Skip to content
This repository has been archived by the owner on Apr 19, 2023. It is now read-only.

[BUG] AttributeError: 'numpy.ndarray' object has no attribute 'nnz' #260

Closed
SeppeDeWinter opened this issue Nov 30, 2020 · 0 comments · Fixed by #262
Closed

[BUG] AttributeError: 'numpy.ndarray' object has no attribute 'nnz' #260

SeppeDeWinter opened this issue Nov 30, 2020 · 0 comments · Fixed by #262
Labels
bug Something isn't working

Comments

@SeppeDeWinter
Copy link

Describe the bug
When running single-sample pipeline on (single) .csv counts matrix execution of sc_h5ad_merge.py exits with AttributeError: 'numpy.ndarray' object has no attribute 'nnz'.

To Reproduce
Steps to reproduce the behavior:

  1. Configure with these options:

While analysing this publicly available dataset (see wget)...

wget ftp.ncbi.nlm.nih.gov/geo/series/GSE129nnn/GSE129114/suppl/GSE129114_E9.5_trunk_Wnt1_counts.txt.gz
gunzip GSE129114_E9.5_trunk_Wnt1_counts.txt.gz
tr ' ' ',' < GSE129114_E9.5_trunk_Wnt1_counts.txt > GSE129114_E9.5_trunk_Wnt1_counts.csv

Generated .config file using:

nextflow pull vib-singlecell-nf/vsn-pipelines
nextflow config vib-singlecell-nf/vsn-pipelines \
    -profile csv,singularity,single_sample > single_sample.config

Adapted following lines of .config file:

filter {
            report_ipynb = '/src/scanpy/bin/reports/sc_filter_qc_report.ipynb'
+          cellFilterMinNGenes = 0
+          cellFilterMaxNGenes = 4000000
+          cellFilterMaxPercentMito = 0.15
+         geneFilterMinNCells = 5
            off = 'h5ad'
            outdir = 'out'
         }
...
data {
      csv {
+        file_paths = '/staging/leuven/stg_00002/lcb/sdewin/PhD/SAMMAP_TEST/data/musMus/GSE129114_E9.5_trunk_Wnt1_counts.csv'
         suffix = '.csv'
      }
   }
  1. Run using this entry point:
nextflow -C single_sample.config \
    run vib-singlecell-nf/vsn-pipelines \
    -entry single_sample
  1. See error:
WARN: Killing pending tasks (3)                   
WARN: To render the execution DAG in the required format it is required to install Graphviz -- See http://www.graphviz.org for more info.
Error executing process > 'single_sample:SINGLE_SAMPLE:SCANPY__SINGLE_SAMPLE:FINALIZE:FILE_CONVERTER_TO_SCANPY:SC__H5AD_MERGE (1)'

Caused by:
  Process `single_sample:SINGLE_SAMPLE:SCANPY__SINGLE_SAMPLE:FINALIZE:FILE_CONVERTER_TO_SCANPY:SC__H5AD_MERGE (1)` terminated with an error exit status (1)

Command executed:                                                                                   

  /user/leuven/330/vsc33053/.nextflow/assets/vib-singlecell-nf/vsn-pipelines/src/utils/bin/sc_h5ad_merge.py             *             "GSE129114_E9.5_trunk_Wnt1_counts.SC__H5AD_MERGE.h5ad"

Command exit status:                              
  1

Command output:                                                                                     
  (empty)

Command error:
  Traceback (most recent call last):
    File "/user/leuven/330/vsc33053/.nextflow/assets/vib-singlecell-nf/vsn-pipelines/src/utils/bin/sc_h5ad_merge.py", line 46, in <module>
      if not all([(adatas[0].raw.X != _adata.raw.X).nnz == 0 for _adata in adatas]):
    File "/user/leuven/330/vsc33053/.nextflow/assets/vib-singlecell-nf/vsn-pipelines/src/utils/bin/sc_h5ad_merge.py", line 46, in <listcomp>
      if not all([(adatas[0].raw.X != _adata.raw.X).nnz == 0 for _adata in adatas]):
  AttributeError: 'numpy.ndarray' object has no attribute 'nnz'

Work dir:
  /ddn1/vol1/staging/leuven/stg_00002/lcb/sdewin/PhD/SAMMAP_TEST/preprocessing/musMus/work/c3/6e05797a63f30d2cd807012cd01391

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`

Problem could be fixed with
edditing .nextflow/assets/vib-singlecell-nf/vsn-pipelines/src/utils/bin/sc_h5ad_merge.py

+from scipy.sparse import csr_matrix

...

- if not all([(adatas[0].raw.X != _adata.raw.X).nnz == 0 for _adata in adatas]):
    # Source: https://stackoverflow.com/questions/30685024/check-if-two-scipy-sparse-csr-matrix-are-equal
    raise Exception("VSN ERROR: adata.raw.X slots are not the same between h5ad files.")
+ if not all([(csr_matrix(adatas[0].raw.X) != csr_matrix(_adata.raw.X)).nnz == 0 for _adata in adatas]):
    # Source: https://stackoverflow.com/questions/30685024/check-if-two-scipy-sparse-csr-matrix-are-equal
    raise Exception("VSN ERROR: adata.raw.X slots are not the same between h5ad files.")

Screenshots
If applicable, add screenshots to help explain your problem.

Please complete the following information:

  • OS: CentOS Linux 7 (Core)
  • Nextflow Version: 20.04.1.5335
  • vsn-pipelines Version: 0.21.0

Additional context
N/A

@SeppeDeWinter SeppeDeWinter added the bug Something isn't working label Nov 30, 2020
dweemx added a commit that referenced this issue Nov 30, 2020
KrisDavie pushed a commit that referenced this issue Dec 8, 2020
Former-commit-id: 2b1bd4e
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant