Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with using sketch assay after loading single cell/single nuc data with BPcells from h5ad file #9301

Open
fingeram opened this issue Sep 11, 2024 · 1 comment

Comments

@fingeram
Copy link

Hi,

I am working with a very larger sc/sn RNA-Seq dataset. Starting from an h5ad file have used BPcells package to load data in-memory as follows:

`raw <- open_matrix_anndata_hdf5(path="/novo/projects/departments/compbio/sysbio/Projects/mouse_liver_models/single_cell_and_nuclei/concatenated.dir/concatenated.h5ad") #imports as data type float

raw <- convert_matrix_type(raw, type = "uint32_t") #must convert count matrix from type float (non-integer) to integer values

write_matrix_dir(mat = raw, dir = "/novo/projects/shared_projects/liver_biology_colab/people/aqnf/mouse_sc_sn_AQNF_June24/BPcells/mouse_counts")

raw.mat <- open_matrix_dir(dir = "/novo/projects/shared_projects/liver_biology_colab/people/aqnf/mouse_sc_sn_AQNF_June24/BPcells/mouse_counts")

sobj <- CreateSeuratObject(counts = raw.mat)

meta <- merge(x= metadata_BSCK, y= metadata_CPDM, by.x = "LibraryID", by.y = "library_id", all.y=T)

sobj<- AddMetaData(sobj, metadata = meta)`

I am working with seurat v5, so I am trying to split layers based on the perepartion method (single cell and single nuc seq). After that I am creating a sketch assay for my seurat object in-memory in order to run downstream analysis more efficiently (the dataset is to large for the available memory):

`sobj <- subset(sobj, subset = nCount_RNA < 50000 & nFeature_RNA > 250 & nFeature_RNA < 8000 & pct_ribo < 20)

sobj[["RNA"]] <- split(sobj[["RNA"]], f = sobj$group)

sobj <- NormalizeData(sobj)

sobj <- FindVariableFeatures(sobj)

sobj.sketch <- SketchData(
object = sobj,
ncells = 50000,
method = "LeverageScore",
sketched.assay = "sketch")

DefaultAssay(sobj.sketch) <- "sketch"`

Up to that point everything runs fine but then when I try to get started with the dimensionality reduction I am running into issues that I don't understand. It seems like something goes wrong when trying to RunPCA, as the Ellbow plot looks very weird and other steps of the pipeline relying on the pca, fail to run. I tried to trace the issue but have failed, so help is very welcome:

`sobj.sketch <- FindVariableFeatures(sobj.sketch)

sobj.sketch <- ScaleData(sobj.sketch)

sobj.sketch <- RunPCA(sobj.sketch)

sobj.sketch <- FindNeighbors(sobj.sketch, dims = 1:30)

Computing nearest neighbor graph
Computing SNN
Error: std::bad_alloc`

@fingeram
Copy link
Author

Some additional info about my data objects:

raw.mat
33696 x 1328118 IterableMatrix object with class MatrixDir

Row names: Xkr4, Gm1992 ... ENSMUSG00000095041
Col names: AAACCCAAGCCTGAGA-97, AAACCCAGTCGTACAT-97 ... TTTGTTGTCTGCATGA-96

Data type: uint32_t
Storage order: column major

Queued Operations:

Load compressed matrix from directory /novo/projects/shared_projects/liver_biology_colab/people/aqnf/mouse_sc_sn_AQNF_June24/BPcells/mouse_counts

sobj
An object of class Seurat
33696 features across 1191094 samples within 1 assay
Active assay: RNA (33696 features, 2000 variable features)
4 layers present: counts.SC, counts.SN, data.SC, data.SN

sobj.sketch
An object of class Seurat
67392 features across 1191094 samples within 2 assays
Active assay: sketch (33696 features, 2000 variable features)
5 layers present: counts.SC, counts.SN, data.SC, data.SN, scale.data
1 other assay present: RNA
1 dimensional reduction calculated: pca

sobj.sketch@assays$sketch
Assay (v5) data with 33696 features for 1e+05 cells
Top 10 variable features:
Mmp12, Igfbp5, Igkc, Nxph1, Kcnip4, Ighm, Grm8, Nrg1, Jchain, Siglech
Layers:
counts.SC, counts.SN, data.SC, data.SN, scale.data

Ellbow Plot after RunPCA on sketch assay in sobj.sketch
file_show (1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant