Skip to content

scRNA seq data processing

Hyunsoo Kim edited this page Feb 9, 2023 · 3 revisions

scRNA-seq data processing

Step 1: align sequences in scRNA-seq FASTQ files to GRCh38 reference transcriptome by 10x Genomics cellranger count to obtain two filtered_feature_bc_matrix.h5 files for two samples.

Step 2: Make the following directoy structure with copy or link.

../count_male-bc
├── Patient1
│   ├── outs
│   │   └── filtered_feature_bc_matrix.h5
└── Patient2
    └── outs
        └── filtered_feature_bc_matrix.h5

Step 3: Make Seurat object for each sample with the following command:

./make_sc-rna-seq_seurat_obj.R --dir_count ../count_male-bc --dir_output ./output_male-bc --dir_seurat_obj ./output_male-bc/rds_male-bc --type_qc arguments --min_ncount_rna 5000 --min_nfeature_rna 2000 --th_percent.mt 25 --max_dimstouse 30 --seurat_resolution 0.8 --method_to_update_cell_types epithelial_cell_types --method_to_identify_subtypes none --type_infercnv_argset vignettes --method_to_determine_th_cna_value_corr fixed --th_cna_value 0.05 --th_cna_corr 0.35 male-bc Patient1

The above example is only for Patient1, you can make another Seurat object for Patient2 by changing the last argument. The contents of the output directory of "./output_male-bc" follows:

output_male-bc/
├── infercnv
│   ├── male-bc_Patient1_cnv_postdoublet
│   └── male-bc_Patient2_cnv_postdoublet
├── output
│   └── log
├── rds_male-bc
│   ├── male-bc_Patient1_sc-rna-seq_sample_seurat_obj.rds
│   ├── male-bc_Patient2_sc-rna-seq_sample_seurat_obj.rds
│   └── wilcox_degs
├── tsv
│   ├── infercnv_input_barcode_group_male-bc_Patient1.tsv
│   └── infercnv_input_barcode_group_male-bc_Patient2.tsv
└── xlsx
    ├── male-bc_Patient1_sc-rna-seq_pipeline_summary.xlsx
    └── male-bc_Patient2_sc-rna-seq_pipeline_summary.xlsx

Step 4: Merge Seurat objects for multiple samples to make merged Seurat object by the following command:

./make_sc-rna-seq_merged_seurat_obj.R --dir_output ./output_male-bc --dir_seurat_obj ./output_male-bc/rds_male-bc --type_parsing_rds_filename unc-male-bc --method_integration none --max_dimstouse 30 --seurat_resolution 0.2 --harmony_theta 0 male-bc

The output file is located under ./output_male-bc/rds_male-bc that was defined by an argument of --dir_seurat_obj.

output_male-bc/
│   ...
├── rds_male-bc
│   ├── male-bc_Patient1_sc-rna-seq_sample_seurat_obj.rds
│   ├── male-bc_Patient2_sc-rna-seq_sample_seurat_obj.rds
│   ├── male-bc_sc-rna-seq_merged_seurat_obj.rds
│   └── wilcox_degs
...
Clone this wiki locally