Skip to content

Update: A repository of pipelines for single-cell data in Nextflow DSL2. Updated to make SCENIC single- and multi-run functional with v3 feather files

License

Notifications You must be signed in to change notification settings

JamesHowie14/vsn-pipelines

 
 

Repository files navigation

VSN-Pipelines

## 2024-09-03

# Fork Notes:

I noticed that the repository has been archived. Unfortunately, the most recent version does not run with the most up-to-date motif files.

Therefore, I produced a fork that can run the scenic module of this VSN-pipeline in both single-run and multi-run modes.

To do this, I borrowed two fixes from the ccasar/vsn-pipelines fork. These allow the VSN-pipeline to run in single-run mode if skipReports = true in the config. To allow the multi-run aggregation to function, I made one further tweak.

These are small changes but can be tricky to come across and should save time for people who want to automatically use SCENIC multi-run mode with aggregation.

---

# Run Notes:

To run, produce an environment and install the following:

  • Singularity: 3.8.6
  • Nextflow: 21.04.03 (crucial)

Then, export, checking before and after:

locale
export LANG="C"
export LC_ALL="C"
locale

Then, pull the fork:

nextflow pull JamesHowie14/vsn-pipelines -r master
ls -l ~/.nextflow/assets/JamesHowie14/vsn-pipelines

Make Config:

nextflow config JamesHowie14/vsn-pipelines \
   -profile scenic,scenic_multiruns,scenic_use_cistarget_motifs,scenic_use_cistarget_tracks,hg38,singularity > nf_CPUopt-Real-MultiRun.config

Then, edit the config:

container = 'aertslab/pyscenic_scanpy:0.12.1_1.9.1' #crucial

skipReports = true #crucial, for up-to-date feather files for the motifs/tracks

Then, run via:

nextflow -C nf_CPUopt-Real-MultiRun.config run JamesHowie14/vsn-pipelines -entry scenic -r master

This should allow you to run the VSN-pipeline implementation of pySCENIC for the single-run and crucially also the multip-run mode, with aggregation.

Note -- scenic reports fail, hence skipped. To make this work requires and edit of vsn-pipelines/src/scenic/bin/reports/scenic_report.ipynb. Have have not looked at this, but see ccasar/vsn-pipelines fork for one attempt to fix this.

# JMH - Sept, 3rd, 2024

##

VSN-Pipelines has now been archived

2023-04-19 - Unfortunately due to lack of developers, VSN-pipelines is no longer being worked on and has been archived. The repo will remain in read-only mode from this point on.

A repository of pipelines for single-cell data analysis in Nextflow DSL2.

GitHub release (latest by date) Documentation Status Zenodo Gitter Nextflow

Full documentation is available on Read the Docs, or take a look at the Quick Start guide.

This main repo contains multiple workflows for analyzing single cell transcriptomics data, and depends on a number of tools, which are organized into subfolders within the src/ directory. The VIB-Singlecell-NF organization contains this main repo along with a collection of example runs (VSN-Pipelines-examples). Currently available workflows are listed below.

If VSN-Pipelines is useful for your research, consider citing:

Raw Data Processing Workflows

These are set up to run Cell Ranger and DropSeq pipelines.

Raw Data Processing Workflows
Pipeline / Entrypoint Purpose Documentation
cellranger Process 10x Chromium data cellranger
demuxlet_freemuxlet Demultiplexing demuxlet_freemuxlet
nemesh Process Drop-seq data nemesh

Single Sample Workflows

The Single Sample Workflows perform a "best practices" scRNA-seq analysis. Multiple samples can be run in parallel, treating each sample separately.

Single Sample Workflows
Pipeline / Entrypoint Purpose Documentation
single_sample Independent samples Single-sample Pipeline
single_sample_scenic Ind. samples + SCENIC Single-sample SCENIC Pipeline
scenic SCENIC GRN inference SCENIC Pipeline
scenic_multiruns SCENIC run multiple times SCENIC Multi-runs Pipeline
single_sample_scenic_multiruns Ind. samples + multi-SCENIC Single-sample SCENIC Multi-runs Pipeline
single_sample_scrublet Ind. samples + Scrublet Single-sample Scrublet Pipeline
decontx DecontX DecontX Pipeline
single_sample_decontx Ind. samples + DecontX Single-sample DecontX Pipeline
single_sample_decontx_scrublet Ind. samples + DecontX + Scrublet Single-sample DecontX Scrublet Pipeline

Sample Aggregation Workflows

Sample Aggregation Workflows: perform a "best practices" scRNA-seq analysis on a merged and batch-corrected group of samples. Available batch correction methods include BBKNN, mnnCorrect, and Harmony.

Sample Aggregation Pipelines
Pipeline / Entrypoint Purpose Documentation
bbknn Sample aggregation + BBKNN BBKNN Pipeline
bbknn_scenic BBKNN + SCENIC BBKNN SCENIC Pipeline
harmony Sample aggregation + Harmony Harmony Pipeline
harmony_scenic Harmony + SCENIC Harmony SCENIC Pipeline
mnncorrect Sample aggregation + mnnCorrect MNN-correct Pipeline

In addition, the pySCENIC implementation of the SCENIC workflow is integrated here and can be run in conjunction with any of the above workflows. The output of each of the main workflows is a loom-format file, which is ready for import into the interactive single-cell web visualization tool SCope. In addition, data is also output in h5ad format, and reports are generated for the major pipeline steps.

scATAC-seq workflows

Single cell ATAC-seq processing steps are now included in VSN Pipelines. Currently, a preprocesing workflow is available, which will take fastq inputs, apply barcode correction, read trimming, bwa mapping, and output bam and fragments files for further downstream analysis. See here for complete documentation.

About

Update: A repository of pipelines for single-cell data in Nextflow DSL2. Updated to make SCENIC single- and multi-run functional with v3 feather files

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Nextflow 51.2%
  • Python 25.9%
  • Jupyter Notebook 13.2%
  • R 6.7%
  • Dockerfile 2.8%
  • Shell 0.2%