Skip to content

Commit

Permalink
Updated test data paths
Browse files Browse the repository at this point in the history
  • Loading branch information
GallVp committed Jul 26, 2024
2 parents 984e780 + f9f884b commit ab92c43
Show file tree
Hide file tree
Showing 28 changed files with 837 additions and 267 deletions.
12 changes: 12 additions & 0 deletions .github/check_test_data_paths.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
#!/usr/bin/env bash

for path in $(find . -name '*.nf.test');
do
result=$(grep 'params.test_data' $path)

if [[ $? -ne 1 ]]; then
echo "$path"
echo "$result"
exit 1
fi
done
4 changes: 4 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -183,6 +183,10 @@ jobs:
exclude:
- profile: conda
path: modules/gallvp/braker3
- profile: conda
path: modules/gallvp/edta/edta
- profile: conda
path: subworkflows/gallvp/fasta_edta_lai
env:
NXF_ANSI_LOG: false
NFTEST_VER: "0.9.0"
Expand Down
8 changes: 8 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -39,3 +39,11 @@ repos:
always_run: true
fail_fast: true
pass_filenames: false
- id: test_data_path_checks
name: Test data path checks
language: system
entry: >
./.github/check_test_data_paths.sh
always_run: true
fail_fast: true
pass_filenames: false
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,9 +30,9 @@ pip install --upgrade --force-reinstall git+https://github.com/nf-core/tools.git

Following modules have been submitted and added (✅︎) to nf-core/modules and may be removed (⛔) from this repository without notice.

| Module | Pull request |
| ------------ | ----------------------------------------------------- |
| pbtk/pbindex | [#5901](https://github.com/nf-core/modules/pull/5901) |
| Module | Pull request |
| ------------------- | ----------------------------------------------------- |
| pbtk/pbindex ✅︎ ⛔ | [#5901](https://github.com/nf-core/modules/pull/5901) |

And [more...](./SUBMITTED.md)

Expand Down
3 changes: 2 additions & 1 deletion docs/AVAILABLE.txt
Original file line number Diff line number Diff line change
@@ -1,17 +1,18 @@
<li><a href="https://github.com/gallvp/nxf-components/tree/main/subworkflows/gallvp/gxf_fasta_agat_spaddintrons_spextractsequences">subworkflows/gallvp/gxf_fasta_agat_spaddintrons_spextractsequences</a></li>
<li><a href="https://github.com/gallvp/nxf-components/tree/main/subworkflows/gallvp/fasta_ltrretriever_lai">subworkflows/gallvp/fasta_ltrretriever_lai</a></li>
<li><a href="https://github.com/gallvp/nxf-components/tree/main/subworkflows/gallvp/fasta_gxf_busco_plot">subworkflows/gallvp/fasta_gxf_busco_plot</a></li>
<li><a href="https://github.com/gallvp/nxf-components/tree/main/subworkflows/gallvp/fasta_edta_lai">subworkflows/gallvp/fasta_edta_lai</a></li>
<li><a href="https://github.com/gallvp/nxf-components/tree/main/modules/gallvp/tesorter">modules/gallvp/tesorter</a></li>
<li><a href="https://github.com/gallvp/nxf-components/tree/main/modules/gallvp/syri">modules/gallvp/syri</a></li>
<li><a href="https://github.com/gallvp/nxf-components/tree/main/modules/gallvp/repeatmasker">modules/gallvp/repeatmasker</a></li>
<li><a href="https://github.com/gallvp/nxf-components/tree/main/modules/gallvp/pbtk/pbindex">modules/gallvp/pbtk/pbindex</a></li>
<li><a href="https://github.com/gallvp/nxf-components/tree/main/modules/gallvp/minimap2/align">modules/gallvp/minimap2/align</a></li>
<li><a href="https://github.com/gallvp/nxf-components/tree/main/modules/gallvp/ltrretriever/ltrretriever">modules/gallvp/ltrretriever/ltrretriever</a></li>
<li><a href="https://github.com/gallvp/nxf-components/tree/main/modules/gallvp/ltrretriever/lai">modules/gallvp/ltrretriever/lai</a></li>
<li><a href="https://github.com/gallvp/nxf-components/tree/main/modules/gallvp/ltrharvest">modules/gallvp/ltrharvest</a></li>
<li><a href="https://github.com/gallvp/nxf-components/tree/main/modules/gallvp/ltrfinder">modules/gallvp/ltrfinder</a></li>
<li><a href="https://github.com/gallvp/nxf-components/tree/main/modules/gallvp/gunzip">modules/gallvp/gunzip</a></li>
<li><a href="https://github.com/gallvp/nxf-components/tree/main/modules/gallvp/gffread">modules/gallvp/gffread</a></li>
<li><a href="https://github.com/gallvp/nxf-components/tree/main/modules/gallvp/edta/edta">modules/gallvp/edta/edta</a></li>
<li><a href="https://github.com/gallvp/nxf-components/tree/main/modules/gallvp/custom/shortenfastaids">modules/gallvp/custom/shortenfastaids</a></li>
<li><a href="https://github.com/gallvp/nxf-components/tree/main/modules/gallvp/custom/restoregffids">modules/gallvp/custom/restoregffids</a></li>
<li><a href="https://github.com/gallvp/nxf-components/tree/main/modules/gallvp/cat/cat">modules/gallvp/cat/cat</a></li>
Expand Down
15 changes: 10 additions & 5 deletions docs/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,11 @@ <h1>gallvp/nxf-components</h1>
>subworkflows/gallvp/fasta_gxf_busco_plot</a
>
</li>
<li>
<a href="https://github.com/gallvp/nxf-components/tree/main/subworkflows/gallvp/fasta_edta_lai"
>subworkflows/gallvp/fasta_edta_lai</a
>
</li>
<li>
<a href="https://github.com/gallvp/nxf-components/tree/main/modules/gallvp/tesorter"
>modules/gallvp/tesorter</a
Expand All @@ -118,11 +123,6 @@ <h1>gallvp/nxf-components</h1>
>modules/gallvp/repeatmasker</a
>
</li>
<li>
<a href="https://github.com/gallvp/nxf-components/tree/main/modules/gallvp/pbtk/pbindex"
>modules/gallvp/pbtk/pbindex</a
>
</li>
<li>
<a href="https://github.com/gallvp/nxf-components/tree/main/modules/gallvp/minimap2/align"
>modules/gallvp/minimap2/align</a
Expand Down Expand Up @@ -158,6 +158,11 @@ <h1>gallvp/nxf-components</h1>
>modules/gallvp/gffread</a
>
</li>
<li>
<a href="https://github.com/gallvp/nxf-components/tree/main/modules/gallvp/edta/edta"
>modules/gallvp/edta/edta</a
>
</li>
<li>
<a href="https://github.com/gallvp/nxf-components/tree/main/modules/gallvp/custom/shortenfastaids"
>modules/gallvp/custom/shortenfastaids</a
Expand Down
20 changes: 8 additions & 12 deletions modules/gallvp/busco/busco/tests/main.nf.test
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ nextflow_process {
"""
input[0] = [
[ id:'test' ], // meta map
file( params.test_data['bacteroides_fragilis']['genome']['genome_fna_gz'], checkIfExists: true)
file(params.modules_testdata_base_path + 'genomics/prokaryotes/bacteroides_fragilis/genome/genome.fna.gz', checkIfExists: true)
]
input[1] = 'genome'
input[2] = 'bacteria_odb10' // Launch with 'auto' to use --auto-lineage, and specified lineages // 'auto' removed from test due to memory issues
Expand Down Expand Up @@ -80,8 +80,8 @@ nextflow_process {
input[0] = [
[ id:'test' ], // meta map
[
file( params.test_data['bacteroides_fragilis']['genome']['genome_fna_gz'], checkIfExists: true),
file( params.test_data['candidatus_portiera_aleyrodidarum']['genome']['genome_fasta'], checkIfExists: true)
file(params.modules_testdata_base_path + 'genomics/prokaryotes/bacteroides_fragilis/genome/genome.fna.gz', checkIfExists: true),
file(params.modules_testdata_base_path + 'genomics/prokaryotes/candidatus_portiera_aleyrodidarum/genome/genome.fasta', checkIfExists: true)
]
]
input[1] = 'genome'
Expand Down Expand Up @@ -165,7 +165,7 @@ nextflow_process {
"""
input[0] = [
[ id:'test' ], // meta map
file( params.test_data['homo_sapiens']['genome']['genome_fasta'], checkIfExists: true)
file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true)
]
input[1] = 'genome'
input[2] = 'eukaryota_odb10'
Expand Down Expand Up @@ -208,8 +208,6 @@ nextflow_process {

with(path("${process.out.busco_dir[0][1]}/logs/busco.log").text) {
assert contains('DEBUG:busco.run_BUSCO')
assert contains("'use_augustus', 'False'")
assert contains("'use_metaeuk', 'True'") // METAEUK
assert contains('Results from dataset')
assert contains('how to cite BUSCO')

Expand All @@ -230,7 +228,7 @@ nextflow_process {
"""
input[0] = [
[ id:'test' ], // meta map
file( params.test_data['homo_sapiens']['genome']['genome_fasta'], checkIfExists: true)
file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true)
]
input[1] = 'genome'
input[2] = 'eukaryota_odb10'
Expand All @@ -250,8 +248,6 @@ nextflow_process {

with(path("${process.out.busco_dir[0][1]}/logs/busco.log").text) {
assert contains('DEBUG:busco.run_BUSCO')
assert contains("'use_augustus', 'True'")
assert contains("'use_metaeuk', 'False'") // AUGUSTUS
assert contains('Augustus did not recognize any genes')

}
Expand All @@ -275,7 +271,7 @@ nextflow_process {
"""
input[0] = [
[ id:'test' ], // meta map
file( params.test_data['candidatus_portiera_aleyrodidarum']['genome']['proteome_fasta'], checkIfExists: true)
file(params.modules_testdata_base_path + 'genomics/prokaryotes/candidatus_portiera_aleyrodidarum/genome/proteome.fasta', checkIfExists: true)
]
input[1] = 'proteins'
input[2] = 'bacteria_odb10'
Expand Down Expand Up @@ -337,7 +333,7 @@ nextflow_process {
"""
input[0] = [
[ id:'test' ], // meta map
file( params.test_data['bacteroides_fragilis']['illumina']['test1_contigs_fa_gz'], checkIfExists: true)
file(params.modules_testdata_base_path + 'genomics/prokaryotes/bacteroides_fragilis/illumina/fasta/test1.contigs.fa.gz', checkIfExists: true)
]
input[1] = 'transcriptome'
input[2] = 'bacteria_odb10'
Expand Down Expand Up @@ -398,7 +394,7 @@ nextflow_process {
"""
input[0] = [
[ id:'test' ], // meta map
file( params.test_data['bacteroides_fragilis']['genome']['genome_fna_gz'], checkIfExists: true)
file(params.modules_testdata_base_path + 'genomics/prokaryotes/bacteroides_fragilis/genome/genome.fna.gz', checkIfExists: true)
]
input[1] = 'genome'
input[2] = 'bacteria_odb10'
Expand Down
4 changes: 2 additions & 2 deletions modules/gallvp/busco/generateplot/tests/main.nf.test
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ nextflow_process {
"""
input[0] = [
[ id:'test' ], // meta map
file(params.test_data['bacteroides_fragilis']['genome']['genome_fna_gz'], checkIfExists: true)
file(params.modules_testdata_base_path + 'genomics/prokaryotes/bacteroides_fragilis/genome/genome.fna.gz', checkIfExists: true)
]
input[1] = 'genome'
input[2] = 'bacteria_odb10'
Expand Down Expand Up @@ -55,7 +55,7 @@ nextflow_process {
when {
process {
"""
input[0] = file(params.test_data['bacteroides_fragilis']['genome']['genome_fna_gz'], checkIfExists: true)
input[0] = file(params.modules_testdata_base_path + 'genomics/prokaryotes/bacteroides_fragilis/genome/genome.fna.gz', checkIfExists: true)
"""
}
}
Expand Down
92 changes: 92 additions & 0 deletions modules/gallvp/edta/edta/main.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
process EDTA_EDTA {
tag "$meta.id"
label 'process_high'

container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/edta:2.1.0--hdfd78af_1':
'biocontainers/edta:2.1.0--hdfd78af_1' }"

input:
tuple val(meta), path(fasta)
path cds
path curatedlib
path rmout
path exclude

output:
tuple val(meta), path('*.log') , emit: log
tuple val(meta), path('*.EDTA.TElib.fa') , emit: te_lib_fasta
tuple val(meta), path('*.EDTA.pass.list') , emit: pass_list , optional: true
tuple val(meta), path('*.EDTA.out') , emit: out_file , optional: true
tuple val(meta), path('*.EDTA.TEanno.gff3') , emit: te_anno_gff3 , optional: true
path "versions.yml" , emit: versions

when:
task.ext.when == null || task.ext.when

script:
def args = task.ext.args ?: ''
def prefix = task.ext.prefix ?: "${meta.id}"
def mod_file_name = "${fasta}.mod"
def cds_file = cds ? "--cds $cds" : ''
def curatedlib_file = curatedlib ? "--curatedlib $curatedlib": ''
def rmout_file = rmout ? "--rmout $rmout" : ''
def exclude_file = exclude ? "--exclude $exclude" : ''
"""
EDTA.pl \\
--genome $fasta \\
--threads $task.cpus \\
$cds_file \\
$curatedlib_file \\
$rmout_file \\
$exclude_file \\
$args \\
&> >(tee "${prefix}.log" 2>&1)
mv \\
"${mod_file_name}.EDTA.TElib.fa" \\
"${prefix}.EDTA.TElib.fa"
[ -f "${mod_file_name}.EDTA.raw/LTR/${mod_file_name}.pass.list" ] \\
&& mv \\
"${mod_file_name}.EDTA.raw/LTR/${mod_file_name}.pass.list" \\
"${prefix}.EDTA.pass.list" \\
|| echo "EDTA did not produce a pass.list file"
[ -f "${mod_file_name}.EDTA.anno/${mod_file_name}.out" ] \\
&& mv \\
"${mod_file_name}.EDTA.anno/${mod_file_name}.out" \\
"${prefix}.EDTA.out" \\
|| echo "EDTA did not produce an out file"
[ -f "${mod_file_name}.EDTA.TEanno.gff3" ] \\
&& mv \\
"${mod_file_name}.EDTA.TEanno.gff3" \\
"${prefix}.EDTA.TEanno.gff3" \\
|| echo "EDTA did not produce a TEanno gff3 file"
cat <<-END_VERSIONS > versions.yml
"${task.process}":
EDTA: \$(EDTA.pl -h | awk ' /##### Extensive/ {print \$7}')
END_VERSIONS
"""

stub:
def args = task.ext.args ?: ''
def prefix = task.ext.prefix ?: "${meta.id}"
def touch_pass_list = args.contains("--anno 1") ? "touch ${prefix}.EDTA.pass.list" : ''
def touch_out_file = args.contains("--anno 1") ? "touch ${prefix}.EDTA.out" : ''
def touch_te_anno = args.contains("--anno 1") ? "touch ${prefix}.EDTA.TEanno.gff3": ''
"""
touch "${prefix}.log"
touch "${prefix}.EDTA.TElib.fa"
$touch_pass_list
$touch_out_file
$touch_te_anno
cat <<-END_VERSIONS > versions.yml
"${task.process}":
EDTA: \$(EDTA.pl -h | awk ' /##### Extensive/ {print \$7}')
END_VERSIONS
"""
}
82 changes: 82 additions & 0 deletions modules/gallvp/edta/edta/meta.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
---
# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/meta-schema.json
name: "edta_edta"
description: Extensive de-novo TE Annotator (EDTA)
keywords:
- genome
- repeat
- annotation
- transposable-elements
tools:
- "edta":
description: Extensive de-novo TE Annotator (EDTA)
homepage: "https://github.com/oushujun/EDTA"
documentation: "https://github.com/oushujun/EDTA"
tool_dev_url: "https://github.com/oushujun/EDTA"
doi: "10.1186/s13059-019-1905-y"
licence: ["GPL v3"]
input:
- meta:
type: map
description: |
Groovy Map containing sample information
e.g. `[ id:'test' ]`
- fasta:
type: file
description: Genome fasta file
pattern: "*.{fsa,fa,fasta}"
- cds:
type: file
description: |
A FASTA file containing the coding sequence (no introns, UTRs, nor TEs)
of this genome or its close relative
pattern: "*.{fsa,fa,fasta}"
- curatedlib:
type: file
description: |
A curated library to keep consistent naming and classification for known TEs
pattern: "*.liban"
- rmout:
type: file
description: |
Homology-based TE annotation instead of using the EDTA library for masking in
RepeatMasker .out format
pattern: "*.out"
- exclude:
type: file
description: Exclude regions (bed format) from TE masking in the MAKER.masked output
pattern: "*.bed"
output:
- meta:
type: map
description: |
Groovy Map containing sample information
e.g. `[ id:'test' ]`
- log:
type: file
description: Log emitted by EDTA
pattern: "*.log"
- te_lib_fasta:
type: file
description: A non-redundant TE library in fasta format
pattern: "*.EDTA.TElib.fa"
- pass_list:
type: file
description: A summary table of intact LTR-RTs with coordinate and structural information
pattern: "*.EDTA.pass.list"
- out_file:
type: file
description: RepeatMasker annotation of all LTR sequences in the genome
pattern: "*.EDTA.out"
- te_anno_gff3:
type: file
description: A gff3 file containing both structurally intact and fragmented TE annotations
pattern: "*.EDTA.TEanno.gff3"
- versions:
type: file
description: File containing software versions
pattern: "versions.yml"
authors:
- "@GallVp"
maintainers:
- "@GallVp"
Loading

0 comments on commit ab92c43

Please sign in to comment.