Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix inclusion of input bams in work directory #3507

Closed

Conversation

anoronh4
Copy link
Contributor

@anoronh4 anoronh4 commented Jun 8, 2023

Staged input bam and bai in a subfolder using stageAs, in order to prevent the inputs from matching the wildcard in the bam output channel.

PR checklist

Closes #3504

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the module conventions in the contribution docs
  • If necessary, include test data in your PR.
  • Remove all TODO statements.
  • Emit the versions.yml file.
  • Follow the naming conventions.
  • Follow the parameters requirements.
  • Follow the input/output options guidelines.
  • Add a resource label
  • Use BioConda and BioContainers if possible to fulfil software requirements.
  • Ensure that the test works with either Docker / Singularity. Conda CI tests can be quite flaky:
    • PROFILE=docker pytest --tag <MODULE> --symlink --keep-workflow-wd --git-aware
    • PROFILE=singularity pytest --tag <MODULE> --symlink --keep-workflow-wd --git-aware
    • PROFILE=conda pytest --tag <MODULE> --symlink --keep-workflow-wd --git-aware

@anoronh4 anoronh4 changed the title staged input files for umitools/dedup in subfolder to avoid copying Fix inclusion of input bams in work directory Jun 8, 2023
@nvnieuwk
Copy link
Contributor

nvnieuwk commented Jun 9, 2023

Inputs files are always excluded from the output glob pattern, so this shouldn't be a problem. You can also change the output to ${prefix}.bam instead if you want to be 100% certain that only the output BAM is in the output channel. Using stageAs should be seen as a last resort option, which isn't the case here :)

@anoronh4
Copy link
Contributor Author

anoronh4 commented Jun 9, 2023

Inputs files are always excluded from the output glob pattern, so this shouldn't be a problem. You can also change the output to ${prefix}.bam instead if you want to be 100% certain that only the output BAM is in the output channel. Using stageAs should be seen as a last resort option, which isn't the case here :)

@nvnieuwk yea it's occurring to me that there is a problem, but not exactly what i thought. the bam isn't exactly going into the output channel. nonetheless, the file is being copied from the scratch directory instead of being cleaned up, and becoming a hard link. I will put this issue on the nextflow github and see what they say.

@anoronh4
Copy link
Contributor Author

anoronh4 commented Jun 9, 2023

@nvnieuwk actually i found that someone else has posted the exact issue that fits here: nextflow-io/nextflow#3995 .

Since this issue has not been addressed by the nextflow team yet, should we fix the umitools/dedup module with this workaround? if so i'll merge the two PRs related to umitools/dedup (this one and #3506).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

umitools/dedup converts input bam from symlink to hard link
2 participants