Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PCR duplicates issue when WASP tag included #1177

Closed
AL-Roberts opened this issue Mar 17, 2021 · 9 comments
Closed

PCR duplicates issue when WASP tag included #1177

AL-Roberts opened this issue Mar 17, 2021 · 9 comments
Labels
Milestone

Comments

@AL-Roberts
Copy link

Hi,

I would like to use WASP correction and remove PCR duplicates from my RNA-Seq data for ASE analysis.

However, using --runMode inputAlignmentsFromBAM with BAMs which include WASP tags (following --runMode alignReads) throws an error.

Is there a way around this?

Many thanks in advance,

Amy

@alexdobin
Copy link
Owner

Hi Amy,

what STAR commands are you using? What error do you see?
If you can send me the Log.out files, it would be great.

Cheers
Alex

@AL-Roberts
Copy link
Author

Hi Alex,

Here's the error I get:
Mar 23 16:26:37 ..... reading from BAM, remove duplicates, output BAM
Segmentation fault (core dumped)

And attached is the Log.out file where you can see the STAR commands.
testLog.out.txt

Thank you,

Amy

@alexdobin
Copy link
Owner

Hi Amy,

I could reproduce the seg-fault, will work to debug it.

Thanks!
Alex

@alexdobin
Copy link
Owner

Hi Amy,

actually, the seg-fault in my tests was occurring because it could not open the file listed in --inputBAMfile.
Could you please check that the file exists and has proper read permission and BAM formatting (e.g. check that samtools index works).

Thanks!
Alex

@alexdobin alexdobin added this to the 2.7.8b: bug-fix release milestone Mar 31, 2021
@AL-Roberts
Copy link
Author

Hi Alex,

I was able to index the bam file using samtools without issues.

All the best,

Amy

@alexdobin
Copy link
Owner

Hi Amy,

could you try to deduplicate with the latest release, 2.7.8a?
If this does not work, could you subset the BAM file to a small set (~100k) of reads that still cause the seg-fault and send it to me?
I would need to replicate the problem on my system.

Cheers
Alex

@AL-Roberts
Copy link
Author

Hi Alex,

Sorry for the delay. I've tested the deduplication with the latest release 2.7.8a and I get the same problem.

I will send you a subset of my BAM file by email. Please let me know if you need anything else.

All the best,

Amy

@alexdobin
Copy link
Owner

Hi Amy,

sorry for the long delay, this issue slipped out of my mind.
I finally figure out what the problem was - the file you sent me did not contain AS tags, which are required for deduplication, as the read with the highest score (AS) is selected to be the representative, while the lower score reads are considered duplicates.
I have added a check in the code to throw an error instead of seg-fault.
Still, you would need to re-generate your BAM files with the AS tag. Another tag that's required is NH.

Cheers
Akex

@AL-Roberts
Copy link
Author

AL-Roberts commented Jun 22, 2021 via email

alexdobin added a commit that referenced this issue Jun 25, 2021
…and AS tags for duplication removal jobs (--runMode inputAlignmentsFromBAM --bamRemoveDuplicatesType UniqueIdenticalNotMulti).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants