Unstranded/sense/antisense for reads mapped to NoFeather or mapped ambiguously #1796
Replies: 4 comments
-
The 2/3/4=Unstranded/Forward/Reverse columns are for different methods of assigning reads to genes. |
Beta Was this translation helpful? Give feedback.
-
Dear Alex,
Thank you again for your clarification. I am continuing to use STAR and I still do not quite understand the strandedness issue. Let me ask a very specific question.
You need to choose the column that agrees with the strandedness of your library prep – yes, sure, I understand this in genera, but what exactly does it mean? Specifically, my libraries are RNAseq, stranded, sequenced by paired ends Illumina protocol. Which read count should I use? Sum of read1 and read 2? That would make sense, right, because both forward and reverse reads mapped to a gene meaningfully represent RNA abundance.
Incidentally I tried to play around with --readStrand Reverse/Forward and I am getting this:
EXITING: FATAL INPUT ERROR: unrecognized parameter name "readStrand" in input "Command-Line-Initial"
SOLUTION: use correct parameter name (check the manual)
And – the current version manual is not available: https://github.com/alexdobin/STAR/blob/master/doc/STARmanual.pdf returns “Error rendering embedded code Invalid PDF”. I have the manual I downloaded with one of the previous versions (2.7.0) which is where I am reading about --readStrand. Is it no longer supported by the current version?
Many thanks in advance,
Lev Yampolsky
From: Alexander Dobin ***@***.***>
Date: Friday, March 17, 2023 at 11:20 AM
To: alexdobin/STAR ***@***.***>
Cc: Yampolsky, Lev ***@***.***>, Author ***@***.***>
Subject: [EXTERNAL] Re: [alexdobin/STAR] Unstranded/sense/antisense for reads mapped to NoFeather or mapped ambiguously (Discussion #1796)
The 2/3/4=Unstranded/Forward/Reverse columns are for different methods of assigning reads to genes.
You need to choose the column that agrees with the strandedness of your library prep - this will represent the read count per gene.
Ambiguous reads overlap more than one gene, according to the strand method of each column.
—
Reply to this email directly, view it on GitHub<#1796 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ACDXHNLVL3E36ETJUG64VOTW4R6MZANCNFSM6AAAAAAV4TJA4Y>.
You are receiving this because you authored the thread.Message ID:
|
Beta Was this translation helpful? Give feedback.
-
Dear Alex,
I guess my misunderstanding boils down to this:
When STAR maps, in a stranded manner, a read to the genome, does it use the read as sequenced, or the reverse complement of the read as well? If only the forward read then of course I should use just the first read1 for genes that are on +, relative to the genome, and only read2 for those who are on “-“. If both the read and its rc are used, then both. But I am thinking now that “stranded” means just that – only the read itself is used.
My other questions remain – current manual and the use of –readStrand.
Thanks!
Lev
From: Yampolsky, Lev ***@***.***>
Date: Wednesday, August 7, 2024 at 2:37 PM
To: alexdobin/STAR ***@***.***>
Subject: Re: [alexdobin/STAR] Unstranded/sense/antisense reads, --readStrand, AND Current manual no available
Dear Alex,
Thank you again for your clarification. I am continuing to use STAR and I still do not quite understand the strandedness issue. Let me ask a very specific question.
You need to choose the column that agrees with the strandedness of your library prep – yes, sure, I understand this in genera, but what exactly does it mean? Specifically, my libraries are RNAseq, stranded, sequenced by paired ends Illumina protocol. Which read count should I use? Sum of read1 and read 2? That would make sense, right, because both forward and reverse reads mapped to a gene meaningfully represent RNA abundance.
Incidentally I tried to play around with --readStrand Reverse/Forward and I am getting this:
EXITING: FATAL INPUT ERROR: unrecognized parameter name "readStrand" in input "Command-Line-Initial"
SOLUTION: use correct parameter name (check the manual)
And – the current version manual is not available: https://github.com/alexdobin/STAR/blob/master/doc/STARmanual.pdf returns “Error rendering embedded code Invalid PDF”. I have the manual I downloaded with one of the previous versions (2.7.0) which is where I am reading about --readStrand. Is it no longer supported by the current version?
Many thanks in advance,
Lev Yampolsky
From: Alexander Dobin ***@***.***>
Date: Friday, March 17, 2023 at 11:20 AM
To: alexdobin/STAR ***@***.***>
Cc: Yampolsky, Lev ***@***.***>, Author ***@***.***>
Subject: [EXTERNAL] Re: [alexdobin/STAR] Unstranded/sense/antisense for reads mapped to NoFeather or mapped ambiguously (Discussion #1796)
The 2/3/4=Unstranded/Forward/Reverse columns are for different methods of assigning reads to genes.
You need to choose the column that agrees with the strandedness of your library prep - this will represent the read count per gene.
Ambiguous reads overlap more than one gene, according to the strand method of each column.
—
Reply to this email directly, view it on GitHub<#1796 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ACDXHNLVL3E36ETJUG64VOTW4R6MZANCNFSM6AAAAAAV4TJA4Y>.
You are receiving this because you authored the thread.Message ID:
|
Beta Was this translation helpful? Give feedback.
-
Since I still cannot get an answer to this - I must say - pretty crucial and fundamental question, let me ask it again in the most specific way possible. Which of the statements A, B, and C below is correct, or maybe none of them. A. When mapping stranded reads to the genome STAR maps both read1 and read2 to each strand of a gene, just keeping track of what mapped to what strand. This implies that there should be equal (or at least very similar) counts of read1 and read2 mapped to each gene. If A is correct then in an RNAseq both reads reflect the frequency of a transcript and the sum of the two should be used as a measure of transcription level. I apologize if the answer to this is written somewhere in the paper or in the Manual, and I just don't see something obvious. And thank you very much in advance to whoever might be able to answer this. Best, Lev Yampolsky |
Beta Was this translation helpful? Give feedback.
-
(columns 2 - 3 - 4 of ReadsPerGene.out.tab).
What is the meaning of these counts? OK, for no_feature mapping, this is relative to just the strand used as the reference. (in which care it is interesting why there may be an asymmetry between sense and antisense). What about ambiguously mapped reads? What sets the strandedness for them - random choice of one reference out of several?
What are the unstranded counts anyway, in a stranded RNAseq data?
Beta Was this translation helpful? Give feedback.
All reactions