Polymorphic Gene Regions #790

kbergin · 2019-09-03T15:38:16Z

From Dario Strbenac on HCA Zendesk (dstr7320@uni.sydney.edu.au):

I am testing the data portal and noticed a couple of issues regarding polymorphic genes. The same algorithms are used to process RNA-seq reads for them as for all other genes. This is unsuitable because of how many variants there are in the human population and how similar these genes are to each other. For example, in recent work we found that patients which didn't have HLA-G expressed according to laboratory experiments had high counts for HLA-G by RNA-seq. Upon further investigation, we realised that the reads mapping to HLA-G had a mismatch score only 1 less than the mismatch score to HLA-A in the reference genome. The alternative approach we implemented is:

Replace sequence in hg38 where HLA and KIR genes are by N to force reads not to map there.
Use an RNA-seq aligner to map the reads to the modified reference genome sequence and output the unmapped reads to a separate FASTQ file.
Take the unmapped reads and the IMGT HLA database (contains thousands of alleles for each gene) and use RSEM to determine where the reads should really go.
Use the reads mapped to the masked reference sequence to process all other genes (i.e. the non-polymorphic ones).

We found that this approach meant that the results matched the biologists' experimental results and avoided reference sequence bias, which is usually not a problem for most of the genes in the genome which are highly conserved and don't have paralogs like HLA and KIR genes do.

AC:

Determine if/when/why/how we want to handle Polymorphic gene regions
TODO: Kylee better understand use case for doing this at all.

┆Issue is synchronized with this Jira Spike

kbergin added the pipelines label Sep 3, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Polymorphic Gene Regions #790

Polymorphic Gene Regions #790

kbergin commented Sep 3, 2019

Polymorphic Gene Regions #790

Polymorphic Gene Regions #790

Comments

kbergin commented Sep 3, 2019