Skip to content
This repository has been archived by the owner on Jul 20, 2021. It is now read-only.

Add reference for sequencing bias term #273

Open
corburn opened this issue Apr 11, 2017 · 0 comments
Open

Add reference for sequencing bias term #273

corburn opened this issue Apr 11, 2017 · 0 comments

Comments

@corburn
Copy link
Contributor

corburn commented Apr 11, 2017

http://readiab.org/book/0.1.3/2/5

The term sequencing bias is mentioned once in section 2.5.

Another application of grouping similar sequences (or OTU clustering, or OTU picking, as it is sometimes referred to) is in grouping sequences in a database before investigating them, to reduce taxonomic bias in the database. For example, E. coli is one of the most heavily sequenced microbes. If you're interested in understanding the frequency of variants of a specific gene across a range of microbial diversity, you might begin by obtaining all sequences of that gene from GenBank. Because there may be many more E. coli sequences, purely because of sequencing bias, you'd likely want to group your sequences into OTUs before computing variant frequencies, so your calculations are not biased toward the frequencies in E. coli, as hundreds of E. coli sequences would likely group to one or a few closely related OTUs. In other words, you're trying to find a divergent set of sequences to work with (and an aptly named tool was published in 2006 to automate this process).

IAB could expand on the topic or reference another source for more information.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant