You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
So far we have a simple search where genes are on or off, and for this we can index the sparse gene expression matrix. To do more sophisticated search where genes are hi/lo, we need the normalized matrix, and a more elaborate indexing scheme.
We first defined a flow cytometric strategy to identify the known B cell subsets and plasma cells in intestinal mucosa and in circulation, identifying plasma cells as live CD45+CD38hiCD27+ cells and nonplasma cell B cells as live CD45+CD38−CD19+ cells. Among nonplasma cell B cells, naïve B cells were defined as CD45+CD38−CD19+IgD+IgM+ cells, whereas switched memory (SM) B cells were defined as CD45+CD38−CD19+IgD−IgM− cells (fig. S2).
So [CD45+CD38−CD19+] would be a possible search query.
The text was updated successfully, but these errors were encountered:
For the simple binary scheme already implemented, each cell has a genes field with the gene ids that are non-null in it. We then do a must with match query with a subset of gene ids.
Now, as an example, if CD45 has gene id 45 and CDG38 has gene id 38, then the query [CD45+ CDG38+] is [genes=45 and genes=38] in ES search terms. Note that we can’t search for a low value such as CD38-, since we can only search for the presence of a gene, not its absence.
To support hi/lo and +/- we first have to define what these mean, presumably they are conventions about percentiles (or standard deviations) in the normalized matrix. E.g. + means anything greater than zero (i.e. positive sd) and - is negative sd, hi is >= 1 sd, and so on.
Then at indexing time, we have a single field genes again, but values are bucketed by the sd range they fall in. For example, if the value is between 0 and 1 sd, assign it the bucket +1, if it's between 1 and 2 sd, assign it to +2, and if it's over 2 sd assign it to +3. Similarly for values less than 0. The bucket is then appended to the gene id, so 45+1 means the values that are between 0 and 1 sd for gene 45.
Then the query [CD45+ CDG38-] becomes [(genes=45+1 or genes=45+2 or genes=45+3) and (genes=38-1 or genes=38-2 or genes=38-3)] in ES search terms.
Similarly, [CD45+ CDG38hi] becomes [(genes=45+1 or genes=45+2 or genes=45+3) and (genes=38+2 or genes=38+3)]
So far we have a simple search where genes are on or off, and for this we can index the sparse gene expression matrix. To do more sophisticated search where genes are hi/lo, we need the normalized matrix, and a more elaborate indexing scheme.
E.g. refering to http://stm.sciencemag.org/content/10/461/eaau4711
So [CD45+CD38−CD19+] would be a possible search query.
The text was updated successfully, but these errors were encountered: