Download Newick labeled by colorBy #1236
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description of proposed changes
I was just doing some downstream phylogeography analyses on top of https://nextstrain.org/ncov/north-america, where I needed to work from a labeled Newick for compatibility with scripts I had made for BEAST trees that include phylogeography.
I could have modified the Augur script
json_tree_to_nexus.py
to collect a particular key from the JSON, but I thought that the more useful direction would be to build this labeling into Auspice download. This should surface this functionality to many more users.BEAST trees that include traits (phylogeography or otherwise) use a format of
[&country="Thailand"]
to label nodes. This PR makes the Download Data modal download labeled Newicks with labeling based on currentcolorBy
.Some notes on implementation choices:
num_date
andauthor
download a regular non-labeled NewickNODE_0000496
to fit with standard Newick format, I'm worried that a discordance between JSON from data.nextstrain.org and downloaded Newick will cause "gotcha" bugs'
North America
toNorthAmerica
. FigTree errors if whitespace is included here.Testing
I've tested this across a number of colorBys from different pathogens. It works for continuous trait values, for categorical trait values and sidesteps troublesome colorBys like
num_date
andauthor
.