Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Download Newick labeled by colorBy #1236

Closed
wants to merge 1 commit into from
Closed

Conversation

trvrb
Copy link
Member

@trvrb trvrb commented Nov 22, 2020

Description of proposed changes

I was just doing some downstream phylogeography analyses on top of https://nextstrain.org/ncov/north-america, where I needed to work from a labeled Newick for compatibility with scripts I had made for BEAST trees that include phylogeography.

I could have modified the Augur script json_tree_to_nexus.py to collect a particular key from the JSON, but I thought that the more useful direction would be to build this labeling into Auspice download. This should surface this functionality to many more users.

BEAST trees that include traits (phylogeography or otherwise) use a format of [&country="Thailand"] to label nodes. This PR makes the Download Data modal download labeled Newicks with labeling based on current colorBy.

Some notes on implementation choices:

  • num_date and author download a regular non-labeled Newick
  • I've removed internal node names, e.g. NODE_0000496 to fit with standard Newick format, I'm worried that a discordance between JSON from data.nextstrain.org and downloaded Newick will cause "gotcha" bugs
  • I've escaped tip names with '
  • I've removed white space from labels, eg changing North America to NorthAmerica. FigTree errors if whitespace is included here.
  • The Download Data modal is smart about telling users that the Newick will include colorBy labeling

download

Testing

I've tested this across a number of colorBys from different pathogens. It works for continuous trait values, for categorical trait values and sidesteps troublesome colorBys like num_date and author.

BEAST trees that include traits (phylogeography or otherwise) use a format of [&country="Thailand"] to label nodes. This commit makes the Download Data modal download labeled Newicks with labeling based on current colorBy.

Some notes on implementation choices:
- colorBy "num_date" and "author" download a regular non-labeled Newick
- I've removed internal node names, e.g. NODE_0000496 to fit with standard Newick format, I'm worried that a discordance between JSON from data.nextstrain.org and downloaded Newick will cause bugs
- I've escaped tip names with '
- I've removed white space from labels, eg "North America" to "NorthAmerica". FigTree errors if whitespace is included here.
@jameshadfield jameshadfield temporarily deployed to auspice-newick-traits-e-gxkimc November 22, 2020 19:51 Inactive
@jameshadfield
Copy link
Member

Closing in favor of #1245 which uses this approach (and code!) to allow Nexus trees with annotations 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants