Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrating FASTA Sequence into Influenza A/H3N2 Evolution Analysis and Visualizing in Nextclade #147

Closed
sekhwal opened this issue Feb 11, 2024 · 6 comments
Labels
enhancement New feature or request

Comments

@sekhwal
Copy link

sekhwal commented Feb 11, 2024

I'm looking to integrate my FASTA sequence into the evolutionary analysis of Influenza A/H3N2 using Nextclade and visualize it appropriately. While I utilized Nextclade for analysis, I encountered difficulties in adding the year information to the x-axis of the phylogenetic tree. Any suggestion would be appreciated.

@joverlee521
Copy link
Contributor

Hi @sekhwal,

As @rneher stated in the discussion forum, Nextclade does not support time-scaled trees so you will have to run a full Nextstrain phylogenetic workflow to create the time-scaled tree.

We are currently lacking documentation on how to run the seasonal flu workflow with custom sequences. The easiest way to get started for now will be to follow the Quickstart with GISAID data.

@sekhwal
Copy link
Author

sekhwal commented Feb 13, 2024 via email

@joverlee521
Copy link
Contributor

However, I could not find EpiFlu" link in the top
navigation bar at GISAID (https://gisaid.org/). I am not sure if I have to
register at GISAID to get EpiFlu link.

You will need to register at GISAID in order to access and download data from them.

Also, please let me know how to get "profiles/gisaid/builds.yaml" and
please provide a template to prepare "builds.yaml" that would be great.

You can start with the existing profiles/gisaid/builds.yaml file in this repo.

@sekhwal
Copy link
Author

sekhwal commented Feb 14, 2024

I have some more follow-up questions.

  1. Downloading the sequences from GISAID takes very long time, also it allows only 20,000 sequences to download. Also, to run nextstrain build with the following command, where to provide downloaded fasta files as input file.

nextstrain build . --configfile profiles/gisaid/builds.yaml
--use-conda --conda-frontend mamba

  1. In addition, should I download "seasonal-flu" Github repo?

  2. In builds.yaml, do I need to change anythings in the following part? Where I should provide the metadata file?

reference: "config/h3n2/{segment}/reference.fasta"
annotation: "config/h3n2/{segment}/genemap.gff"
tree_exclude_sites: "config/h3n2/{segment}/exclude-sites.txt"
clades: "config/h3n2/ha/clades.tsv"
subclades: "config/h3n2/ha/subclades.tsv"
auspice_config: "config/h3n2/auspice_config.json"

@joverlee521
Copy link
Contributor

Also, to run nextstrain build with the following command, where to provide downloaded fasta files as input file.

Following the Quickstart with GISAID data, please move your downloaded files to data/h3n2/metadata.xls and data/h3n2/raw_sequences_ha.fasta.

In addition, should I download "seasonal-flu" Github repo?

Yes, you will need to download the seasonal flu repo to run the workflow.

In builds.yaml, do I need to change anythings in the following part?

Try using the default values first to produce the build. Then if you would like to make adjustments, you can edit the parameters in the builds.yaml file.

@joverlee521
Copy link
Contributor

Closing since the conversation has continued in #149.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants