Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V1.2.0 Update #58

Merged
merged 25 commits into from
Aug 6, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
573e1a6
Version updates
dthoward96 Feb 23, 2024
5dbb982
Env update
dthoward96 Feb 23, 2024
6e7a18c
Delete .github/workflows/python-package-mamba.yml
dthoward96 Feb 23, 2024
ac2007b
pandera schema update
dthoward96 Mar 29, 2024
85b5ea0
Merge remote-tracking branch 'refs/remotes/origin/v1.1.0' into v1.1.0
dthoward96 Mar 29, 2024
e026f4f
Delete gisaid_cli/poxCLI directory
dthoward96 Mar 29, 2024
16d335c
bug fixes
dthoward96 Apr 9, 2024
d984065
Merge branch 'v1.2.0' of https://github.com/CDCgov/seqsender into v1.2.0
dthoward96 Apr 11, 2024
6814281
Delete FLU_test directory
dthoward96 Apr 11, 2024
70eb75d
Delete OTHER_species directory
dthoward96 Apr 11, 2024
9f6eac5
Delete POX_species directory
dthoward96 Apr 11, 2024
ea50744
Merge branch 'v1.2.0' of https://github.com/CDCgov/seqsender
dthoward96 Apr 11, 2024
3658d6c
mypy validation added
dthoward96 Apr 18, 2024
f6516b7
Merge branch 'v1.2.0' of https://github.com/CDCgov/seqsender into v1.2.0
dthoward96 Apr 18, 2024
44ec6f9
Merge branch 'master' into v1.2.0
dthoward96 Apr 19, 2024
228d100
mypy integration
dthoward96 Apr 19, 2024
cbda8c7
Shiny Update
dthoward96 Jun 3, 2024
4e0bba2
Seqsender v1.2.0 website updates
dthoward96 Jun 17, 2024
9b680ec
Update README.md
dthoward96 Jun 17, 2024
bf4c636
Update README.md
dthoward96 Jun 17, 2024
a631c7f
shiny website updates
dthoward96 Jun 17, 2024
56ea569
Seqsender shiny updates
dthoward96 Jun 17, 2024
4886094
V1.2.0 Prod Update
dthoward96 Aug 6, 2024
2b88024
Merge branch 'master' into v1.2.0
dthoward96 Aug 6, 2024
62df1f3
Merge remote-tracking branch 'refs/remotes/origin/v1.2.0' into v1.2.0
dthoward96 Aug 6, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
40 changes: 6 additions & 34 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -26,15 +26,15 @@ github_pages_url <- description$GITHUB_PAGES

<p style="font-size: 16px;"><em>Public Database Submission Pipeline</em></p>

**Beta Version**: `r version`. This pipeline is currently in Beta testing, and issues could appear during submission. Please use it at your own risk. Feedback and suggestions are welcome!
**Beta Version**: v1.2.0. This pipeline is currently in Beta testing, and issues could appear during submission. Please use it at your own risk. Feedback and suggestions are welcome!

**General Disclaimer**: This repository was created for use by CDC programs to collaborate on public health related projects in support of the [CDC mission](https://www.cdc.gov/about/organization/mission.htm). GitHub is not hosted by the CDC, but is a third party website used by CDC and its partners to share information and collaborate on software. CDC use of GitHub does not imply an endorsement of any one particular service, product, or enterprise.

# [Documentation](`r github_pages_url`/index.html)
# [Documentation](https://dthoward96.github.io/seqsender_test_website/)

## Overview

``r program`` is a Python program that is developed to automate the process of generating necessary submission files and batch uploading them to <ins>NCBI archives</ins> (such as **BioSample**, **SRA**, and **Genbank**) and <ins>GISAID databases</ins> (e.g. **EpiFlu** and **EpiCoV**). Presently, the pipeline is capable of uploading **Influenza A Virus** (FLU) and **SARS-COV-2** (COV) data. However, the dynamic nature of this pipeline can allow for additional uploads of other organisms in future updates or requests.
``r program`` is a Python program that is developed to automate the process of generating necessary submission files and batch uploading them to <ins>NCBI archives</ins> (such as **BioSample**, **SRA**, and **Genbank**) and <ins>GISAID databases</ins> (e.g. **EpiFlu**, **EpiCoV**, **EpiPox**, **EpiArbo**). Presently, the pipeline is capable of uploading **Influenza A Virus** (FLU), **SARS-COV-2** (COV), **Monkeypox** (POX), **Arbovirus** (ARBO), and a wide variety of other organisms. If you'd like to have ``r program`` support your virus create a issue.

## Contacts

Expand All @@ -58,15 +58,15 @@ github_pages_url <- description$GITHUB_PAGES

4. Refer to this page for information regarding requirements for GenBank submissions via FTP only. This page applies only for COVID and Influenza [NCBI GenBank FTP Submissions](https://submit.ncbi.nlm.nih.gov/sarscov2/genbank/#step5) For further questions contact <a href="mailto:gb-admin@ncbi.nlm.nih.gov">gb-admin@ncbi.nlm.nih.gov</a> to discuss requirements for submissions.

5. Coordinate a NCBI namespace name (**spuid_namespace**) that will be used with Submitter Provided Unique Identifiers (**spuid**) in the submission. The liaison of **spuid_namespace** and **spuid** is used to report back assigned accessions as well as for cross-linking objects within submission. The values of **spuid_namespace** are up to the submitter to decide but they must be unique and well-coordinated prior to make a submission. For more information about these two fields, see [BioSample](`r github_pages_url`/articles/biosample_submission.html#metadata) / [SRA](`r github_pages_url`/articles/sra_submission.html#metadata) / [GENBANK](`r github_pages_url`/articles/genbank_submission.html#metadata) metadata requirements.
5. Coordinate a NCBI namespace name (**spuid_namespace**) that will be used with Submitter Provided Unique Identifiers (**spuid**) in the submission. The liaison of **spuid_namespace** and **spuid** is used to report back assigned accessions as well as for cross-linking objects within submission. The values of **spuid_namespace** are up to the submitter to decide but they must be unique and well-coordinated prior to make a submission.

- **GISAID Submissions**

``r program`` makes use of GISAID's Command Line Interface tools to bulk uploading meta- and sequence-data to GISAID databases. Presently, the pipeline only allows upload to EpiFlu (**Influenza A Virus**) and EpiCoV (**SARS-COV-2**) databases. Before uploading, submitter needs to
``r program`` makes use of GISAID's Command Line Interface tools to bulk uploading meta- and sequence-data to GISAID databases. Presently, the pipeline supports upload to EpiFlu (**Influenza A Virus**), EpiCoV (**SARS-COV-2**), EpiPox (**Monkeypox**), and EpiArbo (**Arbovirus**). Before uploading, submitter needs to

1. Have a GISAID account. To sign up, visit [GISAID Platform](https://gisaid.org/).

2. Request a client-ID for EpiFlu or EpiCoV database in order to use its CLI tool. The CLI utilizes the client-ID along with the username and password to authenticate the database prior to make a submission. To obtain a client-ID, please email <a href="mailto:clisupport@gisaid.org" >clisupport@gisaid.org</a> to request. _**Important note**: If submitter would like to upload a "test" submission first to familiarize themselves with the submission process prior to make a real submission, one should additionally request a test client-id to perform such submissions._
2. Request a client-ID for your specified Epi(Flu/CoV/Pox/Arbo) database in order to use its CLI tool. The CLI utilizes the client-ID along with the username and password to authenticate the database prior to make a submission. To obtain a client-ID, please email <a href="mailto:clisupport@gisaid.org" >clisupport@gisaid.org</a> to request. _**Important note**: If submitter would like to upload a "test" submission first to familiarize themselves with the submission process prior to make a real submission, one should additionally request a test client-id to perform such submissions._

3. Download the <a href="`r github_pages_url`/articles/images/fluCLI_download.png" target="_blank">EpiFlu</a> or <a href="`r github_pages_url`/articles/images/covCLI_download.png" target="_blank">EpiCoV</a> CLI from the **GISAID platform** and stored them in the destination of choice prior to perform a batch upload.

Expand All @@ -75,34 +75,6 @@ Here is a quick look of where to store the downloaded **GISAID CLI** package.
![](man/figures/gisaid_cli_dir.png)



## Requirement Files

Before submitters can perform a batch submission using ``r program``, they must make sure the requirement files (such as *config.yaml*, *metadata.csv*, *sequence.fasta*, *raw reads*, etc.) are already prepared and stored in a submission directory of choice.

(a) To prep for FLU submissions, select one of the databases below to get started:

> <a href="`r github_pages_url`/articles/biosample_submission.html" target="_blank">BioSample</a> <br>
> <a href="`r github_pages_url`/articles/sra_submission.html" target="_blank">SRA</a> <br>
> <a href="`r github_pages_url`/articles/genbank_submission.html" target="_blank">Genbank</a> <br>
> <a href="`r github_pages_url`/articles/gisaid_flu_submission.html" target="_blank">GISAID</a> <br>
<!-- > <a href="`r github_pages_url`/articles/multiple_databases_flu_submission.html" target="_blank">Multiple databases</a> -->

(b) To prep for COV submissions, select one of the databases below to get started:

> <a href="`r github_pages_url`/articles/biosample_submission.html" target="_blank">BioSample</a> <br>
> <a href="`r github_pages_url`/articles/sra_submission.html" target="_blank">SRA</a> <br>
> <a href="`r github_pages_url`/articles/genbank_submission.html" target="_blank">Genbank</a> <br>
> <a href="`r github_pages_url`/articles/gisaid_cov_submission.html" target="_blank">GISAID</a> <br>
<!-- > <a href="`r github_pages_url`/articles/multiple_databases_cov_submission.html" target="_blank">Multiple databases</a> -->

## Quick Start

- [How to run seqsender locally](`r github_pages_url`/articles/local_installation.html)
- [How to run seqsender with Docker](`r github_pages_url`/articles/docker_installation.html)
- [How to run seqsender with Compose](`r github_pages_url`/articles/compose_installation.html)
- [How to run seqsender with Singularity](`r github_pages_url`/articles/singularity_installation.html)

## Code Attributions

Dakota Howard and Reina Chau for majority of the code base with input and testing from [colleagues](`r github_pages_url`/authors.html).
Expand Down
82 changes: 17 additions & 65 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@

</p>

**Beta Version**: 1.1.0. This pipeline is currently in Beta testing, and
**Beta Version**: 1.2.0. This pipeline is currently in Beta testing, and
issues could appear during submission. Please use it at your own risk.
Feedback and suggestions are welcome\!

Expand All @@ -23,18 +23,19 @@ CDC and its partners to share information and collaborate on software.
CDC use of GitHub does not imply an endorsement of any one particular
service, product, or enterprise.

# [Documentation](https://cdcgov.github.io/seqsender/index.html)
# [Documentation](https://dthoward96.github.io/seqsender_test_website/)

## Overview

`seqsender` is a Python program that is developed to automate the
process of generating necessary submission files and batch uploading
them to <ins>NCBI archives</ins> (such as **BioSample**, **SRA**, and
**Genbank**) and <ins>GISAID databases</ins> (e.g. **EpiFlu** and
**EpiCoV**). Presently, the pipeline is capable of uploading **Influenza
A Virus** (FLU) and **SARS-COV-2** (COV) data. However, the dynamic
nature of this pipeline can allow for additional uploads of other
organisms in future updates or requests.
**Genbank**) and <ins>GISAID databases</ins> (e.g. **EpiFlu**,
**EpiCoV**, **EpiPox**, **EpiArbo**). Presently, the pipeline is capable
of uploading **Influenza A Virus** (FLU), **SARS-COV-2** (COV),
**Monkeypox** (POX), **Arbovirus** (ARBO), and a wide variety of other
organisms. If you’d like to have `seqsender` support your virus create a
issue.

## Contacts

Expand Down Expand Up @@ -84,31 +85,26 @@ FTP on the command line. Before attempting to submit a submission using
used to report back assigned accessions as well as for cross-linking
objects within submission. The values of **spuid\_namespace** are up
to the submitter to decide but they must be unique and
well-coordinated prior to make a submission. For more information
about these two fields, see
[BioSample](https://cdcgov.github.io/seqsender/articles/biosample_submission.html#metadata)
/
[SRA](https://cdcgov.github.io/seqsender/articles/sra_submission.html#metadata)
/
[GENBANK](https://cdcgov.github.io/seqsender/articles/genbank_submission.html#metadata)
metadata requirements.
well-coordinated prior to make a submission.

<!-- end list -->

- **GISAID Submissions**

`seqsender` makes use of GISAID’s Command Line Interface tools to bulk
uploading meta- and sequence-data to GISAID databases. Presently, the
pipeline only allows upload to EpiFlu (**Influenza A Virus**) and EpiCoV
(**SARS-COV-2**) databases. Before uploading, submitter needs to
pipeline supports upload to EpiFlu (**Influenza A Virus**), EpiCoV
(**SARS-COV-2**), EpiPox (**Monkeypox**), and EpiArbo (**Arbovirus**).
Before uploading, submitter needs to

1. Have a GISAID account. To sign up, visit [GISAID
Platform](https://gisaid.org/).

2. Request a client-ID for EpiFlu or EpiCoV database in order to use
its CLI tool. The CLI utilizes the client-ID along with the username
and password to authenticate the database prior to make a
submission. To obtain a client-ID, please email
2. Request a client-ID for your specified Epi(Flu/CoV/Pox/Arbo)
database in order to use its CLI tool. The CLI utilizes the
client-ID along with the username and password to authenticate the
database prior to make a submission. To obtain a client-ID, please
email
<a href="mailto:clisupport@gisaid.org" >clisupport@gisaid.org</a> to
request. ***Important note**: If submitter would like to upload a
“test” submission first to familiarize themselves with the
Expand All @@ -127,50 +123,6 @@ package.

![](man/figures/gisaid_cli_dir.png)

## Requirement Files

Before submitters can perform a batch submission using `seqsender`, they
must make sure the requirement files (such as *config.yaml*,
*metadata.csv*, *sequence.fasta*, *raw reads*, etc.) are already
prepared and stored in a submission directory of choice.

1) To prep for FLU submissions, select one of the databases below to
get started:

> <a href="https://cdcgov.github.io/seqsender/articles/biosample_submission.html" target="_blank">BioSample</a>
> <br>
> <a href="https://cdcgov.github.io/seqsender/articles/sra_submission.html" target="_blank">SRA</a>
> <br>
> <a href="https://cdcgov.github.io/seqsender/articles/genbank_submission.html" target="_blank">Genbank</a>
> <br>
> <a href="https://cdcgov.github.io/seqsender/articles/gisaid_flu_submission.html" target="_blank">GISAID</a>
> <br>
> <!-- > <a href="https://cdcgov.github.io/seqsender/articles/multiple_databases_flu_submission.html" target="_blank">Multiple databases</a> -->

2) To prep for COV submissions, select one of the databases below to
get started:

> <a href="https://cdcgov.github.io/seqsender/articles/biosample_submission.html" target="_blank">BioSample</a>
> <br>
> <a href="https://cdcgov.github.io/seqsender/articles/sra_submission.html" target="_blank">SRA</a>
> <br>
> <a href="https://cdcgov.github.io/seqsender/articles/genbank_submission.html" target="_blank">Genbank</a>
> <br>
> <a href="https://cdcgov.github.io/seqsender/articles/gisaid_cov_submission.html" target="_blank">GISAID</a>
> <br>
> <!-- > <a href="https://cdcgov.github.io/seqsender/articles/multiple_databases_cov_submission.html" target="_blank">Multiple databases</a> -->

## Quick Start

- [How to run seqsender
locally](https://cdcgov.github.io/seqsender/articles/local_installation.html)
- [How to run seqsender with
Docker](https://cdcgov.github.io/seqsender/articles/docker_installation.html)
- [How to run seqsender with
Compose](https://cdcgov.github.io/seqsender/articles/compose_installation.html)
- [How to run seqsender with
Singularity](https://cdcgov.github.io/seqsender/articles/singularity_installation.html)

## Code Attributions

Dakota Howard and Reina Chau for majority of the code base with input
Expand Down
Loading
Loading