R script for identifying trends in bacterial and archaeal genomics through number of genome sequencing projects submitted to ncbi per year.
- ggplot2
- stringr
- tidyr
- tidyverse
- data.table
install.packages(c("stringr,ggplot2,tidyr,tidyverse,data.table"))
-
Data gathering from the ncbi ftp repository (ftp://ftp.ncbi.nlm.nih.gov/genomes/GENOME_REPORTS/prokaryotes.txt)
-
Data formatting: leave only genera and years
-
Function (subsetting and plotting)
genomic_trends("Escherichia|Pseudomonas|Vibrio|Campylobacter|Salmonella|Brucella", 3)
genomic_trends("Methanosarcina|Ignicoccus|Pyrococcus|Sulfolobus", 2)
Manuel García-Ulloa (https://github.com/manuelgug)
Mariette Viladomat Jasso (https://github.com/MarietteViladomat)