Research: A comprehensive and quantitative exploration of thousands of viral genomes
Abstract
The complete assembly of viral genomes from metagenomic datasets (short genomic sequences gathered from environmental samples) has proven to be challenging, so there still remain significant blind spots in our view of viral genomes through the lens of metagenomics. One approach to overcoming this problem is to leverage the thousands of complete viral genomes that are publicly available. Here we describe our efforts to assemble a comprehensive resource that provides a quantitative snapshot of viral genomic trends – such as gene density, noncoding percentage, and abundances of functional gene categories – across thousands of viral genomes. We have also developed a coarse-grained method for visualizing viral genome organization for hundreds of genomes at once, and have explored the extent of the overlap between bacterial and bacteriophage gene pools. Existing viral classification systems were developed prior to the sequencing era, so we present our analysis in a way that allows us to assess the utility of the different classification systems for capturing genomic trends.
Data availability
All source data, scripts and output data used to create this manuscript is deposited in our manuscript, its supporting files and our GitHub repository : https://github.com/gitamahm/VirologyByTheNumbers
-
NCBI viral genomes resourcePublicly available at the NCBI viral resource page (https://www.ncbi.nlm.nih.gov/genome/viruses/).
Article and author information
Author details
Funding
John Templeton Foundation (51250)
- Rob Phillips
National Institutes of Health (RFA-GM-17-002)
- Rob Phillips
National Science Foundation (DGE‐1144469)
- Gita Mahmoudabadi
National Institutes of Health (R01- GM098465)
- Rob Phillips
National Science Foundation (NSF PHY11-25915)
- Rob Phillips
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Copyright
© 2018, Mahmoudabadi & Phillips
This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 12,199
- views
-
- 1,180
- downloads
-
- 71
- citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Citations by DOI
-
- 71
- citations for umbrella DOI https://doi.org/10.7554/eLife.31955