Global biogeographic sampling of bacterial secondary metabolism

  1. Zachary Charlop-Powers
  2. Jeremy G Owen
  3. Boojala Vijay B Reddy
  4. Melinda A Ternei
  5. Denise O Guimarães
  6. Ulysses A de Frias
  7. Monica T Pupo
  8. Prudy Seepe
  9. Zhiyang Feng
  10. Sean F Brady  Is a corresponding author
  1. Howard Hughes Medical Institute, Rockefeller University, United States
  2. Universidade Federal do Rio de Janeiro–Campus Macaé, Brazil
  3. University of São Paulo, Brazil
  4. Nelson R Mandela School of Medicine, South Africa
  5. Nanjing Agricultural University, China
2 figures and 7 additional files

Figures

Global abundance and comparative distribution of AD/KS sequences.

The global abundance (A and C), sample-to-sample variation (B and D), and geographic distribution (E, F, G, and H) of adenylation domains (AD) and ketosynthase domains (KS) were assessed by pyro-sequencing of amplicons generated using degenerate primers targeting AD and KS domains found in 185 soils/sediments from around the world. (A and C) Global AD (A) or KS (C) domain diversity estimates were obtained by rarefying the global OTU table (de novo clustering at 95%) for AD and KS sequences and calculating the average Chao1 diversity metric at each sampling depth. (B and D) The ecological distance (i.e., Jaccard dissimilarity) between AD (B) or KS (D) domain populations sequenced from each metagenome was determined as a function of the great circle distance between sample collection sites (km). Insets show local relationships (<500 km) in more detail. (E and F) All sample collection sites are shown on each world map and lines are used to connect sample sites that share at least the indicated fraction (3%, 10%) of AD (E) or KS (F) OTUs. (G and H) Biome-specific relationships within domain OTU populations sequenced from geographically proximal samples assessed by Jaccard similarity. Samples were collected from (G) Atlantic forest, saline or cerrado environments or from the (H) New Mexican desert topsoils or hot springs sediments.

https://doi.org/10.7554/eLife.05048.003
Biomedically relevant natural product hotspots and diversity.

Hotspot analysis of natural product biosynthetic diversity to identify samples with a high total proportion of reads corresponding to a natural product family of interest (A and D), the maximum unique OTUs corresponding to a natural product family of interest (B and D), or the estimated sample biodiversity (C and D). In A and B samples are arranged by longitude and hemisphere as is shown in the Sample Key. (A) For each sample, sequence reads assigned by eSNaPD are expressed as a percentage of total reads obtained for that sample. A sample is designated a hotspot if more than one percent (0.01; horizontal line) of its reads map to a specific gene cluster. Fractional observance data for five representative gene clusters or gene cluster families (zorbamycin, oocydin, tiacumicinB, epoxomicin, glycopeptides) that show significant sample dependent difference in read frequency are shown. (B) Hotspots of elevated gene cluster family diversity can be identified by determining the number of unique OTUs occurring in each sample that, by eSNaPD, map to a natural product gene cluster of interest. Sample specific OTU counts for nocardicin, rifamycin, bleomycin, and daptomycin clusters are shown. Samples containing greater than 50% of the maximum observed OTU value are colored and mapped in (C). OTU diversity measurements do not predict the abundance of a specific cluster in a metagenome [as predicted in (A)], but instead are used to identify locations where the largest number of congener-encoding clusters may be found. These sites are predicted to be most useful for increasing the structural diversity and therefore potential clinical utility of these medically important families of natural products. (C) Estimated diversity of AD/KS reads by sample. AD and KS OTU tables were combined and for each sample the Chao1 diversity metric was calculated at 5000 reads, providing a baseline metric for comparing sample biosynthetic diversity. The average number of unique OTUs observed over 10 rarefactions analyses is shown (also see Supplementary file 7). (D) Hotspot map of samples identified in A, B and C. (E) Representative structures of target molecule families highlighted in A and B.

https://doi.org/10.7554/eLife.05048.004

Additional files

Supplementary file 1

Sample Location Data.

https://doi.org/10.7554/eLife.05048.005
Supplementary file 2

Sample Read and 95% OTU Count.

https://doi.org/10.7554/eLife.05048.006
Supplementary file 3

Adenylation Domain Rarefaction Data (Figure 1A).

https://doi.org/10.7554/eLife.05048.007
Supplementary file 4

Ketosynthase Domain Rarefaction Data (Figure 1C).

https://doi.org/10.7554/eLife.05048.008
Supplementary file 5

Pairwise Sample Distances. Great Circle Distance and Jaccard Distance for AD and KS Amplicons.

https://doi.org/10.7554/eLife.05048.009
Supplementary file 6

eSNaPD Hits Broken Down by Sample and Molecule.

https://doi.org/10.7554/eLife.05048.010
Supplementary file 7

Per Sample Chao1 Biodiversity Estimates at a Rarefaction Depth of 5000 Reads.

https://doi.org/10.7554/eLife.05048.011

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Zachary Charlop-Powers
  2. Jeremy G Owen
  3. Boojala Vijay B Reddy
  4. Melinda A Ternei
  5. Denise O Guimarães
  6. Ulysses A de Frias
  7. Monica T Pupo
  8. Prudy Seepe
  9. Zhiyang Feng
  10. Sean F Brady
(2015)
Global biogeographic sampling of bacterial secondary metabolism
eLife 4:e05048.
https://doi.org/10.7554/eLife.05048