Abstract

Shotgun metagenomic sequencing is a powerful approach to study microbiomes in an unbiased manner and of increasing relevance for identifying novel enzymatic functions. However, the potential of metagenomics to relate from microbiome composition to function has thus far been underutilized. Here, we introduce the Metagenomics Genome-Phenome Association (MetaGPA) study framework, which allows linking genetic information in metagenomes with a dedicated functional phenotype. We applied MetaGPA to identify enzymes associated with cytosine modifications in environmental samples. From the 2365 genes that met our significance criteria, we confirm known pathways for cytosine modifications and proposed novel cytosine-modifying mechanisms. Specifically, we characterized and identified a novel nucleic acid modifying enzyme, 5-hydroxymethylcytosine carbamoyltransferase, that catalyzes the formation of a previously unknown cytosine modification, 5-carbamoyloxymethylcytosine, in DNA and RNA. Our work introduces MetaGPA as a novel and versatile tool for advancing functional metagenomics.

Data availability

All raw and processed sequencing data generated in this study have been submitted to the NCBI Sequence Read Archive (SRA; https://www.ncbi.nlm.nih.gov/sra) under accession number PRJNA714147.

The following data sets were generated

Article and author information

Author details

  1. Weiwei Yang

    Research department, New England Biolabs Inc, Ipswich, United States
    Competing interests
    Weiwei Yang, The author is employee of New England Biolabs Inc. a manufacturer of restriction enzymes and molecular reagents..
  2. Yu-Cheng Lin

    Research department, New England Biolabs Inc, Ipswich, United States
    Competing interests
    Yu-Cheng Lin, The author was an employee of New England Biolabs Inc. a manufacturer of restriction enzymes and molecular reagents..
  3. William Johnson

    Research department, New England Biolabs Inc, Ipswich, United States
    Competing interests
    William Johnson, The author was an employee of New England Biolabs Inc. a manufacturer of restriction enzymes and molecular reagents..
  4. Nan Dai

    RNA Biology, New England Biolabs Inc, Ipswich, United States
    Competing interests
    Nan Dai, The author is an employee of New England Biolabs Inc. a manufacturer of restriction enzymes and molecular reagents..
  5. Romualdas Vaisvila

    Research department, New England Biolabs Inc, Ipswich, United States
    Competing interests
    Romualdas Vaisvila, The author is an employee of New England Biolabs Inc. a manufacturer of restriction enzymes and molecular reagents..
  6. Peter Weigele

    Research department, New England Biolabs Inc, Ipswich, United States
    Competing interests
    Peter Weigele, The author is an employee of New England Biolabs Inc. a manufacturer of restriction enzymes and molecular reagents..
  7. Yan-Jiun Lee

    Research department, New England Biolabs Inc, Ipswich, United States
    Competing interests
    Yan-Jiun Lee, The author is an employee of New England Biolabs Inc. a manufacturer of restriction enzymes and molecular reagents..
  8. Ivan R Corrêa Jr

    RNA Biology, New England Biolabs Inc, Ipswich, United States
    Competing interests
    Ivan R Corrêa, The author is an employee of New England Biolabs Inc. a manufacturer of restriction enzymes and molecular reagents..
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-3169-6878
  9. Ira Schildkraut

    Research department, New England Biolabs Inc, Ipswich, United States
    Competing interests
    Ira Schildkraut, The author is an employee of New England Biolabs Inc. a manufacturer of restriction enzymes and molecular reagents..
  10. Laurence Ettwiller

    Research department, New England Biolabs Inc, Ipswich, United States
    For correspondence
    laurence.ettwiller@gmail.com
    Competing interests
    Laurence Ettwiller, The author is employee of New England Biolabs Inc. a manufacturer of restriction enzymes and molecular reagents..
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-3957-6539

Funding

New England Biolabs (no data)

  • Weiwei Yang
  • Yu-Cheng Lin
  • William Johnson
  • Nan Dai
  • Romualdas Vaisvila
  • Peter Weigele
  • Yan-Jiun Lee
  • Ivan R Corrêa Jr
  • Ira Schildkraut
  • Laurence Ettwiller

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Copyright

© 2021, Yang et al.

This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 1,817
    views
  • 217
    downloads
  • 5
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Weiwei Yang
  2. Yu-Cheng Lin
  3. William Johnson
  4. Nan Dai
  5. Romualdas Vaisvila
  6. Peter Weigele
  7. Yan-Jiun Lee
  8. Ivan R Corrêa Jr
  9. Ira Schildkraut
  10. Laurence Ettwiller
(2021)
A Genome-Phenome Association study in native microbiomes identifies a mechanism for cytosine modification in DNA and RNA
eLife 10:e70021.
https://doi.org/10.7554/eLife.70021

Share this article

https://doi.org/10.7554/eLife.70021

Further reading

    1. Biochemistry and Chemical Biology
    2. Genetics and Genomics
    Conor J Howard, Nathan S Abell ... Nathan B Lubock
    Research Article

    Deep Mutational Scanning (DMS) is an emerging method to systematically test the functional consequences of thousands of sequence changes to a protein target in a single experiment. Because of its utility in interpreting both human variant effects and protein structure-function relationships, it holds substantial promise to improve drug discovery and clinical development. However, applications in this domain require improved experimental and analytical methods. To address this need, we report novel DMS methods to precisely and quantitatively interrogate disease-relevant mechanisms, protein-ligand interactions, and assess predicted response to drug treatment. Using these methods, we performed a DMS of the melanocortin-4 receptor (MC4R), a G-protein-coupled receptor (GPCR) implicated in obesity and an active target of drug development efforts. We assessed the effects of >6600 single amino acid substitutions on MC4R’s function across 18 distinct experimental conditions, resulting in >20 million unique measurements. From this, we identified variants that have unique effects on MC4R-mediated Gαs- and Gαq-signaling pathways, which could be used to design drugs that selectively bias MC4R’s activity. We also identified pathogenic variants that are likely amenable to a corrector therapy. Finally, we functionally characterized structural relationships that distinguish the binding of peptide versus small molecule ligands, which could guide compound optimization. Collectively, these results demonstrate that DMS is a powerful method to empower drug discovery and development.

    1. Biochemistry and Chemical Biology
    2. Genetics and Genomics
    Jiale Zhou, Ding Zhao ... Zhanjun Li
    Research Article

    5-Methylcytosine (m5C) is one of the posttranscriptional modifications in mRNA and is involved in the pathogenesis of various diseases. However, the capacity of existing assays for accurately and comprehensively transcriptome-wide m5C mapping still needs improvement. Here, we develop a detection method named DRAM (deaminase and reader protein assisted RNA methylation analysis), in which deaminases (APOBEC1 and TadA-8e) are fused with m5C reader proteins (ALYREF and YBX1) to identify the m5C sites through deamination events neighboring the methylation sites. This antibody-free and bisulfite-free approach provides transcriptome-wide editing regions which are highly overlapped with the publicly available bisulfite-sequencing (BS-seq) datasets and allows for a more stable and comprehensive identification of the m5C loci. In addition, DRAM system even supports ultralow input RNA (10 ng). We anticipate that the DRAM system could pave the way for uncovering further biological functions of m5C modifications.