Mutation saturation for fitness effects at human CpG sites

  1. Ipsita Agarwal  Is a corresponding author
  2. Molly Przeworski  Is a corresponding author
  1. Columbia University, United States

Abstract

Whole exome sequences have now been collected for millions of humans, with the related goals of identifying pathogenic mutations in patients and establishing reference repositories of data from unaffected individuals. As a result, we are approaching an important limit, in which datasets are large enough that, in the absence of natural selection, every highly mutable site will have experienced at least one mutation in the genealogical history of the sample. Here, we focus on CpG sites that are methylated in the germline and experience mutations to T at an elevated rate of ~10-7 per site per generation; considering synonymous mutations in a sample of 390,000 individuals, ~99% of such CpG sites harbor a C/T polymorphism. Methylated CpG sites provide a natural mutation saturation experiment for fitness effects: as we show, at current sample sizes, not seeing a non-synonymous polymorphism is indicative of strong selection against that mutation. We rely on this idea in order to directly identify a subset of CpG transitions that are likely to be highly deleterious, including ~27% of possible loss-of-function mutations, and up to 20% of possible missense mutations, depending on the type of functional site in which they occur. Unlike methylated CpGs, most mutation types, with rates on the order of 10-8 or 10-9, remain very far from saturation. We discuss what these findings imply for interpreting the potential clinical relevance of mutations from their presence or absence in reference databases and for inferences about the fitness effects of new mutations.

Data availability

All source data are freely available to researchers, with sources provided in the manuscript. Data and code to generate the figures is available at https://github.com/agarwal-i/cpg_saturation.

The following previously published data sets were used

Article and author information

Author details

  1. Ipsita Agarwal

    Department of Biological Sciences, Columbia University, New York, United States
    For correspondence
    ia2337@columbia.edu
    Competing interests
    No competing interests declared.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-8537-0008
  2. Molly Przeworski

    Department of Systems Biology, Columbia University, New York, United States
    For correspondence
    mp3284@columbia.edu
    Competing interests
    Molly Przeworski, Senior editor, eLife.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-5369-9009

Funding

National Institutes of Health (GM122975)

  • Molly Przeworski

National Institutes of Health (GM121372)

  • Molly Przeworski

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Copyright

© 2021, Agarwal & Przeworski

This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 2,388
    views
  • 264
    downloads
  • 25
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Ipsita Agarwal
  2. Molly Przeworski
(2021)
Mutation saturation for fitness effects at human CpG sites
eLife 10:e71513.
https://doi.org/10.7554/eLife.71513

Share this article

https://doi.org/10.7554/eLife.71513

Further reading

    1. Evolutionary Biology
    2. Neuroscience
    Anastasia A Makarova, Nicholas J Chua ... Alexey A Polilov
    Research Article

    The structure of compound eyes in arthropods has been the subject of many studies, revealing important biological principles. Until recently, these studies were constrained by the two-dimensional nature of available ultrastructural data. By taking advantage of the novel three-dimensional ultrastructural dataset obtained using volume electron microscopy, we present the first cellular-level reconstruction of the whole compound eye of an insect, the miniaturized parasitoid wasp Megaphragma viggianii. The compound eye of the female M. viggianii consists of 29 ommatidia and contains 478 cells. Despite the almost anucleate brain, all cells of the compound eye contain nuclei. As in larger insects, the dorsal rim area of the eye in M. viggianii contains ommatidia that are believed to be specialized in polarized light detection as reflected in their corneal and retinal morphology. We report the presence of three ‘ectopic’ photoreceptors. Our results offer new insights into the miniaturization of compound eyes and scaling of sensory organs in general.

    1. Evolutionary Biology
    Nagatoshi Machii, Ryo Hatashima ... Masato Nikaido
    Research Article

    Cichlid fishes inhabiting the East African Great Lakes, Victoria, Malawi, and Tanganyika, are textbook examples of parallel evolution, as they have acquired similar traits independently in each of the three lakes during the process of adaptive radiation. In particular, ‘hypertrophied lip’ has been highlighted as a prominent example of parallel evolution. However, the underlying molecular mechanisms remain poorly understood. In this study, we conducted an integrated comparative analysis between the hypertrophied and normal lips of cichlids across three lakes based on histology, proteomics, and transcriptomics. Histological and proteomic analyses revealed that the hypertrophied lips were characterized by enlargement of the proteoglycan-rich layer, in which versican and periostin proteins were abundant. Transcriptome analysis revealed that the expression of extracellular matrix-related genes, including collagens, glycoproteins, and proteoglycans, was higher in hypertrophied lips, regardless of their phylogenetic relationships. In addition, the genes in Wnt signaling pathway, which is involved in promoting proteoglycan expression, was highly expressed in both the juvenile and adult stages of hypertrophied lips. Our comprehensive analyses showed that hypertrophied lips of the three different phylogenetic origins can be explained by similar proteomic and transcriptomic profiles, which may provide important clues into the molecular mechanisms underlying phenotypic parallelisms in East African cichlids.