Limited role of generation time changes in driving the evolution of the mutation spectrum in humans

  1. Ziyue Gao  Is a corresponding author
  2. Yulin Zhang
  3. Nathan Cramer
  4. Molly Przeworski
  5. Priya Moorjani  Is a corresponding author
  1. University of Pennsylvania, United States
  2. University of California, Berkeley, United States
  3. Columbia University, United States

Abstract

Recent studies have suggested that the human germline mutation rate and spectrum evolve rapidly. Variation in generation time has been linked to these changes, though its contribution remains unclear. We develop a framework to characterize temporal changes in polymorphisms within and between populations, while controlling for the effects of natural selection and biased gene conversion. Application to the 1000 Genomes Project dataset reveals multiple independent changes that arose after the split of continental groups, including a previously reported, transient elevation in TCC>TTC mutations in Europeans and novel signals of divergence in C>G and T>A mutation rates among population samples. We also find a significant difference between groups sampled in and outside of Africa, in old T>C polymorphisms that predate the out-of-Africa migration. This surprising signal is driven by TpG>CpG mutations, and stems in part from mis-polarized CpG transitions, which are more likely to undergo recurrent mutations. Finally, by relating the mutation spectrum of polymorphisms to parental age effects on de novo mutations, we show that plausible changes in the generation time cannot explain the patterns observed for different mutation types jointly. Thus, other factors--genetic modifiers or environmental exposures--must have had a non-negligible impact on the human mutation landscape.

Data availability

All data generated or analyzed during this study were based on publicly available datasets like the 1000 Genomes Project. Source data for Figures 1-4 contain the numerical data used to generate the figures. Source data for figure 1 is available at the following URL: https://doi.org/10.6078/D19B0H. (Note, For private access prior to publication, the dataset is available at the URL: https://datadryad.org/stash/share/JK1BdqPhl6azkQru6gLTi6_dA-6lobKUxzpUM7mW69Y)

The following previously published data sets were used

Article and author information

Author details

  1. Ziyue Gao

    Department of Genetics, University of Pennsylvania, Philadelphia, United States
    For correspondence
    ziyuegao@pennmedicine.upenn.edu
    Competing interests
    No competing interests declared.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-9244-0238
  2. Yulin Zhang

    Center for Computational Biology, University of California, Berkeley, Berkeley, United States
    Competing interests
    No competing interests declared.
  3. Nathan Cramer

    Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, United States
    Competing interests
    No competing interests declared.
  4. Molly Przeworski

    Department of Systems Biology, Columbia University, New York, United States
    Competing interests
    Molly Przeworski, Reviewing editor, eLife.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-5369-9009
  5. Priya Moorjani

    Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, United States
    For correspondence
    moorjani@berkeley.edu
    Competing interests
    No competing interests declared.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-0947-5673

Funding

National Institutes of Health (R35GM146810)

  • Ziyue Gao

Alfred P. Sloan Foundation

  • Ziyue Gao

National Institutes of Health (R35GM142978)

  • Priya Moorjani

Alfred P. Sloan Foundation

  • Priya Moorjani

National Institutes of Health (GM122975)

  • Molly Przeworski

National Science Foundation (DGE 2146752)

  • Nathan Cramer

Hellman Family Foundation

  • Priya Moorjani

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Reviewing Editor

  1. Philipp W Messer, Cornell University, United States

Version history

  1. Preprint posted: June 18, 2022 (view preprint)
  2. Received: June 18, 2022
  3. Accepted: February 2, 2023
  4. Accepted Manuscript published: February 13, 2023 (version 1)
  5. Accepted Manuscript updated: February 15, 2023 (version 2)
  6. Version of Record published: March 14, 2023 (version 3)

Copyright

© 2023, Gao et al.

This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 1,255
    Page views
  • 179
    Downloads
  • 5
    Citations

Article citation count generated by polling the highest count across the following sources: PubMed Central, Crossref, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Ziyue Gao
  2. Yulin Zhang
  3. Nathan Cramer
  4. Molly Przeworski
  5. Priya Moorjani
(2023)
Limited role of generation time changes in driving the evolution of the mutation spectrum in humans
eLife 12:e81188.
https://doi.org/10.7554/eLife.81188

Further reading

    1. Evolutionary Biology
    2. Genetics and Genomics
    Carolina A Martinez-Gutierrez, Josef C Uyeda, Frank O Aylward
    Research Article

    Microbial plankton play a central role in marine biogeochemical cycles, but the timing in which abundant lineages diversified into ocean environments remains unclear. Here, we reconstructed the timeline in which major clades of bacteria and archaea colonized the ocean using a high-resolution benchmarked phylogenetic tree that allows for simultaneous and direct comparison of the ages of multiple divergent lineages. Our findings show that the diversification of the most prevalent marine clades spans throughout a period of 2.2 Ga, with most clades colonizing the ocean during the last 800 million years. The oldest clades – SAR202, SAR324, Ca. Marinimicrobia, and Marine Group II – diversified around the time of the Great Oxidation Event, during which oxygen concentration increased but remained at microaerophilic levels throughout the Mid-Proterozoic, consistent with the prevalence of some clades within these groups in oxygen minimum zones today. We found the diversification of the prevalent heterotrophic marine clades SAR11, SAR116, SAR92, SAR86, and Roseobacter as well as the Marine Group I to occur near to the Neoproterozoic Oxygenation Event (0.8–0.4 Ga). The diversification of these clades is concomitant with an overall increase of oxygen and nutrients in the ocean at this time, as well as the diversification of eukaryotic algae, consistent with the previous hypothesis that the diversification of heterotrophic bacteria is linked to the emergence of large eukaryotic phytoplankton. The youngest clades correspond to the widespread phototrophic clades Prochlorococcus, Synechococcus, and Crocosphaera, whose diversification happened after the Phanerozoic Oxidation Event (0.45–0.4 Ga), in which oxygen concentrations had already reached their modern levels in the atmosphere and the ocean. Our work clarifies the timing at which abundant lineages of bacteria and archaea colonized the ocean, thereby providing key insights into the evolutionary history of lineages that comprise the majority of prokaryotic biomass in the modern ocean.

    1. Evolutionary Biology
    2. Genetics and Genomics
    Zachary Baker, Molly Przeworski, Guy Sella
    Research Article Updated

    In many species, meiotic recombination events tend to occur in narrow intervals of the genome, known as hotspots. In humans and mice, double strand break (DSB) hotspot locations are determined by the DNA-binding specificity of the zinc finger array of the PRDM9 protein, which is rapidly evolving at residues in contact with DNA. Previous models explained this rapid evolution in terms of the need to restore PRDM9 binding sites lost to gene conversion over time, under the assumption that more PRDM9 binding always leads to more DSBs. This assumption, however, does not align with current evidence. Recent experimental work indicates that PRDM9 binding on both homologs facilitates DSB repair, and that the absence of sufficient symmetric binding disrupts meiosis. We therefore consider an alternative hypothesis: that rapid PRDM9 evolution is driven by the need to restore symmetric binding because of its role in coupling DSB formation and efficient repair. To this end, we model the evolution of PRDM9 from first principles: from its binding dynamics to the population genetic processes that govern the evolution of the zinc finger array and its binding sites. We show that the loss of a small number of strong binding sites leads to the use of a greater number of weaker ones, resulting in a sharp reduction in symmetric binding and favoring new PRDM9 alleles that restore the use of a smaller set of strong binding sites. This decrease, in turn, drives rapid PRDM9 evolutionary turnover. Our results therefore suggest that the advantage of new PRDM9 alleles is in limiting the number of binding sites used effectively, rather than in increasing net PRDM9 binding. By extension, our model suggests that the evolutionary advantage of hotspots may have been to increase the efficiency of DSB repair and/or homolog pairing.