Background selection and biased gene conversion affect more than 95% of the human genome and bias demographic inferences

  1. Fanny Pouyet  Is a corresponding author
  2. Simon Aeschbacher
  3. Alexandre Thiéry
  4. Laurent Excoffier  Is a corresponding author
  1. University of Bern, Switzerland

Abstract

Disentangling the effect on genomic diversity of natural selection from that of demography is notoriously difficult, but necessary to properly reconstruct the history of species. Here, we use high-quality human genomic data to show that purifying selection at linked sites (i.e. background selection, BGS) and GC-biased gene conversion (gBGC) together affect as much as 95% of the variants of our genome. We find that the magnitude and relative importance of BGS and gBGC are largely determined by variation in recombination rate and base composition. Importantly, synonymous sites and non-transcribed regions are also affected, albeit to different degrees. Their use for demographic inference can lead to strong biases. However, by conditioning on genomic regions with recombination rates above 1.5 cM/Mb and mutation types (C↔G, A↔T), we identify a set of SNPs that is mostly unaffected by BGS or gBGC, and that avoids these biases in the reconstruction of human history.

Data availability

All data generated and script to analyse them is provided on the dryad repository: http://dx.doi.org/10.5061/dryad.t76fk80

The following data sets were generated
The following previously published data sets were used

Article and author information

Author details

  1. Fanny Pouyet

    Institute of Ecology and Evolution, University of Bern, Berne, Switzerland
    For correspondence
    fanny.pouyet@iee.unibe.ch
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-5614-6998
  2. Simon Aeschbacher

    Institute of Ecology and Evolution, University of Bern, Bern, Switzerland
    Competing interests
    The authors declare that no competing interests exist.
  3. Alexandre Thiéry

    Institute of Ecology and Evolution, University of Bern, Bern, Switzerland
    Competing interests
    The authors declare that no competing interests exist.
  4. Laurent Excoffier

    Institute of Ecology and Evolution, University of Bern, Bern, Switzerland
    For correspondence
    laurent.excoffier@iee.unibe.ch
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-7507-6494

Funding

Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung (310030B-166605)

  • Laurent Excoffier

University of Berkeley (Visiting Miller Professorship)

  • Laurent Excoffier

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Reviewing Editor

  1. Krishna Veeramah, Stony Brook University, United States

Publication history

  1. Received: March 1, 2018
  2. Accepted: August 17, 2018
  3. Accepted Manuscript published: August 20, 2018 (version 1)
  4. Accepted Manuscript updated: August 23, 2018 (version 2)
  5. Version of Record published: October 9, 2018 (version 3)

Copyright

© 2018, Pouyet et al.

This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 7,446
    Page views
  • 950
    Downloads
  • 58
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, Scopus, PubMed Central.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Fanny Pouyet
  2. Simon Aeschbacher
  3. Alexandre Thiéry
  4. Laurent Excoffier
(2018)
Background selection and biased gene conversion affect more than 95% of the human genome and bias demographic inferences
eLife 7:e36317.
https://doi.org/10.7554/eLife.36317

Further reading

    1. Evolutionary Biology
    2. Genetics and Genomics
    Eric B Zheng, Li Zhao
    Research Article

    De novo gene origination, where a previously non-genic genomic sequence becomes genic through evolution, has been increasingly recognized as an important source of evolutionary novelty across diverse taxa. Many de novo genes have been proposed to be protein-coding, and in several cases have been experimentally shown to yield protein products. However, the systematic study of de novo proteins has been hampered by doubts regarding the translation of their transcripts without the experimental observation of protein products. Using a systematic, ORF-focused mass-spectrometry-first computational approach, we identify almost 1000 unannotated open reading frames with evidence of translation (utORFs) in the model organism Drosophila melanogaster, 371 of which have canonical start codons. To quantify the comparative genomic similarity of these utORFs across Drosophila and to infer phylostratigraphic age, we further develop a synteny-based protein similarity approach. Combining these results with reference datasets on tissue- and life-stage-specific transcription and conservation, we identify different properties amongst these utORFs. Contrary to expectations, the fastest-evolving utORFs are not the youngest evolutionarily. We observed more utORFs in the brain than in the testis. Most of the identified utORFs may be of de novo origin, even accounting for the possibility of false-negative similarity detection. Finally, sequence divergence after an inferred de novo origin event remains substantial, raising the possibility that de novo proteins turn over frequently. Our results suggest that there is substantial unappreciated diversity in de novo protein evolution: many more may exist than have been previously appreciated; there may be divergent evolutionary trajectories; and de novo proteins may be gained and lost frequently. All in all, there may not exist a single characteristic model of de novo protein evolution, but instead, there may be diverse evolutionary trajectories for de novo proteins.

    1. Chromosomes and Gene Expression
    2. Genetics and Genomics
    Meng Huang, Minjie Hong ... Xuezhu Feng
    Research Article Updated

    Histone methylation plays crucial roles in the development, gene regulation, and maintenance of stem cell pluripotency in mammals. Recent work shows that histone methylation is associated with aging, yet the underlying mechanism remains unclear. In this work, we identified a class of putative histone 3 lysine 9 mono/dimethyltransferase genes (met-2, set-6, set-19, set-20, set-21, set-32, and set-33), mutations in which induce synergistic lifespan extension in the long-lived DAF-2 (insulin growth factor 1 [IGF-1] receptor) mutant in Caenorhabditis elegans. These putative histone methyltransferase plus daf-2 double mutants not only exhibited an average lifespan nearly three times that of wild-type animals and a maximal lifespan of approximately 100 days, but also significantly increased resistance to oxidative and heat stress. Synergistic lifespan extension depends on the transcription factor DAF-16 (FOXO). mRNA-seq experiments revealed that the mRNA levels of DAF-16 Class I genes, which are activated by DAF-16, were further elevated in the daf-2;set double mutants. Among these genes, tts-1, F35E8.7, ins-35, nhr-62, sod-3, asm-2, and Y39G8B.7 are required for the lifespan extension of the daf-2;set-21 double mutant. In addition, treating daf-2 animals with the H3K9me1/2 methyltransferase G9a inhibitor also extends lifespan and increases stress resistance. Therefore, investigation of DAF-2 and H3K9me1/2 deficiency-mediated synergistic longevity will contribute to a better understanding of the molecular mechanisms of aging and therapeutic applications.