Broad-scale variation in human genetic diversity levels is predicted by purifying selection on coding and non-coding elements

  1. David A Murphy  Is a corresponding author
  2. Eyal Elyashiv
  3. Guy Amster
  4. Guy Sella  Is a corresponding author
  1. Oklahoma Medical Research Foundation, United States
  2. Columbia University, United States

Abstract

Analyses of genetic variation in many taxa have established that neutral genetic diversity is shaped by natural selection at linked sites. Whether the mode of selection is primarily the fixation of strongly beneficial alleles (selective sweeps) or purifying selection on deleterious mutations (background selection) remains unknown, however. We address this question in humans by fitting a model of the joint effects of selective sweeps and background selection to autosomal polymorphism data from the 1000 Genomes Project. After controlling for variation in mutation rates along the genome, a model of background selection alone explains ~60% of the variance in diversity levels at the megabase scale. Adding the effects of selective sweeps driven by adaptive substitutions to the model does not improve the fit, and when both modes of selection are considered jointly, selective sweeps are estimated to have had little or no effect on linked neutral diversity. The regions under purifying selection are best predicted by phylogenetic conservation, with ~80% of the deleterious mutations affecting neutral diversity occurring in non-exonic regions. Thus, background selection is the dominant mode of linked selection in humans, with marked effects on diversity levels throughout autosomes.

Data availability

Shared data can be found at github.com/sellalab/HumanLinkedSelectionMaps. This repository includes fully documented code for: downloading and processing public datasets used, running inferences, analyzing results, and generating all figures from the manuscript. This repository also includes B-maps for all "best-fitting" models described in the manuscript. Customized CADD scores with bStatistic removed are available on Data Dryad at https://doi.org/10.5061/dryad.n8pk0p2x0.

The following data sets were generated
The following previously published data sets were used

Article and author information

Author details

  1. David A Murphy

    Genes and Human Disease Research Program, Oklahoma Medical Research Foundation, Oklahoma City, United States
    For correspondence
    david-murphy@omrf.org
    Competing interests
    No competing interests declared.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-0715-3355
  2. Eyal Elyashiv

    Department of Biological Sciences, Columbia University, New York, United States
    Competing interests
    Eyal Elyashiv, is affiliated with MyHeritage. The author has no financial interests to declare..
  3. Guy Amster

    Department of Biological Sciences, Columbia University, New York, United States
    Competing interests
    Guy Amster, is affiliated with Flatiron Health Inc. The author has no financial interests to declare..
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-9108-5200
  4. Guy Sella

    Department of Biological Sciences, Columbia University, New York, United States
    For correspondence
    gs2747@columbia.edu
    Competing interests
    No competing interests declared.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-5239-7930

Funding

NIH (GM115889)

  • Guy Sella

NIH (T32GM008798)

  • David A Murphy

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Copyright

© 2022, Murphy et al.

This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 1,883
    views
  • 341
    downloads
  • 29
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. David A Murphy
  2. Eyal Elyashiv
  3. Guy Amster
  4. Guy Sella
(2022)
Broad-scale variation in human genetic diversity levels is predicted by purifying selection on coding and non-coding elements
eLife 11:e76065.
https://doi.org/10.7554/eLife.76065

Share this article

https://doi.org/10.7554/eLife.76065