Background selection and biased gene conversion affect more than 95% of the human genome and bias demographic inferences
Abstract
Disentangling the effect on genomic diversity of natural selection from that of demography is notoriously difficult, but necessary to properly reconstruct the history of species. Here, we use high-quality human genomic data to show that purifying selection at linked sites (i.e. background selection, BGS) and GC-biased gene conversion (gBGC) together affect as much as 95% of the variants of our genome. We find that the magnitude and relative importance of BGS and gBGC are largely determined by variation in recombination rate and base composition. Importantly, synonymous sites and non-transcribed regions are also affected, albeit to different degrees. Their use for demographic inference can lead to strong biases. However, by conditioning on genomic regions with recombination rates above 1.5 cM/Mb and mutation types (C↔G, A↔T), we identify a set of SNPs that is mostly unaffected by BGS or gBGC, and that avoids these biases in the reconstruction of human history.
Data availability
All data generated and script to analyse them is provided on the dryad repository: http://dx.doi.org/10.5061/dryad.t76fk80
-
Data from: Background selection and biased gene conversion affect more than 95% of the human genome and bias demographic inferencesAvailable at Dryad Digital Repository under a CC0 Public Domain Dedication.
Article and author information
Author details
Funding
Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung (310030B-166605)
- Laurent Excoffier
University of Berkeley (Visiting Miller Professorship)
- Laurent Excoffier
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Reviewing Editor
- Krishna Veeramah, Stony Brook University, United States
Publication history
- Received: March 1, 2018
- Accepted: August 17, 2018
- Accepted Manuscript published: August 20, 2018 (version 1)
- Accepted Manuscript updated: August 23, 2018 (version 2)
- Version of Record published: October 9, 2018 (version 3)
Copyright
© 2018, Pouyet et al.
This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 7,887
- Page views
-
- 976
- Downloads
-
- 63
- Citations
Article citation count generated by polling the highest count across the following sources: Scopus, Crossref, PubMed Central.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Genetics and Genomics
- Neuroscience
Aging is a major risk factor for Alzheimer’s disease (AD), and cell-type vulnerability underlies its characteristic clinical manifestations. We have performed longitudinal, single-cell RNA-sequencing in Drosophila with pan-neuronal expression of human tau, which forms AD neurofibrillary tangle pathology. Whereas tau- and aging-induced gene expression strongly overlap (93%), they differ in the affected cell types. In contrast to the broad impact of aging, tau-triggered changes are strongly polarized to excitatory neurons and glia. Further, tau can either activate or suppress innate immune gene expression signatures in a cell-type-specific manner. Integration of cellular abundance and gene expression pinpoints nuclear factor kappa B signaling in neurons as a marker for cellular vulnerability. We also highlight the conservation of cell-type-specific transcriptional patterns between Drosophila and human postmortem brain tissue. Overall, our results create a resource for dissection of dynamic, age-dependent gene expression changes at cellular resolution in a genetically tractable model of tauopathy.
-
- Evolutionary Biology
- Genetics and Genomics
Our interest in the genetic basis of skin color variation between populations led us to seek a Native American population with genetically African admixture but low frequency of European light skin alleles. Analysis of 458 genomes from individuals residing in the Kalinago territory of the Commonwealth of Dominica showed approximately 55% Native American, 32% African, and 12% European genetic ancestry, the highest Native American genetic ancestry among Caribbean populations to date. Skin pigmentation ranged from 20 to 80 melanin units, averaging 46. Three albino individuals were determined to be homozygous for a causative multi-nucleotide polymorphism OCA2NW273KV contained within a haplotype of African origin; its allele frequency was 0.03 and single allele effect size was -8 melanin units. Derived allele frequencies of SLC24A5A111T and SLC45A2L374F were 0.14 and 0.06, with single allele effect sizes of -6 and -4, respectively. Native American genetic ancestry by itself reduced pigmentation by more than 20 melanin units (range 24 - 29). The responsible hypopigmenting genetic variants remain to be identified, since none of the published polymorphisms predicted in prior literature to affect skin color in Native Americans caused detectable hypopigmentation in the Kalinago.