On the limits of fitting complex models of population history to f-statistics
Abstract
Our understanding of population history in deep time has been assisted by fitting admixture graphs ('AGs') to data: models that specify the ordering of population splits and mixtures, which along with the amount of genetic drift on each lineage and the proportions of mixture, is the only information needed to predict the patterns of allele frequency correlation among populations. Not needing to specify population size changes, split times, or whether admixture events were sudden or drawn out simplifies the space of models that need to be searched. However, the space of possible AGs relating populations is vast and cannot be sampled fully, and thus most published studies have identified fitting AGs through a manual process driven by prior hypotheses, leaving the vast majority of alternative models unexplored. Here, we develop a method for systematically searching the space of all AGs that can incorporate non-genetic information in the form of topology constraints. We implement this findGraphs tool within a software package, ADMIXTOOLS 2, which is a reimplementation of the ADMIXTOOLS software with new features and large performance gains. We apply this methodology to identify alternative models to AGs that played key roles in eight published studies and find that graphs modeling more than six populations and two or three admixture events are often not unique, with many alternative models fitting nominally or significantly better than the published one. Our results suggest that strong claims about population history from AGs should only be made when all well-fitting and temporally plausible models share common topological features. Our re-evaluation of published data also provides insight into the population histories of humans, dogs, and horses, identifying features that are stable across the models we explored, as well as scenarios of populations relationships that differ in important ways from models that have been highlighted in the literature, that fit the allele frequency correlation data, and that are not obviously wrong.
Data availability
As indicated in the 'Materials availability statement', the ancient human genome newly reported in this manuscript (Table S2) is freely available at the European Nucleotide Archive in the form of an alignment of reads to the hg19 human reference genome (project accession number PRJEB58199. All the other data we analyze are previously reported. As we state in the 'Materials availability statement', the exact versions of the published archaeogenetic datasets re-analyzed in this manuscript were kindly shared by the corresponding authors of the following publications upon our requests:1.Bergström A, Frantz L, Schmidt R, et al. Initial Upper Palaeolithic humans in Europe had recent Neanderthal ancestry. Nature. 2021 Apr;592(7853):253-257. doi: 10.1038/s41586-021-03335-3.2.Lazaridis I, Patterson N, Mittnik A, et al. Ancient human genomes suggest three ancestral populations for present-day Europeans. Nature. 2014 Sep 18;513(7518):409-13. doi: 10.1038/nature13673.3.Librado P, Khan N, Fages A, et al. The origins and spread of domestic horses from the Western Eurasian steppes. Nature. 2021 Oct;598(7882):634-640. doi: 10.1038/s41586-021-04018-9.4.Lipson M, Ribot I, Mallick S, et al. Ancient West African foragers in the context of African population history. Nature. 2020 Jan;577(7792):665-670. doi: 10.1038/s41586-020-1929-1.5.Shinde V, Narasimhan VM, Rohland N, et al. An Ancient Harappan Genome Lacks Ancestry from Steppe Pastoralists or Iranian Farmers. Cell. 2019 Oct 17;179(3):729-735.e10. doi: 10.1016/j.cell.2019.08.048.6.Sikora M, Pitulko VV, Sousa VC, et al. The population history of northeastern Siberia since the Pleistocene. Nature. 2019 Jun;570(7760):182-188. Doi: 10.1038/s41586-019-1279-z.7.Wang CC, Yeh HY, Popov AN, et al. Genomic insights into the formation of human populations in East Asia. Nature. 2021 Mar;591(7850):413-419. Doi: 10.1038/s41586-021-03336-2.Various statistics for these re-used datasets are summarized in Table S1.
Article and author information
Author details
Funding
Czech Ministry of Education, Youth and Sports (project no. LL2103)
- Pavel Flegontov
- Olga Flegontova
- Piya Changmai
Czech Ministry of Education, Youth and Sports (LM2015070)
- Pavel Flegontov
- Piya Changmai
Czech Ministry of Education, Youth and Sports (project no. LTAUSA18153)
- Pavel Flegontov
- Piya Changmai
National Institutes of Health (GM100233)
- Robert Maier
- David Reich
National Institutes of Health (HG012287)
- Robert Maier
- David Reich
John Templeton Foundation (grant 61220)
- Robert Maier
- David Reich
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Copyright
© 2023, Maier et al.
This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 5,747
- views
-
- 979
- downloads
-
- 103
- citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Cell Biology
- Evolutionary Biology
Maintenance of rod-shape in bacterial cells depends on the actin-like protein MreB. Deletion of mreB from Pseudomonas fluorescens SBW25 results in viable spherical cells of variable volume and reduced fitness. Using a combination of time-resolved microscopy and biochemical assay of peptidoglycan synthesis, we show that reduced fitness is a consequence of perturbed cell size homeostasis that arises primarily from differential growth of daughter cells. A 1000-generation selection experiment resulted in rapid restoration of fitness with derived cells retaining spherical shape. Mutations in the peptidoglycan synthesis protein Pbp1A were identified as the main route for evolutionary rescue with genetic reconstructions demonstrating causality. Compensatory pbp1A mutations that targeted transpeptidase activity enhanced homogeneity of cell wall synthesis on lateral surfaces and restored cell size homeostasis. Mechanistic explanations require enhanced understanding of why deletion of mreB causes heterogeneity in cell wall synthesis. We conclude by presenting two testable hypotheses, one of which posits that heterogeneity stems from non-functional cell wall synthesis machinery, while the second posits that the machinery is functional, albeit stalled. Overall, our data provide support for the second hypothesis and draw attention to the importance of balance between transpeptidase and glycosyltransferase functions of peptidoglycan building enzymes for cell shape determination.
-
- Chromosomes and Gene Expression
- Evolutionary Biology
Repression of retrotransposition is crucial for the successful fitness of a mammalian organism. The domesticated transposon protein L1TD1, derived from LINE-1 (L1) ORF1p, is an RNA-binding protein that is expressed only in some cancers and early embryogenesis. In human embryonic stem cells, it is found to be essential for maintaining pluripotency. In cancer, L1TD1 expression is highly correlative with malignancy progression and as such considered a potential prognostic factor for tumors. However, its molecular role in cancer remains largely unknown. Our findings reveal that DNA hypomethylation induces the expression of L1TD1 in HAP1 human tumor cells. L1TD1 depletion significantly modulates both the proteome and transcriptome and thereby reduces cell viability. Notably, L1TD1 associates with L1 transcripts and interacts with L1 ORF1p protein, thereby facilitating L1 retrotransposition. Our data suggest that L1TD1 collaborates with its ancestral L1 ORF1p as an RNA chaperone, ensuring the efficient retrotransposition of L1 retrotransposons, rather than directly impacting the abundance of L1TD1 targets. In this way, L1TD1 might have an important role not only during early development but also in tumorigenesis.