Studies in a variety of species have shown evidence for positively selected variants introduced into a population via introgression from another, distantly related population - a process known as adaptive introgression. However, there are few explicit frameworks for jointly modelling introgression and positive selection, in order to detect these variants using genomic sequence data. Here, we develop an approach based on convolutional neural networks (CNNs). CNNs do not require the specification of an analytical model of allele frequency dynamics, and have outperformed alternative methods for classification and parameter estimation tasks in various areas of population genetics. Thus, they are potentially well suited to the identification of adaptive introgression. Using simulations, we trained CNNs on genotype matrices derived from genomes sampled from the donor population, the recipient population and a related non-introgressed population, in order to distinguish regions of the genome evolving under adaptive introgression from those evolving neutrally or experiencing selective sweeps. Our CNN architecture exhibits 95% accuracy on simulated data, even when the genomes are unphased, and accuracy decreases only moderately in the presence of heterosis. As a proof of concept, we applied our trained CNNs to human genomic datasets - both phased and unphased - to detect candidates for adaptive introgression that shaped our evolutionary history.
Source code is available from https://github.com/grahamgower/genomatnn/.
A global reference for human genetic variation10.1038/nature15393.
Multiple deeply divergent Denisovan ancestries in Papuans10.1016/j.cell.2019.02.035.
- Fernando Racimo
- Matteo Fumagalli
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
- George H Perry, Pennsylvania State University, United States
© 2021, Gower et al.
This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.
Cell-free DNA (cfDNA) tests use small amounts of DNA in the bloodstream as biomarkers. While it is thought that cfDNA is largely released by dying cells, the proportion of dying cells' DNA that reaches the bloodstream is unknown. Here, we integrate estimates of cellular turnover rates to calculate the expected amount of cfDNA. By comparing this to the actual amount of cell type-specific cfDNA, we estimate the proportion of DNA reaching plasma as cfDNA. We demonstrate that <10% of the DNA from dying cells is detectable in plasma, and the ratios of measured to expected cfDNA levels vary a thousand-fold among cell types, often reaching well below 0.1%. The analysis suggests that local clearance, presumably via phagocytosis, takes up most of the dying cells' DNA. Insights into the underlying mechanism may help to understand the physiological significance of cfDNA and improve the sensitivity of liquid biopsies.
The muscle synergy is a guiding concept in motor control research that relies on the general notion of muscles ‘working together’ towards task performance. However, although the synergy concept has provided valuable insights into motor coordination, muscle interactions have not been fully characterised with respect to task performance. Here, we address this research gap by proposing a novel perspective to the muscle synergy that assigns specific functional roles to muscle couplings by characterising their task-relevance. Our novel perspective provides nuance to the muscle synergy concept, demonstrating how muscular interactions can ‘work together’ in different ways: (1) irrespective of the task at hand but also (2) redundantly or (3) complementarily towards common task-goals. To establish this perspective, we leverage information- and network-theory and dimensionality reduction methods to include discrete and continuous task parameters directly during muscle synergy extraction. Specifically, we introduce co-information as a measure of the task-relevance of muscle interactions and use it to categorise such interactions as task-irrelevant (present across tasks), redundant (shared task information), or synergistic (different task information). To demonstrate these types of interactions in real data, we firstly apply the framework in a simple way, revealing its added functional and physiological relevance with respect to current approaches. We then apply the framework to large-scale datasets and extract generalizable and scale-invariant representations consisting of subnetworks of synchronised muscle couplings and distinct temporal patterns. The representations effectively capture the functional interplay between task end-goals and biomechanical affordances and the concurrent processing of functionally similar and complementary task information. The proposed framework unifies the capabilities of current approaches in capturing distinct motor features while providing novel insights and research opportunities through a nuanced perspective to the muscle synergy.