Most organisms are more closely related to nearby than distant members of their species, creating spatial autocorrelations in genetic data. This allows us to predict the location of a genetic sample by comparing it to a set of samples of known geographic origin. Here we describe a deep learning method, which we call Locator, to accomplish this task faster and more accurately than existing approaches. In simulations, Locator infers sample location to within 4.1 generations of dispersal and runs at least an order of magnitude faster than a recent model-based approach. We leverage Locator's computational efficiency to predict locations separately in windows across the genome, which allows us to both quantify uncertainty and describe the mosaic ancestry and patterns of geographic mixing that characterize many populations. Applied to whole-genome sequence data from Plasmodium parasites, Anopheles mosquitoes, and global human populations, this approach yields median test errors of 16.9km, 5.7km, and 85km, respectively.
Locator is implemented as a command-line program written in Python: www.github.com/kern-lab/locator. SNP calls for the Anopheles dataset are available at https://www.malariagen.net/data/ag1000g-phase1-ar3, for P. falciparum at https://www.malariagen.net/resource/26,and for the HGDP at ftp://ngs.sanger.ac.uk/production/hgdp. Code to run continuous-space simulations can be found at https://github.com/kern-lab/spaceness/blob/master/slim_recipes/spaceness.slim. This publication uses data from the MalariaGEN Plasmodium falciparum Community Project as described in Pearson et al. (2019). Statistical analyses and many plots were produced in R (R Core Team, 2018).
Ag1000G phase 1 AR3 data releaseMalariaGEN, http://www.malariagen.net/data/ag1000g-phase1-AR3.
Plasmodium falciparum community project version 6 data releaseMalariaGEN, https://www.malariagen.net/resource/26.
Insights into human genetic variation and population history from 929 diverse genomesHGDP, ftp://ngs.sanger.ac.uk/production/hgdp.
- CJ Battey
- Andrew D Kern
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
- Magnus Nordborg, Austrian Academy of Sciences, Austria
© 2020, Battey et al.
This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.
Bacterial pathogens show high levels of chromosomal genetic diversity, but the influence of this diversity on the evolution of antibiotic resistance by plasmid acquisition remains unclear. Here, we address this problem in the context of colistin, a ‘last line of defence’ antibiotic. Using experimental evolution, we show that a plasmid carrying the MCR-1 colistin resistance gene dramatically increases the ability of Escherichia coli to evolve high-level colistin resistance by acquiring mutations in lpxC, an essential chromosomal gene involved in lipopolysaccharide biosynthesis. Crucially, lpxC mutations increase colistin resistance in the presence of the MCR-1 gene, but decrease the resistance of wild-type cells, revealing positive sign epistasis for antibiotic resistance between the chromosomal mutations and a mobile resistance gene. Analysis of public genomic datasets shows that lpxC polymorphisms are common in pathogenic E. coli, including those carrying MCR-1, highlighting the clinical relevance of this interaction. Importantly, lpxC diversity is high in pathogenic E. coli from regions with no history of MCR-1 acquisition, suggesting that pre-existing lpxC polymorphisms potentiated the evolution of high-level colistin resistance by MCR-1 acquisition. More broadly, these findings highlight the importance of standing genetic variation and plasmid/chromosomal interactions in the evolutionary dynamics of antibiotic resistance.
As an adapting population traverses the fitness landscape, its local neighborhood (i.e., the collection of fitness effects of single-step mutations) can change shape because of interactions with mutations acquired during evolution. These changes to the distribution of fitness effects can affect both the rate of adaptation and the accumulation of deleterious mutations. However, while numerous models of fitness landscapes have been proposed in the literature, empirical data on how this distribution changes during evolution remains limited. In this study, we directly measure how the fitness landscape neighborhood changes during laboratory adaptation. Using a barcode-based mutagenesis system, we measure the fitness effects of 91 specific gene disruption mutations in genetic backgrounds spanning 8000–10,000 generations of evolution in two constant environments. We find that the mean of the distribution of fitness effects decreases in one environment, indicating a reduction in mutational robustness, but does not change in the other. We show that these distribution-level patterns result from differences in the relative frequency of certain patterns of epistasis at the level of individual mutations, including fitness-correlated and idiosyncratic epistasis.