Rapid adaptation to malaria facilitated by admixture in the human population of Cabo Verde

  1. Iman Hamid  Is a corresponding author
  2. Katharine L Korunes
  3. Sandra Beleza
  4. Amy Goldberg  Is a corresponding author
  1. Department of Evolutionary Anthropology, Duke University, United States
  2. Department of Genetics and Genome Biology, University of Leicester, United Kingdom
6 figures, 2 tables and 2 additional files

Figures

Figure 1 with 2 supplements
Enrichment of West African ancestry at the DARC locus in Santiago, Cabo Verde.

(A) Map of Cabo Verde islands and sample sizes for number of individuals from each island region. (B) The distribution of West African-related local ancestry proportion across the genome by SNP (n = …

Figure 1—figure supplement 1
Local ancestry proportion along the genome in Santiago.

The mean is indicated by the solid horizontal line, and dashed horizontal lines represent three standard deviations from the mean. Again, this plot demonstrates Duffy-null (red dot) as the highest …

Figure 1—figure supplement 2
The observed frequency of Duffy-null for each island vs neutral expectation based on mean global ancestry (as estimated by admixture).

*indicates significant p-value <0.001 for binomial test (see Table 1 for sample sizes and further details).

Figure 2 with 5 supplements
Long, high-frequency West African ancestry tracts span the DARC locus in Santiago.

(A) The distribution of West African (purple) and European (green) ancestry tract lengths spanning the DARC locus (dashed line). Each horizontal line represents a single chromosome in the population …

Figure 2—figure supplement 1
Mean standardized integrated Decay in Ancestry Tract (iDAT) score for 20 Mb sliding windows (step size = 1 Mb), using standardized iDAT for 10,000 random positions across the genome for (A) Fogo and (B) the Northwest Cluster.

Solid gray lines indicate mean windowed standardized iDAT score for each island (Fogo, 0.006; NW Cluster, −0.024) and dashed gray lines indicate three standard deviations from the mean. Vertical …

Figure 2—figure supplement 2
Density distributions for five ancestry-based statistics under eight neutral models.

Summary statistics were calculated from a random sample of 172 individuals from each simulated population, matching the number of individuals from Santiago included in our analyses. High population …

Figure 2—figure supplement 3
Density distributions for five ancestry-based statistics under simulations using different genetic maps.

Simulations shown assumed a single pulse of admixture with exponential growth at a rate of 0.05 per generation and an initial population size of N = 10,000. Initial admixture contributions were …

Figure 2—figure supplement 4
Performance of integrated Decay in Ancestry Tract (iDAT) under various scenarios.

Each plot corresponds to number of generations since admixture (10 – left; 100 – middle; 1000 – right). Line and point colors correspond to source population one admixture contribution at m=0.1 (gray), …

Figure 2—figure supplement 5
Performance of integrated Decay in Ancestry Tract (iDAT) for various chromosome sizes and cut-off values.

Line and point colors correspond to simulated human chromosome and corresponding size (chr 1 – green; chr 7 – blue; chr 15 – yellow; chr 22 – gray). X-axis shows DAT cut-off values, and y-axis shows …

Absolute values of iHS for SNPs in the Cabo Verde data set.

iHS was calculated using the hapbin software and standardized using the default method based on allele frequencies. (A) Santiago, (B) Fogo, and (C) NW Cluster. Value for Duffy-null SNP is indicated …

Figure 4 with 2 supplements
Strong selection inferred at the DARC locus in Santiago.

(A) Pairs of s and h that result in a small difference in final allele frequency calculated under the model and the allele frequency observed in the Santiago genetic data,  |p20pDuffy|<0.01 under a deterministic …

Figure 4—figure supplement 1
Results of approximate Bayesian computation (ABC) estimation of posterior distributions for (A) selection coefficient for Duffy-null and (B) initial West African ancestry contribution for Santiago.

Duffy-null allele was modeled as additive (blue; h=0.5), dominant (yellow; h=1 in SLiM), or recessive (pink; h=0 in SLiM). Posterior median estimates for selection coefficient: srec=0.052, sadd=0.0795, sdom=0.183; initial …

Figure 4—figure supplement 2
Results of leave-one-out cross-validation of approximate Bayesian computation (ABC) joint estimation.

(A) Selection coefficient (RMSE=0.0083, R2=0.9785) and (B) initial West African admixture contribution (RMSE=0.0090, R2=0.9985).

Figure 5 with 1 supplement
Selection at a single locus impacts genome-wide ancestry proportion.

(A) Inferred (dark gray), simulated (white), and observed (red) mean of global ancestry in Santiago over time. The dark gray histogram plots the posterior distribution for initial (g=1) West African …

Figure 5—figure supplement 1
Effect of selection on global ancestry across simulation methods.

Pink circles indicate West African mean global ancestry after 20 generations versus selection coefficient for whole autosome (22 chromosome) simulations, using a uniform recombination rate within …

Figure 6 with 1 supplement
Precision-recall curve for validation of SWIF(r) classification of neutral and positively selected variants, using 1000 neutral and 1000 positive selection simulations.

With our ancestry-based measures, SWIF(r) achieved an area under the curve (AUC) of 0.966, where an AUC of 1 represents a classifier with perfect skill. Horizontal dashed line indicates the no-skill …

Figure 6—figure supplement 1
SWIF(r) classification results for 1000 neutral and 1000 positive selection simulations used for the test set based on Santiago’s demographic history.

(A) Confusion matrix with threshold P(selection)>0.5. There are no false positives in test set and a high rate of false negatives. (B) Scatterplot of initial admixture contribution vs selection …

Tables

Table 1
Expected and observed Duffy-null allele frequencies for each island and source population.

Expected Duffy-null frequencies are approximated by mean West African global ancestry proportion for each island, calculated using the admixture software.

Populationn (sampled individuals)Expected frequencyObserved frequencyBinomial test p-value
Santiago1720.7370.8342.193 ×10−5
Fogo1290.4980.5390.192
NW Cluster2360.5520.5570.817
GWD1070.9971.000-
IBS1070.0020.019-
Table 2
Demographic models used for single-chromosome neutral simulations relevant to Cabo Verde demographic history.
Initial population size (N)Population growth modelPopulation growth rate (per generation)Admixture typeProportion of new migrants (per generation)Scenario number
1000Constant size-Single-pulse-1
Continuous0.012
Exponential0.05Single-pulse-3
Continuous0.014
10,000Constant size-Single-pulse-5
Continuous0.016
Exponential0.05Single-pulse-7
Continuous0.018

Additional files

Supplementary file 1

Chromosome 16:46582888–60359576 GO terms.

File containing ENSEMBL gene IDs and associated GO terms for the 10 genes that overlap with region showing extreme iDAT signatures.

https://cdn.elifesciences.org/articles/63177/elife-63177-supp1-v2.zip
Transparent reporting form
https://cdn.elifesciences.org/articles/63177/elife-63177-transrepform-v2.docx

Download links