6 figures, 7 tables and 1 additional file

Figures

Differentiation between African and Asian genomes.

(A) Neighbour-joining tree showing population structure across all sampling sites, with sample branches coloured according to country groupings (Table 1). Branches with large open tip circle indicate the sample is a kelch13 mutant, while those with a small black symbol are mixed infections (i.e. mixture of wild-type and mutant parasites or two mutant parasites with different mutations). Branches without tip symbols are kelch13 wild type. African kelch13 mutants are, at a genomic level, similar to other African samples. (B) Plot of second principal component (PC2) against the first (PC1), computed from a principal coordinate analysis (PCoA) of all samples in the present study, based on the same pairwise genetic distance matrix used for the tree of Figure 1A. PC1 clearly separates African samples from those collected in SEA, while PC2 is mainly driven by extreme population structure in ESEA.

https://doi.org/10.7554/eLife.08714.006
Figure 2 with 2 supplements
Local origin of African kelch13 mutations.

(A) Chromosome painting (see Materials and methods) of the 52 African kelch13 mutants across the two 250 kbp flanking regions on each side of the kelch13 gene. Each genome chunk is coloured according to the aggregated posterior probabilities that it originated in the African (red) or SEA (blue) population, according to the scale shown. (B) Detail of the flanking regions over a span of approximately 15 kbp, using the same colour scheme. The country of origin is indicated on the left, followed by the proportion of African chunks identified; the kelch13 mutation carried by the sample is shown on the right. Samples are sorted by the proportion of Asian chunks in this window, and the same order was applied to Panel A. Only five samples (lower region of the panel) show strong probability of Asian origin of the chunks closest to kelch13.

https://doi.org/10.7554/eLife.08714.007
Figure 2—figure supplement 1
Diversity within kelch13-flanking haplotypes in African mutants.

Using the same SNPs utilized in the analysis shown in Figure 2B, we estimated diversity in a 1 kbp sliding window around kelch13 (highlighted in grey), by computing the mean SNP heterozygosity in a random set of 56 African kelch13 wild type genomes. The blue shaded area reports 95% of the distribution of the values over 1000 iterations. We also computed the heterozygosity estimate for the set of 56 African kelch13 mutants (red line), and found it not to be significantly different from those in wild-type parasites, suggesting that no common flanking haplotype is shared among these mutants.

https://doi.org/10.7554/eLife.08714.008
Figure 2—figure supplement 2
Origin of flanking haplotypes in selected African kelch13 mutants.

Here, chromosome painting is applied to the 15 kbp region around the kelch13 gene (as in Figure 2B) for five samples for which kelch13 was flanked by chunks with >50% probability of Asian origin. Here, the SEA population was divided into two groups: kelch13 wild type (blue) and mutants (yellow), to estimate the likelihood that the haplotype originated in SEA kelch13 mutants. Strong evidence of origin from SEA mutants was only found in two samples: Y483H mutant from Ghana (top row), and a C580Y mutant from Cameroon (bottom row).

https://doi.org/10.7554/eLife.08714.009
Figure 3 with 1 supplement
Number and density of variants in Africa and Southeast Asia.

(a) Allele frequency spectrum for Africa (red) and SEA (blue). Polymorphisms were binned by their minor allele frequency (MAF), and the counts in each bin were plotted against frequency, shown on a logarithmic scale. Although the number of high-frequency variations is consistent between the two regions, samples from Africa possess an excess of low-frequency variations. (b) SNP density per gene: for each gene, the number of variants found in the two regions is normalized by the length of the coding region (in kbp). African samples have on average 3.9 times more mutations than parasites from SEA. (c) Non-synonymous/synonymous ratio per gene: ratios of non-synonymous to synonymous mutations found per gene are similar in the two regions. To reduce artifacts due to small numbers, only genes with at least 10 SNP were considered in both analyses.

https://doi.org/10.7554/eLife.08714.010
Figure 3—figure supplement 1
Site frequency spectrum for non-synonymous mutations in the KPBD.

Allele frequency spectra for Africa (red) and SEA (blue) of non-synonymous mutations in the KPBD. Polymorphisms were binned by their minor allele frequency (MAF), shown on a logarithmic scale, and the counts in each bin reported. The two populations exhibit dramatically different behaviours with an abundance of medium and high frequency variations in SEA contrasted by an extreme paucity in Africa.

https://doi.org/10.7554/eLife.08714.011
Structure of kelch13, positioning of mutations in Africa and Southeast Asia, and sequence conservation.

(a) The amino acid positions of kelch13 polymorphisms observed in Africa (red) and SEA (blue) are shown. Coloured rectangles describe the extents of the resistance domains (BTB/POZ: aa. 349–448; kelch propeller: aa. 443–721) and upstream region, with the locations of non-synonymous changes indicated above, and that of synonymous changes below. Short lines represent singleton/doubleton mutations, while longer lines represent more frequent mutations. (b) Conservation score across amino acid residues of kelch13, derived by applying a CCF53P62 matrix on alignments of the P. falciparum gene coding sequence with its homologues in six other Plasmodium species for which high-quality sequence data were available: P. reichenowi, P. vivax, P. knowlesi, P. yoelii, P. berghei, and P. chabaudi (see Materials and methods). Although the region below position 200 is less conserved, particularly in rodent species (P. yoelii, berghei, and chabaudi), there is remarkably high conservation across all species over the rest of the gene, which includes the KPBD.

https://doi.org/10.7554/eLife.08714.012
Figure 5 with 2 supplements
Genome-wide analysis of N/S ratio.

(a) For each gene with more than 2 synonymous or non-synonymous SNPs, the N/S ratio in Africa (red points) and in SEA (blue points) are plotted against the conservation score of the gene coding sequence. The kelch13 gene values are represented by larger circles. For each region, a solid line show median values, while dotted lines delimit 95% of the genes at varying levels of conservation. This plot is truncated on the y-axis to show more clearly the bulk of the distribution; the full range is shown in Figure 5—figure supplement 2. (b) Histogram showing the distribution of the ratio of N/S ratios in SEA and Africa, for all genes with ≥5 synonymous and ≥5 non-synonymous SNPs on each region. An arrow shows the placement of kelch13.

https://doi.org/10.7554/eLife.08714.015
Figure 5—figure supplement 1
N/S ratio of well-known drug resistance genes.

This figure shows the same data as Figure 5a, extended by mapping genes known to have undergone selection for antimalarial resistance. For each gene, we show circles marking the N/S ration in Africa (red letter) and SEA (blue letter), paired by a vertical line annotated with a p-value (by Fisher’s exact test) that the two ratios are significantly different. The genes shown are kelch13 (denoted by the letter ‘K’), pfcrt (chloroquine resistance, ‘C’), pfdhfr (pyrimethamine resistance, ‘R’), pfdhps (sulfadoxine resistance, ‘S’), and pfmdr1 (resistance to chloroquine and other drugs, ‘M’). These results are best analyzed in the context of the underlying selective sweeps. For example, the small difference between ratios for the pfmdr1 gene can be explained by the association of the drug resistance phenotype with gene amplification, rather than with non-synonymous changes. Also, the extremely high N/S ratio for pfcrt in SEA is largely due to a paucity of synonymous SNPs, caused by the near-fixation of a very small number of haplotypes in the SEA region, yielding a p-value comparable to that of kelch13.

https://doi.org/10.7554/eLife.08714.016
Figure 5—figure supplement 2
Genome-wide analysis of high-frequency SNP density.

(a) Plot of the density of frequent (present in 3 or more samples) non-synonymous mutations in Africa (red) and SEA (blue) against conservation, for all genes with more than 2 synonymous or non-synonymous SNPs, showing that kelch13 (larger circle) is consistent with other similarly conserved genes in Africa, but has excess non-synonymous polymorphisms in SEA. (b) An analogous plot for synonymous mutations also shows that kelch13 follows the normal trend among genes in Africa, but has far fewer SNPs than expected in SEA.

https://doi.org/10.7554/eLife.08714.017
Figure 6 with 1 supplement
Structure of the kelch13 propeller domain, showing the position of mutations in Southeast Asia and Africa.

The sequence of the kelch13 propeller domain (amino acids 443–726) is organized according to its 6-blade tertiary structure, with the four β-strands characterizing each blade highlighted in colour. Polymorphisms observed in SEA (top panel) and Africa (bottom panel) are shown by circles above the mutated position. Small circles indicate very rare mutations (singletons and doubletons), while larger circles are used for more frequent mutations.

https://doi.org/10.7554/eLife.08714.018
Figure 6—figure supplement 1
Characterization of kelch13 mutations in Africa and Southeast Asia.

In these plots, all kelch13 amino acids in the propeller domains, downstream of position 430, are plotted against their Kyte-Doolittle hydrophobicity score (KD, gray). Circle symbols in plot (a) derived amino acid alleles in P. falciparum, coloured according to the conservation score against the ancestral allele (Table 5), derived from the CFF53P62 substitution matrix; lower values indicate more radical substitutions. The remaining plots use the same colouring scheme to show polymorphisms observed in ≥3 samples in Africa (b) or in SEA (c).

https://doi.org/10.7554/eLife.08714.019

Tables

Table 1

Count of samples analysed in this study.

https://doi.org/10.7554/eLife.08714.003
RegionCodeSample countsCountryCodeSample counts
West AfricaWAF957Burkina FasoBF56
CameroonCM134
GhanaGH478
Gambia, TheGM73
GuineaGN124
MaliML87
NigeriaNG5
East AfricaEAF412KenyaKE52
MadagascarMG18
MalawiMW262
TanzaniaTZ68
UgandaUG12
Central AfricaCAF279D. R. CongoCD279
South AmericaSAM27ColombiaCO16
PeruPE11
South AsiaSAS75BangladeshBD75
West Southeast AsiaWSEA497MyanmarMM111
ThailandTH386
East Southeast AsiaESEA1102CambodiaKH762
LaosLA120
VietnamVN220
OceaniaOCE62Indonesia (Papua)ID17
Papua New GuineaPG45
Table 2

Worldwide distribution of kelch13 mutations. Number of kelch13 propeller and BTB-POZ domain (KPBD) mutations present (not necessarily exclusively) in 5 populations (AFR = Africa, SEA = Southeast Asia, SAS = South Asia, OCE = Oceania, SAM = South America). Sample size is reported for each population.

https://doi.org/10.7554/eLife.08714.004
AFR
(N = 1,648)
SEA
(N = 1,599)
SAS
(N = 75)
OCE
(N = 62)
SAM
(N = 27)
Non-synonymousKPBD2634110
Upstream region4216211
SynonymousKPBD389100
Upstream region223100
Table 3

List of non-synonymous KPBD mutations. Non-synonymous mutations found in the kelch13 propeller and BTB-POZ domains (KPBD) and their position on chromosome Pf3D7_13_v3. For each mutation is reported where it has been observed and in how many samples. Where known, we reported if the mutation has been previously observed in patients with a prolonged parasite clearance half-life (>5 hr) by Miotto et al. 2014 and/or Ashley et al 2014. Sample size is reported for each population.

https://doi.org/10.7554/eLife.08714.005
MutationGenomic coordinatesAFR
(N = 1,648)
SEA
(N = 1,599)
SAS
(N = 75)
OCE
(N = 62)
SAM
(N = 27)
Observed in ART-R samples?
D353Y1725941-4---Yes
F395Y1725814-1---No
I416V172575211---
I416M17257501----
K438N1725684-1---No
P441L1725676-27---Yes
P443S1725671-1---
F446I1725662-7---Yes
G449A1725652-7---Yes
S459L172562222---
A481V1725556-4---Yes
S485N1725544-1---
Y493H1725521176---Yes
V520I17254401----
S522C172543421---Yes
P527H172541815---
C532S17254041----
V534L17253982----
N537I172538811---No
G538V1725385-19---Yes
R539T1725382-63---Yes
I543T1725370-34---Yes
P553L1725340224---Yes
A557S17253291----
R561H1725316124---Yes
V568G1725295-6---Yes
T573S17252802----
P574L1725277-12---Yes
R575K1725274-3---
A578S172526618----No
C580Y17252592423-1-Yes
D584V1725247-3---Yes
V589I17252332----
T593S17252211----
E612D17251621----
Q613E172516151---
Q613L17251601----No
F614L1725158-1---No
Y630F172510921---
V637I17250892----
P667A1724999-2---
P667L1724998-21--
F673I1724981-3---Yes
A675V1724974118---Yes
A676S172497223---
H719N172484318---Yes
Table 4

Frequency of the non-synonymous KPBD mutations. Counts of non-synonymous mutations in the conserved propeller and BTB-POZ domains of kelch13 are shown for each geographical region, stratified by the number of samples in which they are observed. Sample size for each population is reported.

https://doi.org/10.7554/eLife.08714.013
AFR
(N = 1,648)
SEA
(N = 1,599)
SAS
(N = 75)
OCE
(N = 62)
SAM
(N = 27)
1–2 samples2413110
3–5 samples17000
>5 samples114000
Table 5

Kelch13 propeller domain mutations in different Plasmodium species. Here we report amino acid allele differences in a multiple sequence alignments of kelch13 homologues for seven species of Plasmodium parasites for which high-quality sequence data were available: P. falciparum (Pf), P. reichenowi (Pr), P. vivax (Pv), P. knowlesi (Pk), P. yoelii (Py), P. berghei (Pb), and P. chabaudi (Pc). The species formed three groups by similarity: Laverania (Pf, Pr), primate Plasmodia (Pv, Pk) and rodent Plasmodia (Py, Pb and Pc). An allele shared by all members of two different groups was identified as a putative ancestral allele. The table shows, for each position where at least one species exhibits a difference from the others: the amino acid position in the Pf kelch13 sequence; the putative ancestral amino acid allele; the alleles in the various species (columns with heading listing multiple species show mutations common to those species); and a substitution score of the mutation, based on a CCF53P62 substitution matrix (see Materials and methods). All substitution scores are ≥0, denoting conservative substitutions.

https://doi.org/10.7554/eLife.08714.014
Pf PositionAncestral AllelePf,PrPv,PkPkPy,Pb,PcPy,PbPbPcSubstitution Score
434F------Y2
447C----S--1
448I-M-----2
517TV------0
519FY------2
520V---L---0
534V---I---2
550S--C----1
566V---I---2
568V-I-----2
578A-S-----0
584D--E----1
590I----V--2
593T---A---0
605DE------1
613Q----N-K0
648D---E---1
666V---I---2
676A---T---0
691DE------1
708IL------0
711S-----P-0
723I---V---2
Table 6

Conservation score of KPBD mutations.The table shows, for each non-synonymous KPBD mutation observed in the dataset, the number of samples carrying the mutation in Africa (AFR), in Southeast Asia (SEA), and a substitution score of the mutation, based on a CCF53P62 substitution matrix; lower values indicate more radical substitutions. Mutations observed in Africa tend to have higher conservation score, whereas in SEA mutations tend to be more radical.

https://doi.org/10.7554/eLife.08714.020
MutationAFRSEACFF53P62
Q613E

5

1

2

Y630F212
V637I202
V589I202
T573S202
Y493H1762
I416V112
T593S102
V520I102
I416M102
F395Y012
S522C211
E612D101
C532S101
R575K031
A578S1800
A676S230
V534L200
R561H1240
A675V1180
H719N180
P527H150
A557S100
R539T0630
I543T0340
G449A070
F446I070
A481V040
F673I030
P667A020
F614L010
S485N010
P443S010
C580Y2423-2
D353Y04-2
K438N01-2
P553L224-3
P441L027-3
G538V019-3
P574L012-3
V568G06-3
P667L02-3
S459L22-4
N537I11-4
Q613L10-4
D584V03-5
Table 7

Frequency of the genetic background alleles across the world. Frequency of the four genetic background alleles identified in Miotto et al. (2015) for each geographical region. For each SNP, we show mutation name; chromosome number; nucleotide position; and frequencies of the mutant allele in the various populations.

https://doi.org/10.7554/eLife.08714.021
MutationChrPosAFRSASSEAPNGSAM
arps10-V127M1424810700.0%0.0%59.4%0.0%0.0%
fd-D193Y137483950.1%2.2%62.8%23.9%0.0%
mdr2-T484I1419562250.1%5.7%64.2%0.4%0.0%
crt-N326S74053620.8%28.2%68.6%0.1%0.0%

Additional files

Supplementary file 1

Table of N/S ratio and conservation per gene.

For each P. falciparum gene, this table lists: the systematic ID, position and description of the gene; the ortholog gene in P. chabaudi; the conservation score; the count of all (N)on-synonymous and (S)ynonymous mutation in Africa (AFR) and Southeast Asia (SEA); the count of the rare (i.e. present in only 1 or 2 samples) mutations; the N/S log fold-change in SEA vs AFR; the Fisher’s test p-value of N/S in AFR vs SEA.

https://doi.org/10.7554/eLife.08714.022

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. MalariaGEN Plasmodium falciparum Community Project
(2016)
Genomic epidemiology of artemisinin resistant malaria
eLife 5:e08714.
https://doi.org/10.7554/eLife.08714