Research Article

Improving statistical power in severe malaria genetic association studies by augmenting phenotypic precision

Mahidol Oxford Tropical Medicine Research Unit, Faculty of Tropical Medicine, Mahidol University, Thailand
Centre for Tropical Medicine and Global Health, Nuffield Department of Medicine, University of Oxford, United Kingdom
KEMRI-Wellcome Trust Research Programme, Centre for Geographic Medicine Research-Coast, Kenya
The Wellcome Sanger Institute, United Kingdom
Wellcome Trust Centre for Human Genetics, University of Oxford, United Kingdom
Medical Research Council Clinical Trials Unit, University College London, United Kingdom
Institute of Global Health Innovation, Imperial College, London, United Kingdom
Nuffield Department of Medicine, University of Oxford, United Kingdom
Department of Statistics, University of Oxford, United Kingdom

Jul 6, 2021

https://doi.org/10.7554/eLife.69698

Open access
Copyright information

Version of Record: November 22, 2021
Version of Record: July 27, 2021
Accepted Manuscript: July 6, 2021

Download
Cite
Share
CommentOpen annotations (there are currently 0 annotations on this page).

Altmetric provides a collated score for online attention across various platforms and media.
See more details

1. Part of Collection
Malaria: A Collection of Articles

Edited by Olivier Silvie et al.

Abstract

Severe falciparum malaria has substantially affected human evolution. Genetic association studies of patients with clinically defined severe malaria and matched population controls have helped characterise human genetic susceptibility to severe malaria, but phenotypic imprecision compromises discovered associations. In areas of high malaria transmission, the diagnosis of severe malaria in young children and, in particular, the distinction from bacterial sepsis are imprecise. We developed a probabilistic diagnostic model of severe malaria using platelet and white count data. Under this model, we re-analysed clinical and genetic data from 2220 Kenyan children with clinically defined severe malaria and 3940 population controls, adjusting for phenotype mis-labelling. Our model, validated by the distribution of sickle trait, estimated that approximately one-third of cases did not have severe malaria. We propose a data-tilting approach for case-control studies with phenotype mis-labelling and show that this reduces false discovery rates and improves statistical power in genome-wide association studies.

eLife digest

In areas of sub-Saharan Africa where malaria is common, most people are frequently exposed to the bites of mosquitoes carrying malaria parasites, so they often have malaria parasites in their blood. Young children, who have not yet built up strong immunity against malaria, often fall ill with severe malaria, a life-threatening disease. It is unclear why some children develop severe malaria and die, while other children with high numbers of parasites in their blood do not develop any apparent symptoms.

Genetic susceptibility studies are designed to uncover why such differences exist by comparing individuals with severe malaria (referred to as ‘cases’) with individuals drawn from the general population (known as ‘controls’). But severe malaria can be a challenge to diagnose. Since high numbers of malaria parasites can be found in healthy children, it is sometimes difficult to determine whether the parasites are making a child ill, or whether they are a coincidental finding. Consequently, some of the ‘cases’ recruited into these studies may actually have a different disease, such as bacterial sepsis. This ultimately affects how the studies are interpreted, and introduces error and inaccuracy into the data.

Watson, Ndila et al. investigated whether measuring blood biomarkers in patients (derived from the complete blood count, including platelet counts and white blood cell counts) could improve the accuracy with which malaria is diagnosed. They developed a new mathematical model that incorporates platelet and white blood cell counts. This model estimates that in a large cohort of 2,220 Kenyan children diagnosed with severe malaria, around one third of enrolled children did not actually have this disease. Further analysis suggests that patients with severe malaria are highly unlikely to have platelet counts higher than 200,000 per microlitre. This defines a cut-off that researchers can use to avoid recruiting patients who do not have severe malaria in future studies. Additionally, the ability to diagnose severe malaria more accurately can make it easier to detect and treat other diseases with similar symptoms in children with high numbers of malaria parasites in their blood.

Watson, Ndila et al.’s findings support the recommendation that all children with suspected malaria be given broad spectrum antibiotics, as many misdiagnosed children will likely have bacterial sepsis. It also suggests that using complete blood counts, which are cheap to obtain and increasingly available in low-resource settings, could improve diagnostic accuracy in future clinical studies of severe malaria. This could ultimately improve the ability of these studies to find new treatments for this life-threatening disease.

Introduction

Severe malaria caused by the parasite Plasmodium falciparum kills nearly half a million children each year, mostly in sub-Saharan Africa (World Health Organization, 2020). By causing death in children before they reach their reproductive age, P. falciparum has exerted a substantial selective evolutionary pressure on the human genome (Carter and Mendis, 2002; Kariuki and Williams, 2020). Recent advances in whole-genome sequencing and haplotype imputation (Teo et al., 2010), combined with data gathered prospectively from large patient cohorts, have improved our understanding of genetic susceptibility to P. falciparum infection and severe disease (Malaria Genomic Epidemiology Network et al., 2013; Malaria Genomic Epidemiology Network, 2014; Band et al., 2019; Malaria Genomic Epidemiology Network et al., 2017), but many questions remain unanswered (Kariuki and Williams, 2020). A major limitation of genetic association studies in severe malaria is that the diagnosis of severe falciparum malaria in children is imprecise (White et al., 2013; Taylor et al., 2004; Bejon et al., 2007). This imprecision increases with transmission intensity because of the low positive predictive value of a ‘positive blood film’ or rapid diagnostic test (RDT) in areas where the background prevalence of microscopy detectable parasitaemia in apparently healthy young children is high (often around 30%, Rodriguez-Barraquer et al., 2018, but can exceed 90%, Smith et al., 1994).

Severe falciparum malaria has been defined by experts convened by the World Health Organization (WHO) as clinical or laboratory evidence of vital organ dysfunction in the presence of circulating asexual P. falciparum parasitaemia (World Health Organisation, 2014). The WHO definition of severe malaria is aimed primarily at clinicians and health care workers managing patients with malaria who appear severely ill. This appropriately prioritises sensitivity over specificity (Anstey and Price, 2007). An inclusive clinical definition ensures that cases are not missed and patients receive the best treatment. In contrast, genetic association studies require high specificity (Zondervan and Cardon, 2007). For a given sample size, their statistical power, false discovery rates (FDRs) and the validity of their interpretation are weakened by phenotypic inaccuracy. Specificity in the diagnosis of severe malaria depends in part on the prevalence of malaria parasitaemia. This reflects background transmission intensity. In areas of low or seasonal transmission (e.g. most of endemic Asia and the Americas), clinical and laboratory signs of severity accompanied by a positive blood film for P. falciparum are highly specific for severe malaria, which predominantly affects young adults. In contrast in high transmission areas in sub-Saharan Africa and in lowland areas of the island of New Guinea, where severe malaria is largely a disease of young children, the diagnostic criteria for defining severe malaria are less specific because of the high background prevalence of asymptomatic parasitaemia and the lower specificity of the clinical manifestations. Standard case definitions of severe malaria will therefore inevitably include both patients with non-malarial severe illness with concomitant parasitaemia and with concomitant non-severe malaria.

Our goal was to develop a biomarker-based model that can differentiate probabilistically between ‘true severe malaria’ and severe illness not caused primarily by malaria, but with concomitant parasitaemia. We define ‘true severe malaria’ conceptually as a febrile illness caused by malaria parasites, with organ dysfunction, that can result in death whereby mortality is attributable directly to the malaria parasites. This attributable mortality can be given a formal causal definition by using a conceptual (albeit unethical) randomised experiment of delayed versus prompt antimalarial therapy. In a theoretical patient population with true severe malaria, delay in administration of an effective antimalarial would result in increased mortality (Warrell et al., 1982; Gomes et al., 2009) whereas in a population with severe illness not caused by malaria (‘not severe malaria’) there would not be a corresponding increase in mortality.

We developed a probabilistic diagnostic model of severe malaria based on haematological biomarkers using data from 1704 adults and children mainly from low transmission settings whose diagnosis of severe malaria is considered to be highly specific. We used this model to demonstrate low phenotypic specificity in a cohort of 2220 Kenyan children who were diagnosed clinically with severe malaria. We validated the predictions using a natural experiment, the distribution of sickle cell trait (HbAS), the genetic polymorphism with the strongest known protective effect against all forms of clinical malaria (Malaria Genomic Epidemiology Network, 2014). Building on work on ‘data-tilting’ (Nie et al., 2013), we suggest a new method for testing genetic associations in the context of case-control studies in which cases are re-weighted by the probability that the severe malaria diagnosis is correct under the model. As proof of concept, we ran a genome-wide association study across 9.6 million imputed biallelic variants using the subset of cases with genome-wide genotype data (n = 1297) and population controls (n = 1614). Adjusting for case mis-classification decreased genome-wide FDRs (Storey, 2002) and increased effect sizes in three of the top regions of the human genome most strongly associated with protection from severe malaria in East Africa (HBB, ABO and FREM3, Band et al., 2019). A re-analysis of 120 directly typed polymorphisms in 70 candidate malaria-protective genes in the 2220 Kenyan cases and 3940 population controls, examining differential effects between correctly and incorrectly classified cases, suggests that the protective effect of glucose-6-phosphate dehydrogenase (G6PD) deficiency has been obscured in this population by case mis-classification. Our results show that adding full blood count metadata – routinely measured in most hospitals in sub-Saharan Africa – to severe malaria cohorts would lead to more accurate quantitative analyses in case-control studies and increased statistical power.

Results

Reference model of severe malaria

We used the joint distribution of platelet counts and white blood cell counts (both on a logarithmic scale) to develop a simple biomarker-based reference model of severe malaria. To fit the reference model (i.e. P[Data | Severe malaria]), we used platelet and white count data from (i) severe malaria patient cohorts enrolled in low transmission areas where severe disease accompanied by a positive blood stage parasitaemia has a high positive predictive value for severe malaria (930 adults from Vietnam [Hien et al., 1996; Phu et al., 2010] and 653 adults and children from Thailand and Bangladesh); and (ii) severely ill African children with plasma PfHRP2 concentrations >1000 ng/mL and >1000 parasites per μL of blood (121 children from Uganda, Maitland et al., 2011). Severe illness accompanied by a high plasma PfHRP2 concentration makes the diagnosis of severe falciparum malaria highly specific (Hendriksen et al., 2012). The joint distribution of platelet and white blood cell counts in severe malaria was modelled as a bivariate t-distribution with both blood count variables on the log₁₀ scale.

Figure 1A shows the reference data (green triangles: patients with a highly specific diagnosis of severe malaria, summarised in Table 1) alongside data from a large Kenyan cohort of hospitalised children diagnosed with severe malaria, whose diagnosis had unknown specificity (pink squares). The median platelet count in the reference data was 57,000 per μL, and the median total white blood cell count was 8400 per μL. In contrast, the median platelet count in the Kenyan children was 120,000 per μL, and the median total white blood cell count was 13,000 per μL. Direct comparisons of white counts across these two datasets are confounded by geography and age. Total white blood cell counts are known to be age-dependent and vary across genetic backgrounds, in particular lower neutrophil counts are associated with mutations in the ACKR1 gene that results in the Duffy negative phenotype prevalent in African populations (Reich et al., 2009). However, after adjustment for age (see Materials and methods), the marginal distributions of total white counts were comparable between Asian adults and children with severe malaria and African children with high plasma PfHRP2 (Appendix 1). Platelet counts are not age-dependent and do not vary substantially across genetic backgrounds. The marginal distributions of platelet counts were comparable between Asian adults and children with severe malaria and African children with high plasma PfHRP2 (Appendix 2). A low platelet count (thrombocytopenia) is a universal feature of severe malaria (see evidence collated in Materials and methods). To illustrate this important point, in a cohort of 566 severely ill Ugandan children enrolled in the Fluid Expansion as Supportive Therapy (FEAST) trial (Maitland et al., 2011), a trial including all severe illness not restricted to severe malaria, low platelet counts were highly predictive of blood stage parasitaemia and elevated PfHRP2 (p=10^-16 for a spline term on the log₁₀ platelet count in a generalised additive logistic regression model predicting PfHRP2 >1000 ng/mL, Appendix 2). Children enrolled in the FEAST trial who had significant thrombocytopenia (<100,000 platelets per μL) had comparable PfHRP2 concentrations to Asian adults diagnosed with severe falciparum malaria (Figure 1B).

Figure 1

Download asset Open asset

Platelet counts and white blood cell counts as diagnostic predictors of severe falciparum malaria.

Panel (A) shows the bivariate marginal distribution for the reference data (thought to be highly specific to severe malaria, green triangles, n = 1704, summarised in Table 1) and for the Kenyan case data (pink squares, n = 2220; black diamonds: HbAS). The dashed ellipses show the 50% and 95% bivariate normal probability contours approximating each dataset (dark green: reference data; purple: Kenyan data). Panel (B) shows the relationship between platelet counts and plasma PfHRP2 in adults with severe malaria from Bangladesh (green circles, n = 172, the dashed green line shows a linear fit) and in children enrolled in the FEAST trial (n = 567, not specific to severe malaria, Maitland et al., 2011). Undetectable plasma PfHRP2 concentrations were set to 1 ng/mL ± random jitter. Orange squares: malaria-positive blood slide; black triangles: malaria-negative blood slide. The brown line shows a spline fit to the FEAST data (*smooth.spline* function in R with default parameters) including the data points where PfHRP2 was below the lower limit of detection.

Table 1

Summary of severe disease datasets used in our analyses.

For age and parasite density, we show the median values as the distributions are highly skewed. *For the FEAST trial, the severe malaria reference dataset only included platelet and white count data from the 121 patients who had PfHRP2 >1000 ng/mL and >1000 parasites per μL. IQR: interquartile range.

	Bangladesh-Thailand	Vietnam	FEAST (Uganda)	Kenya
Description	Observational studies of severe malaria	Randomised controlled trials in severe malaria	Randomised controlled trial in severe febrile illness	Observational severe malaria cohort
Purpose	Reference data	Reference data	Reference data* and Figure 1B	Testing data
Published references	Leopold et al., 2019	Hien et al., 1996; Phu et al., 2010	Maitland et al., 2011	MalariaGEN Consortium et al., 2018
$n$	653	930	567	2220
Age (years, range)	28 (2–80)	30 (15–79)	2.1 (0–12)	2.3 (0–13)
Parasite density (per μL, IQR)	48,984 (8289–187,395)	83,084 (13,047–316,512)	400 (0–53,200)	72,000 (6208–315,250)
Mortality (%)	18.2	12.9	11.3	11.6

Estimating the proportion of children mis-diagnosed with severe malaria

We can consider the hospitalised Kenyan children in this series as a mixture of two latent sub-populations, ‘severe malaria’ and ‘not severe malaria’ (i.e an alternative aetiology for severe illness). To estimate the proportion of each, we use the distribution of HbAS, the human polymorphism most protective against all forms of clinical falciparum malaria. HbAS provides at least 90% protection against severe malaria (Taylor et al., 2012; Malaria Genomic Epidemiology Network, 2014). The causal SNP rs334 was genotyped in 2213 of the Kenyan children, of whom 57 were HbAS. The causal pathways (a) or (b) in Figure 2 (note all children have been selected into the study on the basis of clinical symptoms consistent with severe malaria) show how the distribution of HbAS can be used to infer the marginal probability P(Severe malaria) in the Kenyan cohort as the prevalence of HbAS is expected to differ in the two latent sub-populations.

Figure 2

Download asset Open asset

Theoretical causal pathways that lead to the clinical diagnosis of severe malaria under the current WHO definition (World Health Organisation, 2014).

Pathways (a) and (b) represent the two ways patients can be mis-classified as severe malaria. For both pathways (a) and (b), we expect a higher prevalence of HbAS relative to the population with true severe malaria as a consequence of the protective bottlenecks. In this causal model, we assume that HbAS does not protect against asymptomatic parasitaemia, although this assumption is not strictly necessary. Adapted with permission from Small et al., 2017.

We assumed that cases with the highest likelihood values P(Data | Severe malaria) under the reference model (a bivariate t-distribution fit to the severe malaria reference data) had a diagnosis of severe malaria that was 100% specific (top 40% of cases, a sensitivity analysis varied this threshold). The cases with lower likelihood values were assumed to be drawn from a mixture of the two latent populations with an unknown mixing proportion; the prevalence of HbAS in the ‘not severe malaria’ subgroup was estimated from a cohort of hospitalised children enrolled in the same hospital and who were malaria blood slide positive but were clinically diagnosed as not having severe malaria (n = 6748 of whom 364 were HbAS; Uyoga et al., 2019). We assumed that this diagnosis of ‘not severe malaria’ was 100% specific. Under these assumptions, we estimated that P(Severe malaria) = 0.64 (95% credible interval [C.I.] 0.46–0.8), implying that approximately one-third of the 2200 cases are from the ‘not severe malaria’ sub-population (they have malaria parasitaemia in addition to another severe illness – likely to be bacterial sepsis – Figure 2).

Estimating individual probabilities of severe malaria

We then estimated P(Severe malaria | Data) for each Kenyan case by fitting a mixture model to the reference data and to the Kenyan data jointly. The model assumed that the platelet and white count data for the Kenyan children were drawn from a mixture of P(Data | Severe malaria) and P(Data | Not severe malaria). The reference data (Asian adults and children with severe malaria and African children with PfHRP2 >1000 ng/mL) were assumed to be drawn only from P(Data | Severe malaria). P(Data | Not severe malaria) was modelled itself as a mixture of bivariate t-distributions. We used an informative prior on the mixture proportion (‘severe malaria’ versus ‘not severe malaria’) in the Kenyan cases, a beta distribution approximating the posterior estimate from the analysis of HbAS prevalence.

Figure 3A shows the bimodal distribution of the posterior individual estimates of P(Severe malaria | Data). As expected, the individual posterior probabilities of severe malaria were highly predictive of HbAS ( $p = 10^{- 6}$ from a generalised additive logistic regression model fit, Figure 3C). The individual probabilities were also predictive of in-hospital mortality ( $p = 10^{- 9}$ from a generalised additive model fit; Figure 3D) and admission peripheral blood parasite density ( $p = 10^{- 25}$ from a generalised additive model fit; Figure 3E). In the top quintile of patients with the highest estimated P(Severe malaria | Data), the prevalence of HbAS was 0.7% (3 out of 446). In contrast, for patients in the lowest quintile of estimated P(Severe malaria | Data), the prevalence of HbAS was 4.8% (21 out of 444). The patients with a low probability of severe malaria had a substantially higher case fatality ratio (18.8% mortality for patients in the bottom quintile of P[Severe malaria | Data] versus 6.1% mortality for the top quintile of P[Severe malaria | Data]). This may be explained by the higher case-specific mortality of severe bacterial sepsis (the most likely alternative cause of severe illness). The admission parasite densities in patients with a probability of severe malaria close to 1 were approximately fivefold higher than in patients with a probability of severe malaria close to 0. The blood culture positive rate was 2.1% in the top quintile of P(Severe malaria | Data) and 4.4% in the lowest quintile of P(Severe malaria | Data), and the individual probabilities were predictive of blood culture results ( $p = 0.004$ under a generalised additive logistic regression model fit).

Figure 3

Download asset Open asset

Model estimates of P(Severe malaria | Data) in 2220 Kenyan children clinically diagnosed with severe malaria.

Panel (A) shows the distribution of posterior probabilities of severe malaria being the correct diagnosis. Panel (B) shows these same probabilities plotted as a function of the platelet and white counts on which they are based (dark red: probability close to 0; dark blue: probability close to 1). The black diamonds show the HbAS individuals. Panels (**C–E**) show the relationship between the estimated probabilities of severe malaria and HbAS, in-hospital mortality and admission parasite density, respectively. The black lines (shaded areas) show the mean estimated values (95% confidence intervals) from a generalised additive logistic regression model with a smooth spline term for the likelihood (R package *mgcv*). The horizontal lines in panels (C–E) show the mean values in the data.

Accounting for case imprecision in case-control studies

‘False-positive’ cases reduce statistical power and dilute effect size estimates in case-control studies. We propose a novel approach for case-control studies with phenotypic imprecision based on data-tilting (Nie et al., 2013). The idea is to ‘tilt’ the cases towards a pseudo-population with higher specificity for severe malaria. We can do this by re-weighting the data by the probabilities P(Severe malaria | Data), that is, re-weighting the contribution to the log-likelihood in an association model.

We applied this approach as proof of concept to a genome-wide association study using the subset of Kenyan children who had clinical and genome-wide data available (after quality control checks n = 1297 cases) and a set of matched population controls (n = 1614), across 9.6 million biallelic variants on the autosomal chromosomes (Band et al., 2019). We compared the data-tilting method to the standard non-weighted approach by estimating local FDRs (Storey, 2002). Compared to the standard non-weighted GWAS, data-tilting substantially increased the number of significant associations for local FDRs in the range of 1–5% (Figure 4). For example, at an FDR of 2%, the number of significant hits is more than doubled with the additional hits all around known loci associated with protection from severe malaria. We note that if the data weights were not predictive of the true latent phenotype, we would expect fewer significant hits for a given FDR because of the reduction in effective sample size. This is demonstrated by permuting the data weights (for the cases only), which results in 50–75% reduction in the number of significant hits at FDRs < 5% (Appendix 3).

Figure 4

Download asset Open asset

The number of significant hits as a function of the FDR for the genome-wide association study across 9.6 million biallelic variants.

This analysis is based on a subset of the Kenyan children with whole-genome data available and passing quality checks n = 1297 and n = 1614 controls. Dashed line: weighted model; thick line: non-weighted model.

Examining three major genetic regions strongly associated with protection from severe malaria in East Africa (HBB: HbAS; ABO: O blood group; FREM3: in close linkage with the GYPA/B/E structural variants that encode the Dantu blood group; Band et al., 2019), the data-tilting approach estimated larger effect sizes compared to the non-weighted model in all three regions (effect size increases: 30% around HBB, 9% around ABO and 5% around FREM3). This resulted in larger –log₁₀ p-values for HBB and ABO, but slightly smaller for FREM3 (Figure 5). We note that there was no signal of association at ATP2B4 in this subset, most likely due to limited power (ATP2B4 had the third largest Bayes factor for association in the largest multicentre GWAS to date, Band et al., 2019).

Figure 5

Download asset Open asset

The three regions in the human genome with the greatest evidence for protection against severe malaria in East Africa (*HBB*, *ABO* and *FREM3;* Band et al., 2019).

The Manhattan plots (left panels) compare p-values from the weighted model (blue) and the non-weighted model (orange). Each Manhattan plot is centred around the known causal position shown by the vertical dashed line (0.5 Mb region). The horizontal dashed line shows $p = 10^{- 7}$ (threshold often used for defining genome-wide significance). The 10 positions with the greatest –*log*₁₀ p-values under the non-weighted model are shown as large diamonds. The scatter plots on the right compare absolute effect size estimates under both models with the same top 10 hits shown by the larger purple diamonds. Increases of 30, 9 and 5% are seen for the 10 top hits for *HBB, ABO* and *FREM3*, respectively.

Reappraisal of directly typed polymorphisms

We re-analysed case-control associations for 120 polymorphisms on 70 candidate malaria-protective genes which were typed directly in the 2220 Kenyan children along with 3940 population controls. In this case-control cohort, 14 polymorphisms had previously been identified as associated with protection or increased risk in severe malaria (MalariaGEN Consortium et al., 2018). A re-analysis of these 14 variants using the same models of association as previously published and down-weighting the likely mis-classified cases replicated the majority of associations, with increased effect sizes and increased –log₁₀ p-values (Appendix 4). For the three major genes (HBB, ABO, FREM3), effect sizes were increased by 10–30% and associations all had higher significance levels on the –log₁₀ scale (0.25–1.7). The allele frequencies of all three polymorphisms were directly associated with the probability weights, showing increased protection in individuals more likely to have severe malaria (Appendix 5). Two polymorphisms on the genes ARL14 and LOC727982, reported previously as associated with protection in severe malaria (neither of which are related to red cells), showed decreased effect sizes and –log₁₀ p-values and are thus potentially spurious hits.

We explored whether there was evidence of differential effects in the Kenyan cases using P[Severe malaria | Data] to assign probabilistically each case to the ‘severe malaria’ versus ‘not severe malaria’ sub-populations. We fitted a categorical logistic regression model predicting the latent sub-population label versus control, where the latent case label was estimated from the weights shown in Figure 3A. This resulted in approximately 1279 cases in the ‘severe malaria’ sub-population and 941 cases in the ‘not severe malaria’ sub-population. Differential effects were tested by comparing the estimated log-odds for the two sub-populations. After accounting for multiple testing, two polymorphisms showed significant differential effects: rs334 (derived allele encodes haemoglobin S, $p = 10^{- 6}$ ) and rs1050828 (derived allele encodes G6PD + 202T, $p = 10^{- 3}$ in the model fit to females only), see Figure 6. As expected, rs334 was associated with protection in both sub-populations (Scott et al., 2011; Uyoga et al., 2019) but the effect was almost eight times larger on the log-odds scale in the ‘severe malaria’ sub-population relative to the ‘not severe malaria’ sub-population (odds ratio of 0.029 [95% C.I. 0.0088–0.094] in the ‘severe malaria’ population versus 0.63 [95% C.I. 0.48–0.83] in the ‘not severe malaria’ population). For rs1050828 (G6PD + 202T allele), approximately the same absolute log-odds were estimated for both sub-populations but they had opposite signs. Under an additive model in females, the rs1050828 T allele was associated with protection in the ‘severe malaria’ sub-population (odds ratio of 0.71 [95% C.I. 0.57–0.88]) but with increased risk in the ‘not severe malaria’ sub-population (odds ratio of 1.30 [95% C.I. 1.00–1.70]). The additive model including both males and females was consistent with these opposing effects but significant only at a nominal threshold ( $p = 0.02$ ). Opposing effects across the two sub-populations are consistent with the hypothesis that G6PD deficiency leads to a greater risk of being erroneously classified as severe malaria as under the severe anaemia criterion (Watson et al., 2019), shown in more detail in Appendix 5. Investigation of haemoglobin concentrations as a function of P(Severe malaria | Data) indicates that the mis-classified group is very heterogeneous, but with a larger proportion of severe anaemia (<5 g/dL) relative to the correctly classified sub-population (Appendix 6).

Figure 6

Download asset Open asset

Exploring differential effects in 120 directly typed polymorphisms across 70 candidate malaria-protecting genes.

(A) Case-control effect sizes estimated for the ‘severe malaria’ sub-population versus the ‘not severe malaria’ sub-population (n = 3940 controls and n = 2220 cases, with approximately 1279 in the ‘severe malaria’ sub-population and 941 in the ‘not severe malaria’ sub-population). The vertical and horizontal grey lines show the 95% credible intervals. (B) The *log*₁₀ p-values testing the hypothesis that the effects are the same for the two sub-populations relative to controls. The top dashed line shows the Bonferroni corrected $α = 0.05$ significance threshold (assuming 70 independent tests). The bottom dashed line shows the nominal $α = 0.05$ significance threshold. In both panels, red circles denote $p < 0.05$ (nominal significance level), and red squares denote $p < 0.05 / 70$ . (C) Analysis of the rs1050828 SNP (encoding G6PD + 202T) under a non-additive model (hemi/homozygotes and heterozygotes are distinct categories). This shows that heterozygotes are clearly under-represented in the ‘severe malaria’ sub-population and hemi/homozygotes are clearly over-represented in the ‘not severe malaria’ sub-population. (D) Evidence of differential effects for the O blood group (rs8176719, recessive model) and *FREM3* (additive model).

Discussion

The clinical diagnosis of severe falciparum malaria in African children is imprecise (Taylor et al., 2004; Bejon et al., 2007; White et al., 2013). Even with quantitation of parasite densities, specificity is still imperfect (Bejon et al., 2007). In children with cerebral malaria (unrouseable coma with malaria parasitaemia), the most specific of the severe malaria clinical syndromes, postmortem examination revealed another diagnosis in a quarter of cases studied in Blantyre, Malawi (Taylor et al., 2004). Diagnostic specificity can be improved by visualisation of the obstructed microcirculation in vivo (e.g. through indirect ophthalmoscopy) or from parasite biomass indicators (quantitation and staging of malaria parasites on thin blood films, counting of neutrophil-ingested malaria pigment, measurement of plasma concentrations of PfHRP2 or parasite DNA), but these are still largely research procedures and have not been widely adopted or measured at scale for genetic association studies. Our results suggest that imprecision in clinical phenotyping is more substantial than thought previously. In this cohort of 2220 Kenyan children diagnosed with severe malaria from an area of moderate transmission, a probabilistic assessment suggests that around one-third may not have had severe malaria (although malaria may have contributed to their illness; Small et al., 2017). This supports our previous conclusion that differences in treatment effects between Asian adults and African children (i.e the benefits of artesunate over quinine in severe malaria estimated from randomised trials; Dondorp et al., 2005; Dondorp et al., 2010) are predominantly driven by differences in diagnostic specificity (Hendriksen et al., 2012; White et al., 2013). Mortality was higher in the severe ‘not malaria’ patients, probably because the main illness was bacterial sepsis. This strongly supports current recommendations to give broad-spectrum antibiotics to all children in endemic areas with suspected severe malaria (World Health Organisation, 2014). Using HbAS as a natural experiment to validate the biomarker model, we show that the joint distribution of platelet and white blood cell counts is a diagnostic predictor of severe malaria. Complete blood counts are inexpensive and increasingly available in low-resource setting hospitals. Application of an upper threshold of 200,000 platelets per μL would have substantially decreased mis-classification in this large cohort of Kenyan children diagnosed with severe malaria.

This re-analysis using rich clinical data provides additional evidence for the three major genetic polymorphisms protective against severe malaria present in East Africa. After probabilistic down-weighting of the likely mis-classified cases, substantial increases in effect sizes were found. Dilution of effect sizes resulting from mis-classification could partially explain the large heterogeneity in effects noted in the largest severe malaria GWAS to date (Band et al., 2019). For haemoglobin S (rs334), there was a fourfold variation in estimated odds ratios across participating sites. Some of this heterogeneity can be attributed to variations in linkage disequilibrium affecting imputation accuracy (Malaria Genomic Epidemiology Network et al., 2013), but our analysis shows an additional substantial source of heterogeneity which results from diagnostic imprecision. This can be adjusted for if detailed clinical data are available. For example, in the case of rs334 (directly typed), the data-tilting approach results in a 25% increase in effect size on the log-odds scale, corresponding to 35% decrease in estimated odds ratios (0.1 versus 0.16).

As for the interpretation of genetic effects, one of the most interesting results concerns the G6PD gene. G6PD deficiency is the most common enzymopathy of humans. Its potential role in protecting against falciparum malaria has been controversial (MalariaGEN Consortium et al., 2017; Watson et al., 2019). A very large multi-country genetic association study with over 11,000 severe malaria cases and 17,000 population controls found no overall protective effect of the G6PD + 202T allele (the most common mutation in sub-Saharan Africa causing G6PD deficiency), under an additive model (Malaria Genomic Epidemiology Network, 2014). The same pattern is observed in this Kenyan cohort (which is a subset of the larger study). In the Kenyan cohort overall, a previous analysis found no clear evidence of protection for male homozygotes but substantial evidence of protection for female heterozygotes (MalariaGEN Consortium et al., 2015). This would suggest a heterogyzote advantage leading to a balancing polymorphism. However, when the Kenyan cases are modelled as two distinct sub-populations, there is evidence of differential effects between the ‘severe malaria’ and ‘not severe malaria’ sub-populations. Hemi- and homozygous G6PD deficiency was associated with an increased risk of mis-classification (reflecting an increased risk of severe anaemia), but it is unclear whether or not hemi/homozygous G6PD deficiency was protective in the 'true severe malaria' sub-population (Figure 6C). On the other hand, heterozygote deficiency was very clearly protective in the true severe malaria subgroup, consistent with previous findings, and did not appear to lead to an increased risk of mis-classification (consistent with a lower risk of extensive haemolysis and thus false classification in heterozygotes who have both normal and G6PD-deficient erythrocytes in their circulation). When examining the ‘severe malaria’ sub-population only, the sample size in this study is too small to discriminate between the heterozygote and additive models of association. In our view, the relationship between G6PD deficiency and severe falciparum malaria remains unanswered. A biomarker-driven approach should be applied to other case-control cohorts for a definitive understanding of the role of this major human polymorphism.

The limitations of our diagnostic model can be summarised as follows. First, the validity and interpretation of the individual probabilities of severe malaria is heavily dependent on the reference model and thus the reference data. Our reference data were primarily from Asian adults in whom diagnostic specificity for severe malaria is thought to be very high. Diagnostic checks suggested that the marginal distributions of platelet counts were similar between adults and children, and we made age corrections to the white blood cell count, but small deviations could reduce the discriminatory value (e.g. lower white counts associated with the Duffy negative phenotype; Reich et al., 2009). Second, it is possible that rare genetic conditions exist in which the probabilities of severe malaria under this model might be biased. One example is sickle cell disease (HbSS, <0.5% in the Kenyan cases), which results in chronic inflammation with high white counts and low platelet counts relative to the normal population (Sadarangani et al., 2009). The 11 children with HbSS in this cohort were all assigned low probabilities of severe malaria, but this should be interpreted with caution. Whether HbSS is protective against severe malaria or increases the risk of severe malaria remains unclear (Williams and Obaro, 2011). For these patients, other biomarkers such as plasma PfHRP2 may be more appropriate. Third, it is possible that the joint distribution of the complete blood count variables used to fit the reference model could be dependent on the severe malaria sub-phenotype. For example, if the reference data were biased towards cerebral malaria, and the joint distribution of platelet and white cell counts in cerebral malaria differed from those in the other severe malaria syndromes, then the predicted outliers could represent other forms of severe malaria instead of ‘not severe’ malaria. However, there are no known biological reasons why this would be the case. The strong correlation between platelet counts and PfHRP2 (Figure 1B) suggests that low platelet counts are a universal feature of severe malaria.

In summary, under a probabilistic model based on routine blood count data, we have shown that it is possible to estimate mis-classification rates in diagnosed severe childhood malaria in a malaria endemic area of East Africa and compute probabilistic weights that can downweight the contribution of likely mis-classified cases. The well-established protective effect of HbAS provided an independent validation of the model. Relative to predicted mis-classified cases, patients predicted to have ‘true severe malaria’ had a substantially lower prevalence of HbAS, higher parasite densities, lower rates of positive blood cultures and lower mortality. These data strongly support the current guideline to give broad-spectrum antibiotics to all children with suspected severe malaria and suggest that normal range platelet counts (>200,000 per μL) could be used as a simple exclusion criterion in studies of severe malaria. Based on this analysis, we recommend that future studies in severe malaria collect and record complete blood count data. Further studies of platelet and white blood cell counts from a diverse cohort of children with severe falciparum malaria, confirmed using high-specificity diagnostic techniques such as visualisation of the microcirculation, and measurement of plasma PfHRP2, or plasma P. falciparum DNA concentrations, should be conducted to validate this approach.

Materials and methods

Data

Kenyan case-control cohort

Request a detailed protocol

The Kenyan case-control cohort has been described in detail previously (MalariaGEN Consortium et al., 2018). Severe malaria cases consisted of all children aged <14 years who were admitted with clinical features of severe falciparum malaria to the high-dependency ward of Kilifi County Hospital between 11 June 1999 and 12 June 2008. Severe malaria was defined as a positive blood film for P. falciparum along with prostration (Blantyre Coma Score of 3 or 4), cerebral malaria (Blantyre Coma Score of <3), respiratory distress (abnormally deep breathing) and severe anaemia (haemoglobin <5 g/dL). Controls were infants aged 3–12 months who were born within the same area as the cases and who were recruited to a cohort study investigating genetic susceptibility to a wide range of childhood diseases. Cases and controls were genotyped for the rs334 SNP and for $α^{+}$ -thalassaemia along with 120 other SNPs using DNA extracted from fresh or frozen samples of whole blood as described in detail previously (MalariaGEN Consortium et al., 2018; Wambua et al., 2006).

Fluid Expansion as Supportive Therapy (FEAST)

Request a detailed protocol

FEAST was a multicentre randomised controlled trial comparing fluid boluses for severely ill children (n = 3161) that was not specific to severe malaria (Maitland et al., 2011). Platelet counts, white blood cell counts, parasite densities and PfHRP2 were jointly measured for 566 children (patients enrolled in the sites in Mulago, Lacor and Mbale, in Uganda). In order to select only those with a very high probability of having severe malaria as the primary cause of illness, we selected the 121 children who had measured PfHRP2 >1000 ng/mL and parasitaemia >1000 per μL.

AQ Vietnam and AAV randomised controlled trials

Request a detailed protocol

The AQ and the AAV studies were two randomised clinical trials in Vietnamese adults diagnosed clinically with severe falciparum malaria recruited to a specialist ward of the Hospital for Tropical Diseases, Ho Chi Minh City, Vietnam, between 1991 and 2003 (Hien et al., 1996; Phu et al., 2010). AQ Vietnam was a double-blind comparison of intramuscular artemether versus intramuscular quinine (n = 560); AAV compared intramuscular artesunate and intramuscular artemether (n = 370).

Observational studies in Thai and Bangladeshi adults and children

Request a detailed protocol

We included data from multiple observational studies in severe falciparum malaria conducted by the Mahidol Oxford Tropical Medicine Research Unit in Thailand and Bangladesh between 1980 and 2019. These pooled data have been described previously (Leopold et al., 2019). Platelet counts and white blood cell counts were available in 657 patients. We excluded one 30-year-old adult from Bangladesh whose recorded platelet count was 1000 per μL and three other adults with platelet counts greater than 450,000 per μL as outliers reflecting likely data entry errors. Plasma PfHRP2 concentrations were available in 172 patients from Bangladesh. 55 patients from this series were younger than 15 years of age.

Multiple imputation

Request a detailed protocol

In the Kenyan severe malaria cohort (n = 2220), data on platelet counts were missing in 18%, white blood counts were missing in 0.2% and parasite density was missing in 1.6%. In-hospital outcome (died/survived) was missing for 13 patients. rs334 genotype was missing for 7; $α^{+}$ -thalassaemia genotype was missing for 101 patients. In the Vietnamese adults, platelet counts were missing in 4%, white counts in 2% and parasitaemia in 0%.

We did multiple imputation using random forests for all available clinical variables using the R package missForest (targeted genotyping data was not included for imputation). Appendix 7 shows the missing data pattern in the studies in Vietnamese adults and in the Kenyan severe malaria cases. Ten datasets were imputed for each dataset independently and were used for the subsequent analyses. Analyses using directly typed genetic polymorphisms or the within-hospital outcome as the dependent variables used only the data where these outcomes were recorded, assuming that they were missing at random.

Reference model of severe malaria

Biological rationale

Request a detailed protocol

Thrombocytopenia accompanied by a normal white blood count and a normal neutrophil count are typical features of severe malaria (Hanson et al., 2015; Leblanc et al., 2020), but they may also occur in some systemic viral infections and in severe sepsis. Neutrophil leukocytosis may sometimes occur in very severe malaria, but is more characteristic of pyogenic bacterial infections. These indices, whilst individually not very specific, could each have useful discriminatory value. We reasoned therefore that their joint distribution could help discriminate between children with severe malaria versus those severely ill with coincidental parasitaemia. The Kenyan severe malaria cohort did not have differential white count data, so we used platelet counts and total white blood cell counts as the two diagnostic biomarkers in the reference model of severe malaria.

Choice of reference data and confounders

Request a detailed protocol

The best data for fitting the biomarker model are either from children or adults from low transmission areas (where parasitaemia has a high positive predictive value) or in children or adults with high plasma PfHRP2 measurements indicating a large latent parasite biomass (Hendriksen et al., 2012).

In the first years of life, white blood cell counts are often much higher than in adults because of lymphocytosis. We used data from 858 children from the FEAST trial, in whom white counts were measured, to estimate the relationship between age and mean white count in severe illness (median age was 24 months). The estimated relationship is shown in Appendix 8 (using a generalised additive linear model with the white count on the log₁₀ scale), with mean white counts reaching a plateau around 5 years of age. We used this to correct all white count data in children less than 5 years of age, both in the reference data and the Kenyan cohort.

There is also a systematic difference associated with the Duffy negative phenotype which is near fixation in Africa but absent in Asia. Duffy negative individuals have lower neutrophil counts (termed benign ethnic neutropenia) (Reich et al., 2009). The use of Asian adults to estimate the reference distribution of white counts in severe malaria could thus falsely include individuals with elevated white counts (relative to the normal ranges). However, a diagnostic quantile-quantile plot (Appendix 1, on the log scale) comparing the white blood cell count distribution in Vietnamese adults and in children in the FEAST trial who had PfHRP2 >1000 ng/mL did not suggest any major differences. In fact the African children had slightly higher white counts on average even after the correction for age. This may represent imperfect specificity for severe malaria when using a plasma PfHRP2 cutoff of 1000 mg/mL.

For platelet counts (which have the greatest diagnostic value for severe malaria in our series), age is not a confounder and published data support the hypothesis that thrombocytopenia is highly specific for ‘true’ severe malaria in children as well as adults suspected of having severe malaria (with a diagnostic and a prognostic value). The French national guidelines specifically mention thrombocytopenia (<150,000 per μL) for the diagnosis of severe malaria in children who have travelled to a malaria endemic area. In a French paediatric severe malaria series in travellers, almost half had severe thrombocytopenia (<50,000 per μL) (Lanneaux et al., 2016; Pediatric Imported Malaria Study Group for the ‘Centre National de Référence du Paludisme’ et al., 2017). In Dakar, Senegal (one of the lowest transmission areas in Africa), thrombocytopenia was an independent predictor of death and the median platelet count was 100,000 (Gérardin et al., 2007; Gérardin et al., 2002). Comparison of the distributions of platelet counts (on the log scale) between Asian children and Asian adults suggested no major differences (Appendix 1), although we had few data for Asian children. In the seminal Blantyre autopsy study (Taylor et al., 2004), platelet counts were substantially different between fatal cases confirmed postmortem to be severe malaria (62,000 per μL and 56,000 per μL for the children with sequestration only and sequestration + microvascular pathology, respectively) and fatal cases with a mis-diagnosis of severe malaria (no sequestration: 176,000 per μL; the inter-group difference was statistically significant, $p = 0.008$ ). A larger cohort from the same centre in Malawi reported substantially higher platelet counts in retinopathy-negative cerebral malaria (mean platelet count was 161,000 per μL, n = 288) compared to retinopathy-positive cerebral malaria (mean count was 81,000 per μL, n = 438) (Small et al., 2017).

We visually checked approximate normality for each marginal distribution using quantile-quantile plots (Appendix 9). On the log₁₀ scale, platelet counts and white counts show a good fit to the normal approximation but with some outliers so a t-distribution was used (robust to outliers). For all modelling of the joint distribution of platelet counts and white blood cell counts, we chose bivariate t-distributions with 7 degrees of freedom as the default model. The final reference model used was a bivariate t-distribution fit to the joint distribution of platelet counts and white counts both on the logarithmic scale. On the log₁₀ scale, the mean values (standard deviations) were approximately 1.76 (0.11) and 0.92 (0.055) for platelets and white counts, respectively. The covariance was approximately 0.0035. These values varied very slightly across the 10 imputed datasets. Log-likelihood values for each severe malaria case in the Kenyan cohort were calculated for each imputed dataset independently. The median log-likelihoods per case were then used in downstream analyses.

Limitations of the model

Request a detailed protocol

The diagnostic model of severe malaria using platelet counts and white blood cell counts cannot be applied to all patients. We summarise here the known and possible limitations. When using this model to estimate the association between a genetic polymorphism and the risk of severe malaria, if the genetic polymorphism of interest affects the complete blood count independently, there will be selection bias (see the directed acyclic graph in Appendix 10). One example is HbSS. Children with HbSS have chronic inflammation with white blood cells counts about 2–3 times higher than normal and slightly lower platelet counts (Sadarangani et al., 2009). All 11 children in the Kenyan cohort with HbSS were assigned low probabilities of having severe malaria (Appendix 10), but these probabilities could reflect a deficiency of the model. Including or excluding these children from the analysis had no impact on the results as they represent less than 0.5% of the cases.

The second possible limitation concerns the validation using HbAS. Previous studies have suggested negative epistasis between the malaria-protective effects of HbAS and $α^{+}$ -thalassaemia (Williams et al., 2005; Opi et al., 2014). The 3.7 kb deletion across the HBA1-HBA2 genes (known as $α^{+}$ -thalassaemia) has an allele frequency of $\sim 40$ % in this population; therefore, 16% of HbAS individuals are homozygous for $α^{+}$ -thalassaemia (Ndila et al., 2020). Negative epistasis implies that those with both polymorphisms would have less or no protective effect against severe malaria. Of the 2113 Kenyan cases with both HbS and $α^{+}$ -thalassaemia genotyped, 13 were HbAS and homozygous $α^{+}$ -thalassaemia. Appendix 11 shows that the majority of those with both polymorphisms had clinical indices pointing away from severe malaria, suggesting that the observed number of patients with both HbAS and homozygous $α^{+}$ -thalassaemia is inflated by two- to threefold.

The third possible problem concerns the use of white blood cell counts in relation to invasive bacterial infections. Bacteraemia could either be the cause of severe illness (with coincidental parasitaemia) or it could be concomitant (which may result from extensive parasitised erythrocyte sequestration in the gut), that is, a result of severe malaria. The former should be identified as ‘not severe malaria’ (as bacteraemia is the main cause of illness), but the latter should be identified as ‘severe malaria’ and might be mis-classified as ‘not severe malaria’ under our model. However, in a series of 845 Vietnamese adults (high diagnostic specificity), only one of eight patients who had concomitant-invasive bacterial infections and a white count measured had leukocytosis (median white count was 8100; range 3500–14,850 per μL; Phu et al., 2020).

Estimating the diagnostic specificity in the Kenyan cohort

Request a detailed protocol

We assume that the Kenyan cases are a latent mixture of two sub-populations: P₀ is the population ‘severe malaria’ and P₁ is the population ‘not severe malaria’ (mis-classified). For a set of diagnostic biomarkers $X$ , this implies that $X \sim G = π f_{0} + (1 - π) f_{1}$ , where $f_{0}, f_{1}$ are the sampling distributions (likelihoods) of each sub-population, respectively.

We can infer the value of π (proportion correctly classified as severe malaria) without making parametric assumptions about f₁ by using the distribution of HbAS (motivated by the causal pathways shown in Figure 2). This is done as follows: we first estimate ${\hat{f}}_{0}$ by fitting a bivariate t-distribution to the reference data – this approximates the sampling distribution for P₀. We then make three assumptions:

Out of the 2213 Kenyan cases with rs334 genotyped, we assume that cases in the top 40th percentile of the likelihood distribution under ${\hat{f}}_{0}$ are drawn from P₀: $N_{0} = 887$ , of which $N_{0}^{s i c k l e} = 9$ are HbAS.
For the other cases, the proportion drawn from P₀ is unknown and denoted $π^{'}$ : $N_{G} = 1, 326$ , of which $N_{G}^{s i c k l e} = 48$ are HbAS.
Finally, additional information is incorporated by using data from a cohort of individuals with severe disease from the same hospital who had positive malaria blood slides but whose diagnosis was not severe malaria $(N_{1} = 6, 748$ , of which $N_{1}^{s i c k l e} = 364$ were HbAS) (Uyoga et al., 2019).

Under these assumptions, we can fit a Bayesian binomial mixture model to these data with three parameters: ${π^{'}, p_{0}, p_{1}}$ . The likelihood is given by

\begin{array}{ll} N_{0}^{s i c k l e} \sim Binomial (p_{0}, N_{0}) \\ N_{G}^{s i c k l e} \sim Binomial (π^{'} p_{0} + (1 - π^{'}) p_{1}, N_{G}) \\ N_{1}^{s i c k l e} \sim Binomial (p_{1}, N_{1}) \end{array}

The priors used were $p_{1} \sim Beta (5, 95)$ (i.e. 5% prior probability with 100 pseudo observations); $p_{0} \sim Beta (1, 99)$ (1% prior probability with 100 pseudo observations). A sensitivity analysis with flat beta priors (Beta[1,1]) did not qualitatively change the result (by one percentage point for the final estimate of π). To check the validity of the use of the external population from Uyoga et al., 2019, we did a sensitivity analysis using the lowest quintile of the likelihood ratio distribution as a population drawn entirely from P₁ (instead of the external data from Uyoga et al., 2019).

Estimating P(Severe malaria | Data) in the Kenyan cohort

Request a detailed protocol

Denote the platelet and white count data from the FEAST trial as ${X_{i}^{FEAST}}_{i = 1}^{121}$ ; the data from the Vietnamese adults and children as ${X_{i}^{Asia}}_{i = 1}^{1583}$ ; the data from the Kenyan children as ${X_{i}^{Kenya}}_{i = 1}^{2220}$ . We fit the following joint model to the reference biomarker data and the Kenyan biomarker data.

\begin{array}{ll} X_{i}^{FEAST} \sim Student (μ_{S M}^{1}, Σ_{S M}^{1}, 7) \\ X_{i}^{Asia} \sim Student (μ_{S M}^{2}, Σ_{S M}^{2}, 7) \\ X_{i}^{Kenya} \sim π f_{0} + (1 - π) f_{1} \\ f_{0} = p Student (μ_{S M}^{1}, Σ_{S M}^{1}, 7) + (1 - p) Student (μ_{S M}^{2}, Σ_{S M}^{2}, 7) \\ f_{1} = \sum_{j = 1}^{K} α_{j} Student (μ_{n o t S M}^{j}, Σ_{n o t S M}^{j}, 7) \end{array}

with the following prior distributions and hyperparameters, where $α = {α_{1}, . ., α_{K}}$ such that $\sum_{j = 1}^{K} α_{j} = 1$ :

\begin{array}{ll} π \sim Beta (40.3, 24.7) \\ p \sim Beta (2, 2) \\ μ_{S M}^{1, 2} \sim Normal ({1.8, 0.95}, {0.1}^{2}) \\ μ_{n o t S M}^{1.. K} \sim Normal ({2.5, 1.5}, {0.25}^{2}) \\ α \sim Dirichlet (1 / K, . . ., 1 / K) \end{array}

The covariance matrices $Σ_{S M}^{1, 2}$ and $Σ_{S M}^{1.6}$ were parameterised as their Cholesky LKJ decomposition, where the L correlation matrices had a uniform prior (i.e. hyperparameter ν = 1). The model was implemented in rstan.

This models the biomarker data in ‘not severe malaria’ as a mixture of $K$ t-distributions. We chose $K = 6$ as the default choice (sensitivity analysis increasing this has no impact). The Dirichlet prior with hyperparameter $1 / K$ forces sparsity in this mixture model (most of the prior weight is on the vertices of the K-dimensional simplex); see, for example, Frühwirth-Schnatter and Malsiner-Walli, 2019. This is a very general and flexible way of modelling the ‘not severe malaria’ distribution: we are not trying to make inferences about this distribution, we just want the mixture model to be flexible enough to describe it. The model also allows for differences in the joint distribution of platelet counts and white counts between the reference datasets (FEAST trial and the Asian studies). The Kenyan cases drawn from the ‘severe malaria’ sub-population are then modelled as a mix of these two reference models.

Re-weighted likelihood for case-control analyses

Request a detailed protocol

For each ${X_{i}^{Kenya}}_{i = 1}^{2220}$ , we estimate the posterior probability of being drawn from the sampling distribution f₀. The mean posterior probability then defines a precision weight w_i which can be used in a standard generalised linear model (glm) with the same interpretation as inverse probability weights. The weighted glm is equivalent to computing the maximum likelihood estimate where the log-likelihood is weighted by w_i. In our case-control analyses, all the controls are given weight 1. Nie et al., 2013 give a proof of correctness for this re-weighted log-likelihood (equivalent to ‘tilting’ the dataset towards the desired distribution ${\hat{f}}_{0} (X)$ ). The log-odds ratio computed from the weighted logistic regression can be interpreted as the causal effect of the polymorphism on ‘true severe malaria’ relative to the controls, where ‘true severe malaria’ is defined by the sampling distribution f₀. Appendix 12 shows the results of a simulation study demonstrating how the effect estimates and standard error estimates vary as a function of the proportion of mis-classified cases (as given by the probability weights).

Genome-wide association study

Request a detailed protocol

Anonymised whole-genome data from the Illumina Omni 2.5M platform for 1944 severe malaria cases and 1738 population controls were downloaded from the European Genome-Phenome Archive (dataset accession ID: EGAD00010001742, release date March 2019; Band et al., 2019). This contained sequencing data on 2,383,648 variants. We used the quality control metadata provided with the 2019 data release to select SNPs and individuals with high-quality data. We first excluded 386 individuals (due to relatedness: 155; missing data or low intensity: 226; gender: 5). We then removed 616,426 SNPs that did not pass quality control, leaving a total of 1,767,222 SNPs. We used plink2 to prune the SNPs (options: –maf 0.01 –indep-pairwise 50 2 0.2) down to a set of 462,120 SNPs in approximate linkage equilibrium. These SNPs were then used to calculated the first five principal components (Appendix 13), which we subsequently used to control for population structure in the genome-wide association study. We used the Michigan imputation server with the 1000 Genomes Phase 3 (version 5) as the reference panel to impute 28.6 million polymorphisms across the 22 autosomal chromosomes. This is a web-based service that runs imputation pipelines (phasing is done with Eagle2, imputation with Minimac4). Encrypted results are returned with a one-time password. Of the remaining 3682 individuals (1681 cases and 1615 controls), we had clinical data available for 1297 cases. We only used the subset of individuals with clinical data available in order for a fair comparison between the weighted and non-weighted genome-wide association studies. We ran subsequent genome-wide association studies on all biallelic sites with a minor allele frequency $\geq 5$ % (9,615,446 sites in total) assuming an additive model of association. We used the R function glm with a binomial link for all tests of association (genetic data were encoded as the number of ancestral alleles). The supplementary appendix gives the R code for weighted logistic regression. The point estimates from the weighted model estimated by glm are correct but it is necessary to transform the standard errors in order to take into account the reduction in effective sample size (see code).

Case-control study in directly typed polymorphisms

Request a detailed protocol

We fit a categorical (multinomial) logistic regression model to the case-control status as a function of the directly typed polymorphisms (120 after discarding those that are monomorphic in this population; see MalariaGEN Consortium et al., 2018 for additional details). We modelled the severe malaria cases as two separate sub-populations with a latent variable: ‘severe malaria’ versus ‘not severe malaria’, resulting in three possible labels (controls, ‘severe malaria’, ‘not severe malaria’). The models adjusted for self-reported ethnicity and sex. The model was coded in stan (Stan Development Team, 2020) using the log-sum-exp trick to marginalise out the likelihood over the latent variables (see code). Normal(0,5) priors were set on all parameters, and parameter estimates and standard errors were estimated from the maximum a posteriori value (function optimizing in rstan).

Code availability

Request a detailed protocol

Code, along with a minimal clinical dataset for reproducibility of the diagnostic phenotyping model, is available via a GitHub repository: https://github.com/jwatowatson/Kenyan_phenotypic_accuracy (Watson, 2021; copy archived at swh:1:rev:03a2de285d38b85a769aa25de46b7960487efc62).

Data availability

Request a detailed protocol

A curated minimal clinical dataset is currently available alongside the code on the GitHub repository. This will also be made available at publication via the KEMRI-Wellcome Harvard Dataverse (https://dataverse.harvard.edu/dataverse/kwtrp).

This paper used genome-wide genotyping data generated by Band et al., 2019, available on request from the European Genome-Phenome Archive (dataset accession ID: EGAD00010001742).

Requests for access to appropriately anonymised clinical data and directly typed genetic variants (Malaria Genomic Epidemiology Network, 2014) for the Kenyan severe malaria cohort can be made by application to the data access committee at the KEMRI-Wellcome Trust Research Programme by email to mmunene@kemri-wellcome.org.

The FEAST trial datasets are available from the principal investigator on reasonable request (k.maitland@imperial.ac.uk). Requests for access to appropriately anonymised clinical data from the AQ and AAV Vietnam study and the Asian paediatric cohort can be made via the Mahidol Oxford Tropical Medicine Research Unit data access committee by emailing the corresponding author JAW (jwatowatson@gmail.com) or Rita Chanviriyavuth (rita@tropmedres.ac).

Appendix 1

Appendix 1—figure 1

Download asset Open asset

Comparison of the marginal distributions of white blood cell counts between Asian adults and children with severe malaria and African children with severe malaria.

FEAST: 121 severely ill Ugandan children with PfHRP2 >1000 ng/mL (Maitland et al., 2011). Vietnamese adults: 930 adults from two large randomised trials in severe malaria (Phu et al., 2010; Hien et al., 1996). Bangladesh/Thailand: 653 adults and children from observational studies of severe malaria (Leopold et al., 2019).

Appendix 1—figure 2

Download asset Open asset

Comparison of the marginal distributions of platelet counts between Asian adults and children with severe malaria and African children with severe malaria.

FEAST: 121 severely ill Ugandan children with PfHRP2 >1000 ng/mL (Maitland et al., 2011). Vietnamese adults: 930 adults from two large randomised trials in severe malaria (Phu et al., 2010; Hien et al., 1996). Bangladesh/Thailand: 653 adults and children from observational studies of severe malaria (Leopold et al., 2019). The bottom-left qqplot compares the white counts from the children in the FEAST study with the combined dataset from Vietnam and Bangladesh/Thailand.

Appendix 2

Appendix 2—figure 1

Download asset Open asset

The relationship between platelet counts and plasma PfHRP2 in severely ill African children.

The black line (shaded area) shows the estimated probability (95% confidence interval) that the plasma PfHRP2 >1000 ng/mL as a function of *log*₁₀ platelet count. This fit is derived from a generalised additive logistic regression model ( $p < 10^{- 16}$ for the spline term), fit using the R package *mgcv*. The generalised additive model was fit to data from 566 African children enrolled in the FEAST trial (Maitland et al., 2011) (all the children who had both platelet counts and PfHRP2 data available). Plasma PfHRP2 >1000 ng/mL is highly discriminatory for severe malaria (Hendriksen et al., 2012).

Appendix 3

Appendix 3—figure 1

Download asset Open asset

Effect of permuting the weights in the re-weighted (data-tilting) GWAS.

Here we show the results of 20 random permutations of the weights, applied to the Kenyan case-control GWAS using only chromosomes 4, 9 and 11 (where the top hits are – we limit it to these three chromosomes for computational reasons). The random permutations (grey) decrease the number of significant hits compared to the non-weighted (thick black) and the non-permuted re-weighted model (dashed purple).

Appendix 4

Appendix 4—figure 1

Download asset Open asset

Comparison of the non-weighted and weighted models of association for directly typed polymorphisms previously reported as associated with severe malaria (MalariaGEN Consortium et al., 2018).

(A) Estimated effect sizes under the non-weighted model versus the difference in effect sizes between the weighted and non-weighted models (absolute effects on the log-odds scale). Differences > 0 imply that the absolute effect size is estimated to be larger under the weighted model. (B) –*log*₁₀ p-values under the non-weighted model versus the differences in –*log*₁₀ p-values under the weighted and non-weighted models, again differences > 0 represent larger –*log*₁₀ p-values for the weighted model. Each point is represented by the gene name. In each case, we use the model that best fit the data in the original analysis (MalariaGEN Consortium et al., 2018). For the X-linked polymorphisms (*G6PD, CD40LG*), multiple models were reported and so the association model is also shown. H: heterozygote; A: additive; M: males only; F: females only; M/F: all.

Appendix 5

Appendix 5—figure 1

Download asset Open asset

Case-only analysis of five key polymorphisms effecting red cells, reported in Ndila et al., 2020 under additive, recessive or heterozygote models.

The horizontal dashed lines show the estimated frequency in the controls (for additive models, this is the frequency of the derived allele; for the heterozygote or recessive models, this is the frequency of the genotype thought to confer protection). The line (shaded area) shows logistic regression fits with P(Severe malaria | Data) as the predictor (95% confidence interval of the fit). The p-value corresponds to the test that the predictor P(Severe malaria | Data) is not associated with the genotype in the cases only. OBG: O blood group.

Appendix 6

Appendix 6—figure 1

Download asset Open asset

Distribution of admission haemoglobin concentrations as a function of P(Severe malaria | Data).

Severe anaemia is generally defined as a haemoglobin less than 5 g/dL in African children diagnosed with severe malaria, shown by the horizontal dashed red line in the top panel and the vertical dashed red lines in the bottom panels. The vertical dashed red lines in the top panel show the top and bottom quintiles of the probability distribution (0.9 and 0.2, respectively). Patients in the bottom quintile of the probability distribution had a markedly bimodal distribution in haemoglobin concentrations with a substantial proportion meeting the severe anaemia criterion and a substantial proportion with relatively high haemoglobin concentrations (>10 g/dL), suggesting two patients subgroups. Patients in the top quintile had a unimodal distribution of haemoglobin.

Appendix 7

Appendix 7—figure 1

Download asset Open asset

Pattern of missing clinical data in the 930 Vietnamese adults.

These data pool the AQ Vietnam severe malaria study (Hien et al., 1996) and the AAV severe malaria study (Phu et al., 2010) (red: missing; yellow: recorded).

Appendix 7—figure 2

Download asset Open asset

Missing clinical data in the 2220 Kenyan children diagnosed with severe malaria (red: missing; yellow: recorded).

Appendix 8

Appendix 8—figure 1

Download asset Open asset

Relationship between age and mean white count (modelled on the *log*₁₀ scale).

This is estimated from 858 children in the FEAST trial who had white counts available using an additive linear model ( $p = 10^{- 8}$ for the smooth spline term). We used this model to adjust observed *log*₁₀ white counts in all children less than 5 years of age in the reference and Kenyan datasets.

Appendix 9

Appendix 9—figure 1

Download asset Open asset

Normal-quantile plots for platelet counts and white blood cell counts in the reference data.

Both were standardised to have mean 0 and standard deviation of 1 on the *log*₁₀ scale. The diagonal lines show the identity line.

Appendix 10

Appendix 10—figure 1

Download asset Open asset

Collider bias in the diagnostic model of severe malaria based on complete blood count data.

*HBB* in its homozygous S form (HbSS, <1% prevalence in this Kenyan population) is a rare example of how this can occur. Children with HbSS have white counts above 2–3 times higher than the normal population and slightly lower platelet counts (Sadarangani et al., 2009). Under the probabilistic model, all 11 children with HbSS were classified as having a low probability of severe malaria, based on their high white counts (mean 40,000 per μL). These probabilities cannot be taken at face value, and it remains an unanswered question whether children with HbSS are more or less susceptible than their wild-type counterparts (Williams and Obaro, 2011).

Appendix 10—figure 2

Download asset Open asset

The relationship between HbSS and the estimated probabilities of severe malaria under the diagnostic model.

There were 11 children with HbSS and they all had low probabilities of severe malaria, but this is biased as these children have chronic inflammation with white counts 2–3 higher than the general population (Sadarangani et al., 2009) (see above Appendix 10—figure 1 for the causal diagram showing collider bias).

Appendix 11

Appendix 11—figure 1

Download asset Open asset

Scatter plots of platelet counts versus white blood cell counts for the Kenyan cohort, showing the 13 individuals with the double mutation HbAS and homozygous $α^{+}$ -thalassaemia as large black diamonds (HZ-alpha-thal).

The red-yellow-blue colour scheme is proportional to the P(Severe malaria | Data) as given by the legend in the top-left corner.

Appendix 12

Simulation study

To demonstrate how the re-weighted likelihood works on simulated data where the true latent classes are known, we constructed the following simulation assuming

A biallelic marker with a derived allele frequency of 10% in the control population (diplotypes encoded as 0, 1, 2).
An additive protective effect for the true cases resulting in a derived allele frequency of 7% in the true cases; no effect in the false cases.
The latent class probability weights for the true cases are drawn from a Beta(0.2, 1) distribution, and the probability weights for the false cases are drawn from a Beta(1, 0.2) distribution.
A proportion of true versus false cases varying between 50% and 100%.

The R code for the simulation is given in the file Simulation_study_weightedLikelihood.R in the GitHub repository https://github.com/jwatowatson/Kenyan_phenotypic_accuracy. Figures 1 and 2 show how the estimates effect sizes, the standard errors and the power (1-type 2 error) vary as a function of the proportion of the true cases.

Appendix 12—figure 1

Download asset Open asset

Simulation study demonstrating how likelihood re-weighting can improve estimation accuracy in case-control studies.

Panels (A) and (B) show histograms of the case probability weights used in the simulations for the scenarios when 50% of cases are true cases and when 100% of cases are true cases, respectively. Panel (C) shows the estimated effect sizes as a function of the proportion of mis-classified cases. Panel (D) shows the standard errors of effect estimates as a proportion of mis-classified cases.

Appendix 12—figure 2

Download asset Open asset

Effect of case re-weighting on power (1-type 2 error).

The thick red line shows the estimated power for the re-weighted approach; the dashed black line shows the estimated power for the non-weighted approach.

Appendix 13

Appendix 13—figure 1

Download asset Open asset

Principal components analysis of 1666 Kenyan cases and 1606 population controls.

The colours show the main self-reported ethnicities (black: Chonyi; red: Giriama; green: Kauma; blue: other). The first five principal components were used to stratify for population structure in the GWAS analyses.

Data availability

A curated minimal clinical dataset is currently available alongisde the code on the github repository. This is also available via the KEMRI-Wellcome Harvard Dataverse (https://doi.org/10.7910/DVN/TH8WAW). Whole genome data are available from European Genome-Phenome Archive (dataset accession ID: EGAD00010001742). Requests for access to appropriately anonymized clinical data and directly typed genetic variants for the Kenyan severe malaria cohort can be made by application to the data access committee at the KEMRI-Wellcome Trust Research Programme by e-mail to mmunene@kemri-wellcome.org. The FEAST trial datasets are available from the principal investigator on reasonable request (k.maitland@imperial.ac.uk). Requests for access to appropriately anonymized clinical data from the AQ and AAV Vietnam study and the Asian paediatric cohort can be made via the Mahidol Oxford Tropical Medicine Research Unit data access committee by emailing the corresponding author JAW (jwatowatson@gmail.com) or Rita Chanviriyavuth (rita@tropmedres.ac).

The following data sets were generated

(2021) Harvard Dataverse
Replication Data for: Improving statistical power in severe malaria genetic association studies by augmenting phenotypic precision.

https://doi.org/10.7910/DVN/TH8WAW

The following previously published data sets were used

1. MalariaGen Consortium
(2015) European Genome-Phenome Archive
ID EGAD00010001742. A genome-wide study of resistance to severe malaria in 18,000 samples from eleven worldwide populations, including eight populations sub-Saharan Africa.

https://ega-archive.org/studies/EGAS00001001311

References

1. Anstey NM
2. Price RN
(2007) Improving case definitions for severe malaria
PLOS Medicine 4:e267.

https://doi.org/10.1371/journal.pmed.0040267
- PubMed
- Google Scholar
1. Band G
2. Le QS
3. Clarke GM
4. Kivinen K
5. Hubbart C
6. Jeffreys AE
7. Rowlands K
8. Leffler EM
9. Jallow M
10. Conway DJ
11. Sisay-Joof F
12. Sirugo G
13. d’Alessandro U
14. Toure OB
15. Thera MA
16. Konate S
17. Sissoko S
18. Mangano VD
19. Bougouma EC
20. Sirima SB
21. Amenga-Etego LN
22. Ghansah AK
23. Hodgson AVO
24. Wilson MD
25. Enimil A
26. Ansong D
27. Evans J
28. Ademola SA
29. Apinjoh TO
30. Ndila CM
31. Manjurano A
32. Drakeley C
33. Reyburn H
34. Phu NH
35. Quyen NTN
36. Thai CQ
37. Hien TT
38. Teo YY
39. Manning L
40. Laman M
41. Michon P
42. Karunajeewa H
43. Siba P
44. Allen S
45. Allen A
46. Bahlo M
47. Davis TME
48. Simpson V
49. Shelton J
50. Spencer CCA
51. Busby GBJ
52. Kerasidou A
53. Drury E
54. Stalker J
55. Dilthey A
56. Mentzer AJ
57. McVean G
58. Bojang KA
59. Doumbo O
60. Modiano D
61. Koram KA
62. Agbenyega T
63. Amodu OK
64. Achidi E
65. Williams TN
66. Marsh K
67. Riley EM
68. Molyneux M
69. Taylor T
70. Dunstan SJ
71. Farrar J
72. Mueller I
73. Rockett KA
74. Kwiatkowski DP
75. Network MGE
(2019) Insights into malaria susceptibility using genome-wide data on 17,000 individuals from Africa, Asia and Oceania
Nature Communications 10:5732.

https://doi.org/10.1038/s41467-019-13480-z
- PubMed
- Google Scholar
1. Bejon P
2. Berkley JA
3. Mwangi T
4. Ogada E
5. Mwangi I
6. Maitland K
7. Williams T
8. Scott JA
9. English M
10. Lowe BS
11. Peshu N
12. Newton CR
13. Marsh K
(2007) Defining childhood severe falciparum malaria for intervention studies
PLOS Medicine 4:e251.

https://doi.org/10.1371/journal.pmed.0040251
- PubMed
- Google Scholar
1. Carter R
2. Mendis KN
(2002) Evolutionary and historical aspects of the burden of malaria
Clinical Microbiology Reviews 15:564–594.

https://doi.org/10.1128/CMR.15.4.564-594.2002
- PubMed
- Google Scholar
1. Dondorp A
2. Nosten F
3. Stepniewska K
4. Day N
5. White N
(2005) Artesunate versus quinine for treatment of severe falciparum malaria: a randomised trial
Lancet 366:717–725.

https://doi.org/10.1016/S0140-6736(05)67176-0
- PubMed
- Google Scholar
1. Dondorp AM
2. Fanello CI
3. Hendriksen ICE
4. Gomes E
5. Seni A
6. Chhaganlal KD
7. Bojang K
8. Olaosebikan R
9. Anunobi N
10. Maitland K
11. Kivaya E
12. Agbenyega T
13. Nguah SB
14. Evans J
15. Gesase S
16. Kahabuka C
17. Mtove G
18. Nadjm B
19. Deen J
20. Mwanga-Amumpaire J
21. Nansumba M
22. Karema C
23. Umulisa N
24. Uwimana A
25. Mokuolu OA
26. Adedoyin OT
27. Johnson WBR
28. Tshefu AK
29. Onyamboko MA
30. Sakulthaew T
31. Ngum WP
32. Silamut K
33. Stepniewska K
34. Woodrow CJ
35. Bethell D
36. Wills B
37. Oneko M
38. Peto TE
39. von Seidlein L
40. Day NPJ
41. White NJ
(2010) Artesunate versus quinine in the treatment of severe falciparum malaria in african children (AQUAMAT): an open-label, randomised trial
The Lancet 376:1647–1657.

https://doi.org/10.1016/S0140-6736(10)61924-1
- Google Scholar
1. Frühwirth-Schnatter S
2. Malsiner-Walli G
(2019) From here to infinity: sparse finite versus dirichlet process mixtures in model-based clustering
Advances in Data Analysis and Classification 13:33–64.

https://doi.org/10.1007/s11634-018-0329-y
- PubMed
- Google Scholar
1. Gérardin P
2. Ka AS
3. Imbert P
4. Jouvencel P
5. Brousse V
6. Rogier C
(2002) Prognostic value of thrombocytopenia in african children with falciparum malaria
The American Journal of Tropical Medicine and Hygiene 66:686–691.

https://doi.org/10.4269/ajtmh.2002.66.686
- Google Scholar
1. Gérardin P
2. Rogier C
3. Ka AS
4. Jouvencel P
5. Diatta B
6. Imbert P
(2007) Outcome of life-threatening malaria in african children requiring endotracheal intubation
Malaria Journal 6:51.

https://doi.org/10.1186/1475-2875-6-51
- PubMed
- Google Scholar
1. Gomes MF
2. Faiz MA
3. Gyapong JO
4. Warsame M
5. Agbenyega T
6. Babiker A
7. Baiden F
8. Yunus EB
9. Binka F
10. Clerk C
11. Folb P
12. Hassan R
13. Hossain MA
14. Kimbute O
15. Kitua A
16. Krishna S
17. Makasi C
18. Mensah N
19. Mrango Z
20. Olliaro P
21. Peto R
22. Peto TJ
23. Rahman MR
24. Ribeiro I
25. Samad R
26. White NJ
(2009) Pre-referral rectal artesunate to prevent death and disability in severe malaria: a placebo-controlled trial
The Lancet 373:557–566.

https://doi.org/10.1016/S0140-6736(08)61734-1
- Google Scholar
1. Hanson J
2. Phu NH
3. Hasan MU
4. Charunwatthana P
5. Plewes K
6. Maude RJ
7. Prapansilp P
8. Kingston HW
9. Mishra SK
10. Mohanty S
11. Price RN
12. Faiz MA
13. Dondorp AM
14. White NJ
15. Hien TT
16. Day NP
(2015) The clinical implications of thrombocytopenia in adults with severe falciparum malaria: a retrospective analysis
BMC Medicine 13:1–9.

https://doi.org/10.1186/s12916-015-0324-5
- PubMed
- Google Scholar
1. Hendriksen IC
2. Mwanga-Amumpaire J
3. von Seidlein L
4. Mtove G
5. White LJ
6. Olaosebikan R
7. Lee SJ
8. Tshefu AK
9. Woodrow C
10. Amos B
11. Karema C
12. Saiwaew S
13. Maitland K
14. Gomes E
15. Pan-Ngum W
16. Gesase S
17. Silamut K
18. Reyburn H
19. Joseph S
20. Chotivanich K
21. Fanello CI
22. Day NP
23. White NJ
24. Dondorp AM
(2012) Diagnosing severe falciparum malaria in parasitaemic african children: a prospective evaluation of plasma PfHRP2 measurement
PLOS Medicine 9:e1001297.

https://doi.org/10.1371/journal.pmed.1001297
- PubMed
- Google Scholar
1. Hien TT
2. Day NPJ
3. Phu NH
4. Mai NTH
5. Chau TTH
6. Loc PP
7. Sinh DX
8. Chuong LV
9. Vinh H
10. Waller D
11. Peto TEA
12. White NJ
(1996) A controlled trial of artemether or quinine in vietnamese adults with severe falciparum malaria
New England Journal of Medicine 335:76–83.

https://doi.org/10.1056/NEJM199607113350202
- Google Scholar
1. Kariuki SN
2. Williams TN
(2020) Human genetics and malaria resistance
Human Genetics 139:801–811.

https://doi.org/10.1007/s00439-020-02142-6
- PubMed
- Google Scholar
1. Lanneaux J
2. Dauger S
3. Pham LL
4. Naudin J
5. Faye A
6. Gillet Y
7. Bosdure E
8. Carbajal R
9. Dubos F
10. Vialet R
11. Chéron G
12. Angoulvant F
(2016) Retrospective study of imported falciparum malaria in french paediatric intensive care units
Archives of Disease in Childhood 101:1004–1009.

https://doi.org/10.1136/archdischild-2015-309665
- PubMed
- Google Scholar
1. Leblanc C
2. Vasse C
3. Minodier P
4. Mornand P
5. Naudin J
6. Quinet B
7. Siriez JY
8. Sorge F
9. de Suremain N
10. Thellier M
11. Kendjo E
12. Faye A
13. Imbert P
(2020) Management and prevention of imported malaria in children update of the french guidelines
Médecine Et Maladies Infectieuses 50:127–140.

https://doi.org/10.1016/j.medmal.2019.02.005
- PubMed
- Google Scholar
1. Leopold SJ
2. Watson JA
3. Jeeyapant A
4. Simpson JA
5. Phu NH
6. Hien TT
7. Day NPJ
8. Dondorp AM
9. White NJ
(2019) Investigating causal pathways in severe falciparum malaria: a pooled retrospective analysis of clinical studies
PLOS Medicine 16:e1002858.

https://doi.org/10.1371/journal.pmed.1002858
- PubMed
- Google Scholar
1. Maitland K
2. Kiguli S
3. Opoka RO
4. Engoru C
5. Olupot-Olupot P
6. Akech SO
7. Nyeko R
8. Mtove G
9. Reyburn H
10. Lang T
11. Brent B
12. Evans JA
13. Tibenderana JK
14. Crawley J
15. Russell EC
16. Levin M
17. Babiker AG
18. Gibb DM
(2011) Mortality after fluid bolus in african children with severe infection
New England Journal of Medicine 364:2483–2495.

https://doi.org/10.1056/NEJMoa1101549
- Google Scholar
1. Malaria Genomic Epidemiology Network
2. Malaria Genomic Epidemiological Network
3. Band G
4. Le QS
5. Jostins L
6. Pirinen M
7. Kivinen K
8. Jallow M
9. Sisay-Joof F
10. Bojang K
11. Pinder M
12. Sirugo G
13. Conway DJ
14. Nyirongo V
15. Kachala D
16. Molyneux M
17. Taylor T
18. Ndila C
19. Peshu N
20. Marsh K
21. Williams TN
22. Alcock D
23. Andrews R
24. Edkins S
25. Gray E
26. Hubbart C
27. Jeffreys A
28. Rowlands K
29. Schuldt K
30. Clark TG
31. Small KS
32. Teo YY
33. Kwiatkowski DP
34. Rockett KA
35. Barrett JC
36. Spencer CC
(2013) Imputation-based meta-analysis of severe malaria in three african populations
PLOS Genetics 9:e1003509.

https://doi.org/10.1371/journal.pgen.1003509
- PubMed
- Google Scholar
1. Malaria Genomic Epidemiology Network
(2014) Reappraisal of known malaria resistance loci in a large multicenter study
Nature Genetics 46:1197–1204.

https://doi.org/10.1038/ng.3107
- PubMed
- Google Scholar
1. Malaria Genomic Epidemiology Network
2. Leffler EM
3. Band G
4. Busby GBJ
5. Kivinen K
6. Le QS
7. Clarke GM
8. Bojang KA
9. Conway DJ
10. Jallow M
11. Sisay-Joof F
12. Bougouma EC
13. Mangano VD
14. Modiano D
15. Sirima SB
16. Achidi E
17. Apinjoh TO
18. Marsh K
19. Ndila CM
20. Peshu N
21. Williams TN
22. Drakeley C
23. Manjurano A
24. Reyburn H
25. Riley E
26. Kachala D
27. Molyneux M
28. Nyirongo V
29. Taylor T
30. Thornton N
31. Tilley L
32. Grimsley S
33. Drury E
34. Stalker J
35. Cornelius V
36. Hubbart C
37. Jeffreys AE
38. Rowlands K
39. Rockett KA
40. Spencer CCA
41. Kwiatkowski DP
(2017) Resistance to malaria through structural variation of red blood cell invasion receptors
Science 356:eaam6393.

https://doi.org/10.1126/science.aam6393
- PubMed
- Google Scholar
1. MalariaGEN Consortium
2. Uyoga S
3. Ndila CM
4. Macharia AW
5. Nyutu G
6. Shah S
7. Peshu N
8. Clarke GM
9. Kwiatkowski DP
10. Rockett KA
11. Williams TN
(2015) Glucose-6-phosphate dehydrogenase deficiency and the risk of malaria and other diseases in children in Kenya: a case-control and a cohort study
The Lancet Haematology 2:e437–e444.

https://doi.org/10.1016/S2352-3026(15)00152-0
- PubMed
- Google Scholar
1. MalariaGEN Consortium
2. Clarke GM
3. Rockett K
4. Kivinen K
5. Hubbart C
6. Jeffreys AE
7. Rowlands K
8. Jallow M
9. Conway DJ
10. Bojang KA
11. Pinder M
12. Usen S
13. Sisay-Joof F
14. Sirugo G
15. Toure O
16. Thera MA
17. Konate S
18. Sissoko S
19. Niangaly A
20. Poudiougou B
21. Mangano VD
22. Bougouma EC
23. Sirima SB
24. Modiano D
25. Amenga-Etego LN
26. Ghansah A
27. Koram KA
28. Wilson MD
29. Enimil A
30. Evans J
31. Amodu OK
32. Olaniyan S
33. Apinjoh T
34. Mugri R
35. Ndi A
36. Ndila CM
37. Uyoga S
38. Macharia A
39. Peshu N
40. Williams TN
41. Manjurano A
42. Sepúlveda N
43. Clark TG
44. Riley E
45. Drakeley C
46. Reyburn H
47. Nyirongo V
48. Kachala D
49. Molyneux M
50. Dunstan SJ
51. Phu NH
52. Quyen NN
53. Thai CQ
54. Hien TT
55. Manning L
56. Laman M
57. Siba P
58. Karunajeewa H
59. Allen S
60. Allen A
61. Davis TM
62. Michon P
63. Mueller I
64. Molloy SF
65. Campino S
66. Kerasidou A
67. Cornelius VJ
68. Hart L
69. Shah SS
70. Band G
71. Spencer CC
72. Agbenyega T
73. Achidi E
74. Doumbo OK
75. Farrar J
76. Marsh K
77. Taylor T
78. Kwiatkowski DP
(2017) Characterisation of the opposing effects of G6PD deficiency on cerebral malaria and severe malarial anaemia
eLife 6:e15085.

https://doi.org/10.7554/eLife.15085
- PubMed
- Google Scholar
1. MalariaGEN Consortium
2. Ndila CM
3. Uyoga S
4. Macharia AW
5. Nyutu G
6. Peshu N
7. Ojal J
8. Shebe M
9. Awuondo KO
10. Mturi N
11. Tsofa B
12. Sepúlveda N
13. Clark TG
14. Band G
15. Clarke G
16. Rowlands K
17. Hubbart C
18. Jeffreys A
19. Kariuki S
20. Marsh K
21. Mackinnon M
22. Maitland K
23. Kwiatkowski DP
24. Rockett KA
25. Williams TN
(2018) Human candidate gene polymorphisms and risk of severe malaria in children in Kilifi, Kenya: a case-control association study
The Lancet Haematology 5:e333–e345.

https://doi.org/10.1016/S2352-3026(18)30107-8
- PubMed
- Google Scholar
1. Ndila C
2. Nyirongo V
3. Macharia A
4. Jeffreys A
5. Rowlands K
6. Hubbart C
7. Busby G
(2020) Haplotype heterogeneity and low linkage disequilibrium reduce reliable prediction of genotypes for the form of α-thalassaemia using genome-wide microarray data [version 1; peer review: awaiting peer review]
Wellcome Open Research 5:287.

https://doi.org/10.12688/wellcomeopenres.16320.1
- Google Scholar
1. Nie L
2. Zhang Z
3. Rubin D
4. Chu J
(2013) Likelihood reweighting methods to reduce potential bias in noninferiority trials which rely on historical data to make inference
The Annals of Applied Statistics 7:1796–1813.

https://doi.org/10.1214/13-AOAS655
- Google Scholar
1. Opi DH
2. Ochola LB
3. Tendwa M
4. Siddondo BR
5. Ocholla H
6. Fanjo H
7. Ghumra A
8. Ferguson DJ
9. Rowe JA
10. Williams TN
(2014) Mechanistic studies of the negative epistatic malaria-protective interaction between sickle cell trait and α+ thalassemia
EBioMedicine 1:29–36.

https://doi.org/10.1016/j.ebiom.2014.10.006
- PubMed
- Google Scholar
(2017) Severe imported malaria in children in France: A national retrospective study from 1996 to 2005
PLOS ONE 12:e0180758.

https://doi.org/10.1371/journal.pone.0180758
- PubMed
- Google Scholar
1. Phu NH
2. Tuan PQ
3. Day N
4. Mai NT
5. Chau TT
6. Chuong LV
7. Sinh DX
8. White NJ
9. Farrar J
10. Hien TT
(2010) Randomized controlled trial of artesunate or artemether in vietnamese adults with severe falciparum malaria
Malaria Journal 9:97.

https://doi.org/10.1186/1475-2875-9-97
- PubMed
- Google Scholar
1. Phu NH
2. Day NPJ
3. Tuan PQ
4. Mai NTH
5. Chau TTH
6. Van Chuong L
7. Vinh H
8. Loc PP
9. Sinh DX
10. Hoa NTT
11. Waller DJ
12. Wain J
13. Jeyapant A
14. Watson JA
15. Farrar JJ
16. Hien TT
17. Parry CM
18. White NJ
(2020) Concomitant bacteremia in adults with severe falciparum malaria
Clinical Infectious Diseases 19:191.

https://doi.org/10.1093/cid/ciaa191
- Google Scholar
1. Reich D
2. Nalls MA
3. Kao WH
4. Akylbekova EL
5. Tandon A
6. Patterson N
7. Mullikin J
8. Hsueh WC
9. Cheng CY
10. Coresh J
11. Boerwinkle E
12. Li M
13. Waliszewska A
14. Neubauer J
15. Li R
16. Leak TS
17. Ekunwe L
18. Files JC
19. Hardy CL
20. Zmuda JM
21. Taylor HA
22. Ziv E
23. Harris TB
24. Wilson JG
(2009) Reduced neutrophil count in people of african descent is due to a regulatory variant in the duffy antigen receptor for chemokines gene
PLOS Genetics 5:e1000360.

https://doi.org/10.1371/journal.pgen.1000360
- PubMed
- Google Scholar
(2018) Quantification of anti-parasite and anti-disease immunity to malaria as a function of age and exposure
eLife 7:e35832.

https://doi.org/10.7554/eLife.35832
- PubMed
- Google Scholar
1. Sadarangani M
2. Makani J
3. Komba AN
4. Ajala-Agbo T
5. Newton CR
6. Marsh K
7. Williams TN
(2009) An observational study of children with sickle cell disease in Kilifi, Kenya
British Journal of Haematology 146:675–682.

https://doi.org/10.1111/j.1365-2141.2009.07771.x
- PubMed
- Google Scholar
1. Scott JAG
2. Berkley JA
3. Mwangi I
4. Ochola L
5. Uyoga S
6. Macharia A
7. Ndila C
8. Lowe BS
9. Mwarumba S
10. Bauni E
11. Marsh K
12. Williams TN
(2011) Relation between falciparum malaria and bacteraemia in kenyan children: a population-based, case-control study and a longitudinal study
The Lancet 378:1316–1323.

https://doi.org/10.1016/S0140-6736(11)60888-X
- Google Scholar
1. Small DS
2. Taylor TE
3. Postels DG
4. Beare NA
5. Cheng J
6. MacCormick IJ
7. Seydel KB
(2017) Evidence from a natural experiment that malaria parasitemia is pathogenic in retinopathy-negative cerebral malaria
eLife 6:e23699.

https://doi.org/10.7554/eLife.23699
- PubMed
- Google Scholar
(1994) Attributable fraction estimates and case definitions for malaria in endemic Areas
Statistics in Medicine 13:2345–2358.

https://doi.org/10.1002/sim.4780132206
- PubMed
- Google Scholar
Software
1. Stan Development Team
(2020) RStan: the R interface to Stan
R package version.

https://cran.r-project.org/web/packages/rstan/vignettes/rstan.html
1. Storey JD
(2002) A direct approach to false discovery rates
Journal of the Royal Statistical Society: Series B 64:479–498.

https://doi.org/10.1111/1467-9868.00346
- Google Scholar
1. Taylor TE
2. Fu WJ
3. Carr RA
4. Whitten RO
5. Mueller JS
6. Fosiko NG
7. Lewallen S
8. Liomba NG
9. Molyneux ME
10. Mueller JG
(2004) Differentiating the pathologies of cerebral malaria by postmortem parasite counts
Nature Medicine 10:143–145.

https://doi.org/10.1038/nm986
- PubMed
- Google Scholar
(2012) Haemoglobinopathies and the clinical epidemiology of malaria: a systematic review and meta-analysis
The Lancet Infectious Diseases 12:457–468.

https://doi.org/10.1016/S1473-3099(12)70055-5
- PubMed
- Google Scholar
(2010) Methodological challenges of genome-wide association analysis in Africa
Nature Reviews Genetics 11:149–160.

https://doi.org/10.1038/nrg2731
- PubMed
- Google Scholar
1. Uyoga S
2. Macharia AW
3. Ndila CM
4. Nyutu G
5. Shebe M
6. Awuondo KO
7. Mturi N
8. Peshu N
9. Tsofa B
10. Scott JAG
11. Maitland K
12. Williams TN
(2019) The indirect health effects of malaria estimated from health advantages of the sickle cell trait
Nature Communications 10:856.

https://doi.org/10.1038/s41467-019-08775-0
- PubMed
- Google Scholar
1. Wambua S
2. Mwangi TW
3. Kortok M
4. Uyoga SM
5. Macharia AW
6. Mwacharo JK
7. Weatherall DJ
8. Snow RW
9. Marsh K
10. Williams TN
(2006) The effect of alpha+-thalassaemia on the incidence of malaria and other diseases in children living on the coast of Kenya
PLOS Medicine 3:e158.

https://doi.org/10.1371/journal.pmed.0030158
- PubMed
- Google Scholar
(1982) Dexamethasone proves deleterious in cerebral malaria
New England Journal of Medicine 306:313–319.

https://doi.org/10.1056/NEJM198202113060601
- Google Scholar
1. Watson JA
2. Leopold SJ
3. Simpson JA
4. Day NP
5. Dondorp AM
6. White NJ
(2019) Collider bias and the apparent protective effect of glucose-6-phosphate dehydrogenase deficiency on cerebral malaria
eLife 8:e43154.

https://doi.org/10.7554/eLife.43154
- PubMed
- Google Scholar
Software
1. Watson JA
(2021) Kenyan_phenotypic_accuracy, version swh:1:rev:03a2de285d38b85a769aa25de46b7960487efc62
Software Heritage.

https://archive.softwareheritage.org/swh:1:dir:236fe424e79fd4d45db768d96e140070d5e1c025;origin=https://github.com/jwatowatson/Kenyan_phenotypic_accuracy;visit=swh:1:snp:c763ec5d56fcbceab1e3f6b1b97b56791289f294;anchor=swh:1:rev:03a2de285d38b85a769aa25de46b7960487efc62
1. White NJ
2. Turner GD
3. Day NP
4. Dondorp AM
(2013) Lethal malaria: Marchiafava and Bignami were right
Journal of Infectious Diseases 208:192–198.

https://doi.org/10.1093/infdis/jit116
- PubMed
- Google Scholar
1. Williams TN
2. Mwangi TW
3. Wambua S
4. Peto TE
5. Weatherall DJ
6. Gupta S
7. Recker M
8. Penman BS
9. Uyoga S
10. Macharia A
11. Mwacharo JK
12. Snow RW
13. Marsh K
(2005) Negative epistasis between the malaria-protective effects of alpha+-thalassemia and the sickle cell trait
Nature Genetics 37:1253–1257.

https://doi.org/10.1038/ng1660
- PubMed
- Google Scholar
1. Williams TN
2. Obaro SK
(2011) Sickle cell disease and malaria morbidity: a tale with two tails
Trends in Parasitology 27:315–320.

https://doi.org/10.1016/j.pt.2011.02.004
- PubMed
- Google Scholar
1. World Health Organisation
(2014) Severe malaria
Tropical Medicine & International Health 19:7–131.

https://doi.org/10.1111/tmi.12313_2
- PubMed
- Google Scholar
Report
1. World Health Organization
(2020)
World Malaria Report 2020: 20 Years of Global Progress and Challenges

World Health Organization.
- Google Scholar
1. Zondervan KT
2. Cardon LR
(2007) Designing candidate gene and genome-wide case-control association studies
Nature Protocols 2:2492–2501.

https://doi.org/10.1038/nprot.2007.366
- PubMed
- Google Scholar

Article and author information

Author details

James A Watson
1. Mahidol Oxford Tropical Medicine Research Unit, Faculty of Tropical Medicine, Mahidol University, Bangkok, Thailand
2. Centre for Tropical Medicine and Global Health, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
Contribution
Conceptualization, Software, Formal analysis, Investigation, Visualization, Methodology, Writing - original draft, Writing - review and editing

Contributed equally with
Carolyne M Ndila

For correspondence
jwatowatson@gmail.com

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0001-5524-0325
Carolyne M Ndila
1. Mahidol Oxford Tropical Medicine Research Unit, Faculty of Tropical Medicine, Mahidol University, Bangkok, Thailand
2. Centre for Tropical Medicine and Global Health, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
Contribution
Resources, Data curation, Writing - review and editing

Contributed equally with
James A Watson

Competing interests
No competing interests declared
Sophie Uyoga

KEMRI-Wellcome Trust Research Programme, Centre for Geographic Medicine Research-Coast, Kilifi, Kenya

Contribution
Resources, Data curation, Writing - review and editing

Competing interests
No competing interests declared
Alexander Macharia

KEMRI-Wellcome Trust Research Programme, Centre for Geographic Medicine Research-Coast, Kilifi, Kenya

Contribution
Data curation, Writing - review and editing

Competing interests
No competing interests declared
Gideon Nyutu

KEMRI-Wellcome Trust Research Programme, Centre for Geographic Medicine Research-Coast, Kilifi, Kenya

Contribution
Data curation, Methodology

Competing interests
No competing interests declared
Shebe Mohammed

KEMRI-Wellcome Trust Research Programme, Centre for Geographic Medicine Research-Coast, Kilifi, Kenya

Contribution
Data curation

Competing interests
No competing interests declared
Caroline Ngetsa

KEMRI-Wellcome Trust Research Programme, Centre for Geographic Medicine Research-Coast, Kilifi, Kenya

Contribution
Data curation, Writing - review and editing

Competing interests
No competing interests declared
Neema Mturi

KEMRI-Wellcome Trust Research Programme, Centre for Geographic Medicine Research-Coast, Kilifi, Kenya

Contribution
Data curation, Writing - review and editing

Competing interests
No competing interests declared
Norbert Peshu

KEMRI-Wellcome Trust Research Programme, Centre for Geographic Medicine Research-Coast, Kilifi, Kenya

Contribution
Data curation, Writing - review and editing

Competing interests
No competing interests declared
Benjamin Tsofa

KEMRI-Wellcome Trust Research Programme, Centre for Geographic Medicine Research-Coast, Kilifi, Kenya

Contribution
Data curation

Competing interests
No competing interests declared
Kirk Rockett
1. The Wellcome Sanger Institute, Cambridge, United Kingdom
2. Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
Contribution
Resources, Data curation, Writing - review and editing

Competing interests
No competing interests declared
Stije Leopold
1. Mahidol Oxford Tropical Medicine Research Unit, Faculty of Tropical Medicine, Mahidol University, Bangkok, Thailand
2. Centre for Tropical Medicine and Global Health, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
Contribution
Data curation, Writing - review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0002-0482-5689
Hugh Kingston
1. Mahidol Oxford Tropical Medicine Research Unit, Faculty of Tropical Medicine, Mahidol University, Bangkok, Thailand
2. Centre for Tropical Medicine and Global Health, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
Contribution
Data curation, Writing - review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0003-1869-8307
Elizabeth C George

Medical Research Council Clinical Trials Unit, University College London, London, United Kingdom

Contribution
Data curation, Writing - review and editing

Competing interests
No competing interests declared
Kathryn Maitland
1. KEMRI-Wellcome Trust Research Programme, Centre for Geographic Medicine Research-Coast, Kilifi, Kenya
2. Institute of Global Health Innovation, Imperial College, London, London, United Kingdom
Contribution
Data curation, Funding acquisition, Writing - review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0002-0007-0645
Nicholas PJ Day
1. Mahidol Oxford Tropical Medicine Research Unit, Faculty of Tropical Medicine, Mahidol University, Bangkok, Thailand
2. Centre for Tropical Medicine and Global Health, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
Contribution
Resources, Supervision, Funding acquisition, Validation, Writing - review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0003-2309-1171
Arjen M Dondorp
1. Mahidol Oxford Tropical Medicine Research Unit, Faculty of Tropical Medicine, Mahidol University, Bangkok, Thailand
2. Centre for Tropical Medicine and Global Health, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
Contribution
Resources, Validation, Writing - review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0001-5190-2395
Philip Bejon
1. Centre for Tropical Medicine and Global Health, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
2. KEMRI-Wellcome Trust Research Programme, Centre for Geographic Medicine Research-Coast, Kilifi, Kenya
Contribution
Resources, Supervision, Funding acquisition, Validation, Writing - review and editing

Competing interests
No competing interests declared
Thomas N Williams
1. KEMRI-Wellcome Trust Research Programme, Centre for Geographic Medicine Research-Coast, Kilifi, Kenya
2. Institute of Global Health Innovation, Imperial College, London, London, United Kingdom
Contribution
Resources, Data curation, Supervision, Funding acquisition, Validation, Methodology, Writing - review and editing

Contributed equally with
Chris C Holmes and Nicholas J White

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0003-4456-2382
Chris C Holmes
1. Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
2. Department of Statistics, University of Oxford, Oxford, United Kingdom
Contribution
Resources, Supervision, Validation, Methodology, Writing - review and editing

Contributed equally with
Thomas N Williams and Nicholas J White

Competing interests
No competing interests declared
Nicholas J White
1. Mahidol Oxford Tropical Medicine Research Unit, Faculty of Tropical Medicine, Mahidol University, Bangkok, Thailand
2. Centre for Tropical Medicine and Global Health, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
Contribution
Conceptualization, Resources, Supervision, Funding acquisition, Validation, Methodology, Project administration, Writing - review and editing

Contributed equally with
Thomas N Williams and Chris C Holmes

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0002-1897-1978

Funding

Wellcome Trust (209265/Z/17/Z)

Kathryn Maitland
Nicholas PJ Day
Arjen M Dondorp

Wellcome Trust (202800/Z/16/Z)

Thomas N Williams

Wellcome Trust (093956/Z/10/C)

Nicholas J White

Medical Research Council (MC\UU\12023/26)

Elizabeth C George

Wellcome Trust (WT077383/Z/05/Z)

Kirk Rockett

Medical Research Council (G0801439)

Elizabeth C George
Kathryn Maitland

Wellcome Trust (090770/Z/09/Z 204911/Z/16/Z)

Kathryn Maitland

Medical Research Council (G0600718 G0600230 MR/M006212/1)

Kathryn Maitland

Wellcome Trust (203141/Z/16/Z)

Kathryn Maitland

Wellcome Trust (206194)

Kathryn Maitland

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

This research was funded by The Wellcome Trust. A CC BY or equivalent licence is applied to the author accepted manuscript arising from this submission, in accordance with the grant's open access conditions. This work was done as part of SMAART (Severe Malaria Africa – A consortium for Research and Trials) funded by a Wellcome Collaborative Award in Science grant (209265/Z/17/Z) held in part by KM, NPJD and AD. TNW and NJW are senior and principal research fellows respectively funded by the Wellcome Trust (202800/Z/16/Z and 093956/Z/10/C, respectively). ECG acknowledges funding from a core grant to the MRC CTU at UCL from the MRC (MC_UU_12023/26).

The human data used in this study was generated through the Malaria Genomic Epidemiology Network (https://www.MalariaGEN.net) Consortial Project 1, for which a full list of Consortium members is provided at https://www.malariagen.net/projects/consortial-project-1/malariagen-consortium-members. The Malaria Genomic Epidemiology Network study of severe malaria was supported by Wellcome (WT077383/Z/05/Z) and the Bill and Melinda Gates Foundation (https://www.gatesfoundation.org/) through the Foundations of the National Institutes of Health (https://fnih.org/) as part of the Grand Challenges in Global Health Initiative. The Resource Centre for Genomic Epidemiology of Malaria is supported by Wellcome (090770/Z/09/Z; 204911/Z/16/Z). This research was supported by the Medical Research Council (G0600718; G0600230; MR/M006212/1). Wellcome also provides core awards to the Wellcome Centre for Human Genetics (203141/Z/16/Z) and the Wellcome Sanger Institute (206194).

This study also makes use of data from the FEAST trial. The FEAST trial was supported by a grant (G0801439) from the Medical Research Council, UK, provided through the (MRC) DFID concordat. KM and ECG were supported by this grant.

Ethics

Human subjects: All clinical data are from published studies in which all participants or guardians gave fully informed consent. Access to the human genetic data was approved by the MalariaGen data access committee.

Copyright

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

1,140

views
134

downloads
27

citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Citations by DOI

27

citations for umbrella DOI https://doi.org/10.7554/eLife.69698

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Article PDF

Open citations (links to open the citations from this article in various online reference manager services)

Mendeley

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

James A Watson
Carolyne M Ndila
Sophie Uyoga
Alexander Macharia
Gideon Nyutu
Shebe Mohammed
Caroline Ngetsa
Neema Mturi
Norbert Peshu
Benjamin Tsofa
Kirk Rockett
Stije Leopold
Hugh Kingston
Elizabeth C George
Kathryn Maitland
Nicholas PJ Day
Arjen M Dondorp
Philip Bejon
Thomas N Williams
Chris C Holmes
Nicholas J White

(2021)

Improving statistical power in severe malaria genetic association studies by augmenting phenotypic precision

eLife 10:e69698.

https://doi.org/10.7554/eLife.69698

Categories and tags

Research organism

Human

Share this article

Cite this article

Platelet counts and white blood cell counts as diagnostic predictors of severe falciparum malaria.

Summary of severe disease datasets used in our analyses.

Theoretical causal pathways that lead to the clinical diagnosis of severe malaria under the current WHO definition (World Health Organisation, 2014).

Model estimates of P(Severe malaria | Data) in 2220 Kenyan children clinically diagnosed with severe malaria.

The number of significant hits as a function of the FDR for the genome-wide association study across 9.6 million biallelic variants.

The three regions in the human genome with the greatest evidence for protection against severe malaria in East Africa (HBB, ABO and FREM3; Band et al., 2019).

Exploring differential effects in 120 directly typed polymorphisms across 70 candidate malaria-protecting genes.

Comparison of the marginal distributions of white blood cell counts between Asian adults and children with severe malaria and African children with severe malaria.

Comparison of the marginal distributions of platelet counts between Asian adults and children with severe malaria and African children with severe malaria.

The relationship between platelet counts and plasma PfHRP2 in severely ill African children.

Effect of permuting the weights in the re-weighted (data-tilting) GWAS.

Comparison of the non-weighted and weighted models of association for directly typed polymorphisms previously reported as associated with severe malaria (MalariaGEN Consortium et al., 2018).

Case-only analysis of five key polymorphisms effecting red cells, reported in Ndila et al., 2020 under additive, recessive or heterozygote models.

Distribution of admission haemoglobin concentrations as a function of P(Severe malaria | Data).

Pattern of missing clinical data in the 930 Vietnamese adults.

Missing clinical data in the 2220 Kenyan children diagnosed with severe malaria (red: missing; yellow: recorded).

Relationship between age and mean white count (modelled on the log10 scale).

Normal-quantile plots for platelet counts and white blood cell counts in the reference data.

Collider bias in the diagnostic model of severe malaria based on complete blood count data.

The relationship between HbSS and the estimated probabilities of severe malaria under the diagnostic model.

Scatter plots of platelet counts versus white blood cell counts for the Kenyan cohort, showing the 13 individuals with the double mutation HbAS and homozygous α+-thalassaemia as large black diamonds (HZ-alpha-thal).

Simulation study demonstrating how likelihood re-weighting can improve estimation accuracy in case-control studies.

Effect of case re-weighting on power (1-type 2 error).

Principal components analysis of 1666 Kenyan cases and 1606 population controls.

Author details

James A Watson

Contribution

Contributed equally with

For correspondence

Competing interests

Carolyne M Ndila

Contribution

Contributed equally with

Competing interests

Sophie Uyoga

Contribution

Competing interests

Alexander Macharia

Contribution

Competing interests

Gideon Nyutu

Contribution

Competing interests

Shebe Mohammed

Contribution

Competing interests

Caroline Ngetsa

Contribution

Competing interests

Neema Mturi

Contribution

Competing interests

Norbert Peshu

Contribution

Competing interests

Benjamin Tsofa

Contribution

Competing interests

Kirk Rockett

Contribution

Competing interests

Stije Leopold

Contribution

Competing interests

Hugh Kingston

Contribution

Competing interests

Elizabeth C George

Contribution

Competing interests

Kathryn Maitland

Contribution

Competing interests

Nicholas PJ Day

Contribution

Competing interests

Arjen M Dondorp

Contribution

Relationship between age and mean white count (modelled on the log₁₀ scale).

Scatter plots of platelet counts versus white blood cell counts for the Kenyan cohort, showing the 13 individuals with the double mutation HbAS and homozygous $α^{+}$ -thalassaemia as large black diamonds (HZ-alpha-thal).