Impact of seasonal variations in Plasmodium falciparum malaria transmission on the surveillance of pfhrp2 gene deletions

  1. Oliver John Watson  Is a corresponding author
  2. Robert Verity
  3. Azra C Ghani
  4. Tini Garske
  5. Jane Cunningham
  6. Antoinette Tshefu
  7. Melchior K Mwandagalirwa
  8. Steven R Meshnick
  9. Jonathan B Parr
  10. Hannah C Slater
  1. Imperial College London, United Kingdom
  2. World Health Organization, Switzerland
  3. University of Kinshasa, Democratic Republic of the Congo
  4. University of North Carolina at Chapel Hill, United States

Abstract

Ten countries have reported pfhrp2/pfhrp3 gene deletions since the first observation of pfhrp2-deleted parasites in 2012. In a previous study (Watson et al., 2017), we characterised the drivers selecting for pfhrp2/3 deletions and mapped the regions in Africa with the greatest selection pressure. In February 2018, the World Health Organization issued guidance on investigating suspected false-negative rapid diagnostic tests (RDTs) due to pfhrp2/3 deletions. However, no guidance is provided regarding the timing of investigations. Failure to consider seasonal variation could cause premature decisions to switch to alternative RDTs. In response, we have extended our methods and predict that the prevalence of false-negative RDTs due to pfhrp2/3 deletions is highest when sampling from younger individuals during the beginning of the rainy season. We conclude by producing a map of the regions impacted by seasonal fluctuations in pfhrp2/3 deletions and a database identifying optimum sampling intervals to support malaria control programmes.

https://doi.org/10.7554/eLife.40339.001

Introduction

Diagnostic testing of suspected malaria cases has more than doubled in the last 15 years, with 75% of suspected cases seeking treatment from the public health sector receiving a diagnostic test in 2017 (World Health Organization, 2018a). Much of this progress reflects the increased distribution of rapid diagnostic tests (RDTs), with the most commonly used RDTs targeting the P. falciparum protein HRP2 (PfHRP2). In 2014, a review of published reports of pfhrp2/3 deletions was conducted and included a critical assessment of the comprehensiveness of the diagnostic investigation. (Cheng et al., 2014). The findings of this review highlighted a need for a harmonized approach to investigating and confirming or excluding pfhrp2/3 deletions and called for further studies to determine the prevalence and impact of pfhrp2/3 gene deletions. Since that review, false-negative RDT results due to pfhrp2/3 gene deletions have been reported in 10 countries in sub-Saharan Africa (SSA) (World Health Organization, 2018b). The frequency of pfhrp2/3 deletions varies across SSA, with the highest burden observed in Eritrea where 80.8% of samples from Ghindae Hospital were both pfhrp2-negative and pfhrp3-negative in 2016 (Berhane et al., 2018).

Mathematical modelling has predicted that the continued use of only PfHRP2 RDTs will quickly select for parasites without the pfhrp2 gene (Gatton et al., 2017). This selection pressure occurs due to the misdiagnosis of infections caused by parasites lacking the pfhrp2 gene, which will subsequently contribute more towards onwards transmission than wild-type parasites that are correctly diagnosed due to the expression of pfhrp2. In 2017, we conducted an analysis of the drivers of pfhrp2 gene deletion selection, identifying the administrative regions in SSA with the greatest potential for selecting for pfhrp2-deleted parasites (Watson et al., 2017). The regions identified were areas with both a low prevalence of malaria and a high frequency of people seeking treatment and being treated on the basis of PfHRP2-based RDT diagnosis. The precise strength of selection, however, is not known, with other factors such as the rate of non-malarial fevers and non-adherence to RDT outcomes likely to impact the number of misdiagnosed cases receiving treatment.

In February 2018, the World Health Organization (WHO) issued guidance for national malaria control programmes on how to investigate suspected false-negative RDTs with an emphasis on pfhrp2/3 gene deletions. (World Health Organization, 2018c). The primary study outcome to be calculated in the guidance is as follows:

ProportionofP.falciparumcaseswithfalsenegativeHRP2RDTresultsduetopfhrp2/3deletions=#ofconfirmedfalciparumpatientswithpfhrp2/3genedeletionsandHRP2RDTnegativeresults#ofconfirmedP.falciparumcases(byeitherRDTormicroscopy)

The guidance recommends that a national change to non PfHRP2-based RDTs should be made if the estimated proportion of P. falciparum cases with false-negative HRP2 RDT results due to pfhrp2/3 deletions is above 5%. If the estimated proportion is less than 5% the country is recommended to establish a monitoring scheme whereby the study is repeated in two years if the 95% confidence interval does not include 5%, or one year if it does include 5%. The 5% threshold approximates the point at which the number of cases missed due to false-negative PfHRP2-based RDTs caused by pfhrp2/3 deletions may become greater than the number of cases that would be missed due to the decreased sensitivity of non PfHRP2-based RDTs. The guidance also specifies a sampling scheme to be used when estimating the prevalence of pfhrp2/3 gene deletions. Samples are to be collected from at least 10 health facilities per province to be tested, with sampling focussed on symptomatic P. falciparum patients presenting at the health facilities. All samplings are to be ideally completed within an 8-week period.

The 8-week interval permits for a rapid turnaround and allows for efficient investigations and policy responses. However, the timing of the 8-week interval chosen within a transmission season is important. The chosen interval could lead to estimates of the proportion of P. falciparum cases with false-negative HRP2 RDT results due to pfhrp2/3 deletions that are not representative of the annual average proportion. Subsequently, any recorded estimate may not be predictive of the number of cases that may be misdiagnosed due to pfhrp2/3 deletions in the years between sampling intervals. For example, an overestimation of the annual average proportion of false-negative RDTs due to pfhrp2/3 deletions could result in a switch to a less sensitive RDT, resulting in an increase in the number of malaria cases misdiagnosed if the annual average proportion of false-negative RDTs due to pfhrp2/3 deletions is less than 5%. The alternative RDT may also be both more expensive and complicated to implement. Similarly, an underestimation of the annual average proportion of P. falciparum cases with false-negative HRP2 RDT results due to pfhrp2/3 deletions would result in continued use of an overall less effective test and could provide pfhrp2/3 deleted parasite populations an opportunity to expand.

In response to these concerns, we extended our original methods (Watson et al., 2017) to characterise the impact of seasonal variations in transmission intensity on the proportion of false-negative RDTs due to pfhrp2-deleted parasites. We present an extended version of our previous model, which predicts that more false-negative RDTs due to pfhrp2 gene deletions are observed when monoclonal infections are more prevalent, with the highest proportion observed when sampling from younger children at the start of the rainy season. We continue to assess how samples collected within an 8-week interval can both over- and underestimate this proportion when compared to the annual average, which reflects the monitoring scheme recommended by the WHO for follow up studies if the outcomes of the original study are inconclusive. Lastly, we map the administrative regions in SSA with the greatest potential for estimates of the proportion of P. falciparum cases with false-negative HRP2 RDT results due to pfhrp2 deletions to be not predictive of the annual average. In addition, we identify the optimum sampling intervals for each level one administrative region, which are most representative of the annual average.

Results

Using our model, we first explored how the proportion of clinical cases only infected with pfhrp2-deleted parasites varies throughout a transmission season. We recorded the proportion of clinical cases that are PfHRP2-negative in four settings (a low and moderate transmission setting with both a low and highly seasonal transmission dynamic), which had a starting pfhrp2 deletion frequency of 6%. 6% was chosen to reflect our previously estimated frequency of pfhrp2 deletions prior to the introduction of RDTs in the Democratic Republic of the Congo (DRC) (Watson et al., 2017). We initially assumed that the frequency of pfhrp2 deletions was not increasing over time before considering scenarios in which the selective pressure for pfhrp2 deletions causes an increase in the population frequency of phrp2 deletions. This decision allowed for the impact of seasonality on the proportion of clinical cases that are pfhrp2-negative to be isolated, before allowing comparisons to scenarios in which the proportion of clinical cases that are pfhrp2-negative is increasing also due to changes in the population frequency of phrp2 deletions.

Our predictions suggest that the misdiagnosis of clinical cases due to pfhrp2-negative RDT results is heavily dependent on transmission intensity (Figure 1). For the same population frequency of pfhrp2 gene deletions (Figure 1Q–T), the observed proportion of clinical cases that are pfhrp2-negative is predicted to be higher in lower transmission settings (Figure 1I–P). The annual average proportion of clinical cases that are pfhrp2-negative was equal to 5% and 3.25% in the low and moderate transmission setting, respectively. This observation is attributable to the lower rate of superinfection in low transmission settings. The lower rate of superinfection reduces the number of polyclonal infections and increases the chance that an individual is only infected with pfhrp2-negative parasites (Figure 1—figure supplement 1). When we considered scenarios with a selective advantage for pfhrp2-deletions (Figure 1—figure supplement 2), the population frequency of pfhrp2 gene deletions increased over the two years observed (Figure 1—figure supplement 2Q–T) with a corresponding increase in the proportion of clinical cases that are pfhrp2-negative (Figure 1—figure supplement 2I–P).

Figure 1 with 2 supplements see all
Relationship between seasonality, transmission intensity and proportion of clinical cases that are infected with only pfhrp2-deleted parasites.

Graphs show in (A – D) and (E - H) the model predicted PCR prevalence and annual clinical incidence respectively at both a low and a moderate transmission intensity. In (I – L) and (M - P) the proportion of clinical cases only infected with pfhrp2-negative parasites is shown for both the whole population and in children under 5 years old, respectively. Lastly, graphs (Q - T) show the population allele frequency of pfhrp2 gene deletions, which was set equal to 6% at the beginning of each simulation. 10 simulation realisations are shown in each graph, with the mean shown with by the black line. Lastly, the 5% threshold for switching RDT provided by the WHO is shown with the dashed horizontal line in plots (I – P).

https://doi.org/10.7554/eLife.40339.002

An increased proportion of individuals only infected with pfhrp2 gene deletions is predicted to occur at the beginning of the rainy season just before incidence starts to increase. During the rainy season, the observed proportion of cases expected to yield a false-negative RDT due to pfhrp2-deleted parasites (PfHRP2-negative) falls, with the lowest proportion observed after the end of the rainy season. These dynamics are more pronounced in highly seasonal transmission regions (Figure 1B, F, J, N, R, D, H, L, P and T). In the highly seasonal settings, the observed proportion of clinical cases that are PfHRP2-negative is predicted to fluctuate above and below the 5% threshold for switching RDT provided by the WHO (Figure 1J, L, N and P). Smaller fluctuations are seen in less seasonal transmission regions (Figure 1A, E, I, M, Q, C, G, K, O and S), with no fluctuations in the observed proportion of clinical cases that are PfHRP2-negative occurring above 5% in the moderate transmission setting (Figure 1K and O). Similar patterns were observed in scenarios with an increasing frequency of pfhrp2-deletions, with fluctuations in the proportion of clinical cases that were PfHRP2-negative observed in the highly seasonal settings (Figure 1—figure supplement 2J, L, N and P). The highest proportion of cases expected to yield a false-negative RDT due to pfhrp2-deleted parasites was still observed at the beginning of the rainy season.

The specific 8-week interval during which samples are collected is predicted to impact the observed proportion of false-negative RDTs due to pfhrp2 gene deletions (Figure 2). In a moderate transmission setting, a clear seasonal pattern is predicted (Figure 2C), with sampling at the beginning of the transmission seasons resulting in significant overestimation of the annual average proportion of false-negative RDTs. Subsequently, sampling at the end of the rainy season is predicted to yield estimates that are most representative of the annual average. In comparison, surveillance in regions with low seasonality is predicted to yield estimates representative of the annual average throughout the transmission season (Figure 2B and D). In all settings, using a sampling scheme spanning the entire transmission season produced estimates that accurately estimated the annual average. A moderate increase in the proportion of false-negative RDTs is also predicted when sampling younger individuals, with the same patterns also seen within asymptomatic individuals. This observation reflects the increased probability that children younger than 5 years old yield symptoms after the first infection, due to their comparatively lower acquired clinical immunity. Similar seasonal dynamics were observed in the highly seasonal settings when we considered scenarios with a selective advantage for pfhrp2-deletions (Figure 2—figure supplement 1A and C).

Figure 2 with 1 supplement see all
Observed proportion of false-negative PfHRP2 RDTs within clinical cases during 8-week intervals.

Graphs show the proportion of clinical cases yielding false-negative PfHRP2 RDTs at 8-week intervals within a transmission season for both moderate (C, D) and low (A, B) transmission settings and high (A, C) and low (B, D) seasonality. In each panel, the observed proportion pfhrp2-negative clinical cases is shown for the whole population and within children aged under 5 years old. Ten stochastic realisations are represented by the points in each plot, with the mean relationship throughout the transmission shown in black with a locally weighted scatterplot smoothing regression (loess). The annual average proportion of false-negative RDTs due to pfhrp2 gene deletions is shown with the horizontal dashed red line, and a sampling scheme that occurs throughout the year, with samples collected proportionally to clinical incidence, is shown with grey points circled in red.

https://doi.org/10.7554/eLife.40339.005

Using data from a national survey of pfhrp2 gene deletions in the DRC, we found that the model-predicted outcomes above were similar to those observed in the field (Figure 3) (Parr et al., 2017). Among 2752 PCR-positive P. falciparum cases in the DRC, individuals were more likely to be infected with only pfhrp2-negative parasites if the clinical incidence in the month prior to sample collection was lower (p=4.1×10−6), and if the individuals were younger (p=0.016). These findings were maintained when comparing across age and transmission groups, with samples collected during periods of lower transmission found to be more likely to be pfhrp2-negative in both older and younger age groups (p=6.6×10−5 and 5.6×10−4, respectively). Samples collected in younger individuals were more likely to be pfhrp2-negative in both lower and higher transmission groups when compared to older individuals (p=0.06 and 0.06, respectively).

Impact of age and transmission intensity upon pfhrp2 deletion in the Democratic Republic of the Congo (DRC), 2013–2014.

Graphs show the percentage of PCR-positive P. falciparum samples taken from children under the age of 5 years from the 2013–2014 Demographic and Health Survey in DRC that are pfhrp2-negative. Children who are younger than the median age in the 2752 samples are grouped within the younger category. In addition, samples are classified as lower transmission if the incidence of malaria in the month prior to sample collection is lower than the median clinical incidence. The 95% binomial confidence intervals are indicated with the vertical error bars.

https://doi.org/10.7554/eLife.40339.007

Lastly, we predicted and mapped the potential for estimates collected within 8-week intervals to be unrepresentative of the annual average proportion of false-negative RDTs due to phrp2 gene deletions across 598 first administrative regions in SSA (Figure 4). We predict that 66 regions possess at least one 8-week interval for which a premature switch to a non PfHRP2-based RDT would have been made in more than 75% of simulations (Figure 4A) and 29 regions are predicted to possess at least one 8-week interval for which a premature decision to continue using PfHRP2-based RDTs would have been made in more than 75% of simulations (Figure 4B). Out of these 29 regions, 25 are also present within the formerly identified 66 regions. The data for each administrative region can be viewed online at the following interactive database https://shiny.dide.imperial.ac.uk/seasonal_hrp2/.

Predicted areas with the potential for collected estimates of the proportion of false-negative PfHRP2 RDTs due to pfhrp2 deletions to be unrepresentative of the annual average.

The maps show (A) the number of 8-week intervals at which an administrative region would prematurely swap to a non PfHRP2-based RDT due to overestimating the proportion of false-negative PfHRP2 RDTs due to pfhrp2 gene deletions in more than 75% of simulations. In (A) the opposing trend is shown, with the number of 8-week intervals at which an administrative region would prematurely continue to use PfHRP2-based RDTs due to underestimating the proportion of false-negative PfHRP2 RDTs due to pfhrp2 gene deletions in more than 75% of simulations.

https://doi.org/10.7554/eLife.40339.008

Discussion

This research characterises the potential for surveillance in highly seasonal areas within sub-Saharan Africa to produce estimates that fail to represent the annual average proportion of P. falciparum cases with false-negative HRP2 RDT results due to pfhrp2 deletions. These findings highlight the impact of both the seasonal timing and the age of individuals sampled when estimating the proportion of false-negative RDTs due to pfhrp2 deletions. Policy decisions based on the proportion of clinical cases presenting with false-negative RDTs due to pfhrp2 gene deletions should thus be made with an awareness of the seasonal transmission dynamics of the region considered.

Our modelling predicted that there would be increased observation of false-negative HRP2 RDT results after periods of lower transmission and within younger individuals. This prediction is consistent with a large, nationally representative survey of pfhrp2-negative samples among asymptomatic subjects in the DRC (Parr et al., 2017). These predictions are also in agreement with other observations from Dioro in the Ségou region of Mali, where in 2012 more than 80% of smear-positive individuals had false-negative RDTs when collected at the end of the dry season (Koita et al., 2013). The proportion of false-negative RDTs then rapidly decreased to 20% within 3–4 weeks after the start of the rainy season. It is, however, likely that a proportion of these false-negative RDTs were due to the increased observation of lower parasitaemia at lower transmission intensities such as at the end of the dry season (Okell et al., 2012). In addition, findings from Eritrea also support our model-predicted outcomes. Eritrea is a region with lower malaria prevalence compared to the Ségou region of Mali. The resultant decrease in transmission intensity is likely to result in an increased proportion of monoclonal infections throughout the transmission season. Consequently, we would predict less variability in the number of false-negative RDTs due to pfhrp2 gene deletions at any given period within a transmission season. We also expect the observed prevalence of pfhrp2 deletions to be more stochastic due to the lower effective population size of the parasite. Indeed, infections due to pfhrp2-deleted parasites identified in Eritrea between November 2013 and November 2014 were not more likely to have occurred after periods of lower transmission intensity (p=0.56, n=144, pfhrp2 deletions at 9.7%) (Menegon et al., 2017).

Similar to the original publication (Watson et al., 2017), there are a number of modelling assumptions in this study. Firstly, there are modelling uncertainties when predicting the dynamics of false-negative RDTs due to pfhrp2-deleted parasites. To account for this uncertainty in this analysis, we have controlled for the drivers characterised in our earlier study by assuming there was no selective advantage associated with pfhrp2-deleted parasites and recording the number of individuals who would have been pfhrp2-negative and subsequently misdiagnosed. The absence of a selective advantage in this way enabled the frequency of pfhrp2 deletions to remain constant, which ensured that any observed dynamics in the estimates of false-negative RDTs due to pfhrp2 deletions were due to the seasonality of transmission and not due to an increase in the population frequency of pfhrp2 deletions. However, we are aware that there is likely a selective advantage for pfhrp2 deleted parasites and subsequently we repeated the analyses with the selective advantage included. In these simulations, we predicted a substantial increase in the frequency of pfhrp2 gene deletions (Figure 1—figure supplement 2Q-T), however clear seasonal dynamics, with an increased proportion of false-negative RDTs due to pfhrp2 deletions at the beginning of the transmission season, were still observed (Figure 2—figure supplement 1C). However, the observed dynamics were less clear in settings with the greatest increase in the frequency of pfhrp2 deletions (Figure 2—figure supplement 1B).

Secondly, we assessed the potential for a region to yield unrepresentative estimates of the proportion of false-negative RDTs due to pfhrp2 deletions through comparisons to the annual average proportion. This decision reflected firstly the monitoring period defined in the WHO technical guidance, with follow-up studies recommended after two years if the 95% CI for the proportion of P. falciparum cases with false-negative HRP2 RDT results due to pfhrp2/3 deletions is less than 5%, or one year if it does include 5%. It also reflected our modelling assumption that the population frequency of pfhrp2 deletions is not increasing over time. However, in simulations in which a selective advantage to pfhrp2 deleted parasites was included, a comparison to the annual average proportion is less suitable. For example, in Figure 2—figure supplement 1B, because we started our simulations in January the optimum sampling interval is simply the interval in the middle of the year, reflecting the constant increase in pfhrp2 deleted parasites. In these scenarios, it could be argued that the correct comparison would be to the average proportion of false-negative RDTs due to pfhrp2/3 gene deletions in the year after sampling, which reflects how many cases could be misdiagnosed between sampling rounds. Unfortunately, this comparison is difficult without knowing how the proportion of false-negative RDTs due to pfhrp2/3 gene deletions will change over time. However, we believe that it is more important to focus on the assumption that the strength of selection is negligible (see Figure 5). Our rationale for this is that it is only in areas with a low selective pressure, for which the frequency of pfhrp2/3 deletions is constant over time, that one could repeatedly make an incorrect decision with regards to whether to switch RDT (Figure 5A). In areas with a selective pressure, it is still possible to incorrectly estimate the annual average for the following year; however, the presence of the selective pressure is likely to cause any decision made to be simply premature as the frequency of pfhrp2/3 deletions and subsequently false-negative PfHRP2 RDTs will increase over time (Figure 5B).

The impact of an assumed selective pressure for pfhrp2/3-deleted parasites on the decision to switch RDT.

The graphs show two hypothetical scenarios with two different regions shown in red and blue for each region. In (A) there are strong seasonal dynamics but no selective pressure. The absence of a selective pressure causes that the mean proportion of false-negative RDTs due to pfhrp2/3 deletions over a 1 year period to be constant and is shown with a horizontal dashed line. Consequently, there are time periods in which an incorrect decision to switch RDT could be made for the region in blue, and an incorrect decision to not switch RDT could be made for the region in red. In (B), there are both seasonal dynamics and a selective pressure, which results in an increasing annual mean proportion of false-negative RDTs due to pfhrp2/3 deletions over time. As in (A), there are periods in which the observed proportion of false-negative RDTs due to pfhrp2/3 deletions is both higher and lower than the rolling mean shown. However, decisions made in these periods are premature rather than definitively incorrect as the selection pressure would eventually cause the proportion to be greater than 5%.

https://doi.org/10.7554/eLife.40339.009

Lastly, it is important to note again that the true strength of selection is unknown. The precise strength of selection is dependent on a number of factors such as the magnitude of any fitness costs associated with pfhrp2 deletion, the degree to which microscopy-based diagnosis is used, the level of non-adherence to RDT results, the treatment coverage and the prevalence of malaria in the region considered. Consequently, our results should not be interpreted as precise predictions of how unrepresentative future samples may be. They should instead be used to support surveillance efforts and to reinforce the need for longitudinal measures conducted at the same point within a transmission season. In addition, we recommend that if possible, sample collection in highly seasonal regions should not occur at the beginning of the transmission season, as this is predicted to lead to premature decisions to switch RDT irrespective of the strength of selection. It will, however, be possible after the samples have been collected to estimate the likely frequency of pfhrp2 gene deletions by incorporating estimates of the multiplicity of infection within the sampled population. This frequency could then be used to estimate how the proportion of false-negative RDT results due to pfhrp2 deletions could increase in response to decreases in the prevalence of malaria.

In summary, our extended model predicts that highly seasonal dynamics in malaria transmission intensity will cause comparable dynamics in the observed proportion of false-negative RDT results due to pfhrp2 gene deletions. The observed proportion of false-negative RDTs due to pfhrp2 deletions is higher when monoclonal infections are more prevalent, with the highest prevalence observed when sampling at the start of the rainy season as individuals are less likely to already be infected. Similarly, the observed proportion of false-negative RDTs due to pfhrp2 deletions is higher in younger individuals who have lower clinical immunity, as they are more likely to present with clinical symptoms after their first infection event. As the rainy season progresses, individuals are more likely to be superinfected and acquire wild-type parasites, resulting in positive PfHRP2-based RDT results and a decrease in the observed proportion of false-negative RDTs due to pfhrp2 deletions. In response to these dynamics, it may be sensible for national malaria control programmes conducting surveillance for pfhrp2/3 deletions to choose a sampling interval towards the end of the transmission season, which is predicted to be most representative of the annual average proportion of false-negative RDTs due to pfhrp2 deletions. To support surveillance efforts, we have published an online database detailing the optimum sampling interval as well as the fluctuations throughout the transmission season for each administrative region.

Materials and methods

Extensions to the P. falciparum transmission model

Request a detailed protocol

In our previous publication (Watson et al., 2017), we presented an extended version of an individual-based model of malaria transmission to characterise the key drivers of pfhrp2 deletion selection; however, it did not capture seasonality. To address this, we incorporated seasonal variation in malaria transmission intensity through the inclusion of seasonal curves fitted to daily rainfall data available from the US Climate Prediction Center (National Weather Service Climate Prediction Center, 2010). Rainfall data was available at a 10 × 10 km spatial resolution from 2002 to 2009, with data missing for only two days. The data was subsequently aggregated to a series of 64 points per year, before Fourier analysis was conducted to capture the seasonal dynamics within this time period (Cairns et al., 2012). The first three frequencies of the resultant Fourier transformed data were used to generate a normalised seasonal curve. This inclusion alters the rate at which new adult mosquitoes are born, with the differential equation governing the susceptible adult stage of the mosquito population now given by:

dSMdt=θtμMMv-μMSM-ΛMSM

where μM is the daily death rate of adult mosquitoes, Mv is the total mosquito population, that is SM + EM + IM, ΛM is the force of infection on the mosquito population and θt is the normalised seasonal curve, with a period equal to 365 days. The rest of the model equations remain the same as in our original study (Watson et al., 2017).

All extensions to the previous model code have been made using the R language (RRID:SCR_001905) (R Development Core Team, 2016) and are available through an open source MIT licence at https://github.com/OJWatson/hrp2malaRia (Watson, 2019; copy archived at https://github.com/eLifeProduction/hrp2malaRia_2019). In addition, these extensions have been included in the pseudocode description of the model (Supplementary file 1).

Characterising the impact of seasonal transmission intensities upon the proportion of false-negative RDTs due to pfhrp2 gene deletions

Request a detailed protocol

The impact of seasonality was examined by recording the proportion of clinical incidence that would have been misdiagnosed due to pfhrp2 gene deletions across the year. This proportion was summarised at 12 8-week intervals, that is January – March, February – April, December – February. This proportion was recorded in both high and low seasonality settings, characterised by a Markham Seasonality Index = 80% and 10%, respectively (Cairns et al., 2015). These settings were examined at both low and moderate transmission intensity (EIR = 1 and 10 respectively), with the starting proportion of pfhrp2-deleted parasites in the whole population set equal to 6% in agreement with previous observations of pfhrp2 gene deletions in the DRC (Watson et al., 2017) The proportion of symptomatic cases seeking treatment was assumed to be 40% (fT = 0.4). In all simulations, 10 stochastic realisations of 100,000 individuals were simulated for 60 years to reach equilibrium first, before setting the frequency of pfhrp2 deletions. Initially, we assumed there was no assumed fitness cost or selective advantage associated with pfhrp2 gene deletion. This was modelled by assuming that individuals who are only infected with parasites with pfhrp2 gene deletions will still be treated. This decision allowed us to control for selection within our investigation by ensuring that the changes observed in the observation of PfHRP2-negative clinical cases are only due to seasonal variation in transmission intensity, and not due to an increase in the frequency of pfhrp2 gene deletions due to the selective advantage by evading diagnosis. As a result, when reporting the proportion of clinical cases that were misdiagnosed resulting from a false-negative PfHPR2-negative RDT we are reporting the proportion of cases that are infected with only pfhrp2-deleted parasites, that is individuals who would have been pfhrp2-negative and subsequently misdiagnosed. We also assume that 25% of individuals who are only infected with pfhrp2-deleted parasites will still be pfhrp2-positive due to the cross reactivity of PfHRP3 epitopes causing a positive PfHRP2-based RDT result (Baker et al., 2005).

Model predictions were subsequently compared to data collected from the Democratic Republic of Congo as part of their 2013–2014 Demographic and Health Survey (DHS). In overview, 7137 blood samples were collected from children under the age of 5 years old, which yielded 2752 children diagnosed with P. falciparum infection by real-time PCR targeting the lactate dehydrogenase (pfldh) gene. The RDT barcodes for the 2752 samples were identified and matched to the DHS survey to identify both the age of the children and the date of sample collection. The collection date was used to predict the mean clinical incidence from the previous 30 days for each sample. This was estimated using the deterministic implementation of our model fitted to the observed PCR prevalence of malaria from the DRC DHS 2013–2014 survey (Meshnick et al., 2013), incorporating the seasonality and treatment coverage for each province. Children who were younger than the median age in the 2752 samples were grouped within a younger category. In addition, samples were classified as lower transmission if the clinical incidence of malaria in the month prior to sample collection was lower than the median clinical incidence. The counts of pfhrp2-negative samples within each group were subsequently compared using the Pearson chi-squared test with Rao-Scott corrections to account for the hierarchal survey design implemented within DHS surveys (Jnk and Scott, 1984). Pearson chi-squared tests were used in a similar analysis that was conducted using samples collected from the Gash-Barka and Debug regions in Eritrea between 2013 and 2014, for which the dates of sample collection were made available to us (Menegon et al., 2017).

Finally, the seasonal profiles for 598 first-level administrative regions across sub-Saharan Africa were used to characterise the potential for estimates of the proportion of false-negative PfHRP2 RDTs due to pfhrp2 gene deletions to be unrepresentative of the annual average. For each region, 100 simulation repetitions were conducted for 60 years to reach equilibrium first before fitting the frequency of pfhrp2 gene deletions in each simulation such that the annual average proportion of false-negative RDT results due to pfhrp2 deletions is equal to 5%. Each repetition was subsequently simulated for two further years, with 7300 individuals seeking treatment sampled from each 8-week interval. This number approximates the recommended sample size within the WHO protocol for pfhrp2 deletion prevalence at 5 ± 0.5%. For each sample, the proportion of false-negative PfHRP2-based RDTs due to pfhrp2 gene deletions was recorded. For each sample, a binomial confidence interval was calculated and the resultant percentage of intervals that did not include the annual prevalence of 5% was calculated. For each region, the number of 8-week intervals for which a premature decision to either swap from a PfHRP2-based RDT or continue using a PfHRP2-based RDT was made in more than 75% of simulations was recorded and mapped. The raw results of this analysis were subsequently used to create a database that details the optimum sampling intervals for estimating the annual proportion of false-negative RDT results due to pfhrp2 deletions.

Data availability

All data generated are provided within the online database, hosted through a shiny application at https://ojwatson.shinyapps.io/seasonal_hrp2/. The raw data for the application is available within the GitHub repository at https://github.com/OJWatson/hrp2malaRia (copy archived at https://github.com/eLifeProduction/hrp2malaRia_2019).

The following previously published data sets were used

References

    1. Koita OA
    2. Ndiaye J-L
    3. Nwakanma D
    4. Sangare L
    5. Ndiaye D
    6. Joof F
    (2013)
    Seasonal changes in the frequency of false negative rapid diagnostic tests based on histidine rich protein 2 (HRP2)
    The American Journal of Tropical Medicine and Hygiene 89:1.
  1. Software
    1. R Development Core Team
    (2016) R: A Language and Environment for Statistical Computing
    R Foundation for Statistical Computing, Vienna, Austria.

Decision letter

  1. Ben Cooper
    Reviewing Editor; Mahidol Oxford Tropical Medicine Research Unit, Thailand
  2. Eduardo Franco
    Senior Editor; McGill University, Canada
  3. Ben Cooper
    Reviewer; Mahidol Oxford Tropical Medicine Research Unit, Thailand
  4. Elena Gómez-Díaz
    Reviewer; Doñana Biological Station (EBD-CSIC), Spain
  5. Penelope Anne Lynch
    Reviewer; University of Exeter Cornwall Campus, United Kingdom

In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

Thank you for submitting your article "The impact of seasonal variations in Plasmodium falciparum malaria transmission on the surveillance of pfhrp2 gene deletions" for consideration by eLife. Your article has been reviewed by three peer reviewers, including Ben Cooper as the Reviewing Editor and Reviewer #1, and the evaluation has been overseen by Prabhat Jha as the Senior Editor. The following individuals involved in review of your submission have agreed to reveal their identity: Elena Gómez-Díaz (Reviewer #2); Penelope Anne Lynch (Reviewer #3).

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

Summary:

This research represents a research advance which builds on a previous paper by the same group which considered the selection pressure exerted by the widespread use of rapid diagnostic tests for malaria in sub-Saharan Africa for deletions to the pfhrp2 gene, which can lead to false negative test results. Subsequent to the previous publication, the World Health Organization has produced guidance for the investigation of suspected false-negative diagnostic test results due to pfhrp2/3 deletions. However, this guidance says nothing about the recommended timing of such investigations. The current work uses an extension of the model-based analysis to show that seasonal variation in malaria transmission can lead to substantial biases in estimates of the prevalence of pfhrp2/3 deletions, leading to poor choices of rapid diagnostic test. The risk of this sampling bias is mapped by region, and optimum sampling intervals are proposed.

Essential revisions:

All reviewers thought that the work was important and conducted to a high standard and should be published if some essential revisions are made. These revisions are needed primarily to improve the clarity of the work and in some cases to extend the Discussion to consider other factors that might be important (see individual reviews below). Comments marked with a * below should be considered discretionary revisions. In particular, though non-essential, it was felt that an additional figure might help to clarify the relationship between monoclonal/multiclonal infections and pfhrp2 deletions prevalence and selection. An important part of these clarifications is provision of pseudo code for the revised model, just to document exactly how the updated DDEs shown in the manuscript are incorporated into the simulation model.

Reviewer #1:

The seems to be a useful research advance that addresses an important policy question using a model described in a previous eLife paper. The work is well-motivated and clearly described.

Reviewer #2:

This is an a research advance upon a previous study Watson et al., 2017.

In the previous article, authors modeled the potential for RDT-led diagnosis to drive selection of pfhrp2-deleted parasites. In the present work, authors extend the model so it now considers the impact of transmission intensity and seasonality on the prevalence of pfhrp2 gene deletions. They found that regions with low transmissibility and high seasonality are those with higher number of false negatives (higher prevalence of pfhrp2 deletions). They also show that this bias is stronger in young children.

The article is clearly written, the figures are very illustrative, and the new data support the conclusions. The new findings are significant. The data provided represents an important resource for the community.

- The extended analysis focus on seasonality and transmission intensity. I wonder about other possible causes of RDTs misdiagnosis. For example, the work seem to focus only on the clinical cases. What is the dynamics expected for pfhrp2 deletions in the asymptomatic? This is important because asymptomatic malaria significantly impacts transmission dynamics and asymptomatic infections show seasonality.

- The study model pfhrp2 deletions but no consideration is made about the effect of the type of treatment driving selection. There might be a temporal and spatial variability at this regards that has not been considered?

*- The link between transmission intensity and multiplicity of infections is clear. However, I find confusing the relationship between monoclonal/multiclonal infections and pfhrp2 deletions prevalence and selection. I think this should be elaborated further and possibly modeled?

- Previous studies indicated that PfHRP3 may play a role in the performance of PfHRP2-based RDTs. Do authors have data on pfhrp3? Apart of pfhrp2 deletions, could other sequence differences contribute to lower sensitivity of RDTs?

Reviewer #3:

This paper provides novel insights into an issue of practical public health importance. The results are interesting, and deserve to be disseminated and understood. In order to achieve this fully, the paper would benefit from greater clarity in some areas. Elements of the story which are perhaps viewed as self-evident by the authors may not be self-evident to readers, and are key to interpreting the paper and its results.

This paper adds seasonal variation to an individual-based model simulating prevalence of pfhpr2-del strains and false negative results in a population over time,. I have not attempted to check the original model, but the amendment shown in the current paper seems correct. Could the authors provide an updated version of the pseudo-code documentation reflecting the updates?

WHO guidelines recommend a transition from HPR2-based RDTs to alternatives when the prevalence of false-negatives due to pfhpr2 deletion exceeds 5%, and specify survey protocols to test for this. This paper focusses on potential biases in the survey results arising from variation due to effects of seasonality and transmission intensity. Since it is central to the paper's premise, a brief explanation of the WHO survey protocol is needed, with an explicit explanation of the links between the simulation outputs and the values measured in the protocol. I think the relevant values are all present in the paper, but their meaning and relationships could be more clearly explained.

Can the authors clarify the basis on which the 5% threshold value was selected by WHO? The bias discussed in the paper may have different implications depending on whether the key comparator is the underlying prevalence of pfhrp2/3 gene deletions or the annual average proportion of pfhrp2/3-del false negatives. Is there any potential to add some discussion about the implications of this study for the WHO threshold value, for example whether specific values could be specified for particular seasonality and transmission-intensity contexts?

The text regarding assumptions about selection and fitness (copied below) is confusing. False negative RDTs and consequent treatment choices reflected in the model will inherently exert selection, which seems to conflict with statements in the text.

'Additionally, there was no assumed fitness cost or selective advantage associated with pfhrp2 gene deletion, i.e. individuals who are only infected with parasites with pfhrp2 gene deletions are assumed to yield a false-negative RDT result. This decision allowed us to control for selection within our investigation. This ensures that the dynamics observed are only due to seasonal variation in transmission intensity, and not due to an increase in the frequency of pfhrp2 gene deletions due to a selective advantage by evading diagnosis. As a result, when reporting individuals who are pfhrp2-negative we assume that 25% of individuals who are only infected with pfhrp2-deleted parasites will still be pfhrp2-positive due to the cross reactivity of PfHRP3 epitopes causing a positive PfHRP2-based RDT result.'

[Editors' note: further revisions were requested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "Impact of seasonal variations in Plasmodium falciparum malaria transmission on the surveillance of pfhrp2 gene deletions" for further consideration at eLife. Your revised article has been favorably evaluated by Prabhat Jha as the Senior Editor, a Reviewing Editor, and two reviewers.

The manuscript has been improved but there are some remaining issues that need to be addressed before acceptance, as outlined below:

As you know, there was some confusion in this paper, as the original submission indicated that model did account for selection for pfhrp2 mutants, but the subsequent correspondence indicated that the model didn't.

While we understand that there is some value in considering the situation where the frequency of pfhrp2 deletions is not affected by selective forces (i.e. delayed treatment), clearly selective forces are likely to be acting in most settings and following consultation the consensus was that completely removing this real-world effect from the model was hard to justify. Therefore in addition to the analysis that has been done, the authors should add additional work where they do what they originally said they had done i.e. including a model where there is a selective advantage for pfhrp2 deletion changes/mutants as originally indicated.

The authors appear to be assuming that the intended meaning of the 5% threshold is the average proportion of HRP2 RDT results for patients infected with P. falciparum which are false negatives caused by pfhrp2/3 gene deletions, during a given year. The Discussion and conclusion then focus on differences between the prevalence of false negatives at specific timepoints during a year vs the average prevalence value over the year. It would hugely improve the clarity of the paper to state this assumption explicitly and early in the text. It is also necessary to demonstrate using information in the WHO documentation that this is in fact the intended definition of the WHO 5% threshold. Without unequivocal evidence that this is the precise meaning of the threshold value intended by the WHO, then the use of terms such as 'bias', 'overestimate', 'underestimate' etc. is unjustified throughout.

If it is not clear that the threshold is defined as an annual average, then the paper's message needs to change slightly. By indicating the extent to which the prevalence of false negatives can vary seasonally, even when the prevalence of gene deletions is constant, the results presented here indicate that a conscious choice about this aspect of the definition is very important. Should the threshold represent the acceptable maximum prevalence of false positives, or should it be the annual average. In either case, the results can inform strategies for applying the protocol in ways most likely to identify the required value.

Because of the extensive nature of the requested revisions and clarifications which cannot easily be summarized, more extensive comments from both reviewers are appended below. All substantive points should be addressed satisfactorily as we are unable to extend the review process beyond this next revision.

Reviewer #2:

The manuscript has improved and authors addressed most of my comments satisfactorily. I have however a few additional comments on the revised manuscript and rebuttal letter which I feel would require additional clarification.

- Exceptionally unhelpfully, the use of "false-negative" should be "positive" here. We carried out all our simulations with the assumption that individuals who are only infected with pfhrp2 gene deleted parasites will still be treated. As such, the gene deleted parasites behave exactly the same as the wild type parasites.

I am afraid I don't fully follow this reasoning. My understanding is that the motivation of the study was that pfhrp2 gene deleted parasites could be indeed misdiagnosed and so simulations should treat them as false negatives (Introduction, second paragraph). If simulations threat those as positive, how could the model effectively estimate the rate of misdiagnosis and the seasonality in such estimate? May I have missed something?

Besides, I don't think that the reviewers have actually addressed the real concern that came with their original consideration that false negative RTD pfhrp2 deleted parasites would allow them to control for selection.

- Related to the same issue above, and in response to my comment, authors replied:

"We do not include a selective advantage to pfhrp2 gene deletion (apologies again for the error mentioned at the beginning of our response) and so we would not expect there to see a temporal variability in the selection pressure. If we did consider this then there would definitely be a temporal element, with the increase in the absolute number of people who seek treatment (we assume a constant proportion of people with a malarial fever seek treatment) during periods of higher transmission causing an increase in the prevalence of the pfhrp2 gene deletion. It was because of this reason that we decided not to model selection, so that we could exclude this effect of selection and be more confident that the dynamics seen are due to the fluctuations in individuals being only infected with pfhrp2 deleted parasites."

The selective advantage comes with pfhrp2 gene deletion individuals being misdiagnosed and not getting treatment. If you consider those as positive you remove selection but this is not reflecting any more the reality of the situation.

- About the relationship between monoclonal/multiclonal infections and pfhrp2 deletions prevalence and selection.

I thank the authors about including a supplementary figure, but could it be possible to clarify further the relationships in the text?, saying that the relationship is unclear is not of much help.

- About my comment "The regions identified were areas with both a low prevalence of malaria and a high frequency of people seeking…" Were these the only factors?"

To which authors responded "These were the only factors we looked at within our modelling study".

I don't find this reply satisfactory. I know they modelled only those, but my comment was more a recommendation so it is acknowledged somewhere in the Introduction or the Discussion whether they could be other factors that have not been considered and have been shown or suggested to influence the misdiagnoses.

Reviewer #3:

The author's clarifications make sense and are helpful. However, my improved understanding of the authors' intentions and the results and conclusions presented in the paper has generated some additional questions and comments. I still feel that the paper would benefit from greater clarity.

My understanding is that the key values being considered are;

1) The proportion of HRP2 RDT results for patients infected with P. falciparum which are false negatives caused by pfhrp2/3 gene deletions at a given timepoint.

2) The average proportion of HRP2 RDT results for patients infected with P. falciparum which are false negatives caused by pfhrp2/3 gene deletions, during a given year.

3) The proportion of P. falciparum parasites in a given region which have pfhrp2/3 gene deletions.

4) The 5% threshold in the WHO guidelines.

It would be incredibly helpful if the authors could provide a precise definition for this, as the various wordings I have found so far in the WHO protocol and information note are open to interpretation regarding whether the 5% is intended to represent: a) The proportion of HRP2 RDT results for patients infected with P. falciparum which are false negatives caused by pfhrp2/3 gene deletions; or b) The proportion of P. falciparum parasites in a given region which have pfhrp2/3 gene deletions.

Part of a full definition for this value is the assumed timing. A quick review of the WHO documentation does not immediately yield any specific information about assumed timings, an absence which would be consistent with an assumption that the rate is effectively constant through a season, or might equally mean that the relevant value is that at the time of sampling.

In the paper, the authors appear to be assuming that the intended meaning of the 5% threshold is the average proportion of HRP2 RDT results for patients infected with P. falciparum which are false negatives caused by pfhrp2/3 gene deletions, during a given year (item 2 in the list above). The Discussion and conclusion then focus on differences between the prevalence of false negatives at specific timepoints during a year vs the average prevalence value over the year. It would hugely improve the clarity of the paper to state this assumption explicitly and early in the text. It is also necessary to demonstrate using information in the WHO documentation that this is in fact the intended definition of the WHO 5% threshold. Without unequivocal evidence that this is the precise meaning of the threshold value intended by the WHO, then the use of terms such as 'bias', 'overestimate', 'underestimate' etc. is unjustified throughout.

If it is not clear that the threshold is defined as an annual average, then the paper's message needs to change slightly. By indicating the extent to which the prevalence of false negatives can vary seasonally, even when the prevalence of gene deletions is constant, the results presented here indicate that a conscious choice about this aspect of the definition is very important. Should the threshold represent the acceptable maximum prevalence of false positives, or should it be the annual average. In either case, the results can inform strategies for applying the protocol in ways most likely to identify the required value.

There is also some confusion in the text between the prevalence of false positives results, and the prevalence of the gene deletion, with the text referring to change of RDT being triggered by an incorrect assessment of the prevalence of gene deletions (e.g. Introduction, fourth paragraph), suggesting that the authors may in fact be defining the threshold value as equal to value 3 in the list above.

These are key to the meaning and the implications of the work presented here, and clarity about what is being assumed or referred to is crucial to allow the text to tell its story clearly, and to make it easy to assess the consistency of that story. Confusing references to different prevalence values in the text should be reviewed and resolved wherever they arise throughout the text, including some specific instances detailed below.

Detailed comments:

Introduction, third and fourth paragraphs: In the third paragraph of the Introduction the authors give a definition of the WHO threshold value as being the prevalence of false negatives caused by pfhrp2/3 gene deletions. However, in the fourth paragraph of the Introduction they suggest that incorrect assessment of the prevalence of pfhrp2/3 gene deletions could drive the decision to switch to non HRP2 RDTs. Is there another mechanism in the WHO guideline in addition to the 5% false negatives threshold which would drive a change of policy based on gene deletion prevalence rather than false negative RTD prevalence?

'The protocol in this guidance details how to estimate the local prevalence of false-negative PfHRP2-based RDTs due to pfhrp2/3 gene deletions and recommends that a national change to non PfHRP2-based RDTs be made if the estimated prevalence is above 5%.'

'the timing of the 8-week interval chosen within a transmission season could lead to bias in the sampled prevalence of pfhrp2/3 gene deletions. An overestimation of the true prevalence of pfhrp2/3 gene deletions could result in a switch to a less sensitive RDT'

Results, first paragraph and similar elsewhere in text: 'In a moderate transmission setting, a clear seasonal pattern is predicted (Figure 2C), with sampling at the beginning of the transmission seasons resulting in significant overestimation of the true proportion of false negative RDTs..'

'true' is not adequately defined to be used here in this way. It might legitimately be assumed to mean the population prevalence of false-negative RDTs at the time of sampling. What is meant here, I think, is that sampling at the beginning of the transmission season is expected to give a value higher than the true average value for the year.

Introduction, last paragraph, Figure 4 description and title, Results, last paragraph.

Introduction, last paragraph and figure description indicate that the values used to generate Figure 4 are the gene deletion prevalences

Results, last paragraph and implication of contents of plot indicate that the plot is based on prevalences of false negative values.

Results, first and last paragraph and Discussion, first paragraph and similar elsewhere in text – 'biased' and 'unbiased' are a mathematical terms with specific meanings and it is not clear that those meanings are correctly applied here and elsewhere in the text. It would be better to replace them with other terms unless the mathematical meaning is genuinely indicated.

Discussion, first and second paragraphs. These paragraphs both begin by describing the research presented in the manuscript as relating to estimates of prevalence of pfhrp2 gene deletions. The remaining text all seems to actually describe the results regarding the prevalence of false positive HRP2 RDT results, but the first sentences mean that it all reads as discussion of the gene deletion prevalence.

'This research characterises the potential for surveillance in highly seasonal areas within sub-Saharan Africa to produce biased estimates of the prevalence of pfhrp2 gene deletions. These findings highlight the impact of both the seasonal timing and…'

'Our modelling predicted that there would be increased observation of pfhrp2 gene deletions after periods of lower transmission and within younger individuals…'

Discussion, first, third and fourth paragraphs. 'However, the true prevalence of parasites with a pfhrp2 gene deletion in each administrative region is fundamentally unknown, and as such, our results should not be interpreted as predictions of the bias in future sampled estimates of pfhrp2 deletion. They should instead be used to support surveillance efforts and to reinforce the need for longitudinal measures of pfhrp2 gene deletions conducted at the same point with a transmission season.'

Is this compatible with the database mentioned in the Discussion? 'To support surveillance efforts, we have published an online database detailing the optimum sampling interval as well as the sampling bias throughout the transmission season for each administrative region'

'The observed prevalence of pfhrp2 deletions is higher when monoclonal infections are more prevalent, with the highest prevalence observed when sampling at the start of the rainy season as individuals are less likely to already be infected. Similarly, the observed prevalence of pfhrp2 deletions is higher in younger individuals who have lower clinical immunity, as they are more likely to present with clinical symptoms after their first infection event.'

Should these two references be to prevalence of false positives rather than prevalence of pfhrp2 deletions?

Discussion, last paragraph. This seems to be simply repeating contents of first paragraph of Discussion?

Subsection “Characterising the impact of seasonal transmission intensities upon pfhrp2 deletion prevalence”, last paragraph. '…fitting the frequency of pfhrp2 gene deletions in each simulation such that the true prevalence of false-negative RDT results due to pfhrp2 deletions is equal to 5%.'

'.. percentage of intervals that did not include the true prevalence of 5% was calculated.'

'true' not adequately defined, should simply say '.. the average annual prevalence..' or similar.

Figure 1 legend. '..In I – L and M – P the proportion of clinical cases due to pfhrp2-negative parasites is shown for both the whole population and..'

Wording is confusing, does this mean cases infected only with pfhrp2-negative parasites?

'…the population allele frequency of pfhrp2 gene deletions, which was set equal to 6% at the beginning of each simulation..'

Is the reason for or significance of the 6% value given anywhere?

'…10 simulation realisations are shown in each graph, with the mean shown with the thicker line. Lastly, the 5% threshold for switching RDT provided by the WHO is shown with the black line in plots I – P…'

I think the means are shown by the black line, and the 5% by the dashed horizontal line?

Figure 3 legend. Should '..age and seasonality..' be '..age and transmission intensity..'?

Figure 4 legend, description and title. '..pfhrp2 deletion..' should be '..false-negative pfhrp2 RDTs?..'

Should also be revised as necessary to reflect assumed exact definition of threshold value.

pseudo codesecond line 048

'// Loop through every day in simulation and calculate the seasonal curve for that day

045 FOR day: = 1 TO t_max // t_max is total simulation time in days

046 theta[day]:= Fourier_average +first_cosine_term * cos(2*pi*day/365) +second_cosine_term * cos(2*2*pi*day/365) +third_cosine_term * cos(3*2*pi*day/365) +first_sine_term * sin(2*pi*day/365) +second_sine_term * sin(2*2*pi*day/365) +third_sine_term * sin(3*2*pi*day/365))

047 ENDFOR

// Loop through every day in simulation and normalise seasonal curve for that day

048 FOR day: = 1 TO t_max // t_max is total simulation time in days

048 theta[day]: = theta [day] / mean(theta [1 TO 365) // normalise theta with first 365 days of theta

049 IF ([day] < 0.001) // with only 1st 3 terms of Fourier used we need to check for <0

050 [day]: = 0.001

051 ENDIF

052 ENDFOR

I'm assuming this is just a problem with the pseudo code, not the actual code, but that should be checked and confirmed. It seems that in the normalisation loop, the sum of theta values by which θ(n) is divided will use the normalised rather than original values for all θ(<n).

Could the authors please review the pseudo code for consistency with the actual code?

https://doi.org/10.7554/eLife.40339.016

Author response

Essential revisions:

All reviewers thought that the work was important and conducted to a high standard and should be published if some essential revisions are made. These revisions are needed primarily to improve the clarity of the work and in some cases to extend the Discussion to consider other factors that might be important (see individual reviews below). Comments marked with a * below should be considered discretionary revisions. In particular, though non-essential, it was felt that an additional figure might help to clarify the relationship between monoclonal/multiclonal infections and pfhrp2 deletions prevalence and selection. An important part of these clarifications is provision of pseudo code for the revised model, just to document exactly how the updated DDEs shown in the manuscript are incorporated into the simulation model.

Thank you to all the reviewers for their thoughtful comments and kind words. We have responded to all the comments below, and we would like to make one clarification initially here as it was picked up by all three reviewers. There was an error in the following:

“Additionally, there was no assumed fitness cost or selective advantage associated with pfhrp2 gene deletion, i.e. individuals who are only infected with parasites with pfhrp2 gene deletions are assumed to yield a false-negative RDT result. This decision allowed us to control for selection within our investigation"

Exceptionally unhelpfully, the use of “false-negative” should be “positive” here. We carried out all our simulations with the assumption that individuals who are only infected with pfhrp2 gene deleted parasites will still be treated. As such, the gene deleted parasites behave exactly the same as the wild type parasites.

The reason for simulating it this way was so that we could see the impact of seasonality on the appearance of individuals that are only infected with pfhrp2-deleted parasites, i.e. those that would be misdiagnosed. We knew from the original study that at low transmission there is a strong selection pressure in favour of pfhrp2-deleted parasite. Subsequently, over the timespan we consider in the simulations it would have been likely that the frequency of pfhrp2 gene deletions would have increased. This would have made it less clear how the dynamics of individuals only infected with pfhrp2 deleted parasites changes throughout a transmission season, if the prevalence of the gene deletion is substantially higher at the end of a transmission season. Our apologies for the confusion and thank you for picking up on it. This section now reads as follows:

“Additionally, there was no assumed fitness cost or selective advantage associated with pfhrp2 gene deletion. This was modelled by assuming that individuals who are only infected with parasites with pfhrp2 gene deletions will still be treated. This decision allowed us to control for selection within our investigation by ensuring that the changes observed in the observation of PfHRP2-negative clinical cases are only due to seasonal variation in transmission intensity, and not due to an increase in the frequency of pfhrp2 gene deletions due to the selective advantage by evading diagnosis.”

Reviewer #2:

This is an a research advance upon a previous study Watson et al., 2017.

In the previous article, authors modeled the potential for RDT-led diagnosis to drive selection of pfhrp2-deleted parasites. In the present work, authors extend the model so it now considers the impact of transmission intensity and seasonality on the prevalence of pfhrp2 gene deletions. They found that regions with low transmissibility and high seasonality are those with higher number of false negatives (higher prevalence of pfhrp2 deletions). They also show that this bias is stronger in young children.

The article is clearly written, the figures are very illustrative, and the new data support the conclusions. The new findings are significant. The data provided represents an important resource for the community.

- The extended analysis focus on seasonality and transmission intensity. I wonder about other possible causes of RDTs misdiagnosis. For example, the work seem to focus only on the clinical cases. What is the dynamics expected for pfhrp2 deletions in the asymptomatic? This is important because asymptomatic malaria significantly impacts transmission dynamics and asymptomatic infections show seasonality.

We agree that considering asymptomatic individuals is very important, especially as they are the major driver of onwards transmission. Additionally, understanding the dynamics in asymptomatics will be useful in any planned community surveillance or reactive case detection.

The focus on the clinical cases within the analysis was initially chosen so that it aligned with the population of individuals that would be sampled within the WHO protocol. However, we did also look at the dynamics within the asymptomatics when conducting the analysis for Figure 2 and found very similar patterns. This was encouraging as the data from the DRC that we used to see if our model predictions were in agreement with real data was taken from mostly asymptomatic individuals. We have included though an additional plot, Figure 1—figure supplement 1, which looks at the mean proportion of asymptomatic infections that are:

1) Only infected with pfhrp2-deleted parasites

2) Only infected with wild type parasites

3) Infected with both pfhrp2-deleted and wild type parasites

This plot also considers these proportions within clinical cases as well, which will hopefully clarify a later point about the relationship between monoclonal/multiclonal infections.

- The study model pfhrp2 deletions but no consideration is made about the effect of the type of treatment driving selection. There might be a temporal and spatial variability at this regards that has not been considered?

We do not include a selective advantage to pfhrp2 gene deletion (apologies again for the error mentioned at the beginning of our response) and so we would not expect there to see a temporal variability in the selection pressure. If we did consider this then there would definitely be a temporal element, with the increase in the absolute number of people who seek treatment (we assume a constant proportion of people with a malarial fever seek treatment) during periods of higher transmission causing an increase in the prevalence of the pfhrp2 gene deletion. It was because of this reason that we decided not to model selection, so that we could exclude this effect of selection and be more confident that the dynamics seen are due to the fluctuations in individuals being only infected with pfhrp2 deleted parasites.

*- The link between transmission intensity and multiplicity of infections is clear. However, I find confusing the relationship between monoclonal/multiclonal infections and pfhrp2 deletions prevalence and selection. I think this should be elaborated further and possibly modeled?

We agree that the relationship between monoclonal/multiclonal infections is unclear, and so we hope Figure 1—figure supplement 1 described above helps clarify this.

- Previous studies indicated that PfHRP3 may play a role in the performance of PfHRP2-based RDTs. Do authors have data on pfhrp3? Apart of pfhrp2 deletions, could other sequence differences contribute to lower sensitivity of RDTs?

Attempts to quantify role that PfHRP3 has towards yielding a positive RDT result have been previously made, and we use within our modelling the assumption that a positive RDT will be produced in 25% of cases due to PfHRP3 cross-reactivity, which we sourced from Baker et al., 2005. We make reference to this study in the original paper, but not in this paper and so we have added this reference to the end of the first paragraph under the “Characterising the impact of seasonal transmission intensities upon pfhrp2 deletion prevalence” section of the Materials and methods.

The latter point is, however, particularly interesting as we do believe that a full deletion of the pfhrp2 gene is not essential in reducing the detection sensitivity of a PfHRP2-based RDT. The study by Baker et al. listed above looks into the impact of amino acid repeats within pfhrp2 on the detection sensitivity of PfHRP2-based RDTs. It would definitely be of interest, however outside the scope of this study, to create an updated map of the pfhrp2/3 genetic variants and compare it to reports of RDT performance within sub-Saharan Africa.

Reviewer #3:

This paper provides novel insights into an issue of practical public health importance. The results are interesting, and deserve to be disseminated and understood. In order to achieve this fully, the paper would benefit from greater clarity in some areas. Elements of the story which are perhaps viewed as self-evident by the authors may not be self-evident to readers, and are key to interpreting the paper and its results.

This paper adds seasonal variation to an individual-based model simulating prevalence of pfhpr2-del strains and false negative results in a population over time,. I have not attempted to check the original model, but the amendment shown in the current paper seems correct. Could the authors provide an updated version of the pseudo-code documentation reflecting the updates?

We have provided an updated version of the pseudocode to reflect the updates made.

WHO guidelines recommend a transition from HPR2-based RDTs to alternatives when the prevalence of false-negatives due to pfhpr2 deletion exceeds 5%, and specify survey protocols to test for this. This paper focusses on potential biases in the survey results arising from variation due to effects of seasonality and transmission intensity. Since it is central to the paper's premise, a brief explanation of the WHO survey protocol is needed, with an explicit explanation of the links between the simulation outputs and the values measured in the protocol. I think the relevant values are all present in the paper, but their meaning and relationships could be more clearly explained.

The overview of the WHO survey protocol is given within the now lengthened third paragraph of the Introduction. We have also added further clarification in the last paragraph of the Materials and methods about how our model outputs represent the population that will be sampled as part of the WHO survey protocol:

“Each repetition was subsequently simulated for 2 further years, with 7,300 individuals seeking treatment sample from each 8-week interval. […] For each sample the proportion of false-negative PfHRP2-based RDTs due to pfhrp2/3 gene deletions was recorded.”

Can the authors clarify the basis on which the 5% threshold value was selected by WHO? The bias discussed in the paper may have different implications depending on whether the key comparator is the underlying prevalence of pfhrp2/3 gene deletions or the annual average proportion of pfhrp2/3-del false negatives. Is there any potential to add some discussion about the implications of this study for the WHO threshold value, for example whether specific values could be specified for particular seasonality and transmission-intensity contexts?

The 5% was calculated during the design of the protocol, and represents the difference in sensitivity between the HRP2-based RDTs vs. non HRP2-based RDTS. The justification is listed as follows in the WHO technical protocol:

“A threshold of 5% was selected because it is somewhere around this point that the proportion of cases missed by HRP2 RDTs due to non-hrp2 expression may be greater than the proportion of cases that would be missed by less-sensitive pLDH- based RDTs”

We agree, however, that there is a difference between the underlying prevalence of the gene deletion and the annual average proportion of false negative RDTs due to pfhrp2/3 deletions. Ideally samples that are collected as part of the WHO protocol would also be assessed for their multiplicity of infection, and then we would also be able to work out the true deletion frequency for a given region. We could then predict at what transmission intensity that region would expect to see 5% of RDTs yielding false negatives due to pfhrp2/3 deletions. Unfortunately this reflects a substantially larger amount of lab work given the number of samples that are likely to be collected as part of the survey as it is.

We have added some discussion about this to the end of the third paragraph of the Discussion as follows:

“It will, however, be possible after the samples have been collected to estimate the likely frequency of pfhrp2 gene deletions by incorporating estimates of the multiplicity of infection within the sampled population. This frequency could then be used to predict the bias in future estimates, as well as estimating how the prevalence of false-negative RDT results due to pfhrp2/3 gene deletions will change if the prevalence of malaria changes.”

The text regarding assumptions about selection and fitness (copied below) is confusing. False negative RDTs and consequent treatment choices reflected in the model will inherently exert selection, which seems to conflict with statements in the text.

'Additionally, there was no assumed fitness cost or selective advantage associated with pfhrp2 gene deletion, i.e. individuals who are only infected with parasites with pfhrp2 gene deletions are assumed to yield a false-negative RDT result This decision allowed us to control for selection within our investigation. This ensures that the dynamics observed are only due to seasonal variation in transmission intensity, and not due to an increase in the frequency of pfhrp2 gene deletions due to a selective advantage by evading diagnosis. As a result, when reporting individuals who are pfhrp2-negative we assume that 25% of individuals who are only infected with pfhrp2-deleted parasites will still be pfhrp2-positive due to the cross reactivity of PfHRP3 epitopes causing a positive PfHRP2-based RDT result.'

Our apologies again for this error. This has been addressed in the opening section of our response.

[Editors' note: further revisions were requested prior to acceptance, as described below.]

The manuscript has been improved but there are some remaining issues that need to be addressed before acceptance, as outlined below:

As you know, there was some confusion in this paper, as the original submission indicated that model did account for selection for pfhrp2 mutants, but the subsequent correspondence indicated that the model didn't.

While we understand that there is some value in considering the situation where the frequency of pfhrp2 deletions is not affected by selective forces (i.e. delayed treatment), clearly selective forces are likely to be acting in most settings and following consultation the consensus was that completely removing this real-world effect from the model was hard to justify. Therefore in addition to the analysis that has been done, the authors should add additional work where they do what they originally said they had done i.e. including a model where there is a selective advantage for pfhrp2 deletion changes/mutants as originally indicated.

The authors appear to be assuming that the intended meaning of the 5% threshold is the average proportion of HRP2 RDT results for patients infected with P. falciparum which are false negatives caused by pfhrp2/3 gene deletions, during a given year. The Discussion and conclusion then focus on differences between the prevalence of false negatives at specific timepoints during a year vs the average prevalence value over the year. It would hugely improve the clarity of the paper to state this assumption explicitly and early in the text. It is also necessary to demonstrate using information in the WHO documentation that this is in fact the intended definition of the WHO 5% threshold. Without unequivocal evidence that this is the precise meaning of the threshold value intended by the WHO, then the use of terms such as 'bias', 'overestimate', 'underestimate' etc. is unjustified throughout.

If it is not clear that the threshold is defined as an annual average, then the paper's message needs to change slightly. By indicating the extent to which the prevalence of false negatives can vary seasonally, even when the prevalence of gene deletions is constant, the results presented here indicate that a conscious choice about this aspect of the definition is very important. Should the threshold represent the acceptable maximum prevalence of false positives, or should it be the annual average. In either case, the results can inform strategies for applying the protocol in ways most likely to identify the required value.

Thank you to the reviewers and the editor for the reviews. The two major comments concerning selection and the decision to use an annual average measure are useful and we agree are important to consider. In response we have conducted the model simulations again with the assumed strength of selection used within the original study. These predictions have been included as new supplementary figures (Figure 1—figure supplement 2, Figure 2—figure supplement 1). We have also more clearly detailed why we are considering an annual average earlier in the Introduction, and have added two additional paragraphs in the Discussion exploring these comments. As the reviewers have pointed out, the annual average proportion of false-negative RDTs due to pfhrp2/3 deletions will only be constant over time if the frequency of pfhrp2/3 deletions are constant, i.e. under negligible selection pressure. Because of the connection of the reviewers’ two major comments (the decision to keep the frequency of pfhrp2 deletions constant over time and the comparison of samples collected within time periods to the annual average), we wanted to address these comments firstly here in the following discussion before responding further to specific comments.

One reason for not wanting to explore selection was because this would lead to an increase in pfhrp2 deletion, which would make it harder to observe patterns solely due to seasonal changes in transmission intensity and not due to the impact of a selective pressure. However, the true strength of any selective pressure is unknown – as raised by reviewer 3 and there are many factors that could affect how frequently individuals are only infected by pfhrp2 deleted parasites. In addition, the technical guidance seeks to identify regions with a sufficiently high frequency of pfhrp2/3 deletions causing false-negative RDTs (above 5% PfHRP2-negative RDTs due to pfhrp2/3 gene deletions) as these regions will have an increase in misdiagnosed cases compared to if a non-HRP2 RDT was used. Decisions made in light of these findings are made irrespective of any assumptions about the selective pressure, except that is greater than or equal to 0, i.e. it will mean that a switch to a non-HRP2 RDT is correct even if there is no selective pressure that would increase the proportion of false-negative RDTs due to pfhrp2-deletions.

If we assume that there is a negligible selective pressure then the frequency of pfhrp2 gene deletions (ignoring fade out and sufficiently large populations) will remain constant over the time period of monitoring for pfhrp2/3 deletions. This period is defined on page 12 of the WHO technical guidance as either two years if the 95% CI for the proportion of P. falciparum cases with false-negative HRP2 RDT results due to pfhrp2/3 deletions is less than 5%, or one year if it does include 5%, which is quite likely given the size of the binomial confidence interval for 370 samples. With this assumption of no selection then any measure of this proportion collected at one 8-week time period is hoped to be representative for the period of monitoring, which would thus represent the annual or biennial average (with the biennial average being the same as an annual average if there is no selection). This was why we initially were interested in looking at the annual average given our assumption of no selective pressure.

If there is a selective pressure, however, then any measure is indicative of the proportion within that time period, which would only be the same as the annual average if the strength of selection tended to zero (i.e. the no selection scenario). However, with a strong selective pressure this proportion is likely to increase. Importantly, we do not know precisely by how much and thus we do not wish to make predictions about whether a measure made in a given 8-week period is representative of that region’s average proportion in the annual or biennial period of monitoring. It is for this reason we think the recommendation for follow up studies made by the WHO in the technical guidance is very sensible. This is also because if there is a strong selective pressure but the seasonality profile of the region and the 8-week period chosen result in a measure that underestimates the annual average for the year after sample collection, then it is likely that the follow up study will correctly detect this increase. If the 8-week period chosen leads to an overestimate of the annual then the decision to switch RDT should not be considered as incorrect as it is likely that if the use of HRP2-based RDT was continued then the selective pressure would have meant a switch would have been needed at the end of the period of monitoring.

It is only in situations where there is a negligible selective pressure that you could make a decision that would actually be incorrect as opposed to premature. The regions identified in this way (Figure 4) would also likely have a negligible selective pressure as they are largely areas with both a high transmission intensity and high seasonality.

We hope the above clarifies why we were both interested in looking at areas with no selective pressure and considering annual averages. We acknowledge that this is definitely not clear nor obvious from the manuscript as it was written. To address this we have extended the Introduction to clearly lay out early what the measure we are recording is (the proportion of false-negative RDTs due to pfhrp2/3 deletions) and why we are comparing this to the annual average, i.e. due to the period defined for follow up monitoring. We agree that this comparison is open to interpretation and so we have removed the use of terms such as bias or unbiased, and have presented the results in terms of whether they are representative of the annual average proportion. We also have added Figure 5 (referenced in the fourth paragraph of the Discussion), which is a new paragraph in which we discuss the assumption we are making about looking at the annual average as well as why we are focussing on the seasonal dynamics with the frequency of pfhrp2 deletion fixed. This figure describes the above discussion through considering 2 scenarios: a seasonal setting with selection and without selection. In this diagram we hope to show clearly for readers why we are focussing our study on the assumption that pfhrp2 deletion frequency is not changing over time and subsequently why we would compare measures collected in 8-week periods to the annual average.

Thank you again to the reviewers for their incredibly helpful comments and we hope our responses to the specific comments have addressed the outstanding points satisfactorily.

Because of the extensive nature of the requested revisions and clarifications which cannot easily be summarized, more extensive comments from both reviewers are appended below. All substantive points should be addressed satisfactorily as we are unable to extend the review process beyond this next revision.

Reviewer #2:

The manuscript has improved and authors addressed most of my comments satisfactorily. I have however a few additional comments on the revised manuscript and rebuttal letter which I feel would require additional clarification.

- Exceptionally unhelpfully, the use of "false-negative" should be "positive" here. We carried out all our simulations with the assumption that individuals who are only infected with pfhrp2 gene deleted parasites will still be treated. As such, the gene deleted parasites behave exactly the same as the wild type parasites.

I am afraid I don't fully follow this reasoning. My understanding is that the motivation of the study was that pfhrp2 gene deleted parasites could be indeed misdiagnosed and so simulations should treat them as false negatives (Introduction, second paragraph). If simulations threat those as positive, how could the model effectively estimate the rate of misdiagnosis and the seasonality in such estimate? May I have missed something?

Besides, I don't think that the reviewers have actually addressed the real concern that came with their original consideration that false negative RTD pfhrp2 deleted parasites would allow them to control for selection.

Thank you for bringing this up, and we hope the following explanation and changes make it clearer how we simulated this. To clarify how we were conducting the earlier simulation in which no selective pressure was assumed, we assumed individuals who were infected with only pfhrp2 deleted parasites would be correctly treated, which removed the selective pressure. However, when reporting on the rate of misdiagnosis we would regard these individuals as misdiagnosed as they are only infected with pfhrp2 deleted parasites. Thus we are reporting the frequency of misdiagnoses that would have occurred throughout the time period for a given frequency of pfhrp2 gene deletions, while controlling for any increases in pfhrp2 gene deletions. The advantage of conducting simulations in this way is that we are more certain that dynamics observed in the frequency of misdiagnoses are due to seasonal fluctuations in the prevalence of individuals only infected with pfhrp2 deleted parasites, and not due to increases in the population frequency of pfhrp2 deletions. We have added text to clarify this in the Materials and methods section as follows:

“As a result, when reporting the proportion of clinical cases that were misdiagnosed resulting from a false-negative PfHPR2-negative RDT we are reporting the proportion of cases that are infected with only pfhrp2-deleted parasites, i.e. individuals who would have been pfhrp2-negative and subsequently misdiagnosed.”

To make this clearer for the reader, this is further expanded upon in the opening paragraph of the Results:

“We initially assumed that the frequency of pfhrp2 deletions was not increasing over time before considering scenarios in which the selective pressure for pfhrp2 deletions causes an increase in the population frequency of pfhrp2 deletions. This decision allowed for the impact of seasonality on the proportion of clinical cases that are pfhrp2-negative to be isolated, before allowing comparisons to scenarios in which the proportion of clinical cases that are pfhrp2-negative is increasing also due to changes in the population frequency of pfhrp2 deletions.”

- Related to the same issue above, and in response to my comment, authors replied:

"We do not include a selective advantage to pfhrp2 gene deletion (apologies again for the error mentioned at the beginning of our response) and so we would not expect there to see a temporal variability in the selection pressure. If we did consider this then there would definitely be a temporal element, with the increase in the absolute number of people who seek treatment (we assume a constant proportion of people with a malarial fever seek treatment) during periods of higher transmission causing an increase in the prevalence of the pfhrp2 gene deletion. It was because of this reason that we decided not to model selection, so that we could exclude this effect of selection and be more confident that the dynamics seen are due to the fluctuations in individuals being only infected with pfhrp2 deleted parasites."

The selective advantage comes with pfhrp2 gene deletion individuals being misdiagnosed and not getting treatment. If you consider those as positive you remove selection but this is not reflecting any more the reality of the situation.

We agree with the reviewers that a selective pressure is likely and that the presence of a selective pressure better represents the biological realism of the dynamics of pfhrp2 gene deletions in most settings. We have conducted these additional model simulations in the revised manuscript as supplementary figures (Figure 1—figure supplement 2 and Figure 2—figure supplement 2). In these simulations, the selective pressure used in the original study was included, and the same simulation settings were explored.

The included supplementary figures for Figures 1 and 2 that consider selection show that there is a significant increase in pfhrp2 deletions over the time period studied, in particular in the low transmission setting considered with low seasonality the frequency of pfhrp2 gene deletions doubled from 6% to over 12% on average after two years (Figure 1—figure supplement 2Q). As a result, for this setting there is a systematic increase in the proportion of false-negative PfHRP2 RDTs within clinical cases within 8 week periods (Figure 2—figure supplement 1B). This manifests in the observation that you would expect estimates collected at the end of the calendar year (because we started our simulations in January) to overestimate the prevalence of false-negative RDTs due to pfhrp2-deletions when compared to the annual mean over the time period shown. However, this comparison is arguably unsuitable and perhaps should be compared to the annual prevalence after sample collection, i.e. what is the mean for the period of monitoring. We, however, do not know that true strength of selection and would feel it incorrect to suggest what this would be, whereas we are confident that the results presented when the frequency of pfhrp2 deletions is constant are able to demonstrate which time periods are most likely to be systematically above or below the annual average.

We hope that the inclusion of these additional figures and the discussion made here have helped to clarify the above two issues. We have included a discussion of this in the third, fourth and fifth paragraph of the Discussion, which also includes some of the earlier discussion at the top of this response.

- About the relationship between monoclonal/multiclonal infections and pfhrp2 deletions prevalence and selection.

I thank the authors about including a supplementary figure, but could it be possible to clarify further the relationships in the text?, saying that the relationship is unclear is not of much help.

Thank you for highlighting this as reading it back the supplementary figure wasn’t actually linked or the relationships clarified. The second Results paragraph has a longer description about why lower transmission settings have an increased chance of individuals being only infected with pfhrp2-negative parasites, and the supplementary figure is now linked here:

“This observation is attributable to the lower rate of superinfection in low transmission settings. The lower rate of superinfection reduces the number of polyclonal infections and increases the chance that an individual is only infected with pfhrp2-negative parasites (Figure 1—figure supplement 1).”

- About my comment "The regions identified were areas with both a low prevalence of malaria and a high frequency of people seeking…" Were these the only factors?"

To which authors responded "These were the only factors we looked at within our modelling study".

I don't find this reply satisfactory. I know they modelled only those, but my comment was more a recommendation so it is acknowledged somewhere in the Introduction or the Discussion whether they could be other factors that have not been considered and have been shown or suggested to influence the misdiagnoses.

Thank you for raising this again and apologies for misunderstanding. We have added the following sentence to the end of the second paragraph of the Introduction highlighting that there are other factors that could affect the rate of selection and the number of misdiagnoses made:

“The precise strength of selection, however, is not known with other factors such as the rate of non-malarial fevers and non-adherence to RDT outcomes likely to impact the number of misdiagnosed cases receiving treatment.“

Reviewer #3:

The author's clarifications make sense and are helpful. However, my improved understanding of the authors' intentions and the results and conclusions presented in the paper has generated some additional questions and comments. I still feel that the paper would benefit from greater clarity.

My understanding is that the key values being considered are;

1) The proportion of HRP2 RDT results for patients infected with P. falciparum which are false negatives caused by pfhrp2/3 gene deletions at a given timepoint.

2) The average proportion of HRP2 RDT results for patients infected with P. falciparum which are false negatives caused by pfhrp2/3 gene deletions, during a given year.

3) The proportion of P. falciparum parasites in a given region which have pfhrp2/3 gene deletions.

4) The 5% threshold in the WHO guidelines.

It would be incredibly helpful if the authors could provide a precise definition for this, as the various wordings I have found so far in the WHO protocol and information note are open to interpretation regarding whether the 5% is intended to represent: a) The proportion of HRP2 RDT results for patients infected with P. falciparum which are false negatives caused by pfhrp2/3 gene deletions; orb) The proportion of P. falciparum parasites in a given region which have pfhrp2/3 gene deletions.

Part of a full definition for this value is the assumed timing. A quick review of the WHO documentation does not immediately yield any specific information about assumed timings, an absence which would be consistent with an assumption that the rate is effectively constant through a season, or might equally mean that the relevant value is that at the time of sampling.

Thank you for the attention to the clarity regarding the model outputs that are discussed in the manuscript. We agree that we have not been consistent with which values we are referencing in the text. The main outcome that we are considering is #2 in the list above. The decision to compare to the annual average is based on the period of monitoring recommended in the WHO technical guidance on page 12, in which follow up studies should be conducted against after one year if the findings are inconclusive, i.e. the 95% CI for the proportion of false-negative RDTs due to pfhrp2/3 deletions includes 5%. We have made this clearer in the text, with this assumption made clearer in the third paragraph of Introduction by including the specific equation listed in the WHO technical guidance as follows:

“In February 2018, the World Health Organization (WHO) issued guidance for national malaria control programmes on how to investigate suspected false-negative RDTs with an emphasis on pfhrp2/3 gene deletions. (World Health Organization, 2018b). The primary study outcome to be calculated in the guidance is as follows:

Proportion of P. falciparum cases with false-negative HRP2 RDT results due to pfhrp 2/3 deletions = # of confirmed falciparum patients with pfhrp2/3 gene deletions and HRP2 RDT negative results / # of confirmed P. falciparum (by either RDT or microscopy)"

In the paper, the authors appear to be assuming that the intended meaning of the 5% threshold is the average proportion of HRP2 RDT results for patients infected with P. falciparum which are false negatives caused by pfhrp2/3 gene deletions, during a given year (item 2 in the list above). The Discussion and conclusion then focus on differences between the prevalence of false negatives at specific timepoints during a year vs the average prevalence value over the year. It would hugely improve the clarity of the paper to state this assumption explicitly and early in the text. It is also necessary to demonstrate using information in the WHO documentation that this is in fact the intended definition of the WHO 5% threshold. Without unequivocal evidence that this is the precise meaning of the threshold value intended by the WHO, then the use of terms such as 'bias', 'overestimate', 'underestimate' etc. is unjustified throughout.

If it is not clear that the threshold is defined as an annual average, then the paper's message needs to change slightly. By indicating the extent to which the prevalence of false negatives can vary seasonally, even when the prevalence of gene deletions is constant, the results presented here indicate that a conscious choice about this aspect of the definition is very important. Should the threshold represent the acceptable maximum prevalence of false positives, or should it be the annual average. In either case, the results can inform strategies for applying the protocol in ways most likely to identify the required value.

Thank you for raising this issue of interpretation. We agree that the technical guidance is open to interpretation about whether the estimates collected should be reflective of that point in time or should be representative of the annual average. We have assumed the latter, which was based on the period of monitoring detailed above and on our assumption of a negligible selective pressure resulting in a constant deletion frequency. To make this clearer we have extended text in the second last paragraph of the Introduction to clarify that the comparisons to the annual average are chosen to reflect the established monitoring scheme:

“The 8-week interval permits for a rapid turnaround and allows for efficient investigations and policy responses. […] Subsequently, any recorded estimate may not be predictive of the number of cases that may be misdiagnosed due to pfhrp2/3 deletions in the years between sampling intervals.”

However, we do acknowledge that this is open to interpretation and as such we have replaced our use of the terms biased and unbiased and refer to whether the reported estimates are representative of the annual average. We have discussed at length this in the fifth paragraph of the Discussion, in which we discuss this assumption in the context of scenarios in which the proportion of false-negative RDTs due pfhrp2/3 deletions is increasing overtime, because these setting make the comparison to the annual average trickier. However, we conclude this paragraph by arguing that it is more prudent to be concerned with scenarios under the assumption that there is a negligible selective pressure, in which case the comparison to the annual average is justified. Our justification is demonstrated in an additional figure (Figure 5), with the following text at the end of the fourth paragraph of the Discussion raising this justification:

“However, we believe that it is more important to focus on the assumption that the strength of selection is negligible (see Figure 5). […] In areas with a selective pressure it is still possible to incorrectly estimate the annual average for the following year, however the presence of the selective pressure is likely to mean any decision made is simply premature as the frequency of pfhrp2/3 deletions and subsequently false-negative PfHRP2 RDTs will increase over time (Figure 5B).”

There is also some confusion in the text between the prevalence of false positives results, and the prevalence of the gene deletion, with the text referring to change of RDT being triggered by an incorrect assessment of the prevalence of gene deletions (e.g. Introduction, fourth paragraph), suggesting that the authors may in fact be defining the threshold value as equal to value 3 in the list above.

Thank you for highlighting these inconsistencies, as they do make the messaging harder to follow. We agree we were not consistent in the prevalence values mentioned, and we have addressed these below.

These are key to the meaning and the implications of the work presented here, and clarity about what is being assumed or referred to is crucial to allow the text to tell its story clearly, and to make it easy to assess the consistency of that story. Confusing references to different prevalence values in the text should be reviewed and resolved wherever they arise throughout the text, including some specific instances detailed below.

Detailed comments:

Introduction, third and fourth paragraphs: In the third paragraph of the Introduction the authors give a definition of the WHO threshold value as being the prevalence of false negatives caused by pfhrp2/3 gene deletions. However, in the fourth paragraph of the Introduction they suggest that incorrect assessment of the prevalence of pfhrp2/3 gene deletions could drive the decision to switch to non HRP2 RDTs. Is there another mechanism in the WHO guideline in addition to the 5% false negatives threshold which would drive a change of policy based on gene deletion prevalence rather than false negative RTD prevalence?

'The protocol in this guidance details how to estimate the local prevalence of false-negative PfHRP2-based RDTs due to pfhrp2/3 gene deletions and recommends that a national change to non PfHRP2-based RDTs be made if the estimated prevalence is above 5%.'

'the timing of the 8-week interval chosen within a transmission season could lead to bias in the sampled prevalence of pfhrp2/3 gene deletions. An overestimation of the true prevalence of pfhrp2/3 gene deletions could result in a switch to a less sensitive RDT'

Apologies for the confusion. This has been changed to refer to the proportion of P. falciparum cases with false-negative HRP2 RDT results due to pfhrp2/3 deletions throughout.

Results, first paragraph and similar elsewhere in text: 'In a moderate transmission setting, a clear seasonal pattern is predicted (Figure 2C), with sampling at the beginning of the transmission seasons resulting in significant overestimation of the true proportion of false negative RDTs..'

'true' is not adequately defined to be used here in this way. It might legitimately be assumed to mean the population prevalence of false-negative RDTs at the time of sampling. What is meant here, I think, is that sampling at the beginning of the transmission season is expected to give a value higher than the true average value for the year.

Agree this is unclear and we have removed the use of “true” throughout and have been clearer that we are referring to the annual average, which we have detailed early on in the manuscript in the Introduction.

Introduction, last paragraph, Figure 4 description and title, Results, last paragraph.

Introduction, last paragraph and figure description indicate that the values used to generate Figure 4 are the gene deletion prevalences

Results, last paragraph and implication of contents of plot indicate that the plot is based on prevalences of false negative values.

Results, first and last paragraph and Discussion, first paragraph and similar elsewhere in text – 'biased' and 'unbiased' are a mathematical terms with specific meanings and it is not clear that those meanings are correctly applied here and elsewhere in the text. It would be better to replace them with other terms unless the mathematical meaning is genuinely indicated.

Thank you for highlighting this area of confusion. We have removed these terms and have consistently referred to measures as being either representative or unrepresentative of the annual average proportion of false-negative RDTs due to pfhrp2/3 deletions.

Discussion, first and second paragraphs. These paragraphs both begin by describing the research presented in the manuscript as relating to estimates of prevalence of pfhrp2 gene deletions. The remaining text all seems to actually describe the results regarding the prevalence of false positive HRP2 RDT results, but the first sentences mean that it all reads as discussion of the gene deletion prevalence.

'This research characterises the potential for surveillance in highly seasonal areas within sub-Saharan Africa to produce biased estimates of the prevalence of pfhrp2 gene deletions. These findings highlight the impact of both the seasonal timing and…'

'Our modelling predicted that there would be increased observation of pfhrp2 gene deletions after periods of lower transmission and within younger individuals…'

Apologies for these. We have changed these to refer consistently to the proportion of false-negative RDTs due to pfhrp2 deletions throughout. Any mention explicitly to changes in the frequency of the gene deletions is now made in the context of how it will impact the proportion of false-negative RDTs.

Discussion, first, third and fourth paragraphs. 'However, the true prevalence of parasites with a pfhrp2 gene deletion in each administrative region is fundamentally unknown, and as such, our results should not be interpreted as predictions of the bias in future sampled estimates of pfhrp2 deletion. They should instead be used to support surveillance efforts and to reinforce the need for longitudinal measures of pfhrp2 gene deletions conducted at the same point with a transmission season.'

Is this compatible with the database mentioned in the Discussion? 'To support surveillance efforts, we have published an online database detailing the optimum sampling interval as well as the sampling bias throughout the transmission season for each administrative region'

We don’t know the prevalence of deletions in all regions, however, we agree this phrasing confuses what we are trying to say here, which is simply that without knowing the fitness costs, selection pressure and the prevalence of deletions that our results are not going to be accurately predictive of how unrepresentative collected estimates may be. However, we are able to confidently indicate the time periods that are most at risk of producing estimates that do not reflect the annual average. This section has also been rewritten to better consider our assumptions about looking at the annual average in the context of our focus on simulations in which the frequency of pfhrp2 deletions are constant over time.

'The observed prevalence of pfhrp2 deletions is higher when monoclonal infections are more prevalent, with the highest prevalence observed when sampling at the start of the rainy season as individuals are less likely to already be infected. Similarly, the observed prevalence of pfhrp2 deletions is higher in younger individuals who have lower clinical immunity, as they are more likely to present with clinical symptoms after their first infection event.'

Should these two references be to prevalence of false positives rather than prevalence of pfhrp2 deletions?

Thank you again for highlighting these inconsistencies. In line with the other changes we have changed these to refer to false-negative RDTs due to pfhrp2 deletions.

Discussion, last paragraph. This seems to be simply repeating contents of first paragraph of Discussion?

Thank you for pointing this out. We have shortened the opening paragraph of the Discussion as a result.

Subsection “Characterising the impact of seasonal transmission intensities upon pfhrp2 deletion prevalence”, last paragraph. '…fitting the frequency of pfhrp2 gene deletions in each simulation such that the true prevalence of false-negative RDT results due to pfhrp2 deletions is equal to 5%.'

'.. percentage of intervals that did not include the true prevalence of 5% was calculated.'

'true' not adequately defined, should simply say '.. the average annual prevalence..' or similar.

Thank you – we agree the use of true is unclear, and we have replaced this to now read as “annual prevalence”.

Figure 1 legend. '..In I – L and M – P the proportion of clinical cases due to pfhrp2-negative parasites is shown for both the whole population and..'

Wording is confusing, does this mean cases infected only with pfhrp2-negative parasites?

Thank you for highlighting this areas of confusion. We agree and have made this clearer by changing to “proportion of clinical cases only infected with pfhrp2-negativeparasites”

'…the population allele frequency of pfhrp2 gene deletions, which was set equal to 6% at the beginning of each simulation..'

Is the reason for or significance of the 6% value given anywhere?

Thank you for also picking up on this. The reasoning is mentioned in the Materials and methods, and relates to the 6% identified in the original study as the likely frequency of pfhrp2 gene deletions in DRC prior to the use of RDTs. We have also added this to the opening paragraph of the Results section, which includes some description of the methodology to help improve the reader’s understanding.

'…10 simulation realisations are shown in each graph, with the mean shown with the thicker line. Lastly, the 5% threshold for switching RDT provided by the WHO is shown with the black line in plots I – P…'

I think the means are shown by the black line, and the 5% by the dashed horizontal line?

Thank you for picking this up and apologies for the unclear description. This has been corrected to read:

“10 simulation realisations are shown in each graph, with the mean shown with by the black line. Lastly, the 5% threshold for switching RDT provided by the WHO is shown with the dashed horizontal line in plots I – P.”

Figure 3 legend. Should '..age and seasonality..' be '..age and transmission intensity..'?

Transmission intensity is clearer and more accurate than seasonality and has been changed. Thank you for the suggestion.

Figure 4 legend, description and title. '..pfhrp2 deletion..' should be '..false-negative pfhrp2 RDTs?..'

Should also be revised as necessary to reflect assumed exact definition of threshold value.

Thank you – we have updated the caption to refer to proportion of false-negativePfHRP2RDTsdue to pfhrp2 deletions.

pseudo codesecond line 048

'// Loop through every day in simulation and calculate the seasonal curve for that day

045 FOR day: = 1 TO t_max // t_max is total simulation time in days

046 theta[day]:= Fourier_average +first_cosine_term * cos(2*pi*day/365) +second_cosine_term * cos(2*2*pi*day/365) +third_cosine_term * cos(3*2*pi*day/365) +first_sine_term * sin(2*pi*day/365) +second_sine_term * sin(2*2*pi*day/365) +third_sine_term * sin(3*2*pi*day/365))

047 ENDFOR

// Loop through every day in simulation and normalise seasonal curve for that day

048 FOR day: = 1 TO t_max // t_max is total simulation time in days

048 theta[day]: = theta [day] / mean(theta [1 TO 365) // normalise theta with first 365 days of theta

049 IF ([day] < 0.001) // with only 1st 3 terms of Fourier used we need to check for <0

050 [day]: = 0.001

051 ENDIF

052 ENDFOR

I'm assuming this is just a problem with the pseudo code, not the actual code, but that should be checked and confirmed. It seems that in the normalisation loop, the sum of theta values by which θ(n) is divided will use the normalised rather than original values for all θ(<n).

Could the authors please review the pseudo code for consistency with the actual code?

Apologies for the short hand here in the pseudocode as we agree it was misleading. The normalisation is using the mean of the original values. This has been corrected through specifying the calculation of the man prior to the normalisation loop. The line numbers have all been updated as well now to reflect this extra line.

https://doi.org/10.7554/eLife.40339.017

Article and author information

Author details

  1. Oliver John Watson

    MRC Centre for Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, Imperial College London, London, United Kingdom
    Contribution
    Conceptualization, Data curation, Software, Formal analysis, Investigation, Methodology, Writing—original draft, Project administration, Writing—review and editing
    For correspondence
    o.watson15@imperial.ac.uk
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-2374-0741
  2. Robert Verity

    MRC Centre for Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, Imperial College London, London, United Kingdom
    Contribution
    Formal analysis, Supervision, Visualization, Methodology, Writing—review and editing
    Competing interests
    No competing interests declared
  3. Azra C Ghani

    MRC Centre for Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, Imperial College London, London, United Kingdom
    Contribution
    Supervision, Methodology, Writing—review and editing
    Competing interests
    No competing interests declared
  4. Tini Garske

    MRC Centre for Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, Imperial College London, London, United Kingdom
    Contribution
    Data curation, Software, Methodology, Writing—review and editing
    Competing interests
    No competing interests declared
  5. Jane Cunningham

    Global Malaria Programme, World Health Organization, Geneva, Switzerland
    Contribution
    Supervision, Methodology, Writing—review and editing
    Competing interests
    No competing interests declared
  6. Antoinette Tshefu

    School of Public Health, University of Kinshasa, Kinshasa, Democratic Republic of the Congo
    Contribution
    Resources, Formal analysis, Writing—review and editing
    Competing interests
    No competing interests declared
  7. Melchior K Mwandagalirwa

    1. School of Public Health, University of Kinshasa, Kinshasa, Democratic Republic of the Congo
    2. Department of Epidemiology, Gillings School for Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, United States
    Contribution
    Resources, Formal analysis, Writing—review and editing
    Competing interests
    No competing interests declared
  8. Steven R Meshnick

    1. Department of Epidemiology, Gillings School for Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, United States
    2. Division of Infectious Diseases, Department of Medicine, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, United States
    Contribution
    Resources, Formal analysis, Writing—review and editing
    Competing interests
    No competing interests declared
  9. Jonathan B Parr

    Division of Infectious Diseases, Department of Medicine, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, United States
    Contribution
    Conceptualization, Data curation, Formal analysis, Supervision, Methodology, Writing—review and editing
    Competing interests
    No competing interests declared
  10. Hannah C Slater

    MRC Centre for Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, Imperial College London, London, United Kingdom
    Contribution
    Conceptualization, Formal analysis, Supervision, Investigation, Visualization, Methodology, Writing—review and editing
    Competing interests
    No competing interests declared

Funding

Wellcome Trust (109312/Z/15/Z)

  • Oliver John Watson

Medical Research Council (MR/N01507X/1)

  • Robert Verity

Department for International Development

  • Azra C Ghani

Medical Research Council

  • Tini Garske

National Institute of Allergy and Infectious Diseases (R01AI132547)

  • Steven R Meshnick

American Society for Tropical Medicine and Hygiene (Burroughs Wellcome Fund-ASTMH Postdoctoral Fellowship in Tropical Infectious Diseases)

  • Jonathan B Parr

Burroughs Wellcome Fund (Burroughs Wellcome Fund-ASTMH Postdoctoral Fellowship in Tropical Infectious Diseases)

  • Jonathan B Parr

Imperial College London

  • Hannah C Slater

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank Dr Michela Menegon for sharing the sample dates for the 2013–2014 study in Eritrea and the administrators and participants of the Demographic and Health Surveys.

Senior Editor

  1. Eduardo Franco, McGill University, Canada

Reviewing Editor

  1. Ben Cooper, Mahidol Oxford Tropical Medicine Research Unit, Thailand

Reviewers

  1. Ben Cooper, Mahidol Oxford Tropical Medicine Research Unit, Thailand
  2. Elena Gómez-Díaz, Doñana Biological Station (EBD-CSIC), Spain
  3. Penelope Anne Lynch, University of Exeter Cornwall Campus, United Kingdom

Publication history

  1. Received: July 25, 2018
  2. Accepted: April 29, 2019
  3. Accepted Manuscript published: May 2, 2019 (version 1)
  4. Version of Record published: May 23, 2019 (version 2)

Copyright

© 2019, Watson et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 1,041
    Page views
  • 204
    Downloads
  • 19
    Citations

Article citation count generated by polling the highest count across the following sources: PubMed Central, Crossref, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Oliver John Watson
  2. Robert Verity
  3. Azra C Ghani
  4. Tini Garske
  5. Jane Cunningham
  6. Antoinette Tshefu
  7. Melchior K Mwandagalirwa
  8. Steven R Meshnick
  9. Jonathan B Parr
  10. Hannah C Slater
(2019)
Impact of seasonal variations in Plasmodium falciparum malaria transmission on the surveillance of pfhrp2 gene deletions
eLife 8:e40339.
https://doi.org/10.7554/eLife.40339

Further reading

    1. Computational and Systems Biology
    2. Epidemiology and Global Health
    Oliver Robinson, Chung-Ho E Lau ... Martine Vrijheid
    Research Article

    Background: While biological age in adults is often understood as representing general health and resilience, the conceptual interpretation of accelerated biological age in children and its relationship to development remains unclear. We aimed to clarify the relationship of accelerated biological age, assessed through two established biological age indicators, telomere length and DNA methylation age, and two novel candidate biological age indicators , to child developmental outcomes, including growth and adiposity, cognition, behaviour, lung function and onset of puberty, among European school-age children participating in the HELIX exposome cohort.

    Methods: The study population included up to 1,173 children, aged between 5 and 12 years, from study centres in the UK, France, Spain, Norway, Lithuania, and Greece. Telomere length was measured through qPCR, blood DNA methylation and gene expression was measured using microarray, and proteins and metabolites were measured by a range of targeted assays. DNA methylation age was assessed using Horvath's skin and blood clock, while novel blood transcriptome and 'immunometabolic' (based on plasma protein and urinary and serum metabolite data) clocks were derived and tested in a subset of children assessed six months after the main follow-up visit. Associations between biological age indicators with child developmental measures as well as health risk factors were estimated using linear regression, adjusted for chronological age, sex, ethnicity and study centre. The clock derived markers were expressed as Δ age (i.e., predicted minus chronological age).

    Results: Transcriptome and immunometabolic clocks predicted chronological age well in the test set (r= 0.93 and r= 0.84 respectively). Generally, weak correlations were observed, after adjustment for chronological age, between the biological age indicators. Among associations with health risk factors, higher birthweight was associated with greater immunometabolic Δ age, smoke exposure with greater DNA methylation Δ age and high family affluence with longer telomere length. Among associations with child developmental measures, all biological age markers were associated with greater BMI and fat mass, and all markers except telomere length were associated with greater height, at least at nominal significance (p<0.05). Immunometabolic Δ age was associated with better working memory (p = 4e -3) and reduced inattentiveness (p= 4e -4), while DNA methylation Δ age was associated with greater inattentiveness (p=0.03) and poorer externalizing behaviours (p= 0.01). Shorter telomere length was also associated with poorer externalizing behaviours (p=0.03).

    Conclusions: In children, as in adults, biological ageing appears to be a multi-faceted process and adiposity is an important correlate of accelerated biological ageing. Patterns of associations suggested that accelerated immunometabolic age may be beneficial for some aspects of child development while accelerated DNA methylation age and telomere attrition may reflect early detrimental aspects of biological ageing, apparent even in children.

    Funding: UK Research and Innovation (MR/S03532X/1); European Commission (grant agreement numbers: 308333; 874583).

    1. Epidemiology and Global Health
    Katharine Sherratt, Hugo Gruson ... Sebastian Funk
    Research Article Updated

    Background:

    Short-term forecasts of infectious disease burden can contribute to situational awareness and aid capacity planning. Based on best practice in other fields and recent insights in infectious disease epidemiology, one can maximise the predictive performance of such forecasts if multiple models are combined into an ensemble. Here, we report on the performance of ensembles in predicting COVID-19 cases and deaths across Europe between 08 March 2021 and 07 March 2022.

    Methods:

    We used open-source tools to develop a public European COVID-19 Forecast Hub. We invited groups globally to contribute weekly forecasts for COVID-19 cases and deaths reported by a standardised source for 32 countries over the next 1–4 weeks. Teams submitted forecasts from March 2021 using standardised quantiles of the predictive distribution. Each week we created an ensemble forecast, where each predictive quantile was calculated as the equally-weighted average (initially the mean and then from 26th July the median) of all individual models’ predictive quantiles. We measured the performance of each model using the relative Weighted Interval Score (WIS), comparing models’ forecast accuracy relative to all other models. We retrospectively explored alternative methods for ensemble forecasts, including weighted averages based on models’ past predictive performance.

    Results:

    Over 52 weeks, we collected forecasts from 48 unique models. We evaluated 29 models’ forecast scores in comparison to the ensemble model. We found a weekly ensemble had a consistently strong performance across countries over time. Across all horizons and locations, the ensemble performed better on relative WIS than 83% of participating models’ forecasts of incident cases (with a total N=886 predictions from 23 unique models), and 91% of participating models’ forecasts of deaths (N=763 predictions from 20 models). Across a 1–4 week time horizon, ensemble performance declined with longer forecast periods when forecasting cases, but remained stable over 4 weeks for incident death forecasts. In every forecast across 32 countries, the ensemble outperformed most contributing models when forecasting either cases or deaths, frequently outperforming all of its individual component models. Among several choices of ensemble methods we found that the most influential and best choice was to use a median average of models instead of using the mean, regardless of methods of weighting component forecast models.

    Conclusions:

    Our results support the use of combining forecasts from individual models into an ensemble in order to improve predictive performance across epidemiological targets and populations during infectious disease epidemics. Our findings further suggest that median ensemble methods yield better predictive performance more than ones based on means. Our findings also highlight that forecast consumers should place more weight on incident death forecasts than incident case forecasts at forecast horizons greater than 2 weeks.

    Funding:

    AA, BH, BL, LWa, MMa, PP, SV funded by National Institutes of Health (NIH) Grant 1R01GM109718, NSF BIG DATA Grant IIS-1633028, NSF Grant No.: OAC-1916805, NSF Expeditions in Computing Grant CCF-1918656, CCF-1917819, NSF RAPID CNS-2028004, NSF RAPID OAC-2027541, US Centers for Disease Control and Prevention 75D30119C05935, a grant from Google, University of Virginia Strategic Investment Fund award number SIF160, Defense Threat Reduction Agency (DTRA) under Contract No. HDTRA1-19-D-0007, and respectively Virginia Dept of Health Grant VDH-21-501-0141, VDH-21-501-0143, VDH-21-501-0147, VDH-21-501-0145, VDH-21-501-0146, VDH-21-501-0142, VDH-21-501-0148. AF, AMa, GL funded by SMIGE - Modelli statistici inferenziali per governare l'epidemia, FISR 2020-Covid-19 I Fase, FISR2020IP-00156, Codice Progetto: PRJ-0695. AM, BK, FD, FR, JK, JN, JZ, KN, MG, MR, MS, RB funded by Ministry of Science and Higher Education of Poland with grant 28/WFSN/2021 to the University of Warsaw. BRe, CPe, JLAz funded by Ministerio de Sanidad/ISCIII. BT, PG funded by PERISCOPE European H2020 project, contract number 101016233. CP, DL, EA, MC, SA funded by European Commission - Directorate-General for Communications Networks, Content and Technology through the contract LC-01485746, and Ministerio de Ciencia, Innovacion y Universidades and FEDER, with the project PGC2018-095456-B-I00. DE., MGu funded by Spanish Ministry of Health / REACT-UE (FEDER). DO, GF, IMi, LC funded by Laboratory Directed Research and Development program of Los Alamos National Laboratory (LANL) under project number 20200700ER. DS, ELR, GG, NGR, NW, YW funded by National Institutes of General Medical Sciences (R35GM119582; the content is solely the responsibility of the authors and does not necessarily represent the official views of NIGMS or the National Institutes of Health). FB, FP funded by InPresa, Lombardy Region, Italy. HG, KS funded by European Centre for Disease Prevention and Control. IV funded by Agencia de Qualitat i Avaluacio Sanitaries de Catalunya (AQuAS) through contract 2021-021OE. JDe, SMo, VP funded by Netzwerk Universitatsmedizin (NUM) project egePan (01KX2021). JPB, SH, TH funded by Federal Ministry of Education and Research (BMBF; grant 05M18SIA). KH, MSc, YKh funded by Project SaxoCOV, funded by the German Free State of Saxony. Presentation of data, model results and simulations also funded by the NFDI4Health Task Force COVID-19 (https://www.nfdi4health.de/task-force-covid-19-2) within the framework of a DFG-project (LO-342/17-1). LP, VE funded by Mathematical and Statistical modelling project (MUNI/A/1615/2020), Online platform for real-time monitoring, analysis and management of epidemic situations (MUNI/11/02202001/2020); VE also supported by RECETOX research infrastructure (Ministry of Education, Youth and Sports of the Czech Republic: LM2018121), the CETOCOEN EXCELLENCE (CZ.02.1.01/0.0/0.0/17-043/0009632), RECETOX RI project (CZ.02.1.01/0.0/0.0/16-013/0001761). NIB funded by Health Protection Research Unit (grant code NIHR200908). SAb, SF funded by Wellcome Trust (210758/Z/18/Z).