From multiplicity of infection to force of infection in sparsely sampled high-transmission Plasmodium falciparum populations

  1. Qi Zhan  Is a corresponding author
  2. Kathryn E Tiedje
  3. Karen P Day
  4. Mercedes Pascual  Is a corresponding author
  1. Committee on Genetics, Genomics and Systems Biology, The University of Chicago, United States
  2. Department of Microbiology and Immunology, Bio21 Institute and The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, Australia
  3. Department of Biology and Department of Environmental Studies, New York University, United States
  4. Santa Fe Institute, United States
29 figures, 1 table and 8 additional files

Figures

Confidence intervals for estimated mean FOI values in simulated scenarios of homogeneous exposure risk, before and during IRS interventions at three different coverage levels.

The times between local transmission events follow a Gamma distribution, with seasonal transmission in a closed system. FOI estimates are derived from true MOI values and MOI estimates obtained through the Bayesian formulation or the bootstrap imputation approach correcting for all or individual sampling limitations. The true mean FOI per host per year is computed by dividing the total number of infections acquired by the population by the total number of hosts in the population. Confidence intervals are estimated from 200 bootstrap replicates using non-parametric bootstrap analysis. Each boxplot shows minimum, 5% quantile, median, 95% quantile, and maximum values.

Confidence intervals for estimated mean FOI values in simulated scenarios of heterogeneous exposure risk, before and during IRS interventions at three different coverage levels.

The times between local transmission events follow a Gamma distribution, with seasonal transmission in a semi-open system. FOI estimates are derived from true MOI values and MOI estimates obtained through the Bayesian formulation of the varcoding method or the bootstrap imputation approach correcting for all or individual sampling limitations. The true mean FOI per host per year is computed by dividing the total number of infections acquired by the population by the total number of hosts in the population. Confidence intervals are estimated from 200 bootstrap replicates using non-parametric bootstrap analysis. Each boxplot shows minimum, 5% quantile, median, 95% quantile, and maximum values.

Figure 3 with 3 supplements
Confidence intervals for the estimated mean FOI values in Ghana surveys before and immediately after a transient three-round IRS intervention.

(A) The estimated FOI values when excluding these treated individuals from the analysis. (B) The estimated FOI values when discarding the infection status and MOI estimates of treated individuals and sampling from non-treated ones with MOI >0. Since this case samples non-zero MOIs for these treated and uninfected individuals, it results in an upper bound for FOI estimates. Confidence intervals are estimated from 200 bootstrap replicates using non-parametric bootstrap analysis. Each boxplot shows minimum, 5% quantile, median, 95% quantile, and maximum. The value of c is set to 30. FOI estimates with other values of c can be found in Figure 3—figure supplements 1–3.

Figure 3—figure supplement 1
Confidence intervals for the estimated mean FOI values in Ghana surveys before and immediately after a transient three-round IRS intervention.

(A) The estimated FOI values when excluding these treated individuals from the analysis. (B) The estimated FOI values when discarding the infection status and MOI estimates of treated individuals and sampling from non-treated ones. Confidence intervals are estimated from 200 bootstrap replicates using non-parametric bootstrap analysis. Each boxplot shows minimum, 5% quantile, median, 95% quantile, and maximum. The value of c is set to 25.

Figure 3—figure supplement 2
Confidence intervals for the estimated mean FOI values in Ghana surveys before and immediately after a transient three-round IRS intervention.

(A) The estimated FOI values when excluding these treated individuals from the analysis altogether. (B) The estimated FOI values when discarding the infection status and MOI estimates of treated individuals and sampling from non-treated ones. Confidence intervals are estimated from 200 bootstrap replicates using non-parametric bootstrap analysis. Each boxplot shows minimum, 5% quantile, median, 95% quantile, and maximum. The value of c is set to 40.

Figure 3—figure supplement 3
Confidence intervals for the estimated mean FOI values in Ghana surveys before and immediately after a transient three-round IRS intervention.

(A) The estimated FOI values when excluding these treated individuals from the analysis. (B) The estimated FOI values when discarding the infection status and MOI estimates of treated individuals and sampling from non-treated ones. Confidence intervals are estimated from 200 bootstrap replicates using non-parametric bootstrap analysis. Each boxplot shows minimum, 5% quantile, median, 95% quantile, and maximum. The value of c is set to 60.

The saturation in FOI with increasing EIR and their non-linear relationship from previous field studies.

(A) and (B) present our empirical estimates (with c=30) when excluding treated individuals from the analysis. (C) and (D) show our estimates when discarding the infection status and MOI estimates of treated individuals and instead sampling from non-treated ones with MOI >0. Since this case samples non-zero MOIs for these treated and uninfected individuals, it results in an upper bound for FOI estimates. The black points represent paired EIR–FOI values from the literature, as summarized by Smith et al., 2010, with crosses indicating instances where multiple estimates or ranges were reported or estimated for the same location. The yellow curve represents the best-fit to these paired EIR–FOI values (Smith et al., 2010). The purple hollow diamond and plus represent the Ghana data, showing our FOI estimates using the two methods and the EIR measured in the field by the entomological team (Tiedje et al., 2022).

Appendix 1—figure 1
Agent-based model for falciparum malaria transmission.

(A) The stochastic model tracks infection history and specific immune memory of individual hosts to variant surface antigens encoded by var genes. At transmission events, a donor and a recipient host are randomly selected. Transmission occurs if the donor host has blood-stage infections, and the recipient host has not reached carrying capacity of infections in its liver. Each parasite genome in the donor host is transmitted to a mosquito with a probability of 1/(number of genomes) multiplied by the transmissibility of the currently expressed gene. Each parasite genome carries 45 var genes, with each gene represented by a linear combination of two epitopes (depicted by different shapes), with many possible variants each (alleles, depicted by different colors). (B) During the sexual stage within mosquitoes, different parasite genomes can exchange var genes through meiotic recombination, generating novel recombinant repertoires. The recipient host can receive either recombinant genomes or original genomes. (C) When a repertoire is successfully transmitted to a recipient host, it undergoes a 7-day dormant liver stage before entering the blood stage, where var genes are sequentially expressed. If the host has no immunity against either epitope of a given var gene, its expression lasts 7 days (and either 7.5 or 8 days in additional simulations). Immunity to one of the two epitopes reduces the expression by approximately half, while complete immunity to both epitopes leads to immediate clearance of the gene product. An infection ends either when all var genes in the repertoire have been expressed or recognized, or, alternatively, with a certain probability, before the full repertoire is exhausted (Appendix 1—Simulation data, subsection ‘An extended var model,’ sub-subsection ‘Within-host dynamics’). (D) During the asexual blood stage of infection, var genes within the same genome can swap their two epitope alleles through mitotic (ectopic) recombination, generating new epitopes with a certain probability. (E) Var genes can also mutate their epitopes to create new genes.

Appendix 1—figure 2
Simulation design, transmission scenarios, under-sampling or imperfect detection of var genes, and the empirical survey design from Bongo District, northern Ghana.

(A) Each simulation comprises three stages: a ‘pre-IRS’ period where local transmission reaches a semi-stationary state, followed by a 3-year ‘IRS’ intervention period (transient IRS) which reduces transmission rate, and a ‘post-IRS’ period where transmission rates return to original levels. After transmission initialization, closed systems do not receive migrant genomes from the regional pool. Semi-open systems explicitly model two local populations connected by migration. Regionally open systems continually receive migrant genomes from the regional pool throughout the simulation. This figure was adapted from Zhan et al., 2024 (Figure 1) (CC BY 4.0 license). The copyright holder has granted permission to publish under a CC BY 4.0 license. (B) Transmission intensity or effective contact rate varies seasonally across the pre-, during-, and post-intervention periods. We simulate three levels of perturbation corresponding to approximately 20% (low-coverage IRS), 40–45% (mid-coverage IRS), and 65–75% (high-coverage IRS) reductions in transmission. Under non-seasonal transmission, the transmission intensity remains constant throughout the year, decreases only during IRS, and then returns to its original level once IRS ends. (C) We examine different statistical distributions for times between local transmission events: exponential and Gamma. We consider homogeneous and heterogeneous exposure risks. In the latter, 23 of the population are high-risk, receiving approximately 94% of all bites, while the remaining population receives the rest. (D) The measurement error is depicted as a histogram showing the number of non-upsA (i.e., upsB and upsC) DBLα types per repertoire from putatively ‘monoclona’ infections, characterized by having 45 or fewer non-upsA DBLα types. These sequences were collected during six cross-sectional surveys conducted from 2012–2016 in Bongo District. This measurement error represents under-sampling or imperfect detection of var genes. (E) The study consists of four age-stratified cross-sectional surveys in Bongo District, Ghana, conducted at the end of wet/high-transmission seasons (blue circles) and dry/low-transmission seasons (gold circles). Two phases are covered: (1) Pre-IRS: Survey 1 (S1) in October 2012 and Survey 2 (S2) in May/June 2013; (2) Right post-IRS: Survey 3 (S3) in October 2015 and Survey 4 (S4) in May/June 2016. IRS was implemented with widespread LLIN usage distributed between 2010 and 2012 and again in 2016 (Tiedje et al., 2025; Gogue et al., 2020). This figure was adapted from Tiedje et al., 2022 (Figure 1) (CC BY 4.0 license). The copyright holder has granted permission to publish under a CC BY 4.0 license.

Appendix 1—figure 3
The relationship between the parasitemia level of the individual (measured in µl) and (A) the number of non-upsA var genes per isolate/individual, or (B) MOI estimates from the Bayesian formulation of the varcoding method.

There is a lack of association between the x- and y-axis variables among both untreated and antimalarial drug-treated individuals. We scale the parasitemia levels and the number of non-ups A var genes or MOI estimates before performing the regression.

Appendix 1—figure 4
Schematic illustration of (A) systems in queuing theory and (B) malaria transmission.
Appendix 1—figure 5
The shape of the negative log likelihood for (A) a simulation run (pre-IRS) with Gamma-distributed times between local transmission events in a seasonal, semi-open system with heterogeneous exposure risk, and (B) Ghana pre-IRS surveys (Survey 1 and 2) with c = 30 and mid PCR detectability.

We remove the infinite and extremely large values of the negative log likelihood, and plot the rest to improve visualization.

Appendix 1—figure 6
The impact of grid value choices on the results of FOI inference in either simulated outputs or Ghana data.

By further reducing the grid width to include more combinations of the mean and variance values of inter-arrival times, the FOI inference results remain either unchanged or deviate by no more than 1% from those based on the original grid width.

Appendix 1—figure 7
True and estimated FOI by the two-moment and Little’s Law methods for additional simulated scenarios of homogeneous exposure risk.

The times between local transmission events are Gamma-distributed, with non-seasonal transmission in a closed system. The true mean FOI per host per year is calculated by dividing the total number of infections acquired by the population by the total number of hosts in the population. Confidence intervals are estimated from 200 bootstrap replicates using non-parametric bootstrap analysis. Each boxplot shows minimum, 5% quantile, median, 95% quantile, and maximum values.

Appendix 1—figure 8
True and estimated FOI by the two-moment and Little’s Law methods for additional simulated scenarios of heterogeneous exposure risk.

The times between local transmission events are Gamma-distributed, with non-seasonal transmission in a semi-open system. The true mean FOI per host per year is calculated by dividing the total number of infections acquired by the population by the total number of hosts in the population. Confidence intervals are estimated from 200 bootstrap replicates using non-parametric bootstrap analysis. Each boxplot shows minimum, 5% quantile, median, 95% quantile, and maximum values.

Appendix 1—figure 9
True and estimated FOI by the two-moment and Little’s Law methods for additional simulated scenarios of homogeneous exposure risk.

The times between local transmission events are Gamma-distributed, with seasonal transmission in a regionally open system. The true mean FOI per host per year is calculated by dividing the total number of infections acquired by the population by the total number of hosts in the population. Confidence intervals are estimated from 200 bootstrap replicates using non-parametric bootstrap analysis. Each boxplot shows minimum, 5% quantile, median, 95% quantile, and maximum values.

Appendix 1—figure 10
True and estimated FOI by the two-moment and Little’s Law for additional simulated scenarios of homogeneous exposure risk.

The times between local transmission events are Gamma-distributed, with non-seasonal transmission in a regionally open system. The true mean FOI per host per year is calculated by dividing the total number of infections acquired by the population by the total number of hosts in the population. Confidence intervals are estimated from 200 bootstrap replicates using non-parametric bootstrap analysis. Each boxplot shows minimum, 5% quantile, median, 95% quantile, and maximum values.

Appendix 1—figure 11
True and estimated FOI by the two-moment and Little’s Law methods for additional simulated scenarios of homogeneous exposure risk.

The times between local transmission events follow an exponential distribution, with seasonal transmission in a closed system. The true mean FOI per host per year is calculated by dividing the total number of infections acquired by the population by the total number of hosts in the population. Confidence intervals are estimated from 200 bootstrap replicates using non-parametric bootstrap analysis. Each boxplot shows minimum, 5% quantile, median, 95% quantile, and maximum values.

Appendix 1—figure 12
As in Figure 1, we present confidence intervals for the estimated mean FOI values; all aspects of the simulation setup are identical except that infections are allowed to clear stochastically before full repertoire exhaustion.

Specifically, while any var gene, whether non-final or final, is being expressed, there is a small probability of infection clearance that depends on the host’s pre-existing immunity to that gene’s epitopes (Appendix 1—Simulation data, subsection ‘An extended var model,’ sub-subsection ‘Within-host dynamics’).

Appendix 1—figure 13
As in Figure 2, we present confidence intervals for the estimated mean FOI values; all aspects of the simulation setup are identical except that infections are allowed to clear stochastically before full repertoire exhaustion.

Specifically, while any var gene, whether non-final or final, is being expressed, there is a small probability of infection clearance that depends on the host’s pre-existing immunity to that gene’s epitopes (Appendix 1—Simulation data, subsection ‘An extended var model,’ sub-subsection ‘Within-host dynamics’).

Appendix 1—figure 14
As in Appendix 1—figure 7, we present confidence intervals for the estimated mean FOI values; all aspects of the simulation setup are identical except that infections are allowed to clear stochastically before full repertoire exhaustion.

Specifically, while any var gene, whether non-final or final, is being expressed, there is a small probability of infection clearance that depends on the host’s pre-existing immunity to that gene’s epitopes (Appendix 1—Simulation data, subsection ‘An extended var model,’ sub-subsection ‘Within-host dynamics’).

Appendix 1—figure 15
As in Appendix 1—figure 8, we present confidence intervals for the estimated mean FOI values; all aspects of the simulation setup are identical except that infections are allowed to clear stochastically before full repertoire exhaustion.

Specifically, while any var gene, whether non-final or final, is being expressed, there is a small probability of infection clearance that depends on the host’s pre-existing immunity to that gene’s epitopes (Appendix 1—Simulation data, subsection ‘An extended var model,’ sub-subsection ‘Within-host dynamics’).

Appendix 1—figure 16
As in Appendix 1—figure 9, we present confidence intervals for the estimated mean FOI values; all aspects of the simulation setup are identical except that infections are allowed to clear stochastically before full repertoire exhaustion.

Specifically, while any var gene, whether non-final or final, is being expressed, there is a small probability of infection clearance that depends on the host’s pre-existing immunity to that gene’s epitopes (Appendix 1—Simulation data, subsection ‘An extended var model,’ sub-subsection ‘Within-host dynamics’).

Appendix 1—figure 17
As in Appendix 1—figure 10, we present confidence intervals for the estimated mean FOI values; all aspects of the simulation setup are identical except that infections are allowed to clear stochastically before full repertoire exhaustion.

Specifically, while any var gene, whether non-final or final, is being expressed, there is a small probability of infection clearance that depends on the host’s pre-existing immunity to that gene’s epitopes (Appendix 1—Simulation data, subsection ‘An extended var model,’ sub-subsection ‘Within-host dynamics’).

Appendix 1—figure 18
As in Appendix 1—figure 11, we present confidence intervals for the estimated mean FOI values; all aspects of the simulation setup are identical except that infections are allowed to clear stochastically before full repertoire exhaustion.

Specifically, while any var gene, whether non-final or final, is being expressed, there is a small probability of infection clearance that depends on the host’s pre-existing immunity to that gene’s epitopes (Appendix 1—Simulation data, subsection ‘An extended var model,’ sub-subsection ‘Within-host dynamics’).

Appendix 1—figure 19
Estimated standard deviation of the inter-arrival times using the two-moment approximation method across different simulation scenarios and field data from Bongo District, Ghana.
Appendix 1—figure 20
Comparison of the distribution of infection durations among naive hosts during the pre-IRS phase in simulated seasonal, semi-open systems where times between local transmission events follow a Gamma distribution, versus historical clinical data from neurosyphilis patients treated with Plasmodium falciparum.

In the simulations, each infection can clear before all of its var genes have been expressed and recognized. Specifically, during the expression of any gene, whether non-final or final, there is a small probability of infection clearance that depends on the host’s pre-existing immunity to that gene’s epitopes (Appendix 1—Simulation data, subsection ‘An extended var model,’ sub-subsection ‘Within-host dynamics’).

Appendix 1—figure 21
As in Appendix 1—figure 20, we compare here the distribution of infection durations for the same simulation conditions with those from the historical clinical data, but show the results for children aged 1–5 years rather than naive hosts in the simulation.
Appendix 1—figure 22
Comparison of the distribution of infection durations among naive hosts during the pre-IRS phase in simulated non-seasonal, semi-open systems where times between local transmission events follow a Gamma distribution, versus historical clinical data from neurosyphilis patients treated with Plasmodium falciparum.

In the simulations, each infection can clear before all of its var genes have been expressed and recognized. Specifically, during the expression of any gene, whether non-final or final, there is a small probability of infection clearance that depends on the host’s pre-existing immunity to that gene’s epitopes.

Appendix 1—figure 23
As in Appendix 1—figure 22, we compare here the distribution of infection durations under the same simulation conditions with those from the historical clinical data, but show the results for children aged 1–5 years rather than naive hosts in the simulation.
Appendix 1—figure 24
Comparison of the distribution of infection durations among naive hosts during the pre-IRS phase in simulated non-seasonal, regionally open systems where times between local transmission events follow a Gamma distribution, versus historical clinical data from neurosyphilis patients treated with Plasmodium falciparum.

In the simulations, each infection can clear before all of its var genes have been expressed and recognized. Specifically, during the expression of any gene, whether non-final or final, there is a small probability of infection clearance that depends on the host’s pre-existing immunity to that gene’s epitopes.

Appendix 1—figure 25
As in Appendix 1—figure 24, we compare here the distribution of infection durations under the same simulation conditions with those from the historical clinical data, but show the results for children aged 1–5 years rather than naive hosts in the simulation.

Tables

Key resources table
Reagent type (species) or resourceDesignationSource or referenceIdentifiersAdditional information
Software, algorithmR 3.6.1R Development Core Team (2019)RRID:SCR_001905

Additional files

Supplementary file 1

Performance of the Bayesian varcoding method and bootstrap imputation under sampling limitations.

Cramér–von Mises and Anderson–Darling test statistics and p-values comparing estimated and true MOI distributions across sampling limitation scenarios. Most p-values are non-significant (p>0.05), indicating good agreement, with deviations primarily observed under certain high-transmission conditions.

https://cdn.elifesciences.org/articles/100076/elife-100076-supp1-v1.xlsx
Supplementary file 2

Tests for deviation from Poisson-distributed MOI estimates.

Results of statistical tests assessing deviations of MOI distributions from a Poisson distribution for both empirical survey data from Ghana and simulated outputs.

https://cdn.elifesciences.org/articles/100076/elife-100076-supp2-v1.xlsx
Supplementary file 3

Performance of the two proposed methods for inferring FOI.

Comparison of inferred and true FOI values for the two proposed inference methods based on bootstrap estimates from simulation outputs. For each scenario, we assess whether the true FOI lies within the bootstrap distribution of inferred values and compute the relative deviation, defined as the difference between the true FOI and the median of the bootstrap distribution, normalized by the true FOI.

https://cdn.elifesciences.org/articles/100076/elife-100076-supp3-v1.xlsx
Supplementary file 4

Ranges and grid values of mean and variance parameters for inter-arrival times used in likelihood-based estimation.

https://cdn.elifesciences.org/articles/100076/elife-100076-supp4-v1.xlsx
Supplementary file 5

Parameters and values used in the agent-based malaria transmission model for generating simulation outputs.

https://cdn.elifesciences.org/articles/100076/elife-100076-supp5-v1.xlsx
Supplementary file 6

Skewness of bootstrap distributions of FOI estimates.

Skewness values of bootstrap distributions of FOI estimates across simulated scenarios.

https://cdn.elifesciences.org/articles/100076/elife-100076-supp6-v1.xlsx
Supplementary file 7

Comparison of the original varcoding method and its Bayesian formulation for MOI estimation.

Results of Cramér–von Mises and Anderson–Darling tests comparing MOI estimates from the original varcoding method and its Bayesian formulation against true MOI values, as well as against each other. The Bayesian formulation shows improved agreement with true MOI, particularly in moderate- and high-transmission settings.

https://cdn.elifesciences.org/articles/100076/elife-100076-supp7-v1.xlsx
MDAR checklist
https://cdn.elifesciences.org/articles/100076/elife-100076-mdarchecklist1-v1.docx

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Qi Zhan
  2. Kathryn E Tiedje
  3. Karen P Day
  4. Mercedes Pascual
(2026)
From multiplicity of infection to force of infection in sparsely sampled high-transmission Plasmodium falciparum populations
eLife 13:RP100076.
https://doi.org/10.7554/eLife.100076.4