1. Epidemiology and Global Health
Download icon

Nationally-representative serostudy of dengue in Bangladesh allows generalizable disease burden estimates

  1. Henrik Salje  Is a corresponding author
  2. Kishor Kumar Paul
  3. Repon Paul
  4. Isabel Rodriguez-Barraquer
  5. Ziaur Rahman
  6. Mohammad Shafiul Alam
  7. Mahmadur Rahman
  8. Hasan Mohammad Al-Amin
  9. James Heffelfinger
  10. Emily Gurley
  1. Institut Pasteur, UMR2000, CNRS, France
  2. Johns Hopkins Bloomberg School of Public Health, United States
  3. International Centre for Diarrhoeal Disease Research, Bangladesh (icddr,b), Bangladesh
  4. University of California, San Francisco, United States
  5. Institute of Epidemiology, Disease Control and Research (IEDCR), Bangladesh
  6. Centers for Disease Control and Prevention, United States
Research Article
  • Cited 6
  • Views 1,433
  • Annotations
Cite this article as: eLife 2019;8:e42869 doi: 10.7554/eLife.42869

Abstract

Serostudies are needed to answer generalizable questions on disease risk. However, recruitment is usually biased by age or location. We present a nationally-representative study for dengue from 70 communities in Bangladesh. We collected data on risk factors, trapped mosquitoes and tested serum for IgG. Out of 5866 individuals, 24% had evidence of historic infection, ranging from 3% in the north to >80% in Dhaka. Being male (aOR:1.8, [95%CI:1.5–2.0]) and recent travel (aOR:1.3, [1.1–1.8]) were linked to seropositivity. We estimate that 40 million [34.3–47.2] people have been infected nationally, with 2.4 million ([1.3–4.5]) annual infections. Had we visited only 20 communities, seropositivity estimates would have ranged from 13% to 37%, highlighting the lack of representativeness generated by small numbers of communities. Our findings have implications for both the design of serosurveys and tackling dengue in Bangladesh.

https://doi.org/10.7554/eLife.42869.001

eLife digest

Dengue is a mosquito-borne virus that infects millions of people each year. Often the countries most affected by the virus, such as Bangladesh, do not have the resources needed to tackle the disease. For resources sent to these countries to have the greatest impact, it is important to know which areas are most affected, and which subsets of the population are most at risk. A way to gather this information is to test for dengue virus antibodies a protein produced by the immune system in response to the infection in the blood of individuals. However, previous efforts to use these tests to understand dengue risk in communities have generally only been done in single locations, typically a major city, and the findings of these tests are unlikely to be applicable to the wider population.

Now, Salje et al. have visited 70 different communities from all around Bangladesh and used these tests on blood samples collected from over 5,000 individuals from a range of age-groups. From these measurements it was estimated that an average 2.4 million people are infected with dengue each year in Bangladesh, with major cities, such as Dhaka, experiencing more concentrated levels. The exposure to dengue outside major cities was much lower, and men, who tend to travel more, were found to be at greater risk of infection.

Salje et al. also showed that using a small number of communities to estimate national levels of infection led to misleading results. This highlights the danger of using information collected from a limited number of places to represent the effects of a disease on the wider population.

Public health agencies in Bangladesh will be able to use this information to tackle dengue more effectively, focusing on the areas and the populations most affected by the disease. In addition, the design and analytical approaches used in this study could be applied to other countries, and to different diseases.

https://doi.org/10.7554/eLife.42869.002

Introduction

There has been growing recognition of the utility of nationally representative serum banks to monitor the burden from infectious diseases in a population (Metcalf et al., 2016; Wilson et al., 2012; Osborne et al., 1997; Jardine et al., 2010; De Melker and Conyn-van Spaendonck, 1998; van der Klis et al., 2009). By tracking the levels of pathogen-specific antibodies in populations, these banks are a powerful tool for public health agencies to understand a wide range of factors that can assist in the fight against diseases, including pathogen circulation patterns, vaccination levels, and the existence of spatial pockets of susceptibility (De Melker and Conyn-van Spaendonck, 1998; Gidding et al., 2005; Osborne et al., 2000). To date, the accumulation and use of nationally representative banked sera have focused almost exclusively on vaccine preventable childhood infections. However, serum banks also have the potential to be invaluable in efforts to understand the burden from arboviruses and optimize efforts for control, including the targeted deployment of vaccines (Imai and Ferguson, 2018). In these diseases, high levels of subclinical infection and frequent clinical misclassification mean that even in locations with good disease surveillance, the underlying risk of infection is poorly understood (Halstead, 2007).

Studies that collect serum for the detection of pathogen-specific antibodies typically rely on either convenience samples (e.g., blood donations) or focus on single communities, often in major cities or where the perceived risk of the pathogen is believed to be greatest (Petersen et al., 2012; Salje et al., 2016a; Rodríguez-Barraquer et al., 2014). While they are often a good starting place for understanding historic pathogen circulation in the sampled communities and age-groups, the ability to generalize the results to the wider population is rarely known. By contrast, if communities are randomly chosen from across a country, we can estimate national incidence rates, the underlying spatial heterogeneity in burden, the spatial dependence across neighbouring communities and identify risk factors for infection. Such study designs therefore provide mechanistic insights into pathogen spread and facilitate the development of data-informed control policies.

Dengue virus is a flavivirus, transmitted by Aedes mosquitoes, that is found across tropical and subtropical regions and causes a range of disease manifestations, ranging from asymptomatic infection to death (Petersen et al., 2012). Transmission of arboviruses, such as dengue, appears to be driven by the interplay of individual- (e.g., sex, age, travel), household- (e.g., water supply, use of mosquito control) and community-level (e.g., urban/rural, mosquito abundance) factors (Salje et al., 2016b; Rodríguez-Barraquer et al., 2015). In order to make data-informed decisions on how best to control spread, we need to understand the relative importance of these different factors by collecting detailed data across these scales. A recent literature search found only one nationally representative dengue seroprevalence study, from Singapore, but there was only a subset of age-groups considered (Imai et al., 2015).

Outside of city states such as Singapore, Bangladesh is the most densely populated country in the world with 146 million people living in an area under 150,000 km2. The dengue burden in Bangladesh is unclear. Sporadic cases were reported in the 1960 s and a major outbreak occurred in 2000 (Rahman et al., 2002; Sharmin et al., 2015; Yunus et al., 2001), with clinical cases reported annually since then (Government of the People’s Republic of Bangladesh, Ministry of health and family Welfare, 2017). However, our knowledge of dengue epidemiology in the country is largely restricted to Dhaka, where a seroprevalence of 80% has been observed (Dhar-Chowdhury et al., 2017), with the burden elsewhere unknown (Government of the People’s Republic of Bangladesh, Ministry of health and family Welfare, 2017). Here, we present the results of a study where we use sequential annual visits in randomly selected communities across Bangladesh to determine the burden of dengue and identify key risk factors for infection.

Materials and methods

Community and household selection

Request a detailed protocol

We randomly selected 70 communities from the 97,162 communities in the national census, where the probability of selection was proportional to the size of the community population. In rural locations (around three-quarters of the country), these census-communities consist of villages, whereas in urban places, these communities are city wards. Study teams visited each of the selected communities at least twice, once during the period 08/2014-12/2014 (Y1) and once during the period 10/2015-01/2016 (Y2) to conduct interviews, collect serum and trap mosquitoes. A further visit was conducted in 06/2015-07/2015 in a subset of communities for additional mosquito collection only. For each visit, the study team spent at least 5 days in the community. In an attempt to select households randomly, the study staff identified the house where the most recent wedding had taken place and identified the closest neighbour. They then counted six households in a random direction to identify the first household for the study. To select each additional household for the study, they used the previous household as a starting point and counted six households in a random direction. Different households were selected in each visit. For selected households, the household head was informed of the study and invited to participate. If the household head was away during the first visit, the study team returned at a later time. If the household head agreed to participate, all household residents over the age of 6 months were also invited to participate. Residents were offered a test to determine their blood group as a benefit of participating. If some members of the household agreed and some refused, all consenting members were included in the study. Where some household members were not present at the time of the visit, study staff organised a time to come back. Data collection for a community was considered to be complete when at least 40 serum samples from at least 10 households had been collected. There were three elements to data collection: (A) questionnaires (B) serum collection and (C) mosquito collection.

Questionnaires

Request a detailed protocol

Each participant was led through a questionnaire. Where individuals were too young to answer, older individuals from the household answered for them. We asked a series of questions on demographics (age, sex), whether they had ever been diagnosed with dengue and whether they had travelled outside of their community in the prior 7 days, 30 days or 6 months. In addition, the head of the household was asked to complete a separate questionnaire, which included questions about their education level, total household income, household utilities (e.g., access to electricity and clean water), whether they had used any form of mosquito control in the last week and whether they owned land away from the household.

Serum sample collection and testing

Request a detailed protocol

A phlebotomist collected 5 ml of venous blood from all individuals who gave consent. Individuals who were sick at the time were ineligible. These samples were centrifuged in the field and the serum extracted into separate vials before being shipped to icddr,b (previously known as the international centre for diarrhoeal disease research, Bangladesh) laboratories in Dhaka in nitrogen dry shippers. The samples were tested for antibodies against IgG dengue virus, which indicates historic infection, using PanBio indirect IgG ELISAs (Alere Inc, Massachusetts, USA).

Mosquito collection

Request a detailed protocol

During the first visit in 2014, BG Sentinel traps (Biogents AG, Germany) were placed in eight randomly sampled households in each of the of the 70 participating communities. The traps were placed in the main living area of the households and after 24 hr, they were collected and all mosquitoes sent to icddr,b laboratories in Dhaka where an entomologist identified the species of each captured mosquito. To help ensure that communities where no Aedes mosquitoes were found during the initial mosquito trapping truly had no Aedes, mosquito trapping was repeated in these communities from June 2015 to July 2015 during which eight households were randomly selected to have sentinel traps placed in their homes for 24 hr and the traps and mosquitos were processed in the same way as they had been initially.

Understanding risk factors for infection using regression analysis

Request a detailed protocol

We divided the covariates into individual-level (age, sex, travel patterns), household-level (household income, electricity in household, access to water in household, mosquito control) and community-level (Ae. albopictus/Ae. aegypti in community, log population size) categories. For each covariate, we initially performed simple logistic regression to explore associations with dengue serostatus using a hierarchical model with random intercepts for household and community. We accounted for spatial correlation structure using a Matern covariance function using a stochastic partial differential equation and fitted the models using integrated nested Laplace approximations (INLA) in a Bayesian framework (Lindgren et al., 2011). All covariates were then included in a multivariable analysis. As the probability of being seropositive is strongly linked to the past circulation of dengue, we also performed a sensitivity model in which we recalculated the regressions using only individuals > 20 years as seropositivity was not found to differ by age among older adults. Finally, we assessed the importance of using spatial correlation structure by calculating the coefficients in a separate regression that did not include a spatial covariance matrix.

Mapping the risk of dengue across Bangladesh

Request a detailed protocol

To explore the variability in dengue risk across Bangladesh, we initially placed a 5 km x 5 km grid over the country and estimated the population size in each of those grid cells using data available from worldpop.org (Tatem, 2017). We then fit a multivariable model using the data from our sampled locations using log(population size), age category (<10 y, 11-20y, 21-30y, 31-40y, 41y-50, 51-60y,>60 y) and sex as covariates, which represent variables that are either available for all the grid cells (population size) or where we can use the overall proportion of the population that is within each category (age and sex) from the national census. As above, we fit the model in a Bayesian framework with a Matern spatial correlation structure using integrated nested Laplace approximations. We used the fitted model to predict in the unsampled grid cells by drawing 1000 samples from the posterior for each grid cell and calculated the mean as well as 2.5% and 97.5% quantiles to quantify uncertainty. The estimated number of seropositive individuals in a cell was calculated by multiplying the estimated proportion seropositive in a cell by the population within that cell. The total number of seropositive individuals in the country was calculated as the sum of seropositive individuals across all the grid cells. As a sensitivity analysis, we also predicted the spatial distribution of dengue seropositivity in the country using a model with the Matern spatial covariance matrix only (i.e., without any covariates).

We assessed the ability of different model formulations to accurately predict the level of seropositivity in unsampled locations. We considered four different models: (i) crude proportion seropositive, (ii) multivariable logistic model with sex, age-group and population size as covariates and Matern spatial covariance (the baseline model), (iii) multivariable logistic model with sex, age-group and population size as covariates but with no spatial dependence, and (iv) spatial dependence model using Matern spatial covariance with no covariates. For 100 iterations and for each model in turn, we repeatedly randomly selected a subset of communities to train the model (varied between 2, 20, 40, 60 and 69 communities) and predicted the seroprevalence in the remaining communities not used to fit the model. Separately, we considered the impact of having sampled fewer people per community. We reran each of the four models over repeated iterations using 50 randomly selected communities and with between 2 and 80 individuals sampled per community to train the models. We then estimated the seroprevalence in the 20 remaining communities.

Estimation of the force of infection using catalytic models

Request a detailed protocol

We used the probability of being seropositive as a function of age to estimate the proportion of the susceptible population that get infected each year using catalytic models, an approach which has been used frequently to reconstruct the past circulation of pathogens (Salje et al., 2016a; Rodríguez-Barraquer et al., 2014; Imai et al., 2015; Ferguson et al., 1999). We assumed a constant force of infection due to all four serotypes, λ and that there were no differences in risk by age.

The proportion seropositive of age a, is given by z(a)=1-exp(-λ x (min(a,NYears)), where NYears is the number of years prior to 2014 that dengue has circulated in Bangladesh. We fixed NYears at 20 to reflect the approximate period when dengue first appeared in the country. We conducted a sensitivity analysis where this was varied between 15 and 25 years. We estimated λ using maximum likelihood where the contribution to the likelihood from seronegative individuals coming from community i is exp(-λ x (min(a,NYears)) x 1/wt(commi), where the weights, wt(commi), represent the proportion of that community that was sampled (number of people in community i/population in community i). This approach was used to ensure that all individuals contributed equally to the likelihood. The contribution to the likelihood from seropositive individuals is (1-exp(-λ x (min(a,NYears))) x 1/wt(commi).

We calculated the force of infection for the entire sampled population as well as separate estimates by sex and for the locations from the three largest cities (Dhaka, Chittagong and Khulna) only versus the rest of the country.

Estimation of the number of infected individuals per year

Request a detailed protocol

To estimate the number of people infected each year we used the estimated population by age for each year for the period 1995–2014 (Kinsella and He, 2009). We assumed that in 1995, the entire population was susceptible. The proportion of the population that have monotypic immunity is calculated as w(a,y)=4 x exp(−3 x λs x a*) x (1-exp(-λs x a*)) where λs is the serotype specific force of infection and is calculated as λ/4 and a* is the number of years an individual has been alive since the introduction and is calculated as min(a,y-1995). Similarly, the proportion of the population that has previously been infected with two serotypes (w2(a,y)) is 6 x exp(−2 x λs x a*) x (1-exp(-λs x a*))2 and the number previously infected with three serotypes w3(a,y)=4 x exp(- λs x a*) x (1-exp(-λs x a*))3. Using these proportions we can calculate the number of primary, secondary, tertiary and quaternary dengue infections. Where N(a,y) x 4 x λs*exp(−4 x λs x a*) is the number of primary infections, N(a,y) x 3 x λs x w(a,y), the number of secondary infections, N(a,y) x 2 x λs x w2(a,y) the number of tertiary infections and N(a,y) x λs x w3(a,y) the number of quaternary infections. N(a,y) is the size of the population of age group a in year y. We present the estimated total number of infections across primary, secondary, tertiary and quaternary infections.

Ethical review

Request a detailed protocol

This study was approved by the icddr,b ethical review board (protocol number PR-14058). The U.S. Centers for Disease Control and Prevention relied on icddr,b’s ethical review board approval. All adult participants provided written, informed consent after receiving detailed explanation of the study and procedures. Parents/guardians of all child participants were asked to provide written, informed consent on their behalf.

Results

In total, 5866 individuals fully participated (completed questionnaire and had blood taken) in our study across 70 communities, 2911 during August–December 2014 and 2955 during October 2015 – January 2016 (Table 1). We obtained serum from 76% of household members of participating households (Figure 1—figure supplement 1). Per community there were an average of 95 participants (range 81–116) from 20 households (range 20–23). The age and sex distributions in the study largely matched those obtained by the 2011 census although we had some under-representation in those <10 years (Figure 1—figure supplement 2). The PanBio assay appeared to discriminate well between those with and without past dengue infection (Figure 1—figure supplement 3). We found that overall, 24% of individuals had evidence of a past infection. We observed substantial heterogeneity by age and sex (Figure 1B), with 27% of males seropositive compared to 21% of females (p-value<0.001). Individuals > 20 y had 30% seropositivity compared to 14% in those under 20y. There was close correlation between the proportion seropositive in a community between 2014 and 2015 (Pearson correlation of 0.92) (Figure 1C). Overall, there was no observed difference in seropositivity across the two years of the study (p=0.66). While most of the study population (91%) aged >10 y had heard of dengue, only 38 individuals (0.6%) reported having had dengue, of whom only 16 had evidence of past infection.

Figure 1 with 3 supplements see all
Dengue seropositivity in the sampled communities.

(A) Locations of sampled communities and the estimated seroprevalence by community. (B) Proportion seropositive by age and sex with 95% confidence intervals. (C) Seropositivity in Y1 compared to seropositivity in Y2 for each community.

https://doi.org/10.7554/eLife.42869.003
Table 1
Individual-, household- and community-level characteristics of participants, stratified by serostatus to dengue.
https://doi.org/10.7554/eLife.42869.007
Serum obtained
(N = 5,866)
Seropositive
(N = 1,403)
Seronegative
(N = 4,463)
Individual levelnn (%)n (%)
Year of study
20142911704 (24)2207 (76)
20152955699 (24)2256 (76)
Age group (years) in 2014:
<1083288 (11)744 (89)
11–201062314 (30)748 (70)
21–301402228 (16)1174 (84)
31–40818261 (32)557 (68)
41–50679197 (29)482 (71)
51–60541144 (27)397 (73)
>60525171 (33)354 (67)
Sex:
Male2821761 (27)2060 (73)
Female3044642 (21)2402 (79)
Heard of dengue:
No772115 (15)657 (85)
Yes20931288 (25)3805 (75)
Reported having had dengue:
No58271387 (24)4440 (76)
Yes3816 (42)22 (58)
Last time left community:
<7 days773269 (35)504 (65)
7d-1 month1198311 (26)887 (74)
1–6 months1142285 (25)857 (75)
>6 months2753538 (20)2215 (80)
Household level
Electricity in home
No780149 (19)631 (81)
Yes50751253 (25)3822 (75)
Access to water in home*
No301125 (42)176 (58)
Yes2599578 (22)2021 (78)
Own home
No444239 (54)205 (46)
Yes54081161 (21)4247 (79)
Own land away from home
No1303373 (29)930(71)
Yes45521029 (23)3523 (77)
Mosquito control used
No2185512 (23)1673 (77)
Yes3670890 (24)2780 (76)
Household head education:
No education1034306 (30)728 (70)
Primary school1574362 (23)1212 (77)
High school1407320 (23)1087 (77)
Higher1840414 (22)1426 (78)
Household income
(Taka, 100 Taka = 1.2 USD):
<7000921212 (23)709 (77)
7,000–99991176229 (19)947 (81)
10000–20,0001980509 (26)1471 (74)
>20,0001766452 (26)1314 (74)
Community level
Aedes aegypti mosquitos captured
No3931668 (17)3263 (83)
Yes1935735 (38)1200 (62)
Aedes albopictus mosquitos captured
No3416941 (28)2475 (72)
Yes2450462 (19)1988 (81)
Type of community
Urban1505557 (37)948 (63)
Rural4361846 (19)3515 (81)
Division:
Dhaka1484407 (27)1077 (73)
Chittagong1533382 (25)1151 (75)
Barisal32968 (21)261 (79)
Khulna672302 (45)370 (55)
Rajshahi668152 (23)516 (77)
Rangpur92080 (9)840 (91)
Sylhet26012 (5)248 (95)

While all communities had at least one seropositive individual, there was substantial spatial heterogeneity across the country with the proportion seropositive ranging from 3% in rural Maulvibazar in Sylhet Division to 88% in urban Chittagong. Communities in the north of the country appeared largely unaffected. Communities in the northern division of Rangpur had a mean seropositivity of 9% compared to 45% for communities in Khulna division in the southeast. Even within Dhaka district (which includes the capital and has the highest population density), where we visited three urban (‘Thana’) communities, there was substantial heterogeneity, with seropositivity ranging from 36 to 85%. The two urban communities we visited in the city of Chittagong had seropositivities of 84 and 88%.

We found that several individual-level variables were associated with seropositivity (Table 2). In particular, males were much more likely to be seropositive (odds ratio [OR]: 1.6 [95%CI: 1.4–1.9]; adjusted odds ratio [aOR]: 1.7 [1.5–2.0]), although this difference was concentrated in communities where overall seropositivity was <20% (Figure 2—figure supplement 1). Travel also appeared important, with those who had travelled in the prior 7 days having twice the odds of being seropositive compared to those that had not travelled in the prior 6 months (OR: 1.9 [1.5–2.4]; aOR: 1.4 [1.1–1.8]). Household-level covariates did not appear to be important in determining risk of seropositivity, including having household electricity, household access to clean water, land ownership or household income. The use of mosquito control in the household was also not associated with seropositivity (OR: 0.9 [0.8–1.1]; aOR: 0.9 [0.7–1.1]). At the community-level, we found some evidence that individuals living in locations where we had found Ae. aegypti were more likely to be seropositive, although this effect was less in the multivariable model (OR 1.8 [1.2–2.8], aOR 1.4 [0.9–2.2]). Having Ae. albopictus in the community was not linked to individual serostatus (OR 1.1 [0.7–1.6], aOR 1.0 [0.7–1.6]). Overall Ae. aegypti was found in 23 (33%) and Ae. albopictus in 29 (41%) of communities with a slight negative correlation between the two (Pearson correlation of −0.2, p-value 0.07). The median seropositivity in communities with Ae. aegypti was 33% compared to 13% in the other communities. The median seropositivity in communities with Ae. albopictus was 15% compared to 18% in communities where it was not found. Individuals living in urban communities were more likely to be seropositive than those living in rural communities with each unit increase in log population size associated with a 1.3 times increased probability of being seropositive (95% CI: 1.2–1.5). The intraclass correlation coefficients showed that the Matern spatial covariance matrix explained 15% of the variance, the community-level random effects explained 6% and the household random intercept explained 12% of the variance in individual level responses. In a model without the spatial covariance matrix, the community-level random intercept explained 23% of the variance with the household-level effect unchanged. Including spatial covariance was associated with a small improvement in model fit, justifying its inclusion (Deviance Information Criterion [DIC] difference of 4). While most of the coefficient estimates were largely consistent in models that did and did not include the spatial covariance structure, the impact of Ae. aegypti changed significantly, increasing to aOR 2.4 (95% CI: 1.3–4.5) when spatial correlation was not incorporated (Figure 2—figure supplement 2). Not including random intercepts by household and community resulted in falsely narrow confidence intervals and some changes in coefficient estimates and a substantial drop in model fit (DIC difference of 801). Coefficients of models where the data was restricted to adults only were largely unchanged.

Table 2
Regression results.
https://doi.org/10.7554/eLife.42869.008
UnadjustedMultivariable
Individual levelOdds ratio (95% confidence interval)Adjusted odds ratio (95% confidence interval)
Year of study (vs Y1)1.0 (0.8–1.1)1.0 (0.9–1.2)
Age group (years) in 2014:
<10RefRef
11–201.9 (1.4–2.6)2.0 (1.4–2.7)
21–305.2 (3.8–7.1)5.5 (4.1–7.6)
31–405.9 (4.3–8.1)6.2 (4.5–8.6)
41–505.5 (4.0–7.7)5.8 (4.2–8.2)
51–605.1 (3.6–7.1)5.1 (3.6–7.2)
>607.5 (5.4–10.6)7.7 (5.4–10.8)
Male1.6 (1.4–1.9)1.7 (1.5–2.0)
Last time left community:
<7 days1.9 (1.5–2.4)1.4 (1.1–1.8)
7d-1 month1.4 (1.2–1.7)1.2 (0.9–1.4)
1–6 months1.1 (0.9–1.3)1.0 (0.8–1.2)
>6 monthsRefRef
Household level
Electricity in home1.0 (0.8–1.2)0.9 (0.7–1.2)
Water in home1.0 (0.7–1.4)- (1)
Own home0.9 (0.7–1.3)0.9 (0.7–1.4)
Own land0.9 (0.8–1.1)0.9 (0.7–1.1)
Mosquito control used0.9 (0.8–1.1)0.9 (0.7–1.1)
Household head education:
No educationRefRef
Primary school0.9 (0.7–1.1)0.9 (0.7–1.1)
High school0.9 (0.7–1.1)1.0 (0.7–1.2)
Higher0.8 (0.7–1.0)0.8 (0.6–1.0)
Household income (Taka):
<7000RefRef
7,000–99990.9 (0.7–1.2)0.9 (0.7–1.1)
10000–20,0001.0 (0.8–1.2)1.0 (0.7–1.2)
>20,0000.8 (0.6–1.0)0.8 (0.6–1.0)
Community level
Population density (log scale)1.3 (1.2–1.5)1.3 (1.1–1.8)
Aedes aegypti mosquitos captured1.8 (1.2–2.8)1.4 (0.9–2.2)
Aedes albopictus mosquitos captured1.1 (0.7–1.6)1.0 (0.7–1.6)

We used a spatial prediction model that incorporates the population size and sex distribution and spatial correlation structure to estimate the level of seropositivity throughout the country (Figure 2A). We found that the proportion of people seropositive in communities was spatially correlated up to 108 km (as measured from the Matern covariance function), consistent with that observed in a variogram of the seropositivity between communities (Figure 2B). Our model performed well at estimating the observed levels of seropositivity in participating communities in leave-one-out cross-validation with a Pearson correlation of 0.8 between the observed and fitted values and a mean absolute error of 8% (Figure 2C). These maps further suggest dengue is currently concentrated in the three largest cities of Dhaka, Chittagong and Khulna. This estimated distribution of dengue risk in the country was very similar if we used the spatial dependence information only, without age and sex covariates (Figure 2—figure supplement 3). Overall, we estimate that approximately 25% (95% CI: 21–29%) of the population had been infected with dengue at some point during their lives, equivalent to 40.3 million individuals (95% CI: 34.3–47.2). This estimate is consistent with that obtained using the crude proportion seropositive among our samples (24% or 39.0 million individuals).

Figure 2 with 4 supplements see all
Modelled dengue seropositivity across Bangladesh and across age groups.

(A) Spatial predictions of seropositivity for the whole country. Kh. = Khulna, Dh. = Dhaka, Ch. = Chittagong (B) Semivariogram showing spatial dependence between the proportion seropositive between communities as a function of distance between them. (C) Observed versus predicted levels of seropositivity by community from leave one out cross validation. (D) Observed (points) and fitted seropositivity by age for the sampled communities within the three largest cities (Khulna, Chittagong and Dhaka) for both males and females. (E) Observed and fitted seropositivity for the remaining communities by sex.

https://doi.org/10.7554/eLife.42869.009

Using a catalytic model to estimate the proportion seropositive by age, we estimated that 1.6% (95% CI: 1.5–1.7%) of the susceptible population gets infected each year across the four serotypes, equivalent to an average of 2.4 million annual infections (95% CI: 2.2–2.5 million) (Figure 2—figure supplement 4). However, estimates were much higher for the three major urban hubs of Dhaka, Chittagong and Khulna compared to the rest of the country. Within these hubs, 6.4% (95% CI: 5.4–7.6%) of the population gets infected annually with no differences by sex, whereas this drops to 1.0% (95% CI: 0.9–1.2%) for females outside these areas and 1.6% (95% 1.4%–1.8%) for males (Figure 2D–E).

We assessed the sensitivity of our results to the number of participating communities and the model framework used. Over repeated iterations, we used a subset of our communities to estimate the overall proportion seropositive and to train a suite of models that were then used to estimate seropositivity in the remaining communities. We found that if we had only visited twenty communities, the seropositivity among the samples would have ranged from 13% to 37%, depending on the communities visited. Spatially explicit models which incorporated data on sex, age and population size, did not result in substantial improvements with the range of seropositivity estimates similarly wide (Figure 3A). By contrast, had we visited 60 communities, the range would have been much smaller (22–28%), with similar results in the spatial models. The accuracy of our predictions in unsampled locations improved substantially with increasing numbers of communities visited (Figure 3B). In spatial models with no covariates, the mean absolute error in the predictions per community fell from 13.6% when 20 communities were sampled to 10.5% when 69 communities were sampled, with a corresponding rise in the correlation between the observed and predicted seroprevalence from 0.39 to 0.81 (Figure 3C). Incorporating information on age, sex and population size resulted in a small improvement in performance when between 20 and 60 communities were sampled (e.g., when 20 communities were sampled, the mean correlation was 0.39 when no covariates were used and 0.49 when covariates were incorporated). Multivariable models with age, sex and population size as covariates but with no spatial correlation performed poorly, with no improvements with increasing numbers of communities visited. We found that sampling fewer people per community had little effect on our estimates, with the performance of nationwide and community-level seropositivity similar if 20 people were sampled per location compared to 80 (Figure 3D–F).

Accuracy of estimates for different number of sampled communities (top row) and different numbers of sampled individuals per community (bottom row) using different estimation methods.

(A) 95% range of estimates of overall seroprevalence from 100 repeated iterations when data from a random subset of communities is used. (B) Mean absolute error among heldout communities over repeated iterations. (C) Mean correlation between predicted and observed seroprevalence among heldout communities. (D) 95% range of estimates of overall seroprevalence from 100 repeated iterations when data from a random number of individuals from 50 communities is used. (E) Mean absolute error among 20 randomly selected heldout communities over repeated iterations. (F) Mean correlation between predicted and observed seroprevalence among 20 randomly selected heldout communities. The different estimation methods are overall proportion seropositive (black), spatial correlation model using Matern covariance structure and no covariates (blue), spatial correlation model using Matern covariance structure and age, sex and population size as covariates, logistic regression using age, sex and population size as covariates with no spatial component (orange).

https://doi.org/10.7554/eLife.42869.014

Discussion

We have presented the results of a large, nationally-representative, serostudy that provides a comprehensive description of dengue infection in Bangladesh. Our results demonstrate that, to date, dengue risk is very heterogeneous across the country. It also shows that the vast majority of the country has never been infected. The framework presented here can act as a strategy for future efforts to estimate nationally-representative infection risks in a population.

Our results suggest that since dengue re-emerged in the late 1990 s, the virus has only established a pattern of sustained endemic transmission in a few urban settings, and not throughout the country, as is the case in nearby Myanmar, Thailand and Cambodia (Rahman et al., 2002; Government of the People’s Republic of Bangladesh, Ministry of health and family Welfare, 2017; van Panhuis et al., 2015). Part of the reason for this may be the limited presence of the principal vector, Ae. aegypti, which was found in only one third of the communities that participated in this study. By contrast, more (especially rural) communities had the secondary vector, Ae. albopictus, but its presence was not associated with infection. Our finding of a negative correlation between Ae. aegypti and Ae. albopictus presence is consistent with the species occupying different environmental niches or competition between the two species, as has previously been suggested (Braks et al., 2004). All communities had at least one seropositive individual, suggesting that external viral introductions may be common and that there may be factors preventing large-scale outbreaks, including from the limited abundance of the Ae. aegypti vector. It is unclear how stable these vector populations are. Characterizing the drivers of Ae. aegypti spread and maintenance, especially in the context of changing land use and climate, appears key to understanding future risk of spread.

We were able to use our study to identify risk factors for infection. At the individual level, we found that males had 1.6 times the odds of having been infected as females, although this difference was concentrated in communities where overall seropositivity was low, suggesting that comparing infection proportions between males and females can be a good marker of local dengue endemicity. In addition, those that travelled more (as indicated by having left the community recently) were more likely to have been infected. These findings suggest that in the current scenario in which infection risk is heterogeneously distributed around the country, individuals from communities with low or no transmission are likely to get infected when they travel to higher transmission areas. If the vector presence expands throughout the country, these individuals could act as sources of outbreaks within these communities. Our findings are in marked contrast to what has been observed with chikungunya in Bangladesh, where women in a community that had a widespread chikungunya epidemic were found to be at significantly increased risk of being infected compared to males, with the increased risk of infection linked to greater time spent in and around the home (Salje et al., 2016b). These findings suggest that it may be difficult to generalize inferences across arboviruses due to differences in vector species and the frequency of introductions and risk of onwards spread. While we incorporated spatial correlation into our risk factor regression analyses, in practice this only resulted in a relatively small improvement in model fit compared to hierarchical models with random intercepts at the community and household levels. The biggest impact of incorporating the spatial correlation was to move the coefficient estimate for Ae. aegypti presence towards the null. This suggests that the covariance structure is a better predictor than the basic mosquito absence/presence data as the covariance is driven by, and absorbs, the true underlying environmental drivers including mosquito distributions.

Overall, we estimated that around a quarter of the population, around 40 million individuals, have been infected by dengue with an average of 2.4 million annual infections. This figure is much less than the 16.7 million previously estimated through a modelling exercise (Bhatt et al., 2013), highlighting the need for representative data to help support these models and the importance of considering immunity in the estimation of annual case numbers. We did not detect a significant difference in seropositivity between the two study years, though we were not sufficiently powered to detect small changes in seropositivity across the population. This points to a clear trade-off between resampling the same individuals across the two study years, which would have facilitated quantification of the incidence between the two years and our approach, which allowed us to maximize our sample size. While there was substantial spatial heterogeneity in the risk of being seropositive across the country, ultimately our crude estimates of the proportion seropositive in the country (24%) was very close to modelled estimates that incorporated the age, sex and population distribution in the country (25%). This provides strong support for the sampling frame we used to capture population-level estimates of population exposure.

While spatial prediction models did not help improve overall estimates of national burden, they did allow us to build maps of how infection risk is distributed throughout the country. Incorporating covariates (e.g., sex, age, population distribution) did not result in substantive improvements in predictive accuracy compared to models that included a spatial covariance term only. This highlights the importance of spatial dependence in obtaining accurate estimates in unsampled locations. The use of environmental and climate covariates were not considered here and may improve estimates, especially where insufficient (or no) sampled locations exist to make use of spatial correlation structure. Our approach is particularly relevant to settings like Bangladesh where there was little prior understanding of the distribution in disease burden in the country. Alternative strategies may exist in other settings where there is already some existing knowledge of where risk is concentrated. Although, even in these settings, unmeasured spatial differences in healthcare seeking or in surveillance system infrastructure may mean that it is preferable to randomly sample communities without specifically focusing on specific areas of populations in the country.

Our study provides key insight for the national vaccine policy for dengue. For the only currently licensed vaccine, Dengvaxia, the WHO has recommended that countries perform nationally representative serosurveys to inform vaccine rollout as the vaccine only provides protection in people with existing antibodies (World Health Organization, 2016). The vaccination of seronegative individuals has been linked to increased risk of subsequent severe disease (Salje et al., 2018; Hadinegoro et al., 2015). For Bangladesh, our findings suggest that any vaccine rollout should be concentrated to the urban areas of Dhaka, Chittagong and Khulna. However, even in these communities, the proportion seropositive at age 9 years is far below the threshold of 80% where vaccine rollout is potentially feasible without pre-vaccination screening (World Health Organization, 2018). Therefore, any rollout will require the screening of individuals for presence of antibodies before vaccination to avoid placing large numbers of individuals at risk for more severe disease manifestations.

The approach presented here could be used as a strategy for other countries interested in obtaining national estimates of disease risk. The optimal number of communities to visit will depend on the size and distribution of the population, the underlying level and heterogeneity of infection in the population, the required level of precision and the available budget. Therefore, it is difficult to make general recommendations about the number of communities which should be sampled in other settings. However, our finding that spatial correlation exists within 100 km suggests that communities should be sampled at a density to ensure that there is at least one sampled community within 100 km of all residents and preferably more. Sampling as few as 20 individuals per community still provides robust nationwide estimates, however, in practice, there are fewer budget and time constraints to sampling additional individuals within a community than visiting additional communities.

Cross-reactivity of antibodies is a problem for all seroprevalence studies, especially with flaviviruses. This prevents us from quantifying the relative importance of the different dengue serotypes. In addition, some seropositive individuals may have been infected with Japanese encephalitis rather than dengue. However, Japanese encephalitis is typically only found in rural communities where it circulates at low levels (estimated at 2.7 cases/100,000) (Paul et al., 2011). As we estimated only low levels of dengue seropositivity in rural communities, the number of false positives from Japanese encephalitis cross-reactivity is likely to be small. Individuals who participated in the study may not be representative of all members in the community. In particular, individuals who travel frequently may have been away. We attempted to minimize this risk by organizing times to meet with household members who were not present in the initial visit. Around 90% of households that we approached agreed to take part in the study. We used BG sentinel traps that have been shown to be well suited to trapping Aedes mosquitoes (Maciel-de-Freitas et al., 2006; Obenauer et al., 2010). In addition, we revisited all communities where we did not find Aedes mosquitoes in the initial visit for additional mosquito trapping. However, given the heterogeneous nature of mosquito distributions within communities, we may have nevertheless failed to find Aedes mosquitoes in communities where they breed. This would mean that Aedes may be more widespread than we found. We used the time period since individuals last left the community as a marker of travel. While this is likely to broadly capture trends in mobility, it remains a crude marker and more detailed measures of movement (from e.g., movement diaries, global positioning system monitors) would help provide a more detailed understanding of how people move. To randomly select households in a community, we would ideally have used a sampling frame of all households in the community. However, in this setting there were no detailed community maps and enumerating all households in the communities would have added an additional day in the field per community. Therefore, we used a quasi-random approach that identified a starting point for household sampling based on the area of the community where the most recent wedding took place; given that >95% of adults marry in Bangladesh, it is unlikely that this approach could bias the sample we obtained. However, in cultural contexts where marriage rates may vary by community or location within a community, this method of choosing a random starting point could produce a biased sample.

We found that simply asking people about whether or not they had been infected with dengue was not informative of past infection. This highlights that studies that only use questionnaires can only provide limited burden information for pathogens such as dengue. For example, Demographic Health Surveys, which collect detailed health questionnaires from randomly selected individuals across many countries could benefit significantly from adding a serological component to their studies. In Bangladesh, where dengue is still emerging, surveillance for vectors could be a way to monitor risk of future outbreaks and continued efforts to understand drivers of transmission could point to interventions to reduce its geographic spread.

References

  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
  6. 6
  7. 7
  8. 8
  9. 9
  10. 10
  11. 11
  12. 12
    An evaluation of the australian national serosurveillance program
    1. A Jardine
    2. SL Deeks
    3. MS Patel
    4. RI Menzies
    5. GL Gilbert
    6. PB McIntyre
    (2010)
    Communicable Diseases Intelligence Quarterly Report 34:29–36.
  13. 13
    An Aging World: 2008, International Population Reports
    1. K Kinsella
    2. W He
    (2009)
    Washington, DC: US Census Bureau.
  14. 14
  15. 15
  16. 16
  17. 17
  18. 18
  19. 19
  20. 20
  21. 21
  22. 22
  23. 23
  24. 24
  25. 25
  26. 26
  27. 27
  28. 28
  29. 29
  30. 30
    Second national serum bank for population-based seroprevalence studies in the netherlands
    1. FR van der Klis
    2. L Mollema
    3. GA Berbers
    4. HE de Melker
    5. RA Coutinho
    (2009)
    The Netherlands Journal of Medicine 67:301–308.
  31. 31
  32. 32
  33. 33
  34. 34
    Dengue vaccine: who position paper-September 2018
    1. World Health Organization
    (2018)
    Weekly Epidemiological Record 93:457–476.
  35. 35
    Dengue Outbreak 2000 in Bangladesh: From Speculation to Reality and Exercises
    1. EB Yunus
    2. AM Bangali
    3. M Mahmood
    4. MM Rahman
    5. AR Chowdhury
    6. KR Talukder
    (2001)
    world health organisation.

Decision letter

  1. Ben Cooper
    Reviewing Editor; Mahidol Oxford Tropical Medicine Research Unit, Thailand
  2. Neil M Ferguson
    Senior Editor; Imperial College London, United Kingdom
  3. Ben Cooper
    Reviewer; Mahidol Oxford Tropical Medicine Research Unit, Thailand
  4. Oliver Brady
    Reviewer; London School of Hygiene & Tropical Medicine, United Kingdom

In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

Thank you for submitting your article "Nationally-representative serostudies are needed for generalizable burden estimates: dengue in Bangladesh case-study" for consideration by eLife. Your article has been reviewed by three peer reviewers, including Ben Cooper as the Reviewing Editor and Reviewer #1, and the evaluation has been overseen by Neil Ferguson as the Senior Editor. The following individual involved in review of your submission has agreed to reveal their identity: Oliver Brady (Reviewer #3).

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

Summary:

Nationally representative serum samples are a valuable tool for understanding disease dynamics and planning vaccination interventions, but outside high income countries such studies are lacking (what seroprevalence studies there are are typically based on convenience samples). The current study, which reports the results of a nationally representative serological study for of Dengue in Bangladesh provides an important insight into the burden of disease and informs public health and future intervention strategies. It also represents an important demonstration of how such a national serological study can be conducted and analysed in a LMIC setting and how it can inform policy.

Essential revisions:

There was a consensus amongst reviewers that the analysis does need a bit more work – certainly in description and possibly also in rigour. At the moment there do seem to be missing some detail as outlined below and it is unclear how the risk factor analysis links to the burden predictions.

1) The title and Abstract of the manuscript are very heavily focused on the need to conduct a large number (> 10) seroprevalence surveys to generate accurate estimates of burden, but this is not tested in detail in the paper. Aside from the numerous other methods of estimating dengue burden which is probably outside of the scope of this paper, the analyses presented here only focus on randomised sampling schemes. WHO guidance on conducting seroprevalence surveys for dengue published only last year (Informing vaccination programs: a guide to the design and conduct of dengue serosurveys, WHO, Geneva, 2017) recommends stratified sampling based on historic dengue incidence as documented by passive surveillance. Passive surveillance data may not be available in Bangladesh, but it is in the vast majority of dengue endemic countries, thus limiting the generalizability of the findings presented here. Using the spatial correlation or risk factor analysis might also give unbiased estimates from < 10 well placed surveys (not tested here). It was also difficult to reconcile these interpretations with statements like: "ultimately our crude estimates of the proportion seropositive in the country (24%) was very close to an estimate adjusted for the age and sex distribution in the country". There is real value in the work that has been done here in its own right, so it is a little confusing why the focus of the paper is so heavily skewed towards a statement that was not so thoroughly tested.

2) Using leave one out cross validation in a dataset of size 70 communities is probably not a particularly stringent test. At a minimum, a CV split of 80% training 20% testing would be more standard. Looking at the model fit (2A-B) even with this small hold-out set there seems to be a fair amount of unexplained variance. Could this be better explained by including other (non-spatial) covariates – ideally those related to the risk factors identified in Table 2? Would this change the conclusions about minimum site sample size for national representativeness?

3) How confident are the authors of the claim made in the second paragraph of the Introduction that nationally representative serum samples are lacking outside high income countries? Can this be backed up by a systematic search of the literature, or is there robust evidence from one or more of the cited papers (Metcalf et al., 2016; Wilson et al., 2012; Osborne, Weinberg and Miller, 1997; Jardine et al., 2010; De Melker and Conyn-van Spaendonck 1998; Ang et al., 2015)?

4) The analysis accounts for clustering at the village level (through a hierarchical model) but not spatial correlation, which in general would be expected to reduce the amount of information in the data. It would be useful if this could be discussed, the decision to ignore such correlation justified, and the situations where it is important to account for spatial correlation in the regression modelling discussed.

5) Results section: "no household level covariates were significantly linked to seropositivity". Following the ASA's report on p-values (and many recommendations before that), it is widely considered unwise to report results according to "bright lines" for p-values (in this case 0.05 presumably). By all means report the p-values, and certainly report the confidence interval, but hopefully we are moving away from the era where we use such arbitrary thresholds to decide whether to report results, while ignoring the magnitude of the effect.

6) It is slightly difficult to read this paper as the Materials and methods are at the end and not enough detail is given in the Results for it be clear what was actually done without referring to the Materials and methods. Note that eLife guidelines say that "A Methods or Model section can appear after the Introduction where it makes sense to do so". I think the authors should either consider moving the Materials and methods to before the Results or at least briefly saying what was done in the Results with a more detailed explanation in the Materials and methods.

7) It's quite unclear how the results were used to derive nationally representatives estimates. The Materials and methods simply say "we also calculated a census-adjusted proportion seropositive by community that adjusted for any sampling bias" but details of how this was done are lacking. Such adjustment also seems to have been using only age and sex distribution from the 2011 census. Why not also use Urban vs. rural (an important predictor according to Table 2 and which should be easily obtained). Other community-level covariates could also be used, potentially. Note there is a literature on multilevel regression and post-stratification to address this type of problem that might be relevant here (see, for example, Zhang et al. Am J Epidemiol 2014, https://academic.oup.com/aje/article/179/8/1025/109078).

8) References to Figure 2B and Figure 2C in the text need to be swapped.

9) "All covariates that were statistically significant at a p-value of <0.1 in the unadjusted analysis were included in a multivariable analysis." Not clear why this was done (except that many other papers do this). There seem to be enough data to include all covariates, and covariates with p-values >0.1 may still have important affects either alone or in combinations. Sometimes covariates have to be selected using some approach as there aren't enough data to use them all, but this doesn't seem to be the case here.

10) "using catalytic models". Given there is no space limit it would be helpful to define the technical details here. Also, a constant force of infection was assumed for all serotypes. Was this assumption of a constant f.o.i. tested (e.g. by comparison with other models)? How consistent are the data with this assumption? The Materials and methods suggest that f.o.i. was allowed to vary by age (is this correct)? Why not also sex given the reported sex differences?

11) "we estimate that 40 million people have been infected with 2.3 million annual infections." (Abstract and Results). 95% CIs are needed for these estimates?

12) Sample size calculations will be useful to others. However, those currently given in the supplementary text use a formula without further justification (or reference to any justification) and condition on the fact that 70 communities are being sampled. To be more useful, I think it would be help if the authors could discuss the sample size implications of sampling more (or fewer) communities for given correlations between village. For example, for the observed within-village correlation how would the required sample size change as a function of the number of communities sampled. What would be the impact of spatial correlation be on these numbers? What about different levels of within village correlation? This would be useful for others planning such studies and think it would also be appropriate to discuss this issue in the Discussion section i.e. what are the resources required for a given precision/resolution and how are the resources required likely affected by study design choices. It would also be useful if the authors could at least discuss the issue of spatial correlation and discuss the merits/demerits of accounting for it in the modelling in general (as well as in this specific example).

13) "Reported having dengue" and "heard of dengue" seem peculiar things to include in the regression, as these might be extended to be a consequence of having dengue rather than a potential risk factor. Perhaps this reflects the fact that the purpose of the regression analysis is not clearly stated. It would be good to have a clear statement about the purpose of the regression and a justification for the choice of variables to include.

14) The supplementary material describes the recruitment strategy: "the study staff identified the house where the most recent wedding had taken place and identified the closest neighbour. They then counted six households in a random direction to identify the first household for the study. To select each additional household for the study, they used the previous household as a starting point and counted six households in a random direction." This seems a little eccentric and not obviously guaranteed to give a random sample. Is there any justification for this choice? Wouldn't numbering households and selecting at random or throwing darts at maps be better? I think this point at least needs to be discussed (with recommendations for future studies) and this aspect of the methods should be move to the Materials and methods section in the main text.

15) "may provide some guidance". Not sure what the intended meaning of this is. What kind of guidance?

16) Was any attempt made at assessing the accuracy of the recorded household data?

17) The entomological approach to determining the presence or absence of Ae. aegypti is quite superficial. The intensity of surveillance (number of BG traps per community) and duration (time in the field) is too short to arrive at a conclusion of presence/absence. In dengue endemic cities, where Ae. aegypti has a well-documented presence, there can be quite marked spatial heterogeneity in the distribution of Ae. aegypti when measured by BG traps, e.g. some houses can be free of this species for consecutive weeks but in a house 50 metres away they can be caught regularly. In regards to Ae. albopictus prevalence, BG traps set indoors are not the optimal method of determining presence/absence- it would have better to set them outdoors or to use outdoor ovitraps. The authors should qualify their conclusions by recognising that trapping method, intensity, duration and seasonality can all influence the likelihood of Aedes detection and this could change the conclusions of the manuscript.

18) Is there any reason uncertainty (in either or all of the data, kriging model and the force of infection model) can't be propagated through to the final burden estimates? Comparing mean estimates with Bhatt et al. to prove that nationally representative surveys are needed probably also needs to consider uncertainty. I think they might have also included tertiary and quaternary infections as well if you want to be comparable.

19) Can code for statistical analysis be made available?

20) "Our findings are in marked contrast to what has been observed with chikungunya in Bangladesh". Can the authors offer any hypothesis as to why this might be the case?

21). Can the authors provide at least one concrete example of such a survey could lead to better decision-making about vaccination?

22) It was felt that some of the statements about findings showing that lack of spread of aegypti is the reason behind heterogeneities in dengue transmission in Bangladesh were not fully justified in light of the known limitations of entomological surveying.

23) The authors correctly point to the possibility that the Panbio-based seroprevalence survey might have reflected past JEV exposure and cite the low case JE incidence to suggest dengue is the primary culprit for the seroloprevalence.. This might be true, but JEV is notoriously difficult to diagnose in the absence of laboratory testing and there would almost certainly be under-reporting of cases in Bangladesh. Having a random subset of samples tested by DENV/JEV PRNT50 assay, regarded as the most specific assay of DENV serostatus, could have helped clarify this point. Can the authors do this readily perhaps on a sample of 100 or so (or use Luminex as a (less preferred) option? Though not essential, it was felt this would improve the quality of the manuscript if it could be done within 2 months. If it can't be done the Discussion needs to be qualified accordingly.

24) Materials and methods: when were interviews conducted in relation to the dengue season? Could this have introduced recall bias of "whether diagnosed with dengue"?

25) Was there any evidence that travel in the last 7 days was a reasonable proxy for long term travel history?

26) "Our finding of a negative correlation between Ae. aegypti and Ae. albopictus presence is consistent with competition between the two species" – or is just evidence that they have different environmental niches?

https://doi.org/10.7554/eLife.42869.019

Author response

Essential revisions:

There was a consensus amongst reviewers that the analysis does need a bit more work – certainly in description and possibly also in rigour. At the moment there do seem to be missing some detail as outlined below and it is unclear how the risk factor analysis links to the burden predictions.

1) The title and Abstract of the manuscript are very heavily focused on the need to conduct a large number (> 10) seroprevalence surveys to generate accurate estimates of burden, but this is not tested in detail in the paper. Aside from the numerous other methods of estimating dengue burden which is probably outside of the scope of this paper, the analyses presented here only focus on randomised sampling schemes. WHO guidance on conducting seroprevalence surveys for dengue published only last year (Informing vaccination programs: a guide to the design and conduct of dengue serosurveys, WHO, Geneva, 2017) recommends stratified sampling based on historic dengue incidence as documented by passive surveillance. Passive surveillance data may not be available in Bangladesh, but it is in the vast majority of dengue endemic countries, thus limiting the generalizability of the findings presented here. Using the spatial correlation or risk factor analysis might also give unbiased estimates from < 10 well placed surveys (not tested here). It was also difficult to reconcile these interpretations with statements like: "ultimately our crude estimates of the proportion seropositive in the country (24%) was very close to an estimate adjusted for the age and sex distribution in the country". There is real value in the work that has been done here in its own right, so it is a little confusing why the focus of the paper is so heavily skewed towards a statement that was not so thoroughly tested.

We thank the reviewer and editor on their thoughts and agree that in some settings there exist sufficient data that could guide alternative sampling. strategies. The optimal sampling strategy is these other settings is certainly an interesting question but we believe falls outside the scope of this paper. We now discuss that alternative strategies may exist where some knowledge already exists. We, however, disagree with the statement that “Passive surveillance data may not be available in Bangladesh, but it is in the vast majority of dengue endemic countries”. For example, India, the country which (probably) suffers from the greatest disease burden from dengue, the situation is very poorly understood with little/no epidemiological data from large (especially rural) areas. Similarly, there is little dengue epi data from African countries – a setting where dengue case reports are frequent but there is rarely systematic surveillance. Further, even in countries with developed surveillance systems like Thailand, only a minority of cases are laboratory confirmed (in Thailand this has been estimated at 10%) and are therefore based on clinical presentation only, which results in frequent misdiagnosis. There are also important differences in healthcare seeking behaviours across the country. Therefore there may be an important discrepancy between what is recorded in surveillance systems and true underlying burden. With regards to the WHO guidance on conducting seroprevalence studies – some of the coauthors (along with the reviewer) were involved in drafting the guidance document. It is very specifically focused on dengue vaccine guidance rather than understanding the level of seropositivity. As such, it e.g., specifically does not consider areas where few cases are recorded (which would be the majority of Bangladesh) and only focuses on children. Therefore it is only of limited relevance to this study.

We agree with the reviewers that in our first draft we did not robustly compare the possible inferences from different analytical approaches and the number of communities/people sampled as indicated by the title/Abstract. Therefore in the revised manuscript we have now systematically compared the nationwide and community level inferences using different approaches and assessed the accuracy of estimates with different numbers of communities and participants per community. The different analytical approaches are:

1) Crude proportion seropositive

2) Spatial covariance with no covariates

3) Spatial covariance with age, sex and population size covariates (this is our baseline model, following suggestions by the reviewer)

4) Age, sex, population size covariates and no spatial covariance

As set out in a new Figure (Figure 3), we find that for the overall nationwide estimate, there is little difference in the uncertainty of estimates across the methods – i.e., incorporating a spatial model and/or covariates does not improve estimates when only a small number of places are sampled. So for example, that if 20 communities had been chosen out of our 70 – the overall crude estimate would have ranged between 12% and 37% depending on which communities were chosen with the range of values using spatial models being very similar. For community-level estimates in held out locations (i.e., the goal of spatial prediction exercises), we find that interestingly, adding covariates provides only minor improvement in predictive accuracy compared to a basic spatial correlation model – this appears to be because the covariates themselves (population size) are themselves spatially correlated and therefore, their effect can be absorbed in the covariance matrix.

In the revised document, we describe the different model formulations in the Materials and methods. We include a new figure (Figure 3). In the Materials and methods we include: “We assessed the ability of different model formulations to accurately predict the level of seropositivity in unsampled locations. […] We then estimated the seroprevalence in the 20 remaining communities.”

In the Results we include:

“We assessed the sensitivity of our results to the number of participating communities and the model framework used. […] We found that sampling fewer people per community would have had little effect on our estimates, with the performance of nationwide and community-level seropositivity similar if 20 people were sampled per location compared to 80 (Figures 3D-F).”

In the Discussion, we now include:

“While spatial prediction models did not help improve overall estimates of national burden, they did allow us to build maps of how infection risk is distributed throughout the country. […] Although, even in these settings, unmeasured spatial differences in healthcare seeking or in surveillance system infrastructure may mean that it is preferable to randomly sample communities without specifically focusing on specific areas of populations in the country.”

2) Using leave one out cross validation in a dataset of size 70 communities is probably not a particularly stringent test. At a minimum, a CV split of 80% training 20% testing would be more standard. Looking at the model fit (2A-B) even with this small hold-out set there seems to be a fair amount of unexplained variance. Could this be better explained by including other (non-spatial) covariates – ideally those related to the risk factors identified in Table 2? Would this change the conclusions about minimum site sample size for national representativeness?

While we agree that in many analyses, 20% held out testing makes more sense – here, the main information being used is the spatial location of the sampled units and their associated community seropositivity. We are therefore asking the question – can we estimate the seroprevalence at an unsampled location using the spatial covariance from sampled locations (i.e., the information is obtained in the spatial covariance matrix – which goes away if there are no sampled communities nearby). Leaving out 20% changes the question to “if we had sampled 56 locations, what are the estimates obtained”. Therefore we prefer to keep the leave one out analysis. However, we thank the reviewer for this comment as it motivated us to explicitly explore the impact of only having observed a subset of communities (which is equivalent to varying the proportion in the training set and the proportion in the validation set as suggested by the reviewer) and then applying different spatial prediction models. In the revised manuscript, we assess the different models to accurately estimate seroprevalence when between 2 and 69 communities are used to train the model and the rest used to validate the model.

Separately, we also agree that risk factors could potentially improve predictions. Following the reviewer’s suggestion, we now include population size and age/sex in the spatial prediction maps. Note that in Bangladesh, there do not exist maps that can link individual communities (and associated census data) to lat/lon points. We believe that this complication is not unique to Bangladesh with communities in e.g., India and African countries also having similar problems.

We now estimate the predictive accuracy of four different approaches under different numbers of communities and participants per community (Figure 3).

We also include predictive maps with population size, age and sex as spatial predictors as the main analysis (Figure 2—figure supplement 3A) but also compare to a map using spatial covariance only (Figure 2—figure supplement 3B).

3) How confident are the authors of the claim made in the second paragraph of the Introduction that nationally representative serum samples are lacking outside high income countries? Can this be backed up by a systematic search of the literature, or is there robust evidence from one or more of the cited papers (Metcalf et al., 2016; Wilson et al., 2012; Osborne, Weinberg and Miller, 1997; Jardine et al., 2010; De Melker and Conyn-van Spaendonck 1998; Ang et al., 2015)?

We agree that this statement was overly broad and difficult to back up. We have therefore removed this sentence.

4) The analysis accounts for clustering at the village level (through a hierarchical model) but not spatial correlation, which in general would be expected to reduce the amount of information in the data. It would be useful if this could be discussed, the decision to ignore such correlation justified, and the situations where it is important to account for spatial correlation in the regression modelling discussed.

This is an interesting point that warranted further investigation. In order to explore the impact of the estimation method, we have moved to a Bayesian framework. In the revised manuscript we compare the inferences made when using (1) logistic regression with no random intercepts (2) random intercepts by household/community but without spatial correlation (i.e., what was in the original submission) and (3) where we also include spatial dependence through a spatial fields approach as implemented in INLA (which has now become our baseline model). We find that the latter model is slightly better supported (DIC difference of 4). Interestingly, the point estimations for the individual and household covariates are very similar for the models 2 and 3. However, the impact of the mosquito populations do go towards the null. This is likely as a result in the strong spatial correlation in the mosquito populations themselves that gets absorbed in the spatial covariance matrix. We include a discussion of these analyses in the revised document and a new figure showing the different in coefficients (Figure 2—figure supplement 2).

We now include a comparison of coefficient estimates under different model formulations (Figure 2—figure supplement 2). We include in the Results:

“The intraclass correlation coefficients showed that the Matern spatial covariance matrix explained 15% of the variance, the community-level random effects explained 6% and the household random intercept explained 12% of the variance in individual level responses. [...] Coefficients of models were the data was restricted to adults only were largely unchanged.”

In the Discussion we include: “While we incorporated spatial correlation into our risk factor regression analyses, in practice this only resulted in a relatively small improvement in model fit compared to hierarchical models with random intercepts at the community and household levels. […] This suggests that the covariance structure is a better predictor than the basic mosquito absence/presence data as the covariance is driven by, and absorbs, the true underlying environmental drivers including mosquito distributions.”

5) Results section "no household level covariates were significantly linked to seropositivity". Following the ASA's report on p-values (and many recommendations before that), it is widely considered unwise to report results according to "bright lines" for p-values (in this case 0.05 presumably). By all means report the p-values, and certainly report the confidence interval, but hopefully we are moving away from the era where we use such arbitrary thresholds to decide whether to report results, while ignoring the magnitude of the effect.

We included this language as it has become standard in the field, however, we agree completely that this is a shame and we should avoid it where possible. We have reworded this phrase in the revised document to “Household-level covariates did not appear to be important in determining risk of seropositivity”.

6) It is slightly difficult to read this paper as the Materials and methods are at the end and not enough detail is given in the Results for it be clear what was actually done without referring to the Materials and methods. Note that eLife guidelines say that "A Methods or Model section can appear after the Introduction where it makes sense to do so". I think the authors should either consider moving the Materials and methods to before the Results or at least briefly saying what was done in the Results with a more detailed explanation in the Materials and methods.

As suggested, we have moved the Materials and methods to after the Introduction.

7) It's quite unclear how the results were used to derive nationally representatives estimates. The Materials and methods simply say "we also calculated a census-adjusted proportion seropositive by community that adjusted for any sampling bias" but details of how this was done are lacking. Such adjustment also seems to have been using only age and sex distribution from the 2011 census. Why not also use Urban vs. rural (an important predictor according to Table 2 and which should be easily obtained). Other community-level covariates could also be used, potentially. Note there is a literature on multilevel regression and post-stratification to address this type of problem that might be relevant here (see, for example, Zhang et al. Am J Epidemiol 2014, https://academic.oup.com/aje/article/179/8/1025/109078).

Following the reviewer’s suggestions (here and elsewhere), we have moved the spatial predictions to a Bayesian framework and incorporate sex, age, population size (values that are easily available for the rest of the country). We also compared the performance of different model formulations. We also have added detail into the Materials and methods as to how we did the spatial prediction.

We have changed the main analysis to include population size, sex and age as suggested. We include comparisons of maps that do and do not include these covariates (Figure 2—figure supplement 4) and also the predictive accuracy of these different formulations (Figure 3).

In the Materials and methods we now include “To explore the variability in dengue risk across Bangladesh, we initially placed a 5km x 5km grid over the country and estimated the population size in each of those grid cells using data available from worldpop.org 23. […] As a sensitivity analysis, we also predicted the spatial distribution of dengue seropositivity in the country using a model with the Matern spatial covariance matrix only (i.e., without any covariates).”

8) References to Figure 2B and Figure 2C in the text need to be swapped.

Figure 2 has now changed and the references have been checked.

9) "All covariates that were statistically significant at a p-value of <0.1 in the unadjusted analysis were included in a multivariable analysis." Not clear why this was done (except that many other papers do this). There seem to be enough data to include all covariates, and covariates with p-values >0.1 may still have important affects either alone or in combinations. Sometimes covariates have to be selected using some approach as there aren't enough data to use them all, but this doesn't seem to be the case here.

As suggested, we now include all covariates in the multivariable model (the results are essentially unchanged).

10) "using catalytic models". Given there is no space limit it would be helpful to define the technical details here. Also, a constant force of infection was assumed for all serotypes. Was this assumption of a constant f.o.i. tested (e.g. by comparison with other models)? How consistent are the data with this assumption? The Materials and methods suggest that f.o.i. was allowed to vary by age (is this correct)? Why not also sex given the reported sex differences?

We have now included the technical details of the model. We thank the reviewer for the suggestions and we also run models for the main urban sites and by sex. We find that they highlight the key differences in the force of infection in these different populations. We include a new figure highlighting this (Figure 2D-E). We assume no differences in risk by age – this has been clarified.

We include a new section in the Materials and methods “Estimation of the force of infection using catalytic models” which sets out the likelihood based approach for calculating the force of infection:

“Estimation of the force of infection using catalytic models

We used the probability of being seropositive as a function of age to estimate the proportion of the susceptible population that get infected each year using catalytic models, an approach which has been used frequently to reconstruct the past circulation of pathogens 12, 13, 16, 24. […] We calculated the force of infection for the entire sampled population as well as separate estimates by sex and for the locations from the three largest cities (Dhaka, Chittagong and Khulna) only versus the rest of the country.”

In the Results we include a new figure (Figure 2D-E) with the force of infection estimate by the different populations.

We also include the text: “Using a catalytic model to estimate the proportion seropositive by age, we estimated that 1.6% (95% CI: 1.5%-1.7%) of the susceptible population gets infected each year across the four serotypes, equivalent to an average of 2.4 million annual infections (95% CI: 2.2-2.5 million) (Figure 2—figure supplement 4). However, estimates were much higher for the three major urban hubs of Dhaka, Chittagong and Khulna compared to the rest of the country. Within these hubs, 6.4% (95% CI: 5.4%7.6%) of the population gets infected annually with no differences by sex, whereas this drops to 1.0% (95% CI: 0.9% – 1.2%) for females outside these areas and 1.6% (95% 1.4% -1.8%) for males (Figure 2D, E).”

11) "we estimate that 40 million people have been infected with 2.3 million annual infections." (Abstract and Results). 95% CIs are needed for these estimates?

We now include confidence intervals for these estimates.

12) Sample size calculations will be useful to others. However, those currently given in the supplementary text use a formula without further justification (or reference to any justification) and condition on the fact that 70 communities are being sampled. To be more useful, I think it would be help if the authors could discuss the sample size implications of sampling more (or fewer) communities for given correlations between village. For example, for the observed within-village correlation how would the required sample size change as a function of the number of communities sampled. What would be the impact of spatial correlation be on these numbers? What about different levels of within village correlation? This would be useful for others planning such studies and think it would also be appropriate to discuss this issue in the Discussion section i.e. what are the resources required for a given precision/resolution and how are the resources required likely affected by study design choices. It would also be useful if the authors could at least discuss the issue of spatial correlation and discuss the merits/demerits of accounting for it in the modelling in general (as well as in this specific example).

We thank the reviewer for the comment – we agree that sample size considerations are important. The optimal number of communities to visit will depend on the size and distribution of the population, the underlying level and heterogeneity of infection in the population, the required level of precision and the available budget and whether the goal is to measure overall exposure or have community-specific estimates. We did not feel we could do justice to all these considerations in a simple sample-size guide. Instead, we feel the overall findings of the paper will help future studies. We also now provide the impact of within household and within community correlation and a provide rough example guidance in the Discussion.

With regards to the issue of accounting for spatial correlation, we now include analyses where we explicitly consider the impact of not including it – both on coefficient estimates (Figure 2—figure supplement 2) and how it helps with spatial prediction (Figure 3).

In the Results we include the estimates of the intraclass correlation coefficients. In the Discussion we include:

“The approach presented here could be used as a strategy for other countries interested in obtaining national estimates of disease risk. […] Sampling as few as 20 individuals per community still provides robust nationwide estimates, however, in practice, there are fewer budget and time constraints to sampling additional individuals within a community than visiting additional communities.”

We also include “While we incorporated spatial correlation into our risk factor regression analyses, in practice this only resulted in a relatively small improvement in model fit compared to hierarchical models with random intercepts at the community and household levels. […] This suggests that the covariance structure is a better predictor than the basic mosquito absence/presence data as the covariance is driven by, and absorbs, the true underlying environmental drivers including mosquito distributions.”

13) "Reported having dengue" and "heard of dengue" seem peculiar things to include in the regression, as these might be extended to be a consequence of having dengue rather than a potential risk factor. Perhaps this reflects the fact that the purpose of the regression analysis is not clearly stated. It would be good to have a clear statement about the purpose of the regression and a justification for the choice of variables to include.

We agree with the reviewer that it was confusing to include them in the regression as the main purpose was to look at risk factors. We now remove them from the regression.

14) The supplementary material describes the recruitment strategy: "the study staff identified the house where the most recent wedding had taken place and identified the closest neighbour. They then counted six households in a random direction to identify the first household for the study. To select each additional household for the study, they used the previous household as a starting point and counted six households in a random direction." This seems a little eccentric and not obviously guaranteed to give a random sample. Is there any justification for this choice? Wouldn't numbering households and selecting at random or throwing darts at maps be better? I think this point at least needs to be discussed (with recommendations for future studies) and this aspect of the methods should be move to the Methods section in the main text.

There do not exist detailed maps for each community (nor indeed maps that can tell you reliably where any community is located). While mapping and enumerating households by the teams would be preferable – in reality this is rarely feasible where there can be hundreds of households and the teams only had a short amount of time in each community. The sampling approach has been used frequently by the teams in icddr,b to identify a random geographic point within the community to begin household enrolment. In practice it allows the teams to cover a substantial part of the communities. It is also difficult to think of how such an approach could have brought in a systematic bias, especially given there tend to not be many differences within Bangladeshi communities and over 95% of adults marry in the country. Nevertheless, we agree that the strategy may appear a little unusual and in the revised document we discuss that it would have preferable to enumerate all households but this was not feasible in this study.

We now include the recruitment strategy in the main document. In the Discussion we include: “To randomly select households in a community, we would ideally have used a sampling frame of all households in the community. […] However, in cultural contexts where marriage rates may vary by community or location within a community, this method of choosing a random starting point could produce a biased sample.”

15) "may provide some guidance". Not sure what the intended meaning of this is. What kind of guidance?

This has now been removed.

16) Was any attempt made at assessing the accuracy of the recorded household data?

The interviewers visually confirmed the accuracy of the reported household characteristics through observation where this was possible.

17) The entomological approach to determining the presence or absence of Ae. aegypti is quite superficial. The intensity of surveillance (number of BG traps per community) and duration (time in the field) is too short to arrive at a conclusion of presence/absence. In dengue endemic cities, where Ae. aegypti has a well-documented presence, there can be quite marked spatial heterogeneity in the distribution of Ae. aegypti when measured by BG traps, e.g. some houses can be free of this species for consecutive weeks but in a house 50 metres away they can be caught regularly. In regards to Ae. albopictus prevalence, BG traps set indoors are not the optimal method of determining presence/absence- it would have better to set them outdoors or to use outdoor ovitraps. The authors should qualify their conclusions by recognising that trapping method, intensity, duration and seasonality can all influence the likelihood of Aedes detection and this could change the conclusions of the manuscript.

We feel confident with our approach of determining presence or absence of Aedes within our sampled communities. We used BG traps that are very well suited for Aedes mosquitoes (including albopictus – see Maciel-de-Freitas et al., 2006; Obenauer et al., 2010). Further, we revisited all communities where we did not record Aedes populations in the initial visit and in each visit, we conducted 8 households x 24 hours trap-hours of collection. This means that in negative communities, no mosquitoes were trapped over 16 x 24 = 384 hours. We specifically targeted the known Aedes season for trapping mosquitoes in communities where the first trapping efforts detected no mosquitoes. We are also confident with the specificity of identifying species, as trained and experienced entomologists were used to speciate the mosquitoes (therefore false instances of presence are very unlikely). Finally, the strong spatial correlation in where each mosquito species were found (see Author response image 1) would be difficult to achieve if the mosquito data were not robust. Nevertheless, in the revised document, we include a discussion that we could have missed some mosquitoes (i.e., there may be some false negatives).

In the Discussion, we include “We used BG sentinel traps that have been shown to be well suited to trapping Aedes mosquitoes 34,35. […] This would mean that Aedes may be more widespread than we found.”

18) Is there any reason uncertainty (in either or all of the data, kriging model and the force of infection model) can't be propagated through to the final burden estimates? Comparing mean estimates with Bhatt et al. to prove that nationally representative surveys are needed probably also needs to consider uncertainty. I think they might have also included tertiary and quaternary infections as well if you want to be comparable.

Thank you for this suggestion. We now include uncertainty in the estimates. In Bhatt et al., the authors rely on fitting models to inapparent:apparent ratios from cohort data that have paired serology – as tertiary (and quaternary) infections will be IgG positive in both samples and IgM responses will be the same as secondary infections, this approach cannot usually detect these infections. A key issue of the Bhatt paper is that it failed to account for population immunity. Hence the high estimates. In many places (India included) they estimate more infections than births/year. In any event we now include tertiary and quaternary infections in the estimates, which gives similar estimates (only a small subset of the population has the right immune history to have these sequences of heterotypic infections) – it moves the total number of estimated infections from 2.32 million to 2.35 million.

We now include in the Results methods on how we calculated the number of infections. In the Results we include “Using a catalytic model to estimate the proportion seropositive by age, we estimated that 1.6% (95% CI: 1.5%-1.7%) of the susceptible population gets infected each year across the four serotypes, equivalent to an average of 2.4 million annual infections (95% CI: 2.2-2.5 million).”

19) Can code for statistical analysis be made available?

We will deposit code for the manuscript on GitHub.

20) "Our findings are in marked contrast to what has been observed with chikungunya in Bangladesh". Can the authors offer any hypothesis as to why this might be the case?

The differences in the chikungunya versus dengue experiences may be due to vector, level of endemicity or behavioral differences. We have changed this section to read “Our findings are in marked contrast to what has been observed with chikungunya in Bangladesh, where women in a community that had a widespread chikungunya epidemic were found to be at significantly increased risk of being infected compared to males, with the increased risk of infection linked to greater time spent in and around the home. These findings suggest that it may be difficult to generalize inferences across arboviruses due to differences in vector species and the frequency of introductions and risk of onwards spread.”

21) Can the authors provide at least one concrete example of such a survey could lead to better decision-making about vaccination?

We now include the following: “For Bangladesh, our findings suggest that any vaccine rollout should be concentrated to the urban areas of Dhaka, Chittagong and Khulna. […] Therefore, any rollout will require the screening of individuals for presence of antibodies before vaccination to avoid placing large numbers of individuals at risk for more severe disease manifestations.”

22) It was felt that some of the statements about findings showing that lack of spread of aegypti is the reason behind heterogeneities in dengue transmission in Bangladesh were not fully justified in light of the known limitations of entomological surveying

As discussed in the response to comment 15 above, we feel confident in our estimates of mosquito presence. Even for Aedes albopictus, the BG trap has been shown to be very good (see https://doi.org/10.1603/EN09322). We would not observe such clear spatial correlation in where the mosquitoes were found (and not found) if the observations were not robust. Nevertheless, we agree that it is possible that we have missed the mosquitoes in a small number of communities (i.e. false negatives). We include this point in the Discussion. The effect of mosquito distribution has also been downplayed throughout the document.

We now include in the Discussion: “We used BG sentinel traps that have been shown to be well suited to trapping Aedes mosquitoes (Maciel-de-Freitas, Eiras and Lourenço, 2006; Obenauer et al., 2010). […] This would mean that Aedes may be more widespread than we found.”

23) The authors correctly point to the possibility that the Panbio-based seroprevalence survey might have reflected past JEV exposure and cite the low case JE incidence to suggest dengue is the primary culprit for the seroloprevalence.. This might be true, but JEV is notoriously difficult to diagnose in the absence of laboratory testing and there would almost certainly be under-reporting of cases in Bangladesh. Having a random subset of samples tested by DENV/JEV PRNT50 assay, regarded as the most specific assay of DENV serostatus, could have helped clarify this point. Can the authors do this readily perhaps on a sample of 100 or so (or use Luminex as a (less preferred) option? Though not essential, it was felt this would improve the quality of the manuscript if it could be done within 2 months. If it can't be done the Discussion needs to be qualified accordingly.

We agree that additional PRNT testing for JE and dengue on some of our samples would be useful, but we are unable to arrange this within a few months. The reviewer is correct that JEV is difficult to diagnose, but our estimation that past JEV exposure is very low among the general population is based upon laboratory confirmed patients who had samples tested at the US CDC as part of a long-running hospital-based surveillance program. These data showed that approximately half of all cases diagnosed are adults, suggesting a low force of infection. A community-based study that a combined population-based door to door survey looking for people who had ever had an illness compatible with encephalitis with the estimated prevalence of JEV among hospitalized encephalitis patients with confirmed infection estimated only 2.7 cases/100,000 population in that part of the country. The low level of JE in Bangladesh is not surprising, as pigs (the main reservoir for JE) are uncommon in the predominantly Muslim country.

Finally, laboratory diagnosed JEV cases in Bangladesh have all been rural residents and the levels for dengue seropositivity we found was low in rural environments. This means the level of false positives due to JE exposure is likely to be minimal. Note that Luminex testing does not resolve the cross-reactivity problems completely and neutralization testing would still be necessary. We have clarified this in the Discussion.

The Discussion now includes: “Cross-reactivity of antibodies is a problem for all seroprevalence studies, especially with flaviviruses. […] As we estimated only low levels of dengue seropositivity in rural communities, the number of false positives from Japanese encephalitis cross-reactivity is likely to be small.”

24) Materials and methods: when were interviews conducted in relation to the dengue season? Could this have introduced recall bias of "whether diagnosed with dengue"?

The interviews were conducted towards the end of the dengue season. However, as the probability of infection from any particular season is small, any infections are likely to be several years old and it is very possible that they were not and falsely recalled. The purpose of the question was to assess whether asking somebody about their dengue history was at all informative of their true status.

25) Was there any evidence that travel in the last 7 days was a reasonable proxy for long term travel history?

We had no external validation of this measure, although (on a population scale at least), it seems likely that recent travel is a marker of frequently leaving the community. We include this in the limitations in the revised document.

We include: “We used the time period since individuals last left the community as a marker of travel. While this is likely to broadly capture trends in mobility, it remains a crude marker and more detailed measures of movement (from e.g., movement diaries, GPS monitors) would help provide a more detailed understanding of how people move.”

26) "Our finding of a negative correlation between Ae. aegypti and Ae. albopictus presence is consistent with competition between the two species" – or is just evidence that they have different environmental niches?

We agree that this is an alternative explanation and have added: “Our finding of a negative correlation between Ae. aegypti and Ae. albopictus presence is consistent with the species occupying different environmental niches or even competition between the two species.”

https://doi.org/10.7554/eLife.42869.020

Article and author information

Author details

  1. Henrik Salje

    1. Mathematical Modelling of Infectious Diseases Unit, Institut Pasteur, UMR2000, CNRS, Paris, France
    2. Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, United States
    Contribution
    Conceptualization, Formal analysis, Supervision, Funding acquisition, Validation, Investigation, Visualization, Methodology, Writing—original draft, Project administration
    For correspondence
    hsalje@pasteur.fr
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-3626-4254
  2. Kishor Kumar Paul

    International Centre for Diarrhoeal Disease Research, Bangladesh (icddr,b), Dhaka, Bangladesh
    Contribution
    Data curation, Supervision, Investigation, Project administration, Writing—review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-6054-3571
  3. Repon Paul

    International Centre for Diarrhoeal Disease Research, Bangladesh (icddr,b), Dhaka, Bangladesh
    Contribution
    Project administration, Writing—review and editing
    Competing interests
    No competing interests declared
  4. Isabel Rodriguez-Barraquer

    University of California, San Francisco, San Francisco, United States
    Contribution
    Formal analysis, Methodology, Writing—review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-6784-1021
  5. Ziaur Rahman

    International Centre for Diarrhoeal Disease Research, Bangladesh (icddr,b), Dhaka, Bangladesh
    Contribution
    Supervision, Investigation, Methodology, Writing—review and editing
    Competing interests
    No competing interests declared
  6. Mohammad Shafiul Alam

    International Centre for Diarrhoeal Disease Research, Bangladesh (icddr,b), Dhaka, Bangladesh
    Contribution
    Investigation, Methodology, Writing—review and editing
    Competing interests
    No competing interests declared
  7. Mahmadur Rahman

    Institute of Epidemiology, Disease Control and Research (IEDCR), Dhaka, Bangladesh
    Contribution
    Supervision, Investigation, Project administration, Writing—review and editing
    Competing interests
    No competing interests declared
  8. Hasan Mohammad Al-Amin

    International Centre for Diarrhoeal Disease Research, Bangladesh (icddr,b), Dhaka, Bangladesh
    Contribution
    Supervision, Investigation, Methodology, Writing—review and editing
    Competing interests
    No competing interests declared
  9. James Heffelfinger

    Division of Global Health Protection, Center for Global Health, Centers for Disease Control and Prevention, Atlanta, United States
    Contribution
    Project administration, Writing—review and editing
    Competing interests
    No competing interests declared
  10. Emily Gurley

    Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, United States
    Contribution
    Conceptualization, Formal analysis, Supervision, Funding acquisition, Investigation, Methodology, Project administration, Writing—review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-8648-9403

Funding

Centers for Disease Control and Prevention

  • Henrik Salje
  • Emily Gurley

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Ethics

Human subjects: This study was approved by the icddr,b ethical review board. (protocol number PR-14058). The U.S. Centers for Disease Control and Prevention relied on icddr,b's ethical review board approval. All adult participants provided written, informed consent after receiving detailed explanation of the study and procedures. Parents/guardians of all child participants were asked to provide written, informed consent on their behalf.

Senior Editor

  1. Neil M Ferguson, Imperial College London, United Kingdom

Reviewing Editor

  1. Ben Cooper, Mahidol Oxford Tropical Medicine Research Unit, Thailand

Reviewers

  1. Ben Cooper, Mahidol Oxford Tropical Medicine Research Unit, Thailand
  2. Oliver Brady, London School of Hygiene & Tropical Medicine, United Kingdom

Publication history

  1. Received: October 15, 2018
  2. Accepted: April 4, 2019
  3. Accepted Manuscript published: April 8, 2019 (version 1)
  4. Version of Record published: May 13, 2019 (version 2)
  5. Version of Record updated: May 16, 2019 (version 3)

Copyright

This is an open-access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.

Metrics

  • 1,433
    Page views
  • 222
    Downloads
  • 6
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Epidemiology and Global Health
    Harriet Mpairwe et al.
    Research Article
    1. Epidemiology and Global Health
    Unnur A Valdimarsdóttir et al.
    Research Article