Linking rattiness, geography and environmental degradation to spillover Leptospira infections in marginalised urban settings: An eco-epidemiological community-based cohort study in Brazil

  1. Max T Eyre  Is a corresponding author
  2. Fábio N Souza
  3. Ticiana SA Carvalho-Pereira
  4. Nivison Nery
  5. Daiana de Oliveira
  6. Jaqueline S Cruz
  7. Gielson A Sacramento
  8. Hussein Khalil
  9. Elsio A Wunder
  10. Kathryn P Hacker
  11. José E Hagan
  12. James E Childs
  13. Mitermayer G Reis
  14. Mike Begon
  15. Peter J Diggle
  16. Albert I Ko
  17. Emanuele Giorgi
  18. Federico Costa
  1. Centre for Health Informatics, Computing, and Statistics, Lancaster University Medical School, United Kingdom
  2. Liverpool School of Tropical Medicine, United Kingdom
  3. Institute of Collective Health, Federal University of Bahia, Brazil
  4. Swedish University of Agricultural Sciences, Sweden
  5. Oswaldo Cruz Foundation, Brazilian Ministry of Health, Brazil
  6. Department of Epidemiology of Microbial Diseases, Yale School of Public Health, United States
  7. University of Pennsylvania, United States
  8. World Health Organization (WHO) Regional Office for Europe, Denmark
  9. Department of Evolution, Ecology and Behaviour, University of Liverpool, United Kingdom

Abstract

Background:

Zoonotic spillover from animal reservoirs is responsible for a significant global public health burden, but the processes that promote spillover events are poorly understood in complex urban settings. Endemic transmission of Leptospira, the agent of leptospirosis, in marginalised urban communities occurs through human exposure to an environment contaminated by bacteria shed in the urine of the rat reservoir. However, it is unclear to what extent transmission is driven by variation in the distribution of rats or by the dispersal of bacteria in rainwater runoff and overflow from open sewer systems.

Methods:

We conducted an eco-epidemiological study in a high-risk community in Salvador, Brazil, by prospectively following a cohort of 1401 residents to ascertain serological evidence for leptospiral infections. A concurrent rat ecology study was used to collect information on the fine-scale spatial distribution of ‘rattiness’, our proxy for rat abundance and exposure of interest. We developed and applied a novel geostatistical framework for joint spatial modelling of multiple indices of disease reservoir abundance and human infection risk.

Results:

The estimated infection rate was 51.4 (95%CI 40.4, 64.2) infections per 1000 follow-up events. Infection risk increased with age until 30 years of age and was associated with male gender. Rattiness was positively associated with infection risk for residents across the entire study area, but this effect was stronger in higher elevation areas (OR 3.27 95% CI 1.68, 19.07) than in lower elevation areas (OR 1.14 95% CI 1.05, 1.53).

Conclusions:

These findings suggest that, while frequent flooding events may disperse bacteria in regions of low elevation, environmental risk in higher elevation areas is more localised and directly driven by the distribution of local rat populations. The modelling framework developed may have broad applications in delineating complex animal-environment-human interactions during zoonotic spillover and identifying opportunities for public health intervention.

Funding:

This work was supported by the Oswaldo Cruz Foundation and Secretariat of Health Surveillance, Brazilian Ministry of Health, the National Institutes of Health of the United States (grant numbers F31 AI114245, R01 AI052473, U01 AI088752, R01 TW009504 and R25 TW009338); the Wellcome Trust (102330/Z/13/Z), and by the Fundação de Amparo à Pesquisa do Estado da Bahia (FAPESB/JCB0020/2016). MTE was supported by a Medical Research UK doctorate studentship. FBS participated in this study under a FAPESB doctorate scholarship.

Editor's evaluation

In their work, the authors present a novel geostatistical framework allowing for modelling complex animal-environment-human interactions during zoonotic spillover. The presented case relates to zoonotic spillover of Leptospira infections in a marginalised urban setting in Salvador, Brazil. The outcomes of such applications could contribute to inform public health interventions. The methodological approach is to be applauded and can be of benefit beyond the study of zoonotic spillover.

https://doi.org/10.7554/eLife.73120.sa0

Introduction

Zoonotic spillover, the transmission of pathogens from infected vertebrate animals to humans, is responsible for a significant public health burden globally. Understanding the processes that promote spillover transmission is essential for improving our ability to predict and prevent spillover events, but for many zoonoses, such as Leptospira interrogans, Escherichia coli O157 and Giardia spp., they are poorly understood (Plowright et al., 2017). This is due to the complex nature of the spillover system, in which the probability of transmission is governed by dynamic interactions in space and time between ecological, epidemiological, behavioural, and immunological factors that determine pathogen pressure, exposure and host susceptibility. Zoonotic spillover research must explore interactions between the environment, disease reservoirs and local epidemiology, presenting two central challenges: (i) the need for transdisciplinary studies at the animal-human disease interface (a One Health approach) that accurately collect data on multiple components of the spillover process at common temporal and spatial scales at which these events take place; (ii) the development of integrative approaches to jointly analyse these diverse datasets within a spatially and temporally explicit framework (Plowright et al., 2017; Becker et al., 2019; Dhewantara et al., 2019).

Leptospirosis, a neglected zoonotic disease caused by pathogenic bacteria from the genus Leptospira, is an important example of zoonotic spillover. Globally, it is estimated to cause more than one million cases and over 58,000 deaths each year (Costa et al., 2015a), with an annual global burden of 2.9 million disability-adjusted life years (DALYs) (Torgerson et al., 2015). This burden falls heavily on marginalised urban populations in low- and middle-income countries who live in areas characterised by high population density, poor quality housing and inadequate provision of healthcare, sanitation, and waste management services. In these settings, leptospiral infection occurs through contact with water or soil contaminated with leptospires shed in the urine of the principal reservoir, the Norway rat (Rattus norvegicus; Bierque et al., 2020). These areas produce the socio-ecological conditions that allow rodent populations to proliferate and leptospires to persist for long periods in the environment (Goarant, 2016). Residents consequently have frequent, intense and largely unavoidable exposure to the contaminated environment, often exacerbated by their geographical vulnerability to flooding events (Lau et al., 2010). In response, the World Health Organisation (WHO) has convened the Leptospirosis Burden Epidemiology Reference Group (LERG) which has recommended ‘Targeted intervention based on the improved knowledge of disease ecology’ (WHO, 2010), highlighting the current knowledge gap for Leptospira transmission mechanisms and target points for effective intervention.

Multiple studies have helped to elucidate key aspects of the Leptospira transmission cycle in urban settings, identifying socioeconomic vulnerability, household environment and behavioural exposures as important determinants of infection risk (Reis et al., 2008; Felzemburgh et al., 2014; Hagan et al., 2016; Khalil et al., 2021; Barcellos and Sabroza, 2000; Barcellos and Sabroza, 2001; Mwachui et al., 2015; Keenan et al., 2010; Goarant, 2016; Prabhakaran et al., 2014; Briskin et al., 2019). However, these variables have been unable to explain fine-scale spatial variation in risk (Reis et al., 2008; Hagan et al., 2016). This is likely to be driven by the high spatial and temporal heterogeneity in environmental risk, observed in recent studies of Leptospira in soil, and surface and sewage waters (Schneider et al., 2018; Casanovas-Massana et al., 2018; Bierque et al., 2020). These findings lead to two key questions: (i) to what extent does environmental contamination by localised rat shedding drive infection risk, rather than exposure to leptospires that have been dispersed by rainwater runoff and overflowing sewer systems; and (ii) how does this change across the geography of a community, for example at different elevation levels?

Establishing a dynamic link between rats, the environment and Leptospira transmission is complicated by the difficulty of measuring and modelling the rat contamination process. However, urban Norway rats have been found to have high Leptospira prevalence and shedding rates worldwide (Pellizzaro et al., 2019; Costa et al., 2014a; Boey et al., 2019; Yusof et al., 2019; Krøjgaard et al., 2009; Costa et al., 2015b; de Faria et al., 2008). This suggests that rat abundance may be predictive of environmental risk, and could be used as a proxy for this shedding process. While several studies have identified associations between infection risk and household rat sightings and infestation (Reis et al., 2008; Costa et al., 2014b; Hagan et al., 2016; Costa et al., 2021; Pellizzaro et al., 2019; Bhardwaj et al., 2008), their ability to explore fine-scale spatial variation in risk was limited by a reliance on household infestation surveys or aggregation of incidence and abundance indices to a common coarse spatial scale. All modelled abundance as a regression covariate, thereby not accounting for uncertainty in its measurement. The absence of methods applied to formally integrate abundance and spillover infection data is an issue for rodent-borne zoonoses more widely (Bordes et al., 2015; Dhewantara et al., 2019).

There is no gold-standard index of abundance and field teams use a range of imperfect indices, such as traps, infestation surveys and track plates. In our previous work, we developed a multivariate generalized linear geostatistical model for joint spatial modelling of multiple imperfect abundance indices (Eyre et al., 2020). We use the term ‘abundance’ here to denote all ecological processes that are associated with animal abundance and measured by abundance indices, for example animal presence, density and activity, and that may be useful to quantify exposure to a zoonotic disease of interest. This methodology was then used to model the spatial distribution of ‘rattiness’, our proxy for rat abundance, at a fine scale within a community in Salvador, Brazil (Eyre et al., 2020). The spatial distribution of rattiness was highly heterogeneous, suggesting that it could be a driver of micro-heterogeneity in infection risk.

To analyse reservoir host abundance (as defined previously) and infection data at fine spatial scales, we propose that a framework should (i) account for spatial correlation in human and reservoir host data; (ii) jointly model multiple imperfect indices of abundance while accounting for the appropriate sampling distribution of each index; (iii) account for uncertainty in abundance indices, (iv) allow for the prediction of abundance and infection risk at all locations within the study area, and (v) quantify the uncertainty associated with those predictions. Several studies have attempted to model spatial associations between disease reservoir or vector abundance and human infection for leptospirosis (Hurd et al., 2017; Lau et al., 2016; Mayfield et al., 2018), tularemia (Rotejanaprasert et al., 2018) Lyme disease (Nicholson and Mather, 1996) West Nile Virus (Winters et al., 2008) dengue fever (Cromwell et al., 2017) and Lassa fever (Fichet-Calvet et al., 2007). However, none of the approaches used satisfy all five of the above conditions. The development of new tools for the joint spatial analysis of abundance and human infection may consequently be beneficial for the study of other zoonoses and vector-borne diseases (Eisen and Eisen, 2008).

The aim of this study was to develop a flexible modelling framework for zoonotic spillover to explore whether rattiness, acting as a proxy for local leptospiral contamination by Norway rats, can explain spatial heterogeneity in leptospiral transmission in a high-risk urban community in Brazil where 80% of rats are estimated to be actively shedding the bacteria (Costa et al., 2015b; de Faria et al., 2008). We extend the rattiness framework of Eyre et al., 2020 to include human infection risk. We describe findings from a transdisciplinary eco-epidemiological study which comprises a prospective community-based cohort study with two serosurveys and a fine-scale rat ecology study. The ecology study was used to collect information on the spatial distribution of rat abundance, our exposure of interest, in the period between the two surveys using multiple abundance indices. Then, we explore associations between infection risk, rattiness and a range of measured environmental and individual risk factors.

Materials and methods

Study design

Study area

Request a detailed protocol

The study was conducted in Pau da Lima community (13°32’53.47” S; 38°43’51.10” W), a marginalised informal settlement located in the city of Salvador, Northeast Brazil. The study site has an area of 0.25km2 and is characterised by three connected valleys with large elevation gradients, high population density and a heterogeneous environment of vegetation, paved surfaces and exposed soil (Figure 1). There are significant gradients in socioeconomic status and infrastructure quality over small elevation increases - with the most marginalised members of the community living at lower elevations. The community suffers from low quality housing, poor provision of waste management services and inadequate drainage and sanitation systems (Hagan et al., 2016; Hacker et al., 2020). Residents are consequently often unable to avoid intense exposure with mud and floodwater. These factors result in abundant rat populations (Eyre et al., 2020) and a high estimated annual Leptospira infection rate of 35.4 (95% CI, 30.7, 40.6) infections per 1000 annual follow-up events (Hagan et al., 2016). For this reason, Pau da Lima has become an exemplar for investigating urban Leptospira transmission in Brazil over the last 15 years.

Study site and timeline.

(A) Map of the three valleys within the study site in Pau da Lima, with household locations for the serosurveys marked as orange circles. Locations sampled in the the rat ecology study are shown for each of the rat abundance indices as follows: Plates & Signs (track plates, burrows, faeces and trails), Traps & Signs (traps, burrows, faeces and trails) and Signs only (burrows, faeces, and trails); (B) Land cover classification map (impervious cover is defined as man-made structures e.g. pavement and buildings); (C) Study timeline for the two community serosurveys and rat ecology study.

Serosurveys

Request a detailed protocol

We conducted a prospective community cohort study with two serosurveys carried out in August-October 2014 and January-April 2015. After an initial census of the study site, all ground floor households were visited and inhabitants who met the eligibility criteria of ≥5 years of age who had slept ≥3 nights in the previous week in a study household were invited to join the study. This study focussed on ground floor households because they are vulnerable to flooding and consequently at high risk for leptospiral transmission. The criterion for determining whether a resident is currently living at a household location is commonly applied in this context to account for resident mobility.

During each survey trained phlebotomists collected blood samples from participants and administered a modified version of the standardised questionnaire used previously (Costa et al., 2014b; Hagan et al., 2016). Information was collected on demographic and socioeconomic indicators, household environmental characteristics and exposures to potential sources of environmental contamination in the previous six months (the average time between the two serosurveys). Study data were collected and managed using REDCap electronic data capture tools (Harris et al., 2009) and all individual data were anonymised. The locations of sampled households are shown in Figure 1 - panel A. If an individual was not found during a sample collection visit their house was revisited at least five times on different days of the week.

The microscopic agglutination test (MAT) was used to determine titers of agglutinating antibodies against pathogenic Leptospira in sera obtained from the blood samples collected in each serosurvey. Serological samples were reacted with a panel of two Leptospira reference strains that are dominant in Pau da Lima: Leptospira interrogans serovars Copenhageni (COPL1) and Cynopteri 3522 C (C3522C). These two strains have been shown to have the same performance in identifying MAT seroconversion in our prospective studies as the WHO recommended battery of 19 reference serovars. When agglutination was observed at a dilution of 1:50, the sample was titrated in serial twofold dilutions to determine the highest agglutination titer. The study outcome of leptospiral infection was defined as seroconversion, an MAT titer increase from negative to ≥1:50, or a fourfold increase in titer for either serovar between paired samples from cohort subjects. All laboratory analyses were performed in the Laboratory Pathology and Molecular Biology at Fiocruz, Salvador. As part of quality control procedures two independent evaluations were conducted by Yale University for all infected subjects and 8% of all samples, with high concordance between results.

Rat ecology study

Request a detailed protocol

To estimate exposure risk due to local rat contamination between the two serosurveys, a cross-sectional rat ecology study was conducted from October to December 2014. As has been described previously (Eyre et al., 2020), the aim of this study was to collect data on the fine-scale spatial variation in rat reservoir population abundance. Data were collected for five indices of rat abundance: live trapping, track plates, number of active burrows present, presence of faecal droppings and presence of trails. Rat trapping was carried out at 189 locations, randomly distributed across the study area (see Panti-May et al., 2016). Two traps were deployed for 4 consecutive 24 hr trapping periods at each location. Trapping success and trap closure without a rat, a common malfunction, were recorded after each 24 hr period. Track plates were placed at 415 locations for two consecutive 24 hr periods following the standardised protocol for placement and survey developed and validated previously (Hacker et al., 2016), with five plates placed at each location in the shape of a ‘five’ on a die. After each 24 hr period, plates were repainted and any lost plates were recorded and replaced. On the first day of trapping or plate placement, a survey for signs of rat infestation, adapted from the Centers for Disease Control and Prevention, 2006 and validated in the study area (Costa et al., 2014b), was conducted within an area of 10 m radius around each trapping or plate location to record the number of active burrows and the presence of faecal droppings and trails. In total, 595 independent locations were sampled for traps, track plates and the three survey indices for signs of rat infestation. The spatial distribution of these locations is shown in Figure 1 - panel A. At 21 locations, theft and local gang violence meant that data for track plates and traps was not collected and only the three survey indices for signs of rat infestation were used.

Environmental data

Request a detailed protocol

In addition to the environmental survey conducted at each household location, we also collected information for three spatially continuous environmental variables: elevation relative to the bottom of each valley, distance to large public refuse piles and the proportion of land cover classified as impervious (man-made structures) within a 30 m radius. The land cover variable was created from Digital Globe’s WorldView-2 satellite imagery (8 bands) taken on February 17, 2013 which was classified using a maximum likelihood supervised algorithm and validated with ground truthed data collected from 20 randomly selected sites of size 5 m by 5 m. The classification map is shown in Figure 1 - panel B.

Ethics

Participants were enrolled according to written informed consent procedures approved by the Institutional Review Boards of the Oswaldo Cruz Foundation and Brazilian National Commission for Ethics in Research, Brazilian Ministry of Health (CAAE: 01877912.8.0000.0040) and Yale University School of Public Health (HIC 1006006956).

For the rat ecology study, the ethics committee for the use of animals from the Oswaldo Cruz Foundation, Salvador, Brazil, approved the protocols used (protocol number 003/2012), which adhered to the guidelines of the American Society of Mammalogists for the use of wild mammals in research (Sikes and Gannon, 2011) and the guidelines of the American Veterinary Medical Association for the euthanasia of animals (Leary et al., 2013). These protocols were also approved by the Yale University’s Institutional Animal Care and Use Committee (IACUC), New Haven, Connecticut (protocol number 2012–11498).

Joint modelling rat abundance and human infection: the rattiness-infection framework

The developed geostatistical modelling framework jointly models multiple rat abundance indices as measurements of a common latent process, called rattiness. Rattiness at each household location contributes to the risk of infection for all inhabitants, in addition to other measured individual or household-level explanatory variables.

We model the rat abundance data following a similar structure to that previously outlined (Eyre et al., 2020). Let R(x) denote a spatially continuous stochastic process, representing rattiness. The rat data then consist of a set of outcomes Yi=(Yi,k:k=1,,5), for i=1,,Nr, collected at a discrete set of locations X={xi:i=1,,Nr}. The outcome variables Yk:k=1,,5 are the set of five rat abundance indices that provide information about R(x): traps (k=1), track plates (k=2), number of burrows (k=3), presence of faecal droppings (k=4) and presence of trails (k=5).

Human data are collected from Nh households and consist of an infection outcome Zi,j for individual j at household location i, for i=Nr+1,,Nr+Nh, collected at a discrete set of locations X={xi:i=Nr+1,,Nr+Nh}.

Let ‘[·]’ be a shorthand notation for ‘the probability distribution of .’ We write Y=(Y1,,YNr), Z=(ZNr+1,,ZNr+Nh) and R=(R(x1),,R(xNr+Nh)). We assume that the Yi,k:k=1,,5 and Zi,j are conditionally independent given R(xi), from which it follows that

(1) [Y,Z|R]=i=1Nrk=15[Yi,k|R(xi)]i=Nr+1Nr+Nhj=1Ji[Zi,j|R(xi)].

where [] is a shorthand notation for ‘the distribution of’ and Ji denotes the number of individuals at household i. This model structure is shown schematically in Figure 2. The conditional independence assumption in Equation 1 is reasonable for a vector-borne disease or one that is transmitted indirectly, in which context the observed rat indices are to be considered as noisy indicators of the unobservable spatial variation in the extent to which the environment is contaminated with rat-derived pathogen. It would be more questionable for applications in which the disease of interest is spread by direct transmission from rat to human.

Directed acyclic graph (DAG) of the rattiness-infection model framework.

R(x) is the value of a spatially continuous stochastic rattiness process at location x. The outcome variables Yk:k=1,,5 are the set of five rat abundance indices that provide information about R(x): traps (k=1), track plates (k=2), number of burrows (k=3), presence of faecal droppings (k=4) and presence of trails (k=5). The outcome variable Zi,j is the observed health outcome, in this case this represents infection status. The terms dh and dr represent the sets of spatially continuous explanatory variables which contribute to spatial variation in infection risk in humans and R(x), respectively. The terms dh and dr are not mutually exclusive groups of explanatory variables and the same variables may contribute to both infection risk and R(x). The term e represents a set of individual- and household-level explanatory variables which contribute to variation in infection risk. Square objects correspond to observable variables, and circles to latent random variables.

Rattiness

Request a detailed protocol

We define rattiness at location x as

(2) R(xi)=dr(xi)βr+ψS(xi)+1-ψUi.

The terms on the right-hand side of Equation 2 have the following interpretations: dr(xi) is a vector of explanatory variables with associated regression coefficients βr is a set of independently and identically distributed zero-mean Gaussian variables with unit variance; S(xi) is a stationary and isotropic spatial Gaussian process; ψ(0,1) regulates the relative contributions of spatially structured variation, S(xi), and unstructured random variation, Ui, to R(xi).

For the Gaussian process, S(xi), we specify an exponential spatial correlation function:

Corr(S(x),S(x))=e-u/ϕ where u=||x-x|| is the Euclidean distance between x and x, and ϕ regulates how fast the spatial correlation decays to zero with increasing distance u.

Rat abundance outcomes

Request a detailed protocol

The variable Yi,1, conditionally on R(xi), is a binomial variable representing the number of traps, out of ni,1, in which rats were captured. We assume that the times of rat captures from a trap follow a time-varying inhomogeneous Poisson process with intensity tiμ1(xi), where ti is the time (in days) for which a trap is operative and log{μ1(xi)}=α1+σ1R(xi). It follows that the probability of capturing a rat is

1exp{tiμ1(xi)}.

If a trap is found closed without a rat, we assume that the trap was disturbed and set t=0.5. In all other cases, t=1 day. We conducted a sensitivity analysis for this assumption (see ‘Appendix 6’) and found that it did not materially affect rattiness parameter estimates (Appendix 6—table 1).

Yi,2, is the number of track-plates, out of ni,2, that show presence of rats. We model this as a binomial variable with ni,2 trials and probability μ2(xi) where log{μ2(xi)/(1μ2(xi))}=α2+σ2R(xi).

Yi,3, is the number of active rat burrows found at location xi. We model this as a Poisson variable with rate μ3(xi) where log{μ3(xi)}=α3+σ3R(xi).

The variables Yi,4 and Yi,5 are binary indicators taking value 1, if at least one faecal dropping or trail, respectively, was found at location xi and 0 otherwise. We model the probability of finding a sign of faecal droppings or trails, μ4(xi) and μ5(xi), using logit-linear regressions log{μ4(xi)/(1-μ4(xi))}=α4+σ4R(xi) and log{μ5(xi)/(1-μ5(xi))}=α5+σ5R(xi).

Human infection outcome

Request a detailed protocol

Conditionally on R(xi), we model the binary human infection outcome Zi,j as a Bernoulli variable with the probability, pj(xi), that individual j at location i is infected. This is modelled with a logit link function and the following linear predictor

(3) log{pj(xi)1pj(xi)}=αh+dh(xi)βh+ei,jγ+ξ(xi)R(xi)+Vi

where: dh(xi) is a vector of spatially continuous explanatory variables with associated regression coefficients βh is a vector of household-level and individual-level explanatory variables with associated regression coefficients γ; Vi is a set of independently and identically distributed zero-mean Gaussian variables with variance σ2 representing unexplained household-level variation; ξ(xi) regulates the contribution of rattiness to risk of infection.

Parameterising to test for an interaction with relative elevation

Request a detailed protocol

To explore variation in the role of local rat populations in transmission within sections of the study area with different flooding risk profiles, ξ was parameterised to test for an interaction between rattiness and a categorical parameterisation of household elevation relative to the bottom of the valley (modelled as a piecewise constant function with breaks at 6.7 and 15.6 m, resulting in three categories: low, medium and high elevation levels.) on human infection risk. This was implemented by first dividing the study area into three elevation categories with different flooding risk profiles (as observed during our work in the study area over the last 15 years): low (0-6.7m from bottom of valley; high flooding risk with maintenance of floodwater for long periods), medium (6.7-15.6m; moderate flooding with high water runoff), and high (>15.6m; limited flooding and water runoff). Our study was then designed to evenly sample across this elevation gradient and minimum and maximum values for each elevation category were chosen to include an equal number of households in each level. We then define the set of household locations in each low, medium, and high elevation category as xlow, xmed, and xhigh, respectively. Three values of ξ were then estimated such that:

(4) ξ(xi)={ξlowat locations xixlowξmedat locations xixmedξhighat locations xixhigh

Variable selection

Predictors of rattiness

Request a detailed protocol

The exploratory analysis for the rattiness model followed the steps developed and described previously (Eyre et al., 2020). Firstly, we explored the functional form of the relationship between rattiness and three continuous explanatory variables: relative elevation, distance to large refuse piles and land cover type. To do this, we fitted a simplified rattiness model that did not include covariates or account for spatial correlation. Rattiness is consequently modelled purely as unstructured random variation; hence R(xi)=Ui (Eyre et al., 2020). We then computed the predictive expectation of this simplified rattiness process, U^i, at all locations for which rat index measurements were observed. A generalized additive model (GAM) (Hastie and Tibshirani, 1987) was then fitted to the U^i with the three explanatory variables and the shape of each fitted smooth function was used to assess whether the relationship between each variable and rattiness was linear. Non-linear relationships were modelled using linear splines based on the identified functional form, with knots placed at relative elevations of 8 m and 22 m, and at a distance from large refuse piles of 50 m (see Appendix 1—figure 1). For variable selection, linear models with all combinations of these variables were fitted and ranked by their Akaike Information Criterion (AIC) value (Bozdogan, 1987). The model with the lowest AIC included all of the variables and their linear splines (Appendix 2—table 1).

Following the methodology outlined previously (Eyre et al., 2020), we fitted the full geostatistical rattiness model using the variables selected in ‘Predictors of rattiness’. We then plugged in the maximum likelihood estimates and made predictions for rattiness at all human household locations; here, the predictive target is T(x)=dr(x)βr+ψS(x) rather than R(x) as defined by Equation 2 because the predicted value of the spatially uncorrelated U(x) at any location x where rat abundance indices have not been recorded is zero. The expectation of this predictive distribution was then computed to provide an estimate of mean predicted rattiness at all household locations. This was then used as an exploratory covariate in the following section.

Risk factors for human infection

Request a detailed protocol

All explanatory variables were grouped into the following four domains: social status, household environment, occupational exposures and behavioural exposures (see Table 2 for the full list of considered variables by group). A group of a priori confounding variables was then identified, with age, gender and household per capita income selected based on previous findings (Hagan et al., 2016; Reis et al., 2008; Felzemburgh et al., 2014), and valley also included to account for otherwise unmeasured differences between the three valley regions within the study area. In the household environment domain, two variables were used to capture risk due to sewer flooding close to the household: (i) the presence of an open sewer within 10 metres of the household location and (ii) a binary ‘unprotected from open sewer’ variable which identified those households within 10 metres of an open sewer that did not have any physical barriers erected to prevent water overflow. Three high-risk occupations were included in the occupational exposures domain as binary variables. Construction workers and refuse collectors have direct contact with potentially contaminated soil, building materials and refuse in areas that provide harbourage and food for rats. Travelling salespeople have regular and high levels of exposure to the environment (particularly during flooding events) as they move from house to house by foot. Two other binary occupational exposure variables were included that measured whether a participant worked in an occupation that involves contact with floodwater or sewer water.

The relationship between continuous explanatory variables and infection risk (on the log-odds scale) was assessed for linearity by fitting a GAM while controlling for the four confounders. As before, non-linear relationships were modelled using linear splines based on the identified functional form. Age was modelled with a knot at 30 years old, education at 5 years and relative elevation at 20 m (Appendix 1—figure 2). A univariable analysis was conducted to explore the relationship between each explanatory variable and infection risk while controlling for the four a priori confounding variables. Crude and adjusted odds ratios were estimated using a mixed effects logistic regression with a random effect to account for unexplained variation at the household-level.

For the multivariable model, variable selection was conducted within each domain separately. Mixed-effect logistic regression models were fitted for all combinations of the variables in each domain and were ranked by their Akaike Information Criterion (AIC) value (Appendix 2—table 2). Variables in the model with the minimum AIC value were selected for each domain. Age, gender, household per capita income and valley were controlled for in all models throughout this process. Then, the variables selected from each domain were combined and the mean predicted rattiness estimate (obtained in ‘Predictors of rattiness at each household location’) was included with an interaction with relative elevation category. This set of variables was reduced once more following the same process and all selected variables were included in the final multivariable model (‘Appendix 3’).

Model fitting

Request a detailed protocol

All rat and human variables selected in ‘Variable selection’ were then included in the full joint model defined in Equation 2 and Equation 3. We fit this model using the Monte Carlo maximum likelihood (MCML) method (Christensen, 2004) as described in ‘Appendix 4’, and compute 95% confidence intervals by re-fitting the model for 1000 parametric bootstraps. A formal diagnostic investigation of randomized quantile residuals (Dunn and Smyth, 1996; Smyth et al., 2021) is included ‘Appendix 7’. We found no evidence in the diagnostic plots (Appendix 7—figure 1) to suggest that there were issues with our modelling approach.

Prediction maps

Request a detailed protocol

The maximum likelihood parameter estimates were then used to make prediction maps for rattiness and infection risk as follows.

To map a general predictive target, T(x) say, we first define T*=(T(x1*),,T(xH*)), where X*={x1*,,xH*} is a finely spaced grid of locations to cover the region of interest. We then draw samples from the predictive distribution of T*, that isits conditional distribution given all relevant data. These samples can then be used to compute any desired summary of the predictive distribution. In our analysis, we used as summaries the expectation and 95% prediction interval.

Our first predictive target is rattiness, for which T(x)=dr(x)βr+ψS(x). Our second is human infection risk, for which T(x)=dh(x)βh+eγ+ξ(xi)R(x)+Vi. In either case, we first sample from [R|W;θ,ω] using the same sampling algorithm as for maximizing the likelihood in ‘Joint modelling rat abundance and human infection: the rattiness-infection framework’, with the parameters θ and ω fixed at their maximum likelihood estimates. After obtaining samples r(b), b=1,,B, we then sample from [T*|r(b)], which in both cases follows a multivariate Gaussian distribution with mean and covariance matrix easily obtained from their joint Gaussian distribution, [R,T*]. The resulting values, t(b)(xh*),h=1,,H;b=1,,B, constitute b samples drawn from [T*|W] as required. Note that each t(b)(xh*),h=1,,H is a sample from the joint predictive distribution of the complete surface of T(x) over the whole of the region of interest and can therefore be used to make inferences about spatially aggregated properties of T(x) if required.

Data and code accessibility

Request a detailed protocol

Data and code used in this analysis are publicly available at https://github.com/maxeyre/Rattiness-infection-framework, (copy archived at swh:1:rev:e7953d38269ce97221dbdd83c0be2c65d92dff40, Eyre, 2022) and have been published (Eyre et al., 2021). However, household coordinates and valley ID have been removed from the human data to ensure participant anonymity. The analysis was conducted using R (R Development Core Team, 2016) and the following packages: tidyverse (Wickham, 2017), mgcv (Wood and Wood, 2015), PrevMap (Giorgi and Diggle, 2017), MuMIn (Barton, 2020), lme4 (Bates et al., 2007), and statmod (Smyth et al., 2021). We also include a step-by-step explanation of the model building process to guide future users of the rattiness-infection framework in 'Appendix 8'.

Results

Study overview

In Pau da Lima, we identified 3179 eligible residents using a baseline community census, household visits and through other members of the household. Of these, 2018 (63.4%) individuals consented to join the study and provided a blood sample in the first serosurvey (August-October 2014). As a result of loss to follow-up, only 1401 (69.4%) of these participants (from 669 households) completed the second serosurvey (January-April 2015). Individuals were lost to follow-up because they could not be found after at least five attempts (44.4%), had moved out of the study area (31.1%) or did not wish to provide a second blood sample (19.8%). An overview of participant recruitment is provided in Figure 3. Individuals lost to follow-up were similar in age to those who remained in the study cohort (mean 29.0 and 28.8 years old, respectively, t-value =-0.37, df=1288.5, p=0.7) but were more likely to be male (49.8% male compared to 42.6%, χ2=8.5, df=1, p<0.01). A full description of the study cohort is included in Appendix 5—table 1.

The study participant flow chart in line with the STROBE (Strengthening the Reporting of Observational Studies in Epidemiology) statement (http://www.strobestatement.org).

Between the two serosurveys there was serological evidence of 72 leptospiral infections in the cohort, with an overall infection rate of 51.4 (95%CI 40.4, 64.2) infections per 1,000 follow-up events. Valleys 2 and 3 had high estimated infection rates with 66.4 (95%CI 47.3, 90.2) infections per 1000 follow-up events and 49.6 (95%CI 33.6, 69.9) infections per 1000 follow-up events, respectively, compared to 23.2 (95%CI 9.2, 46.9) infections per 1000 follow-up events in Valley 1. The number of infected participants in each household are mapped in Figure 4 - panel A, with relative elevation shown for reference in Figure 4 - panel B.

Household infection and elevation maps.

(A) Map of participant household locations with the number of leptospiral infections in each household marked (grey circle - no infections; orange square - 1 infection; red diamond - 2 infections; pink triangle - 3 infections) and contours marking low, medium, and high relative elevation category; (B) Elevation (metres) relative to the bottom of the valley with contours marking low, medium, and high relative elevation levels.

In the rat ecology study a rat was captured in 129 (9.0%) out of 1,512 trapping-days, 263 (37.4%) out of 703 track plate days had at least one positive plate and 28.5%, 19.7%, and 25.9% of the 580 sampled locations had at least one sign of active burrows, faecal droppings and trails, respectively.

Exploratory analysis and model selection

The results from the exploratory multivariable analysis of rattiness are shown in Table 1. The linear splines used were informed by the functional forms shown in Appendix 1—figure 1. The relationship between rattiness and relative elevation demonstrates a trade-off between the high availability of food sources at the bottom of the valley and high risk of flooding which prevents the establishment of burrows. In the lowest elevation areas (0–8 m above the bottom of the valley), relative elevation and rattiness were positively associated with an increase of 0.04 (95%CI 0.00, 0.07) rattiness units per 1 m; when interpreting the magnitude of effect estimates note that, by definition, rattiness is defined so as to have variance one. Rattiness then peaked at an elevation of 8 m before declining with increasing elevation by 0.04 (95%CI –0.09, 0.01) units until an elevation of 22 m. Rattiness started to increase again above this elevation by 0.06 (95%CI 0.00, 0.10) units per metre. Rattiness decreased with increasing distance from large refuse piles, a source of food and harbourage, by 0.07 (95%CI −0.13,–0.01) units per 10 m distance until a distance of 50 m, beyond which there was a smaller increase in rattiness of 0.02 (95%CI –0.05, 0.09) per 10 m. Impervious land cover (defined as the proportion of the area within a 30 m radius around each sampling location classified as pavement or building) was negatively associated with rattiness, decreasing by –0.05 (95%CI −0.08,–0.01) units for every 10% increase in impervious cover.

Table 1
Multivariable linear regression analysis of predictors for rattiness (note that rattiness is a unit-variance random variable when interpreting the magnitude of effect estimates).
VariableEstimate (95% CI) *
Relative elevation (per 1 m increase)
0–8 m0.04 (0.00, 0.07)
8–22 m–0.04 (-0.09, 0.01)
>22 m0.06 (0.00, 0.10)
Distance to large refuse piles (per 10 m increase)
0–50 m–0.07 (-0.13,–0.01)
>50 m0.02 (-0.05, 0.09)
Impervious land cover (per 10% increase)–0.05 (-0.08,–0.01)
  1. *

    CI, Confidence interval.

  2. The effects of relative elevation and distance to refuse are modelled as broken linear models with transitions at 8m and 22m, and 50m, respectively. This was informed by the relationship described by Generalized Additive Modelling in Appendix 1—figure 1.

In the community cohort data, the univariable analysis identified several risk factors that increased a resident’s risk of leptospiral infection (Table 2). Variables in two of the four domains (demographic and social status and behavioural exposures) had estimated effect sizes with 95% confidence intervals that did not include an odds ratio of one (statistically significant at the conventional 5% level). Within the demographic and social status domain, risk of infection increased with age and was found to be higher for male participants and those living in Valleys 2 and 3. In the behavioural exposures domain, participants who had had frequent contact with floodwater in the last six months were more likely to be infected. Two individuals were excluded from the multivariable analysis (n=1399) because of missing data for the floodwater exposure survey question.

Table 2
Univariable mixed effects logistic regression analysis of human risk factors for leptospiral infection.
VariableOR (95% CI)*aOR (95% CI)*
Demographic and social status
Age (per year)
 0–30 years old1.08 (1.03, 1.13)1.09 (1.04, 1.15)
 >30 years old1.02 (0.96, 1.09)1.02 (0.95, 1.08)
Male gender2.22 (1.31, 3.85)2.78 (1.56, 4.96)
Daily per capita household income (US$/day)1.01 (0.89, 1.11)0.92 (0.80, 1.05)
Valley
 1REFREF
 23.35 (1.33, 10.37)3.52 (1.23, 10.05)
 32.39 (0.93, 7.38)2.53 (0.88, 7.27)
Adult illiteracy1.34 (0.61, 2.79)0.66 (0.29, 1.49)
Education (per year of education)
 0–5 years1.05 (0.85, 1.32)1.14 (0.91, 1.44)
 >5 years0.96 (0.73, 1.27)0.96 (0.75, 1.26)
Household environment
Impervious land cover (per 10% increase)0.87 (0.76, 0.99)0.82 (0.71, 0.95)
Relative elevation (per 1 m increase)
 0–20 m0.94 (0.89, 0.99)0.93 (0.88, 0.99)
 >20 m1.12 (0.98, 1.29)1.12 (0.97, 1.29)
Relative elevation category
 Low (0–6.7 m)REFREF
 Medium (6.7–15.6 m)0.72 (0.37, 1.39)0.72 (0.36, 1.44)
 High (>15.6 m)0.58 (0.27, 1.20)0.51 (0.23, 1.11)
Open sewer within 10 m1.60 (0.85, 3.17)1.69 (0.85, 3.37)
Unprotected from open sewer1.00 (0.55, 1.79)1.11 (0.61, 2.03)
Live on hillside0.99 (0.52, 1.86)0.89 (0.46, 1.71)
Occupational exposures
Work in construction §1.36 (0.51, 3.21)0.62 (0.23, 1.67)
Work as travelling salesperson §4.81 (1.12, 18.78)2.97 (0.71, 12.40)
Work in refuse collection §2.95 (1.04, 7.89)1.57 (0.56, 4.42)
Work involves contact with floodwater §0.89 (0.04, 5.61)0.52 (0.05, 4.96)
Work involves contact with sewer water §3.61 (0.45, 20.38)1.92 (0.29, 12.80)
Behavioural exposures
Contact with floodwater in last 6 months
 Never/rarelyREFREF
 Sometimes0.61 (0.27, 1.25)0.66 (0.30, 1.47)
 Frequently2.14 (0.91, 4.94)2.84 (1.18, 6.86)
Contact with sewer water in last 6 months
 Never/rarelyREFREF
 Sometimes0.55 (0.19, 1.31)0.67 (0.25, 1.78)
 Frequently1.42 (0.51, 3.50)1.63 (0.61, 4.41)
  1. *

    OR, Odds ratio; aOR, Adjusted odds ratio; CI, Confidence interval; REF, Reference level.

  2. The effect of age, education and relative elevation are modelled as broken linear models with transitions at 30 years old, 5 years of education and an elevation of 20m. This was informed by the relationship described by Generalized Additive Modelling (Appendix 1—figure 2).

  3. Relative elevation category consists of three discrete groups representing three regions with different floodingrisk profiles.

  4. §

    Binary variable with reference category of ‘no occupational exposure’.

In the exploratory results from the multivariable model there was strong evidence of an interaction between rattiness and household relative elevation category on human infection risk (see Appendix 3—table 1 for all parameter estimates for this model). In the high elevation category area, a unit increase in mean predicted rattiness at the household location was estimated to increase the odds of infection by 6.92 (95%CI 1.88, 25.47). In contrast, in the low and medium elevation category areas there was no evidence of a relationship between rattiness and infection risk, as shown in Figure 5. Consequently, this interaction effect was also included in the rattiness-infection joint model.

Predicted relationship between rattiness and infection risk from the multivariable mixed effects logistic regression demonstrating evidence of an interaction with relative elevation category (low, medium and high).

Shown on the log-odds scale with shaded areas corresponding to 95% confidence intervals.

The explanatory variables selected in the rat and human multivariable analyses were then entered into the full rattiness-infection joint model with the functional forms included in Table 1. To test for residual spatial correlation in the human infection data after controlling for explanatory variables and rattiness, we fitted the joint model with an additional spatial Gaussian process in the human infection linear predictor. The estimated value for the scale of spatial correlation for this Gaussian process was less than 1 m and indistinguishable from household-level variation. We consequently fitted the joint model specified in Equation 3 which assumes that there is no residual spatial correlation in the human infection data.

Joint rattiness-infection model

Human infection risk factors, rattiness predictors and other model parameters estimated using the joint rattiness-infection model are shown in Table 3. Infection risk was strongly associated with age, with an individual experiencing an increased odds of infection of 1.09 (95%CI 1.04, 1.19) for every year of life up until 30 years of age, and 1.02 (95%CI 0.92, 1.09) for each additional year thereafter. Male participants were more likely to be infected than female participants (OR 2.69 95% CI 1.58, 5.89). Compared with individuals living in Valley 1, those living in Valley 2 had a higher estimated odds of infection (OR 2.91 95% CI 1.03, 20.82). Individuals living in the medium (OR 0.77 95% CI 0.31, 1.66) and high (OR 0.67 95% CI 0.11, 1.64) elevation areas had a lower estimated odds of infection relative to those living in the low relative elevation category area where there are open sewers and flooding risk is higher, however these confidence intervals included an odds ratio of one (not statistically significant at the conventional 5% level).

Table 3
Parameter estimates for the full joint rattiness-infection model.
ParameterEstimate (95% CI)
Human infection risk factorsOR
Age (per year)
 0–30 years old1.09 (1.04, 1.19)
 >30 years old1.02 (0.92, 1.09)
Male gender2.69 (1.58, 5.89)
Daily per capita household income (US$/day)0.93 (0.74, 1.05)
Valley
 1REF
 22.91 (1.03, 20.82)
 32.28 (0.86, 14.00)
Relative elevation category
 Low (0–6.7 m)REF
 Medium (6.7–15.6 m)0.77 (0.31, 1.66)
 High (>15.6 m)0.67 (0.11, 1.64)
Work as travelling salesperson3.16 (0.38, 20.57)
Contact with floodwater in last 6 months
 Never/rarelyREF
 Sometimes0.62 (0.18, 1.39)
 Frequently2.47 (0.67, 7.41)
Rattiness (per unit rattiness)
ξlow1.14 (1.05, 1.53)
ξmed1.25 (1.08, 1.74)
ξhigh3.27 (1.68, 19.07)
σ2(variance of household-level random effect)1.36 (0.23, 5.35)
Rattiness variables
Relative elevation (per 1 m increase)2
 0–8 m0.05 (-0.01, 0.13)
 8–22 m–0.06 (-0.16, 0.02)
 >22 m0.05 (-0.03, 0.14)
Distance to large refuse piles (per 10 m increase)3
 0–50 m–0.10 (-0.21, 0.02)
 >50 m0.03 (-0.11, 0.17)
Impervious land cover (per 10% increase)–0.07 (-0.14,–0.01)
Rattiness parameters
αtraps–2.94 (-3.27,–2.65)
αplates–2.06 (-2.50,–1.74)
αburrows–1.41 (-1.67,–1.16)
αfaeces–2.82 (-3.83,–2.32)
αtrails–2.22 (-2.96,–1.76)
σtraps0.72 (0.45, 0.97)
σplates2.37 (2.05, 2.68)
σburrows1.28 (1.08, 1.45)
σfaeces2.36 (1.80, 3.34)
σtrails2.43 (1.85, 3.12)
ψ0.67 (0.29, 1.00)
ϕ9.23 (3.21, 18.24)

Infection risk was positively associated with rattiness for households situated in all three levels of the relative elevation category variable. However, while the effect size (per unit increase in rattiness) was similar in the low (OR 1.14 95% CI 1.05, 1.53) and medium (OR 1.25 95% CI 1.08, 1.74) elevation areas, in the high elevation area the effect of increasing rattiness on infection risk was significantly stronger (OR 3.27 95% CI 1.68, 19.07). This interaction effect between rattiness and household relative elevation category on human infection risk was confirmed with a test for evidence against the null hypothesis that ξlow=ξmed=ξhigh (p=0.026, χ2=7.33, df=2).

Parameter estimates for the rattiness variables were very similar to the estimates from the exploratory linear regression (Table 1), with a slightly higher effect size for the distance to refuse piles and land cover variables. There was evidence of small-scale spatial correlation in rattiness (ϕ = 9.23m 95% CI 3.21, 18.24 m) corresponding to a spatial correlation range (the distance at which the correlation reduces to 5%) of approximately 28 m. The estimate for ψ of about 0.67 (95%CI 0.29, 1.00) indicates that the majority of the unexplained variation in rattiness is spatially structured, with the remainder modelled as a nugget effect.

Spatial prediction

There was heterogeneous spatial variation in predicted rattiness. The numerous small regions of high rattiness in Figure 6 - panel A are indicative of the small-scale spatial correlation in the data. The low elevation areas in the central length of each valley (relative elevation is shown in Figure 6 - panel E with the contours marking the low, medium and high elevation areas) had high mean predicted rattiness. High rattiness was also predicted in several high elevation areas, for example the northern tip of Valley 3 and several small hotspots along the three valley’s high elevation sides.

Joint rattiness-infection model predictions.

(A) Mean predicted rattiness; (B) Mean predicted leptospiral infection risk for 30-year-old male participants with a household per capita income of USD$1 /day who never/rarely have contact with floodwater and do not work as a travelling salesperson; (C) lower 95% prediction interval for predicted infection risk; (D) upper 95% prediction interval for predicted infection risk.

To illustrate the spatial variation in infection risk within the study area, prediction maps are shown in Figure 6 - panel B for a 30-year-old male participant with a household per capita income of USD$1 /day who never or rarely had contact with floodwater in the previous six months and did not work as a travelling salesperson. Infection risk was low across most of Valley 1 (<2.5%), with marginally higher average values found in the central low elevation area (2.5–5%). Risk was consistently higher across most of Valleys 2 and 3 (7.5–15%), with the effect of elevation on risk clearly visible. In areas with higher and more spatially heterogeneous predicted infection risk, for example in the central region of Valley 2, this was driven by high levels of predicted rattiness. The stronger estimated effect of rattiness on infection risk in higher elevation areas was particularly visible in Valleys 2 and 3, as seen in the three hotspots with risk reaching 20% and the moderate risk hotspots along the sides of both valleys. Prediction intervals were relatively narrow across most of the study area (Figure 6 - panel C and Figure 6 - panel D) with greater uncertainty in the high risk areas.

Discussion

We developed and applied a novel framework for joint spatial modelling of disease reservoir abundance and human infection risk to a community-based cohort study and fine-scale rat ecology study. We found that higher levels of rattiness, our proxy for rat abundance, at the household location were associated with a higher risk of leptospiral infection for residents across the entire study area. Importantly, we found that a unit increase in rattiness in high elevation areas was associated with an almost three times higher odds ratio for infection than in low and medium elevation areas. To our knowledge, this is the first study to jointly model rodent abundance and human infection data for a rodent-borne zoonosis. The findings provide new insights into how the dominant mechanisms of Leptospira transmission within complex urban settings may vary over small distances, as a result of interactions between rats, the environment, geography, and local epidemiology.

The finding that rattiness was associated with infection risk indicates that the spatial distribution of rat populations was an important driver of transmission close to the household across the entire study area. This is consistent with a recent study investigating the predictive power of household rat infestation scores for human infection (Costa et al., 2021). There was no residual spatial correlation in the infection data after accounting for rattiness in our analysis, possibly suggesting that previously unexplained spatial heterogeneity in risk could be driven by variation in rattiness (Hagan et al., 2016). Our model also predicted high average rattiness across the low elevation areas where leptospiral transmission is high (Hagan et al., 2016; Reis et al., 2008). This supports the hypothesis that abundant rat populations are responsible for high levels of observed environmental contamination across these lower areas (Casanovas-Massana et al., 2018; Casanovas-Massana et al., 2022), and consequently increased infection risk.

The identified interaction between elevation and rattiness on infection risk suggests that relatively small changes in environment and topography can modify transmission pathways within an urban community. The weaker effect of rattiness on infection risk at low and medium elevation areas relative to high elevation areas may be explained by differences in their hydrological profiles. While high rat abundance in low and medium areas results in high leptospiral contamination, these areas are prone to high levels of water runoff and flooding. This disperses the pathogen across low elevation areas. The ability of leptospires to persist in the environment for weeks or months means that this process can significantly increase environmental risk in low elevation areas for long periods. This process disconnects shedding and infection events in space and time (Plowright et al., 2017) and obscures the relationship between infection risk and rattiness in low and medium elevation households.

In contrast, high elevation areas have lower levels of water runoff and flooding due to improved drainage and sewage systems, and a smaller upstream catchment area for rainfall. Leptospires are consequently less likely to be washed away from the location at which they were shed, and environmental risk remains more localised and strongly associated with the spatial distribution of rats. This hypothesised role of hydrology in the aggregation and dispersal of leptospires (Plowright et al., 2017) is supported by a recent study in low elevation areas of Pau da Lima which found that soil contamination was not associated with local rat activity (Schneider et al., 2018). However, our finding that rattiness was associated with infection in low and medium elevation areas suggests that the spatial distribution of environmental risk in these areas is not entirely determined by water dispersal.

Interestingly, a previous study of surface waters in Pau da Lima found that the probability of a sample being positive for Leptospira was highest in low elevation areas and lowest in medium elevation areas, with no significant difference between low and high elevation areas (Casanovas-Massana et al., 2018). This is consistent with our findings and suggests that there may be a ‘washing out’ of locally deposited leptospires in medium elevation areas but not in high elevation areas.

This has several implications for disease control strategies which aim to reduce environmental risk. Improving drainage systems at all elevation levels can reduce the dispersal of leptospires from high to low elevation areas. Closure of sewer systems, which generally run through low elevation areas, can protect local residents from exposure and reduce the introduction of additional contamination from upstream sewer water. Paving over soil surfaces can reduce the surface area over which leptospires can persist (Bierque et al., 2020), reducing environmental risk further.

A reduction in the dispersal and accumulation of bacteria will result in more localised environmental risk, as was observed in the high elevation areas in this study. Higher risk will then be found in areas with a high abundance of infected rats. This may also reduce environmental exposure for rats, thereby lowering shedding rates and acting as a feedback loop into the Leptospira transmission cycle (Minter et al., 2018). Given the limited and short-term impact of chemical rodenticide campaigns on Norway rat abundance in these settings (de Masi et al., 2009), longer-term environment management strategies targeted at rattiness hotspots may also be needed to reduce the availability of key predictors of rattiness, such as large refuse piles and vegetation and soil land cover. Funding and political will for large-scale infrastructural interventions is often limited in marginalised urban settings and small-scale community-based interventions which target these mechanisms should be evaluated.

Transmission is dynamic in space and time and the alignment of conditions which enable spillover infection can vary over time (Plowright et al., 2017). Our study was designed to explore the spatial variation in rattiness and infection risk in Pau da Lima during the driest period of the year, and it may not be representative of transmission mechanisms during the rainy season. There is some evidence, however, that this may not necessarily be the case, with two recent studies in Pau da Lima reporting low seasonal variation in both rat abundance (Panti-May et al., 2016) and spatial infection risk patterns (Hagan et al., 2016). Nonetheless, future studies across different time periods are needed to establish the role of rat abundance in Leptospira transmission.

In this study we used household location to link rattiness to an individual, under the assumption that the majority of their exposure occurs close to home. Given the spatially heterogeneous distribution of rattiness and environmental contamination (Casanovas-Massana et al., 2018) within the community, future epidemiological studies of leptospirosis and zoonotic spillover could benefit from trying to pinpoint key sources of infection away from the household using GPS mobility data, as has been attempted in a small study previously (Owers et al., 2018). The rattiness-infection framework could then be extended to model cumulative environmental exposure to the rattiness surface by integrating along a person’s trajectory as they move around the community.

Our framework did not account for disease dynamics within rat populations. Given that 80% of rats are estimated to be actively shedding Leptospira in Pau da Lima (Costa et al., 2015b; de Faria et al., 2008) and prevalence in rats is generally high in urban areas globally (Pellizzaro et al., 2019; Costa et al., 2014a; Boey et al., 2019; Yusof et al., 2019; Krøjgaard et al., 2009), the use of rattiness as a proxy for rat shedding appears reasonable and it may be a useful proxy in other epidemiological studies. Despite this, non-shedding rats may be spatially clustered and future work would benefit from the collection of georeferenced rat infection data. For other zoonotic spillover systems where pathogen release does not occur at a high and homogeneous rate across the reservoir host population, accounting for spatially heterogeneous or time-varying (Davis et al., 2005) disease dynamics will be important.

A possible limitation of this study is the titre rise cut-off values used for classifying seroconversion and reinfection in the cohort that determine the sensitivity and specificity of the infection criteria. However, these criteria were used because they are the standard definitions for serological determination of infection that are commonly applied for leptospirosis and a wide range of other infections, and they enable the comparison of results with other previous leptospirosis studies.

The rattiness-infection modelling framework is a flexible tool for exploring the spatial association between reservoir abundance, the environment and human health outcomes. It provides a statistically principled method for joint spatial modelling of infection risk and multiple indices of reservoir abundance, pooling data between indices and directly accounting for uncertainty in their measurement in all parameter estimates and predictions. The framework’s geostatistical structure includes spatially continuous predictors for abundance and accounts for spatial correlation, enabling mapping of both infection risk and rattiness. This can be useful for identifying high-risk areas and targeting control. One inherent limitation is its dependence on the availability of spatially continuous environmental variables and abundance data, both of which are prone to high measurement error. This can result in high uncertainty in the model parameter estimates and predictions, as demonstrated by the wider confidence intervals for risk factors in the joint model compared to the standard mixed-effects logistic regression analysis. An additional benefit of the geostatistical structure is that abundance measurements do not have to be taken at the household location, providing some flexibility in the design of eco-epidemiological studies and indices used. The framework may have important applications beyond the study of zoonotic spillover, with the rattiness component replaced by other exposure measures for example mosquito density or ecological indices (such as pollution, where there are multiple, related measures of air or groundwater quality) to model associations with human or animal health outcomes.

In conclusion, we have developed a framework that may have broad applications in delineating complex animal-environment-human interactions during zoonotic spillover and identifying opportunities for public health intervention. We demonstrate its potential by applying it to Leptospira in an urban setting, finding evidence that the extent to which local rat shedding drives spillover transmission is moderated by elevation, most likely a proxy for water runoff. Future work examining these transmission mechanisms in similar settings and across different time points will be key to establishing how generalisable these results are.

Appendix 1

Functional form of continuous explanatory variables

Appendix 1—figure 1
Generalized Additive Model (GAM) partial dependence plots for the unstructured random variation in rattiness,,U^i plotted against the continuous explanatory variables considered in the analysis (shaded areas correspond to 95% confidence intervals).

(A) elevation relative to the bottom of valley, (B) distance to large refuse piles, (C) impervious land cover in 20 m radius buffer around sampling point. are estimated using a non-spatial model which excludes all covariates. (D) is a variogram computed from U^i using a non-spatial model that includes all of the covariates; the dashed lines correspond to 95% confidence intervals under the assumption of spatial independence.

A single knot point for the distance to refuse piles variable was chosen at 50 m to account for the expected decay in the effect of food resources up to a rat home range distance, beyond which little effect would be expected. We did not include an additional knot point at 145 m despite there being a visible change in gradient in Appendix 1—figure 1 - panel B for two reasons. Firstly, the home range of Norway rats is estimated to be less than 100 m in these urban settings (Feng and Himsworth, 2014; Davis et al., 1948; Byers et al., 2019), meaning that rat abundance is very unlikely to be affected by the availability of anthropogenic food sources beyond this distance, particularly given the high availability of food across the study area. Secondly, we could offer no scientific rationale for why rattiness would start to increase again beyond 50 m before peaking at a very large distance of 145 m from a refuse pile and decreasing thereafter.

In contrast, the mechanisms by which rattiness varies with elevation are more complex, with significant changes in the environment occurring at different elevations. For example, the relationship identified in Appendix 1—figure 1 - panel A can be explained by the high risk of flooding at the bottom of the valley, which carries resources down to lower elevations (resulting in a peak of rattiness at about 7 m) but makes the very lowest elevations unsuitable for rat burrows. The highest elevations in our study area are close to a main road with food markets where large quantities of food waste are left out in the street for collection. Although there is large uncertainty about this relationship, it is highly possible that this may be driving the positive relationship between 23 m and 40 m.

Appendix 1—figure 2
Generalized Additive Model (GAM) partial dependence plots for human infection risk plotted against the continuous explanatory variables considered in this analysis (shaded areas correspond to 95% confidence intervals).

(A) age, (B) household per capita income (in USD), (C) years of education, (D) household elevation relative to the bottom of valley, (E) impervious land cover in 20 m radius buffer around household.

Single knot points were considered for age at 30 years (Appendix 1—figure 2 - panel A), education at 5 years (Appendix 1—figure 2 - panel C) and relative elevation at 20 metres (Appendix 1—figure 2 - panel D) based on the value of the explanatory variable at which the gradient of the relationship changed in these plots.

Appendix 2

Model selection tables

Appendix 2—table 1
AIC fit of the five highest ranked multivariable rattiness models (’+’ indicates that a variable was selected in the model).
ModelDist. refuse (0–50)Dist. refuse (>50)Land coverElevation (0–8 m)Elevation (8–22 m)Elevation (>22 m)df*AICc *
M1++++++81476.48
M2++++61479.31
M3+++++71481.01
M4+++++71481.99
M5+++++71482.97
  1. *

    df, degrees of freedom; AICc, corrected Akaike Information Criterion.

Appendix 2—table 2
AIC fit of the five highest ranked multivariable human infection models (’+’ indicates that a variable was selected in the model).
ModelAge (0–30)Age (>30)SexValleyFloodwaterIncomeLand coverSalespersonElevation levelRattinessRatt:Elevdf*AICc *
M1++++++++++16523.14
M2+++++++++15523.52
M3+++++++++++17523.72
M4++++++++++16524.11
M5+++++++++14525.04
M*++++++++13532.13
  1. *

    df, degrees of freedom; AICc, corrected Akaike Information Criterion

  2. Model M* was ranked outside of the top 5 models but is included here for reference to demonstrate the improvement in model fit when rattiness is included.

Appendix 3

Exploratory multivariable analysis of human risk factors

Appendix 3—table 1
Multivariable mixed effects logistic regression analysis of risk factors for leptospiral infection in community members.

Note: there was missing information for the contact with floodwater question for two individuals and consequently only 1399 participants from 668 households were included in this analysis.

VariableOR (95% CI)
Demographic and social status
Age (per year)*
 0–30 years old1.10 (1.04, 1.16)
 >30 years old1.02 (0.96, 1.09)
Male gender2.90 (1.59, 5.28)
Daily per capita household income (US$/day)0.93 (0.81, 1.06)
Valley
 1REF
 23.91 (1.33, 11.68)
 32.26 (0.74, 6.93)
Household environment
Relative elevation level
 High (>15.6 m)REF
 Medium (6.7–15.6 m)0.71 (0.30, 1.70)
 Low (0–6.7 m)1.08 (0.44, 2.62)
Occupational exposures
Work as travelling salesperson 3.38 (0.77, 14.87)
Behavioural exposures
Contact with floodwater in last 6 months
 Never/rarelyREF
 Sometimes0.64 (0.28, 1.43)
 Frequently2.48 (1.02, 6.02)
Rattiness
Rattiness at high elevation level (per unit rattiness)6.92 (1.88, 25.47)
Elevation level: Low × rattiness0.10 (0.02, 0.62)
Elevation level: Medium × rattiness0.15 (0.02, 0.91)
σ2 (variance of household random effect)1.78
  1. *

    The effect of age is modelled as a broken linear model with a transition at 30 years old, as informed by the relationship described by Generalized Additive Modelling (Appendix 1—figure 2).

  2. Binary variable with reference category of ‘no occupational exposure’.

Appendix 4

Model fitting

To fit the joint model, we proceed as follows. Let W=(Y,Z) and θ=(α1,,α5,αh,σ1,,σ5,βh,γ,ξ,σ2) and ω=(βr,ϕ,ψ) be the vector of unknown parameters associated with [R] and [W|R]. The likelihood function is then given by

(5) L(θ,ω)=[W;θ,ω]=N[R;ω][W|R;θ]dR

The integral in Equation 5 cannot be solved analytically so we approximate it using Monte Carlo methods. Specifically, let θ0 and ω0 be our initial best guesses for θ and ω, respectively. Since [R;ω][W|R;θ][R|W;ω] we re-write the integral in Equation 5 using an importance sampling distribution [R;ω0][W|R;θ0] to give

(6) L(θ,ω)RN[R;ω][W|R;θ][R;ω0][W|R;θ0][R|W;θ0,ω0] dR=E[[R;ω][W|R;θ][R;ω0][W|R;θ0]],

where the expectation is taken with respect to the distribution of [R|W;ω0].

Based on Equation 6, we then approximate Equation 5 with

(7) L(θ,ω)1Bb=1B[r(b);ω][W|r(b);θ][r(b);ω0][W|r(b);θ0]

where r(b) is the b-th sample from [R|W;ω0,θ0]. To obtain the maximum likelihood estimates for θ and ω, we maximize Equation 7 using numerical optimization. To simulate from [R|W;θ0,ω0], we use the Laplace sampling algorithm described in detail by Christensen, 2004 and Giorgi and Diggle, 2017. We draw 110,000 samples from [R|W;θ0,ω0], with a burn in of 10,000 samples and thin by 10%, leaving 10,000 MCMC samples.

To improve the approximation of the likelihood function, we also update our guesses ω0 and θ0 by plugging their estimated values into the denominator of Equation 7 and iterate its maximization until convergence.

Appendix 5

Baseline cohort characteristics

Appendix 5—table 1
Summary of demographic, socioeconomic and environmental risk factors.
VariableNo. or Median (% or IQR) *
Demographic and social status
Age (years)27 (15–41)
Male gender597 (42.6%)
Daily per capita household income (US$/day)1.6 (0.8–2.8)
 Valley 1259 (18.5%)
 Valley 2557 (39.8%)
 Valley 3585 (41.8%)
 Literacy1125 (80.3%)
Education (years)6 (4-9)
Household environment
Impervious land cover (%)49.6 (35.1–70.6)
Relative elevation (metres)11.0 (5.9–16.3)
Elevation level
 Low (0–6.7 m)474 (33.8%)
 Medium (6.7–15.6 m)524 (37.4%)
 High (>15.6 m)403 (28.8%)
Open sewer within 10 m926 (66.1%)
Unprotected from open sewer666 (47.6%)
Live on hillside453 (32.4%)
Occupational exposures
Work in construction105 (7.5%)
Work as travelling salesperson24 (1.7%)
Work in refuse collection61 (4.4%)
Work involves contact with mud27 (1.9%)
Work involves contact with floodwater23 (1.6%)
Work involves contact with sewer water16 (1.1%)
Behavioural exposures
Contact with floodwater in last 6 months
 Never/rarely986 (70.5%)
 Sometimes299 (21.4%)
 Frequently114 (8.1%)
Contact with sewer water in last 6 months
 Never/rarely1120 (80.2%)
 Sometimes180 (12.9%)
 Frequently97 (6.9%)
  1. *

    No., number; IQR, interquartile range; Percentages are calculated without missing values. All variables had ≤ 5 missing values.

Appendix 6

Sensitivity analysis for disturbed trap modelling assumption

In the rattiness-infection framework we assumed that a trap was disturbed when it was found closed without a rat and set t=0.5 (see ‘Rat abundance outcomes’) in the equation for the probability of capturing a rat

1-exp{-tiμ1(xi)}.

This occurred in 554 (36.6%) out of 1,512 trapping-days. To ascertain the potential impact of this on model parameter estimates we conducted a sensitivity analysis as follows:

  1. Draw values for t from U(0,1) for all trap observations that were found closed.

  2. Fit a simplified rattiness model with covariates that did not account for spatial correlation by setting ψ=0 in Equation 2 in ‘Rattiness’.

  3. Repeat steps 1–2 a total of 1,000 times.

  4. Estimate the between-imputation standard error for each parameter, defined as:

    SEimp=i=1B(θi-θ¯2)B-1

    for imputation i of a total B imputed datasets.

The results for each parameter can be seen in Appendix 6—table 1 below. Estimated between-imputation standard errors were small relative to parameter estimates, indicating that uncertainty due to the missing trap disturbance information is unlikely to have significantly affected parameter estimates in the full rattiness-infection model.

Appendix 6—table 1
Trap disturbance sensitivity analysis: non-spatial rattiness model parameter estimates and between-imputation standard errors.
ParameterEstimateSEimp
αtraps–2.82740.0128
αplates–1.90580.0004
αburrows–1.37940.0008
αfaeces–2.86170.0027
αtrails–2.15380.0023
σtraps0.70100.0120
σplates2.40160.0004
σburrows1.38200.0008
σfaeces2.67040.0031
σtrails2.64310.0036
Relative elevation (per 1 m increase)2
 0–8 m0.05250.0001
 8–22 m–0.05830.0001
 >22 m0.11120.0002
Distance to large refuse piles (per 10 m increase)3
 0–50 m–0.10900.0002
 >50 m0.04050.0001
Impervious land cover (per 10% increase)–0.05920.0001

Appendix 7

Residual diagnostics

To examine the fit of the full rattiness-infection model to the human infection data, a formal diagnostic investigation was conducted using randomized quantile residuals (Dunn and Smyth, 1996; Smyth et al., 2021). The residual plots in Appendix 7—figure 1 exhibit no trends between quantile residuals and fitted values (Panel A) or variables in the model (Panels B-H).

Appendix 7—figure 1
Residual diagnostic plots showing randomised quantile residuals plotted against: (A) fitted values; (B–H) variables in the model.

Appendix 8

Model building guidance

To help guide the model building process for future users of the rattiness-infection framework we outline the following key steps (to be viewed with the available R code at https://github.com/maxeyre/Rattiness-infection-framework):

  1. Set up the rat (or any other animal reservoir) component of the model [script: 1-rat-explore.R]:

    1. Fit the non-spatial rattiness model with no explanatory variables and predict rattiness at all sampled rat locations.

    2. Explore the relationship between predicted mean rattiness and explanatory variables using Generalized Additive Models (GAMs) to decide on their functional form. Note: all variables considered must also be measured at household locations

    3. Conduct model selection using a linear model with mean predicted rattiness as the dependent variable.

    4. Fit the non-spatial rattiness model with selected explanatory variables and compute the empirical variogram to check for evidence of residual spatial autocorrelation. If the variogram shows no signs of residual correlation, consider confirming this result by fitting the model in the following step - the estimated scale of spatial correlation ϕ should be close to zero (it may also not be able to estimate its value, with the value changing considerably between iterations).

    5. Fit the spatial rattiness model with selected explanatory variables and predict mean rattiness at household locations to create an exploratory rattiness variable. To improve model convergence use parameter estimates from the non-spatial model as the first guess for the parameters and repeat model fitting by plugging in previous estimates.

  2. Set up the human infection component of the model [script: 2-human-explore.R]:

    1. Explore the relationship between infection risk and explanatory variables using Generalized Additive Models (GAMs) to decide on their functional form.

    2. Conduct model selection

    3. Explore the relationship between infection risk and mean predicted rattiness using Generalized Additive Models (GAMs) and consider interactions where relevant.

    4. Test for residual spatial correlation after controlling for selected variables and mean predicted rattiness (we recommend using the PrevMap package).

  3. Using the joint rattiness-infection framework:

    1. Fit the joint model [script: 3-fit-joint-model.R]. In the ‘control’ file, the inclusion of a household-level random effect (or ‘nugget’) and an additional spatial Gaussian process (if there was evidence of residual spatial correlation after controlling for explanatory variables and rattiness in the previous step) in the human linear predictor can be controlled. We recommend monitoring parameter estimates from each iteration to assess how well the model is converging. If parameters for the spatial Gaussian processes are not converging then this may indicate that the data do not support the inclusion of an additional spatial Gaussian process in the human component and a simpler model should be considered. This model fitting process can be time consuming and ideally should be run on a high-end computing network.

    2. Conduct a residual diagnostic analysis for the full rattiness-infection model [script: 5-revision subanalyses.R].

    3. Bootstrap to estimate uncertainty in parameter estimates [script: 3-fit-joint-model.R]

    4. Create prediction maps for rattiness, infection risk and spatial Gaussian processes (if required) [script: 4-spatial-prediction.R]. The prediction grid for your study area must include values for all variables included in the model.

Data availability

Rat and human data analysed in this study have been deposited in OSF (https://doi.org/10.17605/OSF.IO/AQZ2Y). However, household coordinates and valley ID have been removed from the human data to ensure participant anonymity. Modelling functions, R scripts and metadata for analyses in this manuscript are publicly available at https://github.com/maxeyre/Rattiness-infection-framework, (copy archived at swh:1:rev:e7953d38269ce97221dbdd83c0be2c65d92dff40).

The following data sets were generated
    1. Eyre M
    2. Costa F
    3. Ko A
    (2021) Open Science Framework
    Linking rattiness, geography and environmental degradation to spillover Leptospira infections in marginalised urban settings: Data sources.
    https://doi.org/10.17605/OSF.IO/AQZ2Y

References

  1. Software
    1. Barton K
    (2020)
    Mu-min: multi-model inference, version 0.2
    R Package Version.
  2. Software
    1. Bates D
    2. Sarkar D
    3. Bates MD
    4. Matrix L
    (2007)
    The lme4 package, version 2.0
    R Package Version.
  3. Report
    1. Centers for Disease Control and Prevention
    (2006)
    Integrated Pest Management: Conducting Urban Rodent Surveys
    US Department of Health and Human Services.
    1. Dunn PK
    2. Smyth GK
    (1996) Randomized quantile residuals
    Journal of Computational and Graphical Statistics 5:236–244.
    https://doi.org/10.1080/10618600.1996.10474708
  4. Report
    1. Leary SL
    2. Underwood W
    3. Anthony R
    4. Cartner S
    5. Corey D
    (2013)
    AVMA guidelines for the euthanasia of animals
    American Veterinary Medical Association.
  5. Report
    1. WHO
    (2010)
    Leptospirosis burden epidemiology reference group (LERG)
    Geneva: World Health Organization.
  6. Software
    1. Wickham H
    (2017)
    The tidyverse, version 1.1
    R Package.
  7. Software
    1. Wood S
    2. Wood MS
    (2015)
    Package ‘ mgcv, version 1.8-40
    R Package.

Decision letter

  1. Niel Hens
    Reviewing Editor; Hasselt University, Belgium
  2. Miles P Davenport
    Senior Editor; University of New South Wales, Australia
  3. Benny Borremans
    Reviewer; University of California, Los Angeles, United States

Our editorial process produces two outputs: (i) public reviews designed to be posted alongside the preprint for the benefit of readers; (ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.

Decision letter after peer review:

Thank you for submitting your article "Linking rattiness, geography and environmental degradation to spillover Leptospira infections in marginalised urban settings: an eco-epidemiological community-based cohort study in Brazil" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, one of whom is a member of our Board of Reviewing Editors, and the evaluation has been overseen by David Serwadda as the Senior Editor. The following individual involved in review of your submission has agreed to reveal their identity: Benny Borremans (Reviewer #3).

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

Essential revisions:

The reviewers and I have several comments that should be carefully addressed. I tried to merge the essential ones here; though the public reviews and the recommendations for authors should be taken into account to (note that there is some overlap between the essential revisions and the public reviews).

1) On the statistical model and the choices made:

– On p7, section 2.2.2 the authors use mu_2 for defining the intensity of the inhomogeneous Poisson process. Shouldn't this be mu_1 rather mu_2? If not, what makes using the same function for Yi,1 and Yi,2 a reasonable choice? Note that the model used in both components is not the same.

– Did the authors perform a sensitivity analysis related to the assumption of t=0.5 for the disturbed traps (how often did this occur)?

– For the most part, the explanatory variables assessed in the different models were well described and justified, however there were some cases for which further explanation would have been helpful. For example, how did the authors determine which occupations to evaluate? Specifically, why traveling salesperson? What is the difference between open sewer within 10 m and unprotected from sewer?

– Sup file 2 and Figure S1 (and Table 1): This could be a function of me not understanding correctly, not necessarily the authors not conducting the study appropriately, but I couldn't understand why elevation was split into 3 when distance to large refuse piles was only split into 2 categories since the shape of the splines was similar. Based on Figure S1 (B) it seems that the effect decreases until ~ 60m then increases until ~ 145m and then decreases again? Also, it was unclear to me why in Table 1 an effect estimate for distance to large refuse piles of.02 is 'of little effect' when one of -.07 is considered noteworthy. They both seemed quite small.

– Table 2: It was unclear to me why both relative elevation and elevation level were included and how they differed. Further explanation would be helpful.

– Figure 4. It seems to me that the elevation levels were chosen simply by identifying the elevation cut-offs that divided the household sample sizes into three equal groups. It would be helpful if the authors included a viable biological justification for this division.

– The authors provide an extensive model building exercise and investigate, in different ways, whether the model captures the necessary complexity (GAM smoothers – testing linearity, spatial correlation, etc). I believe the work would benefit from (1) a formal diagnostic investigation, if feasible; (2) providing guidelines on how model building should be performed.

More specifically there are some additional concerns about this specific analysis:

(1) The infection risk data: while the actual infection risk data are not shown, the map shown in Figure 5B suggests that there is an infection hotspot that happens to be at high elevation. This raises the question of how strongly this single hotspot is driving the observed correlation between rat abundance and infection risk (which the authors find to be much stronger at high elevation than at lower elevations).

(2) The statistical models: if I understand correctly, all tested models of infection risk include the variable rat abundance, and while the individual effect estimates for rat abundance are statistically significant (Table 3), the more important question of how the fit of a model without the rat abundance variables compares with those of the other tested models (shown in Supplementary Table S2) has not been addressed.

I am wondering about this curious spatial pattern, where there seems to be one main predicted hotspot of infection risk (Figure 5B), which happens to be at a high elevation. There are a few other locations at a similar elevation, but these don't result in high infection risk predictions, which I assume is because of a difference in other important covariates? When comparing this result to the rattiness map (Figure 5A), one would never guess there is a meaningful (biologically significant) correlation between rattiness and infection risk. Model selection however did find a statistically significant effect of rattiness (Table 3), with the largest effect sizes for the high elevation. This makes me wonder whether the statistical pattern is mostly driven by this one hotspot that happens to be at high elevation, and how important rattiness really is overall.

It would be great to see a map based on the raw infection data, so it's possible to get a better sense of this possible biasing effect on the contribution of rattiness. Maybe add it to figure 5?

One way to test this would be to do the same analysis, but without the location(s) driving this high infection risk hotspot, and see if rattiness is still an important contributor to infection risk.

Perhaps more importantly, all human infection models (Supplementary table 2) include rattiness, so there is no way to assess how a model without rattiness compares with those that do. I strongly suggest adding at least one model without rattiness, for example model M1 but excluding rattiness. If the AIC values of all models in Table S2 are much lower than a model without rattiness, it would add a lot of confidence to the assumed significant effect of rattiness. This is related to the model framework relying on conditional independence within its built up (equation (1)). Whereas this is a reasonable assumption, it would be good to discuss situations in which this assumption is questionable and what the implications are for applying the modeling framework to other settings. In addition the authors indicate that the most complex model was chosen when modeling rattiness (p8, section 2.3.1). Doesn't this imply that the model selection reaches its limits given the candidate models at hand, ie is there a need to consider more complex models?

2) Presentation and interpretation of results

– In Tables 2 and 3, the authors present their results in a comprehensive way but it's not easy to connect those tables in terms of results. For example; are the occupational exposures binary variables? If not, what is the reference category and why is only one (work as traveling salesperson) retained in Table 3? Which of the variables reach overall significance?

Reviewer #1 (Recommendations for the authors):

I believe the manuscript is overall clearly written. I do have a few questions for clarification though.

On p2, section 2.1.2 serosurveys, eligibility criteria for inclusion in the cohort study are outlined. It would help explaining why these specific conditions were used for inclusion: ie 'who had slept more than 3 nights in the previous week in a study household'.

On p7, section 2.2.3 the authors define Zi,j using a Bernoulli variable with probability p_j(x_i). Wouldn't it make sense to consider a hazard-based framework or derive the corresponding hazard function in terms of it's interpretation?

Textual comments:

– Please use \mbox for the correlation in section 2.2.1.

Reviewer #2 (Recommendations for the authors):

Line 62 – typo? Analyze vs. analysing?

Figure 2 description – typo? … dh and dr are not mutually exclusive groups of explanatory [variables – missing?] and the same variables…

Line 277 – it would be nice to reference table 2 here so that the readers can see the full list of considered variables by group.

Regarding supplemental information, it would have been easier if the actual table or figure had been referenced vs. the file in which it could be found.

Reviewer #3 (Recommendations for the authors):

It was a pleasure to review your manuscript.

In my opinion the writing is excellent, the study design is clever and powerful (and must have been a lot of work!), and the spatial statistics are performed expertly.

I do have a few suggestions, that I hope can either be easily refuted or can help to improve the analyses.

Congratulations on this fantastic work.

L65: I suggest writing DALYs in full, as not all readers will know what this is.

L94: I don't agree that there is an absence of methods (multilevel Bayesian models for example have been around for a while), but rather that they are rarely applied in this context.

L99: I find this particular unspecific use of abundance quite confusing, as this is already a very specific and well-defined ecological term. For example, what exactly is then meant by reservoir host abundance on L104? Is this the number of reservoir hosts, or the number of infected hosts, or the number of leptospires in the environment?

If it is used as a measure of exposure to a disease of interest (L100), why not use a term like pathogen pressure, or just exposure? I strongly suggest using different words to describe actual abundance and pathogen-related abundance.

L115: The term rattiness is useful (and fun), but does it really represent leptospire pressure by rats if the model does not take into account leptospira prevalence/shedding in the rat populations? I agree that the presence/abundance of rats can be a decent proxy for the potential risk of leptospira spillover in locations with known presence of leptospira in the rat population, but I'm less inclined to accept that it is ok to define rattiness, which implies rat abundance, as a proxy for leptospiral contamination when the study did not measure the presence of leptospira in the rats.

I see that this is mentioned in the discussion (L493). I think it might be more useful to add this information earlier on, at the place where the term rattiness is introduced.

I agree that with such a high prevalence, it is reasonable to use rattiness as a proxy, but would still be wary: these 80% of rats are likely not distributed randomly across the area, as pathogen transmission is typically more spatially clustered. That means that 1 out of 5 local rat populations are not infected, which is definitely not negligible. That is an important caveat to highlight clearly, early on.

L208, 213: On L208, i is defined as a location, and on L213 as household location. I assume these represent the same location? If so, it might be best to be a bit more specific in the definition on L208, and add 'household', just so it's clear there is only one definition of a location.

L284: What is the rationale for choosing those specific knots?

L324: Kudos for citing the individual R packages (as one should, but often not done).

https://doi.org/10.7554/eLife.73120.sa1

Author response

Essential revisions:

1) On the statistical model and the choices made:

– On p7, section 2.2.2 the authors use mu_2 for defining the intensity of the inhomogeneous Poisson process. Shouldn't this be mu_1 rather mu_2? If not, what makes using the same function for Yi,1 and Yi,2 a reasonable choice? Note that the model used in both components is not the same.

Thank you for pointing out this typographical error, it has been corrected to mu_1.

– Did the authors perform a sensitivity analysis related to the assumption of t=0.5 for the disturbed traps (how often did this occur)?

We have included the results of a sensitivity analysis to evaluate the impact of this assumption on our model parameter estimates in Appendix 6. In brief, for this analysis we drew independent samples for t from the uniform distribution for all 554 trap observations that closed early (36.6% of n=1,512 trap observations) and fitted a simplified rattiness model to the simulated data. This was repeated 1,000 times and between-imputation standard errors were computed. Estimated between-imputation standard errors were very small relative to parameter estimates (the maximum value of a SE divided by the point estimate was equal to 1.7%, with the rest below 1%), indicating that uncertainty due to the missing trap disturbance information is unlikely to have materially affected estimates of uncertainty in the full rattiness-infection model. This evidence supports the use of the t=0.5 assumption in our full analysis. We have added the following text to Section 2.2.2:

“We conducted a sensitivity analysis for this assumption (see Appendix 6) and found that it did not materially affect rattiness parameter estimates.”

– For the most part, the explanatory variables assessed in the different models were well described and justified, however there were some cases for which further explanation would have been helpful. For example, how did the authors determine which occupations to evaluate? Specifically, why traveling salesperson? What is the difference between open sewer within 10 m and unprotected from sewer?

We have added the following additional text to Section 2.3.2 on line 297 to clarify the definition and reason for inclusion for these variables:

“In the household environment domain, two variables were used to capture risk due to sewer flooding close to the household: (i) the presence of an open sewer within 10 metres of the household location and (ii) a binary `unprotected from open sewer' variable which identified those households within 10 metres of an open sewer that did not have any physical barriers erected to prevent water overflow. Three high-risk occupations were included in the occupational exposures domain as binary variables. Construction workers and refuse collectors have direct contact with potentially contaminated soil, building materials and refuse in areas that provide harbourage and food for rats. Travelling salespeople have regular and high levels of exposure to the environment (particularly during flooding events) as they move from house to house by foot. Two other binary occupational exposure variables were included that measured whether a participant worked in an occupation that involves contact with floodwater or sewer water.”

– Sup file 2 and Figure S1 (and Table 1): This could be a function of me not understanding correctly, not necessarily the authors not conducting the study appropriately, but I couldn't understand why elevation was split into 3 when distance to large refuse piles was only split into 2 categories since the shape of the splines was similar. Based on Figure S1 (B) it seems that the effect decreases until ~ 60m then increases until ~ 145m and then decreases again?

We have added the following text in Appendix 1 to explain the decision to model distance to refuse piles in this way and clarify the difference for elevation:

“A single knot point for the distance to refuse piles variable was chosen at 50m to account for the expected decay in the effect of food resources up to a rat home range distance, beyond which little effect would be expected. We did not include an additional knot point at 145m despite there being a visible change in gradient in Figure S1B for two reasons. Firstly, the home range of Norway rats is estimated to be less than 100m in these urban settings [1-3], meaning that rat abundance is very unlikely to be affected by the availability of anthropogenic food sources beyond this distance, particularly given the high availability of food across the study area. Secondly, we could offer no scientific rationale for why rattiness would start to increase again beyond 50m before peaking at a very large distance of 145m from a refuse pile and decreasing thereafter.

In contrast, the mechanisms by which rattiness varies with elevation are more complex, with significant changes in the environment occurring at different elevations. For example, the relationship identified in Figure S1A can be explained by the high risk of flooding at the bottom of the valley, which carries resources down to lower elevations (resulting in a peak of rattiness at about 7m) but makes the very lowest elevations unsuitable for rat burrows. The highest elevations in our study area are close to a main road with food markets where large quantities of food waste are left out in the street for collection. Although there is large uncertainty about this relationship, it is highly possible that this may be driving the positive relationship between 23m and 40m.”

Also, it was unclear to me why in Table 1 an effect estimate for distance to large refuse piles of.02 is 'of little effect' when one of -.07 is considered noteworthy. They both seemed quite small.

Thank you for pointing this out. We have added the following additional text to Section 3.2 to clarify this: “there was a smaller increase in rattiness of 0.02 (95\%CI -0.05, 0.09) per 10m.”

– Table 2: It was unclear to me why both relative elevation and elevation level were included and how they differed. Further explanation would be helpful.

We have changed ‘relative elevation level’ to ‘relative elevation category’ in the text to distinguish more clearly between these two parameterisations. These are two different parameterisations for household elevation above the valley floor: (i) relative elevation – a continuous variable modelled as a linear spline with a knot at 20 metres; (ii) relative elevation category – a categorical variable modelled as a piecewise constant function with breaks at 6.7 and 15.6 metres resulting in three categories: low, medium and high elevation levels. We have added the following additional text in Section 2.2.4 to clarify this:

“a categorical parameterisation of household elevation relative to the bottom of the valley (modelled as a piecewise constant function with breaks at 6.7 and 15.6 metres, resulting in three categories: low, medium and high elevation levels.)”. We have also added the following footnote to Table 2: “Relative elevation category consists of three discrete groups representing three regions with different flooding risk profiles.”.

Both of these variables were included in the model selection because relative elevation has conventionally been modelled as a continuous variable in earlier studies and we wished to maintain consistency. Relative elevation category was included because our primary aim was to test the hypothesis of whether the role of rats in driving transmission varied across the chosen three elevation categories (low, medium and high) due to their different flood risk profiles. We have added additional text to clarify the reason for the inclusion of the relative elevation category variable.

– Figure 4. It seems to me that the elevation levels were chosen simply by identifying the elevation cut-offs that divided the household sample sizes into three equal groups. It would be helpful if the authors included a viable biological justification for this division.

The choice of these groupings was based on our observations on the spatial variation in flooding risk and leptospirosis risk at low, medium and high elevations in the community as defined in earlier studies conducted at this site. Consequently, our study area was defined to have a roughly even number of households across this elevation gradient. We have added the following additional text to Section 2.2.4 to clarify this:

“This was implemented by first dividing the study area into three elevation categories with different flooding risk profiles (as observed during our work in the study area over the last 15 years): low (0-6.7m from bottom of valley; high flooding risk with maintenance of floodwater for long periods), medium (6.7-15.6m; moderate flooding with high water runoff) and high (>15.6m; limited flooding and water runoff). Our study was then designed to evenly sample across this elevation gradient and minimum and maximum values for each elevation category were chosen to include an equal number of households in each level.”

– The authors provide an extensive model building exercise and investigate, in different ways, whether the model captures the necessary complexity (GAM smoothers – testing linearity, spatial correlation, etc). I believe the work would benefit from (1) a formal diagnostic investigation, if feasible;

We have added a new Appendix 7 with diagnostic plots of randomized quantile residuals to check the rattiness-infection model fit with the human infection data and included the following text in Section 2.4 of the main text:

“A formal diagnostic investigation of randomized quantile residuals is included in Appendix 7. We found no evidence in the diagnostic plots to suggest that there were issues with our modelling approach.”

(2) providing guidelines on how model building should be performed.

To supplement the R code that is publicly available for repeating all of the steps in this analysis, we have now also included a detailed step-by-step explanation of the model building process in Appendix 8 that outlines the key steps for building the rat and infection components of the model (variable selection and evaluation of residual spatial autocorrelation) and fitting and examining the joint rattiness-infection model. We have added the following text in Section 2.6 of the main text:

“We also include a step-by-step explanation of the model building process to guide future users of the rattiness-infection framework in Appendix 8.”

More specifically there are some additional concerns about this specific analysis:

(1) The infection risk data: while the actual infection risk data are not shown, the map shown in Figure 5B suggests that there is an infection hotspot that happens to be at high elevation. This raises the question of how strongly this single hotspot is driving the observed correlation between rat abundance and infection risk (which the authors find to be much stronger at high elevation than at lower elevations).

Please see below.

(2) The statistical models: if I understand correctly, all tested models of infection risk include the variable rat abundance, and while the individual effect estimates for rat abundance are statistically significant (Table 3), the more important question of how the fit of a model without the rat abundance variables compares with those of the other tested models (shown in Supplementary Table S2) has not been addressed.

I am wondering about this curious spatial pattern, where there seems to be one main predicted hotspot of infection risk (Figure 5B), which happens to be at a high elevation. There are a few other locations at a similar elevation, but these don't result in high infection risk predictions, which I assume is because of a difference in other important covariates? When comparing this result to the rattiness map (Figure 5A), one would never guess there is a meaningful (biologically significant) correlation between rattiness and infection risk. Model selection however did find a statistically significant effect of rattiness (Table 3), with the largest effect sizes for the high elevation.

This makes me wonder whether the statistical pattern is mostly driven by this one hotspot that happens to be at high elevation, and how important rattiness really is overall.

It would be great to see a map based on the raw infection data, so it's possible to get a better sense of this possible biasing effect on the contribution of rattiness. Maybe add it to figure 5?

One way to test this would be to do the same analysis, but without the location(s) driving this high infection risk hotspot, and see if rattiness is still an important contributor to infection risk.

Thank you for this comment. We have added a new figure (Figure 4) earlier on in the article (we decided to add this here rather than to Figure 6 – formerly Figure 5 – to ensure that the map is large enough that points in Figure 4A are easily visible – please note that it is included as a larger and easier to view image in the main eLife template version) with the raw infection data overlaid on contour lines for the three elevation levels to provide the reader with a better overview of the raw data. This new Figure 4 shows that out of a total of 403 participants in the high elevation region there were 16 infections, of which only 5 (31%) were located in the large hotspot in Valley 3 (valleys are numbered 1 to 3 from west to east, see Figure 1A). In addition to the largest hotspot in the north of Valley 3, there are several other areas in the high elevation region with raised predicted infection risk values relative to their surroundings where there were also rattiness hotspots and infected participants in the raw data: fives cases (red and yellow infection risk areas in Figure 5B) on the western side of Valley 2; the two cases on the eastern edge of Valley 2; the two cases on the western edge of Valley 3; and the single case in the southwest of Valley 3. Other variables are also important drivers of infection risk and at several of these locations the contribution of rattiness increases infection risk significantly relative to the low-risk surrounding area (e.g. to 10% in areas where risk is closer to 1% or 2%) without reaching the more obviously visible high infection risk values closer to 20%. We believe that our statistical model provides a better test of whether there is a statistical association between rattiness and infection at high elevations than a visual examination, but that this is supported by the large number of observations in the high elevation area (403) and the distribution of infected and uninfected households, which demonstrates that the observed association is not only driven by the hotspot in Valley 2.

Perhaps more importantly, all human infection models (Supplementary table 2) include rattiness, so there is no way to assess how a model without rattiness compares with those that do. I strongly suggest adding at least one model without rattiness, for example model M1 but excluding rattiness. If the AIC values of all models in Table S2 are much lower than a model without rattiness, it would add a lot of confidence to the assumed significant effect of rattiness.

Thank you for pointing this out. These models were considered but were ranked outside of the top five models and for this reason were not reported in Table S2. We agree that showing the AIC of a model without rattiness in this table can more clearly demonstrate the improved fit of the model with rattiness. To do this we have added the highest ranked model without rattiness (M*) to Table S2 and added a note to the table explaining the reason for its inclusion (“Model M* was ranked outside of the top 5 models but is included here for reference to demonstrate the improvement in model fit when rattiness is included”). The AIC of M* was 532.13. This is substantially higher than the top five models (M1 = 523.14 and M5 = 525.04), justifying its inclusion in this model and in the joint rattiness-infection framework.

This is related to the model framework relying on conditional independence within its built up (equation (1)). Whereas this is a reasonable assumption, it would be good to discuss situations in which this assumption is questionable and what the implications are for applying the modeling framework to other settings.

We have added the following text immediately after “is shown schematically in Figure 2” following equation (1) on line 225:

“The conditional independence assumption in (1) is reasonable for a vector-borne disease or one that is transmitted indirectly, in which context the observed rat indices are to be considered as noisy indicators of the unobservable spatial variation in the extent to which the environment is contaminated with rat-derived pathogen. It would be more questionable for applications in which the disease of interest is spread by direct transmission from rat to human.”

In addition the authors indicate that the most complex model was chosen when modeling rattiness (p8, section 2.3.1). Doesn't this imply that the model selection reaches its limits given the candidate models at hand, ie is there a need to consider more complex models?

The number of variables available for consideration in this rattiness model was limited to the three variables included in the model because of the requirement that they were also measured at all household locations. For this reason, we were unable to consider any other environmental variables. In terms of the functional forms of the included variables, we wished to maintain interpretability in the modelled relationships and decided to model them with linear splines rather than considering more complex and less interpretable smoothing functions.

2) Presentation and interpretation of results

– In Tables 2 and 3, the authors present their results in a comprehensive way but it's not easy to connect those tables in terms of results. For example; are the occupational exposures binary variables? If not, what is the reference category and why is only one (work as traveling salesperson) retained in Table 3? Which of the variables reach overall significance?

These occupational exposure variables are binary. We have now added additional text in Section 2.3.2 on lines 294 and 297 to clarify this and have added a footnote explaining that the reference category was participants who do not have this exposure in Tables 2 and 3: “Binary variable with reference category of `no occupational exposure'.”. A reference category was not included for each variable because Tables 2 and 3 are already large and we wished to keep it on a single page.

In terms of the overall significance, we prefer to show on confidence intervals to focus on range of possible effect sizes compatible with the data rather than p-values. However, all OR confidence intervals that did not include the value one are significant at the conventional 5% level. To make this more consistent in the text we have edited the text as follows:

Line 394 – “Variables in two of the four domains (demographic and social status and behavioural exposures) had estimated effect sizes with 95\% confidence intervals that did not include an odds ratio of one (statistically significant at the conventional 5\% level).”

Line 422 – “Individuals living in the medium (OR 0.77 95\%CI 0.31, 1.66) and high (OR 0.67 95\%CI 0.11, 1.64) elevation areas had a lower estimated odds of infection relative to those living in the low relative elevation category area where there are open sewers and flooding risk is higher, however these confidence intervals included an odds ratio of one (not statistically significant at the conventional 5% level).”

Reviewer #1 (Recommendations for the authors):

I believe the manuscript is overall clearly written. I do have a few questions for clarification though.

On p2, section 2.1.2 serosurveys, eligibility criteria for inclusion in the cohort study are outlined. It would help explaining why these specific conditions were used for inclusion: ie 'who had slept more than 3 nights in the previous week in a study household'.

We have added the following additional text to clarify this “This study focussed on ground floor households because they are vulnerable to flooding and consequently at high risk for leptospiral transmission. The criterion for determining whether a resident is currently living at a household location is commonly applied in this context to account for resident mobility.”.

On p7, section 2.2.3 the authors define Zi,j using a Bernoulli variable with probability p_j(x_i). Wouldn't it make sense to consider a hazard-based framework or derive the corresponding hazard function in terms of it's interpretation?

Thank you for your comment. Our study design consisted of two cross-sectional serological surveys conducted six months apart. We defined infections during this six-month period based on a comparison of antibody titres in paired serological samples from the two surveys. As we had no ongoing surveillance during this period (and because a significant proportion of infections are asymptomatic) we were only able to consider the outcome of whether there had been (at least one) infection event during the six-month period. Consequently, we did not have a time-to-event outcome and were therefore unable to consider a hazard-based framework.

Textual comments:

– Please use \mbox for the correlation in section 2.2.1.

This has been corrected, thank you for pointing this out.

Reviewer #2 (Recommendations for the authors):

Line 62 – typo? Analyze vs. analysing?

This has been corrected, thank you for pointing this typo out.

Figure 2 description – typo? … dh and dr are not mutually exclusive groups of explanatory [variables – missing?] and the same variables…

This has been corrected by adding in ‘variables’, thank you for pointing this typo out.

Line 277 – it would be nice to reference table 2 here so that the readers can see the full list of considered variables by group.

We have added the following text in the suggested sentence (Section 2.3.2): “(see Table 2 for the full list of considered variables by group)”

Regarding supplemental information, it would have been easier if the actual table or figure had been referenced vs. the file in which it could be found.

We have now added the Figure/Table reference for all references to supplemental information throughout the main text.

Reviewer #3 (Recommendations for the authors):

It was a pleasure to review your manuscript.

In my opinion the writing is excellent, the study design is clever and powerful (and must have been a lot of work!), and the spatial statistics are performed expertly.

I do have a few suggestions, that I hope can either be easily refuted or can help to improve the analyses.

Congratulations on this fantastic work.

L65: I suggest writing DALYs in full, as not all readers will know what this is.

We have added the following text: “disability-adjusted life years (DALYs)”.

L94: I don't agree that there is an absence of methods (multilevel Bayesian models for example have been around for a while), but rather that they are rarely applied in this context.

We have changed ‘The absence of methods to formally integrate abundance into analyses of spillover mechanisms is an issue for rodent-borne zoonoses more widely’ to ‘the absence of methods applied to formally integrate to formally integrate abundance and spillover infection data is an issue for rodent-borne zoonoses more widely’.

L99: I find this particular unspecific use of abundance quite confusing, as this is already a very specific and well-defined ecological term. For example, what exactly is then meant by reservoir host abundance on L104? Is this the number of reservoir hosts, or the number of infected hosts, or the number of leptospires in the environment?

If it is used as a measure of exposure to a disease of interest (L100), why not use a term like pathogen pressure, or just exposure? I strongly suggest using different words to describe actual abundance and pathogen-related abundance.

We agree that abundance in this context is well-defined as the density of reservoir hosts, however it is very commonly used to describe measures or estimates that are all proxies for ‘true abundance’. In our paper we wished to explicitly acknowledge the implications of using several imperfect abundance indices on our latent process. To make this clearer and more consistent we have changed the text to “We use the term ‘abundance’ here to denote all ecological processes that are associated with animal abundance and measured by abundance indices, for example animal presence, density and activity, and that may be useful to quantify exposure to a zoonotic disease of interest.”. We have also added “(as defined previously)” on line 106 to make it clear that we are consistently using this definition throughout the paper.

L115: The term rattiness is useful (and fun), but does it really represent leptospire pressure by rats if the model does not take into account leptospira prevalence/shedding in the rat populations? I agree that the presence/abundance of rats can be a decent proxy for the potential risk of leptospira spillover in locations with known presence of leptospira in the rat population, but I'm less inclined to accept that it is ok to define rattiness, which implies rat abundance, as a proxy for leptospiral contamination when the study did not measure the presence of leptospira in the rats.

I see that this is mentioned in the discussion (L493). I think it might be more useful to add this information earlier on, at the place where the term rattiness is introduced.

I agree that with such a high prevalence, it is reasonable to use rattiness as a proxy, but would still be wary: these 80% of rats are likely not distributed randomly across the area, as pathogen transmission is typically more spatially clustered. That means that 1 out of 5 local rat populations are not infected, which is definitely not negligible. That is an important caveat to highlight clearly, early on.

Thank you for this comment. We have added the following text on line 530 in the discussion to emphasise the possibility of spatial clustering of non-shedding rats:

“Despite this, non-shedding rats may be spatially clustered and future work would benefit from the collection of georeferenced rat infection data”. We have also edited the first sentence in on line 116 in the Introduction to the following: “The aim of this study was to develop a flexible modelling framework for zoonotic spillover to explore whether rattiness, acting as a proxy for local leptospiral contamination by Norway rats, can explain spatial heterogeneity in leptospiral transmission in a high-risk urban community in Brazil where 80% of rats are estimated to be actively shedding the bacteria.”

L208, 213: On L208, i is defined as a location, and on L213 as household location. I assume these represent the same location? If so, it might be best to be a bit more specific in the definition on L208, and add 'household', just so it's clear there is only one definition of a location.

We denote the full set of locations for which we have collected data (rat abundance sampling locations and participant household locations) with ‘i’, but index them as i = 1,…,Nr for rat index locations and i = Nr+1,….,Nr+Nh for household locations. This is important for our definition of R(xi) at all rat and human locations. We have tried to be consistent with this definition for the rat and human data – on line 219 we defined the human data to be at ‘a discrete set of locations X=…’. The reason for including the text ‘for individual j at household location i’ was to make it explicit that it was the household location which we were using for the human data.

L284: What is the rationale for choosing those specific knots?

We have added the following additional text in Appendix 2 to explain the use of the generalized additive model (GAM) plots to explore non-linear relationships between explanatory variables and human infection risk more explicitly:

“Single knot points were considered for age at 30 years (Figure S2A), education at 5 years (Figure S2C) and relative elevation at 20 metres (Figure S2D) based on the value of the explanatory variable at which the gradient of the relationship changed in these plots.”. This is already referenced in the main text “As before, non-linear relationships were modelled using linear splines based on the identified functional form. Age was modelled with a knot at 30 years old, education at 5 years and relative elevation at 20m (Figure S2 in Appendix 1)”.

Please see above for similar added text to clarify the same process for the rattiness explanatory variables.

L324: Kudos for citing the individual R packages (as one should, but often not done).

Thank you.

https://doi.org/10.7554/eLife.73120.sa2

Article and author information

Author details

  1. Max T Eyre

    1. Centre for Health Informatics, Computing, and Statistics, Lancaster University Medical School, Lancaster, United Kingdom
    2. Liverpool School of Tropical Medicine, Liverpool, United Kingdom
    Present address
    Centre for Health Informatics, Computing and Statistics, Lancaster Medical School, Lancaster University, Lancaster, United States
    Contribution
    Conceptualization, Data curation, Software, Formal analysis, Investigation, Visualization, Methodology, Writing – original draft, Writing – review and editing
    For correspondence
    maxeyre3@gmail.com
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-9847-8632
  2. Fábio N Souza

    Institute of Collective Health, Federal University of Bahia, Salvador, Brazil
    Contribution
    Data curation, Investigation, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-3542-8918
  3. Ticiana SA Carvalho-Pereira

    Institute of Collective Health, Federal University of Bahia, Salvador, Brazil
    Contribution
    Data curation, Investigation, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-2370-2198
  4. Nivison Nery

    Institute of Collective Health, Federal University of Bahia, Salvador, Brazil
    Contribution
    Data curation, Investigation, Writing – review and editing
    Competing interests
    No competing interests declared
  5. Daiana de Oliveira

    Institute of Collective Health, Federal University of Bahia, Salvador, Brazil
    Contribution
    Data curation, Investigation, Writing – review and editing
    Competing interests
    No competing interests declared
  6. Jaqueline S Cruz

    Institute of Collective Health, Federal University of Bahia, Salvador, Brazil
    Contribution
    Data curation, Investigation, Writing – review and editing
    Competing interests
    No competing interests declared
  7. Gielson A Sacramento

    Institute of Collective Health, Federal University of Bahia, Salvador, Brazil
    Contribution
    Data curation, Investigation, Writing – review and editing
    Competing interests
    No competing interests declared
  8. Hussein Khalil

    1. Institute of Collective Health, Federal University of Bahia, Salvador, Brazil
    2. Swedish University of Agricultural Sciences, Umeå, Sweden
    Contribution
    Writing – review and editing
    Competing interests
    No competing interests declared
  9. Elsio A Wunder

    1. Oswaldo Cruz Foundation, Brazilian Ministry of Health, Salvador, Brazil
    2. Department of Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, United States
    Contribution
    Data curation, Investigation, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-5239-8511
  10. Kathryn P Hacker

    University of Pennsylvania, Philadelphia, United States
    Contribution
    Data curation, Investigation, Writing – review and editing
    Competing interests
    No competing interests declared
  11. José E Hagan

    World Health Organization (WHO) Regional Office for Europe, Copenhagen, Denmark
    Contribution
    Data curation, Investigation, Writing – review and editing
    Competing interests
    No competing interests declared
  12. James E Childs

    1. Oswaldo Cruz Foundation, Brazilian Ministry of Health, Salvador, Brazil
    2. Department of Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, United States
    Contribution
    Writing – review and editing
    Competing interests
    No competing interests declared
  13. Mitermayer G Reis

    1. Institute of Collective Health, Federal University of Bahia, Salvador, Brazil
    2. Oswaldo Cruz Foundation, Brazilian Ministry of Health, Salvador, Brazil
    Contribution
    Conceptualization, Funding acquisition, Investigation, Project administration, Writing – review and editing
    Competing interests
    No competing interests declared
  14. Mike Begon

    Department of Evolution, Ecology and Behaviour, University of Liverpool, Liverpool, United Kingdom
    Contribution
    Conceptualization, Supervision, Writing – review and editing
    Competing interests
    No competing interests declared
  15. Peter J Diggle

    Centre for Health Informatics, Computing, and Statistics, Lancaster University Medical School, Lancaster, United Kingdom
    Contribution
    Conceptualization, Software, Supervision, Methodology, Writing – review and editing
    Competing interests
    No competing interests declared
  16. Albert I Ko

    1. Oswaldo Cruz Foundation, Brazilian Ministry of Health, Salvador, Brazil
    2. Department of Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, United States
    Contribution
    Conceptualization, Funding acquisition, Writing – review and editing
    Competing interests
    has received funding from Serimmune and Zoetis for work related to leptospirosis. AIK also received payment and honoraria from Reckit Global Health Institute for participating in a non-profit panel. AIK received travel support from World Health Organisation and Brazilian Ministry of Health. AIK is listed as co-inventor on an issued patent (US 7,718,183 B2) and pending patent (US 61/951,732) related to leptospirosis vaccines. AIK is also on the following boards: Board of Directors, American Society of Tropical Medicine and Hygiene; Executive Board Member (2009-present), International Leptospirosis Society; Member, Inaugural Expert Panel, Reckitt Global Hygiene Institute; Steering Committee Member, Global Leptospirosis Environmental Action Network (GLEAN), WHO. The author has no other competing interests to declare
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-9023-2339
  17. Emanuele Giorgi

    Centre for Health Informatics, Computing, and Statistics, Lancaster University Medical School, Lancaster, United Kingdom
    Contribution
    Conceptualization, Software, Supervision, Methodology, Writing – review and editing
    Contributed equally with
    Federico Costa
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-0640-181X
  18. Federico Costa

    1. Centre for Health Informatics, Computing, and Statistics, Lancaster University Medical School, Lancaster, United Kingdom
    2. Institute of Collective Health, Federal University of Bahia, Salvador, Brazil
    3. Oswaldo Cruz Foundation, Brazilian Ministry of Health, Salvador, Brazil
    4. Department of Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, United States
    Contribution
    Conceptualization, Resources, Supervision, Funding acquisition, Investigation, Project administration, Writing – review and editing
    Contributed equally with
    Emanuele Giorgi
    Competing interests
    No competing interests declared

Funding

National Institutes of Health (F31 AI114245)

  • Albert I Ko

National Institutes of Health (R01 AI052473)

  • Albert I Ko

National Institutes of Health (U01 AI088752)

  • Albert I Ko

National Institutes of Health (R01 TW009504)

  • Albert I Ko

National Institutes of Health (R25 TW009338)

  • Albert I Ko

Medical Research Council (964635)

  • Max T Eyre

Wellcome Trust (102330/Z/13/Z)

  • Nivison Nery
  • Federico Costa

Fundação Oswaldo Cruz

  • Federico Costa

Fundação de Amparo à Pesquisa do Estado da Bahia (FAPESB/JCB0020/2016)

  • Fábio N Souza
  • Federico Costa

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication. For the purpose of Open Access, the authors have applied a CC BY public copyright license to any Author Accepted Manuscript version arising from this submission.

Acknowledgements

We thank the residents and community leaders of Pau da Lima community for their support and participation in this study. This work was supported by the Oswaldo Cruz Foundation and Secretariat of Health Surveillance, Brazilian Ministry of Health, the National Institutes of Health of the United States, the Wellcome Trust and by the Fundação de Amparo à Pesquisa do Estado da Bahia. MTE was supported by a UK Research and Innovation (UKRI) doctorate studentship. FNS was supported by a FAPESB doctorate scholarship.

Ethics

Human subjects: Participants were enrolled according to written informed consent procedures approved by the Institutional Review Boards of the Oswaldo Cruz Foundation and Brazilian National Commission for Ethics in Research, Brazilian Ministry of Health (CAAE: 01877912.8.0000.0040) and Yale University School of Public Health (HIC 1006006956).

For the rats captured in the rat ecology study, the ethics committee for the use of animals from the Oswaldo Cruz Foundation, Salvador, Brazil, approved the protocols used (protocol number 003/2012), which adhered to the guidelines of the American Society of Mammalogists for the use of wild mammals in research (Sikes and Gannon, 2011) and the guidelines of the American Veterinary Medical Association for the euthanasia of animals (Leary et al., 2013). These protocols were also approved by Yale University's Institutional Animal Care and Use Committee (IACUC), New Haven, Connecticut (protocol number 2012-11498).

Senior Editor

  1. Miles P Davenport, University of New South Wales, Australia

Reviewing Editor

  1. Niel Hens, Hasselt University, Belgium

Reviewer

  1. Benny Borremans, University of California, Los Angeles, United States

Publication history

  1. Received: August 17, 2021
  2. Preprint posted: September 22, 2021 (view preprint)
  3. Accepted: September 14, 2022
  4. Accepted Manuscript published: September 16, 2022 (version 1)
  5. Accepted Manuscript updated: September 26, 2022 (version 2)
  6. Version of Record published: October 13, 2022 (version 3)
  7. Version of Record updated: October 18, 2022 (version 4)

Copyright

© 2022, Eyre et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 300
    Page views
  • 95
    Downloads
  • 0
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Max T Eyre
  2. Fábio N Souza
  3. Ticiana SA Carvalho-Pereira
  4. Nivison Nery
  5. Daiana de Oliveira
  6. Jaqueline S Cruz
  7. Gielson A Sacramento
  8. Hussein Khalil
  9. Elsio A Wunder
  10. Kathryn P Hacker
  11. José E Hagan
  12. James E Childs
  13. Mitermayer G Reis
  14. Mike Begon
  15. Peter J Diggle
  16. Albert I Ko
  17. Emanuele Giorgi
  18. Federico Costa
(2022)
Linking rattiness, geography and environmental degradation to spillover Leptospira infections in marginalised urban settings: An eco-epidemiological community-based cohort study in Brazil
eLife 11:e73120.
https://doi.org/10.7554/eLife.73120

Further reading

    1. Epidemiology and Global Health
    Paolo Giorgi Rossi, Giuliano Carrozzi ... Paola Mantellini
    Research Article

    Background: In Italy, Regions have the mandate to implement population-based screening programs for breast, cervical, and colorectal cancer. From March to May 2020, a severe lockdown was imposed due to the COVID-19 pandemic by the Italian Ministry of Health, with the suspension of screening programs. This paper describes the impact of the pandemic on Italian screening activities and test coverage in 2020 overall and by socio-economic characteristics.

    Methods: The regional number of subjects invited and of screening tests performed in 2020 were compared with those in 2019. Invitation and examination coverage were also calculated. PASSI surveillance system, through telephone interviews, collects information about screening test uptake by test provider (public screening and private opportunistic). Test coverage and test uptake in the last year were computed, by educational attainment, perceived economic difficulties, and citizenship.

    Results: A reduction of subjects invited and tests performed, with differences between periods and geographic macro areas, was observed in 2020 vs. 2019. The reduction in examination coverage was larger than that in invitation coverage for all screening programs. From the second half of 2020, the trend for test coverage showed a decrease in all the macro areas for all the screening programs. Compared with the pre-pandemic period, there was a greater difference according to the level of education in the odds of having had a test last year vs. never having been screened or not being up to date with screening tests.

    Conclusions: The lockdown and the ongoing COVID-19 emergency caused an important delay in screening activities. This increased the pre-existing individual and geographical inequalities in access. The opportunistic screening did not mitigate the impact of the pandemic.

    Funding: This study was partially supported by Italian Ministry of Health - Ricerca Corrente Annual Program 2023.

    1. Epidemiology and Global Health
    2. Microbiology and Infectious Disease
    Fares Z Najar, Evan Linde ... Pratul K Agarwal
    Research Article Updated

    COVID19 has aptly revealed that airborne viruses such as SARS-CoV-2 with the ability to rapidly mutate combined with high rates of transmission and fatality can cause a deadly worldwide pandemic in a matter of weeks (Plato et al., 2021). Apart from vaccines and post-infection treatment options, strategies for preparedness will be vital in responding to the current and future pandemics. Therefore, there is wide interest in approaches that allow predictions of increase in infections (‘surges’) before they occur. We describe here real-time genomic surveillance particularly based on mutation analysis, of viral proteins as a methodology for a priori determination of surge in number of infection cases. The full results are available for SARS-CoV-2 at http://pandemics.okstate.edu/covid19/, and are updated daily as new virus sequences become available. This approach is generic and will also be applicable to other pathogens.