Introduction

GPS loggers are a growing tool for capturing both human and animal movements.1,2 These small devices can be worn by individuals and record locations at regular preset time intervals.

Compared to other methods of collecting human movements, such as cell tower traffic or Google Location History which are suited for analysing large-scale mobility,3,4 these devices can capture very fine-scale movements. These data are crucial in quantifying exposure within complex environments, where terrain can change rapidly. Furthermore, movements recorded by GPS loggers can be assigned to specific individuals. This allows linkage between individual socio-demographic factors and the data collected, especially convenient when performing epidemiological analyses. Other methods for measuring human mobility are inherently anonymous and do not allow this connection to be made. An important challenge when using GPS loggers is that they rely on individual compliance for carrying the device at all times, an issue which is overcome by the other methods mentioned above.

The analysis of human telemetry data is an emerging field of research in epidemiology. Whilst previous methods have advanced this area of research, improvements could be made. For example, the methods used by Owers et al.1 to assess the relationship between urban slum residents’ movements and the risk of leptospirosis infection were able to analyse differences between genders, but did not consider other important socio-demographic factors. In another study, Fornace et al.2 used GPS loggers to assess human exposure to mosquito vectors of Plasmodium knowlesi malaria and environmental factors associated with this. Various individual-level factors were included in the analyses performed in this paper, questioning how these could affect participants’ movements. However, by not including comparisons of possible choices an individual could have made, this study could not determine how the environment may have influenced movement.

Leptospirosis is a zoonotic bacterial infectious disease with strong environmental drivers. It has been estimated to cause over 1 million yearly human cases worldwide, leading to 58 900 deaths.5 Rats are the main reservoir of the disease, shedding bacteria in their urine.6 Human infection is associated with exposure to contaminated waters and soils.68 Evidence shows that in urban slum settings, men have a higher infection risk than women.9 This has been attributed to differences in behaviours, especially in how individuals move through their communities, rather than a biological differences. Indeed, there is evidence that men tend to visit much larger areas during their daily journeys than women.1

Exactly where people are most exposed to high leptospirosis contamination, and therefore where infection is most likely to occur, has not been investigated. Previous studies have focused on the assessment of the peri-domiciliary environment and its associations to infection risk.79 However, these analyses assume people are mostly exposed to infection risk in this area and ignore the exposure that individuals may incur when they move further away from their households throughout their daily activities. Furthermore, people’s movement patterns may differ depending on individual socio-demographic factors which could in turn affect their risk of exposure. If individuals traverse highly contaminated areas where the risk of exposure is heightened, it can lead to an increased risk of infection. This is particularly important in environmentally heterogeneous areas, such as urban slums, where the landscape can change drastically in small spaces. Technological advances now allow us to record and analyse fine-scale movements to understand how these may affect infection risk.

In this paper, we developed a modelling framework to understand how telemetry data can be used to identify and quantify determinants of human movements, adapting methods from animal movement ecology. We present a novel method for analysing telemetry data to estimate environmental selection as individuals move through their urban communities. This method is applied in a low-income urban setting in Salvador, Brazil, and is used to examine how individuals interact with various key points in their surrounding environment. Furthermore, we analyse if there are any differences in movements inside the study areas between genders, ages and leptospirosis serological status. This method of analysis overcomes limitations from other studies by, firstly, specifically modelling choice of movement in relation to environmental factors and, secondly, incorporating multiple socio-demographic factors which allows regression relationships to be jointly adjusted for these.

Methods

Study areas

This study was nested in a prospective cohort study taking place across Salvador, Brazil.10 Salvador is the third largest city in Brazil, located in the north-eastern region of the country and has a tropical climate. The study areas are considered urban slums (locally called ‘favelas’).

They were selected for a number of reasons: firstly, they all have similar demographic and socio-economic factors within their populations; secondly, they all have a stream running through the centre of the community, which is considered contaminated; and thirdly, there is a high burden of leptospirosis in these populations.

All four study areas are small, with an approximate size of 0.03 km2. They are located across the outskirts of Salvador (Figure 1). The communities have very heterogeneous environments, with rapid changes in both land cover and slope. Buildings in these communities have been built with limited or no urban planning. They can be of varying quality, ranging from gated areas with multiple dwellings protected from rain and flooding to single brick buildings with informal entryways.

Map showing location of each study area in Salvador.

Each area includes symbology for stream (blue line), open sewer points (purple diamond), and domestic rubbish piles (orange triangle).

Individual characteristics

The eligibility criteria for inclusion in the study was: individuals who (1) had been living at one of the study areas for at least 6 months, (2) slept there at least 3 nights a week, (3) were at least 18 years old and (4) gave written consent.10 Participants were asked to answer a baseline survey which collected their demographic, social and economic characteristics, including age and gender. A blood sample was taken from each participant to determine serological evidence for Leptospira infection using the microscopic agglutination test (MAT), the standard test used for leptospirosis diagnosis.6 In this analysis, a MAT showing antibodies with a titre >1:50 against any Leptospira serovar was considered a positive result. Further details about the laboratory work carried out are available in the supplementary material. The location of their household was recorded and georeferenced by the research team.

Participants who were already enrolled in the cohort study were recruited to take part in the movement analysis study. A target of 30 people per study area, balanced by gender, was chosen for this study.

GPS Data

Individuals who consented to take part in this study were asked to wear GPS loggers for continuous periods of up to 48 hours, which could be repeated. The GPS loggers used were i-got U GT-600, set to record their location every 35 seconds. Data were collected between March and November 2022.

Once the GPS telemetry data was collected, participants’ recorded locations were cleaned so as to retain only relocations within the study area boundaries that were recorded between 5 am and 9 pm, which generally corresponds to an individual’s active hours. Interactions with environmental factors outside of the study area boundaries could not be considered in the analysis because high-resolution environmental data outside of the study areas was not available. Individuals with less than 50 relocations within the study area were excluded from the analysis.

Environmental Data

This analysis focused on three environmental factors: community stream, open sewers and domestic rubbish piles. The latter factor represents areas where rats are more likely to be found, whilst the other factors represent risks of having close contact with Leptospira contaminated muds or waters. The location of these different points of interest in the study area were mapped by trained research teams. These environmental factors were included in analyses in two ways: using distance rasters and buffer rasters. A 1 meter resolution raster was created for each environmental factor by calculating the nearest distance for each pixel to the reference points. The buffer rasters, one for each factor, were created using a 20 meter buffer around each reference point. All pixels within this buffer were assigned a value of 1, whilst those outside were given a value of 0. Buffers were used to understand the effect of the immediate vicinity of each reference point on movement behaviours. Buffer rasters were also created for each individual’s household location, with a 10 meter buffer around each location. This represented space within and immediately outside each house.

Movement Analysis

The analysis was performed in two phases (Figure 2). Firstly, each individual’s data was analysed alongside the environmental factors. This phase created a set of parameters—called selection coefficients—for each individual. These selection coefficients were specific to each of the environmental factors. In the second phase, the selection coefficient for a particular environmental factor was analysed across the study population. This phase incorporated the individual characteristics for each participant: gender, age and Leptospira serological status.

Schematic diagram showing what data sources are used in which model, and how models are linked with each other.

The blue sections represent phase one, the individual-level models, whilst the orange section represent phase two, the population-level model.

These phases are detailed below. All analyses were carried out in R, version 4.2.1,11 using tools from tidyverse.12 Specific movement analyses were carried out using package amt.13

Phase 1: Individual-level model

Drawing from the current methodological developments in animal movement ecology, we used step-selection functions to characterise individuals’ movement behaviours in relation to the environmental factors described above. Step-selection functions are a type of movement analysis method that fall under the Resource Selection umbrella. They can also be classified as spatio-temporal point process models.14 In these models, an individual’s location at time point i (μi) is conditioned on the previous location it was in (μi−1), the selection coefficients of the environment (β) and the available space the individual could have travelled to (θ).

Step-selection functions have two important components: the availability function (f(…)) and the selection function (g(…)). The availability function defines the available space that an individual could move inside of within a set of space and time constraints. The selection function specifies how the individual responds to the environmental factors that are close to them when choosing their path, creating a set of selection coefficients for each factor—or resource–included in the model. These selection coefficients are specific to a given individual. This latter component is the focus of our analysis, whilst the former availability function was pre-defined using the empirical data.

The availability function was fitted separately to each recorded location. The step lengths and turning angles between consecutive steps were used to parametrise movement characteristics for an individual (Figure 3A). Using these characteristics, a group of available steps (Figure 3B, grey dots) was created for each used step (Figure 3B, black dots). These represented locations that were consistent with human movements that an individual could have travelled to but chose not to. A total of 100 available steps were created for each used step (Figure 3B).

Descriptive diagram of step-selection functions.

A: step lengths (sl) and turning angles (θ) are used to characterise an individual’s movements. B: these parameters are used to create a set of available steps (grey dots) for every used step (black dots).

Each individual’s telemetry data was analysed by time periods within daytime active hours. These were periods of 4 hours, representing morning (05:00 – 09:00), midday (09:00 – 13:00), afternoon (13:00 – 17:00) and evening (17:00 – 21:00) activities. Movements across the whole daytime period were also analysed (05:00 – 21:00). This analysis was performed to examine the effects of circular journeys, when people travel to and back from a same place using a very similar route. By looking at specific time periods, we hoped to capture one-way journeys.

A conditional logistic regression was used to estimate the selection coefficients for each of the environmental variables for a given individual. A separate model was used for each time period.

The model estimated the odds of a step being used compared to it being available and unused (p), with a logit transformation (logit(p)). The first three variables included in the model (x1x3) represented the different environmental factors (central stream, open sewers and domestic rubbish piles) and their corresponding selection coefficients (β1 − β3). Distance rasters and buffer rasters were included in separate models. The household buffer rasters were included in the next variable (x4). The following three variables (x5x7) represent the movement characteristics of the individual: the step length (sl), the natural logarithm of the step lengths (log(sl)) and the cosine of the turning angle (cos(θ)). These are the same movement characteristics used to create the set of available steps. The final variable included (x8) was the hour within which each step was recorded. The model was stratified by each used step (αstratumi), where j represents each used step and its associated available steps. This model estimates a selection coefficient for each of the environmental factors of interest, conditioned on all other environmental factors, the individual’s household location, the individual’s movement characteristics and the hour of the day. These selection coefficients can be interpreted as the likelihood of moving into a specific environmental condition whilst keeping other environmental factors, movement characteristics and hour of the day constant. For distance rasters, the selection coefficient represents odds of moving further away from the reference point. For buffer rasters, it represents the odds of moving inside of the 20 meter buffer of each reference point.

Phase 2: Population level model

To assess movement differences between individual characteristics, a population-level linear regression model was used. Separate models were created for each of the three environmental factors, using their corresponding selection coefficients as the outcome, and for each time period (whole daytime period, morning, midday, afternoon and evening). We used two main group of models: (1) those assessing differences between genders and ages, which were conditioned on both of these variables and the study area; (2) those assessing differences between Leptospira antibody statuses, which were conditioned on gender, age and study area. The shared equation for each of the models is defined as follows:

In these models, the outcome was the estimated selection coefficient for each environmental factor (k). The first two variables, x1 and x2, represented gender (taking values 0 for male and 1 for female), and age, used as a continuous variable. The third variable, x3, represented Leptospira antibody status, as a binary variable taking values 1 for a positive test and 0 otherwise. As mentioned previously, a positive result was defined as a positive MAT result for any Leptospira serovar. The final variable in the model, x4, represented the study area, included to adjust for any unmeasured differences between study areas. The error term, Zk, captured the residuals from the model, which also accounted for any variation between individuals which was not measured as well as the sampling error inherent to the estimates of the selection coefficients. To account for variation in the standard errors of the selection coefficients, the variance of Zk was defined as wk2, where wk is the estimated variance of which was used to account for the heterogeneity in the estimate of

Results

Descriptive statistics

There were a total of 130 individuals who consented to take part in this movement study. Of these, 6 individuals were removed from further analysis due to not having sufficient re-locations within the study area boundaries. The remaining 124 individuals represented 11.4% of the sample population from the parent study (n = 1086). Of the participants in the movement study, 57 (46.0%) were female and their ages ranged from 18 to 83, with a median age of 38 and mean age of 39.5 (sd = 15.5). There were 12 individuals (9.7%) who tested positive for Leptospira antibodies. Although these proportions were very similar to those present in the larger sample population from the parent study, the individuals in the movement analysis skewed female and older (Table 1).

Summary table comparing parent study participants and movement study participants

The majority of individuals spent most of their recorded time during their active daytime hours within their study area boundaries. The percentage of recorded time spent within the study area boundaries ranged from 4% to 100%. The mean percentage was 80%, with a median of 91% and a standard deviation of 25%. Females spent less time within the boundaries than men (females: mean = 76%, sd = 28%; males: mean = 83%, sd = 22%). Individuals who had antibodies against Leptospira spent the same time within the study area boundaries as individuals with no antibodies (positive: mean = 83%, sd = 26%; negative: mean = 80%, sd = 25%).

The maximum values for the different environmental distance rasters varied across the four study areas. The maximum distance to open sewers was lowest in study area 3 and highest in study area 2 (1: 199 m; 2: 235m; 3: 80 m; 4: 208 m). Similarly, the maximum distance to domestic rubbish piles was lowest in study area 3 and highest in study area 2 (1: 214 m; 2: 363 m; 3: 153 m; 4: 247 m). These differences are attributed to the number of open sewer points and domestic rubbish piles within each study area. The maximum distance to the central stream was highest in study area 1 and lowest in study area 3 (1: 217 m; 2: 209 m; 3: 94 m; 4: 172 m). More detailed descriptive statistics are available in Supplementary Material I.

Movement analysis

The results from the movement analysis are presented in the odds scale. A positive value represents higher odds of moving towards an increasing value for each raster. As described previously, for distance rasters this is interpreted as moving further away from the point of reference (Table 2), whilst for buffer rasters this is interpreted as moving into the 20 meter buffer area for each point of reference (Table 3).

Estimated differences (γ) in selection coefficients (β) for each environmental factor using distance-based rasters. Values >1 represent increasing distance from points of reference

Estimated differences (γ) in selection coefficients (β) for each environmental factor using 20 meter buffers around each point of reference. Values >1 represent movement within the buffer zone for each point of reference.

We found no differences in how individuals moved in regards to the distance to the central stream by age (OR: 1.00; 95% CI: 1.00, 1.00; p = 0.697) or Leptospira antibody status (OR: 0.99; 95% CI: 0.96, 1.01; p = 0.273). Similarly, movements relative to the 20 meter buffer for the central stream were the same across ages (OR: 1.00; 95% CI: 1.00, 1.01; p = 0.280) and across Leptospira serological status (OR: 0.89; 95% CI: 0.67, 1.19; p = 0.433). There was evidence that women moved closer to the stream than men, even after accounting for the effects of age, study area and the location of their households (OR: 0.98; 95% CI: 0.97, 0.99; p = 0.003).

This effect was more pronounced in the analysis of the 20 meter buffered area (OR: 1.22, 95% CI: 1.02, 1.46; p = 0.026).

As with the above, there was no evidence of different movement behaviours relative to distance to open sewers by age (OR 1.00; 95% CI: 1.00, 1.00; p = 0.572) or Leptospira antibody status (OR: 1.03; 95% CI: 1.00, 1.07; p = 0.054). Women were found to move further away from open sewers compared to men (OR 1.04; 95% CI: 1.02, 1.06; p < 0.001). When analysing movements relative to the 20 meter buffer around open sewers, we found no evidence of differences between genders (OR: 0.95; 0.80, 1.14; p = 0.580). We found evidence of a small tendency to move outside of the 20 meter buffer around open sewers as people aged, although the effect could be considered negligible (OR: 0.99; 95% CI: 0.98, 1.00; p = 0.003). We also found evidence of a strong inclination for people with Leptospira antibodies to move outside of the buffers around open sewers, compared to people with no antibodies (OR: 0.64; 95% CI: 0.47, 0.87; p = 0.005).

Our analysis showed no evidence of different movement behaviours relative to the distance to rubbish piles across genders (OR: 0.99; 95% CI: 0.98, 1.01; p = 0.280), ages (OR: 1.00; 95% CI: 1.00, 1.00; p = 0.466) or Leptospira antibody statuses (OR: 1.00; 95% CI: 0.98, 1.02; p = 0.760). We also found no evidence when analysing movements relative to the 20 meter buffer around rubbish piles across genders (OR: 0.92; 95% CI: 0.66, 1.27; p = 0.600), ages (OR: 1.00; 95% CI: 0.99, 1.01; p = 0.989) or Leptospira antibody statuses (OR: 0.80; 95% CI: 0.44, 1.49; p = 0.482).

Analysis by time periods

Movements were subdivided into four time periods: morning (5 am -- 9 am), midday (9 am -- 1 pm), afternoon (1 pm -- 5 pm) and evening (5 pm -- 9 pm). The interactions with the environmental factors were similar to those reported for whole day activities, although there were some key differences (Figure 4).

Graph showing results of final analyses: A) results for distance based rasters, values above 1 interpreted as increasing distance to points of reference; B) results for 20 meter buffer based rasters, values above 1 show movement within buffer zones.

Each horizontal band represents a specific time period (right hand side y-axis label): all day (5 am – 9 pm, Tables 2 and 3), morning (5 am – 9 am), midday (9 am – 1 pm), afternoon (1pm – 5 pm) and evening (5 pm – 9 pm). All data points include their corresponding 95% confidence intervals, some of which are too narrow to show up clearly.

We found no differences in movements relative to the central stream as people aged or between Leptospira antibody status across the four periods. Women still moved closer to the central stream than men across all periods. We also saw that women had a higher tendency to move within the 20 meter buffer for the stream compared to men across all periods.

Movement in relation to distance to open sewer points and their respective 20 meter buffers showed no difference across all four periods. The strength of the selection effect seen in serological positive individuals for moving outside of the 20 meter buffer varied, with stronger effects seen in the morning and evening periods.

Domestic rubbish piles did not appear to have an effect on movement differences between ages or Leptospira antibody status across all periods. We found women moved outside of the 20 meter buffer zone more than men during the morning period only. Otherwise, no notable differences were seen.

Discussion

Our study aimed to apply a novel methodology to the area of human mobility analysis in infectious disease epidemiology, focusing on leptospirosis in four urban slums in Salvador, Brazil. We assessed movements in relation to central streams, open sewer points, and domestic rubbish piles and observed changes throughout the day using step-selection functions. These are a modelling approach which we have taken and adapted from animal movement ecology. Our findings showed that step selection functions could be an effective method to identify movement behaviours. To understand how the results could be described in the context of infectious disease epidemiology, we have explained our interpretation of the findings, including strengths and limitations. However, it is important to highlight that, given this is a novel methodology, the evidence we present is not conclusive and further research is required.

The results suggested no movement differences between Leptospira antibody statuses or ages concerning the distances to stream, open sewer points, or domestic rubbish piles. Our findings consistently showed that women tended to move closer to the central stream and farther from open sewer points than men, adjusted for age and study area. We also found that women had a tendency to move within the 20 meter buffer of the central streams compared to men, and that seropositive individuals were more inclined to move outside of the buffer zone for open sewers compared to seronegative individuals. Movement patterns did not vary significantly throughout the day. Previous research indicates that men in similar communities perceive themselves as less vulnerable to leptospirosis compared to women.15 Additionally, a knowledge, attitudes and practices analysis showed that men have lower scores for both knowledge and attitudes towards leptospirosis and its associated risks.16 Our findings align with these studies, suggesting that women may avoid open sewers due to perceived risks, while men may not share these perceptions. Social areas, which may have gender differences, also contribute to different movement behaviours. One might conclude that the stream is used for gendered chores such as washing clothes. However, this is not the case in our study areas. Following discussion with residents, we know that they perceive the stream as highly contaminated and avoid using its waters for cleaning or other household chores.

Our results contrast those reported by Owers et al.1, who found no differences in space use between genders after using GPS loggers to analyse individual’s movements. This discrepancy could be explained by the differences in length of time being analysed. Owers et al. were only able to analyse data collected over 24 hour periods, whereas our analysis was longer and included data collected over periods of up to 48 hours, which could be repeated. The contrasting results could also be attributed to the different populations studied. Although overall these populations resided in very similar communities in Salvador, they could have different characteristics that affect movement behaviours.

Our findings regarding the interactions with rubbish piles may be explained by various reasons. There is evidence that proximity to rubbish piles does not drive Leptospira seropositivity in similar areas to those used in our analysis.15 Whilst this proximity does increase rat sightings, this reduced effect on infection risk could lead individuals to disregard the locations of rubbish piles when choosing their travel paths. Another possible explanation is that there may be an unmeasured environmental variable that is interacting with the distance to rubbish piles, which needs further investigation. For example, violence could be interacting with where rubbish accumulates. We discuss violence further in a paragraph below.

The evidence showing Leptospira positive individuals avoiding open sewers was surprising. Although we were expecting to see an effect in the opposite direction, showing individuals with Leptospira antibodies interacting closely with open sewers, there are a few possible explanations for our findings. If individuals with antibodies are also actively infected, they could be symptomatic and therefore alter their behaviour to avoid high risk areas. Alternatively, individuals with antibodies could be more aware of risks due to previous infections and display more protective behaviours than people who have not had any previous infections

During informal conversations with community residents, it became clear that violence plays a key role in individual’s decisions on where they go. Violence in these communities is perceived as hyper-local, restricted to one corner or small square within the communities. It is unclear what drives this perception, but nevertheless it is an important factor that could be accounted for.

Further research is required to develop methods that can capture these perceptions in spatial formats that could be incorporated into similar movement studies. Age did not affect movement choices, suggesting consistent perceptions of environmental risks or stable use of urban spaces across ages.

We expected different movement patterns at various times of day, anticipating circular journeys (an individual going somewhere and back again on the same route). However, our results showed consistent movement patterns, possibly due to the analysis period’s length or other unmeasured factors modulating movements. Our results could also be indicative of evidence that strictly circular journeys through these communities, where individuals are travelling through the exact same path for both journeys, are not common, and that movement interactions with urban surroundings do not vary throughout the day.

To our knowledge, this is the first study that uses step-selection functions to model movement behaviours in the context of human infectious disease epidemiology. This method has provided quantitative evidence that there may be differences in how men and women move through their communities, strengthening the argument that the variation in leptospirosis exposure and infection risk between genders is due to behavioural differences rather than physiological differences. Additionally, we show that individuals consider environmental features differently when moving through their communities. Highlighting the effects of these variables on movement would not have been possible with the approaches previously used to model human movement. Our approach provides a better understanding of how individuals relate to their surrounding urban environment and how they interact with features that could increase risk of leptospirosis.

Several important limitations must be highlighted. This study involves a relatively small sample of a larger population, slightly skewed towards older women compared to the parent study. There are few individuals testing positive for Leptospira antibodies. As a result, the findings are biased towards the more represented individuals, limiting their generalisability. Further research is needed to develop appropriate study designs using these methods, including how many individuals should be recruited. The small number of Leptospira-positive individuals also makes the estimation for the effect of this characteristic more difficult. We would also like to restate that a positive antibody response to any Leptospira serovar does not indicate active infection. A positive result merely indicates that the individual has been infected at some stage, either symptomatically or asymptomatically, and has produced an immune response. Information on the timing of the infection could instead be a variable showing a stronger association with movement. Another important limitation is that we did not collect data on behaviours. If risky or protective behaviours, such as the use of closed footware, had been available at the appropriate temporal resolution (e.g. hourly intervals), these could have been included in the step selection functions and could have shown significant associations. Although these are important limitations which require cautious interpretation of results, they do not detract from the value of exploring this novel methodology in this context. This methodology also provides a crucial starting point for exploring how movement characteristics can differ between individuals in these environments.

Step-selection functions also have limitations that must be considered. While these methods can model the choice of moving in a specific direction, they do not account for the initial distance from the individual. For instance, an individual moving towards the central stream from far away will have a high selection coefficient for this environmental factor, which does not indicate their starting distance. This is important because environmental risk factors cease to provide risk beyond a certain distance. This limitation was overcome by using buffer zones around specific points of interest, but it is crucial to highlight to correctly interpret all results. Similarly, step-selection functions do not quantify how long an individual spent within this high-risk distance.

Despite these limitations, this study has several valuable strengths. By including steps an individual could have taken but did not (i.e., available steps, grey dots in Figure 3B), the models allow us to estimate choice. Additionally, the models use each individual’s movement characteristics to create these available steps, resulting in a realistic representation of movement behaviours. This creates more realistic estimates of environmental interactions than those created using existing methods.

Another significant strength is the specificity of the individual-level and population-level models. First, the population-level linear regression models allow multiple individual characteristics to be included, producing results that can be adjusted as needed. Although not considered in this study, these models also provide flexibility in the type of variable interactions that can be specified, allowing for non-linear effects if necessary. Second, the individual-level conditional logistic regression models are conditioned for all included variables. This enablesthe estimation of the selection coefficients for each environmental factor after adjusting for potential confounders. This is particularly useful in our case, as open sewer points are often close to the central stream in all study areas (Figure 1).

Overall, we believe this method is a useful tool in analysing human mobility in the context of infectious disease epidemiology. This modelling approach could also be used in other areas of research which analyse human movements and choice relating to surrounding environmental features, such as urban planning. A major benefit of step selection functions is the use of rasters, which provide flexibility when investigating environmental features. Creative uses of rasters could provide interesting questions and results. Although the focus of these models is looking at choice in space, the methods could also be adapted to analyse choice in time (e.g. are there temporal variables that affect when a rat enters a household).

To conclude, we provide a worked example of how to use step-selection functions to analyse human movements in the field of infectious disease epidemiology. This highlights the usefulness of adapting methods from other fields to answer questions that would otherwise be difficult to answer with the existing methodology. By doing so, we develop a better understanding of environmental interactions and how to leverage the large data sets provided by GPS loggers.

Although our focus was leptospirosis, these methods can be adapted to model the exposure to any disease where movement and the environment play an important role.

Ethics

Ethical approval for this study was obtained from the ethics committee at the Collective Health Institute, Federal University of Bahia (CEP/ISC/UFBA) under number CAEE 32361820.7.0000.5030, and the national research ethics committee (CONEP) linked to the Brazilian Ministry of Health under approval number 4.235.251. All participants involved in the study provided written informed consent before data collection.

Supplementary Material

Descriptive Statistics

Telemetry data

The mean number of hours of telemetry data provided by an individual was 13.3 hours, with a standard deviation of 13.5 hours. The mean number of locations recorded by the GPS loggers was 2767 points (SD = 1947.2). There were no differences in the number of hours or number of locations recorded by gender, age or leptospirosis antibody status. There were notable differences in the number of hours recorded and the number of locations by study area. Study area 1 (NVS) had the lowest number of hours recorded (mean = 5.6 hours, SD = 5.6), whilst all other areas had similar hours recorded (area 2: mean = 15.0, SD = 11.4; area 3: mean = 10.9, SD = 14.0; area 4; mean = 20.7, SD = 15.3). The mean number of locations recorded were all similar across all study areas (area 1: mean = 2048, SD = 1206; area 2: mean = 2831, SD = 1302; area 3: mean = 2992, SD = 2737; area 4: mean = 3107, SD = 1761).

Serological data

Serologically positive individuals were equally distributed across ages and genders, although the oldest male included in the analysis was also serologically positive (Figure S2).

There were also no significant skews in the household characteristics relative to the environmental factors being analyzed (Figure S3).

Laboratory work

All samples were tested using the MAT test, the reference test for serological diagnosis of leptospirosis, as designated by the WHO. The diagnostic panel used included the following serovars:

  • L. kirschneri serovar Cynopteri strain 3522C

  • L. kirschneri serovar Grippothyphosa strain Duyster

  • L. interrogans serovar Canicola strain H. Ultrech

  • L. interrogans serovar Autumnlais strain Akiyami A

  • L. borgspetersenii serovar Ballum strain MUS 127

  • L. interrogans serovar Copenhageni strain Fiocruz L1-130 (locally isolated in 1996)

  • L. interrogans serovar Copenhageni strain Fiocruz LV3954

Distribution of telemetry data provided by each individual across 24 hour periods (x axis), separated into each of the four study areas.

Overlapping areas represent multiple days. Vertical bars represent 5 am (left hand bar) and 9 pm (right hand bar), the period of analysis.

Distribution of Leptospirosis antibody status (serological status) by gender and age.

Distribution of nearest distance to each of the environmental factors being analysed (central stream, open sewer points and domestic rubbish piles) by serological status and study area.

NA represents rest of households in study area that did not take part in movement analysis.

Acknowledgements

We would like to thank all residents from the study areas, without whom this work would not have been able to be completed. The authors would also like to specially thank the GIC (Grupo Impulsor Comunitario) for their support and warmth.

Additonal information

Data Availability

Given the intrusive nature of the data used, individual-level GPS data will not be made available. However, we have submitted anonymised data which was used to carry out the population-level models.

Authors’ contributions

P.R.C., F.N.S., F.C., and E. G.: conceptualization, formal analysis, investigation, methodology, data curation, software, visualization, writing—original draft, writing—review and editing; J.R., C.C., F.C. and E.G.: Supervision; F.C.: funding acquisition; M.T.E., J. R. and C.C.: formal analysis, writing—original draft, writing—review and editing; J.O.S, D.S.O., R.C.N., A.G.S., E.V.R.S., F.A.G, D.C.C.S and P.S.R: investigation, field team, laboratory, data curation, writing—review and editing.

Funding

This work was supported by the Wellcome Trust, NIHR/Wellcome Global health Partnership (218987/Z/19/Z) to F.C.; F.N.S. received a research scholarship from the Brazilian National Research Council (CNPq:150142/2024-2). PRC is in receipt of a studentship from the Medical Research Council, United Kingdom. ME was supported through a Reckitt Global Hygiene Institute fellowship. JMR acknowledges support from National Institute of Allergy and Infectious Diseases (Grant number 1R01AI160780-01).

Additional files

Supplementary material.