1. Computational and Systems Biology
  2. Epidemiology and Global Health
Download icon

Social fluidity mobilizes contagion in human and animal populations

  1. Ewan Colman  Is a corresponding author
  2. Vittoria Colizza
  3. Ephraim M Hanks
  4. David P Hughes
  5. Shweta Bansal  Is a corresponding author
  1. Department of Biology, Georgetown University, United States
  2. Roslin Institute, University of Edinburgh, United Kingdom
  3. INSERM, Sorbonne Université, Institut Pierre Louis d’Épidémiologie et de Santé Publique (IPLESP UMRS 1136), F75012, France
  4. Department of Statistics, Eberly College of Science, Penn State University, United States
  5. Department of Entomology, College of Agricultural Sciences, Penn State University, United States
Research Article
  • Cited 0
  • Views 267
  • Annotations
Cite this article as: eLife 2021;10:e62177 doi: 10.7554/eLife.62177

Abstract

Humans and other group-living animals tend to distribute their social effort disproportionately. Individuals predominantly interact with a small number of close companions while maintaining weaker social bonds with less familiar group members. By incorporating this behavior into a mathematical model, we find that a single parameter, which we refer to as social fluidity, controls the rate of social mixing within the group. Large values of social fluidity correspond to gregarious behavior, whereas small values signify the existence of persistent bonds between individuals. We compare the social fluidity of 13 species by applying the model to empirical human and animal social interaction data. To investigate how social behavior influences the likelihood of an epidemic outbreak, we derive an analytical expression of the relationship between social fluidity and the basic reproductive number of an infectious disease. For species that form more stable social bonds, the model describes frequency-dependent transmission that is sensitive to changes in social fluidity. As social fluidity increases, animal-disease systems become increasingly density-dependent. Finally, we demonstrate that social fluidity is a stronger predictor of disease outcomes than both group size and connectivity, and it provides an integrated framework for both density-dependent and frequency-dependent transmission.

Introduction

Social behavior is fundamental to the survival of many species. It allows the formation of social groups providing fitness advantages from greater access to resources and better protection from predators (Krause and Ruxton, 2002). Structure within these groups can be found in the way individuals communicate across space, cooperate in sexual or parental behavior, or clash in territorial or mating conflicts (Hinde, 1976). While animal societies are usually studied independently of each other, studying how they differ in these regards has potential to reveal new insights into the nature of social living (Sah et al., 2018; Dunbar and Shultz, 2010).

When social interaction requires shared physical space it can also be a conduit for the transmission of infectious disease (Altizer et al., 2003). In a typical infectious disease model, if the disease spreads through the environment then the transmission rate is assumed to scale proportionally to the local population density (de Jong et al., 1995; Hopkins et al., 2020). Alternatively, if transmission requires close proximity encounters that only occur between bonded individuals then we expect social connectivity to determine the outcome. These two paradigms are known in the literature as density-dependence and frequency-dependence (Silk et al., 2017).

The problem, however, is that real diseases are not so easy to categorize (Patterson and Ruckstuhl, 2013). For example, as social groups grow in size, new bonds must be created to maintain cohesiveness (Lehmann et al., 2007). To manage the time and cognitive effort required to create these bonds, individuals tend to interact mostly with a small number of close companions while maintaining cohesion with the wider group through less frequent contact (Silk, 2007; Sueur et al., 2011; Dakin and Ryder, 2020). For an infectious disease, this creates fewer transmission opportunities than we would expect to see in a group with highly fluid social dynamics. The extent to which group size amplifies the transmission rate therefore depends on how individuals choose to distribute their social effort between strong and weak ties (Karsai et al., 2014).

While transmission rate has been observed to scale non-linearly with group size for a number of disease systems (Cross et al., 2013; Smith et al., 2009; Silk et al., 2017), it remains unclear how much this dependency is related to the internal social structure of the group; few studies observe social dynamics at sufficient detail while simultaneously monitoring the disease status of each individual. In the absence of direct observations, our contribution to this discussion centers around modeling; incorporating empirical social data with computational simulations. We address two specific questions. Firstly, can we quantify the variability in how individuals choose to distribute their social effort within a group, and secondly, what will this tell us about the effect that population density has on disease transmission?

In the first part of this paper, we introduce a mathematical model founded on the concept of social fluidity which we define as variability in the amount of social effort the individual invests in each member of their social group. Using openly available data, we estimate the social fluidity of 57 human and animal social systems. In the second part, we derive an expression for the basic reproductive number of an infectious disease in the social fluidity model and demonstrate its accuracy in predicting simulated outcomes. Furthermore, social fluidity emerges as a coherent mathematical framework providing the smooth connection between density-dependent and frequency-dependent disease systems.

Characterizing social behavior

Our first objective is to measure social behavior in a range of human and animal populations. We start by introducing a model that captures a hidden element of social dynamics: how individual group members distribute their social effort. We mathematically describe the relationships between social variables that are routinely found in studies of animal behavior, the number of social ties and the number of interactions observed, and apply the model to empirical data to reveal behavioral differences between several species.

Social behavior model

Consider a closed system of N individuals and a set of interactions between pairs of individuals that were recorded during some observation period. These observations can be represented as a network: each individual, i, is a node; an edge exists between two nodes i and j if at least one interaction was observed between them; the edge weight, wi,j, denotes the number of times this interaction was observed. The total number of interactions of i is denoted strength, si=jwi,j, and the number of nodes with whom i is observed interacting is its degree, k(Barrat et al., 2004).

We define xj|i to be the probability that an interaction involving i will also involve node j. Therefore, the probability that at least one of these interactions is with j is 1-(1-xj|i)si. The main assumption of the model is that the values of xj|i over all i,j pairs are distributed according to a probability distribution, ρ(x). Thus, if a node interacts s times, the marginal probability that an edge exists between that node and any other given node in the network is

(1) Ψ(s)=1-ρ(x)(1-x)s𝑑x.

Technically, ρ(x) is the distribution of marginal xi|j values of the joint probability distribution ρ(x) where X is a matrix whose i,j entry is -1 if i=j and xj|i otherwise. While the values of xj|i are subject to network interdependencies, specifically AX=XTA and 1=0, where A is any diagonal matrix with positive entries, and 1 and 0 are column vectors of length N containing only 0 and 1, we do not take these constraints into account when estimating ρ

Our goal is to find a form of ρ that accurately reproduces network structure observed in real social systems. Motivated by our exploration of empirical interaction patterns from a variety of species, we propose that ρ has a power-law form:

(2) ρ(x)=ϕϵϕ1ϵϕx(1+ϕ)forϵ<x<1,

where ϕ controls the variability in the values of x, and ϵ simply truncates the distribution to avoid divergence. The form of ρ(x) was chosen for its analytical tractability but other heavy-tailed distributions produce a similar result (Figure 1—figure supplement 1 ). Combining (1) and (2) we find

(3) Ψ(s,ϕ,ϵ)=1-ϕϵϕ(1-ϵ)s+1(1-ϵϕ)(s+1)F12(s+1,1+ϕ,s+2,1-ϵ)

where the notation F12 refers to the Gauss hypergeometric function (Abramowitz and Stegun, 1975). It follows from jxj|i=1 that

(4) N=1+(1-ϕ)(1-ϵϕ)ϕϵϕ(1-ϵ1-ϕ),

which can be solved numerically to find ϵ for given values of N and ϕ. The expectation of the degree is κ(s,ϕ,N)=(N-1)Ψ(s,ϕ,ϵ).

Figure 1 illustrates how the value of ϕ can produce different types of social behavior. As ϕ is the main determinant of social behavior in our model, we use the term social fluidity to refer to this quantity. Low social fluidity (ϕ1) produces what we might describe as ‘allegiant’ behavior: interactions with the same partner are frequently repeated at the expense of interactions with unfamiliar individuals. As ϕ increases, the model produces more ‘gregarious’ behavior: interactions are repeated less frequently and the number of partners grows faster. While names like ‘social strategy’ and ‘loyalty’ have been applied to similar concepts (Valdano et al., 2015; Miritello et al., 2013), fluidity, as a property of matter, is a useful metaphor for communicating the main idea behind this model.

Figure 1 with 1 supplement see all
Left: Each individual can be represented as a single point on this plot.

Dashed lines mark the boundary of the region where data points can feasibly be found. The mean degree is plotted for two values of ϕ representing two possible types of social behavior; as the number of observed interactions grows, the set of social contacts increases; the rate at which it increases influences how we categorize their social behavior. Middle: The weight of the edges between i and the other nodes represents the propensity of i to interact with each of the other individuals in the group. Right: Probability distributions that correspond to the different levels of evenness in the contact propensities, both distributions are expressed by Equation (2).

Estimating social fluidity in empirical networks

To understand the results of the model in the context of real systems, we estimate ϕ in 57 networks from 20 studies of human and animal social behavior (further details in the supplement) (Isella et al., 2011; Stehlé et al., 2011a; Mastrandrea et al., 2015; Vanhems et al., 2013; Modlmeier et al., 2019; Blonder and Dornhaus, 2011; Génois et al., 2015; Carter and Wilkinson, 2013; Grant, 1973; Levin et al., 2016; Sailer and Gaulin, 1984; Mourier et al., 2017; Massen and Sterck, 2013; Sade, 1972; Butovskaya et al., 1994; Takahata, 1991; Hass, 1991; Lott, 1979; Schein and Fohrman, 1955; Hobson and DeDeo, 2015; Gernat et al., 2018), focusing our attention to those interactions which are capable of disease transmission (i.e. those that, at the least, require close spatial proximity). The advantage of using this model over more detailed network descriptions is that we obtain a single parameter estimate, ϕ that is easily compared across animal species and environments.

Each dataset provides the number of interactions that were observed between pairs of individuals. We assume that the system is closed, and that the total network size (N) is equal to the number of individuals observed in at least one interaction. To estimate social fluidity, we find the value of ϕ that minimizes i[ki-κ(si,ϕ,N)]2; the total squared squared error between the observed degrees and their expectation given by the model. Uncertainty is displayed using the 2.5th and 97.5th percentile of the distribution of ϕ computed on a set of 1000 ‘bootstrap’ samples, created by sampling N data points, {ki,si}, with replacement, from the observed data. Being estimated from the relationship between strength and degree, and not their absolute values, social fluidity is a good candidate for comparing social behavior across different systems as it is independent of the distributions of si or ki, and of the timescale of interactions.

Figure 2 shows the estimated values of ϕ for all networks in our study. We organize the measurements of social fluidity by interaction type. Aggressive interactions have the highest fluidity (which implies that most interactions are rarely repeated between the same individuals), while grooming and other forms of social bonding have the lowest (which implies frequent repeated interactions between the same individuals). Social fluidity also appears to be related to species: ant systems cluster around ϕ=1, monkeys around ϕ=0.5, humans take a range of values that depend on the social environment. Sociality type does not appear to affect ϕ; sheep, bison, and cattle have different social fluidity compared to kangaroos and bats, although they are all categorized as fission-fusion species (Sah et al., 2018).

Figure 2 with 2 supplements see all
Each point represents a human or animal system for which social fluidity was estimated.

Colors correspond to the species and the setting in the case of human networks. Different shapes are used as a visual aid. Lines represent the 95% bootstrap confidence interval. Results are organized by interaction type: aggression includes fighting and displays of dominance, food sharing refers to mouth-to-mouth passing of food, antennation is when the antenna of one insect touches any part of another, space sharing interactions occur with spatial proximity during foraging, face-to-face refers to close proximity interactions that require individuals to be facing each other, association is defined as co-membership of the same social group, and grooming is when one individual cleans another with their hand or other body part.

Across the 57 networks, there is no evidence that social fluidity scales with the size of the network or the number of observations per individual. No correlation was found between the mean number of interactions per individual (s¯) and social fluidity when testing for a monotonic relationship between the variables (Spearman r2=0.02, p=0.36), and in general no correlation across sets of networks taken from the same study (Supplementary file 1: Table S2). Similarly, network size (N) does not correlate with ϕ (Spearman r2=0.02, p=0.28). To test for a non-monotonic relationship, we partition the set of networks into 10 equally sized groups according to each of the two measures being compared, and compute the adjusted mutual information (AMI) of the two groupings. We find AMI=0.15 for the relationship between ϕ and N, and AMI=0.2 between ϕ and s¯. While non-negative values of AMI typically indicate a non-random relationship, an inherent amount of clustering is to be expected in data aggregated from a diverse range of sources.

Larger values of ϕ correspond to higher mean degrees (Spearman r2=0.21, p<0.001) and lower variability in the distribution of edge weights (measured as the index of dispersion of wi,j; Spearman r2=0.46, p<0.001). Weight variability and mean degree are uncorrelated in these data (Spearman r2=0.01, p=0.54, AMI=0.01) implying that ϕ combines these two entirely distinct features of social behavior. Finally, the modularity of the network (computed by the Louvain method on the unweighted network Blondel et al., 2008) is negatively correlated with ϕ (r2=0.52, p<0.001). This is expected as individuals tend to be loyal to those within the same module while maintaining weaker connections with the remaining network - in all but one network the mean weight of edges within modules is higher than the mean weight of edges between modules (supplementary document).

As with any applied modeling, the validity of these results depends on the extent to which each study system conforms to the assumptions of the model. The value of N, for example, might not represent the true group size if some individuals in the group did not have their interactions recorded, or if there are individuals who did not interact during the time-frame of observation. While we found that variation in the value of N did not have a large impact on the estimated value of ϕ, as shown in Figure 2—figure supplement 1, we warn that the amount of consistency between model assumptions and the conditions of each study will vary, and close consideration should be given to the way data were collected when interpreting these results.

Characterizing disease spread with social fluidity

Our objective is to characterize how social behavior influences the exposure of the group to infectious disease in a range of human and animal social systems. Intuitively, we expect an infected individual in a group with low social fluidity to expose fewer susceptible group members to the pathogen than they would in a group with highly fluid social dynamics. We explore this idea by introducing a analytical transmission model that incorporates social fluidity. Using this model, we mathematically characterize the impact of social fluidity on density dependence, and apply the model to empirical networks to predict disease spread.

Disease transmission model

We consider the transmission of an infectious disease on the social behavior model introduced in the previous section. An infectious node i interacting with a susceptible node j will transmit the infection with probability β. The node will recover from infection with rate γ, assuming an exponential distribution of the length of the infectious period. The probability that the infection is transmitted from i to any given j is

(5) Tij(β,γ,si,τ,xj|i)=1-exp(-sixj|iβ/γτ),

assuming that the interactions si of i are distributed randomly across an observation period of duration τ.

By integrating Equation (5) over all possible values xj|i and infectious period durations and multiplying by the number of susceptible individuals (N-1) we obtain the expected number of infections caused by individual i

(6) r(si)=1-ϕϕ(ϵϕ-ϵ)[1-ϵϕ+ϵϕF12(-ϕ,1,1-ϕ;-βsi/γτ)-F12(-ϕ,1,1-ϕ;-ϵβsi/γτ)].

The basic reproductive number (usually denoted R0) is defined as the mean number of secondary infections caused by a typical infectious individual in an otherwise susceptible population (Diekmann et al., 1990). We will use the notation R0ϕ to signify the social fluidity reproductive number, that is the analogue of R0 derived from our social behavior model.

We assess the relation of the reproductive number with the population density by focusing on a special case where every node has the same strength, that is si=s for all i, so that R0ϕ=r(s). Furthermore, we choose β=γτR0/s where R0 is R0ϕ as ϕ, that is a constant that represents what the basic reproductive number would be if every new interaction occurred between a pair of individuals who have not previously interacted with each other.

Figure 3 shows the effect of social fluidity on the density dependence of the disease. At small population sizes, R0ϕ increases with N and converges as N goes to ∞ (Figure 3A). The rate of this convergence increases with ϕ, and the limit it converges to is higher, meaning that ϕ determines the extent to which density affects the spread of disease. As N, we find that R0ϕR0 for ϕ>1. When ϕ<1, R0ϕ[(1-ϕ)/ϕ][F12(-ϕ,1,1-ϕ;-R0)-1]. At these values of ϕ the disease is constrained by individuals choosing to repeat interactions despite having the choice of infinitely many potential interaction partners (Figure 3B).

Density dependence in populations where every node has the same strength.

(A) For different values of social fluidity, ϕ, we show R0ϕ (from Equation (6)) as a function of N (from Equation (4)) through their parametric relation with ϵ. Dashed lines show the limit for large N. (B) In large populations R0ϕ increases with ϕ up to ϕ=1. Beyond this value, infections occur as frequently as they would if every new interaction occurs between a pair of individuals who have not previously interacted with each other.

Infection spread in empirical networks with heterogeneous connectivity

To apply this analogue of a reproductive number to an animal-disease system, we need to account for heterogeneous levels of social connectivity in the given population and thus the tendency for infected individuals to be those with a greater number of social partners (Anderson et al., 1986). For the basic reproductive number, this is often done using the mean excess degree, that is the degree of an individual selected with probability proportional to their degree (Newman, 2018). Following a similar reasoning, we define R0Est, which incorporates the effect of social fluidity, as the expected number of infections (r(si)) caused by an individual that has been selected with probability proportional to their degree (ki):

(7) R0Est({si},{ki},τ,β,γ)=ikir(si)iki.

Given the degree and strength of each individual in a network, the duration over which those interactions occurrred, and the transmission and recovery rates of the disease, we are able to estimate ϕ, compute Equation (6) for each individual, and finally use Equation (7) to derive a statistic that provides a measure of the risk of the host population to disease outbreak.

Numerical validation using empirical networks

We simulated the spread of disease through the interactions that occurred in the empirical data (Materials and methods). We compute R0Sim(g), defined as the ratio of the number of individuals infected at the (g+1)-th generation to the number infected at the g-th generation over 103 simulated outbreaks, for g=0,1,2 (g=0 refers to the initial seed of the outbreak).

Table 1 shows the Pearson correlation coefficient and the adjusted mutual information between R0Sim(g) and its corresponding value R0Est obtained Equation (7) (Materials and methods). Equivalent results are also presented for other indicators and network statistics. The results correspond to one set of simulation conditions and are consistent across a wide range of parameter combinations (see Supplementary file 1). Note that a different value of β was chosen for each network to control for the varying interaction rates between networks while keeping the upper bound (R0) constant (Materials and methods). While contact frequency is known to be one of the major contributors to disease risk, calibrating β in this way eliminates its effect, allowing the contribution of other network characteristics to be compared. Thus, the mean strength does not have a significant effect on R0Sim(g), and higher mean edge weight does not necessarily imply higher transmission probability over the edges of the network.

Table 1
The Pearson correlation coefficient between quantities calculated on the network and the simulated disease outcomes (with R0=3).

Results that are significant with p<0.01 are labeled with *. Adjusted mutual information is calculated between the variables after partitioning the set of networks into 10 equally sized rank-order classes.

Corr. with R0Sim(g=1)Adjusted MI
R0Est0.91*0.35
Social fluidity0.73*0.24
Excess degree0.64*0.15
Mean degree0.53*0.14
Network size0.47*0.18
Mean strength-0.07-0.02
Mean clustering-0.150.12
Mean edge weight-0.45*0.10
Edge weight heterogeneity-0.48*0.21
Modularity-0.59*0.12

These correlations support a known result regarding repeat contacts in network models of disease spread: that indicators of disease risk that are derived solely from the degree distribution are unreliable and the role of edge weights should not be neglected (Smieszek et al., 2009; Stehlé et al., 2011b). After transmission has occurred from one individual to another, repeating the same interaction serves no advantage for disease (most directly transmitted microparasites are not dose-dependent). Since a large edge weight implies a high frequency of repeated interactions, networks with a higher mean weight tend to have lower basic reproductive numbers. Furthermore, variability in the distribution of weights concentrates a yet larger proportion of interactions onto a small number of edges, further increasing the number of repeat interactions and reducing the reproductive number.

Correlation between modularity and R0Sim(g) is partly due to the strong correlation between modular networks and those with high social fluidity. Consistent with other evidence (Sah et al., 2017), this suggests that transmission events occur mostly within the module of the seed node, with weaker social ties facilitating transmission to other modules. The effect of clustering (a measure of the number of connected triples in network Watts and Strogatz, 1998) correlates with smaller R0Sim(2), consistent with other theoretical work (Miller, 2009; Smieszek et al., 2009).

Finally, we find the model estimate of the social fluidity reproductive number R0Est to be, on average, within 10% of the simulated value, R0Sim(g) at g=1. At g=2 the amount of error is larger (to up to 29% for some parameter choices). Prediction accuracy at this generation is negatively correlated with the mean clustering coefficient. This is not surprising as R0Est does not account for the accelerated depletion of susceptible neighbors that is known to occur in clustered networks (Miller, 2009; Smieszek et al., 2009). No other properties of the network affect the accuracy of R0Est consistently across all parameter combinations (see Supplementary file 1).

Results and discussion

We proposed a measure of fluidity in social behavior which quantifies how much mixing exists within the social relationships of a population. While social networks can be measured with a variety of metrics including size, connectivity, contact heterogeneity and frequency, our methodology reduces all such factors to a single quantity allowing comparisons across a range of human and animal social systems. Social fluidity correlates with both the density of social ties (mean degree) and the variability in the weight of those ties, although these quantities do not correlate with each other. Social fluidity is thus able to combine these two aspects seamlessly in one quantity.

By measuring social fluidity across a range of human and animal systems we are able to rank social behaviors. We identify aggressive interactions as the most socially fluid; this indicates a possible learning effect whereby each aggressive encounter is followed by a period during which individuals avoid further aggression with each other (Parker, 1974). At the opposite end of the scale, we find interactions that strengthen bonds (and thus require repeated interactions) such as grooming in monkeys (Seyfarth and Cheney, 1984) and food-sharing in bats (Carter and Wilkinson, 2013). The fact that food-sharing ants are far more fluid than bats, despite performing the same kind of interaction, reflects their eusocial nature and the absence of any need to consistently reinforce bonds with their kin (Hölldobler and Wilson, 2009).

Our results contribute to a body of work examining the disproportionate distribution of social effort in both human and animal groups. This phenomena has been directly observed in human telecommunication (Mac Carron et al., 2016; Saramäki et al., 2014; Gonçalves et al., 2011; Tamarit et al., 2018). Quantifying this aspect of sociality in animal systems, however, has been held back by the limitations of the data, such as the bias introduced by variation in activity levels across the social group (Di Bitetti, 2000). Additionally, while heterogeneous interaction frequencies and temporal dynamics have become common in epidemiological models (Rocha and Blondel, 2013; Colman et al., 2018), our results highlight the importance of including variability in how the individual chooses to expend their social effort.

As with most studies that aim to describe and quantify social structure, there are a number of concerns that ought to be mentioned. The degree of an individual, for example, is known to scale with the length of the observation period (Perra et al., 2012). This is also true of the networks used here (Figure 2—figure supplement 1). Similarly, social fluidity can be affected by the length of the observation window. However, since our model focuses not on the absolute value of degree, but on how degree scales with the number of observations, the results we obtain are relatively robust against this variability (Figure 2—figure supplement 1). Additionally, observed interactions are typically assumed to persist over time (Perreault, 2010). In our model this is not the case; only the distribution of edge weights remains constant, an assumption consistent with growing evidence (Miritello et al., 2013; Centellegher et al., 2017).

We therefore consider the model to be applicable to the data analysed in this study, but advise caution when applying this approach to other data sources. If the duration of a study allows for substantial developments in the group structure, for example, then a model of edge formation and dissolution may be preferred.

Finally, we do not know the extent to which an interaction, as defined for each network, is capable of transmission which can depend on the pathogen’s transmission mode and the infectious dose required. Furthermore, the transmission probability is unlikely to be the same for all interactions within the group since, for example, the duration of contact is known to be important for disease spread (Stehlé et al., 2011b). We did not include explicitly the duration of each contact in our model as this information was only available in a fraction of the datasets (Barrat et al., 2014). There is therefore potential to improve the applicability of this model as more high resolution data becomes openly available.

Our estimate of reproductive number derived from social fluidity provides a better predictor for the epidemic risk of a host population, going beyond predictors based on density or degree only. To illustrate this point, the social network of individuals at a conference (R0Est=1.60; conference_0, supplementary document) is predicted to be at higher risk compared to the social network at a school (R0Est=1.39; highschool_0), despite having a smaller size and lower connectivity (N=93 vs. N=312, and k¯=5.63 vs. k¯=6.78, respectively). The discrepancy in the risk prediction comes from the lower frequency of repeated contacts between individuals in the conference, compared to the school. Interactions between infectious individuals and those they have previously infected are redundant in terms of transmission. This dynamic is nicely captured by the social fluidity, with ϕ=0.66 for the conference and ϕ=0.40 for the high school.

Unlike previous work that explores the disease consequences of population mixing (Volz and Meyers, 2007; Reluga and Shim, 2014), our analysis allows us to investigate this relation across a range of social systems. We see, for example, how the relationship between mixing and disease risk scales with group size. For social systems that have high values of social fluidity, R0ϕ is highly sensitive to changes in N, whereas this sensitivity is not present at low values of ϕ. This corroborates past work on the scaling of transmission being associated to heterogeneity in contact (Begon et al., 2002; Ferrari et al., 2011). Going beyond previous work, our model captures in a coherent theoretical framework both density-dependence and frequency-dependence, and social fluidity is the measure to tune from one to the other in a continuous way. Since many empirical studies support a transmission function that is somewhere between these two modeling paradigms (Smith et al., 2009; Cross et al., 2013; Borremans et al., 2017; Hopkins et al., 2020), the modeling approaches applied in this paper can be carried forward to inform transmission relationships in future disease studies.

Materials and methods

Python libraries

Request a detailed protocol

Mean clustering coefficients were computed using the networkx Python library. To evaluate the hypergeometric function in (3) we used the hyp2f1 function from the scipy.special Python library. Numerical solutions to Equation (4) using the fsolve function from the scipy.optimize Python library. Adjusted mutual information was computed using adjusted mutual info score from the sklearn.metrics library. All scripts, data, and documentation used in this study are available through https://github.com/EwanColman/Social-Fluidity (Colman, 2021, copy archived at swh:1:rev:90b27e1b84ce4417633885cd260c89bbf1b07eac).

Data handling

Request a detailed protocol

Only freely available downloadable sources of data have been used for this study. Details of the experimentation and data collection, including how the interaction type is defined, can be found through their respective publications. Here, we note some additional processes we have applied for our study.

Each human contact dataset lists the identities of the people in contact, as well as the 20 s interval of detection (Isella et al., 2011; Vanhems et al., 2013; Stehlé et al., 2011a; Mastrandrea et al., 2015; Génois et al., 2015). Any sequence of consecutive time intervals for which contact is detected between two individuals is considered to be one interaction. To exclude contacts detected while participants momentarily walked past one another, only contacts detected in at least two consecutive intervals are considered interactions. Data were then separated into 24 hr subsets.

Bee trophallaxis provided experimental data for five unrelated colonies under continuous observation. We use the first hour of recorded data for each colony (Gernat et al., 2018). The ant trophallaxis study provided six networks: three unrelated colonies continuously observed under two different experimental conditions (Modlmeier et al., 2019). Ant antennation study provided six networks: three colonies, each observed for 4 hr in two sessions separated by a 2-week period. The bat study collected individual data at different times and under different experimental conditions (Carter and Wilkinson, 2013). For bats that were studied on more than one occasion we use only the first day they were observed.

Some data sets provided data for group membership collected through intermittent, rather than continuous, observation (Grant, 1973; Massen and Sterck, 2013; Levin et al., 2016; Sailer and Gaulin, 1984; Mourier et al., 2017) and typically recorded over multiple days or weeks. We construct networks from these data by recording an interaction when two individuals were seen to be in the same group during one round of observation. The shark data were divided into six datasets, each one constructed from 10 consecutive observation bouts, and spread out evenly through the 46-day period over which the data were collected.

For the grooming data (Butovskaya et al., 1994; Sade, 1972), if one animal was grooming another during one round of observations then this would be recorded as a directed interaction. Similarly for aggressive interactions (Parker, 1974; Takahata, 1991; Hass, 1991; Lott, 1979; Schein and Fohrman, 1955; Hobson and DeDeo, 2015). These data are typically collected over a period of days or weeks. When an animal was determined to be the winner of a dominance encounter then this would be recorded as a directed interaction between the winner and the loser. We consider interaction in either direction to be a contact in the network.

We considered including two rodent studies in which interaction is defined as being observed within the same territorial space (Smith et al., 2009; Borremans et al., 2017). We did not find this suitable for our analysis since the network we obtain, and the consequent results are sensitive to setting of arbitrary threshold values regarding what should, or should not, be considered sufficient contact for an interaction.

For data that did not contain the time of each interaction, contact time series were generated synthetically. For those networks, the interactions between each pair were given synthetic timestamps in three different ways, Poisson: the time of each interaction is chosen uniformly at random from {0, 1, ..., 104} seconds, Circadian: chosen uniformly at random from {0,1, ...,3333, 6666, ....,104}, and Bursty: interaction times occur with power-law distributed inter-event times adjusted to give an expected total duration of 104 seconds.

Disease simulation

Request a detailed protocol

Simulations of disease spread were executed using the contacts provided by the datasets. The the bat network was omitted from this part since these data were collected over a series of independent experiments carried out at different times and under different experimental treatments.

In one run of the simulation, one seed node is randomly chosen from the network and, at a randomly selected point in time during the duration of the data, transitions to the infectious state. The duration for which they remain infectious is a random variable drawn from an exponential distribution with mean 1/γ. During this time, any contact they have with other individuals who have not previously been infected will cause an infection with probability β.

The simulation runs until all individuals who were infected at the second generation of the disease, that is those infected by those infected by the seed, have recovered. The datasets are ‘looped’ to ensure that the timeframe of the data collection does not influence the outcome. In other words, immediately after the latest interaction, the interactions are repeated exactly as they were originally. This continues to happen until the termination criteria is met.

We set the parameters to normalise for the variation in contacts rates between networks. To achieve this, we consider a hypothetical counterpart to each network in which the strength of every node is the same, but each interaction occurs between a pair of individuals who have not previously interacted. This is equivalent to ϕ. Under these conditions xj|i=1/(N-1) for all pairs i,j. It follows that Equation (5) becomes Tijsiβ/γτ(N-1), then r(si)siβ/γτ, and, since ki=si for all nodes i, Equation (7) gives

(8) R0=R0Est({si},{si},τ,β,γ)=βisi2γτisi

The value of R0 can be chosen arbitrarily. Then, by setting γ=1/τ and β=R0isi/isi2 we guarantee that Equation (8) holds for every network. To test that our results hold over a range of disease scenarios, we repeat our analysis with R0=2, 3, and 4.

Data availability

All scripts, data, and documentation used in this study are available through https://github.com/EwanColman/Social-Fluidity (copy archived at https://archive.softwareheritage.org/swh:1:rev:90b27e1b84ce4417633885cd260c89bbf1b07eac).

The following previously published data sets were used
    1. Modlmeier AP
    2. Colman E
    3. Hanks EM
    4. Bringenberg R
    5. Bansal S
    6. Hughes DP
    (2019) Dryad Digital Repository
    Ant colonies maintain social homeostasis in the face of decreased density.
    https://doi.org/10.5061/dryad.sh4m4s6
    1. Hobson EA
    2. DeDeo S
    (2016) Dryad Digital Repository
    Social feedback and the emergence of rank in animal society.
    https://doi.org/10.5061/dryad.p56q7
    1. Mourier J
    2. Brown C
    3. Planes S
    (2016) Dryad Digital Repository
    Learning and robustness to catch-and-release fishing in a shark social network.
    https://doi.org/10.5061/dryad.gg859
    1. Carter CG
    2. Wilkinson GS
    (2013) Dryad Digital Repository
    Food sharing in vampire bats: reciprocal help predicts donations more than relatedness or harassment.
    https://doi.org/10.5061/dryad.tg7b1
    1. Levin II
    2. Zonana DM
    3. Fosdick BK
    4. Song SJ
    5. Knight R
    6. Safran RJ
    (2016) Dryad Digital Repository
    Stress response, gut microbial diversity and sexual signals correlate with social interactions.
    https://doi.org/10.5061/dryad.3jn35

References

  1. Book
    1. Abramowitz M
    2. Stegun I
    (1975)
    Handbook of Mathematical Functions
    Dover.
  2. Book
    1. de Jong MC
    2. Diekmann O
    3. Heesterbeek J
    (1995) How does transmission of infection depend on population size?
    In: Mollison D, editors. Epidemic Models: Their Structure and Relation to Data. Cambridge University Press. pp. 1019–1022.
    https://doi.org/10.1007/BF02459495
  3. Book
    1. Dunbar RI
    2. Shultz S
    (2010) Bondedness and sociality
    In: Dunbar R. I, editors. Behaviour. Brill. pp. 775–803.
    https://doi.org/10.1163/000579510X501151
  4. Book
    1. Hölldobler B
    2. Wilson EO
    (2009)
    The Superorganism: The Beauty, Elegance, and Strangeness of Insect Societies
    WW Norton & Company.
    1. Silk JB
    (2007) The adaptive value of sociality in mammalian groups
    Philosophical Transactions of the Royal Society B: Biological Sciences 362:539–559.
    https://doi.org/10.1098/rstb.2006.1994
  5. Book
    1. Takahata Y
    (1991)
    Diachronic changes in the dominance relations of adult female japanese monkeys of the arashiyama b group
    In: Fedigan L. M, Asquith P. M, editors. The Monkeys of Arashiyama. Albany: State University of New York Press. pp. 123–139.
    1. Volz E
    2. Meyers LA
    (2007) Susceptible–infected–recovered epidemics in dynamic contact networks
    Proceedings of the Royal Society of London B: Biological Sciences 274:2925–2934.
    https://doi.org/10.1098/rspb.2007.1159

Decision letter

  1. Niel Hens
    Reviewing Editor; Hasselt University & University of Antwerp, Belgium
  2. Miles P Davenport
    Senior Editor; University of New South Wales, Australia
  3. Jari Saramaki
    Reviewer
  4. Niel Hens
    Reviewer; Hasselt University & University of Antwerp, Belgium

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Acceptance summary:

This manuscript makes an important point for studies of contagion in both human and animal populations. This paper provides a way of characterising heterogeneity in social systems. Furthermore, the proposed measure of social fluidity can be used to distinguish between different types of animal social systems. Hence, the measure is of relevance for studies of human and animal social networks.

Decision letter after peer review:

Thank you for submitting your article "Social fluidity mobilizes contagion in human and animal populations" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, including Niel Hens as the Reviewing Editor and Reviewer #3, and the evaluation has been overseen by Miles Davenport as the Senior Editor. The following individual involved in review of your submission has agreed to reveal their identity: Jari Saramaki (Reviewer #2).

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

Please pay particular attention to the comments by Reviewer #1: items 1 and 2 and Reviewer #3: item 1 which focus on the relevance of this work. The other comments address punctual issues and/or clarifications; they should be looked at carefully and it would be good to organise your reply based on topics rather than a point-by-point reply.

We would like to draw your attention to changes in our revision policy that we have made in response to COVID-19 (https://elifesciences.org/articles/57162). Specifically, we are asking editors to accept without delay manuscripts, like yours, that they judge can stand as eLife papers without additional data, even if they feel that they would make the manuscript stronger. Thus the revisions requested below only address clarity and presentation.

Reviewer #1:

This is a well written paper on an interesting and important topic, characterizing the social contact frequency and network degree into a single measure, social fluidity. The concept is explained and mathematically derived, then applied to an analysis of 50 network data sets spanning 13 human and animal species. The authors then apply that parameter into a mathematical model for infectious disease dynamics to characterize its impact on transmission potential, via R. This is useful in developing understanding of the measure, and demonstrating its applicability in epidemiology. I have only relatively minor suggestions, mostly related to clarifying ideas and terms in the manuscript.

1. The main concern is the treatment of time within the data and fluidity measure. The authors appropriate note this as a limitation in the Discussion section. Because the inputs to fluidity depend on degree and within-node contacts, the measurement time frame for both in each dataset is critical. The authors cite a paper to support the idea that degree scales with the observation length (citation 59), but that is a limited analysis of 3 human networks (2 of which were online convenience samples) that may not apply more broadly. It would be helpful to have more clarification on the range of time frames for data collection across the datasets to understand whether weighted cross-sectional networks are the best modeling approach here (versus modeling dynamic networks of edge formation and dissolution). This may be important in the comparative analysis of different contact types (aggression contacts versus grooming contacts).

2. The authors establish the importance of the fluidity measure for comparative empirical research well, but It would also be helpful to have further clarity on its advantages for modeling over current approaches that might model network structure with two or more parameters. What are the broader benefits of this approach?

3. It was not clear how the contact types (aggression versus grooming, eg) were defined. Were these a component of the secondary data, or did the current authors use a classification scheme? Some further details on the measurement of the data would be helpful.

4. What are the implications of treating the system as closed and the network consisting only of non-isolates on the empirical comparisons and the epidemic modeling? These are assumptions required by the data, but not always realistic and could have a meaningful impact on the outcomes (e.g., non-differential misclassification for aggression contacts that may change the network structure, or include many isolates). The importance of these assumptions may depend on the measurement timeframe (i.e., less important if short observations).

Reviewer #2:

This manuscript makes an important point for studies of contagion in both human and animal populations: the heterogeneity of contact frequencies matters a lot. The individual-level heterogeneity of weights/contact frequencies in egocentric networks is nicely captured by the concept of social fluidity and the model parametrised by $\phi$ whose fitted values clearly differ for datasets from different species (Figure 2). Finally, the spreading model shows that $\phi$ has clear effects on R0 – the effect of ego-network weight heterogeneity on disease transmission is something that I have hypothesised myself as well, so I congratulate the authors for getting there first!

In my view, this paper makes several important contributions: in addition to the context of contagious disease, it provides a way of characterising heterogeneity in social systems and shows that it even works for distinguishing human contact networks under different circumstances (the Sociopatterns data sets). Furthermore, the proposed measure of social fluidity can be used to distinguish between different types of animal social systems. Hence, the measure is of relevance for studies of human and animal social networks.

As the science is solid, the results are important, and the manuscript is well and clearly written, I recommend publishing it, after some fairly straightforward clarifications/modifications.

1) Would it be possible to justify the choice of the power-law form for Equation (2)? And would the results be sensitive to this choice – would using something like a log-normal or stretched exp yield similar results? (Intuitively, my expectation is that the exact form of the distribution should not matter too much as N is fairly low in all studied cases, so whatever is mathematically the most convenient distribution should be fine).

2) page 3, bottom left column: "There is no significant correlation between the mean number of interactions per individual (s) and social fluidity…" Has r^2 been calculated over all datasets or separately for one species over their respective data sets?

"…which implies that sampling bias does not affect the estimation of social fluidity". How this is to be interpreted depends on the answer to the above, but I am not certain if one can make this statement, at least if the correlation is over all species/datasets. I would think that it is difficult to escape some sampling bias (as for most network measures…), unless one has several samples of different size for the same species under the same circumstances, and can show in those samples that $\phi$ doesn't depend on N.

"Similarly, network size does not correlate with $\phi$…" Again, is the correlation over all species?

3) In the subsection "Numerical validation using empirical networks" it is stated that "Since a large edge weight implies a high frequency of repeated interactions, networks with a higher mean weight tend to have lower basic reproductive number. Furthermore, variability in the distribution of weights…"

Would the variability not be a requirement for a higher mean weight leading to lower R0, so that the cause of the lower R0 is the combination of higher weights and high variability? If one considers two networks with uniform weights that are otherwise identical but one has twice the mean weight, would that one not have a *higher* R0?

4) Discussion: "We see, for example, how the relationship between mixing and disease risk scales with population density. For social systems that have high values of social fluidity, $R_0^\phi$ is highly sensitive to changes in N…" Is N conceptually the same as population density? Would, under the network paradigm, the average degree be a better proxy of population density?

Reviewer #3:

The authors define the concept of social fluidity to better define how social behaviour influences contagion process in human and animal populations. Whereas I believe the manuscript is well written, its current version requires a few clarifications.

1. I think it's important to mention that the concept of social fluidity hasn't been tested in relation to infectious disease data. Does it provide a good/better fit to infectious disease data as compared to assumptions of frequency and density dependent mass action etc.

2. In the social behavior model: the authors use frequency for the edge weight; should weighing not be done on the basis of risk assessment of these interactions?

3. Please better motivate the use of the power-law form in equation (2). Have the authors considered alternatives?

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "Social fluidity mobilizes contagion in human and animal populations" for further consideration by eLife. Your revised article has been evaluated by Miles Davenport (Senior Editor) and a Reviewing Editor.

The manuscript has been improved but there are some remaining issues that need to be addressed, as outlined below:

– In terms of the authors' reply on 'Was this relationship linear … ': Given that a Pearson correlation coefficient assumes linearity holds, one cannot measure strength of association in case of non-linearity. Please verify or use an alternative measure (see e.g. https://www.pnas.org/content/111/9/3354).

– The authors use 100 bootstraps to quantify the uncertainty of phi; this seems small to me; why not use 1000 bootstraps (as well as assessing whether or not estimates of 5% and 95% percentiles are stable for that number)?

https://doi.org/10.7554/eLife.62177.sa1

Author response

Please pay particular attention to the comments by Reviewer #1: items 1 and 2 and Reviewer #3: item 1 which focus on the relevance of this work. The other comments address punctual issues and/or clarifications; they should be looked at carefully and it would be good to organise your reply based on topics rather than a point-by-point reply.

As suggested, we start by detailing our response and modifications to the manuscript that were mentioned by more than one reviewer, and then address our remaining responses to each individual reviewer.

Point 1: Power law

Reviewer 2: "1) Would it be possible to justify the choice of the power-law form for Equation (2)? And would the results be sensitive to this choice – would using something like a log-normal or stretched exp yield similar results? (Intuitively, my expectation is that the exact form of the distribution should not matter too much as N is fairly low in all studied cases, so whatever is mathematically the most convenient distribution should be fine)"

Reviewer 3: "3. Please better motivate the use of the power-law form in equation (2). Have the authors considered alternatives?"

It is true that the exact form of this distribution is not important, and other heavy-tailed distributions will give similar results. We show this is in a supplementary analysis (Figure 1-supplement 1) by simulating the interactions of one individual whose interactions with other members of the group are determined by probabilities drawn from other distributions. Figure 1-supplement 1 shows that results obtained from log-Normal and power-law distributions look very similar and can cover a similar range of behaviours.

In the main text this is referenced in the "Social behavior model" section. When introducing the power-law it now says:

“The form of $ρ(x)$ was chosen for its analytical tractability but other heavy-tailed distributions produce a similar result (Figure S2).”

Point 2: Scope of this work

Reviewer 1: "The authors establish the importance of the fluidity measure for comparative empirical research well, but It would also be helpful to have further clarity on its advantages for modeling over current approaches that might model network structure with two or more parameters. What are the broader benefits of this approach?"

Reviewer 3: "1. I think it's important to mention that the concept of social fluidity hasn't been tested in relation to infectious disease data. Does it provide a good/better fit to infectious disease data as compared to assumptions of frequency and density dependent mass action etc."

We feel that the strength of our work resides in (i) introducing a concept, named social fluidity and expressed in mathematical form, that captures the distribution of interactions of individual hosts with varying frequency and strength, and (ii) showing that social fluidity is a stronger predictor of epidemic outcomes than commonly used metrics. While this specific study was not meant to fit the social fluidity model to infectious disease data, social fluidity was estimated from empirical networks and its role in disease spread was compared to simulated disease data on the original empirical networks. Finally, the theory we introduce is able to seamlessly connect between density-dependent and frequency-dependent approaches, so far considered independent and resting on disjoint frameworks.

Both comments suggest that we need to clarify the scope and purpose of this work. Firstly, we have added additional sentences to the abstract to emphasize the novelty…

“Large values of social fluidity correspond to gregarious behavior whereas small values signify the existence of persistent bonds between individuals.”

And the contribution to the field…

“…we demonstrate that social fluidity is a stronger predictor of disease outcomes than both group size and connectivity, and it provides an integrated framework for both density-dependent and frequency-dependent transmission.”

We have also changed the third paragraph of the introduction to better motivate the purpose of our study and the current gap in the literature that it addresses…

“While transmission rate has been observed to scale non-linearly with group size for a number of disease systems, it remains unclear how much this dependency is related to the internal social structure of the group; few studies observe social dynamics at sufficient detail while simultaneously monitoring the disease status of each individual. In the absence of direct observations, our contribution to this discussion centers around modelling; incorporating empirical social data with computational simulations. We address two specific questions. Firstly, can we quantify the variability in how individuals choose to distribute their social effort within a group, and secondly, what will this tell us about the effect that population density has on disease transmission?”

Additionally, in the first paragraph of "Estimating social fluidity in empirical networks" we mention one specific advantage of the model over models that have more parameters…

“The advantage of using this model over more detailed network descriptions is that we obtain a single parameter estimate, $\phi$ that is easily compared across animal species and environments.”

Reviewer #1:

1. The main concern is the treatment of time within the data and fluidity measure. The authors appropriate note this as a limitation in the Discussion section.

We thank the reviewers for this comment. We have addressed this with a supplementary analysis of the networks for which temporal information are available (Figure 2-supplement 2). Briefly, it is a sensitivity analysis to see the effects of using a shorter time frame. References to this analysis are added to various parts of the main text and correspond to your more specific remarks as follows….

Because the inputs to fluidity depend on degree and within-node contacts, the measurement time frame for both in each dataset is critical.

Our supplementary analysis presents the values of phi calculated when only the first 50% of the observations are used. The estimation of social fluidity is largely insensitive to the sampling time frame, using two different methods of estimation. This is referenced in the "Discussion" section …

“However, since our model focuses not on the absolute value of degree, but on how degree scales with the number of observations, the results we obtain are relatively robust against this variability (Figure S3A)”

The authors cite a paper to support the idea that degree scales with the observation length (citation 59), but that is a limited analysis of 3 human networks (2 of which were online convenience samples) that may not apply more broadly.

We checked this for the networks used in our study and include the results in the same supplementary figure. At the same point in the discussion we now add …

“This is also true of the networks used here (Figure 2 supplement 2).”

It would be helpful to have more clarification on the range of time frames for data collection across the datasets to understand whether weighted cross-sectional networks are the best modeling approach here (versus modeling dynamic networks of edge formation and dissolution). This may be important in the comparative analysis of different contact types (aggression contacts versus grooming contacts).

The time frames for data collection do vary greatly and this is something we want to be clear about. We include details of the datasets in the "Data handling" part of the "Materials and methods" section. In the "Discussion" section we mention this concern…

“We therefore consider the model to be applicable to the data analysed in this study, but advise caution when applying this approach to other data sources. If the duration of a study allows for substantial developments in the group structure, for example, then a model of edge formation and dissolution may be preferred.”

3. It was not clear how the contact types (e.g. aggression versus grooming) were defined. Were these a component of the secondary data, or did the current authors use a classification scheme? Some further details on the measurement of the data would be helpful.

These are all defined by the original studies. We have added a sentence to the "Data handling" part of the "Materials and methods" section to make this clear.

4. What are the implications of treating the system as closed and the network consisting only of non-isolates on the empirical comparisons and the epidemic modeling? These are assumptions required by the data, but not always realistic and could have a meaningful impact on the outcomes (e.g., non-differential misclassification for aggression contacts that may change the network structure, or include many isolates). The importance of these assumptions may depend on the measurement timeframe (i.e., less important if short observations).

We agree with the reviewer that isolates (individuals that do not interact during the time frame of observation) are an important data limitation to consider as they can affect the value of N (group size). In our supplementary analysis we tested this directly by varying the time frame of observation, and thus changing the number of isolates.

The following added to end of the section "Estimating social fluidity in empirical networks"

“As with any applied modelling, the validity of these results depends on the extent to which each study system conforms to the assumptions of the model. The value of $N$, for example, might not represent the true group size if some individuals in the group did not have their interactions recorded, or if there are individuals who did not interact during the time-frame of observation. While we found that variation in the value of $N$ did not have a large impact on the estimated value of $\phi$, as shown in Figure S3D, we warn that the amount of consistency between model assumptions and the conditions of each study will vary, and close consideration should be given to the way data were collected when interpreting these results.”

Reviewer #2:

2) page 3, bottom left column: "There is no significant correlation between the mean number of interactions per individual (s) and social fluidity…" Has r^2 been calculated over all datasets or separately for one species over their respective data sets?

The results were presented for all species, and we have added a set of supplementary tables containing correlations between all the social variables within species. This is referenced in the section "Estimating social fluidity in empirical networks" as follows …

Across the $57$ networks, there is no significant correlation between the mean number of interactions per individual ($\bar{s}$) and social fluidity (Pearson $r^{2}=0.02$, $p=0.27$), and in general no correlation across sets of networks taken from the same study (Tables S2).

"…which implies that sampling bias does not affect the estimation of social fluidity". How this is to be interpreted depends on the answer to the above, but I am not certain if one can make this statement, at least if the correlation is over all species/datasets. I would think that it is difficult to escape some sampling bias (as for most network measures…), unless one has several samples of different size for the same species under the same circumstances, and can show in those samples that $\phi$ doesn't depend on N.

We thank the reviewer for this comment – we have removed that statement now. Instead we say:

“Thus there is no evidence that the number of observations per individual affects the estimated value of $\phi$.”

"Similarly, network size does not correlate with $\phi$…" Again, is the correlation over all species?

As before, we have now added that information in the supplement. The main results still hold.

3) In the subsection "Numerical validation using empirical networks" it is stated that "Since a large edge weight implies a high frequency of repeated interactions, networks with a higher mean weight tend to have lower basic reproductive number. Furthermore, variability in the distribution of weights…"

Would the variability not be a requirement for a higher mean weight leading to lower R0, so that the cause of the lower R0 is the combination of higher weights and high variability? If one considers two networks with uniform weights that are otherwise identical, but one has twice the mean weight, would that one not have a ‘higher’ R0?

An important part of the disease model is the calibration of the β parameter to control for the effect of contact frequency, which varies greatly between different networks. Since mean weight is closely related to the contact frequency (i.e. the mean strength divided by the length of the time frame) choosing β in the way we do causes networks with high mean weight to have lower values of β (all else being equal). In the example you provide, the network with higher mean weight would indeed have lower R0, since the mean strength would also double and the β we apply would be halved.

We have explained this in a bit more depth in the revised version:

“While contact frequency is known to be one of the major contributors to disease risk, calibrating $\β$ in this way eliminates its effect, allowing the contribution of other network characteristics to be compared. Thus, the mean strength does not have a significant effect on $R_{0}^{\text{Sim}}(g)$, and higher mean edge weight does not necessarily imply higher transmission probability over the edges of the network.”

4) Discussion: "We see, for example, how the relationship between mixing and disease risk scales with population density. For social systems that have high values of social fluidity, $R_0^\phi$ is highly sensitive to changes in N…" Is N conceptually the same as population density? Would, under the network paradigm, the average degree be a better proxy of population density?

We have changed "population density" to "group size" to more accurately reflect what we found.

1) Figure 2 – it took me a while to understand that there are several data sets for each species (or Sociopatterns setting), so this could be mentioned in the caption.

2) This is probably an error in production, but there is a full-page figure (k vs s for the different datasets) at the end of the PDF without any caption that is also not referred to in the text (unless I missed it).

Thank you for pointing these things out. We have added "Colours correspond to the species and the setting in the case of human networks" to the figure caption. The supplementary figure 2 is now included in the new supplementary document.

Reviewer #3:

2. In the social behavior model: the authors use frequency for the edge weight; should weighing not be done on the basis of risk assessment of these interactions?

In general, yes, we agree that it makes sense to define "interaction" in terms of a disease. In our study, however, we do not specify any particular disease. Specifically, in the section "Estimating infection spread in empirical networks with heterogeneous connectivity" we specify how the transmission risk parameter (β) is selected:

“… value of $\β$ was chosen for each network to control for the varying interaction rates between networks … While contact frequency is known to be one of the major contributors to disease risk, calibrating $\β$ in this way eliminates its effect, allowing the contribution of other network characteristics to be compared.”

To address your concern we have added the following to the part of "Discussion" that mentions the limitations of this study:

“Finally, we do not know the extent to which an interaction, as defined for each network, is capable of transmission which can depend on the pathogen's transmission mode and the infectious dose required.”

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

The manuscript has been improved but there are some remaining issues that need to be addressed, as outlined below:

– In terms of the authors' reply on 'Was this relationship linear … ': Given that a Pearson correlation coefficient assumes linearity holds, one cannot measure strength of association in case of non-linearity. Please verify or use an alternative measure (see e.g. https://www.pnas.org/content/111/9/3354).

We want to know if the size of the network, or the number of observations in the sample, has an effect on our introduced measure, social fluidity. We therefore think it is appropriate to test for a monotonic relationship, and have switched from using the Pearson coefficient to the Spearman coefficient, which yields similar results. We further address whether there may be a non-monotonic relationship using mutual information as suggested.

This is now addressed in the section "Estimating social fluidity in empirical networks" as follows:

Across the $57$ networks, there is no evidence that social fluidity scales with the size of the network or the number of observations per individual. No correlation was found between the mean number of interactions per individual ($\bar{s}$) and social fluidity when testing for a monotonic relationship between the variables (Spearman $r^{2}=0.02$, $p=0.36$), and in general no correlation across sets of networks taken from the same study (Tables S2). Similarly, network size ($N$) does not correlate with $\phi$ (Spearman $r^{2}=0.02$, $p=0.28$). To test for a non-monotonic relationship, we partition the set of networks into $10$ equally sized groups according to each of the two measures being compared, and compute the adjusted mutual information (AMI) of the two groupings. We find AMI=$0.15$ for the relationship between $\phi$ and $N$, and AMI=$0.2$ between $\phi$ and $\bar{s}$. While non-negative values of AMI typically indicate a non-random relationship, an inherent amount of clustering is to be expected in data aggregated from a diverse range of sources.

We have also added the measure of mutual information to Table 1 for comparing analytical and simulated values of R0. Note that we still use the Pearson coefficient here as we want to show that this relationship is linear.

– The authors use 100 bootstraps to quantify the uncertainty of phi; this seems small to me; why not use 1000 bootstraps (as well as assessing whether or not estimates of 5% and 95% percentiles are stable for that number)?

We have increased the number of bootstrap samples to 1000 and updated Figure 2.

https://doi.org/10.7554/eLife.62177.sa2

Article and author information

Author details

  1. Ewan Colman

    1. Department of Biology, Georgetown University, Washington, United States
    2. Roslin Institute, University of Edinburgh, Midlothian, United Kingdom
    Contribution
    Conceptualization, Data curation, Software, Formal analysis, Investigation, Visualization, Methodology, Writing - original draft, Writing - review and editing
    For correspondence
    ecolman@ed.ac.uk
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-2551-8589
  2. Vittoria Colizza

    INSERM, Sorbonne Université, Institut Pierre Louis d’Épidémiologie et de Santé Publique (IPLESP UMRS 1136), F75012, Paris, France
    Contribution
    Methodology, Writing - review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-2113-2374
  3. Ephraim M Hanks

    Department of Statistics, Eberly College of Science, Penn State University, State College, United States
    Contribution
    Funding acquisition, Methodology, Writing - review and editing
    Competing interests
    No competing interests declared
  4. David P Hughes

    Department of Entomology, College of Agricultural Sciences, Penn State University, State College, United States
    Contribution
    Funding acquisition, Writing - review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-9954-8919
  5. Shweta Bansal

    Department of Biology, Georgetown University, Washington, United States
    Contribution
    Conceptualization, Supervision, Funding acquisition, Investigation, Methodology, Writing - review and editing
    For correspondence
    shweta.bansal@georgetown.edu
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-1740-5421

Funding

National Science Foundation (Award 141429)

  • Ewan Colman
  • Ephraim M Hanks
  • David P Hughes
  • Shweta Bansal

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We are grateful to Andreas Modlmeier for his involvement in the inception of this project. We are grateful for insightful feedback from Pratha Sah and Daniela Gerwehns. We also thank all the researchers who have made their behavioral data openly accessible, making this study possible.

Senior Editor

  1. Miles P Davenport, University of New South Wales, Australia

Reviewing Editor

  1. Niel Hens, Hasselt University & University of Antwerp, Belgium

Reviewers

  1. Jari Saramaki
  2. Niel Hens, Hasselt University & University of Antwerp, Belgium

Publication history

  1. Received: August 17, 2020
  2. Accepted: June 25, 2021
  3. Version of Record published: July 30, 2021 (version 1)

Copyright

© 2021, Colman et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 267
    Page views
  • 28
    Downloads
  • 0
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Computational and Systems Biology
    Michael S Lauer, Deepshikha Roychowdhury
    Research Article Updated

    Previous reports have described worsening inequalities of National Institutes of Health (NIH) funding. We analyzed Research Project Grant data through the end of Fiscal Year 2020, confirming worsening inequalities beginning at the time of the NIH budget doubling (1998–2003), while finding that trends in recent years have reversed for both investigators and institutions, but only to a modest degree. We also find that career-stage trends have stabilized, with equivalent proportions of early-, mid-, and late-career investigators funded from 2017 to 2020. The fraction of women among funded PIs continues to increase, but they are still not at parity. Analyses of funding inequalities show that inequalities for investigators, and to a lesser degree for institutions, have consistently been greater within groups (i.e. within groups by career stage, gender, race, and degree) than between groups.

    1. Computational and Systems Biology
    2. Epidemiology and Global Health
    Hannah R Meredith et al.
    Research Article

    Human mobility is a core component of human behavior and its quantification is critical for understanding its impact on infectious disease transmission, traffic forecasting, access to resources and care, intervention strategies, and migratory flows. When mobility data are limited, spatial interaction models have been widely used to estimate human travel, but have not been extensively validated in low- and middle-income settings. Geographic, sociodemographic, and infrastructure differences may impact the ability for models to capture these patterns, particularly in rural settings. Here, we analyzed mobility patterns inferred from mobile phone data in four Sub-Saharan African countries to investigate the ability for variants on gravity and radiation models to estimate travel. Adjusting the gravity model such that parameters were fit to different trip types, including travel between more or less populated areas and/or different regions, improved model fit in all four countries. This suggests that alternative models may be more useful in these settings and better able to capture the range of mobility patterns observed.