Social learning mechanisms shape transmission pathways through replicate local social networks of wild birds

  1. Kristina B Beck  Is a corresponding author
  2. Ben C Sheldon
  3. Josh A Firth
  1. Edward Grey Institute of Field Ornithology, Department of Biology, University of Oxford, United Kingdom

Abstract

The emergence and spread of novel behaviours via social learning can lead to rapid population-level changes whereby the social connections between individuals shape information flow. However, behaviours can spread via different mechanisms and little is known about how information flow depends on the underlying learning rule individuals employ. Here, comparing four different learning mechanisms, we simulated behavioural spread on replicate empirical social networks of wild great tits and explored the relationship between individual sociality and the order of behavioural acquisition. Our results reveal that, for learning rules dependent on the sum and strength of social connections to informed individuals, social connectivity was related to the order of acquisition, with individuals with increased social connectivity and reduced social clustering adopting new behaviours faster. However, when behavioural adoption depends on the ratio of an individuals’ social connections to informed versus uninformed individuals, social connectivity was not related to the order of acquisition. Finally, we show how specific learning mechanisms may limit behavioural spread within networks. These findings have important implications for understanding whether and how behaviours are likely to spread across social systems, the relationship between individuals’ sociality and behavioural acquisition, and therefore for the costs and benefits of sociality.

Editor's evaluation

This valuable study will be of interest to researchers in the fields of behavioural ecology, social ecology and evolution, and network science. The authors use simulations on empirically-recorded great tit social networks to examine how behavioural contagion might spread through social groups if individuals follow different social learning rules. The evidence supporting the conclusions is convincing, with careful modeling and parameterization for the chosen system.

https://doi.org/10.7554/eLife.85703.sa0

Introduction

Social learning, in which individuals learn from others, is widespread in the animal kingdom, and enables individuals to acquire novel behaviours facilitating phenotypic change (Heyes, 1994; Hoppitt and Laland, 2013; Whiten, 2021). Socially induced changes in behaviour can spread through a population and social networks provide the pathways along which behaviour can spread (Hasenjager et al., 2021). Research increasingly shows how the structure of social networks and individual sociality can together influence information flow (Aplin et al., 2012; Evans et al., 2021; Kulahci et al., 2016; Romano et al., 2018; Voelkl and Noë, 2008). However, information can spread via various social learning mechanisms (Cantor et al., 2021; Evans et al., 2021; Firth, 2020; Nunn et al., 2009) and we know surprisingly little about how the relationship between sociality and information flow depends on the social learning mechanisms at play.

By definition, the social spread of behaviour to a focal individual requires contact with at least one knowledgeable individual. Frequently, it is intuitively assumed that the extent (i.e. number and duration) of social contacts to knowledgeable others predicts the likelihood of adoption (Coussi-Korbel and Fragaszy, 1995; Franz and Nunn, 2009; Hasenjager et al., 2021). In this way, behavioural spread is predicted to follow a similar pattern to the transmission of many diseases. However, in contrast to disease spread, individuals may employ ‘social learning rules’ where they can actively shape how to act on acquired novel information. For example, for more costly behaviours such as the usage of a novel food type, an individual may only change its behaviour after the majority of its social contacts consumes the novel food. Consequently, behavioural spread may require exposure to multiple sources (rather than just one) and depends on the ratio of connections to both informed and uninformed individuals (rather than just the connections to informed others) (Centola and Macy, 2007; Firth et al., 2020; Guilbeault et al., 2018; Hodas and Lerman, 2014). Therefore, the type of behaviour considered and its underlying learning rule can fundamentally influence whether and how behaviour spreads through a social network (Centola and Macy, 2007; Firth et al., 2020).

In sociology, research increasingly demonstrates that the spread of various behaviours, from innovations, to health, and political movements (Guilbeault et al., 2018), follow diverse and more complex learning rules compared to the assumptions of many disease models (Centola and Macy, 2007). In contrast, research in animal systems has rarely explored how the diffusion dynamics of behaviours may be altered by learning rules (but see Nunn et al., 2009; Cantor et al., 2021; Evans et al., 2020; Evans et al., 2021), which is somewhat surprising given that previous studies have revealed several social learning strategies in animals that suggest a range of different underlying social learning mechanisms (Hoppitt and Laland, 2013; Kendal et al., 2018). For instance, an increasingly reported learning mechanism is conformist learning in which individuals disproportionally adopt the behaviour performed by the majority of their social connections (e.g. stickelbacks: Pike and Laland, 2010; chimpanzees: Haun et al., 2012; vervet monkeys: van de Waal et al., 2013; great tits: Aplin et al., 2015a; fruit flies: Danchin et al., 2018). Further, individuals often only learn from specific individuals (e.g. depending on status, Canteloup et al., 2020; relatedness, Wild et al., 2019; or conspecifics, Farine et al., 2015) or adopt behaviours only once the social connections to informed others surpass a certain threshold (Rosenthal et al., 2015).

Research on social learning in animal social networks has frequently assumed that more social individuals (i.e. with more social connections and central network positions) have a higher probability to adopt new behaviours because they are more likely to hold connections to knowledgeable others compared to less social individuals (Aplin et al., 2012; Claidière et al., 2013; Kulahci and Quinn, 2019). This link between individual sociality and behavioural adoption can be expected if the learning rule depends on the sum and strength of social connections to knowledgeable others. However, this relationship may change when learning rules rely on both the connections to informed and uninformed individuals (Centola and Macy, 2007; Firth, 2020). For instance, in the case of conformist learning, we may expect that the most social individuals will be less likely to adopt (because it may take longer until the majority of their social connections becomes informed). Such patterns have been reported in humans, where highly connected individuals required stronger social signals in order to act on information (Hodas and Lerman, 2014; Hodas and Lerman, 2012) and poorly connected individuals may utilize information sooner (González-Avella et al., 2011). Hence, predictions of how individual sociality relates to the probability of acquiring novel behaviour, and the resulting transmission pathways, can change fundamentally depending on the social learning mechanism at play (Centola and Macy, 2007; Firth et al., 2020).

Examining and comparing the transmission pathways of behaviours that follow different learning mechanisms in wild animals is challenging. Therefore, research investigating the relationship between social structure and information flow often simulates behavioural spread (Cantor et al., 2021; Evans et al., 2021; Evans et al., 2020; Nunn et al., 2009; Voelkl and Noë, 2008). For instance, studies compared the transmission speed (number or proportion of individuals informed at a given timestep) of simple versus conformity learning (Evans et al., 2021; Evans et al., 2020) or ‘prestige’ (subordinates copy dominants) versus conformity learning (Nunn et al., 2009). While these studies show that on the population level, different learning mechanisms, together with the social network structure, can fundamentally impact the diffusion dynamics (e.g. how quickly a behaviour can spread), we know little on how learning mechanisms impact the relationship between individual sociality and the probability of behavioural adoption.

In addition, behavioural simulations are often performed on artificial social networks with pre-defined structure and size (Voelkl and Noë, 2008; Nunn et al., 2009; Cantor et al., 2021; Evans et al., 2021), and may thus represent unrealistic social structures, failing to capture the social behaviour observed in real animal social networks (but see Naug, 2008; Romano et al., 2018). Therefore, in addition to purely computational studies, it is important to examine real-world social networks to test whether general findings from artificial networks can be replicated using real social systems. Further, simulated social networks are often relatively large including 100 or more individuals, and empirical social networks may be generated over prolonged periods of time. However, for behavioural spread, an individual’s social connections at a relatively small temporal scale may predict subsequent transmission (Aplin et al., 2015a; Aplin et al., 2015b; Somveille et al., 2018). Many animal species live in non-stable social groups such as fission–fusion societies (e.g. various species of birds: Silk et al., 2014, primates: Amici et al., 2008, and fish: Papastamatiou et al., 2020; Wilson et al., 2014) where group composition and size frequently change. As a result, social connections between individuals can change over time, and empirical networks generated over weeks/months, and artificial, large networks, may overestimate the social connections of an individual at the time a new behaviour emerges. Thus, it is crucial to examine social networks – both empirical and artificially derived – on a meaningful temporal scale (which will be study species dependent) to better understand whether and how different types of behaviours spread through social networks.

In this study, we explore by simulation how novel behaviours, transmitted according to different social learning mechanisms, spread through replicated empirical social networks of great tits (Parus major). Great tits are small songbirds that forage in fission–fusion mixed-species flocks during winter (Ekman, 1989) and frequently use social information (e.g. to find novel food: Aplin et al., 2012; Firth et al., 2016, to access novel food: Aplin et al., 2015a, and for prey avoidance: Hämäläinen et al., 2020; Thorogood et al., 2018) which makes them an ideal study species. Here, we create social networks from empirical data on birds’ foraging associations at distinct locations sampled on two days each week to capture the social structure at a relatively small spatiotemporal scale. Subsequently, we simulate behavioural spread on these weekly, local, networks using four different social learning mechanisms and compare how the social behaviour of individual great tits relates to the order in which they acquire novel behaviour under the four different mechanisms.

The first learning mechanism follows the omnipresent concept of simple contagion, which is mainly inspired by models on disease spread and was first formulated in the field of sociology (Guilbeault et al., 2018). Simple contagion assumes that the probability of adopting a novel behaviour depends on the number and strength of connections to informed individuals (thereafter simple rule, Coussi-Korbel and Fragaszy, 1995; Franz and Nunn, 2009; Hasenjager et al., 2021). The other three learning mechanisms imply more complex adoption rules (Centola and Macy, 2007) where behavioural adoption requires more social reinforcement: (1) a threshold rule, (2) a proportion rule, and (3) a conformity adoption rule. Here, the probability of adopting the novel behaviour depends on: (1) the connections to informed individuals surpassing a given threshold; (2) the proportion of connections to informed individuals (rather than the sum); and (3), the behaviour that the majority of connections performs. Threshold-based learning rules have been studied frequently in sociology and network sciences (González-Avella et al., 2011; Granovetter, 1978; Watts, 2002), but have rarely been considered in animals (Rosenthal et al., 2015). In contrast, conformity learning, where individuals are disproportionally more likely to copy the behaviour performed by the majority, has received much attention both in humans (Boyd and Richerson, 1988; Haun et al., 2012; Toyokawa and Gaissmaier, 2022) and animals (Aplin et al., 2015a; Danchin et al., 2018; van de Waal et al., 2013). The proportion rule assumes that the transmission rate is proportional to the ratio of informed and uninformed individuals (rather than disproportional as in the conformity rule) and has rarely been considered (Centola, 2018; Firth, 2020; Rosenthal et al., 2015).

Individual variation in sociality – the number and strength of social connections and centrality within the network – may influence the access to information and thus behavioural adoption. We infer individuals’ sociality by extracting three commonly used weighted social network metrics: the weighted degree (i.e. sum and strength of their social connections to others), weighted clustering coefficient (propensity for their associates to be associated with one-another), and weighted betweenness (propensity to act as a ‘bridge’ within the network). We predicted that the relationship between individual sociality and behavioural adoption would differ depending on the social learning mechanism. Specifically, if the likelihood to adopt a behaviour depends on the number and strength of connections to informed individuals such as in the case for the simple and threshold rule, we predicted that individuals with high degree and betweenness and low clustering coefficient should be faster in adopting the novel behaviour due to being more likely to be connected to at least one informed conspecific. In contrast, if the likelihood of adopting a behaviour depends on the ratio of an individual’s informed and uninformed connections, such as in the case for the proportion or conformity rule, we expected that individuals with low degree and betweenness and high clustering coefficient should be faster in adopting the novel behaviour because the majority of their social connections should become informed faster.

Materials and methods

Study system

Request a detailed protocol

The empirical data used in this study were collected over 3 years (December 2011–March 2014) in a population of great tits located in Wytham Woods, Oxfordshire, UK (51°46′ N, 01°20′ W, approx. 385 ha). Great tits are short-lived (mean lifespan of 1.9 years, Bulmer and Perrins, 1973) hole-nesting songbirds that form socially monogamous pairs, and establish territories during the breeding season (March–June). During the non-breeding season (September–February), great tits forage with other species in loose fission–fusion flocks that differ in size and composition (Ekman, 1989; Hinde, 1952) and consist of mainly unrelated individuals (annual population turnover of about 50% and less than 1.5% of social foraging associations are between first-order relatives, Firth and Sheldon, 2016). Great tits frequently use social information in foraging contexts (Aplin et al., 2012; Farine et al., 2015; Firth et al., 2016; Thorogood et al., 2018).

The woodland contains 1017 nest boxes hosting breeding great tits and 65 bird feeders that were deployed during the winter months in an evenly spaced grid (see Figure 1). Each feeder contained two access holes of which both were equipped with radio-frequency identification (RFID) antennas. The feeders were in place from December to February across three winters (2011–2012, 2012–2013, and 2013–2014) and collected data on the bird visits 2 days each week (from pre-dawn Saturday morning until after dusk on Sunday evening) resulting in 13 sampling periods each year. At other times feeders were closed. For the duration of the study, the location of each feeder was consistent.

Figure 1 with 4 supplements see all
Schematic overview of the simulation procedure.

First, a weekly social network of one of the feeder locations (shown as black dots) in the study site was selected. Second, behavioural spread was simulated on the selected network using four different social learning rules (i.e. simple, threshold, proportion, and conformity). The starting point (i.e. the first individual performing the new behaviour) was randomly chosen. Then, at each timestep (t1–tn), a naive individual adopted the novel behaviour with a given probability of the adoption event being from social learning (dependent on the social learning rule at play; see methods for further details) until all individuals in the network had adopted the novel behaviour. Third, we calculated a correlation coefficient (Spearman’s rank correlation coefficient) between three individual social network metrics (i.e. weighted clustering coefficient, weighted degree, and weighted betweenness) and the order (i.e. timestep) in which individuals adopted the novel behaviour. Finally, we repeated this process 100 times for each weekly, local social network.

All birds were caught in either a nest-box or a mist-net and were fitted with a uniquely numbered metal leg ring (British Trust for Ornithology). In addition, each bird was also fitted with a uniquely coded passive integrated transponder (PIT) tag enclosed in a plastic ring fitted to the other leg. This allowed us to record each visit of a PIT-tagged bird when it came close to the RFID antenna of a feeder (approximately 3 cm). At every detection, the bird’s unique PIT tag code, and the date and time were saved to a data logger. Breeding surveys and frequent trapping allowed to fit almost all individuals with metal rings and PIT tags (>90%, Aplin et al., 2013a).

Social networks

Request a detailed protocol

We created social networks based on the foraging associations of PIT-tagged great tits at each feeder and each of the 13 weekends across the 3 years. We created temporal and locally restricted networks because we wanted to generate a large number of different social networks (rather than just one network from the whole population) and because we expect networks from such a small time-window (i.e. one weekend) to be most meaningful in capturing the social connections relevant for the spread of a novel behaviour. For instance, Aplin et al., 2015b showed that individuals disproportionally copied the behaviour of the majority of individuals in the social group that preceded the focal individual’s first successful solve. Across the winter, individual great tits may move between locations and new individuals arrive at different times to the study site. Therefore, generating a social network spanning the whole study period will contain many connections not relevant at the time a novel behaviour emerges. Further, when examining the relationship between individual sociality and the probability of adoption, generating a social network from the whole population would add considerable spatial noise. For instance, within a sub-population where a new behaviour emerges, individuals with high connectivity may be faster in adopting the behaviour. However, if examined on the population level, such a relationship may be obscured by spatial effects, because an individual’s probability of behavioural adoption will be considerably predicted by its’ spatial proximity to the location of behavioural emergence. Further, creating social networks for each feeder location provided a comparable spatial unit and did not require to draw arbitrary spatial boundaries across the study site.

All analyses were conducted in R 4.0.5 (R Development Core Team, 2020). An ‘association’ was defined as two birds foraging together within the same flock. Flock membership was identified using Gaussian Mixture Models (Psorakis et al., 2015; Psorakis et al., 2012) from the R package ‘asnipe’ (Farine, 2013). This method detects events of increased feeding activity in the spatiotemporal data, clusters these into non-overlapping gathering events (i.e. flocking events), and assigns each individual detection to the event it most likely belonged to. This provided us with information about which individuals co-occurred in the same flock (Psorakis et al., 2015; Psorakis et al., 2012). From the pattern of co-occurrences, we then inferred the strength of associations for each dyad. We calculated association strength using the simple ratio index (SRI, Cairns and Schwager, 1987). The SRI describes the proportion of observations of two individuals in which they were seen together, ranging from 0 (never observed in the same flock) to 1 (always observed in the same flock). We inferred a proportional measure for the association strength between individuals (rather than just a measure for the total number of times two individual were observed together) because we have an unequal number of observations for each individual (Farine and Whitehead, 2015; Hoppitt and Farine, 2018). Further, the SRI provides a more representative measure for the social relationship between two individuals across multiple contexts (e.g. while not foraging at the feeder). We created undirected social networks with edges weighted by the SRI for each sampling weekend (in total: 39 weekends across 3 years) and feeder location (in total: 65).

For each of the weekly, local networks, we then inferred for each individual three social network metrics: the weighted clustering coefficient, the weighted degree and the weighted betweenness. All network metrics were calculated using the R package ‘igraph’ (Csardi and Nepusz, 2006). The weighted clustering coefficient was calculated following Barrat et al., 2004. It represents the proportion of the sum of edge weights of all direct connections of a focal individual i over the sum of weights of all connections of individual i that form a triangle (i.e. where two direct connections of individual i are themselves connected). The weighted degree describes the total interaction rate for a focal individual i with all other individuals, defined as the sum of all the focal individual’s edge weights. The weighted betweenness describes the number of weighted, shortest paths from all individuals to all other individuals that pass through the focal individual i and measures an individuals’ propensity to move between groups. Here, weights were added by considering the inverse of an individuals’ edge weights.

Finally, we standardized the individual metrics within each network to allow comparisons between networks. Social networks including fewer than ten individuals and exhibiting no variation in individual network metrics were excluded from further analyses resulting in the final sample size of 1343 social networks (generated from 62 locations). The three individual network metrics are moderately correlated, with weighted clustering coefficient being negatively correlated to weighted degree and weighted betweenness, weighted degree and weighted betweenness were positively correlated (see Supplementary file 1a). In addition, we provide example networks with individual great tits colour-coded based on their different weighted network metrics (Figure 1—figure supplement 1) and calculated four global network measures (network density, average path length, average edge weight, and modularity) to provide a general overview of the weekly, local great tit social structures (Figure 1—figure supplement 2, details on how these metrics were calculated can be found in the figure legend).

Simulations

Request a detailed protocol

On each network, we simulated behavioural spread using four different social learning rules, as described in Firth et al., 2020, using the R package ‘complexNBDA’. A brief explanation is provided below but full detailed description and tests of each can be found in Firth et al., 2020 and all resources are freely available at https://github.com/whoppitt/complexNBDA (Hoppitt, 2020). The R code and data to replicate the simulations used in this manuscript can be found at https://osf.io/6jrhz/.

(1) Simple rule: This transmission rule follows the logic of the classic NBDA framework:

λi(t)=λo(t)(sj=1Naijzj(t)+1)(1zi(t))

Here, λit represents the rate at which individual i acquires a novel behaviour as a function of time. λot represents a baseline rate function (i.e. the rate of asocial learning at time t) and s determines the strength of social transmission. When simulating the order of acquisitions across individuals (OADA) for a specified parameter set instead of the times of acquisitions (TADA) (Hasenjager et al., 2021), the probabilities that each specific individual is next to learn is independent of λot and thus λot drops out of the equation. zit is the ‘status’ of individual i at time t, (1 = informed; 0 = naive), and N is the number of individuals in the network. The rate at which an individual acquires a novel behaviour through social learning is proportional to j=1Naijzjt , the total connections to informed individuals at time t. Therefore, s gives the rate of transmission per unit connection relative to the rate of asocial learning of the novel behaviour. For example, when s = 2, an increase of 1 in an individuals’ edge weights to informed individuals will increase the rate of social learning by 2 times the baseline rate. 1-zit ensures that only naive individuals acquire the behaviour. Consequently, the more and stronger connections to informed individuals, the more likely an individual is to adopt the behaviour. The social transmission strength s was set to 5.

Following, we define three more complex rules that generalize the classic NBDA model as:

λit=λotT(ai,z(t))+11-zit

Here, ai represents the connections individual i has to all others in the network, z(t) gives the status of each individual in the network at time t, and T(ai,z(t)) is a transmission function determining how the rate of transmission is determined by ai and z(t).

(2) Threshold rule: This transmission rule is defined as:

T(ai,z(t))=(c111+exp(ab))(11+exp(b(jaijzj(t)a))11+exp(ab))

Similar to the classic NBDA, the rate of social transmission is zero when the total connections to informed individuals, jaijzj(t) = 0. However, the rate of transmission increases suddenly as the threshold, a, is approached, to a maximum value of c. The parameter b determines how sharp the threshold effect is. Our threshold rule differs from how threshold rules are sometimes defined in network sciences where the threshold represents a true step function rather than a sigmoidal curve. Here, we aimed to generate a model with a clear sharp threshold, so we set b = 3 for our simulations (for details and other parameter settings for b, see Firth et al., 2020). We set the threshold value a to 5 for all networks. This means that an individual’s weighted connections to informed individuals need to be ≥5 before an individual’s behavioural adoption is likely to stem from social learning. For example, if an individual with two connections with weights of 2 each to informed others adopts the new behaviour, the probability of this adoption event stemming from social learning is low. In contrast, if the individual’s two connections have a weight of 3 each, the behavioural adoption likely stemmed from social learning under the set threshold rule (see also Firth et al., 2020 for more details). The social transmission strength s was set to 5.

(3) Proportion rule: This transmission rule is defined as:

T(ai,z(t))=sjaijzj(t)jaij

Here, the learning rate is proportional to the ratio of connections that an individual i holds to informed others. As such, the individual with the highest proportion is most likely to learn and assumes additional influence from individuals’ uninformed connections rather than just considering the sum of connections to informed individuals such as in the simple and threshold model. The social transmission strength s was set to 5.

(4) Conformity rule: Finally, the fourth transmission rule assumes that individuals are disproportionately more likely to copy the majority of the population (i.e. frequency dependent):

T(ai,z(t))=s(jaijzj(t))f(jaijzj(t))f+(jaij(1zj(t)))f

Here, the frequency dependence parameter is f ≥ 1, and s > 0. When f = 1 this model reduces to the proportional model above, and as f increases the strength of conformity bias increases. Thus, an individual is expected to adopt a new behaviour if it is perceived as being performed by the majority of its’ social connections. Similar to the proportional rule, the conformity rule considers an individual’s informed and uninformed connections. Further, in this way, this conformity rule is somewhat analogous to a threshold rule but based on the proportion of informed connections rather than the total connectivity to informed individuals. For our simulations, we set f to 5 and the social transmission strength s to 5.

For each simulation the individual ‘initiating’ the behaviour (i.e. the demonstrator) was randomly chosen, and was then used across the four transmission models. We then simulated behavioural spread across the entire network under each transmission model separately. This means that at each timestep one new individual adopted the seeded behaviour, whereby each time each individual has a given probability of adopting the behaviour through social learning which ranges from 0 (asocial acquisition, i.e. an individual has no connections to informed individuals and cannot socially learn the behaviour under the set learning rule) to 1 (social acquisition, that is when an individual has the maximum probability of learning socially under the set learning rule). Therefore, at each timestep, the new individual adopting the behaviour was stochastically chosen based on their probability of adopting the behaviour in the previous timestep (i.e. where the one most likely to be chosen was the one most likely to adopt the behaviour under the given social learning rule). For instance, under the proportion rule, the next individual adopting the behaviour (i.e. from t1 → t2, see Figure 1) would most likely be the one with the highest proportion of connections to informed others. Following, we inferred the probability of this adoption event stemming from social learning. For instance, for an individual that adopted the novel behaviour with a proportion of connections to informed others of 0.2, the probability of the behavioural adoption stemming from social learning would be lower compared to the individual having a proportion of 0.7.

Once individuals adopted the novel behaviour, they remained ‘informed’ within each simulation run. For each network, we repeated the simulations 100 times to minimize the influence of the identity of the randomly selected demonstrator on the subsequent transmission pathways. We selected 100 simulation runs because this was enough in acquiring a relatively stable mean correlation coefficient between network metric and order of acquisition (Figure 1—figure supplement 3). We repeated our simulations testing various parameter combinations for s (1, 5, 10), f (3, 5, 7), and a (3, 5, 7) which we consider appropriate for our study system (i.e. the average strength of an individuals’ connections is approximately 2.5). In the main text we present results for s = 5, f = 5, a = 5 and as such a total of 537,200 simulations (1343 weekly, local networks × 4 different learning rules × 100 simulation runs). Results for all other parameters can be found in the supplementary material and with the code provided other parameter combinations can be tested.

Data summary statistics

Request a detailed protocol

To assess the relationship between each individual’s network metric and the order of acquisition, we calculated Spearman’s rank correlation coefficients. After every simulation, for each of the networks and transmission models, we calculated the Spearman’s rank correlation coefficient between the order in which individuals adopted the behaviour (always excluding the demonstrator) and each of the three individual network metrics (i.e. the weighted clustering coefficient, the weighted degree and the weighted betweenness). For an overview of the transmission process, see Figure 1. For each network and model, we then calculated the average correlation coefficient for each network metric across the 100 simulations (Figure 1). To additionally assess the general relationships between the size of the network, and the correlation coefficient between individuals’ centralities and their acquisition order, we used linear mixed-effect models using the ‘lme4’ package (Bates et al., 2015). For each model separately, we set the average correlation coefficient as the dependent variable and network size as the predictor variable. Location identity and week nested in year were set as random effects to factor in these differences when assessing this relationship (Supplementary file 1b). We examined model assumptions and fit using graphical methods (e.g. qq plot of residuals, fitted values versus residual plots, Korner-Nievergelt et al., 2015).

Results

The data in this study were simulated across 1343 empirically derived social networks, inferred from recordings of 1774 individual great tits at 62 feeder locations across 39 weekends and 3 years. Social networks varied in size (right skewed distribution towards smaller networks, see Figure 1—figure supplement 4) and consisted of an average of 21.7 individuals (min = 10, max = 77, sd = 10.0; note that we excluded social networks smaller than 10 from the analysis [see methods] resulting in the minimum network size of 10). For each location, we included on average 21.7 networks into the analysis (min = 1, max = 39, sd = 11.9) and each individual was part of on average 16.5 networks (min = 1, max = 88, sd = 13.2). Individuals on average visited 1.3 different feeder locations on a weekend (min = 1, max = 10, sd = 0.7) and from 21,036 occasions where individuals were recorded on a given weekend, individuals had visited only one location in 14,888 occasions (71%).

Relationship between individual social behaviour and order of acquisition

Simulating behavioural spread under four different social learning rules revealed different transmission pathways across social networks. For both the simple and the threshold rule, the weighted clustering coefficient was on average positively related to the order of acquisition, with more clustered individuals adopting the seeded behaviour later than less clustered individuals across different network sizes (Figure 2). Weighted degree and betweenness were on average negatively related to the order of acquisition. Thus, individuals with higher weighted degree and weighted betweenness adopted the novel behaviour on average faster than individuals with a lower weighted degree and betweenness (Figure 2). In addition, we show the relationship between mean individual network metrics and the standardized order of acquisition across network sizes (Figure 2—figure supplement 1). For the simple learning rule, the relationship between network metric and order of acquisition only changed when the majority of individuals in a network had already adopted the behaviour (with approximately 75% of individuals knowledgeable; Figure 2—figure supplement 1). Behavioural spread under the threshold rule showed a small ‘hump’ shortly after the start of the spread, especially for the network metric weighted degree (Figure 2, Figure 2—figure supplement 1). Here, the initial individuals to acquire the behaviour exhibited network metrics close to the mean, suggesting that acquisition at the initial stages (when the starting individual is chosen randomly) likely depends on the connections to the demonstrator and/or asocial learning (particularly as social learning may not be likely yet under the set threshold, see section on ‘Probability of social spread’). As more individuals became informed, social learning becomes much more likely as the threshold is possible to be reached, particularly for individuals with higher weighted degree, and possibly higher betweenness and lower clustering, adopted the behaviour sooner (e.g. start of the hump). Finally, the relationship between mean network metric and order of acquisition reversed, suggesting that individuals with higher weighted clustering and lower weighted degree and betweenness adopted the behaviour last. This may partly be a product of necessity (given the opposite type of individuals are already informed). But, interestingly the presence of the ‘hump’ was most prominent in larger networks and at low thresholds (a = 3, Figure 2—figure supplement 5), suggesting that under these scenarios this may be related to a larger variation in network positions (or more extremely central individuals) or more opportunities for social learning being present earlier on in the total diffusion (see also section on ‘Probability of social spread’). In contrast to the simple and threshold rules, there was little or no relationship between individual social network metrics and the order of acquisition under both the proportion and conformity learning rules (Figure 2, Figure 2—figure supplement 1).

Figure 2 with 5 supplements see all
Relationship between individual network metric and the order of acquisition for each social learning rule.

Each column shows a different network metric (left to right: weighted clustering coefficient, weighted degree, and weighted betweenness). Each row represents one of the four spreading rules (top to bottom: simple, threshold, proportion, and conformity). Lines plot the average network metric for each order of acquisition and ribbons show the 95% confidence interval from the 100 simulations for each binned group of network sizes. Colour represents network size with darker colour indicating smaller networks. The social transmission rate, the threshold location, and the frequency dependence parameter were set to 5.

Assessing the relationship between each individuals’ network metric and the order of acquisition across all social networks and simulation runs, revealed substantial variation in relationship strength (Figure 3). Across network metrics, there were on average the strongest positive (Figure 3: weighted clustering coefficient) and negative (Figure 3: weighted degree and betweenness) correlations under the simple rule, and coefficients for the threshold, proportion, and conformity model were lower. Even though Figure 2 indicates a clear relationship between average network metric and order of acquisition under the threshold model (Figure 2), correlation coefficients were very small (Figure 3). This may be because of the non-linear relationship under the threshold rule in which the slope of the relationship between network position and order of acquisition changes direction as more of the population becomes informed (Figure 2, Figure 2—figure supplement 1), leading to overall low correlation coefficients (Figure 3).

Figure 3 with 2 supplements see all
Distribution of average correlation coefficients for each social learning rule and network metric.

Violin and boxplots show the distribution of the average correlation coefficients between individual network metric and order of acquisition across 100 simulations from each network for each of the four social learning rules (i.e. simple, proportion, conformity, and threshold). Each plot shows one of the individual network metrics (weighted clustering coefficient, weighted degree, and weighted betweenness).

The direction of the relationship between average network metrics and order of acquisition remained unchanged when setting lower or higher parameters for the social transmission strength ‘s’ (Figure 2—figure supplements 2 and 3). However, for the simple rule, the correlation coefficients became on average stronger under larger social transmission rates (Figure 3—figure supplement 1). Further, different values for the frequency dependence ‘f’ under the conformity rule, and different values for the threshold location ‘a’ under the threshold rule did not change the general direction of the relationship between average network metrics and order of acquisition (Figure 2—figure supplements 4 and 5, Figure 3—figure supplement 2). However, under the threshold model, increasing the threshold location on average reduced the correlation between network metrics and order of acquisition (Figure 2—figure supplement 5, Figure 3—figure supplement 2).

Relationship between social network size and pathways of behavioural diffusion

The direction and magnitude of the correlation between individual sociality and their order of acquisition were partly predicted by network size (Figure 4, Supplementary file 1b). For the simple and threshold model, behavioural spread on larger networks led to more positive correlations between individual network metric and order of acquisition for weighted clustering coefficient and more negative correlations for weighted degree and betweenness (Figure 4, Supplementary file 1b). The predicted effects of network size on mean correlation coefficient inferred under the proportion and conformity rule suggest contrasting directions or no relationship with network size (Figure 4, Supplementary file 1b).

Figure 4 with 2 supplements see all
Relationship between correlation coefficient and network size across the four social learning rules.

Each row shows one of the individual network metrics (top to bottom: weighted clustering coefficient, weighted degree, and weighted betweenness) and each column a different social learning rule (left to right: simple, threshold, proportion, and conformity). Average correlation coefficients across the 100 simulations per network are plotted as count dots (larger dots indicate more values for the respective value), lines represent the predicted effects generated from linear mixed-effect models (LMM) and ribbons represent the 95% confidence intervals (see Supplementary file 1b for model results).

Overall, the predicted effects of network size on the inferred correlation coefficients were small, particularly for more complex contagions (Figure 4, Supplementary file 1b). The relationship between correlation coefficient and network size was modulated by the social transmission rate (‘s’ parameter; Figure 4—figure supplement 1). For the simple model, the direction of the relationship did not change across different social transmission rates, that is correlation coefficients became on average more positive (weighted clustering coefficient) and negative (weighted degree and weighted betweenness) with increasing network size (Figure 4—figure supplement 1). Further, the slope of the relationship between network size and correlation coefficient for each network metric remained relatively constant across social transmission rates (Figure 4—figure supplement 1). However, for weighted betweenness, there was no relationship between correlation coefficients and network size under a strong transmission rate (i.e. s = 10; Figure 4—figure supplement 1). For the threshold model, the direction of the relationship did not change across different social transmission rates, that is correlation coefficients became on average more positive (weighted clustering coefficient) and negative (weighted degree and weighted betweenness) with increasing network size (Figure 4—figure supplement 1). However, the relationship was strongest (i.e. steepest slope) for larger transmission rates (Figure 4—figure supplement 1). Similar patterns are present for weighted degree and weighted betweenness under the proportion and conformity model where correlation coefficients increase with increasing network size (Figure 4—figure supplement 1). For the weighted clustering coefficient, the slope indicated opposing directions for different transmission rates under the proportion and conformity model but were generally small and non-significant (Figure 4—figure supplement 1). The relationship between correlation coefficient and network size under the threshold model was also modulated by the threshold location ‘a’ (Figure 4—figure supplement 2). Here, the relationship between correlation coefficient and network size was strongest for small threshold locations (Figure 4—figure supplement 2). The frequency dependence parameter had no effect on the predicted relationship (Figure 4—figure supplement 2).

Probability of social spread

Across the different social learning mechanisms, the average probability that an individual socially adopted the seeded behaviour increased with increasing timestep (i.e. order of acquisition, Figure 5). This is because with an increasing number of individuals becoming knowledgeable, the probability to be connected to informed others and thus to reach the set learning criterion increased (Figure 5). Behaviours were most likely to socially spread under the simple and proportion model (on average a non-zero probability to socially spread, Figure 5). That is because both rules did not require to surpass a given threshold (such as set for the threshold rule or reaching the majority of an individual’s connections as set for the conformity rule). Therefore, one connection to an informed conspecific was sufficient for the seeded behaviour to spread socially. However, under more complex spreading processes, social learning was limited (Figure 5). This was particularly the case for smaller networks when transmission followed the threshold rule. For instance, in networks consisting of less than 20 individuals, the set threshold for socially adopting the behaviour was almost never met (average probability of social learning close to 0; Figure 5, Threshold). The high rate of asocial learning presumably led to no clear relationship between individual network metric and order of acquisition (Figure 2, Threshold; see also section on ‘Relationship between individual social behaviour and order of acquisition’). For the conformity rule, larger networks limited social spread, but only at the initial phase of transmission (Figure 5, Conformity). In larger social networks, individuals have on average more social connections and thus at the initial stages of behavioural spread, it is less likely that the majority of an individual’s social connections are already informed.

Figure 5 with 2 supplements see all
Relationship between the probability of an individual socially adopting the seeded behaviour and the number of informed individuals within the network.

Panels show the results for each of the four social learning rules (simple, threshold, proportion, and conformity). The x-axis describes the number of informed individuals within a social network. The simulations are set so that at each timestep a new individual adopts the behaviour, whereby each time each individual has a probability of adopting the behaviour through social learning (y-axis) given the set learning rule. Lines plot the average probability for each timestep and ribbons show the 95% confidence intervals from the 100 simulations across the binned groups for different network sizes. Colour represents network size with darker colour indicating smaller networks. The social transmission rate, the threshold location, and the frequency dependence parameter were set to 5.

For both the simple and threshold model, the mean probability of social spread initially increased but then dropped towards the later adoptions (Figure 5). This may be caused by a few individuals that are not well connected in the social network, and thus have in general a low probability to socially learn under the simple and threshold rule. For instance, some great tits may only be connected to one other individual and thus, even if the number of knowledgeable individuals in the social network increases over time, it is very unlikely to reach the required threshold for social learning. For the proportion and conformity rule, the average probability of social learning peaked after a given percentage of individuals being knowledgeable (Figure 5).

The probability to socially adopt a behaviour was modulated by the model parameters chosen. Across all four models, the probability of social learning increased with an increasing social transmission rate ‘s’ (Figure 5—figure supplement 1). Further, for the threshold model, the likelihood of social spread decreased with increasing threshold location ‘a’, especially in smaller networks (Figure 5—figure supplement 2). For the conformity model, changing the frequency dependence parameter ‘f’ did only slightly affect the probability of social learning in larger networks whereby with higher parameters social spread at the initial stage was more limited (Figure 5—figure supplement 2).

Discussion

Using simulations on large numbers of empirical great tit social networks, we show how the underlying social learning rules individuals employ strongly influence the transmission pathways of behaviours across social networks derived from real-world data. Under learning rules that rely purely on the extent of social connections to informed others, we found that individual great tits with a higher weighted degree and betweenness, and lower clustering coefficients were likely to adopt the seeded behaviour faster, in line with common expectations of the benefits of sociality for gaining information. However, if the likelihood of adopting a behaviour depended on the ratio of connections to informed and uninformed others, such as conformist learning, social connectivity was not strongly related to the order in which individuals acquired the seeded behaviour. Notably, this contrasts with the widely proposed prediction that more social individuals may be more likely to adopt new information. Thus, our results show how the relationship between individual sociality and behavioural acquisition can change fundamentally with the type of social learning mechanism at play. Finally, we reveal that the probability of social spread under certain social learning rules is predicted to be limited in certain real-world settings, particularly in networks of a very small or large size.

Individuals differ in the quantity and quality of social connections to others which impacts several aspects of life history (Alberts, 2019; Beck et al., 2021; Farine and Sheldon, 2015; Formica et al., 2012; MacIntosh et al., 2012; McDonald, 2007), including the social transmission of information (Aplin et al., 2012; Kulahci and Quinn, 2019). Generally, individuals that hold many social connections to others, and occupy central network positions, are expected to be more likely to acquire information (Aplin et al., 2012; Claidière et al., 2013; Hoppitt and Laland, 2011; Kulahci et al., 2016). In line with these assumptions, we find that great tits with lower weighted clustering coefficients, and higher weighted degree and betweenness were more likely to adopt the seeded behaviour faster when the underlying spreading mechanism depended on the extent of connections to knowledgeable others (Figure 2). However, when rules were more complex, for example when behavioural adoption depended on the ratio of connections to knowledgeable and naive individuals, sociality was not strongly related to the order of acquisition (Figure 2), matching the expectations from studies on complex contagions in humans (Guilbeault et al., 2018). For instance, in humans individuals with more social connections require stronger exposure to identify ‘useful’ information from the noise received from all their associates (Hodas and Lerman, 2014; Hodas and Lerman, 2012) and individuals with fewer social connections may utilize information sooner (González-Avella et al., 2011). These findings suggest that, under certain spreading mechanisms, the extent of social connections can reduce social spread. Thus, the concept that more social and central individuals are especially important in acquiring and subsequently spreading information (Kulahci and Quinn, 2019) cannot be generalized across different types of behaviours, and may even lead to erroneous conclusions. For instance, if no evidence is found that individual sociality is related to the probability of behavioural adoption, one might be led to conclude the absence of social learning where, in fact more complex social learning rules may be in operation.

These findings have important consequences for our understanding of the relationship between sociality and behavioural acquisition, and also for what may constitute an ‘optimal’ social structure for efficient social transmission (Cantor et al., 2021; Pasquaretta et al., 2014; Romano et al., 2018). In many social structures, individuals differ in their social connectivity, ranging from highly to less connected individuals (i.e. heterogenous degree distribution). In networks with higher variation in connectivity, simple contagions may spread more efficiently because highly connected individuals can act as ‘hubs’ (Evans et al., 2020; Xue et al., 2020). In contrast, more complex contagions may spread more slowly on networks with heterogenous degree distribution (Evans et al., 2020; Xue et al., 2020). Many animals, live in fission–fusion societies (Amici et al., 2008; Silk et al., 2014; Wilson et al., 2014) where individuals frequently join, leave, and rejoin groups which can result in large individual differences in social connectivity (Sah et al., 2018). In contrast, some animals form highly stable groups (e.g. many primates and carnivores; Kappeler and van Schaik, 2002; Holekamp et al., 2007) with lower individual variation in social connectivity (Sah et al., 2018). In social networks with heterogenous degree distribution, an individual’s social network position may, under certain behavioural contagions, have a strong impact on the probability of behavioural acquisition whereas in groups with homogenous degree distribution, an individuals’ position within the network may not be as important. Ultimately, the ‘optimal’ social structure for information transmission will highly depend on the behaviour and the underlying learning mechanism. For future work it will be interesting to test how different behavioural contagions spread on social networks of different species, ideally with contrasting social structures.

Our results have implications for our understanding of the costs and benefits of individual sociality. While increased access to information is one of the postulated key benefits of sociality, simply holding more connections to others may in fact hinder the adoption of novel behaviours under certain social learning rules. For instance, if a novel behaviour follows a conformist learning mechanism, highly social individuals with lots of connections may be exposed to the new behaviour sooner than less social individuals because they are more likely to be connected to at least one informed conspecific, but despite this will be less likely to adopt the novel behaviour if it requires the majority of their social connections to become informed first (Firth, 2020) or if many social connections make the detection of useful information more difficult (Hodas and Lerman, 2014; Hodas and Lerman, 2012). In addition, our results suggest that occupying more peripheral network positions may be costlier in terms of behavioural adoption than occupying central network positions is beneficial, but that this is dependent on the learning rule. For instance, the proportional rule and conformity rule did not show that individuals with lower social connectivity were likely to adopt very late, yet the simple and threshold learning rule showed that the average network centrality of late adopters was very low (after approximately 75% of individuals being informed; Figure 2—figure supplement 1) but remained relatively unchanged for earlier adoptions (before 75%, Figure 2—figure supplement 1). This suggests that in various cases, poorly connected individuals adopt the behaviour last and have in general a very low probability of socially adopting behaviour (see decrease in social learning probability for simple and threshold rule, Figure 5) but that this can be negated when certain social learning rules (e.g. the proportional and conformity rules) are in play. We speculate that the relationship between network position and order of acquisition at the initial stage (i.e. before approx. 75%), and the large variation in correlation coefficients (Figure 3) is also highly determined by the demonstrators’ network position (i.e. the starting positing of spread, Banerjee et al., 2013) and the underlying network structure (Cantor et al., 2021; Evans et al., 2020; Romano et al., 2018) which may warrant future work. Finally, how biologically meaningful differences in the order of acquisition are will ultimately depend on the behaviour and context, and the actual observed variation in adoption times. Therefore, quantifying real individual variation in the timing of behavioural acquisition in the wild will be crucial for our understanding of the potential costs and benefits of individual sociality.

Examining behavioural innovation and its subsequent spread in nature is challenging (Klump et al., 2021; Whiten and Mesoudi, 2008). This is because new behaviours are often only detected once the majority of individuals in a population are already knowledgeable. Alternatively, behavioural innovations may occur much more frequently but remain undetected if the behaviour does not spread far (e.g. because the behaviour is mechanically challenging for individuals [Gajdon et al., 2006] or when the carryover of older, outdated behaviour hinders the spread of novel and more adaptive behaviours [Aplin et al., 2017; Barrett et al., 2019]). Here we used simulations on real-world empirical networks, and demonstrate that the ability of a behaviour to socially spread depends on its underlying social learning mechanism. Our findings are thus consistent with other studies that investigated different spreading processes (Cantor et al., 2021; Evans et al., 2021; Evans et al., 2020; Nunn et al., 2009). For instance, Evans et al., 2020 simulated a simple and conformity contagion on different social structures and show that disease/information spreads faster (i.e. time until a certain number of individuals had been infected/informed) under a simple contagion compared to a conformity contagion. Our study supplements past research by specifically focusing on the diffusion dynamics on the individual rather than the population level. Across the four models, the probability that individual great tits socially adopted the seeded behaviour increased with increasing timestep (Figure 5). This is expected as at each timestep in simulations a new individual becomes knowledgeable and remains in this state thereby increasing the number of knowledgeable individuals within the network. However, social transmission under certain social learning mechanisms was limited. For instance, behaviours that needed to surpass a given threshold of social connections to informed others (i.e. threshold or conformity rule) required more asocial learning events in the initial spreading phase to be able to subsequently transmit via social learning (Figure 5, Figure 4—figure supplement 2). In contrast, for the simple and proportion rules, the probability of social transmission was always higher, even in the initial spreading phase and when considering different transmission rates (Figure 5, Figure 5—figure supplement 1). This demonstrates that in the initial stage some behaviours may have a higher likelihood to spread socially, whereas other behaviours, following more complex processes, may rely on a larger extent of asocial learning events or social reinforcement. Therefore, the majority of empirical research on behavioural spread in animals may in fact examine more simple social transmission processes (because they are easier to detect and observe) which may bias our picture of the existing social learning mechanisms of animals and behavioural innovations.

We found that network size impacted behavioural spread across the four transmission models. The strength of the relationship between individual network metric and the order of acquisition increased with increasing network size under the simple and threshold rule (Figure 4) and was mediated by the social transmission rate (Figure 4—figure supplement 1) and the threshold location (Figure 4—figure supplement 2). The weaker correlation between individual network metrics and order of acquisition in smaller networks might be caused by the relatively reduced likelihood of social spread (Figure 5). For instance, for the threshold rule, individuals in larger networks have an overall higher number of social connections (compared to individuals in smaller networks) which facilitates reaching the set threshold for social learning and thus increases the likelihood of social spread. This is supported by our results when simulating spread using different threshold values where behaviours were most likely to socially spread across different network sizes under small thresholds (see Figure 2—figure supplement 5). In addition, our findings may suggest that in smaller networks, the order of acquisition is mainly predicted by who an individual is connected to (e.g. whether it is directly connected to the demonstrator) compared to the number and extent of social connections it has. In contrast, in larger networks, the probability of behavioural acquisition may be strongly influenced by the number and extent of connections an individual has. Therefore, the importance of ‘who’ you know versus ‘how many you know’ may differ in networks of varying size. Past studies have investigated behavioural spread on relatively large and static networks (using simulations: Cantor et al., 2021; Evans et al., 2021; Nunn et al., 2009; Voelkl and Noë, 2008 or natural observations: Allen et al., 2013; Aplin et al., 2015a). However, in many species social associations can be highly dynamic and only the social connections at a relatively small temporal scale (e.g. at the time of emergence) may predict an individuals’ decision to adopt a novel behaviour. Our findings that network size can impact behavioural spread thus have important consequences for our understanding of the influence of wider society structure on when and where behaviours may emerge, and how to interpret empirical results. As such, it is important to improve our understanding of the factors that give rise to different social network sizes and structures and to consider networks on an appropriate spatiotemporal scale or dynamic versus static networks (Hasenjager et al., 2021; Hobaiter et al., 2014). For instance, spatiotemporal variation in environmental features (such as the availability and distribution of resources) may influence population densities and subsequently local social network size and structure across space and time, and as such may influence local social spread.

Our study suggests further questions for future research. Our models assume that each individual adopts a seeded behaviour under the same ‘learning rule’. While variation in individual learning rules and their impact on behavioural contagions have been widely examined in humans (Aral and Nicolaides, 2017; McCullen et al., 2013; Melnik et al., 2013; Muthukrishna et al., 2016), its’ investigation in animals remains scarce. However, individuals may differ in the extent of social information use and the thresholds required for behavioural adoption (Chimento et al., 2022), which may be context and state dependent (Penndorf and Aplin, 2020; Rendell et al., 2011). For instance, dominance rank (Krueger et al., 2014) and sex (Aplin et al., 2013b) have been shown to be related to variation in social information use and individuals may only copy behaviour from certain individuals, based on familiarity or kin (Boogert et al., 2018; Kavaliers et al., 2005). Therefore, learning rules may differ within individuals (e.g. with changes in age or dominance) and between individuals (e.g. sex, relatedness). Future simulation and empirical studies could explore how heterogeneity in learning rules, within and between individuals, and variation in acquisition versus adoption, impact information flow (Chimento et al., 2022). In addition, we only explored learning rules which might be relevant for our study system. However, future research could test the same and different rules on species with different social structures. For instance, in species with more stable social groups such as in primates and many carnivores (Holekamp et al., 2007; Kappeler and van Schaik, 2002), kin- or dominance-based learning rules may be more applicable.

Furthermore, using simulations did not allow us to test changes in the social network resulting from behavioural adoption: individuals may occupy more central social network positions once performing a new behaviour (Kulahci et al., 2018; Kulahci and Quinn, 2019). This can be the case if individuals preferentially associate with knowledgeable others (Kulahci et al., 2018), or if individuals change their behaviour in response to information acquisition which can also lead to an increase in social connections (Kulahci and Quinn, 2019). Such dynamics are not reflected in our study, and would require the investigation of natural behavioural spread, ideally under experimental conditions in the wild. In addition, our models assume that once individuals become ‘informed’, they cannot return to an ‘uninformed’ state. In natural conditions, however, individuals may return again to an ‘uninformed’ state if the novel behaviour was not rewarding. Future studies incorporating these aspects could provide further new insights into the patterns of social transmission and its link to sociality. Finally, we used empirical networks to capture the fine-scale social association patterns between wild birds and to explore how different behavioural contagions spread on them. While it is important to test predictions on real networks, such an approach also has additional considerations. For instance, network metrics are often correlated and dependent on one another (Supplementary file 1a) which makes it difficult to tease apart the direct effects on the probability of behavioural adoption for each metric alone. In studies using simulated networks, interdependencies can be controlled for in certain ways and the effects of each metric could be teased apart (e.g. by using sensitivity analysis). Therefore, we suggest that, while both empirical and simulation studies can provide valuable information on their own, considering both complementary to each other will improve our understanding of how behaviours spread on different networks and how different social network metrics relate independently to behavioural adoption.

From copying the majority (Aplin et al., 2015a; Danchin et al., 2018; van de Waal et al., 2013) to learning from specific tutors (Canteloup et al., 2020; Wild et al., 2019), individuals use a large range of different learning strategies (Hoppitt and Laland, 2013; Kendal et al., 2018). While, an increasing number of studies show how social network structure can influence social transmission (using simulations: Evans et al., 2021; Romano et al., 2018; Voelkl and Noë, 2008 and empirical data: Firth et al., 2016; Naug, 2008; Romano et al., 2018), our research highlights the importance of the underlying social learning mechanism in shaping the transmission pathways across social networks. We demonstrate that the common assumption that sociality is linked to a higher likelihood in acquiring information and adopting new behaviours cannot be generalized for behavioural spread in real-world networks. This also sheds new light on our current understanding of the costs and benefits of individuals sociality and asks to focus more on the social learning mechanism at play, and to differentiate between access to information and behavioural adoption (Chimento et al., 2022). In addition, we reveal that social transmission can be limited under certain adoption rules (such as the threshold rule), and social networks of particular size. Our findings thus have important consequences for our understanding of whether and how behaviours spread across different social networks, and subsequently the establishment of traditions and cultures. Further, differences in spreading mechanisms alter predictions of what may constitute optimal social structures for the transmission of information, and how selection may act on sociality.

Data availability

All data and code to reproduce the analyses can be accessed at https://osf.io/6jrhz/.

The following data sets were generated
    1. Beck K
    (2022) Open Science Framework
    ID 6jrhz. Data and R code for: ‘Social learning mechanisms shape transmission pathways through replicate local social networks of wild birds’.

References

  1. Book
    1. Boyd R
    2. Richerson PJ.
    (1988)
    Culture and the evolutionary process
    University of Chicago Press.
  2. Book
    1. Ekman J
    (1989)
    Ecology of non-breeding social systems of parus
    In: Prize E, editors. The Wilson Bulletin. Wilson Bull. pp. 263–288.
  3. Book
    1. Hinde RA.
    (1952)
    The Behaviour of the Great Tit (Parus major) and Some Other Related Species
    Brill.
  4. Conference
    1. Hodas NO
    2. Lerman K
    (2012) How visibility and divided attention constrain social contagion
    2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing. pp. 249–257.
    https://doi.org/10.1109/SocialCom-PASSAT.2012.129
    1. Holekamp KE
    2. Sakai ST
    3. Lundrigan BL
    (2007) Social intelligence in the spotted hyena (crocuta crocuta)
    Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences 362:523–538.
    https://doi.org/10.1098/rstb.2006.1993
  5. Book
    1. Hoppitt W
    2. Laland KN
    (2013)
    Social learning
    Princeton University Press.
  6. Software
    1. R Development Core Team
    (2020) R: A language and environment for statistical computing
    R Foundation for Statistical Computing, Vienna, Austria.

Decision letter

  1. Yuuki Y Watanabe
    Reviewing Editor; National Institute of Polar Research, Japan
  2. Christian Rutz
    Senior Editor; University of St Andrews, United Kingdom

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Decision letter after peer review:

[Editors’ note: the authors submitted for reconsideration following the decision after peer review. What follows is the decision letter after the first round of review.]

Thank you for submitting your article "Social learning mechanisms shape transmission pathways through replicate great tit social networks" for consideration by eLife.

Your article has been reviewed by three peer reviewers, and the evaluation has been overseen by a Reviewing Editor and a Senior Editor. The reviewers have opted to remain anonymous.

Comments to the Authors:

All three reviewers found the modeling approach and main results valuable. That said, they raised a number of major concerns, which can be summarized as follows (for many additional points, please see their original reports below):

(1) A more thorough description of the model should be provided. Ideally, all code should be made available so that readers can replicate the modeling.

(2) Parameter selection is weakly justified, and sensitivity analyses for the parameters are missing.

(3) It is unclear to what extent the results depend on the characteristics of great tit networks and how they are relevant to other species. Related to this point, the relevant literature needs to be covered better, including earlier work using empirically-recorded networks to simulate diffusion processes.

Reviewer #1 (Recommendations for the authors):

In this manuscript, the authors simulate learning processes using data collected from wild birds on their social networks. The results suggest that the nature of the learning process, as well as the size of the social network, determine which individuals will adopt a new behavior faster.

1. The main advance is simulating information spread over empirical networks rather than artificial networks. I doubt that this constitutes a major advance. Specifically, the authors mention in the discussion that the results are in agreement with previous results obtained from artificial networks. Therefore, while it is important and interesting to use empirical networks, their usage did not seem to provide additional insight.

2. The results depend on the characteristics of great tit networks. Although many different networks were used in the simulations, we can expect different results if networks of other species would be used. It would be helpful if the authors can provide some information about the structure of the empirical networks, and perhaps some predictions regarding their relevance to other species.

3. As far as I could tell, the specific code and data for running these simulations were not shared (not just the package). This makes it impossible to replicate the results and assess the data structure.

4. Learning rules in the wild were briefly mentioned in the introduction. Is there any evidence for the rules being used in wild animal populations, supporting the inclusion of these four rules?

5. Figure 2 and S2: I don't really understand why the order of acquisition is plotted as predicting the network traits (e.g., clustering coefficient). If there is any connection between the two it should be the other way around. To me it makes these figures confusing. Perhaps related, at least in the lower panels all individuals seem to have very similar trait values (e.g., weighted degree) -- how can this be explained?

6. L. 94-100: Earlier, the authors mention that individuals may tend to learn from specific individuals such as relatives or with a given trait value. To me that sounds like a very likely mechanism, such as "learn only from your mother, or from your strongest affiliation". I would be very interested to see versions of such rules tested here.

L. 90: Did you mean predator avoidance?

L. 152: The introduction mentioned weekly networks but here weekends are reported. How much time does each network include?

Figure 1: This figure is referenced only from the methods section. It raises a question: Why were the networks generated from only one location each time? I assume that two birds may have also interacted in adjacent locations. Perhaps some data on the overlap across adjacent locations can help.

L. 206-207: This statement does not seem to be supported by Figure 3, in which the correlation coefficients for weighted degree and weighted betweenness are actually lower in the simple learning rule.

L. 248…: I was surprised to see here, in the last part of the results, an explanation of the simulation procedure, which belongs either to the methods section, or to the beginning of the results.

L. 470: foal should be focal

Reviewer #2 (Recommendations for the authors):

Overall, I found the article interesting and feel that it provides an interesting examination of the consequences of how individuals learn on how behavioural contagions spread through animal societies. Previous studies have not applied these types of model to empirically-derived animal network data or studied as many forms of behavioural contagion simultaneously. Consequently, while (in the context of broader network science research) the individual results here are not new, it is valuable to see them presented together and in a way that is accessible to behavioural ecologists.

I felt the article was well-framed in the introduction, and the results were largely presented clearly and intuitively. In particular, I thought that Figures 2 and 5 were an excellent way to clearly show some of the main results of the study. However, there was insufficient information provided about the models or the experimental design of the modelling component prior to the results, which made interpretation of model results harder than it should be and at times made the results hard to follow when intertwined with methodological information.

I have some concerns about the modelling approach used, although perhaps some of them are related to the rather limited information provided on the simulation approach (I would really encourage the authors to provide clearer methods that incorporate functions for the learning rules and a step-by-step description of the simulation algorithm). It is not clear to me whether acquisition is deterministic based on likelihoods or probabilistic, and the step explaining how the likelihood of information being acquired socially or asocially is particularly unclear. This plays in to some concerns about the appropriateness of a strictly order-based algorithm -- it seems a rather artificial choice based on an existing statistical model rather than clearly biologically justified. Given the shape of the adoption curves for different contagions will differ, and that there is a probability of asocial component, avoiding a time-based approach seems like it could potentially have different consequences for different social learning rules. I can potentially see an argument that individuals would never truly acquire information at the "same" time, but this perhaps raises the question as to: (a) how meaningful changes in behaviour might or might not be to the subsequent individual to acquire the behaviour; and (b) how biologically meaningful differences in adoption time are (outside of very particular biological contexts such as during a predation event).

Another concern of the work presented here is that there are minimal checks of sensitivity to model parameterization (aside from changing the threshold number of connections in the threshold model). Parameter selection for the different forms of social learning is fairly weakly justified and not well explained so it is not clear what effect this might have on the generalizability of the results. For example, previous modeling work has suggested that the difference (in speed) in how simple contagions and contagions based on conformist learning spread through some types of networks depends on how easy the contagion can spread. This may not impact results related to the order of acquisition, but it is hard to tell if this is likely to be the case, especially given there is also an asocial learning component that has an impact on the adoption of behaviours.

I also found it interesting that the networks were weighted using simple ratio indices, and the reasoning for this wasn't clear. There are some important assumptions hidden in this choice that could potentially impact the results of the study related to what these indices relate to and how individuals learn. An (intentionally) naïve starting point (in my mind at least) would be that the likelihood of social learning depends on the number of times individuals associate at a food resource rather than a proportion of times that they occur together (e.g., individuals that occur together 4 out of 10 observations may well have more opportunities to learn than 1 out of 2 times). This is clearly a very direct interpretation of the simple ratio indices which ignores the potential for associations elsewhere and their "representative value". It is also based on the assumption that number of associations is the factor that drives social learning (as opposed to, e.g., associations, within a set amount of time, strength of social bond, etc.). However, it would be good to more clearly set out the reasoning for this choice and perhaps test or discuss the sensitivity to different assumptions here.

Away from the research itself, one frustration when reading this paper is that it does a fairly poor job of placing itself in the context of the wider literature. First, while it does a good job of citing the relevant studies that conduct similar modelling work in animal societies, there is relatively little effort to engage with the findings of these studies in the introduction or discussion of the paper. There is a real opportunity here to unpack the results of this study in relation to similar and contrasting findings from other papers that is missed here. Different papers have focused on different aspects of how social structure and connections influence contagions in animal societies and by linking better with some of these papers it could perhaps address how the findings of this study might generalize (e.g., to different social structures, considering different "transmissibilities" of contagions, etc.). Second, there is little effort made to acknowledge or consider the large number of modelling studies that address similar questions in the broader network science literature. While the network here is empirically derived, from that point on this study is purely computational and there are studies that have addressed very similar or overlapping questions elsewhere in this literature (e.g., how the number of connections influences speed of acquisition for different forms of contagion).

Related to this point, it would be nice to see a more nuanced discussion about the strengths and weaknesses of computational research that either applies simulation models to a single empirical case study vs. that the applies similar computational models to more generalized network structures. While there was a point when these types of model were applied to very generic network structures (random, small-world, etc.) and to an extent still are in network science (where the research aims are somewhat different), more frequently now studies that use simulated network structures do so with express biological questions in mind and design simulated networks accordingly. Taking this approach is a powerful way of tackling specific questions and/or generating a range of generalizable structures. Equally applying these types of models to particular empirical case studies is very valuable in its own right for different reasons. Related to the previous point, I think it would be great to make the most of their complementary strengths to better integrate the lessons learned from these different approaches.

This recently published paper is perhaps relevant/useful:

https://royalsocietypublishing.org/doi/full/10.1098/rspb.2022.1001

L20: Given the weak correlations illustrated in Figure 3, it feels slightly misleading to describe this as a "strong" relationship.

L22-23 (and elsewhere): Discussion of this idea throughout the paper doesn't acknowledge previous work showing this outside ecology. This review contains more links to studies in network science as a useful resource https://onlinelibrary.wiley.com/doi/full/10.1111/oik.07148?saml_referrer. For example, this paper https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0020207 tackles how conformist social learning leads to this pattern.

L48: "requiring exposure to multiple sources" isn't necessarily a difference between infection and behaviour spread in networks (e.g., see literature on dose-response curves).

L76-78: Evans et al. (2021) also consider a simple contagion and conformist learning and explore the potential implications of considering the simple contagion as being something other than the spread of infection in the discussion.

L83-85: While I think a really interesting aspect of this study is its exploration of the role of network size, I am not sure how well this criticism works (in its current form) given that even big networks can capture interaction patterns at small spatial scales, and that how network structure depends on the temporal scale will be highly study-system dependent.

L96-100: Given the methods come last, these rules need to be more accurately described here to help with the interpretation of the results. It would also be good to provide a more in-depth exploration of previous theory/discussion about these rules beyond ecology.

L112-113: Low clustering coefficient will not always indicate a less sociable individual, this will depend considerably on the broader structure of the network. It would seem to in this study, but some greater context would be helpful here. In the results it would be helpful to quickly describe the correlation between the centrality measures used in the studied networks.

Figure 1: Would be useful (and take up no extra space) to give the thresholds used here.

L151-154: It would be good to provide more clear information on networks per feeder site, network appearances per bird, etc., and an indication of what timespan were data included from. Also perhaps valuable to point out that the minimum of 10 (and rest of these descriptive stats) applies after some networks were excluded.

Figure 2: One thing I found particularly interesting in these results is that for simple, threshold and to some extent conformity there appears to be a stronger pattern of being the most peripheral is costly rather than being the most central is beneficial. Clearly, this may depend on what the information/behaviour that is spreading is related to, but it's a neat result and perhaps worthy of some discussion for what it means for social ecology/evolution in this system.

L210: Is this now meant to refer to Figure 3?

Figure 3/Figure 4: Even the correlations different from zero are (predominantly) small. Later in the paper it might be interesting to discuss how biologically meaningful they are in networks of different sizes in this system. One thing apparent in these results is that there is a lot of noise (presumably) related to the seeded individual. It may be worth using this to highlight that in small networks who you know is just as/more important than how many you know -- even if it is just as a suggestion for further research.

Figure 4: I found distinguishing between the two blues here pretty difficult. Another colour scheme may be clearer or a different way of presenting the data given how dense the point clouds are.

L219-231: It would be good to make this aspect of the experimental design clearer earlier. I also found these results were written less clearly than other parts -- some rewriting might be helpful.

L248-255: Some indication of how the model broadly works such as this should ideally come at the end of the methods to help with interpreting the results in this methods-last format.

L270-274: Would perhaps be helpful at the end of the introduction to set up the model or in the methods?

L362-364: Is there not a little more nuance here given that there won't necessarily be a single learning rule for each behaviour so it suggests perhaps that the importance of different learning rules varies as a behaviour spreads.

L378-380: Similarly to the previous point, an appropriately parameterized (dynamic) network for a large population can capture interactions at very fine spatial and temporal scale -- this doesn't seem so much a point directly related to network size. One aspect of network size that perhaps becomes interesting is that the importance of "who" you know versus how many you know changes in importance in different size networks.

L383: Is there a reason for using network "shape" here rather than more established terms like topology or structure?

L387-405: Good to see a sensible consideration of model limitations.

L449-453: Is there empirical data to support this for this study system given it has been the subject of previous experimental work? Justifying with previous empirical work would strengthen this point considerably.

L449-453: What biases do you introduce to the networks by only including interactions at a single feeder? How many individuals use multiple feeders?

L470: I am not convinced the use of clique (in the strict network analysis definition) is correct here.

L505: I would suggest making it clear earlier in this description that simulations were repeated multiple times in each network.

L520: Is this number correct given different thresholds were used too for further simulation runs?

L530-53: Would be good to provide confirmatory information on model goodness of fit checks and clarify how statistical inference was done (presumably from the full/fitted model?).

I hope my comments help improve the manuscript.

Reviewer #3 (Recommendations for the authors):

This study is a timely theoretical exploration of how variation in transmission rules interacts with variation in social phenotype to influence diffusion dynamics. The authors predict that the likelihood of an individual adopting novel behaviors should depend on the learning rule, as well as the individual's sociality. The authors explore the spread of behavior under 4 different learning rules: a simple adoption rule, a threshold rule, a proportion rule and a conformity adoption rule. They quantify the sociality of individuals by calculating their weighted degree, clustering coefficient and betweenness.

The authors find that under simple and threshold rules, high degree, high betweenness, and low clustered individuals acquired the seeded behavior earlier in simulations. Under proportional and conformity rules, there was no strong relationship between social phenotype and order of acquisition. The authors find that network size predicts the magnitude and direction of the correlation between social phenotype and order of acquisition and that this relationship also depends on the learning rules.

Strengths

1. Overall the paper is well written, and the motivation for using computational simulations is well warranted to explore this question.

2. The topic is timely, and tackles an important theoretical question of how variation in learning rules might interact with social phenotype to influence cultural diffusions. This is a difficult topic to address but is critical for improving our understanding of how diffusions might differ between populations.

3. The authors construct simulations using real social networks of great tits, rather than artificially generated networks, which is a rarity, and thus of great value, in the SL modeling literature. These networks are hard earned -- taken at a relatively fine temporal scale from weekly sampling.

4. The authors provide a thorough discussion of the implications of their results.

Weaknesses

1. One main weaknesses of the paper is the lack of details given about the transmission model. The authors do not provide equations, descriptions of parameters, a detailed schedule of the model. The descriptions they do provide are spread throughout the manuscript, making it more difficult to assess. NBDA is definitely an appropriate model for the question they want to answer, but it seems like the authors have altered some features of the model (e.g., only 1 individual can learn per timestep, 1 individual must learn per timestep). This lack of clarity makes it harder to assess the results they present. Further, from their description, it appears that they have allowed for asocial learning, which adds unnecessary noise to a study that is focused on social transmission.

2. Another main weakness is that the authors do not use a sensitivity analysis, and thus it is difficult to assess the relative effects of each network metric, as they are not necessarily independent of one another. For example, degree and clustering can be correlated simply as a result of how clustering is calculated. This is the downside of using real networks, as without synthetic data, there may be insufficient data to perform a sensitivity analysis. Further, the authors do not present an assessment of variation in their results, instead showing mean values within network size as evidence of their claims.

3. Related to the interpretation of sociality, there is opportunity to increase clarity. The authors describe more social individuals as having a high degree, high betweenness, and low clustering, and less social individuals as low degree, low betweenness and high clustering. One could also imagine a bird who has high degree, low betweenness, and high clustering, being at the center of their group, but rarely going between groups. It seems harder to argue that this bird is less social than a bird with high degree and high betweenness but low clustering. The manuscript would benefit from a careful description of how different combinations of these social metrics could be interpreted.

Point by point comments

1. It should probably be mentioned somewhere in the introduction that social learning rules apply to either the social transmission of novel behavior (e.g., Aplin et al. 2015) or the social influence of others on behavior (e.g., Pike and Laland 2010, Danchin 2018), and that you aim specifically to look at social transmission.

2. The initial conditions of the simulations are not well enough explained before we get to the results. I was left wondering how the authors chose the first knowledgeable agent, which isn't answered until later.

3. The methods section could use more explanation. Those who are unfamiliar with NBDA would need to refer to other publications to see the equations, especially the meaning of 's=1', etc. Also, what are the other parameters set to (λ, A)? Consider including the equations, as well as a more thorough description of parameters. The same could be said for equations describing the network metrics.

4. Related to point 4, what happens when A=0, in a pure social learning environment? This would reduce stochasticity due to asocial innovations, and would provide a pure test of the effect that authors predict arises from sociality and learning rules.

5. L470 "foal" should be "focal".

6. I actually think it's more helpful to show results when you standardize network size in the main text, and put Figure 2 in the supplement. Figure 2 is difficult to read, and something like Figure S2 is easier to interpret if you're assessing relative differences in diffusion dynamics. Also rather than presenting each network size as a color, select one network size (or a binned size) and present a variation metric (e.g., percentile intervals).

7. Suggest changing section title "Social network size and behavioral spreading" to match first section "Relationship between…". Also suggest "diffusion" rather than behavioural spreading. After reading the section, this seems more to do with how network size impacts the correlation between variables, rather than the diffusion itself. Maybe change the heading to reflect this?

8. L204 – 231: Overall I think this section is fairly dense compared to the first section, and after reading it several times, I'm still not sure what I should take away from it. It looks like you have a very low N at large network sizes, which could drive some of these correlations in Figure 4. The fact that agents can asocially learn also makes it hard to interpret what these correlations mean.

9. L204-216: I had to read the beginning of this section several times, and it's more confusing than the first section of results. The results communicated until L210 do not relate to network size, and seem to repeat the previous section. I suggest removing this or incorporating it into the previous section and starting with how network size affected simulation dynamics.

a. Also I suggest rewriting to avoid putting the variable of interest in parentheses (e.g., L 209, 213).

b. L208: "the mean average network metric" was confusing -- do you mean "average network measure"?

c. L210: I suggest "The direction and magnitude of the correlation between ind. Sociality and order of acquisition were predicted by network size. This relationship was modulated by transmission rule…" to improve clarity.

10. Figure 4: I find it very hard to see all 4 lines, maybe choose different colors?

11. L224: Which means network metric?

12. L248: This information about initial conditions should come before the results.

13. L248: "Spreading simulation" -> diffusion simulation.

14. L253: Without the equations written out in the methods, it's difficult to assess how the learning model works. Is asocial learning turned off under obligate social learning? It's my understanding that in NBDA, the s parameter controls the relative strength of social learning per unit connection to asocial learning. In the usual formalization of NBDA, the probability of asocial learning is constant in all individuals, contra L251 which states that asocial learning only occurs in unconnected individuals. Does your model assume that individuals who are "well-connected" (also undefined in the manuscript) have the $A$ parameter set to zero? If this is the case, the authors should include a justification/definition of being well-connected.

15. L317: I'm still finding it hard to wrap my head around how a bird with low clustering is central and highly social. A nice way to explain/justify the differences between more and less social individuals would be to make a figure of an exemplar network, with several stereotypes highlighted, along with their social metrics.

16. L332: Cantor et al. (2021, Proc. R. Soc. B) should probably also be cited here, as they measure the performance of recombination and subsequent diffusion.

17. L342: Overall, this manuscript has synergy with the study "Cultural diffusion dynamics depend on behavioural production rules" (doi.org/10.1098/rspb.2022.1001), which explicitly explores the difference between acquisition and usage, and also uses NBDA as a generative model. It would be relevant to cite here.

18. Figure 5: If individuals have a low probability of social learning, do they have a high probability of asocial learning? Or not learning at all? Are there cases when both the probability of individual learning and social learning are low? Also, this is another case where normalizing the x axis between network sizes would be more informative. The authors might set asocial learning to 0 and simply directly measure the probability of acquisition by each naive agent at each time-step, since the manuscript is focused on social transmission rather than social transmission and asocial learning.

19. Related to the interpretation of the model, the authors use the word "adopt" throughout the manuscript, although one could argue that their model is not of adoption, but of knowledge transmission, since there is no mechanism to determine whether individuals would actually use the behavior once acquiring knowledge of it. In other places, the authors have used the language of knowledge transmission (e.g., Figure 1 caption). It might be best to stick with knowledge transmission throughout the paper.

https://doi.org/10.7554/eLife.85703.sa1

Author response

[Editors’ note: the authors resubmitted a revised version of the paper for consideration. What follows is the authors’ response to the first round of review.]

Comments to the Authors:

All three reviewers found the modeling approach and main results valuable. That said, they raised a number of major concerns, which can be summarized as follows (for many additional points, please see their original reports below):

(1) A more thorough description of the model should be provided. Ideally, all code should be made available so that readers can replicate the modeling.

A major change to the accessibility of the model and associated methods description is the structural change of the manuscript, i.e. through the editors allowing us to present our methods section before the results, we are now able to relay this information more clearly. In our revised manuscript, we provide further description of the model, all relevant equations, and have added more detailed explanation of our methodological approach. Further, we have made all data and code available at https://osf.io/6jrhz/ so that the readers can replicate the modelling, and also easily modify it to fit their own system’s needs.

(2) Parameter selection is weakly justified, and sensitivity analyses for the parameters are missing.

We apologise for this shortcoming. In our revised manuscript, we added a justification of our parameter selection in the method section on ‘Simulations’ and we also perform a range of sensitivity analyses by testing multiple combination of different parameters for each of the different simulation types (lines 324-330; line numbers refer to the revised manuscript). In the main text, we present results on the parameters s=5 (social transmission rate), f=5 (frequency dependence parameter), a=5 (threshold location), but we also present results using a range of smaller and larger parameters in the extensive supplementary material (Figure S6-S15). Further, by providing all data and code, interested readers can now repeat our analysis and also perform them using different parameters too. Interestingly, we find that the parameter selection generally makes little difference to the conclusions drawn here.

(3) It is unclear to what extent the results depend on the characteristics of great tit networks and how they are relevant to other species. Related to this point, the relevant literature needs to be covered better, including earlier work using empirically-recorded networks to simulate diffusion processes.

We apologise for this shortcoming. In our revised manuscript, we provide a much more detailed description of great tit social structure in the methods (line 165-172, 240-247, Figure S1, S2) and elaborate more in relation to how our results on great tit networks may be similar/different to other species with similar/different social structures (lines 586-602). We also provide a range of points in relation to which considerations/findings are likely to be relevant across systems. In addition, we substantially rewrote our introduction and discussion, now including relevant work on similar topics in sociology and network science, and engage more with findings of other studies (e.g. lines 79-82, 89-96, 572-576, 641-645).

Reviewer #1 (Recommendations for the authors):

In this manuscript, the authors simulate learning processes using data collected from wild birds on their social networks. The results suggest that the nature of the learning process, as well as the size of the social network, determine which individuals will adopt a new behavior faster.

1. The main advance is simulating information spread over empirical networks rather than artificial networks. I doubt that this constitutes a major advance. Specifically, the authors mention in the discussion that the results are in agreement with previous results obtained from artificial networks. Therefore, while it is important and interesting to use empirical networks, their usage did not seem to provide additional insight.

We have now revised various aspects of the writing in relation to this comment. Indeed, we believe that the main advance of our study is not only the (1) use of empirical networks as identified by the reviewer here but also (2) that we test four different types of behavioural contagion simultaneously, (3) to specifically examine the relationship between individual sociality and the probability of behavioural adoption (rather than just comparing the effect of different contagions on the ‘efficiency’, e.g. the speed, of spread within a population), and (4) to explore the effect of network size by considering real networks generated from social behaviour on a relatively small temporal scale. While some of these aspects may have been investigated in human and network sciences separately, they have not been investigated in tandem nor have they been explored in animals, and the availability of such fine-scale information on the social behaviour of many individuals across space and time is rare in animal studies. Some of our results are in line with previous work on artificial networks but also add new aspects to it. For instance, Evans et al. (2020) simulated a simple and conformity contagion on different social structures, and show that disease/information spread is generally faster (i.e. number of individuals infected/informed after a given time) for simple contagion compared to conformity contagion. Their findings show that the diffusion dynamics depend on the underlying social learning rule. Our study is in line with their findings in the sense that also our results suggest that diffusion dynamics depend on the considered learning rule. However, while Evans et al. (2020) investigated the ‘efficiency’ of spread (i.e. time until a certain number of individuals had been infected/informed), we specifically focus on the order in which individuals of different network positions acquired novel behaviour. Throughout the revised manuscript, we highlighted in more detail how our study is different and complementary to other studies (e.g. lines 58-69, 89-95, 134-142, 641-645; line numbers refer to the revised manuscript). Further, we believe that considering empirical work in addition to simulation work is crucial to test whether findings are the same or different between approaches. The reviewer’s next comment also highlights the importance of considering empirical networks because different species may provide different results. Simulation studies and more empirical studies both provide their advantages and disadvantages, and we believe that considering both approaches supplementary to each other will provide valuable information on how different contagions and sociality shape behavioural diffusion (we discuss this point in lines 586-602).

Evans, J. C., Silk, M. J., Boogert, N. J., and Hodgson, D. J. (2020). Infected or informed? Social structure and the simultaneous transmission of information and infectious disease. Oikos, 129(9), 1271-1288.

2. The results depend on the characteristics of great tit networks. Although many different networks were used in the simulations, we can expect different results if networks of other species would be used. It would be helpful if the authors can provide some information about the structure of the empirical networks, and perhaps some predictions regarding their relevance to other species.

We have acted on this comment both in regards to quantification of great tit social structure and also in terms of generalising this to other study systems more broadly. Specifically, we have added histograms of the distribution of four global network metrics (i.e. network density, modularity, average edge strength, average path length) depicting the great tit social network structure (see Figure S2). Great tits form fission-fusion societies with groups of variable size and composition in which individuals join, leave, and re-join groups at frequent intervals. Fission-fusion societies are not only widespread among other bird species (see references within Silk et al. 2014) but frequently found in mammals (e.g. different species of primates (Amici et al. 2008), hyenas (Smith et al. 2007), dolphins (Lusseau et al. 2006)) and fish (e.g. guppies (Wilson et al. 2014), reef sharks (Papastamatiou et al. 2020)). Therefore, our observed social structures are likely meaningful for other species as well. However, many species live in highly stable groups (e.g. many primates) which may provide different results. Such species may for example express a lower variation in social connectivity (i.e. homogenous degree distribution) in which case an individual’s social network position may, under certain behavioural contagions, not have a strong impact on the probability of information acquisition (in contrast to social structures with heterogenous degree distributions such as expected in more gregarious species). Thus, our findings may differ to species living in highly stable social structures and it would be interesting for future work to test how different behavioural contagions spread on social networks of species with contrasting social structures (and the framework we use here would enable this). Yet, the common point remains that – across all systems – the type of contagion in play will shape which individuals (in regards to their network position) acquire and transmit this contagion. We added more information on the great tit social structure in lines 165172, 240-247 (and Figure S2) and discuss in more detail how our results may relate to other species in lines 583-602.

Aplin LM, Farine DR, Morand-Ferron J, Cockburn A, Thornton A, Sheldon BC. 2015. Experimentally induced innovations lead to persistent culture via conformity in wild birds. Nature. 518(7540):538–541.

Centola D. 2018. How behavior spreads: The science of complex contagions. Princeton University Press Princeton, NJ.

Danchin E, Nöbel S, Pocheville A, Dagaeff A-C, Demay L, Alphand M, Ranty-Roby S, Van Renssen L, Monier M, Gazagne E. 2018. Cultural flies: Conformist social learning in fruitflies predicts long-lasting mate-choice traditions. Science (80- ). 362(6418):1025–1030.

Firth JA, Albery GF, Beck KB, Jarić I, Spurgin LG, Sheldon BC, Hoppitt W. 2020. Analysing the Social Spread of Behaviour: Integrating Complex Contagions into Network Based Diffusions. arXiv Prepr arXiv201208925.

Hoppitt W, Laland KN. 2013. Social learning. Princeton University Press.

Kendal RL, Boogert NJ, Rendell L, Laland KN, Webster M, Jones PL. 2018. Social learning strategies: Bridge-building between fields. Trends Cogn Sci. 22(7):651–665.

Lachlan RF, Ratmann O, Nowicki S. 2018. Cultural conformity generates extremely stable traditions in bird song. Nat Commun. 9(1):1–9.

Rosenthal SB, Twomey CR, Hartnett AT, Wu HS, Couzin ID. 2015. Revealing the hidden networks of interaction in mobile animal groups allows prediction of complex behavioral contagion. Proc Natl Acad Sci. 112(15):4690–4695.

Van de Waal E, Borgeaud C, Whiten A. 2013. Potent social learning and conformity shape a wild primate’s foraging decisions. Science (80- ). 340(6131):483–485.

Whiten A. 2005. The second inheritance system of chimpanzees and humans. Nature. 437(7055):52–55.

3. As far as I could tell, the specific code and data for running these simulations were not shared (not just the package). This makes it impossible to replicate the results and assess the data structure.

We apologize for not providing the code and data earlier. Data and code to replicate all results are now available at https://osf.io/6jrhz/.

4. Learning rules in the wild were briefly mentioned in the introduction. Is there any evidence for the rules being used in wild animal populations, supporting the inclusion of these four rules?

Many different social learning rules are used within animal populations (Hoppitt and Laland 2013; Kendal et al. 2018), and we reference such work throughout our text. More specifically, several studies provide evidence for the conformity learning rule where individuals are disproportionately more likely to acquire behaviour when that behaviour is being performed by the majority of the population (Whiten 2005; Van de Waal et al. 2013; Aplin et al. 2015; Danchin et al. 2018; Lachlan et al. 2018). The proportional learning rule is conceptually just a special case of the conformity rule (i.e. just modifying the shape of the curve (Firth et al. 2020)); as such it is well in the scope of many animal species and only differentiates from the conformity rule in the assumption that the rate of social transmission is proportional (rather than disproportional). In contrast to studies on human behaviour (e.g. see references within Centola 2018), there is only little research on threshold-type learning in animals (but see Rosenthal et al. 2015). We added more information on the evidence for these learning rules in our revised introduction and provided a more in-depth exploration of previous theory/discussion about these rules beyond ecology (lines 55-58, 124-142). Finally, we believe that our work here and the encouragement of such approaches will increase the evidence for learning rules in the wild. Specifically, the current NBDA approach does not allow users to directly test for such learning rules, but these new empirical approaches (as applied here) will allow investigation (and evidence for) learning rules.

Aplin LM, Farine DR, Morand-Ferron J, Cockburn A, Thornton A, Sheldon BC. 2015. Experimentally induced innovations lead to persistent culture via conformity in wild birds. Nature. 518(7540):538–541.

Centola D. 2018. How behavior spreads: The science of complex contagions. Princeton University Press Princeton, NJ.

Danchin E, Nöbel S, Pocheville A, Dagaeff A-C, Demay L, Alphand M, Ranty-Roby S, Van Renssen L, Monier M, Gazagne E. 2018. Cultural flies: Conformist social learning in fruitflies predicts long-lasting mate-choice traditions. Science (80- ). 362(6418):1025–1030.

Firth JA, Albery GF, Beck KB, Jarić I, Spurgin LG, Sheldon BC, Hoppitt W. 2020. Analysing the Social Spread of Behaviour: Integrating Complex Contagions into Network Based Diffusions. arXiv Prepr arXiv201208925.

Hoppitt W, Laland KN. 2013. Social learning. Princeton University Press.

Kendal RL, Boogert NJ, Rendell L, Laland KN, Webster M, Jones PL. 2018. Social learning strategies: Bridge-building between fields. Trends Cogn Sci. 22(7):651–665.

Lachlan RF, Ratmann O, Nowicki S. 2018. Cultural conformity generates extremely stable traditions in bird song. Nat Commun. 9(1):1–9.

Rosenthal SB, Twomey CR, Hartnett AT, Wu HS, Couzin ID. 2015. Revealing the hidden networks of interaction in mobile animal groups allows prediction of complex behavioral contagion. Proc Natl Acad Sci. 112(15):4690–4695.

Van de Waal E, Borgeaud C, Whiten A. 2013. Potent social learning and conformity shape a wild primate’s foraging decisions. Science (80- ). 340(6131):483–485.

Whiten A. 2005. The second inheritance system of chimpanzees and humans. Nature. 437(7055):52–55.

5. Figure 2 and S2: I don't really understand why the order of acquisition is plotted as predicting the network traits (e.g., clustering coefficient). If there is any connection between the two it should be the other way around. To me it makes these figures confusing. Perhaps related, at least in the lower panels all individuals seem to have very similar trait values (e.g., weighted degree) -- how can this be explained?

Yes, we would expect that the network trait predicts the order of acquisition (OAC), as we discuss in the text in some depth. However, in these figures, we intended to show the OAC as a sequential event which is why we presented the OAC on the x-axis. Further, we only aimed at illustrating the correlations between the OAC and individual network metric and we did not perform analysis where one variable was predicting the other. Last, measures of the network metrics are scaled (centred and divided by the standard deviation) and cannot be as appropriately divided into separate ranks/groups such as the OAC. If we calculate a mean OAC (y-axis) for each value of network metric (x-axis), this results in mean OACs for various different measures of the network metrics and produces a figure that is difficult to read (see Author response image 1). Due to problems visualising our results that way, and because the other two reviewers did not raise concerns about Figure 2/S2, we decided to leave the OAC on the x-axis. The previous Figure 2/S2 showed averaged network metrics for each OAC. In the lower panels, averaged network metrics across the OAC all result around 0 because there is no apparent relationship between the OAC and network metrics across all network sizes and simulation runs when using the proportion and conformity learning rule.

Author response image 1
Relationship between order of acquisition and weighted degree for the simple learning rule.

Lines plot the average OAC for each value of weighted degree (scaled) and network size from the 100 simulations of each network. Colour represents network size with darker colour indicating smaller networks.

6. L. 94-100: Earlier, the authors mention that individuals may tend to learn from specific individuals such as relatives or with a given trait value. To me that sounds like a very likely mechanism, such as "learn only from your mother, or from your strongest affiliation". I would be very interested to see versions of such rules tested here.

We agree that many different rules (other than the ones used here) would be very interesting to test. In this particular case however, great tits are relatively short-lived with a mean life span of about 1.9 years (Bulmer and Perrins 1973). Short life spans result in high annual population turnover of about 50% which results in no prominent kin structure and less than 1.5% of winter social connections are between first-order relatives (Firth and Sheldon 2016). We added this information in lines 165-170. Therefore, a learning rule based on ‘only learn from your mother’ is rather unlikely in tits, given the lack of kin structure in this population. However, such a rule might be highly relevant for species with more prominent kin-structures, e.g. where mother-offspring bonds are stronger than bonds with other group members. We highlight in our revised discussion that testing such a rule in future work would be very interesting (lines 703-707). A rule based on ‘learn from your strongest affiliation’ may result in behaviours not spreading rapidly because if the strongest affiliate of individual A is individual B, it is very likely that the strongest affiliate of individual B is individual A. For instance, in great tits future social pairs often have the strongest association strength. Therefore, for such a behaviour to spread across a population would require a substantial amount of asocial learning events. Nevertheless, we believe that our work here generally encourages readers to consider which social learning rules may be operating in their system, and provides the conceptual framework to test a number of rules.

Bulmer MG, Perrins CM. 1973. Mortality in the great tit Parus major. Ibis (Lond 1859). 115(2):277–281.

Firth JA, Sheldon BC. 2016. Social carry-over effects underpin trans-seasonally linked structure in a wild bird population. Ecol Lett. 19(11):1324–1332.

L. 90: Did you mean predator avoidance?

We refer here to studies providing evidence for social information transmission on the avoidance of aposematic prey by predators (blue tits and great tits).

L. 152: The introduction mentioned weekly networks but here weekends are reported. How much time does each network include?

We apologies for the unclarity. The ‘weekly’ networks only represent a weekly snapshot of data collected over a weekend (i.e. this is the weekly sampling) and thus represent two-day networks. We clarify now in line 118-119.

Figure 1: This figure is referenced only from the methods section. It raises a question: Why were the networks generated from only one location each time? I assume that two birds may have also interacted in adjacent locations. Perhaps some data on the overlap across adjacent locations can help.

In agreement with the editors, the methods section now proceeds the Results section in our revised manuscript and therefore Figure 1 is referenced earlier. We have several different reasons for why we created social networks for each feeder location separately, and understanding the spread of behaviours within local populations was a key aspect of this manuscript (rather than between local populations). More specifically: First, we generated local social networks to remove any spatial effects influencing the probability of social behavioural acquisition. For instance, within a subpopulation where a new behaviour emerges, individuals with high connectivity may be faster in adopting the behaviour. However, if examined on the population-level, such a relationship may be obscured by spatial effects, because an individual’s probability of behavioural adoption will be considerably predicted by its spatial proximity to the location of behavioural emergence, regardless of its overall social network position within the whole population. Further, focusing on one feeder location provides a comparable spatial unit and does not require to draw arbitrary spatial boundaries across the study site. Second, in our study system birds on average only visit 1.3 feeder locations on a given weekend (min=1, max=10, sd=0.7) and from 21036 occasions where individuals were recorded on a given weekend, individuals had visited only one location in 14888 occasions (71%). Therefore, in the majority of cases, individuals only visited one feeder on a weekend. We added this information in lines 364-373. Last, we aimed at testing behavioural diffusion on a large variation of different social networks which we acquired by calculating weekly and local social networks. Creating only weekly networks across the whole population would have resulted in only 39 social networks (compared to 1343 weekly, local networks). We added a justification of why we calculated local social networks in lines 199-207.

L. 206-207: This statement does not seem to be supported by Figure 3, in which the correlation coefficients for weighted degree and weighted betweenness are actually lower in the simple learning rule.

We meant to say that the simple learning rule created overall the strongest positive and negative correlations. We clarify now in line 409-411.

L. 248…: I was surprised to see here, in the last part of the results, an explanation of the simulation procedure, which belongs either to the methods section, or to the beginning of the results.

In agreement with the editors, we re-structured our manuscript so that the methods section proceeds the Results section. We therefore, removed the explanation of the simulation procedure in the Results section and give a more detailed explanation in the methods (see section on ‘Simulations’) which will hopefully increase clarity.

L. 470: foal should be focal

Changed in line 230.

Reviewer #2 (Recommendations for the authors):

Overall, I found the article interesting and feel that it provides an interesting examination of the consequences of how individuals learn on how behavioural contagions spread through animal societies. Previous studies have not applied these types of model to empirically-derived animal network data or studied as many forms of behavioural contagion simultaneously. Consequently, while (in the context of broader network science research) the individual results here are not new, it is valuable to see them presented together and in a way that is accessible to behavioural ecologists.

I felt the article was well-framed in the introduction, and the results were largely presented clearly and intuitively. In particular, I thought that Figures 2 and 5 were an excellent way to clearly show some of the main results of the study. However, there was insufficient information provided about the models or the experimental design of the modelling component prior to the results, which made interpretation of model results harder than it should be and at times made the results hard to follow when intertwined with methodological information.

Thank you for the encouraging comments here, and for the suggestions. In agreement with the editors, we re-structured our manuscript so that the methods section proceeds the Results section which will hopefully increase clarity. Further, we added a more detailed description of our modelling procedure, including all necessary equations (see methods sections on ‘Simulations’) and provide all R code and data to replicate our results.

I have some concerns about the modelling approach used, although perhaps some of them are related to the rather limited information provided on the simulation approach (I would really encourage the authors to provide clearer methods that incorporate functions for the learning rules and a step-by-step description of the simulation algorithm). It is not clear to me whether acquisition is deterministic based on likelihoods or probabilistic, and the step explaining how the likelihood of information being acquired socially or asocially is particularly unclear. This plays in to some concerns about the appropriateness of a strictly order-based algorithm -- it seems a rather artificial choice based on an existing statistical model rather than clearly biologically justified. Given the shape of the adoption curves for different contagions will differ, and that there is a probability of asocial component, avoiding a time-based approach seems like it could potentially have different consequences for different social learning rules. I can potentially see an argument that individuals would never truly acquire information at the "same" time, but this perhaps raises the question as to: (a) how meaningful changes in behaviour might or might not be to the subsequent individual to acquire the behaviour; and (b) how biologically meaningful differences in adoption time are (outside of very particular biological contexts such as during a predation event).

Thank you for this comment. Firstly, in response to the aspects of clarity around our simulation description, we have reworded various parts of this description (as well as placing it before the Results), and we added a more detailed step-by-step explanation of our simulation algorithm, including all relevant equations in our methods section ‘Simulations’ which will hopefully improve clarity. Secondly, in regards to order of acquisition – We were specifically interested in the relationship between individual sociality and the order of acquisition and not necessarily in the time it takes to inform a given proportion of the population (which is the focus of many other studies, e.g. Evans et al. 2021) or in the exact time when individuals adopt a behaviour. This is due to two things:

(1) even though our networks are empirical, the diffusions upon them are simulated and as such the order-based approaches make less assumptions (and require less parameters) than the time-based approaches. (2) Within natural populations with empirical diffusions the order-based approach is one of the most used and most generalisable approach for considering the spread of information and behaviours (albeit within the context of simple NBDA), and as such this work is a close match to those empirical pieces. We agree that the ‘order of acquisition’ is somewhat artificial and that in natural settings, individuals will adopt a novel behaviour during different times with some individuals adopting the behaviour almost simultaneously while others adopt behaviours much sooner than others. Nevertheless, in reality, our observation of natural systems often matches the order of acquisition assumptions more closely, as it is uncommon to know the exact timing of acquisition for each individual, but rather to have ‘snapshots’ of which individuals are likely to have acquired in which order. We also believe that in simulation studies that examine the timing of acquisition, the times at which individuals adopt a novel behaviour will be highly dependent on the model’s parameter settings and thus are also artificial measures. Therefore, meaningful measures for between individual variation in the timing of acquisition can only be acquired empirically in systems with complete observations. How biologically meaningful differences in the order of acquisition are will ultimately depend on the behaviour and context, and the actual observed variation in adoption times. For instance, we find that under simple contagions, individuals with a higher weighted degree adopt novel behaviours sooner. Under natural settings this could mean that individuals with a higher weighted degree acquire information about e.g. a novel food source sooner than individuals with a lower weighted degree. Whether this difference in the order of behavioural adoption is biologically meaningful will depend on the variation in time of acquisition. For instance, if the whole population acquires the same information within a few minutes than the benefit of individuals with a higher weighted degree finding the novel food source sooner will be smaller than if information spread takes several days or weeks. If the latter, individuals with a lower weighted degree may miss out on a potentially highly profitable food source for several days. We mention these points in our revised discussion (lines 626-630; line numbers refer to the revised manuscript).

Evans JC, Hodgson DJ, Boogert NJ, Silk MJ. 2021. Group size and modularity interact to shape the spread of infection and information through animal societies. Behav Ecol Sociobiol. 75(12):1–14.

Another concern of the work presented here is that there are minimal checks of sensitivity to model parameterization (aside from changing the threshold number of connections in the threshold model). Parameter selection for the different forms of social learning is fairly weakly justified and not well explained so it is not clear what effect this might have on the generalizability of the results. For example, previous modeling work has suggested that the difference (in speed) in how simple contagions and contagions based on conformist learning spread through some types of networks depends on how easy the contagion can spread. This may not impact results related to the order of acquisition, but it is hard to tell if this is likely to be the case, especially given there is also an asocial learning component that has an impact on the adoption of behaviours.

We apologise for this shortcoming. In our revised manuscript, we repeated the simulations with different parameter combinations for each of the different simulations (see section on ‘Simulations’) and provide the extensive additional results in the Supplementary Material. Specifically, we altered the ‘s’ parameter which denotes the social transmission rate. We tested ‘s’ values of 1, 5, 10. We further altered the frequency dependence parameter ‘f’ in the conformity model and the threshold location parameter ‘a’ in the threshold model (f/a: 3, 5, 7). In the proportion model, the frequency dependence parameter is per definition 1. If it is >1 it breaks down to the conformity model. In the main text, we present results on the parameters s=5, f=5, a=5, and present results on smaller and larger parameters in the supplementary material (Figure S6-S15). Overall, altering these parameters did not change our general findings and conclusions. However, for instance, the probability of social spread increased with increasing social transmission rate (Figure 14) and larger threshold locations (i.e. larger values for a) decreased the probability of social spread (Figure 15). In addition, we now provide all data and R code with which results can be replicated and readers can test different parameters themselves.

I also found it interesting that the networks were weighted using simple ratio indices, and the reasoning for this wasn't clear. There are some important assumptions hidden in this choice that could potentially impact the results of the study related to what these indices relate to and how individuals learn. An (intentionally) naïve starting point (in my mind at least) would be that the likelihood of social learning depends on the number of times individuals associate at a food resource rather than a proportion of times that they occur together (e.g., individuals that occur together 4 out of 10 observations may well have more opportunities to learn than 1 out of 2 times). This is clearly a very direct interpretation of the simple ratio indices which ignores the potential for associations elsewhere and their "representative value". It is also based on the assumption that number of associations is the factor that drives social learning (as opposed to, e.g., associations, within a set amount of time, strength of social bond, etc.). However, it would be good to more clearly set out the reasoning for this choice and perhaps test or discuss the sensitivity to different assumptions here.

The Simple Ratio Index is widely used in animal social network studies to give a good proxy of association strength between two individuals (and network position among individuals) given the factors that play into these systems, such as differences in observation (Hoppitt and Farine 2018). Much previous research has justified this as a useful measure for estimating the ‘true’ social network, as well as for relating this inferred network to the network at other contexts, and also in other social processes. So, yes, intuitively in a perfectly observed system an individual in the former case described by the reviewer (4 out of 10) may have more opportunities to learn at the feeder compared to the latter case (1 out of 2) at that particular feeder. If one would have equal numbers of observations for all individuals in the group or population, measuring the association strength based on simply the number of times two individuals were seen together might also be appropriate. However, in our study and in most animal network studies, the number of observations differ between individuals – and also the researcher may be interested in not just the associations observed in that one particular sampling context but also in contexts outside of that. In these common cases, the social association between two individuals is better represented as a ratio (Farine and Whitehead 2015). Therefore, we intended to use the SRI as a representative value for the general social relationships between individuals, not necessarily to just reflect opportunities for learning at the feeder. For instance, learning could also happen away from the feeder and we thus wanted to get a generalized measure of ‘association strength’. Last, the SRI was used as a measure to infer weighted social networks in various studies on great tits showing that the weighted edges between individuals are meaningful in predicting a whole range of population processes (e.g. breeding settlement (Firth and Sheldon 2016), mating (Culina et al. 2015), and information transmission (Aplin et al. 2015)). We added a better justification for why we choose the SRI in lines 216-223.

Aplin LM, Farine DR, Morand-Ferron J, Cockburn A, Thornton A, Sheldon BC. 2015. Experimentally induced innovations lead to persistent culture via conformity in wild birds. Nature. 518(7540):538–541.

Culina A, Hinde CA, Sheldon BC. 2015. Carry-over effects of the social environment on future divorce probability in a wild bird population. Proc R Soc B Biol Sci. 282(1817):20150920.

Farine DR, Whitehead H. 2015. Constructing, conducting and interpreting animal social network analysis. J Anim Ecol. 84(5):1144–1163.

Firth JA, Sheldon BC. 2016. Social carry-over effects underpin trans-seasonally linked structure in a wild bird population. Ecol Lett. 19(11):1324–1332.

Hoppitt, W. J., and Farine, D. R. (2018). Association indices for quantifying social relationships: how to deal with missing observations of individuals or groups. Animal Behaviour. 136:227-238.

Away from the research itself, one frustration when reading this paper is that it does a fairly poor job of placing itself in the context of the wider literature. First, while it does a good job of citing the relevant studies that conduct similar modelling work in animal societies, there is relatively little effort to engage with the findings of these studies in the introduction or discussion of the paper. There is a real opportunity here to unpack the results of this study in relation to similar and contrasting findings from other papers that is missed here. Different papers have focused on different aspects of how social structure and connections influence contagions in animal societies and by linking better with some of these papers it could perhaps address how the findings of this study might generalize (e.g., to different social structures, considering different "transmissibilities" of contagions, etc.). Second, there is little effort made to acknowledge or consider the large number of modelling studies that address similar questions in the broader network science literature. While the network here is empirically derived, from that point on this study is purely computational and there are studies that have addressed very similar or overlapping questions elsewhere in this literature (e.g., how the number of connections influences speed of acquisition for different forms of contagion).

We apologise for this shortcoming. In our revised manuscript, we cite and engaged more with the findings of other studies and implement more studies from fields of sociology and network science in the revised introduction (e.g. 79-82, 89-96) and discussion (e.g. 572-575, 640-645). We would also be happy to include any specific works that the reviewer may be referring to here if they want to include those too.

Related to this point, it would be nice to see a more nuanced discussion about the strengths and weaknesses of computational research that either applies simulation models to a single empirical case study vs. that the applies similar computational models to more generalized network structures. While there was a point when these types of model were applied to very generic network structures (random, small-world, etc.) and to an extent still are in network science (where the research aims are somewhat different), more frequently now studies that use simulated network structures do so with express biological questions in mind and design simulated networks accordingly. Taking this approach is a powerful way of tackling specific questions and/or generating a range of generalizable structures. Equally applying these types of models to particular empirical case studies is very valuable in its own right for different reasons. Related to the previous point, I think it would be great to make the most of their complementary strengths to better integrate the lessons learned from these different approaches.

Yes, we agree that both empirical and simulation studies can provide valuable results on their own, and that considering both complementary to each other will improve our understanding of how behaviours spread and how individual sociality relates to the probability of behaviour acquisition. We highlight in lines 718-729.

This recently published paper is perhaps relevant/useful:

https://royalsocietypublishing.org/doi/full/10.1098/rspb.2022.1001

Thank you for forwarding this paper. This is indeed a very relevant study and we included it in our revised manuscript (e.g. line 741).

L20: Given the weak correlations illustrated in Figure 3, it feels slightly misleading to describe this as a "strong" relationship.

We agree and removed the word ‘strong’ in line 20.

L22-23 (and elsewhere): Discussion of this idea throughout the paper doesn't acknowledge previous work showing this outside ecology. This review contains more links to studies in network science as a useful resource https://onlinelibrary.wiley.com/doi/full/10.1111/oik.07148?saml_referrer. For example, this paper https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0020207 tackles how conformist social learning leads to this pattern.

Thank you for drawing our attention to these relevant papers. We acknowledge more previous work in sociology and network science in our revised introduction and discussion (e.g. 79-82, 134-142, 572-575, 639-645).

L48: "requiring exposure to multiple sources" isn't necessarily a difference between infection and behaviour spread in networks (e.g., see literature on dose-response curves).

Yes, we agree that this is not necessarily a difference between infection and behaviour spread. In this sentence we meant to contrast this statement with the frequent assumption of social learning, i.e. that the extent (i.e. number and duration) of social contacts to knowledgeable others predicts the likelihood of adoption, and not contrasting information transmission with disease transmission (see line 49).

L76-78: Evans et al. (2021) also consider a simple contagion and conformist learning and explore the potential implications of considering the simple contagion as being something other than the spread of infection in the discussion.

We included the relevant reference throughout the revised manuscript (e.g. line 60).

L83-85: While I think a really interesting aspect of this study is its exploration of the role of network size, I am not sure how well this criticism works (in its current form) given that even big networks can capture interaction patterns at small spatial scales, and that how network structure depends on the temporal scale will be highly study-system dependent.

We agree that also large networks can capture interaction patterns at small spatial scales.

Generating local social networks was done for numerous reasons (see response to reviewer 1 above) including to remove spatial effects between local populations from our analysis. When examining the relationship between individual sociality and the probability of adoption, generating a social network from the whole population would add considerable spatial effects. For instance, within a sub-population where a new behaviour emerges, individuals with high connectivity may be faster in adopting the behaviour. However, if examined on the population-level, such a relationship may be obscured by spatial effects, because an individual’s probability of behavioural adoption will be considerably predicted by its’ spatial proximity to the location of behavioural emergence. Further, we aimed at testing behavioural diffusion on a large variation of different social networks which we acquired by calculating weekly and local social networks. Creating only weekly networks across the whole population would have resulted in only 39 social networks (compared to 1343 weekly, local networks). We added a justification of why we calculated local social networks in lines 199-207 and re-wrote the section in the introduction accordingly (lines 101-103).

We agree that to which extent social structure depends on the temporal scale will be very study system dependent. Great tits forage in fission-fusion flocks during the winter and thus their group size and composition (and thus their social connections) frequently changes across the winter. Contrary, in more stable social structures (such as found in many primate species), social connections between individuals are likely very stable across time and are subject to little changes. We clarify in lines 101-112.

L96-100: Given the methods come last, these rules need to be more accurately described here to help with the interpretation of the results. It would also be good to provide a more in-depth exploration of previous theory/discussion about these rules beyond ecology.

In agreement with the editors, we re-structured our manuscript so that the methods section proceeds the Results section. We hope this improves the clarity of our methods. In addition, we included a more in-depth discussion of previous theory and evidence for the four learning rules in lines 124-142.

L112-113: Low clustering coefficient will not always indicate a less sociable individual, this will depend considerably on the broader structure of the network. It would seem to in this study, but some greater context would be helpful here. In the results it would be helpful to quickly describe the correlation between the centrality measures used in the studied networks.

We agree that a low clustering coefficient will not always indicate a less sociable individual and rewrote accordingly across the whole revised manuscript. The three individual network metrics are moderately correlated, with weighted clustering coefficient being negatively correlated to weighted degree and weighted betweenness, and weighted degree and weighted betweenness were positively correlated (see Table S1). In addition, we provide example networks with individual great tits colour-coded based on their different weighted network metrics (Figure S1) and calculated four global network measures (network density, average path length, average edge weight, modularity) to provide a general overview of the weekly, local great tit social structures (Figure S2, details on how these metrics were calculated can be found in the figure legend).

Figure 1: Would be useful (and take up no extra space) to give the thresholds used here.

We have not used any thresholding in Figure 1. Figure 1 is purely illustrative showing the simulation processes. We now ensure this is clear from the figure caption. We speculate that the reviewer was referring to Figure 2 and we added all parameters used (including the threshold parameters) to produce the respective results in the figure legends of the revised Figure 2 and 5.

L151-154: It would be good to provide more clear information on networks per feeder site, network appearances per bird, etc., and an indication of what timespan were data included from. Also perhaps valuable to point out that the minimum of 10 (and rest of these descriptive stats) applies after some networks were excluded.

We provide more detailed information on these points in lines 176-179 and 364-373. For each location, we included on average 21.7 networks into the analysis (min=1, max=39, sd=11.9) and each individual was part of on average 16.5 networks (min=1, max=88, sd=13.2). In each of the three winters (2011-12, 2012-2013, 2013-2014) the feeders were in place from December to February and collected data on the bird visits from pre-dawn Saturday morning until after dusk on Sunday evening. Feeder locations were consistent across the three years.

Thank you for pointing this out. We clarify that the minimum of 10 only applies after our data exclusion (lines 367-368).

Figure 2: One thing I found particularly interesting in these results is that for simple, threshold and to some extent conformity there appears to be a stronger pattern of being the most peripheral is costly rather than being the most central is beneficial. Clearly, this may depend on what the information/behaviour that is spreading is related to, but it's a neat result and perhaps worthy of some discussion for what it means for social ecology/evolution in this system.

Thank you for pointing this out. This is indeed an interesting take on this result which may have important consequences for the associated costs and benefits of individual sociality. We discuss the view of this finding in lines 611-626.

L210: Is this now meant to refer to Figure 3?

We re-wrote this whole section in lines 433-457.

Figure 3/Figure 4: Even the correlations different from zero are (predominantly) small. Later in the paper it might be interesting to discuss how biologically meaningful they are in networks of different sizes in this system. One thing apparent in these results is that there is a lot of noise (presumably) related to the seeded individual. It may be worth using this to highlight that in small networks who you know is just as/more important than how many you know -- even if it is just as a suggestion for further research.

Thank you for pointing this out. Yes, correlation coefficients are generally small which might be due to multiple reasons. As noted in our manuscript (lines 411-416), this may result from general nonlinear relationships between network metrics and order of acquisition in real network structures like this. For instance, for the threshold model, the average network metric is close to 0 at first until it slightly increases and then decreases again for late adoptions (Figure 2) which may cause low correlation coefficients. Indeed, we also speculate that the starting position of the seeded behaviour (i.e. the network position of the demonstrator) and likely the underlying general social network structure (e.g. network density, modularity) cause a large variation in diffusion pathways, and thus correlation coefficients. We discuss in more detail the biological relevance of our findings in lines 626-630 and highlight that some of these aspects such as exploring the relevance of the starting position in shaping diffusion pathways would be very interesting for future work (lines 624-626).

Figure 4: I found distinguishing between the two blues here pretty difficult. Another colour scheme may be clearer or a different way of presenting the data given how dense the point clouds are.

We chose a different colour scheme across all figures and the revised Figure 4 shows now each learning rule in a separate plot.

L219-231: It would be good to make this aspect of the experimental design clearer earlier. I also found these results were written less clearly than other parts -- some rewriting might be helpful.

We re-wrote this section and hope it improved in clarity (lines 375-457). Further, we included information on testing different parameters already in the methods section (lines 324-330).

L248-255: Some indication of how the model broadly works such as this should ideally come at the end of the methods to help with interpreting the results in this methods-last format.

We restructured as suggested. Further, in agreement with the editors, the methods section proceeds the Results section in the revised manuscript. We hope this improves the clarity of our methods and results.

L270-274: Would perhaps be helpful at the end of the introduction to set up the model or in the methods?

We included this information in the methods section on ‘Simulations’.

L362-364: Is there not a little more nuance here given that there won't necessarily be a single learning rule for each behaviour so it suggests perhaps that the importance of different learning rules varies as a behaviour spreads.

Yes, these findings suggest that the importance of the social learning rule may be most prominent during the initial stages of spread. We made this clearer in line 655. Further, we agree that one limitation of our models is that each individual adopts a seeded behaviour under the same ‘learning rule’. However, individuals will likely differ in the extent of social information use. For future work, it would be very interesting to examine within and between individual variation in different learning rules. We discuss this limitation in lines 690-703.

L378-380: Similarly to the previous point, an appropriately parameterized (dynamic) network for a large population can capture interactions at very fine spatial and temporal scale -- this doesn't seem so much a point directly related to network size. One aspect of network size that perhaps becomes interesting is that the importance of "who" you know versus how many you know changes in importance in different size networks.

Yes, we agree that also networks of a whole population can capture associations on a fine spatiotemporal scale. As mentioned above, one of our main reasons for generating local social networks was to remove spatial structure per se from our analysis, and consider social transmission within local populations. Otherwise, when examining the relationship between individual sociality and the probability of adoption, generating a social network from the whole population would add considerable spatial noise since spatial proximity to the source would impact an individuals’ probability of social learning. In addition, we wanted to generate a large variety of social networks differing in structure and size (see lines 189-207). We agree that to which extent social structure depends on the temporal scale will be study-system dependent. Great tits forage in fission-fusion flocks during the winter and thus their group size/composition, and thus their social connections, frequently changes across the winter. For other species, in e.g. highly stable social groups this may be different. We re-wrote to clarify in lines 103-112. Related to this point, we now highlight the importance of considering dynamic versus static networks (686). Yes, we agree and discuss the potential difference in the importance of ‘who’ you know versus ‘how many’ you know in relation to network size in lines 671-676.

L383: Is there a reason for using network "shape" here rather than more established terms like topology or structure?

Rephrased to ‘network structure’ in line 689.

L387-405: Good to see a sensible consideration of model limitations.

Thank you.

L449-453: Is there empirical data to support this for this study system given it has been the subject of previous experimental work? Justifying with previous empirical work would strengthen this point considerably.

In Aplin et al. (2015a, b), authors report that the population-level bias for an introduced technique (i.e. which direction to push open a door to access food at a feeder) increased daily. Further, by measuring the proportion of individuals that were observed performing each behavioural variant (push left or right) in the social group that preceded a naïve bird’s first successful solution, Aplin et al. (2015b) show that individuals were disproportionately likely to copy the behavioural variant of the majority of individuals. This finding suggests that the social connections on a very small temporal scale (e.g. the social group preceding a focal individual’s first solve) are important in determining behavioural adoption. We added a more detailed description of our reasoning in lines 194-199.

Aplin LM, Farine DR, Morand-Ferron J, Cockburn A, Thornton A, Sheldon BC. 2015a. Experimentally induced innovations lead to persistent culture via conformity in wild birds. Nature. 518(7540):538–541.

Aplin LM, Farine DR, Morand-Ferron J, Cockburn A, Thornton A, Sheldon BC. 2015b. Counting conformity: evaluating the units of information in frequency-dependent social learning. Anim Behav. 110:e5–e8.

L449-453: What biases do you introduce to the networks by only including interactions at a single feeder? How many individuals use multiple feeders?

In this work, we aimed to consider the transmission of behaviours within local populations (rather than among local populations). On a given weekend, birds on average only visit 1.3 feeder locations (min=1, max=10) and from 21036 occasions where individuals were recorded on a given weekend, individuals had visited only one location in 14888 occasions (71%). We include this information now in the Results section (lines 364-373). While the majority of birds only visited one feeder, some birds did visit more feeders and thus the local social network position will not capture all of an individual’s social connections at a given weekend. Therefore, the social network position of an individual inferred at one location may not be representative of its’ inferred network position at another location or across the whole population. For instance, at location X, individual A may have few connections compared to other individuals at location X (because it may have visited location X only few times). Whereas at location Y, individual A might have many connections compared to other individuals at location Y. However, in our study we were specifically interested in capturing the local diffusion pathways. In this case, we would expect that if a novel behaviour emerges at location X, individual A may have a lower probability to socially adopt the behaviour compared to others, whereas at location Y, the situation may be reversed. Therefore, we believe that even though some of the local individual social network metrics may not be representative of an individual’s overall social network position it will not change our general findings and conclusions. Further, we expect that the overall social network structure is dependent on the spatial scale over which associations are considered. We speculate that networks inferred at only one location are for instance denser and less fragmented compared to networks generated across the whole population.

L470: I am not convinced the use of clique (in the strict network analysis definition) is correct here.

We re-wrote in lines 229-232.

L505: I would suggest making it clear earlier in this description that simulations were repeated multiple times in each network.

The methods section proceeds the Results section now and thus this information comes much earlier in the main text (lines 321).

L520: Is this number correct given different thresholds were used too for further simulation runs?

Thank you for pointing this out. Indeed, these numbers did not include the simulations carried out when using different threshold parameters. They only inferred to simulations with one threshold parameter (the one presented across all Figures in the main text). We clarify now in line 327.

L530-53: Would be good to provide confirmatory information on model goodness of fit checks and clarify how statistical inference was done (presumably from the full/fitted model?).

We included information on statistical inference in lines 346-347 and models can be replicated with the data and R code provided at https://osf.io/6jrhz/.

I hope my comments help improve the manuscript.

We thank the reviewer for the very constructive comments which greatly helped to improve our manuscript.

Reviewer #3 (Recommendations for the authors):

This study is a timely theoretical exploration of how variation in transmission rules interacts with variation in social phenotype to influence diffusion dynamics. The authors predict that the likelihood of an individual adopting novel behaviors should depend on the learning rule, as well as the individual's sociality. The authors explore the spread of behavior under 4 different learning rules: a simple adoption rule, a threshold rule, a proportion rule and a conformity adoption rule. They quantify the sociality of individuals by calculating their weighted degree, clustering coefficient and betweenness.

The authors find that under simple and threshold rules, high degree, high betweenness, and low clustered individuals acquired the seeded behavior earlier in simulations. Under proportional and conformity rules, there was no strong relationship between social phenotype and order of acquisition. The authors find that network size predicts the magnitude and direction of the correlation between social phenotype and order of acquisition and that this relationship also depends on the learning rules.

Strengths

1. Overall the paper is well written, and the motivation for using computational simulations is well warranted to explore this question.

2. The topic is timely, and tackles an important theoretical question of how variation in learning rules might interact with social phenotype to influence cultural diffusions. This is a difficult topic to address but is critical for improving our understanding of how diffusions might differ between populations.

3. The authors construct simulations using real social networks of great tits, rather than artificially generated networks, which is a rarity, and thus of great value, in the SL modeling literature. These networks are hard earned -- taken at a relatively fine temporal scale from weekly sampling.

4. The authors provide a thorough discussion of the implications of their results.

Thank you very much for these positive comments. We were particularly encouraged reading the reviewer’s recognition that “…These networks are hard earned…”.

Weaknesses

1. One main weaknesses of the paper is the lack of details given about the transmission model. The authors do not provide equations, descriptions of parameters, a detailed schedule of the model. The descriptions they do provide are spread throughout the manuscript, making it more difficult to assess. NBDA is definitely an appropriate model for the question they want to answer, but it seems like the authors have altered some features of the model (e.g., only 1 individual can learn per timestep, 1 individual must learn per timestep). This lack of clarity makes it harder to assess the results they present. Further, from their description, it appears that they have allowed for asocial learning, which adds unnecessary noise to a study that is focused on social transmission.

We apologise for this shortcoming. In agreement with the editors the methods section now proceeds the Results section which we hope will improve clarity. In addition, we added a more detailed description of our modelling procedure, including all relevant equations (see method section on ‘Simulations’) and test different parameters (lines 325-330; line numbers refer to the revised manuscript). Further, we provide now all data and code so that readers can replicate our simulations at https://osf.io/6jrhz/.

In regards to the point about altered features in the NBDA – We were specifically interested in the relationship between individual sociality and the order of acquisition which is why only one individual could adopt the behaviour at a time, and not necessarily in the time it takes to inform a given proportion of the population (which is the focus of many other studies, e.g. Evans et al. 2021) or in the exact time when individuals adopt a behaviour. The ‘order of acquisition’ is of course somewhat artificial and in natural settings, individuals will adopt a novel behaviour during different times with some individuals adopting the behaviour almost simultaneously while others adopting behaviours much sooner than others (see also response to reviewer 2 above). Nevertheless, in reality, our observation of natural systems often matches the order of acquisition assumptions more closely, as it is uncommon to know the exact timing of acquisition for each individual, but rather to have ‘snapshots’ of which individuals are likely to have acquired in which order. We also believe that in simulation studies that examine the timing of acquisition, the times at which individuals adopt a

novel behaviour will be highly dependent on the model’s parameter settings and thus are also artificial measures. Our reasoning for ‘a new individual learning at each timestep’ was because we wanted to examine the relationship between individual sociality and the order of acquisition across the whole network. Without such a rule (and without allowing for asocial learning, see below), behaviours would not socially spread far (or at all) under certain circumstances (e.g. dependent on the position of the demonstrator or the underlying learning rule).

Finally, in regards to the point about asocial learning: ‘Asocial learning’ here is simply used to describe the acquisition of the behaviour by an individual that is not observed to take place due to the links within the observed social network. As such, some ‘asocial learning’ is necessary for the analysis to work at all, as without it there would be zero individuals with the behaviour to begin and this would persist throughout the simulation. Therefore, ‘allowing’ for asocial learning is a necessity, and then the question becomes how much asocial learning should be allowed for. We believe that the reviewer here is not arguing for not allowing asocial learning, but for only allowing for the very minimum amount of asocial learning (i.e. 1 asocial learning event per simulation per local population), and then banning asocial learning from that point onwards. However, we do not know of an animal system were such a scenario is likely to take place in natural settings, nor do we know of any previous empirical work that has dismissed the possibility of asocial learning entirely after the first asocial learning event. Therefore, we have opted to match what we believe would be relevant to the systems considered here, and continue allowing asocial learning after the first asocial learning event (indeed, this is also a major conceptual aspect of NBDA approaches generally too). Nevertheless, as we now provide all the data and code to replicate our simulations, the more interested readers are free to modify this to their desired asocial learning levels. However, we would add caution against considering zero possibility of asocial learning after the first asocial learning event, especially as ‘asocial learning’ not only captures the chance of an individual acquiring the behaviour themselves, but also captures any events were the individual acquires the behaviour socially but from outside of the observed network. Scenarios where we could totally write off both of these possibilities in animal social systems are very rare.

2. Another main weakness is that the authors do not use a sensitivity analysis, and thus it is difficult to assess the relative effects of each network metric, as they are not necessarily independent of one another. For example, degree and clustering can be correlated simply as a result of how clustering is calculated. This is the downside of using real networks, as without synthetic data, there may be insufficient data to perform a sensitivity analysis. Further, the authors do not present an assessment of variation in their results, instead showing mean values within network size as evidence of their claims.

Yes, a downside of using empirical networks is that it is difficult to perform a sensitivity analysis to assess the effects of each network metric on its own. However, in our study, we did not intend to quantify the relative effect of each network metric separately and also make no claims of that sort (e.g. by suggesting that a high weighted degree increases the likelihood of behavioural adoption more than a high weighted betweenness). We aimed at choosing three metrics that are commonly used in animal social network studies, and that also describe different properties of an individual’s network position and therefore roughly represent less and more social individuals. In our social networks, weighted degree and weighted betweenness are positively correlated, and weighted degree and weighted clustering, and weighted betweenness and weighted clustering are negative correlated (Table S1). In our manuscript, we often described individuals with a lower clustering, and high degree and betweenness as more social individuals. We agree that this is not appropriate and have been more careful in our wording in the revised manuscript. In addition to the correlation coefficients, we added a supplementary figure (Figure S1) illustrating a few social network examples colour-coding the range of network metrics which will hopefully help illustrate the overall correlation in network metrics. In our revised manuscript, we discuss some of the advantages and disadvantages of empirical and pure computational studies and highlight here that one key strength of computational studies is the ability to examine the relative contribution of parameters considered (e.g. by sensitivity analysis, lines 721-731).

Showing confidence intervals in Figure 2 and 5 would have not benefited visualization because lines would have highly overlapped. However, in our revised figures, we binned network size into different groups and now present, next to the mean, also the 95% confidence intervals (e.g. Figure 2, 5) and the revised Figure 3 and Figure 4 show the same measures of variation as before (violin and boxplots showing 25%, median and 75% of data in Figure 3, and 95% confidence intervals in Figure 4).

3. Related to the interpretation of sociality, there is opportunity to increase clarity. The authors describe more social individuals as having a high degree, high betweenness, and low clustering, and less social individuals as low degree, low betweenness and high clustering. One could also imagine a bird who has high degree, low betweenness, and high clustering, being at the center of their group, but rarely going between groups. It seems harder to argue that this bird is less social than a bird with high degree and high betweenness but low clustering. The manuscript would benefit from a careful description of how different combinations of these social metrics could be interpreted.

We agree that our wording was previously unclear in this sense and we have chosen a more careful description throughout the revised manuscript. In our social networks, weighted degree and weighted betweenness are positively correlated, and weighted degree and weighted clustering, and weighted betweenness and weighted clustering are negative correlated (Table S1). In addition, we added a supplementary figure (Figure S1) illustrating a few social network examples colour-coding the range of network metrics which will hopefully help illustrate the overall correlation in network metrics.

Point by point comments

1. It should probably be mentioned somewhere in the introduction that social learning rules apply to either the social transmission of novel behavior (e.g., Aplin et al. 2015) or the social influence of others on behavior (e.g., Pike and Laland 2010, Danchin 2018), and that you aim specifically to look at social transmission.

We now ensure it is clear throughout our introduction that we consider the social transmission of ‘novel’ behaviour (e.g. lines 32, 112, 125).

2. The initial conditions of the simulations are not well enough explained before we get to the results. I was left wondering how the authors chose the first knowledgeable agent, which isn't answered until later.

In agreement with the editors, we re-structured our manuscript so that the methods section proceeds the Results section. Further, we substantially extended our methods section with a more detailed description and provide all equations and code (https://osf.io/6jrhz/) to replicate our results. We hope this improves the clarity of our methods.

3. The methods section could use more explanation. Those who are unfamiliar with NBDA would need to refer to other publications to see the equations, especially the meaning of 's=1', etc. Also, what are the other parameters set to (λ, A)? Consider including the equations, as well as a more thorough description of parameters. The same could be said for equations describing the network metrics.

We apologise for this shortcoming. We substantially extended our methods section with a more detailed description and provide all equations for the simulations (see section on ‘Simulations’) and code to replicate our results (https://osf.io/6jrhz/). Further, we included better definitions of all model parameters and also provide results on testing various parameters for the social transmission rate, ‘s’, frequency dependence parameter ‘f’, and threshold location ‘a’. In the main text, we present results on the parameters s=5 (social transmission rate), f=5 (frequency dependence parameter), a=5 (threshold location) and present results on smaller and larger parameters in the supplementary material (Figure S6-S15).

4. Related to point 4, what happens when A=0, in a pure social learning environment? This would reduce stochasticity due to asocial innovations, and would provide a pure test of the effect that authors predict arises from sociality and learning rules.

Please see our first response to the reviewer’s public comment:

“Finally, in regards to the point about asocial learning: ‘Asocial learning’ here is simply used to describe the acquisition of the behaviour by an individual that is not observed to take place due to the links within the observed social network. As such, some ‘asocial learning’ is necessary for the analysis to work at all, as without it there would be zero individuals with the behaviour to begin and this would persist throughout the simulation. Therefore, ‘allowing’ for asocial learning is a necessity, and then the question becomes how much asocial learning should be allowed for. We believe that the reviewer here is not arguing for not allowing asocial learning, but for only allowing for the very minimum amount of asocial learning (i.e. 1 asocial learning event per simulation per local population), and then banning asocial learning from that point onwards. However, we do not know of an animal system were such a scenario is likely to take place in natural settings, nor do we know of any previous empirical work that has dismissed the possibility of asocial learning entirely after the first asocial learning event. Therefore, we have opted to match what we believe would be relevant to the systems considered here, and continue allowing asocial learning after the first asocial learning event (indeed, this is also a major conceptual aspect of NBDA approaches generally too). Nevertheless, as we now provide all the data and code to replicate our simulations, the more interested readers are free to modify this to their desired asocial learning levels.”

5. L470 "foal" should be "focal".

Changed in line 229.

6. I actually think it's more helpful to show results when you standardize network size in the main text, and put Figure 2 in the supplement. Figure 2 is difficult to read, and something like Figure S2 is easier to interpret if you're assessing relative differences in diffusion dynamics. Also rather than presenting each network size as a color, select one network size (or a binned size) and present a variation metric (e.g., percentile intervals).

We agree that Figure S2 is easier to interpret for assessing the relative difference in diffusion dynamics. However, we were specifically interested in also presenting the effects of network size. Therefore, we decided to leave Figure 2 in the main text. However, we binned network size into different groups and now present the mean and the 95% confidence intervals as a measure of variation.

7. Suggest changing section title "Social network size and behavioral spreading" to match first section "Relationship between…". Also suggest "diffusion" rather than behavioural spreading. After reading the section, this seems more to do with how network size impacts the correlation between variables, rather than the diffusion itself. Maybe change the heading to reflect this?

We changed the heading to ‘Relationship between social network size and pathways of behavioural diffusion’ in line 433.

8. L204 – 231: Overall I think this section is fairly dense compared to the first section, and after reading it several times, I'm still not sure what I should take away from it. It looks like you have a very low N at large network sizes, which could drive some of these correlations in Figure 4. The fact that agents can asocially learn also makes it hard to interpret what these correlations mean.

We restructured this section to improve clarity in lines 433-457. Removing large networks (>=50 individuals) did not change the correlation coefficients substantially (see Figure 3).

9. L204-216: I had to read the beginning of this section several times, and it's more confusing than the first section of results. The results communicated until L210 do not relate to network size, and seem to repeat the previous section. I suggest removing this or incorporating it into the previous section and starting with how network size affected simulation dynamics.

a. Also I suggest rewriting to avoid putting the variable of interest in parentheses (e.g., L 209, 213).

We restructured this section (375-457) and made changes to the variables in parentheses as suggested (lines 435-438).

b. L208: "the mean average network metric" was confusing -- do you mean "average network measure"?

We removed this sentence in the revised manuscript.

c. L210: I suggest "The direction and magnitude of the correlation between ind. Sociality and order of acquisition were predicted by network size. This relationship was modulated by transmission rule…" to improve clarity.

Changed as suggested in line 434-435.

10. Figure 4: I find it very hard to see all 4 lines, maybe choose different colors?

We changed the colours in the revised Figure 4 and present each social learning rule in a separate plot to improve visibility.

11. L224: Which means network metric?

We are not sure what the reviewer refers here to. We would be grateful if the reviewer could clarify.

12. L248: This information about initial conditions should come before the results.

This information was moved to the methods section which in accordance with the editors proceeds now the Results section (lines 302-320).

13. L248: "Spreading simulation" -> diffusion simulation.

Changes to ‘simulation’ in line 302.

14. L253: Without the equations written out in the methods, it's difficult to assess how the learning model works. Is asocial learning turned off under obligate social learning? It's my understanding that in NBDA, the s parameter controls the relative strength of social learning per unit connection to asocial learning. In the usual formalization of NBDA, the probability of asocial learning is constant in all individuals, contra L251 which states that asocial learning only occurs in unconnected individuals. Does your model assume that individuals who are "well-connected" (also undefined in the manuscript) have the $A$ parameter set to zero? If this is the case, the authors should include a justification/definition of being well-connected.

We apologise for the unclarity. We now include all relevant equations and rewrote several sections of the text in line with the other reviewers’ suggestions accordingly (see revised methods section). The statement ‘asocial learning only occurs in unconnected individuals’ is not true, but it is true that ‘only asocial learning occurs in unconnected individuals’, which is a big difference. We have ensured this is clear throughout. All individuals do have the same asocial learning rate and what we measure is at each event a new individual adopts the behaviour, what is the probability of this behavioural adoption steaming from social learning under each given social learning rule. If an individual has no social connections to any informed individuals, then its’ probability to adopt the behaviour via social learning is 0. Since in our simulations at every timestep a new individual learns, in such a case, the individual would adopt the behaviour via asocial learning.

15. L317: I'm still finding it hard to wrap my head around how a bird with low clustering is central and highly social. A nice way to explain/justify the differences between more and less social individuals would be to make a figure of an exemplar network, with several stereotypes highlighted, along with their social metrics.

We apologise for this unclarity. As mentioned above, we agree that classifying individuals with lower weighted clustering coefficients as less social is not clear. We re-wrote across the revised manuscript. Overall, weighted degree and weighted betweenness are positively correlated, and weighted degree and weighted clustering, and weighted betweenness and weighted clustering are negatively correlated in our social networks (Table S1). In addition, we added a supplementary figure (Figure S1) illustrating a few social network examples colour-coding the range of network metrics which will hopefully help illustrate the overall correlation in network metrics.

16. L332: Cantor et al. (2021, Proc. R. Soc. B) should probably also be cited here, as they measure the performance of recombination and subsequent diffusion.

Cited in line 585.

17. L342: Overall, this manuscript has synergy with the study "Cultural diffusion dynamics depend on behavioural production rules" (doi.org/10.1098/rspb.2022.1001), which explicitly explores the difference between acquisition and usage, and also uses NBDA as a generative model. It would be relevant to cite here.

Thank you for forwarding this relevant paper. We cited it across the revised manuscript (e.g. line 698).

18. Figure 5: If individuals have a low probability of social learning, do they have a high probability of asocial learning? Or not learning at all? Are there cases when both the probability of individual learning and social learning are low? Also, this is another case where normalizing the x axis between network sizes would be more informative. The authors might set asocial learning to 0 and simply directly measure the probability of acquisition by each naive agent at each time-step, since the manuscript is focused on social transmission rather than social transmission and asocial learning.

In our simulations, at each timestep a new individual adopts the behaviour. All individuals do have the same asocial learning rate and what we measure is at each event a new individual adopts the behaviour, what is the probability of this behavioural adoption steaming from social learning under each given social learning rule. If an individual has no social connections to any informed individuals, then its’ probability to adopt the behaviour via social learning is 0. Since in our simulations at every timestep a new individual learns, in such a case, the individual would adopt the behaviour via asocial learning.

19. Related to the interpretation of the model, the authors use the word "adopt" throughout the manuscript, although one could argue that their model is not of adoption, but of knowledge transmission, since there is no mechanism to determine whether individuals would actually use the behavior once acquiring knowledge of it. In other places, the authors have used the language of knowledge transmission (e.g., Figure 1 caption). It might be best to stick with knowledge transmission throughout the paper.

We agree that we cannot philosophically distinguish between knowledge transmission and behavioural adoption. The vast majority of empirical research on animal social learning probably refers to behavioural adoption because knowledge acquisition is difficult to measure in animals, whilst the adoption of a behaviour is observable. Many of our more complex learning rules refer to behavioural adoption rather than knowledge acquisition. For instance, under conformity learning, we expect an individual to only adopt a novel behaviour once the behaviour is performed by the majority of its’ social connections. In such a case, an individual may have already acquired the knowledge to perform a novel behaviour but will only adopt the behaviour once the majority of its’ social connections performs the behaviour. Further, it is the behavioural adoption (rather than the knowledge acquisition) which is the important part for transmitting the behaviour further along the network. Therefore, we decided to stick with behavioural adoption rather than knowledge transmission and re-wrote across the revised manuscript.

https://doi.org/10.7554/eLife.85703.sa2

Article and author information

Author details

  1. Kristina B Beck

    Edward Grey Institute of Field Ornithology, Department of Biology, University of Oxford, Oxford, United Kingdom
    Contribution
    Conceptualization, Formal analysis, Visualization, Methodology, Writing – original draft
    For correspondence
    kbbeck.mail@gmail.com
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-5027-0207
  2. Ben C Sheldon

    Edward Grey Institute of Field Ornithology, Department of Biology, University of Oxford, Oxford, United Kingdom
    Contribution
    Conceptualization, Supervision, Funding acquisition, Project administration, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-5240-7828
  3. Josh A Firth

    Edward Grey Institute of Field Ornithology, Department of Biology, University of Oxford, Oxford, United Kingdom
    Contribution
    Conceptualization, Supervision, Visualization, Methodology, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-7183-4115

Funding

Natural Environment Research Council (NE/S010335/1)

  • Ben C Sheldon

European Research Council (AdG 250164)

  • Ben C Sheldon

Biotechnology and Biological Sciences Research Council (BB/S009752/1)

  • Josh A Firth

Natural Environment Research Council (NE/V013483/1)

  • Josh A Firth

The funders had no role in study design, data collection, and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank the large number of contributors to the data collection, and we are very grateful to Will Hoppitt for all his input. The study has been supported by grants from NERC (NE/S010335/1 & NE/V013483/1), ERC (AdG 250164), and BBSRC (BB/S009752/1). We also thank three anonymous reviewers and the Editor for their comments and suggestions on this manuscript.

Ethics

All work was subject to review by the University of Oxford, Department of Zoology, Animal Welfare and Ethical Review Board (approval number: APA/1/5/ZOO/NASPA/Sheldon/TitBreedingEcology). Data collection adhered to local guidelines for the use of animals in research and all birds were caught, tagged, and ringed by appropriate BTO licence holders.

Senior Editor

  1. Christian Rutz, University of St Andrews, United Kingdom

Reviewing Editor

  1. Yuuki Y Watanabe, National Institute of Polar Research, Japan

Publication history

  1. Preprint posted: June 23, 2022 (view preprint)
  2. Received: December 20, 2022
  3. Accepted: April 5, 2023
  4. Version of Record published: May 2, 2023 (version 1)

Copyright

© 2023, Beck et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 301
    Page views
  • 29
    Downloads
  • 0
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Kristina B Beck
  2. Ben C Sheldon
  3. Josh A Firth
(2023)
Social learning mechanisms shape transmission pathways through replicate local social networks of wild birds
eLife 12:e85703.
https://doi.org/10.7554/eLife.85703

Further reading

    1. Ecology
    2. Evolutionary Biology
    Laure Olazcuaga, Raymonde Baltenweck ... Julien Foucaud
    Short Report

    Most phytophagous insect species exhibit a limited diet breadth and specialize on a few or a single host plant. In contrast, some species display a remarkably large diet breadth, with host plants spanning several families and many species. It is unclear, however, whether this phylogenetic generalism is supported by a generic metabolic use of common host chemical compounds (‘metabolic generalism’) or alternatively by distinct uses of diet-specific compounds (‘multi-host metabolic specialism’)? Here, we simultaneously investigated the metabolomes of fruit diets and of individuals of a generalist phytophagous species, Drosophila suzukii, that developed on them. The direct comparison of metabolomes of diets and consumers enabled us to disentangle the metabolic fate of common and rarer dietary compounds. We showed that the consumption of biochemically dissimilar diets resulted in a canalized, generic response from generalist individuals, consistent with the metabolic generalism hypothesis. We also showed that many diet-specific metabolites, such as those related to the particular color, odor, or taste of diets, were not metabolized, and rather accumulated in consumer individuals, even when probably detrimental to fitness. As a result, while individuals were mostly similar across diets, the detection of their particular diet was straightforward. Our study thus supports the view that dietary generalism may emerge from a passive, opportunistic use of various resources, contrary to more widespread views of an active role of adaptation in this process. Such a passive stance towards dietary chemicals, probably costly in the short term, might favor the later evolution of new diet specializations.

    1. Ecology
    2. Evolutionary Biology
    Jason P Dinh, SN Patek
    Research Article Updated

    Evolutionary theory suggests that individuals should express costly traits at a magnitude that optimizes the trait bearer’s cost-benefit difference. Trait expression varies across a species because costs and benefits vary among individuals. For example, if large individuals pay lower costs than small individuals, then larger individuals should reach optimal cost-benefit differences at greater trait magnitudes. Using the cavitation-shooting weapons found in the big claws of male and female snapping shrimp, we test whether size- and sex-dependent expenditures explain scaling and sex differences in weapon size. We found that males and females from three snapping shrimp species (Alpheus heterochaelis, Alpheus angulosus, and Alpheus estuariensis) show patterns consistent with tradeoffs between weapon and abdomen size. For male A. heterochaelis, the species for which we had the greatest statistical power, smaller individuals showed steeper tradeoffs. Our extensive dataset in A. heterochaelis also included data about pairing, breeding season, and egg clutch size. Therefore, we could test for reproductive tradeoffs and benefits in this species. Female A. heterochaelis exhibited tradeoffs between weapon size and egg count, average egg volume, and total egg mass volume. For average egg volume, smaller females exhibited steeper tradeoffs. Furthermore, in males but not females, large weapons were positively correlated with the probability of being paired and the relative size of their pair mates. In conclusion, we identified size-dependent tradeoffs that could underlie reliable scaling of costly traits. Furthermore, weapons are especially beneficial to males and burdensome to females, which could explain why males have larger weapons than females.