Leading an urban invasion: risk-sensitive learning is a winning strategy

Alexis J Breen; Dominik Deffner

doi:10.7554/eLife.89315.1

eLife assessment

This study uses a multi-pronged empirical and theoretical approach to advance our understanding of how differences in learning relate to differences in the ways that male versus female animals cope with urban environments, and more generally how reversal learning may benefit animals in urban habitats. The work makes an important contribution and parts of the data and analyses are solid, although several of the main claims are only partially supported or overstated and require additional support.

https://doi.org/10.7554/eLife.89315.1.sa2

Significance of findings

important: Findings that have theoretical or practical implications beyond a single subfield

landmark
fundamental
important
valuable
useful

Strength of evidence

solid: Methods, data and analyses broadly support the claims with only minor weaknesses

exceptional
compelling
convincing
solid
incomplete
inadequate

During the peer-review process the editor and reviewers write an eLife assessment that summarises the significance of the findings reported in the article (on a scale ranging from landmark to useful) and the strength of the evidence (on a scale ranging from exceptional to inadequate). Learn more about eLife assessments

Abstract

In the unpredictable Anthropocene, a particularly pressing open question is how certain species invade urban environments. Sex-biased dispersal and learning arguably influence movement ecology, but their joint influence remains unexplored empirically, and might vary by space and time. We assayed reinforcement learning in wild-caught, temporarily-captive core, middle- or edge-range great-tailed grackles—a bird species undergoing urban-tracking rapid range expansion, led by dispersing males. We show: across populations, both sexes initially perform similarly when learning stimulus-reward pairings, but, when reward contingencies reverse, male—versus female—grackles finish ‘relearning’ faster, making fewer choice-option switches. How do male grackles do this? Bayesian cognitive modelling revealed male grackles’ choice behaviour is governed more strongly by the ‘weight’ of relative differences in recent foraging returns—i.e., they show more pronounced reward-payoff sensitivity. Confirming this mechanism, agent-based forward simulations of reinforcement learning—where we simulate ‘birds’ based on empirical estimates of our grackles’ reinforcement learning—replicate our sex-difference behavioural data. Finally, evolutionary modelling revealed natural selection should favour risk-sensitive learning in characteristically urban-like environments: stable but stochastic settings. Together, these results imply risk-sensitive learning is a winning strategy for urban-invasion leaders, underscoring the potential for life history and cognition to shape invasion success in human-modified environments.

Introduction

Indeed, a growing number of studies show support for this supposition (as recently reviewed in Lee and Thornton, 2021). Carefully controlled choice tests, for example, show urban-dwelling individuals—that is, the invaders—will learn novel stimulus-reward pairings more readily than do rural-dwelling counterparts, supporting the idea that urban invasion selects for learning phenotypes at the dispersal and/or settlement stage(s) (Batabyal and Thaker, 2019). Given the independent influence of sex-biased dispersal and learning on range expansion, it is perhaps surprising, then, that their potential interactive influence on movement ecology remains unexamined empirically (but not theoretically: Liedtke and Fromhage, 2021a,b), particularly in light of concerns over (in)vertebrates’ resilience to ever-increasing urbanisation (Eisenhauer and Hines, 2021; Li et al., 2022).

Great-tailed grackles (Quiscalus mexicanus; henceforth, grackles) are an excellent model for empirical examination of the interplay between sex-biased dispersal, learning, and ongoing urban-targeted rapid range expansion: over the past ∼150 years, they have seemingly shifted their historically urban niche to include more variable urban environments (e.g., arid habitat; Summers et al., 2023), moving from their native range in Central America into much of the United States, with several first-sightings spanning as far north as Canada (Dinsmore and Dinsmore, 1993; Wehtje, 2003; Fink et al., 2020). Notably, the record of this urban invasion is heavily peppered with first-sightings involving a single or multiple male(s) (41 of 63 recorded cases spanning most of the twentieth century; Dinsmore and Dinsmore, 1993). Moreover, recent genetic data show, when comparing grackles within a population, average relatedness: (i) is higher among females than among males; and (ii) decreases with increasing geographic distance among females; but (iii) is unrelated to geographic distance among males; hence, confirming urban invasion in grackles is male-led via sex-biased dispersal (Sevchik et al., 2022). Considering these life history and genetic data in conjunction with data on grackle wildlife management efforts (e.g., pesticides, pyrotechnics, and sonic booms; Luscier, 2018), it seems plausible that urban invasion might drive differential learning between male and female grackles, potentially resulting in a spatial sorting of the magnitude of this sex difference with respect to population establishment age (i.e., sex-effect: newer population > older population; Phillips et al., 2010). In range-expanding western bluebirds (Sialia mexicana), for example, more aggressive males disperse towards the invasion front; however, in as little as three years, the sons of these colonisers show reduced aggression as the invasion front moves on (Duckworth and Badyaev, 2007). Whether sex-biased dispersal and learning similarly interact in urban-invading grackles, remains an open and timely question.

Here, for the first time (to our knowledge), we examine whether, and, if so, how sex mediates learning across 32 male and 17 female wild-caught, temporarily-captive grackles either inhabiting a core (17 males, 5 females), middle (4 males, 4 females) or edge (11 males, 8 females) population of their North American range (based on year-since-first-breeding: 1951, 1996, and 2004, respectively; details in Materials and methods; Figure 1). Collating, cleaning, and curating existing reinforcement learning data (Logan, 2016c; Logan et al., 2022a, 2023b)—wherein novel stimulus-reward pairings are presented (i.e., initial learning), and, once successfully learned, these reward contingencies are reversed (i.e., reversal learning)—we test the hypothesis that sex differences in learning are related to sex differences in dispersal. As range expansion should disfavour slow, error-prone learning strategies, we expect male and female grackles to differ across at least two reinforcement learning behaviours: speed and choice-option switches. Specifically, as documented in our preregistration (see Supplementary file 1), we expect male—versus female—grackles: (prediction 1 and 2) to be faster to, firstly, learn a novel colour-reward pairing, and secondly, reverse their colour preference when the colour-reward pairing is swapped; and (prediction 3) to make fewer choice option-switches during their colour-reward learning; if learning and dispersal relate. Finally, we further expect (prediction 4) such sex-mediated differences in learning to be more pronounced in grackles living at the edge, rather than the intermediate and/or core region of their range.

Participants and experimental protocol. Thirty-two male and 17 female wild-caught, temporarily-captive great-tailed grackles either inhabiting a core (17 males, 5 females), middle (4 males, 4 females) or edge (11 males, 8 females) population of their North American breeding range (establishment year: 1951, 1996, and 2004, respectively), are participants in the current study (grackle images: Wikimedia Commons). Each grackle is individually tested on a two-phase reinforcement learning paradigm: *initial learning*, two colour-distinct tubes are presented, but only one coloured tube (e.g., dark grey) contains a food reward (F+ versus F-); *reversal learning*, the stimulus-reward tube-pairings are swapped. The learning criterion is identical in both learning phases: 17 F+ choices out of the last 20 choices, with trial 17 being the earliest a grackle can successfully finish (for details, see Materials and methods).

To comprehensively examine links between sex-biased dispersal and learning in urban-invading grackles, we employ a combination of Bayesian computational and cognitive modelling methods, and both agent-based and evolutionary simulation techniques. Specifically, our paper proceeds as follows: (i) we begin by describing grackles’ reinforcement learning and testing our predictions using multi-level Bayesian Poisson models; (ii) we next ‘unblackbox’ candidate learning mechanisms generating grackles’ reinforcement learning, using a multi-level Bayesian reinforcement learning model; (iii) we then try to replicate our behavioural data via agent-based forward simulations, to determine if our detected learning mechanisms underpin our grackles’ reinforcement learning; (iv) and we conclude by examining the evolutionary implications of variation in these learning mechanisms under urban-like (or not) settings via algorithmic optimisation.

Results

Reinforcement learning behaviour

We observe robust reinforcement learning dynamics across populations (full between- and across-population model outputs in Supplementary file 2). As such, we compare male and female grackles’ reinforcement learning across populations. Both sexes start out as similar learners, finishing initial learning in comparable trial numbers (median trials-to-finish: males, 32; females, 35; Figure 2A and Supplementary file 2a), and with comparable counts of choice-option switches (median switches-at-finish: males, 10.5; females, 15; Figure 2 and Supplementary file 2b). Indeed, the male-female (M-F) posterior contrasts for both behaviours centre around zero, evidencing no sex-effect (Figure 2C). Once reward contingencies reverse, however, male—versus female—grackles finish this ‘relearning’ faster: they take fewer trials (median trials-to-finish: males, 64; females, 81; Figure 2B and Supplementary file 2a), and make fewer choice-option switches (median switches-at-finish: males, 25; females, 35; Figure 2B and Supplementary file 2b). The M-F posterior contrasts, which lie almost entirely below zero, clearly capture this sex-effect (Figure 2F and Supplementary file 2a and b). Environmental unpredictability, then, dependably directs disparate reinforcement learning trajectories between male and female grackles (faster versus slower finishers, respectively), sup-porting our overall expectation of sex-mediated differential learning in urban-invading grackles.

Reinforcement learning mechanisms

Because (dis)similar behaviour can result from multiple latent processes (McElreath, 2018), we next employ computational methods to delimit reinforcement learning mechanisms. Specifically, we adapt a multi-level Bayesian reinforcement learning model (from Deffner et al., 2020), which we validate apriori via agent-based simulation (see Materials and methods and Supplementary file 1), to estimate the contribution of two core latent learning parameters to grackles’ reinforcement learning: the information-updating rate φ (How rapidly do learners ‘revise’ knowledge?) and the risk-sensitivity rate λ (How strongly do learners ‘weight’ knowledge?). Both learning parameters capture individual-level internal response to incurred reward-payoffs (full mathematical details in Materials and methods). Specifically, as φ_0→1, information-updating increases; as λ_0→∞, risk-sensitivity strengthens. In other words, by formulating our scientific model as a statistical model, we can reverse engineer which values of our learning parameters most likely produce grackles’ choice behaviour—an analytical advantage over less mechanistic methods (McElreath, 2018).

Looking at our reinforcement learning model’s estimates between populations to determine replicability, we observe: in initial learning, the information-updating rate φ of core- and edge-inhabiting male grackles is largely lower than that of female counterparts (M-F posterior contrasts lie more below zero; Figure 2G and Supplementary file 2c), with smaller sample size likely explaining the middle population’s more uncertain estimates (M-F posterior contrasts centre widely around zero; Figure 2G and Supplementary file 2c); while in reversal learning, the information-updating rate φ of both sexes is nearly identical irrespective of population membership, with females dropping to the reduced level of males (M-F posterior contrasts centre closely around zero; Figure 2H and Supplementary file 2c). Therefore, the information-updating rate φ across male and female grackles is initially different (males < females), but converges downwards over reinforcement learning phases (across-population M-F posterior contrasts lie mostly below, and then, tightly bound zero; Figure 2G and H and Supplementary file 2c).

These primary mechanistic findings are, at first glance, perplexing: if male grackles generally outperform female grackles in reversal learning (Figure 2D-F), why do all grackles ultimately update information at matched, dampened pace? This apparent conundrum, however, in fact highlights the potential for multiple latent processes to direct behaviour. Case in point: the risk-sensitivity rate λ is distinctly higher in male grackles, compared to female counterparts, regardless of population membership and learning phase (M-F posterior contrasts lie more, if not mostly, above zero; Figure 2I and J and Supplementary file 2d), outwith the middle population in initial learning likely due to sample size (M-F posterior contrasts centre broadly around zero; Figure 2I and Supplementary file 2d). In other words, choice behaviour in male grackles is more strongly governed by relative differences in predicted reward-payoffs, as spotlighted by across-population M-F posterior contrasts that lie almost entirely above zero (Figure 2I and J and Supplementary file 2d). Thus, these combined mechanistic data reveal, when reward contingencies reverse, male— versus female—grackles ‘relearn’ faster via pronounced reward-payoff sensitivity, a persistence-based risk-sensitive learning strategy.

Grackle reinforcement learning. *Behaviour*. Across-population learning speed and choice-option switches in (**A-B**) initial (M, 32; F, 17) and (**D-E**) reversal learning (M, 29; F, 17), with (**C,F**) respective posterior estimates and M-F contrasts. *Mechanisms*. Within- and across-population estimates and contrasts of *information-updating rate φ* and *risk-sensitivity rate λ* in (**G,I**) initial and (**H,J**) reversal learning. In (**G-J**) open circles show 100 random posterior draws; red filled circles and vertical lines show posterior means and 89% HPDI, respectively. *Simulations*. Learning speed and choice-option switches by: 10,000 full posterior-informed ‘birds’ (n = 5,000 per sex) in (**K-L**) initial and (**N-O**) reversal learning; and six average posterior-informed ‘birds’ (n = 3 per sex) in (M) initial and (P) reversal learning. In (**K,N**) the full simulation sample is plotted; in (**L,O**) open circles show 100 random simulant draws. Note (**K,N**) x-axes are cut to match (**A,D**) x-axes. Medians are plotted/labelled in (**A,B,D,E,K,L,N,O**).
***Figure 2—figure supplement 1***. Excluding extra learning trials.

Agent-based simulations and replication of reinforcement learning

To determine definitively whether our learning parameters are sufficient to generate grackles’ observed reinforcement learning, we conduct agent-based forward simulations; that is, we simulate ‘birds’ informed by the grackles in our data set. Specifically, whilst maintaining the correlation structure among learning parameters, we randomly assign 5000 ‘males’ and 5000 ‘females’ information-updating rate φ and risk-sensitivity rate λ estimates from the full across-population posterior distribution of our reinforcement learning model, and we track synthetic reinforcement learning trajectories. By comparing these synthetic data to our real data, we gain valuable insight into the learning and choice behaviour implied by our reinforcement learning model results. Specifically, a close mapping between the two data sets would indicate our information-updating rate φ and risk-sensitivity rate λ estimates can account for our grackles’ differential reinforcement learning; whereas a poor mapping would indicate some important mechanism(s) are missing (e.g., Deffner et al., 2020).

Ten thousand synthetic reinforcement learning trajectories, together, compellingly show our ‘birds’ behave just like our grackles: ‘males’ outpace ‘females’ in reversal but not in initial learning (median trials-to-finish initial and reversal learning: ‘males’, 31 and 62; ‘females’, 32 and 79; respectively; Figure 2K and N); and ‘males’ make fewer choice-option switches in initial but not in reversal learning, compared to ‘females’ (median switches-at-finish in initial and reversal learning: ‘males’, 11 and 20; ‘females’, 11 and 29; respectively; Figure 2L and O). Figure 2M and P show, respectively, synthetic initial and reversal learning trajectories by three average ‘males’ and three average ‘females’ (i.e., simulants informed via learning parameter estimates that average over our posterior distribution), for the reader interested in representative individual-level reinforcement learning dynamics. Such quantitative replication proves our reinforcement learning model results fully explain our behavioural sex-difference data.

Selection and benefit of reinforcement learning mechanisms under urban-like environments

Learning mechanisms in grackles obviously did not evolve to be successful in the current study; instead, they likely reflect selection pressures and/or adaptive phenotypic plasticity imposed by urban environments (Blackburn et al., 2009; Sol et al., 2013; Lee and Thornton, 2021; Vinton et al., 2022; Caspi et al., 2022). Applying an evolutionary algorithm model (Figure 3A), we conclude by examining how urban environments might favour different information-updating rate φ and risk-sensitivity rate λ values, by estimating optimal learning strategies in settings that differ along two key ecological axes: environmental stability u (How often does optimal behaviour change?) and environmental stochasticity s (How often does optimal behaviour fail to payoff?). Urban environments are generally characterised as both stable (lower u) and stochastic (higher s): more specifically, urbanisation routinely leads to stabilised biotic structure, including predation pressure, thermal habitat, and resource availability, and to enhanced abiotic disruption, such as anthropogenic noise and light pollution (reviews in Shochat et al., 2006; Francis and Barber, 2013; Gaston et al., 2013). Seasonal survey data from (sub)urban British neighborhoods show, for example, 40-75% of house-holds provide supplemental feeding resources for birds (e.g., seed, bread, and peanuts; Cowie and Hinsley, 1988; Davies et al., 2009), the density of which can positively relate to avian abundance within an urban area (Fuller et al., 2008). But such supplemental feeding opportunities are necessarily traded off against increased vigilance due to unpredictable predator-like anthropogenic disturbances (e.g., automobile and airplane traffic; as outlined in Frid and Dill, 2002).

Strikingly, under characteristically urban-like (i.e., stable but stochastic) conditions, our evolutionary model shows the learning parameter constellation robustly exhibited by males grackles in our study—that is, low information-updating rate φ and high risk-sensitivity rate λ—should be favoured by natural selection (darker and lighter squares in, respectively the left and right plots in Figure 3B). These results imply, in urban and other statistically similar environments, learners benefit by averaging over prior experience (i.e., gradually updating ‘beliefs’), and by informing behaviour based on this experiential history (i.e., proceeding with ‘caution’), highlighting the adaptive value of strategising risk-sensitive learning in urban-like environments.

Evolutionary optimality of strategising risk-sensitive learning. **(A)** Illustration of our evolutionary algorithm model to estimate optimal learning parameters that evolve under systematically varied pairings of two key (urban) ecology axes: *environmental stability u*. and *environmental stochasticity s*. Specifically, 300-member populations run for 10 independent 7000-generation simulations per pairing, using ‘roulette wheel’ selection (parents are chosen for reproduction with a probability proportional to collected F+ rewards out of 1000 choices) and random mutation (offspring inherit learning genotypes with a small deviation in random direction). **(B)** Mean optimal learning parameter values discovered by our evolutionary model (averaged over the last 5000 generations). As the statistical environment becomes more urban-like (lower u and higher s values), selection should favour lower information-updating rate φ and higher risk-sensitivity rate λ (darker and lighter squares in left and right plot, respectively). We note arrows are intended as illustrative aids and do not correspond to a linear scale of ‘urbanness’

Discussion

Mapping a full pathway from behaviour to mechanisms through to selection and adaptation, we show risk-sensitive learning is a viable strategy to help explain how male grackles—the dispersing sex—currently lead their species’ remarkable North American urban invasion. Specifically, in wild-caught, temporarily-captive core-, middle- or edge-range grackles, we show: (i) irrespective of population membership, male grackles outperform female counterparts on stimulus-reward reversal reinforcement learning, finishing faster and making fewer choice-option switches; (ii) they achieve their speedier reversal learning performance via pronounced reward-payoff sensitivity (low φ and high λ), as ‘unblackboxed’ by our mechanistic model; (iii) these learning mechanisms indeed explain our sex-difference behavioural data: because we replicate our results using agent-based forward simulations; and (iv) risk-sensitive learning—i.e., low φ and high λ—appears advantageous in characteristically urban-like environments (stable but stochastic settings), according to our evolutionary model. These results set the scene for future comparative research.

The term ‘behavioural flexibility’—broadly defined as some ‘attribute’, ‘cognition’, ‘characteristic’, ‘feature’, ‘trait’ and/or ‘quality’ that enables animals to adapt behaviour to changing circumstances (Coppens et al., 2010; Audet and Lefebvre, 2017; Barrett et al., 2019; Lea et al., 2020)—has previously been hypothesised to explain invasion success (Wright et al., 2010), including that of grackles (Summers et al., 2023). But as eloquently argued elsewhere (Audet and Lefebvre, 2017), this term is conceptually uninformative, given the many ways in which it is applied and assessed. Of these approaches, reversal learning and serial—multiple back-to-back—reversal learning tasks are the most common experimental assays of behavioural flexibility (non-exhaustive examples of each assay in bees; Strang and Sherry 2014; Raine and Chittka 2012; birds; Bond et al. 2007; Morand-Ferron et al. 2022; fish; Lucon-Xiccato and Bisazza 2014; Bensky and Bell 2020; frogs; Liu et al. 2016; Burmeister 2022; reptiles; Batabyal and Thaker 2019; Gaalema 2011; primates; Cantwell et al. 2022; Lacreuse et al. 2018; and rodents; Rochais et al. 2021; Boulougouris et al. 2007). We have shown, however, at least for our grackles, faster reversal learning is governed primarily by pronounced reward-payoff sensitivity, so: firstly, these go-to experimental assays do not necessarily measure the unit they claim to measure (a point similarly highlighted in: Aljadeff and Lotem, 2021); and secondly, formal models based on the false premise that variation in learning speed relates to variation in behavioural flexibility require reassessment (Lea et al., 2020; Blaisdell et al., 2021; Logan et al., 2022b; Lukas et al., 2023; Logan et al., 2023a,c). Heeding previous calls (Dukas, 1998; McNamara and Houston, 2009; Fawcett et al., 2013), our study provides an analytical solution to facilitate productive research on proximate and ultimate explanations of seemingly flexible (or not) behaviour: because we publicly provide step-by-step code to examine individual decision making, two core underlying learning mechanisms, and their theoretical selection and benefit (see https://github.com/alexisbreen/Sex-differences-in-grackles-learning), which can be tailored to specific research questions. The reinforcement learning model, for example, generalises to, in theory, a variety of choice-option paradigms (Barrett, 2022), and these learning models can be extended to estimate asocial and social influence on individual decision making (e.g., McElreath et al., 2005; Aplin et al., 2017; Barrett et al., 2017; Deffner et al., 2020; Chimento et al., 2022), facilitating insight into the multi-faceted feedback process between individual cognition and social systems (Trump et al., 2023). Our open-access analytical resource thus allows researchers to dispense with the umbrella term behavioural flexibility, and to biologically inform and interpret their science—only then can we begin to meaningfully examine the functional basis of behavioural variation across taxa and/or contexts.

Ideas and speculation

Related to this final point, it is useful to outline how additional drivers outwith sex-biased risk-sensitive learning might contribute towards urban invasion success in grackles, too. Grackles exhibit a polygynous mating system, with territorial males attracting multiple female nesters (Johnson et al., 2000). Recent learning ‘style’ simulations show the sex with high reproductive skew approaches pure individual learning, while the other sex approaches pure social learning (Smolla et al., 2019). During population establishment, then, later-arriving female grackles could rely heavily on vetted information provided by male grackles on ‘what to do’ (Wright et al., 2010), as both sexes ultimately face the same urban-related challenges. Moreover, risk-sensitive learning in male grackles should help reduce the elevated risk associated with any skew towards acquiring knowledge through individual learning. And as the dispersing sex this process would operate independently of their proximity to a range front—a pattern suggestively supported by our mechanistic data (i.e., risk-sensitivity: males > females; Figure 2G and H). As such, future research on potential sex differences in social learning propensity in grackles seem particularly prudent, alongside systematic surveying of population-level environmental and fitness components across spatially (dis)similar populations; for this, our annotated and readily available analytical approach should prove useful, as highlighted above.

The lack of spatial replicates in the existing data set used herein inherently poses limitations on inference. But it is worth noting that phenotypic filtering by invasion stage is not a compulsory signature of successful (urban) invasion; instead, phenotypic plasticity and/or inherent species trait(s) may be facilitators (Blackburn et al., 2009; Sol et al., 2013; Lee and Thornton, 2021; Vinton et al., 2022; Caspi et al., 2022). For urban-invading grackles, both of these biological explanations seem strongly plausible, given: firstly, grackles’ highly plastic foraging and nesting ecology (Selander and Giller, 1961; Davis and Arnold, 1972; Wehtje, 2003); secondly, grackles’ apparent historic and current realised niche being—albeit in present day more variable—urban environments, a consistent habit preference that cannot be explained by changes in habitat availability or habitat connectivity (Summers et al., 2023); and finally, our combined behavioural, mechanistic, and evolutionary modelling results showing environments approaching grackles’ general species niche—urban environments—select for particular traits that exist across grackle populations (here, sex-biased risk-sensitive learning). Admittedly, our evolutionary model is not a complete representation of urban ecology dynamics. Relevant factors—e.g., spatial dynamics and realistic life histories—are missed out. These omissions are tactical ones. Our evolutionary model solely focuses on the response of reinforcement learning parameters to two core urban-like (or not) environmental statistics, providing a baseline for future study to build on; for example, it would be interesting to investigate such selection on learning parameters of ‘true’ invaders and not their descendants, a logistically tricky but nonetheless feasible research possibility (e.g., Duckworth and Badyaev, 2007).

Conclusions

By revealing robust interactive links between the dispersing sex and risk-sensitive learning in an urban invader (grackles), these fully replicable insights, coupled with our finding that urban-like environments favour pronounced risk-sensitivity, imply risk-sensitive learning is a winning strategy for urban-invasion leaders. Our modelling methods, which we document in-depth and make freely available, can now be comparatively applied, establishing a biologically meaningful analytical approach for much-needed study on (shared or divergent) drivers of geographic and phenotypic distributions (Somveille et al., 2018; Bro-Jørgensen et al., 2019; Lee and Thornton, 2021; Breen et al., 2021; Breen, 2021; Deffner et al., 2022).

Methods and Materials

Data provenance

The current study uses data from two types of sources: publicly archived data at the Knowledge Network for Biocomplexity (Logan, 2016c; Logan et al., 2022a); or privately permissed access to A.J.B. of (at the time) unpublished data by Grackle Project principal investigator Corina Logan, who declined participation on this study. We note these shared data are now also available at the Knowledge Network for Biocomplexity (Logan et al., 2023b).

Data contents

The data used herein chart colour-reward reinforcement learning performance from 32 male and 17 female wild-caught, temporarily-captive grackles inhabiting one of three study sites that differ in their range-expansion demographics; that is, defined as a core, middle or edge population (based on time-since-settlement population growth dynamics, as outlined in Chuang and Peterson, 2016). Specifically: (i) Tempe, Arizona (17 males and five females)—herein, the core population (estimated to be breeding since 1951, by adding the average time between first sighting and first breeding to the year first sighted; Wehtje, 2003, 2004); (ii) Santa Barbara, California (four males and four females)—herein, the middle population (known to be breeding since 1996; Lehman, 2020); and (iii) Greater Sacramento, California (eleven males and eight females)—herein, the edge population (known to be breeding since 2004; Hampton, 2004).

Experimental protocol

Below we detail the protocol for the colour-reward reinforcement learning test that we analysed herein.

Reinforcement learning test

The reinforcement learning test consists of two experimental phases (Figure 1): (i) stimulus-reward initial learning and (ii) stimulus-reward reversal learning. In both experimental phases, two different coloured tubes are used: for Santa Barbara grackles, gold and grey; for all other grackles, light and dark grey. Each tube consists of an outer and inner diameter of 26 mm and 19 mm, respectively; and each is mounted to two pieces of plywood attached at a right angle (entire apparatus: 50 mm wide × 50 mm tall × 67 mm deep); thus resulting in only one end of each coloured tube being accessible (Figure 1).

In initial learning, grackles are required to learn that only one of the two coloured tubes contains a food reward (e.g., dark grey); this colour-reward pairing is counterbalanced across grackles within each study site. Specifically, the rewarded and unrewarded coloured tubes are placed—either on a table or on the floor—in the centre of the aviary run (distance apart: table, 2 feet; floor, 3 feet), with the open tube-ends facing, and perpendicular to, their respective aviary side-wall. Which coloured tube is placed on which side of the aviary run (left or right) is pseudorandomised across trials. A trial begins at tube-placement, and ends when a grackle has either made a tube-choice or the maximum trial time has elapsed (eight minutes). A tube-choice is defined as a grackle bending down to examine the contents (or lack thereof) of a tube. If the chosen tube contains food, the grackle is allowed to retrieve and eat the food, before both tubes are removed and the rewarded coloured tube is rebaited out of sight (for the grackle). If a chosen tube does not contain food, both tubes are immediately removed. Each grackle is given, first, up to three minutes to make a tube-choice, after which a piece of food is placed equidistant between the tubes to entice participation; and then, if no choice has been made, an additional five minutes maximum, before both tubes are removed. All trials are recorded as either correct (choosing the rewarded coloured tube), incorrect (choosing the unrewarded coloured tube), or incomplete (no choice made). To successfully finish initial learning, a grackle must meet the learning criterion, detailed below.

In reversal learning, grackles are required to learn that the colour-reward pairing has been swapped; that is, the previously unrewarded coloured tube (e.g., light grey) now contains a food reward (Figure 1). The protocol for this second and final experimental phase is identical to that, described above, of initial learning.

Reinforcement learning criterion

For all grackles in the current study, we apply the following learning criterion: to successfully finish their respective learning phase, grackles must make a correct choice in 17 of the most recent 20 trials. Therefore, the earliest a grackle can successfully finish initial or reversal learning in the current study is at trail 17. This applied learning criterion is the most compatible with respect to previous learning criteria used by the original experimenters. Specifically, Logan (Logan, 2016c) and Logan et al. (Logan et al., 2022a) used a fixed-window learning criterion for core- and middle-population grackles, in which grackles were required to make 17 out of the last 20 choices correctly, with a minimum of eight and nine correct choices across the last two sets of 10 trials, assessed at the end of each set. If a core- or middle-population grackle successfully satisfied the fixed-window learning criterion, the grackle was assigned by Logan or colleagues the final trial number for that set (e.g., 20, 30, 40), which is problematic because this trial did not always coincide with the true passing trial (by a maximum of two additional trials; see below).

For edge-population grackles, Logan and colleagues (Logan et al., 2023b) used a sliding-window learning criterion, in which grackles were required to again make 17 out of the last 20 choices correctly, with the same minimum correct-choice counts for the previous two 10-trial sets, except that this criterion was assessed at every trial (from 20 onward) rather than at the end of discrete sets. This second method is also problematic because a grackle can successfully reach criterion via a shift in the sliding window before making a choice. For example, a grackle could make three wrong choices followed by 17 correct choices (i.e., 7/10 correct and 10/10 correct in the last two sets of 10 trials), and at the start of the next trial, the grackle will reach criterion because the summed choices now consist of 8/10 correct and at least 9/10 correct in the last two sets of 10 trials no matter their subsequent choice—see initial learning performance by bird ‘Kel’ for a real example (row 1816 in https://github.com/alexisbreen/Sex-differences-in-grackles-learning; as well as in Logan et al., 2023b). Moreover, the use of different learning criteria (fixed- and sliding-window) by Logan and colleagues in different populations represents a confound when populations are compared. Thus, our applied 17/20 learning criterion ensures our assessment of grackles’ reinforcement learning is informative, straightforward, and consistent.

As a consequence of applying our 17/20 learning criterion, grackles can remain in initial and/or reversal learning beyond reaching criterion. These extra learning trials, however, already exist for some core- and middle-population grackles originally assessed via the fixed-window learning criterion (N = 18 in initial [range: 1-2 extra trials]; N = 13 in reversal [range: 1-2 extra trials]), as explained above. And our cleaning of the original data (see our Data_Processing.R script at https://github.com/alexisbreen/Sex-differences-in-grackles-learning) detected additional cases where grackles remained in-test despite meeting the applied criterion (fixed-window: N = 1 in reversal for 11 extra trials; sliding-window: N = 11 in initial [range: 1-10 extra trials]; N = 7 in reversal [range: 1-14 extra trials]), presumably due to experimenter oversight. Similarly, our data cleaning detected four birds in the core-population that did not in fact meet the fixed-window learning criterion because of incorrect trail numbers entered by the original experimenters e.g., skipping trial 24. Moreover, our data cleaning detected two birds in the middle-population that were passed by the original experimenters despite not meeting the assigned fixed-window learning criterion; instead, both chose 7/10 and 10/10 correct choices in the last two sets of 10 trials. We note these data issues, as well as the problematic nature of both the fixed- and sliding-window learning criterion, continue to be unaddressed in work by Logan (et al.) (Logan, 2016b,a; Logan et al., 2022b, 2023a,c). In any case, in our study we: (i) verified our 17/20 learning criterion results in a similar proportion of male and female grackles experiencing extra initial learning trials (females, 15/17; males, 30/32); and (ii) our learning parameter estimations during initial learning remain relatively unchanged irrespective of whether we exclude or include extra initial learning trails (Figure 2—figure Supplement 1). Thus, we are confident that any carry-over effect of extra initial learning trials on grackles’ reversal learning in our study is negligible if not nonexistent, and we therefore excluded extra learning trials.

Statistical analyses

We analysed, processed, and visually present our data using, respectively, the ‘rstan’ (Team, 2020), ‘rethinking’ (McElreath, 2018), and ‘tidyverse’ (Wickham et al., 2019) packages in R (Team, 2021). We note our reproducible code is available at https://github.com/alexisbreen/Sex-differences-in-grackles-learning. We further note our reinforcement learning model, defined below, does not exclude cases—two males in the core, and one male in the middle population—where a grackle was dropped (due to time constraints) early on from reversal learning by the original experimenters: because individual-level φ and λ estimates can still be estimated irrespective of trial number; the certainty around the estimates will simply be wider (McElreath, 2018). Our Poisson models, however, do exclude these three cases for our modelling of reversal learning, to conserve estimation. The full output from each of our models, which use weakly informative and conservative priors, is available in Supplementary file 2, including posterior means and 89% highest posterior density intervals (HPDI) (McElreath, 2018).

Poisson models

For our behavioural assay of reinforcement learning finishing trajectories, we used a multi-level Bayesian Poisson regression to quantify the effect(s) of sex and learning phase (initial versus reversal) on grackles’ recorded number of trials to successfully finish each phase. This model was performed at both the population and across-population level, and accounted for individual differences among birds through the inclusion of individual-specific varying (i.e., random) effects.

For our behavioural assay of reinforcement learning choice-option switching, we used an identical Poisson model to that described above, to predict the total number of switches between the rewarded and unrewarded coloured tubes.

Reinforcement learning model

We employed an adapted (from Deffner et al., 2020) multi-level Bayesian reinforcement learning model, to examine the influence of sex on grackles’ initial and reversal learning. Our reinforcement learning model, defined below, allows us to link observed coloured tube-choices to latent individual-level attraction updating, and to translate the influence of latent attractions (i.e., expected payoffs) into individual tube-choice probabilities. As introduced above, we can reverse engineer which values of our two latent learning parameters—the information-updating rate φ and the risk-sensitivity rate λ—most likely produce grackles’ choice behaviour, by formulating our scientific model as a statistical model. Therefore, this computational method facilitates mechanistic insight into how multiple latent learning parameters simultaneously guide grackles’ reinforcement learning (McElreath, 2018).

Our reinforcement learning model consists of two equations:

Equation (1) expresses how attraction A to choice-option i changes for an individual j across time (t + 1) based on their prior attraction to that choice-option (A_i,j,t) plus their recently experienced choice reward-payoffs (π_i,j,t), whilst accounting for the relative influence of recent reward-payoffs (φ_k,l). As φ_k,l increases in value, so, too, does the rate of individual-level attraction updating based on reward-payoffs. Here, then, φ_k,l represents the information-updating rate. We highlight that the k, l indexing (here and elsewhere) denotes that we estimate separate φ parameters for each population (k = 1 for core; k = 2 for middle; k = 3 for edge) and for each learning phase (l = 1 for females/initial, l = 2 for females/reversal; l = 3 for males/initial; l = 4 for males/reversal).

Equation (2) is a softmax function that expresses the probability P that choice-option i is selected in the next choice-round (t + 1) as a function of the attractions A and the parameter λ_k,l, which governs how much relative differences in attraction scores guide individual choice behaviour. In the reinforcement learning literature, the λ parameter is known by several names—for example, ‘inverse temperature’, ‘exploration’ or ‘risk-appetite’ (Sutton and Barto, 2018; Chimento et al., 2022)— since the higher its value the more deterministic the choice behaviour of an individual becomes (note λ = 0 generates random choice). In line with foraging theory (Stephens and Krebs, 2019), we call λ the risk-sensitivity rate, where higher values of λ imply foragers are more sensitive to risk, seeking higher expected payoffs based on their prior experience, instead of randomly sampling alternative options.

From the above reinforcement learning model, then, we generate inferences about the effect of sex on φ_k,l and λ_k,l from at least 1000 effective samples of the posterior distribution, at both the population- and across-population-level. We note our reinforcement learning model also includes bird as a random effect (to account for repeated measures within individuals); however, for clarity, this parameter is omitted from our equations (but not our code: https://github.com/alexisbreen/Sex-differences-in-grackles-learning). Our reinforcement learning model does not, on the other hand, include trials where a grackle did not make a tube-choice, as this measure cannot clearly speak to individual learning—for example, satiation rather than any learning of ‘appropriate’ colour tube-choice could be invoked as an explanation in such cases. Indeed, there are, admittedly, a number of intrinsic and extrinsic factors (e.g., temperament and temperature, respectively) that might bias grackles’ tube choice behaviour, and, in turn, the output from our reinforcement learning model (Webster and Rutz, 2020). But the aim of such models is not to replicate the entire study system. Finally, we further note, while we exclude extra learning trials from all of our analyses (see above), our reinforcement learning model initiates estimation of φ and λ during reversal learning, based on individual-level attractions encompassing all previous choices. This parameterisation ensures we precisely capture grackles’ attraction scores up to the point of stimulus-reward reversal (for details, see our RL_Execution.R script at https://github.com/alexisbreen/Sex-differences-in-grackles-learning).

Agent-based simulations: pre- and post-study

Prior to analysing our data, we used agent-based simulations to validate our reinforcement learning model (full details in our preregistration–see Supplementary file 1). In brief, the tube choice behaviour of simulants is governed by a set of rules identical to those defined by equations (1) and (2), and we apply the exact same learning criterion for successfully finishing both learning phases. Crucially, this apriori model vetting verifies our reinforcement learning model can (i) detect simulated sex effects and (ii) accurately recover simulated parameter values in both extreme and more realistic scenarios.

After model fitting, we used the same agent-based approach to forward simulate—that is, simulate via the posterior distribution—synthetic learning trajectories by ‘birds’ via individual-level parameter estimates generated from our across-population reinforcement learning model. Specifically, maintaining the correlation structure among sex- and phase-specific learning parameters, we draw samples from the full or averaged random-effects multivariate normal distribution describing the inferred population of grackles. We use these post-study forward simulations to gain a better understanding of the implied consequences of the estimated sex differences in grackles’ learning parameters (see Figure 2 and associated main text; for an example of this approach in a different context, see Deffner et al., 2020).

Evolutionary model

To investigate the evolutionary significance of strategising risk-sensitive learning, we used algorithmic optimisation techniques (Yu and Gen, 2010; Otto and Day, 2011). Specifically, we construct an evolutionary model of grackle learning, to estimate how our learning parameters—the information-updating rate φ and the risk-sensitivity rate λ—evolve in environments that systematically vary across two ecologically relevant (see main text) statistical properties: the rate of environmental stability u and the rate of environmental stochasticity s. The environmental stability parameter u represents the probability that behaviour leading to a food reward changes from one choice to the next. If u is small, individuals encounter a world where they can expect the same behaviour to be adaptive for a relatively long time. As u becomes larger, optimal behaviour can change multiple times within an individual’s lifetime. The environmental stochasticity parameter s describes the probability that, on any given day, optimal behaviour may not result in a food reward due to external causes specific to this day. If s is small, optimal behaviour reliably produces rewards. As s becomes larger, there is more and more daily ‘noise’ regarding which behaviour is rewarded.

We consider a population of fixed size with N = 300 individuals. Each generation, individual agents are born naïve and make t = 1000 binary foraging decisions resulting in a food reward (or not). Agents decide and learn about the world through reinforcement learning governed by their individual learning parameters, φ and λ (see equations (1) and (2)). Both learning parameters can vary continuously, corresponding to the infinite-alleles model from population genetics (Otto and Day, 2011). Over the course of their lifetime, agents collect food rewards, and the sum of rewards collected over the last 800 foraging decisions (or ‘days’) determines their individual fitness. We ignore the first 200 choices because selection should respond to the steady state of the environment, independently of initial conditions (Otto and Day, 2011).

To generate the next generation, we assume asexual, haploid reproduction, and use fitness-proportionate (or ‘roulette wheel’) selection to choose individuals for reproduction (Yu and Gen, 2010; Otto and Day, 2011). Here, juveniles inherit both learning parameters, φ and λ, from their parent but with a small deviation (in random direction) due to mutation. Specifically, during each mutation event, a value drawn from zero-centered normal distributions N(0, μ_φ) or N(0, μ_λ) is added to the parent value on the logit-/log-scale to ensure parameters remain within allowed limits (between 0 and 1 for φ; positive for λ). The mutation parameters μ_φ and μ_λ thus describe how much offspring values might deviate from parental values, which we set to 0.05. We restrict the risk-sensitivity rate λ to the interval 0 to 15, because greater values result in identical choice behaviour. All results reported in the main text are averaged over the last 5000 generations of 10 independent 7000-generation simulations per parameter combination. This duration is sufficient to reach steady state in all cases.

In summary, our evolutionary model is a necessary and useful first step towards addressing targeted research questions about the interplay between learning phenotype and environmental characteristics.

Acknowledgements

We thank Jean-François Gerard and Rachel Harrison for useful feedback on our study before data analyses; and James St Clair, Sue Healy, and Richard McElreath for similarly useful presubmission feedback. We are further and most grateful to Richard McElreath for study and overall full support. And we thank all members, past and present, of the Grackle Project for collecting and making available, either via public archive (core and middle population) or permissed advanced access (edge population; see Data provenance), the data analysed herein. We note this material uses illustrations from Vecteezy.com. Finally, we further note this material uses data from the eBird Status and Trends Project at the Cornell Lab of Ornithology, eBird.org. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the Cornell Lab of Ornithology.

Additional information

Funding

A.J.B. and D.D. received no independent funding for this research.

Author contributions

A.J.B. conceived of the study; collated, cleaned, and curated all target data; D.D. led model and simulation building, with input from A.J.B.; A.J.B and D.D. contributed equally to mechanistic modelling and agent-based forward simulations; D.D. performed the Poisson and the evolutionary modelling; A.J.B. prepared all figures and tables, with help on Figure 3 from D.D.; A.J.B. and D.D. annotated together all open source material; A.J.B. wrote the manuscript, with constructive contributions and revisions by D.D.; A.J.B. and D.D. agree over and approve the final draft of the manuscript.

Author ORCIDs

Alexis J Breen https://orcid.org/0000-0002-2331-0920

Dominik Deffner https://orcid.org/0000-0002-1649-3861

Additional files

Supplementary files

Supplementary file 1. Study preregistration, including reinforcement learning model validation.
Supplementary file 2. Supplementary tables. (a) Total-trials-in-test Poisson regression model output (b) Total-choice-option-switches-in-test Poisson regression model output. (c) Bayesian reinforcement learning model information-updating rate φ output (d) Bayesian reinforcement learning model risk-sensitvity rate λ output. For (a-c), both between- and across-population posterior means and corresponding 89% highest-posterior density intervals are reported for males, females, and male-female contrasts.

Reinforcement learning speed.
Between- and across-population total-trials-in-test Poisson regression model estimates and male-female contrasts, with corresponding lower (L) and upper (U) 89% highest-posterior density intervals in parentheses.

Reinforcement learning switches.
Between- and across-population total-choice-option-switches-in-test Poisson regression model estimates and male-female contrasts, with corresponding lower (L) and upper (U) 89% highest-posterior density intervals in parentheses.

Reinforcement learning information-updating rate ϕ.
Between- and across-population computational model ϕ estimates and male-female contrasts, with posterior means and corresponding lower (L) and upper (U) 89% highest-posterior density intervals in parentheses.

Reinforcement learning risk-sensitivity rate λ.
Between- and across-population computational model λ estimates and male-female contrasts, with posterior means and corresponding lower (L) and upper (U) 89% highestposterior density intervals in parentheses.

Investigating sex differences in learning in a range-expanding bird

Alexis J. Breen^1,* & Dominik Deffner^1,2,3

¹Department of Human Behavior, Ecology and Culture, Max Planck Institute for Evolutionary Anthropology, Leipzig 04103, Germany

²Science of Intelligence Excellence Cluster, Technical University Berlin, Berlin 10623, Germany

³Center for Adaptive Rationality, Max Planck Institute for Human Development, Berlin 14195, Germany

*alexis_breen@eva.mpg.de

Abstract

How might differences in dispersal and learning interact in range expansion dynamics? To begin to answer this question, in this preregistration we detail the background, hypothesis plus associated predictions, and methods of our proposed study, including the development and validation of a mechanistic reinforcement learning model, which we aim to use to assay colour-reward reinforcement learning (and the influence of two candidate latent parameters—speed and sampling rate—on this learning) in great-tailed grackles—a species undergoing rapid range expansion, where males disperse.

Introduction

Dispersal and range expansion go ‘hand in hand’; movement by individuals away from a population’s core is a pivotal precondition of witnessed growth in species’ geographic limits (Chuang & Peterson, 2016; Ronce, 2007). Because ‘who’ disperses—in terms of sex—varies both within and across taxa (for example, male-biased dispersal is dominant among fish and mammals, whereas female-biased dispersal is dominant among birds; see Table 1 in Trochet et al., 2016), skewed sex ratios are apt to arise at expanding range fronts, and, in turn, differentially drive invasion dynamics. Female-biased dispersal, for instance, can ‘speed up’ staged invertebrate invasions by increasing offspring production (Miller & Inouye, 2013). Alongside sex-biased dispersal, learning ability is also argued to contribute to species’ colonisation capacity, as novel environments inevitably present novel (foraging, predation, shelter, and social) challenges that newcomers need to surmount in order to settle successfully (Sol et al., 2013; Wright et al., 2010). Indeed, a growing number of studies show support for this supposition (as recently reviewed in Lee & Thornton, 2021). Carefully controlled choice tests, for example, show that urban-dwelling individuals—that is, the ‘invaders’—will both learn and unlearn novel reward-stimulus pairings more rapidly than their rural-dwelling counterparts (Batabyal & Thaker, 2019), suggesting that range expansion selects for enhanced learning ability at the dispersal and/or settlement stage(s). Given the independent influence of sex-biased dispersal and learning ability on range expansion, it is perhaps surprising, then, that their potential interactive influence on this aspect of movement ecology remains unexamined, particularly as interactive links between dispersal and other behavioural traits such as aggression are documented within the range expansion literature (Duckworth, 2006; Gutowsky & Fox, 2011).

That learning ability can covary with, for example, exploration (e.g., Auersperg et al., 2011; Guillette et al., 2011) and neophobia (e.g., Verbeek et al., 1994), two behaviours which may likewise play a role in range expansion (Griffin et al., 2017; Lee & Thornton, 2021), is one potential reason for the knowledge gap introduced above. Such correlations stand to mask what contribution, if any, learning ability lends to range expansion—an undoubtedly daunting research prospect. A second (and not mutually exclusive) reason is that, for many species, a detailed diary of their range expansion is lacking (Blackburn et al., 2009; Udvardy & Papp, 1969). And patchy population records inevitably introduce interpretive ‘noise,’ imaginably impeding population comparisons of learning ability (or the like).

In range-expanding great-tailed grackles (Quiscalus mexicanus), however, learning ability appears to represent a unique source of individual variation; more specifically, temporarily-captive great-tailed grackles’ speed to solve colour-reward reinforcement learning tests does not correlate with measures of their exploration (time spent moving within a novel environment), inhibition (time to reverse a colour-reward preference), motor diversity (number of distinct bill and/or feet movements used in behavioural tests), neophobia (latency to approach a novel object), risk aversion (time spent stationary within a ‘safe spot’ in a novel environment), persistence (number of attempts to engage in behavioural tests), or problem solving (number of test-relevant functional and non-functional object-choices) (Logan, 2016a, 2016b). Moreover, careful combing by researchers of public records, such as regional bird reports and museum collections, means that great-tailed grackle range-expansion data is both comprehensive and readily available (Dinsmore & Dinsmore, 1993; Pandolfino et al., 2009; Wehtje, 2003). Thus, great-tailed grackles offer behavioural ecologists a useful study system to investigate the interplay between life-history strategies, learning ability, and range expansion.

Here, for the first time (to our knowledge), we propose to investigate potential differences in colour-reward reinforcement learning performance between male and female great-tailed grackles (Figure 1), to test the hypothesis that sex differences in learning ability are related to sex differences in dispersal. Since the late nineteenth century, great-tailed grackles have been expanding their range at an unprecedented rate, moving northward from their native range in Central America into the United States (breeding in at least 20 states), with several first-sightings spanning as far north as Canada (Dinsmore & Dinsmore, 1993; Wehtje, 2003). Notably, the record of this range expansion in great-tailed grackles is heavily peppered with first-sightings involving a single or multiple male(s) (Dinsmore & Dinsmore, 1993; Kingery, 1972; Littlefield, 1983; Stepney, 1975; Wehtje, 2003). Moreover, recent genetic data show that, when comparing great-tailed grackles within a population, average relatedness: (i) is higher among females than among males; and (ii) decreases with increasing geographic distance among females; but (iii) is unrelated to geographic distance among males; hence, confirming a role for male-biased dispersal in great-tailed grackles (Sevchik et al., in press). Considering these natural history and genetic data, then, we expect male and female great-tailed grackles to differ across at least two colour-reward reinforcement learning parameters: speed and sampling rate (here, sampling is defined as switching between choice-options). Specifically, we expect male—versus female—great-tailed grackles: (prediction 1 & 2) to be faster to, firstly, learn a novel colour-reward pairing, and secondly, reverse their colour preference when the colour-reward pairing is swapped; and (prediction 3) to be more deterministic—that is, sample less often—in their colour-reward learning; if learning ability and dispersal relate. Indeed, since invading great-tailed grackles face agribusiness-led wildlife management strategies, including the use of chemical crop repellents (Werner et al., 2011, 2015), range expansion should disfavour slow, error-prone learning strategies, resulting in a spatial sorting of learning ability in great-tailed grackles (Wright et al., 2010). Related to this final point, we further expect (prediction 4) such sex differences in learning ability to be more pronounced in great-tailed grackles living at the edge, rather than the intermediate and/or core, region of their range (e.g., Duckworth, 2006).

Left panel: images showing a male and female great-tailed grackle (credit: Wikimedia Commons). Right panel: schematic of the colour-reward reinforcement learning experimental protocol. In the *initial learning* phase, great-tailed grackles are presented with two colour-distinct tubes; however, only one coloured tube (e.g., dark grey) contains a food reward (F+ versus F-). In the *reversal learning* phase, the colour-reward tube-pairings are swapped. The passing criterion was identical in both phases (see main text for details).

Methods

Data

This preregistration aims to use colour-reward reinforcement learning data collected (or being collected) in great-tailed grackles across three study sites that differ in their range-expansion demographics; that is, belonging to a core, intermediate, or edge population (based on time-since-settlement population growth dynamics, as outlined in Chuang & Peterson, 2016). Specifically, data will be utilised from: (i) Tempe, Arizona—hereafter, the core population (estimated—by adding the average time between first sighting and first breeding to the year first sighted—to be breeding since 1951) (Walter, 2004; Wehtje, 2003); (ii) Santa Barbara, California—hereafter, the intermediate population (known to be breeding since 1996) (Lehman, 2020); and (iii) Woodland, California—hereafter, the edge population (known to be breeding since 2004) (Hampton, 2001). Data collection at both the Tempe, Arizona and Santa Barbara, California study sites has been completed prior to the submission of this preregistration (total sample size across sites: nine females and 25 males); however, data collection at the Woodland, California study site is ongoing (current sample size: three females and nine males; anticipated minimum total sample size: five females and ten males). Thus, the final data set should contain colour-reward reinforcement learning data from at least 14 female and 35 male great-tailed grackles.

Experimental protocol

General

A step-by-step description of the experimental protocol is reported elsewhere (e.g., Blaisdell et al., 2021). As such, below we detail only the protocol for the colour-reward reinforcement learning tests that we propose to analyse herein.

Colour-reward reinforcement learning tests

The reinforcement learning tests consist of two phases (Figure 1, right panel): (i) colour-reward learning (hereafter, initial learning) and (ii) colour-reward reversal learning (hereafter, reversal learning). In both phases, two different coloured tubes are used: for Santa Barbara great-tailed grackles, gold and grey (Logan, 2016b, 2016a); for all other great-tailed grackles: light and dark grey (Blaisdell et al., 2021). Each tube consists of an outer and inner diameter of 26 mm and 19 mm, respectively; and each is mounted to two pieces of plywood attached at a right angle (entire apparatus: 50 mm wide × 50 mm tall × 67 mm deep); thus resulting in only one end of each coloured tube being accessible (Figure 1, right panel).

In the initial learning phase, great-tailed grackles are required to learn that only one of the two coloured tubes contains a food reward (e.g., dark grey; this colour-reward pairing is counterbalanced across great-tailed grackles within each study site). Specifically, the rewarded and unrewarded coloured tubes are placed—either on a table or on the floor—in the centre of the aviary run (distance apart: table, 2 ft; floor, 3 ft), with the open tube-ends facing, and perpendicular to, their respective aviary side-wall. Which coloured tube is placed on which side of the aviary run (left or right) is pseudorandomised across trials. A trial begins at tube-placement, and ends when a great-tailed grackle has either made a tube-choice or the maximum trial time has elapsed (eight minutes). A tube-choice is defined as a great-tailed grackle bending down to examine the contents (or lack thereof) of a tube. If the chosen tube contains food, the great-tailed grackle is allowed to retrieve and eat the food, before both tubes are removed and the rewarded coloured tube is rebaited out of sight (for the great-tailed grackle). If a chosen tube does not contain food, both tubes are immediately removed. Each great-tailed grackle is given, first, up to three minutes to make a tube-choice (after which a piece of food is placed equidistant between the tubes to entice participation); and then, if no choice has been made, an additional five minutes maximum, before both tubes are removed. All trials are recorded as either correct (choosing the rewarded colour tube), incorrect (choosing the unrewarded colour tube), or incomplete (no choice made); and are presented in 10-trial blocks. To pass initial learning, a great-tailed grackle must make a correct choice in at least 17 out of the most recent 20 trials, with a minimum of eight and nine correct choices across the last two blocks.

In the reversal learning phase, great-tailed grackles are required to learn that the colour-reward pairing has been switched; that is, the previously unrewarded coloured tube (e.g., light grey) now contains a food reward. The protocol for this second and final learning phase is identical to that, described above, of the initial learning phase.

Analysis plan

General

Here, we will analyse, process, and visually present our data using, respectively, the ‘rstan’ (Stan Development Team, 2020), ‘rethinking’ (McElreath, 2018), and ‘tidyverse’ (Wickham et al., 2019) packages in R (R Core Team, 2021). Our reproducible code is available on GitHub (https://github.com/alexisbreen/Sex-differences-in-grackles-learning).

Reinforcement learning model

In this preregistration, we propose to employ an adapted (from Deffner et al., 2020) Bayesian reinforcement learning model, to examine the influence of sex on great-tailed grackles’ initial and reversal learning performance. The reinforcement learning model, defined below, allows us to link observed coloured tube-choices to latent individual-level knowledge-updating (of attractions towards, learning about, and sampling of, either coloured tube) based on recent tube-choice reward-payoffs, and to translate such latent knowledge-updating into individual tube-choice probabilities; in other words, we can reverse engineer the probability that our parameters of interest (speed and sampling rate) produce great-tailed grackles’ observed tube-choice behaviour by formulating our scientific model as a statistical model (McElreath, 2018, p. 537). This method can therefore capture whether, and, if so, how multiple latent learning strategies simultaneously guide great-tailed grackles’ decision making—an analytical advantage over more traditional methods (e.g., comparing trials to passing criterion) that ignore the potential for equifinality (Barrett, 2019; Kandler & Powell, 2018).

Our reinforcement learning model consists of two equations:

Equation 1 expresses how attraction (A) to a choice-option (i) changes for an individual (j) across time (t + 1) based on their prior attraction to that choice-option (A_i,j,t) plus their recently experienced choice-payoff (π_i,j,t), whilst accounting for the weight given to recent payoffs (φ_k,l). As φ_k,l increases in value, so, too, does the rate of individual attraction-updating; thus, φ_k,l represents the individual learning rate. We highlight that the k, l indexing denotes that we estimate separate φ parameters for each phase of the experiment (k = 1 for initial, k = 2 for reversal) and each sex (l = 1 for females, l = 2 for males).

Equation 2 is a softmax function that expresses the probability (P) that option (i) is selected in the next choice-round (t + 1) as a function of the attractions and a parameter (λ_k,l) that governs how much relative differences in attraction scores guide individual choice-behaviour. The higher the value of λ_k,l, the more deterministic (less option-switching) the choice-behaviour of an individual becomes (note λ_k,l = 0 generates random choice); thus, λ_k,l represents the individual sampling rate for phase k and sex l.

From the above reinforcement learning model, then, we will generate inferences about the effect of sex on φ_k,l and λ_k,l from at least 1000 effective samples of the posterior distribution (see our model validation below). We note that our reinforcement learning model also includes both individual bird and study site as random effects (to account for repeated measures within both individuals and populations); however, for clarity, these parameters are omitted from our equations (but not our code: https://github.com/alexisbreen/Sex-differences-in-grackles-learning). Regarding our study site random effect, we further note that, as introduced above, we will also explore population-mediated sex-effects on φ and λ, by comparing these learning parameters both within and between sexes at each study site. Finally, our reinforcement learning model excludes trials where a great-tailed grackle did not make a tube-choice, as this measure cannot clearly speak to individual learning ability—for example, satiation rather than any learning of ‘appropriate’ colour tube-choice could be invoked as an explanation in such cases. Indeed, there are, admittedly, a number of intrinsic and extrinsic factors (e.g., temperament and temperature, respectively) that might bias great-tailed grackles’ tube-choice behaviour, and, in turn, the output from our reinforcement learning model (Webster & Rutz, 2020). Nonetheless, our reinforcement learning model serves as a useful first step towards addressing if learning ability and dispersal relate in great-tailed grackles (for a similiar rationale, see McElreath & Smaldino, 2015).

Model validation

We validated our reinforcement learning model in three steps. First, we performed agent-based simulations. Specifically, we followed the tube-choice behaviour of simulated great-tailed grackles—that is, 14 females and 35 males from one of three populations (where population membership matched known study site sex distributions)—across the described initial learning and reversal learning phases. The tube-choice behaviour of the simulated great-tailed grackles was governed by a set of rules identical to those defined by our mathematical equations—for example, coloured tube attractions were independently updated based on the reward outcome of tube choices. Because we assigned higher average φ and λ values to simulated male (versus female) great-tailed grackles, the resulting data set should show males outperform females on initial and reversal learning, at both the group and individual-level; it did (Figure 2 & S1, respectively).

Group-level tube-choice behaviour of simulated great-tailed grackles across colour-reward reinforcement learning trials (females: yellow, n = 14; males: green, n = 35), following model validation step one. Tube option 1 (e.g., dark grey) was the rewarded option in the initial learning phase; conversely, tube option 2 (e.g., light grey) contained the food reward in the reversal learning phase. Each open circle represents an individual tube-choice; black lines indicate binomial smoothed conditional means fitted with grey 89% compatability intervals.

Next, we ran our simulated data set on our reinforcement learning model. Here, we endeavored to determine whether our reinforcement learning model: (i) recovered our assigned φ_k,l and λ_k,l values (it did; Table 1); and (ii) produced ‘correct’ qualitative inferences—that is, detected the simulated sex differences in great-tailed grackles’ initial and reversal learning (it did; Figure 3).

Comparison of assigned and recovered φ and λ values, following model validation step two. Eighty-nine percent highest posterior density intervals (HPDI) are shown for recovered values.

Comparison of learning ability in simulated female (yellow; n = 14) and male (green; n = 35) great-tailed grackles across initial and reversal colour-reward reinforcement learning, following model validation step two. (A) φ, the rate of learning i.e., speed. (B) λ, the rate of sampling i.e., switching between choice-options. (C) and (D) show posterior distributions for respective contrasts between female and male learning. Eighty-nine percent highest posterior density intervals are shaded in grey; that this interval does not cross zero evidences a simulated effect of sex on learning ability.

Finally, we repeated step one and step two, using a range of realistically plausible φ and λ sex differences (note that values for female great-tailed grackles were left unchanged from Table 1), to determine whether our reinforcement learning model could detect different effect sizes of sex on our target learning parameters. This final step confirmed that, for our anticipated minimum sample size, our reinforcement learning model: (i) detects sex differences in φ values >= 0.03 and λ values >= 1; and (ii) infers a null effect for φ values < 0.03 and λ values < 1 i.e., very weak simulated sex differences (Figure 4). Both of these points together highlight how our reinforcement learning model allows us to say that null results are not just due to small sample size. Additionally, estimates obtained from step three were more precise in the reversal learning phase compared to the initial learning phase (Figure 4), and we can expect to detect even smaller sex differences if we analyse learning across both phases—an approach we will apply if we detect no effect of phase. In sum, model validation steps one through three confirm that our reinforcement learning model is reasonably fit.

Parameter recovery test for different sizes of simulated sex differences. Plots show posterior estimates of the effect of sex (contrasts between simulated male and female great-tailed grackles; n = 14 and 35, respectively) on speed (φ) and sampling (λ) learning parameters, following model validation step three. Black circles represent the mean recovered sex effect estimates with grey eighty-nine percent highest posterior density intervals (HPDIs); black solid diagonal lines represent a ‘perfect’ match between assigned and recovered parameter estimates (note that we would not expect a perfect correspondence due to stochasticity of agent-based simulations); and black dashed horizontal lines represent a recovered null sex effect.

Bias

AJB and DD are (at the time of submitting this preregistration) blind with respect to all but two aspects of the target data: the sex and population membership of each grackle that has, thus far, completed, or is expected to complete, the colour-reward reinforcement learning tests (because these parameters were used in model validation simulations—see above).

Open materials

https://github.com/alexisbreen/Sex-differences-in-grackles-learning

Acknowledgements

We thank all members, past and present, of the Grackle Project for collecting and sharing the data that we propose to analyse herein. We further thank Richard McElreath for study support.

Ethics

All data utilised herein were collected with ethical approval.

Supplementary material

Individual-level tube-choice behaviour of simulated great-tailed grackles across colour-reward reinforcement learning trials (females: yellow, n = 14; males: green, n = 35). Tube option 1 (e.g., dark grey) was the rewarded option in the initial learning phase; conversely, tube option 2 (e.g., light grey) contained the food reward in the reversal learning phase. Each open circle shows an individual tube-choice; black solid lines show loess smoothed conditional means fitted with grey 89% compatibility intervals; and dashed black lines show individual-unique transitions between learning phases.

Comparison of information-updating rate φ and risk-sensitivity rate λ estimates (top and bottom row, respectively) in initial learning excluding and including extra initial learning trials (left and right column, respectively), which are present in the original data set (see Methods and materials). Because this comparison does not show any noticeable difference depending on their inclusion or exclusion, we excluded extra learning trials from our analyses. All plots are generated via model estimates using our full sample size: 32 males and 17 females.

Acknowledgements

We thank all members, past and present, of the Grackle Project for collecting and sharing the data that we propose to analyse herein. We further thank Richard McElreath for study support.

References

1. Aljadeff N
2. Lotem A
2021Task-dependent reversal learning dynamics challenge the reversal paradigm of measuring cognitive flexibilityAnimal Behaviour 179:183–197https://doi.org/10.1016/j.anbehav.2021.07.002 Google Scholar
1. Aplin LM
2. Sheldon BC
3. McElreath R
2017Conformity does not perpetuate suboptimal traditions in a wild population of songbirdsProceedings of the National Academy of Sciences 114:7830–7837https://doi.org/10.1073/pnas.1621067114 Google Scholar
1. Audet JN
2. Lefebvre L
2017What’s flexible in behavioral flexibility?Behavioral Ecology 28:943–947https://doi.org/10.1093/beheco/arx007 Google Scholar
1. Barrett BJ
2022Inferential power in identifying frequency-dependent social learning strengthened by increasing behavioural optionsJournal of Animal Ecology https://doi.org/10.1111/1365-2656.13826 Google Scholar
1. Barrett BJ
2. McElreath RL
3. Perry SE
2017Pay-off-biased social learning underlies the diffusion of novel extractive foraging traditions in a wild primateProceedings of the Royal Society B: Biological Sciences 284:20170358https://doi.org/10.1098/rspb.2017.0358 Google Scholar
1. Barrett LP
2. Stanton LA
3. Benson-Amram S
2019The cognition of ‘nuisance’speciesAnimal Behaviour 147:167–177https://doi.org/10.1016/j.anbehav.2018.05.005 Google Scholar
1. Batabyal A
2. Thaker M
2019Lizards from suburban areas learn faster to stay safeBiology Letters 15:20190009https://doi.org/10.1098/rsbl.2019.0009 Google Scholar
1. Bensky MK
2. Bell AM
2020Predictors of individual variation in reversal learning performance in three-spined stick-lebacksAnimal cognition 23:925–938https://doi.org/10.1007/s10071-020-01399-8 Google Scholar
1. Blackburn TM
2. Lockwood JL
3. Cassey P
2009Avian invasions: the ecology and evolution of exotic birdsOxford University Press Google Scholar
1. Blaisdell A
2. Seitz B
3. Rowney C
4. Folsom M
5. MacPherson M
6. Deffner D
7. Logan CJ
2021Do the more flexible individuals rely more on causal cognition?Observation versus intervention in causal inference in great-tailed grackles. Peer Community Journal 1https://doi.org/10.24072/pcjournal.44 Google Scholar
1. Bond AB
2. Kamil AC
3. Balda RP
2007Serial reversal learning and the evolution of behavioral flexibility in three species of North American corvids (Gymnorhinus cyanocephalus, Nucifraga columbiana, Aphelocoma californica)Journal of Comparative Psychology 121:372https://doi.org/10.1037/0735-7036.121.4.372 Google Scholar
1. Boulougouris V
2. Dalley JW
3. Robbins TW
2007Effects of orbitofrontal, infralimbic and prelimbic cortical lesions on serial spatial reversal learning in the ratBehavioural brain research 179:219–228https://doi.org/10.1016/j.bbr.2007.02.005 Google Scholar
1. Breen AJ
2021Animal culture research should include avian nest constructionBiology Letters 17:20210327https://doi.org/10.1098/rsbl.2021.0327 Google Scholar
1. Breen AJ
2. Sugasawa S
3. Healy SD
2021Manipulative and technological skills do not require a slow life historyFrontiers in Ecology and Evolution 9:46https://doi.org/10.3389/fevo.2021.635802 Google Scholar
1. Bro-Jørgensen J
2. Franks DW
3. Meise K
2019Linking behaviour to dynamics of populations and communities: application of novel approaches in behavioural ecology to conservationPhilosophical Transactions of the Royal Society B 374:20190008https://doi.org/10.1098/rstb.2019.0008 Google Scholar
1. Burmeister SS
2022Ecology, cognition, and the hippocampus: A tale of two frogsBrain, Behavior and Evolution 97:211–224https://doi.org/10.1159/000522108 Google Scholar
1. Cantwell A
2. Buckholtz JW
3. Atencia R
4. Rosati AG
2022The origins of cognitive flexibility in chimpanzeesDevelopmental science 25:e13266https://doi.org/10.1111/desc.13266 Google Scholar
1. Caspi T
2. Johnson JR
3. Lambert MR
4. Schell CJ
5. Sih A
2022Behavioral plasticity can facilitate evolution in urban environmentsTrends in Ecology & Evolution 37:1092–1103https://doi.org/10.1016/j.tree.2022.08.002 Google Scholar
1. Chimento M
2. Barrett BJ
3. Kandler A
4. Aplin LM
2022Cultural diffusion dynamics depend on behavioural production rulesProceedings of the Royal Society B: Biological Sciences 289:20221001https://doi.org/10.1098/rspb.2022.1001 Google Scholar
1. Chuang A
2. Peterson CR
2016Expanding population edges: theories, traits, and trade-offsGlobal Change Biology 22:494–512https://doi.org/10.1111/gcb.13107 Google Scholar
1. Coppens CM
2. de Boer SF
3. Koolhaas JM
2010Coping styles and behavioural flexibility: towards underlying mechanismsPhilosophical Transactions of the Royal Society B: Biological Sciences 365:4021–4028https://doi.org/10.1098/rstb.2010.0217 Google Scholar
1. Cowie RJ
2. Hinsley SA
1988The provision of food and the use of bird feeders in suburban gardensBird Study 35:163–168https://doi.org/10.1080/00063658809476985 Google Scholar
1. Davies ZG
2. Fuller RA
3. Loram A
4. Irvine KN
5. Sims V
6. Gaston KJ
2009A national scale inventory of resource provision for biodiversity within domestic gardensBiological Conservation 142:761–771https://doi.org/10.1016/j.biocon.2008.12.016 Google Scholar
1. Davis WR
2. Arnold KA
1972Food habits of the Great-tailed Grackle in Brazos county, TexasThe Condor 74:439–446https://doi.org/10.2307/1365896 Google Scholar
1. Deffner D
2. Kleinow V
3. McElreath R
2020Dynamic social learning in temporally and spatially variable environmentsRoyal Society Open Science 7:200734https://doi.org/10.1098/rsos.200734 Google Scholar
1. Deffner D
2. Rohrer JM
3. McElreath R
2022A causal framework for cross-cultural generalizabilityAdvances in Methods and Practices in Psychological Science 5:1–18https://doi.org/10.1177/25152459221106366 Google Scholar
1. Dinsmore JJ
2. Dinsmore SJ
1993Range expansion of the great-tailed grackle in the 1900sJournal of the Iowa Academy of Science 100:54–59https://scholarworks.uni.edu/jias/vol100/iss2/4 Google Scholar
1. Duckworth RA
2. Badyaev AV
2007Coupling of dispersal and aggression facilitates the rapid range expansion of a passerine birdProceedings of the National Academy of Sciences 104:15017–15022https://doi.org/10.1073/pnas.0706174104 Google Scholar
1. Dukas R
1998Cognitive ecology: the evolutionary ecology of information processing and decision makingUniversity of Chicago Press Google Scholar
1. Eisenhauer N
2. Hines J
2021Invertebrate biodiversity and conservationCurrent Biology 31:R1214–R1218https://doi.org/10.1016/j.cub.2021.06.058 Google Scholar
1. Fawcett TW
2. Hamblin S
3. Giraldeau LA
2013Exposing the behavioral gambit: the evolution of learning and decision rulesBehavioral Ecology 24:2–11https://doi.org/10.1093/beheco/ars085 Google Scholar
1. Fink D
2. Auer T
3. Johnston A
4. Strimas-Mackey M
5. Robinson O
6. Ligocki S
7. Hochachka W
8. Jaromczyk L
9. Wood C
10. Davies I
11. Iliff M
12. Seitz L
2020eBird Status and Trendshttps://doi.org/10.2173/ebirdst.2020 Google Scholar
1. Francis CD
2. Barber JR
2013A framework for understanding noise impacts on wildlife: an urgent conservation priorityFrontiers in Ecology and the Environment 11:305–313https://doi.org/10.1890/120183 Google Scholar
1. Frid A
2. Dill L
2002Human-caused disturbance stimuli as a form of predation riskConservation Ecology 6Google Scholar
1. Fuller RA
2. Warren PH
3. Armsworth PR
4. Barbosa O
5. Gaston KJ
2008Garden bird feeding predicts the structure of urban avian assemblagesDiversity and Distributions 14:131–137https://doi.org/10.1111/j.1472-4642.2007.00439.x Google Scholar
1. Gaalema DE
2011Visual discrimination and reversal learning in rough-necked monitor lizards (Varanus rudicollis)Journal of Comparative Psychology 125:246https://doi.org/10.1037/a0023148 Google Scholar
1. Gaston KJ
2. Bennie J
3. Davies TW
4. Hopkins J
2013The ecological impacts of nighttime light pollution: a mechanistic appraisalBiological Reviews 88:912–927https://doi.org/10.1111/brv.12036 Google Scholar
1. Hampton S
2004Yolo County birding newsGoogle Scholar
1. Johnson K
2. DuVal E
3. Kielt M
4. Hughes C
2000Male mating strategies and the mating system of great-tailed gracklesBehavioral Ecology 11:132–141https://doi.org/10.1093/beheco/11.2.132 Google Scholar
1. Lacreuse A
2. Moore CM
3. LaClair M
4. Payne L
5. King JA
2018Glutamine/glutamate (Glx) concentration in prefrontal cortex predicts reversal learning performance in the marmosetBehavioural Brain Research 346:11–15https://doi.org/10.1016/j.bbr.2018.01.025 Google Scholar
1. Lea SE
2. Chow PK
3. Leaver LA
4. McLaren IP
2020Behavioral flexibility: a review, a model, and some exploratory testsLearning & Behavior 48:173–187https://doi.org/10.3758/s13420-020-00421-w Google Scholar
1. Lee VE
2. Thornton A
2021Animal cognition in an urbanised worldFrontiers in Ecology and Evolution 9https://doi.org/10.3389/fevo.2021.633947 Google Scholar
1. Lehman PE
2020The birds of Santa Barbara County, Californiahttp://www.sbcobirding.com/lehmanbosbc.html Google Scholar
1. Li G
2. Fang C
3. Li Y
4. Wang Z
5. Sun S
6. He S
7. Qi W
8. Bao C
9. Ma H
10. Fan Y
11. et al.
2022Global impacts of future urban expansion on terrestrial vertebrate diversityNature communications 13:1–12https://doi.org/10.1038/s41467-022-29324-2 Google Scholar
1. Liedtke J
2. Fromhage L
2021The joint evolution of learning and dispersal maintains intraspecific diversity in metapopulationsOikos 130:808–818https://doi.org/10.1111/oik.08208 Google Scholar
1. Liedtke J
2. Fromhage L
2021Should dispersers be fast learners? Modeling the role of cognition in dispersal syndromesEcology and Evolution 11:14293–14302https://doi.org/10.1002/ece3.8145 Google Scholar
1. Liu Y
2. Day LB
3. Summers K
4. Burmeister SS
2016Learning to learn: advanced behavioural flexibility in a poison frogAnimal Behaviour 111:167–172https://doi.org/10.1016/j.anbehav.2015.10.018 Google Scholar
1. Logan CJ
2016Behavioral flexibility and problem solving in an invasive birdPeerJ 4:e1975https://doi.org/10.7717/peerj.1975 Google Scholar
1. Logan CJ
2016Behavioral flexibility in an invasive bird is independent of other behaviorsPeerJ 4:e2215https://doi.org/10.7717/peerj.2215 Google Scholar
1. Logan CJ
2016Great-tailed grackle behavioral flexibility and problem solving experiments, Santa Barbara, CA USA 2014-2015Knowledge Network for Biocomplexity https://doi.org/10.5063/F1319SVV Google Scholar
1. Logan CJ
2. Blaisdell AP
3. Johnson-Ulrich Z
4. Lukas D
5. MacPherson M
6. Seitz B
7. Sevchik A
8. McCune KB
2022Is behavioral flexibility manipulatable and, if so, does it improve flexibility and problem solving in a new context?Knowledge Network for Biocomplexity https://doi.org/10.5063/F1862DWC Google Scholar
1. Logan CJ
2. Lukas D
3. Blaisdell AP
4. Johnson-Ulrich Z
5. MacPherson M
6. Seitz B
7. Sevchik A
8. McCune K
2023Behavioral flexibility is manipulable and it improves flexibility and innovativeness in a new contextEcoEvoRxiv https://doi.org/10.32942/osf.io/5z8xs Google Scholar
1. Logan CJ
2. McCune K
3. LeGrande-Rolls C
4. Marfori Z
5. Hubbard J
6. Lukas D
2023Data: Implementing a rapid geographic range expansion - the role of behavior changesKnowledge Network for Biocomplexity https://doi.org/10.5063/F1QZ28FH Google Scholar
1. Logan CJ
2. McCune K
3. LeGrande-Rolls C
4. Marfori Z
5. Hubbard J
6. Lukas D
2023Implementing a rapid geographic range expansion - the role of behavior changesEcoEvoRxiv https://doi.org/10.5063/F1QZ28FH Google Scholar
1. Logan CJ
2. McCune K
3. MacPherson M
4. Johnson-Ulrich Z
5. Rowney C
6. Seitz B
7. Blaisdell A
8. Deffner D
9. Wascher C
2022Are the more flexible great-tailed grackles also better at behavioral inhibition?Animal Behavior Cognition 9:14–36https://doi.org/10.26451/abc.09.01.03.2022 Google Scholar
1. Lucon-Xiccato T
2. Bisazza A
2014Discrimination reversal learning reveals greater female behavioural flexibility in guppiesBiology Letters 10:20140206https://doi.org/10.1098/rsbl.2014.0206 Google Scholar
1. Lukas D
2. McCune K
3. Blaisdell AP
4. Johnson-Ulrich Z
5. MacPherson M
6. Seitz B
7. Sevchik A
8. Logan CJ
2023Behavioral flexibility is manipulatable and it improves flexibility and problem solving in a new context: post-hoc analyses of the components of behavioral flexibilityEcoEvoRxiv https://doi.org/10.32942/osf.io/5z8xs Google Scholar
1. Luscier JD
2018Effects of urbanization on great-tailed grackle habitat use and nest success in Sherman TexasUrban Naturalist 15:1–14Google Scholar
1. McElreath R
2018Statistical rethinking: A Bayesian course with examples in R and StanChapman and Hall/CRC Google Scholar
1. McElreath R
2. Lubell M
3. Richerson PJ
4. Waring TM
5. Baum W
6. Edsten E
7. Efferson C
8. Paciotti B
2005Applying evolutionary models to the laboratory study of social learningEvolution and Human Behavior 26:483–508https://doi.org/10.1016/j.evolhumbehav.2005.04.003 Google Scholar
1. McNamara JM
2. Houston AI
2009Integrating function and mechanism. Trends in Ecology & Evolution24:670–675https://doi.org/10.1016/j.tree.2009.05.011 Google Scholar
1. Miller TEX
2. Inouye BD
2013Sex and stochasticity affect range expansion of experimental invasionsEcology Letters 16:354–361https://doi.org/10.1111/ele.12049 Google Scholar
1. Miller TE
2. Shaw AK
3. Inouye BD
4. Neubert MG
2011Sex-biased dispersal and the speed of two-sex invasionsThe American Naturalist 177:549–561https://doi.org/10.1086/659628 Google Scholar
1. Morand-Ferron J
2. Reichert MS
3. Quinn JL
2022Cognitive flexibility in the wild: individual differences in reversal learning are explained primarily by proactive interference, not by sampling strategies, in two passerine bird speciesLearning & Behavior 50:153–166https://doi.org/10.3758/s13420-021-00505-1 Google Scholar
1. Otto SP
2. Day T
2011A biologist’s guide to mathematical modeling in ecology and evolutionPrinceton University Press Google Scholar
1. Phillips BL
2. Brown GP
3. Shine R
2010Life-history evolution in range-shifting populationsEcology 91:1617–1627https://doi.org/10.1890/09-0910.1 Google Scholar
1. Raine NE
2. Chittka L
2012No trade-off between learning speed and associative flexibility in bumblebees: a reversal learning test with multiple coloniesPloS one 7:e45096https://doi.org/10.1371/journal.pone.0045096 Google Scholar
1. Rochais C
2. Hotte H
3. Pillay N
2021Seasonal variation in reversal learning reveals greater female cognitive flexibility in African striped miceScientific Reports 11:20061https://doi.org/10.1038/s41598-021-99619-9 Google Scholar
1. Ronce O
2007How does it feel to be like a rolling stone? Ten questions about dispersal evolutionAnnual Review of Ecology, Evolution, and Systematics 38:231–253https://doi.org/10.1146/annurev.ecolsys.38.091206.095611 Google Scholar
1. Selander RK
2. Giller DR
1961Analysis of sympatry of great-tailed and boat-tailed gracklesThe Condor 63:29–86https://doi.org/10.2307/1365420 Google Scholar
1. Sevchik A
2. Logan CJ
3. McCune KB
4. Blackwell A
5. Rowney C
6. Lukas D
2022Investigating sex differences in genetic relatedness in great-tailed grackles in Tempe, Arizona to infer potential sex biases in dispersalAnimal Behavior and Cognition 9:37–52https://doi.org/10.32942/osf.io/t6beh Google Scholar
1. Shochat E
2. Warren PS
3. Faeth SH
4. McIntyre NE
5. Hope D
2006From patterns to emerging processes in mechanistic urban ecologyTrends in Ecology & Evolution 21:186–191https://doi.org/10.1016/j.tree.2005.11.019 Google Scholar
1. Smolla M
2. Rosher C
3. Gilman RT
4. Shultz S
2019Reproductive skew affects social information useRoyal Society open science 6:182084https://doi.org/10.1098/rsos.182084 Google Scholar
1. Sol D
2. Lapiedra O
3. González-Lagos C
2013Behavioural adjustments for a life in the cityAnimal Behaviour 85:1101–1112https://doi.org/10.1016/j.anbehav.2013.01.023 Google Scholar
1. Somveille M
2. Firth JA
3. Aplin LM
4. Farine DR
5. Sheldon BC
6. Thompson RN
2018Movement and conformity interact to establish local behavioural traditions in animal populationsPLoS Computational Biology 14:e1006647https://doi.org/10.1371/journal.pcbi.1006647 Google Scholar
1. Stephens DW
2. Krebs JR
2019Risk-Sensitive ForagingIn: Foraging Theory Princeton University Press pp. 128–150Google Scholar
1. Strang CG
2. Sherry DF
2014Serial reversal learning in bumblebees (Bombus impatiens)Animal Cognition 17:723–734https://doi.org/10.1007/s10071-013-0704-1 Google Scholar
1. Summers J
2. Lukas D
3. Logan CJ
4. Chen N
2023The role of climate change and niche shifts in divergent range dynamics of a sister-species pairPeer Community Journal 3https://doi.org/10.24072/pcjournal.248 Google Scholar
1. Sutton RS
2. Barto AG
2018Reinforcement learning: An introductionMIT press Google Scholar
1. Team RC
2021R: A Language and environment for statistical computingGoogle Scholar
1. Team SD
2020RStan: the R interface to StanGoogle Scholar
1. Trochet A
2. Courtois EA
3. Stevens VM
4. Baguette M
5. Chaine A
6. Schmeller DS
7. Clobert J
8. Wiens JJ
2016Evolution of sex-biased dispersalThe Quarterly Review of Biology 91:297–320https://doi.org/10.1086/688097 Google Scholar
1. Trump AN
2. Deffner D
3. Pleskac TJ
4. Romanczuk P
5. Kurvers R
2023A cognitive computational approach to social and collective decision-makingOSF Preprints https://doi.org/10.31219/osf.io/7aykm Google Scholar
1. Vinton AC
2. Gascoigne SJ
3. Sepil I
4. Salguero-Gómez R
2022Plasticity’s role in adaptive evolution depends on environmental change componentsTrends in Ecology & Evolution https://doi.org/10.1016/j.tree.2022.08.008 Google Scholar
1. Webster MM
2. Rutz C
2020How STRANGE are your study animals?Nature 582:337–340https://doi.org/10.1038/d41586-020-01751-5 Google Scholar
1. Wehtje W
2003The range expansion of the great-tailed grackle (Quiscalus mexicanus Gmelin) in North America since 1880Journal of Biogeography 30:1593–1607https://doi.org/10.1046/j.1365-2699.2003.00970.x Google Scholar
1. Wehtje W
2004The great-tailed grackle (Quiscalus mexicanus Gmelin) in the Western USA: Range Expansion and Secondary Contact Between SubspeciesGoogle Scholar
1. Wickham H
2. Averick M
3. Bryan J
4. Chang W
5. McGowan LD
6. François R
7. Grolemund G
8. Hayes A
9. Henry L
10. Hester J
11. Kuhn M
12. Pedersen TL
13. Miller E
2019Bache SM, l Müller, Ooms J, Robinson D, Seidel DP, Spinu V, Takahashi Ket al. Welcome to the Tidyverse. Journal of Open Source Software 4:1686https://doi.org/10.21105/joss.01686 Google Scholar
1. Wright TF
2. Eberhard JR
3. Hobson EA
4. Avery ML
5. Russello MA
2010Behavioral flexibility and species invasions: the adaptive flexibility hypothesisEthology Ecology & Evolution 22:393–404https://doi.org/10.1080/03949370.2010.505580 Google Scholar
1. Yu X
2. Gen M
2010Introduction to evolutionary algorithmsSpringer Science & Business Media Google Scholar
1. Auersperg A. M. I.
2. Von Bayern A. M. P.
3. Gajdon G. K.
4. Huber L.
5. Kacelnik A.
2011Flexibility in problem solving and tool use of kea and New Caledonian crows in a multi access box paradigmPloS ONE 6:e20231https://doi.org/10.1371/journal.pone.0020231 Google Scholar
1. Barrett B. J
2019Equifinality in empirical studies of cultural transmissionBehavioural Processes 161:129–138https://doi.org/10.1016/j.beproc.2018.01.011 Google Scholar
1. Batabyal A.
2. Thaker M
2019Lizards from suburban areas learn faster to stay safeBiology Letters 15:20190009https://doi.org/10.1098/rsbl.2019.0009 Google Scholar
1. Blackburn T. M.
2. Lockwood J. L.
3. Cassey P
2009Avian invasions: The ecology and evolution of exotic birdsOxford University Press Google Scholar
1. Blaisdell A.
2. Seitz B.
3. Rowney C.
4. Folsom M.
5. MacPherson M.
6. Deffner D.
7. Logan C. J
2021Do the more flexible individuals rely more on causal cognition?Observation versus intervention in causal inference in great-tailed grackles. Peer Community Journal 1https://doi.org/10.24072/pcjournal.44 Google Scholar
1. Chuang A.
2. Peterson C. R
2016Expanding population edges: Theories, traits, and trade-offsGlobal Change Biology 22:494–512https://doi.org/10.1111/gcb.13107 Google Scholar
1. Deffner D.
2. Kleinow V.
3. McElreath R
2020Dynamic social learning in temporally and spatially variable environmentsRoyal Society Open Science 7:200734https://doi.org/10.1098/rsos.200734 Google Scholar
1. Dinsmore J. J.
2. Dinsmore S. J
1993Range expansion of the great-tailed grackle in the 1900sJournal of the Iowa Academy of Science 100:54–59Google Scholar
1. Duckworth R. A
2006Behavioral correlations across breeding contexts provide a mechanism for a cost of aggressionBehavioral Ecology 17:1011–1019https://doi.org/10.1093/beheco/arl035 Google Scholar
1. Griffin A. S.
2. Netto K.
3. Peneaux C
2017Neophilia, innovation and learning in an urbanized world: A critical evaluation of mixed findingsCurrent Opinion in Behavioral Sciences 16:15–22https://doi.org/10.1016/j.cobeha.2017.01.004 Google Scholar
1. Guillette L. M.
2. Reddon A. R.
3. Hoeschele M.
4. Sturdy C. B
2011Sometimes slower is better: Slow-exploring birds are more sensitive to changes in a vocal discrimination taskProceedings of the Royal Society B: Biological Sciences 278:767–773https://doi.org/10.1098/rspb.2010.1669 Google Scholar
1. Gutowsky L. F. G.
2. Fox M. G
2011Occupation, body size and sex ratio of round goby Neogobius melanostomus in established and newly invaded areas of an ontario riverHydrobiologia 671:27–37https://doi.org/10.1007/s10750-011-0701-9 Google Scholar
2001Hampton, S. (2001). Yolo County birding news.Google Scholar
1. Kandler A.
2. Powell A
2018Generative inference for cultural evolutionPhilosophical Transactions of the Royal Society B: Biological Sciences 373:20170056https://doi.org/10.1098/rstb.2017.0056 Google Scholar
1. Kingery H. E
1972The nesting season: June 1-August 15, 1972American Birds 26:882–887Google Scholar
1. Lee V. E.
2. Thornton A
2021Animal cognition in an urbanised worldFrontiers in Ecology and Evolution 9:120https://doi.org/10.3389/fevo.2021.633947 Google Scholar
1. Lehman P. E.
2020The birds of Santa Barbara Countyhttp://www.sbcobirding.com/lehmanbosbc.html Google Scholar
1. Littlefield C. D
1983Oregon’s first records of the great-tailed grackleWestern Birds 14:201–202Google Scholar
1. Logan C. J
2016aBehavioral flexibility and problem solving in an invasive birdPeerJ 4:e1975https://doi.org/10.7717/peerj.1975 Google Scholar
1. Logan C. J
2016bBehavioral flexibility in an invasive bird is independent of other behaviorsPeerJ 4:e2215https://doi.org/10.7717/peerj.2215 Google Scholar
1. McElreath R
2018Statistical rethinking: A Bayesian course with examples in R and StanChapman: Hall/CRC Google Scholar
1. McElreath R.
2. Smaldino P. E
2015Replication, communication, and the population dynamics of scientific discoveryPloS One 10:e0136088https://doi.org/10.1371/journal.pone.0136088 Google Scholar
1. Miller T. E. X.
2. Inouye B. D
2013Sex and stochasticity affect range expansion of experimental invasionsEcology Letters 16:354–361https://doi.org/10.1111/ele.12049 Google Scholar
1. Pandolfino E. R.
2. Deuel B. E.
3. Young L
2009Colonization of the california’s central valley by the great-tailed grackleCentral Valley Bird Club Bull 12:77–95Google Scholar
1. R Core Team
2021R: A language and environment for statistical computing (Version 1.4.1106)https://www.R-project.org/
1. Ronce O
2007How does it feel to be like a rolling stone? Ten questions about dispersal evolutionAnnual Review of Ecology, Evolution, and Systematics 38:231–253https://doi.org/10.1146/annurev.ecolsys.38.091206.095611 Google Scholar
1. Sevchik A.
2. Logan C. J.
3. McCune K. B.
4. Blackwell A.
5. Rowney C.
6. Lukas D.
7. in press
Investigating sex differences in genetic relatedness in great-tailed grackles in Tempe, Arizona to infer potential sex biases in dispersalPeer community in ecology https://doi.org/10.32942/osf.io/t6beh Google Scholar
1. Sol D.
2. Lapiedra O.
3. González-Lagos C
2013Behavioural adjustments for a life in the cityAnimal Behaviour 85:1101–1112https://doi.org/10.1016/j.anbehav.2013.01.023 Google Scholar
1. Stan Development Team
2020RStan: The R interface to Stan (Version 2.21.2)http://mc-stan.%20org
1. Stepney P. H. R
1975First recorded breeding of the great-tailed grackle in ColoradoThe Condor 77:208–210https://doi.org/10.2307/1365794 Google Scholar
1. Trochet A.
2. Courtois E. A.
3. Stevens V. M.
4. Baguette M.
5. Chaine A.
6. Schmeller D. S.
7. Clobert J.
8. Wiens J. J
2016Evolution of sex-biased dispersalThe Quarterly Review of Biology 91:297–320https://doi.org/10.1086/688097 Google Scholar
1. Udvardy M. D. F.
2. Papp C. S
1969Dynamic zoogeographyVan Nostrand Reinhold Google Scholar
1. Verbeek M. E. M.
2. Drent P. J.
3. Wiepkema P. R
1994Consistent individual differences in early exploratory behaviour of male great titsAnimal Behaviour 48:1113–1121https://doi.org/10.1006/Google Scholar
1. Walter W.
2004The great-tailed grackle (Quiscalus mexicanus Gmelin) in the Western USA: Range expansion and secondary contact between subspeciesUniversity of California Riverside Google Scholar
1. Webster M. M.
2. Rutz C
2020How STRANGE are your study animals?In: In Nature Nature Publishing Group pp. 337–340https://doi.org/10.1038/d41586-020-01751-5 Google Scholar
1. Wehtje W
2003The range expansion of the great-tailed grackle (Quiscalus mexicanus Gmelin) in North America since 1880Journal of Biogeography 30:1593–1607https://doi.org/10.1046/j.1365-2699.2003.00970.x Google Scholar
1. Werner S. J.
2. DeLiberto S. T.
3. Mangan A. M.
4. Pettit S. E.
5. Ellis J. W.
6. Carlson J. C
2015Anthraquinone-based repellent for horned larks, great-tailed grackles, American crows and the protection of California’s specialty cropsCrop Protection 72:158–162https://doi.org/10.1016/j.cropro.2015.03.020 Google Scholar
1. Werner S. J.
2. Linz G. M.
3. Carlson J. C.
4. Pettit S. E.
5. Tupper S. K.
6. Santer M. M
2011Anthraquinone-based bird repellent for sunflower cropsApplied Animal Behaviour Science 129:162–169https://doi.org/10.1016/j.applanim.2010.11.010 Google Scholar
1. Wickham H.
2. Averick M.
3. Bryan J.
4. Chang W.
5. McGowan L. D.
6. François R.
7. Grolemund G.
8. Hayes A.
9. Henry L.
10. Hester J.
11. Kuhn M.
12. Pedersen T. L.
13. Miller E.
14. Bache S. M.
15. Müller l
16. Ooms J.
17. Robinson D.
18. Seidel D. P.
19. Spinu V
20. Yutani H
2019Welcome to the TidyverseJournal of Open Source Software 4:1686https://doi.org/10.21105/joss.01686 Google Scholar
1. Wright T. F.
2. Eberhard J. R.
3. Hobson E. A.
4. Avery M. L.
5. Russello M. A
2010Behavioral flexibility and species invasions: The adaptive flexibility hypothesisEthology Ecology & Evolution 22:393–404https://doi.org/10.1080/03949370.2010.505580 Google Scholar

Article and author information

Author information

Alexis J Breen
Department of Human Behavior, Ecology and Culture, Max Planck Institute for Evolutionary Anthropology, 04103 Leipzig, Germany
ORCID iD: 0000-0002-2331-0920
- For correspondence: alexis_breen@eva.mpg.de (AJB); deffner@mpib-berlin.mpg.de (DD)
- Department of Linguistic and Cultural Evolution, Max Planck Institute for Evolutionary Anthropology, 04103 Leipzig, Germany
Dominik Deffner
Science of Intelligence Excellence Cluster, Technical University Berlin, 10623 Berlin, Germany, Center for Adaptive Rationality, Max Planck Institute for Human Development, 14195 Berlin, Germany
ORCID iD: 0000-0002-1649-3861
- For correspondence: alexis_breen@eva.mpg.de (AJB); deffner@mpib-berlin.mpg.de (DD)

Version history

Preprint posted: June 28, 2023
Sent for peer review: June 28, 2023
Reviewed Preprint version 1: October 31, 2023
Reviewed Preprint version 2: February 23, 2024
Version of Record published: April 2, 2024

Cite all versions

You can cite all versions using the DOI https://doi.org/10.7554/eLife.89315. This DOI represents all versions, and will always resolve to the latest one.

Copyright

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

views: 1,357
downloads: 206
citations: 2

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Significance of findings

Strength of evidence

Abstract

Introduction

Results

Reinforcement learning behaviour

Reinforcement learning mechanisms

Agent-based simulations and replication of reinforcement learning

Selection and benefit of reinforcement learning mechanisms under urban-like environments

Discussion

Ideas and speculation

Conclusions

Methods and Materials

Data provenance

Data contents

Experimental protocol

Reinforcement learning test

Reinforcement learning criterion

Statistical analyses

Poisson models

Reinforcement learning model

Agent-based simulations: pre- and post-study

Evolutionary model

Acknowledgements

Additional information

Funding

Author contributions

Author ORCIDs

Additional files

Supplementary files

Reinforcement learning speed.

Reinforcement learning switches.

Reinforcement learning information-updating rate ϕ.

Reinforcement learning risk-sensitivity rate λ.

Investigating sex differences in learning in a range-expanding bird

Abstract

Introduction

Methods

Data

Experimental protocol

General

Colour-reward reinforcement learning tests

Analysis plan

General

Reinforcement learning model

Model validation

Bias

Open materials

Acknowledgements

Ethics

Supplementary material

Acknowledgements

References

Article and author information

Author information

Alexis J Breen§

Dominik Deffner

Version history

Cite all versions

Copyright

Metrics

Alexis J Breen