Computational modeling of threat learning reveals links with anxiety and neuroanatomy in humans

  1. Rany Abend  Is a corresponding author
  2. Diana Burk
  3. Sonia G Ruiz
  4. Andrea L Gold
  5. Julia L Napoli
  6. Jennifer C Britton
  7. Kalina J Michalska
  8. Tomer Shechner
  9. Anderson M Winkler
  10. Ellen Leibenluft
  11. Daniel S Pine
  12. Bruno B Averbeck
  1. Emotion and Development Branch, National Institute of Mental Health, National Institutes of Health, United States
  2. Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, United States
  3. Department of Psychiatry and Human Behavior, Brown University Warren Alpert Medical School, United States
  4. Department of Psychology, University of Miami, United States
  5. Department of Psychology, University of California, Riverside, United States
  6. Psychology Department, University of Haifa, Israel

Abstract

Influential theories implicate variations in the mechanisms supporting threat learning in the severity of anxiety symptoms. We use computational models of associative learning in conjunction with structural imaging to explicate links among the mechanisms underlying threat learning, their neuroanatomical substrates, and anxiety severity in humans. We recorded skin-conductance data during a threat-learning task from individuals with and without anxiety disorders (N=251; 8-50 years; 116 females). Reinforcement-learning model variants quantified processes hypothesized to relate to anxiety: threat conditioning, threat generalization, safety learning, and threat extinction. We identified the best-fitting models for these processes and tested associations among latent learning parameters, whole-brain anatomy, and anxiety severity. Results indicate that greater anxiety severity related specifically to slower safety learning and slower extinction of response to safe stimuli. Nucleus accumbens gray-matter volume moderated learning-anxiety associations. Using a modeling approach, we identify computational mechanisms linking threat learning and anxiety severity and their neuroanatomical substrates.

Editor's evaluation

The authors present an investigation into the role of threat learning processes in symptoms of anxiety across a broad sample of subjects with and without clinical anxiety, across multiple age groups. Authors demonstrated weaker safety learning in those who were more anxious.

https://doi.org/10.7554/eLife.66169.sa0

Introduction

Threat learning encompasses processes that rapidly generate associations between neutral stimuli and aversive outcomes. Influential theories implicate variations in the mechanisms underlying such conserved associative-learning processes in anxiety symptoms (Mineka and Oehlberg, 2008; Duits et al., 2015). Computational learning theory offers tools to quantify associative learning dynamics as they relate to variations in these mechanisms. Here, we study physiological data recorded during a threat learning paradigm in a large sample (N = 251) featuring a wide range of normative to pathological anxiety severity across childhood to adulthood. We use these data to uncover links among threat learning processes, their neuroanatomical substrates, and anxiety severity via latent learning parameters estimated by reinforcement learning models.

Neural circuits have evolved to allow rapid learning of threat associations following encounters with danger (Fanselow, 2018; LeDoux, 2014; LeDoux, 2000). Through threat conditioning, a neutral stimulus rapidly comes to elicit fear responding in anticipation of danger (Fanselow, 2018). Through extinction, such conditioned anticipatory responding is attenuated if the stimulus no longer predicts the occurrence of threat. Extensive research across species implicates conserved neural circuitry in these processes, highlighting their adaptive value (LeDoux, 2000).

At the same time, aspects of these processes could prove maladaptive. Thus, influential theories link individual differences in threat learning to the emergence and persistence of excessive threat-anticipatory fear responses which are central in expression of anxiety symptoms (Mineka and Oehlberg, 2008; Duits et al., 2015; Lissek, 2005; Barlow, 2002; Abend et al., 2021; Corchs and Schiller, 2019). Moreover, aberrant maturational processes in the circuitry supporting such learning may underlie the emergence of anxiety symptoms in childhood and adolescence (Beesdo et al., 2009; Kessler et al., 2012; Casey et al., 2015; Pattwell et al., 2012; Lau et al., 2011; Craske et al., 2018; Craske et al., 2012). However, studies in youth and adults attempting to link threat learning and anxiety symptoms yield inconsistent findings (Duits et al., 2015; Lissek, 2005; Dvir et al., 2019), hindering neuroscience research on normal and abnormal threat learning mechanisms.

Failure to detect robust and replicable associations between threat learning and anxiety symptoms could reflect limitations of standard analytic methods because these are not designed to capture associative learning dynamics (Lonsdorf et al., 2017; Ney et al., 2018). These limitations could potentially be overcome through applications of computational learning theory, which provides a mathematical framework for quantifying the temporal dynamics of associative learning processes, as indexed with latent variables (Garrison et al., 2013; Li and McNally, 2014; Keiflin and Janak, 2015; Rescorla and Wagner, 1972). Based on the measurement of prediction errors between expected and experienced outcomes, associative learning models were initially applied to reward learning (Keiflin and Janak, 2015; Schultz, 2013; Lee et al., 2012), including examining reward learning perturbation in anxiety (Huang et al., 2017; Aylward et al., 2019). Recent studies extend this approach to model associations between threat cues and aversive outcomes. These include studies in healthy participants (Wise et al., 2019; Tzovara et al., 2018; Zhang et al., 2016; Li et al., 2011; Schiller et al., 2008; Atlas et al., 2016), individuals with elevated anxiety symptoms (Browning et al., 2015; Michalska et al., 2019), and in adult patients with post-traumatic stress disorder and transdiagnostic symptom dimensions (Homan et al., 2019; Gagne et al., 2020; Wise and Dolan, 2020). These studies utilize variations of threat learning paradigms, such as reversal or instrumental conditioning procedures, to identify parameters that determine important aspects of learning, including some that relate to psychopathology. Here, we complement this body of work by studying the mechanisms that link individual differences in rapid threat conditioning and extinction to anxiety severity across the lifespan (Mineka and Oehlberg, 2008; Duits et al., 2015; Lissek, 2005).

In the current report, we address this gap by applying reinforcement learning models to skin conductance response (SCR) data recorded during a threat learning paradigm, in conjunction with structural brain imaging, to quantify age-dependent associations among threat learning processes, neuroanatomy, and anxiety severity. We complement and extend prior work in several ways. First, we study a sample that includes individuals with and without anxiety disorders falling along a wide age span, from age 8 to 50 years. This sample contains wide ranges of anxiety symptoms and age, which increase statistical power to detect associations among age, anxiety, and threat learning indices. Second, previous work modeled cue responding over many reinforcement trials (e.g. > 60 trials) (Homan et al., 2019; Tzovara et al., 2018; Zhang et al., 2016), which may more closely model the expression of conditioned responses. Here, we focus instead on the rapid learning of threat associations that takes place during a shorter schedule (Fanselow, 2018). Third, we model four learning effects: threat conditioning, threat generalization, safety learning, and threat extinction. This modeling approach provides tests of different theories that link mechanistic variations in these specific processes to the severity of anxiety (Mineka and Oehlberg, 2008; Lonsdorf et al., 2017; Craske et al., 2012; Abend et al., 2020; Michalska et al., 2017; Curzon et al., 2009). Finally, we use data from a relatively large sample (N = 215) of medication-free subjects, enabling parameter estimation with greater accuracy and over a large model space. We hypothesize that anxiety severity is associated with latent learning parameters (Mineka and Oehlberg, 2008; Duits et al., 2015); that age moderates specifically the association between extinction learning and anxiety (Casey et al., 2015); and that morphometry features in subcortical structures and prefrontal cortex moderate associations between learning parameters and anxiety (Fullana et al., 2018; Maren, 2001; Fullana et al., 2016).

Materials and methods

Participants

All participants were recruited from the community as paid volunteers for research at the National Institute of Mental Health (NIMH), Bethesda, MD. Written informed consent was obtained from adult participants (≥18 years) as well as parents, and written assent was obtained from youth. Procedures were approved by the NIMH Institutional Review Board (protocol 01-M-0192). Data from a sample of 351 individuals were initially considered. Due to our focus on model fitting with trial-by-trial data and the noisy nature of SCR data, we excluded individuals with excessive missing data, with similar exclusion proportion to prior work (Homan et al., 2019) (n=136). This led to a final sample of N=215, which included 104 healthy participants (53 females; ages 8-44 years) and 111 medication-free participants with anxiety disorders (63 females; ages 8-50 years). Healthy and anxiety groups did not differ in age, t(213)=0.97, p=0.774, d=0.13, sex, χ(1)2=3.56, p=0.060, V<0.01; or IQ, t(213)=0.15, p=0.882, d=0.02. Analyses on raw psychophysiology data and neuroanatomy data from portions of this sample have been previously reported (Abend et al., 2020; Shechner et al., 2015; Britton et al., 2013; Gold et al., 2020). The current study reports novel analyses of these data. Data from 32 participants were excluded due to aborting the task (22 anxious, 8 healthy) or technical problems (1 anxious, 1 healthy). Data from four additional participants (2 anxious, 2 healthy) were excluded from analyses since they inquired and were then informed of the CS contingencies prior to the conditioning phase (Mechias et al., 2010).

Anxiety severity

Request a detailed protocol

All participants were interviewed by trained clinicians using semi-structured clinical interviews (Kaufman et al., 1997; First et al., 2002); see Appendix 1. Anxious patients were required to meet criteria for generalized, social, and/or separation anxiety disorder. Healthy participants did not meet criteria for any psychiatric diagnosis. To assess current anxiety symptom severity, youth and adults completed standard self-report anxiety questionnaires. Youth (age < 18 years) and parents completed the Screen for Child Anxiety Related Emotional Disorders (SCARED) (Birmaher et al., 1997), and adults (age ≥ 18 years) completed the trait subscale of the State-Trait Anxiety Inventory (STAI) (Spielberger et al., 1970), within 3 months of the task. The SCARED is a child- and parent-report measure comprising 41 items assessing recent anxiety symptoms and possessing strong psychometric properties (Birmaher et al., 1997; Birmaher et al., 1999); to reduce informant differences, child- and parent-report scores were averaged (Behrens et al., 2019; Abend et al., 2021). The STAI (Spielberger et al., 1970) consists of 20 items relating to general anxious moods and possesses strong psychometric properties (Elwood et al., 2012). To combine these anxiety measures, we Z-transformed each measure within its respective age sample; these Z-scores were then combined across samples and used in analyses (Abend et al., 2019). These anxiety severity scores manifested a unimodal, continuous distribution across the sample (see Figure 1—figure supplement 1). Given that the SCARED and STAI might not capture the construct of anxiety in an identical manner, we also report on SCARED analyses in youth and STAI analyses in adult participants separately.

Threat conditioning and extinction task

The task involved rapid, uninstructed Pavlovian conditioning and extinction of threat associations (Fanselow, 2018). The task has previously been found effective (i.e. produce conditioning while maintaining a low dropout rate) among individuals with anxiety and healthy participants from both youth and adult populations (Lau et al., 2011; Michalska et al., 2017; Shechner et al., 2015; Britton et al., 2013; Gold et al., 2020; Den et al., 2015; Lau et al., 2008; Ryan et al., 2019). The task consisted of a pre-conditioning phase, a conditioning phase, and an extinction phase (Figure 1). Photographs of two women displaying neutral expressions (Tottenham et al., 2009) served as the conditioned threat and safety stimuli (CS + and CS-, respectively), each presented for 7s. During the pre-conditioning phase, each CS was presented four times to allow physiological responses to the novel stimuli to habituate. During the threat conditioning phase, each CS was presented 10 times, and the CS + was followed by the unconditioned stimulus (UCS), a 1s presentation of the same actress displaying fear and co-occurring with a 95dB female scream delivered via headphones, with an 80% reinforcement schedule. Participants were instructed that they could learn to predict when the UCS would occur, but they were not explicitly informed of this contingency. During the extinction phase, the CSs were each presented eight times in the absence of the UCS. In all phases, the duration of the CS + and CS- presentation was 8s, but shortened to 7s when UCS occurred. An inter-trial interval (a gray screen presented for 8-21s, averaging 15s) separated the trials. Presentation order of the CSs was pseudo-randomized (two different orders counterbalanced across participants). The task was programmed and administered using PsyLab psychophysiological recording system (PsyLab SAM System, Contact Precision Instruments, London). Skin conductance, electromyography, and electrocardiography were recorded. Due to our focus on skin conductance responses as indexing conditioned anticipatory threat responses (Lonsdorf et al., 2017), the other measures are not analyzed in the current report.

Figure 1 with 1 supplement see all
Threat learning task and physiological data.

Top: Schematic representation of the threat learning paradigm. During the pre-conditioning phase, the designated threat (CS+) and safety (CS-) stimuli were presented without reinforcement. During the conditioning phase, the CS + was paired with a fearful face co-terminating with a scream (UCS); the CS- was never reinforced. During the extinction phase, both CS + and CS- were not reinforced by the UCS. Bottom: Mean raw skin conductance response for the CS + and CS- by task phase and trial. Note: Trial number indicates the nth trial for that stimulus. However, the CS+ and CS- trials were presented in counterbalanced order throughout the task. CS = conditioned stimulus; UCS = unconditioned stimulus. Error bars indicate one standard error of the mean.

Skin conductance

Request a detailed protocol

Skin conductance was recorded at 1000Hz using PsyLab from two Ag/AgCl electrodes from the medial phalanx of the middle and ring non-dominant-hand fingers. In line with prior research, skin conductance response (SCR) was determined by the square-root-transformed difference between base-to-peak amplitude within 1-5s after stimulus onset (Zhang et al., 2016; Li et al., 2011; Schiller et al., 2008; Shechner et al., 2015; Marin et al., 2017; Marin et al., 2020). To generate more accurate estimates from trial-by-trial data, we next cleaned trial-level data. Missing skin conductance trial values and within-subject outliers were identified per subject, for each CS type during conditioning and extinction phase separately; outliers were defined and up to three consecutive values were linearly interpolated using the zoo package in R (Zeileis and Grothendieck, 2005). Next, trailing missing trial values during conditioning and leading missing values during extinction were linearly extrapolated. Subjects with more than 50% missing trial values were excluded. Then, overall CS+ and CS- SCRs were averaged for each subject. Group outliers for each CS type were identified and excluded. Based on these conservative criteria, skin conductance data from 136 participants were excluded from modeling (52 anxious, 84 healthy). This proportion is similar to excluded data in prior work (Homan et al., 2019; Lonsdorf et al., 2019). Included and excluded participants did not differ in anxiety severity (by diagnosis or continuously) or IQ; excluded relative to included participants had a higher mean age and a greater proportion of females were excluded (see Appendix 1—table 1).

Raw SCR data

Request a detailed protocol

Raw SCR data by trial and task phase are depicted in Figure 1. During the conditioning phase, CS+ threat conditioning (acquisition of conditioned fear response) was noted; response to the CS- was characterized by generalization of conditioned fear response (increased response despite no reinforcement) and safety learning (diminishing response through continued non-reinforcement). During the extinction phase, rapid extinction (diminishing response) was noted for both stimuli. Statistics on these data are reported in the Results section. As noted, perturbations in these processes have been suggested to contribute to the emergence and persistence of anxiety symptoms (Mineka and Oehlberg, 2008; Duits et al., 2015; Lissek, 2005; Dymond et al., 2015; Pittig et al., 2016; Vervliet et al., 2013.) Specifically, facilitated threat conditioning (Orr et al., 2000), greater threat generalization (Dymond et al., 2015; Vervliet et al., 2013; Lissek et al., 2014), and slower safety learning (Craske et al., 2018; Craske et al., 2012) have each been theorized to lead to anxiety symptoms. Further, threat extinction processes have been implicated in both anxiety symptoms and exposure-based treatment for anxiety (Milad and Quirk, 2012; Casey et al., 2015; Craske et al., 2018; Vervliet et al., 2013). Due to the potential relevance of these learning processes to anxiety, and as they were evident in the data, we used reinforcement modeling to examine them.

Reinforcement learning models

Request a detailed protocol

In previous work on these data, we used standard statistical models to assess the effects of anxiety on learning, specifically by averaging responses across trials (Abend et al., 2020). In this manuscript, the goal was to explore dynamic learning processes that could account for the model-agnostic effects seen in the previous study. To model these dynamic processes during threat conditioning and extinction, we fit multiple reinforcement learning models to the trial-level SCR data for each participant. The models describe the trial-by-trial changes in the associative strength between the conditioned stimulus (CS) and the aversive unconditioned stimulus (UCS). Using a family of related models, we focused on modeling specific processes that were evident in the data and have potential relevance to anxiety.

Conditioning phase

Request a detailed protocol

We modeled three processes taking place during the conditioning phase: threat conditioning (Mineka and Oehlberg, 2008; Orr et al., 2000), whereby the CS+ comes to elicit a threat-anticipatory response via reinforcement by the UCS; threat generalization (Vervliet et al., 2013; Lissek et al., 2014), whereby conditioned fear response is generalized to the safe CS-; and safety learning (Craske et al., 2018; Craske et al., 2012), whereby CS- generalized fear responses are gradually diminished through non-reinforcement. We fit fourteen reinforcement learning models (variations of three main models) to CS+ and CS- conditioning data that varied in the number of parameters and features used to model the psychophysiological responses.

Twelve models were initially based on a Rescorla-Wagner (RW) model (Rescorla and Wagner, 1972), whereby at trial t+1, the value vCS of each CS is updated based on its value in the preceding trial vCS(t) plus the prediction error, δ(t), on trial t. We introduced several additional features to the basic model to better capture the dynamics in the data. For example, as can be seen in the conditioning phase data (Figure 1), participants rapidly generalized a fear response to the CS- (i.e., showed an increase in response to CS- despite non-reinforcement). A standard RW model will not capture conditioning to the CS-, as this cue was never reinforced. Generalization of the conditioned response to the CS- can be captured by a model that tracks the psychophysiological responses to the CS+ and ‘projects’ them onto the CS-. Therefore, our base RW model contains a CS- generalization term that updates CS- values using prediction errors from CS+ UCS trials.

The prediction error was calculated in only the CS+ trials and only when a UCS was delivered as:

(1) δ(t)=r(t)νCS+(t)

where r(t) is the SCR to the UCS on trial t. In CS+ trials the value for both the CS+ and CS- were updated according to:

(2) νCS+(t+1)=νCS+(t)+αCS+δ(t)
(3) νCS(t+1)=νCS(t)+αCSδ(t)

We used the magnitude of response to the UCS as the reinforcement term, r(t), responses to the UCS diminished across acquisition and setting r(t) to 1 when a UCS was delivered would not have the appropriate scale to capture SCRs to the CS; see Appendix 1 for further discussion. We also fit models (10-12, see below), in which we used an additional regression step, and in this case we used values of 0/1 for r(t).

We explored other updating schemes, including crossing the updates on both CS+ and CS- trials to the other cue. However, the approach in equations 1-3 provided the best fit. Note that below when we write the variable without subscripts, we are referring to both CS+ and CS- terms.

As noted, we explored additional features to capture learning effects, which led to fourteen total models (see Table 1). First, we fit a learning inertia term (Abbott, 2008), which assumed that the effect of the prediction error accumulated across m recent trials. Specifically, if we define the prediction error using equation 1 above, the learning inertia term updated the value for the CS+/CS- using νCS(t+1)=νCS(t)+αδin where:

(4) δin(t)=k=0kmδ(tk)
Table 1
Specifications and estimated free parameters for each model fit to the CS- and CS+.
ModelModel specificationFree parametersInitialization values
1.RWνCS(t+1)=νCS(t)+αδαCS+,αCS,νCS+(0)=Vi
νCS(0)=Vi

where vi is the SCR value from the last habituation trial (acquisition) and SCR value from the first extinction trial (extinction)
2.RW with inertiaνCS(t+1)=νCS(t)+α(t)δin, whereby, δin(t)=k=0kmδ(tk), with m=number of recent trialsαCS+, αCSsame as model 1
3.RW with Bayesian learning-rate decayνCS(t+1)=νCS(t)+α(t)δ, with α(t)=αsqrt(t)αCS+, αCSsame as model 1
4.RW with habituationvCS(t+1)=vCS(t)+αδ, with νCS(t) multiplied by eφCS[tt0]+ after updateαCS+,αCS;φCS+,φCSsame as model 1
5.RW with inertia and Bayesian learning-rate decayvCS(t+1)=vCS(t)+α(t)δin, with α(t)=αsqrt(t), and δin(t)=k=0kmδ(tk), with m=number of recent trialsαCS+,αCSsame as model 1
6.RW with inertia and habituationνCS(t+1)=νCS(t)+α(t)δin, whereby δin(t)=k=0kmδ(tk), with m=number of recent trials, and νCS(t) multiplied by eφCS[tt0]+ after updateαCS+,αCS;φCS+,φCSsame as model 1
7.RW with Bayesian learning-rate decay and habituationνCS(t+1)=νCS(t)+αδ, whereby α(t)=αsqrt(t) and vCS(t) multiplied by eφ[tt0]+ after updatesame as model 1
8.RW with inertia and Bayesian learning-rate decay, and habituation
νCS(t+1)=νCS(t)+α(t)δin, whereby,α(t)=αsqrt(t) and δin(t)=k=0kmδ(tk), with m=number of recent trials, and νCS(t) multiplied by eφCS[tt0]+ after update
αCS+,αCS;φCS+,φCSsame as model 1
9.RW-PH hybridνCS(t+1)=νCS(t)+καδ, whereby γCS+, γCS, κCS+, κCSαCS+(0)=1
αCS(0)=1
10.Hybrid(V) Li et al., 2011(Changing Vn for νCS(t+1) δ(n)=bUCS(n)Vn(xn)), where bUCS(n)=1 if a UCS was delivered and bUCS(n)=0 if no UCS was delivered (b indicates binary UCS). SCR was then predicted using a regression: pSCR(V)nN(β0+β1 Vn(xn), σ)
The squared error:
Hybrid(V)Error=(pSCR(V)SCR)2
γCS+, γCS, κCS+, κCS, β0, β1αCS+(0)=1αCS(0)=1vCS+(0)=vivCS(0)=vi
where vi is the SCR value on the last habituation trial and SCR value from the first extinction trial (extinction)
11.Hybrid(α) Li et al., 2011Same as model 10 (changing Vn for νCS(t+1)) except pSCR(αn)N(β0+β1αn(xn),σ)
The squared error:
Hybrid(α)Error=(pSCR(α)SCR)2
γCS+, γCS, κCS+, κCS,β0, β1same as model 10
12.Hybrid (V+α) Li et al., 2011Same as model 10 changing Vn for νCS(t+1) with additional regression such that:
pSCR(V,α)nN(β0+β1 Vn(xn)+β2 αn(xn),σ)
Hybrid(V+α)Error=(pSCR(V,α)SCR)2
γCS+, γCS, κCS+, κCS,β0, β1same as model 10
13.Mixed prior mean and uncertainty model Tzovara et al., 2018hCS(t)=ln(αCS+βCS)

zCS(t)=hCS(t)+E[θ]

Where B(αCS, βCS) is a Beta function whose parameters are updated according to:
αCS(t)=αCS(t1)+u(t1)
βCS(t)=βCS(t1)u(t1)+1

Where u(t)=1 if a US occurred and u(t)=0 otherwise. β0, β1  are the regression parameters relating zCS(t) to SCR
β0, β1αCS(0)=1βCS(0)=1
14.Mixed prior mean and uncertainty model (Model 13) Tzovara et al., 2018 with habituationSame as model 13 with the addition: hCS(t) multiplied by eφCS[tt0]+ after update for habituationβ0, β1, φCS+, φCSsame as model 13

The value km=2 maximized the fit across all participants. Thus, we replaced the prediction error defined in equation 1 with the prediction error defined in equation 4. This term has no free parameters at the single-subject level, as km was adjusted once at the population level and held constant for all participants. Qualitatively, this model improved the fit, but we found only indeterminate statistical evidence to support it.

Second, we examined Bayesian learning rate decay. This assumes that the learning rate starts high and then diminishes over trials according to the information that has been accumulated (Fiesler and Beale, 1995). In this model variant, αCS+ and αCS were free parameters and the learning rate on each trial was given by:

(5) α(t)=αsqrt(t)

This variant has no additional free parameters over the base RW model.

Third, we examined habituation of the conditioned response (diminished response over trials despite reinforcement). This captures habituation in addition to decreased response to the UCS over trials. Such habituation has been observed in other studies (Homan et al., 2019; Tzovara et al., 2018), and requires an additional parameter beyond the basic RW learning parameter to account for it. Accordingly, we fit a multiplicative exponential decay term, which resulted in an SCR estimate given by:

(6) νCShab(t)=νCS(t)eφCS[tt0]+

When we fit this model, we estimated vCS using equations 2 and 3, and then multiplied the estimated values by the habituation term given in equation 6. Note that there were two of these equations, defined by separate free parameters φCS+ and φCS. This model, therefore, had two additional parameters for each participant which controlled the rate of habituation for the CS+ and CS- cues. The rectified linear operator, [x]+, returns 0 for negative values and x for positive values. We also estimated a single value of t0 across all participants, as it was clear that the habituation process did not start in the first conditioning trial. We found that t0=2 was optimal and significantly improved the pooled model fit across participants; see Appendix 1 for additional information.

In addition to these RW models, we fit a RW-Pearce-Hall (PH) “hybrid” model to the data, given findings on its relevance to threat learning proceses (Homan et al., 2019; Tzovara et al., 2018; Zhang et al., 2016; Li et al., 2011). This model utilizes Equation 2, Equation 3 for the value update with absolute value of the prediction error. However, in the PH model the association parameter is adaptive, and updated on each trial as:

(7) α(t+1)=γδ(t)+(1γ)α(t)

And the update equation is given by:

(8) νCS(t+1)=νCS(t)+κα

The variable γ is a free parameter that controls the rate of update of α, and the variable κ is a fixed learning rate. We fit separate γ and κ parameters for CS+ and CS-. Model 9 therefore, has 4 parameters (κCS+; κCS-; γCS+; γCS-).

We also fit a series of previously reported models (Models 10, 11, 12) that had an additional regression step, such that SCR was related to value, associability, or both through regression coefficients. Here, we change notation to be consistent with previous papers and write Vn(xn) instead of νCS where xn indicates CS+ and CS-, and n is the trial number. Model fits were conducted as previously described, except for two changes specific to these models. First, the prediction error was calculated using a 1/0 for whether a UCS was delivered or not: δ(n)=bUCS(n)V(xn), where bUCS(n)=1 if a UCS was delivered and bUCS(n)=0 if no UCS was delivered (b indicates binary UCS). Second, SCR was predicted using a regression, which can be written as a normal distribution around a mean determined by the scaled predicted value, associability, or both (Homan et al., 2019; Zhang et al., 2016; Li et al., 2011).

pSCR(V)nN(β0+β1Vn(xn),σ)
pSCR(α)nN(β0+β1αn(xn),σ)
pSCR(V,α)nN(β0+β1Vn(xn)+β2αn(xn),σ)

where, in this case, n is the trial number. The squared error between the scaled predicted value, associability or both was then calculated as:

Hybrid(V)Error=(pSCR(V)SCR)2
Hybrid(α)Error=(pSCR(α)SCR)2
Hybrid(V+α)Error=(pSCR(V,α)SCR)2

Thus Models 10 and 11 had two additional parameters and Model 12 had three additional parameters to Model 9, that is, the β regression parameters, which were not used in other models.

Finally, we adapted the (Tzovara et al., 2018) model, which uses uncertainty and value to co-determine the model’s predictions. We modeled the response to both the CS+ and CS- jointly and used two parameters for the regression values relating uncertainty and value to the SCR (see Table 1). For model 14, we added an additional parameter for each type of reinforced data (CS+ and CS-) to account for the habituation observed, for a total of four parameters.

Together, we fit a total of fourteen models (see Table 1) to the acquisition data, where model 1 was the base RW model; model 2 included a prediction-error inertia term; model 3 assumed a learning-rate decay; model 4 included a habituation term; model 5 included the inertia term and learning-rate decay; model 6 included inertia and habituation terms; model 7 included learning-rate decay and habituation; model 8 included inertia, learning-rate decay, and habituation; model 9 was the RW-PH hybrid model; model 10 was the Hybrid(V) model; model 11 was the Hybrid(α) model; model 12 was the Hybrid(V+α) model; model 13 was the uncertainty model and model 14 was the uncertainty model with habituation. In summary, for models 1-9, predictions were not scaled to the data by using regression. Models 1-9 predict CS-related SCR from US-related SCR. For models 10-14, CS-related SCR is predicted using a regression to associability, value, or uncertainty, as described for each model. For models 1-9, outcome was modeled as a continuous variable. For models 10-14, we modeled outcome as a binary value of 0 or 1, as described in past publications using these models. For the equations for each model, see Table 1.

Of note, the learning processes studied here manifest over few trials, which could make parameter estimates noisy. To accommodate for that, we use data from a large number of participants. Because our hypotheses were tested using hierarchical models, variance in parameter estimates at the first level can be compensated for with additional participants, when testing hypotheses at the second level.

For each model, parameters were estimated and optimized for each participant separately using the fminsearch function in MATLAB by minimizing the difference between the predicted and measured SCR to the CS (Michalska et al., 2017). We used a series of initial values for the learning rate between 0.1 and 0.8. There were no parameter constraints for the acquisition data fits, as estimates fell within a reasonable range of –1 to 1. We also carried out parameter recovery and found that while some parameters were reasonably well recovered, for the more complex models, many were not. We therefore conducted model comparison for the models with recoverable parameters. We defined a recoverable model as one for which all the parameters had a correlation coefficient of at least 0.20. The limited ability to recover parameters is likely due to the limited number of trials available to estimate the parameters. See Appendix 1 for a full description of model and parameter recovery.

For all models, we calculated the Bayesian Information Criterion (BIC) (Schwarz, 1978), which offers a trade-off between model fit and model complexity, for each model. BIC was calculated under a Gaussian assumption as:

(9) BIC=kln(n)+nln(variance)

where k is the number of parameters in the model, n is the number of data points in the dataset that was fit (which corresponds to the number of trials in CS+ and CS- data for the given phase being fit), under the assumption that model errors are i.i.d. and normally distributed (Priestley, 1981).

We then compared the distributions of BIC values generated for each model across all subjects using t-tests. First, the BIC value for each model for each participant was computed. The differences between sets of BIC values (by model, for all participants) were used to run t-tests to determine whether there were statistically significant differences in the distributions of values for each of the models. We also examined the best model for each participant. In this case the BIC values were compared by comparing the fraction of participants best fit by each of the models. Parameters from the best fitting model were then used in analyses of anxiety symptoms and brain structure.

Extinction phase

Request a detailed protocol

We modeled threat extinction rates, whereby responses to the CS+ and CS- cues decrease over trials. No UCS was delivered during the extinction phase. We fit twelve models to the extinction data (the uncertainty models, models 13 and 14, were excluded because no UCS was ever delivered for these trials, and therefore the uncertainty model would never update), using the same procedure as above. Fitting these models to the extinction data with unconstrained learning rates led to poor performance and the learning rates for some of the participants went to extreme ranges to accommodate the data. Therefore, we refit the models with learning rates constrained to an interval of −1: + 1, where model fits would be interpretable. With this approach ~14% of participants had learning rates pinned at –1 and 1, and for ten participants, the models still did not converge to a solution. Although we would not expect a negative learning rate, we allowed for flexibility in learning by allowing the model to converge with a small or negative learning rate if necessary for the model to converge and provide a reasonable solution. Thus, rates were constrained to the range –1: +1. We also conducted parameter recovery for the twelve models used for extinction (see Appendix). The models that had recoverable parameters were included in model comparison using BIC values and the same methods used for the acquisition data.

Brain imaging

Request a detailed protocol

Analyses tested associations among threat learning parameters and imaging measures. MRI images (1 mm3) acquired on a 3-Tesla MR750 GE scanner (32-channel head coil; sagittal; 176 slices; 256 × 256 matrix; 1mm3 isotropic voxels; flip angle = 7°; repetition time (TR) = 7.7 ms, echo time (TE) = 3.42 ms), were collected from 148 of the 215 participants (69%; 81 females, M age = 18.38 years) within 90 days of the task. MRI data for a larger sample containing these participants appear in previous reports using different analyses (Abend et al., 2020; Gold et al., 2017; Gold et al., 2016). FreeSurfer (version 6.0, http://surfer.nmr.mgh.harvard.edu) was used for processing, as reported elsewhere (Abend et al., 2020). Statistical tests were performed using PALM (Permutation Analysis of Linear Models) (Winkler et al., 2014). Surface-based analyses considered whole-brain cortical thickness (10,242 vertices) using the TFCE (threshold-free cluster enhancement) statistic (Smith and Nichols, 2009). Analyses of subcortical volumes generated by FreeSurfer (bilateral amygdala, hippocampus, thalamus, diencephalon, caudate, putamen, pallidum, and nucleus accumbens; brainstem) considered gray matter volume (GMV). For each morphometry measure, analyses included global whole-brain estimates of the measure (global average thickness, total intracranial volume) as nuisance, as recommended in prior research (Nordenskjöld et al., 2015). Sex was also used as a nuisance variable given reported sex differences in brain structure (Beesdo et al., 2009; Abend et al., 2020; Ruigrok, 2014). Subcortical results were visualized using Blender version 2.90 (Kent, 2015).

Data analysis

Request a detailed protocol

Analyses tested associations among parameters of the best-fitting models for threat conditioning and extinction processes, anxiety severity, and brain morphometry measures, as moderated by age. Specifically, for each phase, we first identified the model best accounting for physiological responses. For each estimated parameter in this winning model, we next examined whether it was associated with anxiety severity, and the moderation of this association by age, using a single regression model. Conditioning effects were best modeled using four parameters (see Results), and thus significance level for each tested effect was determined via Bonferroni at α = 0.05/(4 parameters) = 0.0125. Extinction effects were best modeled using one parameter (see Results), and thus significance level for each tested effect was determined at α = 0.05/(2 parameters) = 0.025. Next, we examined whether brain structure moderated the associations between each learning parameter and anxiety severity. As noted above, analyses considered whole-brain cortical thickness (at the vertex level) and subcortical GMV (at the structure level), and used FWE rate correction for multiple comparisons of α < 0.05, whereby the family of tests for each analysis consisted of all vertices/subcortical structures across all tested effects, as above.

Sample size was not determined by a power analysis, for two reasons. First, no prior data exist on modeling of the processes of interest and their associations with anxiety severity; second, we aimed, a priori, to recruit a sample that was substantially larger than prior studies on threat learning in anxiety (Duits et al., 2015; Dvir et al., 2019).

Results

Raw SCR data

Raw SCR data by trial and phase are depicted in Figure 1. Repeated-measures ANOVA of SCR data with Phase (conditioning, extinction) and CS (CS-, CS+) as within-subject factors and Anxiety severity (Z-scores) and Age (years) as between-subject factors, indicated a significant Phase×CS interaction, F(1,212)=10.97, p = 0.001. Follow-up analyses indicated greater response to CS+ than CS- during conditioning, F(1,212)=24.24, p < 0.001, but not during extinction, F(1,212)=0.24, p = 0.631, demonstrating successful conditioning and extinction in the task. Anxiety and age did not significantly moderate the Phase×CS interaction, ps > 0.05.

Additional lower-order effects were noted. We noted a main effect of Phase, F(1,212)=10.29, p = 0.002, with greater response during conditioning than during extinction. Further, we noted a main effect of CS, F(1,212)=11.93, p < 0.001, with greater response to CS+ than to CS- across the task. These effects were qualified by a significant Phase×Anxiety interaction, F(1,212)=6.25, p = 0.013; follow-up analyses indicated no main effect of Anxiety during conditioning, F(1,212)=0.14, p = 0.712, and a trend-level main effect of Anxiety during extinction, F(1,212)=3.30, p = 0.074. Finally, we noted a main effect of Age, F(1,212)=25.42, p < 0.001, with decreasing response with greater age.

Reinforcement models

Conditioning phase

We fit a series of reinforcement learning models to the conditioning data (see Figure 2—figure supplement 1). Our goal was to provide accurate fits of the SCRs, along with parameter estimates that could be related to symptom and structural brain data. In all cases, we used the response to the UCS, on each trial in which it occurred, to generate prediction errors, δ(t), that were used to update CS+/- predictions (see below). We built a family of models that included different features and their combinations, as well as a hybrid Rescorla-Wagner Pearce-Hall models (Li et al., 2011) and uncertainty models with and without habituation (Tzovara et al., 2018). We characterized these models and our model fitting process by carrying out parameter and model recovery (see Appendix 1). The models for which parameters could not be recovered were removed from model comparison. After this step, six models remained (models 1, 4, 5, 7, 8, 13, 14).

To examine the quality of the fits of the remaining models, we calculated BIC values for each model for each participant, and based on the BIC values, determined which model was best for each participant (Figure 2A). As these models are related, and we had a broad sample of different responses to the conditioning task, there was not a dominant model. However, we used the BIC values to define a best model at the population level.

Figure 2 with 3 supplements see all
Modeling threat conditioning.

(A) Bars in left panel depict BIC values for each of the fourteen models fit to the CS- and CS + conditioning data. Error bars indicate one standard error of the mean. Bars in right panel depict the proportion of participants for whom each model provided the best fit. (B) Based on model fit indices, model 7 was chosen as the best-fitting model for conditioning data. Graphs depict empirical skin conductance data (full line) and fitted data (dashed line) for model 7 fitted to CS+ (red) and CS- (blue) conditioning data. Data are smoothed for display purposes only. (C) Association between model 7’s CS- learning rate parameter and anxiety severity. (D) The association between model 7’s CS- learning rate parameter and anxiety severity was moderated by left accumbens gray matter volume (GMV).

To characterize model fits at a population level, we first performed a repeated-measures ANOVA on BIC values (model as fixed effect, participant as random effect, BIC value as dependent variable). We found that BICs varied across models, F(1,5)=22.54, p<0.001. We further found that models 7 and 8 performed better than all other models based on post-hoc comparisons (ps<0.05, Bonferroni-corrected). These models did not, however, statistically differ (p=0.20). Since model 7 is a simpler model, we chose it as the optimal model for our data and used its four free parameters (for CS+, α: threat learning rate, φ: habituation rate; for CS-, α: threat generalization rate, φ: safety learning rate) in subsequent analyses; see Figure 2. See Appendix 1 for model fit when the sample is split into sub-groups based on youth vs adult and patient vs healthy control participants.

Conditioning: CS+ threat learning and habituation

Anxiety

We examined associations between anxiety severity and CS+ threat learning rate and habituation rate (generated by model 7). These rates did not significantly correlate with anxiety severity, ßs < 0.09, ps > 0.193.

Brain structure

No significant associations emerged between threat learning or habituation rate and brain structure, all pFWEs > 0.05.

Conditioning: CS- threat generalization and safety learning

Anxiety

We examined associations between anxiety severity and CS- parameters generated by model 7: generalization of fear response to the non-reinforced CS- (threat generalization) and reduction in response to it through non-reinforcement (safety learning). Anxiety severity was negatively associated with safety learning rate, indicating slower safety learning with greater anxiety severity, β = −0.243, p = 0.001; see Figure 2C. This effect remained significant when controlling for CS+ habituation rate, β = −0.239, p = 0.001, indicating the specificity of the association between anxiety and safety learning (as opposed to a general habituation effect). No other effects were observed.

Brain structure

A significant interaction between anxiety severity and left accumbens GMV on safety learning rate emerged, pFWE = 0.043. This effect remained significant when controlling for CS+ habituation rate, pFWE = 0.040. Decomposition of this interaction indicated that when accumbens volume was smaller, slower safety learning was associated with greater anxiety severity; see Figure 2D.

Extinction phase

Model fit

Extinction data from each participant were fit with the first twelve models used to model acquisition responses (See Figure 3—figure supplement 1). Four models with parameters that could be recovered were included in model comparison (models 1, 2, 3, and 5). A repeated-measures ANOVA on BIC values (model as fixed effect, participant as random effect), yielded a significant effect of model, F(1,3)=15.45, p < 0.001. Model 3 was picked most often as the best-fitting model (Figure 3), and therefore we compared it with other models providing better fit using t-tests. However, model 3 did not differ statistically from models 1, 2, or 5 (ps > 0.05, Bonferroni-corrected). Given its relative simplicity and the fact that it was frequently chosen across participants, we used the parameters from model 3 (Bayesian-decay learning rate) to represent the extinction process, with one learning rate for CS+ and one for CS-.

Figure 3 with 1 supplement see all
Modeling threat extinction.

(A) Bars in left panel depict BIC values for each of the twelve models fit to the CS- and CS+ extinction data. Error bars indicate one standard error of the mean. Bars in right panel depict the proportion of participants for whom each model provided the best fit. (B) Based on model fit indices, model 3 was chosen as the best-fitting model for extinction data. Graphs depict empirical skin conductance data (full line) and fitted data (dashed line) for model 3 fitted to CS+ (red) and CS- (blue) extinction data. Data are smoothed for display purposes only. (C) Association between model 3’s CS- extinction rate parameter and anxiety severity; this association is only trend-level significant. (D) The association between model 3’s CS- learning rate parameter and anxiety severity was moderated by left accumbens gray matter volume (GMV); this association is only trend-level significant.

Anxiety

CS+ extinction rates did not correlate with anxiety severity. CS- extinction rate marginally correlated with anxiety severity, β = −0.16, p = 0.034, such that slower extinction was associated with greater anxiety; see Figure 3. A comparable effect was noted when controlling for CS+ extinction rate, p = 0.031, as well as when controlling for CS- learning and habituation rates during conditioning, p = 0.030. No other effects were observed.

Brain structure

Gray matter volume in left accumbens moderated the association between CS- extinction rate and anxiety severity at trend level, pFWE = 0.062, such that as accumbens volume was smaller, slower extinction was associated with greater anxiety severity; see Figure 3. No other effects were observed.

In addition to the primary analyses examining associations between anxiety severity and learning parameters and the moderation of these associations by brain structure across the full sample, we also examined these effects separately among youth and adult participants. These are reported in full in Appendix 1. Briefly, they suggest that observed anxiety effects on safety learning and extinction are attenuated with age. Further, these suggest moderation of anxiety-safety learning rate by several structures (bilateral nucleus accumbens, brain stem, and amygdala) in youth participants but not in adults. In contrast, extinction rates were associated with amygdala, accumbens, and brainstem volume moderation in adults, but not in youths; see Appendix 1.

Discussion

This study modeled trial-by-trial psychophysiological data to quantify threat learning processes and their associations with anxiety symptoms and neuroanatomy (Mineka and Oehlberg, 2008; Duits et al., 2015; Fanselow, 2018; Casey et al., 2015). Several findings emerged. First, threat conditioning was optimally modeled with a RW model that featured Bayesian learning rate decay and a habituation rate. Within the conditioning process, slower safety learning rate was found to relate to more severe anxiety, and nucleus accumbens volume moderated this association. Finally, extinction learning was best modeled using a Bayesian (diminishing) learning rate decay and extinction of response to the safety, but not threat, stimulus was associated with greater anxiety. This study identifies specific latent threat-learning parameters that relate to anxiety symptom severity and neuroanatomical features.

The optimal fit for threat conditioning was a RW model that assumes Bayesian learning decay and habituation terms. A RW learning rule has long been proposed to underlie reinforcement learning, whereby point estimates of future outcomes are calculated by modifying current predictions according to prediction error, as weighted by a fixed learning rate (Rescorla and Wagner, 1972). Models based on a RW rule have been shown to describe the acquisition of conditioned fear responses in animals and humans (Herry and Johansen, 2014). Here, we show that an incrementally-diminishing threat-learning rate provides a better fit to physiology data (i.e., SCR) in humans, suggesting that conditioning of threat-anticipatory physiological responses is governed by a set learning parameter that is weighted less in proportion to additional reinforcement trials. It therefore places greater emphasis on learning from initial encounters with danger, supporting an adaptive role for this learning process in environments that contain threats (Fanselow, 2018).

The inclusion of a habituation term could indicate that conditioned responding decreases as the threat cue becomes more predictable, beyond what is explained by the basic RW model itself (Rescorla and Wagner, 1972). This could be a product of the 80% reinforcement schedule used here; a different reinforcement ratio could have affected this habituation term. Additionally, the habituation term could reflect our use of trial-level UCS value when modeling, rather than 0/1 as is often done, such that the magnitude of response to the UCS contributes to learning, rather than just its categorical absence/presence. Indeed, response to UCS in conditioning paradigms is often shown to naturally diminish as learning is achieved, potentially reflecting effective and reliable anticipation of it (Britton et al., 2013; Goodman et al., 2018; Johansen et al., 2010). Further, in other work (Abend, 2021) we show that increasing UCS potency results in increasing physiological response to the UCS. The response habituation term in the model may therefore account for individual differences in dynamic changes in the strength of response that both CS+ and UCS elicit.

This study extends prior work by modeling multiple learning effects that were evident in the data during threat conditioning and showed associations with anxiety severity and neuroanatomy. The introduction of an aversive UCS led to generalization of conditioned fear response to the CS- occurring most strongly early in conditioning, and followed by safety learning as this generalized response was inhibited by the CS- not predicting the aversive outcome. Learning theories of anxiety posit that each of these processes might play a role in the emergence and maintenance of pathological anxiety (Mineka and Oehlberg, 2008; Duits et al., 2015; Dymond et al., 2015; Vervliet et al., 2013; Lissek et al., 2014). Our modeling approach allowed us to test these theories. First, threat generalization was noted in the data, indicating that this process can be usefully modeled with this approach. However, the rate of generalization was not significantly associated with anxiety severity. Second, our findings provide support for propositions (Mineka and Oehlberg, 2008; Lissek et al., 2009) that link greater anxiety with slower safety learning rates (Mineka and Oehlberg, 2008; Craske et al., 2018). This association remained significant when controlling for CS+ habituation during conditioning, indicating that it was not a general habituation process. This finding suggests that fear responses to aversive events may initially generalize to neutral proximal stimuli, as a potentially adaptive mechanism, but that continued responding to these stimuli may contribute to the persistent and generalized fears associated with anxiety (Lissek et al., 2014). Importantly, the use of computational modeling enabled us to extend prior work by directly examining associations between learning rates and anxiety.

Our data suggest that the association between anxiety and safety learning was moderated by nucleus accumbens volume. Recent work indicates a role for this structure in regulating fear responses and threat-safety discrimination, including in the context of extinction learning (Ray et al., 2020; Dutta et al., 2021; Abraham et al., 2014). Our findings extend such work by suggesting that variation in accumbens structure relates to fear regulation following fear generalization to safety cues. Importantly, this effect is associated with anxiety severity, indicating potential pathophysiological relevance. These findings encourage more research on accumbens structure in predicting future anxiety as well as functional imaging work linking its structure and function in the context of safety learning.

Our results show that threat extinction learning also follows an incrementally-diminishing learning rate. As in threat generalization and safety learning, this is the first study to directly quantify the temporal dynamics of this learning process via modeling. We show an association between slower extinction of the safety stimulus and greater anxiety severity, providing some support for major theories of anxiety and its treatment that highlight extinction processes (Mineka and Oehlberg, 2008; Duits et al., 2015; Milad and Quirk, 2012; Barlow, 2002; Craske et al., 2018; Pittig et al., 2016; Vervliet et al., 2013; Papalini et al., 2020). Thus, in the context of exposure to conditioned stimuli during threat extinction, individuals with higher relative to lower anxiety symptoms diminish fear responses to safe stimuli more slowly. Importantly, this association was maintained when controlling for threat extinction rate as well as safety learning rate during conditioning, indicative of specificity. As in safety learning, accumbens structure moderated anxiety-learning associations. Thus, associations with anxiety severity, and moderation by accumbens anatomy, emerged for two processes in which the safe value of the CS- is to be learned, highlighting the role of response regulation to such stimuli in anxiety symptoms. It should be emphasized, however, that effects for extinction were weaker than those observed during conditioning and were evident only at trend-level, calling for caution in its interpretation and for replication.

Of note, prior work identified the RW-PH “hybrid” model as providing optimal fit to data for the expression of threat contingencies among healthy adults as well as adults with PTSD (Homan et al., 2019; Tzovara et al., 2018; Zhang et al., 2016). Here, we found that this model did not provide the best fit for rapid threat conditioning or extinction among the models tested. This discrepancy could potentially be attributed to the nature of threat learning processes studied. While we focused on rapid acquisition/extinction of threat contingencies which takes place over a short training schedule (LeDoux, 2000; Fanselow, 2018), prior modeling studies examined threat contingencies over much longer durations (over 80 CS+ trials). The hybrid model includes a cumulative associability term which could be more sensitive to tracking contingency values or the expression of conditioned fear as it is optimized over longer durations. Thus, differences in rapid, crude acquisition vs optimized expression of threat contingencies processes could account for model differences. As such, the current and prior work could be seen as complementary in terms of threat learning processes studied, and both sets of findings could usefully inform study design for future work.

While the design of this study offers a unique opportunity to examine age effects on threat learning as these relate to anxiety severity, an inherent challenge that arises in research on anxiety along the lifespan is how to combine anxiety data from youth and adult participants. Although the SCARED and STAI are each considered ‘gold standard’ measures in their respective target populations, they nevertheless are not identical. Under the assumption that these capture similar constructs, we uncovered several anxiety effects across the full age range. Alternatively, one may consider these measures to be incomparable, in which case analyses should be restricted to specific age groups. This alternative approach indicated an attenuation of anxiety-learning associations with age, as well as age group-specific moderation of these associations by brain structure. Given that age-dependent effects appear to emerge more strongly when age groups were considered separately, future research could emphasize this approach. When hypotheses call for analyses across the lifespan, consideration of optimal harmonization of clinical data is needed.

Our prior report used this dataset to examine links among associative threat learning, anxiety, and neuroanatomy using a ‘standard’ analytical approach whereby responses to CSs are averaged across trials (Abend et al., 2020). Here, we utilized a reinforcement modeling framework to test such links; this approach importantly extends our prior work in several ways. First, the approach applied here enabled us to directly quantify the temporal dynamics of CS and UCS associations that are at the core of associative learning (as opposed to measuring only responses to CSs and treating all trials as equivalent), thereby providing a more sensitive measure of threat learning processes, and how these relate to anxiety. Second, the current approach allowed us to quantify multiple characteristics evident in the learning processes (e.g. acquisition and habituation rates). As such, the modeling approach may be more sensitive to detect different, specific aspects of learning; this, in turn, enabled us to simultaneously test different theories linking these specific threat learning aspects and anxiety (e.g., slower safety learning) with greater sensitivity. Indeed, whereas our previous report failed to identify such associations, the current report provides novel insight on the pathophysiology of anxiety by linking variations in distinct threat learning processes to anxiety symptoms and subcortical structures. This distinction therefore highlights the utility of using a reinforcement learning framework to test questions on threat learning and anxiety.

Along these lines, the findings generated from this application of a computational approach to link biological, clinical, and imaging data could initiate continued research along several lines. In terms of clinical research, the identified associations between threat learning processes and anxiety severity could inform theories on the etiology of anxiety symptoms and guide studies that aim to further qualify the learning conditions and mechanisms that promote anxiety (Li and McNally, 2014; Vervliet et al., 2013). Further, influential theories that place threat extinction processes at the center of exposure-based therapy (Milad and Quirk, 2012; Craske et al., 2018; Pittig et al., 2016; Papalini et al., 2020) could be evaluated more sensitively using modeling-derived indices, while such treatment approaches could incorporate insight on safety learning (Craske et al., 2018). Translational research on mechanisms of threat learning could also benefit from these findings. Thus, neuroscience research in humans could focus on the specific processes identified here and extend our structural imaging findings to insight on function within this circuitry. Research in animals could complement such work by further delineating, via invasive manipulations (e.g., Likhtik and Paz, 2015), the roles in threat learning of the subcortical structures identified here.

Several important limitations should be acknowledged. First, modeling was based on relatively few task trials; while we focused on the specific processes of rapid learning of threat associations that take place over very few pairings (Fanselow, 2018), this could have led to noisier parameter estimates that could diminish accuracy and statistical power. Second, this was a cross-sectional study; a longitudinal design examining whether learning rates predict later emergence of symptoms would allow stronger inferences about developmental processes (Lonsdorf and Merz, 2017). Third, only structural MRI data were examined; functional imaging during threat learning could reveal additional correlates of learning circuitry (Homan et al., 2019), although the delivery of aversive stimuli to participants with anxiety in the MRI scanner presents a challenge (Thorpe et al., 2008). Fourth, SCR data were analyzed using a single method, which, while established, relies only on directly-observable effects; future studies may consider using novel, computational analysis methods which could potentially reveal effects not observed using the current method (Bach et al., 2018; Bach and Friston, 2013; Bach et al., 2020; Ojala and Bach, 2020). Along these lines, a multiverse approach may be used in future work to comprehensively compare multiple methods of quantifying threat learning. Fifth, while we used strict exclusion criteria to reduce effects of psychopathology other than anxiety, this does not eliminate all such potential effects (particularly when using the STAI Knowles and Olatunji, 2020); future research may wish to utilize computational approaches (e.g. bifactor models) to estimate symptom variability that is unique to anxiety (Tseng et al., 2021). Sixth, future research may consider alternative psychophysiology measures to SCR which may potentially be more sensitive to threat learning effects (Ojala and Bach, 2020). Finally, there were differences in sex and age between individuals included and excluded (non-responders) from analyses, as reported previously (Boucsein et al., 2012; Bari et al., 2020); while these differences did not influence our findings, they could still limit generalizability and thus future studies should consider such potential differences. Several strengths partially mitigate these limitations and address general shortcomings in threat learning research (Lonsdorf et al., 2017; Ney et al., 2018). First, the large sample size increases precision of estimated parameters (Asendorpf et al., 2020), offsetting, to some extent, the small number of trials used for modeling. Second, participants were carefully assessed and free of medications known to impact threat learning and psychophysiology (Lonsdorf et al., 2017). Third, wide anxiety-symptom and age ranges generate inferences with reasonable statistical power. Fourth, task and setting were identical for all participants, reducing measurement confounds and noise.

As a final technical comment, reinforcement learning model parameters were estimated using maximum likelihood techniques on individual subjects followed by model comparison. Future work could expand on this by using hierarchical Bayesian parameter estimation to reduce the variance around parameter estimates (Piray et al., 2019; van Geen and Gerraty, 2021; Lee and Newell, 2011). However, choosing prior distributions within the hierarchical Bayesian approach is not trivial and may not work for all of the models tested in this study. As such, future work could focus on fewer models, such as those that survived parameter recovery in this study, test a range of priors, and determine whether parameter estimation could be improved.

In conclusion, in this study we used computational modeling to index dynamic learning processes associated with threat learning. Through this modeling approach, we quantified these learning processes, revealed specific associations with anxiety symptoms, and identified neuroanatomical substrates. These findings extend our knowledge of how these learning processes manifest in humans and how variations in these could potentially contribute to anxiety symptomatology.

Appendix 1

Methods

Participants

Prior to participation in the study, patients’ psychiatric symptoms were assessed on three separate occasions, via (1) telephone screen with a psychiatric nurse, (2) in-person, standardized diagnostic assessment (see below) with a trained clinician, and (3) independent assessment and confirmation of diagnosis by a senior psychiatrist. All patients agreed to enter treatment for their anxiety disorder; as such, the patient data reported reflect populations of youth and adults with anxiety disorders.

Individuals were included if they were medication-free, physically healthy, and had an IQ>70, based on the Vocabulary and Matrix Reasoning subscales of the Wechsler Abbreviated Scale of Intelligence (Wechsler, 1999). All pediatric patients in the anxiety group had to suffer from generalized anxiety, social anxiety, and/or separation anxiety disorders as their major source of psychiatric impairment and need for treatment. Adult patients were eligible for any of the same ongoing three anxiety disorders as well as panic disorder and were not required to be seeking treatment. A diagnosis of major depressive disorder, bipolar disorder, obsessive compulsive disorder, disruptive mood dysregulation disorder, or posttraumatic stress disorder was exclusionary. Patients with anxiety were permitted to have comorbid additional anxiety disorders or attention-deficit/hyperactivity disorder if presenting as a secondary, minor problem, relative to the primary diagnosis. Healthy participants were diagnosis-free. Exclusion criteria for both groups included current psychotropic medications, inclusion of family relatives in the study, or physical health problems.

Data from 32 participants were excluded due to aborting the task (22 anxious, 8 healthy) or technical problems (1 anxious, 1 healthy). Data from 4 additional participants (2 anxious, 2 healthy) were excluded from analyses since they inquired and were then informed of the CS contingencies prior to the conditioning phase (Mechias et al., 2010).

Diagnosis

Diagnosis of an anxiety disorder was given by trained clinicians using the Kiddie Schedule for Affective Disorders and Schizophrenia for School-Age Children-Present and Lifetime Version (KSADS-PL) (Kaufman et al., 1997) for youths (n=140; age <18 years) and the Structured Clinical Interview for DSM-IV-TR Axis I Disorders (SCID) (First et al., 2002) for adults (n=75; age ³18 years). All clinicians were trained on an initial series of recorded interviews and then were regularly monitored through a review of interview tapes and reassessments of patients.

Anxiety severity scores

As noted in the main text, we used dimensional anxiety severity scores in analyses derived from Z-scores on standardized anxiety questionnaires. As can be seen in Figure 1—figure supplement 1, severity scores were unimodal and continuous across the sample. Supplemental analyses considered youth and adult participants separately.

Threat conditioning and extinction task

A schematic representation of the threat conditioning and extinction task is provided in Figure 1. The task consisted of a pre-conditioning phase, a conditioning phase, and an extinction phase. Each conditioned stimulus (CS+ and CS-) was presented for 7s. During the pre-conditioning phase, each CS was presented four times to allow physiological responses to the novel stimuli to habituate. During the threat conditioning phase, each CS was presented 10 times, and the CS+ was followed by the UCS with an 80% reinforcement schedule. Participants were instructed that they could learn to predict when the UCS would occur, but they were not explicitly informed of this contingency. During the extinction phase, the CSs were each presented eight times in the absence of the UCS. Throughout all phases, presentation order of the CSs and an inter-trial interval (a gray screen presented for 8-21s, averaging 15s) was pseudo-randomized (two different orders counterbalanced across participants). The task was programmed and administered using PsyLab psychophysiological recording system (PsyLab SAM System, Contact Precision Instruments, London).

Following extraction of SCR data as specified in the main text, we cleaned the data at the trial level. This was of particular importance in this study, since models were fit to trial-by-trial data, and outliers would skew model estimates. Missing skin conductance trial values and within-subject outliers were identified per subject, for each CS type during conditioning and extinction phase separately; outliers were defined and up to three consecutive values were linearly interpolated using the default settings in the zoo package in R66. Next, trailing missing trial values during conditioning and leading missing values during extinction were linearly extrapolated. Interpolation and extrapolation of missing data was done, as opposed to censoring, since model generation relied on limited trial-by-trial data (per subject), and learning effects were very rapid, and spurious estimates due to censoring data points was a concern. Subjects with more than 50% missing trial values (i.e., skin conductance = 0mS) for CS+ trials during conditioning were excluded. Then, overall CS+ and CS- skin conductance responses were averaged for each subject. Group outliers for each CS type were identified and excluded. Based on these conservative criteria, skin conductance data from 136 participants were excluded from modeling (52 anxious, 84 healthy).

This proportion of exclusion is similar to excluded data in prior work (Homan et al., 2019; Lonsdorf et al., 2019). See Appendix 1—table 1 for demographic and clinical differences as a function of inclusion/exclusion. No significant differences in exclusion were noted for anxiety severity (by diagnosis or continuously) or IQ, but excluded relative to included participants had a higher mean age and a greater proportion of females. Note that age was included in tested models, and thus observed anxiety effects are independent of age effects; further, results did not change when using sex as a covariate. Based on our experience, we believe that this exclusion proportion reflects challenges of physiological data collection during uninstructed aversive tasks, particularly across a wide age range and in those diagnosed with anxiety disorders, in conjunction with conservative exclusion rules since modeling required sufficient trial-by-trial data. These considerations should be taken into account in future research, particularly in studies across a wide age range and in those that consider sex, and require sufficient trial-level data. Implementing newer analytic techniques and equipment, as well as use of more robust aversive stimuli, could potentially mitigate this issue.

In addition to skin conductance, startle electromyography (EMG) and electrocardiography were recorded, but not analyzed in the current report. For the startle EMG measure, startle probes (i.e., 40ms, 4-10 psi of compressed air delivered to the forehead) were presented during the CS trials (but not within the SCR response window, 5-6 seconds post-stimulus onset) and during the inter-stimulus interval.

Reinforcement modeling

When constructing the models, we used the magnitude of response to the UCS as the reinforcement term, and not the mere presence/absence of the UCS (0/1). This is due to two reasons. First, the Rescorla-Wagner model is best for reinforcement learning processes where there is convergence to a level of performance, as the function asymptotes to the expected value of the reinforcement. Thus, this model naturally asymptotes and fits asymptotic choice (performance) behavior well. When used in the current context, the skin conductance data do not asymptote between a fixed range (i.e., 0 to 100% in the case of choice performance) and therefore the scale must be considered. To use 0/1 modeling of UCS delivery, one would need to additionally normalize the skin conductance for each participant or have a unique scaling factor for each participant. Doing this would be inaccurate given the limited amount of data available for each participant and the presence of noise in SCR data in general. Second, it is important to note that there is substantial decrease in response to the UCS over trials, F(7,2422)=45.33, p<0.001. This has been noted in other work (Goodman et al., 2018). Thus, the value of reinforcement naturally diminishes with presentation, and this should be integrated into the model as reinforcement is not fixed but rather changes with time, potentially affecting learning over trials. As such, we used the UCS response as the reinforcement term.

Imaging data processing and analysis

All participants underwent MRI scanning at the NIMH Functional Magnetic Resonance Imaging Core Facility. Participants completed a high-resolution, T1-weighted magnetization-prepared rapid-conditioning gradient-echo scan (MPRAGE) with the following parameters: sagittal conditioning; 176 slices; 256x256 matrix; 1mm3 isotropic voxels; flip angle = 7°; repetition time (TR) = 7.7ms, echo time (TE) = 3.42ms. Imaging was conducted within 90 days of the task

Image Processing

Surface-based analysis followed the procedures in (Fischl and Dale, 2000 and Dale et al., 1999). T1-weighted images were corrected for magnetic field inhomogeneities, affine-registered to the Talairach-Tournoux atlas (Talairach and Tournoux, 1988), and then skull-stripped. White matter (WM) voxels were identified based on their locations, their intensities, and the intensities of neighboring voxels, and grouped into a mass of connected voxels using a six-neighbor connectivity scheme. A mesh of triangular faces was the constructed using two triangles per exposed voxel face. The mesh was next smoothed based on local intensity in the original images using trilinear interpolation (Dale and Sereno, 1993); a second smoothing iteration was then applied, resulting in a realistic representation of the interface between gray and white matter. The external cortical surface was produced by identifying a point where tissue contrast is maximal, maintaining constraints on smoothness and possibility of self-intersection (Fischl and Dale, 2000). These surfaces were then parcellated using an automated process into smaller regions (Fischl et al., 2004). We used the regions of the atlas developed by Desikan et al., 2006. Cortical thickness served as the cortical measure of interest.

The subcortical volume-based analysis stream is designed to automatically preprocess MRI volumes and label subcortical tissue classes (Fischl et al., 2004; Fischl et al., 2002). First, images were affine-registered to MNI305 space. Next, initial volumetric labeling was conducted and variation in intensity due to the B1 bias field was corrected. Finally, a high-dimensional nonlinear volumetric alignment to the MNI305 atlas was performed, and structures were labeled. These included brainstem and left and right amygdala, hippocampus, thalamus, caudate, putamen, pallidum, and nucleus accumbens. The permutations tests corrected for the number of structures tested (see below).

Bias-corrected images from FreeSurfer, in the same space as the labelled structures, were segmented into gray matter, white matter, and cerebrospinal fluid using the FAST module of FSL. The outputs of FAST are images in which the value at each voxel corresponds to the proportion of the volume of the voxel that is occupied by each of these tissue classes (Zhang et al., 2001).

Analyses

Analyses testing for associations among brain structure, learning rates, anxiety, and age were conducted using PALM (Permutation Analysis of Linear Models Winkler et al., 2014). Analyses were based on 1000 permutations, followed by an approximation to the tail of the permutation distribution of the maximum statistic using a generalized Pareto distribution (Winkler et al., 2016). For each morphometry measure, analyses included global whole-brain estimates of the measure as nuisance. Thus, for subcortical GMV, we controlled for total intracranial volume; for cortical thickness, we controlled for global average thickness. For the cortical analysis, only the surface vertices that represent actual cortex were included, masking out sub-callosal region of each hemisphere that is included in the surfaces only to ensure the topology of a sphere. Sex also served as a nuisance variable in light of known differences in brain structure (Abend et al., 2020; Ruigrok, 2014). All analyses used familywise error (FWE) rate correction for multiple comparisons across all contrasts. Subcortical results were visualized using Blender version 2.90 (Kent, 2015).

Results

Raw SCR data

In addition to the effects on raw SCR data reported in the main text, we also noted a main effect of Phase, F(1,212)=10.29, p=0.002, with greater response during conditioning than during extinction. Further, we noted a main effect of CS, F(1,212)=11.93, p<0.001, with greater response to CS+ than to CS- across the task. These effects were qualified by a significant Phase´Anxiety interaction, F(1,212)=6.25, p=0.013; follow-up analyses indicated no main effect of anxiety during conditioning, F(1,212)=0.14, p=0.71, and a trend-level main effect of anxiety during extinction, F(1,212)=3.30, p=0.07. Finally, we noted a main effect of age, F(1,212)=25.42, p<0.001, with decreasing response with greater age.

Reinforcement modeling

Figure 2—figure supplement 1 depicts the empirical and modeled data across the sample for each of the models fit to the CS- and CS+ conditioning data.

We carried out parameter recovery by using the parameters for each model from the participants to whom it was best fit and used those parameters to generate synthetic data from the model. For each model, we then fit the original model to the synthetic data and compared the corresponding actual and “recovered” parameters using correlation coefficients; see Appendix 1—table 2. For acquisition, eight models had all parameter correlation coefficients > 0.2; these models were included in model comparison and subsequent analyses.

Parameter recovery was also conducted for the extinction modeling. The same criteria were applied to include only models with correlation coefficients > 0.20 for all model parameters. See Appendix 1—table 3.

Appendix 1—table 1
Demographics (sex, age, IQ) and anxiety severity (by diagnosis: anxious/healthy; by continuous anxiety scores on the Screen for Child Anxiety Related Emotional Disorders or ) for participants who were included or excluded from data analysis due to excessive missing data.

Differences between included and excluded participants were tested using chi-squared or independent-samples t-tests.

ExcludedIncludedTest Statistic
N136 (94 F)215 (116 F)
% Female69.1153.95χ(1)2= 7.35, P=.006
N Anxiety diagnosis55104χ(1)2= 1.81, P=.18
N Healthy85111χ(1)2= 3.56, P=.06
Mean (SD) age23.40 (9.16)18.76 (9.39)t(302.29) = 4.62, P<.001
Mean (SD) anxiety–0.09 (0.89)0.05 (1.06)t(288.66) = 1.27, P=.21
Mean (SD) IQ114.68 (13.32)113.98 (11.90)t(271) = .50, P=.61
Appendix 1—table 2
Corelations between the actual parameters and recovered parameters for each of the 14 models for acquisition data.

Models in bold met the criteria for model comparison (all corelation values for all parameters >0.20). Note that the last row reflects an additional variant of the Tzovara et al. model with two additional habituation parameters (habituation for CS+ and CS- for 6 total parameters) that was examined for completeness (see Methods).

ModelParameter 1, CS+ learning RateParameter 2, CS- learning RateParameter 3, CS+ habituationParameter 4, CS- habituation
10.990.56
2 (inertia)0.860.22
3 (Bayesian)0.520.21
40.760.99>0.990.99
5 (inertia +Bayesian)0.740.46
6 (inertia)0.380.170.53>0.99
7 (Bayesian)0.590.420.98>0.99
8 (inertia +Bayesian)0.520.620.96>0.99
Parameter 1, CS +learning RateParameter 2, CS- learning RateParameter 3, CS +learning rate updateParameter 4, CS- learning rate updateParameter 5, regression βoParameter 6 regression β1Parameter 7 regression β2
90.530.130.520.02
10–0.090.04–0.370.460.350.21
110.490.07–0.370.460.350.21
120.340.04–0.390.660.0020.020.99
Parameter 1, βoParameter 2, β1Parameter 5, CS+ habituationParameter 4, CS- habituation
130.900.96
140.810.90–0.030.33
Appendix 1—table 3
Correlations between the actual parameters and recovered parameters for each model for extinction data.

Models in bold met the criteria for model comparison (all correlation values for all parameters > 0.20).

ModelParameter 1, CS+ learning rateParameter 2, CS- learning rateParameter 3, CS+ habituationParameter 4, CS- habituation
10.590.45
2 (inertia)0.510.46
3 (Bayesian)0.830.69
40.280.070.460.93
5 (inertia + Bayesian)0.800.82
6 (inertia)0.490.270.05-0.04
7 (Bayesian)0.320.400.050.40
8 (inertia + Bayesian)0.220.25-0.0030.01
Parameter 1, CS+ learning rateParameter 2, CS- learning rateParameter 3, CS+ learning rate updateParameter 4, CS- learning rate updateParameter 5, regression βoParameter 6,
regression β1
Parameter 7, regression β2
90.470.330.110.01
100.280.010.010.030.820.19
110.850.110.990.86–0.31–0.03
120.99-0.0020.990.930.900.90–0.01

Indeed, most prior work that finds this model to work well has included substantially more trials and contingency reversals; here, we focus only on initial, rapid acquisition as opposed to change in threat value over many trials. Thus, when modeling contingencies and their change over longer durations, the hybrid model may provide an optimal fit, whereas initial acquisition of contingencies might reflect a specific case in which other models perform better.

Model recovery was performed by simulating data from the nine base models using the parameters from the participants best fit by each model. However, due to the unequal distribution of participants best fit by each model, we calculated the mean and standard deviation of the parameters from the best fit participants, and then sampled from the corresponding Gaussian distributions to create a total of 100 sets of parameters for each model. These parameters were then used to generate synthetic datasets. That simulated data was then fit with all the models to determine the extent to which the model selection approach would identify the model used to generate the data. Note that in some cases, for example in habituation models, we could sample very small parameter values for the habituation term, which would lead to selection of a simpler model. See confusion matrix for model recovery results (Figure 2—figure supplement 2).

Overall, the models were recovered reasonably well, except for models 6 and 9. It is important to note that models 1-8 are variants of a general model that captures the process of interest. We did not intentionally bias our selection of parameters to the parameter range that would best separate these models and improve model recovery. Thus, we chose the participants best fit by the models as a reasonable starting point for parameter and model recovery procedures. In this context, models 3 and 7 recovered relatively better, while model 9 (RW-PH) and model 6 did not recover as well.

Figure 3—figure supplement 1 depicts empirical and modeled data overlays for CS- and CS+ during extinction.

Additional analyses

In this report, we did not bin the participants into categories of anxiety (patients vs. healthy comparisons) or age (youth vs. adults). This is because both anxiety and age are naturally continuous variables, and it is not clear that breaking them down by categories offers an advantage, either empirically or conceptually. Nevertheless, when participants are binned categorically, we do not see a clear relationship between which model is picked and anxiety, as can be seen in Figure 2—figure supplement 3.

In addition to the primary analyses reported in the main text, we also examined associations between learning parameters and anxiety separately among youth participants and among adult participants, as described below.

Conditioning

Anxiety was not significantly associated with CS+ threat conditioning or habituation rates in the youth group, βs ≤ 0.11, ps ≥ 0.665, and in the adult group, βs ≤ 0.53, ps ≥ 0.289. Anxiety severity was not significantly associated with CS- generalization rate in the youth group, βs ≤ 0.51, ps ≥ 0.297, and in the adult group, ßs ≤ 0.45, ps ≥ 0.297. CS- safety learning rate was negatively correlated with anxiety in the youth group, β = −0.27, p = 0.002, as in the full sample, and marginally so in the adult group, β = −0.24, p = 0.056, suggesting an attenuation of anxiety-safety learning association with age. No other effects were observed.

Effects of brain structure emerged in the youth group. These included a negative association between brain stem volume and CS- safety learning rate, as well as bilateral accumbens moderation effects similar to those reported in the main text, pFWEs < 0.05. Right amygdala showed a similar moderation effect at trend level, pFWE = 0.0628. In the youth group, left accumbens showed a moderation effect similar to that reported in the main text and in the youth group, but at trend level, pFWE = 0.073.

Extinction

In the youth group and the adult group, anxiety was not significantly associated with CS + extinction rate, β = 0.12, p = 0.227, and β = −0.16, p = 0.204, respectively. CS- extinction rate was negatively correlated with anxiety in the youth group at comparable magnitude to what was observed across the full sample, β = −0.20, p = 0.032, but not in the adult group, β = −0.09, p = 0.504, suggesting an attenuation of the association between anxiety and CS- extinction with age. No other effects were observed.

In the youth group, no effects of brain structure emerged. In contrast, in the adult group, a significant positive association between right amygdala volume and CS+ extinction rate was noted, pFWE = 0.008. Brain stem volume moderated the association between anxiety and CS+ extinction, pFWE = 0.012. Anxiety-CS- extinction rate was moderated by left accumbens and brain stem volume at trend level, pFWEs < 0.073.

Quadratic effects of age

Some evidence suggests quadratic effects of age on threat learning, and extinction learning in particular (Casey et al., 2015; Pattwell et al., 2012). Exploratory analyses testing a quadratic effect of age instead of a linear effect did not yield significant effects in all analyses, ps > 0.06.

Data availability

We cannot share the full dataset due to the NIH IRB requirements, which require participants to explicitly consent to their data being shared publicly. An important element in that is to protect patients who agree to participate in studies that relate to their psychopathology. Such consent was not acquired from most participants; as such, we cannot upload our complete dataset in its raw or deidentified form, or derivatives of the data, since we will be violating IRB protocols. Still, a subset of participants did consent to data sharing and we have uploaded their data as noted in the revised manuscript (https://github.com/rany-abend/threat_learning_eLife, copy archived at swh:1:rev:221919cc02c4b5ae00fc21764095b8b4d2a14201). Researchers interested in potentially acquiring access to the data could contact Dr. Daniel Pine (https://pined@mail.nih.gov/), Chief of the Emotion and Development Branch at NIH, with a research proposal; as per IRB rules, the IRB may approve adding such researchers as Associate Investigators if a formal collaboration is initiated. No commercial use of the data is alowed. The modeling and imaging analyses have now been uploaded in full as source code files.

References

  1. Book
    1. Barlow DHA
    (2002)
    Its Disorders: The Nature and Treatment of Anxiety and Panic
    The Guilford.
  2. Book
    1. Curzon P
    2. Rustay NR
    3. Browman KE
    (2009)
    Methods of Behavior Analysis in Neuroscience
    CRC Press/Taylor & Francis.
  3. Book
    1. Kent BR
    (2015)
    CA
    San Rafael: Morgan & Claypool Publishers.
    1. Lee MD
    2. Newell BR
    (2011)
    Using hierarchical Bayesian methods to examine the tools of decision-making
    Judgment and Decision Making 6:832–842.
  4. Book
    1. Rescorla RA
    2. Wagner AR
    (1972)
    In Classical Conditioning II (Ed A. H. Black & W. F. Prokasy) 64-99
    New York: Appleton-Century-Crofts.
  5. Book
    1. Talairach J
    2. Tournoux P
    (1988)
    Co-Planar Stereotaxic Atlas of the Human Brain: 3-Dimensional Proportional System: An Approach to Cerebral Imaging
    Georg Thieme.
  6. Book
    1. Wechsler D
    (1999)
    Wechsler Abbreviated Scale of Intelligence
    The Psychological Corporation.

Decision letter

  1. Alexander Shackman
    Reviewing Editor; University of Maryland, United States
  2. Michael J Frank
    Senior Editor; Brown University, United States
  3. Dominick Bach
    Reviewer

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Decision letter after peer review:

Thank you for submitting your article "Computational Modeling of Threat Learning Reveals Links with Anxiety and Neuroanatomy" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, and the evaluation has been overseen by Drs. Shackman (Reviewing Editor) and Michael Frank (Senior Editor).

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

Summary:

The authors present an investigation into the role of threat learning processes in symptoms of anxiety across a broad sample of subjects with and without clinical anxiety, across multiple age groups. The study uses a simple fear conditioning paradigm in which subjects first undergo an acquisition procedure to learn associations between visual CSs and loud noise USs, before undergoing an extinction procedure where the CS+ is presented without the US. Reinforcement learning models are fit to skin conductance responses, and parameters from these models are used to predict symptoms of anxiety. This demonstrated weaker safety learning (i.e. learning that the CS– did not predict threat) in those who were more anxious, and slower extinction in more anxious individuals, a result that was moderated by gray matter volume in a number of subcortical brain regions.

The reviewers expressed enthusiasm for the manuscript:

– This is an important area, and the results will be of interest to readers across the field of computational psychiatry, and mental health research more widely.

– This work has the potential to be an important study in the field of computational psychiatry. Investigations of threat learning processes in individuals with clinical anxiety are lacking, despite their clear relevance as a mechanism underlying symptoms of these disorders.

– [noted by all reviewers] The sample is large and includes subjects with clinically significant levels of anxiety who were not receiving medication.

– The modelling approach is carefully considered, and the authors have clearly been thorough in testing different model variants.

– The inclusion of brain volume data provides some interesting insights into how gray matter volume relates to these core learning processes.

– The authors explain the added value brought by the computational modelling approach well, and it clearly provides some interesting insights into the role of threat learning that would not be seen in more traditional analyses (e.g. prior work by this team).

– All published work on computational models of threat learning in SCR used comparably smaller samples with more limited age range (Li et al. 2011, Zhang et al. 2016, Tzovara et al. 2018, Homan et al. 2019). Hence, this is an important data set that could offer substantial insights into threat learning.

– This paper is of potential interest to researchers within the fields of clinical psychology, clinical neuroscience and computational psychiatry.

– Over recent years, both the use of dimensional measures of symptoms and adoption of computational modelling has advanced our understanding of psychopathology–related differences in cognitive and brain function. This has encompassed both original empirical studies and application of computational modelling to previously published datasets. Both these approaches have potential value and different advantages and disadvantages. The current study is an example of the latter approach.

– The consideration of nine alternate models in the modelling of SCR responses to the CS+ during acquisition is a strength of the paper.

– The application of computational modelling techniques to such a large differential fear conditioning dataset, with data from both healthy adult and child participants and unmedicated patients seeking treatment for anxiety is highly appealing.

Nevertheless, all 3 reviewers expressed some significant concerns, mostly centered on the analytic framework and other aspects of the approach. There was a consensus among us that the necessary revisions will entail considerable effort on the part of the authors.

– Modeling Approach

Concern: The modelling approach could be validated more robustly. For example, model and parameter recovery analyses are not presented, and so it is unclear how much confidence can be placed in the winning model and the parameter values derived from this model.

Recommendation: Include a more thorough validation of the modelling approach, adding model recovery and parameter recovery analyses. This would provide confidence in the conclusions drawn from the modelling by demonstrating that it is possible to recover the winning model and its parameters accurately.

Recommendation: It would be helpful to also show the BICs for different models to give the reader a clearer idea of how model fit differed across the different models.

Recommendation: It would be helpful if the authors could conduct Bayesian model comparison (see Stephan et al., Neuroimage, 2009) and report the population–level estimate of the proportion of participants best fit by each model. Given the age by anxiety analyses, it would also be helpful if they could report the best–fitting model for high and low anxious children (as assessed using the SCARED) and high and low anxious adults (as assessed using the STAI). Does the same model 'win' for both child and adult participants divided into low and high anxious subgroups?

Concern: Description of model estimation and comparison lacks detail. Model parameters are fit with fminsearch (using which starting values? random? grid search? just one starting value would hardly be sufficient. any parameter constraints?) using OLS minimisation. How is BIC then computed? How are the models compared – the methods mention "per participant and across the sample"; do you assume a fixed or random model structure? What is the purpose of a t–test (p. 13) in the model comparison?

Recommendation: Provide modeling details.

Concern: The model space of the acquisition session does not include any of the "best" threat learning models from previously published work (see below for some specific examples). This precludes a direct comparison; more importantly it may also miss out on the best model in this sample. First, previous work has simultaneously modelled CS– and CS+ responses. This is appropriate (for all sessions), since a priori the agent does not know which is the "safety" cue, and will presumably use the same mechanisms to update value for both cues. The authors state that standard models cannot account for the large increase from first CS– trial (before first reinforced CS+) to next trial; while this is correct, the authors have a model to account for that, and more trivially one may just leave out the first CS– response. Secondly, the observation function (which is only specified in the legend of table 1 – this should appear in the text) maps SCR onto the value terms of all models. All previous SCR modelling work has found SCR is more closely related to associability or uncertainty, or a mixture of these with value.

Concern: The models considered are broadly suitable but comparison of model fit across both SCRs to the CS+ and CS– during acquisition and SCRs to the CS+ and CS– in extinction is missing, which is a limitation.

Recommendation: Include prior models in the model space. Models should simultaneously include CS+ and CS– responses. To this end, they may leave out responses before the first reinforced CS+ trial. Model space should include additional variants of models 9 that account for a mapping onto associability (as in Li et al. 2011, Zhang et al. 2016) or combined associability and value (as in Homan et al. 2019) and a variant of model 3 that accounts for a mixture of uncertainty and value (as in Tzovara et al. 2018).

Related Concern from Another Reviewer: The authors do not consider alternate models for the SCRs to the CS– during acquisition or for the SCRs to the CS+ during extinction and they do not model the SCRs to the CS– during extinction at all.

Recommendations: It would be helpful to know which model best fits all the acquisition data (to both the CS+ and CS–). Does the hybrid model still perform less well than the non–hybrid models? And does model 8 still outperform model 7? (in models 8 and 9, as in model 7, the CS– value could also be modelled as updating after each UCS delivery with model 8 now having 6 free parameters and model 9 4 free parameters.) Similarly, it would be useful to have both CS– and CS+ trials during extinction modelled. Given the raw data indicates the response to both the CS+ and CS– drop off during extinction, understanding if decreased responding to the CS+ during extinction represents active specific updating of the CS+ UCS contingency or a non–specific decreases in expectancy of the UCS after both the CS+ and CS– is clearly important.

– Multiple Comparisons [Multiple Reviewers]

Concern – There are many post–hoc tests on association of model parameters with brain results and participant traits. It is unclear how this was planned and performed – how did the authors correct for multiple comparison across parameters, traits, and experiment phases?

Recommendation: provide appropriate, principled correction for multiple comparisons to mitigate familywise α inflation

Recommendation: Please state how many tests were conducted and what correction was applied for example how many brain regions were investigated, was grey matter volume in each of these related to each of the five learning rate parameters (across acquisition, generalisation and extinction), and were the moderating effects of anxiety, age and anxiety by age considered in each case?

Recommendation: please be clear in defining what "counts" as a family of analyses

– Specificity of key results

Concern: The relationship between anxiety and safety learning (as indexed by the habituation parameter to the CS– during acquisition) is one of the main findings.

Concern: The novelty of this study depends both on the strength of the computational analysis and on the extent to which the results provide new insights into the relationship between child and adult anxiety and differential fear conditioning relative to the non–computational analyses reported in the authors' prior publication on this dataset (Abdend et al., 2020). The relationship between anxiety and safety learning (as indexed by the habituation parameter to the CS– during acquisition) could be one such novel insight – however here a direct comparison of the relationship between anxiety and this parameter versus the habituation parameter to the CS+ is needed to ensure this isn't simply a non–specific habituation effect. The same issue holds for the finding of a relationship between anxiety and rate of extinction to the CS+ – the authors need to show this holds over and above any relationship between anxiety and drop off in responsivity to the CS–.

Recommendation: Here, a comparison of the relative strength of relationship between anxiety and participants' habituation parameter values for the CS– versus that for the CS+ is needed to ensure this isn't simply a non–specific habituation effect. Similarly, the authors need to directly compare the relationship between anxiety and drop off in SCRs to the CS+ during extinction against the relationship between anxiety and drop off in SCRs to the CS– during extinction to be able to make any claim specific to extinction to the CS+. This also applies to the moderating effects of brain structure on this relationship.

– SCR Approach

Concern: SCR data are likely not optimal for model fitting at the subject level; SCR is noisy, and the task involves very few trials. This may result in poorly estimated parameters; however it is not clear from the presented analyses how much of a problem this is.

Recommendation – Given the noisiness of SCR data and the limited number of trials available, parameter estimation could likely be improved substantially by using hierarchical Bayesian parameter estimation methods.

Concern: The quantification of the conditioned response is in line with some previous work, but in fact there is a range of possible analysis choices, summarised in Lonsdorf et al. 2017.

Recommendation: Given this heterogeneity, it would make sense to use the most sensitive method. Bach et al. 2020 Behaviour Research and Therapy and Ojala and Bach 2020 Neurosci Biobeh Rev discuss the most sensitive method to quantify threat conditioning from skin conductance data, and Bach et al. 2020 Nat Hum Behav provide background.

– Missing SCR Data

[Multiple reviewers] Concern: More than 1/3 of the sample are excluded due to "excessive missing data", defined as >50% missing trials.

Recommendations: It is unclear why these data were missing. Please address the explanation and please discuss the potential consequences for inference. Do excluded subjects systematically differ from those included? It would be helpful if the authors could include a table showing a break–down of the characteristics of included and excluded participants by age, gender, diagnostic status and anxiety levels and if they could address in the discussion any associated limitations in particular in relation to the age by anxiety analyses reported.

Concern: It is unclear why missing data are inter/extrapolated?

Recommendation: Use MSE. The model fitting procedure should be happy with missing data points if MSE (rather than SSE) is used as objective function. This would be more appropriate than giving the model a data value that does not exist. Censor, rather than interpolate.

– Disparate Dimensional Measures of Anxiety [Multiple Reviewers]

Concern: The combination of multiple anxiety measures (SCARED, STAI) into a single anxiety measure may not be valid, as it's reasonable to assume these are measuring subtly different constructs (aside from anything else, the SCARED measures recent symptoms while the STAI is a trait measure).

Concern: The main analyses appear to combine measures of anxiety severity across age groups as the primary variable of interest (using the SCARED for youth and the STAI for adults), and in one analysis find an interaction between age and anxiety that is driven by the younger age group. This raises some concerns as these measures are not necessarily measuring the same construct, and may limit the interpretability of these results.

Concern: Children and adults show opposing direction relationships between anxiety and fear generalisation. It is difficult to interpret the anxiety by age results given different measures were used to assess anxiety in children and in adults. This could potentially account for the different relationship between anxiety and fear generalisation in under and over 18s (see Figure 3B). I am not convinced that the authors can simply merge scores from the SCARED (which averages symptom scores based on child and adult reports) and STAI (an adult self–report questionnaire) as if they came from the same measure.

Recommendation: Absent some sort of direct quantitative harmonization (as in the GWAS literature for N/NE) or additional evidence to motivate the Z–score approach, it may not be appropriate to combine or directly compare the younger (SCARED) and older (STAI) samples (i.e. more appropriate to keep them distinct and examine them separately)

– Potential nonlinear effects of age

Concern: Only linear effects of age are modelled.

Recommendation: explore potential nonlinear effects (given that many developmental effects are nonlinear).

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

Thank you for submitting your revised manuscript – "Computational Modeling of Threat Learning Reveals Links with Anxiety and Neuroanatomy in Humans" – for consideration by eLife. Your revised article has been evaluated by Drs. Frank (Senior Editor) and Shackman (Reviewing/Handling Editor) with guidance and consultation from 2 of the 3 original Reviewers.

The manuscript has improved, with one Reviewer emphasizing that "the authors have done a good job of addressing many of the suggestions and the manuscript is definitely substantially improved as a result."

Nevertheless, there was agreement that the paper would benefit from some additional modeling.

One Reviewer wrote that, Abend et al. have addressed some, but not all of my previous concerns. The model space of the acquisition session does not include any of the "best" threat learning models from previous publications. This precludes a direct comparison; more importantly it may also miss out on the best model in this dataset.

The authors now model CS– and CS+ responses simultaneously. Yet they still do not compare their models to any of the those used in previous studies of conditioned SCR (Li et al. 2011, Zhang et al. 2016, Tzovara et al. 2018, Homan et al. 2019). They claim that their model 9 is the hybrid model used in three of these studies, but actually it is not, because the update works differently (presently they update only upon US delivery, and both CS+ and CS– from PE in CS+ trials, where as previous RW and hybrid models used the standard update on every trial, on signed PE, and with no cross–update). They qualitatively argue why the associability term in previous work does not fit their data, but in fact they should just quantitatively compare models that map value and/or associability onto SCR. Finally, they argue that Tzovara's model can only update CS+ or CS– expectation with one parameter, and modelling both CS types within the same model would lead to parameter inflaton, but in fact Tzovara's winning learning model is entirely parameter–free (only has parameters for the observation function), so this is inaccurate.

The authors need to implement these pre–existing models and show quantitatively that they do not fit the data, rather than on rely on qualitative and methodological arguments for why this is unnecessary. Based on their rebuttal, it seems that they had already implemented the models, so it should be relatively straightforward to make the necessary direct ("head–to–head") comparisons.

Along related lines, the other Reviewer noted that, [page 11] The authors argue that to adapt the Tzovara et al. model, a 5th parameter would need to be added, and that this would make it too complex. It would still be worth testing this properly [per the other reviewer's comments], and it is possible (although probably unlikely based on the provided evidence) that the model could fit well enough that it still wins despite the additional complexity–related penalisation.

The Reviewers and Editors also discussed hierarchical Bayesian approaches, with one reviewer noting that, [page 19] Reviewer suggestions related to hierarchical Bayesian parameter estimation do not seem to have been addressed in the revision. For clarity, this refers to the method of estimating reinforcement learning model parameters (see van Geen and Gerraty, 2021), rather than the SCR quantification, and there's a strong possibility it would improve parameter estimation and hence sensitivity. I appreciate that this could be represent a substantial re–analysis.

During the consultation session, the 2 Reviewers agreed that the hierarchical Bayesian approach would be nice, but is not strictly necessary for the revision, with one noting that, "I wouldn't want to force them to redo all the model fitting with a hierarchical Bayesian approach, it would likely provide more accurate estimates, but it would be a fairly big undertaking. Some acknowledgement that other model–fitting procedures exist and could be more appropriate would be nice, however (this probably also applies to their SCR quantification methods)".

And the other writing that, "I agree. This model fitting approach is a major step and may not work for some of the models."

In sum, the Bayesian approach should –– at minimum –– be noted in the Discussion.

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "Computational Modeling of Threat Learning Reveals Links with Anxiety and Neuroanatomy in Humans" for further consideration by eLife. Your revised article has been evaluated by Drs. Shackman (Reviewing Editor) and Frank (Senior Editor).

We are very pleased to accept your paper for publication pending receipt of a revision that adequately addresses the remaining suggestions from Reviewer #2. Congratulations in advance!

Reviewer #2:

This revision of the paper by Abend et al. is significantly stronger by including a direct comparison of the current model space with previous "best" models. There are two substantive and a few presentation points remaining.

1. The Tzovara et al. model is implemented differently from the original paper. The authors use separate regression parameters for CS+ and CS– trials for the Tzovara et al. model (i.e. 4 parameters) and it is not clear to me why – Tzovara et al. themselves had used only 2 regression parameters across these two trials types. With 2 instead of 4 parameters, the model might fare better (even though the fit/RSS will of course become worse). Could the authors include the original 2–parameter Tzovara et al. model?

2. Perhaps relatedly, I've been wondering about the initial values of the various terms in the update equations. Can they be presented in the tables? I'm wondering, for example, whether they used the same values for α_0 and β_0 in the β distribution of the Tzovara et al. model, but other readers might wonder about other models. Of course, it can make a big difference how a model is "initialised".

3. The authors make an effort to explain the difference between models 1–9, and models 10–14, by using different symbols on the LHS of the equations. Perhaps it could be made more prominent that models 1–9 predict CS–related SCR from US–related SCR and thus need no regression to scale the predictions to the data, whereas models 10–14 predict CS–related SCR from an arbitrary reinforcement variable.

4. I'm still unsure about using t–tests to compare BICs across a group. I have never seen this. What is the statistical foundation for this approach? I would suggest adding the RFX–model comparison by Stephan et al. as had been suggested by two reviewers in the context of the initial manuscript version. It is implemented in a matlab function (spm_BMS), requires only the matrix of BIC values, and takes a few seconds to run.

https://doi.org/10.7554/eLife.66169.sa1

Author response

Summary:

[…]

– Modeling Approach

Concern: The modelling approach could be validated more robustly. For example, model and parameter recovery analyses are not presented, and so it is unclear how much confidence can be placed in the winning model and the parameter values derived from this model.

Recommendation: Include a more thorough validation of the modelling approach, adding model recovery and parameter recovery analyses. This would provide confidence in the conclusions drawn from the modelling by demonstrating that it is possible to recover the winning model and its parameters accurately.

Thank you for this suggestion. We have now carried out parameter recovery and model recovery, as specified below.

Parameter recovery:

To carry out parameter recovery, we pulled the parameters for each model from the subjects to whom it was best fit and used those parameters to generate synthetic data from the model. We then fit the original model to the synthetic data and compared the corresponding actual and “recovered” parameters using a correlation coefficient. In general, the parameters were well recovered, with the exception of model 9 (RW-Pearce-Hall hybrid). We believe that our task design, probing our specific processes of interest, is not well-suited to drive the flexible learning component of the RW-PH model. Indeed, most prior work that finds this model to work well has included substantially more trials and contingency reversals; here, we focus only on initial, rapid acquisition and extinction as opposed to change in threat value over many trials. The main text now directs the reader (p. 15) to the full description of this rationale in Appendix 1 (pp. 57-58): “We carried out parameter recovery by using the parameters for each model from the participants to whom it was best fit and used those parameters to generate synthetic data from the model. For each model, we then fit the original model to the synthetic data and compared the corresponding actual and “recovered” parameters using correlation coefficients; see Appendix 1-table 2. In general, parameters were recovered well, with the exception of model 9 (RW-PH hybrid). This may suggest that this model is not well-suited to capture rapid conditioning. Indeed, most prior work that finds this model to work well has included substantially more trials and contingency reversals; here, we focus only on initial, rapid acquisition as opposed to change in threat value over many trials. Thus, when modeling contingencies and their change over longer durations, the hybrid model may provide an optimal fit, whereas initial acquisition of contingencies might reflect a specific case in which other models perform better.”

The correlation between the actual parameters and recovered parameters for each model was calculated and placed in a table (Appendix 1-table 2):

Model recovery:

We now note in Appendix 1 (p. 59): “Model recovery was performed by simulating data from all the models using the parameters from the participants best fit by each model. However, due to the unequal distribution of participants best fit by each model, we calculated the mean and standard deviation of the parameters from the best fit participants, and then sampled from the corresponding Gaussian distributions to create a total of 100 sets of parameters for each model. These parameters were then used to generate synthetic datasets. That simulated data was then fit with all the models to determine the extent to which the model selection approach would identify the model used to generate the data. Note that in some cases, for example in habituation models, we could sample very small parameter values for the habituation term, which would lead to selection of a simpler model.”

We summarized this with a confusion matrix (added to Appendix 1 as Figure 2—figure supplement 2):

The models were recovered reasonably well, with the exception of model 6 and model 9. It is important to note that models 1-8 are related variants of a model that captures the process of interest, i.e., they are rather similar (as we now note in p. 59). As such, the models are generally comparable and the accuracy of parameter and model recovery lower and more similar relative to model comparison between very different models. Note that we did not intentionally bias our selection of parameters or data to get full coverage of the parameter or data space best represented by these models, which would separate the differences and seemingly improve model recovery. To do this, one would need a process that exploits what each model is capturing differently than all other models. Thus, we chose the subjects best fit by the models as a reasonable starting point for parameter and model recovery procedures. In this context, models 3 and 7 recovered relatively better, while model 9 (RW-PH hybrid) and model 6 did not recover as well. This is now described in full in Appendix 1 (p. 59).

Recommendation: It would be helpful to also show the BICs for different models to give the reader a clearer idea of how model fit differed across the different models.

Thank you for this recommendation. In Author response image 1 we present a bar plot of the average BIC values for each of the models fit to the acquisition data to show similarity. (We performed similar analyses for extinction; those results are included in a later reply to a reviewer requesting more modeling of extinction). (Author response image 1) shows that the average BIC values for the acquisition models all fall within a range of ~10 values, reflecting the similarity of the models to each other. This suggests that the different variants capture the acquisition process comparably well. Our response to the next recommendation provides more detail on BIC comparison.

Author response image 1

Recommendation: It would be helpful if the authors could conduct Bayesian model comparison (see Stephan et al., Neuroimage, 2009) and report the population–level estimate of the proportion of participants best fit by each model. Given the age by anxiety analyses, it would also be helpful if they could report the best–fitting model for high and low anxious children (as assessed using the SCARED) and high and low anxious adults (as assessed using the STAI). Does the same model 'win' for both child and adult participants divided into low and high anxious subgroups?

Please see our response to the previous recommendation for a figure showing mean BIC by model. In Author response image 2 we present the fraction of participants best fit by each model, as suggested.

Author response image 2

We now note in p. 14 how we compared the models: “We then compared the distributions of BIC values generated for each model across all subjects using t-tests. First, the BIC value for each model for each participant was computed. The differences between sets of BIC values (by model, for all participants) were used to run t-tests to determine whether there were statistically significant differences in the distributions of values for each of the models. We also examined the best model for each subject. In this case the BIC values were compared, and the model with the lowest BIC value for each subject was selected (without t-test).” We then note in pp. 18-19: “To characterize model fits at a population level, we first performed a repeated-measures ANOVA on BIC values (model as fixed effect, participant as random effect, BIC value as dependent variable). We found that BICs varied across models, F(1,8)=38.98, p<0.001. We further found that models 7 and 8 performed better than all other models based on post-hoc comparisons (ps<0.05, Bonferroni-corrected). These models did not, however, statistically differ (p=0.20). Since model 7 is a simpler model, we chose it as the optimal model for our data…”With regard to proportion of models providing best fit, we do not see large stand-outs. Rather, we see that subjects are spread across models, reflecting natural variability among the subtle variants. Our goal with model fitting was to find models which captured the skin conductance response dynamics well, rather than support assertions about which model captured a particular observed behavior (as is often common in studies doing model comparison). In our original analyses, we used only simple models and found that without a habituation term in the model, the models fit poorly, and the learning rates tended to absorb aspects of the learning curves that were not due to differences in learning, because these models had large bias. The main thing that comes from this figure is that the simplest model (model 1) does not seem to capture the skin conductance data well, while the models with additional parameters can provide a more reliable estimate of skin conductance response values. Because the BIC distributions did offer a reasonable way to identify an accurate model, we based our model selection most strongly on that.

We want to emphasize that for this manuscript, we did not bin the participants into categories of anxiety (high vs. low anxiety) or age (youth vs. adults). This is because both anxiety and age are naturally continuous variables, and it is not clear that breaking them down by categories offers an advantage, either empirically or conceptually. Nevertheless, when subjects are binned categorically, we do not see a clear relationship between which model is picked and anxiety. Author response image 3 shows the fractions of participants best fit by each model, broken down by group. We do report in Appendix 1 results for the primary analyses (associations between learning parameters and anxiety, and moderation by brain structure). This way, the reader has results for both the full sample and by age group.

Author response image 3

Concern: Description of model estimation and comparison lacks detail. Model parameters are fit with fminsearch (using which starting values? random? grid search? just one starting value would hardly be sufficient. any parameter constraints?) using OLS minimisation. How is BIC then computed? How are the models compared – the methods mention "per participant and across the sample"; do you assume a fixed or random model structure? What is the purpose of a t–test (p. 13) in the model comparison?

Recommendation: Provide modeling details.

We apologize for not clarifying. We added more information to the text to clarify this. The model parameters were fit with fminsearch. We used starting values between 0.1 and 0.8 for learning rate. The habituation parameters started at 0. This is now noted in pp. 13-14. In pilot exploration, we found that this approach gave good fits, with a reasonable amount of computation. There were no parameter constraints for the acquisition data fits, as estimates naturally fell in the range -1:+1. This is noted in pp. 13-14. For extinction, unconstrained fits with all 9 models (described in more detail later) led to extreme estimates for some participants in order to accommodate the data. When learning rates fall outside 1/-1, the RW model is unstable. As such, learning rates were constrained to -1:+1, which improved model fit. Still, ~14% of participants had estimated pinned at -1 or 1, and for four participants, models did not converge. This is now all noted in p. 15. Data for these were not used in analyses.

The parameter set associated with the maximum log-likelihood (same as the minimum negative log-likelihood) was selected out of 10 iterations across parameter starting values. We calculated the BIC using a standard Gaussian assumption on the residuals:

BIC=kln(n)+nln(variance) Where k is the number of parameters in the model (2 or 4), and n is the number of data points in the dataset that was fit (which corresponds to the number of trials in CS+ and CS- data for the given phase being fit). This is the BIC calculation under the assumption that model errors are i.i.d. and normally distributed (Priestley, M.B. (1981). Spectral Analysis and Time Series. Academic Press. ISBN 978-0-12-564922-3. (p 375)). This is now noted in p.14.

As noted in our response to the previous recommendation, we also added information on how models were statistically compared (p. 14; pp. 18-19).

Concern: The model space of the acquisition session does not include any of the "best" threat learning models from previously published work (see below for some specific examples). This precludes a direct comparison; more importantly it may also miss out on the best model in this sample. First, previous work has simultaneously modelled CS– and CS+ responses. This is appropriate (for all sessions), since a priori the agent does not know which is the "safety" cue, and will presumably use the same mechanisms to update value for both cues. The authors state that standard models cannot account for the large increase from first CS– trial (before first reinforced CS+) to next trial; while this is correct, the authors have a model to account for that, and more trivially one may just leave out the first CS– response. Secondly, the observation function (which is only specified in the legend of table 1 – this should appear in the text) maps SCR onto the value terms of all models. All previous SCR modelling work has found SCR is more closely related to associability or uncertainty, or a mixture of these with value.

Concern: The models considered are broadly suitable but comparison of model fit across both SCRs to the CS+ and CS– during acquisition and SCRs to the CS+ and CS– in extinction is missing, which is a limitation.

Recommendation: Include prior models in the model space. Models should simultaneously include CS+ and CS– responses. To this end, they may leave out responses before the first reinforced CS+ trial. Model space should include additional variants of models 9 that account for a mapping onto associability (as in Li et al. 2011, Zhang et al. 2016) or combined associability and value (as in Homan et al. 2019) and a variant of model 3 that accounts for a mixture of uncertainty and value (as in Tzovara et al. 2018).

As requested by the reviewers, the 9 models were entirely refit to the combined CS+ and CS- trials during conditioning (see Figure 2—figure supplement 1), although it should be noted that it is not clear that threat and safety learning rely on the same circuitry and thus the same learning processes. We now clarify in the text in multiple places that both CSs were modeled in the same model (p. 10-15).

This was also done for the extinction process (discussed later; Figure 3—figure supplement 1). The models referenced in Li et al., Zhang et al., and Homan et al., are all the same version of the Pearce-Hall model which has already been used in our study (model 9). We apologize for not describing this model clearly. We include two associability terms in the model that drive changes in learning rate (for fitting CS+ and CS-), as now noted in p. 14.

We stress that the family of models chosen for this study was intentionally similar, and we are using them to improve the quality of the metrics (parameters) used to link the SCR data to the symptom and neural data. We note this in pp. 9-10.

We reviewed the variants of the PH model mentioned by reviewers (Li et al. 2011, Zhang et al. 2016, Homan et al. 2019) and examined the associability terms for model 9. We plotted the median values of the two associability parameters (α positive and α negative for CS+ and CS-), resulting from the model 9 fit (see Author response image 4) and compared the profiles to the acquisition data. However, the time course of associability for acquisition lacked the learning dynamics we observed in the SCR data, and for extinction, associability and value were highly similar.

Author response image 4

As suggested, we also attempted to adapt the Tzovara et al. model, which uses uncertainty and value to co-determine the model’s predictions, to our reinforcement learning data. As Tzovara et al. conceived the model, it contains two parameters, and can only account for one type of reinforced data (i.e., CS+ or CS-, but not both). To model the response to both the CS+ and CS- simultaneously, we would need to add a third parameter (in addition to another two for the CS-) to the model, for a total of 5 parameters (more than any of our models, which have a maximum of 4 parameters). While possible, the BIC calculation would deduct points for this additional parameter, thus making it not an ideal model for these data. Author response image 5 comparing the Tzovara et al. uncertainty model to the CS+ data shows that the model fit across subjects does not pick up on the initial acquisition or subsequent habituation to the CS+:

Author response image 5

Thus, it is possible that the process captured by Tzovara et al., which is modeled over many trials, is different than the rapid, initial acquisition process targeted in our study. Given this result and the difference in processes of interest, we did not pursue this model further. We provide this rationale in full in Appendix 1 (p. 60).Of note, we also discuss the absence of findings regarding the hybrid model in the context of this specific task (Discussion, p. 25): “Of note, prior work identified the RW-PH “hybrid” model as providing optimal fit to data for the expression of threat contingencies among healthy adults as well as adults with PTSD12,33,34. Here, we found that this model did not provide the best fit for rapid threat conditioning or extinction among the models tested. This discrepancy could potentially be attributed to the nature of threat learning processes studied. While we focused on rapid acquisition/extinction of threat contingencies which takes place over a short training schedule, prior modeling studies examined threat contingencies over much longer durations (over 80 CS+ trials). The hybrid model includes a cumulative associability term which could be more sensitive to tracking contingency values or the expression of conditioned fear as it is optimized over longer durations. Thus, differences in rapid, crude acquisition vs optimized expression of threat contingencies processes could account for model differences. As such, the current and prior work could be seen as complementary in terms of threat learning processes studied, and both sets of findings could usefully inform study design for future work.”

Related Concern from Another Reviewer: The authors do not consider alternate models for the SCRs to the CS– during acquisition or for the SCRs to the CS+ during extinction and they do not model the SCRs to the CS– during extinction at all.

Recommendations: It would be helpful to know which model best fits all the acquisition data (to both the CS+ and CS–). Does the hybrid model still perform less well than the non–hybrid models? And does model 8 still outperform model 7? (in models 8 and 9, as in model 7, the CS– value could also be modelled as updating after each UCS delivery with model 8 now having 6 free parameters and model 9 4 free parameters.) Similarly, it would be useful to have both CS– and CS+ trials during extinction modelled. Given the raw data indicates the response to both the CS+ and CS– drop off during extinction, understanding if decreased responding to the CS+ during extinction represents active specific updating of the CS+ UCS contingency or a non–specific decreases in expectancy of the UCS after both the CS+ and CS– is clearly important.

Joint fitting of CS+ and CS- during acquisition:

As noted, we re-fit our nine models to both the CS+ and CS- during acquisition and performed model comparison as suggested. This is detailed in the previous responses and in the text. The hybrid model (model 9) did not perform particularly well, as can be seen in the bar chart of average BIC values across the models or fraction of best fit across the sample (Figure 2A). Notably, model 9 is also one of the two models (the other being model 6) that did not do well in model recovery. This model is designed to allow flexibility in learning rate, which is important in situations where there are changes in uncertainty over time. Because our task is a straightforward rapid acquisition/extinction task, we believe it might not necessarily provide the right statistics to lead to it fitting best. We note in this in Appendix 1 (pp. 59-60).

We performed a repeated-measures ANOVA on BIC values (model as fixed effect, subject as random effect), yielding a significant effect of model, F(1,8)=38.98, p<0.001, as noted in pp.18-19. Pairwise t-tests were conducted for model 9 versus all other models (Bonferroni corrected). Models 3, 4, 7, and 8 were significantly better than model 9, model 2 performed worse. Thus, we conclude that fitting the data from this task with this model does not produce parameter values that are optimal representations for this type of learning (i.e., rapid acquisition/extinction). This is now reported in Appendix 1 (pp. 59-60). Models 7 and 8, on the other hand, still perform best compared to all the other models. When directly compared to each other, model 8 and model 7 do not statistically differ (p=0.20). As before, since model 7 is a simpler model and they do not statistically differ, we chose model 7 as the optimal model for our data and used these parameters in subsequent analyses (pp.19-20). Thus, we have completely redone the analyses to account for these changes (pp. 19-21).

Extended modeling of extinction:

All 9 models were next fit to the CS+ and CS- extinction data; see Figure 3—figure supplement 1.

As noted in our response to previous recommendations, we provide information on joint CS-/CS+ fitting in p. 10-11. Several models had relatively low BIC values across the population, although overall there were no large differences between models, as can be seen Author response image 6.

Author response image 6
Model fit: BIC values and fraction of best fit.

We now note in p. 20 that we conducted a mixed-effects ANOVA on the BIC values (subject as random effect, model as fixed effect) to determine whether any of the models were significantly different than other models, and found a difference between models, F(1,8)=26.89, p<0.001. As can be seen, this effect is driven primarily by worse fits of models 4 and 9. We then ran follow-up pairwise t-tests for model 3, which was picked most often as the best at the individual subject level, and found that model 3 was not statistically different than models 1,2,5,6,7, or 8. Since model 3 was the simplest among the better models, and picked most often for individual subject fits, we used its parameters to model extinction. This is all now described fully in the text (p. 20).

We have accordingly redone the entire relevant section of the results concerning the relationship between the extinction data, anxiety symptoms, and neural data using model 3 parameters (pp. 20-21).

– Multiple Comparisons [Multiple Reviewers]

Concern – There are many post–hoc tests on association of model parameters with brain results and participant traits. It is unclear how this was planned and performed – how did the authors correct for multiple comparison across parameters, traits, and experiment phases?

Recommendation: provide appropriate, principled correction for multiple comparisons to mitigate familywise α inflation

We apologize for not making this part sufficiently clear, and we appreciate the opportunity to clarify our analysis plan as well as test level (α) adjustments. We now specify in the Methods (p. 16) that for each of the two phases (conditioning, extinction), we first identified the model best accounting for the phase data (which are now modeled using both CS- and CS+). For each estimated parameter in the winning model, we then examined whether it was associated with anxiety severity, and the moderation of this association by age using a regression model. Conditioning effects were best modeled using a model with four parameters (2 for CS+ and 2 for CS- responses, as we show in the Results section), and thus the test level was determined via Bonferroni as α=0.05/(4 parameters)=0.0125. Extinction effects were best modeled using two parameters (one for each CS, see Results), and thus significance level for each tested effect was determined as α=0.05/(2 parameters)=0.025. In terms of imaging analyses, we examined whether brain vertex (cortical thickness)/subcortical structure (GMV) was associated with each learning parameter as moderated by age. All imaging analyses used FWE rate correction for multiple comparisons of α<0.05, whereby the family of tests for each analysis consisted of all vertices/subcortical structures across all tested effects. This is now all detailed in the Methods section, p. 16.

Recommendation: Please state how many tests were conducted and what correction was applied for example how many brain regions were investigated, was grey matter volume in each of these related to each of the five learning rate parameters (across acquisition, generalisation and extinction), and were the moderating effects of anxiety, age and anxiety by age considered in each case?

Recommendation: please be clear in defining what "counts" as a family of analyses

We hope that our previous response clarifies the types of multiple-comparison correction used in analyses and what the tested effects were. Specifically, we note (p. 16) what counts as a family for behavioral analyses (all tested effects) and for imaging analyses (all vertices/subcortical structures across tested effects). We also now specify the number of vertices tested in the whole-brain cortical thickness analysis and moved the listing of subcortical structures tested for GMV from Appendix 1 to the main text (p. 15).

– Specificity of key results

Concern: The relationship between anxiety and safety learning (as indexed by the habituation parameter to the CS– during acquisition) is one of the main findings.

Concern: The novelty of this study depends both on the strength of the computational analysis and on the extent to which the results provide new insights into the relationship between child and adult anxiety and differential fear conditioning relative to the non–computational analyses reported in the authors' prior publication on this dataset (Abdend et al., 2020). The relationship between anxiety and safety learning (as indexed by the habituation parameter to the CS– during acquisition) could be one such novel insight – however here a direct comparison of the relationship between anxiety and this parameter versus the habituation parameter to the CS+ is needed to ensure this isn't simply a non–specific habituation effect. The same issue holds for the finding of a relationship between anxiety and rate of extinction to the CS+ – the authors need to show this holds over and above any relationship between anxiety and drop off in responsivity to the CS–.

Recommendation: Here, a comparison of the relative strength of relationship between anxiety and participants' habituation parameter values for the CS– versus that for the CS+ is needed to ensure this isn't simply a non–specific habituation effect. Similarly, the authors need to directly compare the relationship between anxiety and drop off in SCRs to the CS+ during extinction against the relationship between anxiety and drop off in SCRs to the CS– during extinction to be able to make any claim specific to extinction to the CS+. This also applies to the moderating effects of brain structure on this relationship.

Thank you for this recommendation, which allowed us to examine stronger inferences regarding the specificity of effects. The association between anxiety and CS- habituation rate during conditioning remained significant when controlling for CS+ habituation rate, β=-0.239, p=0.001, indicating the specificity of anxiety effects to safety learning. This is now noted in p. 19-20, and in the Discussion, pp. 24-25. Similarly, the association between anxiety and CS- extinction rate maintained its magnitude when controlling for CS+ extinction rate, p=0.031, and also when controlling for CS- learning and habituation rates during conditioning, p=0.030, providing support for the specificity of the anxiety-extinction association (reported in p. 21). Likewise, the significant brain structure finding for CS- habituation and anxiety moderation by accumbens GMV remained significant when controlling for CS+ habituation rate p=0.040 (p. 20). The effect for extinction was not re-assessed as it was only at trend level.

– SCR Approach

Concern: SCR data are likely not optimal for model fitting at the subject level; SCR is noisy, and the task involves very few trials. This may result in poorly estimated parameters; however it is not clear from the presented analyses how much of a problem this is.

Recommendation – Given the noisiness of SCR data and the limited number of trials available, parameter estimation could likely be improved substantially by using hierarchical Bayesian parameter estimation methods.

We agree that SCR data are indeed inherently noisy. This was a major motivation to accumulate a large sample of participants, especially in the studied populations. Accordingly, data collection took several years, as recruitment of the target populations (medication-free, clinically-diagnosed anxiety patients, including children, who are willing to take part in an aversive experiment) is a difficult, expensive, and time-consuming process. We opted to retain our original equipment and software during this period. Since it has been several years since data were collected and analyzed, and as our group has moved on to newer equipment for data collection and analysis since then, we, unfortunately, no longer have access to the original raw data and cannot reanalyze them. While model fit to the SCR data was good (as was model recovery) and the sample size was large, we are aware that not using newer methods could possibly decrease statistical power, rendering our approach rather conservative. It should also be noted, however, that recent work does not clearly indicate whether newer processing methods provide stronger or more valid estimates in the context of the paradigm used here (Kuhn, M., Gerlicher, A., and Lonsdorf, T. B. (2021, April 30). Navigating the manifold of skin conductance response quantification approaches – a direct comparison of Trough-to-Peak, Baseline-correction and model-based approaches in Ledalab and PsPM. https://doi.org/10.31234/osf.io/9h2kd). Nevertheless, we now note this as a limitation of the study in p. 28, citing papers by Bach et al. which provides background and examples for such methods: “Fourth, SCR data were analyzed using conservative methods (base to peak amplitude); future studies should consider utilizing more novel analysis methods which may increase sensitivity and statistical power96-99”.

Concern: The quantification of the conditioned response is in line with some previous work, but in fact there is a range of possible analysis choices, summarised in Lonsdorf et al. 2017.

Recommendation: Given this heterogeneity, it would make sense to use the most sensitive method. Bach et al. 2020 Behaviour Research and Therapy and Ojala and Bach 2020 Neurosci Biobeh Rev discuss the most sensitive method to quantify threat conditioning from skin conductance data, and Bach et al. 2020 Nat Hum Behav provide background.

We appreciate the suggestion to re-analyze the dataset using alternative methods of analysis as indeed multiple different choices do exist. It is quite possible that changing the way that we analyze data in our group to a different method might increase the effects sizes, although, as noted above, it is not clear that this effect would be achieved (see reference to Kuhn et al.; newer methods might not necessarily show more sensitivity than the method used here). In any event, re-analyzing the data will likely change the nature of findings, from learning models to their associations with symptoms/brain structure, since different methods might not necessarily capture the same physiological processes (e.g., assuming a canonical phasic response vs possible changes in SC that include phasic but also tonic changes – our own recent work, under review elsewhere, supports the latter notion). As such, it is difficult for us to conclusively say that all alternative methods measure the same exact physiological process, and we cannot conclusively say (given recent evidence) which method is the ideal one in this context. In this case, the dataset should be re-analyzed using multiple methods, and we worry that having to analyze the data using multiple methods and indices might make interpretation difficult and the paper more challenging to read. Importantly, as we note in our response to the previous comments, we were also unable to reanalyze the data in this specific case, for practical reasons. Nevertheless, we agree that using the current method, while standard, might not necessarily be the most powerful method and could therefore be considered more conservative. As such, as noted above, we added an important limitation that pertains to the analysis method that also encourages future work to explore newer analytic methods (p. 28; see above). We cite the papers suggested by the Editor, as well as additional work by Bach et al., to provide context.

– Missing SCR Data

[Multiple reviewers] Concern: More than 1/3 of the sample are excluded due to "excessive missing data", defined as >50% missing trials.

Recommendations: It is unclear why these data were missing. Please address the explanation and please discuss the potential consequences for inference. Do excluded subjects systematically differ from those included? It would be helpful if the authors could include a table showing a break–down of the characteristics of included and excluded participants by age, gender, diagnostic status and anxiety levels and if they could address in the discussion any associated limitations in particular in relation to the age by anxiety analyses reported.

Indeed, a substantial amount of SCR was missing. We agree that this is an important point to clarify and discuss. We report the criteria and exclusion process, as recommended by Lonsdorf et al. (2019, eLife), in Appendix 1 (pp. 52-53). Exclusion was due to individuals with no response or with outlier values. A comparably large proportion of non-responders has previously been reported (e.g., Homan et al., 2019; Lonsdorf et al., 2019). To assess whether the SCR processing method led to this dropout level, we analyzed using newer approaches (as suggested above) a different, ongoing dataset of adolescents who completed this task. We observed ~25% missing data using similar criteria, suggesting that it is not the processing method. Instead, based on our extensive experience, the noisiness and difficulty in collecting good signal reflects the challenges of physiological data collection particularly across a wide age range and in those diagnosed with anxiety disorders, in the context of uninstructed aversive tasks. Due to these challenges, we chose conservative criteria, resulting in a large proportion of exclusion, since modeling was based on trial-by-trial data, and we needed sufficient usable trial-level data. Additionally, data collection for this study took place over several years, due to the need to acquire a large sample of patients across age; newer systems used today may be more sensitive, signal-wise.

As recommended by the Editor, we added Appendix 1-table 1 which details differences between included and excluded participants in terms of age, sex, anxiety (both diagnostic status and continuous severity), and IQ. We noted significant differences in age and sex (but not anxiety or IQ), such that excluded (non-responder) individuals tended to be older (M=23.4 vs. 18.8 years) and with a greater proportion of females (69.1% vs. 54.0%). This is now noted in the main text (p. 8) and Appendix 1. Indeed, age and sex differences in responsivity have been previously noted, but inconsistently (e.g., Boucsein, et al., 2012, Psychophysiology; Bari, et al., 2020, J Biol Phys; Ganella, et al., 2017, Front Hum Neuro).

It should be noted that primary analyses included age in the models, and as such, anxiety effects are independent of age effects. We further examined the primary analyses using sex as a covariate (p. 53).

Finally, we added a paragraph discussing these important issues to Appendix 1 (p. 53): “This proportion of exclusion is similar to excluded data in prior work12,67. See Appendix 1-table 1 for demographic and clinical differences as a function of inclusion/exclusion. No significant differences in exclusion were noted for anxiety severity (by diagnosis or continuously) or IQ, but excluded relative to included participants had a higher mean age and a greater proportion of females. Note that age was included in tested models, and thus observed anxiety effects are independent of age effects; further, results did not change when using sex as a covariate. Based on our experience, we believe that this exclusion proportion reflects challenges of physiological data collection during uninstructed aversive tasks, particularly across a wide age range and in those diagnosed with anxiety disorders, in conjunction with conservative exclusion rules since modeling required sufficient trial-by-trial data. These considerations should be taken into account in future research, particularly in studies across a wide age range and in those that consider sex, and require sufficient trial-level data. Implementing newer analytic techniques and equipment, as well as use of more robust aversive stimuli, could potentially mitigate this issue.”

Given other comments, we no longer report on significant Age x Anxiety interactions (see response to later comment). As requested, to address this important issue, we added another limitation referring to age and sex differences (p. 28): “Finally, there were differences in sex and age between individuals included and excluded (non-responders) from analyses, as previously reported102,103; while these differences did not influence our findings, they could still limit generalizability and thus future studies should consider such potential differences.”

We hope that this information could help researchers by alerting them to such potential differences in responsivity.

Concern: It is unclear why missing data are inter/extrapolated?

Recommendation: Use MSE. The model fitting procedure should be happy with missing data points if MSE (rather than SSE) is used as objective function. This would be more appropriate than giving the model a data value that does not exist. Censor, rather than interpolate.

We chose to impute missing time-series data points since model generation relied on trial-by-trial data for a limited number of trials, and learning effects were very rapid. We chose to assume that missing data points lie between the recorded data around them and impute missing data, rather than censor since we were worried that censoring would lead to spurious estimates. We now provide this rationale in Appendix 1, p. 52.

– Disparate Dimensional Measures of Anxiety [Multiple Reviewers]

Concern: The combination of multiple anxiety measures (SCARED, STAI) into a single anxiety measure may not be valid, as it's reasonable to assume these are measuring subtly different constructs (aside from anything else, the SCARED measures recent symptoms while the STAI is a trait measure).

Concern: The main analyses appear to combine measures of anxiety severity across age groups as the primary variable of interest (using the SCARED for youth and the STAI for adults), and in one analysis find an interaction between age and anxiety that is driven by the younger age group. This raises some concerns as these measures are not necessarily measuring the same construct, and may limit the interpretability of these results.

Concern: Children and adults show opposing direction relationships between anxiety and fear generalisation. It is difficult to interpret the anxiety by age results given different measures were used to assess anxiety in children and in adults. This could potentially account for the different relationship between anxiety and fear generalisation in under and over 18s (see Figure 3B). I am not convinced that the authors can simply merge scores from the SCARED (which averages symptom scores based on child and adult reports) and STAI (an adult self–report questionnaire) as if they came from the same measure.

Recommendation: Absent some sort of direct quantitative harmonization (as in the GWAS literature for N/NE) or additional evidence to motivate the Z–score approach, it may not be appropriate to combine or directly compare the younger (SCARED) and older (STAI) samples (i.e. more appropriate to keep them distinct and examine them separately)

Thank you for this interesting comment. We agree that the SCARED and STAI are not the same measure, despite each being considered a ‘gold standard’ for the overall assessment of anxiety severity for its age group, and thus may not be fully comparable. At the same time, given the developmental trajectory of anxiety (Beesdo et al., 2009, Psych Clin N Am) and the great interest in developmental effects on threat learning circuitry in the context of anxiety and its treatment (e.g., Casey, et al., 2015, Neuron; Shechner, et al., 2015, Dep and Anx; King, et al., 2014, Stress), which was the impetus for this study, we believe it is still important to report on age effects across the age continuum. Balancing the need to comprehensively report our results as intended with the potential incongruity of anxiety measures, we repeated the primary analyses separately in youth and adults while also keeping the original analyses.

To alert the reader to this important issue and to the fact that additional analyses were conducted, we provide this acknowledgement relatively early in the Methods section (p. 7): “Given that the SCARED and STAI might not capture the construct of anxiety in an identical manner, we also report on SCARED analyses in youth and STAI analyses in adult participants separately.”

We retained the analyses examining associations between anxiety and learning across the age continuum since these already control for age. Further, we now report in the Results section (p. 21) that additional analyses were performed, as well as the rationale, and provide a brief summary. We provide the results in full in Appendix 1. Specifically, the brief summary in the main text notes: “In addition to the primary analyses examining associations between anxiety and learning parameters and the moderation of these associations by brain structure across the full sample, we also examined these effects separately among youth and adult participants. These are reported in full in Appendix 1 (see below). Briefly, they suggest that observed anxiety effects on safety learning and extinction are attenuated with age. Further, these suggest moderation of anxiety-safety learning rate by several structures (bilateral nucleus accumbens, brain stem, and amygdala) in youth participants but not in adults. In contrast, extinction rates were associated with amygdala, accumbens, and brainstem volume moderation in adults, but not in youths; see Appendix 1.”

The full text in Appendix 1 (pp. 61-62) states: “In addition to the primary analyses reported in the main text, we also examined associations between learning parameters and anxiety separately among youth and adult participants, as described below.

Conditioning. Anxiety was not significantly associated with CS+ threat conditioning or habituation rates in the youth group, βs≤0.11, ps≥0.665, and in the adult group, βs≤0.53, ps≥0.289. Anxiety severity was not significantly associated with CS- generalization rate in the youth group, βs≤0.51, ps≥0.297, and in the adult group, βs≤0.45, ps≥0.297. CS- safety learning rate was negatively correlated with anxiety in the youth group, β=-0.27, p=0.002, as in the full sample, and marginally so in the adult group, β=-0.24, p=0.056, suggesting an attenuation of anxiety-safety learning association with age. No other effects were observed.

Effects of brain structure emerged in the youth group. These included a negative association between brain stem volume and CS- safety learning rate, as well as bilateral accumbens moderation effects similar to those reported in the main text, pFWEs<0.05. Right amygdala showed a similar moderation effect at trend level, pFWE=0.0628. In the youth group, left accumbens showed a moderation effect similar to that reported in the main text and in the youth group, but at trend level, pFWE=0.073.

Extinction. In the youth group and the adult group, anxiety was not significantly associated with CS+ extinction rate, β=0.12, p=0.227, and β=-0.16, p=0.204, respectively. CS- extinction rate was negatively correlated with anxiety in the youth group at comparable magnitude to what was observed across the full sample, β=-0.20, p=0.032, but not in the adult group, β=-0.09, p=0.504, suggesting an attenuation of the association between anxiety and CS- extinction with age. No other effects were observed.

In the youth group, no effects of brain structure emerged. In contrast, in the adult group, a significant positive association between right amygdala volume and CS+ extinction rate was noted, pFWE=0.008. Brain stem volume moderated the association between anxiety and CS+ extinction, pFWE=0.012. Anxiety-CS- extinction rate was moderated by left accumbens and brain stem volume at trend level, pFWEs<0.073.”

Importantly, we also added a full paragraph in the Discussion section in which the differences between analyses in full vs. age-specific samples are discussed (pp. 25-26). Specifically, the text reads: “While the design of this study offers a unique opportunity to examine age effects on threat learning as these relate to anxiety severity, an inherent challenge that arises in research on anxiety along the lifespan is how to combine anxiety data from youth and adult participants. Although the SCARED and STAI are each considered “gold standard” measures in their respective target populations, they nevertheless are not identical. Under the assumption that these capture similar constructs, we uncovered several anxiety effects across the full age range. Alternatively, one may consider these measures to be incomparable, in which case analyses are restricted to specific age groups. This alternative approach indicated an attenuation of anxiety- learning associations with age. As such, one interpretation of the discrepancies is that observed effects may in fact be instrument-specific. Another interpretation is that limited range and sample size reduced statistical power to detect associations in the different sub-groups. In either case, consideration of harmonizing clinical data is needed in future research that examines individuals across a wide age range.”

– Potential nonlinear effects of age

Concern: Only linear effects of age are modelled.

Recommendation: explore potential nonlinear effects (given that many developmental effects are nonlinear).

We previously reported quadratic effects (all with null results) in Appendix 1. These results are reported in p. 62.

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

The manuscript has improved, with one Reviewer emphasizing that "the authors have done a good job of addressing many of the suggestions and the manuscript is definitely substantially improved as a result."

Thank you for this feedback. We agree that subsequent recommendations in the second round of reviews have also significantly improved our manuscript and the clarity of our findings.

Nevertheless, there was agreement that the paper would benefit from some additional modeling.

One Reviewer wrote that, Abend et al. have addressed some, but not all of my previous concerns. The model space of the acquisition session does not include any of the "best" threat learning models from previous publications. This precludes a direct comparison; more importantly it may also miss out on the best model in this dataset.

The authors now model CS– and CS+ responses simultaneously. Yet they still do not compare their models to any of the those used in previous studies of conditioned SCR (Li et al. 2011, Zhang et al. 2016, Tzovara et al. 2018, Homan et al. 2019). They claim that their model 9 is the hybrid model used in three of these studies, but actually it is not, because the update works differently (presently they update only upon US delivery, and both CS+ and CS– from PE in CS+ trials, where as previous RW and hybrid models used the standard update on every trial, on signed PE, and with no cross–update). They qualitatively argue why the associability term in previous work does not fit their data, but in fact they should just quantitatively compare models that map value and/or associability onto SCR. Finally, they argue that Tzovara's model can only update CS+ or CS– expectation with one parameter, and modelling both CS types within the same model would lead to parameter inflaton, but in fact Tzovara's winning learning model is entirely parameter–free (only has parameters for the observation function), so this is inaccurate.

The authors need to implement these pre–existing models and show quantitatively that they do not fit the data, rather than on rely on qualitative and methodological arguments for why this is unnecessary. Based on their rebuttal, it seems that they had already implemented the models, so it should be relatively straightforward to make the necessary direct ("head–to–head") comparisons.

Along related lines, the other Reviewer noted that, [page 11] The authors argue that to adapt the Tzovara et al. model, a 5th parameter would need to be added, and that this would make it too complex. It would still be worth testing this properly [per the other reviewer's comments], and it is possible (although probably unlikely based on the provided evidence) that the model could fit well enough that it still wins despite the additional complexity–related penalisation.

These suggestions are valid. In response, we have fit 5 additional models to our data, including three from Li et al., 2011: Hybrid(V), Hybrid(α) and Hybrid(V+α) and the Tzovara et al., 2018 probabilistic uncertainty model with and without an additional habituation parameter. We fit these 5 models to our acquisition data and 3 of the models (Li et al., 2011) to the extinction data. It is not possible to fit the Tzovara model in a meaningful way to extinction because there are no UCSs. This led to a total of 14 models used to model acquisition data and 12 models used to model extinction data. We have updated the manuscript to reflect this (detailed below).

We found that on average, these models provided reasonable fits to subject data. Please see Figure 2—figure supplement 1 and Figure 3—figure supplement 1 in the Appendix for all model and data overlays that include the new models for acquisition and extinction. Although we were able to fit the models, the Li et al. models had a large number of parameters relative to the number of data points available to fit them. After fitting these models, we also carried out parameter recovery on all of the models to evaluate the quality of the fits. We found that the Li et al., models had poor parameter recovery. Therefore, to improve the model selection process, we imposed a minimal criterion on the correlation coefficients of the parameters to determine which models should be considered in final model comparison. We imposed a criterion that all parameter correlation coefficients for a given model should be ≥0.20, if we subsequently included that model in model comparison.

Many parameters could not be recovered for the new models (see Appendix 1- Table 2). The Li et al., models had considerable additional complexity, and thus it is likely that the regression step that adds 1-3 additional parameters was allowing for too much flexibility to fit noise in the data. Our original model 9, which we called the Pearce-Hall Hybrid model, lacked this regression step, but was otherwise a similar model. This model also did not survive parameter recovery, suggesting that this family of models was not the best set of models for this particular dataset and processes modeled, despite previous reports and use of these models for other datasets. Importantly, as we note in the text, this dataset features a limited number of trials per subject in order to capture rapid learning processes; while we compensate for that with a large number of subjects, perhaps those models are better suited for processes that use more trials for each subject to estimate the larger number of parameters.

The Tzovara models fit the data reasonably well, and because parameters could be recovered (Appendix table 1- table 2), were included in model comparison. However, these models did not produce an average BIC better than the previously chosen model (Figure 2A). Pairwise t-tests showed that models 7 and 8 remained the best models (ps<0.05, Bonferroni-corrected). As these two models did not differ from each other, we chose the simpler model (model 7, Bayesian learning rate and habituation) and the original conclusions from this section did not change.

Of note, we also tried another variant of the Tzovara model with two additional habituation parameters (habituation for CS+ and CS- for 6 total parameters) but the habituation parameters of this model were not recoverable during parameter recovery (p. 14; Appendix 1-table 2). Thus, we retained the model with a single habituation parameter.

Using the same methods, we evaluated the new models for extinction data, which included the 3 Li et al., models (Figure 3—figure supplement 1). After parameter recovery and imposing a criterion for parameter recovery, four models remained for model comparison: models 1, 2, 3 and 5 (Appendix 1- Table 3). These models were not statistically different from each other (ps>0.05, Bonferroni-corrected), and model 3 was the best fitting model for more subjects than the competing models (Figure 3B). Thus, we retained our original model and use of parameters from this model to correlate with anxiety severity and brain volume.

Please see updated figures and tables that account for the additional models:

– Figure 2A, 2B (main text): updated bar plots for models compared for acquisition

– Figure 2—figure supplement 1: model/data overlays for all 14 models fit to acquisition

– Figure 2—figure supplement 3: fractions of subjects by age/anxiety for acquisition

– Figure 3A, 3B (main text): updated bar plots for models compared for extinction Figure 3—figure supplement 1: model/data overlays for all 12 models fit to extinction

– Figure 3—figure supplement 1: model/data overlays for all 14 models fit to extinction

– Appendix 1-table 2: parameter recovery for acquisition

– Appendix 1-table 3: parameter recovery for extinction

The Reviewers and Editors also discussed hierarchical Bayesian approaches, with one reviewer noting that, [page 19] Reviewer suggestions related to hierarchical Bayesian parameter estimation do not seem to have been addressed in the revision. For clarity, this refers to the method of estimating reinforcement learning model parameters (see van Geen and Gerraty, 2021), rather than the SCR quantification, and there's a strong possibility it would improve parameter estimation and hence sensitivity. I appreciate that this could be represent a substantial re–analysis.

During the consultation session, the 2 Reviewers agreed that the hierarchical Bayesian approach would be nice, but is not strictly necessary for the revision, with one noting that, "I wouldn't want to force them to redo all the model fitting with a hierarchical Bayesian approach, it would likely provide more accurate estimates, but it would be a fairly big undertaking. Some acknowledgement that other model–fitting procedures exist and could be more appropriate would be nice, however (this probably also applies to their SCR quantification methods)".

And the other writing that, "I agree. This model fitting approach is a major step and may not work for some of the models."

In sum, the Bayesian approach should –– at minimum –– be noted in the Discussion.

Thank you for this suggestion and we apologize for not acknowledging other model fitting techniques more clearly. We have updated the Discussion section to include a section acknowledging Bayesian parameter estimation and potential future work in this direction. We acknowledge the suggested papers in addition to others describing hierarchical Bayesian methods. Please see Discussion, pages 29,30:

“As a final technical comment, reinforcement learning model parameters were estimated using maximum likelihood techniques on individual subjects followed by model comparison. Future work could expand on this by using hierarchical Bayesian parameter estimation to reduce the variance around parameter estimates105-107. However, choosing prior distributions within the hierarchical Bayesian approach is not trivial and may not work for all of the models tested in this study. As such, future work could focus on fewer models, such as those that survived parameter recovery in this study, test a range of priors, and determine whether parameter estimation could be improved.”

We have also expanded our description of other SCR quantification methods. (Discussion, page 29):

“Fourth, SCR data were analyzed using a single method which, while established, relies only on directly-observable effects. Future studies may consider using novel, computational analysis methods which could potentially reveal effects not observed using the current method96-99. Along these lines, a multiverse approach may be used in future work to comprehensively compare multiple methods of quantifying threat learning”.

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

We are very pleased to accept your paper for publication pending receipt of a revision that adequately addresses the remaining suggestions from Reviewer #2. Congratulations in advance!

Reviewer #2 (Recommendations for the authors):

This revision of the paper by Abend et al. is significantly stronger by including a direct comparison of the current model space with previous "best" models. There are two substantive and a few presentation points remaining.

1. The Tzovara et al. model is implemented differently from the original paper. The authors use separate regression parameters for CS+ and CS– trials for the Tzovara et al. model (i.e. 4 parameters) and it is not clear to me why – Tzovara et al. themselves had used only 2 regression parameters across these two trials types. With 2 instead of 4 parameters, the model might fare better (even though the fit/RSS will of course become worse). Could the authors include the original 2–parameter Tzovara et al. model?

Thank you for this detail and for the suggestion. We have repeated the fitting of the Tzovara model with only 2 regression parameters for CS+ and CS- trials. Previously, we had tried to give the model the best chance of success by using separate regressors for CS+ and CS-, as the CS+ data has different dynamics than the CS- data. The model and data overlays for the Tzovara model with 2 parameters are shown in the updated Figure 2—figure supplement 1. For completeness, we also repeated the modeling using additional habituation terms for a Tzovara model with 4 total parameters. The model and data overlays are also shown in updated Figure 2—figure supplement 1. We replaced the previous models 13 and 14 with the requested versions of the Tzovara models in all aspects of the manuscript. Parameter recovery values have been updated in Appendix1-Table 2. These are the new models 13 and 14 to replace those reported previously. Model 13 survived the parameter recovery criteria, whereas model 14 did not. Thus, model 13 was included in the revised versions of Figure 2A and Figure 2—figure supplement 3 and BIC analyses.

We have also updated the uploaded code (available on a GitHub page) to reflect the most recent versions of the Tzovara models.

2. Perhaps relatedly, I've been wondering about the initial values of the various terms in the update equations. Can they be presented in the tables? I'm wondering, for example, whether they used the same values for α_0 and β_0 in the β distribution of the Tzovara et al. model, but other readers might wonder about other models. Of course, it can make a big difference how a model is "initialised".

Thank you for this suggestion. We have added the initialization values, for both conditioning and extinction models, to the table of equations (Table 1) for completeness and clarity.

3. The authors make an effort to explain the difference between models 1–9, and models 10–14, by using different symbols on the LHS of the equations. Perhaps it could be made more prominent that models 1–9 predict CS–related SCR from US–related SCR and thus need no regression to scale the predictions to the data, whereas models 10–14 predict CS–related SCR from an arbitrary reinforcement variable.

Thank you for this suggestion to improve the clarity of our modeling procedures. We have added the following text to the manuscript in the Methods on page 14:

“In summary, for models 1-9, predictions were not scaled to the data by using regression. Models 1-9 predict CS-related SCR from US-related SCR. For models 10-14, CS-related SCR is predicted using a regression to associability, value, or uncertainty, as described for each model. For models 1-9, outcome was modeled as a continuous variable. For models 10-14, we modeled outcome as a binary value of 0 or 1, as described in past publications using these models.”

4. I'm still unsure about using t–tests to compare BICs across a group. I have never seen this. What is the statistical foundation for this approach? I would suggest adding the RFX–model comparison by Stephan et al. as had been suggested by two reviewers in the context of the initial manuscript version. It is implemented in a matlab function (spm_BMS), requires only the matrix of BIC values, and takes a few seconds to run.

We apologize that our model comparison procedures were unclear. The referenced paper by Stephan et al., (2009), “Bayesian model selection for group studies,” states a description of the random effects procedure as was used in our manuscript:

“A straightforward random effects procedure to evaluate the between-subject consistency of evidence for one model relative to others is to use the log-evidences across subjects as the basis for a classical log-likelihood ratio statistic, testing the null hypothesis that no single model is better (in terms of their log-evidences) than any other. This essentially involves performing an ANOVA, using the log evidence as a summary statistic of model adequacy for each subject.

This ANOVA then compares the differences among models to the differences among subjects with a classical F-statistic. If this statistic is significant one can then compare the best model with the second best using a post hoc t-test.”

“The most general implementation would be a repeated-measures ANOVA, where the log-evidences for the different models represent the repeated measure. At its simplest, the comparison of just two models over subjects reduces to a simple paired t-test on the log-evidences.”

This was the methodology used in our manuscript as described in the Methods (pp. 15-16)­.

In addition, per the Reviewer’s suggestion, we have run the spm_BMS function from the SPM imaging toolbox on our BIC values. Using a flat prior across the models, we reproduced the posterior probabilities of the six models compared for the acquisition data:

Model 1: 0.0351

Model 4: 0.1356

Model 5: 0.1950

Model 7: 0.1423

Model 8: 0.2369

Model 13: 0.2551

These values are nearly identical to the fraction of subjects best fit by each model, shown in Figure 2A. Given that we do not expect that prior probabilities of each model should be anything but uniform, we did not expect significant differences between the RFX model comparison method and the random effects procedure.

https://doi.org/10.7554/eLife.66169.sa2

Article and author information

Author details

  1. Rany Abend

    Emotion and Development Branch, National Institute of Mental Health, National Institutes of Health, Bethesda, United States
    Contribution
    Conceptualization, Formal analysis, Investigation, Methodology, Project administration, Software, Supervision, Validation, Visualization, Writing - original draft, Writing – review and editing
    Contributed equally with
    Diana Burk
    For correspondence
    rany.abend@nih.gov
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-0022-3418
  2. Diana Burk

    Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, Bethesda, United States
    Contribution
    Formal analysis, Methodology, Software, Visualization, Writing – review and editing
    Contributed equally with
    Rany Abend
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-7775-2989
  3. Sonia G Ruiz

    Emotion and Development Branch, National Institute of Mental Health, National Institutes of Health, Bethesda, United States
    Contribution
    Data curation, Formal analysis, Methodology, Project administration, Visualization, Writing - original draft, Writing – review and editing
    Competing interests
    No competing interests declared
  4. Andrea L Gold

    1. Emotion and Development Branch, National Institute of Mental Health, National Institutes of Health, Bethesda, United States
    2. Department of Psychiatry and Human Behavior, Brown University Warren Alpert Medical School, Providence, United States
    Contribution
    Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Supervision, Writing - original draft, Writing – review and editing
    Competing interests
    No competing interests declared
  5. Julia L Napoli

    Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, Bethesda, United States
    Contribution
    Formal analysis, Investigation, Methodology, Software, Visualization, Writing – review and editing
    Competing interests
    No competing interests declared
  6. Jennifer C Britton

    1. Emotion and Development Branch, National Institute of Mental Health, National Institutes of Health, Bethesda, United States
    2. Department of Psychology, University of Miami, Coral Gables, United States
    Contribution
    Conceptualization, Data curation, Investigation, Methodology, Project administration, Supervision, Writing - original draft, Writing – review and editing
    Competing interests
    No competing interests declared
  7. Kalina J Michalska

    1. Emotion and Development Branch, National Institute of Mental Health, National Institutes of Health, Bethesda, United States
    2. Department of Psychology, University of California, Riverside, Riverside, United States
    Contribution
    Data curation, Investigation, Methodology, Project administration, Supervision, Writing - original draft, Writing – review and editing
    Competing interests
    No competing interests declared
  8. Tomer Shechner

    1. Emotion and Development Branch, National Institute of Mental Health, National Institutes of Health, Bethesda, United States
    2. Psychology Department, University of Haifa, Haifa, Israel
    Contribution
    Data curation, Investigation, Project administration, Supervision, Writing – review and editing
    Competing interests
    No competing interests declared
  9. Anderson M Winkler

    Emotion and Development Branch, National Institute of Mental Health, National Institutes of Health, Bethesda, United States
    Contribution
    Formal analysis, Investigation, Methodology, Software, Visualization, Writing – review and editing
    Competing interests
    No competing interests declared
  10. Ellen Leibenluft

    Emotion and Development Branch, National Institute of Mental Health, National Institutes of Health, Bethesda, United States
    Contribution
    Methodology, Supervision, Writing – review and editing
    Competing interests
    No competing interests declared
  11. Daniel S Pine

    Emotion and Development Branch, National Institute of Mental Health, National Institutes of Health, Bethesda, United States
    Contribution
    Conceptualization, Funding acquisition, Investigation, Project administration, Resources, Supervision, Writing – review and editing
    Competing interests
    No competing interests declared
  12. Bruno B Averbeck

    Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, Bethesda, United States
    Contribution
    Conceptualization, Formal analysis, Investigation, Methodology, Project administration, Software, Supervision, Writing - original draft, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-3976-8565

Funding

National Institutes of Health (ZIAMH002781-15)

  • Daniel S Pine

National Institutes of Health (R00MH091183)

  • Jennifer C Britton

Brain and Behavior Research Foundation (28239)

  • Rany Abend

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank the participants and families, as well as the staff of the Intramural Research Program of the National Institute of Mental Health (IRP, NIMH), National Institutes of Health. We also thank Emily Ronkin, Elizabeth Steuber, Madeline Farber, Jessica Sachs, Brigid Behrens, Carolyn Spiro, and Omri Lily for their contribution to data collection.

This research was supported (in part) by the NIMH IRP (ZIAMH002781-15; DSP), NIH grant K99/R00MH091183 (JCB), and a NARSAD Young Investigator Grant from the Brain & Behavior Research Foundation (RA).

Ethics

Written informed consent was obtained from adult (age greater than or equal to 18 years) participants as well as parents, and written assent was obtained from youth (age under 18). Procedures were approved by the NIMH Institutional Review Board (protocol 01-M-0192).

Senior Editor

  1. Michael J Frank, Brown University, United States

Reviewing Editor

  1. Alexander Shackman, University of Maryland, United States

Reviewer

  1. Dominick Bach

Publication history

  1. Received: December 31, 2020
  2. Accepted: April 25, 2022
  3. Accepted Manuscript published: April 27, 2022 (version 1)
  4. Version of Record published: June 14, 2022 (version 2)

Copyright

This is an open-access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.

Metrics

  • 946
    Page views
  • 212
    Downloads
  • 0
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Rany Abend
  2. Diana Burk
  3. Sonia G Ruiz
  4. Andrea L Gold
  5. Julia L Napoli
  6. Jennifer C Britton
  7. Kalina J Michalska
  8. Tomer Shechner
  9. Anderson M Winkler
  10. Ellen Leibenluft
  11. Daniel S Pine
  12. Bruno B Averbeck
(2022)
Computational modeling of threat learning reveals links with anxiety and neuroanatomy in humans
eLife 11:e66169.
https://doi.org/10.7554/eLife.66169
  1. Further reading

Further reading

    1. Cancer Biology
    2. Computational and Systems Biology
    Pan Cheng, Xin Zhao ... Teresa Davoli
    Research Article

    How cells control gene expression is a fundamental question. The relative contribution of protein-level and RNA-level regulation to this process remains unclear. Here, we perform a proteogenomic analysis of tumors and untransformed cells containing somatic copy number alterations (SCNAs). By revealing how cells regulate RNA and protein abundances of genes with SCNAs, we provide insights into the rules of gene regulation. Protein complex genes have a strong protein-level regulation while non-complex genes have a strong RNA-level regulation. Notable exceptions are plasma membrane protein complex genes, which show a weak protein-level regulation and a stronger RNA-level regulation. Strikingly, we find a strong negative association between the degree of RNA-level and protein-level regulation across genes and cellular pathways. Moreover, genes participating in the same pathway show a similar degree of RNA- and protein-level regulation. Pathways including translation, splicing, RNA processing, and mitochondrial function show a stronger protein-level regulation while cell adhesion and migration pathways show a stronger RNA-level regulation. These results suggest that the evolution of gene regulation is shaped by functional constraints and that many cellular pathways tend to evolve one predominant mechanism of gene regulation at the protein level or at the RNA level.

    1. Computational and Systems Biology
    2. Neuroscience
    Janus RL Kobbersmed, Manon MM Berns ... Alexander M Walter
    Research Article Updated

    Synaptic communication relies on the fusion of synaptic vesicles with the plasma membrane, which leads to neurotransmitter release. This exocytosis is triggered by brief and local elevations of intracellular Ca2+ with remarkably high sensitivity. How this is molecularly achieved is unknown. While synaptotagmins confer the Ca2+ sensitivity of neurotransmitter exocytosis, biochemical measurements reported Ca2+ affinities too low to account for synaptic function. However, synaptotagmin’s Ca2+ affinity increases upon binding the plasma membrane phospholipid PI(4,5)P2 and, vice versa, Ca2+ binding increases synaptotagmin’s PI(4,5)P2 affinity, indicating a stabilization of the Ca2+/PI(4,5)P2 dual-bound state. Here, we devise a molecular exocytosis model based on this positive allosteric stabilization and the assumptions that (1.) synaptotagmin Ca2+/PI(4,5)P2 dual binding lowers the energy barrier for vesicle fusion and that (2.) the effect of multiple synaptotagmins on the energy barrier is additive. The model, which relies on biochemically measured Ca2+/PI(4,5)P2 affinities and protein copy numbers, reproduced the steep Ca2+ dependency of neurotransmitter release. Our results indicate that each synaptotagmin engaging in Ca2+/PI(4,5)P2 dual-binding lowers the energy barrier for vesicle fusion by ~5 kBT and that allosteric stabilization of this state enables the synchronized engagement of several (typically three) synaptotagmins for fast exocytosis. Furthermore, we show that mutations altering synaptotagmin’s allosteric properties may show dominant-negative effects, even though synaptotagmins act independently on the energy barrier, and that dynamic changes of local PI(4,5)P2 (e.g. upon vesicle movement) dramatically impact synaptic responses. We conclude that allosterically stabilized Ca2+/PI(4,5)P2 dual binding enables synaptotagmins to exert their coordinated function in neurotransmission.