Flexible control of representational dynamics in a disinhibition-based model of decision-making
Abstract
Inhibition is crucial for brain function, regulating network activity by balancing excitation and implementing gain control. Recent evidence suggests that beyond simply inhibiting excitatory activity, inhibitory neurons can also shape circuit function through disinhibition. While disinhibitory circuit motifs have been implicated in cognitive processes, including learning, attentional selection, and input gating, the role of disinhibition is largely unexplored in the study of decision-making. Here, we show that disinhibition provides a simple circuit motif for fast, dynamic control of network state and function. This dynamic control allows a disinhibition-based decision model to reproduce both value normalization and winner-take-all dynamics, the two central features of neurobiological decision-making captured in separate existing models with distinct circuit motifs. In addition, the disinhibition model exhibits flexible attractor dynamics consistent with different forms of persistent activity seen in working memory. Fitting the model to empirical data shows it captures well both the neurophysiological dynamics of value coding and psychometric choice behavior. Furthermore, the biological basis of disinhibition provides a simple mechanism for flexible top-down control of the network states, enabling the circuit to capture diverse task-dependent neural dynamics. These results suggest a biologically plausible unifying mechanism for decision-making and emphasize the importance of local disinhibition in neural processing.
Editor's evaluation
This novel theoretical work outlines a unifying architecture for decision-making via disinhibition. The model clearly links observations across multiple empirical studies and highlights how characteristics from previous decision models can be effectively integrated into a single mechanism. This will be of interest to a wide variety of neuroscientists who work across levels of analysis.
https://doi.org/10.7554/eLife.82426.sa0Introduction
Inhibition is an essential component in neural network models of decision-making. In standard decision models, pools of option-selective excitatory neurons compete in a winner-take-all (WTA) selection process via feedback inhibition (Roach et al., 2023; Wang, 2002; Wong and Wang, 2006). Generally, such inhibition is thought to be homogeneous and non-selective, with a single pool of inhibitory neurons receiving broad excitation, and in turn inhibiting excitatory neurons. However, more recent empirical findings suggest that inhibitory neurons interact with the decision circuit in a more structured manner. Inhibitory neurons active in decision-making exhibit choice-selective activity on par with excitatory neurons in the frontal cortex (Allen et al., 2017), parietal cortex (Allen et al., 2017; Najafi et al., 2020), and striatum (Gage et al., 2010) in contrast to the non-selective or broadly tuned inhibition seen in visual cortex during stimulus representation (Bock et al., 2011; Chen et al., 2013; Hofer et al., 2011; Kerlin et al., 2010; Liu et al., 2009; Niell and Stryker, 2008; Sohya et al., 2007). At an anatomic level, inhibitory interneurons also exhibit a remarkable diversity in morphology, connectivity, and physiological functions (Kepecs and Fishell, 2014; Markram et al., 2004; Tremblay et al., 2016). A prominent circuit motif observed in these anatomical studies is local disinhibition in which vasoactive intestinal peptide (VIP)-expressing interneurons inhibit the neighboring interneurons expressing somatostatin (SST) or parvalbumin (PV) that inhibit dendritic or perisomatic areas in pyramidal neurons, thus locally disinhibiting the activities of the pyramidal neurons in the neighboring area (Chiu et al., 2013; Fino and Yuste, 2011; Fu et al., 2014; Karnani et al., 2014; Karnani et al., 2016; Lee et al., 2013; Letzkus et al., 2011; Pfeffer et al., 2013; Pi et al., 2013; Urban-Ciecko and Barth, 2016). Here, we explore the computational implications of that motif in decision-making.
While disinhibitory circuit motifs have been implicated in cognitive processes including learning, attentional selection, and input gating (Fu et al., 2014; Letzkus et al., 2011; Wang and Yang, 2018), how disinhibition functions in decision-making circuits is unknown. Local circuit inputs to the VIP neurons suggest that disinhibition may be a key mechanism for generating the mutual competition necessary for option selection in decision-making. In addition, given the existence of long-range inputs (Kepecs and Fishell, 2014; Lee et al., 2013; Pfeffer et al., 2013; Pi et al., 2013; Schuman et al., 2021) and neuromodulatory inputs (Alitto and Dan, 2012; Fu et al., 2014; Pfeffer et al., 2013; Prönneke et al., 2020; Rudy et al., 2011; Tremblay et al., 2016) to the VIP neurons, local disinhibition has been proposed to play a particular role in dynamic gating of circuit activity; such gating may be essential in decision circuits underlying flexible behavior, mediating top-down control of network function (Fu et al., 2014; Kamigaki, 2019; Lee et al., 2013; Letzkus et al., 2011; Pi et al., 2013; Schuman et al., 2021; Zhang et al., 2014). Here, we hypothesize that disinhibition controls a transition between information processing states, allowing a single decision-making circuit to both represent the values of alternatives and select a single best option amongst those alternatives.
Value representation is prominent in the early stage of a decision. Integrated decision variables combine outcome information such as expected gain and probability of realization. Neural firing rates in numerous decision-related brain areas vary with the integrated option values, including the frontal (Kiani et al., 2014; Kim and Shadlen, 1999; Padoa-Schioppa, 2013; Padoa-Schioppa and Conen, 2017; Pastor-Bernier and Cisek, 2011; Roesch and Olson, 2003; Thura and Cisek, 2014; Thura and Cisek, 2016; Yamada et al., 2018) and parietal (Andersen and Buneo, 2002; Churchland et al., 2008; Dorris and Glimcher, 2004; Hanks et al., 2014; Kiani et al., 2008; Kiani et al., 2014; Louie and Glimcher, 2010; Platt and Glimcher, 1999; Roitman and Shadlen, 2002; Rorie et al., 2010; Shadlen and Newsome, 2001; Sugrue et al., 2004) cortices and basal ganglia (Ding and Gold, 2010; Ding and Gold, 2012; Ding and Gold, 2013; Thura and Cisek, 2017). Recent research shows more specifically that neural value coding is contextual in nature, with the value of a given option represented relative to the value of available alternatives (Churchland et al., 2008; Kira et al., 2015; Louie et al., 2011; Louie et al., 2013; Louie et al., 2014; Pastor-Bernier and Cisek, 2011; Rorie et al., 2010; Strait et al., 2014; Yamada et al., 2018). Furthermore, this relative value coding employs a divisive normalization-like representation (Hunt et al., 2012; Louie et al., 2011; Louie et al., 2015; Yamada et al., 2018), a canonical computation prevalent in sensory processing and thought to implement efficient coding principles (Carandini et al., 1999; Carandini and Heeger, 1994; Carandini and Heeger, 2012; Heeger, 1992; Heeger, 1993; Schwartz and Simoncelli, 2001; Silver, 2010) and temporal adaptation (Chau et al., 2020; Heeger, 1992; Louie et al., 2013; Louie et al., 2015; Steverson et al., 2019; Webb et al., 2014).
Option selection and categorical choice occur when the decision process progresses beyond simple representation. A common and powerful neural mechanism for this categorical choice is WTA competition (Wickens et al., 2007; Wilson, 2007). WTA dynamics are widely observed in multiple brain regions: the neural firing rate representing the chosen option or action target increases in concert with selection (often reaching an activity threshold at choice), while firing rates representing the other unchosen option are suppressed (Churchland et al., 2008; Gold and Shadlen, 2007; Hanes and Schall, 1996; Hanks et al., 2014; Lo et al., 2015; Lo and Wang, 2006; Roitman and Shadlen, 2002; Rorie et al., 2010; Shadlen and Newsome, 2001; Wang, 2002; Wong and Wang, 2006). The wide prevalence of WTA dynamics in decision-related neural activities suggests that it is a general feature of biological choice.
Existing models have identified core circuit motifs that produce either normalized value representation or WTA selection (Figure 1). For normalized value representation, dynamic circuit-based models emphasize a crucial role for both lateral and feedback inhibition (Lofaro et al., 2014; Louie et al., 2014). In the dynamic normalization model (DNM), paired excitatory and inhibitory neurons represent each choice option (Figure 1A); feedforward excitation delivers value inputs, lateral connectivity mediates contextual interactions, and feedback inhibition drives divisive scaling. This simple differential equation model emphasizes the crucial role of lateral connectivity and feedback inhibition in driving empirically observed divisive scaling and contextual interactions (Figure 1B).
For WTA selection, the predominant class of decision models (recurrent network models, hereafter RNM) proposes a central role for recurrent connectivity (Houck and Person, 2014; Ito, 2002; Ito, 2006; Ito, 2008; Llinás, 1975; Sathyanesan et al., 2019; Sillitoe and Joyner, 2007) and non-selective feedback inhibition (Wickens et al., 2007; Wilson, 2007; Figure 1C). RNMs capture psychophysical and neurophysiological results in perceptual (Furman and Wang, 2008; Wang, 2002; Wong et al., 2007; Wong and Wang, 2006) and economic (Hunt et al., 2012; Jocham et al., 2012; Rustichini and Padoa-Schioppa, 2015; Soltani and Wang, 2006) choices, recapitulating much of the complex nonlinear dynamics of empirical neurons (Figure 1D). The competitive nature of the RNM generates attractor states which maintain continued activity even in the absence of stimuli, consistent with persistent spiking activity associated with working memory during delay intervals (Brunel and Wang, 2001; Compte et al., 2000; Constantinidis et al., 2018; Furman and Wang, 2008; Hart and Huk, 2020; Lo and Wang, 2006; Macoveanu et al., 2006; Murray et al., 2017; Tegnér et al., 2002; Wang et al., 2013; Wang, 1999, Wang, 2002; Wong and Wang, 2006).
While sequential valuation and selection processes may occur independently, electrophysiological evidence shows sequentially coexisting value coding and WTA signals in many prominent decision-related circuits. When decisions are framed as action selection, such integrated representation of values exists primarily in frontoparietal areas tightly linked to motor action commitment. In the control of eye movements, valuation and selection dynamics coexist in multiple brain regions including the lateral intraparietal (LIP) cortex (Louie and Glimcher, 2010; Roitman and Shadlen, 2002; Rorie et al., 2010; Shadlen and Newsome, 2001; Sugrue et al., 2004), the frontal eye fields (Ding and Gold, 2012; Kim and Shadlen, 1999; Roesch and Olson, 2003), and the superior colliculus (Basso and Wurtz, 1997; Basso and Wurtz, 1998; Horwitz et al., 2004; Horwitz and Newsome, 1999; Zhang et al., 2021). In these areas, neural activity initially represents the relevant decision variables but shifts to encode the selected saccade after a WTA-like interval. Similar activity emerges in parallel circuits controlling arm movements, including the parietal reach region (Kubanek et al., 2015; Rajalingham et al., 2014; Snyder et al., 1997), dorsal premotor cortex (Cisek and Kalaska, 2005; Pastor-Bernier and Cisek, 2011; Thura and Cisek, 2016), and primary motor cortex (Thura and Cisek, 2014). Notably, when examined, contextual value coding during a decision typically arises after the initial absolute value coding (Louie et al., 2014; Pastor-Bernier and Cisek, 2011; Rorie et al., 2010), consistent with a local normalization process; these dynamics suggest that normalized value coding is not simply inherited from upstream regions and support coexisting within-region normalization and selection computations.
Despite electrophysiological evidence for sequentially coexisting relative value coding and WTA signals in prominent decision-related circuits, no current model integrates both properties within a single circuit. The DNM cannot capture late-stage choice dynamics because it lacks a mechanism for WTA competition. Similarly, RNMs typically neither exhibit contextual value coding nor predict contextual choice patterns (Wang, 2012) due to the lack of structured lateral inhibition. Here, we propose that disinhibition is a biologically plausible solution to unify these key features of decision-making into a single circuit. We develop and characterize a biological circuit consisting of three neuronal types which critically include a form of local disinhibition. This model hybridizes the architectural features of divisive gain control and recurrent self-excitation used in existing models but utilizes disinhibition rather than the commonly assumed pooled inhibition to implement competition. We find that the disinhibition-based model unifies multiple characteristics of decision activity including normalized value coding, WTA choice, and working memory. A top-down gating signal operating via this disinhibition enables the model to switch between the states of value representation and WTA selection and to reproduce decision activity in a range of experimental paradigms with diverse task timing and activity dynamics. These findings suggest that local disinhibition provides a robust, biologically plausible integration of normalization and WTA selection in a single-circuit architecture.
Results
Local disinhibition decision model
To develop an integrated circuit model of decision-making, we systematically tested a series of models incorporating disinhibitory motifs and the core elements of existing models, namely divisive gain control, recurrent excitation, and mutual competition (Figure 2—figure supplement 1; see Methods Motifs tested and compared for normalized coding and WTA choice for the analysis details). This analysis identified local disinhibition as the crucial component that can integrate mutual competition and value normalization within the existing circuit architecture of DNM. In the rest of this paper, outside of the methods and supplementary figures, we focus on this local disinhibition decision model (hereafter LDDM) that emerged from our detailed examination of potential models.
In the LDDM (Figure 2A), as in the DNM, option-specific excitatory R units receive value inputs and interact via widespread lateral inhibition. However, the LDDM also includes an option-specific disinhibitory D unit that receives input from its associated excitatory R unit and locally inhibits the inhibitory G unit in the local circuit. In this way, disinhibition biased by different value inputs can serve to selectively release local circuit gain control, generating an unbalanced gain control between local and opponent circuits and leading to a WTA competition. In this model, the network thus shifts from value coding to WTA competition regimes in response to the onset of disinhibition (controlled by the coupling strength between R and D). With zero or weak R-D coupling, the circuit preserves normalized value coding consistent with the DNM; with strong R-D coupling, the circuit switches to a state of WTA selection (Figure 2B). Inhibitory units, as a result, dynamically switch from a non-selective response pattern to a selective response pattern (G and D units in Figure 2B) driven by local disinhibition. This flexible onset of disinhibition is modeled after biological findings, which show that activation of disinhibition in cortical circuits arises from exogenous, long-distance projections (Fu et al., 2014; Kamigaki, 2019; Lee et al., 2013; Pi et al., 2013; Zhang et al., 2014; Figure 2C). This form of top-down control allows for flexibility in the relative timing of the valuation and selection processes, consistent with neural and behavioral data in different task paradigms (see Gated disinhibition provides top-down control of choice dynamics).
Activity dynamics of the LDDM are described by a set of differential equations:
where i=1, …, N designates choice alternatives, each of which is represented by an R unit receiving selective input and non-selective baseline input BR. , , and are the time constants for the R, G, and D units. The weights represent the coupling strength between excitatory units and inhibitory (gain control) units , with each G unit driven by a weighted sum of excitatory inputs from all R units and a non-selective baseline input and inhibited by its local ; the parameter reflects the strength of recurrent self-excitation on R units. Finally, weights the coupling strength between the excitatory and the disinhibitory units and is presumed to be under external (task-triggered) control.
Dynamic divisive normalization preserved in the LDDM
We first examine whether the LDDM retains the dynamics of divisively normalized value coding seen empirically and in the DNM (Lofaro et al., 2014; Louie et al., 2014). As discussed above, during the initial option evaluation, the disinhibitory units are silent (); therefore, the sole difference between the LDDM and the DNM is recurrent excitation (controlled by ). Example activity traces in Figure 3B show that the LDDM preserves characteristic early-stage dynamics and contextual modulation seen in both empirical data (Figure 3C) and the original DNM (Lofaro et al., 2014; Louie et al., 2011; Louie et al., 2014). Immediately after stimulus onset, R1 activities replicate the transient peak observed in a wealth of studies (Andersen and Buneo, 2002; Churchland et al., 2008; Gnadt and Andersen, 1988; Louie et al., 2011; Louie et al., 2014; Platt and Glimcher, 1999; Rorie et al., 2010; Sugrue et al., 2004). Furthermore, the network settles to equilibrium displaying relative value coding: R1 activity increases with V1 and decreases with V2, reflecting a contextual representation of value (Figure 3B, R1 activity across V1 inputs [upper panel] and V2 inputs [bottom panel]).
Taking advantage of its simplified mathematical form, we analytically evaluated the LDDM by conducting phase plane analyses. We found that it represents each set of input values () as one unique and stable equilibrium point in its output space () when . Specifically, we solved for the equilibrium state of each R unit by setting each differential equation (Equations 1–3) to zero, which defines the nullcline of each R unit as a function of the activity of the complementary R unit, visualized in Figure 3D. The nullclines of R1 (solid) and R2 (dashed) intersect at a unique equilibrium point, regardless of whether input values are equal or unequal (see different panels for examples of different inputs). This point indicates that the dynamical system, when receiving any positive inputs, can maintain a unique equilibrium where every unit maintains a steady level of activity. Linearization analysis around this point suggests that this point is attractive: given any initial values to the system, the activities of the units will converge into the unique equilibrium point for the network (see Methods Equilibria and stability analysis of the LDDM for mathematical proof). The steady state of neural activity at equilibrium (noted as ) reflects divisive normalization (Equation 4), as in the original DNM (Lofaro et al., 2014; Louie et al., 2014). The only difference between the LDDM and the DNM at equilibrium is the introduction of a constant in the denominator () representing baseline gain control and recurrent excitation; this change rescales the activity magnitudes but preserves normalized value coding.
We next verified that the normalized value coding produced by the LDDM cannot be implemented by standard RNM models. Figure 4A compares the activity of as a function of both value inputs (V1 and V2) in the LDDM (left panel), the original DNM (middle panel), and the RNM (right panel). Both the LDDM and the DNM exhibit activities (indicated by color) that monotonically increase with input V1 but decrease with V2, with a slightly steeper V2 dependence in the LDDM versus the DNM model depending on the rescaling of . In contrast, strong WTA dynamics in the RNM implement categorical (choice) coding rather than relative value representation, with high or low coding of input values (right panel).
To quantitatively test value normalization, we fit the models to observed firing rates of monkey LIP neurons under varying reward conditions (Louie et al., 2011). In the empirical data (Figure 4B, dots), LIP activity increases with the reward (water quantity) associated with the target inside the neuronal response field (Vin) and decreases with the summed rewards of targets outside the response field (Vout). The fitting results show that the DNM captures the rescaled firing rates very well with only two free parameters (baseline input BR = 70.92 and an arbitrary scaling parameter ; see Methods; middle panel in Figure 4B, R2=0.9640). The LDDM with an additional parameter introduced by self-excitation and baseline gain control fitted slightly better than the DNM (BR =71.53, ; see Methods; left panel in Figure 4B, R2=0.9646; parameter recovery analysis shows that the LDDM is highly robust in the data fitting, Figure 4—figure supplement 1). Note that fitting to the current dataset is not able to differentiate the contributions of and to the neural dynamics (see proof in Methods); thus more empirical data will be needed to draw conclusions about the role of recurrent self-excitation in value coding. However, we do show below that self-excitation is critical for generating persistent activities (see section: Disinhibition controls point versus line attractor dynamics in persistent activity).
We found that fitting the standard RNM with its standard four parameters (see Methods) cannot capture the pattern of neural activity as well as the LDDM and DNM (right panel in Figure 4B; R2=0.8920). This small but clear difference in performance between model classes arises from the difference between divisive (DNM and LDDM) and subtractive (RNM) types of inhibition, with subtractive inhibition failing to capture the concave contextual effects predicted by divisive models. Furthermore, fitting the RNM to the data results in a parameter regime that can no longer generate WTA competition; instead, the model predicts mean firing rates in a low-activity regime with a maximum value of 3.5 Hz (Figure 4—figure supplement 2). These results suggest that RNM models cannot simultaneously support both normalized value coding and WTA selection regimes.
Local disinhibition drives WTA competition
A key question is whether the LDDM can also produce WTA competition. Given the architecture of the LDDM, local disinhibition is hypothesized to break the symmetry between option-specific R-G sub-circuits, enabling a competitive interaction between sub-circuits. To examine whether this competition produces WTA selection, we simulated model activity in a reaction-time version of a motion discrimination task, a standard perceptual decision-making paradigm in non-human primates (Churchland et al., 2008; Roitman and Shadlen, 2002). The task contains two stages of processing: the pre-motion stage with only the choice targets presented and the motion stage presenting a random-dot motion stimulus simultaneously with a go signal. Animals are allowed to select an option, indicating their perception of the main direction of the motion, at any time following motion stimulus/go signal onset (see timeline, Figure 5A). During the pre-motion stage, we simulated equal value inputs, given the equal prior probability of either target being correct in the standard task. The simulated dynamics replicate the characteristic transient peak observed in both perceptual and economic decision-making tasks (Andersen and Buneo, 2002; Churchland et al., 2008; Louie et al., 2011; Rorie et al., 2010). At motion stimulus onset, inputs to the two R units are changed according to the task design; disinhibition (i.e. value) is switched on at the go signal, simultaneously with motion inputs.
We find that the LDDM replicates neural and behavioral aspects of WTA competition. In Figure 5A, we show example model activities for five input strengths corresponding to different motion coherence levels. Consistent with electrophysiological recordings in the posterior parietal cortex (Churchland et al., 2008; Roitman and Shadlen, 2002; Shadlen and Newsome, 2001), model R unit activities bifurcate based on the input strengths, with the unit receiving stronger input ramping-up to an (arbitrary) decision threshold while the activity of the opponent unit is suppressed. The speed of bifurcation depends on the contrast between the inputs, a variable equivalent to motion coherence in the experimental literature (Roitman and Shadlen, 2002; Shadlen and Newsome, 2001). Furthermore, the LDDM predicts the dynamics of the two types of interneurons G and D governing excitatory neuron computation (Figure 5B). Prior to the go signal, the two G units share the same activity. However, after the go signal, the activity levels bifurcate because of disinhibition. In contrast to R units, the G unit in the sub-circuit receiving stronger input shows lower activity, indicating a stronger disinhibition of the associated R unit. Thus, the LDDM exhibits mutual competition that generates WTA selection in excitatory neurons, as in the existing RNM; this competition is mediated by a novel disinhibitory control input achieved through the use of biologically identified different interneuron subtypes.
What features of the LDDM are essential to generate WTA competition? We examined the dynamical properties of the system under disinhibition by conducting phase plane analyses. As shown in Figure 5C, the network in the choice regime ( in this example) shows a different configuration of nullcline intersections than the network in the value representation regime (; Figure 3D). Given equal (value) inputs, the nullclines of R1 and R2 intersect at three equilibrium points (left panel in Figure 5C), with the central point unstable and the two peripheral points stable. Thus, given an initial configuration of R1–R2 activities (with the presence of noise), the system will converge to the closer peripheral attractor (see example activity traces in blue and red thin lines) and implement WTA competition. Given moderately unequal inputs, the basin of attraction is biased toward the side with higher input, resulting in a higher probability of falling into the side with higher input (middle panel in Figure 5C). When inputs are extremely unequal, the stable equilibrium in the middle of the basin and the unstable equilibrium point associated with weaker input no longer exist, leaving only the attractor associated with stronger input (Figure 5C, right). Thus, across varying degrees of input coherences, disinhibition drives the LDDM toward selecting one of the potential choices. This can be seen in Figure 5D by viewing the output ratio () of the preferred attractor as a function of input ratio (): under active disinhibition () we observe categorical coding (green line), in contrast to under inactive disinhibition () where the output ratio faithfully preserves the original ratio of inputs (dark line; other parameters used in the simulation: , , , , , and ).
To understand the operating regimes of the LDDM, we quantified model behavior across the full parameter space defined by recurrent excitation weight () and local disinhibition weight (), both of which are critical in determining the properties of the system (see Methods Equilibria and stability analysis of the LDDM for mathematical proof). Decisions with equivalent inputs are a critical test of WTA behavior since WTA systems should select an option (stochastically) even in these symmetric scenarios (Furman and Wang, 2008; Lo and Wang, 2006; Wang, 2002; Wong and Wang, 2006); we therefore analyzed system behavior under equal value inputs. As shown in Figure 5E, this analysis revealed two distinct territories corresponding to value representation and WTA-operating regimes. The value representation regime generates a unique attractor for normalized value representation but no WTA attractors; in contrast, the WTA regime (induced by a change in ) generates no normalization attractor, but instead, R1 and R2 always diverge into high-contrast attractors (see Figure 5—figure supplement 1 and Methods Equilibria and stability analysis of the LDDM for a full description of regime parcellation). The value in the WTA regime is always larger than zero even when it asymptotically approaches zero when recurrent excitation is extremely strong, suggesting that disinhibition is always required to generate WTA choices. Models with a wide range of recurrent excitation can shift from value representation to WTA choice with an increase in local disinhibition strength (e.g. red arrow in Figure 5E) or under more limited conditions with an increase in recurrent activation. These findings emphasize the impact of changes in local disinhibition to WTA choice and highlight a particular role for a dynamic gating signal in controlling the transition from value coding to option selection.
The LDDM captures empirical choice behavior and neural activity
While the preceding analyses show that the LDDM can generate value normalization and WTA selection, a critical question is whether this circuit architecture accurately captures empirically observed behavioral and neural aspects of decision-making. Here, we take advantage of the limited number of parameters in this differential equation-based LDDM (compared to more complicated conductance-based biophysical models; Tegnér et al., 2002; Wang, 1999; Wang, 2002; Wong and Wang, 2006), which allows model fitting to empirical data. Specifically, we fit LDDM parameters to nonhuman primate behavior from the reaction-time version of the motion discrimination task described above. The choice and reaction time (RT) data from monkeys align with a reduced form model of decision-making (the drift-diffusion model; Ratcliff and McKoon, 2008), and the activity of posterior parietal neurons recorded during this task display characteristic decision-related features (motion-dependent ramping, a common decision threshold, and WTA activity).
To fit the LDDM to behaviorally observed RTs, we employed the standard quantile maximum likelihood estimation (QMLE) method to the RT distributions across input coherence levels (0–51.2%), with correct and error trials dissociated (Hawkins et al., 2015; Heathcote et al., 2002; Ratcliff and Tuerlinckx, 2002). We set as 1 and the baseline input BR as zero. Baseline gain control () and self-excitation () are collinear as mentioned above (see model fitting in Figure 4), and this is also true in fitting WTA choice behavior (see Figure 6—figure supplement 3). To address this, we kept as a free parameter but set to zero (note that this limits the interpretability of fit values as simply the level of recurrence, a point we address below and in the supplementary materials). The model is then reduced to seven parameters: recurrent excitation weight , local disinhibition weight , noise parameter , input value scaling parameter S, and time constants , , and (see Methods for model-fitting details). Predictions of the best fitting model are shown in Figure 6A (best fitting parameters: , , , S = 3251, , , and ). The optimization surfaces visualized across pairs of parameters (Figure 6—figure supplement 1) were consistent with the robust parameter fitting. A parameter recovery analysis indicated that the parameters are recoverable and identifiable within the network (Figure 6—figure supplement 2). While there is a small amount of collinearity between α and β in the fit to behavioral choice data, further simulation uncovered that these two parameters have notably different effects on the shapes of LDDM-predicted RT distributions: increasing β decreases the skewness of the RT distribution; whereas, increasing α increases the skewness. These effects on choice dynamics likely play a role in the ability of the LDDM to perform alternative models in fitting behavioral data (see below) and reinforce the separability and influence of disinhibition and recurrence on behavior (Palminteri et al., 2017).
Model-predicted RT distributions (lines) closely follow the empirical distributions (bars) for both correct (blue) and error (red) trials across different levels of input coherence. The aggregated mean choice accuracy and RT data are shown in Figure 6C. Model choice accuracy (line) captures the average empirical psychometric function (crosses); model RT captures coherence-dependent changes in the chronometric function, including longer RTs in error trials (dashed line and empty dots) compared to correct trials (solid line and dots). Beyond mean RT data, the LDDM accurately captured aspects of the empirical RT distributions, as evident in the quantile probability plot of RT quantiles as functions of chosen ratio (Figure 6B). Given the mathematical collinearity issue between and , it is important to note that the fitted value of should not be interpreted as reflecting the exact level of recurrence in the circuit. Future empirical data will be needed to differentiate how recurrence and baseline inhibition contribute to the LDDM WTA selection.
We compared the performance of the LDDM in fitting this classical dataset with the reduced form of the RNM (Wong and Wang, 2006; Figure 6—figure supplement 4), as well as another prominent computational decision model with a similar architecture of mutual inhibition – the leaky competing accumulator (LCA) model (Usher and McClelland, 2001; see Figure 6—figure supplement 5). The performances of the three models were close in predicting averaged RTs and choice accuracy (panel C). However, the LDDM captures the skewness and the shape of RT distributions better than the other two, as reflected in goodness of fit (negative log-likelihood) and Akaike information criterion (AIC) measures (nLLLDDM = 16,546, nLLRNM = 16,573, nLLLCA = 16,948, AICLDDM = 33,109, AICRNM = 33,165, and AICLCA = 33,932).
Notably, the LDDM – fit only to behavior – generates predictions about the underlying neural dynamics that can be compared to electrophysiological findings. We examined R unit activity in the best-fitting model, with predicted activity aggregated across trials and aligned to the onset of stimuli and the time of decision as in the original study (Roitman and Shadlen, 2002). Aligned to the onset of stimuli (Figure 6D, left), neural responses are aggregated by coherence level and eventual choice and truncated at median RT. These data show clear evidence of WTA competition: chosen (solid) and unchosen (dashed) activity traces diverge over time. Moreover, neural activity is stimulus dependent: the dynamics of both chosen and unchosen units ramp at different, coherence-dependent speeds, consistent with empirical findings consistent with an accumulation process. More quantitatively, we examined the relationship between activity and coherence at the specific time point reported in the original work (arrow points a and b, Figure 6E). Model predictions align well with empirical observations: across the three alternative models, the deviation between empirical recordings and model-predicted activity is the smallest for LDDM (quantified by root-mean-square error (RMSE); RMSELDDM = 2.74 (Figure 6E), RMSERNM = 20.10 (Figure 6—figure supplement 4E), and RMSELCA = 3.92 Figure 6—figure supplement 5E).
Aligned to the onset of decision (Figure 6D, right), model R unit activity near the time of choice shows further evidence of the WTA competition observed in real neurons: the initial divergence between chosen and unchosen activity traces extends into a categorical coding of choice. The relationship between activity and coherence quantitatively replicates the empirical pattern immediately preceding the decision time (Roitman and Shadlen, 2002): chosen activity (indicated by arrow c in Figure 6D and plotted in Figure 6E) no longer shows much difference across coherence conditions, while unchosen activity (indicated by d in Figure 6D and plotted in Figure 6E) retains a decrease. Quantification shows that LDDM again best predicted empirical neural activity with data aligned to choice onset (RMSELDDM = 6.77 [Figure 6E; RMSERNM = 9.35]; [Figure 6—figure supplement 4E]; RMSELCA = 7.51 [Figure 6—figure supplement 5E]). Thus, R unit activity – in a model with parameters fit only to behavior – replicates the recorded activity of parietal neurons during both initial decision processing and eventual choice selection.
Unlike the RNM and LCA models, the LDDM predicts different dynamics in different subtypes of interneurons (Figure 6F–I). The inhibitory (G) units selectively code input values and choice but exhibit complex dynamics due to the interplay of feedforward excitation, lateral inputs, and disinhibition. Early on (dynamics sorted to the left in Figure 6F and upper panel in Figure 6G), the G activities initially increase due to excitatory drive from R units. Later on, when the inhibition from D units increases (Figure 6H), the G activities start to decrease. Near the time of choice (dynamics sorted to the right in Figure 6F and the lower panel in Figure 6G), the chosen G units show lower activities than the unchosen side because of stronger inhibition from D as an outcome of WTA competition. The dynamics of D units rapidly increase in the early stage, driven by excitatory R unit activity (dynamics sorted to the left in Figure 6H). Dynamics in the late stage (dynamics sorted to the right in Figure 6H) show higher activity on the chosen side than the unchosen side as an outcome of WTA competition. Both types of interneurons show different time-dependent patterns of coherence-dependence that likely reflect the complex dynamics of the system and RT-based data aggregation methods (Figure 6G and H). While the activities of different interneuron subtypes have not been widely recorded in decision tasks, these new LDDM predictions provide a testbed for future empirical and theoretical investigations.
The LDDM integrates normalized value coding and WTA choices
While the LDDM separately replicates normalized value coding and WTA dynamics shown in different empirical studies, a key distinguishing feature of the LDDM is that it can capture both phenomena within a single experimental context. Numerous studies using the random-dot motion paradigm show two stages of dynamics: target (action) representation during the pre-motion stage and WTA selection after the go cue following motion stimuli (Churchland et al., 2008; Rorie et al., 2010). Neural activity in the pre-motion stage shows a characteristic phasic-sustained dynamic response to the presentation of visual cues; rather than purely sensory information, activity during this stage reflects the magnitude and probability of reward associated with the visual cues (Rorie et al., 2010). After the go cue, WTA dynamics reflect an integration of motion information and implement a transition from initial value coding to a categorical coding of choice in the late stage of the decision (Churchland et al., 2008; Ding and Gold, 2010; Kiani et al., 2008; Roitman and Shadlen, 2002; Rorie et al., 2010; Shadlen and Newsome, 2001). Studies of economic choice show a similar set of dynamics, a context-dependent valuation, followed by a shift to WTA after a go cue (Louie et al., 2011; Louie et al., 2014; Louie and Glimcher, 2010; Pastor-Bernier and Cisek, 2011; Sugrue et al., 2004).
Neural dynamics are also observed to be influenced by the number of options, a feature captured by the LDDM. Specifically, the number of options offered to non-human primates has been empirically observed to affect the neural dynamics during both representation and choice (Basso and Wurtz, 1997; Basso and Wurtz, 1998; Churchland et al., 2008). When the choice set is expanded from two options to four options, early representational activity is lower during pre-motion dynamics (Figure 7A) and the speed of WTA dynamics slows after motion onset (Figure 7C).
Here, we show that the LDDM replicates the impact of the number of options on both early and late empirical neural dynamics during both the representation phases and the WTA phases observed in real neurons. Under four (versus two) options, LDDM R unit activity during the representation stage decreases because of increased recurrent inhibition, driven by multiple contextual inputs (left side in Figure 7D). Similarly, the ramping speed after motion onset and disinhibition decreases in the four-option (versus the two-option) condition, despite identical parameters (Figure 7E). These results highlight the LDDM as a potential mechanism of integrating normalized value coding and WTA competition within a single-circuit architecture.
Disinhibition controls point versus line attractor dynamics in persistent activity
We next examine the implications of the local disinhibition architecture for another characteristic of decision-related neural firing: persistent activity. In brain areas such as the parietal (Kiani et al., 2008; Kiani et al., 2014; Kiani and Shadlen, 2009; Roitman and Shadlen, 2002; Shadlen and Newsome, 2001), prefrontal (Funahashi et al., 1989; Fuster and Alexander, 1971; Goldman-Rakic, 1995; Rigotti et al., 2013), and premotor cortices (Pastor-Bernier and Cisek, 2011), neurons show elevated firing in the absence of stimulus-driven input over intervals of seconds; such persistent activity is thought to underlie working memory and enable decisions based on internally maintained information. In the RNM, recurrent excitation and feedback inhibition preserve categorical choice information after input withdrawal because of the point-attractor dynamics (Furman and Wang, 2008; Wang, 2002; Wong and Wang, 2006). Here, we answer two questions: does the LDDM generate persistent activity, and how does this persistent activity differ from that in the RNM?
We found that the LDDM can generate two distinct forms of persistent activity, controlled by the state of disinhibition. Figure 8A shows example dynamics of two R units before and after the withdrawal of inputs while disinhibition is silent. Following input withdrawal, network activity decreases but still preserves elevated firing rates, governed by the self-excitation parameter (the network loses elevated activity when ). The persistent activity ratio between R1 and R2 preserves the ratio between the input values V1 and V2 during the memory interval in contrast to RNMs which immediately lose all value information and only preserve categorical information about the largest value (see Figure 8—figure supplement 1 and Methods Analysis for persistent activity for mathematical proof). Phase plane analyses suggest that relative value coding in persistent activity arises from a line-attractor dynamic in the network during the inactivation of disinhibition, unlike the point-attractor dynamics in the RNM, which shed value information immediately (Figure 8B). Like other line-attractor models of persistent activity that store continuous-valued information (Burak and Fiete, 2009; Compte et al., 2000; Ganguli et al., 2008; Seung, 1996), an unbiased coding of the input ratio requires perfectly balanced gain control weights from G to R. Unbalanced weights will result in distorted coding of the input ratio, and graded coding of the inputs will decay over time (Figure 8—figure supplement 1D and E). For perfectly balanced weights, the line attractor state is vulnerable to noise perturbation. A small perturbation can easily drive the activity to drift on the line of attractors, with the summed value of R1 and R2 as a constant (). The preserved ratio between R1 and R2 drifts stochastically over time, similar to the prediction of other line-attractor circuits and consistent with behavioral and neural variability related to working memory (Seung, 1996; Wimmer et al., 2014).
However, a line attractor is not the only state that the LDDM predicts. If disinhibition is activated during the delay interval, the network switches to a point attractor dynamic similar to the one exhibited by the RNM (see Figure 8—figure supplement 2 and Methods Analysis for persistent activity for mathematical proof). Figure 8D shows the example dynamics of two R units before and after the withdrawal of inputs. Disinhibition drives a competition between the two R units, resulting in a switch between the graded coding of the input ratio to a categorical coding of the largest value ( in visualization). Interestingly, a transition of coded information from input values to categorical information has been widely observed in firing rates in decision-related regions, such as LIP and the superior colliculus, during the delay period of decision-making (Rorie et al., 2010; Shadlen and Newsome, 2001; Zhang et al., 2021). The point attractor predicted by the circuit under disinhibition (Figure 8E) is highly tolerant to perturbations compared to the line attractor. Choice performance over long delays may require a switch from the value coding to the categorical regimes to achieve this robustness. As a plausible biological mechanism for mediating top-down control, disinhibition may gate such a transition without imposing any distinct change on the network architecture.
The LDDM can be easily expanded to multiple options. Here, we show an example of a five-option case with five sets of option-specific R-G-D units. A line attractor network with silent disinhibition (Figure 8C, right) is able to retain relative input value information for all five items simultaneously in the network. Due to normalization, the neural activity representing each alternative decreases with the total number of alternatives, with the summed value as a constant (), leading to a lower signal-to-noise ratio when coding more items; this set-size effect may be related to working memory (WM) span constraints (Cowan, 2010; Cowan, 2016; Engle, 2001; Engle, 2002; Oberauer et al., 2016). When disinhibition is active, the LDDM exhibits a point attractor (Figure 8F, right), and the network only holds the information of the largest item as a categorical code during persistent activity.
Gated disinhibition provides top-down control of choice dynamics
In addition to its crucial role in generating WTA competition, local disinhibition provides an intrinsic mechanism for top-down control of choice dynamics. Decision circuits show remarkable flexibility in timing, with similar neurophysiological evidence of this flexibility recorded in a variety of task paradigms. In addition to reaction-time tasks, in which subjects can choose at any time immediately after the onset of stimulus, decision-related neural activity has been widely studied in fixed-duration and delayed-response tasks. In fixed-duration tasks, subjects are required to withhold their selection of action until an instruction signal. Neural activity prior to the instruction signal reflects value information, for example, about reward characteristics (Dorris and Glimcher, 2004; Louie et al., 2011; Platt and Glimcher, 1999; Sugrue et al., 2004; Watanabe, 1996) or accumulating perceptual evidence (Kiani et al., 2008; Kiani et al., 2014; Kiani and Shadlen, 2009; Kim and Shadlen, 1999; Roitman and Shadlen, 2002; Rorie et al., 2010; Shadlen and Newsome, 2001); however, this activity never entirely diverges or reaches the decision threshold until after the instruction cue, suggesting a gating of the competition process. In delayed-response (working memory) tasks, subjects must postpone selection for an interval that includes both stimulus presentation and an additional subsequent interval after the stimulus is withdrawn. As in fixed-duration tasks, neural activity in delayed-response tasks typically carries decision–related information (across both the stimulus and delay periods), but WTA selection – and behavioral choice – is withheld until the instruction cue is given (Kiani et al., 2008; Kiani et al., 2014; Kiani and Shadlen, 2009; Kim and Shadlen, 1999; Roitman and Shadlen, 2002; Shadlen and Newsome, 2001). Thus, biological decision circuits are able to evaluate choice options while selectively initiating the WTA selection process with variable context-dependent timing.
Despite this evidence of top-down control, how neural circuits implement dynamic control of selection – and temporal separation of evaluation and WTA choice – is largely unaddressed in current decision models. For example, in RNM models, neural activity is driven by fixed attractor dynamics; option evaluation and the selection process cannot be disambiguated, and WTA competition is essentially ballistic and not under top-down control. In this section, we examine how the timing of a dynamic top-down control signal – modulating the strength of disinhibition via long-range inputs and neuromodulation – allows the LDDM to capture neural activity in different task paradigms. In these simulations, disinhibition is activated when the choice instruction cue is presented. Figure 9A shows LDDM activity in a reaction-time task, a standard paradigm in the perceptual decision-making (Churchland et al., 2008; Roitman and Shadlen, 2002). As in previous analyses (Figures 5 and 6), LDDM R units show simultaneous evaluation (coherence-dependent ramping) and WTA selection (rise to threshold) processes driven by immediate activation of disinhibition at motion stimulus onset.
In a fixed-duration task (Figure 9B), disinhibition is activated after a required interval of stimulus presentation as in the empirical data. Compared to the reaction-time task, LDDM activity here shows distinct, temporally separated patterns during stimuli viewing and option selection; this temporal segregation is driven by the activation of disinhibition (a step function on in this example), which promotes a transition between value representation and WTA choice.
A further demonstration of this temporal flexibility arises from considering delayed-response tasks (Figure 9C), which include an interval between stimuli offset and the onset of the instruction cue. Consistent with its ability to maintain persistent activity (Figure 8), the LDDM shows value coding across the delay interval. It delays WTA selection until after the instruction cue and the accompanying activation of disinhibition. These results show that the LDDM – via modulation in the timing of disinhibition activation - can temporally separate the value representation and selection processes (unlike the RNM), enabling it to capture the diversity of neural dynamics seen in reaction-time, fixed-duration, and delayed-response tasks.
Inhibitory potentiation distinguishes LDDM from earlier models
The architecture of disinhibition employed by the LDDM is more structured than the earlier non-selective inhibition used in most standard competition networks. This distinction gives rise to the novel prediction from the LDDM that the influence of global changes in inhibitory tone is non-selective during representation but switches to input-selective after disinhibition is increased. This reflects a fundamentally novel prediction of this class of model. The LDDM contains two different types of inhibition, and thus, its reaction to inhibitory potentiation depends on both the state of the disinhibitory network and the intensity of potentiation. To highlight the importance of that prediction, we implemented different levels of inhibitory connection weights in both the LDDM and the standard RNM.
At the neural level, the LDDM predicts a dissociable effect of potentiated inhibition on the primary (R) neuron’s activity (Figure 10A). During option representation (cue interval in fixed duration trials), potentiated inhibition increases both recurrent and lateral inhibition, leading to decreased firing rates and a weaker modulation by value in the R neurons. During option selection (go/choice intervals in fixed duration trials), local disinhibition increases WTA activity and decreases the late-stage representation of value. As an outcome, these changes produce a speeding up of RTs but a reduced choice accuracy (Figure 10B). The expected differences between the control condition and the inhibitory potentiation condition would be evident in chronometric and psychometric curves across different levels of inputs effectively implementing a speed-accuracy tradeoff (Figure 10C). Note that the qualitative predictions for inhibitory potentiation effects on RT and accuracy are robust to specific LDDM parameterizations (Figure 10D). In contrast, in more traditional networks like the RNM that employ non-selective inhibition, potentiated inhibition suppresses the excitatory neural activities during the WTA competition (Figure 10E). The suppression in neural coding in these models slows down RTs but does not affect choice accuracy (Figure 10F and G), thus failing to replicate the observed speed-accuracy tradeoff. We note that these novel predictions that differentiate models which rely on structured disinhibition could be readily tested using modern optogenetic techniques.
Discussion
The prevalence of disinhibitory circuit motifs in the brain, and recent evidence for structured decision-related inhibitory activity, argue for a more structured implementation of inhibition than has been previously employed in computational models of decision-making. Here, we show that the disinhibition-based LDDM replicates three characteristic features of observed neurobiological decision-making circuits – normalized value coding, WTA choice, and persistent activity – within a single-circuit architecture. We find that our disinhibition-based model outperforms existing recurrent circuits both in fitting empirical choice data and in replicating decision-related neural dynamics. Perhaps most importantly, the LDDM provides a novel mechanism for top-down control of decision dynamics which regulates phenomena like the empirically observed speed-accuracy tradeoff. By controlling the timing of disinhibition, the LDDM effectively paces the decision process and replicates neural dynamics from a broader range of empirical choice tasks than any previous models.
Flexible control of dynamic regimes
While normalized value coding and WTA selection have largely been modeled separately, the LDDM offers a biologically plausible circuit architecture that integrates these two features via local disinhibition. Existing neurophysiological evidence shows that WTA dynamics and normalized value coding co-exist in the same brain regions. On the one hand, neural activities show relative value coding in the early stage of decision-making, reflecting a context-dependent modulation consistent with the canonical divisive normalization computation (Churchland et al., 2008; Kira et al., 2015; Louie et al., 2011; Pastor-Bernier and Cisek, 2011; Rorie et al., 2010; Strait et al., 2014; Yamada et al., 2018). On the other hand, WTA choice dynamics are widely observed during later stages of decision-making across multiple brain regions of non-human primates (Andersen and Buneo, 2002; Churchland et al., 2008; Ding and Gold, 2010; Ding and Gold, 2012; Ding and Gold, 2013; Dorris and Glimcher, 2004; Hanks et al., 2014; Kiani et al., 2008; Kiani et al., 2014; Kim and Shadlen, 1999; Louie and Glimcher, 2010; Padoa-Schioppa, 2013; Padoa-Schioppa and Conen, 2017; Pastor-Bernier and Cisek, 2011; Platt and Glimcher, 1999; Roesch and Olson, 2003; Roitman and Shadlen, 2002; Rorie et al., 2010; Shadlen and Newsome, 2001; Sugrue et al., 2004; Thura and Cisek, 2014; Yamada et al., 2018), including many of the brain regions that show normalized value coding. In addition, neural firing rates show a graded coding of perceptual evidence and reward during the early stage of decision-making tasks that require evidence accumulation, gradually transitioning to a categorical coding for choice in the late period of decision-making (Churchland et al., 2008; Dorris and Glimcher, 2004; Gold and Shadlen, 2007; Platt and Glimcher, 1999; Roitman and Shadlen, 2002; Rorie et al., 2010; Shadlen and Newsome, 1996; Sugrue et al., 2004; Zhang et al., 2021).
All existing models of decision-making capture activity dynamics only in specific temporal intervals during decision-making tasks or across trials in specific task paradigms (Hart and Huk, 2020; Hunt et al., 2012; Louie et al., 2014; Wang, 2002; Wong and Wang, 2006), and thus, typically do not generalize across tasks in the same way as the empirically observed neural architecture. In contrast, the LDDM presented here modulates the dynamics of the circuit without requiring changes in circuit structure via gated disinhibition driven by the external action instruction. Controlling the timing of valuation-to-WTA regime transition enables the LDDM to replicate neural dynamics in a much more diverse set of task paradigms with different stimulus and action timing schedules (Kiani et al., 2008; Roitman and Shadlen, 2002; Rorie et al., 2010; Shadlen and Newsome, 2001).
Biological plausibility and fast modulation of disinhibition
The top-down control of normalization via disinhibition used in the model mirrors recently proposed mechanisms for flexible modulation of contextual processing in sensory circuits (Coen-Cagli et al., 2012; Coen-Cagli et al., 2015; Schwartz and Coen-Cagli, 2013). The input-scaled disinhibition we employ implements a self-sparing (‘donut-like’) inhibition motif central to existing midbrain models of categorical selection (Mahajan and Mysore, 2022; Mysore and Kothari, 2020). The micro-circuit structure underlying this donut-like inhibition has been revealed as a mechanism of localized disinhibition from VIP neurons to PV/SST neurons in the cortex (Karnani et al., 2016). Recent research on neuromodulatory control of disinhibition offers biologically plausible mechanisms for such top-down control of circuit dynamics. In addition to evidence that VIP neurons are recruited by long-range projections from distant regions (Lee et al., 2013; Zhang et al., 2014), VIP neurons are recruited by neuromodulatory projections such as acetylcholine (Fu et al., 2014) from the basal forebrain and pedunculopontine nuclei and serotonin from the red nucleus. With ionotropic acetylcholine receptor (nAChR) and serotonin receptors (5HT3aR and 5HT2R), VIP neurons depolarize to acetylcholine and serotonin (Alitto and Dan, 2012; Pfeffer et al., 2013; Rudy et al., 2011; Tremblay et al., 2016). The spiking mode of a major type of VIP neurons in layer II/III of the cortex switches from an input-insensitive burst-quiescent mode to an input-sensitive tonic mode under the cholinergic and serotonin modulation (Prönneke et al., 2020). Such a mode-switching feature allows the disinhibitory neurons to receive excitatory projections with different gains under different levels of neuromodulation, providing a mechanism to modulate network dynamics via disinhibition without a change in network structure that we employ as a central feature of the LDDM. In vivo studies show that disinhibition mediated by cholinergic activation is triggered in a surprisingly fast time scale of tens of milliseconds (Alitto and Dan, 2012; Hangya et al., 2015; Letzkus et al., 2011), supporting a fast modulation mechanism of disinhibition and network plasticity of the kind the LDDM instantiates.
The contribution of LDDM relative to existing disinhibition models
Disinhibition has been previously linked in separate models to several of the computational functions that are exhibited in a unified manner by the LDDM. For example, a computational model employing dendritic disinhibition captures flexible information routing in a context-dependent decision task, with dendritic disinhibition gating on specific inputs to a circuit while gating off other pathways (Yang et al., 2016). However, disinhibition plays a different role in this model (context-dependent input gating) from that employed in the LDDM (transition from value coding to WTA selection and mutual competition). In another example, PV neuron activation within a disinhibitory circuit motif can produce a divisive normalization of tuning curves in a model of the visual cortex (Litwin-Kumar et al., 2016). This specific model of division, however, arises from different circuit mechanisms than those we employ, such as reduced tuned input and firing rate nonlinearities. Finally, disinhibition has also been proposed to underlie the long time scales of information processing seen in working memory, as enhancing inhibitory-to-inhibitory connections stabilize temporal dynamics and improve working memory performance in recurrent neural networks (Kim and Sejnowski, 2021). One other notable difference between previous research and our current work is that disinhibition in past models typically contributes to a specific function (e.g. input gating, categorical selection, working memory, etc.), whereas disinhibition in the LDDM both mediates a transition from value coding to WTA selection and plays an integral role in the selection process itself. Taken together, previous results and our current work reinforce the importance of incorporating disinhibition in circuit models of decision-making.
Disinhibition in cortical-ganglia pathways: similarities and the differences
While largely absent in standard existing cortical decision models, disinhibition is a key element of action selection in models of the cortical-basal ganglia (CBG) system (Bogacz and Gurney, 2007; Frank, 2005; Lo and Wang, 2006; Schroll and Hamker, 2013; Wei et al., 2015). In the basal ganglia direct pathway, GABAergic neurons in the striatum inhibit neurons in the substantia nigra pars reticulata and internal globus pallidus, which in turn send inhibitory projections to the thalamus. Cortical inputs to the striatum thus produce a disinhibition of thalamic outputs to the cortex and brainstem motor areas, resulting in motor facilitation. Crucially, the activation of disinhibition in the CBG system is selective: the selection of a specific action requires a selective disinhibition driven by asymmetries in cortical inputs or striatal synaptic weights. This selective disinhibition is an essential element of computational models of the CBG system (Frank, 2005; Lo and Wang, 2006), including more complex models that incorporate global inhibition mediated by the indirect and hyper-direct pathways (Bogacz and Gurney, 2007; Schroll and Hamker, 2013; Wei et al., 2015).
While both the LDDM and standard CBG models utilize disinhibition to drive selection, they differ in two important ways. First, disinhibition in the LDDM specifically functions to implement a transition between value coding and WTA selection states. This transition is mediated by a broad/non-selective activation of disinhibition across the decision circuit. The activation of disinhibition is not biased toward specific alternatives until a period of interaction with differential value inputs to option-specific subcircuits that instantiates the WTA process. Second, disinhibition in the LDDM is tightly integrated with the lateral inhibition that mediates competition (and hence normalization) between alternatives; consistent with the microarchitecture of the cortex which it seeks to model (Fu et al., 2014; Karnani et al., 2016; Kepecs and Fishell, 2014; Pi et al., 2013; Zhang et al., 2014), disinhibitory, inhibitory, and excitatory neurons are part of the same local circuit. In contrast, the basal ganglia are known to lack these local, lateral connections and mutual competition. As a result CBG models typically require both direct pathway disinhibition along with diffusive suppression of competing motor plans via the indirect or hyper-direct pathways (Bogacz and Gurney, 2007; Schroll and Hamker, 2013; Wei et al., 2015) for effective operation. Thus, while conceptually similar to the CBG models, disinhibition in the LDDM is in some ways quite distinct, being tightly integrated with competitive inhibition and providing dynamic control of circuit state, both characteristics of decision-making in cortical brain areas.
Point- and line-attractor persistent activity
An interesting feature of the LDDM is that it can produce both point attractor (Bathellier et al., 2012; Kopec et al., 2015; Niessing and Friedrich, 2010; Wills et al., 2005) and continuous/line attractor (Ganguli et al., 2008; Wimmer et al., 2014; Yoon et al., 2013) dynamics in persistent activity, the balance between these two being controlled by the level of disinhibition. Given ambiguous empirical evidence, it remains controversial whether persistent activity in neural circuits exhibits point attractor (Bathellier et al., 2012; Kopec et al., 2015; Niessing and Friedrich, 2010; Wills et al., 2005) or continuous/line attractor (Ganguli et al., 2008; Wimmer et al., 2014; Yoon et al., 2013) dynamics. Most existing circuit models of persistent activity exclusively predict either a point attractor (Amit and Brunel, 1997; Brunel and Wang, 2001; Hopfield, 1982; Wang, 1999) or a line attractor (Amari, 1977; Burak and Fiete, 2009; Compte et al., 2000; Ganguli et al., 2008; Seung, 1996). The LDDM achieves the flexible reconfiguration of line attractor and point attractor states under the control of disinhibition, suggesting that attractor dynamics might not be a fixed property of a network; rather, it may be adaptive and controllable by a top-down signal operating via gated disinhibition. Of course, similar reconfiguration has been achieved by other important circuit mechanisms that have been well-described. For example, a mutual inhibition network can capture the different regimes of sequential two-interval decision-making – stimulus loading, working memory, and comparison – by assuming a flexible reconfiguration of the external inputs (Machens et al., 2005). Similar to the LDDM, this model can transition between point attractor (initial stimulus encoding), line attractor (working memory), and saddle point (comparison) dynamics. Interestingly, disinhibition may also play a role in this model by providing a theoretical mechanism to switch the routing of external inputs within the circuit, which drives the switch from line attractor to comparison dynamics.
A second point relevant to persistent activity is that the exact degree of recurrent excitation in the network (controlled by α) is unable to be identified from the current datasets owing to its collinearity with the degree of baseline gain control from SST/PV neurons to the pyramidal neurons (controlled by BG). We believe that this feature reflects the E-I balance in the network: with larger recurrence than gain control, the network is able to generate persistent activity when excitatory input is withdrawn; otherwise, the network is unable to maintain such excitability. Since α and BG are highly collinear in predicting either neural dynamics or behavior, future empirical work is needed to identify the features that dissociate the two parameters. For example, one possible approach is to measure the neural activity of different neuronal types, taking advantage of the advanced genetic labeling and in vivo calcium imaging (Najafi et al., 2020). Since we propose that baseline gain control is linked to the activity of SST/PV interneurons, a direct test can be measuring the activities of SST and PV neurons across the full dynamics of decision-making tasks; the identification of BG will help dissociate its contribution from that of recurrence.
Conclusions
In conclusion, we introduce a novel, biologically plausible architecture for decision-making based on local disinhibition. Our model unifies the characteristic decision-making features of normalized value coding, WTA competition, and persistent activity in a single circuit. The LDDM captures both psychometric and chronometric aspects of behavioral choice, as well as realistic neural dynamics in essentially all standard decision-making tasks. The local disinhibition it employs provides a mechanism for top-down control of local decision circuit dynamics, enabling the LDDM both to replicate variable task-dependent timing in diverse decision-making paradigms and to implement realistic speed-accuracy tradeoffs. These results suggest a new circuit mechanism for decision-making that can capture a large suite of empirical data and emphasize the importance of interneuron diversity, local circuit architecture, and top-down control in models of the decision process.
Methods
Equilibria and stability analysis of the LDDM
In Figures 3 and 5, we showed that the LDDM exhibits different patterns of equilibria and stabilities under normalized value coding and WTA competition, mediated through disinhibition. Here, we provide detailed mathematical analysis about the equilibria and stability of this dynamic system under different states of disinhibition.
Equilibria of the system were solved by taking the intersection of the nullclines of all units, i.e., the steady states of each unit. This is obtained by setting , , and all equal to in Equations 1–3. The solution of the equilibrium state of R units (Ri*) can be written as:
For a binary input system (N=2), the six differential equations can be simplified to two equations with only the R units explicitly in the expression (Equation 6). Each equation describes the nullcline of a single R unit.
Given that the equilibrium states of the system can be reduced with only R units explicitly in the expression, these equilibrium points can be visualized in the space of R1 and R2 activities as the intersection of the nullclines of the two R units (as shown in Figures 3 and 5). The stability of each equilibrium point was then examined by checking the eigenvalues of the Jacobian matrix around it. The equilibrium point is attractive and stable when all of the eigenvalues have negative real parts; the equilibrium point is divergent and unstable when there exist any positive real parts of eigenvalues. By denoting as the differential equations for all units in their steady states, the Jacobian matrix around the point can be written as Equation 7:
We examined the configuration of nullclines and checked the eigenvalues of the Jacobian matrix across a wide range of parameter values and . was set as a unit value of 1 for the sake of simplicity. and were set as zero in the following visualization.
The property of the system under equivalent inputs is a critical test since it determines whether the system is able to implement a WTA choice and select an option. Thus, we examined the property of the system for WTA under equal inputs. Examining the full space of and revealed five territories distinguished by the number of equilibrium points and their stabilities (Figure 5—figure supplement 1A). For each territory, the configuration of nullclines is illustrated in Figure 5—figure supplement 1 labeled by color. Dark green region: when disinhibition is smaller (), and show a trade-off in generating WTA competition. When both and are small, the system generates a unique equilibrium point of normalized coding (dark green region in Figure 5—figure supplement 1A, nullclines shown in Figure 5—figure supplement 1B). Eigenvalues in this regime show all negative real parts on this equilibrium point, indicating it is a stable equilibrium. Blue region: as values increase (at smaller values), the system generates three equilibrium points (Figure 5—figure supplement 1D), with two high-contrast (stable) attractors at the peripheral and one (unstable) repellor in the center of space R1–R2. Neural activities of R1 and R2 with equal initial values bifurcate into the high-contrast attractors to realize WTA competition (example traces shown in red and blue lines). Green region: when the strength of disinhibition increases (), most of the regimes (yellow and red regions) show the properties of WTA competition except for a small regime when (an almost invisible region between dark green and yellow). In the green region, the nullclines of R1 and R2 still intersect on three equilibrium points, but in contrast to the blue region, the two points with a high contrast of R1–R2 activities are unstable and the equilibrium point in the center is stable; therefore, the system maintains normalized coding (Figure 5—figure supplement 1C). Yellow region: when disinhibition is large (), most of the parameter regime in the yellow region shows only one repellor at the center (Figure 5—figure supplement 1E). The activities of R1 and R2 bifurcate from the center repellor to the high-contrast corners. The restriction of maximum activity depends on the value of . When , the model predicts a limited value of activity on each R unit as (; vertical and horizontal dashed lines in Figure 5—figure supplement 1E). When , the model predicts no boundary on the maximum activities (though a boundary may still need to be considered because of biological constraints). Red region: when disinhibition is extremely large (), the two nullclines show no intersections (Figure 5—figure supplement 1F). Most of the other features in this region are similar to the yellow region. The neural activities of R1 and R2 bifurcate from initial values from the center to the corners of high contrast (example traces shown in red and green thin lines). The boundary of neural activity is predicted when and not accounted when .
Taken together, the five territories can be simplified into two regions based on the properties of the system in implementing either normalized coding or WTA competition as discussed in the main text (Figure 5E). These two regions show clear-cut dichotomous separation in the two-dimensional space of recurrent excitation weight () and local disinhibition weight ().
Numerical simulations
To quantify neural dynamics and behavioral performance (choice/RT), time-varying activity was represented by a system of differential equations (Equations 1-3) which was solved numerically using the Runge-Kutta method implemented in MATLAB (MathWorks) at a time step of 1 ms. Evaluations using smaller time steps (0.1 ms) were examined and produced similar results. At each time step, the model unit activities were updated based on their values at the previous step according to the differential equations. Considering the biological reality that spike rates cannot be negative, the activities were constrained to be non-negative. For the simulations including noise, we assumed an additive noise term for each unit, which evolved independently based on an Ornstein-Uhlenbeck process (Equation 8),
where is the variance of the noise, is a Gaussian white noise with zero mean and unit variance, and is the time constant for the noise fluctuation process. The time constant for the noise process () was set to 2 ms, aligned with previous studies (Wang, 2002; Wong and Wang, 2006). Note that this approach assumes for convenience that noise arises in model unit activity; however, similar stochasticity can be implemented assuming noise arises in inputs external to the circuit, generalizing our findings.
All parameters used for visualization were set as the following unless specified elsewhere or fitted as free parameters in Figure 6: , , and were set as the same value of 100ms only for non-quantitative visualization purposes and fitted independently as free parameters in the model fittings; the gain control weight was set as a unit value of 1 for simplicity; the self-excitation weight was set as 15; the disinhibition weight was assumed as zero in representation (i.e. ) and set as 1.1 in WTA competition; the input values V1 and V2 were set as S*(1+c’) and S*(1-c’), where c’ indicates the motion coherence of the stimulus, with varied values (0, 3.2, 6.4, 12.8, 25.6, 38.4, and 51.2%), and S indicates the scale of input (set as 250). Baseline input was set as 70 in representation in Figure 3, fit as a free parameter in Figure 4, and set as 0 in WTA competition and persistent activity in Figures 5—10. Ornstein-Uhlenbeck noise was set as zero in most figures aiming at visualization of the model properties but in Figure 10B–D and fit as a free parameter in Figure 6. The set of parameters in Figure 7 were adjusted to predict the multi-alternative choice data: was set as ; was set as 1.5; scaling parameter was set as 640 for both pre-motion and motion period but set as 427 for the first 190 ms of motion period to replicate the initial dip; all parameters were kept the same between two- and four-alternative choices. Parameters in Figure 8 were adjusted between two- and five-item cases in order to get comparable scale of activities in visualization: for two-item case, S = 250, α = 15, and βon = .4; for five-item case, S = 50, , and .
Fitting the LDDM and the DNM to the neural firing rates of normalized value coding
In order to quantify the performance of the LDDM in fitting to the neural dynamics of normalized value coding and compare with the original DNM, we fit the equilibrium values of the LDDM and DNM to the dataset of normalized value coding (Figure 4 in Louie et al., 2011). In this task, monkeys are asked to represent the reward targets (1, 2, or 3) on the corresponding location of the screen. The neural activity in the response field receiving direct input V1 is recorded. Different combinations of V1, V2, and V3 are provided to the monkeys based on the associated volume of water in the presented targets (varying from 50, 100, 200, and 250 µl or omitted targets marked as 0), resulting in 28 data points.
To fit the DNM, we employed the following differential equations,
To fit LDDM, we employed Equations 1-3.
The direct input value () to each pool takes the value of the volume of water reward () plus a baseline input value . was set as 1. In the LDDM, there are additional terms of self-excitation weighted by , baseline gain control input fed into , and coupling between and the disinhibitory neurons weighted by .
To fit the predicted activities to the empirical mean firing rates during the sustain phase, we fit the predicted activities during the equilibria of these models. The equilibria of the two models were solved in Equation 10 and Equation 11, respectively by taking the differential equations (Equation 9 and Equations 1-3) to zero.
For DNM,
For LDDM,
To fit the empirical activities with normalized scale, we need another scaling parameter to capture the arbitrary rescaling, which results in the following equations (Equation 12 and Equation 13).
For DNM,
For LDDM,
Since we assume the disinhibition modules in LDDM keep silent during representation, takes zero. For a trinary input system, the equilibria of the two models can be described by the following equations (Equation 14 for DNM and Equation 15 for LDDM).
For DNM,
For LDDM,
From Equation 15, we realized that and share the same term and cannot be independently identified. Thus, we combined these parameters as one in our model fitting.
Based on the above analyses, two free parameters were estimated for the DNM (baseline input and the scaling parameter ). Three free parameters were estimated for the LDDM (, S, and a combined parameter ). The Bayesian adaptive direct search (BADS) algorithm (Acerbi and Ma, 2017a; Acerbi and Ma, 2017b) was implemented to minimize the ordinary squared error between the steady state of the predicted neural firing rates on R1 and the empirical data.
Fitting the RNM to the neural firing rates of normalized value coding
In order to quantify the performance of the RNM in predicting normalized value coding, we fit the reduced form of the RNM (Wong and Wang, 2006) with four free parameters (JNi,i,i, JNi,j,k(i≠j≠k), I0, and a scaling parameter S applied to the predicted neural firing rates) to a normalized value coding dataset (Figure 4 in Louie et al., 2011). Other parameters are set the same as reported in the original paper (Wong and Wang, 2006), except that the noise term is set as zero. The RNM is expanded to a trinary choice circuit, with three selective populations wired together based on the same rules specified in the original paper (Wong and Wang, 2006). We study the predicted neural activity on Pool 1 that receives direct input from V1 and investigate how the activity of Pool 1 changes with the values of contextual inputs V2 and V3. The BADS algorithm was used to minimize the mean squared error between the predicted neural firing rates of Pool 1 and the empirical neural firing rates data reported in Figure 4 of Louie et al., 2011. The best-fitting result shows that the RNM explains 89.2% of the variance, worse than the DNM and LDDM we reported in the main text (best-fitting parameters: = 0.0055, = 0.0861, = 0.3511, and S = 1.074).
Fitting the LDDM to empirical behavioral data
The LDDM with seven free parameters (the weights of self-excitation [] and disinhibition [], the variance of Gaussian white noise in the Ornstein-Uhlenbeck process [], the scaling parameter of input [S], and time constants for three types of units , , and ) was fit to choice behavior (RT and choice accuracy) in a classic perceptual decision-making dataset (Roitman and Shadlen, 2002). was set to zero since any positive values will worsen the accuracy performance. was fixed as zero since it shows high collinearity with (Figure 6—figure supplement 3). We employed the commonly used QMLE method (Heathcote et al., 2002; Ratcliff and McKoon, 2008). The rationale of QMLE is to minimize the differences between the predicted data and the empirical data on the proportion of trials located in each RT bin. Choice accuracy was implicitly estimated because the algorithm accounts for the proportion of trials between correct and error trials. Nine quantiles (from 0.1 to 0.9 with 0.1 of step size) were used, resulting in 10 RT bins, with correct and error trials accounted for separately at each coherence level. Because the LDDM has no closed-form analytic expression for the RT distribution, we evaluated the prediction by Monte Carlo simulations (10,240 repetitions for each input coherence). In each simulated trial, the initial R1 and R2 activities were set as 32 Hz to be comparable to the empirical data (Churchland et al., 2008; Roitman and Shadlen, 2002). Visual stimulus (motion) inputs were defined as S*(1+c’) and S*(1-c’) for V1 and V2, where the free parameter S models input scaling and the coherence c’ replicated values in the original experiment (0, 3.2, 6.4, 12.8, 25.6, and 51.2 %)(Roitman and Shadlen, 2002). A gap period (90ms) was implemented at visual stimulus onset as a non-decision period to capture the commonly observed initial dip untuned to inputs in empirical firing rates (Roitman and Shadlen, 2002). Gated disinhibition was activated along with inputs after the gap. A decision was reached when either of the R unit activities reached a decision threshold of 70 Hz, the biological threshold observed in the empirical data (Roitman and Shadlen, 2002). 30 ms of non-decision time was added to the RT of threshold hitting to capture the delay in the down-streaming motor execution. After the decision, the input values, and , were reset to zero. The negative loglikelihood (nLL) of QMLE was minimized using BADS algorithm in MATLAB (Acerbi and Ma, 2017b). The estimation was conducted using GPU (NVIDIA Tesla V100) parallel computation on a high-performance cluster (NYU Langone), with 160 sets of random initial parameter values to prevent local minima. The set with the smallest nLL in its fitting result was selected as the best-fitting result.
The visualization of the predicted RT distribution (Figure 6A) was calculated based on 60 evenly distributed RT bins, with correct and error trials calculated separately under each coherence. The predicted neural dynamics (Figure 6D) were generated using the model best fit to behavior. R unit activities were aggregated across correct trials, segregated by units associated with the chosen and unchosen sides. As in the original experiment data visualization (Roitman and Shadlen, 2002), activity early in trials was aligned to stimulus onset. Data within 100 ms of boundary crossing were omitted to reduce the impact of decision dynamics on visualizing early-stage ramping dynamics. Early activity traces were cut off at the median value of RT for each coherence level to ensure that the average trace was based on at least half of the trials. Activity late in trials was aligned to the time of the decision, and data within 200ms of stimulus onset was omitted.
Fitting the RNM to empirical behavioral data
In order to compare the model performance in predicting choice behaviors, we fit the original RNM to the classical perceptual decision dataset (Roitman and Shadlen, 2002). We used the reduced form of the RNM (Wong and Wang, 2006). We set eight parameters in the reduced model (see the Appendix in its original paper) as free parameters to fit: self-excitatory coupling weights , mutual inhibitory coupling weights , non-selective input , noise amplitude of Ornstein-Uhlenbeck (OU) process , input scale , synaptic kinetic parameter , initial value , and time constant . The other parameters that describe the input-output relationship of a single cell were set as the same in the paper: a=270 (VnC)–1, b=108 Hz, and d=0.154 s. The time constant for the AMPA receptor was fixed as 2 ms. The task setting, non-decision time (90 ms delay after stimulus onset and 30 ms delay before saccade), and optimization were kept the same as in fitting the LDDM (see above). The time step dt was set as .001 s.
Fitting the LCA to empirical behavioral data
Another widely acknowledged decision circuit model – the LCA model (Usher and McClelland, 2001) was fit to the behavioral data (Roitman and Shadlen, 2002). The dynamics of the two nodes in the LCA can be described using the following differential equations (Equation 16).
where () indicates the activity of each node; indicates the excitatory input value to each node; indicates the net leakage on each node after the cancellation of recurrent excitation; weighs the mutual inhibition strength from the other nodes; is a Gaussian random noise on each node with an SD of .
The input values were set as 1+c’ for Option 1 and 1−c’ for Option 2, with c’ changing over 0 to 0.512. We fitted the threshold as a free parameter. In that way, the time constant can be taken as an arbitrary value (100 ms used in our case) since it was not independent from the threshold. Other than the parameters we mentioned above, non-decision time was fixed as 120 ms, sharing the same assumption with the other two models based on the empirically observed delays after stimulus onset (90 ms) and before the saccade (30 ms). That gives in total four free parameters to estimate , , , and . Since the scale of the activities is arbitrarily defined, it would need rescaling when compared to the empirical data of mean firing rates in the unit of Hz. The task setting and the optimization used were kept the same as in fitting the LDDM (see above). The time step dt was set as 0.001 s.
Analysis of persistent activity
We showed in Results that the LDDM with recurrent excitation predicts persistent activity that maintains input information during delay intervals. Here, we provide mathematical analyses of the LDDM differential equations to examine the properties and genesis of this persistent activity. In addition to examining the property of the system with symmetric gain-control weights (), we expanded our analysis to allow the gain-control weights to be asymmetric; this allows us to examine the robustness of LDDM properties to asymmetric weights.
Equilibrium states of the differential equations (Equations 1-3) after the withdrawal of inputs were considered. The gain control weights were split into two parts, with the local-option weight denoted as () and the cross-option weight denoted as (). The input values were set to zero, and local disinhibition was assumed inactive (). Equilibria of the system were solved by taking the intersection of the steady states of all units, i.e., when , , and all equal to . When the input terms are set to zero, the solution degrades from Equation 5 to Equation 17 as a linear form,
For a binary choice system, the solution of Equation 17 is denoted in linear algebra as:
The solutions of the equations depend on the value of recurrent excitation and baseline gain control . When , the equations do not provide a positive solution. This explains why the system without recurrent excitation () cannot generate persistent activity. When , the equations provide positive solutions. The model generates persistent activities in three different patterns depending on the symmetry of gain control weights, i.e., , , and .
First, by assuming and , the nullclines of R1 and R2 overlap on a line of attraction, as shown in Figure 8B (the same as Figure 8—figure supplement 1B). Any position on this line is an equilibrium point. This is a special case where the eigenvalues on each point have a real part of zero; therefore, linearization around the equilibrium points cannot tell us their stability. Thus, we checked the instantaneous change direction of neural activities instead across a wide range of initial values to see whether the system converges to the line of attraction. From the differential equations (Equations 1-3), the ratio of the instantaneous change rates of R1 () and R2 () keeps the same ratio as the ratio of original activities (R1/R2), given under the assumption of symmetric gain control weights. As a result, for any given initial values, R1 and R2 activities change in the direction that preserves the original ratio until reaching equilibrium on the line of attraction. The instantaneous changes of R1 and R2 are shown as a vector field (red arrows) in Figure 8B. Thus, any positive initial values will drop into an equilibrium state with the ratio of R1*/R2* maintaining the ratio of initial values, which preserves the ratio of inputs when the activities are inherited from the stage of value representation. Figure 8—figure supplement 1E shows example dynamics of R1 and R2 under different ratios of input values (Figure 8—figure supplement 1G). The activities show the characteristic dynamics of divisive normalization during the inputs and preserve this input information after the withdrawal of inputs.
However, since the values of and are complementary on the line of attraction, any combination of values with a constant sum satisfies the equilibrium. Thus, any disturbance to the system (e.g. random noise) will drive and to deviate from their original ratio resulting in a loss of the coded information about the inputs. Noise-driven drift on the line of attraction will cause the decaying of the coded value information over time, consistent with the degradation attribute of working memory (Barrouillet et al., 2011; Barrouillet and Camos, 2012; Lee and Harris, 1996; Paivio and Bleasdale, 1974; Portrat et al., 2008).
In addition, under the special condition of symmetric gain control weights (), the formula in Equation 18 can be easily expanded to multiple inputs with the equilibrium delay interval activities defined by:
The summed value of all R units equals to a constant . When the number of inputs (N) increases, the activity shared by each R unit decreases and leads to a lower signal relative to the noise scale. Thus, as the number of coded items increases, the information kept during persistent activity may become less accurate considering a lower signal-to-noise ratio. This may explain another important attribute of working memory – the constraint of the working memory span (Cowan, 2010; Cowan, 2016; Engle, 2001; Engle, 2002; Oberauer et al., 2016).
Second, by assuming and , the nullclines of R1 and R2 intersect on a unique equilibrium point, where R1 and R2 share the same value (Figure 8—figure supplement 1A). The point is confirmed as attractive by linearization. Any positive initial values on the space of R1 and R2 will converge into this point, which is visualized in the instantaneous change ranges of R1 and R2 (red arrows) for a wide range of given initial values (Figure 8—figure supplement 1A). Thus, R1 and R2 will gradually converge to be equal, and the original information about input values will be lost. Nevertheless, the dynamic of information loss is based on the level of asymmetry of . For a close-to-symmetric matrix, the input information can still be preserved for a considerable amount of time. We showed example dynamics of information loss in Figure 8—figure supplement 1D. After the withdrawal of inputs, the R unit activities collapse into the same level and the coded ratio information gradually diminishes (simulation parameters: ).
Finally, by assuming and , the nullclines of R1 and R2 intersect on a unique equilibrium point, which is confirmed as unstable by linearization (Figure 8—figure supplement 1C). Any initial values of activities on the space will diverge into the upper-left or bottom-right corner of the space generating high contrast between R1 and R2, with the higher activity as and the lower activity suppressed to zero. The instantaneous change rates of R1 and R2 (red arrows) are visualized in the vector field in Figure 8—figure supplement 1C. The instantaneous change direction bifurcates at the line of , biased to the side associated with the higher initial activity. As an outcome, the R unit with higher initial values tends to increase while the opponent unit tends to be suppressed to zero, a process that implements WTA competition before the action stage but with constrained higher activity. Example R1 and R2 activity dynamics are shown in Figure 8—figure supplement 1F. After the withdrawal of inputs, R1 activities with different preceding input values collapse onto the same level of high activity, while R2 activities with lower input values are suppressed to zero. Thus, the system gradually switches from the normalized coding of inputs to a categorical coding of choice over the delay interval.
We also examined whether persistent activity could exist with active local disinhibition. We showed in Results that persistent activity in the working-memory task switches to WTA choice under the dynamic control of disinhibition (Figure 8D–F). How does the transition from persistent activity to WTA choice happen? How might disinhibition change the dynamic pattern of persistent activity during a delay interval?
The analysis was based on the differential equations of the system with symmetric gain control weights and without inputs (Equations 1-3). The equilibrium solution is given by:
With binary inputs, the solution can be thus written as:
Besides the impact of recurrent excitation and baseline gain control discussed above, equilibrium responses are determined by the relative strength between disinhibition () and the gain control weight (). We examined three separate conditions: , , and . We have already shown the analysis for the special case when above (phase plane analysis and example dynamic shown in Figure 8—figure supplement 1B) and replotted in Figure 8—figure supplement 2A for the sake of comparison with the other two conditions.
By assuming , the nullclines of R1 and R2 intersect on a unique equilibrium point, whose stability was confirmed as unstable after checking the eigenvalues of the Jacobian matrix around the point (Figure 8—figure supplement 2B). Any initial values on the space will diverge into the upper-left or bottom-right corner of the space, with the higher activity value as and the lower activity value as zero. We show the instantaneous change rates of R1 and R2 at given initial values in the vector field (red arrows; Figure 8—figure supplement 2B). In Figure 8—figure supplement 2E, we show the examples R1 and R2 activity dynamics (value setting kept the same as in Figure 8—figure supplement 1G). All of the R1 with larger input values converge into the same level of activity after the withdrawal of inputs, while all of the R2 with lower input values are suppressed to zero, implementing a WTA competition. Thus, the system gradually switches from normalized coding of input values to categorical choice from the early to the late stage of persistent activity.
By assuming , most of the features are similar to the previous situation, except that the model now predicts no constraints on the maximum activity (Figure 8—figure supplement 2C). The system shows nullclines with an intersection at a unique repellor. The activities of R1 and R2 bifurcate at the line of . The example dynamics show that the activity of , which has higher initial value, increases to an unlimited level and thus will reach a decision threshold. The rising speed of depends on the advantage of over as defined by their initial values.
Taken together, these analyses show that persistent activity is present as normalized coding of input values only with symmetric gain control weights () and inactive disinhibition (). When disinhibition has a moderate strength (), the persistent activity gradually transitions from value coding to categorical choice coding but avoids hitting the decision threshold. When disinhibition is strong enough (), the system generates WTA competition and reaches the decision threshold.
Simulation of pharmacological manipulation of inhibitory activity
In Figure 10, we tested inhibitory potentiation (e.g. GABAergic agonist) manipulation effects in both the LDDM and RNM (Wong and Wang, 2006) by assuming different levels of enhancement of the inhibitory projections. For LDDM (Figure 10A–D), we assumed , , input scale S=256, decision threshold =70 Hz, and dt =1 ms. Panel A illustrated the temporal dynamic of excitatory pools (R1 and R2) under input coherence of 25% between control (inhibitory connection weight = 1.0) and potentiation (inhibitory connection weight =3.8) conditions (other parameters used were , , = 0, , and ). Panel B examined the predicted RT and choice accuracy over different input coherences (c’ = [0, 3.2, 6.4, 12.8, 25.6, and 51.2%]) and levels of inhibitory weights (from 1 [control] to 4 [enhanced]; , , , , , and 10,000 repetitions). Panel C showed the chromomeric and psychometric curves over a number of input coherences (1–100%) under the section between control and inhibitory potentiation (inhibitory connection weight = 1.8). Panel D scanned the full parameter space of and between the contrast of control and inhibitory potentiation (inhibitory connection weight = 1.8; c’=3.2%, , = 0, , and 10,000 repetitions). For RNM (Figure 10E–G), we used the parameters specified in Wong and Wang, 2006 for the mean-field rate model. Inhibitory potentiation was manipulated by weighting the inhibitory connection in the model. Panel E illustrated the noiseless neural dynamics of RNM using the same input coherences and inhibitory enhancement levels as in panel A. Panel F was set to compare with panel B; thus the input coherences and inhibitory enhancement kept the same as in panel B, with noise amplitude set as recommended by Wong and Wang, 2006. Panel G showed the chromomeric and psychometric function predicted by RNM under the same input and inhibitory potentiation assumptions as in panel C.
Motifs tested and compared for normalized coding and WTA choice
We tested a series of motifs and found that local disinhibition is critical for integrating normalized valuation and choice functions. To do this, we tested four types of modifications that might enhance mutual competition between the option-specific local sub-circuits (Figure 2—figure supplement 1A): (a) Recurrent self-excitation (loops weighted by ), with self-amplification of each R unit, a property shown to be important for mutual competition in the RNM. (b) Local disinhibition (loops weighted by ), which is the focus of the main text, mediated through disinhibitory units (D); the function of a D unit is to inhibit the gain control G unit in the local sub-circuit, therefore, release inhibition on the local R units. (c) Cross inhibition (loops weighted by ), which directly inhibits the lateral R units through inhibitory units (I) to implement mutual inhibition. (d) Lateral gain control boost (loops weighted by ), which is mediated through excitatory units (E) to boost the lateral G, therefore, drives higher gain control on the lateral R than the local R (i.e. asymmetric gain control) and realizes mutual inhibition.
To see which type of modification(s) is/are critical for integrated value normalization and choice, we tested different combinations of these modifications on the original DNM circuit. The full model with all modifications can be described by a set of differential equations (Equations 22-26):
where i=1, …, N designates choice alternatives, each of which receives input , and , , , , and are the time constants for the R, G, D, I, and E units. The weights represent the coupling strength between excitatory units and gain control units . The parameters , , , and control the active state of recurrent excitation, local disinhibition, cross inhibition, and lateral gain control boost loops, respectively.
The active and inactive states of the four types of loops can be combined into 24=16 possiblegf models. Example dynamics were shown in Figure 2—figure supplement 1B for each type of model. When local disinhibition () is off (left two columns), the model generates WTA dynamics only when cross inhibition () is on. However, the maximum activity in the late stage is still restricted to a value lower than the phasic peak during the early stage, contradicting empirical findings that the late-stage decision threshold is usually higher than activity in the early phasic peak (Churchland et al., 2008; Kiani et al., 2008; Kiani and Shadlen, 2009; Louie et al., 2011; Roitman and Shadlen, 2002; Rorie et al., 2010; Shadlen and Newsome, 2001; Sugrue et al., 2004). This restriction arises because, with only cross inhibition, local option gain control is not released; this release requires local disinhibition. With local disinhibition on (, the right two columns), the models generate WTA dynamics with high activity in the late stage to reach the decision threshold. This is robust even without any other modifications (see the panel with and off), highlighting the role of local disinhibition in generating WTA competition. For the sake of simplicity, we omitted other non-essential modifications and kept only the loop of local disinhibition. Because recurrent excitation is important for persistent activity and exists widely in cortical circuits, we retained it as well. The modified DNM model with local disinhibition and recurrent self-excitation is the primary model (LDDM) characterized in the current work.
Data availability
The empirical data presented in this paper and MATLAB code used for simulations and fitting the empirical data is available at DOI https://doi.org/10.17605/OSF.IO/YGR57.
-
Open Science FrameworkFlexible control of representational dynamics in a disinhibition-based model of decision making.https://doi.org/10.17605/OSF.IO/YGR57
-
Matlab m filesID shadlenlab. Response of neurons in the lateral intraparietal area during a combined visual discrimination reaction time task.
References
-
ConferencePractical Bayesian optimization for model fitting with Bayesian adaptive direct searchProceedings of the 31st International Conference on Neural Information Processing Systems. pp. 1834–1844.
-
Cell-type-specific modulation of neocortical activity by basal forebrain inputFrontiers in Systems Neuroscience 6:79.https://doi.org/10.3389/fnsys.2012.00079
-
Dynamics of pattern formation in lateral-inhibition type neural fieldsBiological Cybernetics 27:77–87.https://doi.org/10.1007/BF00337259
-
Intentional maps in posterior parietal cortexAnnual Review of Neuroscience 25:189–220.https://doi.org/10.1146/annurev.neuro.25.112701.142922
-
Further evidence for temporal decay in working memory: reply to Lewandowsky and Oberauer (2009)Journal of Experimental Psychology. Learning, Memory, and Cognition 37:1302–1317.https://doi.org/10.1037/a0022933
-
As time goes by: temporal constraints in working memoryCurrent Directions in Psychological Science 21:413–419.https://doi.org/10.1177/0963721412459513
-
Modulation of neuronal activity in superior colliculus by changes in target probabilityThe Journal of Neuroscience 18:7519–7534.https://doi.org/10.1523/JNEUROSCI.18-18-07519.1998
-
Effects of Neuromodulation in a cortical network model of object working memory dominated by recurrent inhibitionJournal of Computational Neuroscience 11:63–85.https://doi.org/10.1023/a:1011204814320
-
Accurate path integration in continuous attractor network models of grid cellsPLOS Computational Biology 5:e1000291.https://doi.org/10.1371/journal.pcbi.1000291
-
BookLinearity and gain control in V1 simple cellsIn: Ulinski PS, Jones EG, Peters A, editors. Models of Cortical Circuits. Springer. pp. 401–443.https://doi.org/10.1007/978-1-4615-4903-1
-
Normalization as a canonical neural computationNature Reviews Neuroscience 13:51–62.https://doi.org/10.1038/nrn3136
-
Decision-making with multiple alternativesNature Neuroscience 11:693–702.https://doi.org/10.1038/nn.2123
-
Cortical surround interactions and perceptual salience via natural scene statisticsPLOS Computational Biology 8:e1002405.https://doi.org/10.1371/journal.pcbi.1002405
-
Flexible gating of contextual influences in natural visionNature Neuroscience 18:1648–1655.https://doi.org/10.1038/nn.4128
-
Persistent spiking activity underlies working memoryThe Journal of Neuroscience 38:7020–7028.https://doi.org/10.1523/JNEUROSCI.2486-17.2018
-
The magical mystery four: how is working memory capacity limited, and whyCurrent Directions in Psychological Science 19:51–57.https://doi.org/10.1177/0963721409359277
-
Working Memory CapacityWorking Memory Capacity, 1st, ed, New York, Psychology Press, 10.4324/9781315625560.
-
Caudate encodes multiple computations for perceptual decisionsThe Journal of Neuroscience 30:15747–15759.https://doi.org/10.1523/JNEUROSCI.2894-10.2010
-
BookWhat is working memory capacityIn: Roediger HL, Nairne JS, Neath I, Surprenant AM, editors. The Nature of Remembering: Essays in Honor of Robert G. Crowder. American Psychological Association. pp. 297–314.https://doi.org/10.1037/10394-000
-
Working memory capacity as executive attentionCurrent Directions in Psychological Science 11:19–23.https://doi.org/10.1111/1467-8721.00160
-
Mnemonic coding of visual space in the monkey’s dorsolateral prefrontal cortexJournal of Neurophysiology 61:331–349.https://doi.org/10.1152/jn.1989.61.2.331
-
Memory related motor planning activity in posterior parietal cortex of macaqueExperimental Brain Research 70:216–220.https://doi.org/10.1007/BF00271862
-
The neural basis of decision makingAnnual Review of Neuroscience 30:535–574.https://doi.org/10.1146/annurev.neuro.29.051605.113038
-
Revisiting the evidence for collapsing boundaries and urgency signals in perceptual decision-makingThe Journal of Neuroscience 35:2476–2484.https://doi.org/10.1523/JNEUROSCI.2410-14.2015
-
Quantile maximum likelihood estimation of response time distributionsPsychonomic Bulletin & Review 9:394–401.https://doi.org/10.3758/BF03196299
-
Normalization of cell responses in cat striate cortexVisual Neuroscience 9:181–197.https://doi.org/10.1017/S0952523800009640
-
Modeling simple-cell direction selectivity with normalized, half-squared, linear operatorsJournal of Neurophysiology 70:1885–1898.https://doi.org/10.1152/jn.1993.70.5.1885
-
Mechanisms underlying cortical activity during value-guided choiceNature Neuroscience 15:470–476.https://doi.org/10.1038/nn.3017
-
Historical review of the significance of the cerebellum and the role of Purkinje cells in motor learningAnnals of the New York Academy of Sciences 978:273–288.https://doi.org/10.1111/j.1749-6632.2002.tb07574.x
-
Cerebellar circuitry as a neuronal machineProgress in Neurobiology 78:272–303.https://doi.org/10.1016/j.pneurobio.2006.02.006
-
Control of mental activities by internal models in the cerebellumNature Reviews. Neuroscience 9:304–313.https://doi.org/10.1038/nrn2332
-
Dissecting executive control circuits with neuron typesNeuroscience Research 141:13–22.https://doi.org/10.1016/j.neures.2018.07.004
-
A blanket of inhibition: functional inferences from dense inhibitory connectivityCurrent Opinion in Neurobiology 26:96–102.https://doi.org/10.1016/j.conb.2013.12.015
-
Opening holes in the blanket of inhibition: localized lateral disinhibition by VIP interneuronsThe Journal of Neuroscience 36:3471–3480.https://doi.org/10.1523/JNEUROSCI.3646-15.2016
-
Neural correlates of a decision in the dorsolateral prefrontal cortex of the macaqueNature Neuroscience 2:176–185.https://doi.org/10.1038/5739
-
Contrast transfer characteristics of visual short-term memoryVision Research 36:2159–2166.https://doi.org/10.1016/0042-6989(95)00271-5
-
A disinhibitory circuit mediates motor integration in the somatosensory cortexNature Neuroscience 16:1662–1670.https://doi.org/10.1038/nn.3544
-
Inhibitory stabilization and visual coding in cortical circuits with multiple interneuron subtypesJournal of Neurophysiology 115:1399–1409.https://doi.org/10.1152/jn.00732.2015
-
Visual receptive field structure of cortical inhibitory neurons revealed by two-photon imaging guided recordingThe Journal of Neuroscience 29:10520–10532.https://doi.org/10.1523/JNEUROSCI.1915-09.2009
-
The cortex of the cerebellumScientific American 232:56–71.https://doi.org/10.1038/scientificamerican0175-56
-
Speed-accuracy tradeoff by a control signal with balanced excitation and inhibitionJournal of Neurophysiology 114:650–661.https://doi.org/10.1152/jn.00845.2013
-
The temporal dynamics of cortical normalization models of decision-makingLetters in Biomathematics 1:2Lofaro.https://doi.org/10.30707/LiB1.2Lofaro
-
Separating value from choice: delay discounting activity in the lateral intraparietal areaThe Journal of Neuroscience 30:5498–5507.https://doi.org/10.1523/JNEUROSCI.5742-09.2010
-
Reward value-based gain control: divisive normalization in parietal cortexThe Journal of Neuroscience 31:10627–10639.https://doi.org/10.1523/JNEUROSCI.1237-11.2011
-
Dynamic divisive normalization predicts time-varying value coding in decision-related circuitsThe Journal of Neuroscience 34:16046–16057.https://doi.org/10.1523/JNEUROSCI.2851-14.2014
-
Adaptive neural coding: from biological to behavioral decision-makingCurrent Opinion in Behavioral Sciences 5:91–99.https://doi.org/10.1016/j.cobeha.2015.08.008
-
Interneurons of the neocortical inhibitory systemNature Reviews. Neuroscience 5:793–807.https://doi.org/10.1038/nrn1519
-
Working memory and decision-making in a frontoparietal circuit modelThe Journal of Neuroscience 37:12167–12186.https://doi.org/10.1523/JNEUROSCI.0343-17.2017
-
Highly selective receptive fields in mouse visual cortexThe Journal of Neuroscience 28:7520–7536.https://doi.org/10.1523/JNEUROSCI.0623-08.2008
-
What limits working memory capacityPsychological Bulletin 142:758–799.https://doi.org/10.1037/bul0000046
-
Visual short-term memory: A methodological caveatCanadian Journal of Psychology / Revue Canadienne de Psychologie 28:24–31.https://doi.org/10.1037/h0081973
-
The importance of falsification in computational cognitive modelingTrends in Cognitive Sciences 21:425–433.https://doi.org/10.1016/j.tics.2017.03.011
-
Neural correlates of biased competition in premotor cortexThe Journal of Neuroscience 31:7083–7088.https://doi.org/10.1523/JNEUROSCI.5681-10.2011
-
Time-related decay or interference-based forgetting in working memoryJournal of Experimental Psychology. Learning, Memory, and Cognition 34:1561–1564.https://doi.org/10.1037/a0013356
-
Modulation of neural activity by reward in medial intraparietal cortex is sensitive to temporal sequence of rewardJournal of Neurophysiology 112:1775–1789.https://doi.org/10.1152/jn.00533.2012
-
Three groups of interneurons account for nearly 100% of neocortical GABAergic neuronsDevelopmental Neurobiology 71:45–61.https://doi.org/10.1002/dneu.20853
-
A neuro-computational model of economic decisionsJournal of Neurophysiology 114:1382–1398.https://doi.org/10.1152/jn.00184.2015
-
Emerging connections between cerebellar development, behaviour and complex brain disordersNature Reviews. Neuroscience 20:298–313.https://doi.org/10.1038/s41583-019-0152-2
-
Computational models of basal-ganglia pathway functions: focus on functional neuroanatomyFrontiers in Systems Neuroscience 7:122.https://doi.org/10.3389/fnsys.2013.00122
-
Neocortical layer 1: an elegant solution to top-down and bottom-up integrationAnnual Review of Neuroscience 44:221–252.https://doi.org/10.1146/annurev-neuro-100520-012117
-
Natural signal statistics and sensory gain controlNature Neuroscience 4:819–825.https://doi.org/10.1038/90526
-
Visual attention and flexible normalization poolsJournal of Vision 13:25.https://doi.org/10.1167/13.1.25
-
Neural basis of a perceptual decision in the parietal cortex (area LIP) of the rhesus monkeyJournal of Neurophysiology 86:1916–1936.https://doi.org/10.1152/jn.2001.86.4.1916
-
Morphology, molecular codes, and circuitry produce the three-dimensional complexity of the cerebellumAnnual Review of Cell and Developmental Biology 23:549–577.https://doi.org/10.1146/annurev.cellbio.23.090506.123237
-
A biophysically based neural model of matching law behavior: melioration by stochastic synapsesThe Journal of Neuroscience 26:3731–3744.https://doi.org/10.1523/JNEUROSCI.5159-05.2006
-
Choice-theoretic foundations of the divisive normalization modelJournal of Economic Behavior & Organization 164:148–165.https://doi.org/10.1016/j.jebo.2019.05.026
-
The dynamical stability of reverberatory neural circuitsBiological Cybernetics 87:471–481.https://doi.org/10.1007/s00422-002-0363-9
-
Somatostatin-expressing neurons in cortical networksNature Reviews. Neuroscience 17:401–409.https://doi.org/10.1038/nrn.2016.53
-
The time course of perceptual choice: the leaky, competing accumulator modelPsychological Review 108:550–592.https://doi.org/10.1037/0033-295x.108.3.550
-
Synaptic basis of cortical persistent activity: the importance of NMDA receptors to working memoryThe Journal of Neuroscience 19:9587–9603.https://doi.org/10.1523/JNEUROSCI.19-21-09587.1999
-
Neural Dynamics and circuit mechanisms of decision-makingCurrent Opinion in Neurobiology 22:1039–1046.https://doi.org/10.1016/j.conb.2012.08.006
-
A disinhibitory circuit motif and flexible information routing in the brainCurrent Opinion in Neurobiology 49:75–83.https://doi.org/10.1016/j.conb.2018.01.002
-
Role of the indirect pathway of the basal ganglia in perceptual decision makingThe Journal of Neuroscience 35:4052–4064.https://doi.org/10.1523/JNEUROSCI.3611-14.2015
-
Gabaergic inhibition in the neostriatumProgress in Brain Research 160:91–110.https://doi.org/10.1016/S0079-6123(06)60006-X
-
A recurrent network mechanism of time integration in perceptual decisionsThe Journal of Neuroscience 26:1314–1328.https://doi.org/10.1523/JNEUROSCI.3733-05.2006
-
Neural circuit dynamics underlying accumulation of time-varying evidence during perceptual decision makingFrontiers in Computational Neuroscience 1:6.https://doi.org/10.3389/neuro.10.006.2007
-
A dendritic disinhibitory circuit mechanism for pathway-specific gatingNature Communications 7:12815.https://doi.org/10.1038/ncomms12815
-
Specific evidence of low-dimensional continuous attractor dynamics in grid cellsNature Neuroscience 16:1077–1084.https://doi.org/10.1038/nn.3450
Article and author information
Author details
Funding
National Institutes of Health (R01DA038063)
- Paul Glimcher
National Institutes of Health (R01DA043676)
- Paul Glimcher
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Copyright
© 2023, Shen et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 1,427
- views
-
- 173
- downloads
-
- 2
- citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Computational and Systems Biology
- Neuroscience
The basolateral amygdala (BLA) is a key site where fear learning takes place through synaptic plasticity. Rodent research shows prominent low theta (~3–6 Hz), high theta (~6–12 Hz), and gamma (>30 Hz) rhythms in the BLA local field potential recordings. However, it is not understood what role these rhythms play in supporting the plasticity. Here, we create a biophysically detailed model of the BLA circuit to show that several classes of interneurons (PV, SOM, and VIP) in the BLA can be critically involved in producing the rhythms; these rhythms promote the formation of a dedicated fear circuit shaped through spike-timing-dependent plasticity. Each class of interneurons is necessary for the plasticity. We find that the low theta rhythm is a biomarker of successful fear conditioning. The model makes use of interneurons commonly found in the cortex and, hence, may apply to a wide variety of associative learning situations.
-
- Cancer Biology
- Computational and Systems Biology
Effects from aging in single cells are heterogenous, whereas at the organ- and tissue-levels aging phenotypes tend to appear as stereotypical changes. The mammary epithelium is a bilayer of two major phenotypically and functionally distinct cell lineages: luminal epithelial and myoepithelial cells. Mammary luminal epithelia exhibit substantial stereotypical changes with age that merit attention because these cells are the putative cells-of-origin for breast cancers. We hypothesize that effects from aging that impinge upon maintenance of lineage fidelity increase susceptibility to cancer initiation. We generated and analyzed transcriptomes from primary luminal epithelial and myoepithelial cells from younger <30 (y)ears old and older >55y women. In addition to age-dependent directional changes in gene expression, we observed increased transcriptional variance with age that contributed to genome-wide loss of lineage fidelity. Age-dependent variant responses were common to both lineages, whereas directional changes were almost exclusively detected in luminal epithelia and involved altered regulation of chromatin and genome organizers such as SATB1. Epithelial expression of gap junction protein GJB6 increased with age, and modulation of GJB6 expression in heterochronous co-cultures revealed that it provided a communication conduit from myoepithelial cells that drove directional change in luminal cells. Age-dependent luminal transcriptomes comprised a prominent signal that could be detected in bulk tissue during aging and transition into cancers. A machine learning classifier based on luminal-specific aging distinguished normal from cancer tissue and was highly predictive of breast cancer subtype. We speculate that luminal epithelia are the ultimate site of integration of the variant responses to aging in their surrounding tissue, and that their emergent phenotype both endows cells with the ability to become cancer-cells-of-origin and represents a biosensor that presages cancer susceptibility.