A functional model of adult dentate gyrus neurogenesis
Abstract
In adult dentate gyrus neurogenesis, the link between maturation of newborn neurons and their function, such as behavioral pattern separation, has remained puzzling. By analyzing a theoretical model, we show that the switch from excitation to inhibition of the GABAergic input onto maturing newborn cells is crucial for their proper functional integration. When the GABAergic input is excitatory, cooperativity drives the growth of synapses such that newborn cells become sensitive to stimuli similar to those that activate mature cells. When GABAergic input switches to inhibitory, competition pushes the configuration of synapses onto newborn cells toward stimuli that are different from previously stored ones. This enables the maturing newborn cells to code for concepts that are novel, yet similar to familiar ones. Our theory of newborn cell maturation explains both how adultborn dentate granule cells integrate into the preexisting network and why they promote separation of similar but not distinct patterns.
Introduction
In the adult mammalian brain, neurogenesis, the production of new neurons, is restricted to a few brain areas, such as the olfactory bulb and the dentate gyrus (Deng et al., 2010). The dentate gyrus is a major entry point of input from cortex, primarily entorhinal cortex (EC), to the hippocampus (Amaral et al., 2007), which is believed to be a substrate of learning and memory (Jarrard, 1993). Adultborn cells in dentate gyrus mostly develop into dentate granule cells (DGCs), the main excitatory cells that project to area CA3 of hippocampus (Deng et al., 2010).
The properties of rodent adultborn DGCs change as a function of their maturation stage, until they become indistinguishable from other mature DGCs at approximately 8 weeks (Deng et al., 2010; Johnston et al., 2016; Figure 1a). Many of them die before they fully mature (Dayer et al., 2003). Their survival is experience dependent and relies on NMDA receptor activation (Tashiro et al., 2006). Initially, newborn DGCs have enhanced excitability (SchmidtHieber et al., 2004; Li et al., 2017) and stronger synaptic plasticity than mature DGCs, reflected by a larger longterm potentiation (LTP) amplitude and a lower threshold for induction of LTP (Wang et al., 2000; SchmidtHieber et al., 2004; Ge et al., 2007). Furthermore, after 4 weeks of maturation adultborn DGCs have only weak connections to interneurons, while at 7 weeks of age, their activity causes indirect inhibition of mature DGCs (Temprana et al., 2015).
Newborn DGCs receive no direct connections from mature DGCs (Deshpande et al., 2013; Alvarez et al., 2016) (yet see Vivar et al., 2012), but are indirectly activated via interneurons (Alvarez et al., 2016; Heigele et al., 2016). At about 3 weeks after birth, the γaminobutyric acid (GABAergic) input from interneurons to adultborn DGCs switches from excitatory in the early phase to inhibitory in the late phase of maturation (Ge et al., 2006; Deng et al., 2010) (‘GABAswitch’, Figure 1a). Analogous to a similar transition during embryonic and early postnatal stages (Wang and Kriegstein, 2011), the GABAswitch is caused by a change in the expression profile of chloride cotransporters. In the early phase of maturation, newborn cells express the ${\text{Na}}^{+}{\text{K}}^{+}\text{}2{\text{Cl}}^{}$ cotransporter NKCC1, which leads to a high intracellular chloride concentration. Hence, the GABA reversal potential is higher than the resting potential (Ge et al., 2006; Heigele et al., 2016), and GABAergic inputs lead to Cl^{−} ions outflow through the ${\text{GABA}}_{A}$ ionic receptors, which results in depolarization of the newborn cell (BenAri, 2002; Owens and Kriegstein, 2002). In the late phase of maturation, expression of the ${\text{K}}^{+}{\text{Cl}}^{}$coupled cotransporter KCC2 kicks in, which lowers the intracellular chloride concentration of the newborn cell to levels similar to those of mature cells, leading to a hyperpolarization of the cell membrane due to Cl^{−} inflow upon GABAergic stimulation (BenAri, 2002; Owens and Kriegstein, 2002). The transition from depolarizing (excitatory) to hyperpolarizing (inhibitory) effects of GABA is referred to as the ‘GABAswitch’. It has been shown that GABAergic inputs are crucial for the integration of newborn DGCs into the preexisting circuit (Ge et al., 2006; Chancey et al., 2013; Alvarez et al., 2016; Heigele et al., 2016).
The mammalian dentate gyrus contains – just like hippocampus in general – a myriad of inhibitory cell types (Freund and Buzsáki, 1996; Somogyi and Klausberger, 2005; Klausberger and Somogyi, 2008), including basket cells, chandelier cells, and hilar cells (Figure 1—figure supplement 1). Basket cells can be subdivided in two categories: some express cholecystokinin (CCK) and vasoactive intestinal polypeptide (VIP), while the others express parvalbumin (PV) and are fastspiking (Freund and Buzsáki, 1996; Amaral et al., 2007). Chandelier cells also express PV (Freund and Buzsáki, 1996). Overall, it has been estimated that PV is expressed in 15–21% of all dentate GABAergic cells (Freund and Buzsáki, 1996) and in 20–25% of the GABAergic neurons in the granule cell layer (Houser, 2007). Amongst the GABAergic hilar cells, 55% express somatostatin (SST) (Houser, 2007) and somatostatinpositive interneurons (SSTINs) represent about 16% of the GABAergic neurons in the dentate gyrus as a whole (Freund and Buzsáki, 1996). While axons of hilar interneurons stay in the hilus and provide perisomatic inhibition onto dentate GABAergic cells, axons of hilarperforantpathassociated interneurons (HIPP) extend to the molecular layer and provide dendritic inhibition onto both DGCs and interneurons (Yuan et al., 2017). HIPP axons generate lots of synaptic terminals and extend as far as 3.5 mm along the septotemporal axis of the dentate gyrus (Amaral et al., 2007). PVexpressing interneurons (PVINs) and SSTINs both target adultborn DGCs early (after 2–3 weeks) in their maturation (Groisman et al., 2020). PVINs provide both feedforward inhibition and feedback inhibition (also called lateral inhibition) to the DGCs (Groisman et al., 2020). In general, SSTINs provide lateral, but not feedforward, inhibition onto DGCs (Stefanelli et al., 2016; Groisman et al., 2020; Figure 1—figure supplement 1).
Adultborn DGCs are preferentially reactivated by stimuli similar to the ones they experienced during their early phase of maturation, up to 3 weeks after cell birth (Tashiro et al., 2007). Even though the amount of newly generated cells per month is rather low (3–6% of the total DGCs population [van Praag et al., 1999; Cameron and McKay, 2001]), adultborn DGCs are critical for behavioral pattern separation (Clelland et al., 2009; Sahay et al., 2011a; Jessberger et al., 2009), in particular in tasks where similar stimuli or contexts have to be discriminated (Clelland et al., 2009; Sahay et al., 2011a). However, the functional role of adultborn DGCs is controversial (Sahay et al., 2011b; Aimone et al., 2011). One view is that newborn DGCs contribute to pattern separation through a modulatory role (Sahay et al., 2011b). Another view suggests that newborn DGCs act as encoding units that become sensitive to features of the environment which they encounter during a critical window of maturation (Kee et al., 2007; Tashiro et al., 2007). Some authors have even challenged the role of newborn DGCs in pattern separation in the classical sense and have proposed a pattern integration effect instead (Aimone et al., 2011), while others suggest a dynamical (Aljadeff et al., 2015; ShaniNarkiss et al., 2020) or forgetting (Akers et al., 2014) role for newborn DGCs. Within that broader controversy, we ask two specific questions: First, why are GABAergic inputs crucial for the integration of newborn DGCs into the preexisting circuit? And second, why are newborn DGCs particularly important in tasks where similar stimuli or contexts have to be discriminated?
To address these questions, we present a model of how newborn DGCs integrate into the preexisting circuit. In contrast to earlier models where synaptic input connections onto newborn cells were assumed to be strong enough to drive them (Chambers et al., 2004; Becker, 2005; Crick and Miranker, 2006; Wiskott et al., 2006; Chambers and Conroy, 2007; Aimone et al., 2009; Appleby and Wiskott, 2009; Weisz and Argibay, 2009; Temprana et al., 2015; Finnegan and Becker, 2015; DeCostanzo et al., 2018), our model uses an unsupervised biologically plausible Hebbian learning rule that makes synaptic connections between EC and newborn DGCs either disappear or grow from small values at birth to values that eventually enable feedforward input from EC to drive DGCs. Contrary to previous modeling studies, our plasticity model does not require an artificial renormalization of synaptic connection weights since model weights are naturally bounded by the synaptic plasticity rule. We show that learning with a biologically plausible plasticity rule is possible thanks to the GABAswitch, which has been overlooked in previous modeling studies. Specifically, the growth of synaptic weights from small values is supported in our model by the excitatory action of GABA, whereas, after the switch, specialization of newborn cells arises from competition between DGCs, triggered by the inhibitory action of GABA. Furthermore, our theory of adultborn DGCs integration yields a transparent explanation of why newborn cells favor pattern separation of similar stimuli, but do not impact pattern separation of distinct stimuli.
Results
We model a small patch of cells within dentate gyrus as a recurrent network of 100 DGCs and 25 GABAergic interneurons, omitting the Mossy cells for the sake of simplicity (Figure 1b). The modeled interneurons correspond to SSTINs from the HIPP category, as they are the providers of feedback inhibition to DGCs through dendritic projections (Stefanelli et al., 2016; Yuan et al., 2017; Groisman et al., 2020; Figure 1—figure supplement 1). The activity of a DGC with index $i$ and an interneuron with index $k$ is described by their continuous firing rates ${\nu}_{i}$ and ${\nu}_{k}^{I}$, respectively. Firing rates are modeled by neuronal frequency–current curves that vanish for weak input and increase if the total input into a neuron is larger than a firing threshold. Since newborn DGCs exhibit enhanced excitability early in maturation (SchmidtHieber et al., 2004; Li et al., 2017), the firing threshold of model neurons increases during maturation from a lower to a higher value (Materials and methods). Connectivity in a localized patch of dentate neurons is high: DGCs densely project to GABAergic interneurons (Acsády et al., 1998), and SSTINs heavily project to cells in their neighborhood (Amaral et al., 2007). Hence, in the recurrent network model, each model DGC projects to, and receives input from, a given interneuron with probability 0.9. The exact percentage of GABAergic neurons (or SSTINs) in the dentate gyrus as a whole is not known, but has been estimated at about 10% and only a fraction of these are SSTINs (Freund and Buzsáki, 1996). The number of inhibitory neurons in our model network might therefore seem too high. However, our results are robust to substantial changes in the number of inhibitory neurons (Supplementary file 2).
Each of the 100 model DGCs receives input from a set of 144 model EC cells (Figure 1b). In the rat, the number of DGCs has been estimated to be about 10^{6}, while the number of EC input cells is estimated to be about 2 · 10^{5} (Andersen et al., 2007), yielding an expansion factor from EC to dentate gyrus of about 5. Theoretical analysis suggests that the expansion of the number of neurons enhances decorrelation of the representation of input patterns (Marr, 1969; Albus, 1971; Marr, 1971; Rolls and Treves, 1998) and promotes pattern separation (Babadi and Sompolinsky, 2014). Our standard network model does not reflect this expansion because we want to highlight the particular ability of adult neurogenesis in combination with the GABAswitch to decorrelate input patterns independently of specific choices of the network architecture. However, we show later that an enlarged network with an expansion from 144 model EC cells to 700 model DGCs (similar to the anatomical expansion factor) yields similar results.
At birth, a DGC with index $i$ does not receive synaptic glutamatergic input yet. Hence, the connection from any model EC cell with index $j$ is initialized at ${w}_{ij}=0$. The growth or decay of the synaptic strength ${w}_{ij}$ of the connection from $j$ to $i$ is controlled by a Hebbian plasticity rule (Figure 1c):
where x_{j} is the firing rate of the presynaptic EC neuron, η (‘learning rate’) is the susceptibility of a cell to synaptic plasticity, and $\alpha ,\beta ,\gamma $ are positive parameters (Materials and methods, Table 1). The first term on the righthand side of Equation (1) describes LTP whenever the presynaptic neuron is active (${x}_{j}>0)$ and the postsynaptic firing ${\nu}_{i}$ is above a threshold θ; the second term on the righthand side of Equation (1) describes longterm depression (LTD) whenever the presynaptic neuron is active and the postsynaptic firing rate is positive but below the threshold θ; LTD stops if the synaptic weight is zero. Such a combination of LTP and LTD is consistent with experimental data (Artola et al., 1990; Sjöström et al., 2001) as shown in earlier ratebased (Bienenstock et al., 1982) or spikebased (Pfister and Gerstner, 2006) plasticity models. The third term on the righthand side of Equation (1) implements heterosynaptic plasticity (Chistiakova et al., 2014; Zenke and Gerstner, 2017): whenever strong presynaptic input arriving at synapses $k\ne j$ drives the firing of postsynaptic neuron $i$ at a rate above θ, the weight of a synapse $j$ is downregulated if synapse $j$ does not receive any input, while the weights of synapses $k\ne j$ are simultaneously increased due to the first term (Lynch et al., 1977). Importantly, the threshold condition for the third term (postsynaptic rate above θ) is the same as that for induction of LTP in the first term so that if some synapses are potentiated, silent synapses are depressed. In the model, heterosynaptic interaction between synapses is induced since information about postsynaptic activity is shared across synapses. This could be achieved in biological neurons via backpropagating action potentials or similar depolarization of the postsynaptic membrane potential at several synaptic locations; alternatively, heterosynaptic crosstalk could be implemented by signaling molecules. Note that since our neuron model is a point neuron, all synapses are neighbors of each other. In our model, the ‘heterosynaptic’ term has a negative sign which ensures that the weights cannot grow without bounds (Materials and methods). In this sense, the third term has a ‘homeostatic’ function (Zenke and Gerstner, 2017), yet acts on a time scale faster than experimentally observed homeostatic synaptic plasticity (Turrigiano et al., 1998).
We ask whether such a biologically plausible plasticity rule enables adultborn DGCs to be integrated in an existing network of mature cells. To address this question, we exploit two observations (Figure 1a): first, the effect of interneurons onto newborn DGCs exhibits a GABAswitch from excitatory to inhibitory after about three weeks of maturation (Ge et al., 2006; Deng et al., 2010) and, second, newborn DGCs receive input from interneurons early in their maturation (before the third week), but project back to interneurons only later (Temprana et al., 2015). For simplicity, no plasticity rule was implemented within the dentate gyrus: connections between newborn DGCs and inhibitory cells are either absent or present with a fixed value (see below). However, before integration of adultborn DGCs can be addressed, an adultstage network where mature cells already store some memories has to be constructed.
Mature neurons represent prototypical input patterns
In an adultstage network, some mature cells already have a functional role. Hence, we start with a network that already has strong random ECtoDGC connection weights (Materials and methods). We then pretrain our network of 100 DGCs using the same learning rule (Equation (1), with identical learning rate η for all DGCs) that we will use later for the integration of newborn cells. For the stimulation of EC cells, we apply patterns representing thousands of handwritten digits in different writing styles from MNIST, a standard data set in artificial intelligence (Lecun et al., 1998). Even though we do not expect EC neurons to show a twodimensional arrangement, the use of twodimensional patterns provides a simple way to visualize the activity of all 144 EC neurons in our model (Figure 1d). We implicitly model feedforward inhibition from PVINs (Groisman et al., 2020; Figure 1—figure supplement 1) by normalizing input patterns so that all inputs have the same amplitude (Materials and methods). Below, we present results for a representative combination of three digits (digits 3, 4, and 5), but other combinations of digits have also been tested (Supplementary file 1).
After pretraining with patterns from digits 3 and 4 in a variety of writing styles, we examine the receptive field of each DGC. Each receptive field, consisting of the connections from all 144 EC neurons onto one DGC, is characterized by its spatial structure (i.e. the pattern of connection weights) and its total strength (i.e. the efficiency of the optimal stimulus to drive the cell). We observe that out of the 100 DGCs, some have developed spatial receptive fields that correspond to different writing styles of digit 3, others receptive fields that correspond to variants of digit 4 (Figure 1e).
Behavioral discrimination has been shown to be correlated with classification accuracy based on DGC population activity (Woods et al., 2020). Hence, to quantify the representation quality, we compute classification performance by a linear classifier that is driven by the activity of our 100 DGC model cells (Materials and methods). At the end of pretraining, the classification performance for patterns of digits 3 and 4 from a distinct test set not used during pretraining is high: 99.25% (classification performance on digit 3: 98.71%; digit 4: 99.80%), indicating that nearly all input patterns of the two digits are well represented by the network of mature DGCs. The median classification performance for 10 random combinations of two groups of pretrained digits is 98.54%, the 25th percentile 97.26%, and the 75th percentile 99.5% (Supplementary file 1).
A detailed mathematical analysis (Materials and methods) shows that heterosynaptic plasticity in Equation (1) ensures that the total strength of the receptive field of each selective DGC converges to a stable value which is similar for selective DGCs confirming the homeostatic function of heterosynaptic plasticity (Zenke and Gerstner, 2017). As a consequence, synaptic weights are intrinsically bounded without the need to impose hard bounds on the weight dynamics. Moreover, we find that the spatial structure of the receptive field represents the weighted average of all those input patterns for which that DGC is responsive. The mathematical analysis also shows that those DGCs that do not develop selectivity have weak synaptic connections and a very low total strength of the receptive field.
After convergence of synaptic weights during pretraining, selective DGCs are considered mature cells. Mature cells are less plastic than newborn cells (SchmidtHieber et al., 2004; Ge et al., 2007). So in the following, unless specified otherwise, we set $\eta =0$ in Equation (1) for mature cells (feedforward connection weights from EC to mature cells remain therefore fixed). A scenario where mature cells retain synaptic plasticity is also investigated (see Robustness of the model and Supplementary file 4). Some DGCs did not develop any strong weight patterns during pretraining and exhibit unselective receptive fields (highlighted in red in Figure 1e). We classify these as unresponsive units.
Newborn neurons become selective for novel patterns during maturation
In our main neurogenesis model, we replace unresponsive model units by plastic newborn DGCs ($\eta >0$ in Equation (1)), which receive lateral GABAergic input but do not receive feedforward input yet (all weights from EC are set to zero). The replacement of unresponsive neurons reflects the fact that unresponsive units have weak synaptic connections and, experimentally, a lack of NMDA receptor activation has been shown to be deleterious for the survival of newborn DGCs (Tashiro et al., 2006). To mimic exposure of an animal to a novel set of stimuli, we now add input patterns from digit 5 to the set of presented stimuli, which was previously limited to patterns of digits 3 and 4. The novel patterns from digit 5 are randomly interspersed into the sequence of patterns from digits 3 and 4; in other words, the presentation sequence was not optimized with a specific goal in mind.
We postulate that functional integration of newborn DGCs requires the twostep maturation process caused by the GABAswitch from excitation to inhibition. Since excitatory GABAergic input potentially increases correlated activity within the dentate gyrus network, we predict that newborn DGCs respond to familiar stimuli during the early phase of maturation, but not during the late phase, when inhibitory GABAergic input leads to competition.
To test this hypothesis, our model newborn DGCs go through two maturation phases (Materials and methods). The early phase of maturation is cooperative because, for each pattern presentation, activated mature DGCs indirectly excite the newborn DGCs via GABAergic interneurons. We assume that in natural settings, the activation of ${\text{GABA}}_{A}$ receptors is low enough that the mean membrane potential remains below the chloride reversal potential at which shunting inhibition would be induced (Heigele et al., 2016). In this regime, the net effect of synaptic activity is hence excitatory. This lateral activation of newborn DGCs drives the growth of their receptive fields in a direction similar to those of the currently active mature DGCs. Consistent with our hypothesis we find that, at the end of the early phase of maturation, newborn DGCs show a receptive field corresponding to a mixture of several input patterns (Figure 2a).
In the late phase of maturation, model newborn DGCs receive inhibitory GABAergic input from interneurons, similar to the input received by mature DGCs. Given that at the end of the early phase, newborn DGCs have receptive fields similar to those of mature DGCs, lateral inhibition induces competition with mature DGCs for activation during presentation of patterns from the novel digit. Because model newborn DGCs start their late phase of maturation with a higher excitability (lower threshold) compared to mature DGCs, consistent with observed enhanced excitability of newborn cells (SchmidtHieber et al., 2004; Li et al., 2017), the activation of newborn DGCs is facilitated for those input patterns for which no mature DGC has preexisting selectivity. Therefore, in the late phase of maturation, competition drives the synaptic weights of most newborn DGCs toward receptive fields corresponding to different subcategories of the ensemble of input patterns of the novel digit 5 (Figure 2b).
The total strength of the receptive field of a given DGC can be characterized by the sum of the squared synaptic weights of all feedfoward projections onto the cell (i.e. the square of the L2norm). During maturation, the L2norm of the feedforward weights onto newborn DGCs increases (Figure 2e) indicating an increase in total glutamatergic innervation, e.g., through an increase in the number and size of spines (Zhao et al., 2006). Nevertheless, the distribution of firing rates of newborn DGCs is shifted to lower values at the end of the late phase compared to the end of the early phase of maturation (Figure 2c,d), consistent with in vivo calcium imaging recordings showing that newborn DGCs are more active than mature DGCs (Danielson et al., 2016).
We emphasize that upon presentation of a pattern of a given digit, only those DGCs with a receptive field similar to the specific writing style of the presented pattern become strongly active, others fire at a medium firing rate, yet others at a low rate (Figure 2g). As a consequence, the firing rate of a particular newborn DGC at the end of its maturation to a pattern from digit 5 is strongly modulated by the specific choice of stimulation pattern within the class of ‘5’s. Analogous results are obtained for patterns from pretrained digits 3 and 4 (Figure 2—figure supplement 1). Hence, the ensemble of DGCs is effectively performing pattern separation within each digit class as opposed to a simple ternary classification task. The selectivity of newborn DGCs develops during maturation. Indeed, during the late, competitive, phase, the percentage of active newborn DGCs decreases, both upon presentation of familiar patterns (digits 3 and 4), as well as upon presentation of novel patterns (digit 5) (Figure 2f). This reflects the development of the selectivity of our model newborn DGCs from broad to narrow tuning, consistent with experimental observations (MarínBurgin et al., 2012; Danielson et al., 2016).
If two novel ensembles of digits (instead of a single one) are introduced during maturation of newborn DGCs, we observe that some newborn DGCs become selective for one of the novel digits, while others become selective for the other novel digit (Figure 2—figure supplement 2). This was expected, since we have found earlier that DGCs are becoming selective for different prototype writing styles even within a digit category; hence introducing several additional digit categories of novel patterns simply increases the prototype diversity. Therefore, newborn DGCs can ultimately promote separation of several novel overarching categories of patterns, no matter if they are learned simultaneously or sequentially (Figure 2—figure supplement 2).
Adultborn neurons promote better discrimination
As above, we compute classification performance of our model network as a surrogate for behavioral discrimination (Woods et al., 2020). At the end of the late phase of maturation of newborn DGCs, we obtain an overall classification performance of 94.56% for the three ensembles of digits (classification performance for digit 3: 90.50%; digit 4: 98.17%; digit 5: 95.18%). Confusion matrices show that although novel patterns are not well classified at the end of the early phase of maturation (Figure 3e), they are as well classified as pretrained patterns at the end of the late phase of maturation (Figure 3f).
We compare this performance with that of a network where all three digit ensembles are directly simultaneously pretrained starting from random weights (Figure 3a, control 1). In this case, the overall classification performance is 92.09% (classification performance for digit 3: 86.83%; digit 4: 98.78%; digit 5: 90.70%). The confusion matrix shows that all three digits are decently classified, but with an overall lower performance (Figure 3d). Across 10 simulation experiments, classification performance is significantly higher when a novel ensemble of patterns is learned sequentially by newborn DGCs (P_{2}; Supplementary file 1), than if all patterns are learned simultaneously (P_{1}; Supplementary file 1). Indeed, the distribution of ${P}_{2}{P}_{1}$ for the 10 simulation experiments has a mean which is significantly different from zero (Wilcoxon signed rank test: pval = 0.0020, Wilcoxon signed rank = 55; oneway ttest: pval = 0.0269, tstat = 2.6401, df = 9; Supplementary file 1).
The GABAswitch guides learning of novel representations
To assess whether maturation of newborn DGCs promotes learning of a novel ensemble of digit patterns, we compare our results with two control models without neurogenesis (controls 2 and 3).
In control 2, similar to the neurogenesis case, the feedforward weights and thresholds of mature DGCs are fixed (learning rate $\eta =0$) after pretraining with patterns from digits 3 and 4, while the thresholds and weights of all unresponsive neurons remain plastic ($\eta >0$) upon introduction of patterns from the novel digit 5. The only differences to the model with neurogenesis are that unresponsive neurons: (1) keep their feedforward weights (i.e. no reinitialization to zero values) and (2) keep the same connections from and to inhibitory neurons. In this case, we find that the previously unresponsive DGCs do not become selective for the novel digit 5, no matter during how many epochs patterns are presented (we went up to 100 epochs) (Figure 3b, control 2). Therefore, if patterns from digit 5 are presented to the network, the model fails to discriminate them from the previously learned digits 3 and 4: the overall classification performance is 81.69% (classification performance for digit 3: 85.94%; digit 4: 97.56%; digit 5: 59.42%). This result suggests that integration of newborn DGCs is beneficial for sequential learning of novel patterns.
In control 3, all DGCs keep plastic feedforward weights (learning rate $\eta >0$) after pretraining and introduction of the novel digit 5, no matter if they became selective or not for the pretrained digits 3 and 4. We observe that in the case where all neurons are plastic, learning of the novel digit induces a change in selectivity of mature neurons. Several DGCs switch their selectivity to become sensitive to the novel digit (Figure 3c), while none of the previously unresponsive units becomes selective for presented patterns (compare with Figure 1e). In contrast to the model with neurogenesis, we observe a drop in classification performance to 90.92% (classification performance for digit 3: 85.45%; digit 4: 98.37%; digit 5: 88.90%). We find that the classification performance for digit 3 is the one which decreases the most. This is due to the fact that many DGCs previously selective for digit 3 modified their weights to become selective for digit 5. Importantly, the more novel patterns are introduced, the more overwriting of previously stored memories occurs. Hence, if all DGCs remain plastic, discrimination between a novel pattern and a familiar pattern stored long ago is impaired.
Maturation of newborn neurons shapes the representation of novel patterns
Since each input pattern stimulates slightly different, yet overlapping, subsets of the 100 model DGCs in a sparse code such that about 20 DGCs respond to each pattern (Figure 2g), there is no simple onetoone assignment between neurons and patterns. In order to visualize the activity patterns of the ensemble of DGCs, we perform dimensionality reduction. We construct a twodimensional space using the activity patterns of the network at the end of the late phase of maturation of newborn DGCs trained with ‘3’s, ‘4’s and ‘5’s. One axis connects the center of mass (in the 100dimensional activity space) of all DGC responses to ‘3’s with all responses to ‘5’s (arbitrarily called ‘axis 1’) and the other axis those from ‘4’s to ‘5’s (arbitrarily called ‘axis 2’). We then project the activity of the 100 model DGCs upon presentation of MNIST testing patterns onto those two axes, both at the end of the early and late phase of maturation of newborn DGCs (Materials and methods). Each twodimensional projection is illustrated by a dot whose color corresponds to the digit class of the presented input pattern (blue for digit 3, green for digit 4, red for digit 5). Different input patterns within the same digit class cause different activation patterns of the DGCs, as depicted by extended clouds of dots of the same color (Figure 4a,b). Interestingly, an example pattern of a ‘5’ that is visually similar to a ‘4’ (characterized by the green cross) yields a DGC representation that lies closer to other ‘4’s (green cloud of dots) than to typical ‘5’s (red cloud of dots) (Figure 4b). Noteworthy the separation of the representation of ‘5’s from ‘3’s and ‘4’s is better at end of the late phase (Figure 4b) when compared to the end of the early phase of maturation (Figure 4a). For instance, even though the pattern ‘5’ corresponding to the orange cross is represented close to representations of ‘4’s at the end of the early phase of maturation (green cloud of dots, Figure 4a), it is represented far from any ‘3’s and ‘4’s at the end of maturation (Figure 4b). The expansion of the representation of ‘5’s into a previously empty subspace evolves as a function of time during the late phase of maturation (Figure 4d).
Robustness of the model
Our results are robust to changes in network architecture. As mentioned earlier, neither the exact number of GABAergic neurons (Supplementary file 2), nor that of DGCs is critical. Indeed, a larger network with 700 DGCs, thus mimicking the anatomically observed expansion factor of about 5 between EC and dentate gyrus (all other parameters unchanged), yields similar results (Supplementary file 3).
In the network with 700 DGCs, 275 cells remain unresponsive after pretraining with digits 3 and 4. In line with our earlier approach in the network with 100 DGCs, we can algoritmically replace all unresponsive neurons with newborn DGCs before patterns of digit 5 are added. Upon maturation, newborn DGC receptive fields provide a detailed representation of the prototypes of the novel digit 5 (Figure 4—figure supplement 1) and good classification performance is obtained (Supplementary file 3). Interestingly, due to the randomness of the recurrent connections, some newborn DGCs become selective for particular prototypes of the familiar (pretrained) digits 3 and 4 that are not already extensively represented by the network (see newborn DGCs selective for digit 4 highlighted by magenta squares in Figure 4—figure supplement 1).
As an alternative to replacing all unresponsive cells simultaneously, we can also replace only a fraction of them by newborn cells so as to simulate a continuous turnover of cells. For example, if 119 of the 275 unresponsive cells are replaced by newborn DGCs before the start of presentations of digit 5, then these 119 cells become selective for different writing styles and generic features of the novel digit 5 (Figure 4—figure supplement 2) and allow a good classification performance of all three digits. On the other hand, replacing only 35 of the 275 unresponsive cells is not sufficient (Supplementary file 3). In an even bigger network with more than 144 EC cells and more than 700 DGCs, we could choose to replace 1% of the total DGC population per week by newborn cells, consistent with biology (van Praag et al., 1999; Cameron and McKay, 2001). Importantly, if only a small fraction of unresponsive cells are replaced at a given moment, other unresponsive cells remain available to be replaced later by newborn DGCs that are then ready to learn new stimuli.
Interestingly, the timing of the introduction of the novel stimulus is important. In our main neurogenesis model with 100 DGCs, we introduce the novel digit 5 at the beginning of the early phase of maturation, which consists of one epoch of MNIST training patterns (all patterns are presented once). If the novel digit is only introduced in the middle of the early phase (half epoch), it cannot be properly learned (classification performance for digit 5: 46.52%). However, if introduced after threeeights or onequarter of the early phase, the novel digit can be picked out (classification performance for digit 5: 93.61% and 94.17%, respectively). We thus observe an increase in performance the earlier the novel digit is introduced after cell birth (classification performance for digit 5 was 95.18% when introduced at the beginning of the early phase of maturation). Therefore, our model predicts that a novel stimulus has to be introduced early enough with respect to newborn DGC maturation to be well discriminated and that the accuracy of discrimination is better the earlier it is introduced.
This could lead to an online scenario of our model, where adultborn DGCs are produced every day and different classes of novel patterns are introduced at different timepoints. To understand whether newborn DGCs in their early and late phase of maturation would interfere, two aspects should be kept in mind. First, since model newborn DGCs in the early phase of maturation do not project to other neurons yet, they do not influence the circuit and thus do not affect maturation of other newborn DGCs. Second, since model newborn DGCs in the late phase of maturation project to GABAergic neurons in the dentate gyrus, they will, just like mature cells, indirectly activate newborn DGCs that are in their early phase of maturation. As a result, early phase newborn DGCs will develop receptive fields that represent an average of all the stimuli that excite the mature and late phase newborn DGCs, which indirectly activate them. The ultimate selectivity of newborn DGCs is determined after the GABAswitch, when competition sets in, which makes those cells that have recently switched most sensitive to aspects of the input patterns that are not yet well represented by other cells. Therefore, in an online scenario, different model newborn DGCs would become selective for different novel patterns according to both their maturation stage with respect to presentation of the novel patterns, and the selectivity of mature and late phase newborn DGCs which indirectly activate them.
Finally, in our neurogenesis model, we have set the learning rate of mature DGCs to zero despite the observation that mature DGCs retain some plasticity (SchmidtHieber et al., 2004; Ge et al., 2007). We therefore studied a variant of the model in which mature DGCs also exhibit plasticity. First, we used our main model with 100 DGCs and 21 newborn DGCs. The implementation was identical, except that the learning rate of the mature DGCs was kept at a nonzero value during the maturation of the 21 newborn DGCs. We do not observe a large change in classification performance, even if the learning rate of the mature cells is the same as that of newborn cells (Supplementary file 4). Second, we used our extended network with 700 DGCs to be able to investigate the effect of plastic mature DGCs while having a proportion of newborn cells matching experiments. We find that with 35 newborn DGCs (corresponding to the experimentally reported fraction of about 5%), plastic mature DGCs (with a learning rate half of that of newborn cells) improve classification performance (Supplementary file 4). This is due to the fact that several of the mature DGCs (that were previously selective for ‘3’s or ‘4’s) become selective for prototypes of the novel digit 5. Consequently, more than the 35 newborn DGCs specialize for digit 5, so that digit 5 is eventually represented better by the network with mature cell plasticity than the standard network where plasticity is limited to newborn cells. Note that those mature DGCs that had earlier specialized on writing styles of digit 3 or 4 similar to a digit 5 are most likely to retune their selectivity. If the novel inputs were very distinct from the pretrained familiar inputs, mature DGCs would be unlikely to develop selectivity for the novel inputs.
Newborn DGCs become selective for similar novel patterns
To investigate whether our theory for integration of newborn DGCs can explain why adult dentate gyrus neurogenesis promotes discrimination of similar stimuli, but does not affect discrimination of distinct patterns (Clelland et al., 2009; Sahay et al., 2011a), we use a simplified competitive winnertakeall network (Materials and methods). It contains only as many DGCs as trained clusters, and the GABAergic inhibitory neurons are implicitly modeled through direct DGCtoDGC inhibitory connections. DGCs are either silent or active (binary activity state, while in the detailed network DGCs had continuous firing rates). The synaptic plasticity rule is however the same as for the detailed network, with different parameter values (Materials and methods). We also construct an artificial data set (Figure 5a,b) that allows us to control the similarity $s$ of pairs of clusters (Materials and methods). The MNIST data set is not appropriate to distinguish similar from dissimilar patterns, because all digit clusters are similar and highly overlapping, reflected by a high within cluster dispersion (e.g. across the set of all ‘3’) compared to the separation between clusters (e.g. typical ‘3’ versus typical ‘5’).
After a pretraining period, a first mature DGC responds to patterns of cluster 1 and a second mature DGC to those of cluster 2 (Figure 5e,f). We then fix the feedforward weights of those two DGCs and introduce a newborn DGC in the network. Thereafter, we present patterns from three clusters (the two pretrained ones, as well as a novel one), while the plastic feedforward weights of the newborn DGC are the only ones that are updated. We observe that the newborn DGC ultimately becomes selective for the novel cluster if it is similar ($s=0.8$) to the two pretrained clusters (Figure 5i), but not if it is distinct ($s=0.2$, Figure 5j). The selectivity develops in two phases. In the early phase of maturation of the newborn model cell, a pattern from the novel cluster that is similar to one of the pretrained clusters activates the mature DGC that has a receptive field closest to the novel pattern. The activated mature DGC drives the newborn DGC via lateral excitatory GABAergic connections to a firing rate where LTP is triggered at active synapses onto the newborn DGC. LTP also happens when a pattern from one of the pretrained clusters is presented. Thus, synaptic plasticity leads to a receptive field that reflects the average of all stimuli from all three clusters (Figure 5g).
To summarize our findings in a more mathematical language, we characterize the receptive field of the newborn cell by the vector of its feedforward weights. Analogous to the notion of a firing rate vector that represents the set of firing rates of an ensemble of neurons, the feedforward weight vector represents the set of weights of all synapses projecting onto a given neuron (Figure 1b). In the early phase of maturation, for similar clusters, the feedforward weight vector onto the newborn DGC grows in the direction of the center of mass of all three clusters (the two pretrained ones and the novel one), because for each pattern presentation, be it a novel pattern or a familiar one, one of the mature DGCs becomes active and stimulates the newborn cell (compare Figure 5g and Figure 5k). However, if the novel cluster has a low similarity to pretrained clusters, patterns from the novel cluster do not activate any of the mature DGCs. Therefore, the receptive field of the newborn cell reflects the average of stimuli from the two pretrained clusters only (compare Figure 5h and Figure 5l).
As a result of the different orientation of the feedforward weight vector onto the newborn DGC at the end of the early phase of maturation, two different situations arise in the late phase of maturation, when lateral GABAergic connections are inhibitory. If the novel cluster is similar to the pretrained clusters, the weight vector onto the newborn DGC at the end of the early phase of maturation lies at the center of mass of all the patterns across the three clusters. Thus, it is closer to the novel cluster than the weight vector onto either of the mature DGCs (Figure 5g). So if a novel pattern is presented, the newborn DGC wins the competition between the three DGCs, and its feedforward weight vector moves toward the center of mass of the novel cluster (Figure 5i). By contrast, if the novel cluster is distinct, the weight vector onto the newborn DGC at the end of the early phase of maturation is located at the center of mass of the two pretrained clusters (Figure 5h). If a novel pattern is presented, no output unit is activated since their receptive fields are not similar enough to the input pattern. Therefore, the newborn DGC always stays silent and does not update its feedforward weights (Figure 5j). These results are consistent with studies that have suggested that dentate gyrus is only involved in the discrimination of similar stimuli, but not distinct stimuli (Gilbert et al., 2001; Hunsaker and Kesner, 2008). For discrimination of distinct stimuli, another pathway might be used, such as the direct EC to CA3 connection (Yeckel and Berger, 1990; Fyhn et al., 2007).
In conclusion, our model suggests that adult dentate gyrus neurogenesis promotes discrimination of similar patterns because newborn DGCs can ultimately become selective for novel stimuli, which are similar to already learned stimuli. On the other hand, newborn DGCs fail to represent novel distinct stimuli, precisely because they are too distinct from other stimuli already represented by the network. Presentation of novel distinct stimuli in the late phase of maturation therefore does not induce synaptic plasticity of the newborn DGC feedforward weight vector toward the novel stimuli. In the simplified network, the transition between similar and distinct can be determined analytically (Materials and methods). This analysis clarifies the importance of the switch from cooperative dynamics (excitatory interactions) in the early phase to competitive dynamics (inhibitory interactions) in the late phase of maturation.
Upon successful integration the receptive field of a newborn DGC represents an average of novel stimuli
With the simplified model network, it is possible to analytically compute the maximal strength of the DGC receptive field via the L2norm of the feedforward weight vector onto the newborn DGC (Materials and methods). In addition, the angle between the center of mass of the novel patterns and the feedforward weight vector onto the adultborn DGC can also be analytically computed (Materials and methods). To illustrate the analytical results and characterize the evolution of the receptive field of the newborn DGC, we thus examine the angle φ of the feedforward weight vector with the center of mass of the novel cluster (i.e. the average of the novel stimuli), as a function of maturation time (Figure 6b,c, Figure 6—figure supplement 1).
In the early phase of maturation, the feedforward weight vector onto the newborn DGC grows, while its angle with the center of mass of the novel cluster stays constant (Figure 6—figure supplement 1). In the late phase of maturation, the angle φ between the center of mass of the novel cluster and the feedforward weight vector onto the newborn DGC decreases in the case of similar patterns (Figure 6c, Figure 6—figure supplement 1), but not in the case of distinct patterns (Figure 6—figure supplement 1), indicating that the newborn DGC becomes selective for the novel cluster for similar but not for distinct patterns.
The analysis of the simplified model thus leads to a geometric picture that helps us to understand how the similarity of patterns influences the evolution of the receptive field of the newborn DGC before and after the switch from excitation to inhibition of the GABAergic input. For novel patterns that are similar to known patterns, the receptive field of a newborn DGC at the end of maturation represents the average of novel stimuli.
The cooperative phase of maturation promotes pattern separation for any dimensionality of input data
Despite the fact that input patterns in our model represent the activity of 144 or 128 model EC cells, the effective dimensionality of the input data was significantly below 100 because the clusters for different input classes were rather concentrated around their respective center of mass. We define the effective input dimensionality as the participation ratio (Mazzucato et al., 2016; LitwinKumar et al., 2017) (Materials and methods). Using this definition, the input data of both the MNIST 12 × 12 patterns from digits 3, 4, and 5 and the seven clusters of the handmade dataset for similar patterns ($s=0.8$) are relatively lowdimensional ($PR=19$ of a maximum of 144, and $PR=11$ of a maximum of 128, respectively). We emphasize that in both cases the spread of the input data around the cluster center implies that the effective dimensionality is larger than the number of clusters. In natural settings, we expect the input data to have even higher dimension. Therefore, here we investigate the effect of dimensionality of the input data on our neurogenesis model by increasing the spread around the cluster centers.
We use our simplified network model and create similar artificial datasets ($s=0.8$) with different values for the concentration parameter κ (Materials and methods). The smaller the κ, the broader the distributions around their center of mass; hence, the larger the overlap of patterns generated from different cluster distributions. Therefore, we can increase the effective dimensionality of the input by decreasing the concentration parameter κ. First, as expected from our analytical analysis (Materials and methods), we find that the broader the cluster distributions the smaller the length of the feedforward weight vector onto newborn DGCs (from just below 1.5 with $\kappa ={10}^{4}$ to about 1.35 with $\kappa =6\cdot {10}^{2}$). Second, we examine the ability of the simplified network to discriminate input patterns coming from input spaces with different dimensionalities. To do so, we compare our neurogenesis model (Neuro.) with a random initialization model (RandInitL.). In both cases, two DGCs are pretrained with patterns from two clusters, as above. Then we fix the weights of the two mature DGCs and introduce patterns from a third cluster as well as a newborn DGC. For the neurogenesis case, after maturation of the newborn DGC we fix its weights (while for the random initialization model we keep them plastic) upon introduction of patterns from a fourth cluster as well as another newborn DGC, and so on until the network contains seven DGCs and patterns from the full dataset of seven clusters have been presented. We compare our neurogenesis model, where each newborn DGC starts with zero weights and undergo a twophase maturation (one epoch per phase), with a random initialization model where each newborn DGC is directly fully integrated into the circuit and whose feedforward weight vector is randomly initialized with a length of 0.1 (RandInitL.) and is then learned for two epochs.
Since clusters can be highly overlapping, we assess discrimination performance by computing the reconstruction error at the end of training. Reconstruction error is evaluated analogously to classification error, except that the readout layer has the task of an autoencoder: it contains as many readout units as there are input units. Reconstruction error is the mean squared distance between the input vector and the reconstructed output vector based on testing patterns. We observe that for any dimensionality of the input space, even as high as 97dimensional, the neurogenesis model performs better (has a lower total reconstruction error) than the random initialization model (Supplementary file 5). Indeed, in the neurogenesis case newborn DGCs grow their feedforward weights (from zero) in the direction of presented input patterns in their early cooperative phase of maturation and can later become selective for novel patterns during the competitive phase. In contrast, since the random initialization model has no early cooperative phase, the newborn DGC weight vector does not grow unless an input pattern is by chance well aligned with its randomly initialized weight vector (which is unlikely in a highdimensional input space). We get similar results for a larger initialization of the synaptic weights (e.g. the length of the weight vector at birth is set to 1, results not shown). Importantly, in high input dimensions, the advantage of a larger weight vector length at birth in the random initialization model is overridden by the capability of newborn DGCs to grow their weight vector in the appropriate direction during their early cooperative phase of maturation. Finally, we note that even if the length of the feedforward weight vector onto newborn DGCs is set to 1.5 (RandInitH., Supplementary file 5), which is the upper bound according to our analytical results (Materials and methods), the random initialization model performs worse than the neurogenesis model for low up to relatively highdimensional input spaces ($PR=83$, Supplementary file 5) despite its advantage in the competition conferred by the longer weight vector. It is only when input clusters are extremely broad and overlapping that the random initialization model performs similarly to the neurogenesis model ($PR=90,97$, Supplementary file 5). In other words, a random initialization at full length of weight vectors works well if input data is homogeneously distributed on the positive quadrant of the unit sphere but fails if the input data is clustered in a few directions. Moreover, random initialization requires that synaptic weights are large from the start which is biologically not plausible. In summary, the twophase neurogenesis model is advantageous because the feedforward weights onto newborn cells can start at arbitrarily small values; their growth is, during the cooperative phase, guided to occur in a direction that is relevant for the task at hand; the final competitive phase eventually enables specialization onto novel inputs.
Discussion
While experimental studies, such as manipulating the ratio of NKCC1 to KCC2, suggest that the switch from excitation to inhibition of the GABAergic input onto adultborn DGCs is crucial for their integration into the preexisting circuit (Ge et al., 2006; Alvarez et al., 2016) and that adult dentate gyrus neurogenesis promotes pattern separation (Clelland et al., 2009; Sahay et al., 2011a; Jessberger et al., 2009), the link between channel properties and behavior has remained puzzling (Sahay et al., 2011b; Aimone et al., 2011). Our modeling work shows that the GABAswitch enables newborn DGCs to become selective for novel stimuli, which are similar to familiar, alreadystored, representations, consistent with the experimentally observed function of pattern separation (Clelland et al., 2009; Sahay et al., 2011a; Jessberger et al., 2009).
Previous modeling studies already suggested that newborn DGCs integrate novel inputs into the representation in dentate gyrus (Chambers et al., 2004; Becker, 2005; Crick and Miranker, 2006; Wiskott et al., 2006; Chambers and Conroy, 2007; Appleby and Wiskott, 2009; Aimone et al., 2009; Weisz and Argibay, 2009; Temprana et al., 2015; Finnegan and Becker, 2015; DeCostanzo et al., 2018). However, our work differs from them in four important aspects. First of all, we implement an unsupervised biologically plausible plasticity rule, while many studies used supervised algorithmic learning rules (Chambers et al., 2004; Becker, 2005; Chambers and Conroy, 2007; Weisz and Argibay, 2009; Finnegan and Becker, 2015; DeCostanzo et al., 2018). Second, as we model the formerly neglected GABAswitch, the connection weights from EC to newborn DGCs are grown from small values through cooperativity in the early phase of maturation. This integration step was mostly bypassed in earlier models by initialization of the connectivity weights toward newborn DGCs to random, yet fully grown values (Crick and Miranker, 2006; Aimone et al., 2009; Weisz and Argibay, 2009; Finnegan and Becker, 2015). Third, as the dentate gyrus network is commonly modeled as a competitive network, weight normalization is crucial. In our framework, competition occurs during the late phase of maturation. Previous modeling works either applied algorithmic weight normalization or hard bounds on the weights at each iteration step (Crick and Miranker, 2006; Aimone et al., 2009; Weisz and Argibay, 2009; Temprana et al., 2015; Finnegan and Becker, 2015). Instead, our plasticity rule includes heterosynaptic plasticity, which intrinsically softly bounds connectivity weights by a homeostatic effect. Finally, although some earlier computational models of adult dentate gyrus neurogenesis could explain the pattern separation abilities of newborn cells, separation was obtained independently of the similarity between the stimuli. Contrarily to experimental data, no distinction was made between similar and distinct patterns (Chambers et al., 2004; Becker, 2005; Crick and Miranker, 2006; Wiskott et al., 2006; Chambers and Conroy, 2007; Aimone et al., 2009; Appleby and Wiskott, 2009; Weisz and Argibay, 2012; Temprana et al., 2015; Finnegan and Becker, 2015; DeCostanzo et al., 2018). To our knowledge, we present the first model that can explain both (1) how adultborn DGCs integrate into the preexisting network and (2) why they promote pattern separation of similar stimuli and not distinct stimuli.
Our work emphasizes why a twophase maturation of newborn DGCs is beneficial for proper integration in the preexisting network. From a computational perspective, the early phase of maturation, when GABAergic inputs onto newborn DGCs are excitatory, corresponds to cooperative unsupervised learning. Therefore, the synapses grow in the direction of patterns that indirectly activate the newborn DGCs via GABAergic interneurons (Figure 6a). At the end of the early phase of maturation, the receptive field of a newborn DGC represents the center of mass of all input patterns that led to its (indirect) activation. In the late phase of maturation, GABAergic inputs onto newborn DGCs become inhibitory, so that lateral interactions change from cooperation to competition, causing a shift of the receptive fields of the newborn DGCs toward novel features (Figure 6b). At the end of maturation, newborn DGCs are thus selective for novel inputs. This integration mechanism is in agreement with the experimental observation that newborn DGCs are broadly tuned early in maturation, yet highly selective at the end of maturation (MarínBurgin et al., 2012; Danielson et al., 2016). Loosely speaking, the cooperative phase of excitatory GABAergic input promotes the growth of the synaptic weights coarsely in the relevant direction, whereas the competitive phase of inhibitory GABAergic input helps to specialize on detailed, but potentially important differences between patterns.
In the context of theories of unsupervised learning, the switch of lateral GABAergic input to newborn DGCs from excitatory to inhibitory provides a biological solution to the ‘problem of unresponsive units’ (Hertz et al., 1991). Unsupervised competitive learning has been used to perform clustering of input patterns into a few categories (Rumelhart and Zipser, 1985; Grossberg, 1987; Kohonen, 1989; Hertz et al., 1991; Du, 2010). Ideally, after learning of the feedforward weights between an input layer and a competitive network, input patterns that are distinct from each other activate different neuron assemblies of the competitive network. After convergence of competitive Hebbian learning, the vector of feedforward weights onto a given neuron points to the center of mass of the cluster of input patterns for which it is selective (Kohonen, 1989; Hertz et al., 1991). Yet, if the synaptic weights are randomly initialized, it is possible that the set of feedforward weights onto some neurons of the competitive network point in a direction ‘quasiorthogonal’ (Materials and methods) to the subspace of the presented input patterns. Therefore, those neurons, called ‘unresponsive units’, will never get active during pattern presentation. Different learning strategies have been developed in the field of artificial neural networks to avoid this problem (Grossberg, 1976; Bienenstock et al., 1982; Rumelhart and Zipser, 1985; Grossberg, 1987; DeSieno, 1988; Kohonen, 1989; Hertz et al., 1991; Du, 2010). However, most of these algorithmic approaches lack a biological interpretation. In our model, weak synapses onto newborn DGCs form spontaneously after neuronal birth. The excitatory GABAergic input in the early phase of maturation drives the growth of the synaptic weights in the direction of the subspace of presented patterns that succeed in activating some of the mature DGCs. Hence, the early cooperative phase of maturation can be seen as a smart initialization of the synaptic weights onto newborn DGCs, close enough to novel patterns so as to become selective for them in the late competitive phase of maturation. However, the cooperative phase is helpful only if the novel patterns are similar to the input statistics defined by the set of known (familiar) patterns.
Our results are in line with the classic view that dentate gyrus is responsible for decorrelation of inputs (Marr, 1969; Albus, 1971; Marr, 1971; Rolls and Treves, 1998), a necessary step for differential storage of similar memories in CA3, and with the observation that dentate gyrus lesions impair discrimination of similar but not distinct stimuli (Gilbert et al., 2001; Hunsaker and Kesner, 2008). To discriminate distinct stimuli, another pathway might be involved, such as the direct EC to CA3 connection (Yeckel and Berger, 1990; Fyhn et al., 2007).
The parallel of neurogenesis in dentate gyrus and olfactory bulb suggests that similar mechanisms could be at work in both areas. Yet, even though adult olfactory bulb neurogenesis seems to have a similar functional role to adult dentate gyrus neurogenesis (Sahay et al., 2011b), follow a similar integration sequence and undergo a GABAswitch from excitatory to inhibitory, the circuits are different in several aspects. First, while newborn neurons in dentate gyrus are excitatory, newborn cells in the olfactory bulb are inhibitory. Second, the newborn olfactory cells start firing action potentials only once they are well integrated (Carleton et al., 2003). Therefore, in view of a transfer of results to the olfactory bulb, it would be interesting to adjust our model of adult dentate gyrus neurogenesis accordingly. For example, a voltagebased synaptic plasticity rule could be used to account for subthreshold plasticity mechanisms (Clopath et al., 2010).
Our model of transition from an early cooperative phase to a late competitive phase makes specific predictions, at the behavioral and cellular level. In our model, the early cooperative phase of maturation can only drive the growth of synaptic weights onto newborn cells if they are indirectly activated by mature DGCs through GABAergic input, which has an excitatory effect due to the high NKCC1/KCC2 ratio early in maturation. Therefore, our model predicts that NKCC1knockout mice would be impaired in discriminating similar contexts or objects because newborn cells stay silent due to lack of indirect activation. The feedforward weight vector onto newborn DGCs could not grow in the early phase and newborn DGCs could not become selective for novel inputs. Therefore, our model predicts that since newborn DGCs are poorly integrated into the preexisting circuit, they are unlikely to survive. If, however, in the same paradigm newborn cells are activated by lightinduced or electrical stimulation, we predict that they become selective to novel patterns. Thus discrimination abilities would be restored and newborn DGCs are likely to survive. Analogously, we predict that using inducible NKCC1knockout mice, animals would gradually be impaired in discrimination tasks after induced knockout and reach a stable maximum impairment about 3 weeks after the start of induced knockout.
Experimental observations support the importance of the switch from early excitation to late inhibition of the GABAergic input onto newborn DGCs. An absence of early excitation using NKCC1knockout mice has been shown to strongly affect synapse formation and dendritic development in vivo (Ge et al., 2006). Conversely, a reduction in inhibition in the dentate gyrus through decrease in KCC2 expression has been associated with epileptic activity (Pathak et al., 2007; Barmashenko et al., 2011). An analogous switch of the GABAergic input has been observed during development, and its proper timing has been shown to be crucial for sensorimotor gating and cognition (Wang and Kriegstein, 2011; Furukawa et al., 2017). In addition to early excitation and late inhibition, our theory also critically depends on the time scale of the switching process. In our model, the switch makes an instantaneous transition between early and late phase of maturation. Several experimental results have suggested that the switch is indeed sharp and occurs within a single day, both during development (Khazipov et al., 2004; Tyzio et al., 2007; Leonzino et al., 2016) and adult dentate gyrus neurogenesis (Heigele et al., 2016). Furthermore, in hippocampal cell cultures, expression of KCC2 is upregulated by GABAergic activity but not affected by glutamatergic activity (Ganguly et al., 2001). A similar process during adult dentate gyrus neurogenesis would increase the number of newborn DGCs available for representing novel features by advancing the timing of their switch. In this way, instead of a few thousands of newborn DGCs ready to switch (3–6% of the whole population [van Praag et al., 1999; Cameron and McKay, 2001], divided by 30 days), a larger fraction of newborn DGCs would be made available for coding, if appropriate stimulation occurs. Finally, while neurotransmitter switching has been observed following sustained stimulation for hours to days (Li et al., 2020), it is still unclear if it has the same functional role as the GABAswitch in our model. In particular, it remains an open question if neurotransmitter switching promotes the integration of neurons in the same way as our model GABAswitch does in the context of adult dentate gyrus neurogenesis.
To conclude, our theory for integration of adultborn DGCs suggests that newborn cells have a coding – rather than a modulatory – role during dentate gyrus pattern separation function. Our theory highlights the importance of GABAergic input in adult dentate gyrus neurogenesis and links the switch from excitation to inhibition to the integration of newborn DGCs into the preexisting circuit. Finally, it illustrates how Hebbian plasticity of EC to DGC synapses along with the switch make newborn cells suitable to promote pattern separation of similar but not distinct stimuli, a longstanding mystery in the field of adult dentate gyrus neurogenesis (Sahay et al., 2011b; Aimone et al., 2011).
Materials and methods
Network architecture and neuronal dynamics
Request a detailed protocolDGCs are the principal cells of the dentate gyrus. They mainly receive excitatory projections from the EC through the perforant path and GABAergic inputs from local interneurons, as well as excitatory input from Mossy cells. They project to CA3 pyramidal cells and inhibitory neurons, as well as local Mossy cells (Acsády et al., 1998; Henze et al., 2002; Amaral et al., 2007; Temprana et al., 2015; Figure 1—figure supplement 1). In our model, we omit Mossy cells for simplicity and describe the dentate gyrus as a competitive circuit consisting of ${N}_{DGC}$ DGCs and ${N}_{I}$ GABAergic interneurons (Figure 1b). The activity of ${N}_{EC}$ neurons in EC represents an input pattern $\overrightarrow{x}=({x}_{1},{x}_{2},\mathrm{\dots},{x}_{{N}_{EC}})$. Because the perforant path also induces strong feedforward inhibition in the dentate gyrus (Li et al., 2013), we assume that the effective EC activity is normalized, such that $\overrightarrow{x}=1$ for any input pattern $\overrightarrow{x}$ (Figure 1—figure supplement 1). We use $P$ different input patterns ${\overrightarrow{x}}^{\mu}$, $1\u2a7d\mu \u2a7dP$ in the simulations of the model.
In our network, model EC neurons have excitatory alltoall connections to the DGCs. In rodent hippocampus, spiking mature DGCs activate interneurons in dentate gyrus, which in turn inhibit other mature DGCs (Temprana et al., 2015; Alvarez et al., 2016). In our model, the DGCs are thus recurrently connected with inhibitory neurons (Figure 1b). Connections from DGCs to interneurons exist in our model with probability ${p}_{IE}$ and have a weight ${w}_{IE}$. Similarly, connections from interneurons to DGCs occur with probability ${p}_{EI}$ and have a weight ${w}_{EI}$. All parameters are reported in Table 1 (Biologically plausible network).
Before an input pattern is presented, all rates of model DGCs are initialized to zero. We assume that the DGCs have a frequency–current curve that is given by a rectified hyperbolic tangent (Dayan and Abbott, 2001), which is similar to the frequency–current curve of spiking neuron models with refractoriness (Gerstner et al., 2014). Moreover, we exploit the equivalence of two common firing rate equations (Miller and Fumarola, 2012) and let the firing rate ${\nu}_{i}$ of DGC $i$ upon stimulation with input pattern $\overrightarrow{x}$ evolve according to:
where ${[.]}_{+}$ denotes rectification: ${[a]}_{+}=a$ for $a>0$ and zero otherwise. Here, b_{i} is a firing threshold, $L$ is the smoothness parameter of the frequency–current curve (${L}^{1}$ is the slope of the frequency–current curve at the firing threshold), and ${I}_{i}$ the total input to cell $i$:
with x_{j} the activity of EC input neuron $j$, ${w}_{ij}\u2a7e0$ the feedforward weight from EC input neuron $j$ to DGC $i$, and ${w}_{ik}^{EI}$ the weight from inhibitory neuron $k$ to DGC $i$. The sum runs over all inhibitory neurons, but the weights are set to ${w}_{ik}^{EI}=0$ if the connection is absent. The firing rate ${\nu}_{i}$ is unitfree and normalized to a maximum of 1, which we interpret as a firing rate of 10 Hz. We take the synaptic weights as unitless parameters such that ${I}_{i}$ is also unitfree.
The firing rate ${\nu}_{k}^{I}$ of inhibitory neuron $k$, is defined as:
with ${p}^{*}$ a parameter which relates to the desired ensemble sparsity, and ${I}_{k}^{I}$ the total input toward interneuron $k$, given as:
with ${w}_{ki}^{IE}$ the weight from DGC $i$ to inhibitory neuron $k$. (We set ${w}_{ki}^{IE}=0$ if the connection is absent.) The feedback from inhibitory neurons ensures a sparse activity of model DGCs for each pattern. With ${p}^{*}=0.1$ we find that more than 70% of model DGCs are silent (firing rate < 1 Hz [Senzai and Buzsáki, 2017]) when an input pattern is presented, and less than 10% are highly active (firing rate > 1 Hz) (Figure 2c,d), consistent with the experimentally observed sparse activity in dentate gyrus (Chawla et al., 2005).
Plasticity rule
Projections from EC onto newborn DGCs exhibit Hebbian plasticity (SchmidtHieber et al., 2004; Ge et al., 2007; McHugh et al., 2007). Therefore, in our model, the connections from EC neurons to DGCs are plastic, following a Hebbian learning rule that exhibits LTD or LTP depending on the firing rate ${\nu}_{i}$ of the postsynaptic cell (Bienenstock et al., 1982; Artola et al., 1990; Sjöström et al., 2001; Pfister and Gerstner, 2006). Input patterns, ${\overrightarrow{x}}^{\mu}$, $1\u2a7d\mu \u2a7dP$, are presented in random order. For each input pattern, we let the firing rate converge for a time $T$ where $T$ was chosen long enough to achieve convergence to a precision of 10^{−6}. After $n1$ presentations (i.e. at time $(n1)\cdot T$), the weight vector has value ${w}_{ij}^{(n1)}$. We then present the next pattern and update at time $n\cdot T$ (${w}_{ij}^{(n)}={w}_{ij}^{(n1)}+\mathrm{\Delta}{w}_{ij}$), according to the following plasticity rule (Equation (1), written here for convenience):
where x_{j} is the firing rate of presynaptic EC input neuron $j$, ${\nu}_{i}$ the firing rate of postsynaptic DGC $i$, η the learning rate, θ marks the transition from LTD to LTP, and the relative strength α, γ of LTP and LTD depend on θ via $\alpha =\frac{{\alpha}_{0}}{{\theta}^{3}}>0$ and $\gamma ={\gamma}_{0}\theta >0$. The values of the parameters ${\alpha}_{0}$, ${\gamma}_{0}$, β, and θ are given in Table 1 (Biologically plausible network). The weights are hardbounded from below at 0, i.e., if Equation (1) leads to a new weight smaller than zero, ${w}_{ij}$ is set to zero. The first two terms of Equation (1) are a variation of the BCM rule (Bienenstock et al., 1982). The third term implements heterosynaptic plasticity (Chistiakova et al., 2014; Zenke and Gerstner, 2017) with three important features: first, heterosynaptic plasticity has a negative sign and therefore leads to synaptic depression; second, heterosynaptic plasticity sets in above a threshold (${\nu}_{i}>\theta $) that is the same threshold as that for LTP, so that if LTP occurs at some synapses LTD is induced at other synapses; third, above threshold the dependence upon the postsynaptic firing rate ${\nu}_{i}$ is supralinear. The interaction of the three different terms in the plasticity rule has several consequences. Because the first two terms of the plasticity rule are Hebbian (‘homosynaptic’) and proportional to the presynaptic activity x_{j}, the active DGCs (${\nu}_{i}>\theta $) update their feedforward weights in direction of the input pattern $\overrightarrow{x}$. Moreover, whenever LTP occurs at some synapses, all weights onto neuron $i$ are downregulated heterosynaptically by an amount that increases supralinearly with the postsynaptic rate ${\nu}_{i}$, implicitly controlling the length of the weight vector (see below) similar to synaptic homeostasis (Turrigiano et al., 1998) but on a rapid time scale (Zenke and Gerstner, 2017). Analogous to learning in a competitive network (Kohonen, 1989; Hertz et al., 1991), the vector of feedforward weights onto active DGCs will move toward the center of mass of the cluster of patterns they are selective for, as we will discuss now.
For a given input pattern ${\overrightarrow{x}}^{\mu}$, there are three fixed points for the postsynaptic firing rate: ${\nu}_{i}=0$, ${\nu}_{i}=\theta $, and ${\nu}_{i}={\widehat{\nu}}_{i}$ (the negative root is omitted because ${\nu}_{i}\u2a7e0$ due to Equation (2)). For ${\nu}_{i}<\theta $, there is LTD, so the weights move toward zero: ${w}_{ij}\to 0$, while for ${\nu}_{i}>\theta $, there is LTP, so the weights move toward ${w}_{ij}\to \frac{\gamma {x}_{j}^{\mu}}{\beta {\widehat{\nu}}_{i}^{2}}$ (Figure 1c). The value of ${\widehat{\nu}}_{i}$ is defined implicitly by the network Equations (2–5). If a pattern ${\overrightarrow{x}}^{\mu}$ is presented only for a short time these fixed points are not reached during a single pattern presentation.
Winners, losers, and quasiorthogonal inputs
Request a detailed protocolWe define the winners as the DGCs that become strongly active (${\nu}_{i}>\theta $) during presentation of an input pattern. Since the input patterns are normalized to have an L2norm of 1 (${\overrightarrow{x}}^{\mu}=1$ by construction), and the L2norm of the feedforward weight vectors is bounded (see Section Direction and length of the weight vector), the winning units are the ones whose weight vectors ${\overrightarrow{w}}_{i}$ (row of the feedforward connectivity matrix) align best with the current input pattern ${\overrightarrow{x}}^{\mu}$.
We emphasize that all synaptic weights and all presynaptic firing rates ${\nu}_{j}$ are nonnegative: ${w}_{ij}\u2a7e0$ and ${\nu}_{j}\u2a7e0$. Thus, both the weight vectors and the vectors of input firing rates live in the positive quadrant. The angle between an input pattern ${\overrightarrow{x}}^{\mu}$ and the weight vector ${\overrightarrow{w}}_{i}$ of neuron $i$ can be at most ninety degrees. We say that an input pattern ${\overrightarrow{x}}^{\mu}$ is ‘quasiorthogonal’ to a weight vector ${\overrightarrow{w}}_{i}$ if, in the stationary state, the input is not sufficient to activate neuron $i$, i.e., ${I}_{i}={\sum}_{j=1}^{{N}_{EC}}{w}_{ij}{x}_{j}+{\sum}_{k=1}^{{N}_{I}}{w}_{ik}^{EI}{\nu}_{k}^{I}<{b}_{i}$. If an input pattern ${\overrightarrow{x}}^{\mu}$ is quasiorthogonal to a weight vector ${\overrightarrow{w}}_{i}$, then neuron $i$ does not fire in response to ${\overrightarrow{x}}^{\mu}$ after the stimulus has been applied for a long enough time. Note that for a case without inhibitory neurons and with ${b}_{i}\to 0$, we recover the standard orthogonality condition, but for finite ${b}_{i}>0$ quasiorthogonality corresponds to angles larger than some reference angle.
Direction and length of the weight vector
Request a detailed protocolLet us denote the ensemble of patterns for which neuron $i$ is a winner by ${C}_{i}$ and call this the set of winning patterns (${C}_{i}=\{\mu {\nu}_{i}>\theta \}$). Suppose that neuron $i$ is quasiorthogonal to all other patterns, so that for all $\mu \notin {C}_{i}$, we have ${\nu}_{i}=0$. Then the feedforward weight vector of neuron $i$ converges in expectation to:
where ${G}_{1}({\nu}_{i})=({\nu}_{i}\theta ){\nu}_{i}$ and ${G}_{2}({\nu}_{i})=({\nu}_{i}\theta ){\nu}_{i}^{3}$. Hence $\overrightarrow{{w}_{i}}$ is a weighted average over all winning patterns.
The squared length of the feedforward weight vector can be computed by multiplying Equation (6) with ${\overrightarrow{w}}_{i}$:
Since input patterns have length one, the scalar product on the righthand side can be rewritten as ${\overrightarrow{w}}_{i}\cdot \overrightarrow{x}=\overrightarrow{{w}_{i}}\mathrm{cos}(\alpha )$ where α is the angle between the weight vector and pattern $\overrightarrow{x}$. Division by $\overrightarrow{{w}_{i}}$ yields the L2norm of the feedforward weight vector:
where the averages run, as before, over all winning patterns.
Let us now derive bounds for $\overrightarrow{{w}_{i}}$. First, since $\mathrm{cos}(\alpha )\u2a7d1$ we have ${\u27e8{G}_{1}({\nu}_{i})\mathrm{cos}(\alpha )\u27e9}_{\mu \in {C}_{i}}\u2a7d{\u27e8{G}_{1}({\nu}_{i})\u27e9}_{\mu \in {C}_{i}}$. Second, since for all winning patterns ${\nu}_{i}>\theta $, where θ is the LTP threshold, we have ${\u27e8{G}_{2}({\nu}_{i})\u27e9}_{\mu \in {C}_{i}}\u2a7e\u27e8({\nu}_{i}\theta ){\nu}_{i}\u27e9{\theta}^{2}$. Thus the length of the weight vector is finite and bounded by:
It is possible to make the second bound tighter if we find the winning pattern with the smallest firing rate ${\nu}_{\text{min}}$ such that ${\nu}_{i}\u2a7e{\nu}_{\text{min}}\forall i\in {C}_{i}$:
The bound is reached if neuron $i$ is winner for a single input pattern.
We can also derive a lower bound. For a pattern $\mu \in {C}_{i}$, let us write the firing rate of neuron $i$ as ${\nu}_{i}(\mu )={\overline{\nu}}_{i}+\mathrm{\Delta}{\nu}_{i}(\mu )$ where ${\overline{\nu}}_{i}$ is the mean firing rate of neuron $i$ averaged across all winning patterns and ${\u27e8\mathrm{\Delta}{\nu}_{i}\u27e9}_{\mu \in {C}_{i}}=0$. We assume that the absolute size of $\mathrm{\Delta}{\nu}_{i}$ is small, i.e., ${\u27e8{(\mathrm{\Delta}{\nu}_{i})}^{2}\u27e9}_{\mu \in {C}_{i}}\ll {({\overline{\nu}}_{i})}^{2}$. Linearization of Equation (8) around ${\overline{\nu}}_{i}$ yields:
Elementary geometric arguments for a neuron model with monotonically increasing frequency–current curve yield that the value of ${\u27e8\mathrm{cos}(\alpha )\mathrm{\Delta}{\nu}_{i}\u27e9}_{\mu \in {C}_{i}}$ is positive (or zero) because an increase in the angle α lowers both the cosine and the firing rate, giving rise to a positive correlation. Since we are interested in a lower bound, we can therefore drop the term proportional to ${G}_{1}^{\prime}$ and evaluate the ratio ${G}_{1}/{G}_{2}$ to find:
where ${\nu}_{\mathrm{max}}$ is the maximal firing rate of a DGC and $\widehat{\alpha}={\mathrm{max}}_{\mu \in {\mathrm{C}}_{\mathrm{i}}}\{\alpha \}$ is the angle of the winning pattern that has the largest angle with the weight vector. The first bound is tight and is reached if neuron $i$ is winner for only two patterns.
To summarize we find that the length of the weight vector remains bounded in a narrow range. Hence, for a reasonable distribution of input patterns and weight vectors, the value of $\overrightarrow{{w}_{i}}$ is similar for different neurons $i$, so that the weight vector will have, after convergence, similar lengths for all DGCs that are winners for at least one pattern. In our simulations with the MNIST data set, we find that the length of feedforward weight vectors lies in the range between 9.3 and 11.1 across all responsive neurons with a mean value close to 10; Figure 2e.
Early maturation phase
Request a detailed protocolDuring the early phase of maturation, the GABAergic input onto a newborn DGC with index $l$ has an excitatory effect. In the model, it is implemented as follows: ${w}_{lk}^{EI}={w}_{EI}>0$ with probability ${p}_{EI}$ for any interneuron $k$ and ${w}_{lk}^{EI}=0$ otherwise (no connection). Since newborn cells do not project yet onto inhibitory neurons (Temprana et al., 2015), we have ${w}_{kl}^{IE}=0\forall l$. Newborn DGCs are known to have enhanced excitability (SchmidtHieber et al., 2004; Li et al., 2017), so their threshold is kept at ${b}_{l}=0\forall l$. Because the newborn model DGCs receive lateral excitation via interneurons and their thresholds are zero during the early phase of maturation, the lateral excitatory GABAergic input is always sufficient to activate them. Hence, if the firing rate of a newborn DGC exceeds the LTP threshold θ, the feedforward weights grow toward the presented input pattern, Equation (1).
Presentation of all patterns of the data set once (one epoch) is sufficient to reach convergence of the feedforward weights onto newborn DGCs. We define the end of the first epoch as the end of the early phase, i.e., simulation of one epoch of the model corresponds to about 3 weeks of biological time.
Late maturation phase
Request a detailed protocolDuring the late phase of maturation (starting at about 3 weeks [Ge et al., 2006]), the GABAergic input onto newborn DGCs switches from excitatory to inhibitory. In terms of our model, it means that all existing ${w}_{lk}^{EI}$ connections switch their sign to ${w}_{EI}<0$. Furthermore, since newborn DGCs develop lateral connections to inhibitory neurons in the late maturation phase (Temprana et al., 2015), we set ${w}_{kl}^{IE}={w}_{IE}$ with probability ${p}_{IE}$, and ${w}_{kl}^{IE}=0$ otherwise. The thresholds of newborn DGCs are updated after presentation of pattern μ at time $n\cdot T$ (${b}_{l}^{(n)}={b}_{l}^{(n1)}+\mathrm{\Delta}{b}_{l}$) according to $\mathrm{\Delta}{b}_{l}={\eta}_{b}\left({\nu}_{l}{\nu}_{0}\right)$, where ${\nu}_{0}$ is a reference rate and ${\eta}_{b}$ a learning rate, to mimic the decrease of excitability as newborn DGCs mature (Table 1, Biologically plausible network). Therefore, the distribution of firing rates of newborn DGCs is shifted to the left (toward lower firing rates) at the end of the late phase of maturation compared to the early phase of maturation (Figure 2c,d). A sufficient condition for a newborn DGC to win the competition upon presentation of patterns of the novel cluster is that the scalar product between a pattern of the novel cluster and the feedforward weight vector onto the newborn DGC is larger than the scalar product between the pattern of the novel cluster and the feedforward weight vector onto any of the mature DGCs. Analogous to the early phase of maturation, presentation of all patterns of the data set once (one epoch) is sufficient to reach convergence of the feedforward weights onto newborn DGCs. We therefore consider that the late phase of maturation has been finished after one epoch.
Input patterns
Request a detailed protocolTwo different sets of input patterns are used. Both data sets have a number $K$ of clusters and several thousands of patterns per cluster. As a first data set, we use the MNIST 12 × 12 patterns (Lecun et al., 1998) (${N}_{EC}=144$), normalized such that the L2norm of each pattern is equal to 1. Normalization of inputs (be it implemented algorithmically as done here or by explicit inhibitory feedback) ensures that, once weight growth due to synaptic plasticity has ended and weights have stabilized, the overall strength of input onto DGCs is approximately identical for all cells (see Section Direction and length of the weight vector). Equalized lengths of weight vectors are, in turn, an important feature of classic soft or hard competitive networks (Kohonen, 1989; Hertz et al., 1991). The training set contains approximately 6000 patterns per digit, while the testing set contains about 1000 patterns per digit (Figure 1d). Both training patterns and test patterns contain a large variety of different writing styles indicating that the clusters of input patterns for each class are broadly distributed around their center of mass.
As a second data set, we use handmade artificial patterns designed such that the distance between the centers of any two clusters, or in other words their pairwise similarity, is the same. All clusters lie on the positive quadrant of the surface of a hypersphere of dimension ${N}_{EC}1$. The cluster centers are Walsh patterns shifted along the diagonal (Figure 5b):
with $\xi <1$ a parameter that determines the spacing between clusters. c_{0} is a normalization factor to ensure that the center of mass of all clusters has an L2norm of 1:
The number of input neurons ${N}_{EC}$ is ${N}_{EC}={2}^{K}$. The scalar product, and hence the angle Ω, between the center of mass of any pair of clusters $k$ and $l$ ($k\ne l$) is a function of ξ (Figure 5a):
We define the pairwise similarity $s$ of two clusters as: $s=1\xi $. Highly similar clusters have a large $s$ due to the small distance between their centers (hence a small ξ).
To make the artificial data set comparable to the MNIST 12 × 12 data set, we choose $K=7$, so ${N}_{EC}=128$, and we generate 6000 noisy patterns per cluster for the training set and 1000 other noisy patterns per cluster for the testing set. Since our noisy highdimensional input patterns have to be symmetrically distributed around the centers of mass ${\overrightarrow{P}}^{k}$, yet lie on the hypersphere, we have to use an appropriate sampling method. The patterns ${\overrightarrow{x}}^{\mu (k)}$ of a given cluster $k$ with center of mass ${\overrightarrow{P}}^{k}$ are thus sampled from a Von Mises–Fisher distribution (Mardia and Jupp, 2009):
with $\overrightarrow{\zeta}$ an L2normalized vector taken in the space orthogonal to ${\overrightarrow{P}}^{k}$. The vector $\overrightarrow{\zeta}$ is obtained by performing the singularvalue decomposition of ${\overrightarrow{P}}^{k}$ ($U\mathrm{\Sigma}{V}^{*}={\overrightarrow{P}}^{k}$) and multiplying the matrix $U$ (after removing its first column), which corresponds to the leftsingular vectors in the orthogonal space to ${\overrightarrow{P}}^{k}$, with a vector whose elements are drawn from the standard normal distribution. Then the L2norm of the obtained pattern is set to 1, so that it lies on the surface of the hypersphere. A rejection sampling scheme is used to obtain $a$ (Mardia and Jupp, 2009). The sample $a$ is kept if $\kappa a+({N}_{EC}1)\text{ln}(1\psi a)c\u2a7e\text{ln}(u)$, with κ a concentration parameter, $\psi =\frac{1b}{1+b}$, $c=\kappa \psi +({N}_{EC}1)\text{ln}(1{\psi}^{2})$, $u$ drawn from a uniform distribution $u\sim U[0,1]$, $a=\frac{1(1+b)z}{1(1b)z}$, $b=\frac{{N}_{EC}1}{\sqrt{4{\kappa}^{2}+{({N}_{EC}1)}^{2}}+2\kappa}$, and $z$ drawn from a beta distribution $z\sim \mathcal{B}e(\frac{{N}_{EC}1}{2},\frac{{N}_{EC}1}{2})$.
The concentration parameter κ characterizes the spread of the distribution around the center ${\overrightarrow{P}}^{k}$. In the limit where $\kappa \to 0$, sampling from the Von Mises–Fisher distribution becomes equivalent to sampling uniformly on the surface of the hypersphere, so the clusters become highly overlapping. In dimension ${N}_{EC}=128$, if $\kappa >{10}^{3}$, the probability of overlap between clusters is negligible. We use a value $\kappa ={10}^{4}$.
Classification performance (readout network)
Request a detailed protocolIt has been observed that classification performance based on DGC population activity is a good proxy for behavioral discrimination (Woods et al., 2020). Hence, to evaluate whether the newborn DGCs contribute to the function of the dentate gyrus network, we study classification performance. Once the feedforward weights have been adjusted upon presentation of many input patterns from the training set (Section Plasticity rule), we keep them fixed and determine classification on the test set using artificial readout units (RO).
To do so, the readout weights (${w}_{ki}^{RO}$ from model DGC $i$ to readout unit $k$) are initialized at random values drawn from a uniform distribution: ${w}_{ki}^{RO}\sim \sigma \mathcal{U}(0,1)$, with $\sigma =0.1$. The number of readout units, ${N}_{RO}$, corresponds to the number of learned classes. To adjust the readout weights, all patterns of the training data set that belong to the learned classes are presented one after the other. For each pattern ${\overrightarrow{x}}^{\mu}$, we let the firing rate of the DGCs converge (values at convergence: ${\nu}_{i}^{\mu}$). The activity of a readout unit $k$ is given by:
As we aim to assess the performance of the network of DGCs, the readout weights are adjusted by an artificial supervised learning rule. The loss function, which corresponds to the difference between the activity of the readout units and a onehot representation of the corresponding pattern label (Hertz et al., 1991),
with ${L}_{k}^{\mu}$ the element $k$ of a onehot representation of the correct label of pattern ${\overrightarrow{x}}^{\mu}$, is minimized by stochastic gradient descent:
The readout units have a rectified hyperbolic tangent frequencycurrent curve: $g(x)=\text{tanh}\left(2{[x]}_{+}\right)$, whose derivative is: ${g}^{\prime}(x)=2\left(1{\left(\text{tanh}\left(2{[x]}_{+}\right)\right)}^{2}\right)$. We learn the weights of the readout units over 100 epochs of presentations of all training patterns with $\eta =0.01$, which is sufficient to reach convergence.
Thereafter, the readout weights are fixed. Each test set pattern belonging to one of the learned classes is presented once, and the firing rates of the DGCs are let to converge. Finally, the activity of the readout units ${\nu}_{k}^{RO,\mu}$ is computed and compared to the correct label ${L}_{k}^{\mu}$ of the presented pattern. If the readout unit with the highest activity value is the one that represents the class of the presented input pattern, the pattern is said to be correctly classified. Classification performance is given by the number of correctly classified patterns divided by the total number of test patterns of the learned classes.
Control cases
Request a detailed protocolIn our standard setting, patterns from a third digit are presented to a network that has previously only seen patterns from two digits. The question is whether neurogenesis helps when adding the third digit. We use several control cases to compare with the neurogenesis case. In the first control case, all three digits are learned in parallel (Figure 3a, control 1). In the two other control cases, we either keep all feedforward connections toward the DGCs plastic (Figure 3c, control 3) or fix the feedforward connections for all selective DGCs but keep unselective neurons plastic (as in the neurogenesis case) (Figure 3b, control 2). However, in both instances, the DGCs do not mature in the twostep process induced by the GABAswitch that is part of our model of neurogenesis.
Pretraining with two digits
Request a detailed protocolAs we are interested by neurogenesis at the adult stage, we pretrain the network with patterns from two digits, such that it already stores some memories before neurogenesis takes place. To do so, we randomly initialize the weights from EC neurons to DGCs: they are drawn from a uniform distribution (${w}_{ij}\sim U[0,1]$). The L2norm of the feedforward weight vector onto each DGC is then normalized to 1, to ensure fair competition between DGCs during learning. Then we present all patterns from digits 3 and 4 in random order, as many times as needed for convergence of the weights. During each pattern presentation the firing rates of the DGCs are computed (Section Network architecture and neuronal dynamics) and their feedforward weights are updated according to our plasticity rule (Section Plasticity rule). We find that we need approximately 40 epochs for convergence of the weights and use 80 epochs to make sure that all weights are stable. At the end of pretraining, our network is considered to correspond to an adult stage, because some DGCs are selective for prototypes of the pretrained digits (Figure 1e).
Projection on pairwise discriminatory axes
Request a detailed protocolTo assess how separability of the DGC activation patterns develops during the late phase of maturation of newborn DGCs, we project the population activity onto axes which are optimized for pairwise discrimination (patterns from digit 3 versus patterns from digit 5, 4 versus 5, and 3 versus 4). Those axes are determined using Fisher linear discriminant analysis, as explained below.
We determine the vector of DGC firing rates, $\overrightarrow{\nu}$, at the end of the late phase of maturation of newborn DGCs upon presentation of each pattern, $\overrightarrow{x}$, from digits 3, 4, and 5 of the training MNIST dataset. The mean activity in response to all training patterns μ from digit $m$, ${\overrightarrow{\mu}}_{m}=\frac{1}{{N}_{m}}{\sum}_{\mu \in m}{\overrightarrow{\nu}}^{\mu}$, is computed for each of the three digits (${N}_{m}$ is the number of training patterns of digit $m$). The pairwise Fisher linear discriminant is defined as the linear function ${\overrightarrow{w}}^{T}\overrightarrow{\nu}$ that maximizes the distance between the means of the projected activity in response to two digits (e.g. $m$ and $n$), while normalizing for withindigit variability. The objective function to maximize is thus given as:
with ${S}_{B}=({\overrightarrow{\mu}}_{m}{\overrightarrow{\mu}}_{n}){({\overrightarrow{\mu}}_{m}{\overrightarrow{\mu}}_{n})}^{T}$ the betweendigit scatter matrix, and ${S}_{W}={\mathrm{\Sigma}}_{m}+{\mathrm{\Sigma}}_{n}$ the withindigit scatter matrix (${\mathrm{\Sigma}}_{m}$ is the covariance matrix of the DGC activity in response to pattern of digit $m$, and ${\mathrm{\Sigma}}_{n}$ is the covariance matrix of the DGC activity in response to pattern of digit $n$). It can be shown that the direction of the optimal discriminatory axis between digit $m$ and $n$ is given by the eigenvector of ${S}_{W}^{1}{S}_{B}$ with the corresponding largest eigenvalue.
We arbitrarily set ‘axis 1’ as the optimal discriminatory axis between digit 3 and digit 5, ‘axis 2’ as the optimal discriminatory axis between digit 4 and digit 5, and ‘axis 3’ as the optimal discriminatory axis between digit 3 and digit 4. For each of the three discriminatory axes, we define its origin (i.e. projection value of 0) as the location of the average projection of all training patterns of the three digits on the corresponding axis. Figure 4 represents the projections of DGC activity upon presentation of testing patterns at the end of the early and late phase of maturation of newborn DGCs onto the abovedefined axes.
Statistics
In the main text, we present a representative example with three digits from the MNIST data set (3, 4, and 5). It is selected from a set of 10 random combinations of three different digits. For each combination, one network is pretrained with two digits for 80 epochs. Then the third digit is added and neurogenesis takes place (one epoch of early phase of maturation, and one epoch of late phase of maturation). Furthermore, another network is pretrained directly with the three digits for 80 epochs. Classification performance is reported for all combinations (Supplementary file 1).
Simplified rate network
We use a toy network and the artificial data set to determine whether our theory of integration of newborn DGCs can explain why adult dentate gyrus neurogenesis helps for the discrimination of similar, but not for distinct patterns.
The rate network described above is simplified as follows. We use $K$ DGCs for $K$ clusters. Their firing rate ${\nu}_{i}$ is given by:
where $\mathscr{H}$ is the Heaviside step function. As before, b_{i} is the threshold, and ${I}_{i}$ the total input toward neuron $i$:
with x_{j} the input of presynaptic EC neuron $j$, ${w}_{ij}$ the feedforward weight between EC neuron $j$ and DGC $i$, and ${\nu}_{k}$ the firing rate of DGC $k$. Inhibitory neurons are modeled implicitly: each DGC directly connects to all other DGCs via inhibitory recurrent connections of value ${w}_{rec}<0$. During presentation of pattern ${\overrightarrow{x}}^{\mu}$, the firing rates of the DGCs evolve according to Equation (21). After convergence, the feedforward weights are updated: ${w}_{ij}^{(\mu )}={w}_{ij}^{(\mu 1)}+\mathrm{\Delta}{w}_{ij}$. The synaptic plasticity rule is the same as before, see Equation (1), but with the parameters reported in Table 1 (Simple network). They are different from those of the biologically plausible network because we now aim for a single winning neuron for each cluster. Note that for an LTP threshold $\theta <1$ all active DGCs update their feedforward weights because of the Heaviside function for the firing rate (Equation 21).
Assuming a single winner ${i}^{*}$ for each pattern presentation, the input (Equation 22) to the winner is:
while the input to the losers is:
Therefore, two conditions need to be satisfied for a solution with a single winner:
for the winner to actually be active, and:
to prevent nonwinners to become active. The value of b_{i} in the model is lower in the early phase than in the late phase of maturation to mimic enhanced excitability (SchmidtHieber et al., 2004; Li et al., 2017).
Similar versus distinct patterns with the artificial data set
Request a detailed protocolUsing the artificial data set with $\xi <1$ (Equation 13), the scalar product between the centers of mass of two different clusters, given by Equation (15), satisfies: $0.5\u2a7d\frac{1}{1+{\xi}^{2}}\u2a7d1$. This corresponds to ${0}^{\circ}\u2a7d\mathrm{\Omega}\u2a7d{\mathrm{\Omega}}_{\text{max}}={60}^{\circ}$.
After stimulation with a pattern $\overrightarrow{x}$, it takes some time before the firing rates of the DGCs converge. We call two patterns ‘similar’ if they activate, at least initially, the same output unit, while we consider two patterns as ‘distinct’ if they do not activate the same output unit, not even initially. We now show that, with a large concentration parameter κ, patterns of different clusters are similar if $\xi <\sqrt{\frac{{\overrightarrow{w}}_{i}}{{b}_{i}}1}$ and distinct if $\xi >\sqrt{\frac{{\overrightarrow{w}}_{i}}{{b}_{i}}1}$.
We first consider a DGC $i$ whose feedforward weight vector has converged toward the center of mass of cluster $k$. If an input pattern ${\overrightarrow{x}}^{\mu (k)}$ from cluster $k$ is presented, it will receive the following initial input:
where ${\vartheta}_{\text{kk}}$ is the angle between the pattern ${\overrightarrow{x}}^{\mu (k)}$ and the center of mass ${\overrightarrow{P}}^{k}$ of the cluster to which it belongs. The larger the concentration parameter κ for the generation of the artificial data set, the smaller the dispersion of the clusters, and thus the larger $\mathrm{cos}({\vartheta}_{\text{kk}})$. If instead, an input pattern from cluster $l$ is presented, that same DGC will receive a lower initial input:
The approximation holds for a small dispersion of the clusters (large concentration parameter κ). We note that there is no subtraction of the recurrent input yet because output units are initialized with zero firing rate before each pattern presentation. By definition, similar patterns stimulate (initially) the same DGCs. A DGC can be active for two clusters only if its threshold is:
Therefore, with a high concentration parameter κ, patterns of different clusters are similar if $\xi <\sqrt{\frac{{\overrightarrow{w}}_{i}}{{b}_{i}}1}$, while patterns of different clusters are distinct if $\xi >\sqrt{\frac{{\overrightarrow{w}}_{i}}{{b}_{i}}1}$.
Parameter choice
Request a detailed protocolThe upper bound of the expected L2norm of the feedforward weight vector toward the DGCs at convergence can be computed, see Equation (10). With the parameters in Table 1 (Simple network), the value is ${\overrightarrow{w}}_{i}\u2a7d1.5$. Moreover, the input patterns for each cluster are highly concentrated; hence, their angle with the center of mass of the cluster they belong to is close to 0, so we have ${\overrightarrow{w}}_{i}\approx 1.5$. Therefore, at convergence, a DGC selective for a given cluster $k$ receives an input ${I}_{{i}^{\ast}}={\overrightarrow{w}}_{{i}^{\ast}}\cdot {\overrightarrow{x}}^{\mu (k)}\approx 1.5$ upon presentation of input patterns ${\overrightarrow{x}}^{\mu (k)}$ belonging to cluster $k$. We choose ${b}_{i}=1.2$ to satisfy Equation (25). Given b_{i} the threshold value ${\xi}_{\text{thresh}}$ for which two clusters are similar (and above which two clusters are distinct) can be determined by Equation (29) : ${\xi}_{\text{thresh}}=0.5$. We created a handmade data set with $\xi =0.2$ for the case of similar clusters (therefore with similarity $s=0.8$), and a handmade data set with $\xi =0.8$ for the distinct case (hence with similarity $s=0.2$).
Let us suppose that the weights of DGC $i$ have converged and made this cell respond to patterns of cluster $i$. If another DGC $k$ of the network is selective for cluster $k$, cell $i$ gets the input $I}_{i}={\overrightarrow{w}}_{i}\cdot {\overrightarrow{x}}^{\mu (k)}+{w}_{\text{rec}}\approx \frac{1.5}{1+{\xi}^{2}}+{w}_{\text{rec}$ upon presentation of input patterns ${\overrightarrow{x}}^{\mu (k)}$ belonging to cluster $k\ne i$. Hence, to satisfy Equation (26), we need ${w}_{\text{rec}}<{b}_{i}{\mathrm{max}}_{\xi}\left(\frac{1.5}{1+{\xi}^{2}}\right)\approx 0.24$. We set ${w}_{\text{rec}}=1.2$.
Furthermore, a newborn DGC is born with a null feedforward weight vector so that at birth, its input consists only of the indirect excitatory input from mature DGCs, which vanishes if all DGCs are quiescent and takes a value ${I}_{i}={w}_{\text{rec}}>0$ if a mature DGC responds to the input. For the feedforward weight vector to grow, the newborn cell $i$ needs to be active. This could be achieved through spontaneous activity that could be implemented by setting the intrinsic firing threshold at birth to a value ${b}_{\text{birth}}<0$. In this case, a difference between similar and distinct patterns is not expected. Alternatively, activity of newborn cells can be achieved in the absence of spontaneous activity under the condition ${w}_{\text{rec}}>{b}_{\text{birth}}$. For the simulations with the toy model, we set ${b}_{\text{birth}}=0.9$, which leads to weight growth in newborn cells for similar, but not distinct patterns.
Neurogenesis with the artificial data set
Request a detailed protocolTo save computation time, we initialize the feedforward weight vectors of two mature DGCs at two training patterns randomly chosen from the first two clusters, normalized such that they have an L2norm of 1.5. We then present patterns from clusters 1 and 2 and let the feedforward weights evolve according to Equation (1) until they reach convergence.
We thereafter fix the feedforward weights onto the two mature cells and introduce a novel cluster of patterns as well as a newborn DGC in the network. The sequence of presentation of patterns from the three clusters (a novel one and two pretrained ones) is random. The newborn DGC is born with a null feedforward weight vector, and its maturation follows the same rules as before (plastic feedforward weights). In the early phase, GABAergic input has an excitatory effect (Ge et al., 2006) and the newborn DGC does not inhibit the mature DGCs (Temprana et al., 2015). This is modeled by setting ${w}_{\text{rec}}^{NM}={w}_{\text{rec}}$ for the connections from mature to newborn DGC, and ${w}_{\text{rec}}^{MN}=0$ for the connections from newborn to mature DGCs. The threshold of the newborn DGC starts at ${b}_{\text{birth}}=0.9$ at birth, mimicking enhanced excitability (SchmidtHieber et al., 2004; Li et al., 2017), and increases linearly up to 1.2 (same threshold as that of mature DGCs) over 12,000 pattern presentations, reflecting loss of excitability with maturation. The exact time window is not critical. In the late phase of maturation of the newborn DGC, GABAergic input switches to inhibitory (Ge et al., 2006), and the newborn DGC recruits feedback inhibition onto mature DGCs (Temprana et al., 2015). It is modeled by switching the sign of the connection from mature to newborn DGC: ${w}_{\text{rec}}^{NM}={w}_{\text{rec}}$ and establishing connections from newborn to mature DGCs: ${w}_{\text{rec}}^{MN}={w}_{\text{rec}}$. Each of the 6000 patterns is presented once during the early phase of maturation and once during the late phase of maturation.
The above paradigm is run separately for each of the two handmade data sets: the one where clusters are similar ($s=0.8$) and the one where clusters are distinct ($s=0.2$).
Analytical computation of the L2norm and angle
Request a detailed protocolWe consider the case where two mature DGCs have learned their synaptic connections, such that the first mature DGC with feedforward weight vector ${\overrightarrow{w}}_{1}$ is selective for cluster 1 with normalized center of mass ${\overrightarrow{P}}^{1}$, and the second mature DGC with feedforward weight vector ${\overrightarrow{w}}_{2}$ is selective for cluster 2 with normalized center of mass ${\overrightarrow{P}}^{2}$. After convergence, we have ${\overrightarrow{w}}_{1}=\u27e8{\overrightarrow{w}}_{1}\u27e9{\overrightarrow{P}}^{1}$ and ${\overrightarrow{w}}_{2}=\u27e8{\overrightarrow{w}}_{2}\u27e9{\overrightarrow{P}}^{2}$, where $\u27e8{\overrightarrow{w}}_{k}\u27e9$ is the expected L2norm of the feedforward weight vector onto mature DGC $k$ that is selective for pretrained cluster $k$. In addition, the upper bound for the L2norm of the weight vectors of the mature DGCs can be determined $\u27e8{\overrightarrow{w}}_{1}\u27e9=\u27e8{\overrightarrow{w}}_{2}\u27e9\u2a7d1.5$. In our case, we obtain $\u27e8{\overrightarrow{w}}_{1}\u27e9=\u27e8{\overrightarrow{w}}_{2}\u27e9\approx 1.49$ because of the dispersion of the patterns around their center of mass; hence, we will use this value for the numerical computations below.
We represent the feedforward weight vector ${\overrightarrow{w}}_{i}$ onto a newborn DGC as an arrow of length $\u27e8{\overrightarrow{w}}_{1}\u27e9$ (Figure 6—figure supplement 1). We compute analytically its L2norm at the end of the early phase of maturation of the newborn DGC, as well as its angle φ with the center of mass of the novel cluster ${\overrightarrow{P}}^{i}$, to confirm the results obtained numerically (Figure 6, Figure 6—figure supplement 1).
In the early phase of maturation, the feedforward weight vector onto the newborn DGC grows. The norm stabilizes at a higher value in the case of similar patterns ($s=0.8$, Figure 6—figure supplement 1) than in the case of distinct patterns ($s=0.2$, Figure 6—figure supplement 1). It is due to the fact that the center of mass of three similar clusters lies closer to the surface of the sphere than the center of mass of two distinct clusters (see below). In the late phase of maturation, for similar clusters we observe a slight increase of the L2norm of the feedforward weight vector onto the newborn DGC concomitantly with the decrease of angle with the center of mass of the novel cluster (Figure 6—figure supplement 1), because the center of mass of the novel cluster lies closer to the surface of the sphere than the center of mass of the three clusters.
Similar clusters
Request a detailed protocolThe angle between the center of mass of any pair of similar clusters ($s=0.8$, $\xi =0.2$) is given by Equation (15):
Half the distance between the projections of the center of mass of any pair of two similar clusters on a concentric sphere with radius $\u27e8{\overrightarrow{w}}_{1}\u27e9$ is given by (Figure 6—figure supplement 1):
The triangle that connects the centers of masses of the three clusters is equilateral, and $y$ separates one of its angle in two equal parts ($\pi /6$ [rad] each). So the length $y$ can be calculated:
Using Pythagoras formula, we can thus determine the expected L2norm $\u27e8{\overrightarrow{w}}_{i}\u27e9$ of the feedforward weight vector onto the newborn DGC at the end of the early phase of maturation:
and finally its angle with the center of mass of the novel cluster:
The numerical values are as follows: $\u27e8{\overrightarrow{w}}_{i}\u27e9\approx 1.47$ and $\phi \approx 9.21{[}^{\circ}]$, which correspond to the values on Figure 6—figure supplement 1.
Distinct clusters
Request a detailed protocolIn the case of distinct patterns ($s=0.2$, $\xi =0.8$), the angle between the center of mass of any pair of clusters is given by Equation (15):
We can directly compute the expected L2norm of the feedforward weight vector onto the newborn DGC at the end of the early phase of maturation (Figure 6—figure supplement 1):
We can then calculate the length $z$ between the projection of the center of mass of one of the two pretrained clusters on a concentric sphere with radius $\u27e8{\overrightarrow{w}}_{1}\u27e9$ and the feedforward weight vector onto the newborn DGC:
Analogous to the similar case, we observe that $y$ separates one angle of the equilateral triangle connecting the projections of the center of mass of the clusters on the sphere in two equal parts, consequently:
Finally, the angle between the center of mass of the novel cluster and the feedforward weight vector onto the newborn DGC at the end of the early phase of maturation is:
We obtain the following approximate values: $\u27e8{\overrightarrow{w}}_{i}\u27e9\approx 1.34$ and $\phi \approx 47.2{[}^{\circ}]$, which correspond to the values on Figure 6—figure supplement 1. The angle φ is smaller in the similar case than in the distinct case, hence the norm is larger in the similar case, as observed in Figure 6—figure supplement 1.
Effective dimensionality and participation ratio
Request a detailed protocolThe effective dimensionality of the input is measured as the participation ratio (PR) defined as $PR={(\text{Tr}(C))}^{2}/\text{Tr}({C}^{2})$, where $C$ is the covariance matrix of the input patterns, and $\text{Tr}(C)$ denotes the trace of matrix $C$ (Mazzucato et al., 2016; LitwinKumar et al., 2017).
Data availability
Simulation and plotting scripts can be found at: https://github.com/ogozel/NeurogenesisModel (copy archived at https://archive.softwareheritage.org/swh:1:rev:e46f2dfc10c21d69ac057f31c5800f46644b004a).

THE MNIST DATABASEID yann.lecun.com/exdb/mnist/. The MNIST database of handwritten digits.
References

GABAergic cells are the major postsynaptic targets of mossy fibers in the rat HippocampusThe Journal of Neuroscience 18:3386–3403.https://doi.org/10.1523/JNEUROSCI.180903386.1998

A theory of cerebellar functionMathematical Biosciences 10:25–61.https://doi.org/10.1016/00255564(71)900514

Transition to chaos in random networks with celltypespecific connectivityPhysical Review Letters 114:088101.https://doi.org/10.1103/PhysRevLett.114.088101

The dentate gyrus: fundamental neuroanatomical organization (dentate gyrus for dummies)Progress in Brain Research 163:3–22.https://doi.org/10.1016/S00796123(07)630015

BookThe Hippocampus BookOxford University Press.https://doi.org/10.1093/acprof:oso/9780195100273.001.0001

Additive neurogenesis as a strategy for avoiding interference in a sparselycoding dentate gyrusNetwork: Computation in Neural Systems 20:137–161.https://doi.org/10.1080/09548980902993156

Excitatory actions of gaba during development: the nature of the nurtureNature Reviews Neuroscience 3:728–739.https://doi.org/10.1038/nrn920

Adult neurogenesis produces a large pool of new granule cells in the dentate gyrusThe Journal of Comparative Neurology 435:406–417.https://doi.org/10.1002/cne.1040

Becoming a new neuron in the adult olfactory bulbNature Neuroscience 6:507–518.https://doi.org/10.1038/nn1048

GABA depolarization is required for experiencedependent synapse unsilencing in adultborn neuronsJournal of Neuroscience 33:6614–6622.https://doi.org/10.1523/JNEUROSCI.078113.2013

Heterosynaptic plasticity: multiple mechanisms and multiple rolesThe Neuroscientist : A Review Journal Bringing Neurobiology, Neurology and Psychiatry 20:483–498.https://doi.org/10.1177/1073858414529829

Connectivity reflects coding: a model of voltagebased STDP with homeostasisNature Neuroscience 13:344–352.https://doi.org/10.1038/nn.2479

Apoptosis, neurogenesis, and information content in Hebbian networksBiological Cybernetics 94:9–19.https://doi.org/10.1007/s0042200500268

Shortterm and longterm survival of new neurons in the rat dentate gyrusThe Journal of Comparative Neurology 460:563–572.https://doi.org/10.1002/cne.10675

Hippocampal neurogenesis reduces the dimensionality of sparsely coded representations to enhance memory encodingFrontiers in Computational Neuroscience 12:99.https://doi.org/10.3389/fncom.2018.00099

New neurons and new memories: how does adult hippocampal neurogenesis affect learning and memory?Nature Reviews Neuroscience 11:339–350.https://doi.org/10.1038/nrn2822

ConferenceAdding a conscience to competitive learningIEEE International Conference on Neural Networks. pp. 117–124.https://doi.org/10.1109/ICNN.1988.23839

Clustering: a neural network approachNeural Networks 23:89–107.https://doi.org/10.1016/j.neunet.2009.08.007

Neurogenesis paradoxically decreases both pattern separation and memory interferenceFrontiers in Systems Neuroscience 9:136.https://doi.org/10.3389/fnsys.2015.00136

Neonatal maternal separation delays the GABA excitatorytoinhibitory functional switch by inhibiting KCC2 expressionBiochemical and Biophysical Research Communications 493:1243–1249.https://doi.org/10.1016/j.bbrc.2017.09.143

BookNeuronal Dynamics: From Single Neurons to Networks and Models of CognitionCambridge University Press.https://doi.org/10.1017/CBO9781107447615

Single granule cells reliably discharge targets in the hippocampal CA3 network in vivoNature Neuroscience 5:790–795.https://doi.org/10.1038/nn887

Interneurons of the dentate gyrus: an overview of cell types, terminal fields and neurochemical identityProgress in Brain Research 163:217–232.https://doi.org/10.1016/S00796123(07)630131

On the role of the hippocampus in learning and memory in the ratBehavioral and Neural Biology 60:9–26.https://doi.org/10.1016/01631047(93)906644

Paradox of pattern separation and adult neurogenesis: a dual role for new neurons balancing memory resolution and robustnessNeurobiology of Learning and Memory 129:60–68.https://doi.org/10.1016/j.nlm.2015.10.013

Developmental changes in GABAergic actions and seizure susceptibility in the rat hippocampusEuropean Journal of Neuroscience 19:590–600.https://doi.org/10.1111/j.0953816X.2003.03152.x

BookSelfOrganization and Associative MemorySpringerVerlag.https://doi.org/10.1007/9783642881633

Gradientbased learning applied to document recognitionProceedings of the IEEE 86:2278–2324.https://doi.org/10.1109/5.726791

Decoding neurotransmitter switching: the road forwardThe Journal of Neuroscience 40:4078–4089.https://doi.org/10.1523/JNEUROSCI.000520.2020

A theory of cerebellar cortexThe Journal of Physiology 202:437–470.https://doi.org/10.1113/jphysiol.1969.sp008820

Simple memory: a theory for archicortexPhilosophical Transactions of the Royal Society of London. Series B, Biological Sciences 262:23–81.https://doi.org/10.1098/rstb.1971.0078

Stimuli reduce the dimensionality of cortical activityFrontiers in Systems Neuroscience 10:11.https://doi.org/10.3389/fnsys.2016.00011

Is there more to GABA than synaptic inhibition?Nature Reviews Neuroscience 3:715–727.https://doi.org/10.1038/nrn919

Triplets of spikes in a model of spike timingdependent plasticityJournal of Neuroscience 26:9673–9682.https://doi.org/10.1523/JNEUROSCI.142506.2006

BookNeural Networks and Brain FunctionOxford: Oxford University Press.https://doi.org/10.1093/acprof:oso/9780198524328.001.0001

Feature discovery by competitive learningCognitive Science 9:75–112.https://doi.org/10.1207/s15516709cog0901_5

Young adultborn neurons improve odor coding by mitral cellsNature Communications 11:5867.https://doi.org/10.1038/s41467020194728

Defined types of cortical interneurone structure space and spike timing in the hippocampusThe Journal of Physiology 562:9–26.https://doi.org/10.1113/jphysiol.2004.078915

Running increases cell proliferation and neurogenesis in the adult mouse dentate gyrusNature Neuroscience 2:266–270.https://doi.org/10.1038/6368

Monosynaptic inputs to new neurons in the dentate gyrusNature Communications 3:1107.https://doi.org/10.1038/ncomms2101

Hebbian plasticity requires compensatory processes on multiple timescalesPhilosophical Transactions of the Royal Society B: Biological Sciences 372:20160259.https://doi.org/10.1098/rstb.2016.0259
Article and author information
Author details
Funding
Swiss National Science Foundation (200020 184615)
 Wulfram Gerstner
Horizon 2020 Framework Programme (785907)
 Wulfram Gerstner
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
We thank Josef Bischofberger and Laurenz Wiskott for great discussions and useful remarks, as well as Paul Miller and an anonymous reviewer for constructive comments and suggestions. This research was supported by the Swiss National Science Foundation (no. 200020 184615) and by the European Union Horizon 2020 Framework Program under grant agreement no. 785907 (HumanBrain Project, SGA2).
Copyright
© 2021, Gozel and Gerstner
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics

 2,000
 views

 318
 downloads

 6
 citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading

 Computational and Systems Biology
 Neuroscience
Diffusional kurtosis imaging (DKI) is a methodology for measuring the extent of nonGaussian diffusion in biological tissue, which has shown great promise in clinical diagnosis, treatment planning, and monitoring of many neurological diseases and disorders. However, robust, fast, and accurate estimation of kurtosis from clinically feasible data acquisitions remains a challenge. In this study, we first outline a new accurate approach of estimating mean kurtosis via the subdiffusion mathematical framework. Crucially, this extension of the conventional DKI overcomes the limitation on the maximum bvalue of the latter. Kurtosis and diffusivity can now be simply computed as functions of the subdiffusion model parameters. Second, we propose a new fast and robust fitting procedure to estimate the subdiffusion model parameters using two diffusion times without increasing acquisition time as for the conventional DKI. Third, our subdiffusionbased kurtosis mapping method is evaluated using both simulations and the Connectome 1.0 human brain data. Exquisite tissue contrast is achieved even when the diffusion encoded data is collected in only minutes. In summary, our findings suggest robust, fast, and accurate estimation of mean kurtosis can be realised within a clinically feasible diffusionweighted magnetic resonance imaging data acquisition time.

 Computational and Systems Biology
 Developmental Biology
The initially homogeneous epithelium of the early Drosophila embryo differentiates into regional subpopulations with different behaviours and physical properties that are needed for morphogenesis. The factors at top of the genetic hierarchy that control these behaviours are known, but many of their targets are not. To understand how proteins work together to mediate differential cellular activities, we studied in an unbiased manner the proteomes and phosphoproteomes of the three main cell populations along the dorsoventral axis during gastrulation using mutant embryos that represent the different populations. We detected 6111 protein groups and 6259 phosphosites of which 3398 and 3433 were differentially regulated, respectively. The changes in phosphosite abundance did not correlate with changes in host protein abundance, showing phosphorylation to be a regulatory step during gastrulation. Hierarchical clustering of protein groups and phosphosites identified clusters that contain known fate determinants such as Doc1, Sog, Snail, and Twist. The recovery of the appropriate known marker proteins in each of the different mutants we used validated the approach, but also revealed that two mutations that both interfere with the dorsal fate pathway, Toll^{10B} and serpin27a^{ex} do this in very different manners. Diffused network analyses within each cluster point to microtubule components as one of the main groups of regulated proteins. Functional studies on the role of microtubules provide the proof of principle that microtubules have different functions in different domains along the DV axis of the embryo.