Thetamodulation drives the emergence of connectivity patterns underlying replay in a network model of place cells
Abstract
Place cells of the rodent hippocampus fire action potentials when the animal traverses a particular spatial location in any environment. Therefore for any given trajectory one observes a repeatable sequence of place cell activations. When the animal is quiescent or sleeping, one can observe similar sequences of activation known as replay, which underlie the process of memory consolidation. However, it remains unclear how replay is generated. Here we show how a temporally asymmetric plasticity rule during spatial exploration gives rise to spontaneous replay in a model network by shaping the recurrent connectivity to reflect the topology of the learned environment. Crucially, the rate of this encoding is strongly modulated by ongoing rhythms. Oscillations in the theta range optimize learning by generating repeated prepost pairings on a timescale commensurate with the window for plasticity, while lower and higher frequencies generate learning rates which are lower by orders of magnitude.
https://doi.org/10.7554/eLife.37388.001Introduction
As an animal explores in any given environment, place cells in the hippocampus fire selectively at particular locations (O'Keefe and Dostrovsky, 1971; O'Keefe and O’Keefe, 1976; Harvey et al., 2009), known as the ‘placefields’ of the cells. Furthermore, the place fields of an ensemble of place cells remap to completely new positions with respect to one another if the animal enters a distinct environment (Muller and Kubie, 1987; Kubie and Muller, 1991; Bostock et al., 1991). Sequential place cell activation during exploration therefore acts as a unique fingerprint for each environment, providing information needed for navigation and spatial learning. The spontaneous replay of such sequential activation, which occurs within sharpwave/ripples (SWRs) during quiet wakefulness (Foster and Wilson, 2006; Karlsson and Frank, 2009; Carr et al., 2011) and sleep (Wilson and McNaughton, 1994; Lee and Wilson, 2002) suggests that the animal has formed an internal representation of the corresponding environment, presumably during exploration (Wu and Foster, 2014). Synaptic plasticity is the clear candidate mechanism for this formation. Nonetheless it remains unclear how changes in the synaptic connectivity are coordinated at the network level in order to generate wellordered sequences spontaneously. An additional prominent physiological signature of exploratory behavior in the hippocampus is the theta rhythm (4–12 Hz). Decreases in theta power due to lesions of the medial septum strongly reduce performance in spatialmemory based tasks (Winson, 1978; Mitchell et al., 1982; Dwyer et al., 2007) and other hippocampal dependent tasks (Berry and Thompson, 1979; Allen et al., 2002) although they do not eliminate place fields on linear tracks (Brandon et al., 2014; Wang et al., 2015). Despite this, lesioned animals can still reach fixed performance criteria given enough time (Winson, 1978; Berry and Thompson, 1979). This shows not only that theta is important for learning, but also suggests a potential mechanism. Namely, it may act as a dial for the learning rate, presumably by modulating the networkwide coordination of synaptic plasticity. How this occurs is unknown. Here, we develop a model that explains how synaptic plasticity shapes the patterns of synaptic connectivity in a strongly recurrent hippocampal circuit, such as CA3, as an animal explores a novel environment. We show that a temporally asymmetric spiketiming dependent plasticity (STDP) rule (Markram et al., 1997; Bi and Poo, 1998) during motiondriven sequential placecell activity leads to the formation of network structure capable of supporting spontaneous bursts. These bursts occur in the absence of placefield input, that is during awakequiescence or sleep. Importantly, the spatiotemporal structure of the bursts undergoes a sharp transition during exploration, exhibiting wellordered replay only after a critical time of exploration in the novel environment. The underlying rate of this plasticity process, and hence the critical time, is strongly modulated through external oscillatory drive: for very low and high frequency the rate is near zero, while for an intermediate range, set by the timescale of the plasticity rule, it is higher by several orders of magnitude, allowing for learning on realistic timescales. Our theoretical findings lend support to the hypothesis that the theta rhythm accelerates learning by modulating placecell activity on a timescale commensurate with the window for plasticity. This maximizes the growth rate of networkwide patterns of synaptic connectivity which drive spontaneous replay. Finally, we confirm a main prediction from the model using simultaneous recordings of hippocampal place cells from a rat exploring a novel track. Namely, pairwise correlations between cells with neighboring place fields show a pronounced increase over the span of several minutes at the outset of exploration. Furthermore, in a rat in which the medial septum is partially inactivated via muscimol injection, which strongly reduces theta modulation, this increase is not seen.
Results
Plasticity during exploration of a novel environment leads to a transition in the structure of sharpwave events
We modeled the dynamics of hippocampal place cells in a strongly recurrent circuit, such as area CA3, as an animal sequentially explored a series of distinct novel ringlike tracks, Figure 1a. The model consisted of recurrently coupled, excitatory stochastic firing rate neurons, each of which received a placespecific external input on any given track, and inhibition was modeled as a global inhibitory feedback, see Materials and methods for model details. To model the global remapping of place fields from one track to another, we randomized the position of the peak input, and as a consequence the place field, of each neuron. In this way, the ordering of cells according to their place field location on one track was random and uncorrelated with their ordering on any other track, see Figure 1b.
We were interested in knowing how the exploration of a novel environment affected the pattern of synaptic connectivity between place cells via a temporally asymmetric plasticity rule, and how this in turn shaped the activity. While the plasticity rule we use is formally referred to as 'Spiketiming dependent’ (Kempter et al., 1999; Song et al., 2000; Pfister and Gerstner, 2006), our model neurons generate spikes stochastically as Poisson processes. Therefore the exact spike timing itself does not play any role but rather only the time variations in the underlying firing rate. Recent theoretical work has shown that the plasticity induced by a number of plasticity rules, including heuristic STDP rules, is in fact dominated by such firing rate variations when irregular, invivo like activity is considered (Graupner et al., 2016). We also considered a ‘balanced’ plasticity rule, for which the degree of potentiation and depression were the same, on average, given constant firing rates. Such a rule is a means of avoiding fully potentiating or fully depressing all synapses on long timescales, that is it allows for the generation and maintenance of structured synaptic connectivity. Alternatively, we could have considered an unbalanced plasticity rule with additional stabilizing mechanisms (Zenke et al., 2015).
In order to study ‘typical’ placecell dynamics during exploration we first exposed the network to a series of distinct tracks until the matrix of recurrent synaptic weights no longer depended on its initial state. When the trajectory and velocity of the virtual animal was stochastic and distinct on different tracks, the resulting evolution of the synaptic weights also exhibited some variability from track to track, see Figure 2—figure supplement 1. In simulations with constant velocity this variability largely vanished, see Figure 2—figure supplement 2a. After exploring 10 tracks, we exposed the network to another novel track and studied the dynamics in detail; because the exploration process had already become largely stereotyped (despite some variability), the dynamics on the novel track reflected what would be seen qualitatively during the exploration of any other novel track in the future, that is it was dependent only on the learning process itself and not the initial state of the synaptic weight matrix.
In our simulations, we modeled the movement of a virtual animal around the track by varying the external inputs to place cells. Specifically, the external input was maximal for cells with place fields at the animal’s current position and minimal for cells with place fields at the opposite side of the track, with the input decaying smoothly for intermediate locations like a cosine curve, see Materials and methods for details. We modeled the motion of the animal by taking the velocity to be an OrnsteinUhlenbeck process with a time constant of 10 s; this lead to realistic trajectories with changes of direction, see Figure 2—figure supplement 3. (We will use the metaphor of a virtual animal for conceptual ease, although we really mean that we moved the bumplike external input to our model neurons). Every three minutes of simulation time we stopped the animal at its current location for three seconds and removed the place field input. This led to spontaneous bursting via a synaptic depressiondependent mechanism (Romani and Tsodyks, 2015), reminiscent of sharpwaves seen during awake quiescence and sleep, see Figure 2. Neither the bursting, nor the learning process itself depended qualitatively on the exact amount of time the virtual animal spent moving versus staying still. Synaptic plasticity was allowed to occur during both theta activity and sharpwaves, that is it was never ‘turnedoff’.
As the virtual animal explored the track, the sequential activation of the place cells led to changes in the recurrent connectivity via the plasticity rule, Figure 2a. Whereas the connectivity between cells was initially unstructured on the novel track, it evolved over the span of minutes to tens of minutes such that cells with nearby place fields were more strongly connected than ones with disparate place fields. Furthermore the resulting connectivity was slightly asymmetric, reflecting the bias of the animal’s motion, see Figure 2—figure supplement 3. Although the changes we observed in the recurrent connectivity, see Figure 2a, and concomitant changes in place cell activity were continuous and smooth during the duration of the simulation, there was a dramatic and sharp transition in the structure of the burst activity during awake quiescence. We quantified this transition by measuring the mean pairwise crosscorrelation of cells with neighboring place fields on the novel track. Such cells are ‘neighbors’ only on the novel track and not on any other track by virtue of the global remapping or randomization of placefields. We call this measure the sequential correlation (SC) because it is high when there is a properly ordered sequence of placecell activations on any given track. The SC during thetaactivity (exploration) was already nonzero at the outset of the simulation by virtue of the external placefield input which generates sequential activity, black squares Figure 2b. On the other hand, the SC was initially near zero during spontaneous bursts, red circles Figure 2b. Interestingly the SC abruptly increased during burst activity at a critical time and remained elevated for the duration of the simulation. This transition occurred even in simulations where we included strong heterogeneity in place field tuning, see Figure 2—figure supplement 4. Note that the SC for thetaactivity also showed a steady increase leading up to this transition, as did the SC when the total activity was taken into account, without discerning between epochs of movement or awake quiescence, see Figure 2b inset. This steady increase is due to the emergence of socalled theta sequences (Foster and Wilson, 2007). These are bursts of placecell activity which sweep ahead of the animal’s position on every theta cycle during awake exploration, see Figure 2—figure supplement 5. The emergence of theta sequences here also coincides with a precession of the phase of the underlying firing rate with the ongoing theta rhythm, see Figure 2—figure supplement 5f and discussion for more detail.
Spacetime plots of placecell activity show that the abrupt increase in SC coincides with a transition in the spatiotemporal structure of the bursts: it is initially uncorrelated with the ordering of cells on the novel track, Figure 2c (i), and after the transition there is a clear sequential activation reminiscent of socalled ‘replay’ activity, Figure 2c (ii). In fact, before the transition the bursts are highly structured, but this structure is correlated with a previously explored track and not the novel one. If the learning process was carried on for much longer times all relevant network properties, such as connectivity and mean firing rates, saturated, see Figure 2—figure supplement 8.
The transition in the structure of bursts is strongly dependent on the shape of the plasticity rule and the frequency of ‘theta’ modulation
We hypothesized that changes in the recurrent connections between place cells in our model during exploration were shaping the spontaneous activity and driving the transition to replaylike activity. The connectivity profile at any point in time could be decomposed into a series of spatial Fourier modes, Figure 3a. We tracked these modes in time and discovered that the transition in burst activity always occurred when the coefficient of the first even mode reached a critical value. Specifically, we conducted additional simulations in which we increased the firing rate of the feedforward inputs which drove place cell activity which in turn led the transition in burst activity to occur at earlier times, see shaded regions in Figure 3b. In fact, a theoretical analysis of our model shows that there is an instability of the spontaneous activity to bursts when the even mode of the connectivity times the gain in the neuronal transfer function is greater than a critical value, see Materials and methods. In our simulations the gain in the transfer function was nearly always the same during bursts because it depended on the mean input to the network during quiescence, which was constant. Therefore it was principally the even mode of the recurrent connectivity which determined the time of the transition. For this reason, the transition to replay occurred when the coefficient of the even mode reached a critical value as seen in Figure 3c. Note that there was no such dependence on the odd mode (dotted lines in Figure 3c).
We then asked how the growth of the even mode depended on the details of the plasticity rule and the frequency of periodic modulation of the place cell activity. We found that the growth of the even mode depended strongly on the frequency, peaking at an intermediate range and decreasing by orders of magnitude at low and high frequencies, Figure 4ai. This occurred independent of the details of the animal’s motion, for example compare Figure 4ai and Figure 2—figure supplement 2e for irregular versus purely clockwise motion respectively. On the other hand the odd mode, which represents a directional bias in the recurrent connectivity, closely tracks the cumulative bias in the animal’s motion, see Figure 2—figure supplement 3, and is independent of the forcing frequency to leading order, see Materials and methods for details. The strong dependence of the growth of the even mode on frequency, meant that, for simulations of up to 1 hr, transitions in the burst activity were observed only when the frequency was in the range of 2–10 Hz, Figure 4aii. Furthermore, the even mode only grew, and transitions only occurred, when the plasticity rule had dominant potentiation at short latencies. For a perfectly antisymmetric plasticity rule, and for a rule with dominant inhibition at short latencies, the even mode did not change and even decreased, respectively, Figure 4b and c. In the latter case bursts were suppressed entirely (not shown).
A selfconsistent theory to explain how the interplay between thetamodulation and the plasticity rule govern changes in recurrent connectivity
We sought to understand the mechanism underlying the evolution of the recurrent connectivity seen in simulations by studying a continuum version of a network of place cells, which can be written
where $r(\theta ,t)$ is the firing rate of a place cell with place field centered at a location $\theta $, $w(\theta ,{\theta}^{\text{\u2032}})$ is the synaptic weight from a cell at a position ${\theta}^{\text{\u2032}}$ to a cell at a position $\theta $, and $I(\theta ,t)$ is the external input which has the form
The position of the virtual animal is given by $x\left(t\right)$, and the corresponding placefield input is modulated with a frequency $f$. This type of multiplicative theta modulation is seen in intracellular recordings invivo, for example see Figures 1 and 5 in (Harvey et al., 2009) and Figure 4 in (Lee et al., 2012).
To model the evolution of the recurrent connectivity we made use of the fact that the stepwise increases in synaptic strength due to the plasticity rule can be approximated as a smooth process as long as plasticity occurs much more slowly than the firing rate dynamics. When this is the case, the rate of change of the synaptic weight from place cell with place field centered at ${\theta}^{\text{\u2032}}$ to one with place field at $\theta $ can be written as
where $K\left(T\right)$ is the change in the synaptic weight according to the plasticity rule given a spike pair with latency $T$ (Kempter et al., 1999) and see Materials and methods. This equation reflects the fact that the total change in the synaptic weight is the sum of all the pairwise contributions from the pre and postsynaptic cells, with each pair of spikes weighted by the plasticity rule with the appropriate latency. (Equations 1–3) represent a selfconsistent model for the coevolution of the firing rates and synaptic weights in the network.
In order to derive an analytical solution we first assume that the neuronal transfer function $\varphi $ is linear. We then make the assumption of slowly evolving synaptic weights explicit by scaling the amplitudes of the potentiations and depressions from the plasticity rule by a small parameter. The upshot is that the connectivity evolves to leading order only on a slow time scale, much slower than the fast neuronal dynamics. Furthermore, we know from numerical simulations that after sufficient exploration the probability of connection between any two cells depends on average only on the difference in place field locations. Therefore, by averaging the connectivity over the fast time $t$ we can write
where the growth rates of the even and odd modes $\u27e8{\dot{w}}_{even}\u27e9}_{t$ and $\u27e8{\dot{w}}_{odd}\u27e9}_{t$ are functions of the plasticity rule parameters, the velocity of the animal and the frequency of periodic modulation, see Materials and methods for details. It turns out it is possible to understand these dependencies intuitively and comprehensively without having to study the analytical formulas. Specifically, if we wish to isolate the growth rate of the even mode, which is responsible for driving the emergence of replay in the burst, we can consider place cell pairs where $\theta ={\theta}^{\text{\u2032}}$, that is pairs with overlapping place fields. When this is the case we can combine (Equations 3 and 4) to yield
where $AC\left(T\right)={\u27e8r\left(\theta ,t\right)r\left(\theta ,t+T\right)\u27e9}_{t}$ is the autocorrelation (AC) of the placecell activity. Note that despite the similarity in form between (Equation 5) and (Equation 3), the biological interpretation of the two is quite distinct. (Equation 3) describes the changes in the strength of a specific synapse, that from a cell with placefield centered at a position ${\theta}^{\text{\u2032}}$ onto a cell with placefield centered at a position $\theta $. The evolution of this single synapse depends only on the sum of the changes from all spike pairs from these two cells. If we averaged this equation over fast fluctuations in time we would find that the slow evolution of the synapse depends on the crosscorrelation of the activity of the two neurons times the STDP window. There is an equation like this for every cell pair in the network; clearly for a large network it is not feasible to solve these equations selfconsistently. On the other hand, (Equation 5) does not describe the changes in strength of a single synapse. Rather, it describes changes in the strength of a particular pattern of synaptic connectivity in the network. This pattern is one in which cells with highly overlapping place fields have strong and symmetric recurrent connectivity. Furthermore the strength of the synaptic connections decays smoothly with the difference between place field locations. In our theoretical model, in the limit of a large network, the crosscorrelation of very nearby cells is simply given by the autocorrelation, which is why it appears in (Equation 5). It is perhaps remarkable that using purely local information, namely the AC of place cell activity, one can infer the growth rate of a networkwide mode of synaptic connectivity. The key assumption that makes this possible is that the synaptic weight between any two cells should depend only on the difference in placefield location and not on the absolute position, (Equation 4). Therefore the growth rate of the even mode is found by multiplying the AC of placecell activity by the window for plasticity, and integrating. Because the effect of periodic modulation on the AC is straightforward, we can determine graphically how the frequency of modulation interacts with the plasticity rule to drive changes in the burst structure.
We first note that if there is no periodic modulation of placecell activity then the AC will simply reflect the movement of the animal. This will lead to a very broad AC compared to the timescale of plasticity. For example, if we assume that the width of the place field is a fraction of the track length (as in our model), then a rat running between 5 and 50 cm/sec on a multimeter track would have an AC which decays on the order of between seconds and tens of seconds. Therefore, the AC is essentially constant compared to a typical plasticity window of between tens and hundreds of milliseconds, and the integral in (Equation 5) will give nearly zero. Periodically modulating placecell activity will increase the growth rate as long as potentiation is dominant at short latencies, Figure 5a. Slow modulation will bias the integral in (Equation 5) toward the potentiation lobe of the STDP (Figure 5a left and middle, top), while in an optimal range of frequencies the peaks and troughs of the AC will maximally capture potentiation and flip the sign of the depression lobe (Figure 5a left and middle, middle). Finally, at higher frequencies the plasticity rule undergoes multiple sign flips on a fast time scale, which again gives a near zero value for (Equation 5) (Figure 5a, left and middle, bottom). This means that the maximal growth rate of the even mode occurs for an intermediate range of frequencies: those which modulate placecell activity on a time scale commensurate with the window for plasticity, Figure 5a right. Note that this has nothing to do with the overall rate of plasticity, which is the same independent of the modulation frequency. That is, even if the AC is flat, recurrent synapses undergo large numbers of potentiations and depressions. Rather, periodic modulation serves to organize the structure of synaptic plasticity at the network level by preferentially strengthening connections between placecells with overlapping or nearby place field and weakening others. The optimal range of frequencies for growth of the even mode depends only weakly on the ratio of the width of the potentiation to that of the depression lobe, Figure 5c, but significantly on the total width, Figure 5d. Allowing for triplet interactions (as opposed to just pairwise) in the plasticity rule increases the overall growth rate but does not alter the range of optimal frequencies, Figure 5e. On the other hand, the theory predicts that the growth rate of the odd mode is only weakly dependent on the modulation frequency (Figure 5b) as is seen in simulations (Figure 2—figure supplement 2e), and can be understood by considering (Equation 3) with $\theta {\theta}^{\text{\u2032}}=\pi /2$. In this case the growth rate depends on the product of the plasticity rule with the crosscorrelation (CC) of cells with disparate place fields. When there is an overall direction bias in motion then the CC will have peak shifted from zerolag and the product with the STDP rule reliably gives a positive (or negative) growth rate. When there is very little motion bias the CC will be nearly flat, yielding little growth in the odd mode and the resulting connectivity will be highly symmetric.
It is also clear from (Equation 5) that a perfectly antisymmetric plasticity rule will lead to no growth in the even mode as the product of the rule and the AC will itself be an odd function, see Figure 4b. A rule with dominant depression at short latencies will similarly yield a growth rate which is precisely the inverse of that shown in Figure 5a. This causes a decay in the even mode resulting in a connectivity pattern with locally dominant inhibition, see Figure 4c.
Finally, to go beyond this qualitative treatment and find the exact dependence of the evolution of the synaptic weights on model parameters requires solving (Equations 1–3) selfconsistently, see (Equations 43–45) in Materials and methods for the case when $v$ is constant. These are the equations we use to generate the curves in Figure 5. When the AC of the place cell activity is predominantly shaped by the modulation frequency $f$, compared to the motion of the animal (strictly speaking when $v/\left(2\pi f\right)\ll 1$, which is the case already for $f>1$ Hz and for reasonable values of $v$) then the connectivity evolves to leading order as
Equation 6 shows clearly that the growth of the even mode, which drives the emergence of replay, is proportional to strength of oscillatory modulation ${I}_{theta}$. Additionally it can be seen on inspection that the modulation frequency itself plays a crucial role in setting the growth rate, and peaks at intermediate values. On the other hand, the growth of the odd mode, which represents the degree of asymmetry in the recurrent connectivity, is driven entirely by the motion of the animal to leading order. In the case of constant motion, studied here, it depends only on the velocity of the animal.
Sparse coding allows for the replay of activity from multiple explored tracks during bursts
The spatiotemporal structure of sharpwavelike bursts in our model reflected the sequential ordering of place fields on the most recently explored track; correlation with earlier tracks was erased or greatly reduced. In reality, only a fraction of place cells have welldefined place fields in any given environment, that is coding is sparse (Bostock et al., 1991; Fyhn et al., 2007). We incorporated this sparsecoding strategy to our model by providing placefield input to only one half of the total neurons in the network; the other half received constant input. These place cells were chosen randomly from one track to the next and place field locations were assigned randomly as before. Therefore the overlap in the population of placecells between any two tracks was also fifty percent. We then allowed the network to evolve by having the virtual animal explore thirty distinct tracks, each for 1 hr of simulation time, as before. The resulting matrix of synaptic connections was correlated with the ordering of placecells in several past explored environments, Figure 6a. The amplitude of the even mode, which is responsible for generating spontaneous replay, decayed roughly exponentially as a function of the recency of the explored track, Figure 6a inset. This suggested that the replay dynamics in the network with sparse coding might be correlated with several previously explored tracks simultaneously (Romani and Tsodyks, 2015). Indeed, when the network was driven with a global, nonselective input, the replay was strongly correlated with the past two explored environments, Figure 6b, whereas the correlation with earlier environments was negligible. On the other hand, when the constant external drive was selective to the subset of placecells on any given track, replay activity was robustly correlated with activity on that track only, Figure 6c. Such selective input may originate in cortical circuits which store environmentspecific sensory information as stable patterns of activity; these patterns correspond to attracting fixed points in network models of longterm memory storage and memory recall (Hopfield, 1982; Tsodyks and Feigel'man, 1988; Recanatesi et al., 2015). Spontaneous switching between cortical representations during slowwave activity would therefore engage distinct hippocampal replay patterns, which in turn could strengthen and consolidate the corresponding cortical memory trace.
Therefore, sparse coding causes the synaptic connectivity matrix to be simultaneously correlated with place field orderings from multiple tracks and allows for robust replay of activity from those tracks given appropriate inputs. The number of different environments which could be simultaneously encoded in replay depended on model details, particularly the coding level, that is the fraction of active place cells in any given environment (not shown) (Battaglia and Treves, 1998).
Experimental evidence for a transition to replay
Pairwise reactivations of CA1 placecells with overlapping place fields during awake SWRs improve during exploration; they are stronger during late exploration than early exploration (O'Neill et al., 2006). This holds true not only for pairwise correlations, but also reactivations of entire neuronal ensembles, at least on linear tracks (Jackson et al., 2006). More recent work has shown that significant replay events during awake SWRs in rats running along a threearm maze emerge abruptly only after a certain number of runs (Wu and Foster, 2014). These results are consistent with our model predictions. We additionally sought to directly test for a timeresolved increase in SC in neuronal data. We looked for this increase in multiunit recordings of placecell activity from the hippocampus of rats exploring novel, periodic tracks (Wang et al., 2015).
We first identified cells with welldefined place fields by extracting the coefficients and phase of the first two spatial Fourier modes of their timeaveraged activity as a function of the normalized distance along the track (in radians), see Figure 7—figure supplements 1–3. We kept only those cells for which the ratio of the coefficients, the Tuning Index (TI), exceeded the threshold of one, indicating strong spatially selective activity, see methods for details. We then ordered the cells according to their phase (approximately the position of peak firing). The SC of activity over the total duration of the experiment using this ordering was significantly higher than 5000 randomly reshuffled orderings which on average gave SC = 0, see Figure 7a and b.
When the medial septum (MS) was inactivated via muscimol the SC did not exhibit any dynamics as a function of time, Figure 7c. However, once the animal recovered from the muscimol the SC using the proper phase ordering exhibited an initial ramp over the first ten minutes of exploration and then remained high (significant difference between first point and all others, ttest with multiplecomparison Bonferroni correction, pvalues<0.004 and between second and third points), see Figure 7d, solid circles. This is consistent with the model result which showed a similar ramping increase when the total activity (and not just bursts) was considered, see for example Figure 2bii. On the other hand, the SC computed for shuffled phases showed no dynamics and remained close to zero, see Figure 7c,d, open squares. This indicates that there are no global changes in neuronal correlations during exploration, which could occur, for example, due to slow changes in theta modulation or neuronal excitability. Rather, there is a sustained increase in pairwise correlations only between those neurons which encode nearby places, and only when strong theta modulation is present. This finding also held when we considered the more lax criterion $TI\ge 1/2$, see Figure 7—figure supplement 4 (rat A991). In a second animal there was an insufficient number of welltuned units to repeat the analysis, see Figure 7—figure supplement 4 (rat A992). One possible confound in attributing this increase in correlation to theta modulation alone, is the fact that the firing rates of placecells during MS inactivation were lower on average than during the postmuscimol experiment. Lower firing rates in the computational model lead to lower rates of plasticity. However, this difference in firing rates was not significant for the welltuned neurons shown in Figure 7 ($TI\ge 1$), (ttest, p=0.30). An initial, sustained increase in SC was also observed in data from a separate experiment in which an animal was first exposed to a novel hexagonal track, Figure 7e (difference between first point and all others, ttest with correction for multiple comparison, pvalues<10–6). Furthermore, no such increase was found on subsequent sessions on the same track, indicating that the change in correlation only occurred when the environment was novel. This result held when we considered the more lax criterion $TI\ge 1/2$ and also when all units were used, regardless of tuning, see Figure 7—figure supplement 3a (rat A992). In a second rat there were too few welltuned cells to use the criterion $TI\ge 1$, but an initial and sustained increase in SC was also seen using all units, although this correlation was not always significantly different from that calculated from shuffled orderings of the cells (unshaded points in Figure 7—figure supplement 3a, rat A991).
Discussion
Summary
We have presented a computational model of a hippocampal placecell network in order to investigate how the exploration of novel environments shapes the patterns of recurrent synaptic connectivity. Because placefields remap randomly from one environment to the next, the recurrent connectivity, shaped by previous learning, is initially uncorrelated with placecell activity in a novel environment. Our major finding is that the learning rate during the exploration of a novel environment depends almost entirely on the product of the autocorrelation of placecell activity and the window for plasticity. The integral of this product determines the growth rate of a global, networkwide pattern of synaptic connectivity, which results in strong local recurrence and longrange competition. It is this mode which drives spontaneous replay activity in our model network. The growth rate of this mode is maximum when placecell activity is periodically modulated on a timescale commensurate with the plasticity rule, which for realistic time constants yields frequencies in the theta range. Furthermore, lower and higher frequencies than theta lead to learning rates which are orders of magnitude slower. This suggests that the role of theta is to accelerate learning. Note that the overall rate of plasticity is not affected by the presence of oscillations. The number of spike pairs, and hence the number of potentiations and depressions, depends only on the firing rates. Rather, theta oscillations generate repeated prepost pairings in both directions, which coupled with a plasticity rule with dominant potentiation at short latencies bias plasticity toward potentiating events for neurons with neighboring place fields. One signature of this mechanism is a continuous increase in the pairwise crosscorrelation in the activity of neighboring placecells leading up to a critical time. We have found evidence consistent with this by analyzing the activity of simultaneously recorded hippocampal place cells in a rat during the exploration of a novel track.
The assumption of plasticity at recurrent synapses
In our model we have assumed that plasticity occurs only in the recurrent excitatory synaptic connections, and not in the feedforward inputs. Therefore we also assume that the placefield input, which peaks at the spatial position of the virtual animal at given moment in time, is itself stable. In fact, consistent with this assumption, it seems most place cells are active from the outset of exploration of a new environment, although see (Hill, 1978; Frank et al., 2004; Monaco et al., 2014). Furthermore cells tend to exhibit only subtle changes in the size and shape of their place fields over time (Mehta et al., 1997; Mehta et al., 2000), also consistent with our model, see Figure 2—figure supplement 2b. On the other hand, it has been shown in area CA1 that some placecells exhibit placefields only after several minutes of exploration (Frank et al., 2004; Frank et al., 2006). Recent intracellular recordings indicate that appearance of these ‘hidden’ placefields requires the coincidence of active dendritic events and synaptic input via the Schaeffer collaterals (Lee et al., 2012; Bittner et al., 2015; Bittner et al., 2017). It may be that this mechanism for placecell ‘generation’ is particularly salient in cells of area CA1 by virtue of being uniquely positioned to compare and integrate both entorhinal and hippocampal inputs. In any case the strongly recurrent nature of the network we study may make it a more relevant model for circuits in area CA3.
Nonetheless we would expect that changes in spiking activity arising in CA3 due to plasticity in the recurrent connectivity, as in our model, would be reflected in similar changes in the spiking activity of CA1 cells due to the direct inputs via the Schaeffer collaterals. More specifically, in contrast to plasticity mechanisms leading to the formation of place cells themselves, here we have modeled on how plasticity shapes the recurrent connections between alreadyformed place cells. We find that pairwise correlations between place cells with nearby preferred locations increases during exploration of a novel environment. Assuming such an increase occurs in a strongly recurrent circuit in CA3, we would also expect to observe an increase in correlation in target CA1 pyramidal cells, as long as there exists some mapping from CA3 place cells to CA1 place cells. Such a mapping could be a simple random projection or a more ordered relationship. Recent work suggests that place fields of CA1 pyramidal cells on linear tracks are built up of a weighted sum of CA3 inputs from positions surrounding the relevant place field (Bittner et al., 2017); in such a case the increased correlation in CA3 activity would be shared by CA1 output. Indeed, the data we have analyzed from place cells of area CA1 show increase sequential correlation as predicted by our recurrent model. A recent computational model of STDPinduced formation of place fields in CA1 cells via the feedforward excitatory connections from CA3 (Schaeffer Collaterals) suggests a role for theta in speeding up place cell formation (D'Albis et al., 2015). This raises the intriguing suggestion that theta may play a key role both in placecell formation through plasticity of feedforward inputs, and also in the emergence of replay through plasticity of recurrent synaptic connections as indicated by our work here.
Remapping of place fields for different directions of motion on the same track
Hippocampal place cells actually exhibit global remapping of their place fields depending on the direction of motion of the animal on linear tracks (McNaughton et al., 1983; Muller et al., 1994). This is perhaps not surprising given that the behaviorally relevant information for the animal is not just the position along the track but also the way in which it is facing; for example this determines how far away a potential reward is, if located at one or both ends of the track. Studies using periodic tracks have shown no such global remapping, but rather some degree of rate remapping (Schwindel et al., 2016), that is the direction of motion affects the shape and amplitude of place fields, but not their position. In the data we have analyzed there is very weak remapping, see Figure 7—figure supplements 1c and 2b, and so the assumption of invariance of place field to direction of motion is a good one. In our model we have made this assumption. The consequence of this is that while exclusively clockwise (CW) or counterclockwise (CCW) motion will lead to highly asymmetric recurrent connectivity, exploration of both directions will give rise to much more symmetric connectivity, see Figure 2a. In practice any trajectory over a finite amount of time will have a directional bias; in the data we have analyzed the rat spends a slightly larger fraction of the time moving CW compared to CCW, and this will necessarily lead to asymmetries in the connectivity. In linear tracks, due to the global remapping such asymmetries should be even more pronounced.
Forward versus backward replay
The inevitable asymmetry in the recurrent connectivity of our model placecell network due to plasticity during exploration strongly biases spontaneous activity. On a periodic track this replay would be exclusively CW or CCW depending on the corresponding bias in motion during exploration, while learning on a linear track would always produce forward replay. Previous work has shown that perfectly symmetric connectivity can give rise to both forward and backward replay in equal measures, due to spontaneous symmetry breaking of activity (Romani and Tsodyks, 2015). We would argue that such symmetric connectivity is not robust for the reasons given above, although we cannot rule out the existence of homeostatic mechanisms which would conspire to make it so. Rather, we propose here an alternative mechanism for generating backward replay given asymmetric connectivity, based on local sensory input.
Specifically, if global input to our model network is homogeneous then replay occurs only in one direction. This is due to the asymmetry in the connectivity which comes about through a bias in the direction of motion of the animal during exploration. On the other hand, if the input is not homogeneous, but rather spatially localized, then the replay dynamics can be qualitatively distinct. This difference is mediated by the shortterm synaptic depression mechanism present in the model. With a strong, spatially localized input, synapses to downstream neurons (in the sense of the asymmetric bias of the connectivity) become rapidly depressed, see Figure 2—figure supplement 6. This forces the activity to travel backward with respect to the bias in the connectivity. In fact, in experiment, when local spatial input is absent, for example when the animal is sleeping in a rest box, forward replay is predominant (Roumis and Frank, 2015). On the other hand, both backward and forward replay are observed when the animal is awake but quiescent on a given track. This is precisely when locally sensory cues are available to the animal, and could potentially shape spontaneous replay events. Recent work shows that some neurons in area CA2 fire more strongly during awake quiescence than during exploration (Kay et al., 2016); they may be providing information regarding local sensory cues. In our scenario the mechanisms leading to forward versus backward replay are distinct and therefore in principle relevant replay statistics such as replay velocity and spiking intensity should also be different.
Robustness to changes in the plasticity model and to the presence of spike correlations
Here we have considered a simple phenomenological model of plasticity which depends on the timing of spike pairs. Taking into account spike triplets as opposed to only pairs does not alter our findings qualitatively, see Figure 5e, although we are necessarily ensuring a balanced rule through our choice of parameters. Fits of heuristic spike timingdependent models to data from slice experiments yield parameter values which do not lead to balanced rules such as the ones we have used here (Pfister and Gerstner, 2006). Specifically, when the repetitionrate of spike pairs is high, invitro STDP is dominantly potentiating; for such a rule there would be no nonmonotonic dependence of the learning rate on modulation frequency at high rates. The mechanism we propose for theta would not be operative in that case. However, in a recurrent network model, unbalanced plasticity rules would quickly lead to synapses either saturating to maximum efficacy or vanishing completely; nontrivial network structure is therefore not possible with such rules. One solution to this problem is to complement the heuristic, unbalanced Hebbian rule, extracted from slice experiments, with a nonHebbian heterosynaptic rule, as well as additional, slower homeostatic mechanisms (Zenke et al., 2015). It is unclear if the learning rate of this rule would also exhibit a nonmonotonic dependence of the modulation frequency, as in our rule. However, heuristic rules describing synaptic plasticity invivo are still largely unknown. Therefore one reasonable approach to studying plasticity in recurrent networks invivo is to choose the simplest possible rule which allows for the emergence of nontrivial structure; this has been our approach here. It remains to be studied how more realistic voltage or calciumbased plasticity rules interact with the thetamodulation to affect learning in recurrent networks (Clopath et al., 2010; Graupner and Brunel, 2012), although at a singlesynapse one can find qualitatively similar regimes for an array of plasticity rules in the presence of pre and postsynaptic oscillations (Albers et al., 2013).
Our results clearly do not depend on the actual spike timing since our model neurons generate spikes as Poisson processes; rather, all lasting changes in the connectivity are due to timevarying modulations of the firing rates. In fact, recent work with a spiking neuron model suggests that such modulations in the firing rate, as opposed to exact spike timing, are sufficient to explain the effect of plasticity from STDP and more realistic calciumbased plasticity rules in general (Graupner et al., 2016). In any case the contribution of pairwise spike correlations to the evolution of the recurrent connectivity can be formally taken into account in (Equation 4), that is via its affect on the autocorrelation of placecell activity.
Theta sequences and phase precession
The increase in SC during theta activity in our model is due to the emergence of theta sequences, Figure 2—figure supplement 5, a consequence of the strengthening of the recurrent excitatory connectivity during exploration. The emergence of theta sequences also gives rise to phase precession of the place cell activity. Specifically, the peak of the place cell firing rate shifts to earlier phases of the theta rhythm as the animal enters the place field. This mechanism was first studied in (Tsodyks et al., 1996), in which theta sequences arose through asymmetry in the recurrent connectivity. A very similar effect can be achieved through shortterm synaptic depression even without the need for asymmetric connectivity (Romani and Tsodyks, 2015). The reason is that the motion of the animal, and hence the sequential activation of place cells, generates an asymmetric pattern of synaptic depression (upstream synapses are more depressed than downstream ones). This is the mechanism responsible for theta sequences and phase precession in our model. Phase precession therefore emerges over time in our model, and is not present during early exploration. We have also studied the effect of phase precession on learning numerically by generating theta sequences directly via the placefield input itself (not shown). Preliminary simulations suggest that phase precession can actually speed up learning, although the mechanism is nontrivial and requires additional study.
Other network models of placecell activity
Recurrent network models for placecell activity provide a parsimonious explanation for electrophysiological phenomena associated with exploratory behavior as well as the generation of sharpwave bursts during awake quiescence and sleep (Romani and Tsodyks, 2015; Tsodyks et al., 1996; Shen and McNaughton, 1996; Cutsuridis and Hasselmo, 2011; Jahnke et al., 2015). In recent theoretical work (Romani and Tsodyks, 2015) sharpwavelike bursts were generated spontaneously by virtue of the spatial modulation of the recurrent connectivity, which drives an instability to traveling waves in the absence of placefield input. The presence of shortterm depression modulates the amplitude of the waves, leading to narrow bursts. This is the same mechanism we have used here. Alternatively, recent work with a biophysically detailed spiking network model focused on the role of nonlinear dendritic integration on the generation of replay during SWRs (Jahnke et al., 2015). In that work the authors found that the exploration of a virtual linear track in the presence of pairwise STDP led to highly asymmetric connectivity; this could generate replay activity given a sufficiently synchronous external input which recruited nonlinear dendritic events. In our work, we have sought to explain the replay as an emergent phenomenon which depends only on the networkwide organization of synaptic structure. In doing so we have considered a simple stochastic firing rate model which allowed us to fully characterize how interplay between the plasticity rule and the placecell activity affects learning. Nonetheless, a detailed reproduction of the phenomenology of SWRs certainly requires mechanisms we have not included here. In particular, while our model generates sharpwavelike bursts, it does not generate highfrequency ripples, which are likely generated by networks of inhibitory interneurons in CA1 (Buzsáki, 2006).
Spatial learning
It seems reasonable that the learning of tasks which depend on spatial information require the formation of an internal representation of the relevant environment. This is the process we have studied here. While we have not modeled any particular cognitive task, we propose that the networkwide organization of synaptic structure, in order that it be in concordance with the placefield distribution of place cells, should be a necessary step in spatial learning tasks. Our results suggest that this process is dramatically sped up by modulating placecell activity in the theta range, which is one possible role of this prominent rhythm. Interestingly, while theta modulation is prominent in landgoing rodents, it is absent in the bat (Yartsev and Ulanovsky, 2013). Although we cannot purport to know why this is, we find that it actually may be consistent with the theoretical mechanism for learning rates we put forth here. Namely, to attain a high learning rate the AC of placecell activity should be modulated on a timescale commensurate with the window for plasticity. In the case of land mammals like rats and mice, the behavioral timescale related to spatial navigation may be too slow, therefore necessitating oscillatory modulation in the form of theta. In the case of flying animals, like the bat, changes in sensory input alone are already likely to occur on the order of tens to hundreds of ms due to the high velocity of flight, thereby obviating the need for theta. This speculation is based on the modulatory effect of ongoing oscillations on learning in our computational model, and is therefore not yet supported by experimental evidence. Indeed, our theoretical prediction is that more generally, for learning to occur on behaviorally relevant time scales, neuronal activity should vary somehow on a timescale commensurate with synaptic plasticity mechanisms. One means of achieving this is to have a broad window for synaptic plasticity, on the order of seconds, socalled behavioral timescale plasticity (Bittner et al., 2017). Alternatively, internally generated rhythms may serve to modulate neuronal activity on nonbehavioral timescales, with a similar effect.
Materials and methods
Network Model
We simulate a network of $N$ placecells. The firing rate of neuron $i$ is given by
where the time constant $\tau =10$ ms and $N=100$ for all simulations except for those in Figure 6 in which $N=200$. We take $\varphi \left(I\right)=\alpha \phantom{\rule{thinmathspace}{0ex}}\mathrm{l}\mathrm{n}\left(1+{e}^{I/\alpha}\right)$ with $\alpha =1$ Hz for all simulations. The input
where ${\theta}_{i}^{\mu}$ is the center of the place field for cell $i$ on track $\mu $. These place field positions are random and uncorrelated for each cell $i$ on different tracks. The position of the animal is given by $X\left(t\right)$. For the simulations in all figures except Figure 2—figure supplement 2 the position was calculated by assuming that the velocity of the animal was random with a characteristic time constant of 10 sec. Specifically the velocity is given by $v={v}_{0}+{v}_{1}\left(t\right)$ where the mean velocity ${v}_{0}=0.5$ rad/sec and
where ${\xi}_{\nu}$ is a Gaussian whitenoise process with mean zero and unit variance, with ${\tau}_{\nu}=10$ sec and ${\sigma}_{\nu}=2$ rad/sec. The position of the animal is then given by solving $\dot{X}=v$. For Figure 2—figure supplement 2 the velocity is taken to be a constant $v=1$ rad/sec and the position $X=vt$.
The shortterm synaptic depression variable ${x}_{i}$ obeys
with ${\tau}_{x}=800$ ms and ${U}_{0}=0.0008$.
Synaptic weights
Request a detailed protocolThe synaptic weight from cell $j$ to cell $i$ is $\stackrel{~}{w}}_{ij}={w}_{ij}{w}_{I$, where ${w}_{I}$ represents a global inhibitory feedback. The weights ${w}_{ij}$ are plastic and evolve according to a spiketiming dependent plasticity rule. In order to generate spike trains we take the firing rates of the cells as the underlying rates for a Poisson process. Specifically, the probability that a cell $i$ produces a spike in a given time step $\Delta t\ll 1$ is defined as $Pr\left(\text{spike}\right)={r}_{i}\left(t\right)\Delta t$.
We consider a simple pairwise STDP rule (Kempter et al., 1999). To implement this numerically, if we take the point of view of a neuron $i$, which receives synaptic input from a neuron $j$, then the weight ${w}_{ij}$ will change value for each prepost spike pair. Specifically, the weight is strengthened every time there is a postsynaptic spike. The amount it is strengthened by depends on the how long ago the presynaptic spike occurred. If it is discounted exponentially then we need only take into account a presynaptic variable ${r}_{j}$, where
and ${S}_{j}\left(t\right)={\displaystyle \sum _{k}}\text{\hspace{0.17em}}\delta (t{t}_{j}^{k})$ is the train of spikes emitted by neuron $j$. Then, when the postsynaptic spike occurs at time $t$ we increase ${w}_{ij}\to {w}_{ij}+{A}_{+}{r}_{j}\left(t\right)$. Similarly, ${w}_{ij}$ is weakened whenever there is a presynaptic spike by an amount which depends on the spike times of the postsynaptic neuron $i$. Therefore, we can keep track of a postsynaptic variable ${o}_{i}$, where
and everytime there is a presynaptic spike at time t we update ${w}_{ij}\to {w}_{ij}{A}_{}{o}_{i}\left(t\right)$. Weights are bounded below by zero and above by ${w}_{max}$. This fully characterizes the plasticity rule. The parameters ${A}_{+}=0.1$, ${\tau}_{+}=20$ ms, ${A}_{}=0.1/3$, ${\tau}_{}=60$ ms unless otherwise stated.
Initial Condition
Request a detailed protocolThe weight matrix ${w}_{ij}$ was trained by simulating exploration on 10 distinct linear tracks for one hour each. The value of ${w}_{ij}$ was taken as a constant 40 for all synapses at the beginning of exploration of the first environment and the maximum possible weight was set at ${w}_{max}=80$. See Figure 2—figure supplement 1 for statistics of the synaptic weights during the training process. For the connectivity profiles, we calculate the mean synaptic weight between pairs of neurons with a difference in place field location $\Delta \theta $ at the given time of the snapshot. There are no autapses, but the curve is made continuous at $\Delta \theta $ by interpolating between adjacent points. The connectivity is normalized by subtracting the mean and dividing by 40.
Details for Figure 2
Request a detailed protocolTo generate the figure in Figure 2b we calculate the SC in 1 s bins during the entire simulation. Every 180 s there is a 3 s period during which the external thetamodulated place field input is removed, in order to model awake quiescence. Activity during this period is spontaneous and is considered to be bursts (black circles). During burst activity we only calculate the SC for the second and third seconds because it takes some time for the place field activity to die away and the burst activity to emerge, for example SC for bursts is calculated for seconds 181–182 and 182–183 but not for 180–181. For the simulation in Figure 2b the place field input is simply removed when the animal stops, that is the external input is set to a constant value of ${I}_{0}=3$ Hz while for Figure 2c it is set to ${I}_{0}=0.75$ Hz. Changing ${I}_{0}$ does not significantly affect the value of SC, only the degree of burstiness of the spontaneous activity, see Figure 2—figure supplement 7. The inset in Figure 2b was generated by binning the SC of the total network activity into 3 min bins. The ‘LFP’ in Figure 2c is the networkaveraged input current to the neurons, that is the networkaverage of the argument in the firing rate equations. Additional parameters are: ${I}_{0}=3$ Hz, ${I}_{PF}=25$ Hz, ${I}_{theta}=1$, $f=8$ Hz, ${w}_{I}=65$.
Details for Figure 3
Request a detailed protocolThe curves shown in Figure 3a are a cartoon meant to illustrate how the recurrent connectivity can be decomposed into a spatial Fourier series which include even (cosine) and odd (sine) terms. The amplitude of an even term (its coefficient) can lead to a transition in the network dynamics when it reaches a critical value. In Figure 3c the coefficients ${w}_{even}$ and ${w}_{odd}$ are the first cosine and sine Fourier coefficients of the mean recurrent connectivity are calculated as in Figure 2 at a given time during the simulation. Parameters in (c) are the same as in Figure 2.
Details for Figure 4
Request a detailed protocolThe parameters in (a) and (b) are the same as in Figure 2. Parameters in (c) and (d) are the same as in Figure 2 with the sole exception of the STDP rule. For the antisymmetric case (c) the parameters are ${A}_{+}=0.1$, ${\tau}_{+}=40$ ms, ${A}_{}=0.1$, ${\tau}_{}=40$ ms while for (d) they are ${A}_{}=0.1$, ${\tau}_{+}=20$ ms, ${A}_{+}=0.1/3$, ${\tau}_{}=60$ ms.
Details for Figure 6
Request a detailed protocolThe virtual animal has explored thirty distinct environments for one hour each. In each environment the coding level is one half, that is one half of the neurons are modeled as place cells (randomly assigned place field location from uniform distribution around the track) and the other half receive only constant background input with ${I}_{0}=0$ Hz. Plasticity occurs via STDP as before between all cell pairs. Place cells in any given environment are chosen randomly with equal probability; hence the overlap in placecell representation between any two environments is on average one half. The number of neurons is N = 200, so that in any environment there are still 100 place cells, as in previous simulations.
Derivation of continuous learning rule from pairwise and triplet STDP rule
Pairwise rule
Request a detailed protocolWe can write down an approximate description for the evolution of ${w}_{ij}$ which is accurate if the STDP occurs much more slowly than changes in the firing rates (see Kempter et al. PRE 1999 for details). For the case of inhomogeneous Poisson processes the equation is
As we are only interested in the evolution of the weights on a long timescale (longer than the timescale of fluctuations in the rates) we will consider the time averaged equation
For a stationary process we have $\u27e8{r}_{j}\left(tT\right){r}_{i}\left(t\right)\u27e9}_{t}={\u27e8{r}_{j}\left(t\right){r}_{i}\left(t+T\right)\u27e9}_{t$. Therefore, as long as we restrict our analysis to stationary processes (which is the case here), we can write (Equation 15) in the compact form
If neurons are arranged along a periodic track and can be parameterized according to their position (phase) $\theta $, then (Equation 16) becomes (Equation 3). Note that (Equation 3) is only used for analysis. Plasticity in the numerical simulations is modeled as described in the preceding section.
Triplet rule
Request a detailed protocolThe STDP rule which depends only on spike pairs cannot account for some general experimental findings, including the dependence of LTP on presentation frequency. A triplet rule can describe these effects (Pfister and Gerstner, 2006). The triplet rule is implemented as above, but by adding one additional variable to keep track of per neuron. Specifically, we rename the previous presynaptic and postsynaptic variables ${o}_{1}$ and ${r}_{1}$ and add two more ${o}_{2}$ and ${r}_{2}$. Now if there is a spike in neuron $i$ at time $t$, then the weight ${w}_{ij}$ is potentiated by an amount ${r}_{1,j}({A}_{2}^{+}+{A}_{3}^{+}{o}_{2,i}(t\u03f5\left)\right)$, where the $\u03f5$ means that you take the value of the postsynaptic variable ${o}_{2}$ before it is incremented due to the spike. Similarly, the weight ${w}_{ji}$ is depressed by an amount ${o}_{1,j}\left(t\right)({A}_{2}^{}+{A}_{3}^{}{r}_{2,i}(t\u03f5\left)\right)$.
One can write an equation similar to (Equation 3) for the triplet rule, which is exact if the learning is slow compared to the rate dynamics. The full equation, with pairwise and triplet interactions is
The kernel ${K}_{11}$ is just the standard pairwise one from before, where we write 11 to mean ‘one presynpatic spike and one postsynaptic spike’. The kernel ${K}_{12}$ is the one for ‘two presynaptic spikes and one postsynaptic spike’. The STDP rule is multiplicative, that is the potentiation due to the triplet is the product of ${r}_{1}$ and ${o}_{2}$ which implies that ${K}_{21}\left({\tau}_{1},{\tau}_{2}\right)={K}_{11}\left({\tau}_{1}\right){\stackrel{~}{K}}_{21}\left({\tau}_{2}\right)$, where ${\tau}_{1}={t}_{pre}{t}_{post,2}$ and ${\tau}_{2}={t}_{post,1}{t}_{post,2}$, so ${\tau}_{1}\le 0$ and ${\tau}_{2}\le 0$, and we are only integrating over the potentiating part of ${W}_{11}$ which is also clear from the STDP rule.
Plasticity in network of place cells on periodic track
Request a detailed protocolWe first consider a simple case where the place cell activity is given by $r\left(\theta ,t\right)={r}_{0}+{r}_{PF}\phantom{\rule{thinmathspace}{0ex}}\mathrm{cos}\left(\theta vt\right)$, that is the rate dynamics trivially follow the motion of the virtual animal and are not ‘theta’modulated. Also, we are taking a continuum limit in which the index of a neuron $i$ is replaced with the position of its place field along the track $\theta $. In (Equation 3) we have the product
where $f\left(t\right)$ is a timeperiodic function. The first two terms in (Equation 18) will therefore determine the mean growth of the even and odd modes of the connectivity on a slow timescale, while the remainder will lead to fluctuations about this mean on a faster time scale. (This timescale separation will be conducted more formally in a later section) Averaging over the fast time eliminates these fluctuations. Therefore the rate of change of the connectivity can be expressed as
where the brackets on the l.h.s. indicate that we have taken a timeaverage to eliminate timeoscillating terms. Therefore, (Equation 3) reduces to
Performing these integrals gives
We see from (Equation 22) that the balance condition ${A}_{+}{\tau}_{+}={A}_{}{\tau}_{}$ must hold to avoid the synapses from all reaching their maximum or minimum value. At this point we would also note that for any reasonable velocity the growth rate of the even mode is extremely small and far from the maximal possible growth rate. Specifically, the velocity has unit of radians. For a three meter track, a rat running at 50 cm/s would have a velocity of about 1 rad/sec. Typical time scale for STDP are 50 ms or 0.05 s. Therefore the nondimensional quantity ${\tau}^{2}{v}^{2}=0.0025\ll 1$ and, given the balance condition, ${\dot{w}}_{\text{even}}\sim 0$.
Including the triplet interactions by using (Equation 17) gives
From (Equation 25) we see there are two additional balance conditions to avoid saturation or decay to zero of synaptic weights: ${\tau}_{x}={\tau}_{y}$ and ${A}_{3}^{+}{\tau}_{+}{\tau}_{y}={A}_{3}^{}{\tau}_{}{\tau}_{x}$. Once we apply the balance conditions the growth rate of the mean connectivity ${w}_{0}$ is identically zero.
Including periodic modulation
Request a detailed protocolHere we consider the case where the place cell activity is modulated periodically with a frequency $f$, that is $r\left(t\right)={r}_{0}+{r}_{\text{PF}}\phantom{\rule{thinmathspace}{0ex}}\mathrm{cos}\left(\theta vt\right)\left(1+{r}_{\text{theta}}\phantom{\rule{thinmathspace}{0ex}}\mathrm{cos}\left(2\pi ft\right)\right)$. This can be rewritten as
where ${v}_{+}=v+2\pi f$ and ${v}_{}=v2\pi f$. Now when we plug (Equation 28) into (Equation 3) or (Equation 17) we find that each of the three cosine terms makes an independent contribution. The reason is that the cross terms (which arise due to the product of rates) lead to time periodic terms which vanish once we average the equation. Therefore we end up with equations of the form (Equation 22–24) or (Equations 25–27) but with three times the number of terms, that is for $v$, ${v}_{+}$ and ${v}_{}$. These equations can, however, be simplified by considering the limit where the ratio $\u03f5={\scriptscriptstyle \frac{v}{2\pi f}}\ll 1$. For example, when $f=8$ Hz and $v=1$ rad/sec (50 cm/sec on a three meter track) then ${\scriptscriptstyle \frac{v}{2\pi f}}\sim 0.02$. Slower running speeds or faster oscillations will make the following approximation even more accurate. In this limit we have (for the pairwise rule)
Furthermore, given that $\tau v$ is also a small parameter, we can neglect the first term in (Equation 29), which means that the growth rate of the even mode is entirely dominated by the modulation frequency to leading order. Because $f$ could take on any value the $\dot{w}}_{\text{even}$ need not be close to zero. In fact, there is a critical value of f which maximizes the growth rate $f={\scriptscriptstyle \frac{1}{2\pi \sqrt{{\tau}_{+}{\tau}_{}}}}$. On the other hand the growth of the odd mode is not dependent on the modulation frequency to leading order, but only on the motion of the animal.
Coupled rate dynamics and synaptic plasticity
Request a detailed protocolNow we return to the full problem, namely, (Equations 1–3) and we take $\varphi $ to be linear for simplicity. In principle (Equations 1–3) are extremely challenging to analyze by virtue of the nonlinear combinations of the rates and the connectivity, both of which evolve in time. However, if we assume that the connectivity evolves much more slowly than the rates, the problem simplifies considerably. Specifically, we assume that the changes to the synaptic weights due to each spike pair are small. Formally, we can introduce a small parameter $\u03f5$ and write $A}_{+}=\u03f5{\stackrel{~}{A}}_{+$ and $A}_{}=\u03f5{\stackrel{~}{A}}_{$, and so $K\left(T\right)=\u03f5\stackrel{~}{K}\left(T\right)$, where all of the ‘tilded’ quantities are order one. In the limit $\u03f5\to 0$ we would then find that (Equation 1) is unchanged while (Equation 3) becomes $\dot{w}\left(\theta {\theta}^{{\mathrm{\prime}}^{}}\right)=0$, that is the connectivity is a constant. This clearly is an approximation since the connectivity will change over time, just slowly. In fact it is precisely this slow evolution that we would like to describe. In order to do this we can define a new, slow time ${\tau}_{s}=\u03f5t$ and allow both the firing rate and the connectivity to evolve on both the fast and the slow time scales, which are now taken to be independent. This is known as a multiscale approach.
Formally we make the following ansatz
where ${t}_{s}=\u03f5t$. Therefore, the temporal derivatives in (Equations 1–3) can be written $\dot{r}={\mathrm{\partial}}_{t}r+\u03f5{\mathrm{\partial}}_{{t}_{s}}r$ by using the chain rule. This leads to the system of equations
To leading order we have the equations
(Equation 36) shows that, to leading order, the connectivity is independent of the fast time $t$, that is ${w}_{0}={w}_{0}(\theta {\theta}^{\text{\u2032}},{t}_{s})$. Given this, the term ${w}_{0}$ can be treated as a constant in (Equation 35), and the rate is given by ${r}_{0}={r}_{0}(\theta ,{t}_{s},t)$. At order $\vartheta \left(\u03f5\right)$ the equations are
The solution to (Equation 37), ${r}_{1}$ gives the next order correction to the firing rates. On the other hand, (Equation 38) would seem to require solving for both ${w}_{0}$ and ${w}_{1}$ simultaneously. However, we note that the product of rates ${r}_{0}(\theta ,{t}_{s},t){r}_{0}({\theta}^{\text{\u2032}},{t}_{s},t+T)$ can actually be written as $g(\theta {\theta}^{\text{\u2032}},{t}_{s};T)+f(\theta {\theta}^{\text{\u2032}},{t}_{s},t;T)$. That is, there is a part which is independent of time, except for the dependence of the slow time through the connectivity, and a part which varies on a regular, nonslow time scale. This fast, timevarying part, has zero mean (it is periodic here) and hence can be eliminated by averaging on an appropriate intermediate time scale, which is still much faster than the slow time scale, that is ${\u27e8f\u27e9}_{t}=0$. Therefore, we can define ${w}_{1}$ as that component of the connectivity which is driven by $f$, and hence also will have zero mean and will vanish upon averaging. Finally, combining the order one equation for the rate and the averaged order epsilon equation for the connectivity yields
(Equations 39 and 40) are the system we wish to solve here. We first note that the connectivity can always be decomposed into a Fourier series. Anticipating the fact that the only terms which will not have zero mean in time will be the first cosine and sine terms, we write
Given that the forcing term can be written
then the firing rates ${r}_{0}$ will be
where, after some linear algebra one finds
Taking the product of the rates and keeping only those terms with nonzero mean yields the function $f$, which when plugged into (Equation 40) gives
In the limit $v/\left(2\pi f\right)\ll 1$ the leading order terms of (Equations 44 and 45) are given by (Equations 6 and 7).
The solution for the triplet rule can be similarly found by replacing ${r}_{\text{PF}}$ in (Equations 25–27) by $({r}_{\text{even}}^{2}+{r}_{\text{odd}}^{2})/2$ as well as the analogous terms corresponding to ${v}_{+}$ and ${v}_{}$.
Motion with nonuniform velocity
Request a detailed protocolUp until now we have considered a constant velocity $v$ for ease in calculation. In reality, the motion of the animal will be much more erratic, with both positive and negative speeds during the exploration of the track. In the extreme case of a random walk, in which both clockwise (CW) and counterclockwise (CCW) motion is equally likely, the resulting connectivity is much more symmetric. This occurs because contributions from CW and CCW motion to the even mode simply add, while the same contributions to the odd mode cancel out. We can illustrate this more clearly by considering the continuum learning rule, (Equation 3), with a given distribution of velocities. Specifically, we take the simple case where the firing rates can be written ${r}_{0}+{r}_{\text{PF}}\phantom{\rule{thinmathspace}{0ex}}\mathrm{cos}\left(\theta vt\right)$ (we have seen that the solution to the full model can be written in an analogous way to the solution of this simpler problem) and assume a distribution of firing rates $\rho \left(v\right)$. From (Equations 20 and 21) we have
If we choose $\rho \left(v\right)=\delta (v{v}^{\ast})$, that is, a fixed velocity ${v}^{\ast}$, then we recover our results from previous sections. We can allow for a range of velocities by considering a uniform distribution of velocities, that is
For this choice of distribution we find $\dot{w}}_{\text{even}}={\int}_{\mathrm{\infty}}^{\mathrm{\infty}}\text{}dT\cdot K\left(T\right)\frac{\mathrm{sin}\left({v}^{\ast}T\right)}{{v}^{\ast}T$, while ${\dot{w}}_{\text{odd}}=0$. The function $\mathrm{sin}\left(x\right)/x$ is even and decays more slowly than $\mathrm{cos}\left(x\right)$ around zero lag, meaning that the AC of the firing rate is qualitatively similar whether $v$ is a constant or distributed. On the other hand, the CC of neurons with disparate place field location will be zero on average if the animal samples both directions equally. Nonetheless, any asymmetry in the velocity distribution (which is expected for any finite length of exploration) will lead to a nonzero odd mode in the connectivity.
A traveling wave instability of the rate equation for modulated connectivity
Request a detailed protocolIn the previous sections we have seen how the connectivity grows in time due to the nonstationarity of the firing rates. In particular, the AC of the placecell activity drives growth of the even mode, and this growth depends on the STDP rule parameters and the modulation (theta) frequency. Here we show that once the amplitude of the even mode reaches a critical value there is an instability of the network activity in the absence of place field input, which gives rise to spontaneous traveling waves. This is the origin of replay activity in the model.
We consider the firing rate equation in the absence of place field input, that is spontaneous dynamics.
and a small perturbation about a steady state solution of the form $r\left(\theta ,t\right)={r}_{0}+\left(\delta {r}_{0}+\delta {r}_{\text{even}}\mathrm{cos}\left(\theta \right)+\delta {r}_{\text{odd}}\mathrm{sin}\left(\theta \right)\right){e}^{\lambda t}$. Plugging this into (Equation 49) leads to a characteristic equation for $\lambda $ the solutions of which are
Therefore when ${w}_{\text{even}}>2/{\varphi}^{\text{\u2032}}$ there is an oscillatory instability with frequency $\omega ={\scriptscriptstyle \frac{{w}_{\text{odd}}}{2\tau}}{\varphi}^{\text{\u2032}}$.
Including the effects of synaptic depression
A general study of the dynamics in recurrent networks with a ring topology and synaptic depression can be found in (York and van Rossum, 2009). Here we outline several calculations which are relevant for our model in the context of plasticity and replay.
Evolution of recurrent connectivity
Request a detailed protocolSynaptic depression is included in the model in order to generate burstlike activity during awake quiescent periods. The equations are
Here we note that the quadratic nonlinearities $xr$ in (Equations 51 and 52) make it impossible to find a closed form solution. However, we also note that the growth rate of the first even mode of the connectivity will still depend only on the product of the STDP window with the autocorrelation (AC) of the firing rate, as before. Therefore, the modulation of the firing rate due to the theta rhythm, the resulting shape of the AC, and its consequence for the growth rate: maximal growth for intermediate frequencies commensurate with the STDP window, is qualitatively the same.
If we assume the place field input is weak, that is $I\left(\theta ,t\right)={I}_{0}+{I}_{\text{PF}}\mathrm{cos}\left(\theta vt\right)$, where ${I}_{\text{PF}}\ll 1$, then we can linearize the solution. In this case we can write $r\left(t\right)={r}_{0}+{r}_{\text{even}}\phantom{\rule{thinmathspace}{0ex}}\mathrm{cos}\left(\theta vt\right)+{r}_{\text{odd}}\phantom{\rule{thinmathspace}{0ex}}\mathrm{sin}\left(\theta vt\right)$ and $x\left(t\right)={x}_{0}+{x}_{\text{even}}\phantom{\rule{thinmathspace}{0ex}}\mathrm{cos}\left(\theta vt\right)+{x}_{\text{odd}}\phantom{\rule{thinmathspace}{0ex}}\mathrm{sin}\left(\theta vt\right)$, and the product
where ‘h.o.t.’ means higher order terms. Plugging this into (Equation 52) gives the solution to x in terms of r
Now we plug the expansion (Equation 54) into (Equation 51). Doing so yields the following nonlinear algebraic equations
which can be solved to find ${r}_{0}$, ${r}_{\text{even}}$ and ${r}_{\text{odd}}$. This analysis has been carried out without periodic modulation of the input. Including periodic modulation with frequency $f$ once again just leads to two equivalent sets of algebraic equations for ${v}_{+}=v+2\pi f$ and ${v}_{}=v2\pi f$. Then, the selfconsistent equations for the growth rates of the connectivity are just (Equations 43–45), where the $r$s are replaced by the $\stackrel{~}{r}$s calculated above.
Linear stability of spontaneous activity
Request a detailed protocolTo study the stability of the spontaneous state we consider (Equations 51 and 52) with constant input $I={I}_{0}$. We consider perturbations of the steady state solution of the form $r\left(t\right)={r}_{0}+\left(\delta {r}_{0}+\delta {r}_{\text{even}}\phantom{\rule{thinmathspace}{0ex}}\mathrm{cos}\left(\theta \right)+\delta {r}_{\text{odd}}\phantom{\rule{thinmathspace}{0ex}}\mathrm{sin}\left(\theta \right)\right){e}^{\lambda t}$, and $x\left(t\right)={x}_{0}+\left(\delta {x}_{0}+\delta {x}_{\text{even}}\phantom{\rule{thinmathspace}{0ex}}\mathrm{cos}\left(\theta \right)+\delta {x}_{\text{odd}}\phantom{\rule{thinmathspace}{0ex}}\mathrm{sin}\left(\theta \right)\right){e}^{\lambda t}$. Plugging these formulas into (Equations 51 and 52) yields a fourthorder characteristic equation for the eigenvalues associated with the spatially modulated modes. In the limit ${U}_{0}\ll 1$ and $1/{\tau}_{x}\ll 1$, which is always the case in our simulations, this equation reduces to a quadratic equation, the solution of which is
which is similar to the case without synaptic depression, (Equation 50), with the sole difference that the connectivity is multiplied by the the depression variable in the steady state ${x}_{0}$. Therefore there again is an instability to traveling waves, but the effective recurrent connectivity is weakened by the synaptic depression, thereby shifting the instability. When there is no depression ${x}_{0}=1$ and we recover the previous result.
Analysis of data from hippocampal recordings
We analyzed data collected from hippocampal recordings using bilateral silicon probes from two distinct sets of experiments. In the first set two rats had their medial septa reversibly inactivated through injection of muscimol and were then exposed to a novel square, periodic track for 35 min (Wang et al., 2015). After a recovery period they were reintroduced onto the same track for an additional 35 min. In the second set of experiments two rats were exposed to a novel hexagonal track for 35 min in a first session. Thereafter they were placed back onto the same track for two additional 35 min sessions with a rest period in between each session.
Medial septum inactivation experiments
Request a detailed protocolThere were a total of n = 124 and n = 128 cells from rats A991 and A992 respectively. In order to identify place cells we first calculated a rate map on the track for each neuron, and then linearized the rate to obtain a onedimensional place field. We then fit the place field with the function $r={r}_{0}+{r}_{1}\phantom{\rule{thinmathspace}{0ex}}\mathrm{cos}\left(\theta {\varphi}_{1}\right)+{r}_{2}\phantom{\rule{thinmathspace}{0ex}}\mathrm{cos}2\left(\theta {\varphi}_{2}\right)$ thereby extracting the coefficients of the Fourier modes and their phases, see Figure 7—figure supplement 1a and b and Figure 7—figure supplement 2. We also extracted the coefficients via FastFourier Transform and found identical results. We did not find any cells for which ${r}_{2}$ was significant compared to the first two coefficients and hence limited our analysis to the first two coefficients and the phase, that is the place field was centered at ${\varphi}_{1}$. We excluded all cells for which the tuning index (TI) ${r}_{1}/{r}_{0}$ was below one. We additionally required that the mean firing rate of selected cells over the entire experiment was greater than 0.4 Hz. With these criteria we had a total of n = 13 and n = 19 cells for the muscimol and postmuscimol sessions in rat A991 and n = 4 and n = 0 cells for rat A992, see Figure 7—figure supplement 4. To ensure we had a sufficient number of neurons we additionally compared the timeresolved SC to the SC from 5000 shuffled orderings of the cells (ttest, p<0.005, corrected for multiple comparison). Significant differences indicate that the SC carries spatial specificity as opposed to simply reflecting the degree of global correlation in the network (shaded regions in Figure 7—figure supplement 4). The SC for rat A991 was always significantly different from shuffled for all TI, while for rat A992 it was often not so. For this reason we only used the data from rat A991 for these experiments, see Figure 7. We furthermore computed the power spectrum of a simultaneously recorded local field potential (LFP) signal, see Figure 7c and d and Figure 7—figure supplement 4.
Hexagonal track experiments
Request a detailed protocolThere were a total of n = 158 and n = 134 cells from rats A991 and A992 respectively. We selected cells as described above. This left a total of n = 6, 9 and 4 cells and n = 17, 20 and 13 cells for rat A991 and A992 respectively for the three sessions. The SC calculated for rat A991 was not significantly different from that from 5000 shuffled data sets with random reorderings of the cells, see nonshaded regions in Figure 7—figure supplement 3. Therefore we only considered data from rat A992 for these experiments, see Figure 7e.
Calculating the SC
Request a detailed protocolWe consider the firing rate of a $N$ neurons in a time interval of length $T$, divided into $k$ equal time bins of length $T/k$. Once we have ordered the cells according to their place field positions, the crosscorrelation of two neighboring cells over the interval is given by
where ${r}_{i}^{t}$ is the firing rate of cell $i$ in time bin $t$, and $\ufffd{r}_{i}\ufffd$ and $Var\left({r}_{i}\right)$ are the mean and the variance of the firing rate over the interval. The superscript $\mu $ indicates the particular ordering. The sequential correlation is the average crosscorrelation of neighbors in the network
Therefore we can also write
where
We can use (Equation 65) to calculate the variance of the SC over the time interval in question. The SC in Figure 7c,d,e as well as in Figure 7—figure supplements 3,4, is calculated by first dividing the total experiment into intervals of 3 min. For each of these 3 min intervals a bin of 1 s was used to calculate the SC. In Figure 7b the firing rates over the whole experiment are used, in time bins of 500 ms. The shuffled SC shown in Figure 7a (histogram), and ce (open symbols) is found in the same way by randomly reordering the cells.
Data availability
All electrophysiological data has been uploaded to the Dryad website. The DOI is doi:10.5061/dryad.n9c1rb0

Dryad Digital RepositoryData from: Thetamodulation drives the emergence of connectivity patterns underlying replay in a network model of place cells.https://doi.org/10.5061/dryad.n9c1rb0
References

Thetaspecific susceptibility in a model of adaptive synaptic plasticityFrontiers in Computational Neuroscience 7:1–18.https://doi.org/10.3389/fncom.2013.00170

Synaptic modifications in cultured hippocampal neurons: dependence on spike timing, synaptic strength, and postsynaptic cell typeThe Journal of Neuroscience : The Official Journal of the Society for Neuroscience 18:10464–10472.https://doi.org/10.1523/JNEUROSCI

Conjunctive input processing drives feature selectivity in hippocampal CA1 neuronsNature Neuroscience 18:1133–1142.https://doi.org/10.1038/nn.4062

BookRhythms of the BrainOxford University Press.https://doi.org/10.1093/acprof:oso/9780195301069.001.0001

Connectivity reflects coding: a model of voltagebased STDP with homeostasisNature Neuroscience 13:344–352.https://doi.org/10.1038/nn.2479

Hippocampal plasticity across multiple days of exposure to novel environmentsJournal of Neuroscience 24:7681–7689.https://doi.org/10.1523/JNEUROSCI.195804.2004

First occurrence of hippocampal spatial firing in a new environmentExperimental Neurology 62:282–297.https://doi.org/10.1016/00144886(78)900584

Hippocampal sharp waves and reactivation during awake states depend on repeated sequential experienceJournal of Neuroscience 26:12415–12426.https://doi.org/10.1523/JNEUROSCI.411806.2006

A unified dynamic model for learning, replay, and SharpWave/RipplesJournal of Neuroscience 35:16236–16258.https://doi.org/10.1523/JNEUROSCI.397714.2015

Awake replay of remote experiences in the hippocampusNature Neuroscience 12:913–918.https://doi.org/10.1038/nn.2344

Hebbian learning and spiking neuronsPhysical Review E 59:4498–4514.https://doi.org/10.1103/PhysRevE.59.4498

Multiple representations in the hippocampusHippocampus 1:240–242.https://doi.org/10.1002/hipo.450010305

Attentive scanning behavior drives onetrial potentiation of hippocampal place fieldsNature Neuroscience 17:725–731.https://doi.org/10.1038/nn.3687

On the directional firing properties of hippocampal place cellsThe Journal of Neuroscience 14:7235–7251.https://doi.org/10.1523/JNEUROSCI.141207235.1994

The effects of changes in the environment on the spatial firing of hippocampal complexspike cellsThe Journal of Neuroscience 7:1951–1968.https://doi.org/10.1523/JNEUROSCI.070701951.1987

Place units in the hippocampus of the freely moving ratExperimental Neurology 51:78–109.https://doi.org/10.1016/00144886(76)900558

Triplets of spikes in a model of spike timingdependent plasticityJournal of Neuroscience 26:9673–9682.https://doi.org/10.1523/JNEUROSCI.142506.2006

Neural network model of memory retrievalFrontiers in Computational Neuroscience 9:1–11.https://doi.org/10.3389/fncom.2015.00149

Hippocampal sharpwave ripples in waking and sleeping statesCurrent Opinion in Neurobiology 35:6–12.https://doi.org/10.1016/j.conb.2015.05.001

Reactivation of rate remapping in CA3The Journal of Neuroscience 36:9342–9350.https://doi.org/10.1523/JNEUROSCI.167815.2016

Competitive hebbian learning through spiketimingdependent synaptic plasticityNature Neuroscience 3:919–926.https://doi.org/10.1038/78829

The enhanced storage capacity in neural networks with low activity levelEurophysics Letters 6:101–105.https://doi.org/10.1209/02955075/6/2/002

Theta sequences are essential for internally generated hippocampal firing fieldsNature Neuroscience 18:282–288.https://doi.org/10.1038/nn.3904

Hippocampal replay captures the unique topological structure of a novel environmentJournal of Neuroscience 34:6459–6469.https://doi.org/10.1523/JNEUROSCI.341413.2014

Recurrent networks with short term synaptic depressionJournal of Computational Neuroscience 27:607–620.https://doi.org/10.1007/s1082700901724
Decision letter

Michael J FrankSenior Editor; Brown University, United States

Mark CW van RossumReviewing Editor; University of Nottingham, United Kingdom

Maxim BazhenovReviewer; University of California, San Diego, United States
In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.
Thank you for submitting your article "Thetamodulation drives the emergence of connectivity patterns underlying replay in a network model of place cells" for consideration by eLife. Your article has been reviewed by three peer reviewers, one of whom is a member of our Board of Reviewing Editors, and the evaluation has been overseen by Michael Frank as the Senior Editor. The reviewers have opted to remain anonymous.
The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.
Summary:
Overall the reviewers thought that this was an interesting study. However, a substantial number of issues were identified, and in subsequent discussions all reviewers thought each other's revision requests were valid and important.
We believe that it should be manageable to address them within a reasonable time.
Essential revisions:
Here is the summary of the essential revisions. These are just summaries, for the full comments see below.
 Discuss, and ideally simulate, the effect of phase precession.
 More clearly discuss the requirement that STDP rule cannot be frequency dependent, as well as the exact prediction regarding the optimal frequency.
 Extend the discussion of the result of Equation 5 and its predictions.
 Discuss, and ideally simulate, reverse replay in the model.
 Clarify the brain region which is modeled from the outset.
Reviewer #1:
In this study the authors present a rate based network model representing the place field activity in rodents on a periodic track. An external input activates place cells according to their preferred position and an asymmetric learning rule strengthens the recurrent excitatory synaptic weights such that during bursts in a quiet period (no place cell input) replay of the place cell sequence occurs. This learning is accelerated by theta oscillations. Over all, this is a well done study, consisting of computer simulations, analytical derivations and the analysis of electrophysiological data. However there are a few comments that should be addressed.
The model assumes a Poissonrate for each neuron. However, it has been shown that place cells phase precess during theta oscillations which leads to a sequential organization of spiking during the theta cycle. This phenomenon is not mentioned at all. What effect does phase precession have on the presented results?
In this model the growth rate of the symmetric mode, which governs the learning process, depends on the autocorrelation and the plasticity rule. But the autocorrelation itself depends on the firing rate of the place cell. What happens with heterogeneous peak rates as shown in Figure 7A? Does this framework still hold?
I wonder whether the muscimol experiments are sufficient to claim the importance of theta oscillations for the learning process. A counterexample for place cell activity in the absence of theta oscillations has been demonstrated in bats (Yartsev and Ulanovsky, 2013), which should be mentioned. Yes, silencing the medial septum largely abolishes theta oscillations, however, it might be a stretch to reduce the medial septum to its pace making function.
Reviewer #2:
Summary:
The authors describe a spiking neural network model of place cells using what seems to be a variation of LinearNonlinearPoisson (LNP) neurons equipped with a temporally asymmetric SpikeTimingDependent Plasticity (STDP) rule and externally oscillatory modulation. The network is capable of giving rise to spontaneous replay following simulated exploration, and its learning rate is optimized by external modulation in the Theta range. Moreover, the authors performed an analytical treatment of a simplified version of the network model to derive a formal description of how the robustness of and timescale in which replay emerging from repeated prepost spike pairings is modulated by oscillatory activity at various frequencies.
The science behind the paper is very solid, and the analytical treatment in addition to the numerical simulations make this a very thorough, impressive modeling paper. The major criticisms revolve around how the authors chose to present some of their findings. Certain claims and findings are very exaggerated or misleading (see Major Critiques 13), and it is unclear what brain region are actually modeled (see Major Critique 4). Finally, there is an unsubstantiated claim about the model and oversight of extremely similar previously published work in the Discussion.
Major Critiques:
1) The authors consistently refer to the frequency band which optimizes the learning rate in the model as "theta", which is typically referred to as a hippocampal oscillation within the 812 Hz range. However, Figure 4A appears to show that optimal synaptic weight changes occur over a much broader frequency range, and Figure 4B specifically shows that optimal replay occurs within the 28 Hz range. The authors should be clear in the introduction about what is typically considered the theta range, and that, while their model does demonstrate the utility of oscillations within the theta range, the effect they observe is not limited to theta.
2) The authors should spend some time interpreting the analytical result from Equation 5 which demonstrates that "the growth rate of the even mode is found by multiplying the AC of placecell activity by the window for plasticity, and integrating." No real physiological/functional interpretations/implications of this result are mentioned (which should always be done), and this leaves it unclear to readers (particularly ones without a rigorous computational background) how impactful this result actually is. It is unsurprising that integrating the autocorrelation of placecell activity multiplied by the plasticity window provides the expected growth rate. This is essentially saying that by considering all of the placecell activity which occurs regularly, and the corresponding weight change given by STDP, while ignoring the noisy activity (which would tend to cancel out on average from this type of unsupervised learning rule) it is possible to estimate the growth rate of the even mode (i.e. the one relevant to learning and which governs replay). The straightforward intuition of this result should be discussed both so that less computationally inclined readers can appreciate the result, and so that it does not mistakenly get taken to be predictive of a novel interpretation of how STDP leads to learning.
3) The authors propose an unsubstantiated mechanism for how their network could also generate reverse replay in addition to the more commonly studied forward replay. They should either provide supporting data for this claim or go into a much more detailed explanation of their proposed mechanism. To begin with, it is unclear what exactly is meant by "synapses to downstream neurons become rapidly depressed". Is this referring to longterm depression, shortterm presynaptic depression, etc.? Assuming shortterm presynaptic depression (as it has the appropriate timescale) this mechanism would only make sense if the same placecell sequences were activated during the theta cycle as those immediately reactivated in during sharpwave replay. However, as far as I know, no such tight temporal correlation has been observed.
4) The authors need to be clearer from the beginning which region of the brain is modeled. The paper does not spend any time on this or the general architecture of the network model in the Introduction. The constant reference to "placecells" implies that this is a model of CA1, however the connectivity of the model makes this more likely to reflect CA3; a fact which the authors bring up themselves in the Discussion. Their defence of how these results can still apply to learning and replaying placecell sequences is solid, and I would recommend the authors take this approach from the start. Bringing it up at the end of the paper seems somewhat misleading and could potentially turn off readers and make them dismissive of the interesting findings reported in the paper.
Reviewer #3:
This paper presents a theoretical study how STDP together with shortterm synaptic depression and correctly tuned oscillation leads to replay like activity of hippocampal place cells. I think that none of the particular mechanisms are new, but the combination and its application to hippocampal sequences is and the analysis of the complete system is nice.
 Were any special settings needed to make sure the rotation of the attractor stopped (Figure 2C bottom), and did not continue rotating? Or would sufficient experience lead to longer and longer sequences?
 The maximum in Figure 4A hinges on the lack of frequency dependence in the STDP rule. The problem is that, AFAIK, STDP always has shown frequency dependence (more LTP at higher frequencies) and it is hard to imagine how biophysically a precise balanced STDP model could be constructed.
Currently it is written as if this is a minor issue, and that STDP needs some stabilization anyway. That might be true but the maximum might be lost. I think this weakness of the model needs to be exposed more.
Secondly, related but a bit less important, I would expect that even with balanced STDP the weights diverge due to the correlation contribution to LTP.
https://doi.org/10.7554/eLife.37388.032Author response
Essential revisions:
Here is the summary of the essential revisions. These are just summaries, for the full comments see below.
 Discuss, and ideally simulate, the effect of phase precession.
Phase Precession
The presence of phase precession in the model
It turns out there is, in fact, some degree of phase precession in our model. Because the model is stochastic, the phase precession should be understood as a precession in the probability of generating a spike (the peak firing rate) with respect to the underlying theta phase as the virtual animal runs through a given place field. This mechanism is explained in detail in (Romani and Tsodyks, 2015) Figure 5. Our model is essentially the same, with the addition of STDP, so the mechanism also holds, with some caveats. This mechanism is intimately tied to the generation of theta sequences during movement, which occurs once the connectivity becomes sufficiently modulated in a novel environment. That is, there is initially no phase precession in the model, just as there is no “replay”. It emerges as the connectivity is shaped through plasticity. We give a detailed description of this effect below.
The mechanism generating phase precession in our model is a network mechanism which was first proposed by Misha Tsodyks and others involving the generation of propagating activity during the theta cycle; this propagation was due to an assumed asymmetry in the recurrent connections between neurons which represented place cells on a linear track (Tsodyks et al., 1996). A very similar effect can be achieved through shortterm synaptic depression even without the need for asymmetric connectivity. The reason is that the motion of the animal, and hence the sequential activation of place cells, generates an asymmetric pattern of synaptic depression (upstream synapses are more depressed than downstream ones). This can also lead to theta sequences.
The relationship between theta sequences and phase precession can be illustrated in a cartoon, in which it also becomes very clear that such a mechanism relies on asymmetry in the place cell activity. Author response image 1A shows the collective activity from a population of place cells ordered according to their place field locations (the ovals can be thought of as contours of a firing rate map). The bump of activity reflects the animal’s motion and is modulated by an ongoing theta rhythm. We consider the firing rate of a single place cell, with a place field centered at the position shown by the horizontal dashed blue line. In this example there is no temporal asymmetry in the place cell activity and hence the maximal firing rate of all cells coincides with the peak in the theta rhythm, including that of the blue cell. The peak in the firing rate gives the most likely time for generating a spike, shown by the solid vertical lines. There is no phase precession.
Author response image 1B illustrates the effect of asymmetry in the placecell activity. Now, within each theta cycle there is a sequence of activation of the place cells. Because of this the maximum firing rate of the blue cell within each cycle does not necessarily coincide with the peak in the theta rhythm. In fact it moves to earlier phases as the animal moves out of the field. This generates phase precession. This is the same mechanism as shown in Figure 5 of (Romani and Tsodyks, 2015). Clearly the details of the phase precession depend on the exact shape of the placecell activity.
This mechanism is present in the model we study. This is shown in Figure 2—figure supplement 5. Specifically, during early exploration the recurrent connectivity is unstructured w.r.t. the space of the novel track. The place cell activity is therefore dominated by sensory input; activity is symmetric, Figure 2—figure supplement 5A, there are no theta sequences, Figure 2—figure supplement 5B, and there is no phase precession, Figure 2—figure supplement 5C. Note also that the place cell activity is highly heterogeneous due to the random nature of the recurrent connectivity from previous learning. So in Figure 2—figure supplement 5 it is even often difficult to define a preferred phase for firing. On the other hand, after sufficient plasticity, the recurrent connectivity becomes strongly modulated in the novel space. When this occurs, shortterm synaptic plasticity leads to strong temporal asymmetry in the placecell activity, Figure 2—figure supplement 5D, there are pronounced theta sequences, Figure 2—figure supplement 5E, and phase precession, Figure 2—figure supplement 5F.
In the case of our model, as in (Romani and Tsodyks, 2015), the phase precession resulting from the synapticdepressiondriven theta sequences, is not as pronounced as that seen in experiment. Therefore, theta sequences are probably not the whole story. In fact, the relationship between theta sequences and phase precession has been known for over a decade (Foster and Wilson, 2007). In that work, the authors show that theta sequences cannot be predicted solely based on the observed phase precession, which may imply the opposite as well. So additional mechanisms are likely.
The role of phase precession in plasticity
The weak phase precession already present in the model arises during the plasticity process, as the recurrent connectivity becomes significantly modulated. Therefore it is a result of the plasticity process. In order to study the effect of phase precession on the learning process itself, we can introduce it by hand by adding an additional term to the external input to place cells. This allows us to study the role of phase precession parametrically, assuming it is an extrinsic effect. The external forcing now takes the form
The new term, proportional to α, generates theta sequences. Note that when α = 0 we recover the previous forcing, for which the maximum of the input is simply θ_{max} = vt, i.e. it follows the motion of the animal. On the other hand, when α ≠ 0 the maximal input sweeps ahead of the position of the animal periodically. For example, for α = 1 it can be shown that the maximal input is θ_{max} = vt + ψ where ψ = πft mod 2π. Author response image 2 shows place cell activity and input for two values of α = 0,1 which correspond to theta activity without and with theta sequences respectively.
As it turns out, generating theta sequences, which is the ratebased equivalent of phase precession, has a significant effect on the growth rate of the even mode of the connectivity. It therefore also strongly influences the time at which a transition to replay occurs. Specifically, theta sequences speed up learning. This is shown in Author response image 3 which shows that the transition to replay occurs earlier when α = 1 compared with α = 0. The inset quantifies the dependence of the transition time on α. The curve asymptotes for large enough α.
Why is there a speedup in learning when theta sequences are present? We can solve Equations 39 and 40 of the Materials and methods using Equation 1 as the external input to see how the growth rate of the even modes of the connectivity are affected by α. Analytically this is quite messy. The reason is that for α = 0, which is the case studied in the paper, the input only has the spatial Fourier modes cosθ and sinθ, and a selfconsistent solution to Equations 39 and 40 can be found by considering only these. When α ≠ 0, additional spatial modes appear, specifically cos((1 ± α)θ), sin((1 ± α)θ). This means that a selfconsistent solution will involve at least these modes, making the calculation extremely lengthy and tedious.
Therefore, rather than do this we performed numerical simulations of the full model, using Equation 1 as an input, but with a linearthreshold transfer function. We also take I_{0} > I_{PF} which ensures that the firing rate dynamics operates in the linear regime. Doing this makes the full system equivalent to Equations 39 and 40, which assume a linear transfer function. When we do this we find that taking α ≠ 0 indeed leads to the growth of additional spatial modes in the connectivity (not shown). However, the amplitude of the even modes is actually smaller than in the α = 0 case. On the other hand, when we take I_{0} < I_{PF}, so that the firing rate dynamics operates in the nonlinear, rectified regime, the opposite occurs. Namely, when α ≠ 0, the even modes grow faster than when α = 0. This strongly suggests that the speedup of learning due to the theta sequences is a nonlinear effect, not captured by our current theory.
Summary of phase precession
We apologize for the length of our response, but it turned out that phase precession/theta sequence dynamics in our model are quite rich. In short, we can make the following two statements: 1 – theta sequences emerge spontaneously in our model during the learning process, leading to weak phase precession. 2 The presence of theta sequences during learning speeds up the learning process. The mechanism is nonlinear but not well understood (at least by us) at this point. Also note that the role of the “theta” frequency f on the growth rate of the even mode, continues to hold even in the presence of theta sequences. So theta sequences represent a means of providing an additional speedup of learning once the theta rhythm is already there, in our model.
Changes to Manuscript
We have now added two sentences to the Results. They are: “This steady increase is due to the emergence of socalled theta sequences. […] The emergence of theta sequences here also coincides with a precession of the phase of the underlying firing rate with the ongoing theta rhythm, see Figure 2—figure supplement 5f and Discussion for more detail.”
We have also added a new section in the Discussion on theta sequences and phase precession.
 More clearly discuss the requirement that STDP rule cannot be frequency dependent, as well as the exact prediction regarding the optimal frequency.
Frequency dependence of the STDP rule
Balanced rule
In this paper we have only considered a balanced STDP rule, i.e. one for which the strength of potentiation and depression are on average the same given fixed pre and postsynaptic firing rates. The upshot for our analysis is that only timevarying firing rates lead to lasting networkwide changes in the synaptic connectivity. Specifically, if we write the recurrent connectivity profile as
then with a balanced plasticity rule w_{0} will not change. When this is the case then the growth of w_{even} depends crucially on the presence of an ongoing oscillation in the placecell activity, as explained in the main text of the paper. It can be shown that with a pairwise STDP rule the maximum growth rate occurs for a frequency f = 1/(2π√τ_{+}τ−). This can be found by taking the derivative of Equation 23 with respect to the velocity term and plugging in v = 2πf. From this formula it is easy to see that temporal windows for the STDP rule on the order of 10s to 100s of ms will lead to best frequencies between 1 and 10 Hz. So over the physiological range of observed STDP windows the best frequency is within the range of theta. As noted by the reviewers, the growth curve does not drop off precipitously away from the best frequency. For example, in Figure 5C of the manuscript, frequencies in the range 312Hz more or less give similar growth rates. Therefore, depending on the STDP time constants there will be some range of frequencies for which the growth rate is high. For much lower and much higher frequencies the growth rate does fall by orders of magnitude. For example, again in Figure 5C, a frequency of 1Hz or 30Hz gives an order of magnitude lower growth rate. So if an 8Hz modulation leads to a transition to replay after 20min, then a 1Hz modulation would take 3hrs and 20 min. This is a large difference. If a balanced triplet rule is used the best frequency could be found by taking the derivative of Equation 26 w.r.t. the velocity. We cannot find an analytical solution, but Figure 5E of the main text suggests the best frequency is close to the one from the pairwise rule.
Unbalanced rule
If the rule is not balanced then w_{0} will grow or decay if potentiation or depression dominates respectively. In this case the bounds on the synaptic weights become very important. In particular, if we bound the weights below by a minimum and above by some maximum weight, then with the unbalanced rule, after sufficient time all of the synapses will be fully potentiated or fully depressed. Therefore no structure is possible in this network with an unbalanced rule. This is a very general finding in recurrent networks. In fact, even to store simple patterns using multistable fixed points in networks with STDP is quite hard, and requires additional fast homeostatic mechanisms, e.g. (Zenke et al., 2015).
On the other hand, invitro experiments have shown a “frequencydependence” of the stimulation protocol on the shape of the STDP window. We would like to emphasize that the frequency in question here is the repetitionrate, not the frequency f, i.e. the modulation or “theta” frequency from our model. Rather, it would correspond to the pre and postsynaptic firing rates. So from now on we will talk about a ratedependence of the STDP rule just to be clear. This ratedependence can be taken into account using a triplet STDP rule (Pfister and Gerstner, 2006). This rule is unbalanced in the sense that at high rates potentiation dominates and so using the rule as is will lead to a fully potentiated network. We can see from our Equation 26 that taking a sufficiently unbalanced rule can eliminate much of the dependence of the growth rate of the even mode on the modulation frequency. Specifically, if we take A3+ much larger than the pairwise parameters A_{+} and A− and A3 = 0, then the growth rate will be large and positive even if there is no modulation whatsoever. Unfortunately, if we do this then Equation 25 shows that w_{0} will also grow very fast, wiping out all structure in the network.
We can take a less extreme example to make the point. Author response image 4 of this reply shows the dependence of the growth rate of the homogeneous mode, i.e. the mean synaptic weight, as a function of the mean firing rate of the place cells. Note that this firing rate dependence is what would be called “frequency dependence” of the STDP rule; here we all it rate dependence. In order to calculate this curve, we use Equations 2527 of Materials and methods but including all of the terms which depend on the modulation frequency f as described in the section “including periodic modulation”. The parameters used are τ_{+} = 20ms, τ− = 60ms, A_{+} = 0.001, A− = 0.001/3, v = 0.001, τ = 20ms, τ_{x} = τ_{y} = 100ms, A+3 = 10−7, A3− = 10−8/3.
Again, for the balanced rule (black line in Author response image 4), there is no ratedependence of the growth rate of the mean connectivity. This is what is meant by balanced. In the unbalanced case (orange line) there is a dependence on rate. Author response image 5 shows the dependence of the even mode and the mean connectivity on the modulation frequency if we fix the mean firing rate at r_{0} = 10Hz. Both the balanced and unbalanced cases show a clear maximum in the growth rate of the even mode. On the other hand, for the unbalanced case the growth rate is significant for any frequency whereas in the balanced case it goes to zero at low and high frequencies. Importantly, the growth rate of the mean synaptic weight w_{0} is positive at any frequency. So again in the unbalanced case, after sufficient time all of the synapses will become maximally potentiated, and no network structure is possible. This example merely serves to illustrate the fact that we cannot study an unbalanced rule in a recurrent network without including additional homeostatic mechanisms.
Unbalanced rule with additional mechanisms
Therefore, in order for nontrivial structure to evolve in a recurrent network with unbalanced STDP, additional mechanisms must be included. In (Zenke et al., 2015) this includes a depressing heterosynaptic mechanism, Equation 13 in that paper. Studying that particular mechanism in the context of our model is possible, but would represent a major undertaking. The reason is that heterosynaptic term is proportional to the difference between the current value of the synaptic weight and a reference value which itself changes in time. Furthermore, the pairwise depression term is subject to slow homeostasis, Equations 1718 in that paper. So we unfortunately cannot give a detailed answer here regarding the robustness of our result for that rule. On the other hand, in order to allow for the development of nontrivial structure there should be some mechanism, on not too long a time scale, which brings down the orange curve in Author response image 5 (bottom). If this mechanism is nonlinear in the firing rates, which is the case for (Zenke et al., 2015) for example, then it will also cause the orange curve in Author response image 5 (top) to shift downward. In the limit of very fast “homeostasis” this will just be equivalent to the black curves. For noninstantaneous mechanisms (the relevant case) a systematic study of the dynamics would be needed.
Changes to manuscript
We have now added three sentences to the Discussion, under “Robustness to changes in the plasticity model and to the presence of spike correlations”. They are: “Specifically, when the repetitionrate of spike pairs is high, invitro STDP is dominantly potentiating; for such a rule there would be no nonmonotonic dependence of the learning rate on modulation frequency at high rates. […] It is unclear if the learning rate of this rule would also exhibit a nonmonotonic dependence of the modulation frequency, as in our rule.”
 Extend the discussion of the result of Equation 5 and its predictions.
Discussion of Equation 5
Equation 5 is one of the main results of the paper, so we appreciate the chance to discuss it more in depth. It is easiest to first discuss the plasticity between pairs of neurons in this model. This is given by Equation 3 of the manuscript. If we average this equation to eliminate fluctuations on a fast time scale we have
where CC _{θ,θ’} (T) = ⟨r (θ’, t) r (θ, t + T)⟩ _{t}. Author response image 6 shows how the integral can be evaluated for the hypothetical example in which the cell at θ′reliably fires before the cell at θ. In this case the likelihood of a spike pair peaks at 25ms. Author response image 6B and D show that the synapse from θ′to θ will potentiate while that from θ to θ′will depress. Although Equation 3 is useful for determining the evolution of the synaptic weights between a single pair of neurons, this framework is not very helpful in a large recurrent network. In a network of N place cells there are 2N such equations for the synaptic weights, not to mention the corresponding equations for the firing rates.
The great simplification comes about by decomposing the recurrent connectivity into its Fourier components, specifically
$w(\theta ,\theta )=w(\theta \theta )={w}_{0}+{w}_{even}cos(\theta \theta )+{w}_{odd}sin(\theta \theta )$
Here the coefficients w_{0}, w_{even} and w_{odd} are not the synaptic weights between pairs of neurons. Rather, w_{0} is the mean synaptic weight in the network, w_{even}
is the amplitude of a pattern of synaptic weights in which nearby neurons are strongly coupled in both directions, and more distant neurons have effectively inhibitory interactions. It is often what is called a “Mexican hat” connectivity profile. w_{odd} is a pattern in which nearby neurons are strongly coupled in one direction and effectively inhibitory in the other. We can isolate the Mexican
hat component of the connectivity by taking θ′= θ, i.e. by considering place cells with overlapping place fields and subtracting off the mean connectivity, w_{0}. This is because only growth of the even mode leads to strong recurrent excitation locally. If we now take the temporal derivative of the connectivity and average in time we arrive at Equation 5, which therefore gives the rate of change of the amplitude of the even mode, or the Mexican hatlike connectivity profile. The derivative of w_{0} vanishes because it is a balanced STDP rule.
Biological interpretation of Equation 5
The biological interpretation of Equation 3 is straightforward. If you want to know how the synapses change between a pair of cells, you should just count all spike pairs and use the STDP rule to look up all potentiations and depressions. If the neurons’ activity is a noisy but stationary process, then this prescription is equivalent to first using the full spike trains to calculate the crosscorrelations (CC). Then you just multiply the CC by the STDP window and integrate. Again, this is the same as adding up all of the changes due to each and every spike pair. But it is much easier to implement with real data (assuming you know what the STDP rule is).
The biological interpretation of Equation 5 (from the paper) is less straightforward, but is the following. If you are doing an experiment in which you believe the connectivity between any two place cells is most strongly dependent on the distance between their place fields alone, and not, say the absolute position of any place field, then you can decompose the connectivity into a series which depends only on this distance (in our case we have considered data from a periodic track which the animal explores more or less homogeneously, so there are no “special” locations. If there is a special location, such as a rewarded location, then this theory can be extended by separating the connectivity into a part which only depends on differences between locations, and a locationspecific (e.g. reward specific) part. But that goes beyond the scope of this paper). If this is the case, then you can actually infer global properties of the network structure just by looking at the neuronal activity locally. Specifically, if you find that the activity of neurons with overlapping (or very nearby) place fields would induce strong reciprocal potentiation, then you can conclude that a Mexicanhat like connectivity is forming. Theoretically we assume a large number of place cells with arbitrarily close place fields. In the limit of many cells, the CC of very nearby cells just becomes the autocorrelation AC. Therefore, one can simply take the full spike train of each place cell, calculate the AC, and multiply it by the STDP window. Given real data, for which the AC of each cell will be different, one could take the ensemble average to get the mean growth rate of the even mode. In our theoretical framework each cell has an identical AC by construction.
Now consider the case where the STDP window is perfectly antisymmetric, with identical depression and potentiation lobes. The AC is by definition symmetric. Therefore, the product of these functions is also antisymmetric (or odd) and so the integral is zero, i.e. there is zero growth of the even mode of the connectivity. What is the biological interpretation of this? If we consider place cells with very nearby place fields, then as the animal moves through the place fields in one direction the synapse from cell A to cell B will be strengthened on average, while those from B to A will be weakened. We are taking about changes due to spike pairs with latencies which are most likely to be relatively small, since the place fields are overlapping. If the animal only ever moves from A to B, then the synapse from A to B will continue to get stronger and from B to A weaker. If the animal moves equally in both directions then there will be no overall change in synaptic strength. Therefore, there will be no strong reciprocal excitatory connectivity in the network.
Changes made to manuscript
We have now added a detailed description of the biological interpretation of Equation 5, directly following Equation 5 in the main text. It reads: “Note that despite the similarity in form between Equation 5 and Equation 3, the biological interpretation of the two is quite distinct. […] The key assumption that makes this possible is that the synaptic weight between any two cells should depend only on the difference in placefield location and not on the absolute position, Equation 4.”
 Discuss, and ideally simulate, reverse replay in the model.
Reverse Replay
Replay in the model occurs spontaneously once the recurrent connectivity is sufficiently modulated through plasticity in any given environment. Specifically, once the even mode of the connectivity reaches a critical value, then traveling waves emerge spontaneously through an instability of the homogeneous state. The mathematics behind this are described in the section “A traveling wave instability of the rate equation for modulated connectivity” and “Including the effect of synaptic depression” of Materials and methods. Equations 50 and 61 show that the velocity of the traveling wave is proportional to the odd mode w_{odd}. Therefore, if, during the exploration of a novel environment, the recurrent connectivity is asymmetric, meaning the odd mode is not zero, then waves will appear which travel only in one direction. How then can one get replay in both directions?
Reverse replay through local input
The solution we propose in the discussion is to allow for strong local input. Such input can lead to activity propagating backward w.r.t. the asymmetry in the connectivity. Strong, local input, presumably at or near the animal’s location, leads to depletion of synaptic resources in the recurrent connections. Because the connectivity is asymmetric, this depletion is also asymmetric, more strongly affecting the direction in which replay would usually propagate. Therefore, the activity is now forced to propagate backward. A simulation illustrating this can be seen in Figure 2—figure supplement 7. One appealing feature of this hypothesis is that it requires spatial input to work. In fact, during sleep, when such sensory input is missing, replay is predominantly forward (Roumis and Frank, 2015). Backward replay is more often observed during awake quiescence when such input is available, and may be mediated by place cell activity in CA2 (Kay et al., 2016). So the mechanism is consistent with those particular findings. On the other hand, and as can be seen in Figure 2—figure supplement 7, the backward replay which arises via this mechanism does not look the same as the forward replay. Differences include spiking intensity, duration, and velocity. To our knowledge there has not been any systematic study of these properties comparing forward and backward replay in the experimental literature. Additionally, such differences may not be so striking unless many cells are recorded simultaneously.
Reverse replay with symmetric connectivity
When the connectivity is perfectly symmetric, once the even mode reaches a critical value there is an instability to a stationary spatial bump, and not a wave. This can be seen in Equations 50 and 61 by taking w_{odd} = 0. Nonetheless, in the presence of shortterm synaptic depression this stationary bump is not stable, and traveling waves occur anyway. These traveling waves emerge from the bump state and not the homogeneous state, and as such are not amenable to analysis. We can study them numerically however. Author response image 7 shows how the spontaneous activity changes when the connectivity is perfectly symmetric. Panels (A)(C) show the typical activity for our model, after exploration of 10 environments. This is just a reproduction of Figure 2—figure supplement 2 of the original manuscript. Panels (D)(F) show how the dynamics changes when the connectivity from (A)(C) is made symmetric by taking the connection from cell j to cell i to be (w_{ij} + w_{ji})/2 where the ws are from the learning process. In this way w_{ij} = w_{ji}. Note that in Author response image 7B the replay is always in the same direction, whereas in Author response image 7E the activity can propagate both ways. This is just an illustration. The statistics of the replay process for the symmetric case can be found in Figure 2 of (Romani and Tsodyks, 2015).
When would the connectivity be perfectly symmetric? In our model the connectivity would only be symmetric if the animal explored equal times in the clockwise (CW) and counterclockwise (CCW) directions. This seems unlikely. Indeed, in the data we have there is a clear bias in the animal’s movement for CW motion. Alternatively, there could be some homeostatic process which would conspire to make the connectivity symmetric.
Summary
Plasticity generates asymmetric recurrent connectivity in general in our model. On a periodic track there is no global remapping of place cells when the animal switches direction. Therefore, equal exploration of CW and CCW directions can lead to symmetric connectivity, which would give rise to forward and backward replay. Otherwise, replay goes in one direction, and strong local input is needed to generate replay in the other direction. On a linear track only this second mechanism would work due to global remapping.
Working out the details of replay, including differences between forward and backward replay statistics in data, and also between linear tracks and periodic tracks, would be a valuable contribution. It would also be a huge undertaking. Our proposed mechanism is therefore not a wellfleshed out theory, but merely an idea, albeit one which we have shown to work in our model. Therefore, while we felt it important to mention, we did not deem it conclusive enough to add to the main text. We felt a paragraph in the Discussion was fitting.
Changes to manuscript
We have altered the description of the mechanism for backward replay in the Discussion in order to clarify it. It now reads: “Specifically, if global input to our model network is homogeneous then replay occurs only in one direction. […] This forces the activity to travel backward with respect to the bias in the connectivity.”
The simulations in Figure 2—figure supplement 8 provide an illustration of this effect.
 Clarify the brain region which is modeled from the outset.
Clarifying the brain region
Our model is definitely a model of a strongly recurrent circuit, such as in CA3 and not in CA1. We now mention this explicitly at the outset of the paper. Specifically, in the Introduction we now state:
“Here, we develop a model that explains how synaptic plasticity shapes the patterns of synaptic connectivity in a strongly recurrent hippocampal circuit, such as CA3, as an animal explores a novel environment.”
At the beginning of the Results section we say:
“We modeled the dynamics of hippocampal place cells in a strongly recurrent circuit, such as area CA3, as an animal sequentially explored a series of distinct novel ringlike tracks”.
Unsolicited changes
On revising the manuscript we realized that the sequential correlation (SC) was not formally defined. We therefore added a brief section to Materials and methods to describe this.
Reviewer #1:
In this study the authors present a rate based network model representing the place field activity in rodents on a periodic track. An external input activates place cells according to their preferred position and an asymmetric learning rule strengthens the recurrent excitatory synaptic weights such that during bursts in a quiet period (no place cell input) replay of the place cell sequence occurs. This learning is accelerated by theta oscillations. Over all, this is a well done study, consisting of computer simulations, analytical derivations and the analysis of electrophysiological data. However there are a few comments that should be addressed.
The model assumes a Poissonrate for each neuron. However, it has been shown that place cells phase precess during theta oscillations which leads to a sequential organization of spiking during the theta cycle. This phenomenon is not mentioned at all. What effect does phase precession have on the presented results?
Although our neurons are Poisson we can still talk about the order in which neurons fire in terms of the underlying likelihood of spiking, i.e. the rate. Therefore, we can still study the phenomenon of phase precession, defined in this way. It turns out that: 1) Thetasequences emerge in our model during learning. A signature of these sequences is phase precession, although it is a weak effect. 2) Introducing phase precession by hand during the learning process, in the form of thetasequences, significantly speeds up learning. These two findings are described in depth above, under “Essential revisions”, in the section entitled “Phase precession”.
In this model the growth rate of the symmetric mode, which governs the learning process, depends on the autocorrelation and the plasticity rule. But the autocorrelation itself depends on the firing rate of the place cell. What happens with heterogeneous peak rates as shown in Figure 7A? Does this framework still hold?
A very relevant question. In our theory, in order to facilitate the analysis, we assume that place fields are homogeneously distributed and identical in shape. In the data this is clearly not the case. In the model there is some variability in place field shape due to heterogeneity in the recurrent connectivity, but it is largely homogeneous because each placecell receives placefield input of the same shape.
We have tested the effect of heterogeneity by allowing for a distribution of placefield input to place cells. Specifically, we distribute the amplitude of the placefield input I_{PF}. Figure 2—figure supplement 4A shows how heterogeneity affects the transition to replay. The curve with the black circles is identical to the simulation in Figure 2 of the main text, for which I_{PF} = 25Hz. The red squares are a simulation for which I_{PF} is uniformly distributed between 20 and 30Hz (and hence the mean is the same as before). The orange diamonds show an extreme case where I_{PF} is uniformly distributed between 0 and 50Hz. Figure 2—figure supplement 4B shows sample place cell activity for the strongly heterogeneous case. Note that in this case some cells are only very weakly selective to place, e.g. cell 3, while others have no place field whatsoever, e.g. cell 4. Therefore, heterogeneity in place cell activity has a quantitative effect on the learning dynamics and resulting transition, but does not alter the results qualitatively.
We have now added a sentence to the main text which states:
“This transition [in SC] occurred even in simulations where we included strong heterogeneity in place field tuning, see Figure 2—figure supplement 4.”
I wonder whether the muscimol experiments are sufficient to claim the importance of theta oscillations for the learning process. A counterexample for place cell activity in the absence of theta oscillations has been demonstrated in bats (Yartsev and Ulanovsky, 2013), which should be mentioned. Yes, silencing the medial septum largely abolishes theta oscillations, however, it might be a stretch to reduce the medial septum to its pace making function.
This is a fair point. Inactivating the medial septum undoubtedly has other effects beyond reducing theta. Some of these effects we have characterized to some extent in the supplementary figures. For example, there is stronger rate remapping of place fields when the animal switches from CW to CCW motion under muscimol, see Figure 7—figure supplement 1C. Firing rates are also reduced, see Figure 7—figure supplement 2A. Both of these effects are relatively weak and neither would qualitatively alter any conclusions drawn from our computational model. On the other hand in the experiments we have looked at theta is dramatically reduced, essentially eliminated in fact. This of course would make a huge difference in learning rates in the context of our model.
The absence of theta in the bat is a fascinating case. Although we cannot purport to know why this is, we find that it actually may be consistent with the theoretical mechanism for learning rates we put forth here. Namely, to attain a high learning rate the AC of placecell activity should be modulated on a timescale commensurate with the window for plasticity. In the case of land mammals like rats and mice, the behavioral timescale related to spatial navigation is too slow, therefore necessitating oscillatory modulation in the form of theta. In the case of flying animals, like the bat, changes in sensory input alone are already likely to occur on the order of tens to hundreds of ms due to the high velocity of flight, thereby obviating the need for theta. Note, for example, the two sample cells shown in Figure 4 of (Yartsev and Ulanovsky, 2013), which have AC which decay on the order of 100s of ms. We are not aware of data on STDP from bat hippocampus, so this is just an idea without any solid support. But in principle this idea could be put to the test.
Thank you for the reference.
We have now added several sentences to the Discussion section on the bat, essentially a slight rewriting of the above paragraph, as well as the above reference.
Reviewer #2:
Summary:
[…] The science behind the paper is very solid, and the analytical treatment in addition to the numerical simulations make this a very thorough, impressive modeling paper. The major criticisms revolve around how the authors chose to present some of their findings. Certain claims and findings are very exaggerated or misleading (see Major Critiques 13), and it is unclear what brain region are actually modeled (see Major Critique 4). Finally, there is an unsubstantiated claim about the model and oversight of extremely similar previously published work in the Discussion.
Major Critiques:
1) The authors consistently refer to the frequency band which optimizes the learning rate in the model as "theta", which is typically referred to as a hippocampal oscillation within the 812 Hz range. However, Figure 4A appears to show that optimal synaptic weight changes occur over a much broader frequency range, and Figure 4B specifically shows that optimal replay occurs within the 28 Hz range. The authors should be clear in the introduction about what is typically considered the theta range, and that, while their model does demonstrate the utility of oscillations within the theta range, the effect they observe is not limited to theta.
This is a fair point. The exact range which maximizes the learning rate clearly depends on the model parameters. As explained above in more detail in the section “Frequency dependence of the STDP rule”, the maximum learning rate occurs for f = 1/(2π√τ_{+}τ−. The squareroot dependence makes the optimal frequency only weakly sensitive to large changes in the STDP window, so that a large change in the width from 10s to 100s of ms only shift the optimal frequency from about 10Hz to 1Hz. Nearby frequencies do nearly as well as the optimal so that frequencies in the range 110Hz in general do very well.
In the Introduction we state that “for an intermediate range, set by the timescale of the plasticity rule, it [the growth rate] is higher…”. Also, when discussing Figure 4 we now replace “theta range” with “intermediate range”.
2) The authors should spend some time interpreting the analytical result from Equation 5 which demonstrates that "the growth rate of the even mode is found by multiplying the AC of placecell activity by the window for plasticity, and integrating." No real physiological/functional interpretations/implications of this result are mentioned (which should always be done), and this leaves it unclear to readers (particularly ones without a rigorous computational background) how impactful this result actually is.
We of course want to provide a clear biological interpretation. This is given above in the section “Discussion of Equation 5”.
It is unsurprising that integrating the autocorrelation of placecell activity multiplied by the plasticity window provides the expected growth rate. This is essentially saying that by considering all of the placecell activity which occurs regularly, and the corresponding weight change given by STDP, while ignoring the noisy activity (which would tend to cancel out on average from this type of unsupervised learning rule) it is possible to estimate the growth rate of the even mode (i.e. the one relevant to learning and which governs replay).
Again, the detailed explanation is given in “Discussion of Equation 5” but we would disagree that this result is unsurprising or in any way obvious. What is clear, is that given a pair of neurons, the changes in the synaptic weights will depend on the crosscorrelation (CC) of the neuronal activity. This follows just from the construction of the pairwise STDP rule. Adding up the potentiations and depressions from each spike pair separately (vis the STDP rule) is mathematically equivalent to taking the full spike trains, calculating the CC of the spikes and multiplying this by the STDP window. But here we are not talking about the weight changes between pairs of neurons. Rather we are talking about the growth of a pattern of synaptic connectivity which spans the entire network and which could involve changes in potentially hundreds or thousands of neurons simultaneously. The fact that one can infer the growth of a global pattern of connectivity just by looking at the AC of neuronal activity, a purely local quantity, is surely not trivial. It depends on the key assumption that the connectivity between any two place cells should only depend on the difference on their place cell positions, and not on their absolute positions. Again, details are given in “Discussion of Equation 5”.
The straightforward intuition of this result should be discussed both so that less computationally inclined readers can appreciate the result, and so that it does not mistakenly get taken to be predictive of a novel interpretation of how STDP leads to learning.
Again, we personally don’t find it straightforward, but we certainly agree that the explanation, both mathematical and physiological, should be as clear as possible, please see the section entitled “Discussion of Equation 5”.
3) The authors propose an unsubstantiated mechanism for how their network could also generate reverse replay in addition to the more commonly studied forward replay. They should either provide supporting data for this claim or go into a much more detailed explanation of their proposed mechanism.
As far as we know there are no substantiated mechanisms for how either forward or backward replay are generated. Everything is quite hypothetical at this point. But the point is well taken. We actually were conflicted about how much detail to go into on backward replay. The conundrum of course is that asymmetric connectivity, which is almost unavoidable in our model, leads to replay in one direction only. With symmetric connectivity there is no problem, and this can happen if the animal explores CW and CCW in equal measures. On a linear track this would not be possible due to global remapping, so the problem would persist. Our proposed solution is illustrated in simulations in Figure 2—figure supplement 7. A detailed description of the problem and our proposed solution is found above in section “Reverse replay”.
To begin with, it is unclear what exactly is meant by "synapses to downstream neurons become rapidly depressed". Is this referring to longterm depression, shortterm presynaptic depression, etc.? Assuming shortterm presynaptic depression (as it has the appropriate timescale) this mechanism would only make sense if the same placecell sequences were activated during the theta cycle as those immediately reactivated in during sharpwave replay. However, as far as I know, no such tight temporal correlation has been observed.
Indeed, it is just the shortterm depression we have in the model (through the synaptic resource variable x). The depression is caused by purported local sensory input during awake quiescence, which would therefore be correlated with replay in that environment. Therefore our mechanism would not work during sleep for example. Details are given in the section “Reverse replay” above.
4) The authors need to be clearer from the beginning which region of the brain is modeled. The paper does not spend any time on this or the general architecture of the network model in the Introduction. The constant reference to "placecells" implies that this is a model of CA1, however the connectivity of the model makes this more likely to reflect CA3; a fact which the authors bring up themselves in the Discussion. Their defence of how these results can still apply to learning and replaying placecell sequences is solid, and I would recommend the authors take this approach from the start. Bringing it up at the end of the paper seems somewhat misleading and could potentially turn off readers and make them dismissive of the interesting findings reported in the paper.
We have now added several sentences to the Introduction and the beginning of the Results sections to make it clear we are discussing a model of a strongly recurrent network, such as in CA3.
Reviewer #3:
This paper presents a theoretical study how STDP together with shortterm synaptic depression and correctly tuned oscillation leads to replay like activity of hippocampal place cells. I think that none of the particular mechanisms are new, but the combination and its application to hippocampal sequences is and the analysis of the complete system is nice.
 Were any special settings needed to make sure the rotation of the attractor stopped (Figure 2C bottom), and did not continue rotating? Or would sufficient experience lead to longer and longer sequences?
Excellent question. The duration of the burst does indeed depend on parameter values, in particular the mean external drive to the network. The dependence of the burst profile on the external input is shown in Figure 2—figure supplement 8. So there is only a small range of inputs for which the burst duration is physiological. Whenever the animal stops we set the input to this value. If we set it to another value for which the burst is longer lasting, there is no significant effect on the learning process; it’s just an aesthetic choice which gives rise to sharpwavelike profiles.
 The maximum in Figure 4A hinges on the lack of frequency dependence in the STDP rule. The problem is that, AFAIK, STDP always has shown frequency dependence (more LTP at higher frequencies) and it is hard to imagine how biophysically a precise balanced STDP model could be constructed. Currently it is written as if this is a minor issue, and that STDP needs some stabilization anyway. That might be true but the maximum might be lost. I think this weakness of the model needs to be exposed more.
Of course we don’t expect an invivo STDP rule to be precisely balanced, some additional plasticity mechanisms are surely present with homeostatic effect. On the other hand, the frequencydependence observed invitro would be disastrously unstable in an invivo network, an issue noted by (Zenke et al., 2015) who suggested heterosynaptic and homeostatic mechanisms to counter this. However, we completely agree that with a sufficiently strong frequency dependence (we call it rate dependence so as not to confuse it with the modulation frequency f) the pairwise rule becomes essentially irrelevant and the peak would be lost. A detailed discussion of this can be found above in the section “Frequency dependence of the STDP rule”, including changes made to the text.
Secondly, related but a bit less important, I would expect that even with balanced STDP the weights diverge due to the correlation contribution to LTP.
In our model this would not happen. But if the reviewer means in a spiking network in which pairwise spike correlations (above and beyond those due to underlying rate variations) are not negligible, this may be true. If correlations are weak then the divergence would be slow, and could be brought into check by a slow homeostatic process. Furthermore as noted in (Graupner et al., 2017), realistic invivolike firing patterns “imply low sensitivity of synaptic plasticity to spiketiming as compared to firing rate”. This suggests that in a recurrent spiking network in the asynchronous state, learning due to variations in firing rates, which is the mechanism we study here, should be much faster than changes due to spike correlations.
https://doi.org/10.7554/eLife.37388.033Article and author information
Author details
Funding
Ministerio de Economía y Competitividad (BFU201233413)
 Alex Roxin
Ministerio de Economía y Competitividad (MTM201571509)
 Alex Roxin
Generalitat de Catalunya (CERCA program)
 Alex Roxin
Howard Hughes Medical Institute
 Yingxue Wang
MaxPlanckGesellschaft
 Yingxue Wang
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
AR acknowledges helpful discussions with Pablo Jercog.
Senior Editor
 Michael J Frank, Brown University, United States
Reviewing Editor
 Mark CW van Rossum, University of Nottingham, United Kingdom
Reviewer
 Maxim Bazhenov, University of California, San Diego, United States
Publication history
 Received: April 9, 2018
 Accepted: October 24, 2018
 Accepted Manuscript published: October 25, 2018 (version 1)
 Version of Record published: November 8, 2018 (version 2)
Copyright
© 2018, Theodoni et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics

 1,818
 Page views

 291
 Downloads

 6
 Citations
Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading

 Neuroscience
Neurodegenerative diseases are characterized by selective vulnerability of distinct cell populations; however, the cause for this specificity remains elusive. Here we show that entorhinal cortex layer 2 (EC2) neurons are unusually vulnerable to prolonged neuronal inactivity compared with neighboring regions of the temporal lobe, and that reelin+ stellate cells connecting EC with the hippocampus are preferentially susceptible within the EC2 population. We demonstrate that neuronal death after silencing can be elicited through multiple independent means of activity inhibition, and that preventing synaptic release, either alone or in combination with electrical shunting, is sufficient to elicit silencinginduced degeneration. Finally, we discovered that degeneration following synaptic silencing is governed by competition between active and inactive cells, which is a circuit refinement process traditionally thought to end early in postnatal life. Our data suggests that the developmental window for wholesale circuit plasticity may extend into adulthood for specific brain regions. We speculate that this sustained potential for remodeling by entorhinal neurons may support lifelong memory but renders them vulnerable to prolonged activity changes in disease.

 Neuroscience
Human behaviour requires flexible arbitration between actions we do out of habit and actions that are directed towards a specific goal. Drugs that target opioid and dopamine receptors are notorious for inducing maladaptive habitual drug consumption; yet, how the opioidergic and dopaminergic neurotransmitter systems contribute to the arbitration between habitual and goaldirected behaviour is poorly understood. By combining pharmacological challenges with a wellestablished decisionmaking task and a novel computational model, we show that the administration of the dopamine D2/3 receptor antagonist amisulpride led to an increase in goaldirected or ‘modelbased’ relative to habitual or ‘modelfree’ behaviour, whereas the nonselective opioid receptor antagonist naltrexone had no appreciable effect. The effect of amisulpride on modelbased/modelfree behaviour did not scale with drug serum levels in the blood. Furthermore, participants with higher amisulpride serum levels showed higher explorative behaviour. These findings highlight the distinct functional contributions of dopamine and opioid receptors to goaldirected and habitual behaviour and support the notion that even small doses of amisulpride promote flexible application of cognitive control.