Abstract
Representations related to past experiences play a critical role in memory and decisionmaking processes. The rat hippocampus expresses these types of representations during sharpwave ripple (SWR) events, and previous work identified a minority of SWRs that contain ‘replay’ of spatial trajectories at ∼20x the movement speed of the animal. Efforts to understand replay typically make multiple assumptions about which events to examine and what sorts of representations constitute replay. We therefore lack a clear understanding of both the prevalence and the range of representational dynamics associated with replay. Here, we develop a state space model that uses a combination of movement dynamics of different speeds to capture the spatial content and time evolution of replay during SWRs. Using this model, we find that the large majority of replay events contain spatially coherent, interpretable content. Furthermore, many events progress at realworld, rather than accelerated, movement speeds, consistent with actual experiences.
Introduction
The brain has the remarkable ability to store and retrieve representations of past events, enabling memories of the past to influence future behavior. These memory processes depend critically on the hippocampus, where rapid plasticity during an experience is thought to drive the initial encoding of representations of events (Eichenbaum and Cohen, 2004). Subsequently, hippocampal 'replay' of stored representations during both slowwave sleep and periods of immobility during waking is thought to contribute to the longer term storage and updating of event memories in distributed hippocampalneocortical circuits (Frankland and Bontempi, 2005; Carr et al., 2011; Joo and Frank, 2018).
The canonical example of 'replay' is seen in the rodent, where hippocampal cells preferentially fire at specific locations in an environment, and thus ensembles of cells fire in sequence as the animal moves through a series of locations. When the animal is asleep or immobile, hippocampal cells can be reactivated during a 'sharpwave ripple' (SWR) event. A subset of SWRs contain sequential firing similar to that seen during a previous experience, and thus are thought to 'replay' these previous experiences. Importantly, previous work has reported that these sequential firing events proceed at an average speed of ∼10 meters per second, about 20x faster than the animal’s usual movement speed (Nádasdy et al., 1999; Lee and Wilson, 2002; Davidson et al., 2009; Karlsson and Frank, 2009).
While the existence of these sequential events is well established, the current consensus is that only a minority (∼5–45%) of hippocampal SWRs contain statistically identifiable, sequential replay (Davidson et al., 2009; Michon et al., 2019; Shin et al., 2019; Kaefer et al., 2020; Tingley and Peyrache, 2020b). One might therefore conclude that all memories that are stored or updated by SWRs correspond to sequences of locations. Our subjective experience, however, suggests that we can retrieve memories of individual locations without having to mentally traverse a long distance. Such memories would seem to be useful, in that they could encode the stimuli and values associated with a given place irrespective of the path used to get there. If the rodent hippocampus is capable of storing those sorts of memories then one might expect to see SWRs where the neural activity corresponds not to a rapid trajectory through space, but instead to either a stable pattern associated with a single location or perhaps a brief pattern more similar to one that occurs during a real experience.
Interestingly, there is evidence for SWRs where the spiking corresponds to a single location. Specifically, some SWRs contain spiking of neurons associated with single locations where animals are immobile (Yu et al., 2017), although two reports suggested that events that represent a single location are only seen in young animals (Farooq and Dragoi, 2019; Muessig et al., 2019). Other studies have reported activity that, when combined across an entire event, seems to correspond roughly to a single location, although the dynamics of these events is typically not investigated (Dupret et al., 2010). Thus, while it would seem useful to replay memories associated with single locations or a small region of the environment, the prevalence of that type of replay in awake animals remains unclear.
Our uncertainty stems in part from the dominant approaches used to identify the content of replay events. These approaches typically involve multiple steps and assumptions about the nature of replay, which are most commonly characterized using the standard 'Bayesian' decoder. First, an encoding model is constructed based on spiking activity during movement, most often using only spikes that have been clustered as single units (putative single neurons). Then, a subset of SWRs or events with high activity levels are selected based on a threshold for event size chosen by the experimenter (Foster and Wilson, 2006; Diba and Buzsáki, 2007; Karlsson and Frank, 2009; Stella et al., 2019). A decoding algorithm is then applied to the spikes within these events, yielding a set of position probability distributions for each time bin. Current approaches use either overlapping or nonoverlapping time bins whose size is also chosen by the experimenter. Finally, the most commonly used approaches for detecting sequential replay involve fitting a line to the resulting set of probability distributions, which relies on the assumption that the representations progress at a constant speed (Foster and Wilson, 2006; Diba and Buzsáki, 2007; Davidson et al., 2009; Karlsson and Frank, 2009; Carr et al., 2012; Ólafsdóttir et al., 2017; Tang et al., 2017; Drieu et al., 2018; Shin et al., 2019; Tingley and Buzsáki, 2020a; Bhattarai et al., 2020). A statistical test is then used to determine whether the line fit to the data is better than a line fit to shuffled versions of the data, where the shuffled version of the data represent an implicit definition of a 'random' sequence.
While the standard approach identifies constant speed events, it does not consider events that are rejected by the statistical test. This is problematic because it has the potential to mischaracterize real events that do not move at constant speeds, such as those that change direction or are discontinuous (Liu et al., 2018). Furthermore, the use of large, fixedsize temporal bins acts as a boxcar smoother that limits the potential movement speeds of the representation. For example, with 20 ms time bins and 3 cm position bins, an event can only move in 1.5 m/s increments (one or more bins in 20 ms) between time steps. The linear fit is also highly dependent on the estimation of event boundaries, such as the start and end times of the SWR, because the fit depends on all the data over the course of the event. Approaches that focus on the order of cell activity within each event (Lee and Wilson, 2002; Gupta et al., 2010) allow for a relaxation of that linear assumption, but replace it with a loss of statistical power due to either ignoring all but the first spike from each cell or using an arbitrarily smoothed version of the spike train. These approaches also do not provide information about the dynamics of the underlying spatial representation. Moreover, these approaches often exclude events that have stationary representations of a single location (Yu et al., 2017; Farooq and Dragoi, 2019).
Recognizing the problems with the linear fit, several studies have moved away from the constant velocity assumption, using the most probable location at each time bin (Pfeiffer and Foster, 2013; Wu and Foster, 2014; Grosmark and Buzsáki, 2016; Carey et al., 2019; Kaefer et al., 2020). For example, using this approach, Pfeiffer and Foster, 2015 found awake replay events containing alternation between representations of a single location and sequential spatial trajectories. On the other hand, Stella et al., 2019 reported that replays during sleep are spatially continuous and follow Brownian diffusion dynamics. However, both methods still used large time bins and neither took into account the uncertainty of the decoded estimates, making it hard to identify the source of the different conclusions.
An ideal approach to identifying and quantifying the dynamics of replay would circumvent these problems. It would use all the available spiking data to yield the most accurate decoded positions. It would provide a momentbymoment estimate of position and dynamics that would not be dependent on the estimation of SWR start and end times and could rapidly adjust to changes in dynamics. It would use very small temporal bins (1 or 2 ms) to allow for very rapid representational movement and would provide information about the certainty of the decoded estimates. It would be able to capture a range of movement dynamics including stationary or unchanging representations, trajectories that progress through space at constant or variable speeds, and disorganized, spatially incoherent events. It would provide a robust statistical assessment of confidence for each dynamic. Finally, where assumptions are made, it would provide welldefined parameters whose values could be explored systematically to understand their influence on the results.
We developed a state space model that achieves all those goals. State space models are a wellunderstood, wellknown statistical solution to the problems described above. By mathematically modeling the relationship between the data and latent dynamics, state space models make the assumptions of the model explicit and interpretable. Our model goes beyond previous approaches (Deng et al., 2016; Maboudi et al., 2018) by characterizing the spatial representations during SWRs as a mixture of three underlying patterns of movement dynamics: stationary trajectories, continuous trajectories that can progress at many times the typical speed of the animal, and spatially fragmented trajectories. We show how this model can take advantage of clusterless decoding—which relates multiunit spike waveform features to position without spike sorting—giving us more information about the population spiking activity. We apply this model to spiking data during SWR events from 10 rats, enabling a direct comparison to previous work.
We find that the large majority of SWRs contain spatially coherent content; that is, trajectories that are spatially concentrated at each moment in time and have no large discontinuities in position. Surprisingly, while the expected highspeed, sequential replay events were identified, the most common category of events expressed representations that moved at slower speeds, more consistent with realworld experiences. These findings illustrate the power of state space models and provide a new understanding of the nature of hippocampal replay.
Results
Overview of the model
We begin with an application of the state space model to simulated data, both to validate the model and to provide intuition (Figure 1). We simulate 19 Poisson spiking cells with Gaussian place fields on a 180 cm virtual linear track. Each place field has a 36 cm variance and a 15 Hz peak firing rate, and fields are spaced every 10 cm along the virtual track. We then construct a 280 ms spiking sequence (Figure 1A) and apply our model to the sequence. For the first 60 ms of this sequence, a single place cell fires repeatedly, resulting in the extended representation of a single location. For the second 190 ms of the sequence, the cells fire in sequential spatial order, representing a fast moving trajectory across the virtual linear track. For the last 30 ms of the sequence, the cells fire in an incoherent spatial order. These three firing patterns represent three different types of movement dynamics that could be expressed during SWRs, which we call stationary, continuous, and fragmented, respectively. The goal of our model is to characterize SWRs in terms of a mixture of these three dynamics at every time point.
Decoding the spiking sequence requires specifying two elements: the data model—how the spikes relate to position—and the movement dynamic models—how the position can change over time in each movement dynamic. For the data model, our decoder is the same as the standard ('Bayesian') decoder (Davidson et al., 2009; Pfeiffer and Foster, 2015; Stella et al., 2019). We compute an estimate of how each cell’s firing rate varies over position during movement (i.e. the place field, Figure 1B). This is used during decoding to compute the Poisson likelihood of position over time given the spiking sequence of interest. In contrast to the standard decoding approaches, we can use small time bins (in our case 2 ms vs. 20 ms or more) because we are able to take advantage of the prior placed on the dynamics by the state space model. This allows us to detect changes on smaller time scales than would be possible with the standard decoder and incorporate information about times when there is no spiking (Deng et al., 2015). Further, because place estimates from spikes within a single bin are combined, our small time bins allow us to measure the spatial information conveyed by single spikes, rather than assuming that a downstream neuron would integrate spikes from multiple neurons on the timescale of 20 ms or more.
Next, we specify movement dynamic models that describe a variety of ways that the latent position—the 'mental' position of the animal represented by the cells— could evolve over time. (Figure 1C). We do this by defining a state transition matrix that defines how the latent position can move from the previous time step (in our case, 2 ms). Previous findings suggest that replay may exhibit at least three distinct types of movement dynamics: stationary (Yu et al., 2017; Farooq and Dragoi, 2019; Muessig et al., 2019), continuous (Davidson et al., 2009), and fragmented, which could correspond to both extended spatially incoherent representations and representations that jump from one position to another in a short time period (Pfeiffer and Foster, 2015). We therefore define movement dynamic models to capture each of these possibilities.
In the stationary movement dynamic, the latent position does not change between time steps. The state transition matrix can thus be defined as an identity matrix, which predicts that the next position will be the same as the last position (at the resolution of the position bin). In the continuous movement dynamic, the latent position is most likely to be 'spatially close' to the position in the previous time step, so we use a Gaussian random walk state transition matrix. This means that, for a given latent position, the probability of moving to another position is modeled by a Gaussian centered at that position and 'spatially close' is defined by the variance of the Gaussian. In our case, since replay has been shown to move at speeds much faster than the animal’s speed (Davidson et al., 2009; Pfeiffer and Foster, 2015), we set the variance of the Gaussian to 6.0 cm. This ensures that with a 2 ms time step, the latent position is 95% likely to be within 4.90 cm of the previous latent position (or equivalently, this means that latent speeds from 0 to ∼25 m/s are most likely). Last, in the fragmented movement dynamic, the latent position can move to any available position instantaneously. We model this using a uniform state transition matrix, which makes transitions to each position equally likely. Importantly, the fragmented dynamic provides a wellspecified definition of what we mean by random, or unstructured activity.
Finally, we specify how likely each movement dynamic is to persist in time versus change to another dynamic via another state transition matrix between the movement dynamics (Figure 1D). In order to be conservative with respect to switching between dynamics, we assume that each movement dynamic is likely to dominate for ∼100 ms on average, which is approximately the duration of a SWR event. There is, however, a small probability of switching to one of the other movement dynamics. Accordingly, we set the probability of staying in a dynamic to 0.98 for each 2 ms time step, which corresponds to an expected duration of 100 ms for staying in a particular dynamic (the Markov assumption of the model means that the probability of staying in a dynamic follows a geometric distribution). Importantly, the data drives the estimated dynamic, so even if the probability of staying in a particular movement dynamic is high, data providing clear evidence for a change in dynamic can, within a time frame of ∼10 ms, drive a change to a different estimated dynamic, as in our simulated example. In line with this, we show below that our results are relatively insensitive to the value of this parameter.
Once we have specified the data and the movement dynamic models, we have fully specified the state space model. We use acausal decoding, meaning that we use all information from both past and future spikes, to estimate the joint posterior probability of position and dynamic (see Materials and methods). With this, we can summarize the resulting posterior probability with two quantities: the probability of each movement dynamic over time (by integrating out position; Figure 1E, left panel) and the probability of latent position over time, irrespective of movement dynamic (by summing over the movement dynamics; Figure 1E, right panel).
An examination of the two summaries shown in Figure 1E reveals that that the model successfully captures the dynamics of the population spiking activity in Figure 1A. The stable firing of the one active neuron indicates a stationary representation, and accordingly, the probability of the stationary movement dynamic is high at the beginning of the simulated replay. A change in the data then drives a rapid transition to the continuous movement dynamic, reflecting the trajectorylike spatially sequential spiking from the population of cells. Subsequently, as the population activity becomes spatially incoherent, the fragmented movement dynamic dominates for the last 30 ms of the simulated event.
This illustrates two key features of the model. First, as mentioned above, the fragmented dynamic gives us a way to directly identify times when the position representation is spatially incoherent, a higher resolution and less assumptiondependent alternative to the more commonly used nonparametric shuffle. Second, this approach allows the model to to capture a wide range of movement speeds for the latent position in contrast to the standard decoder line fit (red dashed line in Figure 1E, using the Radon transform). The model is defined in terms of a mixture of movement dynamics, as summarized by the probability of each movement dynamic, and which dynamic or dynamics are dominant at a given moment is related to the temporal evolution of the underlying position representation. To demonstrate this, we applied the model to 10,000 simulated trajectories, each trajectory proceeding at a constant speed (Figure 1F) from 1 cm/s to 10,000 cm/s. From this, we can see that not only are there regions of speed that correspond to each of our three movement dynamics being highly probable (where we define highly probable to be greater than or equal to 0.80 probability), but there are also intermediate speeds where two of the dynamics exhibit relatively high probability; and where the sum of two of the dynamics’ probabilities exceeds 0.80. In this manuscript, we will refer to these intermediate speeds as mixture dynamics. For example, when the stationary dynamic has a probability of 0.6 and the continuous has a probability of 0.4, we call this a stationarycontinuous mixture (light blue, Figure 1F) and this indicates that the trajectory is moving slowly. Correspondingly, if the continuous dynamic has a probability of 0.5 and the fragmented dynamic has a probability of 0.4, then we would call this a fragmentedcontinuousmixture and this indicates the trajectory is moving very quickly, but not as quickly as the fragmented dynamic dictates. In summary, we can characterize the speed or set of speeds that occur within an SWR based on the probability of each of the three movement dynamics over time. We further classify the probability of each movement dynamic as being part of one of five speed categories: stationary, stationarycontinuousmixtures, continuous, fragmentedcontinuous mixtures, and fragmented.
We note here that the choice of any particular threshold for classifying the movement dynamic of a SWR is arbitrary, and that the power of our approach lies in part on the ability to assign a probability for each dynamic or combinations of dynamics to each moment in time. Our goal in choosing 0.80 was to use a threshold that corresponds to highly probable events that roughly partition the latent position trajectory into interpretable categories of speed. Nonetheless, we also verify that our central results hold with a higher threshold of 0.95. We do not know if downstream neurons can explicitly responds to these dynamics based on either threshold, and there is probably no hard boundary between these dynamics, but our approach makes it possible to ask that question in a systematic way.
Finally, as mentioned above, we wanted to test the robustness of the model to the choice of probability of staying in a dynamic, because our choice of 0.98 or an expected duration of 100 ms is only based on the expected duration of an SWR. To investigate this we decoded the spiking sequence in Figure 1A with different probabilities of staying in the same dynamic versus switching to another dynamic (Figure 1—figure supplement 1). We found that for a large range of plausible probabilities of staying in one of the dynamics (between 0.96 and 0.993, corresponding to an expected duration between 25 and 150 ms), the model identified the movement dynamics consistently, with high probability (Figure 1—figure supplement 1A), demonstrating that data itself is the most influential element in the model. Furthermore, the most probable latent positions remained relatively consistent across the range of these probabilities as well (Figure 1—figure supplement 1B).
Identification of continuous dynamics in a SWR with sorted or clusterless spikes
We next sought to validate the model on real hippocampal data recorded from awake, behaving rats. To ensure that we could capture rapid trajectories many times the speed of the animal—as described by most studies of hippocampal replay—we first decoded a single SWR with sequential population activity (Figure 2A, top panel). Here the rat was performing a spatial alternation task on a Wshaped track (see Materials and methods for details). As expected, we observed that the probability of the continuous dynamic is high throughout this event, but the probability of the stationary dynamic was also noticeable at the beginning and end of the SWR (Figure 2A, middle panel). Using our speed classification scheme, this indicates that the speed of the replay is initially slower— a mixture of continuous and stationary dynamics—and then speeds up and slows down again. This is also evident in the posterior probability of latent linear position over time, which shows that the replay most likely travels down the center arm and up the right arm (Figure 2A, bottom panel). We can also see this when we project the maximum of the posterior of this trajectory (the most probable 'mental' position) to 2D to better see the spatial trajectory on the maze (Figure 2C). Importantly, when we apply the same model using the 2D position of the animal we get a similar result (Figure 2—figure supplement 1A).
One of our criteria for a more optimal method is for it to use all of the available spiking data. Using only clustered spikes discards any spike events that cannot be uniquely assigned to a putative single neuron, substantially reducing the amount of information that the resultant decoding can use. Additionally, spike sorting is not necessary to recover the underlying neural population dynamics (Trautmann et al., 2019) and often requires inclusion decisions that can vary widely across experimenters. We therefore adopted a 'clusterless' approach which directly uses multiunit spikes and their spike waveform features to decode position without spike sorting (see Materials and methods). Clusterless decoding has previously been used to successfully identify theta sequences and replay sequences in the hippocampus (Chen et al., 2012b; Kloosterman et al., 2014; Deng et al., 2016; Kay et al., 2020). Applying a clusterless decoder to the same SWR event, we get similar classification of the sequence (Figure 2B,D), both with 1D linearized position and 2D position (Figure 2—figure supplement 1B). As predicted, the spatial extent of the event is longer and the estimate of the posterior probability of latent position is narrower for the clusterless model. This reflects the clusterless model’s access to a larger pool of data that provides more information about the extent of the event and more certainty in the latent position and the dynamic (Figure 2D vs C).
Most replays contain coherent spatial content
After testing our model on a single SWR event, we applied our clusterless decoding algorithm to hippocampal recordings from 10 rats performing the Wtrack spatial alternation task (# tetrodes range: [10, 24], brain areas = [CA1, CA2, CA3]; some data previously used in Karlsson and Frank, 2009; Carr et al., 2012; Kay et al., 2020; position linearized to 1D). We first confirmed that our cell population for each animal could closely track the position of the animal during behavior by comparing the most probable decoded position to the position of the animal. We found that the average median difference between actual and decoded location is small (7 cm median difference, 5–9 cm 95% CI, for all times where the animal was moving greater than 4 cm/s, fivefold cross validation). This level of accuracy is comparable to that of other studies (Davidson et al., 2009; Shin et al., 2019; Farooq and Dragoi, 2019), and reflects in part the presence of 'theta sequences' (Dragoi and Buzsáki, 2006; Foster and Wilson, 2007) where position deviates behind and ahead of the animal across individual theta cycles. Importantly, we achieved this using small time bins (2 ms) compared to the commonly used 250 ms time bin that most studies use for decoding during movement, demonstrating the power of our algorithm. For each animal, we also verified that the population spiking rate was relatively consistent over all positions on the track, suggesting that we have comparable sampling of all locations in the environment (Figure 2—figure supplement 3).
Having established the validity of our decoder in tracking the animal’s position, we next assessed the prevalence of spatial content across SWRs. We detected SWRs using a permissive threshold (see Materials and methods) to ensure that we include both the rare largeamplitude events as well as the much more common smallamplitude events. As expected, and providing further support for the validity of the model, we observed replays that are classified as continuous throughout the entirety of the SWR, similar to those seen using standard approaches (Figure 2—figure supplement 2A–F). However, we also observed many events with spatially coherent content that do not have this structure. For example, there are trajectories that start in one direction and reverse back to the original position (Figure 3A, Figure 3—figure supplement 1B), representations that remain fixed in one position (Figure 3B, Figure 3—figure supplement 1G), and trajectories that jump between arms and between dynamics (Figure 3C,D,F, Figure 3—figure supplement 1F,H,I). We also observed SWRs where the content is spatially incoherent throughout (Figure 3—figure supplement 1A,D).
Using a 0.80 threshold, 89% (23,071 of 25,844) of SWRs contain at least one of the three dynamics or dynamic mixtures. To ensure that this reflects the spatially tuned firing of the neurons, we trained the encoding model with positions resampled with replacement, a shuffling procedure which disrupts the relationship between spiking and position, for two recording sessions. We then decoded the same SWR events, containing the original spikes. Only 9% of the SWRs are classified in the shuffled datasets, a value that is significantly less than that seen for the real data (p=0.02 for recording session 1, p=0.02 for recording session 2, Figure 3—figure supplement 2). This shows that our model does not impose movement dynamics in the absence of spatial information in the data.
Previous work focusing on spatially sequential replay reported that only a minority of events contain sequential spatial content (Foster and Wilson, 2006; Karlsson and Frank, 2009; Davidson et al., 2009). We therefore asked what fraction of classified events contain spatially coherent content, which we define as any times with stationary, stationarycontinuous mixture, or continuous dynamics (see Materials and methods and Figure 1F). We find that 96% (22,170 of 23,071) of classified SWRs and 86% (22,170 of 25,844) of all SWRs include spatially coherent structure, and that this prevalence of spatially coherent structure is consistent across animals (Figure 3G). We then asked what fraction of events contained spatially incoherent content, defined as an SWR containing any times with fragmented or fragmentedcontinuous mixture dynamics. We find that only 14% (3295 of 23,071) of classified SWRs include any times with spatially incoherent structure (Figure 3G). To further validate this result, we performed a shuffling procedure that, in contrast to our previous shuffle, preserves the local correlations between spikes but reduces global spatial structure. This controls for dynamics that may have been induced by bursts or other local features of the spike train while randomizing the position to spike relationship. To do this, we randomized the order of runs (from one reward well to another) and then circularly permuted the resulting segments of data across all tetrodes uniformly. Using this shuffle, we found that there was still less spatially coherent and more spatially incoherent content compared to the real data (recording session #1, spatially coherent: 99% vs. 71%, p=0.02, real vs. shuffled; spatially incoherent: 22% vs. 61%, p=0.02, real vs. shuffled; recording session #2, spatially coherent: 95% vs. 63%, p=0.02, real vs. shuffled; spatially incoherent: 16% vs. 79%, p=0.02, real vs. shuffled; Figure 3—figure supplement 3).
To more directly compare our findings to previous work, we quantified the percentage of classified SWRs that contained continuous content, as would typically be analyzed when using the standard decoder. Here, our findings were consistent with previous reports: in our datasets, 4449 of 23,071 or 19% of classified SWRs included time periods where the decoded position was classified as continuous (Figure 3G, 17% of all SWRs). Thus, focusing on only highspeed, linearfit trajectories excludes a large fraction of events where there is evidence for spatially coherent, but not continuously changing, content. We emphasize here that we did not limit our analysis to only those SWRs that had continuous dynamics for the entire SWR, as is assumed by linefitting approach of the standard decoder. The consideration of this broader class of SWRs allows us to take advantage of the ability of our decoder to capture a range different speeds within each SWR event.
We repeated our classification analysis with a higher classification threshold of 0.95 to ensure that our result was not dependent on the threshold of 0.80. We find that, while this change slightly reduced the total fraction of classified SWRs (19,478 of 26,159 or 74% of all SWRs), an even higher fraction of the classified SWRs (19,317 of 19,478 or 99% classified SWRs) included spatially coherent content. Similarly, SWRs containing spatially incoherent content comprised a small fraction of the classified SWRs (490 of 19,478 or 3% classified SWRs).
Because our model is specified in the context of a latent position associated with different movement dynamics, it allows us to not only classify events in terms of their dynamics, but also to quantify the model’s certainty in each position estimate at each moment in time given the model parameters. To do so, we can compute the cumulative spatial size of the 95% highest posterior density (HPD) region of the latent position estimate. The 95% HPD region corresponds to the set of most probable latent positions as estimated by the model. Larger values of the HPD region size indicate the model is less certain about position, because the most probable decoded positions are distributed over more of the track at a given time point. In contrast, smaller values of the HPD region size indicate that the model is more certain about the estimate of position because the extent of the HPD is more concentrated and covers less of the track. Thus, the HPD region size provides a complementary measure of spatial information, and evaluating it allows us to verify that the events we defined as spatially coherent also correspond to events where there is high certainty around the position estimates.
We find that spatially coherent events indeed have smaller HPD regions than events with fragmented dynamics. Figure 4A and Figure 4B show two example SWRs that are classified as having continuous and stationary dynamics, respectively. The most probable positions (as estimated by the model) at each time step in these SWRs is concentrated in a small portion of the track and correspondingly, the HPD region is small throughout the SWRs. In contrast, Figure 4C shows a SWR where the dynamics are fragmented and correspondingly, the most probable positions are more spatially diffuse and the HPD region is much larger. The HPD region size also provides insights into the momentbymoment structure of each event, which can change over the time course of a SWR. An example of this change is shown in Figure 4D, where the HPD region is small for most of the SWR until the end, at which point the uncertainty of position becomes much higher, reflecting a transition from a spatially coherent to a spatially incoherent representation. Overall, when we examine the average HPD region size for each SWR, grouped by dynamic, we find a clear bimodal split between spatially coherent dynamics and spatially incoherent dynamics (Figure 4E). For the spatially coherent dynamics, the average HPD region for each SWR was much smaller than the spatially incoherent dynamics (median 24 cm, spatially coherent vs. median 238 cm, spatially incoherent, p=2.2e16, onesided MannWhitneyU). The HPD region for the unclassified SWRs was similarly large.
We note here that while the size of the HPD region will be influenced by the dynamic estimated from the data, it remains possible for continuous or stationary dynamics to correspond to a large HPD region (see outliers in Figure 4E), indicating less spatial certainty for those specific events. In general, if the data does not exhibit strong place specific firing, the HPD region will be large, regardless of dynamic classification. To show this, we used the same shuffle as above, resampling position with replacement for two recording sessions and shuffling the relationship between spiking and spatial information when fitting the encoding model. The shuffled decodes have much larger HPD regions than the real data (recording session #1: 55 cm vs. 229 cm, p=0.02, real vs. shuffled; recording session #2: 72 cm vs. 231 cm, p=0.02, real vs. shuffled, onesided; Figure 4—figure supplement 1). We also compared the HPD regions between the real data and a shuffle that swapped the runs and circularly shuffled position, as above. This also resulted in larger HPD regions than the real data for spatially coherent and incoherent SWRs (recording session #1, spatially coherent: 29 cm vs. 52 cm, p=0.02, real vs. shuffled; spatially incoherent: 181 cm vs. 243 cm, p=0.02, real vs. shuffled; recording session #2, spatially coherent: 26 cm vs. 53 cm, p=0.02, real vs. shuffled; spatially incoherent: 210 cm vs. 263 cm, p=0.02, real vs. shuffled; Figure 4—figure supplement 2).
Many replay events are slow or stationary
Surprisingly, our results indicate that most events with coherent spatial content are not dominated by the continuous movement dynamic, but instead correspond to trajectories that are stationary or that move relatively slowly compared to their purely continuous counterparts. We therefore examined these events in more detail. We first note that most of the SWRs (14,989 of 21,433 or 70% of classified SWRs) were classified as having only a single dynamic or dynamic mixture (Figure 5A, for specific example SWR see Figure 3B), whereas SWRs with multiple dynamics or dynamic mixtures (such as those in Figure 3A,C,D,E and F) were less common. Interestingly, the most common category of classified SWRs were those with only stationarycontinuous mixtures (8944 of 21,433 or 42% of classified SWRs, Figure 5A, Figure 5—figure supplement 1, note that this percentage ignores the unclassified category). These events contain representations that move at slower speeds (Figure 5D)—similar to those speeds experienced by the animal in the environment (Figure 5G, median 17 cm/s for run periods and 4 cm/s for all times)—and are sustained, but slightly shorter, on average, than events with continuous dynamics (median duration: stationarycontinuousmixture 73 ms vs. continuous 94 ms, Figure 5B). Nonetheless, both the slowmoving events and continuous events frequently represent locations that were some distance away from the animal (mean trajectory distance from animal’s position: stationarycontinuousmixture 52 cm vs. continuous 43 cm, Figure 5C). This indicates that the content of these SWRs, like those that are classified as continuous, do not represent the animal’s actual position. We note that roughly a third of these events persisted for the entire duration of the SWR, but the other two thirds included some time where the model was uncertain about the dynamic. This shows the power of our state space decoder because it allows us identify periods of time when the model has high confidence with each SWR, and use these periods to characterize the SWR overall or to focus on specifically for further analysis.
The second most prevalent classification was exclusively stationary events (4858 of 21,433 or 23% classified SWRs; 47% of SWRs with stationary dynamics were entirely stationary for the entire duration of the SWR). Unlike the stationarycontinuous mixtures, most of these events represented a location close to the animal’s actual position (Figure 5C). There were, however, a number of stationary events that represented positions far from the animal. We set a threshold of 30 cm distance from the animal to identify nonlocal stationary events and found such content in ∼7% of classified SWRs (1437 of 21,433 classified SWRs). Of these, 46% (664 of 1437 stationary SWRs) were stationary throughout the duration of the SWR. These nonlocal stationary events most commonly represented reward wells or choice points (Figure 5E), consistent with Yu et al., 2017, but a small portion of them occurred at other locations of the track. This suggests that these representations can be flexibly deployed to represent discrete locations.
Control analyses
We then carried out a series of control analyses to determine whether our results could be due to differences in spiking statistics across dynamics or interneuronspecific activity patterns. We first calculated the multiunit spiking rates and found that these were similar across the different dynamics (Figure 5F). This indicates that all dynamics, including stationary and stationarycontinuous mixtures, were driven by comparable levels of sustained spiking information and could not be explained by the absence of spiking. Further corroborating this, we found that when we reran our analysis using periods of high multiunit activity instead of SWRs to identify events, we still found a larger proportion of stationary and stationarycontinuous dynamics relative to the continuous dynamics (Figure 5—figure supplement 3F).
We also verified that our classifications were very unlikely to be driven by the firing of interneurons, which are less likely to project to other brain areas and exhibit less spatial specificity than principal neurons in the hippocampus (Wilent and Nitz, 2007; Jinno et al., 2007; Hangya et al., 2010). First, we increased our tetrode participation threshold from two up to five tetrodes active during the SWR, to eliminate the possibility that events with a single tetrode with a high rate interneuron might drive the classification of stationary or stationarycontinuous mixture dynamics during SWRs. We found that our results were robust to this increase in threshold (Figure 5—figure supplement 2A). Indeed, most of our events and most periods of dynamics within each event included spikes from upwards of 10 tetrodes (Figure 5—figure supplement 3). Next, we removed spikes that had spike widths of less than 0.3 ms to reduce the potential contribution of narrowwaveform inhibitory interneurons (Fox and Ranck, 1975). Similarly, this had little effect: we still observed a high proportion of SWRs with stationary and stationarycontinuous mixture dynamics (Figure 5—figure supplement 2B). Using clustered data, we can more robustly categorize units as putative interneurons; thus, we then repeated this analysis on clustered data for 9 of the 10 animals, excluding putative interneurons based on spike width and firing rate and requiring that at least three putative pyramidal cells be active during the SWR. Although there were many fewer events to examine, we still observed many stationary and stationarycontinuous mixtures in SWRs (Figure 5—figure supplement 2D). Finally, if interneurons with less spatial specificity were driving the classification of slower dynamics, we would expect to see the posterior be less spatially concentrated during these events. Instead, we found that stationary and stationarycontinuous mixtures tended to have narrow posteriors (as characterized by 95% HPD region size in Figure 4E). This provides further evidence that our large fraction of slowspeed events were not the solely the result of the firing of interneurons.
We then asked whether the slow spatially coherent events (stationarycontinuous mixtures) potentially reflect a slow theta sweep rather than a replay event. We noted that this seems very unlikely given that the stationarycontinuous mixtures tended to represent locations far from the animal that would require rapid, longdistance theta sequences (median 50 cm Figure 5C). In addition, the majority of our events occur when the animal is moving at speeds less than 2 cm/s, as expected for SWRs (Figure 5—figure supplement 3B). Further, when we restrict the analyses to SWRs where the animal was moving at speeds less than 1 cm/s, we find similar proportion of SWRs with stationary and stationarycontinuous mixture dynamics (Figure 5—figure supplement 2C).
A third possible concern is that burst firing from a single cell is driving the stationary dynamics. There are several lines of evidence against this. First, as mentioned above, most of our SWRs contained spikes from multiple tetrodes, including those with stationary dynamics (Figure 5—figure supplement 3A), so it is unlikely that a single cell could drive these dynamics. Second, the stationary dynamics have a longer duration than one would expect from a burst alone (Figure 5B, 54 ms median duration, 38–74 ms quartiles vs. bursts which have a typical duration of 6–24 ms; Ranck Jr, 1973; Harris et al., 2001; Tropp Sneider et al., 2006). Third, only ∼50% of the spikes had interspike intervals (ISIs) of less than 6 ms during the stationary, stationarycontinuous, and continuous dynamics (Figure 5—figure supplement 3A), indicating that rapid bursts made up at most half of the observed spikes, and likely fewer given that these ISIs are computed for each tetrode, rather than for single units.
Finally, we wished to assess whether the spiking information from different hippocampal subfields (CA1, CA2, and CA3), could influence the types of dynamics we observed. This seemed unlikely, given that CA1 and CA3 spiking is tightly coordinated within and across hemispheres during SWR replay (Carr et al., 2012) and multiple previous papers have combined recordings across subfields and obtained results similar to those that restricted their analyses to CA1 (Diba and Buzsáki, 2007; Karlsson and Frank, 2009). However, we verified that when we used only CA1 tetrodes to decode (Figure 5—figure supplement 2E), we get similar proportions of dynamics as when using all the tetrodes.
Comparison to standard approaches
Finally, we asked how our method compares to the standard approaches to identifying significant replay events that use a nonparametric shuffle test to characterize their content. This comparison is complicated due to the variety of parameters and inference methods used previously, so here we chose three commonly used approaches and a reasonable parameter set for the comparison. We began with an examination of events based on statistical significance of a line fit to a circularly permuted spatial distribution, which has been the standard in the field. We applied the clusterless algorithm in 20 ms bins to produce probability densities of represented locations, as this binning is comparable to previous methods. We then used the Radon transform method, which finds the line with the highest summed posterior probability of position along the line (Davidson et al., 2009; Farooq and Dragoi, 2019), to calculate, for each SWR, a measure of significance based on the assumption that the probability density moves at a constant speed throughout the event (see examples in Figure 6—figure supplement 1A–F, fourth row, blue line). We found that ∼12% of the classified events (as defined by our statespace decoder) were identified as significant by this Radon method. Not surprisingly, the fraction of significant events was strongly related to which dynamic they contained. Of the SWRs that contained a period of continuous dynamics, 31% (1386 of 4511 SWRs) were significant. Of all SWRs that contained a period of stationary dynamics, 8% (542 of 7176 SWRs) were significant, and when we restricted the analysis to long stationary events (>100 ms duration), we found that 28% were significant. This increase is not surprising, since the shuffle used for the Radon method penalizes shorter events. An alternative approach, the linear regression method, which samples from the posterior probability and uses linear regression to fit a line to the samples (Karlsson and Frank, 2009; Carr et al., 2011; Shin et al., 2019), detects only nonzero slopes as significant, and thus by definition cannot identify stationary events as significant.
Conclusions about the representational content of events were similarly variable across methods. In addition to the Radon and linear regression methods, we also computed the MAP estimate, the maximum probability position of each 20 ms time bin (see examples in Figure 6—figure supplement 1A–F, fourth row, green line, Pfeiffer and Foster, 2015; Stella et al., 2019; Kaefer et al., 2020). We then compared the speed of the replay for each dynamic and for the entire SWR (Figure 6). We found that the methods that rely on fitting a line for the entire SWR (the Radon transform and the linear regression), tend to estimate that the replays proceed at faster speeds (Figure 6A,C), whereas the MAP method (Figure 6B), which only relies on picking the most probable position for a given time bin and allows for more variable speeds, meaning it can capture both fast and slow speeds. In addition, the speeds from the MAP method (Figure 6B) are quite similar to the speeds estimated by the state space decoder (Figure 6D). This is to be expected because the MAP method imposes less smoothing (although some is added by using 20 ms time bins, see Figure 6—figure supplement 2A–F for the same examples with 2 ms bins), whereas the linear methods impose a larger amount of smoothing and cannot change directions (see Figure 6—figure supplement 1A for an example where this is beneficial because the trajectory is linear and the smoothing ignores a lack of spiking and Figure 6—figure supplement 1B for an example where this is not beneficial because the linear fit cannot change direction). Our state space decoder with 2 ms bins imposes much less smoothing than the linear methods and therefore is closer to the MAP estimates with 20 ms bins (Figure 6—figure supplement 1B), but also takes into account the uncertainty of each time bin (Figure 6—figure supplement 1C) and is able to change direction more quickly than the 20 ms time bins (Figure 6—figure supplement 1E). Using 2 ms bins with the MAP method would allow for faster change in the direction of the latent position but also results in much more variable estimates (Figure 6—figure supplement 2). These results indicate that the methods that made more assumptions about the structure of replay (Radon and linear regression approaches) yield different conclusions about which events are potentially meaningful and how those events progress over time from methods that make fewer assumptions (the MAP and statespace approaches). The substantial variability in both the proportions of 'significant' events and the speed of the events demonstrates that key conclusions about the events could be very different depending on the which of the previous methods and parameter were used.
Discussion
We developed a state space model that identifies and quantifies replay in terms of a mixture of three movement dynamics: stationary, continuous, and fragmented. This model is robust: it is relatively insensitive to a range of plausible transition parameters between dynamics and instead strongly reflects the hippocampal place representation structure. We show that this model is interpretable: it has welldefined parameters that capture intuitive movement models and allows us to explain a large fraction of SWRs in terms of these dynamics—far more than have been previously associated with identifiable activity patterns. This model is also flexible: it can be used with sorted or clusterless spikes, it can be used with 1D or 2D positions, and—because it uses a mixture of dynamics—it allows us to confidently discover not only SWRs with spatial representations that move at a constant speed, but also representations that vary in speed, have slower speeds, or that are stationary. In fact, these slower moving representations constitute the majority of representations seen during SWRs, but would not be found with the typical approaches that assume that events progress at constant speed. The prevalence of these events indicates that hippocampal replay can recapitulate experience at both accelerated and closer to realworld speeds.
Previous work reported that less than half of SWRs or highpopulationactivity events contained replay events (Foster and Wilson, 2006; Karlsson and Frank, 2009; Davidson et al., 2009). This could lead one to the assumption that most events are not spatially coherent or meaningful. Our results indicate that this is not the case: in our dataset, the large majority of SWRs (86%) contained spatially coherent content, as defined by some combination of stationary and continuous dynamics. These events were mostly decoded representations where the most probable positions were highly spatially concentrated (i.e. had small HPD region size), indicating that they were driven by spatially specific firing. This was in contrast to unclassified events or events that included fragmented dynamics, which were associated with highly distributed decoded spatial representations.
Importantly, the spatially coherent events most often expressed a dynamic that was consistent with slower speeds of less than 1 m/s (100 cm/s), a speed range that includes the speeds at which rats traverse their environment. The next most common category were stationary events (Yu et al., 2017; Farooq and Dragoi, 2019; Muessig et al., 2019) that activated a persistent representation of a single location, while the 'classical' replay events—continuous, extended trajectories through space—made up only about 19% of classified events. Events with slower dynamics tended to persist for approximately the same length of time as events with faster continuous dynamics (∼50–100 ms), consistent with a brief activation of a representation of a location or a short trajectory through space. Events with slower dynamics most often represented the animal’s current location, similar to continuous events identified in previous work (Davidson et al., 2009; Karlsson and Frank, 2009); however, slower events were also capable of representing locations far from the animal. Interestingly, Louie and Wilson, 2001 found that during REM sleep, the longer time scale population activity (on the order of seconds) is highly correlated with firing during times when the animal is running, resembling the animal’s experience. Our results are similar but were observed during pauses in active behavior and on the timescale of SWRs, which last 10s to 100s of milliseconds.
These results challenge the longheld notion that hippocampal replay trajectories necessarily proceed at many times the speed of the animal (Nádasdy et al., 1999; Lee and Wilson, 2002; Davidson et al., 2009). Instead, replay events encompass a much richer variety of event speeds that could promote storage and updating of memories on different spatial and temporal scales, including slower moving representations that activate a spatially compact set of nearby locations and stationary representations of single locations. This is consistent with observations of events that, when decoded in a single time bin, represent a small neighborhood in the environment (Dupret et al., 2010). Interestingly, a previous paper (Farooq and Dragoi, 2019) reported that these stationary representations were most prevalent in juvenile rats and decreased to 'chance' levels when the rats reached adulthood. Our results strongly suggest that they do not disappear, but are a very common occurrence with adult rats. We believe that the difference in these findings is a result of defining 'chance' levels of occurrence of these events and using a restrictive decoding model which requires the decoded position to be stationary for the entire event. We also note that we chose to include a stationary dynamic to explicitly evaluate the prevalence of stationary events in adult animals, but in principle, a model with just continuous and fragmented dynamics could capture these events as well, as the continuous dynamic includes movement speeds close to 0 m/s.
The differences between our findings and those reported in some previous papers also highlight the importance of identifying the specific question that an analysis seeks to answer. Initial studies of replay (Lee and Wilson, 2002; Foster and Wilson, 2006; Davidson et al., 2009; Karlsson and Frank, 2009) focused on the question of whether the observed patterns of spiking seen during replay reflected structured content or 'random' spiking as defined by a comparison to one or more shuffles. These and subsequent studies answered the question 'are replay events the result of random spiking?' with a clear 'no'. These studies did not, however, establish that the significance criteria used to show nonrandom structure has anything to do with the presence of spatial structure or events that are potentially useful for memory processes.
Our method aims to identify structure within events and to capture their complex dynamics to determine what sort of information replay events could provide in service of memory functions. We precisely define patterns of spatial representation that are, and are not, interpretable in terms of the structure of a given environment and the spatial activity seen during movement. In so doing, we find that most, but not all, events contain spiking that corresponds to a location or a trajectory through the animal’s environment.
The different questions addressed by our method compared to previous methods can also explain the apparent differences in findings. Our results showed that 19% of SWRs contained continuous trajectories—meaning that portions of the SWR had spatially coherent trajectories that moved at high velocities—but only 4% were continuous throughout the entire SWR. This, prima facie, may seem at odds with previous results, which found between 5% and 45% of SWRs were 'significant', meaning that they were well fit by a line compared to various shuffled versions of the dataset. Furthermore, these studies found trajectories on average move at much higher speeds (20x normal rat movement speed) than we report here. These previous findings were based largely on a single linear fits to the data for each event, while our model allows the trajectory to change speed during the SWR, which can lead to very different estimates of the speed of the event. Many of the events that had continuous dynamics, and would be classified as having high speed trajectories, also exhibited stationarycontinuous mixture dynamics for some portion of the event(see for example Figure 2). Further, because the linear fit includes the whole SWR event, it is highly sensitive to the definition of the start and end of the SWR event, an often arbitrary definition which varies across research groups. In contrast, our model makes an estimate for each time point and only depends on the previous and future time step, which minimizes the dependence of the event on those boundaries and any possible 'noise' activity within the SWR. This, along with our more permissive threshold for SWRs, allows us to characterize more SWRs than previous studies, rather than exclude them from analysis.
Our results may also explain the seeming conflict between (Stella et al., 2019), who reported that hippocampal replay trajectories followed Brownianlike diffusion dynamics, and (Pfeiffer and Foster, 2015), who reported that a subset of replay trajectories contained systematic jumps in position. We found that while replay speeds are, on average, distributed log normally, a small subset of the replays include definitive 'jumps' from one position to another. Because our model makes it possible to identify different dynamics within the same SWR, we are able to identify both types of sequences.
We hypothesize that this large set of events supports the ability to store and update representations of individual places and snippets of experience. Previous findings demonstrated that SWRs with movement or immobilityrelated activity engage specific patterns of spiking activity outside the hippocampus (Jadhav et al., 2016; Yu et al., 2017). Thus, we would expect that the large set of events with slow dynamics would also engage specific patterns of activity outside the hippocampus, perhaps allowing animals to store or update the value of specific locations or short movement trajectories without having to generate full trajectories representing one of multiple ways to arrive at a location (Yu and Frank, 2015). Events that contain multiple dynamics (e.g. two continuous representations separated by a brief fragmented representation) might also help bind together experiences that occur in the same environment but at different places, whereas sustained spatially incoherent representations might represent coherent representations in other spatial environments (Karlsson and Frank, 2009) or nonspatial experiences. Alternatively, downstream structures may be are attuned to the shifts in dynamics, and that each period of coherent hippocampal dynamics drives its own set of responses elsewhere in the brain. We further note that recent results from a study of activity patterns during movement indicates that individual theta cycles can have different dynamics as well (Wang et al., 2020), suggesting that the underlying sequences seen during replay and theta may overlap in their dynamics as well as their content.
Similarly, it is possible that slow and fast dynamics are supported by different subcircuits within the hippocampal formation. Yamamoto and Tonegawa, 2017 found that blocking medial entorhinal cortex (MEC) output to CA1 resulted in shorter (and therefore slower) replay trajectories and (Oliva et al., 2018) showed that layer 2/3 MEC neurons increased their firing 50 ms before the start of longer duration SWRs (>100 ms). This raises the possibility that these slower replays are driven by CA3CA1 interaction while the faster trajectories are more externally driven by MECCA1, and that these different events could also engage different circuits outside of the hippocampus and entorhinal cortex.
One key aspect to understanding how the hippocampus engages with other brain structures is identifying the times when the neural dynamics have changed. The times of SWRs or increases in multiunit activity has provided an important, but coarse, landmark to this change in neural dynamics. Because our method allows for momentbymoment characterization of the neural dynamic, we suggest that evaluating the start and end of each dynamic may prove a powerful way identify periods of interest. Examination of such periods may advance our understanding of when and why the hippocampus may be representing current position, past experience, or possible future locations, how inputs to the hippocampus influence these events, and how these events may in turn influence downstream structures.
Our identification of the rich and varied dynamics in SWRs depended on several factors. First, we sought to describe as many SWRs as possible, instead of setting a high threshold for event detection. This more permissive approach was also critical for the identification of events that activate specific, immobilityassociated locations in previous work (Yu et al., 2017). More broadly, there is no evidence that SWR and high population activity events constitute clearly distinct classes of events that can be identified with a single fixed threshold. Instead, the sizes of these events are well described by a unimodal, longtailed distribution (Yu et al., 2017). High thresholds can therefore exclude large numbers of real events and potentially lead to erroneous conclusions.
Second, we used clusterless decoding, which enabled us to take advantage of information from neurons farther away from the tetrodes and not just those that could be cleanly and somewhat subjectively separated (Chen et al., 2012b; Kloosterman et al., 2014; Deng et al., 2016). We note that more fully automated modern spike sorting algorithms (such as MountainSort; Chung et al., 2017) in combination with better recording technology could reduce the differences in results when using sorted spikes or clusterless decoding, as these sorting methods help reduce experimenter subjectivity involved in spike sorting and identify larger numbers of putative single neurons.
Third, we built an explicit and flexible model that allowed us to identify not just one dynamic, but multiple dynamics consistent with different speeds of motion. We used these dynamics to describe the SWR events, rather than declaring the events as significant or not.
Fourth, our model avoided using large temporal bins and does not make assumptions about the linear progression of events. Instead, the model allows for movement in any direction allowed by geometry of the environment, and further, still accounts for the uncertainty of the data. Because the model gives a momentbymoment estimate of the position and dynamic, this also minimizes the effect of misidentifying the SWR start and end times. Combined with an acausal estimation of latent position, this approach allows us to identify the specific timing of transitions in dynamics at a fine timescale. In addition to our SWRbased analyses, this approach could also be used to decode activity during movement, where we might expect that placespecific activity would express combinations of stationary and continuous dynamics reflecting different speeds of representational movement during each cycle of the theta rhythm. Similarly, because our approach uses the same encoderdecoder approach as the standard decoder, it can be used to investigate sequence dynamics before experience (Dragoi and Tonegawa, 2013).
Our model can be viewed as an extension to the multiple temporal models approach of Johnson et al., 2008, which used model selection to determine the model with the most appropriate trajectory speed. Our model goes beyond the Johnson model in that it explicitly permits a mixture between the movement speeds, can work for arbitrary track geometries, and uses clusterless decoding. Our model also represents a middle ground between Hidden Markovstyle models (Chen et al., 2012a; Chen et al., 2016; Linderman et al., 2016; Maboudi et al., 2018), which seek to be environmentagnostic detectors of sequential patterns, and the standard decoder, which typically use arbitrarily chosen bin sizes and restrictive assumptions about the nature of the trajectories. In particular, our approach allows for a variety of dynamics while still yielding spatially interpretable results and makes it possible to use bin sizes of any size (here 2 ms).
Code and algorithms for decoding hippocampal replay have not typically been made accessible or easy to use. This is problematic because it can lead to severe variation or errors in code and parameters, limits reproducibility of results, and slows down scientific progress. Accordingly, we have made code for this model publicly available as a python package at the following URL https://github.com/EdenKramerLab/replay_trajectory_classification (Denovellis, 2021a; copy archived at swh:1:rev:83d84170ae0bdeef65cd123fa83448fcca9cb986). It is easily installable using the pip or conda package installer and contains tutorial Jupyter notebooks in order to facilitate reuse.
Statespace models like the one we use here can enable a richer set of analyses of events that take advantage of all of the available data. These approaches can be applied not just to SWRs but to all times, providing a momentbymoment estimate of the nature of the spatial representation in the hippocampus, including important information about the spatial coherence of that representation. The model can be extended to incorporate other previously experienced environments by training place field models on those environments and including the appropriate movement transition matrices for those environments. It can also be extended to account for task conditions (such as the inbound and outbound conditions in our spatial alternation task) and forward/reverse sequences as in Deng et al., 2016. Finally, models can be built to estimate any covariate, including movement direction (Kay et al., 2020). We therefore suggest that this approach has the potential to provide a much richer understanding of neural representations throughout the brain.
Materials and methods
Simulated data
Request a detailed protocolEncoding data for Figure 1 and Figure 1—figure supplement 1 were generated by simulating 15 traversals of a 180 cm linear track. The spiking of 19 neurons was simulated using an inhomogeneous Poisson process where the instantaneous firing rate of each neuron changes according to a Gaussian place field. The Gaussian place fields had a variance of 36 cm and were spaced at 10 cm intervals over the 180 cm track. The decoding data for Figure 1F was generated by simulating 20,000 linear traversals of the 180 cm track, each with a unique constant speed, starting at 0.5 cm/s and increasing by 0.5 cm/s up to 10,000 cm/s. Each simulated neuron 'spiked' when the traversal passed through the peak location of its place field.
Recording locations and techniques
Request a detailed protocolTen male Long Evans rats (500–700 g, 4–9 months old) were trained on a Wtrack spatial alternation task. nine rats contributed to previous studies (Karlsson and Frank, 2009; Carr et al., 2012; Kay et al., 2016; Kay et al., 2020). Neural activity was recorded in CA1, CA2, CA3, MEC, Subiculum, and DG depending on the rat. We only used tetrodes located in the CA1, CA2, and CA3 subregions in this study.
Behavioral task
Request a detailed protocolAll animals performed a Wtrack spatial alternation task, which is described in detail in Karlsson and Frank, 2009. In brief, each day, animals alternate between 20 min rest sessions in a rest box and 15 min run sessions on the Wshaped track equipped with a reward well at each arm end. On the Wtrack, animals are rewarded at the ends of an arm when that arm is the next correct arm in the sequence. Two rules determine the next correct arm. If the animal is in an outer arm, it must next visit the center arm. If the animal is in the center arm, it has to next visit the less recently visited outer arm. Correct performance of these rules result in the visit sequence: center, left, center, right, center, left, etc. Animals were free to choose any arm at any time. Only run recording sessions with at least nine putative hippocampal pyramidal cells that fired at least 100 spikes were included in the analysis.
Position of the animal and linearization
Request a detailed protocolThe animal’s 2D position was estimated from digital video (30 Hz) of two infrared diodes placed on the headstage preamplifiers using a semiautomated analysis. In order to decrease the time it takes to run the model, the 2D position of the animal was converted into a 1D position. This is done by first defining a 2D graph representation of the track (herein referred to as the track graph), where edges correspond to segments of the Wtrack and nodes represent intersection points between those segments. Then, based on the algorithm in Newson and Krumm, 2009, we use a Hidden Markov Model (HMM) to assign the position detected in each video frame to the most likely track segment. Using the HMM takes into account the time dependent nature of the data and helps prevents sudden jumps from one track segment to another, which is particularly important near intersections. The observation model of the HMM is Gaussian and it models the likelihood of being on a track segment as the Gaussian distance to that segment with a 5 cm standard deviation. The state transition model is empirically estimated, and changes with each time point to ensure that the Euclidean distance between successive position estimates is similar to the shortest path distance along the graph between successive position estimates. A slight bias of 0.1 is given to the diagonal of the state transition model to encourage staying on the same track segment. The most likely track segment the animal is on is computed using the Viterbi algorithm Viterbi, 1967. After finding the track segment that corresponds to each 2D position, the 2D position is projected onto the nearest point of the track segment. This allows us to define a distance from the center well in terms of shortest path length on the track, where 0 cm represents the center well position. The linear distance can then be converted into a linear position by assigning each track segment a position in 1D space. 15 cm gaps were placed between the center arm, left arm, and right arms in 1D space to prevent any smoothing done in the model from influencing neighboring segments inappropriately. The code used for linearization can be found at https://github.com/EdenKramerLab/loren_frank_data_processing (Denovellis, 2021b).
Sorted spikes, multiunit spikes, and waveform features
Request a detailed protocolTo obtain the neural spiking data used for decoding, electrical potentials from rat hippocampus were recorded at 30 kHz, referenced to a tetrode located in the corpus callosum, and then digitally filtered between 600 Hz and 6 kHz. Spiking events were detected as any potential exceeding a 60 $\mu V$ threshold on any one of the four tetrode wires of a tetrode. The electrical potential value on each wire of the tetrode at the time of maximum potential of any of the four wires was used as the waveform feature in the clusterless decoding.
For decoding using sorted spikes, the multiunit events were processed further to assign events to putative single cells. Putative single cells were manually identified based on the clustering of three waveform features within a day: peak amplitude, principal components, and spike width. Only putative hippocampal pyramidal cells—identified based on spike width and average firing rate—were included in the analysis.
SWR detection
Request a detailed protocolSharp wave ripples were detected using the same method as in Kay et al., 2016. Each CA1 LFP was obtained by downsampling the original 30 kHz electrical potential to 1.5 kHz and bandpass filtering between 0.5 Hz and 400 Hz. This was further bandpass filtered for the ripple band (150–250 Hz), squared, and then summed across tetrodes—forming a single population trace over time. This trace was smoothed with a Gaussian with a 4 ms standard deviation and the square root of this trace was taken to get an estimate of the population ripple band power. Candidate SWR times were found by zscoring the population power trace of an entire recording session and finding times when the zscore exceeded two standard deviations for a minimum of 15 ms and the speed of the animal was less than 4 cm/s. The SWR times were then extended before and after the threshold crossings to include the time until the population trace returned to the mean value. The code used for ripple detection can be found at https://github.com/EdenKramerLab/ripple_detection (Denovellis, 2021b). We only analyzed SWRs with spikes from at least two tetrodes.
The model
Request a detailed protocolLet x_{k} be a continuous latent variable that corresponds to the position represented by the population of cells at time t_{k} and let ${I}_{k}$ be a discrete latent variable that is an indicator for the movement dynamics we wish to characterize: stationary, continuous, and fragmented. The goal of the model is to estimate simultaneously the posterior probability of position and dynamics $p({x}_{k},{I}_{k}\mid {O}_{1:T})$, where ${O}_{1:T}$ corresponds to the observed spiking data from time one to time $T$. The observed data can be either spike trains $\mathrm{\Delta}{N}_{1:T}^{(1:C)}$ from $C$ putative cells when decoding with sorted spikes or multiunit spikes $\mathrm{\Delta}{N}_{1:T}^{(1:E)}$ and their associated wave form features ${\overrightarrow{m}}_{k,j}^{i}$ from each tetrode $E$ when decoding with clusterless spikes, where $i\in 1:E$, $k\in 1:T$, $j\in 1:\mathrm{\Delta}{N}_{k}^{i}$.
We have previously shown (Denovellis et al., 2019) that the posterior probability $p({x}_{k},{I}_{k}\mid {O}_{1:T})$ can be estimated by applying the following recursive causal filter equation, starting with initial conditions $p({x}_{0},{I}_{0})$ and iterating to time $T$:
and then applying the acausal smoother equation, starting from the last estimate of the casual filter $p({x}_{T},{I}_{T}\mid {O}_{1:T})$ and recursively iterating backwards to time 1:
where:
Therefore, to specify the model, we have to define or estimate the following quantities:
$p({x}_{0},{I}_{0})$  the initial conditions
$Pr({I}_{k}\mid {I}_{k1})$  the dynamics transition matrix
$p({x}_{k}\mid {x}_{k1},{I}_{k},{I}_{k1})$  the dynamics movement model
$p({O}_{k}\mid {x}_{k},{I}_{k})$  the likelihood of the observations
For the initial conditions $p({x}_{0},{I}_{0})$, we set each dynamic I_{0} to have uniform $1/3$ probability and each initial latent position to have uniform probability density over all possible positions $\mathcal{U}(\mathrm{min}x,\mathrm{max}x)$, reflecting the fact that we do not have any prior knowledge about which dynamic or position is more likely:
For the dynamics transition matrix $Pr({I}_{k}\mid {I}_{k1})$, which defines how likely the dynamic is to change to another dynamic versus persist in the same dynamic, we set it to be:
to encode the prior expectation that each of the dynamics will last the average duration of 100 ms, with a small probability of changing to one of the other dynamics.
For the dynamics movement model $p({x}_{k}\mid {x}_{k1},{I}_{k},{I}_{k1})$, which defines how likely the position x_{k} is to change given the previous position ${x}_{k1}$ and current ${I}_{k}$ and past dynamics ${I}_{k1}$, we set it to be:
where $\delta ({x}_{k1})$ is an identity transition matrix where position cannot change from the previous time step, $\mathcal{N}({x}_{k1},6.0)$ is a random walk from the previous position with variance 6.0, and $\mathcal{U}(\mathrm{min}x,\mathrm{max}x)$ is a uniform transition that allows transitions to any possible position. As discussed in the Results, this means that when persisting in the same dynamic, the stationary, continuous, and fragmented dynamics are defined by the identity transition, the random walk, and the uniform transition, respectively. When transitioning to or from the fragmented dynamic, we assume we do not have any information about the position, so the transition is uniform. Finally, when the transition is from the stationary to continuous, we assume the position is spatially close where it was previously, so we use a random walk. When the transition is from continuous to stationary, we assume that the position is no longer changing, so we use the identity transition.
Lastly, we evaluate the likelihood of the observations $p({O}_{k}\mid {x}_{k},{I}_{k})$ based on an encoding model fit during the encoding period. We assume the likelihood is the same for each dynamic ${I}_{k}$, so we only need to evaluate $p({O}_{k}\mid {x}_{k})$. It has been shown (Zhang et al., 1998; Brown et al., 1998) that the Poisson likelihood with sorted spikes can be computed as:
where ${N}_{{t}_{k}}^{i}$ represents a spike at time t_{k} from cell $i$, ${\mathrm{\Delta}}_{k}$ is the time bin size, and ${\lambda}_{i}({t}_{k}\mid {x}_{k})$ is the instantaneous firing rate of cell $i$ given position x_{k}. ${\lambda}_{i}({t}_{k}\mid {x}_{k})$ is the 'place field’ of the cell, which can be estimated by fitting a generalized linear model to each cell’s spiking during the encoding period.
Likewise, it has been shown (Chen et al., 2012b; Kloosterman et al., 2014; Deng et al., 2016) that the clusterless likelihood can be computed as:
where ${\lambda}_{i}({t}_{k},{\overrightarrow{m}}_{k,j}^{i}\mid {x}_{k})$ is now a generalized firing rate that depends on an associated wave form features $\overrightarrow{m}$ and ${\mathrm{\Lambda}}_{i}({t}_{k}\mid {x}_{k})$ is a marginal firing rate that is equivalent to a place field estimated on multiunit spikes. Both of these rates can be defined as:
and
where ${\mu}_{i}$ is the mean firing rate for tetrode $i$, $\pi (x)$ is the smoothed spatial occupancy of the animal on the track, ${p}_{i}({x}_{k})$ is the smoothed spatial occupancy only at times when spikes occur in tetrode $i$, and ${p}_{i}({x}_{k},{\overrightarrow{m}}_{k,j}^{i})$ is the smoothed occupancy in the space of both space and waveform features. $\pi (x)$, ${p}_{i}({x}_{k},{\overrightarrow{m}}_{k,j}^{i})$, ${p}_{i}({x}_{k})$ can all be estimated by training a kernel density estimator on each tetrode’s spike waveform features and corresponding position during the encoding period.
Encoding  sorted spikes
Request a detailed protocolIn order to encode how each cell’s spiking activity relates to position (the place field), we fit a generalized linear model (GLM) with a Poisson response distribution to each cell’s spiking activity during the encoding period, which we define as all movement times (time periods when the running speed is greater than 4 cm/s). We estimate the parameters $\beta $, which consist of ${\beta}_{0}$, the baseline firing rate over time, and ${\beta}_{i}$, weights for third degree Bspline basis functions f_{i} over position (or tensor products of the Bsplines when position is two dimensional). Bspline basis functions are used because place field firing activity is assumed to vary smoothly over position and this prior knowledge can be exploited to reduce the total number of model parameters needed. Each basis function is spaced every 5 cm over the range of the position and zero constrained so that the change encoded by the parameters is relative to the baseline firing rate. We use a log link function to convert the linear combination of parameters to an instantaneous firing rate over time $\lambda ({t}_{k})$ to ensure the rate is always positive.
A small L2 penalization term $\lambda {\parallel {\beta}_{i}\parallel}_{2}^{2}$ is used to prevent model fitting instability when spiking activity is very low. We set this to 0.5 for all cells. Fitting is done by maximizing the penalized likelihood using a NewtonRaphson algorithm. The code used to fit the GLMs is available at https://github.com/EdenKramerLab/regularized_glm; Denovellis, 2021b.
Encoding  clusterless
Request a detailed protocolIn order to relate each tetrode’s unsorted spiking activity and waveform features to position during the encoding period, which we define as all movement times (time periods when the running speed is greater than 4 cm/s), we used kernel density estimation (KDE) to estimate the following distributions: $\pi (x)$, ${p}_{i}({x}_{k})$, and ${p}_{i}({x}_{k},{\overrightarrow{m}}_{k})$. We used KDEs of the form:
where $y$ is the data with $D$ dimensions, $K$ is a one dimensional Gaussian kernel with bandwidth h_{d}, ${y}_{i,d}$ is the data observed during movement, $N$ is the number of observations during the movement period. For $\pi (x)$, ${y}_{i,d}$ is all positions observed during movement. For ${p}_{i}({x}_{k})$, ${y}_{i,d}$ is all positions at the time of multiunit spikes during movement. For ${p}_{i}({x}_{k},{\overrightarrow{m}}_{k,j}^{i})$, ${y}_{i,d}$ is all positions at the time of multiunit spikes and their associated waveform features during movement. We choose the bandwidth h_{d} to be 6.0 cm for any position dimension and 24.0 $\mu V$ for any spike amplitude dimension, because these parameters were found to minimize decoding error in Kloosterman et al., 2014. The mean firing rate ${\mu}_{k}$ was also estimated during movement.
Decoding
Request a detailed protocolIn order to decode $p({x}_{k},{I}_{k}\mid {O}_{1:T})$, we used a gridbased approximation of the latent position x_{k} that respected the geometry of the track. For 1D linearized positions, we discretized the position space based on the same track graph we used for linearization by finding bins less than or equal to the chosen position bin size for each edge of the graph. For 2D positions, we discretized the position space by binning 2D positions occupied by the animal with equal sized bins of the chosen position bin size, followed by morphological opening to get rid of any holes smaller than the bin size. This was done using Scipy image processing tools (SciPy 1.0 Contributors et al., 2020). This grid based approximation allows us to use Riemann sums to approximate the integrals in the causal filter and acausal smoother equations. We chose a position grid with bins of 3 cm in width (and height if the model was computed with 2D positions) in order to get good resolution for the random walk transition matrices (which had 6 cm variance) as well as for the clusterless and sorted spikes decoding (which have 6 cm bandwidth for the KDE position dimensions and 5 cm spline knots for the GLM). For sorted spikes decoding, we evaluated the place field ${\lambda}_{i}({t}_{k}\mid {x}_{k})$ on the midpoint of these bins. Likewise, for clusterless decoding, we evaluated the spike amplitudes observed during the decoding period by evaluating the KDE for ${p}_{i}({x}_{k},{\overrightarrow{m}}_{k,j}^{i})$ for the midpoint of these bins.
We also used these grid bins in combination with the track graph in order to construct the appropriate 1D random walk transition matrices that respected the track geometry. To do this, we inserted the bin centers as nodes in the track graph, and then computed the shortest path distance between all pairs of position bin centers. We then evaluated a zero mean Gaussian with a variance of 6 cm on these distances to get the appropriate transition probability of moving from bin to bin.
Finally, for our SWR analysis of $p({x}_{k},{I}_{k}\mid {O}_{1:T})$, we decoded each immobility time period (times when the animal’s speed was less than 4 cm / s) in 2 ms time bins and extracted the SWR times.
Probability of the dynamics
Request a detailed protocolThe probability of the dynamics is a quantity that indicates how much that dynamic is contributing to the posterior at a given time. We estimate the probability of the dynamics by integrating out position from the joint posterior:
As in other calculations, the integral is approximated with a Riemann sum.
Posterior probability of position
Request a detailed protocolThe posterior probability of position is a quantity that indicates the most probable 'mental' positions of the animal based on the data. We estimate it by marginalizing the joint probability over the dynamics.
Classification of dynamics
Request a detailed protocolWe used a threshold of 0.80 to classify the probability of each state $Pr({I}_{k}\mid {O}_{1:T})$ into five categories. Time periods during sharp wave ripples are labeled as stationary, continuous, or fragmented when the probability of each individual state is above 0.80, regardless of duration. Time periods are labeled as stationarycontinuousmixture or fragmentedcontinuousmixture when the sum of stationary and continuous or fragmented and continuous are above 0.80, respectively. Time periods where none of these criterion are met are considered unclassified.
Highest posterior density
Request a detailed protocolThe 95% highest posterior density (HPD) is a measurement of the spread of the posterior probability and is defined as the region of the posterior that contains the top 95% of values of the posterior probability (Casella and Berger, 2001). By using only the top values, this measurement of spread is not influenced by multimodal distributions (whereas an alternative measure like the quantiles of the distribution would be). In this manuscript, we use the HPD region size—the total area of the track covered by the 95% HPD region—to evaluate the uncertainty of the posterior probability of position.
In order to calculate the HPD region size, we first calculate the 95% HPD region by determining the maximum threshold value $h$ that fulfills the follow equality:
The 95% HPD region is the set of position bins with posterior values greater than the threshold $\{x:p({x}_{k}\mid {O}_{1:T})>h\}$. The HPD region size is calculated by taking the integral of the members of this set:
which we approximate with a Riemann sum.
Shuffle analysis of the effect of place encoding on classification of dynamics and HPD region size
Request a detailed protocolIn order to confirm that the model classification of dynamics depended on hippocampal place specific encoding, we resampled the position during movement with replacement, but preserved the spike times and spike waveform amplitudes. We fit the clusterless encoding model on the resampled data and then decoded the immobility periods. Like with the nonresampled data, we then extracted the SWR times and determined their classification based on our classification scheme. We repeated this shuffle analysis 50 times and compared this distribution to the real data for two recording sessions on two different animals (animal Bond, day 3, epoch two and animal Remy, day 35, epoch 2).
We also performed another shuffle that preserved more of the local correlation structure of spiking by shuffling the order of the welltowell runs and then circularly shuffling the position within the runs. A welltowell run starts when the animal moves more than 5 cm from a well and ends when the animal moves back within 5 cm of a well. This largely reflects runs from one well to another (e.g. from the center well to the left well), but for a small number of welltowell runs, the animal did turn back to the same well (e.g. went back to the center well after leaving the center well). As above, after permuting the order of the welltowell runs and circularly shuffling the position, we fit the clusterless encoding model and decoded SWRs. We repeated this shuffle analysis 50 times and compared this distribution to the real data for two recording sessions on two different animals (animal Bond, day 3, epoch two and animal Remy, day 35, epoch 2).
Distance from animal
Request a detailed protocolThe distance of the decoded position from the animal’s is defined as the shortest path distance along the track graph between the animal’s 2D position projected on to the track graph (see Linearization) and the MAP estimate of the posterior probability of position, ${\mathrm{arg}\mathrm{max}}_{{x}_{k}}p({x}_{k}\mid {O}_{1:T})$, which is the center of the position bin with the greatest posterior value. The shortest path is found using Dijkstra’s algorithm Dijkstra, 1959 as implemented in NetworkX (Hagberg et al., 2008).
Estimation of speed
Request a detailed protocolIn order to estimate the speed over time, we first found the MAP estimate of the posterior probability of position, ${\mathrm{arg}\mathrm{max}}_{{x}_{k}}p({x}_{k}\mid {O}_{1:T})$, which is the center of the position bin with the greatest posterior value. We then computed the first derivative using the Python library Numpy’s gradient function, which computes the difference in the forward and backward time direction and then averages them for all points except the boundaries. We then smoothed this quantity with a small Gaussian (2.5 ms standard deviation) and then averaged this for the classified time points. Because our position bin size of 3 cm makes it hard to distinguish between slower speeds in a 2 ms time step, we only analyzed classifications longer than 20 ms.
Nonlocal stationary position
Request a detailed protocolThe nonlocal stationary position is defined as replay distances at least 30 cm from the animal’s position during a time period classified as stationary.
Identifying events of high multiunit activity
Request a detailed protocolWe identified times of high multiunit activity when the animal was immobile as a control analysis. Our approach was similar to Davidson et al., 2009. High multiunit periods were identified as times when the zscored multiunit population spiking activity was greater than two standard deviations for at least 15 ms and the animal was moving at speeds less than 4 cm/s. The code used for detection of high multiunit time periods can be found at https://github.com/EdenKramerLab/ripple_detection.
Standard 'Bayesian’ decoding and significance estimation
Request a detailed protocolFor the standard decoder analysis, we first fit a clusterless encoding model in the same manner as described in the subsection Encoding  Clusterless. We then decoded the posterior probability of each position, using 20 ms time bins, using the clusterless likelihood equation as described in the subsection The Model and assuming a uniform prior probability. Finally, to get an estimate of speed of the replay, we used either the Radon transform method, the linear regression method, or the MAP estimate method.
For the Radon transform method, we densely sampled potential line trajectories and picked the line that maximized the probability along that line. The sum of the probability along the line is called the Radon score. We picked the maximum Radon score on lines fit on the combination of center arm and left arm posterior versus center arm and right arm posteriors. pValues were computed by circularly shuffling the posterior probabilities 1000 times and comparing the real data vs. the shuffled distribution.
For the linear regression method, we drew 1000 position samples from each time bin according to the probability of position in that time bin. A linear regression was then fit to the sampled positions using time as a covariate. Fits were made separately for the posterior probabilities corresponding to the center arm with the left arm and the center arm with the right arm. Quality of the fit was assessed using the Rsquared and the best fit between the arms was selected based on this metric.
For the MAP estimate, we pick the position bin with the maximum value for each time bin, which corresponds to the most probable position.
Software and code availability
Request a detailed protocolPython code used for analysis and generating figures in the paper is available at: https://github.com/EdenKramerLab/replay_trajectory_paper (Frank, 2021 copy archived at swh:1:rev:630d4e32f343e86a4921d0773f1f5c5adf90553f). Code for the decoder is available in a separate software repository to facilitate code reuse at: https://github.com/EdenKramerLab/replay_trajectory_classification (Denovellis, 2021a; copy archived at swh:1:rev:83d84170ae0bdeef65cd123fa83448fcca9cb986) (doi: 10.5281/zenodo.3713412). Both code bases rely on the following python packages: Numpy (van der Walt et al., 2011), Numba (Lam et al., 2015), Matplotlib (Hunter, 2007), xarray (Hoyer and Hamman, 2017), NetworkX (Hagberg et al., 2008), Pandas (McKinney, 2010), and Seaborn (Waskom, 2021). All code is opensource and licensed under the MIT Software License. Decoder code can be easily installed as a python package with all requisite dependencies using pip or conda. See software repositories for specific details.
Data availability
All data from this paper has been deposited in a dryad repository. Some of this data was previously deposited at the CRCNS data sharing website as dataset hc6, but for reproducibility purposes we have included it here.

Dryad Digital RepositoryData from: Hippocampal replay of experience at realworld speeds.https://doi.org/10.7272/Q61N7ZC3

Collaborative Research in Computational NeuroscienceSimultaneous extracellular recordings from hippocampal areas CA1 and CA3 (or MEC and CA1) from rats performing an alternation task in two Wshapped tracks that are geometrically identically but visually distinct.https://doi.org/10.6080/K0NK3BZJ
References

Reward revaluation biases hippocampal replay content away from the preferred outcomeNature Neuroscience 22:1450–1459.https://doi.org/10.1038/s4159301904646

BookDuxbury Advanced Series in Statistics and Decision SciencesIn: Berger R. L, editors. Statistical Inference. United States: Cengage. pp. 10–21.

Uncovering spatial topology represented by rat hippocampal population neuronal codesJournal of Computational Neuroscience 33:227–255.https://doi.org/10.1007/s108270120384x

ConferenceTransductive neural decoding for unsorted neuronal spikes of rat HippocampusEngineering in Medicine and Biology Society (EMBC), 2012 Annual International Conference of the IEEE. pp. 1310–1313.https://doi.org/10.1109/EMBC.2012.6346178

ConferenceBayesian nonparametric methods for discovering latent structures of rat hippocampal ensemble spikes2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP) Vietri Sul Mare Salerno Italy: IEEE. pp. 1–6.https://doi.org/10.1109/MLSP.2016.7738867

Rapid classification of hippocampal replay content for realtime applicationsJournal of Neurophysiology 116:2221–2235.https://doi.org/10.1152/jn.00151.2016

ConferenceCharacterizing hippocampal replay using hybrid point process state space models53rd Asilomar Conference on Signals, Systems, and Computers. 245–249.https://doi.org/10.1109/IEEECONF44664.2019.9048688

Softwarereplay_trajectory_classification, version swh:1:rev:83d84170ae0bdeef65cd123fa83448fcca9cb986Software Heritage.

Forward and reverse hippocampal placecell sequences during ripplesNature Neuroscience 10:1241–1242.https://doi.org/10.1038/nn1961

A note on two problems in connexion with graphsNumerische Mathematik 1:269–271.https://doi.org/10.1007/BF01386390

The reorganization and reactivation of hippocampal maps predict spatial memory performanceNature Neuroscience 13:995–1002.https://doi.org/10.1038/nn.2599

BookFrom conditioning to conscious recollection: memory systems of the brainIn: Cohen N. J, editors. No. 35 in Oxford Psychology Series. New York: Oxford University Press. pp. 1–52.

Softwarereplay_trajectory_paper, version swh:1:rev:630d4e32f343e86a4921d0773f1f5c5adf90553fSoftware Heritage.

The organization of recent and remote memoriesNature Reviews. Neuroscience 6:119–130.https://doi.org/10.1038/nrn1607

ConferenceDynamics, and function using NetworkXProceedings of the 7th Python In Science Conference Pasadena. pp. 11–15.

Complementary spatial firing in place cellinterneuron pairsThe Journal of Physiology 588:4165–4175.https://doi.org/10.1113/jphysiol.2010.194274

Xarray: nd labeled arrays and datasets in PythonJournal of Open Research Software 5:10.https://doi.org/10.5334/jors.148

Matplotlib: a 2D graphics environmentComputing in Science & Engineering 9:90–95.https://doi.org/10.1109/MCSE.2007.55

Neuronal diversity in GABAergic longrange projections from the HippocampusThe Journal of Neuroscience 27:8790–8804.https://doi.org/10.1523/JNEUROSCI.184707.2007

BookMeasuring Distributed Properties of Neural Representations beyond the Decoding of Local Variables: Implications for CognitionIn: Holscher C, Munk M, editors. Information Processing by Neuronal Populations. Cambridge: Cambridge University Press. pp. 95–119.https://doi.org/10.1017/CBO9780511541650.005

The hippocampal sharp waveripple in memory retrieval for immediate use and consolidationNature Reviews. Neuroscience 19:744–757.https://doi.org/10.1038/s4158301800771

Awake replay of remote experiences in the HippocampusNature Neuroscience 12:913–918.https://doi.org/10.1038/nn.2344

Bayesian decoding using unsorted spikes in the rat HippocampusJournal of Neurophysiology 111:217–227.https://doi.org/10.1152/jn.01046.2012

ConferenceNumba: a LLVMBased Python JIT compilerProceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC LLVM ’15 Austin. pp. 1–6.

UpSet: visualization of intersecting setsIEEE Transactions on Visualization and Computer Graphics 20:1983–1992.https://doi.org/10.1109/TVCG.2014.2346248

A bayesian nonparametric approach for uncovering rat hippocampal population codes during spatial navigationJournal of Neuroscience Methods 263:36–47.https://doi.org/10.1016/j.jneumeth.2016.01.022

Methods for assessment of memory reactivationNeural Computation 30:2175–2209.https://doi.org/10.1162/neco_a_01090

ConferenceData structures for statistical computing in PythonPython in Science Conference . pp. 56–61.https://doi.org/10.25080/Majora92bf192200a

Replay and time compression of recurring spike sequences in the HippocampusThe Journal of Neuroscience 19:9497–9507.https://doi.org/10.1523/JNEUROSCI.192109497.1999

ConferenceHidden markov map matching through noise and sparseness17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (ACM SIGSPATIAL GIS 2009). pp. 336–343.https://doi.org/10.1145/1653771.1653818

HippocampalPrefrontal reactivation during learning is stronger in awake compared with sleep statesThe Journal of Neuroscience 37:11789–11805.https://doi.org/10.1523/JNEUROSCI.229117.2017

On the methods for reactivation and replay analysisPhilosophical Transactions of the Royal Society B: Biological Sciences 375:20190231.https://doi.org/10.1098/rstb.2019.0231

The NumPy array: a structure for efficient numerical computationComputing in Science & Engineering 13:22–30.https://doi.org/10.1109/MCSE.2011.37

Error bounds for convolutional codes and an asymptotically optimum decoding algorithmIEEE Transactions on Information Theory 13:260–269.https://doi.org/10.1109/TIT.1967.1054010

Seaborn: statistical data visualizationJournal of Open Source Software 6:3021.https://doi.org/10.21105/joss.03021

Discrete place fields of hippocampal formation interneuronsJournal of Neurophysiology 97:4152–4161.https://doi.org/10.1152/jn.01200.2006

Hippocampal replay captures the unique topological structure of a novel environmentThe Journal of Neuroscience 34:6459–6469.https://doi.org/10.1523/JNEUROSCI.341413.2014

Hippocampalcortical interaction in decision makingNeurobiology of Learning and Memory 117:34–41.https://doi.org/10.1016/j.nlm.2014.02.002
Decision letter

Adrien PeyracheReviewing Editor; McGill University, Canada

Timothy E BehrensSenior Editor; University of Oxford, United Kingdom

Yunzhe LiuReviewer; University College London, United Kingdom

H Freyja ÓlafsdóttirReviewer; University College London, United Kingdom
In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.
Acceptance summary:
This paper presents a new framework to decode neuronal activity in the hippocampus during ripple events. This method reveals that most ripple events contain spatially interpretable content that often progresses at timescales slower than previously reported and that appears to better match previous wake activity at real time. These findings challenge the classical view of replay as primarily timecompressed representation of previous wake activity and provide a fuller picture of the repertoire of possible content for ripple events during rest.
Decision letter after peer review:
Thank you for submitting your article "Hippocampal replay of experience at realworld speeds" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Timothy Behrens as the Senior Editor. The following individuals involved in review of your submission have agreed to reveal their identity: Yunzhe Liu (Reviewer #1); H Freyja Ólafsdóttir (Reviewer #2).
The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.
Summary:
In the present study, the authors report that the proportion of candidate replay events in the hippocampus showing spatially interpretable content is much higher than previously described. In addition, these events often progress at slow ("realworld") timescales while it is generally assumed that hippocampal replay is compressed in time. Specifically, the authors have used a state space model that captures the spatial content and temporal evolution of replay. This model is based on few assumptions and is aimed to be unbiased – as it is based on "clusterless" extracellular spikes. While the three reviewers are generally enthusiastic about the work and agree that this study is potentially suitable for eLife, they have raised several concerns that should be addressed to warrant publication. The main concerns are summarized below, please refer to the detailed reviewer's comments for more information.
Essential revisions:
1. The authors should further test their method on exploration data to validate the approach, especially since non place cells may potentially bias the decoding of animal's speed in clusterless data (see reviewer #3 point #1). This analysis of hippocampal activity during run can be used to identify the encoded speed in within theta sequences (reviewer #1, point #2).
2. The authors have to improve controls, especially their shuffling procedures. Shuffling the spikelocation relationship is too permissive. See reviewer #2, point #1 and reviewer #3, point #9.
3. The authors should use more stringent thresholds on speed and theta power to rule out the possibility that some of the events are not replays but represent current position. See reviewer #2, points #23.
4. The classification of candidate replay events should be improved (especially events that may be partially "unclassified") and the authors should provide a more detailed comparison of the replay events detected with their method and with previously described methods (reviewer #3, points #23)
5. The authors should provide descriptive statistics demonstrating that spiking properties during candidate replay events (e.g. average rate per tetrode) are similar between sorted and clusterless data (reviewer #3 point #67). Only events in which spikes are detected on multiple tetrodes should be included, to prevent decoding from the firing of a single cell. Finally, spikes of putative inhibitory neurons should be discarded (reviewer #3, point #7).
6. The authors have used neuronal data collected from the different subcircuits of the hippocampus (CA13). It should be demonstrated that the main results hold if the analysis is restricted to CA1, on which most of previously published studies are based, especially since it can affect the nature of replay events. Is there a relationship between the category (e.g. continuous or stationary) of a given replay event and the proportion of neurons from the different substructures that fired during that event? Also, discussing the seemingly low proportion of continuous events (if the same proportion is observed when only including CA1 neurons). See reviewer #2, point #4 and reviewer #3, points #45.
7. The title of the manuscript is a bit misleading as the reported replayed speeds are often even slower than "realworld speed" and should therefore be changed. See reviewer #1, point #1 and reviewer #3, point #8.
8. The authors should further discuss the implication of their findings. See reviewer #1, point #3 and reviewer #2, point #4.
9. Please note that reviewer #3 has also raised a couple of minor concerns that should be addressed.
Reviewer #1:
Denovellis et al. have developed a state space model approach for unbiased replay detection. As a result, they are able to demonstrate that most of the candidate replay events (i.e., sharp wave ripple events) contains spatial coherent (stationary or continuous) content, and they progress at a relative lower speed (< 1m/s) than previously believed.
I like this paper a lot. The writing is very clear, and the analysis is thorough. The proposed method is likely to be very impactful and widely adopted in the replay community.
Reviewer #2:
Denovel and colleagues develop and apply a state space model to decode position representation during hippocampal SWR events. The main advantage of this model is that is makes fewer assumptions of the dynamics of SWR events allowing it capturing varying and mixed movement dynamics. Through the application of this model the authors are able to decode coherent spatial content from the vast majority (~90%) of SWR events – a much, much higher proportion than previous papers on the topic have reported – and they show that, contrary to previous work, the most common type of movement dynamic during SWR events is one that's characterised by a stationary/slow movement speeds.
I believe the work by Denove and colleagues represents a significant advance in our understanding of SWR events. Moreover, the state space method Denove et al. described also represents useful analytical tool which could potentially be useful to many neuroscientists in the field. I would also like to commend the authors for their clear and detailed explanation of the model.
I do however have some concerns about the robustness of their SWR event method and some of their control analyses. Below I explain these in more details and suggest ways to address these concerns. All suggestions are implementable and would, I believe, significantly strengthen the authors' conclusions.
1. To control for spurious SWR events (containing spatially coherent content), the authors shuffle the positionspike relationship. Although this type of control analysis is commonly used, it can be overly permissive, particularly if researchers are working with smalltomedium samples sizes. This is because for smaller samples sizes it is unlikely that all positions are represented equally by the cell population (which is an assumption of the control analysis). However, given the authors are using clusterless decoding, a cell ID shuffle is not possible. Thus to address this potential caveat, it would be useful if the authors show position representation in their cell population on the Wmaze? Moreover, if the position representation is not equal throughout the environment the authors need to account for this in their position resampling with replacement control analysis.
2. The authors use a speed threshold of 4cm/s to exclude SWR events that may occur during movement periods. Could the authors use a more stringent threshold – e.g. 1cm/sec or 0cm/sec? Although 4cm/sec is low I believe it is important to rule out the possibility that some of their detected spatially coherent SWR events don't merely reflect the movement of the animal during the event. This is particularly important given their finding that the most common type of replay event is one that is characterised by slow movement speeds – akin to those seen during actual behaviour.
3. Related to this, the authors could also use low theta power as an additional quality criteria to weed out events that may reflect (slow) movement periods.
4. One of the main findings of the paper is that SWR events can consist of different types of movement dynamics (stationary, continuous, fragmented) or a mixture of dynamics. This is indeed an interesting and novel finding, but it would be useful if the authors could elaborate on the theoretical implications of different type of replay events. For example, do the authors think SWR events belonging to different movement dynamic categories are supported by different subcircuits within the hippocampus? For example, some work has shown SWR events preceded by EC activity are longer in duration Oliva et al. (2016).
Reviewer #3:
The authors of this study investigated the prevalence and range of representational dynamics associated with replay activity in the rat hippocampus during periods of rest following a spatial alternation task in a Wshape maze. Using a state space model that captures the spatial content and temporal evolution of replay during sharpwave ripple activity, the authors report that most ripple events contain spatially interpretable content that often progresses at timescales slower than previously reported and that appears to better match previous wake activity at real time. The reported findings challenge the classical view of replay as primarily timecompressed representation of previous wake activity and build on previous work on stationary events and reduced proportion of significant trajectory replay events to provide a fuller picture of the repertoire of possible content for ripple events during rest. While the results are intriguing, a number of conceptual and technical shortcomings prevent a more enthusiastic appreciation of this work in its current state, as detailed below.
1. The main Bayesian model needs to be tested first during run on track, where the prior is being built. The authors need to compute the error of the model during run in terms of the difference between the decoded position during run and the actual animal trajectory. If, for any reason, the model appears stationary during run at any trial/lap, then the authors will need to take that into account when claiming stationarity during rest. In particular, clusterless decoding may make use of spikes that are not spatially well modulated and that would appear to support the stationary events during rest.
2. Please use more stringent classification of the events. As the current relaxed criterion for classifying events only requires a portion of the event to exhibit that behavior, what proportion of events classified as 'stationary' or 'continuous' significantly pass traditional, rigorous eventmatched shuffles? What proportion of extended stationary events, >100 ms for instance, have a linefit replay score above the 95th percentile of eventmatched shuffles (circular shifting of posterior decoded probabilities) and a trajectory length shorter than a certain value, e.g. 6 cm? Similarly, what proportion of continuous trajectories pass their respective eventmatched shuffles? What proportion of stationary events are entirely stationary? The authors need to quantify all of the above and present them in the Results.
3. One significant conclusion of this paper is that most population events during awake rest are stationarycontinuous mixtures or stationary as opposed to continuous (replay) events. However, given that an event is classified as stationary if part of the event is stationary (probability >0.8), there is a possibility that the rest of the event is unclassified (no clear structure). In plots 3G, 5AF it is important to include details about the unclassified ripples and the unclassified parts of the ripple which is otherwise classified. For instance, in 5A, the proportion of ripples with parts that are unclassified needs to be mentioned. This is important because parts of the event might be stationary, and parts might be noise/unclassified. Previous studies have characterized events as stationary only if they were stationary throughout the event and had excess stationarity than the 95th percentile of an eventbased shuffle. How many of the events classified as stationary pass this significance criterion? This relaxed classification might otherwise lead to overclassification of certain event types.
4. The reported results are based on combined recordings from 3 hippocampal areas: CA1, CA2, and CA3 (conform Methods). These areas have different cell types, some of which respond to animal stationarity rather than movement. This makes classification and comparison of all event types between this study and the previous studies reporting on CA1 place cells only which were active on simpler tasks on linear tracks difficult. The authors should restrict their analysis to tetrodes located exclusively in CA1 when computing proportions of event types and only then compare their results to previous studies reporting on significant replay and stationary events. To make this comparison even more interpretable, similar significance tests against shuffles and use of clustered data (available for 9 out of the 10 animals that were previously published) should be performed. If the presented proportions of events persist under these conditions, the impact of the findings will be significantly increased.
5. There is discrepancy in proportion of replay events reported here compared to previous studies. Continuous events are most similar to traditionally detected replay sequences, however, those sequences are unidirectional (move from one side of the track to the other to be significant), while these events can move in both directions (model of movement dynamic is symmetric for continuous). The authors should investigate and explain why the proportions of continuous events are less than those of traditionally detected replay sequence events with eventmatched shuffles (1050% in other studies as stated by the authors versus ~5% only continuous).
6. The properties of each event type need to be compared to rule out other reasons for their distinct spatial and temporal dynamics. These properties to be compared between classes of event types (e.g. stationary, continuous, etc.) include: number of tetrodes active in the event, ripple power, total number of spikes in the event, spikes per bin, event duration, proportion of spikes under 6 ms ISI for each tetrode in each event type, proportion of the event duration it lies in that classification. Could you also confirm that population events included for clusterless decoding are similar in properties to those obtained from clustered data by comparing their properties (duration and zscored population firing rates across the event)?
7. There is a possibility that one neuron dominates a population event and, therefore, stationary events are simply single units firing continuously. For instance, Figure 1A for the simulation shows just that, one simulated neuron fires and a stationary event is predicted by the algorithm. As clusterless decoding is used for the main analyses, can only events consisting of multiunit activity from at least 5 tetrodes be included in the analysis? Will the main classes of events be replicated under these conditions? Related to this, interneurons (putative inhibitory neurons) should be removed from the clusterless decoding based on waveform shape as there is a possibility that inclusion of interneurons in the clustering decoding analysis biases results towards detection of stationary events (there might be a hint of this from examples presented in Figure S4 where good stationary events seem to have higher spiking per tetrode compared to continuous events). The event classes should be recomputed and compared with the original classes.
8. The current title is misleading. There does not appear to be clear evidence of replay at realworld speeds. First, during each run lap, the animal trajectory changes unidirectionally at an estimated speed of 2535 cm/s. For replay to occur at real time speed, the trajectory decoded during rest should fully unfold at the same speed over 34 s for a 1mlong track. This is never reported here, partly because the events are shorter. The stationary events decode individual locations, but there is no indication that they evolve in succession (successive locations at successive times) to represent trajectories at real time speeds. Second, neurons fire in timecompressed theta sequences during run, so neuronal activity is naturally compressed even before replay. Compressed replay essentially already represents space at roughly the same speed as during run (theta sequences).
9. Bursting of neurons or other individual neuronal properties might lead to classification of successive bins as spatially coherent and/or stationary. In order to control for that possibility, in clustered decoding researchers typically generate shuffled datasets with conserved place map properties by circularly shifting individual place maps. The shuffles generated in this paper randomize the position to spike relationship destroying all singlecell properties. Can shuffles with conserved tetrode multiunit activity properties be generated instead to test this hypothesis? E.g. circularly shift by the same value all spikes from the same tetrode.
[Editors' note: further revisions were suggested prior to acceptance, as described below.]
Thank you for resubmitting your work entitled "Hippocampal replay of experience at realworld speeds" for further consideration by eLife. Your revised article has been evaluated by Timothy Behrens (Senior Editor) and a Reviewing Editor.
The manuscript has been improved but there are some remaining issues that need to be addressed, as outlined below:
The most critical aspect is about the decoding. During wake, theta sequences should strongly impact the accuracy of the decoding with 2m time bins, but it is unclear why this is not the case. The study should also provide some comparison of decoding accuracy with 2ms and 20ms and provide further details regarding how exactly events are eventually classified (see reviewer #2, points 911).
The study should also include examples showing the differences and similarities in decoding with the clusterless and more classical approaches (see reviewer #2, point #1).
Some other concerns were raised, mostly requesting clarification in the methods and the discussion of the results. Please see detailed review below (reviewer #2, points #28).
Reviewer #2:
I am satisfied with the changes the authors have made in their manuscript and the updated manuscript addresses all of my concerns.
I think the findings are exciting and will contribute to theoretical development in the field. I particularly like that the authors have made the code for their state space model available – this may prove useful for promoting methodological consistency and transparency in the replay field.
Reviewer #3:
In the revised manuscript, the authors have partially addressed the previous concerns. The manuscript has improved, and the findings remain potentially interesting. Below, please find additional comments aimed at improving the accuracy and readability of the manuscript:
1. Due to the various differences from earlier methods (clusterless decoding, state space encoding model, bin size, event detection criteria and quantification of dynamic type: radon transform vs current method), it is difficult to pinpoint which of these differences is leading to the authors' main conclusions. For the general readership, it will be extremely useful to show actual examples of the same decoded spike trains using standard Bayesian decoding (with clustered units) and using state space models. This could reveal any significant differences in the decoded locations which the authors claim are revealed by this method (pages 3 and 4). Since the authors claim there is a substantial difference between the two methods, differences in the inferred dynamics from the two methods should be visible by eye and many such examples should be available.
2. Is the higher preponderance of stationary/slow events caused by the authors setting of the beginning and ends of events as the start and ends of ripples, rather than using the multiunit population activity (which can span multiple ripples). Can the authors rerun the analysis with multiunit population activity to ensure that the results are not different from previous studies mostly due to how events are detected? Even if the results are due to that, it will be useful to report that to the reader as it will not diminish the impact of these findings (the authors already allude to the fact that the cut off is arbitrary) but will help them understand the reason for the differences between the studies.
3. Can the authors include a histogram of event duration versus proportion of such events for each of the dynamic so that the reader can better judge what is the likelihood of each event type by duration?
4. Although the fragmented dynamic would indeed identify discontinuities in decoded locations, how exactly is it better than a nonparametric shuffle? What if the fragmented dynamic has more falsepositives or falsenegatives?
5. Are the discovered dynamics restricted to the replay of the animal experience or can they be present before the experience? The authors should discuss this possibility.
6. How is stationarity a realworld 'speed'? Isn't it a lack of movement? Were only the spikes emitted while the animal was moving used for the encoding model and not those while it was resting? How are then the stationary events replays of that movement? What was the average velocity of the animal during movement? And of the stationary dynamic?
7. References cited and conclusions drawn from papers are not accurate in some instances, please verify and correct. Several cases are listed below, but the list is larger. Louie and Wilson, Neuron 2001 showed that replay of experience can occur at realtime speed during REM sleep, which should be cited. On lines 6768 the authors say: "events that represent a single location are only seen in young animals (Stella et al., 2019)". That study did not investigate young animals, the authors probably meant to refer to (Farooq and Dragoi, 2019), which should be cited. There is a difference between what previous studies have claimed, and what the authors attribute to those studies. For instance, previous studies observed low proportions of sequential trajectories, but they did not observe low proportions of spatially coherent timebins/cofiring, which are otherwise wellreported and have been observed many times.
8. How do the authors know that a probability of 0.8 or above can be inferred from downstream neurons as opposed to 0.95? Unless recordings are performed from downstream neurons that cannot really be determined. Both 0.8 and 0.95 face the problem of ambiguity of whether it is interpretable by a downstream neuron, however, a probability of 0.95 will enable readers to judge what happens when events are statistically significant. Can the authors discuss this in the main text?
9. Very low decoding errors (36 cm) during the run at the small timescale employed here (2 ms) may indicate that this method is incapable of capturing theta sequences (which should give a higher decoding error). Instead, they capture stationary dynamics like they do during rest. Larger bins for decoding during run are typically used to study position representation at larger timescales (at bigger timescales than theta sequences) which are likely to have low error. This could be problematic since the current method might be biased toward overrepresenting the stationary dynamic. The authors should address this in the manuscript.
10. What happens when a 20 ms time bin is used for state space modelling? Does it reduce the proportion of spatiallycoherent events?
11. Figure 5. When whole ripples are classified as stationary, stationarycontinuous mixture and so on, how is this classification done for the whole ripple? The classification of dynamics is done for each 2 ms bin, but what criteria are used to classify the whole ripple? Please provide more details on the intermediate steps. For instance, if an event is stationary, but only has one 2 ms bin which deviates, is it considered stationarycontinuous?
https://doi.org/10.7554/eLife.64505.sa1Author response
Essential revisions:
1. The authors should further test their method on exploration data to validate the approach, especially since non place cells may potentially bias the decoding of animal's speed in clusterless data (see reviewer #3 point #1). This analysis of hippocampal activity during run can be used to identify the encoded speed in within theta sequences (reviewer #1, point #2).
We have added a comparison between the decoded position and the animal’s position to validate our decoding of speed (page 7) as well as added a discussion about the ability of our model to identify speed during theta sequences. We have provided a more detailed response to reviewers #3 and #1 below.
2. The authors have to improve controls, especially their shuffling procedures. Shuffling the spikelocation relationship is too permissive. See reviewer #2, point #1 and reviewer #3, point #9.
We have added an additional shuffle where we randomized the order of runs (from one reward well to another) and then circularly permuted the resulting segments of data across all tetrodes uniformly (page 9). This shuffle preserves local spiking correlations (and thus spiking statistics), but the relationship to position is disrupted. We find that our results are very different when our model is applied to these shuffled data as we describe in detail in our response to reviewers #2 and #3 below.
3. The authors should use more stringent thresholds on speed and theta power to rule out the possibility that some of the events are not replays but represent current position. See reviewer #2, points #23.
We have performed a control analysis where we required animal speeds to be less than 1 cm/s and achieved similar results (Figure5—figure supplement 2C). We also show in Figure 5 that most of the dynamics do not represent the animal’s current position (although the stationary dynamics often represent the animal’s position). We have provided a more detailed response to reviewer #2 below.
4. The classification of candidate replay events should be improved (especially events that may be partially "unclassified") and the authors should provide a more detailed comparison of the replay events detected with their method and with previously described methods (reviewer #3, points #23)
We included more information about which SWR included times that were unclassified (Figure 5). We also performed a comparison between our method and three previously described methods (Figure 6). We have provided a more detailed response to reviewer #2 below.
5. The authors should provide descriptive statistics demonstrating that spiking properties during candidate replay events (e.g. average rate per tetrode) are similar between sorted and clusterless data (reviewer #3 point #67). Only events in which spikes are detected on multiple tetrodes should be included, to prevent decoding from the firing of a single cell. Finally, spikes of putative inhibitory neurons should be discarded (reviewer #3, point #7).
We have included descriptive statistics about the spiking statistics during each of the dynamics (Figure 5—figure supplement 3) as well as performed a control analysis using clustered units (Figure 5—figure supplement 2D). We have clarified in the Methods (page 22) that all the SWR events that we have analyzed included more than 2 tetrodes with spikes and we have included a further control where we required more than 5 tetrodes with spikes (Figure 5—figure supplement 2A). We have also performed a control analysis where we removed spikes with narrow waveforms to exclude spikes from putative interneurons (Figure 5—figure supplement 2B) and another analysis where we included only sorted putative pyramidal cells (Figure 5—figure supplement 2D). We have provided a more detailed response to reviewer #3 below.
6. The authors have used neuronal data collected from the different subcircuits of the hippocampus (CA13). It should be demonstrated that the main results hold if the analysis is restricted to CA1, on which most of previously published studies are based, especially since it can affect the nature of replay events. Is there a relationship between the category (e.g. continuous or stationary) of a given replay event and the proportion of neurons from the different substructures that fired during that event? Also, discussing the seemingly low proportion of continuous events (if the same proportion is observed when only including CA1 neurons). See reviewer #2, point #4 and reviewer #3, points #45.
We have performed a control analysis where we include only CA1 tetrodes (Figure5—figure supplement 2E). Our dataset does not allow comparison of hippocampal subregions because of the varying number of tetrodes in each region. We have also included a discussion (page 18) of why our proportion of continuous events seem lower than those observed in the literature, but are actually in line with the proportion observed in the literature because of the capability of our model to characterize more than one dynamic per SWR. We have provided a more detailed response to reviewers #2 and #3 below.
7. The title of the manuscript is a bit misleading as the reported replayed speeds are often even slower than "realworld speed" and should therefore be changed. See reviewer #1, point #1 and reviewer #3, point #8.
Here we respectfully disagree. We now provide a plot showing the actual speeds at which the animal moves through the environment, which cover a similar range to those seen during the slower replay events (Figure 5G). We have provided a more detailed response to reviewers #1 and #3 below.
8. The authors should further discuss the implication of their findings. See reviewer #1, point #3 and reviewer #2, point #4.
We have further expanded our discussion to discuss the functional implications of the slower replays and the possible subcircuits that could underlie the different replay speeds within the hippocampal structure (page 19). We have provided a more detailed response to reviewers #1 and #2 below.
Reviewer #2:
Denovellis and colleagues develop and apply a state space model to decode position representation during hippocampal SWR events. The main advantage of this model is that is makes fewer assumptions of the dynamics of SWR events allowing it capturing varying and mixed movement dynamics. Through the application of this model the authors are able to decode coherent spatial content from the vast majority (~90%) of SWR events – a much, much higher proportion than previous papers on the topic have reported – and they show that, contrary to previous work, the most common type of movement dynamic during SWR events is one that's characterised by a stationary/slow movement speeds.
I believe the work by Denovellis and colleagues represents a significant advance in our understanding of SWR events. Moreover, the state space method Denove et al. described also represents useful analytical tool which could potentially be useful to many neuroscientists in the field. I would also like to commend the authors for their clear and detailed explanation of the model.
I do however have some concerns about the robustness of their SWR event method and some of their control analyses. Below I explain these in more details and suggest ways to address these concerns. All suggestions are implementable and would, I believe, significantly strengthen the authors' conclusions.
1. To control for spurious SWR events (containing spatially coherent content), the authors shuffle the positionspike relationship. Although this type of control analysis is commonly used, it can be overly permissive, particularly if researchers are working with smalltomedium samples sizes. This is because for smaller samples sizes it is unlikely that all positions are represented equally by the cell population (which is an assumption of the control analysis). However, given the authors are using clusterless decoding, a cell ID shuffle is not possible. Thus to address this potential caveat, it would be useful if the authors show position representation in their cell population on the Wmaze? Moreover, if the position representation is not equal throughout the environment the authors need to account for this in their position resampling with replacement control analysis.
We thank the reviewer for this comment. As the reviewer noted, it is important to know how well the spatial environment is represented by cells because this could affect the decode of SWRs. To address this, we have added Figure 2—figure supplement 3 to show that the firing over all our tetrodes is roughly equal for most of our epochs. We believe this is one of the advantages of clusterless decoding in that we have more spikes and therefore are potentially less constrained by worries about under sampling positions in the environment. Furthermore, we also measured the median difference between the most probable decoded position and the animal’s actual position and found a 4 cm median difference (36 cm 95% CI), for all time in sessions. This gives us confidence that the cell population has spatial information about all positions in the environment.
We would also like to note that we directly test for nonspatially coherent content by including the fragmented dynamic in our model, which is what the shuffle tests are aimed at capturing in most analyses. We have added to the Results to clarify this point (page 6).
In addition, to better illustrate this point, we have added an additional shuffle where we have randomized the order of the runs between wells and circularly permuted the spatial position within the runs to preserve more of the structure of the task (Figure 3—figure supplement 3, page 9). This shuffle increased the prevalence of spatially incoherent dynamics and decreased the prevalence of spatially coherent dynamics as expected, showing that our addition of the fragmented state works well to capture these types of dynamics. The shuffle also substantially increased the spatial extent of the 95% HPD region, indicating that the model was much less certain of the decoded locations even in cases where it identified a dynamic. These results indicate that our results reflect structure in the data, not just structure in the model.
2. The authors use a speed threshold of 4cm/s to exclude SWR events that may occur during movement periods. Could the authors use a more stringent threshold – e.g. 1cm/sec or 0cm/sec? Although 4cm/sec is low I believe it is important to rule out the possibility that some of their detected spatially coherent SWR events don't merely reflect the movement of the animal during the event. This is particularly important given their finding that the most common type of replay event is one that is characterised by slow movement speeds – akin to those seen during actual behaviour.
As per the reviewer’s suggestion, we reran our analysis using a speed threshold of less than 1 cm/s and found a similar distribution of movement dynamics (Figure 5—figure supplement 2C). Also note that the majority of our SWRs occur at speeds less than 1 cm/s (Figure 5—figure supplement 3B). We have also quantified the distance of the decoded trajectory from the animal for stationarycontinuousmixtures (Figure 5C) and found that these represented locations that were some distance away from the animal (median 52 cm). Because of this we are confident that our results are not simply encoding the animal’s position or the result of theta sequences as the animal begins to move.
3. Related to this, the authors could also use low theta power as an additional quality criteria to weed out events that may reflect (slow) movement periods.
We agree that this is a very reasonable suggestion, and in fact we have tried this in the past. Unfortunately, low theta power is difficult to use as a criteria because the sharp wave that occurs during the SWR is also in this same frequency band. In our experience, there is no consistent threshold of theta power (or even theta/δ ratio) that consistently separates immobility from movement times. Because of these challenges, and because of our analysis above, we feel confident that our results are not due to movement of the animal and slow theta sequences.
4. One of the main findings of the paper is that SWR events can consist of different types of movement dynamics (stationary, continuous, fragmented) or a mixture of dynamics. This is indeed an interesting and novel finding, but it would be useful if the authors could elaborate on the theoretical implications of different type of replay events. For example, do the authors think SWR events belonging to different movement dynamic categories are supported by different subcircuits within the hippocampus? For example, some work has shown SWR events preceded by EC activity are longer in duration Oliva et al. (2016).
As per the reviewer’s suggestion, we have broadened our Discussion to talk about the theoretical implications of the different types of replay events (page 19). In particular, for the slower dynamics: stationary dynamics could be important for recalling specific locations in the environment and stationarycontinuousdynamics could be useful for recalling or evaluating small snippets of the environment. Continuous dynamics could be useful when needing to recall or consolidate an entire trajectory. Finally, fragmented dynamics could correspond to nonspatial memories or memories of other spatial environments. When there are jumps between positions, perhaps these represent instances when several snippets are consolidated in succession.
For example, do the authors think SWR events belonging to different movement dynamic categories are supported by different subcircuits within the hippocampus? For example, some work has shown SWR events preceded by EC activity are longer in duration Oliva et al. (2016).
It is certainly possible that the different dynamics are driven by different subcircuits within the hippocampus because awake SWRs are influenced by input from outside of the hippocampus, as the reviewer noted. However, due to the varying number of tetrodes in each hippocampal subfield, we are unable to attribute particular dynamics to specific subregions. We have added this to the Discussion (page 19).
Reviewer #3:
The authors of this study investigated the prevalence and range of representational dynamics associated with replay activity in the rat hippocampus during periods of rest following a spatial alternation task in a Wshape maze. Using a state space model that captures the spatial content and temporal evolution of replay during sharpwave ripple activity, the authors report that most ripple events contain spatially interpretable content that often progresses at timescales slower than previously reported and that appears to better match previous wake activity at real time. The reported findings challenge the classical view of replay as primarily timecompressed representation of previous wake activity and build on previous work on stationary events and reduced proportion of significant trajectory replay events to provide a fuller picture of the repertoire of possible content for ripple events during rest. While the results are intriguing, a number of conceptual and technical shortcomings prevent a more enthusiastic appreciation of this work in its current state, as detailed below.
1. The main Bayesian model needs to be tested first during run on track, where the prior is being built. The authors need to compute the error of the model during run in terms of the difference between the decoded position during run and the actual animal trajectory. If, for any reason, the model appears stationary during run at any trial/lap, then the authors will need to take that into account when claiming stationarity during rest. In particular, clusterless decoding may make use of spikes that are not spatially well modulated and that would appear to support the stationary events during rest.
As per the reviewers suggestion, we computed the median absolute difference between the animal’s most probable decoded position during run and the animal’s actual location, using a five fold cross validation. We found that the median difference between these two quantities was comparable to other studies (4 cm median difference, 36 cm 95% CI, for all time in sessions), even though we used 2 ms time bins, rather than the more commonly used 250 ms time bins. We have added this to the Results (page 7). We also note that the slower spatially coherent trajectories had posteriors that were highly spatially concentrated (as characterized by the size of the 95% highest posterior density in Figure 4E). This can only happen if the spikes themselves were from cells that were strongly spatially modulated, ruling out the possibility that the stationary events are driven by cells lacking spatial selectivity.
2. Please use more stringent classification of the events.
As suggested by the reviewer, we have added more comparison between our method and previous methods in the Results (Figure 6). However, we believe that it is a strength of our decoder that the classification of dynamics is not dependent on the entire SWR. The determination of the start and end times of the SWR are somewhat ad hoc and vary greatly from paper to paper. Because of this, other studies have used a procedure where the stop and start is determined by the decode and not the multiunit or SWR event boundary (Xu and Csicsvari 2019, Pfeiffer and Foster 2015). Using the whole SWR for determination of the trajectory means any noise or spatially incoherent period during the SWR could cause the whole event to be thrown out and not analyzed. Our method allows us to characterize the content of all SWRs, rather than exclude certain SWRs because the SWR does not fit our assumptions about constant velocity trajectories. Furthermore, we have shown that the spatially coherent dynamics that we characterize are sustained in duration (Figure 5, stationarycontinuousmixtures have a median duration of 73 ms for example). Finally, we and others have shown examples where there are multiple dynamics in a single SWR (Figure 3, Figure 3—figure supplement 1, Pfeiffer and Foster 2015), so we think it would be inappropriate to classify these types of events as a single dynamic. We have clarified this in the manuscript (pages 11, 18).
As the current relaxed criterion for classifying events only requires a portion of the event to exhibit that behavior, what proportion of events classified as 'stationary' or 'continuous' significantly pass traditional, rigorous eventmatched shuffles?
As requested by the reviewer, we have performed the comparison of our dynamics to the standard Bayesian decoder that passed eventmatched shuffles (page 15). These were all based on a Radon line fit with 20 ms bins, have a replay score with p < 0.05 using a circular shuffle of the posterior values for each time bin, and use clusterless decoding. We chose this method as the “traditional” method, but please note, as we mentioned in our Introduction, that there are many other methods that have been used in the replay field. We would like to emphasize that the dynamics captured by our decoder are not required to be present for the entire SWR while the line is fit over the entire SWR, so the comparison is, in our opinion, hard to interpret. However, we agree with the reviewer that, despite these limitations, it is useful to have this comparison for others in the field to understand the differences between the methods, so we have included them here and in the manuscript (Figure 6):
– 8% (542 of 7176) of SWRs containing stationary dynamics with p < 0.05.
– 14% (2253 of 16106) of SWRs containing stationarycontinuousmixtures dynamics with p < 0.05.
– 31% (1368 of 4511) of SWRs containing continuous dynamics with p < 0.05.
– 31% (799 of 2562) of SWRs containing fragmentedcontinuous dynamics with p < 0.05.
– 15% (244 of 1598) of SWRs containing fragmented dynamics with p < 0.05.
What proportion of extended stationary events, >100 ms for instance, have a linefit replay score above the 95th percentile of eventmatched shuffles (circular shifting of posterior decoded probabilities) and a trajectory length shorter than a certain value, e.g. 6 cm?
28% of extended stationary events have a linefit replay score with p < 0.05.
Similarly, what proportion of continuous trajectories pass their respective eventmatched shuffles?
30% of SWRs with continuous dynamics have a linefit replay score with p < 0.05.
What proportion of stationary events are entirely stationary? The authors need to quantify all of the above and present them in the Results.
47% of SWRs containing the stationary dynamics are entirely stationary. We have added all of these quantifications to the results.
3. One significant conclusion of this paper is that most population events during awake rest are stationarycontinuous mixtures or stationary as opposed to continuous (replay) events. However, given that an event is classified as stationary if part of the event is stationary (probability >0.8), there is a possibility that the rest of the event is unclassified (no clear structure). In plots 3G, 5AF it is important to include details about the unclassified ripples and the unclassified parts of the ripple which is otherwise classified. For instance, in 5A, the proportion of ripples with parts that are unclassified needs to be mentioned. This is important because parts of the event might be stationary, and parts might be noise/unclassified.
We agree with the reviewer that including information about the unclassified times is important in order to fully characterize the SWRs. As suggested, we included the unclassified categorization in Figure 5 to give the readers a sense for how often the unclassified category occurs, the duration of these time periods, and the multiunit firing during these time periods. We have also included the unclassified category in Figure 5—figure supplement 3 to add information about the ripple power, spikes per bin, and number of spikes per tetrode.
Previous studies have characterized events as stationary only if they were stationary throughout the event and had excess stationarity than the 95th percentile of an eventbased shuffle. How many of the events classified as stationary pass this significance criterion?
5% of events classified as stationary were significant at the 0.05 level using the standard Bayesian decoder with Radon transform line fit. We feel that our findings argue strongly against trusting this 5% number as an indication that there are no meaningful stationary events.
This relaxed classification might otherwise lead to overclassification of certain event types.
We hope that we have addressed this issue in the answers above and in the text (page 15), and we want to emphasize that given that many events contain multiple dynamics, the standard linefitting approaches are problematic in terms of accurately capturing the way that representations move during these events. The goal of our approach is to provide information about the sorts of representational dynamics that are present and that could provide information to downstream structures, and for that goal, we felt that a threshold of 0.80 was reasonable. That said, future work that examines how these events influence activity and plasticity should help us refine these classifications further.
4. The reported results are based on combined recordings from 3 hippocampal areas: CA1, CA2, and CA3 (conform Methods). These areas have different cell types, some of which respond to animal stationarity rather than movement. This makes classification and comparison of all event types between this study and the previous studies reporting on CA1 place cells only which were active on simpler tasks on linear tracks difficult. The authors should restrict their analysis to tetrodes located exclusively in CA1 when computing proportions of event types and only then compare their results to previous studies reporting on significant replay and stationary events.
As requested by the reviewer, we performed a control analysis where we only used CA1 tetrodes for clusterless decoding. We found similar dynamics as reported in the main result of the paper (Figure 5—figure supplement 2E).
However, we would like to point out that several studies (e.g. Diba and Buszaki 2007, Karlsson and Frank 2009) investigating replay have used cells from multiple subregions within the hippocampus. Second, we have also found cells show place specific activity during immobility across CA1, CA2 and CA3 (Yu et al. 2017, Kay et al. 2016). Finally, we have also found that CA1 and CA3 spiking are largely coordinated within and across hemispheres during awake SWR replay (Carr et al. 2012). For these reasons, we feel our main analysis should not exclude CA2 and CA3 tetrodes.
To make this comparison even more interpretable, similar significance tests against shuffles and use of clustered data (available for 9 out of the 10 animals that were previously published) should be performed. If the presented proportions of events persist under these conditions, the impact of the findings will be significantly increased.
We appreciate this point, and carried out an analysis of replays events based on clustered data. The results were very similar (Figure 5—figure supplement 2D).
5. There is discrepancy in proportion of replay events reported here compared to previous studies. Continuous events are most similar to traditionally detected replay sequences, however, those sequences are unidirectional (move from one side of the track to the other to be significant), while these events can move in both directions (model of movement dynamic is symmetric for continuous). The authors should investigate and explain why the proportions of continuous events are less than those of traditionally detected replay sequence events with eventmatched shuffles (1050% in other studies as stated by the authors versus ~5% only continuous).
The reviewer brings up an important point that we failed to clarify in the text. We believe that there is not a discrepancy in the proportion of replay events in our study compared to previous studies. Our decoder is much more flexible in terms of allowing different speeds during the SWR event, so an event that might have been characterized as having constant high speed for the entire event might actually have some time periods that have slower trajectories, even if the majority the time the trajectory proceeds at a higher speed (see Figure 2 for example). Accordingly, we still find 19% of classified SWR events contain continuous trajectories, which is consistent with those previous studies, and these continuous trajectories persist for a considerable duration (median 94 ms, Figure 5B). As the reviewer noted, only 5% of SWRs have continuous dynamics throughout the entire duration of the SWR, but we feel that restricting our analyses to dynamics which last the entire SWR is a limited characterization of the SWR, particularly given that we observed SWRs with multiple dynamics. Finally, we have also used a less strict threshold for including SWRs and interpreted a much larger fraction of these SWRs, so the percentage may be hard to compare between studies. We have added this clarification to the text (page 18).
6. The properties of each event type need to be compared to rule out other reasons for their distinct spatial and temporal dynamics. These properties to be compared between classes of event types (e.g. stationary, continuous, etc.) include; number of tetrodes active in the event, ripple power, total number of spikes in the event, spikes per bin, event duration, proportion of spikes under 6 ms ISI for each tetrode in each event type, proportion of the event duration it lies in that classification.
We thank the reviewer for this suggestion. We have added these statistics as a supplementary figure (Figure 5—figure supplement 3). These statistics confirm that slower dynamics are not driven by a single neuron as there are many tetrodes active for each dynamic and there are multiple spikes from different tetrodes in each time bin. There is also little evidence for more bursting in the slower dynamics compared to the continuous dynamics because we observed a similar proportion of spikes under 6 ms ISI for the slow dynamics. Only ~50% of the spikes had ISIs of less than 6 ms in the stationary, stationarycontinuous, and continuous dynamics (Figure 5—figure supplement 3). It should be noted that this metric should be interpreted with extreme caution because with clusterless decoding, the ISI of the multiunit spikes is not the same as if we had an individual unit. The spikes could be from multiple neurons firing, which would not tell us if there was bursting or not. Finally, these statistics show similar ripple power for all our dynamics as well as a tendency for the unclassified time periods during the SWR to have lower average ripple power.
Could you also confirm that population events included for clusterless decoding are similar in properties to those obtained from clustered data by comparing their properties (duration and zscored population firing rates across the event)?
We apologize for the confusion regarding our event detection method. In this manuscript, we analyzed SWR events which are detected based on the combined instantaneous rippleband power of CA1 LFPs and not based on high multiunit firing rate, as many studies have done (Xu et al. 2019, Farooq and Dragoi 2019, Drieu et al. 2018, Ólafsdóttir et al. 2017, Grossmark and Buzsáki 2016, Pfeiffer and Foster 2013, Davidson et al. 2009). Therefore, we are not able to compare the multiunit and clustered events because they are not selected on this basis.
7. There is a possibility that one neuron dominates a population event and, therefore, stationary events are simply single units firing continuously. For instance, Figure 1A for the simulation shows just that, one simulated neuron fires and a stationary event is predicted by the algorithm. As clusterless decoding is used for the main analyses, can only events consisting of multiunit activity from at least 5 tetrodes be included in the analysis? Will the main classes of events be replicated under these conditions?
As suggested by the reviewer, to exclude the possibility that a single neuron firing is the cause of stationary events, we performed a control analysis where we required spiking from at least 5 tetrodes for a SWR to be analyzed and found a similar distribution of movement dynamics (Figure 5—figure supplement 2A). It also should be noted that our original analysis required at least 2 tetrodes to be active for a SWR to be analyzed and we have clarified this in the Methods (page 22). We have also included a histogram of the number of active tetrodes per SWR event (Figure 5—figure supplement 3B) as well as a boxplot of the number of active tetrodes by dynamic (Figure 5—figure supplement 3B). These show that well over 5 tetrodes are active in each dynamic and during the SWR events.
Related to this, interneurons (putative inhibitory neurons) should be removed from the clusterless decoding based on waveform shape as there is a possibility that inclusion of interneurons in the clustering decoding analysis biases results towards detection of stationary events (there might be a hint of this from examples presented in Figure S4 where good stationary events seem to have higher spiking per tetrode compared to continuous events). The event classes should be recomputed and compared with the original classes.
As requested by the reviewer, we performed a control analysis where we excluded spikes with narrow waveform spike widths (<0.3 ms), which would preferentially exclude spikes from putative interneurons. We again found a similar distribution as in our main results (Figure 5—figure supplement 2B). However, we should note several caveats to this approach. First, interneurons are typically identified by both their high firing rate and the average narrow spike width waveforms. Our approach only considered the spike width for each individual spike (not average) and therefore may be much less discriminating and remove many spikes that may contain spatial information. Second, it is not clear that interneurons should be excluded. Interneurons can have spatially specific firing and may contain information about the position represented by the hippocampus (Hangya et al. 2010, Wilent and Nitz 2007). Third, if interneurons were not spatially specific and driving the slower dynamics, we would expect to see the posterior be less spatially concentrated. Instead, we found that these dynamics tended to have narrow posteriors (as characterized by spatial coverage of the 95% highest posterior density values in Figure 4E).
Finally, we performed a third control analysis where we also looked at the 9 of 10 animals in our dataset that already had clustered cells, excluding putative interneurons based on spike width and firing rate and requiring that at least three putative pyramidal cells be active during the SWR. Although there were many fewer events to examine, we still observed many stationary and stationarycontinuousmixtures in SWRs compared to continuous sequences (Figure 5—figure supplement 2D).
8. The current title is misleading. There does not appear to be clear evidence of replay at realworld speeds. First, during each run lap, the animal trajectory changes unidirectionally at an estimated speed of 2535 cm/s. For replay to occur at real time speed, the trajectory decoded during rest should fully unfold at the same speed over 34 s for a 1mlong track. This is never reported here, partly because the events are shorter. The stationary events decode individual locations, but there is no indication that they evolve in succession (successive locations at successive times) to represent trajectories at real time speeds.
We thank the reviewer for this comment. As the reviewer stated, the slower trajectories cannot span the same amount of distance as when the animal is moving at slow speeds because of the duration of the SWR. Our claim in the title and in the rest of the manuscript is that the speed of the decoded trajectory over the duration of the SWR is often similar to the speeds at which the animal experiences the environment, which includes times when the animal is immobile and times when the animal is running. This does indeed imply that the replay trajectory does not move very far. We have clarified this in the text (pages 12) and added a figure showing the distribution of speeds of the animal (Figure 5G). As this figure shows, there is substantial overlap between the movement speeds seen during replay and real movement speeds.
Second, neurons fire in timecompressed theta sequences during run, so neuronal activity is naturally compressed even before replay. Compressed replay essentially already represents space at roughly the same speed as during run (theta sequences).
The reviewer brings up an interesting and important point. Individual theta cycles can contain extended sequences spanning locations behind and in front of the animal, but other theta cycles can show much less prominent sequences, resulting in a range of potential representational movement speeds. In addition, a recent result reported quite different dynamics for the first and second halves of theta cycles (Wang et al. 2020).
Whether replay sequences are essentially the same as theta sequences remains to be determined, but we have noted that our methods could be used to study these sequences as well.
We emphasize, though, that the goal of this paper is to establish that replay sequences can proceed at a variety of speeds. We hope in future manuscripts to directly address the similarities and differences between replay and theta sequences using our approach.
9. Bursting of neurons or other individual neuronal properties might lead to classification of successive bins as spatially coherent and/or stationary. In order to control for that possibility, in clustered decoding researchers typically generate shuffled datasets with conserved place map properties by circularly shifting individual place maps. The shuffles generated in this paper randomize the position to spike relationship destroying all singlecell properties. Can shuffles with conserved tetrode multiunit activity properties be generated instead to test this hypothesis? E.g. circularly shift by the same value all spikes from the same tetrode.
The reviewer brings up an important point, which highlights our use of small time bins. Our time bins are 2 ms, which implies spike bursts (38 ms ISI, 23 spikes per burst, Tropp Sneider et al. (2006)) could fall in separate, potentially successive time bins as the reviewer noted (although the average ISI is ~6 ms so it’s likely not the case, Tropp Sneider et al. (2006)). Note that our use of 2 ms time bins is an advantage over the standard 20 ms time bins because with 20 ms time bins, either most of these spikes fall into one time bin, making the estimate stationary over the 20 ms time bin or the spikes could fall into two 20 ms time bins, making the estimate stationary over 40 ms; with our approach, the stationary period could only last for approximately as long as the burst modulo 1 or 2 ms.
As the reviewer stated, spike bursts from a single neuron could lead to successive time bins being classified as stationary. There are two lines of evidence that make us think that this is not the case. First, the stationary dynamics have a longer duration than one would expect from a burst alone (54 ms median duration, 3874 ms quartiles vs. bursts which have a duration of 624 ms). Second, thanks to a previous suggestion of this reviewer, we know that multiple tetrodes are active during each SWR (Figure 5—figure supplement 3B), there are often multiple spikes from different tetrodes in a given time bin (Figure 5—figure supplement 3A), and the proportion of spikes under 6 ms for stationary dynamics is similar to those as stationarycontinuousmixtures and continuous dynamics (Figure 5—figure supplement 3A). Moreover, our main claim in the manuscript is that the stationarycontinuousmixtures constitute the bulk of dynamics in SWRs, which can only happen with more than one tetrode active.
[Editors' note: further revisions were suggested prior to acceptance, as described below.]
The manuscript has been improved but there are some remaining issues that need to be addressed, as outlined below:
The most critical aspect is about the decoding. During wake, theta sequences should strongly impact the accuracy of the decoding with 2m time bins, but it is unclear why this is not the case.
The number we reported included both times when the animal was immobile and times when the animal was running. Because of this, the decoding accuracy was not strictly computed on times when theta was most prevalent. To prevent confusion for the readers, we have changed the decoding accuracy to be measured only on the times when the animal was moving. This resulted in a change of decoding accuracy from a median error of 4 cm (36 cm) to 7 cm (59 cm), which is in line with most previous results. We also note that we show an example of decoding during movement in a separate manuscript that uses a version of this algorithm (Gillespie et al. Biorxiv 2021, Figure 2B) which illustrates its capacity to detect theta sequences.
The study should also provide some comparison of decoding accuracy with 2ms and 20ms and provide further details regarding how exactly events are eventually classified (see reviewer #2, points 911).
We clarified this issue regarding decoding accuracy with Dr. Peyrache and have therefore not included this first analysis.
For the second analysis we now provide further details on the classification of events on page 26 and in our response to the reviewer. We also clarified this issue with Dr. Peyrache.
The study should also include examples showing the differences and similarities in decoding with the clusterless and more classical approaches (see reviewer #2, point #1).
We have added six examples comparing the differences in decoding with clusterless and more classical approaches in a new supplemental figure: Figure 6—figure supplement 1.
Some other concerns were raised, mostly requesting clarification in the methods and the discussion of the results. Please see detailed review below (reviewer #3, points #28).
Reviewer #3:
In the revised manuscript, the authors have partially addressed the previous concerns. The manuscript has improved, and the findings remain potentially interesting. Below, please find additional comments aimed at improving the accuracy and readability of the manuscript:
1. Due to the various differences from earlier methods (clusterless decoding, state space encoding model, bin size, event detection criteria and quantification of dynamic type: radon transform vs current method), it is difficult to pinpoint which of these differences is leading to the authors' main conclusions. For the general readership, it will be extremely useful to show actual examples of the same decoded spike trains using standard Bayesian decoding (with clustered units) and using state space models. This could reveal any significant differences in the decoded locations which the authors claim are revealed by this method (pages 3 and 4). Since the authors claim there is a substantial difference between the two methods, differences in the inferred dynamics from the two methods should be visible by eye and many such examples should be available.
We thank the reviewer for this great suggestion. We have added six examples of the fits from three different methods in a new supplemental figure: Figure 6supplemental1. These examples illustrate the variety of event types seen and the challenges inherent in using 20 ms bins and standard “Bayesian” decoding approaches.
2. Is the higher preponderance of stationary/slow events caused by the authors setting of the beginning and ends of events as the start and ends of ripples, rather than using the multiunit population activity (which can span multiple ripples). Can the authors rerun the analysis with multiunit population activity to ensure that the results are not different from previous studies mostly due to how events are detected? Even if the results are due to that, it will be useful to report that to the reader as it will not diminish the impact of these findings (the authors already allude to the fact that the cut off is arbitrary) but will help them understand the reason for the differences between the studies.
As requested by the reviewer, we reran the analysis using times of high multiunit activity during immobility and found that the distribution of slow dynamics is qualitatively similar. Thus, the preponderance of slow events does not seem to be a result of using sharp wave ripple rather than events with high multiunit activity. We have included this as Figure5—figure supplement 2F in the manuscript. This also corroborates our finding that these slow dynamics have similar multiunit firing rates to the higher speed, continuous dynamics (Figure 5F).
3. Can the authors include a histogram of event duration versus proportion of such events for each of the dynamic so that the reader can better judge what is the likelihood of each event type by duration?
We are unclear on whether proportion in this case refers to proportion relative to the length of the sharp wave ripple or whether proportion refers to over a recording session. We have included both here as Author response image 1 and Author response image 2. Given that we already show the distributions of dynamics per SWR and the distributions of the durations of the dynamics in main figures, it is not clear to us that these Author response images will substantially augment a reader’s understanding of our results, so we have not included them in the manuscript, but if the reviewer feel strongly that they need to be included we can do so.
4. Although the fragmented dynamic would indeed identify discontinuities in decoded locations, how exactly is it better than a nonparametric shuffle? What if the fragmented dynamic has more falsepositives or falsenegatives?
All statistical methods have tradeoffs. However, we believe that our fragmented state directly tests for “random” on a momentbymoment basis. This is an advantage because nonparametric methods cannot do this on a momentbymoment basis thereby potentially excluding content of interest. Shuffles also can require much time for computation often leading to small numbers of shuffles, which can make the tests underpowered. Finally, shuffles can lead to pathological null distributions which cannot occur in real data and therefore do not test against the null hypothesis of interest (e.g Foster 2017, van der Meer et al. 2020). Thus, including the fragmented dynamic preserves the existing data while still testing for the different dynamics. More broadly, testing for falsepositives and falsenegatives requires a model of what a true positive and a true negative is. The state space approach provides a relatively simple, interpretable version of such a model, while the model associated with a shuffle is often harder to define and understand. We have added this point to the Discussion.
5. Are the discovered dynamics restricted to the replay of the animal experience or can they be present before the experience? The authors should discuss this possibility.
Our study was not focused on these types of events but indeed our model could be used to discover dynamics present for the experience. We added this point to the Discussion that we could use this model to detect dynamics before the event (page 20).
6. How is stationarity a realworld 'speed'? Isn't it a lack of movement? Were only the spikes emitted while the animal was moving used for the encoding model and not those while it was resting? How are then the stationary events replays of that movement? What was the average velocity of the animal during movement? And of the stationary dynamic?
To answer the first question, the animal’s experience is not limited to running. The animals can stop at any position on the track, and thus a real world experience can include movement and immobility at any given position.
We only used spikes that occurred in association with speeds of > 4 cm / second in the encoding model. We had mentioned that in the Methods for the single spike analysis, but we had neglected to do so for the clusterless analysis, so we have added that information. We note, however, that place specific spiking in a location can continue even when animals are still (Kay et al. 2016; Yu et al. 2017), so using this encoding model still allows for the identification of periods of immobility or slow movement in a decoded representation. More importantly, stationary events are those that have slow velocity, but nowhere do we claim that this velocity is < 4 cm/second. Indeed, given our bin size of 3 cm, a 100 ms event could be moving at (for example) 10 cm /second and still remain within a single bin. Thus, we only claim that these events represent very slow speed movements consistent with (slower) realworld speeds, not that they are perfectly stationary at a point in space.
Finally, the median speed of our animals is 4 cm/s for all times (including immobility times) and 17 cm/s for run periods (speeds > 4 cm/s). We have added this to the manuscript in the Results (page 14).
7. References cited and conclusions drawn from papers are not accurate in some instances, please verify and correct. Several cases are listed below, but the list is larger. Louie and Wilson, Neuron 2001 showed that replay of experience can occur at realtime speed during REM sleep, which should be cited.
We thank the reviewer for the reference. We have added the reference to Louie and Wilson 2001 on page 18 in the Discussion.
On lines 6768 the authors say: "events that represent a single location are only seen in young animals (Stella et al., 2019)". That study did not investigate young animals, the authors probably meant to refer to (Farooq and Dragoi, 2019), which should be cited.
We thank the reviewer for catching this mistake. We have changed the reference to Farooq and Dragoi (2019) as we originally intended, which was erroneously referenced as Stella et al. 2019. We have correctly referenced Farooq and Dragoi (2019) in other parts of the paper and apologize for our carelessness in this instance.
There is a difference between what previous studies have claimed, and what the authors attribute to those studies. For instance, previous studies observed low proportions of sequential trajectories, but they did not observe low proportions of spatially coherent timebins/cofiring, which are otherwise wellreported and have been observed many times.
We agree that previous studies that examined cofiring across entire events found substantial structure, but those studies did not determine whether individual events were expressing spatially coherent structure across a set of time bins. We have modified the introduction and discussion to reflect this clarification.
8. How do the authors know that a probability of 0.8 or above can be inferred from downstream neurons as opposed to 0.95? Unless recordings are performed from downstream neurons that cannot really be determined. Both 0.8 and 0.95 face the problem of ambiguity of whether it is interpretable by a downstream neuron, however, a probability of 0.95 will enable readers to judge what happens when events are statistically significant. Can the authors discuss this in the main text?
Our threshold of 0.8 is used to categorize the dynamics to determine the speed and as a measure of our confidence in that dynamic. We are not claiming downstream neurons are explicitly making use of our dynamics categories, but rather our categories can be used to understand and summarize the dynamics of replay. Thus, our algorithm enables the possibility of a systematic investigation of how different thresholds and thus different category boundaries might be related to activity in downstream regions.
We have also shown that these categories broadly correspond to speeds of interest in Figure 1F. Additionally, we have also shown that our results still hold with a higher 0.95 threshold. We have added a discussion of this on page 6.
9. Very low decoding errors (36 cm) during the run at the small timescale employed here (2 ms) may indicate that this method is incapable of capturing theta sequences (which should give a higher decoding error). Instead, they capture stationary dynamics like they do during rest. Larger bins for decoding during run are typically used to study position representation at larger timescales (at bigger timescales than theta sequences) which are likely to have low error. This could be problematic since the current method might be biased toward overrepresenting the stationary dynamic. The authors should address this in the manuscript.
We reported decoding error for both immobility and running time periods to show that our model could capture stationary representation of the animal’s position at the well as well as theta sequences when the animal was running. We realize, however, this is confusing to the reader given what is typically reported. We have changed this to be the error only during running times. During running, we observed a median error of 7 cm (59 cm), showing that we indeed can capture theta sequences. This is similar to most other studies (Davidson et al. 2009: 7 cm, 9 cm, 8 cm, 8 cm for each rat; Shin et al. 2019: 3.81 cm; Farooq and Dragoi 2019: ~5 cm (inferred from Figure 1J); Stella et al. 2019: ~9 cm (Figure S2); Farooq et al. 2019: 7.6 cm; Grossmark and Buzsaki 2016: 5.14 cm; Ólafsdóttir et al. 2017: 17.5 cm). We also note that we illustrate the application of the state space algorithm to data from running in Figure 2B of Gillespie et al. Biorxiv (2021), which clearly illustrates that it can capture the dynamics of theta sequences.
We also do not think that our model is biased to overrepresenting stationary dynamics because we observed similar distributions of speeds when using the MAP estimate (which picks the most likely position without the state space modeling component) with 20 ms time bins (Figure 6B).
10. What happens when a 20 ms time bin is used for state space modelling? Does it reduce the proportion of spatiallycoherent events?
We do not have reason to believe that larger time bins would reduce the proportion of spatially coherent events because our analysis using the MAP estimate and 20 ms bins (Figure 6B) rather than the state space model (Figure 6D) also showed many slow events. In any case, we emphasize that there are serious issues with using longer bins for understanding the structure of the data because it introduces arbitrary boundaries between sets of spikes. Some papers try to overcome this with a sliding window, which partially smooths over those boundaries, but even there, the data are chunked into bins that are larger than necessary (e.g. 10 ms for a 20 ms sliding window that is offset by 10 ms each step) and we argue that there is no reason to introduce these arbitrary boundaries.
11. Figure 5. When whole ripples are classified as stationary, stationarycontinuous mixture and so on, how is this classification done for the whole ripple? The classification of dynamics is done for each 2 ms bin, but what criteria are used to classify the whole ripple? Please provide more details on the intermediate steps. For instance, if an event is stationary, but only has one 2 ms bin which deviates, is it considered stationarycontinuous?
Any time bin that meets the criteria is included. However, the majority of our dynamics have a duration much longer than a single time bin (Figure 5B). We have further clarified this in the methods (page 26).
https://doi.org/10.7554/eLife.64505.sa2Article and author information
Author details
Funding
Simons Foundation (542971)
 Uri T Eden
Simons Foundation (542981)
 Eric L Denovellis
 Anna K Gillespie
 Michael E Coulter
 Marielena Sosa
 Jason E Chung
 Loren M Frank
Howard Hughes Medical Institute
 Eric L Denovellis
 Loren M Frank
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
We thank Margaret Carr Larkin and Mattias P Karlsson for use of their openly available datasets. We also thank Abhilasha Joshi for feedback on the manuscript. This work was supported by the Simons Foundation for the Global Brain Grants 542971 (UTE) and 542981 (LMF) and the Howard Hughes Medical Institute (LMF).
Ethics
Animal experimentation: All experiments were conducted in accordance with University of California San Francisco Institutional Animal Care and Use Committee and US National Institutes of Health guidelines. The protocol was approved by the Institutional Animal Care and Use Committee, approval number AN17499103G. All surgical procedures were performed under anesthesia and every effort was made to minimize suffering.
Senior Editor
 Timothy E Behrens, University of Oxford, United Kingdom
Reviewing Editor
 Adrien Peyrache, McGill University, Canada
Reviewers
 Yunzhe Liu, University College London, United Kingdom
 H Freyja Ólafsdóttir, University College London, United Kingdom
Publication history
 Preprint posted: October 21, 2020 (view preprint)
 Received: October 30, 2020
 Accepted: September 8, 2021
 Version of Record published: September 27, 2021 (version 1)
 Version of Record updated: October 4, 2021 (version 2)
Copyright
© 2021, Denovellis et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics

 1,654
 Page views

 166
 Downloads

 1
 Citations
Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.