Sampling motion trajectories during hippocampal theta sequences
Abstract
Efficient planning in complex environments requires that uncertainty associated with current inferences and possible consequences of forthcoming actions is represented. Representation of uncertainty has been established in sensory systems during simple perceptual decision making tasks but it remains unclear if complex cognitive computations such as planning and navigation are also supported by probabilistic neural representations. Here, we capitalized on gradually changing uncertainty along planned motion trajectories during hippocampal theta sequences to capture signatures of uncertainty representation in population responses. In contrast with prominent theories, we found no evidence of encoding parameters of probability distributions in the momentary population activity recorded in an openfield navigation task in rats. Instead, uncertainty was encoded sequentially by sampling motion trajectories randomly and efficiently in subsequent theta cycles from the distribution of potential trajectories. Our analysis is the first to demonstrate that the hippocampus is well equipped to contribute to optimal planning by representing uncertainty.
Editor's evaluation
This paper will be of interest to neuroscientists interested in predictive coding and planning. It presents a novel analysis of hippocampal place cells during exploration of an open arena. It performs a comprehensive comparison of real and synthetic data to determine which encoding model best explains population activity in the hippocampus.
https://doi.org/10.7554/eLife.74058.sa0Introduction
Modelbased planning and predictions are necessary for flexible behavior in a range of cognitive tasks. In particular, navigation is a domain that is ecologically highly relevant not only for humans but for rodents as well, which established a field for parallel investigation of the theory of planning, the underlying cognitive computations, and their neural underpinnings (Hunt et al., 2021; Mattar and Lengyel, 2022). Importantly, predictions extending into the future have to cope with uncertainty coming from multiple sources: uncertainty in the current state of the environment (our current location relative to a dangerous spot, the satiety of a predator or the actual geometry of the environment) and the availability of multiple future options when evaluating upcoming choices (Glimcher, 2003; Redish, 2016). Whether and how this planningrelated uncertainty is represented in the brain is not known.
The hippocampus has been established as one of the brain areas critically involved in both spatial navigation and more abstract planning (O’Keefe and Nadel, 1978; Miller et al., 2017). Recent progress in recording techniques and analysis methods largely contributed to understanding of the neuronal mechanisms underlying such computations (Pfeiffer and Foster, 2013; Kay et al., 2020). A crucial insight gained about the neural code underlying navigation is that neuron populations in the hippocampus represent the trajectory of the animal on multiple time scales: Not only the current position of the animal can be read out at the behavioral time scale (O’Keefe and Nadel, 1978; Wilson and McNaughton, 1993), but also trajectories starting in the past and ending in the near future are repeatedly expressed on a shorter time scale at accelerated speed during individual cycles of the 6–10 Hz theta oscillation (theta sequences, Foster and Wilson, 2007, Figure 1a). Moreover, features characteristic of planning can be identified in the properties of encoded trajectories, such as their dependence on the immediate context the animal is in, and on the span of the current run, future choices and rewards (Johnson and Redish, 2007; Gupta et al., 2012; Wikenheiser and Redish, 2015; Tang et al., 2021; Zheng et al., 2020). These data provide strong support for a computational framework where planning relies on sequential activity patterns in the hippocampus delineating future locations based on the current beliefs of the animal (Stachenfeld et al., 2017; Miller et al., 2017). Whether hippocampal computations also take into account the uncertainty associated with planning and thus the population activity represents the uncertainty of the encoded trajectories has not been studied yet.
Neuronal representations of uncertainty have been extensively studied in sensory systems (Ma et al., 2006; Orbán et al., 2016; Vértes and Sahani, 2018; Walker et al., 2020). Schemes for representing uncertainty fall into three broad categories (Figure 1d–f). In the first two categories (product and Distributed Distributional Code, DDC, schemes), the firing rate of a population encodes a complete probability distribution over spatial locations instantaneously at any given time by representing the parameters of the distribution (Wainwright and Jordan, 2007, Methods). In the product scheme, the firing rate of neurons encode a probability distribution through taking the product of the tuning functions (basis functions, Methods) of the coactive neurons (Ma et al., 2006; Pouget et al., 2013, Figure 1d). In contrast, in the DDC scheme a population of neurons represents a probability distribution by signalling the overlap between the distribution and the basis function of individual neurons (Zemel et al., 1998; Vértes and Sahani, 2018, Figure 1e). In the third category, the sampling scheme, the population activity represents a single value sampled stochastically from the target distribution. In this case uncertainty is represented sequentially by the acrosstime variability of the neuronal activity (Fiser et al., 2010, Figure 1f). These coding schemes provide a firm theoretical background to investigate the representation of uncertainty in hippocampus. Importantly, all these schemes have been developed for static features, where the represented features do not change in time (Ma et al., 2006; Orbán et al., 2016; but see Kutschireiter et al., 2017). In contrast, trajectories represented in the hippocampus encode the temporally changing position of the animal. Here, we extend the coding schemes to be able to accommodate the encoding of uncertainty associated with dynamic motion trajectories and investigate their neuronal signatures in rats while navigating an open environment.
Probabilistic planning in an open field entails the representation of subjective beliefs about the past, present, and future states, which requires that a continuous probability distribution over possible locations is represented. Previous studies investigated prospective codes during theta sequences in a constrained setting, in which binary decisions were required in a spatial navigation task (Johnson and Redish, 2007; Kay et al., 2020; Tang et al., 2021). These studies found that alternative choices were encoded sequentially in distinct theta cycles suggesting a sampling based representation. However, the dominant source of future uncertainty in these tasks is directly associated with the binary choice of the animal (left or right) and it remains unclear whether this generalizes to other sources of uncertainty relevant in open field navigation. In particular, it has been widely reported that the hippocampal spatial code has different properties in linear tracks, where the physical movement of the animal is constrained by the environment, than during openfield navigation (Buzsáki, 2005). Moreover, these previous studies did not attempt to test the consistency of the hippocampal code with alternative schemes for representing uncertainty. Thus, the way hippocampal populations contribute to probabilistic planning during general openfield navigation remains an open question.
In the present paper, we propose that the hippocampus is performing probabilistic inference in a model that represents the temporal dependencies between spatial locations. Using a computational model, we demonstrate that key features of the hippocampal single neuron and population activity are compatible with representing uncertainty of motion trajectories in the population activity during theta sequences. Further, by developing novel computational measures, we pitch three alternative schemes of uncertainty representation and a scheme that lacks the capacity to represent uncertainty, the mean scheme (Figure 1c), against each other and demonstrate that hippocampal activity does not show the hallmarks of schemes encoding probability distributions instantaneously. Instead, we demonstrate that the large and structured trial to trial variability between subsequent theta cycles is consistent with stochastic sampling from potential future trajectories but not with a scheme ignoring the uncertainty by representing only the most likely trajectory. Finally we confirm previous results in simpler mazes by showing that the trajectories sampled in subsequent theta cycles tend to be anticorrelated, a signature of efficient sampling algorithms. These results demonstrate that the brain employs probabilistic computations not only in sensory areas during perceptual decision making but also in associative cortices during naturalistic, highlevel cognitive processes.
Results
Neural variability increases within theta cycle
A key insight of probabilistic computations is that during planning uncertainty increases as trajectories proceed into more distant future (Murphy, 2012; Sezener et al., 2019). As a consequence, if planned trajectories are encoded in individual theta sequences, the uncertainty associated with the represented positions increases within a theta cycle (Figure 1a). This systematic change in the uncertainty of the represented position during theta cycles is a crucial observation that enabled us to investigate the neuronal signatures of uncertainty during hippocampal theta sequences. For this, we analyzed a previously published dataset (Pfeiffer and Foster, 2013). Briefly, rats were trained to collect food reward in a 2×2 m large open arena from one of the 36 uniformly distributed food wells alternating between random foraging and spatial memory task (Pfeiffer and Foster, 2013). Position of the animal was recorded via a pair of distinctly coloured headmounted LED light. Goal directed navigation in an open arena requires continuous monitoring and online correction of the deviations between the intended and actual motion trajectories. While sequences during both sharp waves and theta oscillations have been implicated in planning, here we focused on theta sequences as they are more strongly tied to the current position and thus averaging over many thousands of theta cycles can provide the necessary statistical power to identify the neuronal correlate of uncertainty representation.
Activity of hippocampal neurons was recorded by 20 tetrodes targeted to dorsal hippocampal area CA1 (Pfeiffer and Foster, 2013). Individual neurons typically had locationrelated activity (Figure 2b, see also Figure 4—figure supplement 1a), but their spike trains were highly variable (Skaggs et al., 1996; Fenton and Muller, 1998, Figure 2a). We used the empirical tuning curves (i.e., place fields) and a Bayesian decoder (Zhang et al., 1998) to estimate the represented spatial position from the spike trains of the recorded population in overlapping 20ms time bins. Theta oscillations partition time into discrete segments and analysis was performed in these cycles separately (Figure 2c). Despite the large number of recorded neurons (68–242 putative excitatory cells in 8 sessions from 4 rats), position decoding had a limited accuracy (Fisher lower bound on the decoding error in 20ms bins: 16–30 cm vs. typical trajectory length ∼20 cm). Yet, in high spike count cycles we could approximately reconstruct the trajectories encoded in individual theta cycles (Figure 2c). We then compared the reconstructed trajectories to the actual trajectory of the animal. We observed substantial deviation between the decoded trajectories and the motion trajectory of the animal: decoded trajectories typically started near the actual location of the animal and then proceeded forward (Foster and Wilson, 2007) often departing in both directions from the actual motion trajectory (Kay et al., 2020; Figure 2c).
To systematically analyse how this deviation depends on the theta phase, we sorted spikes into three classes (early, mid and late). For any given class, we decoded position from the spikes and compared it to the position of the animal shifted forward and backward along the motion trajectory. Time shift dependence of the accuracy of the decoders reveals the most likely portion of the trajectory the given class encodes (Figure 2d, see also Figure 4—figure supplement 2). For early spikes, the minimum of the average decoding error was shifted backward by ∼100ms, while for late spikes +500ms forward shift minimized the decoding error. Note that the position identified by the minima can only establish positions relative to the LED used to record the position of the animal. The relative shifts in the minima of the decoding error across different phases of theta are not affected by the arbitrariness of the positioning of the LED sources. The observed systematic biases indicate that theta sequences tended to start slightly behind the animal and extended into the near future (Foster and Wilson, 2007; Gupta et al., 2012).
Further, the minima of the decoding errors showed a clear tendency: position decoded from later phases displayed larger average deviation from the motion trajectory of the animal (Figure 2d). At the single neuron level, increased deviations at late theta phases resulted in the expansion of place fields (Skaggs et al., 1996): Place fields were more compact when estimated from earlyphase spikes than those calculated from latephase activity (Figure 2e). At the population level, larger deviation between the motion trajectory and the late phasedecoded future positions can be attributed to the increased variability in encoded possible future locations. Indeed, when we aligned the decoded locations relative to the current location and motion direction of the animal, we observed that the spread of the decoded locations increased in parallel with the forward progress of their mean within theta cycles (Figure 2f, g). Taken together, this analysis demonstrated that the variability of the decoded positions is larger at the end of the theta cycle when encoding more uncertain future locations than at the beginning of the cycle when representing the past position.
The observed increase of the variability across theta cycles could be a direct consequence of encoding variable twodimensional trajectories (Figure 4—figure supplement 2a, b) or it may be a signature of the neuronal representation of uncertainty. In the next sections, we set up a synthetic dataset to investigate the theta sequences and their variability in the four alternative coding schemes.
Synthetic data: testbed for discriminating the encoding schemes
To analyse the distinctive properties of the different coding schemes and to develop specific measures capable of discriminating them, we generated a synthetic dataset in which both the behavioral component and the neural component could be precisely controlled. The behavioral component was matched to the natural movement statistics of rats during navigating a 2D environment. The neural component was constructed such that it could accommodate the alternative encoding schemes for prospective representations during navigation.
Our synthetic dataset had three consecutive levels. First, we simulated a planned trajectory for the rat in a two dimensional open arena by allowing smooth changes in the speed and motion direction (Methods, Figure 3—figure supplement 1a, b). Second, similar to the real situation, the simulated animal did not have access to its true position, $x}_{n$, in theta cycle $n$, but had to infer it from the sensory inputs it had observed in the past, ${y}_{1:n}$ (Methods). To perform this inference and predict its future position, the animal used a model of its own movements in the environment. In this dynamical generative model, the result of the inference is a posterior distribution over possible trajectories starting $n}_{p$ steps back in the past and ending n_{f} steps forward in the future. To generate the motion trajectory of the animal noisy motor commands were calculated from the difference between its planned trajectory and its inferred current position (Methods, Figure 3—figure supplement 1a, b). The kinematics was matched between simulation and experimental animals (Figure 3—figure supplement 2).
Third, in our simulations the hippocampal population activity encoded an inferred trajectory at an accelerated speed in each theta cycle such that the trajectory started in the past at the beginning of the simulated theta cycle and led to the predicted future states (locations) by the end of the theta cycle (Figure 3a–c). To approximately match the size of the synthetic and experimental data, we simulated the activity of 200 hippocampal pyramidal cells (place cells). Firing rates of pyramidal cells depended on the encoded spatial location using either of the four different coding schemes (Methods): In the mean code, the population encoded the single most likely trajectory without representing uncertainty. In product and DDC schemes, a snapshot of the population activity at any given theta phase encoded the estimated past or predicted future part of the trajectory in the form of a probability distribution (Methods). Finally, in the sampling scheme, in each theta cycle a single trajectory was sampled stochastically form the distribution of possible trajectories (Figure 3—figure supplement 1). Spikes were generated from the firing rates independently across neurons via an inhomogenous Poisson process. The posterior distribution was updated in every ∼100ms matching the length of theta cycles. Importantly, all of the four encoding schemes yielded single neuron and population activity dynamics consistent with the known features of hippocampal population activity including spatial tuning, phase precession (Figure 3d) and theta sequences (Figure 3e, see also Figure 3—figure supplement 3).
After generating the synthetic datasets, we investigated how positional information is represented during theta sequences in each of the four alternative coding schemes. We decoded the synthetic population activity in early, mid and late theta phases and compared the estimated position with the actual trajectory of the simulated animal. The deviation between the decoded position and the motion trajectory increased throughout the theta cycle irrespective of the coding scheme (Figure 3f) due to the combined effects of the divergence of possible future trajectories and the increased variability of the encoded locations. Moreover, place fields were significantly larger when estimated from late than earlyphase spikes in all four encoding schemes (Figure 3g, $\mathrm{P}<{10}^{16}$ for each scheme). Finally, when aligned to the current position and motion direction of the simulated animal, the spread of the decoded locations increased with the advancement of their mean within theta cycles in all four coding schemes (Figure 3h).
Our analyses thus confirmed that the disproportionate increase in the variability of the encoded locations at late theta phase is consistent with encoding trajectories in a dynamical model of the environment irrespective of the representation of the uncertainty. These synthetic datasets provide a functional interpretation of the hippocampal population activity during theta oscillations and offer a unique testbed for contrasting different probabilistic encoding schemes. In the following sections, we will identify hallmarks for each of the encoding schemes of uncertainty and use these hallmarks to discriminate them through the analysis of neural data (Figure 4a).
Testing the product scheme: gain
To discriminate the product scheme from other representations, we capitalize on the specific relationship between response intensity of neurons and uncertainty of the represented variable. In a product representation, a probability distribution over a feature, such as the position, is encoded by the product of the neuronal basis functions (Methods). When the basis functions are localized, as in the case of hippocampal place fields, the width of the encoded distribution tends to decrease with the total number of spikes in the population (Ma et al., 2006, Figure 4b, Appendix 1, Figure 4—figure supplement 1). Therefore, we propose using the systematic decrease of the population firing rate (gain) with increasing uncertainty as a hallmark of the product scheme.
We first tested for the specificity of the comodulation of the population gain with uncertainty to the product scheme: we compared gain modulation in the four different coding schemes in our synthetic dataset. In each of the coding schemes, we identified the theta phase with the maximal uncertainty by the maximal forward bias in encoded positions. For this, we decoded positions from spikes using eight overlapping 120° windows in theta cycles and defined the end of the theta cycles based on the maximum of the average forward bias (Figure 4C top). Then we calculated the average number of spikes in a given 120° window as a function of the theta phase. We found that the product scheme predicted a systematic, ∼threefold modulation of the firing rate within the theta cycle (Figure 4c, bottom). The peak of the firing rate coincided with the start of the encoded trajectory, where the uncertainty is minimal and the correlation between the firing rate and the forward bias was negative (Figure 4d). The three other coding schemes did not constrain the firing rate of the population to represent probabilistic quantities, and thus the firing rate was independent of the theta phase or the encoded uncertainty.
After demonstrating the specificity of uncertaintyrelated gain modulation to the product scheme, we returned to the experimental dataset and applied the same analysis to neuronal activity recorded from freely navigating rats. We first decoded the spikes during theta oscillation falling in eight overlapping 120° window and aligned the decoded locations relative to the animals’ position and motion direction. We confirmed that the encoded location varied systematically within the theta cycle from the beginning towards the end of the theta cycle both when considering all theta cycles (Figure 4—figure supplement 2) or when constraining the analysis to theta cycles with the highest 5% spike count (Figure 4e, f; Figure 2f, g). Maximal population firing rate coincided with the end of the theta sequences which correspond to future positions characterized by the highest uncertainty (Figure 4f). Thus, a positive correlation emerged between the represented uncertainty and the population gain (Figure 4h). This result was consistent across recording sessions and animals (Figure 4g, h) and was also confirmed in an independent dataset where rats were running on a linear track (Figure 4—figure supplement 3a–d; Grosmark et al., 2016).
This observation is in sharp contract with the prediction of the product encoding scheme where the maximum of the firing rate should be near the beginning of the theta sequences (Figure 4c). The other encoding schemes are neutral about the theta modulation of the firing rate, and therefore they are all consistent with the observed small phasemodulation of the firing rate.
Testing the DDC scheme: diversity
Next, we set out to discriminate the DDC scheme from the mean and the sampling schemes. In the DDC scheme, neuronal firing rate represents the overlap of the basis functions with the encoded distribution (Figure 1e; Zemel et al., 1998; Vértes and Sahani, 2018). Intuitively, in this scheme the diversity of the coactive neurons increases as the variance of the encoded distribution increases, i.e. when the encoded variance is small, only a smaller fraction of basis functions will overlap with the distribution, thus limiting the number of neurons participating in encoding. (Figure 5a). Conversely, a diverse set of neurons becomes coactive when encoding a distribution of high variance (Figure 5a). Therefore, the set of active neurons reflects the width of the encoded probability distribution. This feature of the population code carries information about uncertainty, which can thus be decoded from the population activity (Figure 5—figure supplement 1b). We used a maximum likelihood decoder to estimate the first two moments (mean and SD) of the encoded distribution (Methods, Equation 19). Intuitively, increased uncertainty at later stages of the trajectory is expected to be reflected in a parallel increase in the decoded SD, and we propose to use this systematic increase of decoded SD as a hallmark of DDC encoding.
First, we turned to our synthetic dataset to demonstrate that systematic changes in decoded SD are specific to this scheme. In each of the three remaining coding schemes (mean, DDC and sampling), we divided the population activity to three distinct windows relative to the theta cycle (early, mid, and late). We decoded the mean and the SD of the encoded distribution of trajectories (Figure 5b inset) using the empirical tuning curves in each theta cycle and analysed the systematic changes in the decoded SD values from early to late theta phase (Methods). We found a systematic and significant increase in the decoded SD in the DDC scheme from early to late theta phases (onesided, two sample KolmogorovSmirnov test, $P=1.3\times {10}^{16}$), whereas the decoded SD was independent of the theta phase for the mean and the sampling schemes (KS test, mean scheme: $P=0.98$, sampling scheme: $P=0.95$, Figure 5b, c). This result was robust against using the theta cycles with higher than median spike count for the analysis (Figure 5d, mean: $P=0.99$, DDC: $P=8.2\times {10}^{12}$, sampling: $P=0.93$) or against using the estimated basis functions instead of the empirical tuning curves for decoding (Figure 5—figure supplement 2b; Appendix 2). Thus, our analysis of synthetic data demonstrated that the decoded SD is a reliable measure to discriminate the DDC scheme from sampling or mean encoding.
After testing on synthetic data, we repeated the same analysis on the experimental dataset. We divided each theta cycle into three windows of equal spike counts (early, mid, and late) and decoded the mean and the SD of the encoded trajectories from the population activity in each window. We found that the decoded SDs had a nearly identical distribution at the three theta phases for all recording sessions (Figure 5e). The mean of the decoded SD did not change significantly or consistently across the recording session neither when we analysed all theta cycles (early vs. late, KS test $P>0.62$ for all sessions, Figure 5e, f) nor when we constrained the analysis to the half of the cycles with higher than median spike count (KS test, $P>0.7$ for all sessions, Figure 5g) or when we used the estimated encoding basis functions instead of the empirical tuning curves for decoding (Figure 5—figure supplement 2d). We obtained similar results in a different dataset with rats running on a linear track (Figure 4—figure supplement 3ei; Grosmark et al., 2016). We conclude that there are no signatures of DDC encoding scheme in the hippocampal population activity during theta oscillation. Taken together with the findings on the product scheme, our results indicate that hippocampal neurons encode individual trajectories rather than entire distributions during theta sequences.
Testing the sampling scheme: excess variability
Both the mean scheme and the sampling scheme represent individual trajectories but only the sampling scheme is capable of representing uncertainty. Therefore, a critical question concerns if the two schemes can be distinguished based on the population activity during theta sequences. Samplingbased codes are characterized by large and structured trialtotrial neuronal variability (Orbán et al., 2016). Our results showed a systematic increase in the variability of the decoded location at phases of theta oscillation that correspond to later portions of the trajectory associated with higher uncertainty. This parallel increase of variability and uncertainty could be taken as evidence towards the sampling scheme. However, we demonstrated that the systematic theta phasedependence of the neural variability and the variability of the encoded trajectories is a general feature of predictions in a dynamical model characterizing both the sampling and the mean schemes (Figure 3). In the sampling scheme, the cycletocycle variability is further increased by the sampling variance, that is the stochasticity of the represented trajectory, such that the magnitude of excess variance is proportional to uncertainty. In order to discriminate sampling from the mean scheme we designed a measure, excess variability, that can identify the additional variance resulting from sampling. For this, we partitioned the variability of encoded trajectories such that the full magnitude of variability across theta cycles (cycletocycle variability, CCV) is compared to the amount of variability expected from the animal’s own uncertainty (trajectory encoding error, TEE, Figure 6a, Methods).
We assessed excess variability in our synthetic dataset using either the sampling or the mean representational schemes, that is encoding either sampled trajectories or mean trajectories. Specifically, we decoded the population activity in three separate windows of the theta cycle (early, mid, and late) using a standard static Bayesian decoder and computed the difference between the decoded locations across subsequent theta cycles (cycletocycle variability, CCV) and the difference between the decoded position and the true location of the animal (trajectory encoding error, TEE). As expected, both TEE and CCV increased from early to late theta phase both for mean (Figure 6b, c) and sampling codes (Figure 6d, e, Methods). Our analysis confirmed that it is the magnitude of the increase that is the most informative of the identity of the code: the increase of CCV is more intense within theta cycle than TEE in the case of sampling (Figure 6e) whereas the increase of the TEE is more intense during the theta cycle than that of CCV in the mean encoding scheme (Figure 6c, Methods). To evaluate this distinction in population responses, we quantified the difference between the rate of change of CCV and TEE using the excess variability index (EVindex). The EVindex was negative for mean (Methods, Figure 6f) and positive for sampling schemes (Figure 6g). To test the robustness of the EVindex against various factors influencing the cycletocycle variability, we analyzed potential confounding factors. First, to compensate for the potentially large decoding errors during low spike count theta cycles, we calculated the EVindex both using all theta cycles (Figure 6—figure supplement 1e, f) or only theta cycles with high spike count (Figure 6f, g). Second, we varied randomly the speed and the length of the encoded trajectories (irregular vs. regular trajectories, Figure 6f inset, Methods). Third, we introduced a jitter to the boundaries of the theta cycles in order to model our uncertainty about cycle boundaries in experimental data (jitter 0–40ms, Figure 6g inset, Methods). We found that the EVindex was robust against these manipulations, reliably discriminating samplingbased codes from mean codes across a wide range of parameters. Thus, EVindex can distinguish sampling related excess variability from variability resulting from other sources.
We repeated the same analysis on the dataset recorded from rats exploring the 2D spatial arena. We calculated cycletocycle variability and trajectory encoding error and found that typically both the error and the variability increased during theta (Figure 6h, inset). However, at early theta phase the distributions had high positive skewness due to a few outliers displaying extremely large errors typically at early theta phase (Figure 6h, arrows in the inset). The outliers could reflect an error in the identification of the start of the theta cycles, and the resulting erroneous assignment of spikes that encode a different trajectory caused increased error in the estimation of the starting position of the trajectory. To mitigate this effect, we calculated the EVindex using both the mean and the median across all theta cycles (Figure 6—figure supplement 1j, k) or including only high spike count cycles (Figure 6k, l). We found that the EVindex was consistently and significantly positive for all recording sessions. This analysis supports that the large cycletocycle variability of the encoded trajectories during hippocampal theta sequences (Gupta et al., 2012) is consistent with random sampling from the posterior distribution of possible trajectories in a dynamic generative model.
The consistently positive EVindex across all recording sessions signified that variance in the measured responses was higher at the end of theta cycle than what would be expected from a scheme not encoding uncertainty. In fact, the magnitude of the EVindex was substantially larger when evaluated on real data than in any of our synthetic datasets, including datasets where additional structured variability was introduced through randomly changing the speed or the length of the encoded trajectories (irregular trajectories, Figure 6g) or through additional randomness in the cycle boundaries (jitter in Figure 6g cf., Figure 6k). A potential source for higher excess variability could be taskdependent switching between multiple coexisting maps (Kelemen and Fenton, 2010; Jezek et al., 2011). However, we found that most neurons had identical spatial tuning in the two phases of the task (random foraging versus goal directed navigation; Figure 6—figure supplement 2a–c) and the remaining cells typically showed relatively minor change in their spatial activity across task phases (Figure 6—figure supplement 2; see also Pfeiffer, 2022). Thus, multiple maps are not responsible for the large EVindex. Furthermore, we did not find consistent differences in the EVindex evaluated in the random foraging versus the goal directed search phase of the task (Figure 6—figure supplement 2), implying that specific goals during planning do not modulate the uncertainty of trajectories during theta sequences. Finally, large variability could be indicative of efficient sampling algorithms where correlations between subsequent samples are actively suppressed (MacKay, 2003; Kay et al., 2020; Echeveste et al., 2020). In the final section, we tested this possibility using real and simulated data.
Signature of efficient sampling: generative cycling
The efficiency of a sampling process can be characterized by the number of samples required to cover the target probability distribution. The efficiency can be increased when subsequent samples are generated from distant parts of the distribution making the samples anticorrelated. To test the hypothesis that the magnitude of the EVindex may be indicative of the magnitude of correlation between the samples, we first generated five datasets with varying the degree of correlation between the endpoint of trajectories sampled in subsequent theta cycles (Figure 7a; Methods) and calculated the EVindex for all these datasets. We found that the EVindex varied consistently with the sign and the magnitude of the correlation between the sampled trajectories: The EVindex was negative when positive correlations between the subsequent trajectories reduced the cycletocycle variability (Figure 7b). In this case, persistent sampling biases could not be distinguished from erroneous inference (Beck et al., 2012) and sampling became very similar to mean scheme. Conversely, the EVindex was positive for independent or anticorrelated samples with a substantial increase in the magnitude of the EVindex for strongly anticorrelated samples (Figure 7b). Thus, the large EVindex observed in the data is consistent with an efficient sampling algorithm that preferentially collects diverging trajectories in subsequent theta cycles.
To more directly test the hypothesis that subsequently encoded trajectories are anticorrelated, we explored if future portions of decoded trajectories tend to visit distinct parts of the environments. If sampling is optimized to produce anticorrelated samples then future portions of trajectories are expected to alternate around the real trajectories more intensely than positively correlated trajectories. To formulate this, we calculated the duration of the consistent cycling periods (cycling length, $\mathrm{\ell}$, Figure 7c; Kay et al., 2020). In our synthetic dataset, we found that ch:cyclingthe cycling length took the value one more frequently than in the shuffle control, indicating the absence of alternation even when the samples were strongly anticorrelated (Figure 7d). We identified that the source of this bias is the difference between the observed true position of the animal, and the animal’s subjective location estimate (Figure 7e). Specifically, even when the sampled trajectories alternate around the estimated position, the alternating trajectories often fell into the same side of the real trajectory of the animal if the true and the estimated position do not coincide (Figure 7e). Indeed, the bias was be eliminated when we calculated the alternation ot the encoded trajectories with respect to the animal’s internal location estimate (Figure 7—figure supplement 1). Note that this bias is not present in simpler, 1dimensional environments when the position of the animal is constrained to a linear corridor (Kay et al., 2020). To compensate for this bias, we introduced the cycling index, that ignores the repeats ($\ell =1$) and measures the prevalence of long ($\ell \ge 4$) cycling periods relative to the short alternations ($1<\ell \le 3$; Figure 7f). We validated the cycling index on synthetic data by showing that it correctly distinguishes strongly correlated and anticorrelated settings.
Finally, we analysed the cycling behavior in the experimental data. We found that, similar to our synthetic datasets, repeats ($\ell =1$) were relatively overrepresented in the data compared to short alternations ($\ell =2$, Figure 7g). However, the distribution of cycling length had a long tail and long periods of alternations ($\ell >=4$) were frequent (Figure 7g, see also Figure 7—figure supplement 1c) resulting in a cycling index significantly larger than in shuffle control (Figure 7h). To standardize the cycling index for all sessions, we zscored it using the corresponding shuffle distribution and found that the normalized cycling index was significantly positive for 7 of the 8 recording sessions (Figure 7i) and not different from random in the remaining session. This finding indicates that hypothetical trajectories encoded in subsequent theta cycles tend to alternate between different directions relative to the animal’s true motion trajectory, which is a signature of efficient sampling from the underlying probability distribution.
Analysis of signatures using modelfree trajectories
The core of our analysis was a generative model that we used to obtain distributions of potential trajectories in order to identify signatures of alternative coding schemes. To test if specific properties of the generative model affect these signatures, we designed an analysis in which the same signatures could be tested in a modelfree manner. The original generative model directly provided a way to assess the distributions of past, current, and future potential positions using probabilistic inference in the model. In the modelfree version, we used only the recorded motion trajectories to construct an empirical distribution of hypothetical trajectories at each point in the motion trajectory of the animal. Specifically, for a particular target point on a recorded motion trajectory, we sampled multiple real trajectory segments with similar starting speed and aligned the start of these trajectories with the start of the target trajectory by shifting and rotating the sampled segments (Figure 7—figure supplement 2a). This alignment resulted in a bouquet of trajectories for each actual point along the motion trajectory that outlined an empirical distribution of hypothetical future positions and speeds (Figure 7—figure supplement 2a).
We used the empirical, modelfree distributions of motion trajectories to synthesize spiking data using the four different coding schemes. We then showed that the forward bias and the spread of the decoded locations changed systematically within the theta cycle (Figure 7—figure supplement 2b, c) indicating that theta sequences are similar in this modelfree dataset to real data (Figure 1d, f and g). Next, we repeated identical analyses that were performed with the original, modelbased synthetic trajectories. Our analyses demonstrated that the signatures we introduced to test alternative coding schemes are robust against changes in the way distributions of planned trajectories are constructed (Figure 7—figure supplement 2c–g).
Discussion
In this paper, we demonstrated that the statistical structure of the sequential neural activity during theta oscillation in the hippocampus is consistent with repeatedly performing inference and predictions in a dynamical generative model. Further, we have established a framework to directly contrast competing hypotheses about the way probability distributions can be represented in neural populations. Importantly, new measures were developed to dissociate alternative coding schemes and to identify their hallmarks in the population activity. Reliability and robustness of these measures was validated on multiple synthetic data sets that were designed to match the statistics of neuronal and behavioral activity of recorded animals. Our analysis demonstrated that the neural code in the hippocampus shows hallmarks of probabilistic planning by representing information about uncertainty associated with the encoded positions. Specifically, our analysis has shown that the hippocampal population code displays the signature of efficient sampling of possible motion trajectories but could not identify the hallmarks of alternative proposed schemes for coding uncertainty.
Planning and dynamical models
Hippocampal activity sequences both during theta oscillation and sharp waves have been implicated in planning. During sharp waves, sequential activation of place cells can outline extended trajectories up to 10 m long (Davidson et al., 2009), providing a mechanism suitable for selecting the correct path leading to goal locations (Pfeiffer and Foster, 2013; Mattar and Daw, 2018; Widloski and Foster, 2022). During theta oscillation, sequences typically cover a few tens of centimeters (Figure 2; Wikenheiser and Redish, 2015) and are thus more suitable for surveilling the immediate consequences of the imminent actions. Monitoring future consequences of actions at multiple time scales is a general strategy for animals, humans and autonomous artificial agents alike (Neftci and Averbeck, 2019).
In our model, the dominant source of uncertainty represented by the hippocampal population activity was the stochasticity of the animal’s forthcoming choices. Indeed, it has been observed that before the decision point in a Wmaze the hippocampus also represents hypothetical trajectories not actually taken by the animal (Redish, 2016; Kay et al., 2020; Tang et al., 2021). Here, we generalized this observation to an openfield task and found that monitoring alternative paths is not restricted to decision points but the hippocampus constantly monitors consequences of alternative choices. Disentangling the potential contribution of other sources of uncertainty, including ambiguous sensory inputs (Jezek et al., 2011) or unpredictable environments to hippocampal representations requires analysing population activity during more structured experimental paradigms (Kelemen and Fenton, 2010; Miller et al., 2017).
Although representation of alternative choices in theta sequences is well supported by data, it is not clear how much the encoded trajectories are influenced by the current policy of the animal. On one hand, prospective coding, reversed sequences during backward travel or the modulation of the sequences by goal location indicates that current context influences the content and dynamics of the sequences (Frank et al., 2000; Johnson and Redish, 2007; Cei et al., 2014; Wikenheiser and Redish, 2015). On the other hand, theta sequences represent alternatives with equal probability in binary decision tasks and the content of the sequences is not predictive about the future choice of the animal (Kay et al., 2020; Tang et al., 2021). Our finding that there is no consistent difference between the EVindex in home versus away trials (Figure 6—figure supplement 2) is also consistent with the idea that trajectories are sampled from a wide distribution not influenced strongly by the current goal or policy of the animal. Sampling from a relatively wide proposal distribution is a general motif utilized by several sampling algorithms (MacKay, 2003). Brain areas beyond the hippocampus, such as the prefrontal cortex, might perform additional computations on the array of trajectories sampled during theta oscillations, including rejection of the samples not consistent with the current policy (Tang et al., 2021). The outcome of the sampling process can have profound effect on future behavior as selective abolishment of theta sequences by pharmacological manipulations impairs behavioral performance (Robbe et al., 2006).
Compared to previous models simulating hippocampal place cell activity (McClain et al., 2019; Chadwick et al., 2015), a major novelty in our approach was that its spatial location was not assumed to be known by the animal. Instead, the animal had to estimate its own position and motion trajectory using a probabilistic generative model of the environment. Consequently, the neuronal activity in our simulations was driven by the estimated rather than the true positions. This probabilistic perspective allowed us to identify the sources of the variability of theta sequences, to define quantities (CCV, TEE and EVindex) that can discriminate between sampling and mean schemes and to recognise the origin of biases in generative cycling. Although the fine details of the model are not crucial for our results (Figure 7—figure supplement 2) and our specific parameter choices were motivated mainly to achieve a good match with the real motion and neural data (Figure 3—figure supplements 2 and 3), a probabilistic generative model was necessary for these insights and for the consistent implementation of the inference process. Alternative frameworks, such as the successor representations (Dayan, 1993; Stachenfeld et al., 2017) provide only aggregate predictions in the form of expected future state occupancy averaged across time and intermediate states instead of temporally and spatially detailed predictions on specific future states necessary for generating hypothetical trajectories (Johnson and Redish, 2005; Kay et al., 2020).
We found that excess variability in the experimental data was higher than that in the simulated data sampling independent trajectories in subsequent theta cycles. This high EVindex is consistent with preferentially selecting samples from the opposite lobes of the target distribution (MacKay, 2003) via generative cycling (Kay et al., 2020) leading to low autocorrelation of the samples. Recurrent neural networks can be trained to generate samples with rapidly decaying autocorrelation (Echeveste et al., 2020). Interestingly, these networks were shown to display strong oscillatory activity in the gamma band (Echeveste et al., 2020). Concurrent gamma and theta band activities are characteristics of hippocampus (Colgin, 2016), which indicates that network mechanisms underlying efficient sampling might be exploited by the hippocampal circuitry. Efficient sampling of trajectories instead of static variables could necessitate multiple interacting oscillations where individual trajectories evolve during multiple gamma cycles and alternative trajectories are sampled in subsequent theta cycles.
Circuit mechanisms
Recent theoretical studies established that recurrent neural networks can implement complex nonlinear dynamics (Mastrogiuseppe and Ostojic, 2018; Vyas et al., 2020) including sampling from the posterior distribution of a static generative model (Echeveste et al., 2020). External inputs to the network can efficiently influence the trajectories emerging in the network either by changing the internal state and initiating a new sequence (Kao et al., 2021) or by modulating the internal dynamics influencing the transition structure between the cell assemblies or the represented spatial locations (Mante et al., 2013; Stroud et al., 2018). We speculate that the recurrent network of the CA3 subfield could serve as a neural implementation of the dynamical generative model with inputs from the entorhinal cortex providing strong contextual signals selecting the right map and conveying landmark information necessary for periodic resets at the beginning of the theta cycles.
Although little is known about the role of entorhinal inputs to the CA3 subfield during theta sequences, inputs to the CA1 subfield show functional segregation consistent with this idea. Specifically, inputs to the CA1 network from the entorhinal cortex and from the CA3 region are activated dominantly at distinct phases of the theta cycle and are associated with different bands of gamma oscillations reflecting the engagement of different local microcircuits (Colgin et al., 2009; Schomburg et al., 2014). The entorhinal inputs are most active at early theta phases when they elicit fast gamma oscillations (Schomburg et al., 2014) and these inputs might contribute to the reset of the sequences (FernándezRuiz et al., 2017). Initial part of the theta cycle showing transient backward sequences (Wang et al., 2020) could reflect the effect of external inputs resetting the local network dynamics. Conversely, CA3 inputs to CA1 are coupled to local slow gamma rhythms preferentially occurring at later theta phases, associated with prospective coding and relatively long, temporally compressed paths (Schomburg et al., 2014; Bieri et al., 2014) potentially following the trajectory outlined by the CA3 place cells.
Representations of uncertainty
Uncertainty representation implies that not only a best estimate of an inferred quantity is maintained in the population but properties of a full probability distribution can also be recovered from the activity. Machine learning provides two major classes of computational methods to represent probability distributions: 1, instantaneous representations which rely on a set of parameters to encode a probability distribution; or 2, sequential representations that collect samples from the distributions (MacKay, 2003). Accordingly, theories of neural probabilistic computations fall into these categories: the product scheme and DDC instantaneously, while sampling sequentially represents uncertainty (Pouget et al., 2013; Savin and Deneve, 2014). Our analysis did not find evidence for representing a probability distributions instantaneously during hippocampal theta sequences. Instead, our data is consistent with representing a single location at any given time where uncertainty is encoded sequentially by the variability of the represented locations across time.
Importantly, it is not possible to accurately recover the represented position of the animal from the observed spikes in any of the coding schemes: one can only estimate it with a finite precision, summarized by a posterior distribution over the possible positions. Thus, decoding noisy neural activity naturally leads to a posterior distribution. However, this does not imply that it was actually a probability distribution encoded in the population activity (Zemel et al., 1998; Lange et al., 2020). Specifically, when a scalar variable $x$ is encoded in the spiking activity $s(x)$ of neurons using an exponential family likelihood function, then the resulting code is a linear probabilistic population code (PPC; Ma et al., 2006). In fact, we used an exponential family likelihood function (Poisson) in our mean and sampling scheme, so these schemes belong, by definition, to the PPC family. However, the PPC should not be confused with our product scheme where a target distribution is encoded in the noisy population activity instead of a single variable (Lange et al., 2020).
To test the product scheme, we used the population gain as a hallmark. We found that the gain varied systematically, but the variance was not consistent with the basic statistical principle, that on average uncertainty accumulates when predicting future states. An alternative test would be to estimate both the encoding basis functions and the represented distributions as proposed by Ma et al., 2006 but this would require the precise estimation of stimulus dependent correlations among the neurons.
We also did not find evidence for representing distributions via the DDC scheme during theta sequences. However, the lack of evidence for instantaneous representation of probability distributions in the hippocampus does not rule out that these schemes might be effectively employed by other neuronal systems (Pouget et al., 2013). In particular, on the behavioral time scale when averaging over many theta cycles, the sampling and DDC schemes become equivalent: when we calculate the average firing rate of a neuron that uses the sampling scheme across several theta cycles, it becomes the expectation of its associated encoding basis function under the represented distribution. This way, sampling alternative trajectories in each theta cycle can be interpreted as DDC on the behavioral time scale with all computational advantages of this coding scheme (Vértes and Sahani, 2019). Similarly, sampling potential future trajectories at the theta time scale naturally explains the emergence of successor representations on the behavioral time scale (Stachenfeld et al., 2017).
In standard sampling codes, individual neurons correspond to variables and their activity (membrane potential or firing rate) represent the value of the represented variable which is very efficient for sampling from complex, high dimensional distributions (Fiser et al., 2010; Orbán et al., 2016). Here, we take a population coding approach when individual neurons are associated with encoding basis functions and the population activity collectively encode the value of the variable (Zemel et al., 1998; Savin and Deneve, 2014). This scheme allows the hippocampal activity to efficiently encode the value of a low dimensional variable at high temporal precision.
Recurrent networks can implement parallel chains of sampling from the posterior distribution of static (Savin and Deneve, 2014) or dynamic (Kutschireiter et al., 2017) generative models. Similar to the DDC scheme, these implementations would also encode uncertainty by the increase of the diversity of the coactive neurons. Thus, our data indicates that the hippocampus avoids sampling multiple, parallel chains for representing uncertainty in dynamical models and rather multiplexes sampling in time by collecting several samples subsequently at an accelerated temporal scale.
Our analysis leveraged the systematic increase in the uncertainty of the predicted states with time in dynamical models. The advantage of this approach is that we could analyse 1000s of theta cycles, much more than the typical number of trials in behavioral experiments where uncertainty is varied by manipulating stimulus parameters (e.g. image contrast; Orbán et al., 2016; Walker et al., 2020). Uncertainty could also be manipulated during navigation by changing the amount of available sensory information (Zhang et al., 2014), introducing ambiguity regarding the spatial context (Jezek et al., 2011) or manipulating the volatility of the environment (Miller et al., 2017). Our analysis predicts that the variability across theta cycles will increase systematically after all manipulations causing an increase in the uncertainty regardless of the nature of this manipulation or the shape of the environment. These experiments would also allow a more direct test of our theory by comparing changes in the neuronal activity with a behavioral readout of subjective uncertainty.
Methods
Theory
To study the neural signatures of the probabilistic coding schemes during hippocampal theta sequences, we developed a coherent theoretical framework which assumes that the hippocampus implements a dynamical generative model of the environment. The animal uses this model to estimate its current spatial location and predict possible consequences of its future actions. Since multiple possible positions are consistent with recent sensory inputs and multiple options are available to choose from, representing these alternative possibilities, and their respective probabilities, in the neuronal activity is beneficial for efficient computations. Within this framework, we interpreted trajectories represented by the sequential population activity during theta oscillation as inferences and predictions in the dynamical generative model.
We define three hierarchical levels for this generative process (Figure 3—figure supplement 1a). (1) We modeled the generation of smooth planned trajectories in the twodimensional box, similar to the experimental setup, with a stochastic process. These trajectories represented the intended locations for the animal at discrete time steps. (2) We formulated the generation of motion trajectories via motor commands that are calculated as the difference between the planned trajectory and the position estimated from the sensory inputs. Again, this component was assumed to be stochastic due to noisy observations and motor commands. Calculating the correct motor command required the animal to update its position estimate at each time step and we assumed that the animal also maintained a representation of its inferred past and predicted future trajectory. (3) We modeled the generative process which encodes the represented trajectories by neural activity. Activity of a population of hippocampal neurons was assumed to be generated by either of the four different encoding schemes as described below.
We implemented this generative model to synthesize both locomotion and neural data and used it to test contrasting predictions of different encoding schemes. Importantly, the specific algorithm we used to synthesize motion trajectories and perform inference is not assumed to underlie the algorithmic steps implemented in the hippocampal network, it only provides sample trajectories from the distribution with the right summary statistics. The flexibility of this hierarchical framework enabled us to match qualitatively the experimental data both at the behavioral (Figure 3—figure supplement 2) and the neural (Figure 3—figure supplement 3) level. In the following paragraphs we will describe these levels in detail.
Generation of smooth planned trajectory
The planned trajectory was established at a temporal resolution corresponding to the length of a theta cycle, $\mathrm{\Delta}t=0.1$ s, and spanned a length $T\approx 1$ s providing a set of planned positions, $\overline{\mathbf{x}}}_{n$ for any given theta cycle $n$ (see Table 1 for a list of frequently used symbols). The planned trajectories were initialized from the center of a hypothetical 2 m×2 m box with a random initial velocity. Magnitude and direction of the velocity in subsequent steps were jointly sampled from their respective probability distributions. Specifically, at time step $n$ we first updated the direction of motion by adding a random twodimensional vector of length ${\mu}_{\mathrm{\Delta}v}=3.5$ cm/s to the velocity ${\overline{v}}_{n}$. Next, we changed the speed, the magnitude of ${\overline{v}}_{n}$, according to an OrnsteinUhlenbeck process:
where $\overline{\nu}}_{n$ denotes the speed in theta cycle $n$. We used the parameters ${\mu}_{\nu}=16$ cm/s, ${\tau}_{\nu}=2$ s, ${Q}_{\nu}=\sqrt{2/{\tau}_{\nu}}{\sigma}_{\nu}$ with ${\sigma}_{\nu}=10$ cm/s and $\u03f5\sim \mathcal{N}(0,1)$ when $2\text{cm/s}\le \nu \le 80\text{cm/s}$ and $\u03f5=0$ otherwise. The planned trajectory was generated by discretized integration of the velocity signal:
When the planned trajectory reached the boundary of the box, the trajectory was reflected from the walls by inverting the component of the velocity vector that was perpendicular to the wall. The parameters of the planned trajectory were chosen to approximate the movement of the real rats by the movement of the simulated animal (Figure 3—figure supplement 2).
Importantly, in any theta cycle multiple hypothetical planned trajectories could be generated by resampling the motion direction and the noise term, $\u03f5$ in Equation 1. Moreover, these planned trajectories can be elongated by recursively applying Equations 1 and 2. The planned trajectory influenced the motion of the simulated animal through the external control signal (motor command) as we describe it in the next paragraph.
Generation of motion trajectories
We assumed that the simulated animal aims at following the planned trajectory but does not have access to its own location. Therefore the animal was assumed to infer the location from its noisy sensory inputs. To follow the planned trajectory, the simulated animal calculated motor commands to minimize the deviation between its planned and estimated locations.
To describe the transition between physical locations, $x$, we formulated a model where transitions were a result of motor commands, $u}_{n$. For this, we adopted the standard linear Gaussian state space model:
where ${\epsilon}_{u}$ represented motor noise. The animal only had access to the locationdependent, but noisy sensory observations, $y}_{n$:
where ${\epsilon}_{y}$ is the sensory noise and $Q$ and $R$ are diagonal noise covariance matrices with ${Q}_{ii}=2.25$ cm^{2} and ${R}_{ii}=225$ cm^{2}. The small motor variance was necessary for smooth movements since motor errors accumulate across time (Figure 3—figure supplement 2). Conversely, large sensory variance was efficiently reduced by combining sensory information across different time steps (see below, Equation 6).
Since the location, $x}_{n$, was not observed, inference was required to calculate the motor command. This inference relied on estimates of the location in earlier theta cycles, the motor command, and the current sensory observation. The estimated location was represented by the Gaussian filtering posterior:
This posterior is characterized by the mean estimated location ${\mu}_{n}$ and a covariance, ${\mathrm{\Sigma}}_{n}$, which quantifies the uncertainty of the estimate. These parameters were updated in each time step (theta cycle) using the standard Kalman filter algorithm (Murphy, 2012):
where ${K}_{n}=({\mathrm{\Sigma}}_{n1}+Q){({\mathrm{\Sigma}}_{n1}+Q+R)}^{1}$ is the Kalman gain matrix.
The motor command, $u}_{n$, was calculated by lowpass filtering the deviation between the planned position, ${\overline{x}}_{n}$, and the estimated position of the animal (posterior mean, ${\mu}_{n1}$):
with $\alpha =0.25$ ensuring sufficiently smooth motion trajectories (Figure 3—figure supplement 2). Relationship between the planned position ${\overline{x}}_{n}$, actual position $x}_{n$, the sensory input $y}_{n$, and the estimated location $P({x}_{n}{y}_{1:n},{u}_{1:n})$ is depicted in Figure 3—figure supplement 1b.
To make predictions about future positions, we defined the subjective trajectory of the animal, the distribution of trajectories consistent with all past observations and motor commands: $P({\mathbf{x}}_{n}{y}_{1:n},{u}_{1:n})$. This subjective trajectory is associated with a particular theta cycle: since it is estimated on a cyclebycycle manner we use the index $n$ to distinguish trajectories at different cycles. Here $\mathbf{x}}_{n}\equiv {x}_{(n{n}_{p}):(n+{n}_{f})$ is a trajectory starting $n}_{p$ steps in the past and ending $n}_{f$ steps ahead in the future (Figure 3—figure supplement 1c, Table 1). We call the distribution $P({\mathbf{x}}_{n}{y}_{1:n},{u}_{1:n})$ the trajectory posterior. We sampled trajectories from the posterior distribution by starting each trajectory from the posterior of current position (filtering posterior, Equation 5) and proceeded first backward, sampling from the conditional smoothing posterior, and then forward, sampling from the generative model.
To sample the past component of the trajectory ($m\le n$), we capitalized on the following relationship:
where the first factor on the right hand side of Equation 8 is the filtering posterior (Equation 5) and the second factor is defined by the generative process ( (Equation 3) ) and $\mathrm{\Lambda}={({\mathrm{\Sigma}}_{m1}^{1}+{Q}^{1})}^{1}$. We started each trajectory by sampling its first point independently from the filtering posterior (Equation 5) and applied (Equation 8) recursively to elongate the trajectory backward in time.
To generate samples in the forward direction ($m>n$) we implemented an ancestral sampling approach. First, a hypothetical planned trajectory was generated as in Equations 1 and 2 starting from the last planned location ${\overline{x}}_{n}$. Next, we calculated the hypothetical future motor command, ${u}_{m+1}$ based on the difference between the next planned location ${\overline{x}}_{m+1}$ and current prediction for $m$, $x}_{m$ as in Equation 7. Finally, we sampled the next predicted position from the distribution
To elongate the trajectory further into the future we repeated this procedure multiple times.
We introduce ${\mathit{\mu}}_{n}(\phi )=\mathbb{E}[P({\mathbf{x}}_{n}{y}_{1:n},{u}_{1:n})]$ to denote the average over possible trajectories. $\phi $ indexes different parts of the trajectory and refers to the phase of the theta cycle at which the trajectory unfolds. Similarly, we also defined the covariance of the trajectories, ${\mathbf{\Sigma}}_{n}(\phi )$. We used an approximate, diagonal covariance matrix and ignored the covariances between different trajectories and theta phase. In our simulations we estimated ${\mathit{\mu}}_{n}(\phi )$ and ${\mathbf{\Sigma}}_{n}(\phi )$ from 100 samples both for the past and for the future part of the represented trajectories.
The motion profile of the simulated animal, including the distribution and the autocorrelation of the speed, acceleration and heading was matched to empirical data from real animals (Figure 3—figure supplement 2). Figure 3—figure supplement 1c, d illustrates the inference process in the model by showing a short segment of the true trajectory of the simulated animal centered on its location at time step $n$ as well as trajectories starting in the past and extending into the future sampled from the posterior distribution. As expected, the variance of these hypothetical trajectories ${\mathbf{\Sigma}}_{n}(\phi )$ increased consistently from the past towards the future (from the beginning to the end of the theta cycle; illustrated by the increasing diameter of the ellipses in Figure 3—figure supplement 1d), while their mean ${\mathit{\mu}}_{n}(\phi )$ tracked accurately the true trajectory of the animal (Figure 3—figure supplement 1c, d). Mean trajectories and trajectories sampled from the trajectory posterior at subsequent theta cycles are compared in Figure 3—figure supplement 1e, f and in Figure 3a, b.
To change the correlation between the subsequent trajectories in Figure 7, we first generated 100 candidate trajectories sampled randomly from the posterior in time step (theta cycle) $n$ and calculated $\vartheta}_{n}^{i$, the direction of the endpoint of each of them relative to the current motion direction of the animal. Next, we calculated the absolute circular difference in the endpoint direction between the candidate directions and the direction of the endpoint of the trajectory in the previous theta cycle:
Finally, we chose a single trajectory randomly, where the probability of each candidate trajectory was proportional to a sigmoid function of $\mathrm{\Delta}{\vartheta}_{n}^{i}$
with slope (${\gamma}_{\vartheta}$) and threshold (${\vartheta}_{0}$) parameters controlling the sign and magnitude of the correlation between subsequent samples (Table 2).
Modelfree generation of motion trajectories
In Figure 7—figure supplement 2 we replaced our generative model for potential motion trajectories by sampling trajectory segments from the real trajectory of an animal (R1D2). Specifically, we selected all 1.5 s long motion trajectory segments of continuous running and divided them into 10 quantiles based on the starting speed of the segments. To generate a distribution of potential motion trajectories for a given starting point, we collected 100 randomly sampled trajectory segments from the matching speed quantile as the reference trajectory such that all samples were consistent with the geometry of the environment (i.e. they did not cross the border of the arena). This set of trajectory segments were then used to evaluate the posterior mean trajectory (${\mathit{\mu}}_{n}(\phi )$), the posterior variance of the trajectories (${\mathbf{\Sigma}}_{n}(\phi )$). Note, that the variance of the trajectory segments were zero at the beginning (since all segments were aligned to the same starting point). To avoid unrealistically high firing rates in the product scheme, we added a constant 16 cm^{2} to ${\mathbf{\Sigma}}_{n}(\phi )$. We used these quantities to drive the neuronal activity (see below) in the same way as we used trajectories sampled from the posterior of the generative model.
Encoding the represented trajectory by the firing of place cells
We assumed that in each theta cycle the sequential activity of hippocampal place cells represents the temporally compressed trajectory posterior. The encoded trajectory was assumed to start in the past at the beginning of the theta cycle and arrive to the predicted future states (locations) by the end of the theta cycle (Skaggs et al., 1996; Foster and Wilson, 2007). Each model place cell $i$ was associated with an encoding basis function, ${\varphi}_{i}(x)$, mapping the encoded position to the firing rate of cell $i$. Each basis function was generated as the sum of $K$ Gaussian functions (subfields):
with the following choice of the parameters:
The number of subfields $K$ was sampled from a gamma distribution with parameters $\alpha =0.57$ and $\beta =1/0.14$ (Rich et al., 2014). We included only cells with at least one subfield within the arena ($K\ge 1$).
The location of the subfields (μ) was distributed uniformly within the environment.
The radius of each subfield, $\sigma $, was sampled uniformly between 10 and 30 cm.
The maximum firing rate of each subfield ${\rho}_{ik}$ was chosen uniformly on the range 5–15 Hz.
The baseline firing rate ${\rho}_{0}$ was chosen uniformly on the range 0.1–0.25 Hz.
Examples of the encoding basis functions are shown in Figure 5—figure supplement 1f (top row). Since the encoded positions can be different than the measured positions, the encoding basis function ${\varphi}_{i}(x)$ is not identical to the measured tuning curve ${\psi}_{i}(x)$, which is defined as a mapping from the measured position to the firing rate. The exact relationship between tuning curves and the encoding basis functions depends on the way the estimated location is encoded in the population activity. Tuning curves estimated from the synthetic position data are compared with experimentally recorded place cell tuning curves in Figure 3—figure supplement 3.
Importantly, in our model the activity of place cells was determined by the inferred trajectories and not the motion trajectory of the animal. The way the trajectories were encoded by the activity of place cells was different in the four encoding schemes:
In the mean encoding scheme the instantaneous firing rate of the place cells was determined by the mean of the trajectory posterior
(14) ${\mathit{\lambda}}_{i}(\phi )={\varphi}_{i}(\mathit{\mu}(\phi ))$That is, the cell’s firing rate changed within the theta cycle according to the value of its basis function at the different points of the mean inferred trajectory.
In the product scheme (Ma et al., 2006) the firing rate was controlled both by the posterior mean and variance of the trajectory:
(15) ${\mathit{\lambda}}_{i}(\phi )={\varphi}_{i}(\mathit{\mu}(\phi ))\phantom{\rule{thinmathspace}{0ex}}\frac{{\varsigma}_{0}^{2}}{{\varsigma}^{2}(\phi )}$where $\varsigma}^{2}(\phi )=\sqrt{det\mathbf{\Sigma}(\phi )$ and ${\varsigma}_{0}=5$ cm. This is similar to the mean encoding model, except that the population firing rate is scaled by the inverse of the posterior variance.
In the DDC scheme (Zemel et al., 1998; Vértes and Sahani, 2018) the instantaneous firing rate of cell $i$ is the expectation of the basis function $i$ under the trajectory posterior at the encoded time point (that is, the overlap between the basis function and the posterior):
(16) ${\mathit{\lambda}}_{i}(\phi )=\int {\varphi}_{i}(\mathbf{x})\phantom{\rule{thinmathspace}{0ex}}P(\mathbf{x}(\phi ){y}_{1:n},{u}_{1:n})\phantom{\rule{thinmathspace}{0ex}}dx$In the sampling scheme, the encoded trajectory was sampled from the trajectory posterior, $\stackrel{~}{\mathbf{x}}(\phi )\sim P({\mathbf{x}}_{n}{y}_{1:n},{u}_{1:n})$ and the instantaneous firing rate was the function of the sampled trajectory:
(17) ${\mathit{\lambda}}_{i}(\phi )={\varphi}_{i}(\stackrel{~}{\mathbf{x}}(\phi ))$
In each encoding model, spikes were generated from the instantaneous firing rate ${\lambda}_{i}(\phi )$ as an inhomogeneous Poisson process:
Decoding
To discriminate encoding schemes from each other we decoded position information both from experimentally recorded and from synthesized hippocampal neuronal activity. We performed decoding based on two different assumptions: First, assuming that a single position is encoded by the population activity (consistent with the mean and sampling schemes). Second, assuming that a distribution is encoded via the DDC scheme. We used static decoders to ensure that the variance of the decoder is independent of the theta phase as opposed to dynamic decoders, where the variance can be larger around the boundaries of the decoded segments.
Single point decoding
We performed static Bayesian decoding independently in each temporal bin at different phases of the theta cycle. The estimated position at theta phase $\phi $ is the mean of the posterior distribution calculated using Bayes rule:
where the prior $P(x)$ was the estimated occupancy map and we used a Poisson likelihood with spatial tuning curves ${\psi}_{i}(x)$, estimated from the smoothed (10 cm Gaussian kernel) and binned spike counts. We binned spikes either into windows of fixed duration (20ms, Figure 2a–c) fixed theta phase (120°, Figure 3h and Figure 4) or into three bins with equal number of spike counts within a theta cycle (Figures 5 and 6 and everywhere else). When calculating the spread of the decoded locations (Figure 2g, Figure 3h), we controlled for the possible biases introduced by theta phase modulation of the firing rates by randomly downsampling the data to match the spike count histograms across theta phase.
DDC decoding
The DDC decoder assumes that at each time point in the theta cycle an isotropic Gaussian distribution over the locations is encoded by the population activity via Equation 16 and we aim at recovering the mean and the variance of the encoded distribution. To ensure that the theta phase dependence of the firing rates does not introduce a bias in the decoded variance, we divided each theta cycle into three windows (early, middle and late) with equal number of spikes. As linear decoding of DDC codes (Vértes and Sahani, 2018) from spikes is very inaccurate, we performed maximum likelihood decoding of the parameters of the encoded distribution. The estimated mean and the variance in bin $\phi $ is:
where
Here, ${\varphi}_{i}(x)$ is the basis function of neuron $i$ used in the encoding process (Equation 16). We numerically searched for the maximum likelihood parameters with constraints $\mu \in (0,200)$ (cm) and $\sigma \in (3,200)$ (cm) using quasiNewton optimizer and a finitedifference approximation for the gradients.
In practice we do not have access to the encoding basis functions, ${\varphi}_{i}(x)$, only to the empirical tuning curves, $\psi (x)$. Based on the synthetic data we found that the tuning curves are typically more dispersed than the basis functions since the encoded and the measured location is not identical or the encoded distributions have nonzero variance (Figure 5—figure supplement 1e, g, Appendix 2). The difference between the size of the basis functions used for encoding and decoding introduces a bias in decoding the variance of the distribution (Figure 5—figure supplement 1a, c). To reduce this bias, we devised a nonparametric deconvolution algorithm that could estimate the encoding basis functions from the empirical tuning curves (Figure 5—figure supplement 1d–f, Appendix 2). Although we demonstrated on synthetic data that the bias of the decoder can be eliminated by using these estimated basis functions (Figure 5—figure supplement 1d), we obtained qualitatively similar decoding results with either the estimated basis functions or the empirical tuning curves. Therefore in Figure 5 we show DDC decoding results obtained using the empirical tuning curves and show decoding with the estimated basis functions in Figure 5—figure supplement 2 (see also Appendix 2).
Analysis of trajectory variability
To discriminate the mean scheme from the sampling scheme we introduced the excess variability index (EVindex). The EVindex is the difference between the cycletocycle variability and the trajectory encoding error and is positive for sampling and negative for the mean scheme. In this section, we provide definitions for these quantities, derive their expected value for the sampling and the mean scheme and show how the EVindex can be estimated from data.
Cycletocycle variability (CCV)
We defined cycletocycle variability ($\chi $) as the difference between the trajectories decoded from the neural activity in two subsequent theta cycles (Figure 3—figure supplement 1f):
where $\phi $ is the theta phase and the index $i$ runs over the 2 dimensions. As we show in Appendix 3, for the mean encoding scheme the expected value of the cycletocycle variability is the sum of two terms:
where ${\zeta}^{2}=\mathbb{E}[\sum _{i}({\mathit{\mu}}_{n}^{i}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}{\mathit{\mu}}_{n1}^{i}{)}^{2}]$ is the expected change of the encoded mean trajectory between subsequent theta cycles, ${\u03f5}^{2}$ is the error of the decoder reflecting the finite number of observed neurons in the population and their stochastic spiking and the expectation runs across theta cycles. Since each theta cycle is divided into equal spike count bins, the decoder variance is independent of the theta phase $\phi $. Conversely, ${\zeta}^{2}$ increases with $\phi $ (Figure 3—figure supplement 1g) as new observations have larger effect on uncertain future predictions than on the estimated past positions.
Although our derivations use the expected value (mean across theta cycles), in practice we often found that the median is more robust to outliers and thus in Figure 6 and Figure 6—figure supplement 1 we also show results with median next to the mean.
When estimating the cycletocycle variability for the sampling scheme, where the population activity encodes independent samples drawn from the trajectory posterior, an additional term appears:
Here, ${\sigma}^{2}(\phi )=\mathbb{E}[\sum _{i}{\mathbf{\Sigma}}^{i}(\phi )]$ is the total posterior variance, which reflects the variance coming from the uncertainty of the inference. In our synthetic dataset, we found that the trajectory change is proportional to the posterior variance:
with the proportionality constant $0<\alpha <1$ (Figure 3—figure supplement 1g). Using this insight, we can simplify our treatment: by substituting Equation 24 into Equations 22 and 23, we can see that in both coding schemes the cycletocycle variability increases with the theta phase, and the magnitude of this increase, proportional to the total posterior variance ${\sigma}^{2}(\phi )$, can discriminate the two coding schemes. In order to obtain an independent estimate of ${\sigma}^{2}(\phi )$, we can exploit insights obtained from synthetic data and introduce another measure, the trajectory encoding error.
Trajectory encoding error (TEE)
We defined trajectory encoding error ($\gamma $) as the expected difference between the true twodimensional trajectory of the rat, $\mathbf{x}}_{n$, and the trajectory decoded from the neural activity ${\widehat{\mathbf{x}}}_{n}$ (Figure 3—figure supplement 1e):
When comparing decoded and physical trajectories, we assumed a fixed correspondence between theta phase $\phi $ and temporal shift along the true trajectory. Specifically, for each animal we first calculated the average decoding error for early, mid and late phase spikes with respect to the true position temporally shifted along the motion trajectory (Figure 4—figure supplement 2). Next, we determined the temporal shift $\mathrm{\Delta}t$ that minimized the decoding error separately for early, mid, and late theta phases and used this $\mathrm{\Delta}t$ to calculate $\gamma $.
In the case of mean encoding the trajectory encoding error is the sum of two terms:
where we used the fact that the encoded trajectory is the mean ${\mu}_{n}(\phi )$, and if the model of the animal is consistent, then the expected difference between the posterior mean and the true location equals the variance of the posterior, $\mathbb{E}[\sum _{i}({\mathbf{x}}_{n}^{i}{\mathit{\mu}}_{n}^{i}{)}^{2}]={\mathit{\sigma}}^{2}$.
In the case of sampling the encoded trajectory is ${\stackrel{~}{\mathbf{x}}}_{n}$ and the trajectory encoding error is increased by the difference between the mean and the sampled trajectory:
Decoding error
To directly compare cycletocycle variability with trajectory encoding error we have to estimate the decoding error, ${\u03f5}^{2}$. A lower bound to the decoding error is given by the CramérRao bound (Dayan and Abbott, 2001), but in our simulations the actual decoding error was slightly larger than this bound. Underestimating the decoding error would bias the excess variability towards more positive values (see below), which we wanted to avoid as it would provide false evidence for the sampling scheme.
To obtain a reasonable upper bound instead, we note, that both $\chi $ and $\gamma $ were evaluated in three different phases of the theta cycles: early (${\phi}_{1}$), mid (${\phi}_{2}$) and late (${\phi}_{3}$). At early theta phases when encoding past positions the posterior variance ${\sigma}^{2}$ is small and thus both $\chi $ and $\gamma $ are dominated by ${\u03f5}^{2}$. Thus, we estimated the decoding error from the measured cycletocycle variability and trajectory encoding error at early theta phase.
Furthermore, to compare cycletocycle variability with trajectory encoding error we defined the compensated cycletocycle variability and trajectory encoding error by subtracting the estimated decoding error:
Excess variability
We define the excess variability as the difference between ${\chi}^{\prime}$ and ${\gamma}^{\prime}$:
Substituting Equations 22 and 23, Equations 26 and 27 and Equations 28 and 29 to Equation 30 we can obtain the expectation of the excess variability in the sampling and the mean encoding scheme:
The excess variability is positive for sampling and negative for mean encoding. As $\mathit{\sigma}}^{2$ is expected to increase within a theta cycle, the excess variability is most distinctive at late theta phases. Therefore, throughout the paper we defined the EVindex as the excess variability at late theta phases:
Importantly, all terms in Equation 33 can be measured directly from the data.
We calculated the EVindex either using all theta cycles (Figure 6—figure supplement 1) or using theta cycles with high spike counts in order to mitigate the effect of large decoding error (Figure 6). To reduce the effect of outliers, we also reported the median of the EVindex in Figure 6l and k. Error bars on the EVindex (Figure 6f–g and k, Figure 6—figure supplement 1ef, j) reflect the standard error across the theta cycles. When showing the median across the theta cycles (Figure 6l, Figure 6—figure supplement 1k), the error bars indicate the 5% and 95% confidence intervals estimated by bootstrapping. Specifically, we obtained 1000 pseudodatasets by discarding randomly selected half of the theta cycles and calculating the EVindex of the remaining data. The statistical significance of the EVindex was tested using one sample ttests (Figure 6f, g and k; Figure 6—figure supplement 1e, f and j) or bootstrapping (Figure 6l and Figure 6—figure supplement 1k). The resulting pvalues are shown in Tables 3 and 4.
Data analysis
Processing experimental data
To test the predictions of the theory, we analysed the dataset recorded by Pfeiffer and Foster, 2013. In short, rats were required to collect food reward from one of the 36 uniformly distributed food wells alternating between random forging and spatial memory task. The rat’s position and head direction were determined via two distinctly colored, headmounted LEDs recorded by an overhead video camera and digitized at 30 Hz. Neural activity was recorded by 40 independently adjustable tetrodes targeting the left and the right hippocampi. Local field potential (LFP) was recorded on one representative electrode, digitally filtered between 0.1 and 500 Hz and recorded at 3,255 Hz. Individual units were identified by manual clustering based on spike waveform peak amplitudes based on the signals digitalized at 32,556 Hz as in Pfeiffer and Foster, 2013.
The raw position signal was filtered with a 250ms Gaussian kernel and instantaneous speed and motion direction was calculated from the smooth position signal. We restricted the analysis to run periods with $v>5$ cm/s for at least 1 s separated by stops with duration of $\delta t>0.2$ s. In this study we included only putative excitatory neurons on the basis of spike width and mean firing rate (Pfeiffer and Foster, 2013) with at least 200 spikes during the analysed run epochs. Position was binned (5 cm) and spatial occupancy map was calculated as the smoothed (10 cm Gaussian kernel) histogram of the time spent in each bin. Position tuning curves (ratemaps, $\psi (x)$) were calculated as the smoothed (10 cm Gaussian kernel) histogram of firing activity normalized by the occupancy map and we used ${\psi}_{\text{min}}=0.1$ Hz wherever $\psi (x)<0.1$ Hz.
Theta phase was calculated by applying Hilbert transformation on band pass filtered (4–12 Hz) LFP signal. To find the starting phase of the theta sequences in each animal, we calculated the forward bias by decoding spikes using 120° windows advanced in 45° increments (Figure 4h). The forward (lateral) bias of the decoder was defined as the average of the error between the decoded and the actual position of the animal parallel (perpendicular) to the motion direction of the animal. Theta start was defined as 135° after the peak of the cosine function (Figure 4f) fitted to the forward bias. We used two complementary strategies to avoid biases related to systemic changes in the firing rate during decoding: (1) When analysing early, mid and late theta phases, we divided each theta cycle into three periods of equal spike count. (2) When calculating decoding spread, we subsampled the data in order to match the distribution of spike counts across theta phases. The decoding spread was defined as $det{(\mathrm{\Sigma})}^{1/4}$ where $\mathrm{\Sigma}$ is the covariance matrix of the decoded positions.
To calculate the theta phase dependence of the place field size (Figure 2e) we estimated the place fields in the three theta phase (early, mid and late) using the position of the rat shifted in time with $\mathrm{\Delta}t(\phi )$ that minimized the median error between the decoded and temporally shifted position in that session (Figure 2d). Similarly, when we calculated trajectory encoding error, we compared the decoded position in each phase bin to the real position shifted in time with the same $\mathrm{\Delta}t(\phi )$.
To compare place fields between home and away conditions, we first estimated separately the neuronal tuning curves for the two trial types. Home trials where the rat did not reach the goal location within 100 s were excluded from the analysis and we only included neurons with average firing rate higher than 0.1 Hz. We zscored each tuning curve using the mean and variance of the tuning curve estimated from all trials. This step allowed us to compare changes in the tuning curves across cells (see below).
Next, we compared the normalized tuning curve changes between the home and away trials (${\delta}_{HA}$) to two different controls. First, we randomly split the data into trials of two types 100 times and estimated normalized tuning curves from the random splits. This provided a baseline measure of the tuning curve change ${\delta}_{\text{shuffle}}$. Second, we obtained an upper bound on the expected tuning curve changes by comparing normalized home and away tuning curves of different neurons, ${\delta}_{\text{indep}}$. Neurons with the average difference between the tuning curves across random splits of the data larger than the average difference across cells ($\mathbb{E}[{\delta}_{\mathrm{s}\mathrm{h}\mathrm{u}\mathrm{f}\mathrm{f}\mathrm{l}\mathrm{e}}]>\mathbb{E}[{\delta}_{\mathrm{i}\mathrm{n}\mathrm{d}\mathrm{e}\mathrm{p}}]$) were considered unreliable and were excluded from further analysis. Figure 6—figure supplement 2 shows the proportion of cells in each recording session where ${\delta}_{HA}$ was larger than the 95% of distribution of ${\delta}_{\text{shuffle}}$, the difference between random splits of the data. To quantify the magnitude of remapping between home and away trials, we calculated the remapping index:
which is 0 when the observed difference between home and away tuning curves (${\delta}_{HA}$) equals the mean tuning curve change in the shuffle control ($E[{\delta}_{\text{shuffle}}]$) and is 1 when the magnitude of tuning curve change is similar to the difference between the tuning curves of independent neurons ($E[{\delta}_{\text{indep}}]$).
Processing synthetic data
We simulated the movement of the animal using Equations 1–7 in a $2\times 2$ m open arena, except for Figure 3d, e, where we used a $2\times 0.1$ m long linear track. We used the same procedure to analyse the synthetic data as we applied to the experimental data: we filtered the raw position signal, calculated the speed and motion direction and estimated the spatial occupancy maps and position tuning curves from the generated spikes. Importantly, we used these estimated tuning curves and not the true, noiseless encoding basis functions for decoding position from spikes in the synthetic data. In our model, each theta cycle was 100ms long (but also see an alternative variant below) and the encoded trajectory spanned 2 s centered on the current position of the animal (regular theta sequences).
To test the robustness of the EVindex to variations across theta cycles (Figure 6f, g), we added variability to the theta cycles in two different ways: First, the duration and the content of each theta cycle was varied stochastically. Specifically, for the simulations using irregular theta sequences in Figure 6f, g, we varied the duration of each theta cycle (80–160 ms) together with the total length of the encoded trajectory (1–3 s) with constraining the encoded trajectory to start in the past and end in the future. Second, a uniformly distributed random jitter (0–40 ms) was added to the true time of the theta cycle boundaries before binning the spikes according to their theta phase.
Appendix 1
Product coding scheme
In a product representation, a probability distribution over a feature, such as the position $x$, is encoded by the product of the encoding basis functions, ${\varphi}_{i}(x)$:
where is the number of spikes fired by neuron $i$ and $\eta (\mathbf{s})=\int {\varphi}_{0}(x)\prod _{i}{\varphi}_{i}(x{)}^{{s}_{i}}\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}x$ is the normalizing factor. Alternatively, by defining a set of sufficient statistics, ${h}_{i}(x)=\mathrm{log}{\varphi}_{i}(x)$, the same distribution can be expressed in the standard, exponential form:
where the neuronal spikes serve as the canonical or exponential parameters of the distribution (Wainwright and Jordan, 2007). Intuitively, increasing the number of spikes $\mathbf{s}$ in the population adds more and more terms in Equations 35; 36 leading to an increase in the sharpness of the represented distribution. In the special case where the tuning functions are Gaussians with variances ${\sigma}_{i}^{2}$ and the likelihood is conditionally independent Poisson, the represented distribution is also Gaussian with a variance, ${\sigma}^{2}(\mathbf{s})$ that is inversely proportional to the total spike count (Ma et al., 2006):
Here we show numerically that this relationship still holds for more general, multimodal basis functions, similar to the empirical tuning curves found in our dataset (Figure 2b). We used 110 tuning curves from an example session with mean firing rate $\overline{r}>0.1$ Hz and tuning curve peak ${r}_{\mathrm{m}\mathrm{a}\mathrm{x}}>10$ Hz (Figure 4—figure supplement 1a), we normalised the tuning curves and used their logarithm as sufficient statistics to encode distributions in the form of Equation 36:
We used isotropic 2dimensional Gaussians with a wide range of mean (20 cm $\ge \mu \ge $ 180 cm) and variance ($4\phantom{\rule{thinmathspace}{0ex}}\mathrm{c}\mathrm{m}}^{2}\phantom{\rule{thinmathspace}{0ex}}\ge {\sigma}^{2}\ge \phantom{\rule{thinmathspace}{0ex}}{1600\phantom{\rule{thinmathspace}{0ex}}\mathrm{c}\mathrm{m}}^{2$) as target distributions and encoded them in the spiking activity of the neurons. Equation 36 defines how a set of observed spikes (canonical parameters) can be interpreted as encoding a probability distribution. However, this mapping is not invertible and finding a set of spikes approximating an arbitrary target distribution is computationally challenging. We solved this problem by iteratively and greedily selecting spikes to minimise the KullbackLeibler divergence between the encoded distribution, $Q(x)\propto \mathrm{exp}\left({\sum}_{i}{s}_{i}{h}_{i}(x)\right)$, and the Gaussian target distribution, $P(x)$. Specifically, we first selected a single neuron whose spike minimised ${\mathrm{D}}_{\mathrm{KL}}(Q,P)$, then added more spikes one by one until the additional spikes stopped decreasing ${\mathrm{D}}_{\mathrm{KL}}(Q,P)$.
We found that the approximations were reasonably accurate, especially for distributions with smaller variance (Figure 4—figure supplement 1b). Importantly, the average number of spikes used for encoding decreased systematically with the variance of the encoded distribution (Figure 4—figure supplement 1c). Moreover, we found that even for highly variable basis functions, the total spike count in the population was well approximated as a linear function of the target precision, as expected from the theory for uniform basis functions (Figure 4—figure supplement 1d; Equation 37). Therefore we used the systematic variation of the total spike count in the population (population gain) as a function of the represented uncertainty as a hallmark of the product scheme.
When encoding trajectories in the population activity we used a simplified encoding model and scaled the gain of the neurons by the instantaneous variance of the encoded distribution (Equation 15).
Appendix 2
DDC coding scheme
In the DDC coding scheme the firing rate of neuron $i$ corresponds to the expectation of its basis function under the encoded distribution (Zemel et al., 1998; Vértes and Sahani, 2018):
This encoding scheme has complementary properties to the product scheme, as it represents exponential family distributions using their mean parameters instead of their canonical, natural parameters (Wainwright and Jordan, 2007). Equation 39 defines a mapping from a distribution to firing rates, but the reverse mapping, from the rates to the encoded distribution is not trivial. When the firing rates are observed then the expectation of any nonlinear function $f(\cdot )$ under the encoded distribution, including its moments, the mean and the covariance, can be calculated as a linear combination of the rates (Vértes and Sahani, 2018):
where $f(x)\approx {\sum}_{i}{\alpha}_{i}{\varphi}_{i}(x)$. However, in the experimental data only the discrete spike counts are observed and the underlying firing rates are hidden. In this case Equation 40 can be computed as an expectation under the posterior of the firing rates $P(\mathit{\lambda}\mathbf{s}\mathbf{)}$:
with
Here the prior over the firing rates $P(\mathit{\lambda})$ (the distribution of coactivity patterns among neurons with different tuning curves or place fields) is usually assumed to be factorised for computational efficiency. However, correlations between neuronal tuning curves introduce strong dependence in the prior and thus using a factorised approximation of Equation 41 leads to a substantial loss of information (Ujfalussy et al., 2015). Therefore, instead of following this Bayesian approach, we aimed at directly estimating the parameters of the distribution encoded by the spikes observed in a neuronal population using a maximum likelihood decoder (Equation 19).
We tested the performance of the maximum likelihood DDC decoder by encoding isotropic 2dimensional Gaussian distributions with a wide range of mean (20 cm $\ge \mu \ge $ 180 cm) and variance ($40\phantom{\rule{thinmathspace}{0ex}}\mathrm{c}\mathrm{m}}^{2}\phantom{\rule{thinmathspace}{0ex}}\ge {\sigma}^{2}\ge \phantom{\rule{thinmathspace}{0ex}}400\phantom{\rule{thinmathspace}{0ex}}\mathrm{c}{\mathrm{m}}^{2$) as target in the spiking activity of a neural population using 200 encoding basis functions similar to the empirical tuning curves recorded experimentally (Figure 6—figure supplement 1, top). We calculated the firing rate of the neurons according to Equation 39 and generated Poisson spike counts in $\mathrm{\Delta}t=20100$ ms time bins. We found that on average, we could accurately estimate the encoded mean and SD of the target distributions when we used the same basis functions during encoding and decoding at least for $\mathrm{\Delta}t\ge 50$ ms. (Figure 5—figure supplement 1b). However, in practice, we do not have access to the encoding basis functions, only the empirical tuning curves, substantially wider than the basis functions (Figure 5—figure supplement 1e). When we used the wider tuning curves for decoding, the estimate of the SD became substantially biased: the decoded SD could be on average 5–7 cm below the target SD even for $\mathrm{\Delta}t=100$ ms (Figure 5—figure supplement 1e).
To identify DDC scheme in the data it is important to be able to reduce this decoding bias and accurately estimate the SD of the encoded distribution even in small time windows. We speculated that this bias is due to the difference between the wider empirical tuning curves used for decoding and the narrower basis functions used for encoding the target distributions. This effect is caused by encoding locations other than the actual position of the animal and by encoding distributions with nonzero variance using the DDC scheme. The decoder tries to match the overlap between the tuning functions and the estimated distribution with the observed spike counts. When the tuning functions used for DDCdecoding are wider than the encoding tuning functions, the decoder will compensate this by choosing a narrower decoded distribution to achieve a similar overlap with the decoding tuning functions (Figure 5—figure supplement 1a). Thus, the decoder will be biased towards lower variances, leading to systematic underestimation of the variance of the encoded distribution.
In order to reduce this bias we aimed at estimating the true encoding basis functions from the recorded neuronal activity. Note, that deriving an optimal estimator is hindered by the fact that in Equation 39 neither the tuning functions nor the encoded distributions or the true firing rates are observed. Therefore we developed an approximate algorithm to sharpen the empirical tuning curves and reconstruct the original tuning functions. The aim was to find a flexible algorithm that preserves the relative mass of different modes of the tuning curves but sharpens all of them to a similar degree. Our method was inspired by particle filtering and the use of annealing methods in sampling (Murray, 2007): we first generated a set of samples from the empirical tuning curves, and then added random noise to propagate the samples towards the desired distribution. Specifically, we used the following procedure to sharpen the tuning curves:
We generated $N=10,000$ samples from the empirical tuning curve $z\sim P(z)=\frac{1}{\eta}{\psi}_{i}(z)$, where $\eta =\int {\psi}_{i}(z)dz$ is the normalization constant to convert the tuning curve to a distribution.
We added a Gaussian random noise to the samples ${z}^{\mathrm{\prime}}=z+\u03f5$ with $\u03f5\sim \mathcal{N}(0,{\sigma}^{2})$ where the proposal variance, ${\sigma}^{2}$ controls the degree of mixing between the different modes. We used $\sigma =10$ cm.
We accepted the new sample ${z}_{k}^{\prime}$ with probability ${p}_{k}=\mathrm{min}(1,{\left(P({z}_{k}^{\prime})/P({z}_{k})\right)}^{\beta})$ where the parameter $\beta $ controls the sign and degree of sharpening. We used $\beta =3$.
We repeated steps 2. and 3. for a number of iterations, not necessarily until convergence.
Finally, we defined the estimated basis functions as the smoothed (5 cm Gaussian kernel) histogram of the number of samples in 5 cm spatial bins.
Figure 5—figure supplement 1g shows the tuning functions of a few example cells from our synthetic dataset (top row) together with the measured tuning curves (second row) and the result of the sharpening algorithm after 3 and 6 steps (3rd and 4th rows). We evaluated the sharpening algorithm by comparing the relative size of the estimated and the original tuning functions and the average difference between them. We found that 6 iterations of the algorithm eliminates overestimation bias of the tuning curve size (Figure 5—figure supplement 1e) and 3–6 iterations minimized the estimation error (Figure 5—figure supplement 1f). Using the 3step sharpened tuning curves also improved the performance of the DDCdecoder as it substantially reduced the bias when decoding the SD of the distribution (Figure 5—figure supplement 1d). Therefore we repeated the DDC decoding analysis shown in Figure 5 using the 3step sharpened tuning curves in Figure 5—figure supplement 2.
Even with this correction, trial by trial decoding of the represented distributions was not possible on short time scales. However, when averaged across 1000s of theta cycles, the decoder became sufficiently sensitive to the encoded SD to identify systematic changes in the represented uncertainty. Therefore, we used the average SD decoded from the neural activity as a hallmark of DDC encoding.
Appendix 3
Background for the calculation of EVindex
Cycletocycle variability
Here we derive the expectation of Equation 21 in the case of mean encoding scheme, when the population activity encodes the posterior mean trajectory ${\mathit{\mu}}_{n}$:
where ${\zeta}^{2}=\mathbb{E}[\sum _{i}({\mathit{\mu}}_{n}^{i}{\mathit{\mu}}_{n1}^{i}{)}^{2}]$ is the expected change of the posterior mean trajectory due to sensory inputs, and we omitted second order terms including the correlations between the decoding errors in subsequent theta cycles and the correlation between decoding error and trajectory change.
The same expectation in the case of sampling, when the encoded location ${\stackrel{~}{\mathbf{x}}}_{n}$ is sampled from the posterior:
where ${\mathit{\sigma}}^{2}=\mathbb{E}[\sum _{i}{\mathbf{\Sigma}}^{i}]$ measures the total variance of the posterior and we omitted all second order terms. In our numerical simulations we found that the contribution of second order terms are small in most cases, except for the bordereffect which captures the correlation of the decoding error in two subsequent theta cycles. The relatively strong positive correlation is explained by the bias of the decoder near the borders of the arena. Importantly, the contribution of this term is independent of theta phase and thus does not influence our analysis. We also assumed that the sample autocorrelation is near zero i.e., the samples drawn in subsequent theta cycles are independent.
Data availability
The current manuscript is a computational study, so no data have been generated for this manuscript. The (R) code and parameters for the simulations and analysis are available at https://github.com/bbujfalussy/tSeq (copy archived at swh:1:rev:6d131540ba45e56b46f07aba1d0cc211114d0a6d).

Collaborative Research in Computational NeuroscienceRecordings from hippocampal area CA1, PRE, during and POST novel spatial learning.https://doi.org/10.6080/K0862DC5
References

Reversed theta sequences of hippocampal cell assemblies during backward travelNature Neuroscience 17:719–724.https://doi.org/10.1038/nn.3698

Rhythms of the hippocampal networkNature Reviews. Neuroscience 17:239–249.https://doi.org/10.1038/nrn.2016.21

Statistically optimal perception and learning: from behavior to neural representationsTrends in Cognitive Sciences 14:119–130.https://doi.org/10.1016/j.tics.2010.01.003

DataRecordings from hippocampal area CA1, PRE, during and POST novel spatial learningCollaborative Research in Computational Neuroscience.https://doi.org/10.6080/K0862DC5

Segmentation of spatial experience by hippocampal θ sequencesNature Neuroscience 15:1032–1039.https://doi.org/10.1038/nn.3138

Formalizing planning and information search in naturalistic decisionmakingNature Neuroscience 24:1051–1064.https://doi.org/10.1038/s4159302100866w

Neural ensembles in CA3 transiently encode paths forward of the animal at a decision pointThe Journal of Neuroscience 27:12176–12189.https://doi.org/10.1523/JNEUROSCI.376107.2007

Bayesian inference with probabilistic population codesNature Neuroscience 9:1432–1438.https://doi.org/10.1038/nn1790

BookInformation Theory, Inference, and Learning AlgorithmsCambridge University Press.

Prioritized memory access explains planning and hippocampal replayNature Neuroscience 21:1609–1617.https://doi.org/10.1038/s415930180232z

Dorsal hippocampus contributes to modelbased planningNature Neuroscience 20:1269–1276.https://doi.org/10.1038/nn.4613

Reinforcement learning in artificial and biological systemsNature Machine Intelligence 1:133–143.https://doi.org/10.1038/s4225601900254

Probabilistic brains: knowns and unknownsNature Neuroscience 16:1170–1178.https://doi.org/10.1038/nn.3495

Vicarious trial and errorNature Reviews. Neuroscience 17:147–159.https://doi.org/10.1038/nrn.2015.30

Cannabinoids reveal importance of spike timing coordination in hippocampal functionNature Neuroscience 9:1526–1533.https://doi.org/10.1038/nn1801

ConferenceSpatiotemporal representations of uncertainty in spiking neural networksIn Proceedings of the 27th International Conference on Neural Information Processing Systems.

Optimizing the depth and the direction of prospective planning using information valuesPLOS Computational Biology 15:e1006827.https://doi.org/10.1371/journal.pcbi.1006827

The hippocampus as a predictive MAPNature Neuroscience 20:1643–1653.https://doi.org/10.1038/nn.4650

Motor primitives in space and time via targeted gain modulation in cortical networksNature Neuroscience 21:1774–1783.https://doi.org/10.1038/s4159301802760

Computation through neural population dynamicsAnnual Review of Neuroscience 43:249–275.https://doi.org/10.1146/annurevneuro092619094115

Graphical models, exponential families, and variational inferenceFoundations and Trends in Machine Learning 1:1–305.https://doi.org/10.1561/2200000001

A neural basis of probabilistic computation in visual cortexNature Neuroscience 23:122–129.https://doi.org/10.1038/s4159301905545

Hippocampal theta sequences reflect current goalsNature Neuroscience 18:289–294.https://doi.org/10.1038/nn.3909

Probabilistic interpretation of population codesNeural Computation 10:403–430.https://doi.org/10.1162/089976698300017818

Spatial representations of place cells in darkness are supported by path integration and border informationFrontiers in Behavioral Neuroscience 8:222.https://doi.org/10.3389/fnbeh.2014.00222
Decision letter

Adrien PeyracheReviewing Editor; McGill University, Canada

Laura L ColginSenior Editor; University of Texas at Austin, United States
Our editorial process produces two outputs: (i) public reviews designed to be posted alongside the preprint for the benefit of readers; (ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.
Decision letter after peer review:
Thank you for submitting your article "Sampling motion trajectories during hippocampal theta sequences" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Laura Colgin as the Senior Editor. The reviewers have opted to remain anonymous.
The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.
Essential revisions:
1) The demonstration that place cell firing samples potential future trajectories from the current location has so far only been shown in narrow armed mazes. This study, based on hippocampal recording in rats exploring 2D environments, is thus an interesting new contribution. However, there are several aspects of the analyses that should be clarified. Most importantly, it was recently shown that future sweeps of place cells alternate left/right on a Tmaze (Kay et al., 2020). Grid cells may do the same in open environments (Gardner R. J., Vollan A. Z., Moser M.B., Moser E. I. (2019) Soc. Neurosci. Abstr. 604.13/AA9). This study would thus benefit from bringing its original framing more up to date and from analyzing whether consecutive sweeps are anticorrelated.
2) Then, it is possible that the sampling of future trajectories corresponds to the representation of multiple coexisting "maps" (e.g. Jackson and Redish Hippocampus 2007, Kelemen and Fenton PLoS Biol 2010), especially since the animals alternated between two different behavioural strategies, namely the search of the "away" location and going back to the "home" location. This "flickering" between maps could potentially explain the overdispersion of the data, resulting in variance that could be interpreted as sampling of possible future trajectories. Additional analyses could clarify this potential bias.
3) The generative model seems to be a key aspect of the study. A lot of work has clearly been done to model as accurately as possible an animal's movement. However, it also raises the question of how critical this is for the whole analysis. Are the predictions somehow different with a simpler model, for example one that would be generated as a successor representation? It is possible that generating naturally looking trajectories this way is not possible, but it would be interesting to at least discuss why all components of the generative model are strictly necessary. In other words, to discuss why the same predictions cannot be done with a simpler model.
4) It is unclear how the diversity prediction of the DDC model was tested. Specifically, how the variance in panel 5b was computed?
5) Not all "signatures" were shown for all models. The study would benefit from a summary figure or table showing how each model's prediction compared to the data for each of the signatures.
Reviewer #1 (Recommendations for the authors):
I think this paper should be presented from the point of view that theta sweeps are already thought to represent specific potential trajectories extending from the current location, from extensive work on mazes. However, in open field data (where trajectories are not constrained to specific routes), although there are similar reports concerning replay in open fields and theta sweeps in entorhinal cortex, it is still possible that place cells actually represent the distribution of possible future trajectories, and you show that this is not the case, providing some of the first evidence that theta sweeps in open fields encode specific trajectories.
I did not find it helpful that the paper is framed as evaluating different possible neural representations of uncertainty, and I don't think the rejection of the product or DDC schemes for doing this necessarily tells us much about whether or not they are used in situations where the brain might in fact represent a probability distribution.
Reviewer #3 (Recommendations for the authors):
Please provide more details how the increased diversity prediction of the DDC was tested. How is the diversity decoded from the population? What do the cumulative probabilities on Figure 5 subplot b show exactly and why and how can they show the increased diversity? The authors may want to provide some intuition/more explanation about what the decoded SD reflects in Figure 5 b, d and why it reflects the diversity and not the uncertainty which would increase in the late cycles for the rest of the models too.
We assume all analyses ("signatures") were applied to all models but not all of the results are shown. We appreciate that the stepbystep approach eliminating one model at a time makes for a simpler story but a figure with all results would seem to be very informative: e.g. are the other signatures that contradict the product scheme?
Presentation: we did not understand the cumulative probability panels and how they are related to the text. E.g. lines 249252 and Figure 5b, same for Figure 6. Maybe it would be more intuitive to show the pdf instead of the cdf? But even then, the xaxis and general interpretation remains unclear to us.
Line 333: How do the results suggest efficient planning in the hippocampus? It suggests probabilistic, i.e. close to statistically optimal computations, but the authors should provide more details why they think it is also efficient.
[Editors’ note: further revisions were suggested prior to acceptance, as described below.]
Thank you for resubmitting your work entitled "Sampling motion trajectories during hippocampal theta sequences" for further consideration by eLife. Your revised article has been evaluated by Laura Colgin (Senior Editor) and a Reviewing Editor.
The manuscript has been improved but reviewer #2 has one remaining issue that needs to be addressed, as outlined below.
Reviewer #1 (Recommendations for the authors):
The authors have answered my main concerns
Reviewer #2 (Recommendations for the authors):
In this revision, the authors have made a number of improvements to what was already a systematic, rigorous examination of an important issue. The control analysis ruling out excess variance is due to different place maps across navigationtogoal and random foraging further enhances confidence in the results, and is additionally supported by a similar analysis in Brad Pfeiffer's paper that just came out (PMID: 35396328).
My one remaining hesitation with this paper is the one I brought up in my previous review but perhaps didn't explain clearly, so I will try again. My understanding of the core logic of the paper is that the authors first establish what the signatures are of the coding schemes they wish to distinguish, by implementing these schemes in a series of models that generate synthetic spiking data. Then, they test to what extent those signatures exist in real data, and draw inferences from comparison with the simulated results.
I think this is a powerful approach and am completely on board with it. However, it does raise an overall question of how robust the simulation results are: what components of the simulation are necessary and sufficient for the results? How sensitive are the results to specific parameter choices? I appreciate and agree with the authors' argument that the Kalman filter is an appropriate and relatively minimal way to model the animal's uncertainty about its own location. But it is not obvious to me what modeling the animal's uncertainty about its position contributes to the results in the first place. Would you get the same simulated signatures if all you had was a probability distribution of expected future trajectories given true current location, and then applied the various coding schemes (encoding MAP trajectory, sampling, etc)?
Hence, my suggestion of using the SR to obtain such trajectory distributions from the animal's behavioral data, but really any approach that estimates these distributions from the data would help address my concern. This feels important, because comparison between these different generative models would give a sense of what data sets/behavioral tasks/task conditions should show the predicted signatures. Ultimately we don't want to just understand what happens in the Pfeiffer and Foster 2013 data set the authors analyze, but have some idea of what to expect say, in the dark with high uncertainty about current location, or on armed mazes where possible futures are highly constrained.
I realize that the paper is already extensive and thorough, but unless I'm off base with this intuition or misunderstand the authors' argument, it could actually simplify their paper if the same results obtain with a simpler way of generating probability distributions over trajectories. If they don't obtain, then this is important to point out, because of the resulting prediction that theta sequences ought to have different spiking statistics in high vs low uncertaintyaboutcurrentposition conditions.
Reviewer #3 (Recommendations for the authors):
The authors have addressed all my concerns. They have added an interesting new analysis as the result of the reviewer suggestions. Congratulations to the authors for an impressive piece of work.
In the final version, I'd like to encourage the authors to better explain the cause of the bias towards similar directions described in line 404. I didn't understand it. Please elaborate.
https://doi.org/10.7554/eLife.74058.sa1Author response
Essential revisions:
1) The demonstration that place cell firing samples potential future trajectories from the current location has so far only been shown in narrow armed mazes. This study, based on hippocampal recording in rats exploring 2D environments, is thus an interesting new contribution. However, there are several aspects of the analyses that should be clarified. Most importantly, it was recently shown that future sweeps of place cells alternate left/right on a Tmaze (Kay et al., 2020). Grid cells may do the same in open environments (Gardner R. J., Vollan A. Z., Moser M.B., Moser E. I. (2019) Soc. Neurosci. Abstr. 604.13/AA9). This study would thus benefit from bringing its original framing more up to date and from analyzing whether consecutive sweeps are anticorrelated.
We thank the reviewers for encouraging us to analyse the autocorrelation structure of the theta sequences. We performed the suggested analysis and our results confirm and extend previous findings that trajectories encoded during theta sequences are indeed anticorrelated in 2dimensional environments. Since this result also underscores one of the main messages of the paper, i.e., that the brain is able to perform probabilistic computations efficiently: correlated samples require a higher number of samples to represent a probability distribution, and therefore decorrelation can contribute to a code that requires less samples and consequently less time to perform more accurate inference. We present these results on a new main figure (Figure 7) and associated section in the main text (Signature of efficient sampling:generative cycling, line ~382425). We also added a corresponding supplemenal figure (Figure 7 figure supplement 1).
2) Then, it is possible that the sampling of future trajectories corresponds to the representation of multiple coexisting "maps" (e.g. Jackson and Redish Hippocampus 2007, Kelemen and Fenton PLoS Biol 2010), especially since the animals alternated between two different behavioural strategies, namely the search of the "away" location and going back to the "home" location. This "flickering" between maps could potentially explain the overdispersion of the data, resulting in variance that could be interpreted as sampling of possible future trajectories. Additional analyses could clarify this potential bias.
Thank you for raising this issue. We performed the analysis to assess the differences in the behaviour and the spatial tuning of the recorded neurons between the two task phases (home versus away runs). Although we found significant differences in spatial tuning of a minority of the cells, these differences were relatively small and thus we do not believe that this flickering could substantially contribute to the overdispersion of the data. We included this analysis as a supplemental figure (Figure 6 figure supplement 2)
3) The generative model seems to be a key aspect of the study. A lot of work has clearly been done to model as accurately as possible an animal's movement. However, it also raises the question of how critical this is for the whole analysis. Are the predictions somehow different with a simpler model, for example one that would be generated as a successor representation? It is possible that generating naturally looking trajectories this way is not possible, but it would be interesting to at least discuss why all components of the generative model are strictly necessary. In other words, to discuss why the same predictions cannot be done with a simpler model.
The generality of the signatures follows from the fact that we derived them from the fundamental properties of the encoding schemes, not the generative model. The sole criterion for the generative model was to produce synthetic trajectories that match experimental trajectories well (Figure 3, figure supplement 2) and to enable simple inference in the model (Figure 3, figure supplement 1). We tested the robustness of signatures using both idealized test data (Figure 4, figure supplement 1cd, Figure 5, figure supplement 1) and our simulated hippocampal model (Figure 4c, Figure 5Bc, Fig6bg). We added a paragraph discussing the choice of the generative model and clarified the relationship between our model and the successor representation framework in the discussion (lines 486490).
4) It is unclear how the diversity prediction of the DDC model was tested. Specifically, how the variance in panel 5b was computed?
Thank you for raising the issue of clarity of presentation. We extended the description of DDC decoding in the main text (lines 268273) and added a reference to the Methods section where the mathematical details are provided (Equations 1718).
Briefly, In the DDC scheme we assumed that the hippocampal population activity at each time point encodes the probability distribution using the overlap between the neuronal basis functions and the target distribution. Just as we can use a static Bayesian decoder to decode the mean of the encoded location when a single position is encoded (as in the mean encoding scheme), we can also decode higher moments (e.g., variance) from a set of spikes when a full distribution is represented in the population activity (as in the product or the DDC schemes). In this case, we made a further assumption, that an isotropic Gaussian distribution was encoded, and used a maximum likelihood estimation to infer the three parameters of the encoded distribution (x and y position as well as the standard deviation).
5) Not all "signatures" were shown for all models. The study would benefit from a summary figure or table showing how each model's prediction compared to the data for each of the signatures.
We added a novel supplemental figure (Figure 5—figure supplement 3.) showing all signatures for all Models.
Reviewer #1 (Recommendations for the authors):
I think this paper should be presented from the point of view that theta sweeps are already thought to represent specific potential trajectories extending from the current location, from extensive work on mazes. However, in open field data (where trajectories are not constrained to specific routes), although there are similar reports concerning replay in open fields and theta sweeps in entorhinal cortex, it is still possible that place cells actually represent the distribution of possible future trajectories, and you show that this is not the case, providing some of the first evidence that theta sweeps in open fields encode specific trajectories.
We thank the reviewer for this suggestion. We added a new paragraph (lines 7488) to the introduction to clarify that one of the novel contributions of the paper is the generalization of previous intuitions, largely based on work on mazes, to open field environments.
I did not find it helpful that the paper is framed as evaluating different possible neural representations of uncertainty, and I don't think the rejection of the product or DDC schemes for doing this necessarily tells us much about whether or not they are used in situations where the brain might in fact represent a probability distribution.
We did not aim to suggest that our results exclude the viability of the product or DDC schemes in other brain regions or at different temporal scales. We clarified this point in the discussion (line 556).
Reviewer #3 (Recommendations for the authors):
Please provide more details how the increased diversity prediction of the DDC was tested. How is the diversity decoded from the population? What do the cumulative probabilities on Figure 5 subplot b show exactly and why and how can they show the increased diversity? The authors may want to provide some intuition/more explanation about what the decoded SD reflects in Figure 5 b, d and why it reflects the diversity and not the uncertainty which would increase in the late cycles for the rest of the models too.
We rewrote the text (lines ~264273) and the legend of the corresponding figures to clarify that we use the decoded SD and not the diversity across neurons to test the DDC code, and that the CDFs are calculated across theta cycles. We summarise the answer to the question below.
In the DDC scheme the firing rate of neurons is defined by the overlap between the target distribution and their tuning functions. Intuitively, this scheme leads to a code where the diversity of the coactive neurons increases with uncertainty. This increase in the diversity provides a useful intuition about how neuronal activities change with uncertainty in this scheme. We did not aim to decode diversity nor to directly evaluate how diversity changes within a theta cycle. Instead we aimed at directly decoding the summary statistics of the probability distribution that had been encoded in the population activity. To this end, we used the standard deviation (SD) of the distribution to quantify the uncertainty (large SD = large uncertainty, implying large diversity as well).
To achieve this, we built a maximum likelihood decoder that finds the most likely parameters (mean and SD) of the encoded distribution. We applied this decoder to early, mid, and late spikes in each theta cycle recorded either using simulated data (Figure 5Bd) or experimental data (Figure 5eg). The plots in Figure 5b and e show the results of this decoding process. Specifically, the three curves in Figure 5b (some of them are overlapping) show the distribution of the standard deviation values decoded from the observed spikes in the three different theta phases (early, mid and late) in the form of cumulative distribution functions (CDF). An increased uncertainty of the encoded distribution would be reflected by a rightward shift of the distribution. Importantly, the three CDF curves completely overlap for the sampling and the mean schemes as these schemes do not represent uncertainty in the instantaneous population activity. Instead, these schemes represent a single trajectory throughout the whole theta cycle.
Shift of CDF is only present in the DDC scheme (green), thus indicating that this measure displays specificity to DDC.
We assume all analyses ("signatures") were applied to all models but not all of the results are shown. We appreciate that the stepbystep approach eliminating one model at a time makes for a simpler story but a figure with all results would seem to be very informative: e.g. are the other signatures that contradict the product scheme?
We agree with the Reviewer that the signatures not shown could also be informative. We thus prepared a supplementary figure to show all signatures for all encoding schemes (Figure 5—figure supplement 3).
Further to the patterns observed before, our analysis also revealed that the DDC inspired decoding of SD grows within a theta cycle not only for DDC but for the product scheme as well. This can be taken as further evidence against this encoding scheme (Figure S9ab). The EVindex is positive for the product scheme and negative for the DDC scheme (Figure S9c). We explain the positivity in the case of the product scheme by the fact that the firing rate is the lowest for the most uncertain future predictions inflating the variability at late theta cycles. In the case of DDC scheme the EVindex is negative as the distributions encoded in subsequent theta cycles are similar.
However, we would like to add a cautionary note. The EVindex and the DDC decoded SD were not designed to work consistently for all encoding schemes and thus there is no theoretically solid justification for their specificity and robustness.
For example the positivity of the EVindex in the case of the product code may not reflect true variability of the encoded locations but the large decoding noise due to the low firing rate. Indeed, the magnitude of the EVindex decreases substantially for the product code if we consider only high spike count theta cycles (Figure 5—figure supplement 3c). Consequently, the value of these measures taken on simulated datasets could be misleading and may not generalise well to the real situation. Thus we prefer to keep this figure in the supplementary material.
Presentation: we did not understand the cumulative probability panels and how they are related to the text. E.g. lines 249252 and Figure 5b, same for Figure 6. Maybe it would be more intuitive to show the pdf instead of the cdf? But even then, the xaxis and general interpretation remains unclear to us.
To test the DDC code we assumed that a probability distribution is encoded in each moment by the activity of the neuron. We decoded the moments (mean and SD) of this distribution in each theta cycle for the three theta phases (early, mid and late) separately using a maximum likelihood decoder. As a result of this decoding process we obtained three distributions of the decoded SD values (three SD values for each theta cycle, corresponding to early mid and late phase) and plotted these distributions separately as three CDF curves with the SD on the xaxis. The shift of these curves along the xaxis indicate differences in the magnitude of the represented uncertainty.
The interpretation of the CDF curves in Fig6b,d,h are similar.
We decided to show the distribution using the CDF as relatively small, but significant differences between the distributions are more apparent in the CDF than in the PDF (Figure R1).
We rewrote the description of these panels in the text and in the legend to clarify that CDFs are calculated across theta cycles (line ~278).
Line 333: How do the results suggest efficient planning in the hippocampus? It suggests probabilistic, i.e. close to statistically optimal computations, but the authors should provide more details why they think it is also efficient.
In the revised manuscript we clarified our wording and use the word efficient solely when we refer to anticorrelated samples.
[Editors’ note: further revisions were suggested prior to acceptance, as described below.]
The manuscript has been improved but reviewer #2 has one remaining issue that needs to be addressed, as outlined below.
We conducted additional simulations and analysis to address the remaining comment of reviewer #2. Our results demonstrated that our analysis is robust to the way we generate synthetic data and thus the signatures we derived can be used to discriminate the alternative coding schemes. The results of this analysis in included in the new version of the paper.
We provide a detailed answer to the Reviewers’ comments below.
Reviewer #2 (Recommendations for the authors):
In this revision, the authors have made a number of improvements to what was already a systematic, rigorous examination of an important issue. The control analysis ruling out excess variance is due to different place maps across navigationtogoal and random foraging further enhances confidence in the results, and is additionally supported by a similar analysis in Brad Pfeiffer's paper that just came out (PMID: 35396328).
We thank the Reviewer for drawing our attention to this recent paper which we included in our reference list.
My one remaining hesitation with this paper is the one I brought up in my previous review but perhaps didn't explain clearly, so I will try again. My understanding of the core logic of the paper is that the authors first establish what the signatures are of the coding schemes they wish to distinguish, by implementing these schemes in a series of models that generate synthetic spiking data. Then, they test to what extent those signatures exist in real data, and draw inferences from comparison with the simulated results.
I think this is a powerful approach and am completely on board with it. However, it does raise an overall question of how robust the simulation results are: what components of the simulation are necessary and sufficient for the results? How sensitive are the results to specific parameter choices? I appreciate and agree with the authors' argument that the Kalman filter is an appropriate and relatively minimal way to model the animal's uncertainty about its own location. But it is not obvious to me what modeling the animal's uncertainty about its position contributes to the results in the first place. Would you get the same simulated signatures if all you had was a probability distribution of expected future trajectories given true current location, and then applied the various coding schemes (encoding MAP trajectory, sampling, etc)?
The reviewer suggested that representation of uncertainty associated with the current position of the animal might not be necessary to interpret experimental data. Indeed, the key to our analysis is the change in uncertainty along the planned trajectory of the animal that is less sensitive to the level of uncertainty at any given time. Accordingly, the trajectory distribution that is using the modelfree alternative, which lacks the representation of current uncertainty is expected to produce similar results than the generative model that explicitly represents uncertainty in current beliefs. To corroborate our expectations, we designed a minimalist approach to generate hypothetical future trajectories without explicitly modelling the animal’s subjective uncertainty and demonstrated that our signatures robustly discriminate the different coding schemes.
Hence, my suggestion of using the SR to obtain such trajectory distributions from the animal's behavioral data, but really any approach that estimates these distributions from the data would help address my concern. This feels important, because comparison between these different generative models would give a sense of what data sets/behavioral tasks/task conditions should show the predicted signatures. Ultimately we don't want to just understand what happens in the Pfeiffer and Foster 2013 data set the authors analyze, but have some idea of what to expect say, in the dark with high uncertainty about current location, or on armed mazes where possible futures are highly constrained.
We thank the reviewer for clarifying this idea. Indeed, we agree that paper benefits from the demonstration that the signatures of coding schemes are not sensitive to the specific properties of the generative model of motion trajectories.
Therefore we designed a minimal, modelfree approach to obtain trajectory distributions directly from the animal’s behavioral data. Specifically, we collected trajectory distribution for a particular target point on the real motion trajectory we first sampled multiple real trajectory segments with similar starting speed. Next, we aligned the start of these trajectory segments with the start of the target trajectory by shifting and rotating the sampled segments. This alignment resulted in a bouquet of trajectories for each actual point along the motion trajectory that outlined an empirical distribution of hypothetical future positions and speeds (Figure 7—figure supplement 2a).
Next we used the empirical, modelfree distributions of motion trajectories to generate spiking data using the four different coding schemes in the same way as we did it for modelbased trajectories. Next, we repeated identical analyses that were performed with the original, modelbased synthetic trajectories. We found that the population gain, the increase in the DDCdecoded SD and the sampling index were potent signatures to discriminate these alternative encoding schemes (Figure 7—figure supplement 2cg). We concluded that our results are robust against changes in the way distributions of planned trajectories are constructed. We present the results of this new analysis in a novel subsection of the Results (Analysis of signatures using modelfree trajectories, line 430) and in a novel Supplemental Figure (Figure 7—figure supplement).
I realize that the paper is already extensive and thorough, but unless I'm off base with this intuition or misunderstand the authors' argument, it could actually simplify their paper if the same results obtain with a simpler way of generating probability distributions over trajectories. If they don't obtain, then this is important to point out, because of the resulting prediction that theta sequences ought to have different spiking statistics in high vs low uncertaintyaboutcurrentposition conditions.
We had a hard time deciding how to include the new, modelfree analysis into the paper. We decided to present it at the end of the Results section as a separate subsection and on a new Figure Supplement. Our choice was based on the following arguments. First, we consider the presentation of the generative model of planning an insightful ingredient of the paper, as the readers are provided with new ways of thinking about hippocampal theta sequences and therefore has the perspective to inspire the design of new experiments in the future.
1. Experimental data suggests that the hippocampus is involved in modelbased planning. Our original approach, implementing a model based method to generate synthetic trajectories, is thus more similar to what might be actually implemented by the hippocampal circuitry than the new, modelfree alternative.
2. We believe that representing subjective uncertainty in a selfconsistent way using a probabilistic model is an important principle used in many artificial systems. This idea has been also applied recently to neuronal systems and gained support from a wide range of studies, albeit from different domains of cognition than the one studied here.
3. Our generative model was a key component of our framework providing crucial insights about the variability of hippocampal theta sequences. In particular, we realized that the subjective uncertainty of the animal is an important and often neglected factor shaping hippocampal representations. In our work, this is best reflected in the relationship between the decoding error, the subjective uncertainty, and the expected change in the posterior mean that allowed us to derive the sampling index and in the way we can explain the source of the bias in the cycling index (Figure 7e).
Second, we decided to present the keep the original logic of the paper and present the main findings first using the original, generative model and only describe the modelfree approach at the end of the manuscript as a control simulation. Alternatively, we could have introduce the two approaches in parallel around Figure 3 and demonstrate all results using both alternatives, but we were afraid that this would break the narrative of the main text and confuse the readers.
Reviewer #3 (Recommendations for the authors):
The authors have addressed all my concerns. They have added an interesting new analysis as the result of the reviewer suggestions. Congratulations to the authors for an impressive piece of work.
In the final version, I'd like to encourage the authors to better explain the cause of the bias towards similar directions described in line 404. I didn't understand it. Please elaborate.
We thank the Reviewer for their encouraging words.
We rewrote the corresponding section of the Results (lines 402409) and the caption of Figure 7e to clarify the source of this bias.
The bias can be best understood if we consider that the true position of the animal in room coordinates that we can directly observe can sometimes be different from the animal’s own position estimate. Although this difference may be quite small (it is typically not larger than a few centimeters in our synthetic dataset in a 2×2 m arena), the important point is that this subjective position represent the animal’s best guess about its own position and hypothetical trajectories will alternate around this subjective position during generative cycling. Even if the subsequent trajectories are alternating around the subjective position they might fall into the same side of the true position if the two are different (Figure 7e).
https://doi.org/10.7554/eLife.74058.sa2Article and author information
Author details
Funding
National Research, Development and Innovation Fund (PD125386)
 Balazs B Ujfalussy
National Research, Development and Innovation Fund (FK125324)
 Balazs B Ujfalussy
National Brain Research Centre (20171.2.1NKP201700002)
 Gergő Orbán
National Brain Research Centre (KTIANAP122201)
 Balazs B Ujfalussy
Artificial Intelligence National Laboratory (RRF2.3.121202200004)
 Gergő Orbán
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
We thank Brad E Pfeiffer and David J Foster for kindly sharing their data and Andres D Grosmark and György Buzsáki for making their data publicly available. We thank Márton Kis and Judit K Makara for useful discussions; Mihály Bányai and Judit K Makara for comments on a previous version of the manuscript. This work was supported by an NKFIH fellowship (PD125386, FK125324; BBU), by the Hungarian Brain Research Program (2017–1.2.1NKP2017–00002, GO and KTIANAP122201, BBU and GO), by the Artificial Intelligence National Laboratory (European Union project RRF2.3.121202200004, GO).
Senior Editor
 Laura L Colgin, University of Texas at Austin, United States
Reviewing Editor
 Adrien Peyrache, McGill University, Canada
Version history
 Received: September 20, 2021
 Preprint posted: December 16, 2021 (view preprint)
 Accepted: September 28, 2022
 Version of Record published: November 8, 2022 (version 1)
Copyright
© 2022, Ujfalussy and Orbán
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics

 844
 Page views

 138
 Downloads

 2
 Citations
Article citation count generated by polling the highest count across the following sources: PubMed Central, Crossref, Scopus.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading

 Neuroscience
Blindness affects millions of people around the world. A promising solution to restoring a form of vision for some individuals are cortical visual prostheses, which bypass part of the impaired visual pathway by converting camera input to electrical stimulation of the visual system. The artificially induced visual percept (a pattern of localized light flashes, or ‘phosphenes’) has limited resolution, and a great portion of the field’s research is devoted to optimizing the efficacy, efficiency, and practical usefulness of the encoding of visual information. A commonly exploited method is noninvasive functional evaluation in sighted subjects or with computational models by using simulated prosthetic vision (SPV) pipelines. An important challenge in this approach is to balance enhanced perceptual realism, biologically plausibility, and realtime performance in the simulation of cortical prosthetic vision. We present a biologically plausible, PyTorchbased phosphene simulator that can run in realtime and uses differentiable operations to allow for gradientbased computational optimization of phosphene encoding models. The simulator integrates a wide range of clinical results with neurophysiological evidence in humans and nonhuman primates. The pipeline includes a model of the retinotopic organization and cortical magnification of the visual cortex. Moreover, the quantitative effects of stimulation parameters and temporal dynamics on phosphene characteristics are incorporated. Our results demonstrate the simulator’s suitability for both computational applications such as endtoend deep learningbased prosthetic vision optimization as well as behavioral experiments. The modular and opensource software provides a flexible simulation framework for computational, clinical, and behavioral neuroscientists working on visual neuroprosthetics.

 Neuroscience
Extinction is a specific example of learning where a previously reinforced stimulus or response is no longer reinforced, and the previously learned behaviour is no longer necessary and must be modified. Current theories suggest extinction is not the erasure of the original learning but involves new learning that acts to suppress the original behaviour. Evidence for this can be found when the original behaviour recovers following the passage of time (spontaneous recovery) or reintroduction of the reinforcement (i.e. reinstatement). Recent studies have shown that pharmacological manipulation of noradrenaline (NA) or its receptors can influence appetitive extinction; however, the role and source of endogenous NA in these effects are unknown. Here, we examined the role of the locus coeruleus (LC) in appetitive extinction. Specifically, we tested whether optogenetic stimulation of LC neurons during extinction of a foodseeking behaviour would enhance extinction evidenced by reduced spontaneous recovery in future tests. LC stimulation during extinction trials did not change the rate of extinction but did serve to reduce subsequent spontaneous recovery, suggesting that stimulation of the LC can augment rewardrelated extinction. Optogenetic inhibition of the LC during extinction trials reduced responding during the trials where it was applied, but no longlasting changes in the retention of extinction were observed. Since not all LC cells expressed halorhodopsin, it is possible that more complete LC inhibition or pathwayspecific targeting would be more effective at suppressing extinction learning. These results provide further insight into the neural basis of appetitive extinction, and in particular the role of the LC. A deeper understanding of the physiological bases of extinction can aid development of more effective extinctionbased therapies.