Coding of latent variables in sensory, parietal, and frontal cortices during closed-loop virtual navigation

  1. Jean-Paul Noel
  2. Edoardo Balzani
  3. Eric Avila
  4. Kaushik J Lakshminarasimhan
  5. Stefania Bruni
  6. Panos Alefantis
  7. Cristina Savin
  8. Dora E Angelaki  Is a corresponding author
  1. Center for Neural Science, New York University, United States
  2. Center for Theoretical Neuroscience, Columbia University, United States

Abstract

We do not understand how neural nodes operate and coordinate within the recurrent action-perception loops that characterize naturalistic self-environment interactions. Here, we record single-unit spiking activity and local field potentials (LFPs) simultaneously from the dorsomedial superior temporal area (MSTd), parietal area 7a, and dorsolateral prefrontal cortex (dlPFC) as monkeys navigate in virtual reality to ‘catch fireflies’. This task requires animals to actively sample from a closed-loop virtual environment while concurrently computing continuous latent variables: (i) the distance and angle travelled (i.e., path integration) and (ii) the distance and angle to a memorized firefly location (i.e., a hidden spatial goal). We observed a patterned mixed selectivity, with the prefrontal cortex most prominently coding for latent variables, parietal cortex coding for sensorimotor variables, and MSTd most often coding for eye movements. However, even the traditionally considered sensory area (i.e., MSTd) tracked latent variables, demonstrating path integration and vector coding of hidden spatial goals. Further, global encoding profiles and unit-to-unit coupling (i.e., noise correlations) suggested a functional subnetwork composed by MSTd and dlPFC, and not between these and 7a, as anatomy would suggest. We show that the greater the unit-to-unit coupling between MSTd and dlPFC, the more the animals’ gaze position was indicative of the ongoing location of the hidden spatial goal. We suggest this MSTd-dlPFC subnetwork reflects the monkeys’ natural and adaptive task strategy wherein they continuously gaze toward the location of the (invisible) target. Together, these results highlight the distributed nature of neural coding during closed action-perception loops and suggest that fine-grain functional subnetworks may be dynamically established to subserve (embodied) task strategies.

Editor's evaluation

This important study investigates distributed neural coding across the three brain areas MST, 7a, and dlPFC in monkeys carrying out a novel behavioural paradigm with a naturalistic closed action-perception-loop developed by the same group previously. The convincing model-based analysis discerns potential influences (e.g. task variables, hidden variables) on firing rates and supports the claim of task-specific sub-networks being formed. The authors provide an important first step to unravel potential drivers of dynamic activity in distributed networks during recurrent action-perception-loops, which should be augmented by future analyses of, for instance, the contribution of changing visual input, especially as the recordings stem from areas involved in processing optical flow, and of signals across different circuit elements like cortical layers.

https://doi.org/10.7554/eLife.80280.sa0

Introduction

Despite the closed-loop interplay that exists between action and perception in the real world, our understanding of sensory encoding and the neural architectures supporting goal-directed behavior is largely derived from tasks segregating action from perception, and only sporadically requiring motor output. Arguably, the limited purview of this traditional approach has hindered our ability to understand neural coding for dynamic and flexible behaviors (see, e.g., Cisek and Kalaska, 2010; Gomez-Marin et al., 2014; Pitkow and Angelaki, 2017; Yoo et al., 2021, for similar arguments).

First, severing the loop between action and perception disrupts the natural timing that exists between sensory events and internal neural dynamics. In natural vision, for example, eye movements dictate the content, relative resolution, and timing of visual input. Indeed, work from active sensing, for example, Schroeder et al., 2010; Yang et al., 2016, has shown that neural excitability in primary visual cortex (Barczak et al., 2019) and the anterior nuclei of the thalamus (Leszczynski et al., 2020) are enhanced at saccade offset – precisely when new observations are made. This enhancement is likely mediated by phase-reset of neural oscillations (Lakatos et al., 2008; Rajkai et al., 2008) caused by the shifting gaze. In turn, physical movements of the eyes during sensory processing – an aspect absent in most laboratory, binary decision-making tasks – may not only enhance low-level visual encoding but may also favor certain channels of inter-area communication via local field potential (LFP) phase alignment or other coupling mechanisms (e.g., Jutras et al., 2013).

Second, the emphasis on tasks with poor dynamics, together with a technology-limited focus on studying one area at a time, has limited our ability to explore how within- and across-area communication enables flexible behavior. This has possibly led to a degeneracy in the number of functions ascribed to each neural node. For instance, notorious redundancy has been observed between parietal and frontal cortices with both areas showing similar properties during visual feature categorization (Swaminathan and Freedman, 2012), target selection (Suzuki and Gottlieb, 2013), visuo-spatial memory (Chafee and Goldman-Rakic, 1998), and working memory (Olesen et al., 2004), among others (see Katsuki and Constantinidis, 2012, for a review). While this redundancy is certainly an adaptive feature (Moreno-Bote et al., 2014; Driscoll et al., 2017), the joint characterization of sensory, parietal, and frontal areas during tasks requiring a flow of cognitive operations typical of daily life (e.g., sensing, memorizing, acting) may offer the possibility to study how these regions differ, and how they interact.

To tackle these limitations, we have developed a task requiring observers to catch memorized ‘fireflies’ in virtual reality (Lakshminarasimhan et al., 2018; Lakshminarasimhan et al., 2020; Noel et al., 2020; Noel et al., 2021). This goal-directed virtual navigation task addresses many of the limitations of the traditional approach, while remaining rooted in well-established neural elements: motion detection (Newsome and Paré, 1988), optic flow processing (Britten, 2008), and navigation (Ekstrom et al., 2018). Animals first detect a briefly flashed target, much like the blinking of a firefly. Then, they use a joystick controlling linear and angular velocity to navigate via path integration to the memorized location of this target (Figure 1). Importantly, the observers’ eyes are free to sample from their sensory surroundings, and trials last on the order of 2–3 s. In turn, this task requires integration of sensory evidence over protracted time periods, allows for active sensing, and engages the real-world loop between observations, beliefs, actions, and environmental states (Figure 1A). Importantly, the critical task variables are latent. Namely, observers must dynamically estimate their self-location (by accumulating velocity flow vectors) and that of the target (i.e., a hidden spatial goal). This dynamic computation of latent variables may offer a window into the mechanisms of internal computation (e.g., see Lakshminarasimhan et al., 2020).

Task and behavioral results.

(A) Behavioral task. Top left: Egocentric perspective of the monkey during the firefly task, emphasizing the ground plane optic flow elements (flashing triangles), the fact that the target disappears (target off), and eye positions (following target position perfectly when the target is on, and then continuing on the correct path but deviating with time). Bottom left: Overhead view of the monkey, starting location (star), accelerating and progressively re-orienting to face the firefly, before de-accelerating and stopping at (left; rewarded) or near (right: unrewarded) the location of the now invisible target. Right: This task involves making observation of the sensory environment (composed of optic flow), using these observations to generate a dynamic belief (of the relative distance to target), and producing motor commands based on the current belief, which in turn updates the state of the environment. Right: Continuous variables are shown for three example trials. (B) Top: spatial distribution of target positions across trials. Bottom: monkey trajectories. The orange dot and trajectory are an example trial, maintained throughout the figure. (C) Example session. Radial (top) and angular (bottom) endpoint (y-axis) as a function of target (x-axis). Gray dots are individual trials, black line is unity, and blue line is the regression between response and target (slope of 1 indicates no bias). (D) Radial (top) and angular (bottom) bias (slope,=1 means no bias, >1 means overshooting, and <1 means undershooting) for every session (transparent blue circles) and monkey (x-axis, dark blue circles are average for each monkey, Q, S, and M). Rightmost circle is the average across monkeys and error bars represent ±1 SEM. Overall, monkeys are fairly accurate but undershoot targets, both radially and in eccentricity. (E) Eye trajectories (green = eye position at firefly offset) for an example session, projected onto a two-dimensional ‘screen’. Eyes start in the upper field and gradually converge in the lower center (where the firefly ought to be when they stop). (F) Target-tracking index (variance in eye position explained by prediction of fixating on firefly) for an example session as a function of time since firefly onset and offset. (G) Average target-tracking index within 1 s for all sessions (light blue) and monkeys (dark blue) showing the monkeys attempt to track the invisible target.

Here, we leverage this ‘firefly task’ to simultaneously characterize the encoding profiles of single units in sensory (dorsomedial superior temporal area, MSTd), parietal (area 7a), and frontal (dorsolateral prefrontal cortex, dlPFC) areas during closed-loop goal-directed behavior (Figure 2—figure supplement 1). We record from these regions (MSTd, 7a, and dlPFC) because they form a series of reciprocally interconnected areas (Andersen et al., 1990; Rozzi et al., 2006) that are well established in the processing of optic flow for self-motion, sensorimotor transformations, and belief formation/working memory, all critical elements of the task (see, e.g., Kravitz et al., 2011; Constantinidis and Klingberg, 2016; Christophel et al., 2017; Noel and Angelaki, 2022, for reviews). We observe that all areas probed – including MSTd, a classically considered sensory area – encode latent task variables. Further, global encoding profiles and unit-to-unit couplings suggested a functional subnetwork composed by dlPFC and MSTd (and not area 7a, as would be predicted from anatomy). We suggest this MSTd-dlPFC subnetwork reflects the natural task strategy wherein monkeys continuously gaze toward the location of the (invisible) target.

Results

Monkeys navigate to remembered targets employing natural task strategies

Targets (‘fireflies’) were displayed for 300 ms on a ground plane composed of triangular optic flow elements that appeared transiently (250 ms) with random orientations. The density of these optic flow elements was 5 elements/m2. A single target was presented per trial. The firefly targets were uniformly distributed (across trials) within a radial range of 1–4 m, and an angular eccentricity spanning from –40° (leftward) to 40° (rightward) of visual angle (Figure 1A and B). Performance feedback was provided at the end of each trial in the form of juice reward for correctly stopping within a reward boundary (see Methods for further details).

Visualizing example trials shows that monkeys followed curvilinear trajectories and stopped near the latent targets (Figure 1B). To quantify their performance, we expressed the monkeys’ trajectory endpoints and firefly locations in polar coordinates, with an eccentricity from straight-ahead (θ) and a radial distance (r). Figure 1C shows radial (top; r vs. r~ slope = 0.90; R2=0.55) and angular (bottom; θ vs. θ~ slope = 0.95; R2=0.78) responses as a function of target location for an example session (omitting 18% of trials the animals aborted early or never stopped, see Methods). As shown by this example session, and confirmed across three animals and all sessions (Monkey Q, 27 sessions; S, 38 sessions; M, 17 sessions; Figure 1D), monkeys were on average accurate, but tended to undershoot targets (r vs. r~ slope, mean = 0.892, 95% CI=[0.860, 0.914]; θ vs. θ~ slope, M=0.792, 95% CI=[0.762, 0.842]). Undershooting targets is, in fact, the optimal strategy given that the uncertainty over one’s own and the firefly’s position scales supra-linearly with time and distance travelled (see Lakshminarasimhan et al., 2018).

Most interestingly, this task affords observers with degrees of freedom that may reveal their innate task strategy. For instance, they are not required to fixate and instead we may use their eye movements as an indicator of their internal beliefs (Lakshminarasimhan et al., 2020). Indeed, over the course of individual trajectories, the monkeys’ eyes move downward and become less eccentric (Figure 1E), as if tracking the (invisible) target progressively becoming nearer and aligned with straight ahead (Lakshminarasimhan et al., 2020; also see Ilg and Thier, 2003; Ilg and Thier, 1999). This behavior was quantified by deriving predictions for the binocular position of the observer’s eyes, assuming the monkeys maintained fixation at the center of the target throughout the trial. Then, we expressed a target-tracking index as the square root of the fraction of variance in the observed eye position that was explained by this prediction (see Lakshminarasimhan et al., 2020, for further details). An index of 1 implies that the subject consistently looked at the center of the firefly, while a value of 0 denotes a lack of correspondence between target and gaze locations. The target-tracking index was high at firefly onset, highest at firefly offset, and remained above chance throughout the course of the trial (Figure 1F shows an example session). Across all animals and sessions, the target-tracking index averaged 0.73 (95% CI = [0.72, 0.74]) during the first second of each trial. This suggest that, while not a task requirement, animals developed the strategy of fixating on the visible firefly and then attempted to maintain their gaze on it even after the target had disappeared (Figure 1G, see Lakshminarasimhan et al., 2020, for a full description of this phenomenon and demonstration that pursuing the invisible firefly with our eyes results in improved task performance).

Altogether, the animals accumulated optic flow velocity signals (Alefantis et al., 2021) into an evolving (and largely accurate) estimate of self-position. They used this estimate to path integrate to the location of latent targets over prolonged periods of time. In addition, they seemingly sampled from their environment in an intelligent manner, tracking the (invisible) target with their eyes in what is seemingly an embodied mnemonic strategy.

Patterned mixed selectivity across sensory, parietal, and frontal cortices

We recorded single-unit activity from a large number of neurons in MSTd (231 units), area 7a (3200 units), and dlPFC (823 units) across 82 sessions while monkeys performed the firefly task (Figure 2A and Figure 2—figure supplement 1).

Figure 2 with 11 supplements see all
Dorsomedial superior temporal area (MSTd), area 7a, and dorsolateral prefrontal cortex (dlPFC) encode a heterogeneous mixture of task variables.

(A) Schematic of brain areas recorded. (B) Raster plots of spiking activity in MSTd (green), 7a (blue), and dlPFC (red). Shaded yellow area represents the time of the target being visible, and black dots represent the timing of movement offset. Trials are sorted from shortest (bottom) to longest (top) duration of the trial. (C) Schematic of the Poisson generalized additive model (P-GAM) used to fit spike trains. The task and neural variables used as input to the model were: linear and angular velocity and acceleration, time of movement onset and offset, time of firefly onset, distance and angle from origin and to target, time of reward, vertical and horizontal position of the eyes, and ongoing phase of LFP at theta, alpha, and beta bands. (D) Top: random snippet of spiking activity of simultaneously recorded neurons (green = MSTd; blue = 7a; red = dlPFC). Bottom: corresponding cross-validated prediction reconstructed from the P-GAM. The average cross-validated pseudo-R2 was 0.072 (see Colin Cameron and Windmeijer, 1997). (E) Responses from an example MSTd, 7a, and dlPFC neuron (black), aligned to temporal task variables (e.g., time of movement onset and offset), or binned according to their value in a continuous task variable (e.g., linear velocity). Colored (respectively green, blue, and red for MSTd, 7a, and dlPFC) lines are the reconstruction from the reduced P-GAM. The responses are opaque (p<0.01) or transparent (p>0.01), according to whether the P-GAM kernel for the specific task variable is deemed to significantly contribute to the neuron’s firing rate. (F) Fraction of neurons tuned to the given task variable, according to brain area. Error bars are 99% CIs, and thus non-overlapping bars indicate a pair-wise significant difference.

To quantitatively account for the encoding properties of neurons within this task – wherein no two trials are alike (see Figure 2B for rasters demonstrating heterogeneity in spike times and trial durations) – we fit spike trains to a Poisson generalized additive model (P-GAM). The P-GAM we leveraged was purposely developed to efficiently and robustly estimate encoding profiles during closed-loop and naturalistic tasks such as the ‘firefly task’ (Balzani et al., 2020b). Namely, beyond capturing arbitrary non-linearities and handling collinear predictors (Dormann et al., 2013), this encoding model has the strong advantage of inferring marginal confidence bounds for the contribution of each feature to neural responses (see Balzani et al., 2020b, for details). This property allows us to identify the minimal set of task variables that each neuron is responsive to (setting statistical significance to p<0.01), while circumventing the need for computationally demanding (and often unstable) model selection procedures – particularly given the large number and time-varying nature of the variables in this task. Indeed, beyond the numerical results in Balzani et al., 2020b, demonstrating the ability to recover the ground-truth in artificial data, here we show that the P-GAM will capture the simplest possible interpretation of the neural responses in the statistical regime of our data, that is, even when neurons are weakly firing and predictors are correlated (see Figure 2—figure supplement 2).

In addition to continuous sensorimotor (e.g., linear and angular velocity and acceleration) and latent variables (e.g., distance from origin and to target, Figure 1A), as well as discrete task events (e.g., time of target onset, as well as movement onset and offset), we included elements of brain dynamics in the encoding model. These internal dynamics are most often not considered in accounting for task-relevant neural responses, yet they fundamentally shape spiking activity. These latter variables included the phase of LFP in different frequency bands (theta: 4–8 Hz; alpha: 8–12 Hz; beta: 12–30 Hz), and causal unit-to-unit coupling filters within (i.e., spike history, 36 ms wide temporal filter) and between units, both within (36 ms wide temporal filter) and across cortical areas (600 ms wide temporal filters, Figure 2C, see Methods). These coupling filters capture noise correlations, that is, the signal-independent relationship between one neuron firing and the likelihood that this same (i.e., spike history) or another neuron (i.e., coupling filter) will fire at a given time delay (see Hart and Huk, 2020). In total, the encoding model had 17 (analog or digital) task inputs (Figure 2C), in addition to hundreds of potential lateral and across region coupling filters (see Figure 2—figure supplement 2B for the distribution of the number parameters in the full and reduced encoding models, and Figure 2—figure supplement 2C for demonstration that while the reduced model had a fifth of the parameters present in the full model, its cross-validated ability to account for observed spiking was unchanged). The fitted model accounts well for observed spiking activity (see raw and model reconstructed spiking activity of simultaneously recorded units in Figure 2D, mean pseudo-R2=0.072).

Figure 2E shows the empirical (black) and P-GAM reconstructed (colored) firing rate to task variables in an example MSTd, 7a, and dlPFC neuron. The striking feature of these examples is that all units responded to, or were modulated by, a multitude of task variables. For instance, each of these example neurons responded at the time of firefly onset. The 7a neuron showed a strong dependency not only to sensorimotor variables, but also to LFP phases (in all frequency ranges). Finally, and perhaps most surprisingly, the MSTd neuron reflected not only sensorimotor and eye position variables, but also latent ones, such as the angular distance to the (invisible) target. We used the P-GAM to factorize the contribution of different elements of the task to a neuron’s overall firing rate (i.e., perform credit assignment). The example neurons in Figure 2E are opaque (contributes) or transparent (does not contribute) for different task variables, according to the P-GAMs estimate of their factorized contribution. Thus, there may be cases where there are clear evoked responses (e.g., time of movement onset in the dlPFC example neuron, Figure 2E, bottom) yet the P-GAM estimated the neuron to not be tuned to this variable, likely due to the correlated nature of input statistics during this naturalistic task (i.e., this neuron appears tuned to linear acceleration, which correlates with the time of movement onset). In Figure 2—figure supplements 36 we quantify effect sizes by comparing the modulation in firing rates, as well as the mutual information between these and a given variable, for the population of neurons significantly coding or not for a given task variable. Similarly, to provide a point of comparison with prior work studying optic flow processing, in Figure 2—figure supplement 7, we quantify the speed (i.e., liner velocity) and direction (i.e., angular velocity) discrimination index (see Methods and e.g., Chen et al., 2008; Avila et al., 2019) for neurons in MSTd, 7a, and dlPFC.

The fraction of neurons tuned to different task variables demonstrated a patterned mixed selectivity (Figure 2F). Namely, the fraction of neurons tuned to sensorimotor variables was highest in area 7a (linear and angular velocity, linear and angular acceleration, time of movement onset and offset, all p<0.01; Cohen’s d linear acceleration d=0.2, all the rest d>0.76), while most neurons in MSTd coded for eye position (58.8% and 63.3% respectively for vertical and horizontal eye position). Interestingly, a large fraction of dlPFC neurons also coded for eye position (dlPFC vs. 7a, vertical and horizontal eye position, p<6.4 × 10–17, all Cohen’s d>3), putatively reflecting the fact that the gaze indexes the internal belief over firefly position (Lakshminarasimhan et al., 2020). Neurons in area 7a showed a strong dependency to the phase of ongoing LFP fluctuations (see Figure 2—figure supplement 8 for further illustrations of this effect as quantified by the P-GAM, and Figure 2—figure supplement 9 for corroborative evidence by pairwise phase consistency, Vinck et al., 2010), a fact that was less observed in dlPFC or MSTd (all p<7.1 × 10–8, all d>1.02). Lastly, and most relevant to the task of path integrating to the location of invisible fireflies, we observed that the greatest fraction of neurons encoding for both path integration (i.e., travelled distance and angle turned) and the distance and angle to the target (i.e., spatial goal) were in dlPFC (ranging from 30.6% to 57.5%). Interestingly, while 7a had more neurons coding for path integration than MSTd (all p<1.6 × 10–5, all Cohen’s d>0.54), the latter area had a greater fraction of neurons coding for the latent distance and angle to target than 7a did (all p<6.1 × 10–5, all Cohen’s d>0.41). In the supplement we demonstrate that this coding was stable (contrasting odd vs. even trials; Figure 2—figure supplement 10) and task-relevant (Figure 2—figure supplement 11), in that passive viewing of the same stimuli did not elicit a comparable fraction of neurons tuned to task variables in 7a (passive viewing data in MSTd and dlPFC were unavailable). The fraction of neurons aligned with the phase of LFP in different frequency bands remained stable across passive and active viewing conditions, particularly in the beta band (all frequencies, active vs. passive, p=0.13; beta band, p=0.51). Altogether, the encoding pattern across areas may suggest that while dlPFC is critically involved in estimating the relative distance between self and target, 7a may be preferentially involved in the process of path integration, while somewhat unexpectedly, MSTd may play an important role in keeping track of the latent spatial goal.

Cortical codes for path integration and vector coding of spatial goals

Beyond the frequency with which we observe neurons tuned to the angle and distance from the origin (i.e., path integration) and to the target (i.e., vector coding of spatial goals), we may expect the distributions of preferred distances and angles to also be informative. Of note, distance/angle from origin and to the target are not the reciprocal of one another given that the target location varies on a trial-by-trial fashion. In other words, the travelled distance and the distance to target may correlate within a trial (but need not, given under- vs. overshooting) but certainly do not across trials (e.g., a distance of, say, 100 cm from the origin could corresponding to a whole host of distances from target). In Figure 3A we show rasters of representative neurons tuned to the distance from origin (example neuron 1) and to target (example neuron 2). The neuron tuned to the distance to target (example neurons 2) is not tuned to a particular distance from origin, but does demonstrate a patterned firing rate, discharging at further distances as the animal travels further. The third example (Figure 3A) is tuned to movement stopping, and demonstrates a pattern similar to the neuron tuned to distance to target when plotted as a function of distance from origin (Figure 3A, third vs. fifth panel), but not when visualized as a function of distance to target. The distributions shown in Figure 3B illustrate that the preferred distances/angles from origin and to target spanned the entire range of angles and distances visited, demonstrating a full basis set covering the feature space.

Figure 3 with 4 supplements see all
Preferred angle and distance from origin and to target in dorsomedial superior temporal area (MSTd), 7a, and dorsolateral prefrontal cortex (dlPFC).

(A) Rasters and average firing rate of three example neurons, sorted by their maximal distance from origin and to target. The first example neuron (left) responds at a distance of ~100 cm from origin and is not modulated by distance to target. The second example (middle) responds to a close distance to target (~30 cm). Arrows at the top of these rasters indicate the preferred distance from origin (example 1) and to target (example 2). We include a third example (tuned to movement stop) as a control, demonstrating that responding to a distance near the target and to stopping behavior are distinguishable. (B) Heatmaps showing neural responses (y-axis) sorted by preferred angles from origin (top), angle to target (second row), distance from origin (third row), and distance to target (bottom row) for MSTd (green), 7a (blue) and dlPFC (red) in Monkey S (data simultaneously recorded). Darker color indicates higher normalized firing rate. Neurons were sorted based on their preferred distances/angles in even trials and their responses during odd trials is shown (i.e., sorting is cross-validated, see Methods). (C) Histograms showing the probability of observing a given preferred angle or distance across all three monkeys. Inverted triangles at the top of each subplot indicate the median. Of note, however, medians may not appropriately summarize distributions in the case of bimodality. (D) We clustered the kernels driving the response to angle or distance to origin and from the target. Here, we show 10 representatives from each cluster (thin and transparent lines), as well as the mean of the cluster as a whole (ticker line). The inset quantifies the percentage of tunings within a particular area and for the particular variable that were deemed to belong within the cluster depicted (the most frequent one).

Across all animals, angles specifying a 0° offset from the heading at origin were over-represented (Figure 3C, top row, proportions across all animals), while the 0° offset from target location was under-represented (Figure 3C, second row). Instead, particularly in area 7a and dlPFC, the distribution of preferred angles to target was bimodal (bimodality coefficients, null = 0.55, MST = 0.57, 7a=0.89, dlPFC = 0.80, 7a and dlPFC p<0.05), with many neurons showing a preference for ~45–90° away from target, either leftward or rightward. This pattern is in stark contrast with observations from the bat’s hippocampus, where vector coding of hidden spatial goals exists and a goal angle of 0° is over-represented (Sarel et al., 2017). Speculatively, this reversed pattern of preferences between cortex (here) and hippocampus (Sarel et al., 2017) may suggest that while the former plays a role in guiding heading toward the desired state (akin to an error or correction signal), the hippocampus maintains this heading.

In terms of radial distance, MSTd, area 7a, and dlPFC all showed an over-representation of units coding for distances near to, as opposed to far from, the starting location (Figure 3C, third row, <~150 cm). On the other hand, we observed a clear differentiation between cortical areas regarding their preferred distance to target. Area 7a showed a strong preference for nearby targets with approximately 50% of units coding for targets within ~30 cm. Neurons in MSTd and dlPFC, on the other hand, responded primarily at intermediary and far distances from the target (~200–400 cm from target; Figure 3C, fourth row). The preference for distances further from the target in MSTd and dlPFC seemingly concords with their frequent tuning to eye position, and the fact that the eyes attempt to pursue the hidden target (i.e., as if these areas were involved in planning and sampling from a distance).

Lastly, to depict the full shape of the kernels encoding the distance and angle from the origin and to the target, we performed statistical clustering of these kernels, separately for each area and task variable. Figure 3D shows the mean and 10 example tuning functions for the most frequently present cluster within each area. The most noticeable difference between areas is that while MSTd and dlPFC prefer distances far from the target, area 7a responds mostly to distances near it (Figure 3C, bottom row). In Figure 3—figure supplements 14 we depict all the different types (i.e., high-dimensional clusters) of tuning functions that exist for all the different task variables (e.g., linear and angular velocity, horizontal and vertical eye position, etc.) in MSTd, area 7a, and dlPFC.

Single-cell properties and unit-to-unit coupling suggest two distinct functional subnetworks

Both the fraction of neurons tuned to different task variables (Figure 2) and the distribution of preferred angles and distances from origin and to target (Figure 3) show a surprising degree of coding similarity between MSTd and dlPFC. In turn, to systematically examine how the encoding of all task variables, and not solely a select few, varied across cortical areas, we employed a statistical clustering approach (see, e.g., Minderer et al., 2019, for a similar approach).

First, we leveraged the knowledge of which variables each neuron was significantly tuned to (e.g., Figure 2F), and attempted clustering neurons based on this binary data, each neuron being defined by a size 17 binary vector (i.e., tuned or not tuned) corresponding to the 17 task variables defined in Figure 2. This approach (see Methods for details) showed that nearly all MSTd (89%) and dlPFC (94%) neurons were categorized as belonging within the same cluster (Figure 4A, cluster number 1), one that was defined by true mixed selectivity (Figure 4B, top left). In contrast, area 7a neurons appeared in three distinct clusters (Figure 4A, cluster numbers 1–3, respectively, 36%, 22%, and 31%). Cluster 2 was characterized by a strong selectivity for sensorimotor variables and firing in alignment with specific LFP phases (Figure 4B, top center), while Cluster 3 was characterized by a near complete lack of tuning to latent variables and eye position (Figure 4B, top right). Other cluster types existed, for instance composed of neurons selectively tuned to the ongoing phase in LFP bands but no other task variable (Figure 4A and B, Cluster 4), or driven also by motor onset and offset (Figure 4A and B, Cluster 5). These remaining clusters were, however, less common (~1–5%). This analysis was conducted with the full dataset (4254 neurons in total), yet in the supplement (Figure 4—figure supplement 1A) we confirm that the results are unchanged when subsampling from areas with more neurons (7a and dlPFC) to match the number present in MSTd (231 neurons). Together, this pattern of clustering results based on whether neurons were tuned to different task variables demonstrated a surprising degree of coding similarity between MSTd and dlPFC, which are in turn different from area 7a.

Figure 4 with 4 supplements see all
Global single-unit encoding profiles and unit-to-unit coupling properties suggest a common functional role for dorsomedial superior temporal area (MSTd) and dorsolateral prefrontal cortex (dlFPC).

(A) Proportion of neurons being classified into distinct cluster (x-axis, seven total) according to which task parameters they were significantly tuned to. (B) Fraction of neurons tuned to each of the 17 task variables (order is the same as in Figure 2C, E, F) according to cluster. (C) Uniform Manifold Approximation and Projection (UMAP) of the tuning function shapes, color coded by brain area (first column) or Density-Based Spatial Clustering of Applications with Noise (DBSCAN) cluster (second, third, and forth column). (D) Fraction of neurons whose spiking activity is phase-locked to local field potential (LFP) phases in other areas, in theta (leftmost), alpha (center) and beta (rightmost) bands. An arrow projecting from, for example, MSTd to 7a (center, 0.24), indicates that the neuron in area 7a is influenced by ongoing LFP phase in MSTd. Width of arrows and associated number indicate the proportion of neurons showing the particular property. (E) As (D) but for unit-to-unit coupling. An arrow projecting from, for example, MSTd to dlPFC indicates that the firing of a neuron in MSTd will subsequently influence spiking activity in dlPFC. (F) Left: Cross-validated pseudo-R2 of the full encoding model (y-axis) and a reduced model without within and across area unit-to-unit coupling (x-axis). Right: Probability distribution of the change in pseudo-R2 from the reduced to the full model.

Subsequently, we questioned whether utilizing knowledge of the shape of these tuning functions (as opposed to simply whether tuning to a variable was significant or not) would dissociate between neurons in each area. Tuning functions for all task variables of a given neuron were stacked, then all neurons were projected onto a low-dimensional manifold via Uniform Manifold Approximation and Projection (UMAP; McInnes et al., 2020), and clustered on this projection via DBSCAN (Density-Based Spatial Clustering of Applications with Noise; Ester, 1996). Area 7a and dlPFC neatly segregated from one another, while neurons from MSTd could be found along a continuum from area 7a to dlPFC. Notably, however, the centroid of MSTd was 6.49 times closer to the centroid of dlPFC than area 7a (Figure 4C, top row. Note that Becht et al., 2018, have shown UMAP to conserve global structure and thus allows for a meaningful interpretation of distances). This finding also holds when subsampling from 7a and dlPFC to match the number of units present in MSTd (100 iterations, MSTd-dlPFC distance was 5.56 times closer than MSTd-7a, 95% CI = [4.07, 7.33]; Figure 4—figure supplement 1B). DBSCAN clustering highlighted three main clusters within this low-dimensional projection of tuning functions. Cluster 1 (Figure 4C, second row) contained 81.5% of all dlPFC neurons, 58.9% of all MSTd neurons, and 3.6% of all neurons from 7a. Clusters 2 and 3 exclusively contained neurons from area 7a (Figure 4C, third and fourth row). Thus, just as the clustering based on what variables were neurons tuned to, clustering based on tuning shapes also highlighted a stronger similarity in the encoding properties of MSTd and dlPFC, as opposed to either of the former and area 7a (see Figure 4—figure supplement 2 for UMAPs color coded by the mutual information between neural responses and a particular task variable).

Given the similarity in encoding profiles between MSTd and dlPFC, we next examined inter-area global synchronization and unit-to-unit coupling to question whether these two areas formed an interacting functional subnetwork. We examined inter-area coordination both from the standpoint of coarse grain LFP-to-spike phase locking, and via finer resolution unit-to-unit couplings as estimated from the P-GAM (Balzani et al., 2020b).

Spiking activity in some units was well explained by the ongoing LFP phase in different regions (above and beyond the ongoing LFP phase in their local area, see Figure 2F and Figure 2—figure supplements 8 and 9). Most notably, 37% of neurons in dlPFC were tuned to the ongoing LFP phase within the beta range in area 7a. An even greater proportion of units showed the reciprocal property, with 59% of neurons in area 7a being modulated by the ongoing phase of beta-band LFP in dlPFC. A considerable proportion of 7a units were also modulated by the ongoing phase of alpha-band LFP in MSTd (24%, Figure 4D). Globally, therefore, phase locking of spiking activity to LFP phase in different areas reflected known anatomical connections, with reciprocal projections between MSTd and 7a, as well as between 7a and dlPFC (Andersen et al., 1990; Rozzi et al., 2006). Interestingly, the putative ‘feedback’ influence from dlPFC to area 7a (potentially reflecting the ‘belief’ guiding motor action) was stronger than the putative ‘feedforward’ influence of 7a onto dlPFC. The lowest frequency we indexed (theta, 4–8 Hz) seemed to be reserved for putative ‘feedback’ influence of dlPFC onto both MSTd (12%) and area 7a (18%).

Finer grain unit-to-unit coupling was sparse, and within each area the probability of two units being functionally connected decreased as a function of the distance between neurons (Figure 4—figure supplement 3A). The overall likelihood of two units being coupled within a given area changed as a function of brain area and was modulated by task engagement (active vs. passive viewing in area 7a; Figure 2—figure supplement 11C), but not as a function of probe type used (Utah array or linear probe, see Figure 4—figure supplement 4). There were more units coupled in MSTd (13%, corrected for distance between electrodes, see Methods) and area 7a (22%) than in dlPFC (7%, Figure 4E), potentially reflecting an increase in the dimensionality of the neural code (i.e., decrease in redundancy) as we move from sensorimotor to cognitive areas. More importantly, the across area unit-to-unit coupling did not reflect global anatomical trends, and instead supported the functional association between MSTd and dlPFC, as suggested by the encoding profiles of these areas. Twelve percent of neurons in dlPFC were coupled to activity in MSTd, while 7% of neurons in MSTd were coupled to activity in dlPFC. Importantly, neither of these regions showed either a ‘feedforward’ or ‘feedback’ likelihood of being coupled with area 7a that exceeded 4% (Figure 4E).

Four arguments support the fact that unit-to-unit coupling as estimated by the P-GAM reflect a property of neural coding beyond global volume conductance. First, they significantly improved fits of the encoding model. Figure 4F shows the cross-validated pseudo-R2 for the encoding model, including all task variables, LFP phases, and unit-to-unit coupling (‘coupled model’, y-axis), relative to a reduced model lacking the unit-to-unit coupling (‘uncoupled model’, x-axis). Second, inter-area coupling filters did not show oscillatory behavior. Instead of sinusoidal patterns, we frequently observed gradual increases or decreases in the likelihood of the examined neuron to spike given the activity of another neurons (see Figure 4—figure supplement 3B for a ‘dictionary’ of coupling filters across areas and their relative frequency). Third, while units phase-locked to the LFP in a different region were likely to be phase-locked to LFP in their own area (reflecting the utility of LFPs in coordinating spiking activity across areas), there was no significant change in the likelihood of coupled vs. uncoupled units to be phased-locked to LFPs (Figure 4—figure supplement 3C). Lastly, and most importantly, the likelihood of two units being coupled was not random or purely depending on their physical distance, but instead varied as a function of their tuning similarity. Figure 4—figure supplement 3D shows that the more similar the tuning functions were between two neurons, the more likely these were to be coupled. The P-GAM was able to reflect this well-established ‘like-to-like connectivity’ (Cossell et al., 2015; Chettih and Harvey, 2019) within a naturalistic closed-loop task, even for latent variables, and even in MSTd.

Altogether, both the global encoding profiles and fine-grain unit-to-unit coupling across areas (Figure 4) suggested the presence of an MSTd-dlPFC functional subnetwork within this closed-loop virtual navigation task. Given the high probability to encountering neurons in MSTd and dlPFC tuned to the distance and angle from the target, as well as to eye position (Figure 2), and given the animals’ tendency to keep track of the firefly location with their eyes (Figure 1, and Lakshminarasimhan et al., 2020), we hypothesized that this MSTd-dlPFC network may reflect the monkey’s embodied mnemonic strategy in navigating to hidden targets.

MST-dlPFC coupling reflect the animals’ strategy of tracking hidden targets with their eyes

The monkeys’ gaze reflected their belief about the firefly location. Indeed, within a session and across trials, the mean absolute difference between a monkey’s eye position and where they ought to be looking if they were perfectly tracking the firefly correlated with steering endpoint error (r2 ± standard deviation across all datasets, 0.36±0.22; for shuffled data, 0.16±0.09, p=5.6 × 10–3, paired t-test; see Lakshminarasimhan et al., 2020, for the original description of this effect). Further, this relationship also held across sessions, with better target tracking correlating with less bias (slopes closer to 1, see Figure 1C), particularly in the angular domain (r2=0.43, p=0.004; radial: r2=0.26, p=0.04; Figure 5A), and with an increasing proportion of rewarded trials (r2=0.24, p=0.042). Thus, we may question whether the likelihood of observing unit-to-unit coupling (defined within a session but not within a trial) relates to session-by-session changes in target tracking and/or steering performance.

Figure 5 with 1 supplement see all
Increased dorsomedial superior temporal area (MSTd)-dorsolateral prefrontal cortex (dlPFC) coupling correlated with an increased likelihood of animals keeping track of the invisible fireflies with their eyes.

(A) Correlation between the target tracking index (x-axis, i.e., tracking the hidden target with their eyes) and the radial (top) or angular (bottom) bias (defined as the slope relating responses and targets, as in Figure 1). Only sessions included in the neural analysis in Panel B were included in Panel A. (B) Correlation between the target tracking index (x-axis) and the fraction of neurons coupled within the ‘sender’ region (y-axis). The diagonal shows within area couplings (MSTd-MSTd, 7a-7a, dlPFC-dlPFC), while off-diagonals show across area couplings. R2 and p-values are shown as insets for the significant correlations.

For well-fit sessions (mean pseudo-r2 >0.05) with at least two units in each of two different areas, we computed the monkey’s target tracking index (averaged across the entire trial and then across trials), as well as the probability of units being coupled with others within and across areas, as estimated by the P-GAM. As shown in Figure 5B (top right), we observed a strong association between the fraction of units showing MST-to-dlPFC coupling and the monkeys’ ability to track the hidden firefly with their eyes (r2=0.63, p=0.003). This association, with more coupling indexing better tracking of the firefly, was also true for the ‘feedback’ projection from dlPFC-to-MSTd (r2=0.41, p=0.035, Figure 5B, bottom left), as well as for dlPFC-to-dlPFC coupling (r2=0.43, p=0.047, Figure 5B, bottom right). The other sender-receiver pairings, including MSTd-to-MSTd, did not show a correlation with the ability of the monkey to track the hidden firefly with their eyes (all p>0.08). As shown in Figure 5—figure supplement 1, there was no correlation between the unit-to-unit couplings within a session, and either radial or angular biases (7a-7a coupling vs. angular bias p=0.06; all other p>0.12). There was similarly no correlation between the proportion of rewarded trials in a session and unit-to-unit coupling probability (all p>0.11, Bonferroni corrected). Overall, therefore, the functional subnetwork between MSTd and dlPFC (Figure 4) seemingly reflects the animals’ strategy in keeping track of the hidden target with their eyes. In turn, the eye movements (but not MSTd-dlPFC coupling directly) aid in successfully navigating to the location of the hidden target.

Discussion

The lion’s share of our understanding linking spiking activity to goal-directed behavior originates from tasks that bear little resemblance to real-world self-environment interactions. Major differences include the prevalence of closed loops between action and perception in natural behaviors, as well as the necessity to concurrently accomplish multiple sensory and cognitive operations over protracted periods of time. To examine how the brain operates and coordinates among multiple neural nodes under these naturalistic conditions, we recorded simultaneously from three interconnected cortical areas (MSTd, 7a, and dlPFC) while non-human primates detected a target, remembered its location, and path integrated to it by actively sampling from their environment. We focused on the encoding profiles of single units and described four main findings.

First, we found that all task variables were present across all brain areas probed. This multiplexing, however, was not random. The prefrontal cortex showed the greatest proportion of neurons tuned to latent variables. Posterior parietal cortex showed a strong selectivity for sensorimotor variables and the ongoing phase of LFPs. MSTd was prominently tuned to eye position (Komatsu and Wurtz, 1988; Nadler et al., 2009). Most notoriously, even MSTd, an area traditionally considered a sensory node, showed a considerable number of neurons encoding latent variables, such as the angle or radial distance from origin (i.e., path integration) and to target (i.e., vector coding of spatial goals). In fact, the global encoding profile of MSTd was strikingly akin to that of dlPFC, suggesting this area need not be purely sensory (see Mendoza-Halliday et al., 2014, for a similar argument).

Second, we found that MSTd, area 7a, and dlPFC all show evidence for vector-based coding of spatial goals (Sarel et al., 2017, also see Ekstrom et al., 2003; Gauthier and Tank, 2018; Marigold and Drew, 2017, for similar evidence in humans, rodents, and cats, respectively). Interestingly, we observed that area 7a showed a majority of neurons coding for locations near the origin or near the target. That is, 7a is likely involved in body-centered state transitions (e.g., from static to moving) and approaching behavior (see Medendorp and Heed, 2019, for a similar argument and Serino, 2019, for a review on the role of posterior parietal cortex in peri-personal space encoding). dlPFC, on the other hand (and to a lesser extent MSTd), showed a preponderance of units coding for locations near the origin and far from the target. Interestingly, human fMRI work (Howard et al., 2014) has suggested that the posterior hippocampus encodes path distance to goals, while the anterior hippocampus encodes a more abstract Euclidean distance. Together with the current findings, and the fact that anterior and posterior hippocampus respectively project to the prefrontal and parietal cortex (Strange et al., 2014) suggests that a circuit including the anterior hippocampus and prefrontal cortex may abstractly plan goal-directed trajectories, while a circuit including the posterior hippocampus and posterior parietal cortex may compute body-centric sensorimotor affordances.

The fact that all areas probed showed a preference for distances near (as opposed to far) from where the monkey started path integrating (i.e., the origin) may suggest a change in the computations undertaken early vs. late within a single trajectory. Namely, early on observers may compute their ongoing distance from the origin, and then switch to computing distance not from origin, but from the target. This switch from first coding displacement from origin to then coding location relative to a target has been previously suggested empirically (Gothard et al., 1996) and in recent computational accounts (Kanitscheider and Fiete, 2017).

The third major finding relates to the functional organization across MSTd, area 7a, and dlPFC, and again suggests that navigating to a remembered target may involve two interdependent but separable computations: (i) estimating own’s own location relative to a starting location (i.e., a landmark) and (ii) estimating the position of the target and discounting self-location from target location. Indeed, anatomical tracer studies suggest a distributed hierarchy, with recurrent connections between MSTd and 7a, and then recurrent connections between 7a and dlPFC (Andersen et al., 1990; Rozzi et al., 2006). As expected from anatomy, here we observed that spiking activity in 7a is influenced by LFP fluctuations in MSTd (particularly in the alpha range). In turn, spiking activity in dlPFC and 7a are influenced by LFP fluctuations, respectively, in 7a and dlPFC (particularly in the beta range). In other words, we observed that the coarse channels of communication riding on LFPs are in line with anatomy. Further, the strong tuning of area 7a to sensorimotor variables and the strong tuning of both area 7a and dlPFC (but not MSTd) to the distance and angle from origin suggest that this pathway may be primarily involved in path integration as opposed to the vector coding of spatial goals. In addition to this MSTd-7a-dlPFC pathway, we also observed significant functional coupling between MSTd and dlPFC, independent from 7a. Namely, the likelihood of MSTd neurons being coupled to dlPFC, and of dlPFC neurons being coupled to MSTd, was three and almost two times as large as the unit-to-unit coupling among either of these areas with 7a. These results suggest that despite the lack of direct anatomical connections, MSTd and dlPFC may form a functional subnetwork within this closed-loop navigate-to-target task.

Lastly, we examined the putative functional role of this MSTd-dlPFC subnetwork. As our group has previously reported, monkeys naturally track with their eyes the location of hidden goals during navigation (Lakshminarasimhan et al., 2020). Given this task strategy, as well as the prominent tuning of both MSTd and dlPFC to eye position and the distance and angle to the hidden target, we hypothesized that the MSTd-dlPFC subnetwork may reflect the monkeys’ innate task strategy. Indeed, we found that the more units were coupled between MSTd and dlPFC (either ‘feedforward’ or ‘feedback’), the better the monkeys tracked the hidden goal with their eyes. Thus, we suggest that the dynamic functional coupling between MSTd and dlPFC may reflect the innate task strategy monkeys adapted in pursuing the goal location with their eyes.

We must acknowledge a number of limitations and areas of ongoing or future experimentation. First, while the task we employed here is more naturalistic than most, further improvements are possible. For instance, virtual and real-world navigation may rely on partially distinct neural codes (Aghajan et al., 2015). Thus, it will be interesting to replicate the current experiment while macaques move freely in a 3D environment (e.g., Mao et al., 2021). This would also allow for independent eye and head movements (head was restrained here) and thus we could estimate whether eye movements in the current experiment partially reflected intended head movements (as they seemingly do in rodents; Michaiel et al., 2020; Meyer et al., 2020). Performing a ‘firefly task’ in a real environment would also suppose a more complex set of visual inputs (e.g., corners, textures, shadows) that could be leveraged in an expanded P-GAM taking visual features as input (see Parker et al., 2022, for recent work taking this approach). The second limitation relates to the (necessarily incomplete) sampling of neural areas, and the focus on single units as opposed to population dynamics. We report a functional subnetwork between MSTd and dlPFC based on the similarity of their encoding profiles (though they are of course not identical) and the likelihood of encountering unit-to-unit couplings across these areas. But this functional connection must be subserved by structure (e.g., perhaps a third area we did not record from fluctuating with both MSTd and dlPFC). Thus, in ongoing experiments we have trained rodents to perform the ‘firefly task’. This will allow recording from a wider array of neural areas and cortical layers (most of the recordings reported here being from Utah arrays and hence likely from a single layer and of limited independence). Similarly, to further corroborate the functional subnetwork between MST and dlPFC, it will be interesting to examine population dynamics and the possibility that these areas form a functional ‘communication subspace’ (Semedo et al., 2019), adapted to the naturalistic setting of this task (see Balzani et al., 2022b). The last limitation relates to causality. The results we report here amount to detecting a correlation (i.e., spikes occurring most often at a given LFP phase and this correlation not being accounted by other sensorimotor, latent, or internal covariates). In future experiments it will be interesting to test for causality within this network, by either demanding observers to fixate elsewhere (Lakshminarasimhan et al., 2020), or by directly perturbing this network, for instance by micro-stimulation or optogenetic manipulations.

In conclusion, we demonstrate what may be broad principles of neural computation during closed action-perception loops: (i) mixed yet patterned single cell selectivity, (ii) coding of latent variables even in areas traditionally considered as purely sensory, and (iii) differential coarse (e.g., LFP-spike phase alignment) and fine-grain connectivity between task-relevant areas. Most notoriously, here we indexed the presence of significant noise correlations between MSTd and dlPFC, independently from 7a. We suggest that this coupling between sensory and prefrontal areas may reflect (embodied) task strategies.

Methods

Animals and animal preparation

We recorded extracellularly from areas MSTd, area 7a, and dlPFC in three adult male rhesus macaques (Macaca mulatta; 9.8–13.1 kg). We collected behavioral and neural data from 27 recording sessions from Monkey Q, 38 sessions from Monkey S, and 17 sessions from Monkey M (see Figure 2—figure supplement 1 for additional detail regarding the recording locations from each animal). All animals were chronically implanted with a lightweight polyacetal ring for head restraint. Further, for acute recordings, animals were outfitted with a removable grid to guide electrode penetrations. For eye tracking, a subset of monkeys (Monkey Q) were implanted with scleral coils (CNC Engineering, Seattle WA, USA), while eye tracking was performed using a video-based system (ISCAN Inc, Woburn, MA, USA) in the remaining animals (Monkeys S and M). Monkeys were trained using standard operant conditioning procedures to use a joystick to navigate in virtual reality and stop at the location of briefly presented targets, ‘fireflies’. All surgeries and procedures were approved by the Institutional Animal Care and Use Committee at Baylor College of Medicine (Protocol A3317-01) and New York University (Protocol number 18-1502) and were in accordance with National Institutes of Health guidelines.

Experimental setup

Monkeys were head-fixed and secured in a primate chair. A three-chip DLP projector (Christie Digital Mirage 2000, Cypress, CA, USA) rear-projected images onto a 60×60 cm2 screen that was attached to the front of the field coil frame, ~30 cm in front of the monkey. To navigate, the animals used an analog joystick (M20U9T-N82, CTI electronics) with two degrees of freedom to control their linear and angular speeds in a virtual environment. This virtual world comprised a ground plane whose textural elements had limited lifetime (~250 ms) to avoid serving as landmarks. The ground plane was circular with a radius of 70 m (near and far clipping planes at 5 cm and 40 m, respectively), with the subject positioned at its center at the beginning of each trial. Each texture element was an isosceles triangle (base × height: 8.5 × 18.5 cm2) that was randomly repositioned and reoriented at the end of its lifetime, making it impossible to use as a landmark. The maximum linear and angular speeds were fixed to 2 m/s and 90 deg/s. The density of the ground plane was 5.0 elements/m2. All stimuli were generated and rendered using C++ Open Graphics Library (OpenGL; Nvidia Quadro FX 3000G accelerator board) by continuously repositioning a virtual camera based on joystick inputs to update the visual scene at 60 Hz. The virtual camera was positioned at a height of 0.1 m above the ground plane. Given the OpenGL approach in re-positioning a virtual camera within a 3D space, depth cues included linear perspective, texture gradient, and motion parallax. Further, the stimulus was rendered as a red-green anaglyph and monkeys wore goggles fitted with Kodak Wratten filters (red #29 and green #61) to view the stimulus. This additionally provided binocular parallax. The binocular crosstalk for the green and red channels was 1.7% and 2.3%, respectively. Spike2 software (Cambridge Electronic Design Ltd., Cambridge, UK) was used to record and store the timeseries of target and animal’s location, animal linear and angular velocity, as well as eye positions. All behavioral data were recorded along with the neural event markers at a sampling rate of 833.33 Hz.

Behavioral task

Monkeys steered to a target location (circular disc of radius 20 cm) that was cued briefly (300 ms) at the beginning of each trial. Each trial was programmed to start after a variable random delay (truncated exponential distribution, range: 0.2–2.0 s; mean: 0.5 s) following the end of the previous trial. The target appeared at a random location between –40 and 40 deg of visual angle, and between 1 and 4 m relative to where the subject was stationed at the beginning of the trial. The joystick was always active, and thus monkeys were free to start moving before the target vanished, or before it appeared. Monkeys typically performed blocks of 750 trials before being given a short break. In a session, monkeys would perform two or three blocks. Feedback in the form of juice reward was given following a variable waiting period after stopping (truncated exponential distribution, range: 0.1–0.6 s; mean: 0.25 s). They received a drop of juice if their stopping position was within 0.6 m from the center of the target. No juice was provided otherwise.

Neural recording and pre-processing

We recorded extracellularly, either acutely using a 24- or 36-channel linear electrode array (100 µm spacing between electrodes; U-Probe, Plexon Inc, Dallas, TX, USA; MSTd in Monkeys Q and S, and dlPFC in Monkey M) or chronically with multi-electrode arrays (Blackrock Microsystems, Salt Lake City, UT, USA; 96 electrodes in area 7a in Monkey Q, and 48 electrodes in both area 7a and dlPFC in Monkey S). During acute recordings with the linear arrays, the probes were advanced into the cortex through a guide-tube using a hydraulic microdrive. Spike detection thresholds were manually adjusted separately for each channel to facilitate real-time monitoring of action potential waveforms. Recordings began once the waveforms were stable. The broadband signals were amplified and digitized at 20 kHz using a multichannel data acquisition system (Plexon Inc, Dallas, TX, USA) and were stored along with the action potential waveforms for offline analysis. Additionally, for each channel, we also stored low-pass filtered (–3 dB at 250 Hz) LFP signals. For the chronic array recordings, broadband neural signals were amplified and digitized at 30 kHz using a digital headstage (Cereplex E, Blackrock Microsystems, Salt Lake City, UT, USA), processed using the data acquisition system (Cereplex Direct, Blackrock Microsystems) and stored for offline analysis. Additionally, for each channel, we also stored low-pass filtered (–6 dB at 250 Hz) LFP signals sampled at 500 Hz. Finally, copies of event markers were received online from the stimulus acquisition software (Spike2) and saved alongside the neural data.

Spike detection and sorting were performed on the raw (broadband) neural signals using KiloSort 2.0 software (Pachitariu et al., 2016) on an NVIDIA Quadro P5000 GPU. The software uses a template matching algorithm both for detection and for clustering of spike waveforms. The spike clusters produced by KiloSort were visualized in Phy2 and manually refined by a human observer, by either accepting or rejecting KiloSort’s label for each unit. In addition, we computed three isolation quality metrics: inter-spike interval violations (ISIv), waveform contamination rate (cR), and presence rate (PR). ISIv is the fraction of spikes that occur within 1 ms of the previous spike. cR is the proportion of spikes inside a high-dimensional cluster boundary (by waveform) that are not from the cluster (false positive rate) when setting the cluster boundary at a Mahalanobis distance such that there are equal false positives and false negatives. PR is 1 minus the fraction of 1 min bins in which there is no spike. We set the following thresholds in qualifying a unit as a single unit: ISIv <20%, cR <0.02, and PR >90%.

Analyses

Behavior

The location of targets and monkey’s endpoints were expressed in polar coordinates, with a radial distance (target = r, response = r~) and eccentricity from straight ahead (target = θ; response = θ~). On a subset of trials (~5%) animals stopped within 0.5 m of the origin (range of targets, 1–4 m). Similarly, on a subset of trials (~13%) animals did not stop during the course of a trial (max duration = 7 s). These trials were discarded before further behavioral analyses. As we have observed before (Lakshminarasimhan et al., 2018; Lakshminarasimhan et al., 2020; Noel et al., 2020; Noel et al., 2021), a linear model with multiplicative gain accounted well for the observed data (average R2=0.72). Thus, we used the slopes of the corresponding linear regressions as a measure of bias. Note that in this scheme a slope of one indicates no bias (i.e., targets and endpoints lie along the identity line), whereas slopes smaller than one indicate a bias wherein animals undershoot targets (either in radial or angular distance).

Poisson generalized additive model

The P-GAM (https://github.com/savin-lab; Balzani, 2020a) defines a non-linear mapping between spike counts of a unit ytN0 and a set of continuous covariates xt (angular and linear velocity and acceleration, angular and linear distance travelled, angular and linear distance to target, and LFP instantaneous phase across different frequency ranges), and discrete events zt (time of movement onset/offset, target onset, reward delivery, and the spike counts from simultaneously recorded units). As such, inputs to the P-GAM were of three types. First, the spike counts of the unit to be modeled, at a 6 ms resolution (i.e., the number of spikes within 6 ms windows, no baseline correction). Second, the continuous, discrete, and neural covariates, which were also sampled at a 6 ms resolution. The last input type were a set of 15 ‘knots’ per covariate, defining the nodes of eventual tuning functions. The location of knots were defined as to (i) cover the range of a given input variable from the 2nd to the 98th percentile with (ii) equi-probable knots (each knot covering the same probability mass). See (https://github.com/BalzaniEdoardo/PGAM/blob/master/PGAM%20Tutorial.ipynb; Balzani, 2022c) for a comprehensive tutorial.

The unit log-firing rate is modeled as a linear combination of arbitrary non-linear functions of the covariates,

(1) logμ=jfjxj+kfk*zk

where * is the convolution operator, and the unit spike counts are generated as Poisson random variables with rate specified by Equation 1.

Input specific non-linearities f are expressed in terms of flexible B-splines, fβb and are associated with a smoothness enforcing penalization term controlled by a scale parameter λf,

(2) PENf,λf=-12λfβTSfβ,Sf=b``b``Tdx

The larger λf , the smoother the model. This penalization terms can be interpreted as Gaussian priors over model parameters. The resulting log-likelihood of the model takes the form,

(3) Ly=logpyx,z,β+fPENf,λf

with yRT being the spike counts of the unit, xRJ×T being the continuous task variables, zRK×T being the discrete task events, T being the time points, β being the collection of all B-spline coefficients, and p the Poisson likelihood. Both parameters β and the hyperparameters λ are learned from the data by an iterative optimization procedure that switches between maximizing Equation 3 as a function of the parameters and minimizing a cross-validation score as a function of the hyperparameters (see Balzani et al., 2020b, for further details).

The probabilistic interpretation of the penalization terms allowed us to compute a posterior distribution for the model parameters, derive confidence intervals with desirable frequentist coverage properties, and implement a statistical test for input variable inclusion that selects a minimal subset of variables explaining most of the variance. In Figure 2—figure supplement 2, we demonstrate that this approach will appropriately select variables contributing to the spike trials of individual neurons. Further, we show that including all variables in the model (hundreds to thousands of parameters, given the cell-to-cell coupling) does not outperform the selected (i.e., ‘reduced’) model, which typically has an order of magnitude less parameters. The approach circumvents traditional model-comparison-based variable selection, which would be unfeasible in a fully coupled model with hundreds of covariates.

To show the stability in the estimated tuning functions, Figure 2—figure supplement 10 shows the fraction of units tuned to a given task variable as a function of brain area, and as a function of whether odd or even trials were fit to the P-GAM. Namely, we fit half of the dataset each time and show that the fraction of neurons tuned to a given task variable was the same regardless of whether we fit the odd numbered trials or the even numbered trials. Similarly, we index the ‘preferred’ distances and angles from origin and to target (Figure 3) as defined by the peak of tuning functions. In Figure 3B we sort neurons according to their preferred distances or angles in one subset of trials (i.e., ‘even’ trials) and plot the normalized responses in the other subset of trials (‘odd’ trials). Figure 3B therefore demonstrates that not only the fraction of neurons tuned to different variables was stable, but the estimated tuning functions were as well.

Pseudo-R2

Fit quality was assessed via the pseudo-R2 on subset of held-out test trials (20% of the total trials, not used for inferring model parameters). Pseudo-R2 is a goodness of fit measure that is suitable for models with Poisson observation noise (Colin Cameron and Windmeijer, 1997). The score is computed as:

(4) pseudoR2=1L(y)L(y^)L(y)L(y¯)

with Ly being the likelihood of the true spike counts, L (y^) being the likelihood of the P-GAM model prediction, and Ly´ being the likelihood of a Poisson null-model (constant mean rate). It can be interpreted as the fraction of the maximum possible likelihood (normalized by the null model) that the fit achieves. The score is 0 when the GAM fits are no more likely than the null model, 1 when it perfectly matches the data, and rarely can be negative (for test-set data, 3% of the recorded units) when overfitting occurs. In this latter case we excluded the unit from analysis. Empirically, the pseudo-R2 is a stringent metric and ranges in values that are substantially lower than the standard R2 when both are applicable (Domencich and McFadden, 1975) and depends on the firing statistics of the area recorded (such as mean firing rates and variance), the type of covariates, and their number (Benjamin et al., 2018). Our average score of 0.072 is about three times better than standard GLM performance even in areas such as primary somatosensory and motor areas (Benjamin et al., 2018).

Speed and direction discrimination index

To allow for comparison with prior reports studying optic flow processing within the cadre of two-alternative forced-choice tasks, we compute the discrimination index for speed (i.e., linear velocity) and direction (i.e., angular velocity) in MSTd, 7a, and dlPFC. The discrimination index (DDI) was defined as:

(5) DDI=RmaxRminRmaxRmin+2SSE(NM)

where Rmax and Rmin are the maximum and minimum response from a tuning function, SSE is the sum of squared errors around the mean responses, M is the number of stimulus directions, and N is the total number of observations. In the context of the current naturalistic experiment, we may bin linear and angular velocities and compute tuning functions defining Rmax and Rmin. The number of bins (here, we used 15 nodes, as defined by the P-GAM) defines M. To estimate SSE and N we must define trials wherein the full gamut of linear and angular velocities are experienced. To facilitate direct comparison with Chen et al., 2008, we divided our recordings in 80 segments, the mean number of trials (N) in Chen et al., 2008. Lastly, we computed DDI according to Equation 5.

Clustering

For Figure 4A and B, clustering was performed by employing a spectral clustering algorithm based on the Jaccard distance metric (Shi and Malik, 2000). This approach finds the k eigenvectors to split Jaccard distances k ways. Different values of k (≥3) resulted in conceptually identical findings. The results reported in the main text were generated for the full dataset, and thus a 4254 (231 MSTd neuron + 3200 7a neurons + 823 dlPFC neurons) × 17 (task variables) input matrix (where each entry denotes whether a particular neuron was significant or not for a particular task variable). In Figure 4—figure supplement 1A we confirm that the results reported in the main text were not driven by the fact that we had a considerably larger number of 7a neurons than other. Namely, we perform 10,000 iterations of spectral clustering while subsampling from 7a and dlPFC (without replacement) to match the number of units in MSTd (231). At each iteration, we compute the ratio of the size of the largest cluster, to the total number of units for a given area (i.e., 231). Results confirm that MSTd and dlPFC neurons were most often assigned to a single cluster, while neurons in 7a were most readily assigned to one of three clusters.

For Figure 4C all tuning functions of a neuron were stacked together, so each neuron was represented by a vector of dimension ni, where ni is the dimension of the i th tuning function. Then, the high-dimensional matrix (neurons × ni) was reduced to a matrix (neurons × m) by PCA projection, where m is the number of PCs that explains >90% of the variance. Finally, the matrix was further projected onto a 2D manifold using UMAP (McInnes et al., 2020) and clustered using DBSCAN (Ester, 1996). This clustering method automatically determined the number of clusters on the basis of the spatial density of the UMAPs. Clusters depicted in Figure 3D, Figure 3—figure supplements 14, and Figure 4—figure supplement 3B were equally determined by DBSCAN. As for Figure 4A and B, for Figure 4C we confirmed that the results reported in the main text were not driven by the unequal number of neurons recorded from in MSTd, 7a, and dlPFC. Namely, we perform 500 iterations of the procedure described above (i.e., stacking of tuning functions, then PCA, then UMAP) while subsampling from 7a and dlPFC to match the number of units present in MSTd. At each iteration, we compute the distance between the MSTd and dlPFC centroid in UMAP space, and between the MSTd and 7a centroid. We then compute the ratio of these distances, which are depicted in Figure 4—figure supplement 1B.

For Figure 4—figure supplement 2, mutual information was computed as the difference of the entropy of spike counts H(Y), and the entropy of the counts conditioned on the stimulus H(Y|S),

(6) I(Y,S)=H(Y)H(Y|S)

To estimate these quantities, we assumed that the counts Y are Poisson distributed with rate parameter λ equal to the mean firing rate of the unit. The stimulus was discretized and its distribution approximated as a binomial, while the conditional distribution of the counts given the stimuli p(Y|S) is given by the P-GAM generative model. Finally, we computed Y as:

(7) H(Y)=yp(y)logp(y),andH(Y|S)=sp(s)yp(y|s)logp(y|s)

Coupling filters

Coupling filters (and the corresponding inferential statistics) were determined via the P-GAM. Within area coupling filters were set to a duration of 36 ms, and across area filters were set to a duration of 600 ms. For the within-area coupling probability reported in Figure 4E, we corrected for the effects of unit distance (i.e., Figure 4—figure supplement 3A) by first fitting a brain region-specific logistic regression. Specifically, we expressed coupling probability as a non-linear function of electrode distance as follows:

(8) pc=1=logit-1fd

with c being a binary variable taking value 1 for significant coupling and 0 otherwise, d being the electrode distance, and f being a non-linear function expressed in terms of B-splines. Each brain area was fit independently, and the coupling probability in Figure 4E was set as the model prediction for a distance of 500 µm.

Data and code availability

Data and code are available at: https://osf.io/d7wtz/. Code and tutorials for utilizing the P-GAM are additionally available at: https://github.com/BalzaniEdoardo/PGAM, (copy archived at swh:1:rev:deaaef66ccff5e667fcfbbc11c3de75dafea5be4; Balzani, 2022a).

Data availability

Data and code are available at: https://osf.io/d7wtz/.

The following data sets were generated
    1. Noel J
    2. Balzani E
    (2022) Open Science Framework
    ID d7wtz. latent variables in sensory, parietal, and frontal cortices.

References

  1. Conference
    1. Balzani E
    2. Lakshminarasimhan K
    3. Angelaki D
    4. Savin C
    (2020b)
    Efficient estimation of neural tuning during naturalistic behavior
    Advances in Neural Information Processing Systems.
  2. Book
    1. Domencich TA
    2. McFadden DL
    (1975)
    Urban Travel Demand-A Behavioral Analysis. A Behavioral Analysis
    Amsterdam: North-Holland.
  3. Book
    1. Ekstrom A
    2. Spiers H
    3. Bohbot V
    4. Rosenbaum R
    (2018)
    Human Spatial Navigation
    Princeton University Press.
  4. Conference
    1. Ester M
    (1996)
    Density-Based spatial clustering of applications with noise (DBSCAN
    Proc. of the Second International Conference on Knowledge Discovery and Data Mining.
    1. Shi J
    2. Malik J
    (2000) Normalized cuts and image segmentation
    IEEE Transactions on Pattern Analysis and Machine Intelligence 22:888–905.
    https://doi.org/10.1109/34.868688
    1. Yoo SBM
    2. Hayden BY
    3. Pearson JM
    (2021) Continuous decisions
    Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences 376:20190664.
    https://doi.org/10.1098/rstb.2019.0664

Decision letter

  1. Kristine Krug
    Reviewing Editor; Otto-von-Guericke University Magdeburg, Germany
  2. Joshua I Gold
    Senior Editor; University of Pennsylvania, United States
  3. Marieke Schölvinck
    Reviewer; Ernst Strungmann Institute for Neuroscience, Germany

Our editorial process produces two outputs: (i) public reviews designed to be posted alongside the preprint for the benefit of readers; (ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.

Decision letter after peer review:

Thank you for submitting your article "Coding of latent variables in sensory, parietal, and frontal cortices during virtual closed-loop navigation" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, one of whom is a member of our Board of Reviewing Editors, and the evaluation has been overseen by Joshua Gold as the Senior Editor. The following individuals involved in review of your submission have agreed to reveal their identity: Marieke Schölvinck (Reviewer #2) and Sujaya Neupane (#3).

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this letter to help you prepare a revised submission.

Essential revisions:

1) Heat maps of preferred angle and distance in Figure 3 must be cross-validated across trials to show data reliability. Maybe the authors have done that and these are cross-validated plots. If so, it is not mentioned in the methods anywhere. Please verify the ordering of preferred angle and distance hold up with cross-validation. One can get a spurious, evenly tiled coding of any continuous variable if one takes a random matrix (say mean FR x variable value), normalizes each row (FR) and sorts the column by peak location (i.e. preferred variable value) for each row.

2) We are a bit confused by the distribution of preferred latent variables (Figure 3). For e.g. travelled distance and distance to target are anti-correlated. Isn't it trivial that the preferred coding would appear bimodal across a neural population if some neurons are coding for one and some for the other? For a travelled distance coding neuron, there is nothing to code at the onset since distance travelled is 0 and vice versa for distance to the target coding neuron at the offset.

Related to #2, it would be helpful to see PSTH examples of single neurons that code for travelled distance and those that code for distance to target. PSTH would be obtained by averaging across trials, binned over a range of trial-lengths (e.g. bin1: short trial length, bin2: medium trial length, bin3: long trial length). We would expect clear differences in firing rate at the onset for distance to target coders and at the offset for travelled distance coders. It is difficult to see this in the presented rasters, although according to P-GAM results, that should be the case.

3) It would be helpful to provide a few examples of LFP traces and their filtered form along with spike times to appreciate the phase modulations apparent in their statistical modelling results (Figure 2F).

4) Tuning strength.

From the manuscript it is difficult to judge how representative the different neuronal populations for each area are and to what extent their selectivity differs. The analysis and it variables are quite complex to follow. It would be helpful for the reader to understand how some of them relate to more traditional measures. It is great that the focus on single neurons allows this comparison.

What does "tuned" in this context mean in terms of strength and selectivity?

How many neurons would pass a minimal response criterion like 10 spikes/s. Would these show stronger tuning or correlations?

Figure 2E: "E. Responses from an example MSTd, 7a, and dlPFC neuron (black), aligned to temporal task variables (e.g., time of movement onset and offset), or binned according to their value in a continuous task variable (e.g., linear velocity)." It would be helpful to give the x-axis for each line for the reader to be able to ascertain what the nature of the scale and the range of variables are over which the firing rate changes are depicted.

Could the authors derive from their data one of the traditional measures used for MST, like a direction tuning index? A direct comparison with previous studies could help understand the nature of the sampled pool, particularly (but not exclusively) for areas when smaller neuronal samples, like MST.

5) Eye movements and visual input.

Another issue is the extent that it seems difficult to distinguish the effect of eye position from that of the background stimulus flow patterns, which of course must differ in direction and element size when animals fixate at different locations on the screen. To what extent was this visual input to neurons correlated with "latent variables" like latent distance and angle to target (latent spatial goal)?

In order to dissociate the contribution of eye position and task from visual input, do the authors have data on a passive viewing control condition, in which the animal fixates and the visual pattern is played back to animals exactly as if in an active one? How do neural responses compare across the three areas?

Could the authors discuss in the paper how the visual input is (or not) included in the model?

6) MSTd and dlPFC coupling.

a) As the animals were head-fixed, eye position would compensate in some cases the animals might have moved its head position (for instance to keep track of the target). Both, MSTd and dlPFC encoded eye position. Could the close coupling of MSTd and dlPFC be linked to this element of the task?

b) The authors claim that areas MSTd and dlPFC form a functional sub-network together, on the basis of similarity in the fractions of neurons tuned to certain variables, and the distribution of the preferred value of some of these variables. However, the fractions of neurons tuned to the latent variables in MSTd and dlPFC (see Figure 2F) are actually quite different. Perhaps the authors could comment on this.

c) When there was stronger MSTd-to-dlPFC coupling and better tracking of the hidden firefly with the eyes (Figure 5B), was the performance of the monkey also better (i.e. more hits)?

7) Sampling of areas.

a) Area 7a was exclusively sampled with chronic rather than moveable probes. It has also the largest number of "single units".

To what extent are these single units independent?

Could a sampling bias in these probes (part of 7a; layers) affect the results, especially when it comes to coupling. Please include in the discussion.

b) The number of recorded neurons in the three areas differs greatly: 231 units in MSTd, 823 units in dlPFC, and 3200 units in area 7a. Yet many conclusions in the paper rely on neuronal numbers: the fractions of neurons tuned to certain sensorimotor and latent variables differ between the areas, the variables explaining the firing rates cluster differently in the neurons of the three areas, and both the coarse LFP connectivity and the fine unit-to-unit coupling within areas differ. Especially the clustering results might depend on the number of recorded neurons: the fact that almost all MSTd and dlPFC neurons are categorized as belonging to the same cluster, whereas the area 7a neurons appear in three distinct clusters, could be caused by the much larger number of recorded neurons in area 7a. Also unit-to-unit coupling is more likely to show up in the data with a much larger number of recorded neurons. The data could be corrected for these differences in number of recorded neurons.

8) Lateralisation.

To what extent played the lateralization of the recording and task a role for neuronal response? This applies relative to brain hemisphere, body and eye position? Where in each monkey did the recordings take place? Which hand(s) did each monkeys use for the choice stick?

How was the lateralisation included in the model?

Please comment with regards to responses in MST, 7A, and dPFC and add information to the manuscript.

Specifically, it is unclear from Suppl Figure 1 whether within a particular monkey, some recording sites were interhemispheric, or whether within one monkey, all recordings were done in the same hemisphere. This of course has significant consequences for the effects of ongoing LFP and unit-to-unit coupling.

9) Data fed into the P-GAM model.

a) The P-GAM model is a great analysis tool for these kinds of data. However, the variables that the authors put into it are conceptually very different from each other. There are purely external task variables such as target onset and offset, latent variables such as distance to target that require knowledge of one's own position in space, and purely internal brain dynamics variables such as coupling to the LFP in another area. In that light, the finding of 'many variables contributing to the responses' is not surprising; all neurons in the brain are probably influenced both by external variables and internal brain dynamics. Maybe the authors could comment on the different nature of their variables and how that impacts their results.

b) Given that the sensorimotor and latent variables going into the G-PAM model are so crucial for the story, could you make a figure where you visualize them? This could maybe be added to Figure 1A. Also, 'radial bias' and 'angular bias' (in Fig1D) could be visualized here.

c) Quantification of electrophysiological activity processing that is fed into the P-GAM model is not entirely clear.

More details about the preprocessing of these data are required, for example, are the SUA baselined using pre-stimulus presentation activity? Are the LFP baselined as well? And how similar are the pooled responses within each area and across? This would allow the reader to spot possible problems when computing further neuronal properties, that could bias the main paper result:

An example is the tuning of the neurons to the phase of ongoing oscillation (Β, Α, Theta). There are a number of papers attempting to optimize methods to measure spike field coherency, e.g. the PPC pairwise phase consistency (Vinck et al., 2010). This method gives an estimation independent of spike count and LFP amplitudes (both parameter vary of course widely across time, tasks, subjects, areas…).

Here, it seems these two parameters are not considered and could lead to artefacts in the coupling results presented. The authors use temporal correlations to approximate coupling between spike/spike, and spike/LFP-phase. Correlation methods can potentially lead to artefacts and overestimations of coupling strength.

In their methods, the author state to 'bin spiking activity across 8ms window' prior to feeding this activity to the P-GAM. It means that 1 spike corresponds to an averaged 8ms time window. If you now try to calculate the dependency of this single spike to a specific phase of a β (30Hz) , the α (12Hz) and theta (4Hz) oscillation, it means that the chance level of assigning the binned spike to a particular phase differs considerably. Therefore, the statistical power of this analysis would decrease for higher frequency. It seems that the authors do not apply any correction.

Reviewer #1 (Recommendations for the authors):

1) From the manuscript it is difficult to judge how representative the different neuronal populations for each area are and to what extent their selectivity differs. The analysis and its variables are quite complex to follow. It would be helpful for the reader to understand how some of them relate to more traditional measures. It is great that the focus on single neurons allows this comparison.

What does "tuned" in this context mean in terms of strength and selectivity?

How many neurons would pass a minimal response criterion like 10 spikes/s. Would these show stronger tuning or correlations?

Figure 2E: "E. Responses from an example MSTd, 7a, and dlPFC neuron (black), aligned to temporal task variables (e.g., time of movement onset and offset), or binned according to their value in a continuous task variable (e.g., linear velocity)." It would be helpful to give the x-axis for each line for the reader to be able to ascertain what the nature of the scale and the range of variables are over which the firing rate changes are depicted.

Could the authors derive from their data one of the traditional measures used for MST, like a direction tuning index? A direct comparison with previous studies could help understand the nature of the sampled pool, particularly (but not exclusively) for areas when smaller neuronal samples, like MST.

2) Another issue is the extent that it seems difficult to distinguish the effect of eye position from that of the background stimulus flow patterns, which of course must differ in direction and element size when animals fixate at different locations on the screen. To what extent was this visual input to neurons correlated with "latent variables" like latent distance and angle to target (latent spatial goal)?

In order to dissociate the contribution of eye position and task from visual input, do the authors have data on a passive viewing control condition, in which the animal fixates and the visual pattern is played back to animals exactly as if in an active one? How do neural responses compare across the three areas?

Could the authors discuss in the paper how the visual input is (or not) included in the model?

3) As the animals were head-fixed, eye position would compensate in some cases the animals might have moved its head position (for instance to keep track of the target). Both, MSTd and dlPFC encoded eye position. Could the close coupling of MSTd and dlPFC be linked to this element of the task?

4) Area 7a was exclusively sampled with chronic rather than moveable probes. It has also the largest number of "single units".

To what extent are these single units independent?

Could a sampling bias in these probes (part of 7a; layers) affect the results, especially when it comes to coupling. Please include in the discussion.

5) To what extent played the lateralization of the recording and task a role for neuronal response? This applies relative to brain hemisphere, body and eye position? Where in each monkey did the recordings take place? Which hand(s) did each monkeys use for the choice stick?

How was the lateralisation included in the model?

Please comment with regards to responses in MST, 7A, and dPFC.

6) Quantification of task parameters are quite clear, this is not entirely the case for electrophysiological activity processing that they feed into their P-GAM model.

More details about the preprocessing of these data are required, for example, are the SUA baselined using pre-stimulus presentation activity? Are the LFP baselined as well? And how similar are the pooled responses within each area and across? This would allow the reader to spot possible problems when computing further neuronal properties, that could bias the main paper result:

An example is the tuning of the neurons to the phase of ongoing oscillation (Β, Α, Theta). There are a number of papers attempting to optimize methods to measure spike field coherency, e.g. the PPC pairwise phase consistency (Vinck et al., 2010). This method gives an estimation independent of spike count and LFP amplitudes (both parameter vary of course widely across time, tasks, subjects, areas…).

Here, it seems these two parameters are not considered and could lead to artefacts in the coupling results presented. The authors use temporal correlations to approximate coupling between spike/spike, and spike/LFP-phase. Correlation methods can potentially lead to artefacts and overestimations of coupling strength.

In their methods, the author state to 'bin spiking activity across 8ms window' prior to feeding this activity to the P-GAM. It means that 1 spike correspond to an averaged 8ms time window. If you now try to calculate the dependency of this single spike to a specific phase of a β (30Hz) , the α (12Hz) and theta (4Hz) oscillation, it means that the chance level of assigning the binned spike to a particular phase differs considerably. Therefore, the statistical power of this analysis would decrease for higher frequency. It seems that the authors do not apply any correction.

Reviewer #2 (Recommendations for the authors):

– I am missing a clear motivation for recording in the three areas that you chose. Could you maybe elaborate on this a bit more in the introduction?

– Given that the sensorimotor and latent variables going into the G-PAM model are so crucial for the story, could you make a figure where you visualize them? This could maybe be added to Figure 1A. Also, 'radial bias' and 'angular bias' (in Fig1D) could be visualized here.

– In Figure 1C, you have added 'slope=bias', whereas technically, it is 'deviation from slope=bias'.

– The legend of Figure 2 is extremely long and contains a lot of information that does not pertain directly to the figure. I suggest that the part '(The direct comparison of the goodness-of-fit….the complexity of their areas and tasks, reaches)' in Fig2D is taken out and added to the text somewhere else.

– It is unclear from Suppl Figure 1 whether within a particular monkey, some recording sites were interhemispheric, or whether within one monkey, all recordings were done in the same hemisphere. This of course has significant consequences for the effects of ongoing LFP and unit-to-unit coupling.

– In Fig2F, you show fractions of neurons tuned to the several variables of the G-PAM model, and in Fig4D, you show proportions of neurons phase-locked to LFP phases in other areas. I might have missed it, but I didn't see any quantification of how strong the tuning was, and how strong the phase-locking.

– When there was stronger MSTd-to-dlPFC coupling and better tracking of the hidden firefly with the eyes (Figure 5B), was the performance of the monkey also better (i.e. more hits)?

– There are a few spelling mistakes throughout the paper (psueudo-R on p.6; tunning on p.7)

Reviewer #3 (Recommendations for the authors):

1. Heat maps of preferred angle and distance in Figure 3 must be cross-validated across trials to show data reliability. Maybe the authors have done that and these are cross-validated plots. If so, it is not mentioned in the methods anywhere. Please verify the ordering of preferred angle and distance hold up with cross-validation. One can get a spurious, evenly tiled coding of any continuous variable if one takes a random matrix (say mean FR x variable value), normalizes each row (FR) and sorts the column by peak location (i.e. preferred variable value) for each row.

2. I am a bit confused by the distribution of preferred latent variables (Figure 3). For e.g. travelled distance and distance to target are anti-correlated. Isn't it trivial that the preferred coding would appear bimodal across a neural population if some neurons are coding for one and some for the other? For a travelled distance coding neuron, there is nothing to code at the onset since distance travelled is 0 and vice versa for distance to the target coding neuron at the offset.

3. Related to #2 above, it would be helpful to see PSTH examples of single neurons that code for travelled distance and those that code for distance to target. PSTH would be obtained by averaging across trials, binned over a range of trial-lengths (e.g. bin1: short trial length, bin2: medium trial length, bin3: long trial length). I would expect clear differences in firing rate at the onset for distance to target coders and at the offset for travelled distance coders. It is difficult to see this in the presented rasters, although according to P-GAM results, that should be the case.

4. It would be helpful to provide a few examples of LFP traces and their filtered form along with spike times to appreciate the phase modulations apparent in their statistical modelling results (Figure 2F).

https://doi.org/10.7554/eLife.80280.sa1

Author response

Essential revisions:

1) Heat maps of preferred angle and distance in Figure 3 must be cross-validated across trials to show data reliability. Maybe the authors have done that and these are cross-validated plots. If so, it is not mentioned in the methods anywhere. Please verify the ordering of preferred angle and distance hold up with cross-validation. One can get a spurious, evenly tiled coding of any continuous variable if one takes a random matrix (say mean FR x variable value), normalizes each row (FR) and sorts the column by peak location (i.e. preferred variable value) for each row.

We thank the reviewer for this suggestion. We had previously not cross-validated those plots, but have now. We fit the P-GAM to half the dataset (either “odd” or “even” trials). Then, for the figure referenced by the reviewer, we find the distance and angle that most strongly drives a particular neuron (given that the neuron is significantly tuned to the variable), as indicated by their tuning function (i.e., peak of the tuning function). Neurons were then sorted based on their preferred angle/distance in “even” trials, and the tuning functions generated by the “odd” trials are plotted as a heatmap. As shown in Figure 3, the results originally reported hold after cross-validation.

We have amended the figure caption and the methods section to include this information.

Figure caption:

“Heatmaps showing neural responses (y-axis) sorted by preferred angles from origin (top), angle to target (2nd row), distance from origin (3rd row), and distance to target (bottom row) for MSTd (green), 7a (blue) and dlPFC (red) in monkey S (data simultaneously recorded). Darker color indicates higher normalized firing rate. Neurons were sorted based on their preferred distances/angles in even trials and their responses during odd trials is shown (i.e., sorting is cross-validated, see Methods).”

Methods:

“To show the stability in the estimated tuning functions, Figure 2 – supplement 10 shows the fraction of units tuned to a given task-variable as a function of brain area, and as a function of whether odd or even trials were fit to the P-GAM. Namely, we fit half of the dataset each time and show that the fraction of neurons tuned to a given task variable was the same regardless of whether we fit the odd numbered trials, or the even numbered trials. Similarly, we index the “preferred” distances and angles from origin and to target (Figure 3) as defined by the peak of tuning functions. In Figure 3B we sort neurons according to their preferred distances or angles in one subset of trials (i.e., “even” trials) and plot the normalized responses in the other subset of trials (“odd” trials). Figure 3B, therefore demonstrates that not only the fraction of neurons tuned to different variables was stable, but the estimated tuning functions were as well.”

2) We are a bit confused by the distribution of preferred latent variables (Figure 3). For e.g. travelled distance and distance to target are anti-correlated. Isn't it trivial that the preferred coding would appear bimodal across a neural population if some neurons are coding for one and some for the other? For a travelled distance coding neuron, there is nothing to code at the onset since distance travelled is 0 and vice versa for distance to the target coding neuron at the offset.

Related to #2, it would be helpful to see PSTH examples of single neurons that code for travelled distance and those that code for distance to target. PSTH would be obtained by averaging across trials, binned over a range of trial-lengths (e.g. bin1: short trial length, bin2: medium trial length, bin3: long trial length). We would expect clear differences in firing rate at the onset for distance to target coders and at the offset for travelled distance coders. It is difficult to see this in the presented rasters, although according to P-GAM results, that should be the case.

We apologize for the confusion and recognize this deserved a more detailed explanation. The distance from origin and to target are not correlated (in fact, if they were fully correlated the P-GAM would be under-specified). Within a trial, these distances could be correlated (e.g., if the target were straight ahead and the animal traveled in a perfectly straight line), but need not. For instance, if the target were at 200 cm straight ahead, and the animal travelled in a perfect line 100 cm. Now the animal is 100cm from the origin and from the target. But if the animal now overshoots the target by 100cm, they are now 300cm from the origin, and still 100 cm from the target. A similar (more realistic) example happens when the animal does not take an optimal path, but increases their distance from origin without decreasing their distance from the target (e.g., think of a target that is 200cm from the origin / the animal at trial onset, the animal could navigate to form a perfect isosceles triangle, where they are now 200cm from target and origin). More importantly, on different trials the target appears at a random distance between 100cm and 400cm. Thus, if the animal travelled 100cm (imagine in a perfect line to the target), on different trials they could be 0cm from the target (first case), or 300cm from the target (second case). Thus, particularly across trials, there is absolutely no correlation between distance travelled and distance to target, and hence why we can distinguish between these.

We agree with the reviewers that illustrating this point and providing PSTHs (and even more strikingly, rasters) would go a long way to clarifying this issue. We did not perform the exact analysis suggested by the reviewers (because it is not necessarily the case that firing rates vary monotonically with distance, as their suggestion implies), but have added raster plots for example neurons tuned to either the distance from origin or to target. We also show an example neuron that responds to movement stop. This latter neuron, thus, shows a pattern similar to that of Example 2 when sorted from starting location (c.f., Figure 3A third and fifth panel), but critically, when aligned to target location it is evident there is no relation between its firing pattern and the distance to target. Hence, we can differentiate between the distance from origin, to target, and simply responding to moments of starting or stopping movements.

Further, we have modified the main text:

“Beyond the frequency with which we observe neurons tuned to the angle and distance from the origin (i.e., path integration) and to the target (i.e., vector coding of spatial goals), we may expect the distributions of preferred distances and angles to also be informative. Of note, distance/angle from origin and to the target are not the reciprocal of one another given that the target location varies on a trial-by-trial fashion. In other words, the travelled distance and the distance to target may correlate within a trial (but need not, given under- vs. overshooting) but certainly do not across trials (e.g., a distance of, say, 100cm from the origin could corresponding to a whole host of distances from target). In Figure 3A we show rasters of representative neurons tuned to the distance from origin (example neuron 1) and to target (example neuron 2). The neuron tuned to the distance to target (example neurons 2) is not tuned to a particular distance from origin, but does demonstrate a patterned firing rate, discharging at further distances as the animal travels further. The third example (Figure 3A) is tuned to movement stopping, and demonstrates a pattern similar to the neuron tuned to distance to target when plotted as a function of distance from origin (Figure 3A, 3rd vs. 5th panel), but not when visualized as a function of distance to target.

And the figure caption:

“A. Rasters and average firing rate of three example neurons, sorted by their maximal distance from origin and to target. The first example neuron (left) responds at a distance of ~100cm from origin and is not modulated by distance to target. The second example (middle) responds to a close distance to target (~30cm). Arrows at the top of these rasters indicate the preferred distance from origin (example 1) and to target (example 2). We include a third example (tuned to movement stop) as a control, demonstrating that responding to a distance near the target and to stopping behavior are distinguishable.”

3) It would be helpful to provide a few examples of LFP traces and their filtered form along with spike times to appreciate the phase modulations apparent in their statistical modelling results (Figure 2F).

We agree with the reviewers. We have added a new supplement figure (Figure 2 —figure supplement 8), where we show a few example LFP traces, their band-passed version, and the phases. We also show the spikes and how they align to the ongoing phase. Finally, we have also added histograms showing the phase at which a few example neurons fired throughout the course of a recording.

The accompanying figure captions is:

Figure 2 – supplement figure 8. Illustration of spike-LFP phase locking. A. Example trials. For each of four example trials (different sessions as well) we show the raw LFP (top), as well as the band-passed version (transparent) and extracted phase (opaque) in theta (green, second row), α (orange, third row), and β (blue, fourth row) ranges. Spikes are represented by dots, and they are placed on the y-axis according to the phase of the ongoing LFP. That is, across rows spikes occur at the same time along the x-axis, but are at different y-locations. If a neuron is phased-locked to LPF in a given range, spikes should predominantly occur at the same y location (as is seen in these examples for the Β band). B. Example sessions. For 6 example neurons, we show the distribution of phases (x-axis, in radians) at which spikes occurred, throughout the entire session. We show 2 example neurons that were not modulated by LFP phase (1st and 2nd column, uniformly distributed), 2 example neurons that were modulated solely by Β frequency phases (3rd and 4th column), and finally 2 example neurons that were modulated by phases at Theta, Α, and Β frequencies.

4) Tuning strength.

From the manuscript it is difficult to judge how representative the different neuronal populations for each area are and to what extent their selectivity differs. The analysis and it variables are quite complex to follow. It would be helpful for the reader to understand how some of them relate to more traditional measures. It is great that the focus on single neurons allows this comparison.

What does "tuned" in this context mean in terms of strength and selectivity?

How many neurons would pass a minimal response criterion like 10 spikes/s. Would these show stronger tuning or correlations?

We thank the reviewers for this question and consider that in essence they are asking a question about effect sizes. We have added 4 supplementary figures addressing this question (explained below), but must first add the caveat that when examining “raw” firing rates (e.g., response criterion of 10 spikes/s) in a naturalistic task where variables are continuous, dynamic, and correlated, it is not entirely clear what drives the response. This is exactly why we must use a method such as the P-GAM, attempting to factorize variance (i.e., perform credit assignment).

The 4 supplement figures we add (Figure 2 —figure supplement 3, 4, 5, and 6) follow the same format and provide 2 new analyses regarding effect sizes. First, for each sensorimotor (supplement 3), latent (supplement 4), LFP phase (supplement 5) or “other” (supplement 6) variable, we find the neurons that are and are not significantly tuned to that variable. Then, we compute a selectivity index, by computing the difference between the maximum and minimum response of a tuning function (in firing rate space). This gives a more traditional “evoked response”. Then for each population of neurons (significant vs. not) we log transform their evoked response (to render the population Gaussian) and compute Cohen’s d (the distance between means of the distributions normalized by their variance). As it can be observed in only Figure 2 —figure supplement 3, the effect sizes are considerable.

The percentage of neurons showing an increase in firing rate above 10spikes/s for linear velocity, linear acceleration, angular velocity, angular acceleration, the timing of movement onset, offset, and the timing of target presentation are respectively,

For significant units: 10.7%, 34.4%,10.7%, 19.8%, 21.4%, 21.0%, and 3.1%. Non-significant units: 1.5%, 6.0%, 1.1%, 2.3%, 2.2%, 0.5%, and 2.0%.

The second analysis performed (second column) is similar to the first, but contrasting mutual information (a measure of correlation) between evoked responses in the firing rate space and task variables. Findings are conceptually the same as for the first approach.

Associated caption:

“Figure 2 – supplement figure 3. Effect sizes in the firing rate space for neurons deemed to code for sensorimotor variables. Rows are the different sensorimotor variables, in the same order as in Figure 2E and F. The left column is the difference in evoked firing rate between the population of neurons deemed to significantly code for, or not, a given task variable. Namely, for each neuron we compute the difference in firing rate between the peak and the trough of its tuning function. The populations of significant and non-significant neurons are then log-transformed (to render normally distributed) and Cohen’s d is computed (indicated as the title of each subplot). Right column, as for the left column, but while contrasting the mutual information present between firing rates and the given task variable. For reference, Cohen’s d < 0.2 are typically considered weak effects, ~0.5 are considered moderate, and > ~0.8 are considered strong effects.”

Figure 2E: "E. Responses from an example MSTd, 7a, and dlPFC neuron (black), aligned to temporal task variables (e.g., time of movement onset and offset), or binned according to their value in a continuous task variable (e.g., linear velocity)." It would be helpful to give the x-axis for each line for the reader to be able to ascertain what the nature of the scale and the range of variables are over which the firing rate changes are depicted.

We thank the reviewer for this suggestion. An x-axis has been added. The figure is as follows:

Could the authors derive from their data one of the traditional measures used for MST, like a direction tuning index? A direct comparison with previous studies could help understand the nature of the sampled pool, particularly (but not exclusively) for areas when smaller neuronal samples, like MST.

We thank the reviewers for this suggestion. The traditional measure our group (e.g., Chen et al., 2008 in MSTd, and Avila et al., 2019 in 7a) has employed is the “discrimination index”, defined as,DDI=RmaxRminRmaxRmin+2SSENM Where Rmax and Rmin are the maximum and minimum response from a tuning function, SSE is the sum of squared errors around the mean responses, M is the number of stimulus directions, and N is the total number of observations. In the context of the current naturalistic experiment, we may bin linear and angular velocities and compute tuning functions defining Rmax and Rmin. The number of bins (here we used 15) defines M. To estimate SSE and N we must define trials wherein the full gamut of linear and angular velocities are experienced. To facilitate direct comparison with Chen et al., 2008, we divided our recordings in 80 segments, the mean number of trials (N) in Chen et al., 2008. Lastly, we computed DDI.

Importantly, we plot the mean estimates for MSTd (green), area 7a (blue), and dlPFC (red) from the current dataset in filled triangles, as well as the mean estimates for 7a (from Avila et al., 2019) and MSTd (from Chen et al., 2008) from “traditional experiments”. In general, there is good agreement between studies with MSTd, 7a, and finally dlPFC (in that order) showing the strongest discrimination for optic flow. We must note, however, that there are of course also coarse-grain differences, particularly for MSTd. This is expected given that in contrast to the P-GAM estimate, simply binning data and averaging within these bins does not factorize for the contribution of other experimental factors.

We have included this figure as Figure 2 —figure supplement 7. We have amended the text in the following manners:

Results:

“Similarly, to provide a point of comparison with prior work studying optic flow processing, in Figure 2 —figure supplement 7 we quantify the speed (i.e., liner velocity) and direction (i.e., angular velocity) discrimination index (see Methods and e.g., Chen et al., 2008; Avila et al., 2019) for neurons in MSTd, 7a, and dlPFC.”

Methods:

“To allow for comparison with prior reports studying optic flow processing within the cadre of two-alternative forced-choice tasks, we compute the discrimination index for speed (i.e., linear velocity) and direction (i.e., angular velocity) in MSTd, 7a, and dlPFC. The discrimination index (DDI) was defined as,DDI=RmaxRminRmaxRmin+2SSENM Where Rmax and Rmin are the maximum and minimum response from a tuning function, SSE is the sum of squared errors around the mean responses, M is the number of stimulus directions, and N is the total number of observations. In the context of the current naturalistic experiment, we may bin linear and angular velocities and compute tuning functions defining Rmax and Rmin. The number of bins (here we used 15 nodes, as defined by the P-GAM) defines M. To estimate SSE and N we must define trials wherein the full gamut of linear and angular velocities are experienced. To facilitate direct comparison with Chen et al., 2008, we divided our recordings in

80 segments, the mean number of trials (N) in Chen et al., 2008. Lastly, we computed DDI according to Equation 5.”

Figure caption:

“Figure 2 – supplement figure 7. Speed and direction discrimination index for neurons in MSTd, 7a, and dlPFC. To allow for direct comparison with prior studies, we compute the discrimination index (see Methods) for speed (i.e., linear velocity) and direction (i.e., angular velocity) in MSTd (green), 7a (blue), and dlPFC (red). Full triangles at the top indicate the mean of each population recorded from here (in their corresponding color), while the empty blue triangles show the mean in Avila et al., 2019 (7a recording) and the empty green triangles show the mean in Chen et al., 2008 (MSTd recordings).”

5) Eye movements and visual input.

Another issue is the extent that it seems difficult to distinguish the effect of eye position from that of the background stimulus flow patterns, which of course must differ in direction and element size when animals fixate at different locations on the screen. To what extent was this visual input to neurons correlated with "latent variables" like latent distance and angle to target (latent spatial goal)?

In order to dissociate the contribution of eye position and task from visual input, do the authors have data on a passive viewing control condition, in which the animal fixates and the visual pattern is played back to animals exactly as if in an active one? How do neural responses compare across the three areas?

These are excellent questions; we address them both together.

In principle there is no correlation between the visual input the animals receive and the latent variables. As it can be observed in Figure 1B (bottom) the animals most often initially move forward and rotate to face the target straight-ahead. Then, after approximately the first 500-1000ms of the trial, they move forward at a constant speed (maximal linear velocity but little to no angular component). Thus, while distance to target is continuously changing (decreasing if the animal is performing the task well), their linear velocity (driving much of the visual stimulation) is held constant.

Nonetheless, the reviewer is correct that the manuscript would be substantially bolstered by providing empirical evidence for the fact that the neural properties we report are in fact related to the animal’s computing taskrelevant metrics and not purely sensory. To address this, we perform exactly what the reviewer suggested. We recorded 2 sessions from area 7a (117 neurons) while the animal first engaged in the task, and then passively viewed the same stimuli replayed (unfortunately recordings in MSTd and dlPFC were not possible at this time). Figure 2 —figure supplement 11 (panel A) shows 7 example trials, showing that stimuli (linear velocity shown, but applies to all task variables) were matched between active and passive conditions. Panel A also shows single trial evoked responses (mean for all simultaneously recorded neurons in the first session, 56 neurons). As it can be observed, there are (single trial) evoked responses when the animal is activity performing the task, but not when passively viewing stimuli. Panel B shows that the mean firing rate (across the entire recording) was unchanged between the active version of the task and passive viewing (dashed line is the diagonal and blue circle is the mean across all neurons). Lastly, Panel C shows summary statistics: the fraction of neurons tuned to different task variables and the fraction of neurons coupled, in active (blue) and passive (black) conditions. All variables except for the phase alignment with LFP bands was significantly blunted in the passive viewing condition. Interestingly, still ~28% of neurons showed tuning to the sensory variables (“latent” and “other” variables average in the passive viewing conditions = 14.8%). Coupling filters were also less prominent in the passive viewing condition (mean ± sem: active = 20.62% ± 0.0047; passive = 15.30% ± 0.0041 ; p = 4.03 x 10-18).

The text has been amended in the following manner:

Results:

“In the supplement we demonstrate that this coding was stable (contrasting odd vs. even trials; Figure 2 —figure supplement 10) and task-relevant (Figure 2 —figure supplement 11), in that passive viewing of the same stimuli did not elicit a comparable fraction of neurons tuned to task variables in 7a (passive viewing data in MSTd and dlPFC were unavailable). The fraction of neurons aligned with the phase of LFP in different frequency bands remained stable across passive and active viewing conditions, particularly in the Β band (all frequencies, active vs. passive, p = 0.13; Β band, p = 0.51). Altogether, the encoding pattern across areas may suggest that while dlPFC is critically involved in estimating the relative distance between self and target, 7a may be preferentially involved in the process of path integration, while somewhat unexpectedly, MSTd may play an important role in keeping track of the latent spatial goal.”

And:

“Finer grain unit-to-unit coupling was sparse, and within each area the probability of two units being functionally connected decreased as a function of the distance between neurons (Figure 4 —figure supplement 3A). The overall likelihood of two units being coupled within a given area changed as a function of brain area and was modulated by task engagement (active vs. passive viewing in area 7a; Figure 2 —figure supplement 11C), but not as a function of probe type used (Utah array or linear probe, see Figure 4 —figure supplement 4).”

Figure caption:

“Figure 2 – supplement figure 11. Task engagement drives neural tuning. A. Example trials. To demonstrate that the fraction of neurons tuned to different task variables reported in Figure 2F are driven by task engagement and not purely low-level visual input, in a control experiment (2 sessions) we recorded from area 7a (117 neurons) as a monkey first actively engaged in the task (top), and then passively viewed replayed the exact visual input (bottom). We show 7 example trials, demonstrating that the linear velocity during active and passive trials matched (same for other task variables, not shown). Instead, the population evoked responses (1 trial, average across the entire population of simultaneously recorded neurons) was evident during active but not passive trials. B. Average firing rate. Firing rate (averaged over the entire recording) did not differ between active (x-axis) and passive (y-axis) viewing (blue dots are single cells in 7a, black dot is the mean, dashed black line is identity). C. Fraction of neurons tuned and coupled. The fraction of neurons tuned to different taskvariables, and the fraction of neurons coupled to each other in area 7a, were blunted (but not entirely absent) during passive viewing. The exceptions were variables related to internal neural dynamics, notably the phase locking of spiking activity to LFP phase in Theta, Α, and Β band.”

Further, in the discussion we had previously stated:

“To the best of our knowledge, the striking difference between the posterior parietal node and other areas (here MSTd and dlPFC) vis-à-vis their dependency on the LFP phase has not been previously reported and may have been acutely evident here given the natural timing between sensory inputs, motor outputs, and ongoing neural dynamics that exists within this closed-loop setting”.

This sentence has been removed given the new analysis here demonstrating that even during passive viewing, there was a stronger spike-LFP phase locking in 7a (particularly in theta and α bands) than dlPFC.

Could the authors discuss in the paper how the visual input is (or not) included in the model?

We thank the reviewers for this question. The visual input (in particular the task-relevant feature, being optic flow) was determined by (1) the linear and angular velocity of the animals, and (2) their eye movements. Both of these were included in the P-GAM. However, it is true that the characteristics of each pixel, and how these pixels varied over time, was not included in the model. We consider both of these approaches (ours: in a sense assuming an abstraction of self-velocity; vs. fitting the visual input itself: more a question of representation) to be interesting, yet different. Even complementary. We have modified the discussion in the following manner:

“Performing a “firefly task” in a real environment would also suppose a more complex set of visual inputs (e.g., corners, textures, shadows) that could be leveraged in an expanded P-GAM taking visual features as input (see Parker et al., 2022, for recent work taking this approach).”

6) MSTd and dlPFC coupling.

a) As the animals were head-fixed, eye position would compensate in some cases the animals might have moved its head position (for instance to keep track of the target). Both, MSTd and dlPFC encoded eye position. Could the close coupling of MSTd and dlPFC be linked to this element of the task?

We believe that the functional subnetwork established between MSTd and dlPFC reflects the fact that the animals are moving their eyes to keep track of the hidden latent variable. This is shown in Figure 5, demonstrating that the more MSTd and dlPFC were coupled, the more the eyes moved as to keep track of the firefly. In turn, it may be (though this is an empirical question) that if the animals employed another strategy (e.g., moving their heads instead, if freely moving), there would be no (or a reduced) coupling between MSTd and dlPFC. We have amended the discussion in the following manner:

“For instance, virtual and real-world navigation may rely on partially distinct neural codes (Aghajan et al., 2015). Thus, it will be interesting to replicate the current experiment while macaques move freely in a 3D environment (e.g., Mao et al., 2021). This would also allow for independent eye- and head-movements (head was restrained here) and thus we could estimate whether eye movements in the current experiment partially reflected intended head movements (as they seemingly do in rodents; Michaiel et al., 2020, Meyer et al., 2020).”

b) The authors claim that areas MSTd and dlPFC form a functional sub-network together, on the basis of similarity in the fractions of neurons tuned to certain variables, and the distribution of the preferred value of some of these variables. However, the fractions of neurons tuned to the latent variables in MSTd and dlPFC (see Figure 2F) are actually quite different. Perhaps the authors could comment on this.

The reviewer is correct that the fraction of neurons tuned to different variables is not identical in MSTd and dlPFC. However, these areas were in general less tuned to velocity, acceleration, and the timing of different sensorimotor variables than 7a was. MSTd and dlPFC were often tuned to eye position, and these areas were more tuned to the distance to target than 7a. Thus, overall, there were strong similarities between MSTd and dlPFC that did not exist with 7a. Most importantly, we do not make the claim that these areas form a functional subnetwork only because their tuning properties are similar, but because there was a greater likelihood of seeing unit-to-unit coupling between these areas, than between these areas and 7a. We are confident in these results, and in fact initial results regarding population dynamics also supports this claim. Namely, we have seen that the communication subspace (defined as in Semedo et al., 2019, Neuron) is largest between MSTd and dlPFC than between these areas and 7a. And the variable that is most readily decodable from the communication manifold between MSTd and dlPFC is related to eye movements (c.f. Balzani et al., 2022, Arxiv), as would be suggested by the correlation between coupling likelihood and the monkey’s eyes tracking the hidden target.

We consider that it would not be appropriate to include these population-level findings to the current manuscript, as they rely on sophisticated methodology and address a slightly different question: single units vs. population dynamics. Nevertheless, we amend the discussion to explicitly acknowledge that the fraction of units tuned to different variables is not identical in MSTd and dlPFC. The text has been modified in the following manner:

“Similarly, to further corroborate the functional subnetwork between MST and dlPFC it will be interesting to examine population dynamics and the possibility that these areas form a functional “communication subspace” (Semedo et al., 2019), adapted to the naturalistic setting of this task (see Balzani et al., 2022). “

c) When there was stronger MSTd-to-dlPFC coupling and better tracking of the hidden firefly with the eyes (Figure 5B), was the performance of the monkey also better (i.e. more hits)?

We thank the reviewers for this question. Indeed, the better the tracking of the hidden firefly, the better is performance. This is true both when quantified as bias (as we had reported in the original paper and as is reported in Lakshminarasimhan et al., 2020), and when quantified as hit rate (added to the manuscript following this question). Coupling probability (MSTs-to-dlPFC, dlPFC-to-MSTd, and dlPFC-to-dlPFC) does correlate with better gazing toward the firefly. However, this coupling is not predictive of bias (as originally reported in Figure 5 —figure supplement 1), nor of hit rate (as reported now following this question). Together, it appears that (1) coupling within dlPFC or across MSTd and dlPFC correlates with (2) better gazing toward the invisible firefly, and this latter one correlates with (3) better task-performance, but the association between (1) and (3) is not (directly, without (2)) true.

The text has been modified to include the correlations (or lack thereof) with hit rate:

“Further, this relationship also held across sessions, with better target tracking correlating with less bias (slopes closer to 1, see Figure 1C), particularly in the angular domain (r2 = 0.43, p = 0.004; radial: r2 = 0.26, p = 0.04; Figure 5A), and with an increasing proportion of rewarded trials (r2 = 0.24, p = 0.042).”

And:

“As shown in Figure 5 —figure supplement 1, there was no correlation between the unit-to-unit couplings within a session, and either radial or angular biases (7a-7a coupling v. angular bias p = 0.06; all other p > 0.12). There was similarly no correlation between the proportion of rewarded trials in a session and unit-to-unit coupling probability (all p > 0.11, Bonferroni corrected). Overall, therefore, the functional subnetwork between MSTd and dlPFC (Figure 4) seemingly reflects the animals’ strategy in keeping track of the hidden target with their eyes. In turn, the eye movements (but not MSTd-dlPFC coupling directly) aid in successfully navigating to the location of the hidden target.”

7) Sampling of areas.

a) Area 7a was exclusively sampled with chronic rather than moveable probes. It has also the largest number of "single units".

To what extent are these single units independent?

Could a sampling bias in these probes (part of 7a; layers) affect the results, especially when it comes to coupling. Please include in the discussion.

This is an excellent question. We address it by examining the existent recording from Monkey M (dlPFC recordings with linear probes), and by incorporating in the supplementary materials three new recordings (from a new monkey, Monkey B) in 7a, all with linear probes. We fit these sessions to the P-GAM and compute the fraction of neurons coupled within area. Importantly, we only perform this analysis for neurons that were exactly 400 micrometers apart, the minimum distance in Utah probe recordings. Then, we take the array recordings, and compute a new distribution. Namely, we perform 500 iterations in which we first subsample neurons from a randomly selected session to match the number of simultaneously recorded neurons with the linear probes. Then, given that these neurons were at the distance of 400 micrometers, we compute the fraction of neurons coupled. That is, we match the distance (400 micrometers) and number of neurons between array and linear probe recordings. Figure 4 —figure supplement 4 shows, most importantly, that there was no significant effect of probe: the mean of the probe data (dashed black line) fell within the distribution of the array data. Interestingly, it also corroborates Figure 4E, showing that coupling among 7a was more common than among dlPFC.

Figure caption:

“Figure 4 – supplement figure 4. Fraction of neurons coupled in 7a and dlPFC as a function of probe (Utah array or linear probe). We questioned whether the type of probed utilized during recording had an impact on the fraction of units the P-GAM estimated as coupled. To address this question, we examined a new set of recordings in 7a (5 sessions, 32 neurons in monkey B, not reported in the main text), as well as the recordings in dlPFC in monkey M (55 neurons), both of which were conducted with linear probes. For each area, we computed the fraction of neurons coupled during linear probe recordings, given that the distance between these neurons was 400 micrometers, the minimal distance between electrodes in the Utah array recordings. Then, we performed 500 iterations where we randomly subsampled from simultaneously recorded neurons in Utah arrays to match the number of simultaneously recorded neurons with the linear probe. We computed the fraction of neurons coupled within this subsample, again only for neurons at a distance of 400 micrometers. We plot the iterations as a histogram (blue for 7a and red for dlPFC), demonstrating that the fraction of neurons tuned did not significantly depend on type of probe used (7a, mean of linear probe = 0.17, 95%CI of array = [0.10 0.49]; dlPFC, mean of linear probe = 0.10, 95%CI of array = [0.03 0.22]).“

Results:

“Finer grain unit-to-unit coupling was sparse, and within each area the probability of two units being functionally connected decreased as a function of the distance between neurons (Figure 4 —figure supplement 3A). The overall likelihood of two units being coupled within a given area changed as a function of brain area and was modulated by task engagement (active vs. passive viewing in area 7a; Figure 2 —figure supplement 11C), but not as a function of probe type used (Utah array or linear probe, see Figure 4 —figure supplement 4).”

Lastly, we also amend the discussion to make reference to the issues of sampling from a single layer in 7a and the potential of non-independence of single-units (across sessions) during Utah array recordings. The text has been modified as follows:

“The second limitation relates to the (necessarily limited) sampling of neural areas, and the focus on single units as opposed to population dynamics. We report a functional subnetwork between MSTd and dlPFC based on the similarity of their encoding profiles (though they are of course not identical) and the likelihood of encountering unit-to-unit couplings across these areas. But this functional connection must be subserved by structure (e.g., perhaps a third area we did not record from fluctuating with both MSTd and dlPFC). Thus, in ongoing experiments we have trained rodents to perform the “firefly task”. This will allow recording from a wider array of neural areas and cortical layers (most of the recordings reported here being from Utah arrays and hence likely from a single layer and of limited independence).”

b) The number of recorded neurons in the three areas differs greatly: 231 units in MSTd, 823 units in dlPFC, and 3200 units in area 7a. Yet many conclusions in the paper rely on neuronal numbers: the fractions of neurons tuned to certain sensorimotor and latent variables differ between the areas, the variables explaining the firing rates cluster differently in the neurons of the three areas, and both the coarse LFP connectivity and the fine unit-to-unit coupling within areas differ. Especially the clustering results might depend on the number of recorded neurons: the fact that almost all MSTd and dlPFC neurons are categorized as belonging to the same cluster, whereas the area 7a neurons appear in three distinct clusters, could be caused by the much larger number of recorded neurons in area 7a. Also unit-to-unit coupling is more likely to show up in the data with a much larger number of recorded neurons. The data could be corrected for these differences in number of recorded neurons.

We thank the reviewers for this question. Respectfully, we disagree that the fraction of neurons tuned to a particular variable, or the fraction of neurons coupled with others, depends on the number of neurons (these are proportions and thus corrected for the number of neurons). On the hand, we do agree that clustering analyses could be impacted by the relative number of units per area included in the analysis. Thus, we have performed these analyses again, while subsampling from 7a and dlPFC to match the number of neurons present in MSTd.

For the clustering based on whether or not a neuron was tuned to a particular variable (Figure 4A and B), we performed 10k iterations. As highlighted by the reviewers, the striking finding originally reported was that most neurons in MSTd and dlPFC were mixed selective and belonged to a single cluster, while neurons from 7a belonged to a wider variety of clusters. Thus, the metric we compute here is the size of the largest cluster relative to the total number of units. As shown in Figure 4 —figure supplement 1 (panel A, means and 95%CI shown) even when matching the number of units across areas, ~70% of units in MSTd and dlPFC come from the most populous cluster, while only ~35% do so in 7a. This results thus confirm the original finding.

Regarding the clustering based on tuning function shapes and UMAP projections (Figure 4C), the key metric originally reported was that the centroid of the MSTd UMAP projection was 6.49 times closer to the centroid of dlPFC than to the centroid of 7a. We computed this metric again, while performing 100 iterations. Figure 4 —figure supplement 1 (Panel B) we depict the ratio of MSTd-to-dlPFC UMAP distance to MSTd-to-7a distance, for all iterations. As it can be observed, the finding holds, with the boundaries of the 95%CI being 4.07 and 7.33.

The results and figure caption have been amended:

Results:

“Other cluster types existed, for instance composed of neurons selectively tuned to the ongoing phase in LFP bands but no other task variable (Figure 4A and B, Cluster 4), or driven also by motor onset and offset (Figure 4A and B, Cluster 5). These remaining clusters were, however, less common (~1-5%). This analysis was conducted with the full dataset (4254 neurons in total), yet in the supplement (Figure 4 —figure supplement 1A) we confirm that the results are unchanged when subsampling from areas with more neurons (7a and dlPFC) to match the number present in MSTd (231 neurons). Together, this pattern of clustering results based on whether neurons were tuned to different task variables demonstrated a surprising commonality between MSTd and dlPFC, which are in turn different from area 7a.”

And:

“Notably, however, the centroid of MSTd was 6.49 times closer to the centroid of dlPFC than area 7a (Figure 4C, top row. Note that Becht et al., 2018, have shown UMAP to conserve global structure and thus allows for a meaningful interpretation of distances). This finding also holds when subsampling from 7a and dlPFC to match the number of units present in MSTd (100 iterations, MSTd-dlPFC distance was 5.56 times closer than MSTd7a, 95%CI = [4.07, 7.33]; Figure 4 —figure supplement 1B).”

Figure caption:

“Figure 4 – supplement figure 1. Clustering results, subsampling from neurons in dlPFC and 7a to match the number of units recorded from in MSTd. A. We performed spectral (Jaccard) clustering of neurons based on their 1 x 17 vector of Booleans, indicating whether they were tuned or not to particular task-variables (Figure 4A). Here, we perform this operation 10000 times, while randomly selecting (without replacement) 231 neurons from 7a and dlPFC. Thus, the full matrix clustered was 693 (231 x 3 brain areas) x 17 (task variables). Given that on each run clusters are assigned an arbitrary cluster number, for each run we compute the ratio of the largest cluster size to the total number of units per area (231). Namely, in the main text we report that MSTd and dlPFC are predominantly represented by 1 mixed-selective cluster, while 7a is represented in 3 approximately equal sized clusters. The results here concord with those in the main text, demonstrating that approximately 70% of neurons in MSTd and dlPFC belong to a single cluster, while approximately 35% of neurons belong to the largest cluster in area 7a. Circles are the mean across 10000 iteration, error bars are 95%CI. B. 100 iterations of UMAP while randomly subsampling from 7a and dlPFC to match the number of units in MSTd. On each run, we compute the distance in UMAP space between MSTd and dlPFC, and between MSTd and 7a. Then we compute their ratio (MSTd-to-dlPFC/MSTd-to-7a, thus >1 indicating closer MSTd-to-dlPFC distances). The figure shows the full distribution of ratios, concurring with the main text that MSTd and dlPFC are approximately 6 times closer in UMAP space, than MSTd and 7a are. “

8) Lateralisation.

To what extent played the lateralization of the recording and task a role for neuronal response? This applies relative to brain hemisphere, body and eye position? Where in each monkey did the recordings take place? Which hand(s) did each monkeys use for the choice stick?

How was the lateralisation included in the model?

Please comment with regards to responses in MST, 7A, and dPFC and add information to the manuscript.

Specifically, it is unclear from Suppl Figure 1 whether within a particular monkey, some recording sites were interhemispheric, or whether within one monkey, all recordings were done in the same hemisphere. This of course has significant consequences for the effects of ongoing LFP and unit-to-unit coupling.

We thank the reviewers for highlighting that Figure 1 —figure supplement 1 did not provide enough information as to lateralization. We have significantly expanded this figure caption to add the information requested by the reviewers. The figure caption reads:

“Location of acute recordings are indicated by spheres, color coded per animal (Monkey S, orange; Monkey Q, purple; Monkey M in green). Location of Utah arrays are indicated by squares. The pictures of the Utah arrays are framed in the color corresponding to the monkey. In turn, Monkey S had recordings performed from Utah arrays in dlPFC and area 7a on the left hemisphere, and from a linear probe in MSTd on the right hemisphere. Monkey Q had recordings performed from a Utah array in 7a on the left hemisphere and from a linear probe in MSTd on the right hemisphere. Monkey M had recordings performed from a linear probe in dlPFC on the right hemisphere. In total, therefore, each area was sampled twice. All 7a recordings were on the left hemisphere and all MSTd recordings were on the right hemisphere (note, given that MSTd is directly ventral to 7a, see Figure 2A, these cannot be recorded from the same hemisphere if the former area is implanted with an array). From the 823 neurons recorded in dlPFC, 55 were recorded on the right hemisphere in Monkey M. All inter-area coupling analysis (which requires simultaneous recordings) were based on left-hemisphere recordings in 7a and dlPFC, and from right-hemisphere recordings in MSTd. All monkeys were right-handed and used this hand to manipulate the joystick. AS, arcuate sulcus; IPS, intraparietal sulcus; PS, principal sulcus; STS, superior temporal sulcus; LF, lateral fissure.”

In short, all animals were right-handed. All recordings in area 7a were on the left-hemisphere, while all recordings in MSTd were on the right hemisphere. It is impossible to record from these two areas simultaneously in the same hemisphere, if recording high-density with a Utah array in 7a (the area on the surface). The vast majority of neural recordings in dlPFC (93%) were conducted via a Utah array on the left hemisphere. All data analysis requiring simultaneous recordings (i.e., inter-area coupling, inter-area LFP phase coupling) were conducted with MSTd data on the right hemisphere, and 7a and dlPFC on the left hemisphere. Thus, 7a and dlPFC were on equal footing with regard MSTd, in that they were both inter-hemispheric.

9) Data fed into the P-GAM model.

a) The P-GAM model is a great analysis tool for these kinds of data. However, the variables that the authors put into it are conceptually very different from each other. There are purely external task variables such as target onset and offset, latent variables such as distance to target that require knowledge of one's own position in space, and purely internal brain dynamics variables such as coupling to the LFP in another area. In that light, the finding of 'many variables contributing to the responses' is not surprising; all neurons in the brain are probably influenced both by external variables and internal brain dynamics. Maybe the authors could comment on the different nature of their variables and how that impacts their results.

We agree with the reviewer that neurons are very likely influenced by both external task variables and internal brain dynamics. In fact, we consider that the inclusion of both these types of variables to be a strong asset of the current manuscript. We have amended the text to more explicitly mention the difference between these types of variables. The text reads:

“In addition to continuous sensorimotor (e.g., linear and angular velocity and acceleration) and latent variables (e.g., distance from origin and to target, Figure 1A), as well as discrete task events (e.g., time of target onset, as well as movement onset and offset), we included elements of brain dynamics in the encoding model. These internal dynamics are most often not considered in accounting for task-relevant neural responses, yet they fundamentally shape spiking activity. These latter variables included the phase of LFP in different frequency bands (theta: 4-8 Hz; α: 8-12 Hz; β: 12-30 Hz), and causal unit-to-unit coupling filters within (i.e., spikehistory, 36 ms wide temporal filter) and between units, both within (36 ms wide temporal filter) and across cortical areas (600 ms wide temporal filters, Figure 2C, see Methods).”

b) Given that the sensorimotor and latent variables going into the G-PAM model are so crucial for the story, could you make a figure where you visualize them? This could maybe be added to Figure 1A. Also, 'radial bias' and 'angular bias' (in Fig1D) could be visualized here.

We thank the reviewers for the suggestion and agree that depicting these time-courses could be helpful to readers. We have modified the figure.

c) Quantification of electrophysiological activity processing that is fed into the P-GAM model is not entirely clear.

More details about the preprocessing of these data are required, for example, are the SUA baselined using pre-stimulus presentation activity? Are the LFP baselined as well? And how similar are the pooled responses within each area and across? This would allow the reader to spot possible problems when computing further neuronal properties, that could bias the main paper result:

An example is the tuning of the neurons to the phase of ongoing oscillation (Β, Α, Theta). There are a number of papers attempting to optimize methods to measure spike field coherency, e.g. the PPC pairwise phase consistency (Vinck et al., 2010). This method gives an estimation independent of spike count and LFP amplitudes (both parameter vary of course widely across time, tasks, subjects, areas…).

Here, it seems these two parameters are not considered and could lead to artefacts in the coupling results presented. The authors use temporal correlations to approximate coupling between spike/spike, and spike/LFP-phase. Correlation methods can potentially lead to artefacts and overestimations of coupling strength.

In their methods, the author state to 'bin spiking activity across 8ms window' prior to feeding this activity to the P-GAM. It means that 1 spike corresponds to an averaged 8ms time window. If you now try to calculate the dependency of this single spike to a specific phase of a β (30Hz) , the α (12Hz) and theta (4Hz) oscillation, it means that the chance level of assigning the binned spike to a particular phase differs considerably. Therefore, the statistical power of this analysis would decrease for higher frequency. It seems that the authors do not apply any correction.

We thank the reviewers for this question. We have added information regarding the pre-processing of neural data in the following manner:

“As such, inputs to the P-GAM were of three types. First, the spike counts of the unit to be modelled, at a 6ms resolution (i.e., the number of spikes within 6ms windows, no baseline correction). Second, the continuous, discrete, and neural co-variates, which were also sampled at a 6ms resolution. The last input type were a set of 15 “knots” per co-variate, defining the nodes of eventual tuning functions. The location of knots were defined as to (i) cover the range of a given input variable from the second to the 98th percentile with (ii) equi-probable knots (each knot covering the same probability mass). See

https://github.com/BalzaniEdoardo/PGAM/blob/master/PGAM%20Tutorial.ipynb for a comprehensive tutorial.“

Regarding the spike-LFP phase coupling, we thank the reviewers for highlighting the pairwise phase consistency (PPC0) measure. We have applied this measure and report the findings in Figure 2 —figure supplement 9.

As it can be observed, the main finding that neurons in area 7a are considerably more phase locked to LFPs (particularly in the Β range, but also in Α) holds. We have added Figure 2 —figure supplement 9 and the following text:

Results:

“Neurons in area 7a showed a strong dependency to the phase of ongoing LFP fluctuations (see Figure 2 —figure supplement 8 for further illustrations of this effect as quantified by the P-GAM, and Figure 2 —figure supplement 9 for corroborative evidence by pairwise phase consistency, Vinck et al., 2010), a fact that was less observed in dlPFC or MSTd (all p < 7.1 x 10-8, all d > 1.02) “

Figure caption:

“Figure 2 – supplement figure 9. Pairwise phase consistency. The findings related to spike-LFP phase coupling reported in the main text are based on the P-GAM, as are the rest tuning properties reported. However, detecting a correlation between when spikes occur and the phase of LFPs may be biased by a number of factors, for example the firing rate of neurons or the amplitude of LFP oscillations. In turn, we corroborated the spikeLFP phase coupling results by computing the pairwise phase consistency (PPC0), a bias-free estimator (see Vinck et al., 2010 for detail). This analysis confirmed the findings from the main text in demonstrating greater spike-LFP phase coupling in area 7a, particularly within the Β and Α ranges. Mean PPC across all neurons and sessions are reported separately by brain region and frequency range. Error bars are S.E.M.”

https://doi.org/10.7554/eLife.80280.sa2

Article and author information

Author details

  1. Jean-Paul Noel

    Center for Neural Science, New York University, New York City, United States
    Contribution
    Conceptualization, Data curation, Software, Formal analysis, Validation, Investigation, Visualization, Methodology, Writing - original draft, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-5297-3363
  2. Edoardo Balzani

    Center for Neural Science, New York University, New York City, United States
    Contribution
    Conceptualization, Resources, Data curation, Software, Formal analysis, Validation, Investigation, Visualization, Methodology, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-3702-5856
  3. Eric Avila

    Center for Neural Science, New York University, New York City, United States
    Contribution
    Conceptualization, Data curation, Investigation, Methodology
    Competing interests
    No competing interests declared
  4. Kaushik J Lakshminarasimhan

    1. Center for Neural Science, New York University, New York City, United States
    2. Center for Theoretical Neuroscience, Columbia University, New York, United States
    Contribution
    Conceptualization, Data curation
    Competing interests
    No competing interests declared
  5. Stefania Bruni

    Center for Neural Science, New York University, New York City, United States
    Contribution
    Methodology
    Competing interests
    No competing interests declared
  6. Panos Alefantis

    Center for Neural Science, New York University, New York City, United States
    Contribution
    Methodology
    Competing interests
    No competing interests declared
  7. Cristina Savin

    Center for Neural Science, New York University, New York City, United States
    Contribution
    Supervision, Validation, Investigation, Project administration, Writing – review and editing
    Contributed equally with
    Dora E Angelaki
    Competing interests
    No competing interests declared
  8. Dora E Angelaki

    Center for Neural Science, New York University, New York City, United States
    Contribution
    Conceptualization, Supervision, Funding acquisition, Project administration, Writing – review and editing
    Contributed equally with
    Cristina Savin
    For correspondence
    da93@nyu.edu
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-9650-8962

Funding

National Institutes of Health (1U19 NS118246)

  • Dora E Angelaki

National Institutes of Health (1R01 NS120407)

  • Dora E Angelaki

National Institutes of Health (1R01 DC004260)

  • Dora E Angelaki

National Institutes of Health (1R01MH125571)

  • Cristina Savin

National Science Foundation (1922658)

  • Cristina Savin

Google faculty award

  • Cristina Savin

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

The authors thank Jing Lin and Jian Chen for programming the experimental stimulus. We also thank Roozbeh Kiani for his surgical expertise during the Utah array implantations. The work was funded by a 1U19 NS118246, 1R01 NS120407, and 1R01 DC004260 from NIH to DEA, as well as by 1R01MH125571 from NIH, the National Science Foundation under NSF Award No. 1922658, and a Google faculty award to CS.

Ethics

All surgeries and procedures were approved by the Institutional Animal Care and Use Committee at Baylor College of Medicine and New York University and were in accordance with National Institutes of Health guidelines.

Senior Editor

  1. Joshua I Gold, University of Pennsylvania, United States

Reviewing Editor

  1. Kristine Krug, Otto-von-Guericke University Magdeburg, Germany

Reviewer

  1. Marieke Schölvinck, Ernst Strungmann Institute for Neuroscience, Germany

Version history

  1. Preprint posted: October 24, 2021 (view preprint)
  2. Received: May 14, 2022
  3. Accepted: October 24, 2022
  4. Accepted Manuscript published: October 25, 2022 (version 1)
  5. Version of Record published: November 16, 2022 (version 2)

Copyright

© 2022, Noel et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 1,379
    Page views
  • 239
    Downloads
  • 6
    Citations

Article citation count generated by polling the highest count across the following sources: PubMed Central, Crossref, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Jean-Paul Noel
  2. Edoardo Balzani
  3. Eric Avila
  4. Kaushik J Lakshminarasimhan
  5. Stefania Bruni
  6. Panos Alefantis
  7. Cristina Savin
  8. Dora E Angelaki
(2022)
Coding of latent variables in sensory, parietal, and frontal cortices during closed-loop virtual navigation
eLife 11:e80280.
https://doi.org/10.7554/eLife.80280

Further reading

    1. Cell Biology
    2. Neuroscience
    Elisabeth Jongsma, Anita Goyala ... Collin Yvès Ewald
    Research Article Updated

    The amyloid beta (Aβ) plaques found in Alzheimer’s disease (AD) patients’ brains contain collagens and are embedded extracellularly. Several collagens have been proposed to influence Aβ aggregate formation, yet their role in clearance is unknown. To investigate the potential role of collagens in forming and clearance of extracellular aggregates in vivo, we created a transgenic Caenorhabditis elegans strain that expresses and secretes human Aβ1-42. This secreted Aβ forms aggregates in two distinct places within the extracellular matrix. In a screen for extracellular human Aβ aggregation regulators, we identified different collagens to ameliorate or potentiate Aβ aggregation. We show that a disintegrin and metalloprotease a disintegrin and metalloprotease 2 (ADM-2), an ortholog of ADAM9, reduces the load of extracellular Aβ aggregates. ADM-2 is required and sufficient to remove the extracellular Aβ aggregates. Thus, we provide in vivo evidence of collagens essential for aggregate formation and metalloprotease participating in extracellular Aβ aggregate removal.

    1. Computational and Systems Biology
    2. Neuroscience
    Marjorie Xie, Samuel P Muscinelli ... Ashok Litwin-Kumar
    Research Article Updated

    The cerebellar granule cell layer has inspired numerous theoretical models of neural representations that support learned behaviors, beginning with the work of Marr and Albus. In these models, granule cells form a sparse, combinatorial encoding of diverse sensorimotor inputs. Such sparse representations are optimal for learning to discriminate random stimuli. However, recent observations of dense, low-dimensional activity across granule cells have called into question the role of sparse coding in these neurons. Here, we generalize theories of cerebellar learning to determine the optimal granule cell representation for tasks beyond random stimulus discrimination, including continuous input-output transformations as required for smooth motor control. We show that for such tasks, the optimal granule cell representation is substantially denser than predicted by classical theories. Our results provide a general theory of learning in cerebellum-like systems and suggest that optimal cerebellar representations are task-dependent.