1. Neuroscience
Download icon

Neural encoding of actual and imagined touch within human posterior parietal cortex

  1. Srinivas Chivukula
  2. Carey Y Zhang
  3. Tyson Aflalo  Is a corresponding author
  4. Matiar Jafari
  5. Kelsie Pejsa
  6. Nader Pouratian
  7. Richard A Andersen
  1. Department of Biology and Biological Engineering, California Institute of Technology, United States
  2. Tianqiao and Chrissy Chen Brain-Machine Interface Center, Chen Institute for Neuroscience, California Institute of Technology, United States
  3. Geffen School of Medicine, University of California, Los Angeles, United States
Research Article
  • Cited 1
  • Views 1,239
  • Annotations
Cite this article as: eLife 2021;10:e61646 doi: 10.7554/eLife.61646

Abstract

In the human posterior parietal cortex (PPC), single units encode high-dimensional information with partially mixed representations that enable small populations of neurons to encode many variables relevant to movement planning, execution, cognition, and perception. Here, we test whether a PPC neuronal population previously demonstrated to encode visual and motor information is similarly engaged in the somatosensory domain. We recorded neurons within the PPC of a human clinical trial participant during actual touch presentation and during a tactile imagery task. Neurons encoded actual touch at short latency with bilateral receptive fields, organized by body part, and covered all tested regions. The tactile imagery task evoked body part-specific responses that shared a neural substrate with actual touch. Our results are the first neuron-level evidence of touch encoding in human PPC and its cognitive engagement during a tactile imagery task, which may reflect semantic processing, attention, sensory anticipation, or imagined touch.

Introduction

Touch is a complex, multisensory perceptual process (de Haan and Dijkerman, 2020; de Lafuente and Romo, 2006; Graziano and Gross, 1993). In non-human primates (NHPs), multisensory input (e.g., visual, tactile) converges upon neurons in higher-order brain regions such as the posterior parietal cortex (PPC) where they are integrated into coherent representations (Graziano and Gross, 1993; Avillac et al., 2007; Graziano, 1999; Graziano, 2001; Graziano et al., 2000; Holmes and Spence, 2004; Hwang et al., 2014; Seelke et al., 2012; Sereno and Huang, 2014). Recent human neuroimaging studies suggest that the PPC is also recruited during touch cognition in the absence of actual tactile input (e.g., seen touch or imagined touch), supporting a notion that both higher-level touch processing and tactile cognition share a neural substrate (Chan and Baker, 2015; Lucas et al., 2015). To date, however, such a link has not been established at the single neuron level.

We recently reported an analogous relation in the parallel domain of motor function (Aflalo et al., 2020; Aflalo et al., 2015; Rutishauser et al., 2018Zhang et al., 2017). In these studies, we found that a shared PPC neuronal population coded for overt movements as well as cognitive motor variables including imagery, observed actions, and action verbs (Aflalo et al., 2020; Aflalo et al., 2015; Andersen and Buneo, 2002; Rutishauser et al., 2018Zhang et al., 2017). This richness of representation is made possible through a partially mixed encoding in which single neurons represent multiple variables, allowing a relatively small neuronal population (recorded through a 4 × 4 mm implanted microelectrode array) to provide many movement-related signals (Zhang et al., 2017; Zhang et al., 2020). Here, we hypothesize that the same PPC neuronal population engaged by high-level motor cognition also encodes actual tactile sensations as well as tactile cognition within this partially mixed encoding structure.

The neural correlates of somatosensory perception are characterized by spatially structured receptive fields to touch that respond at short latency (Keysers et al., 2010). In NHPs, subregions of the PPC within and medial to the intraparietal sulcus (IPS) encode tactile receptive fields that respond to bilateral stimuli (Graziano and Gross, 1993; Avillac et al., 2007; Graziano, 1999; Seelke et al., 2012; Sereno and Huang, 2014). These are often large receptive fields, extending across multiple body parts (Avillac et al., 2007; Graziano, 1999; Graziano, 2001; Graziano et al., 2000; Hwang et al., 2014; Seelke et al., 2012; Sereno and Huang, 2014; Sakata et al., 1973). In humans, functional magnetic resonance imaging (fMRI) studies support multisensory encoding of touch within and medial to the IPS in anterior portions of PPC (Huang et al., 2018; Sereno and Huang, 2014; Huang et al., 2012). Although these studies indicate that relatively small regions of PPC may encode touch to large portions of the body, the limited spatial resolution of fMRI precludes a characterization of tactile receptive fields. The inability to resolve single neurons in fMRI is especially problematic when attempting to understand the significance of the grossly overlapping representations of actual touch and cognitive representations of touch (Chan and Baker, 2015; Lucas et al., 2015). Spatial correspondence in fMRI cannot confirm whether representations share a neuron-level substrate (Caramazza et al., 2014). Taken together, it is unclear from the current literature whether individual neurons in human PPC discriminate touch to different segments of the body with spatially structured receptive fields, and, if so, whether cognitive processing of touch engages the same populations of cells.

In a unique opportunity, we investigated touch processing in a tetraplegic human subject at the level of single neurons recorded from an electrode array implanted in the left PPC for an ongoing brain machine interface (BMI) clinical trial. We recorded single- and multi-unit neural activity during the presentation of actual touch and during imagined touch to sensate dermatomes above the level of the participant’s injury. We found that neurons recorded at the junction of the postcental and intraparietal sulci in humans (postcentral-intraparietal, PC-IP) encoded actual touch at short latency (~50 ms) with bilateral spatially structured receptive fields, covering all tested, sensate regions within the head, face, neck, and shoulders. The tactile imagery task evoked body part-specific responses that shared a neural substrate with actual touch. Our results demonstrate that PPC neurons that discriminate touch are partially reactivated during a tactile imagery task in a body part-specific manner. The latter represents a novel finding, thus far untestable in NHP models, and suggests PPC involvement in the cognitive processing of touch.

Results

We recorded from on average 101.6 ± 7.2 neurons (Figure 1—figure supplement 1) over 14 sessions in the PPC (left hemisphere) of a tetraplegic human participant (spinal injury at levels 3–4; C3/4). In previous work, we referred to the implant area as the anterior intraparietal cortex, a region functionally defined in NHPs (Aflalo et al., 2020, Aflalo et al., 2015; Rutishauser et al., 2018Zhang et al., 2017; Zhang et al., 2020; Andersen et al., 2019; Sakellaridi et al., 2019). Here, we refer to the recording site as the PC-IP, acknowledging that further work is necessary to definitively characterize homologies between human and NHP anatomy. Recordings were split across four tasks, designed to probe basic properties of the neuronal population during both actual and imagined touch. Recordings were made from a chronic implanted array, and thus neuronal waveform sorting resulted in both well-isolated neuronal waveforms and multi-neuron groupings. The main figures aggregate across sorted channels while key analyses are performed separately for well-isolated and multi-unit activity in supplemental figures.

PC-IP neurons encode bilateral tactile receptive fields

We first examined the hypothesis that PC-IP neurons encode tactile receptive fields to dermatomes above the level of the participant’s spinal cord injury (SCI). Tactile stimuli were delivered as rubbing motions at approximately 1 Hz for 3 s. The subject was asked to keep her eyes closed to eliminate neural responses arising from visual input. Tactile stimuli were presented to bilateral axial (forehead, vertex, cheek, neck, back) and truncal (shoulder) body parts to determine the extent of body coverage of any tactile representations among PC-IP neurons. As controls, touch was also presented to the bilateral hands (insensate regions below the level of SCI), and a null condition was included (with no stimulus delivered) to verify that touch-related neural responses did not arise by chance.

For each neuron, we fit a linear model that explained firing rate as a function of responses to each touch location. Neural responses to a particular body location were considered significant if the t-statistic for the associated beta coefficient was significant (p<0.05, false discovery rate [FDR] corrected for multiple comparisons). A significant fraction of the neuronal population encoded touch to each of the tested body parts with preserved somatosensation (χ2(1)=3908.98, p<0.05; Figure 1A, Figure 1—figure supplement 2). These results are consistent with bilateral encoding as the tested body parts included both body sides. Neither touch to the hands nor the null condition elicited significant neuronal modulation. Single neurons discriminated the location of actual touch: Of the 263 responsive units shown in Figure 1A, we found that 257 discriminated touch location (ANOVA, FDR corrected for multiple comparisons). Representative examples of neurons showing clear discrimination between the different touch locations are shown in Figure 1B. As expected, a population of discriminative cells enabled accurate cross-validated classification of the touched body part (Figure 1C; see Figure 1—figure supplement 3 for single-session examples).

Figure 1 with 3 supplements see all
Postcentral-intraparietal (PC-IP) discriminably encodes bilateral tactile receptive fields.

(A) Percentage of the PC-IP neuronal population that demonstrated significant modulation relative to baseline for each tested stimulation site (p<0.05, false discovery rate corrected, n = 398 units). Results are shown as the mean percentage (horizontal black line) of the population ± bootstrapped 95% confidence interval (bar height). Gray bars represent truncal (midline) body locations, blue bars represent left (ipsilateral)-sided sites, and orange bars represent right (contralateral)-sided sites. Population results were pooled across recording sessions (Figure 1—figure supplement 1) and were not qualitatively affected by pooling together single and potential multi-units (Figure 1—figure supplement 2). (B) Representative neuronal responses illustrating body part discrimination. Each column of panels depicts the response for one neuron to body parts on the left (top two rows) and on the right (bottom two rows). The first and third rows show the neural response (mean firing rate ± standard error on the mean, n = 10 trials) as a function of time. The second and fourth rows show the spike rasters. Within these rows, each panel depicts the spike activity over each of the 10 trials (rows) and time (x-axis), and are color-coded by tested body site. The vertical line labeled ‘Go’ indicates the start of the stimulus phase. (C) Confusion matrix of the cross-validated classification accuracy (as percentage) for predicting body parts from population neural data. Colors represent the cross-validated accuracy, as in the scale. The matrix is an average of the confusion matrices computed for each recording day individually (Figure 1—figure supplement 3). L: left body, ipsilateral to implant; R: right body, contralateral to implant.

Single neurons were heterogenous, responding to variable numbers of touch sites (Figure 2A, Figure 2—figure supplement 1). Right and left sides tended to respond to the same number of fields (evidenced by the strong diagonal structure of Figure 2A). Tactile receptive fields of PC-IP neurons were diverse with evidence both for broad single-peaked fields and multi-peaked fields characterized by spatially separated regions of enhanced response (Figure 2—figure supplement 2, Figure 2—figure supplement 3).

Figure 2 with 6 supplements see all
Neurons respond to variable numbers of bilateral receptive fields.

Bilateral responses are mirror-symmetric. (A) Matrix showing the number of neurons from within the postcentral-intraparietal population that responded to the number of body parts shown, along each of the left and right body sides. Colors represent the number of neurons. Population results were not qualitatively affected by pooling together single and potential multi-units (Figure 2—figure supplement 1). Analysis of tactile receptive fields is shown in Figure 2—figure supplement 2 and Figure 2—figure supplement 3. (B) Neuronal population correlation demonstrating the relation in encoding structure between body locations. Colors represent strength of correlation, as in the scale. Population results were not qualitatively affected by pooling together single and potential multi-units (Figure 2—figure supplement 4) or by choice of distance metric (Figure 2—figure supplement 5). For mirror-symmetry analysis at the single unit level, see Figure 2—figure supplement 6. L: left body, ipsilateral to implant; R: right body, contralateral to implant.

PC-IP neurons demonstrated mirror-symmetric bilateral coding. We performed a cross-validated population correlation analysis to measure population-level similarity in the responses to each touch location (Figure 2B, Figure 2—figure supplement 4). In brief, the neural activation pattern elicited by touch to each body location was quantified as a vector, with each vector element capturing the mean response for a particular neuron during actual touch. These vectors were then pairwise correlated in a cross-validated manner so that the strength of correlation between any two body parts could be compared against the strength of correlation for repeated touches applied to the same body part. We found that responses to the same touch locations on the right and left sides are highly correlated, comparable to the correlation for repeated touches applied to the same body part. This result is consistent with a strong, mirror-symmetric, bilateral encoding. As expected, correlations involving the hands and the null condition were distributed about zero, consistent with a lack of systematic neural population response to these conditions. The results from the correlation analysis were similar for alternative distance metrics (Figure 2—figure supplement 5). Further, analysis of single units revealed mirror symmetry in bilateral representation for the vast majority of the population, paralleling population-level findings (Figure 2—figure supplement 6).

Tactile responses occur at short latency to bilateral stimuli

We explored PC-IP population response latency to tactile stimulation on the contralateral and ipsilateral body sides. In a variation of the basic task paradigm, we used a capacitive touch sensing probe to acquire precise measurements of the time of contact with the skin surface in order to measure the latency in neuronal response from the time of tactile stimulation. We probed latency on the bilateral cheeks and shoulders. As a control, we included both hands in the task design.

We measured latency as the time at which the response of the neural population rose above the pre-stimulus baseline activity (Figure 3). The neural population response was quantified as the first principal component computed from principal component analysis (PCA) of the activity of all neurons (Cunningham and Yu, 2014; Cunningham and Ghahramani, 2015). The first principal component was then fit with a piecewise linear function, and latency was computed as the time the linear function crossed the baseline pre-stimulus response. Response latency was short for both body sides and slightly shorter for contralateral (right) receptive fields (50 ms) than for ipsilateral (left) receptive fields (54 ms), although this difference was not statistically significant (permutation shuffle test, p>0.05). Figure 3A shows the time course of the first principal component relative to time of contact of the touch probe (stepped window; 2 ms window size, stepped at 2 ms, no smoothing) along with the piecewise linear fit (dashed line). A bootstrap procedure was used to find the interquartile range of latency estimates (Figure 3B).

Tactile responses occur at short latency.

(A) Population response was quantified as the first principal component (mean ± 95% CI). Population response was computed separately for the left (blue; ipsilateral to implant) and right body sides (orange; contralateral to implant) and is shown as a function of time (2 ms window size, 2 ms step size, no smoothing). Dashed lines show piecewise linear fit used to compute latency. Transparent vertical bar shows inter-quartile latency range based on a bootstrap procedure (see B). (B) Distribution illustrating variability of latency estimates for the recorded data using a bootstrap procedure. Color code as in A.

Tactile imagery task evokes body part-specific responses congruent with actual touch

The results thus far establish that PC-IP neurons have spatially structured tactile receptive fields that are activated at short latency consistent with processing of tactile sensations. Are neurons that encode tactile sensations also recruited during tactile imagery? And if so, how might evoked neural responses compare to those arising from actual touch? To address these questions, we performed an additional experiment allowing us to compare population activity elicited during a tactile imagery task with activity elicited during actual touch to matching body parts recorded during interleaved trials. During the imagery conditions, the participant was instructed to imagine touch to the right (contralateral) cheek, shoulder, or hand with the same qualities as the actual touch stimuli the participant experienced during interleaved trials. A null condition was included as a baseline to measure neural activity when no stimulus was presented.

As with findings for actual touch, neuronal responses elicited during the tactile imagery task following the go cue (during the imagery phase) were discriminably encoded (Figure 4A, cross-validated accuracy 92%). High decode accuracy is consistent with the participant’s compliance with task instructions and implies that the tactile imagery task elicited discriminative neural responses. A significant fraction of PC-IP neurons encoded actual touch to the cheeks and shoulders but not to the hands (Figure 4B; χ2(1)=355.73, p<0.05), consistent with the results presented in Figure 1. In comparison, a smaller fraction of the neuronal population was responsive to the cheek and shoulder during imagery of tactile stimuli (Figure 4B). Of note, a significant number of neurons were active during imagined touch to the hand (χ2(1)=188.89, p<0.05), despite the hand being clinically insensate in the study participant (and despite actual touch to the hand not eliciting neuronal activation). The extent of overlap between the set of units active during actual and the tactile imagery condition is illustrated in Figure 4C. The degree of overlap, compared to what is expected by chance, was statistically significant (permutation shuffle test, p<0.05). Results were qualitatively similar for well-isolated single units (Figure 4—figure supplement 1).

Figure 4 with 1 supplement see all
Postcentral-intraparietal (PC-IP) neurons encode body part-specific responses during the tactile imagery task.

(A) Average classification confusion matrix across recording sessions for body parts during tactile imagery and the baseline (null) condition. Colors represent prediction accuracy as a percentage, as in the scale. (B) Percentage of PC-IP neurons significantly modulated from baseline (mean ± 95% CI, p<0.05, false discovery rate corrected, n = 838 units) split by test condition. Population results were not qualitatively affected by pooling together single and potential multi-units (Figure 4—figure supplement 1). (C) Venn diagram illustrating the number (percentage) of PC-IP neurons recorded that activated during actual and imagined touch, and their overlap. (D) Population correlation matrix depicting similarity of the population response between all test conditions. Colors represent the correlation strength, as in the scale. (E) Distribution of correlations between actual shoulder (left) and cheek (right) touch and imagined cheek/shoulder touches, with the distributions computed over different splits of the data (see 'Materials and methods: Population correlation'). L: left body, ipsilateral to implant; R: right body, contralateral to implant.

We used the population correlation measure to compare population-level neural activity across conditions (Figure 4D). Neural activity during the tactile imagery task shared a neural substrate with responses evoked by actual touch: representations evoked during the imagery task and during actual touch were more similar for matching body parts than for mismatched body parts (Figure 4E, permutation shuffle test, p<0.05).

Dynamic evolution of population coding between task epochs suggests multiple cognitive processes

The analyses above were restricted to the mean neuronal activity following the go cue (e.g., during actual touch or during imagery) to allow a direct comparison with the results reported for the previous paradigms. We now expand this analysis. During the tactile imagery task, the participant heard a verbal cue specifying a body part (verbal cue = ‘cheek,’ ‘hand,’ or ‘shoulder’) followed approximately 1.5 s later by a beep instructing the participant to imagine the stimulus at the cued body part on the right side of the body. This cue-delay paradigm is standard in the motor physiology literature and is used to dissociate planning from motor execution-related neural activity (Aflalo et al., 2015; Rosenbaum, 1983a; Lecas et al., 1986; Ames et al., 2019). In our case, the cue-delay was unique to the tactile imagery condition. We utilized the cue-delay task to begin to dissociate in time neural activity related to different aspects of the task.

To leverage the benefits of the cue-delay paradigm, we performed a dynamic classification analysis (500 ms windows, stepped at 100 ms). Results are shown as a matrix (Figure 5). In brief, the diagonal elements represent the cross-validated prediction accuracy for a specific time window. The off-diagonal elements represent how well the classifier generalizes to alternate time windows. Each row can be interpreted as quantifying how well decision boundaries established for the diagonal time windows generalize to other time windows. This analysis allows us to measure when the neuronal population represents the different body parts (the diagonal) and whether population coding is similar or distinct during the task phases (the off-diagonal). We are interested in two main phases of the task: the early portion comprised the cue and delay (cue-delay), and the later portion when the participant is actively imagining the stimulus (go/imagery). Figure 5A schematically illustrates the examples of possible results. The examples are meant to be illustrative and are not an exhaustive list of possibilities. The population may be discriminative exclusively during the imagery phase, during the cue-delay and imagery phases but with distinct population coding, during the cue-delay and imagery phases with identical coding, or during the cue-delay and imagery phases with partially shared and partially distinct coding. Each pattern would suggest a different interpretation of various forms of cognitive processing that may be engaged in a tactile imagery task (see 'Discussion').

Shared and distinct coding of body parts during cue-delay and imagery epochs.

(A) Schematic illustrating possible dynamic classification patterns over epochs of the tactile imagery task. In each panel, the window used for classifier training is along the y-axis and the window used for classifier testing is along the x-axis. The start of the auditory cue (marking the onset of the cue-delay epoch) and the beep (marking the go signal for the imagery epoch) is shown as solid white lines, labeled ‘Cue’ and ‘Go.’ (B) Dynamic classification analysis results for the imagined touch test conditions with conventions as in A. The colors represent prediction accuracy values (as percentage), as in the scale. (C) Illustration of distinct and shared neuronal responses between the cue/delay and imagery epochs for the boxed window of B. Shared response illustrated with cross-validated, classification generalization accuracy (blue, mean with 95% confidence interval computed across sessions). Distinct response illustrated with cross-validated Mahalanobis distance (gray, mean with 95% confidence interval computed across sessions). The dashed vertical line marks the onset of the imagery epoch. The dashed horizontal line marks chance classification accuracy. (D) Dynamic classification matrices were constructed separately for all selective units. The first three principal components (PCs) of the dynamic classification matrices of single-unit activity are shown, along with the fractional variance explained by each. The mean activity of all neurons within the PC is shown within each panel, color-coded by PC weights. Plot conventions are as in A and B. s: seconds.

The results of our classification analysis (Figure 5B) are most consistent with body part selectivity during both the cue-delay and imagery phases, with partially shared and partially distinct population coding of the body parts between phases. The shared component is evident in the significant generalization accuracy in the off-diagonal elements, a representative row of which is shown in Figure 5C (blue portion) where cross-validated accuracy generalizes from approximately 70% within the cue-delay phase to approximately 60% during the imagery phase. The distinct population activity between phases is highlighted by a cross-validated Mahalanobis distance that provides a sensitive measure of change that is masked by the discretization process of classification (expanded rationale in 'Materials and methods: Temporal dynamics of population activity'). The findings demonstrate a significant change between the activity patterns in the cue-delay and imagery epochs (Figure 5C, gray).

To further clarify the properties of individual units, we conducted a dynamic classification analysis for each recorded unit. This resulted in the same matrices described above, but now each matrix represents how information coding evolves for a single unit. Time-resolved classification data were then analyzed using PCA, the first three principal components of which are shown in Figure 5D. A majority of variance (26%) is explained by units that are active during both epochs with similar coding. Coding during the imagery epoch exclusively or during the cue-delay epoch exclusively explained an additional 9% of variance.

Cognitive processing during the cue-delay and imagery epochs of the tactile imagery task shares a neural substrate with that for actual touch

Finally, we look at how encoding patterns through time generalize between the tactile imagery and actual touch conditions. A dynamic correlation analysis was applied both within and across the imagery and actual touch condition types (Figure 6A). In brief, the neural activation pattern elicited to each body location was quantified as a vector, and these vectors were concatenated to form a population response matrix for each condition type and for each point in time. These vectors were then pairwise correlated in a cross-validated manner so that the strength of correlation between conditions could be assessed relative to the strength of correlation within condition and across time. We found that the neural population pattern that defined responses to actual touch was similar to population responses both during the cue-delay or the imagery phases of the imagery task (Figure 6A). This implies that cognitive processing prior to active imagery as well as during imagery shares a neural substrate with actual touch. Sample neuronal responses that help to understand single unit and population behavior are shown in Figure 6B.

Cue-delay and imagery-evoked neural activity shares a neural substrate with actual touch.

(A) Within- and across-condition dynamic, cross-validated correlation analysis demonstrating a shared neural substrate between imagined and actual tactile sensations. Each panel shows how the neural population response at one slice of time compares to all other slices of time for the two formats being compared (x- and y-axis labels). Correlation magnitude is indicated by color as in the bar. The start of the auditory cue (marking the onset of the cue-delay epoch) and the beep (marking the go signal for the imagery epoch) is shown as solid white lines, labeled ‘Cue’ and ‘Go.’ (B) Representative neuronal responses illustrating selectivity during actual and imagined sensations. Each panel shows the firing rate (in Hertz, mean ± SEM) through time (vertical lines signal onset of cue/delay and go phases as labeled). Each column illustrates the responses of the same unit to tactile imagery of the right side (top), actual touch on the right side (middle), and actual touch on the left side (bottom) for matched body parts.

Discussion

We have previously reported that human PPC encodes many action variables in a high-dimensional and partially mixed representation (Aflalo et al., 2020; Zhang et al., 2017; Zhang et al., 2020). This architecture allows many parameters to be encoded by a small number of neurons, while still enabling meaningful relationships between variables to be preserved. Here, we show that neurons recorded from the same electrode array in the same clinical trial participant are also selective for bilateral touch at short latency. Responses to actual touch are organized around body part, sharing population representations between the left and right sides. Additionally, a tactile imagery task elicits body part-specific responses that share a neural substrate with that for actual touch. Furthermore, we found neural selectivity during the active imagery epoch as well as during the cue and delay epochs that precede imagery. The distinguishable population activity during these different phases indicates an encoding of multiple cognitive processes that may include semantic association, memory, attention, sensory anticipation, or imagery per se.

Human PC-IP encodes tactile stimuli with large and bilateral receptive fields

Cortical processing of somatosensory information begins in the anterior portion of the parietal cortex (APC) within four cyto-architectonically defined areas termed BA 3a, 3b, 1, and 2 (Kaas, 1983; Kaas et al., 1979; Sur et al., 1980). Each of these four sub-regions represents primarily contralateral somatosensory information (Ferezou et al., 2007; Geyer et al., 2000; Iwamura, 1998; Jiang et al., 1997; Ruben et al., 2001; Schnitzler et al., 1995; Tamè et al., 2016, Tamè et al., 2015). Moving from the APC to superior regions of the PPC, spatially localized and segregated sensory representations become progressively more integrated, resulting in neuronal receptive fields that are larger, frequently encompassing multiple segments of the body (de Lafuente and Romo, 2006; Sakata et al., 1973; Burton and Sinclair, 1990; Dykes, 1983; Garraghty et al., 1990; Pei et al., 2009; Pons et al., 1987; Saal et al., 2015; Soso and Fetz, 1980) including bilateral encoding (Schnitzler et al., 1995; Tamè et al., 2016, Tamè et al., 2015; Zhu et al., 2007). This process of integration is thought to play an integral role in sensory processing for the guidance of movement (Graziano et al., 2000; Mulliken et al., 2008). Our results, demonstrating that PPC neurons encode mirror-symmetric spatially structured tactile receptive fields at short latency, are consistent with these prior reports. They further provide the first single neuron evidence supporting a role of human PPC in tactile processing. As hypothesized, when comparing the current results with our prior reports, we find that the same PPC neuronal population engaged by high-level motor cognition also encodes actual tactile sensations, providing a common neural substrate for sensory and motor processing (Aflalo et al., 2020, Aflalo et al., 2015; Sakellaridi et al., 2019).

Short latency tactile responses

In NHPs, reported latency to touch responses in primary somatosensory cortex (S1) from contralateral touch ranges between 19 and 23 ms (Reed et al., 2010; Vázquez et al., 2012). PPC response latencies to touch are less clear, but neurons in the lateral intraparietal area within NHP PPC orient to visual stimuli at a mean latency of approximately 45 ms (Bisley et al., 2004). A recent human-invasive electrocorticographic study reported mean latencies to visual response of approximately 60 ms in PPC (Regev et al., 2018) similar to the mean response latencies to visual stimuli within the occipital cortex (Bisley et al., 2004; Regev et al., 2018). Our own response latency to actual touch of ~50 ms compares well with these data and is consistent with rapid somatosensory processing within PPC for updating internal estimates of the body (Graziano et al., 2000; Mulliken et al., 2008).

Tactile imagery dynamically invokes multiple cognitive processes in human PC-IP that share a neural substrate with actual touch

In motor neurophysiology, neural activity related to planning and execution is dissociated in time by introducing a delay between the cue instructing movement and the movement in response to the cue (Lecas et al., 1986; Rosenbaum, 1983b). We have previously found that such distinctions between planning and execution are preserved during motor imagery paradigms in tetraplegic individuals (Aflalo et al., 2015). Here, a similar paradigm allowed temporal dissociation in cognitive processing during tactile imagery. Single units demonstrated three dominant response profiles (Figure 5D): (1) a shared selectivity pattern between the cue-delay and imagery epochs, consistent with cognitive engagement during all phases of the imagery task; (2) selectivity exclusively during the cue-delay epoch but not the imagery epoch; and (3) selectivity exclusively during the imagery epoch but not the cue-delay epoch. In a previous study, we found similarly heterogeneous responses during the cue, delay, and imagery epochs for imagined hand grasp shapes (Klaes et al., 2015). These single-unit temporal selectivity profiles provide a basis for the population-level findings of generalization in classification results between the cue-delay and imagery epochs (Figure 5B,C) but also a separation in neural state-space between these epochs (Figure 5C).

The tactile imagery task evoked body part-specific cognitive activity that shared a neural substrate with actual touch within the PC-IP. Activity during imagined touch to the cheek, for example, was more similar in representation to actual touch to the cheek than to actual touch to the shoulder, and vice versa. Interestingly, the overlapping neural representations between actual touch and those elicited during imagery were not limited to the stimulus phase (actual touch and imagery) itself, but also extended to the cue-delay phase of the imagery task. This overlap echoes our recent findings for shared neural representations between imagined and attempted actions, as well as for shared neural representations between observed actions and action verbs (Aflalo et al., 2020; Zhang et al., 2017). These studies are consistent with views in which cognition recruits sensorimotor cortical regions (Binder and Desai, 2011; Meyer and Damasio, 2009; Miyashita, 2019; Patterson et al., 2007; Ralph et al., 2017). We acknowledge that, as with all passive neural recording studies, ours cannot establish a causal role for these neurons in tactile cognition. Understanding the unique contribution of PC-IP neurons within the larger network of brain regions engaged in cognitive touch processing remains to be explored. Nonetheless, our current results provide the first human single-unit evidence of a shared neural substrate between tactile imagery and actual touch.

One concern with the use of all imagery experiments is that participant compliance cannot be externally validated. This raises the possibility that the participant is not performing the task or is performing the task in an unexpected manner. We think this is unlikely for three reasons. First, the subject by the time of this study was well versed in performing cue-delayed paradigms in the motor domain using both motor imagery and overt movements. In Zhang et al., 2017, the participant’s performance of overt movements was perfect: the participant performed both the correct cued action and the action at the go cue (i.e., no movements began prior to the go cue as validated by measurements of electromyogram activity; Zhang et al., 2017). Second, our current pattern of results that includes stable and accurate body part-specific encoding within the cue-delay and imagery epochs, with a shift between epochs, is consistent with the participant performing the task as instructed. At a minimum, it is consistent with the participant’s performing two distinct cognitive operations during the two primary phases of the task with remarkable trial-to-trial consistency. Third, evidence for a shared neural substrate between the actual touch and imagined touch conditions indicates that selective responses during the imagery task are related to tactile cognition.

What does neural processing within human PC-IP during tactile imagery represent?

While our task identifies dynamic engagement of multiple cognitive processes during tactile imagery, it is inadequate to precisely define the cognitive correlates of the observed neural activity. A number of cognitive processes may be engaged during the tactile imagery task including preparation for and/or execution of imagery, engagement of an internal model of the body, semantic processing of the auditory cue, allocation of attention to the cued body location or nature of the upcoming stimulus, and/or sensory memory for the corresponding actual sensation applied by the experimenter.

The precise neural correlates of tactile imagery are unknown, but evidence suggests that both imagined and actual touch may engage the same internal mental representations, or internal models of the body (Kilteni et al., 2018; Schmidt and Blankenburg, 2019). Support for such a shared representation comes largely from the parallel domain of motor imagery (Kilteni et al., 2018). Imagined and actual movements show similarity at the behavioral (e.g., similar duration), physiological (e.g., alteration of heart rate), and neural levels (e.g., activating the same neural substrates) (Decety et al., 1993, Decety et al., 1991, Decety et al., 1989; Decety and Michel, 1989; Lotze and Halsband, 2006; Papaxanthis et al., 2002a, Papaxanthis et al., 2002b). These studies have been interpreted as evidence that imagined movements are the simulation of the internal models that track the state of our bodies during movement (Jeannerod and Decety, 1995). In powerful support of such a notion, we have shown that populations of neurons in human PPC code motor imagery and overt actions in highly similar ways (Zhang et al., 2017). The domain of tactile imagery has been less studied in comparison. However, relevant to the current paper, behavioral evidence has demonstrated that internal models of motor actions can influence sensory perception of touch (Kilteni et al., 2018). Further, human neuroimaging studies suggest that overlapping brain regions are activated during both imagined and actual touch, including the PPC (Schmidt and Blankenburg, 2019; Lucas et al., 2015; Wise et al., 2016). This points to not only a shared substrate for the representation of imagined and actual touch, but also to their likely engagement of similar internal models. Because an internal model may be involved in anticipatory or planning activity (and/or related to imagery), it could at least partly explain the pre-stimulus (post-cue, pre-imagination) neural activity we observed.

Another possibility is that the neural activity following the auditory cue in our study represents semantic processing of the cued word. Evidence suggests that a network of brain regions is activated in processing word meaning, including those involved in processing their higher-order sensory aspects, or motor intentions such as PPC (Binder and Desai, 2011; Meyer and Damasio, 2009; Ralph et al., 2017; Huth et al., 2016; Martin, 2016; Pulvermüller, 2013). Within this framework, semantic processing of the auditory cue (e.g., instructing imagined touch to the cheek) may engage the same population of neurons responsible for the higher-level processing of touch, consistent with our data. In support, we recently reported that action verbs and visually observed actions share a common neural substrate in the same PPC neural populations reported in the current study (Aflalo et al., 2020). Results were consistent with automatic semantic processing as distinct from imagery. The current findings would extend possible semantic processing to the tactile domain and demonstrate neuronal selectivity for auditory cues (in addition to written text used in the previous study).

Hearing an auditory cue can direct the study participant’s attention to the cued body part. Attention to a stimulated body part has been shown to enhance sensory processing in human neuroimaging (Johansen-Berg et al., 2000; Hämäläinen et al., 2000; Puckett et al., 2017; Roland, 1981). In neurophysiological studies, this manifests as a gain in stimulus responses (Boynton, 2009; Williford and Maunsell, 2006). However, during the imagery task, no stimulus was delivered to the participant. An attention account of our data would require that attention result in highly discriminable patterns of activity without a sensory stimulus (or pre-stimulus). Most studies of pre-stimulus attention report modest modulation of baseline neural activity (Boynton, 2009; Williford and Maunsell, 2006; Snyder et al., 2018). However, the failure to find pre-stimulus effects may be the consequence of insensitive analysis techniques: indeed, recent single neuron work in NHP visual cortical area 4 (V4) demonstrated discriminable coding for the locus of attention prior to stimulus presentation and, further, that the pre-stimulus activation patterns were systematically related to the post-stimulus response patterns (Snyder et al., 2018). These recent results suggest that attention may be decodable elsewhere, and they match the results presented in this paper. It is also consistent with what we have previously described as partially mixed selectivity (Zhang et al., 2017; Zhang et al., 2020). If our results are interpreted within the framework of attention, our current findings are inconsistent with a simple gain-like mechanism for attention, but instead suggest a richer mechanism by which information is selectively enhanced for further processing (Snyder et al., 2018).

Our task was not designed to tease apart the different possible cognitive correlates of the observed neural activity engaged during imagery. We think the temporal dynamics of the signal indicate that multiple cognitive process may be engaged throughout the course of the task. The above cognitive phenomena may each independently engage the same neural population as distinct phenomena or may be distinct processes that nonetheless engage the same underlying neural substrate.

PC-IP and plasticity following spinal injury

The extent to which the human PPC reorganizes following SCI is unknown. Lesion studies in NHPs suggest that BA 3b and 3a, 1 and 2, show altered sensory maps following SCI in a manner dependent on thalamic input from the afferent sensory pathways such as the dorsal column-medial lemniscus system (Tandon et al., 2009). With mid-cervical lesions, for instance, there is an initial loss of BA 3b hand representations and a slight expansion in face representation at approximately 2 years (Tandon et al., 2009; Mohammed and Hollis, 2018). Although significant axonal sprouting has been demonstrated to occur at the site of deafferentation in the spinal cord, with increased projections to brainstem nuclei, the changes observed in the somatosensory cortex are significantly smaller (Tandon et al., 2009; Mohammed and Hollis, 2018). Moreover, the reorganization in higher-order somatosensory centers such as the secondary somatosensory cortex is even more restricted than in BA 3b (Mohammed and Hollis, 2018). Similar stability in the topography of the somatosensory cortex has been identified in human subjects that have suffered limb amputations. In these amputees, there is a preserved digit map within the primary somatosensory cortex (Kikkert et al., 2016; Makin and Flor, 2020).

The results of our experiments suggest significant stability in tactile somatosensory architecture within the PPC. A substantial fraction of the neuronal population responded to imagined touch to the hand, while there was no significant response to actual touch (insensate in the study participant). The response during the imagery task lends support to the idea that despite the lack of peripheral input from the hand due to the participant’s SCI, the brain maintains an internal representation of tactile sensations (Makin and Bensmaia, 2017). The findings that intracortical microstimulation produces discernible tactile perceptions from insensate body regions further reinforce a maintained representation of somatosensation after deafferentation (Armenta Salas et al., 2018; Flesher et al., 2016). These findings will prove useful for bidirectional neural prostheses. We acknowledge that while additional work probing cortical reorganization following SCI is necessary to fully understand its electrophysiological consequences, our results provide insight into the maintenance of basic tactile processing within the human PC-IP, after SCI.

Conclusion

Multiple lines of evidence indicate a critical role for the human PPC in the integration of convergent multimodal sensory information to enable complex cognitive processing and motor control. To date, however, its processing of somatosensory information at the single neuron level has remained fundamentally unexplored. In the unique opportunity of a BMI clinical trial, we examined the neural encoding of actual and cognitive touch within the human PC-IP. We found that local populations of PC-IP neurons within a 4 × 4 mm patch of cortex encode bilateral touch sensations to all tested body regions above the level of the participant’s injury at short latency. A significant fraction of PC-IP neurons responded during the imagined touch condition with matching sensory fields to actual touch. The activity in the delay period of the task, between cueing and imagining touch, may reflect cognitive processes including tactile semantics, sensory anticipation, attention, as well as active imagery. Together, our results provide the first single-unit evidence of touch processing within the human PC-IP and identify a putative substrate for the encoding of cognitive representations of touch, thus far untested in animal models.

Materials and methods

Key resources table
Reagent type (species) or
resource
DesignationSource or referenceIdentifiersAdditional
information
Software, algorithmMATLABMathWorks, MATLAB R2019bRRID: SCR_001622
Software, algorithmPsychophysics toolboxPsychophysics toolbox PTB3,http://psychtoolbox.org/RRID: SCR_001622
OtherNeuroportNeuroport Recording System with Utah array implant, https://blackrockmicro.com/
OtherCapacitive Touch SensoryAdafruit Capacitive Touch HAT for Raspberry Pi, https://www.adafruit.com/product/2340Product ID: 2340

Study participant

Request a detailed protocol

The study participant, NS, is a 60-year-old tetraplegic female with a motor complete SCI at cervical level C3-4 that she sustained approximately 10 years prior to this report. She has intact motor and sensory function to the level of her bilateral deltoids. NS was implanted with two 96-channel Neuroport Utah electrode arrays (Blackrock Microsystems model numbers 4382 and 4383) 6 years post-injury for an ongoing BMI clinical study. Implants were made in the left hemisphere as human neuroimaging studies have typically reported stronger coding of intention-related activity in left versus right PPC (Gallivan et al., 2011a, Gallivan et al., 2011b). She consented to the surgical procedure as well as to the subsequent clinical studies after understanding their nature, objectives, and potential risks. All procedures were approved by the California Institute of Technology (IRB #18–0401), University of California, Los Angeles (IRB #13–000576-AM-00027), and Casa Colina Hospital and Centers for Healthcare (IRB #00002372) Institutional Review Boards.

Implant methodology and physiological recordings

Request a detailed protocol

The electrode implant methodology in NS has been previously published (Aflalo et al., 2015; Zhang et al., 2017; Zhang et al., 2020). One array was implanted at the junction of the left IPS with the left post-central sulcus in what we refer to as PC-IP. The other was implanted in the left superior parietal lobule (SPL). Implant locations were determined based on preoperative fMRI. The participant performed imagined hand reaching and grasping movements during a functional MRI scan to localize limb and hand areas within this region. Following localization, a craniotomy was performed on August 26, 2014. The PC-IP electrode array was implanted over the hand/limb region of the PPC within the dominant (left) hemisphere, at Talairach coordinates [−36 lateral, 48 posterior, 53 superior]. In the weeks following implantation, it was found that the SPL implant did not function. Although this electrode array was not explanted, only data recorded from the PC-IP implant were used in this study.

Experimental setup

Request a detailed protocol

All experimentation procedures were conducted at Casa Colina Hospital and Centers for Healthcare. Participant NS was seated in a motorized wheelchair in a well-lit room. Task procedures are presented in detail in the sections below. For most tasks, however, one experimenter stood directly behind the participant and was responsible for providing tactile stimuli to the participant. A 27-inch LCD monitor was positioned behind NS (visible to the experimenter but not to the subject) to cue the experimenter for the presentation of a stimulus. Cue presentation was controlled by the psychophysics toolbox (Brainard, 1997) for MATLAB (MathWorks) (Brainard, 1997).

Data collection and unit selection

Request a detailed protocol

Data were collected over a period of approximately 8 months in the fourth year after NS was implanted. Study sessions were conducted between two and three times per week, lasting approximately 1 hr each. Neural activity recorded from the array was amplified, digitized, and sampled at 30 kHz from the electrodes using a Neuroport neural signal processor (NSP). The system has received Food and Drug Administration (FDA) clearance for less than 30 days of recording. We received an investigational device exemption (IDE) from the FDA (IDE #G120096, G120287) to extend the implant duration for the purposes of the BMI clinical study. Putative neuron action potentials were detected at threshold crossings of −3.5 times the root-mean-square of the high-pass filtered (250 Hz) full bandwidth signal. Each individual waveform was made of 48 samples (1.6 ms) with 10 samples prior to triggering and 38 samples after. Single- and multi-unit activity was sorted using Gaussian mixture modeling on the first three principal components of the detected waveforms. The details of our sorting algorithm have been previously published by our group (Zhang et al., 2017). To minimize noise and low-firing effects in our analyses, we used as a selection criterion for units, a mean firing rate greater than 0.5 Hz and a signal-to-noise ratio (SNR) greater than 0.5. We defined SNR for the waveform shapes as the difference between their mean peak amplitude and the baseline amplitude, divided by the variability in the baseline.

Each recording session was assumed to be independent in reporting the total number of units. However, it is likely that some fraction of units recorded on separate days are resamples of the same neuron, given that recordings for different days were made from the same array. Neural waveforms are largely a function of the geometry of the electrode with respect to the neuron, and not a unique signature that can be used to characterize a neuron. Thus, it is impossible to determine whether waveforms collected on separate days represent the same or different neurons. However, the degree to which recordings change from one day to the next can be taken as a general indicator of whether there may be some daily change in which neurons are recorded from. In other words, if the set of waveforms across the array are the exact same from one day to the next, it is likely that we are recording from the same neurons. Conversely, if the waveforms change substantially from one day to the next, it is likely that we are recording from, at least partially, distinct populations of neurons. We performed two analyses to quantify changes in neural recordings across days. In the first, more conservative of the two analyses, we compare the number of waveforms on each channel between days. If the number of waveforms on a single channel changes, then this is strong evidence that there has been some substantial change in the neural recordings. By this measure, an average of 29 ± 4.2% of channels change across days. In the second analysis, we used a permutation shuffle test to measure whether the recorded waveforms on the same channel were more similar than the waveforms across different channels. By this assessment, 58 ± 8% of neurons were distinct from one day to another. These values indicate that there was some degree of neural turnover from one day to the next.

Well-isolated single and multi-units were pooled across recording sessions. To ensure that such pooling did not bias the conclusions of this paper, we performed core analyses within this paper on single units alone, potential multi-units alone, and all units together. The results of these analyses, shown as supplemental figures for key results, generally demonstrate that our results were robust to the pooling of all sorted units together. For the separation of spike sorted units into high-quality single and multi-units, we used as a threshold the mean across all units of the base-10 logarithm of their cluster isolation distances, based on a previously described method (Zhang et al., 2017; Harris et al., 2016). Sorted units for which the cluster isolation distance was above this measure were considered single units, and those with a distance below this threshold were considered potential multi-units. Findings were robust to the exact choice of isolation distance threshold.

For measurements of neural latency to stimulus response (please refer to the task descriptions below for more information), a custom capacitive probe was used to record the exact time of tactile stimulation. This probe was built using a Raspberry Pi 2B and Adafruit Capacitive Touch Hat (Adafruit product ID 2340). The digital output (a binary output for touch or no touch) was transmitted through a BNC cable into the NSP at an analog signal sampling rate of 2 kHz.

Task procedures

Request a detailed protocol

We used several experimental paradigms to probe various features of actual and imagined touch representations in the PC-IP. In each paradigm, the participant was instructed to keep her eyes closed. The basic task structure comprised three phases. Each trial began with the presentation of a cue to the experimenter (or an auditory cue in the tactile imagery condition, see specific task description below), 1.5 s in duration, indicating the stimulus (e.g., touch NS’s left cheek). This was followed by a brief delay, 1 s in duration. Then written text appeared on the screen to signal the experimenter to present the instructed stimulus, for 3 s (in the tactile imagery paradigm, a beep indicated the ‘go’ signal for the participant). Exact time intervals varied depending on task. Trials were pseudorandomly interleaved; all conditions were necessarily required to be performed at least once before they were repeated. In tasks in which both left and right body sides (ipsilateral and contralateral to the implant, respectively) were tested, stimuli were delivered to one body side at a time.

Neural responsiveness to touch

Request a detailed protocol

This task variant explored neuronal responsiveness and selectivity to actual touch to body parts (receptive fields) with preserved somatosensory input (above the level of SCI). Body parts tested included the forehead, vertex of the head, left and right back of the head, left and right cheeks, left and right sides of the neck, and the dorsal surfaces of the left and right shoulders. As controls, the left and right hands (clinically insensate) and a null condition (no stimulus presentation) were also included. Actual touch stimuli were presented to each body part as finger rubs by the experimenter at approximately one per second. The experimenters’ fingertips were used to provide touch stimuli over an approximately 3 × 5 cm patch of skin on each tested body part. Stimuli to the left and right body sides were delivered on separate trials to evaluate each side independently. To ensure that any neural activity observed arose from actual touch and not from observed touch or other stimuli, NS was instructed to close her eyes throughout the task. She additionally wore ear plugs to block auditory input. This task was performed on four separate days. On each day, 10 trials per condition were conducted. In total, we recorded from 398 sorted units on four separate testing days.

Neural response latency

Request a detailed protocol

The purpose of this task was to determine the latency of neural response to actual touch for the left and right sides of the body. Tested regions included the left and right cheeks, the left and right shoulders, and as controls, the left and right hands (insensate). Actual touch stimuli were presented as in the task above. Instead of finger rubs, however, a capacitive touch probe was used to enable precise delineation of the actual time of contact (touch) before the onset of a neural response. This task was performed on eight separate days, with eight trials per condition in each run of the task. In total, we recorded from 838 sorted units.

Receptive field size

Request a detailed protocol

This task aimed to estimate the size of neuronal receptive fields to actual touch. Neural responses to nine equally spaced points were evaluated, 2 cm apart, along a straight line from NS’s right cheek to her neck (Figure 2—figure supplement 3). Only the right side (contralateral to the implant) was tested in this task. The first of these nine points was on the malar eminence, and the ninth point was on the neck as shown. In addition to the nine points, a null condition (no stimulus presentation) was also included. Stimuli were presented via a paintbrush (3 mm round tip) gently brushed against each location, at a frequency of one brush per second. Touch was restricted to a small region of approximately 0.5 cm (parallel to test sites) by 2.5 cm (perpendicular to the test sites) immediately above the marked points shown in panel A of Figure 2—figure supplement 3. The paintbrush was employed to deliver spatially localized sensations without accompanying skin distortion that could mechanically stimulate nearby sensory fields. Data were recorded on six separate days. On each day, 10 trials of each condition were tested. In total, we recorded from 585 sorted units.

Engagement during tactile imagery

Request a detailed protocol

This task was intended to establish whether PC-IP neurons are engaged by tactile imagery and whether neural patterns evoked by cognitive processing of imagined touch and actual touch share a common neural substrate (e.g., activate the same population of neurons in similar ways). In this variant, the participant was presented with either actual touch stimuli or instructed to imagine the sensation of being touched. NS was instructed to keep her eyes closed throughout. Actual stimuli were cued to the experimenter with written words that appeared on the monitor. Because the participant’s eyes were closed, the participant did not receive any information about the body part that would be stimulated prior to experiencing the touch. The cue was followed by a 1 s delay, and then at the sound of a beep (the ‘go’ signal), rubs at 1 Hz were presented with the capacitive touch probe to either the left or right cheeks, shoulders, or hands. During imagined touch trials, an auditory cue was presented to NS instructing her to imagine being touched on her right cheek, shoulder, or hand. The auditory cue consisted of a voice recording of the words ‘cheek,’ ‘hand,’ or ‘shoulder’ with cue duration of approximately 0.5 s. After a 1 s delay, at the sound of the beep, NS imagined touch to the cued body part. We asked the participant to imagine the sensations as alternating 1 Hz rubbing motions similar to what she actual during actual touch trials. A null condition (without actual or imagined touch), not preceded by an auditory cue, was used to establish a baseline neural response. Data were recorded on eight separate days. Eight trials of each condition were performed on each testing day. In total, we recorded from 838 sorted units.

Quantification and statistical analysis

In the analysis of data from the various task paradigms used in this study, we utilized several statistical methods. Some were specific for certain tasks, but others were applicable to multiple sets of data from the different paradigms. For ease of reference, we have described all methods together in this section. Where necessary, we provide specific examples from tasks to help illustrate their use in this paper. Unless explicitly noted, all recorded units for a given task were used in the statistical analyses pertaining to that task.

Linear analysis (relevant for Figures 1 and 4, and for Figure 1—figure supplement 2, Figure 2—figure supplement 3, and Figure 4—figure supplement 1)

Request a detailed protocol

To determine whether a neuron was tuned (i.e., differentially modulated to touch locations), we fit a linear model that describes firing rate as a function of the neuron’s response to each touch location. Neuronal responses were summarized as the mean firing rate computed between 0.5 and 2.5 s after stimulus presentation onset. The starting time of 0.5 s was chosen to minimize the influence of variable experimenter delay in presenting the stimulus. The baseline response was summarized as the mean firing rate during the 1.5 s window before the stimulus presentation cue. The linear equation is written as

FR=cβcXc+β0

where FR is the firing rate, Xc is the vector indicator variable for touch location c, βc is the estimated scalar weighting coefficient for touch location c, and β0 is a constant offset term. Xc was constructed by assigning a value of 1 if the corresponding firing rate was collected when touch location c was being stimulated and with a 0 otherwise. All baseline samples were also assigned a 0, effectively pooling together baseline data independent of subsequent touch location. Here, we used indicator variables as our predictors because our stimulus was applied in a binary manner, either touch was applied to a position on the skin or not. Note that in principle the formalism of linear models allows multiple indicator variables to take on a value of 1 at the same time. In our experiment, this would amount to simultaneous touch of two or more body parts. However, in out experiments, simultaneous touch was not tested and thus only one indicator variable could take a value of 1 at a time. Neural responses to a particular body location were considered responsive if the t-statistic for the associated beta coefficient was significant (p<0.05, FDR corrected for multiple comparisons). A unit was considered tuned if the F-statistic comparing the beta coefficients was significant (p<0.05, FDR corrected for multiple comparisons).

The linear models for each task were computed using all test conditions within the task, except when comparing discriminative coding between the left and right body sides. For this analysis, the goal was to determine how informative information encoded for one body side was for the other. Each neuron was fit by two linear models, one for touch locations exclusive to sensate regions of the contralateral side (e.g., contralateral shoulder, neck, back, and cheek) and one for touch locations exclusive to sensate regions of the ipsilateral side (e.g., ipsilateral shoulder, neck, back, and cheek). More details regarding this analysis are given in 'Materials and methods: Tests for mirror symmetric neural coding of body locations: single-unit analysis.'

Population correlation (relevant for Figures 2 and 4, and for Figure 2—figure supplement 4 and Figure 4—figure supplement 1)

Request a detailed protocol

We used correlation to compare the population neural representations of various tested conditions (stimulus presentations) against each other in a pairwise fashion. Correlation was chosen over alternative distance metrics (such as Mahalanobis or Euclidean distance) because it provides an intuitive metric of similarity that is robust to gross changes in baseline neural activity across the entire neural population. Alternative distance metrics were tested and gave comparable results (e.g., Figure 2—figure supplement 5).

To perform the population correlation analyses, we quantified the neural representations as a vector of firing rates, one vector for each condition (stimulus location) with each vector element summarizing the response of an individual unit. As before, neural activity was summarized as the mean firing rate during the stimulation phase window, defined as 0.5–2.5 s after the onset of stimulus presentation. Firing rate vectors were constructed by averaging the responses across 50–50 splits of trial repetitions. The mean responses across different splits were correlated within and across conditions (e.g., across stimulations of different sensory fields), then the splits were regenerated, and the correlation computed 250 times. Performing the splits 250 times was chosen based on an empirical analysis applied to preliminary data. For preliminary data, we performed the analysis with N splits, with N ranging from 5 to 200 in steps of 5. We found that the mean correlation across splits converged to a stable value by about 80 splits. We then roughly tripled that to ensure that the numerical sampling scheme would capture a stable value of our cross-validated correlation metric. The across-condition correlations measured similarity between population responses for different sensory fields, answering the question – are the tactile sensations similar or dissimilar from the perspective of the recorded neural population? The within-condition correlations assist in our interpretation of the across-format correlations by allowing us to quantify the theoretical maxima of the similarity measure (e.g., if the within-condition correlation is measured at 0.6, then an across condition of 0.6 suggests identical neural representations).

To test whether the difference between any pair of conditions was statistically significant, we used a shuffle permutation test applied to the correlations computed over the 250 random splits. To illustrate, in Figure 4E we applied this analysis to test whether the correlation between actual and imagined cheek touch was significantly different from that of actual cheek touch and imagined shoulder touch. The true difference in the correlations was computed as the difference in the mean correlations between actual and imagined cheek touches (over the 250 splits) and the mean of the correlations between actual cheek touch and imagined shoulder touches. We then randomly shuffled the two distributions together (2000 times) and computed the difference in the mean correlations for each shuffle. The distribution of shuffled differences served as the null distribution, against which we compared the true difference to determine significance. As in the case above, our permutation shuffle test used 2000 shuffles to ensure that the numerical sampling scheme would capture a stable value of the percentile of our true difference compared to the empirical null distribution.

Decode analysis (confusion matrix; relevant for Figures 1 and 4, and for Figure 1—figure supplement 3)

Request a detailed protocol

Classification was performed using linear discriminant analysis with the following parameter choices: (1) only the mean firing rates differ for unit activity in response to each touch location (covariance of the normal distributions are the same for each condition) and (2) firing rates for each unit are independent (covariance of the normal distribution is diagonal). These choices do not reflect assumptions about the behavior of neurons, but instead were found to improve cross-validation prediction accuracy on preliminary data. In our experiments, we acquired 10 repetitions per touch location, generally not enough data to robustly estimate the covariance matrix that describes the conditional dependence of the neural behavior on the stimulus. In choosing equal covariance, we are able to pool data across touch locations, achieving a more generalizable approximation of the neural response as verified by cross-validation.

The classifier took as input a matrix of firing rates for all sorted units. The analysis was not limited to significantly modulated units to avoid ‘peeking’ effects (Smialowski et al., 2010). Classification performance is reported as prediction accuracy of a stratified leave-one-out cross-validation analysis. The analysis was performed independently for each recording session, and results were then averaged across days.

Tests for mirror symmetric neural coding of body locations: single-unit analysis (relevant for Figure 2—figure supplement 6)

Request a detailed protocol

The purpose of this analysis was to assess whether neural responses to one body side were the same as neural responses to the alternate body side on a single-unit basis. Heuristically, we used a cross-validation approach, similar in concept to the population correlation, to ask whether the neural responses to one body are similar to the other body side. The transition to single units required one major modification from the population approach: instead of comparing the pattern of response across neurons (as in the population case), we compared the pattern of response across the set of lateralized body locations (shoulder, neck, cheek, and back). We first selected the set of neurons that demonstrated discriminative encoding to at least one of the body locations that was tested to ensure that there was a meaningful discriminative pattern across sites to form a basis of comparison. Then we used a cross-validation procedure to compare within and across body-side encoding. A schematic representation of how the two sides were compared is shown in panels B–F of Figure 2—figure supplement 6.

For each neuron, we created a linear model that explained firing rate as a function of the response to each touch location on the right side. The linear model was constructed using indicator variables as described above; however, the set of body locations was restricted to shoulder, neck, cheek, and back. In this way, the response of a neuron is quantified by the continuous set of beta values for the four locations. This model was then used to predict the responses for the same four locations on the left side. The ability to predict the responses was quantified as the R2Right to Left. This metric is hard to interpret on its own; a low R2Right to Left could indicate that responses are very distinct between the right and left sides or it could indicate that the neuron is not very discriminative (e.g., there is high trial-to-trial variability relative to the differences in response to the different touch locations). Therefore, we also computed a cross-validated R2Left to Left measure. This disambiguates the R2Right to Left measure. If R2Right to Left is low but R2Left to Left is high, then we know that the unit is discriminative, but that the patterns of response between the right and left sides are distinct. To compare apples-to-apples, both the R2Left to Left and R2Right to Left were computed using leave-one-out cross-validation. This is necessary to ensure that the two measures are computed based on the same amount of training data. To directly compare these values, we plotted them against each other as a scatter plot. If the patterns of response are similar, this would lead to data points falling along the identity line. If the patterns are distinct, the data points should fall below the identity line.

Response latency (relevant for Figure 3)

Request a detailed protocol

We quantified the neural response latency to touch stimuli at the level of the neural population. Prior to the analysis, trials were aligned by touch onset as detected by the capacitive touch sensor (ground truth). PCA was used to summarize the population-level temporal response of recorded neurons (Cunningham and Yu, 2014). We constructed a matrix of neural data D that was (n) by (t * c * r) in size, with n being the number of neurons, t being the number of time points, c being the number of conditions, and r being the number of trial repetitions. For each neuron, activity was sampled every 2 ms and no additional smoothing was applied. 2 ms windows were chosen to allow high temporal resolution to precisely localize the timing of the neural response with respect to touch contact. We used t = 201 time bins starting from −150 ms and stopping at 250 ms with respect to the time of touch sensor contact. Different ranges from time of contact (up to −500 ms before and 500 ms after probe contact) were tested, and the basic average latency was robust to the exact window choice. c = 2, including data for touch to the cheek and touch to the shoulder. r = 10, as we acquired 10 repetitions per condition. Principal components were calculated based on the singular value decomposition algorithm.

The first principal component (1PC) was retained, and responses were averaged across conditions and repetitions. Single-trial results were visually inspected, and basic temporal profiles were consistent across conditions and repetitions. This process was performed separately for data acquired for touch to the left side and right side of the body. The 1PC was then fit with a piecewise linear function with two transition points. The choice of two transition points was set based on visual inspection of the data and allows for an initial baseline, a subsequent rise, and a final plateau. The time at which transitions occurred was not constrained, being purely a product of the fitting process. Latency was reported as the time the piecewise linear fit crossed the 95th percentile of the baseline data, as measured by the distribution of activity in the window between −150 and 0 ms. To compute bootstrapped quartile bounds of the latency estimates, the above process was repeated 1000 times while resampling with replacement from the 1PC single-trial results. To verify that 1000 resamples were sufficient to estimate a stable estimate of the quartile range, we repeated the process with 1500 resamples and found that the quartile estimate changed less than 1%.

To determine whether the mean difference of latency estimates was significant between the right and the left side, we performed a permutation shuffle test. We used a rank test to compare the true difference in latency estimates against an empirical null distribution of differences in latency estimates generated by shuffling labels and repeating the comparison 2000 times.

Quantifying macroscale receptive field structure (relevant for Figure 2—figure supplement 2)

Request a detailed protocol

We found that many neurons responded to touch to multiple body locations. We wished to further characterize the receptive field structure to determine whether neurons were characterized by single-peaked broad receptive fields or discontinuous receptive fields with multiple peaks. To adjudicate between these possibilities, we selected touch locations to the contralateral (right) forehead, cheek, neck, and shoulder for further analysis because these locations are approximately collinear. We reasoned that if neurons are characterized by single-peak-type responses, then responses across a collinear set of testing sites will result in a single local maximum (either with a single peak and fall off on either side or as a monotonic increase to the edge locations). On the other hand, if receptive fields are characterized by multiple peaks, then responses should have multiple local maxima.

Neurons were first restricted to those demonstrating significant differential responses between the four sites (ANOVA, p<0.05, FDR corrected). Each neuron was then grouped according to its location of preferred (peak) response. This resulted in four groups of neurons: neurons that responded maximally to the forehead, cheek, neck, or shoulder. For each neuron, the goal was to identify if the firing rate monotonically decreases with increasing distance from the preferred location or rises again, allowing for a second maxima. For example, for a unit preferring the forehead, this would manifest as firing rate at forehead larger than at cheek, at cheek larger than at neck, and at neck larger than at shoulder. Tests of firing rate between adjacent locations were performed by one-tailed t-tests between the pair of locations, evaluating whether the firing rate at the location nearer the preferred response was greater than that at the location more distant. In the example of the forehead preferring units, the t-tests evaluated whether cheek>neck and neck>shoulder. If it was found that any of those comparisons was not true (e.g., firing rate at neck greater than at cheek) after correcting for multiple comparisons, this implied a second local maxima. The unit was then classified as multi-peak. If no second local maxima was found, the unit was classified as single peak.

Receptive field size estimation (relevant for Figure 2—figure supplement 3)

Request a detailed protocol

In our first experiment, we tested touch responses across major body parts at a course resolution. Patterns of neuronal responses suggest that multiple body parts can be represented in individual neurons, although the response field around each body part is locally narrow (not expansive, covering all body parts). To evaluate this further, as a complimentary dataset, we tested tactile representations at a finer spatial precision to begin to characterize the size of their receptive fields. We characterized the response patterns of individual neurons to tactile stimuli delivered to each of nine points along the subject’s face and neck. All units demonstrating a differential spatial response to touch to each of the nine fields were included in this analysis. For each of these units, we first identified the preferred site of stimulus delivery as the point associated with the largest firing rate. Next, we examined its response to delivering stimuli to the other points. To estimate the average size of a neuronal receptive field as a function of its preferred point of stimulus delivery, we fit a Gaussian model to the average responses grouped by the preference of the neuron. The Gaussian model had three free parameters and was defined as

Gx=Ae-12x-μσ2+c

Here, A is the amplitude of the Gaussian, σ is the standard deviation, and c is the constant offset term. µ is the mean/center of the Gaussian and was fixed at the preferred point. A separate model (with the appropriate value of µ) was fit to each of the response groups. The field size was described as the full width at half maximum.

Temporal dynamics of population activity during tactile imagery task: within category (relevant for Figure 5)

Request a detailed protocol

We performed a sliding-window classification analysis to quantify the strength and temporal dynamics of population coding in the tactile imagery task. In this task, the participant heard an auditory cue specifying a body part (‘cheek,’ ‘hand,’ or ‘shoulder’) that lasted approximately 0.5 s, followed by an approximately 2 s delay, and finally a beep instructing the participant to initiate imagining a touch sensation at the cued body part. This task could engage at least four cognitive processes: (1) semantic processing of the cue, (2) preparation/anticipation for imagery, (3) attentional modulation, and (4) imagined touch per se. We used a dynamic classification analysis to understand how the neural population evolved during the course of the trial to determine whether the population was best described as mediating single or multiple cognitive processes. In brief, the analysis consisted of creating a dataset that consisted of the population response measured in small temporal windows throughout the course of the trial. We trained a classifier separately on each temporal window and applied each classifier to both temporal windows. In this way, we can measure how information about the cued stimulus evolves in time (e.g., does there exist neural coding during the delay portion of the trial, and, if so, does the neural coding during the delay match neural coding during active imagery). Classification was performed using linear discriminant analysis as described above. We used cross-validation to ensure that training and predicting on the same time window was directly comparable to training on one window and testing on an alternate time window; in other words, we were careful to ensure that accuracy across all comparisons reflects generalization accuracy using the same amount of training and test data. Classifiers were trained and tested on neural responses to the three imagery conditions: cheek, hand, and shoulder. Population response activity for each time window was computed as the average neural response within a 500 ms window, stepped at 100 ms intervals. Window onsets started at −700 ms relative to auditory cue onset (cue-delay epoch) with the final window chosen 3.5 s after the beep (onset of the imagery epoch). Classification was performed on all sorted units acquired within a single session. Mean and bootstrapped 95% CIs were computed for each time bin from the cross-validated accuracy values computed across sessions.

We used a fixed window size for averaging time series data for analysis (box-car smoothing) as it provides straightforward bounds for the temporal range of data that are included in the analysis for a particular time window. 500 ms was chosen as a good balance between temporal resolution and noise mitigation. We note that although the window size can influence various metrics (e.g., larger smoothing windows can increase coefficients of determination, R2) the choice of smoothing size is largely inconsequential as long as the kernel size is kept consistent when making comparisons across conditions. The choice of a 100 ms step size was anchored to the choice of smoothing window. A small step, such as 1 ms, would not be justified with a 500 ms time window. We chose 100 ms, representing a change in 20% of the data, to allow us the ability to temporally localize changes in neural response without unnecessarily oversampling a smoothed signal and thus not unnecessarily increasing computation time for analysis.

We believe that this technique, by helping us to understand when information appears and how information compares across task phases, provides a valuable approach to understanding how population activity relates to the underlying cognitive processes. For example, if neural decoding reaches significance only after the go cue, neural activity would be inconsistent with semantic or anticipatory processing. Alternatively, if neural processing begins with the cue, and the same pattern of neural activity is maintained throughout the trial, with no changes during the active imagery phase, then the data would be inconsistent with processing imagined touch per se.

The classification analysis described above was used to measure general similarity of the population response to the tested conditions across time. However, to explicitly test whether population activity was changing, we used Mahalanobis distance as our measure. This is necessary as classification involves a discretization step that makes the technique relatively insensitive to changes in neural population activity that do not cross decision thresholds. Mahalanobis distance, being a proper distance measure, is a more sensitive measure of change. To illustrate, imagine that a classifier is trained on time point A and tested on time point B. At time point A, the means of the two classes are 0 and 1, respectively, and at time point 2 the means are 0 and 4, respectively. All classes are assumed to have equal but negligible variance (e.g., 0.01) in this example. When trained on time point A, the classifier finds a decision boundary at 0.5. with 100% classification accuracy. When tested on time point B, with the same 0.5 decision boundary, the classifier again is 100%. Naively, this could be interpreted as signifying that no change in the underlying data has occurred, even though the mean of the second distribution has shifted.

Separation in neural activity between the cue-delay epoch and the imagery epoch was quantified using a cross-validated Mahalanobis distance computed between the observed neural activity at a time point and a reference (baseline) defined as the neural activity immediately following the presentation of the auditory cue, from 0.25 to 0.75 s. Distances were measured separately for each of the three conditions and then averaged. The mean and standard error on the mean were computed across sessions for the cross-validated distance measures and plotted in Figure 5C. Activity during the cue-delay epoch and go epoch was compared using a rank-sum test of the averaged activity during the phase-averaged responses across sessions.

Temporal dynamics of single-unit activity during tactile imagery task: within category (relevant for Figure 5)

Request a detailed protocol

We wished to understand the behavior of single neurons that led to the temporal dynamics of the population. The temporal dynamics of single-unit activity during the imagery task (for the imagined touch conditions only) were quantified a PCA (Figure 5D). A sliding-window classification analysis was first performed on each sorted unit from all testing days in the same manner as described above for the population activity, with the exception that classifier took as input a vector of the firing rates for a single unit as opposed to a matrix of the firing rates for all units recorded in a single session. This allowed a quantitative description of the temporal dynamics for each sorted unit. Next, a PCA was applied to the dynamic classification matrices with individual neurons counting as the independent observations. PCA has become a standard method for describing the behavior of neural populations (Cunningham and Yu, 2014). Typically, PCA is applied to firing rate measurements of neurons. However, in our case, we were less interested in capturing the main modes of variability with respect to individual conditions, but instead wanted to capture the main modes of variability with respect to the temporal dynamics of information encoding.

Temporal dynamics of population activity during tactile imagery task: across category (relevant for Figure 6)

Request a detailed protocol

We wished to evaluate the similarity in neural representations of actual and imagined touch in a time-resolved manner, as well as to compare the similarities in activity from one epoch (cue-delay) to another (stimulus: actual or imagined touch). We performed a sliding-window (dynamic) correlation analysis in a cross-validated manner to compute within-format correlation in addition to across-format correlations. For this analysis, we restricted the tested body sites in the actual touch format to the right cheek and right shoulder only. Neural activity from the left side was not used, to try and match the conditions for the imagined touch format, in which only touch to the right side was tested. Similarly, the hand was not included in this analysis to match conditions that evoked responses in both formats.

For cross-validation, the analysis began with splitting trial repetitions into training and testing sets (five trial repetitions each). A sliding time-window was used for the analysis with window size of 500 ms and step size of 100 ms. Correlations were computed between training and testing sets for all combinations of windows, starting from 500 ms before the cue onset to 1000 ms after the end of the stimulus phase. Within each window, we organized the neural response data into two matrices (one each for the training and the test trial splits) with two columns each. Each column contained trial-averaged firing rates during the corresponding time window for each of the two tested stimulation sites (cheek and shoulder), with one value per unit. The columns represented the two formats. Thus, for N units recorded, two tested stimulation sites and two formats (actual and imagined), each matrix was of size (2*N) × 2. The mean response across each matrix (computed separately for training and test sets) was subtracted from each value to ensure that a positive correlation across formats reflected a similarity in the pattern of responses to the two body sites and not general offsets in the mean response of the different neurons. Finally, correlations were computed between training and test sets for all combinations of time windows. This was done across 50 random 50–50 trial splits and the results averaged across these repetitions. The analysis was performed for each recording session independently and the depicted results averaged over days.

Data availability

Data and analysis for key figures will be made available on github: https://github.com/tysonnsa/eLifePPCTouch copy archived at https://archive.softwareheritage.org/swh:1:rev:aead504c828568a46cf9555598211f1800f2187d/.

References

    1. Cunningham JP
    2. Ghahramani Z
    (2015)
    Linear dimensionality reduction: survey, insights, and generalizations
    Journal of Machine Learning Research: JMLR 16:2859–2900.
  1. Book
    1. Rosenbaum DA
    (1983a)
    The movement precuing technique: assumptions, applications and extensions
    In: Magill R. A, editors. Memory and Control of Action. Amsterdam. pp. 231–274.

Decision letter

  1. Tamar R Makin
    Senior and Reviewing Editor; University College London, United Kingdom

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Thank you for submitting your article "Neural encoding of felt and imagined touch within human posterior parietal cortex" for consideration by eLife. Your article has been reviewed by three peer reviewers, and the evaluation has been overseen by Tamar Makin as the Senior and Reviewing Editor. The reviewers have opted to remain anonymous.

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

We would like to draw your attention to changes in our revision policy that we have made in response to COVID-19 (https://elifesciences.org/articles/57162). Specifically, we are asking editors to accept without delay manuscripts, like yours, that they judge can stand as eLife papers without additional data, even if they feel that they would make the manuscript stronger. Thus the revisions requested below only address clarity and presentation.

Chivukula and colleagues report an extensive set of multi-unit neural recordings from PPC of a tetraplegic patient taking part in a brain machine interface clinical trial. The recordings were collected across a set of tasks, designed to investigate neuronal responses to both experienced and imagined touch. It was found that many neurons are responsive to touch in specific locations. Most of the recorded neurons were activated bilaterally, which is consistent with earlier monkey work from this lab. Probably the most important component of the work is the analysis of the modest activation in this area that occurs simply when the participant imagines different places on her body being touched – even the insensate arm. This work is virtually impossible to do in animals, and as such offers a unique opportunity to describe neural properties for higher-level representation of touch. The study therefore paves the way for a deeper understanding of the role of the human PPC in the cognitive processing of somatosensation.

Overall, we found the manuscript to be well-written, the study to be interesting, and for the most part the analyses are well thought out. But at the same time, the reviewers raised multiple main concerns regarding missing information and unclear descriptions of some of the analyses undertaken, which are detailed below. In addition, it was felt that there was unnecessary overlap across analyses – the first part especially contains a number of analyses that seem to make very similar points repeatedly or where it is not entirely clear what the point is in the first place. As such, there is a need to identify and cut a lot of the duplicative analyses/results and explain both the essential methods and the interpretation of the remaining results more succinctly and clearly. The key analyses could then be streamlined and better justified, ideally with an eye towards a consistent approach in both parts of the paper. As you will see, most of the comments detailed below are geared towards guiding the authors through this major revision (please forgive any duplicative comments on our behalf, given the technical nature of most of the comments, the Editor was keen to preserve the original comments as made by the reviewers). In addition, there are also some major considerations regarding the contextualisation and interpretation of the key imagery results, as detailed in the first major comment below.

1) Perhaps the most exciting innovation of the study relates to the neural responses related to the imagery conditions. Yet, we know from many previous studies in humans that cognitive processes (most notably attention), can modulate somatosensory responses. What the current study offers is an opportunity to characterise this cognitive modulation at a single neuron (and population) level over space and time, with hopes of achieving better insight into its underlying mechanism. While the reviewers agree that this unique contribution is valuable, it was also agreed that the manuscript would benefit from additional context in the Introduction as well as a more thorough discussion – particularly with respect to related research on this topic and potential mechanisms that could be recruited to support the observed neural modulation during the imagery condition.

Introduction: The second paragraph did well in establishing why one might be interested in examining somatosensory processing in the PPC. It was however, less clear why the particular questions at the end of the paragraph were being posed. Perhaps an extra paragraph could be added to bridge the notion that a sizeable body of literature has been developed around somatosensory representation within the PPC and the several "fundamental" questions remaining that are of interest here.

Discussion: The manuscript would benefit from a more thorough discussion of "imagination per se" and the various top-down processes that might be involved – as well as better positioning with respect to previous studies investigating top-down modulation of the somatosensory system. The authors state that the cognitive engagement during the tactile imagery may reflect semantic processing, sensory anticipation, and imagined touch per se – which we would not argue. But we would also expect some explicit mention of processes like attention and prediction. Perhaps these are intended to be captured by "sensory anticipation" – but, for example, attention can be deployed even if no sensation is anticipated. Importantly, it seems that imagining a sensation at a particular body site might well involve attending to that body part. That is, one may first attend to a body part before "imagining" a sensation there – and then even continue to attend there the entire time the imagining is being done. Because of this, perhaps the authors are considering attention to be a part of "imagination per se". But since attention has been shown to modulate somatosensory cortex without imagination, how can one exclude the possibility that the neuronal activity measured here simply reflects this attention component?

Regardless, we think the Discussion would benefit from a more explicit treatment of these top-down processes – especially given the number of previous studies showing that they are able to modulate activity throughout the somatosensory system. Some literature that may be of interest include:

Roland P (1981) Somatotopical tuning of postcentral gyrus during focal attention in man. A regional cerebral blood flow study. Journal of Neurophysiology 46 (4):744-754

Johansen-Berg H, Christensen V, Woolrich M, Matthews PM (2000) Attention to touch modulates activity in both primary and secondary somatosensory areas. Neuroreport 11 (6):1237-1241

Hamalainen H, Hiltunen J, Titievskaja I (2000) fMRI activations of SI and SII cortices during tactile stimulation depend on attention. Neuroreport 11 (8):1673-1676. doi:10.1097/00001756-200006050-00016

Puckett AM, Bollmann S, Barth M, Cunnington R (2017) Measuring the effects of attention to individual fingertips in somatosensory cortex using ultra-high field (7T) fMRI. Neuroimage 161:179-187. doi:10.1016/j.neuroimage.2017.08.014

Yu Y, Huber L, Yang J, Jangraw DC, Handwerker DA, Molfese PJ, Chen G, Ejima Y, Wu J, Bandettini PA (2019) Layer-specific activation of sensory input and predictive feedback in the human primary somatosensory cortex. Sci Adv 5 (5):eaav9053. doi:10.1126/sciadv.aav9053

More useful references could be found in this recent review:

https://doi.org/10.1016/j.neuroimage.2020.117255

2) The Materials and methods would benefit from additional rationale / supporting references throughout. Whereas it is generally clear what was done, it is sometimes less clear why certain choices were made. Perhaps some of the choices are "standard practice" when working with single unit recordings, but I was left in want of a bit more reasoning (or at least direction to relevant literature). Some examples are below:

For the population correlation: why was the correlation computed 250 times or why were the two distributions shuffled together 2000 times?

For the decode analysis: consider providing a reference for those interested in better understanding the "peeking" effects mentioned.

Response latency: how were window parameters determined (for both visualization and the latency calculation). And what was the rationale for them being different – especially given that the data used for the response latency calculation was still visualized (at least in part)? Relatedly, I'd be curious to see the entire time-course for that data rather than just the shaded region of the "visualization" data. Also, it would be nice if a comment (or some data) could be provided regarding how much the latency estimates change based on these parameter choices.

Temporal dynamics of population activity: why use a 500 ms window, stepped at 100 ms intervals instead of something else?

Temporal dynamics of single unit activity: it is stated that the neurons were restricted to those whose 90th percentile accuracy was at least 50% to ensure only neurons with some degree of significant selectivity were used for the cluster analysis. But why these particular values? Are the results sensitive to this choice? In this section, I'd also suggest providing references for those interested in better understanding the use of Bayesian information criteria. Similarly, it is stated that PCA is a "standard method for describing the behavior of neural populations" – as such it would be nice to provide some relevant references for the reader

3) At the start of the Results section it is stated that the recordings were from "well-isolate and multi-unit neurons". This seems to contradict the Materials and methods section, which only talks about "sorted" neurons. This needs to be clarified, and if multi-units were included, it should be stated which sections this concerns as it will have implications for the results (e.g. for selectivity for different body parts). In any case, the number of neurons included in different analyses should be evident. There are some numbers in the Materials and methods and sprinkled throughout the Results section, but for some of the analyses (e.g. clustering analysis, which was run only on a responsive subset of neurons) no numbers are provided.

4) The linear analysis section needs further details. The coefficients are matched to "conditions" but it is not explained how. I am assuming that each touch location is assigned to a condition c, however the way the model is described suggests that the vector X can in principle have multiple conditions active at the same time. Additionally, could the authors confirm whether it is the significance of the coefficients that determined whether a neuron was classed as responsive as shown in Figure 1? This analysis states a p-value but does give no further information on which test was run and on what data.

5) Figure 1C could be converted into a matrix that lists all combinations of RF numbers on either side of the body to highlight whether larger RFs on one side of the body generally imply larger RFs on the other side.

6) I am confused about the interpretation of the coefficient of determination as shown in Figure 2A. In the text this is described as testing the "selectivity" of the neurons. To clarify, I am assuming that the "regression analysis" is referring to the linear model described in a previous section. The authors then presumably took the coefficients from this model for a single side only and tested how well they could predict the responses to the opposite side, as assessed by R2 (Figure 2C,E). Before that in Figure 2A, they tested how well each single-side model could predict the responses. This is all fine, but the "within" comparison then simply tests how well a linear model can explain the observed responses, and has nothing to do with the selectivity of the neuron. For example, the neuron might be narrowly or broadly selective, but the model might fit equally well.

7) We computed a cross-validated coefficient of determination (R2 within) to measure the strength of neuronal selectivity for each body side.

Even after reading the methods (further comments below) it is difficult to figure out what all these related measures reveal. At this point in the text it is very difficult to intuit how R2 would measure selectivity.

8) Regarding the timing analysis, it is not clear to me how the accuracy can top out at 100% as shown in the figure, when the control conditions were included. Additionally, the authors should state the p value and statistic for the comparison of latencies.

9) Spatial analysis. Could the authors provide the size of the paintbrush tip that was used in this analysis. Furthermore, as stimulation sites were 2 cm apart, it is not appropriate to specify receptive fields down to millimeter precision.

10) Imagery: how many neurons were responsive to both imagery and real touch? Were all neurons that were responsive to imagery also responsive to actual touch? This is left vague and Figure 5 only includes the percentages per condition, but gives no indication of how many neurons responded to several conditions. Whether and how many neurons were responsive to both conditions also determines the ceiling for the correlation analysis in Figure 5D (e.g. if the most neurons are responsive only to actual but not imaginary touch, this will limit the population correlation).

11) What is added by including both classification and Mahalanobis distance?

Relatedly, information coding evolves for a single unit. Two complimentary analyses were then performed.

In what sense are they complementary? What is added (besides complexity) by including both cluster analysis and PCA?

12) Classification was performed using linear discriminant analysis with the following assumptions:

one, the prior probability across tested task epochs was uniform;

It is not clear what prior probability this refers to. Just stimulus site?

two, the conditional probability distribution of each unit on any epoch was normal;

Is this a reference to firing rate probability conditioned on stimulus site?

three, only the mean firing rates differ for unit activity during each epoch (covariance of the normal distributions are the same for each);

four, firing rates for each input are independent (covariance of the normal distribution is diagonal).

Does this refer to independent firing rates of neurons across stimulus sites? This seem very unlikely, given everything we know about dimensionality of cortex. Perhaps it refers to something else. Cannot all of these assumptions be tested? Were they?

13) We computed the cross-validated coefficient of determination

(R2 within) to measure how well a neuron's firing rate could be explained by the responses to the sensory fields.

This needs a better description, and I may be missing the point entirely. I assume it is an analysis of mean firing rate (which should be stated explicitly) and that it uses something like the indicator variable of the linear analysis of individual neuron tuning above. In this case is this is a logistic regression? As it is computed for each side independently, it would appear that there are only four bits to describe the firing of any given neuron. This would seem to be a pretty impoverished statistic, even if the statistical model is accurate.

14) The purpose of computing a specificity index was to quantify the degree to which a neuron was tuned to represent information pertaining to one side of the body over the other.

This is all pretty hard to follow. The R2 metric itself is a bit mysterious, as noted above. Within and across R2 is fairly straightforward, but adds to the complexity, as does SI, which makes comparisons of three different combinations of these measures across sides. Aside from R2 itself, the math is pretty transparent. However, a better high-level description of what insight all the different combinations provide would help to justify using them all. As is, there is no discussion and virtually no description of the difference across these three scatter plot. The critical point apparently, is that, "nearly all recorded PC-IP neurons demonstrate bilateral coding". There should be much a more direct way to make this point.

15) Computing response latency via RF discrimination is rather indirect and assumes that there is significant classification in the first place. I suspect it will add at least some delay beyond more typical tests. Why not a far simpler and more direct test of means in the same sliding window? Alternatively, a change point analysis?

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

Thank you for submitting your article "Neural encoding of felt and imagined touch within human posterior parietal cortex" for consideration by eLife. Your article has been reviewed by two peer reviewers, and the evaluation has been overseen by Tamar Makin as the Senior and Reviewing Editor. The reviewers have opted to remain anonymous.

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

The reviewers were happy with the vast majority of the extensive revisions that have been implemented relative to the original submission, but felt that the presentation of the results and their discussion was inadequate. While we recognise the enormous amount of work that was poured into the revisions, there are still a few issues, originating from the revisions, that the reviewers felt require further consideration. But beyond these specific considerations that are elaborated below, we all agreed that the paper, at its present form, is too dense with results/interpretation, making it very difficult to read and evaluate. As one of the reviewers summarised:

"My main high-level concern was the, "extensive and overlapping analyses" that made it difficult to follow and to find a clear takeaway message from the paper. I suggested eliminating a number of figures and clarifying the remainder to improve the impact of the paper. Although a few small panels have disappeared, most of the duplication remains and there are now nine, fairly complex figures (had been eight). It's hard to judge how much longer the text got because the figures are no longer embedded, but that has certainly also increased. The paper has not improved as a consequence".

There is a strong need for a “deep clean” (or more precisely deep edit) of the paper. As highlighted in our original review, there's a lot of overlap between analyses/findings that will not be of interest to the average reader. We believe the key innovation of the study – the imagery results – could be presented a lot more concisely with a more focused discussion of the findings and the limitations of the suggested interpretation. Much of the details of the tactile RF properties is of secondary interest and as such should be moved to the supplementary section. I leave the decision of how to achieve this leaner and more focused version to the authors, but please consider the average reader (and their very limited time and attention span) when making the edits. As a rule of thumb, the reviewers estimated you will be likely be able to remove ~50% of the figure panels and 33% of the main text without weakening the key findings.

1) Perhaps the most exciting innovation of the study relates to the neural responses related to the imagery conditions. Yet, we know from many previous studies in humans that cognitive processes…

While the Introduction has been much improved, the Discussion still mostly disregards the alternative cognitive processes that are likely to drive the present findings (prediction, attention). The authors seem to downplays the impact of attention considerably. For example, they state:

"Most studies of pre-stimulus attention report that any modulation of baseline activity is modest at best."

But especially when considering the modest effects of imagery, this statement is misleading. The Roland (1981) study that we provided, for example, show a 25% increase in rCBF when subjects attended the index finger without being stimulated. And this increase was spatially specific as it shifted to the lip area when attending to the upper lip instead of the finger. Although the other 3 attention references were looking at the effects of attention while sensory stimulation was present, they still seem relevant to the discussion and show what I'd consider to be greater than modest effects of attention. For example, the Puckett (2017) reference shows clear digit-specific modulation when attending to the different fingertips. Note that despite stimulation being present during this condition, the stimulation was constant (allowing the phasic attention-related signal to be somewhat isolated from the sensory signal). The amplitude of the phasic attention signal was similar to that elicited by phasic stimulation alone (i.e., without endogenous attention).

There needs to be a more serious consideration that the effects attributed here to imagery are in fact modulated by (or even driven by) related cognitive processes, such as prediction and attention, which are not specifically linked to the auditory cue used in the present study.

2) Figure 4: Several panels would be more effective…

This figure has changed quite lot, addressing my cosmetic concerns. However, I do not understand this statistical test: If no comparison was significant (FDR corrected), the unit was classified as "single peak." If at least one of the comparisons was statistically significant, it was characterized as "multi-peak." I must be missing something fundamental. I took this to be a test of differences of the responses to the different body parts, with respect to the peak response. How is this a single peak? No differences sounds like a flat line. By this definition a neuron with no response whatsoever would be "single peak". Likewise, the multi-peak definition is a puzzle.

3) What is added by including both classification and Mahalanobis distance?

"Mahalanobis distance that provides a sensitive measure of change which is masked by the discretization process of classification "

I don't think the discrete nature of the classifier output is really the biggest issue. By averaging across many instances, it essentially becomes continuous, as in this figure. There are also classifier-related metrics that are by their nature, continuous. The nature of the distance measure they are making is likely more important. In this case, I do not understand why the classification rises only slightly at t=0 as the M distance increases sharply. Subsequently, between 1 and 2.5 s, classifier success drops back below its original level even as distance is stable. The two measures really don't seem to be concordant. What is going on here? I think this concern is not unrelated to my next comment about Figure 8C (now 9C).

4) Figure 8C: Despite my best efforts, I have no idea…

"This asymmetry is likely a consequence of the analysis technique and may not be of physiological significance.”

I agree with the statement, but not its sentiment. Perhaps I'm missing something, but the fact that a single classifier can distinguish between rest state and two very different activation states does nothing to suggest those two states are a general representation of an input. The classification failure in the opposite direction only reinforces that. Presumably, classifying imagined and actual touch would be trivial, at a much higher level of success than rest and imagined touch, suggesting that they are in fact, rather different, even by this metric. If the authors wish to make the claim that their results show more than grossly common receptive fields bilaterally and across modes (which is not an uninteresting finding) they would do well to adopt tools more appropriate for it, like those that have been used by the groups of Shenoy, Churchland, Miller, Kaufman, and others: Canonical correlations, principal angles, subspace overlap…

5) Computing response latency via RF discrimination is rather indirect…

The authors have adopted a more sensible and sensitive test of latency. I do not agree with this statement, however: "We believe this is likely related to discussion above about Mahalanobis distance versus classification: namely, changes in the underlying neural behavior are only detected once the neural responses cross a decision line which likely results in delays detecting changes in neural behavior." In what fundamental sense is classification significance different from a significant distance in M-space? It seems to me that the more likely explanation is simply that significant modulation precedes significant discrimination.

https://doi.org/10.7554/eLife.61646.sa1

Author response

Overall, we found the manuscript to be well-written, the study to be interesting, and for the most part the analyses are well thought out. But at the same time, the reviewers raised multiple main concerns regarding missing information and unclear descriptions of some of the analyses undertaken, which are detailed below. In addition, it was felt that there was unnecessary overlap across analyses – the first part especially contains a number of analyses that seem to make very similar points repeatedly or where it is not entirely clear what the point is in the first place. As such, there is a need to identify and cut a lot of the duplicative analyses/results and explain both the essential methods and the interpretation of the remaining results more succinctly and clearly. The key analyses could then be streamlined and better justified, ideally with an eye towards a consistent approach in both parts of the paper. As you will see, most of the comments detailed below are geared towards guiding the authors through this major revision (please forgive any duplicative comments on our behalf, given the technical nature of most of the comments, the Editor was keen to preserve the original comments as made by the reviewers). In addition, there are also some major considerations regarding the contextualisation and interpretation of the key imagery results, as detailed in the first major comment below.

1) Perhaps the most exciting innovation of the study relates to the neural responses related to the imagery conditions. Yet, we know from many previous studies in humans that cognitive processes (most notably attention), can modulate somatosensory responses. What the current study offers is an opportunity to characterise this cognitive modulation at a single neuron (and population) level over space and time, with hopes of achieving better insight into its underlying mechanism. While the reviewers agree that this unique contribution is valuable, it was also agreed that the manuscript would benefit from additional context in the Introduction as well as a more thorough discussion – particularly with respect to related research on this topic and potential mechanisms that could be recruited to support the observed neural modulation during the imagery condition.

Introduction: The second paragraph did well in establishing why one might be interested in examining somatosensory processing in the PPC. It was however, less clear why the particular questions at the end of the paragraph were being posed. Perhaps an extra paragraph could be added to bridge the notion that a sizeable body of literature has been developed around somatosensory representation within the PPC and the several "fundamental" questions remaining that are of interest here.

We have expanded the Introduction to better situate the questions we address in this manuscript. We have tried to highlight that despite a growing body of literature surrounding higher-order somatosensory processing within the PPC (both in humans and in animal models), there exist gaps in our knowledge related to the spatial structure of tactile receptive fields and whether tactile cognition engages the same populations of cells with similar spatial tuning preferences.

The modified text is below.

Introduction:

“Touch is a complex, multisensory perceptual process (1-3). In non-human primates (NHPs), multisensory input (e.g., visual, tactile) converges upon neurons in higher-order brain regions such as the posterior parietal cortex (PPC) where they are integrated into coherent representations (3-11). […] The latter represents a novel finding, thus far untestable in NHP models, and suggests PPC involvement in the cognitive processing of touch.”

Discussion: The manuscript would benefit from a more thorough discussion of "imagination per se" and the various top-down processes that might be involved – as well as better positioning with respect to previous studies investigating top-down modulation of the somatosensory system. The authors state that the cognitive engagement during the tactile imagery may reflect semantic processing, sensory anticipation, and imagined touch per se – which we would not argue. But we would also expect some explicit mention of processes like attention and prediction. Perhaps these are intended to be captured by "sensory anticipation" – but, for example, attention can be deployed even if no sensation is anticipated. Importantly, it seems that imagining a sensation at a particular body site might well involve attending to that body part. That is, one may first attend to a body part before "imagining" a sensation there – and then even continue to attend there the entire time the imagining is being done. Because of this, perhaps the authors are considering attention to be a part of "imagination per se". But since attention has been shown to modulate somatosensory cortex without imagination, how can one exclude the possibility that the neuronal activity measured here simply reflects this attention component?

Regardless, we think the Discussion would benefit from a more explicit treatment of these top-down processes – especially given the number of previous studies showing that they are able to modulate activity throughout the somatosensory system. Some literature that may be of interest include:

Roland P (1981) Somatotopical tuning of postcentral gyrus during focal attention in man. A regional cerebral blood flow study. Journal of Neurophysiology 46 (4):744-754

Johansen-Berg H, Christensen V, Woolrich M, Matthews PM (2000) Attention to touch modulates activity in both primary and secondary somatosensory areas. Neuroreport 11 (6):1237-1241

Hamalainen H, Hiltunen J, Titievskaja I (2000) fMRI activations of SI and SII cortices during tactile stimulation depend on attention. Neuroreport 11 (8):1673-1676. doi:10.1097/00001756-200006050-00016

Puckett AM, Bollmann S, Barth M, Cunnington R (2017) Measuring the effects of attention to individual fingertips in somatosensory cortex using ultra-high field (7T) fMRI. Neuroimage 161:179-187. doi:10.1016/j.neuroimage.2017.08.014

Yu Y, Huber L, Yang J, Jangraw DC, Handwerker DA, Molfese PJ, Chen G, Ejima Y, Wu J, Bandettini PA (2019) Layer-specific activation of sensory input and predictive feedback in the human primary somatosensory cortex. Sci Adv 5 (5):eaav9053. doi:10.1126/sciadv.aav9053

More useful references could be found in this recent review:

https://doi.org/10.1016/j.neuroimage.2020.117255

We significantly modified the Discussion to better discuss possible contributions to cognitive touch processing, especially imagery.

The pertinent modified text from the Discussion is below.

“What does neural processing within human PC-IP during tactile imagery represent?

While our task identifies dynamic engagement of multiple cognitive processes during tactile imagery, it is inadequate to precisely define the cognitive correlates of the observed neural activity. […] The above cognitive phenomena may each independently engage the same neural population as distinct phenomena, or may be distinct processes that nonetheless engage the same underlying neural substrate.”

2) The Materials and methods would benefit from additional rationale / supporting references throughout. Whereas it is generally clear what was done, it is sometimes less clear why certain choices were made. Perhaps some of the choices are "standard practice" when working with single unit recordings, but I was left in want of a bit more reasoning (or at least direction to relevant literature).

We have updated the Materials and methods section to clarify the analyses, provide additional technical details, expand motivation for analysis choices, and to point the reader to supporting references for further information.

Some examples are below:

For the population correlation: why was the correlation computed 250 times or why were the two distributions shuffled together 2000 times?

In the correlation analysis, for each condition (touch location), we split the data into test and training sets, averaged responses across repetitions, and finally computed a correlation to measure population response similarity within and across conditions. Because the correlation was computed for independent training and testing sets, within and across condition correlations can be directly compared. We repeated this process 250 times, each time creating new random assignments of which trials are put into the training and test sets. Each split gives a slightly different correlation value, and we average across all values to give our best estimate of the actual population correlation. Performing the splits 250 times is somewhat arbitrary, but was chosen based on an empirical analysis applied to preliminary data. For preliminary data, we performed the analysis with N splits, with N ranging from 5 to 200 in steps of 5. We found that the mean correlation across splits converged to a stable value by about 80 splits. We then roughly tripled that number to be safe. So 250 was chosen to ensure that the numerical sampling scheme would capture a stable value of our cross-validated correlation metric. A similar process was used for performing 2000 shuffles as part of our permutation shuffle test.

The following was added to the relevant Materials and methods section (“Population Correlation”)

“Performing the splits 250 times was chosen based on an empirical analysis applied to preliminary data. For the preliminary data, we performed the analysis with N splits, with N ranging from 5 to 200 in steps of 5. We found that the mean correlation across splits converged to a stable value by approximately 80 splits. We then roughly tripled this value to ensure that the numerical sampling scheme would capture a stable value of our cross-validated correlation metric.”

“As in the case above, our permutation shuffle test used 2000 shuffles to ensure that the numerical sampling scheme would capture a stable value of the percentile of our true difference as compared to the empirical null distribution.”

For the decode analysis: consider providing a reference for those interested in better understanding the "peeking" effects mentioned.

The “peeking” phenomena refers to overestimation of generalization error when using supervised feature selection on the entire dataset. We have added a reference to the text, Smialowski, Frishman and Kramer, 2010. To quickly touch on the subject and provide intuition, imagine that you have a population of 1000 neurons that are unmodulated by task condition. By chance, some of these units may appear task-modulated: with a p-value of 0.05, we would expect to find 50 or so “modulated” units. This is expected to occur by chance and is the reason multiple comparisons corrections are needed. Here, peeking refers to the situation where neurons are preselected to be modulated prior to cross-validated classification analysis. By cherry-picking “modulated” units you break the logic of cross-validation which requires independence between training and testing sets.

Response latency: how were window parameters determined (for both visualization and the latency calculation). And what was the rationale for them being different – especially given that the data used for the response latency calculation was still visualized (at least in part)? Relatedly, I'd be curious to see the entire time-course for that data rather than just the shaded region of the "visualization" data. Also, it would be nice if a comment (or some data) could be provided regarding how much the latency estimates change based on these parameter choices.

We are replacing the original version of the latency analysis. In line with other reviewer comments, in the updated manuscript, we measure latency using the basic responsiveness, not discriminability, of the neural population. One benefit of the updated approach is that we did not have to introduce any additional smoothing beyond averaging activity within 2 ms windows. The updates and rationale for the new latency calculation are discussed in response to major point 15.

The following is not included in the revised manuscript as we have revised how we perform the latency analysis. While no longer pertinent to the latency analysis in the revised manuscript, we include a response below for completeness:

As background, selecting the window size/smoothing kernel involves a balance between preserving temporal resolution (which asks for a smaller time window) and mitigating against noise (which asks for a larger window). For computing the latency estimate, fine temporal resolution is critical and so we chose a narrow smoothing kernel, essentially as small as we could make it without high-frequency noise obscuring the transition. Given our truncated Gaussian kernel, increasing the kernel width (smoothing more) pulls in information from the future and past into the firing rate estimate at the current moment in time. In computing latencies, this is highly problematic as it destroys any ability to resolve the absolute timing of signals. With our truncated Gaussian kernel, increasing the smoothing kernel for computing latency simply results in earlier and earlier latency estimates, but only because the firing rate estimate is allowed to peak further and further into the future. For the “visualization” kernel, the latency was actually pre-stimulus (~-100 ms). It is possible to use “causal” filters that can only smooth by averaging past data (not past and future.) This does not solve the core problem though. Increasing the smoothing width in this case simply delays the latency more and more as an increasing amount of the data used to compute the estimate comes from pre-stimulus activity thus adding noise and decreasing sensitivity. This same basic challenge is inherent in all techniques, although it is not always explicit. For example, hidden Markov models essentially use exponential smoothing where the smoothing coefficient is indirectly set by the window durations used for the pre and post-stimulus phases (which is still an experimenter choice).

Please see Author response image 1 for a plot of the full window with the “latency” smoothing kernel. Orange=contralateral, blue = ipsilateral. It’s fairly innocuous and essentially shows the same results as the “visualization” kernel with more high-frequency noise (as expected).

Author response image 1

We understand why choosing two separate smoothing’s, especially without better justification, warranted scrutiny. In brief, the only point of the “visualization” was to show that classification performance over the course of the trial was well behaved, with chance performance before the stimulus and sustained accurate performance after. For this basic point, a smoother signal with larger time steps seemed appropriate and required less time to compute. Again, the new version uses no smoothing and only has one visualization (see point 15).

Temporal dynamics of population activity: why use a 500 ms window, stepped at 100 ms intervals instead of something else?

Unlike the original latency analysis, for all other analysis we used a 500 ms window size for averaging time series data (also called box-car smoothing). We chose a fixed window size (as opposed to e.g. truncated Gaussian) as the exact bounds of what data was used in analysis is transparent for any particular analysis window. The use of 500 ms is somewhat arbitrary and was chosen as a balance between temporal resolution and noise mitigation. While the exact choice of window size can massage around various metrics (e.g. larger smoothing window can increase R2), for anything with decent signal-to-noise, as is the case with our data, the exact choice of smoothing size is fairly inconsequential so long as the same kernel is used when making comparisons between different conditions. The choice of step size is essentially anchored to the choice of smoothing window. We could have stepped at 1ms, but with a 500 ms smoothing window, such a small step is not justified. 100 ms, representing a change in 20% of the data, allows us the ability to temporally localize changes in neural responses while being efficient with our compute time. We have added text to the Materials and methods (“Temporal dynamics of population activity during tactile imagery task: within category”) as well to clarify the choice of time window and step size. The text is copied below.

“We used a fixed window size for averaging time series data for analysis (box-car smoothing) as it provides straight-forward bounds for the temporal range of data that are included in the analysis for a particular time window. 500 ms was chosen as a good balance between temporal resolution and noise mitigation. We note that although the window size can influence various metrics (e.g., larger smoothing windows can increase coefficients of determination, R2) the choice of smoothing size is largely inconsequential as long as the kernel size is kept consistent when making comparisons across conditions. The choice of a 100 ms step size was anchored to the choice of smoothing window. A small step, such as 1 ms would not be justified with a 500 ms time window. We chose 100 ms, representing a change in 20% of the data, to allow us the ability to temporally localize changes in neural response without unnecessarily oversampling a smoothed signal and thus not unnecessarily increasing computation time for analysis.”

Temporal dynamics of single unit activity: it is stated that the neurons were restricted to those whose 90th percentile accuracy was at least 50% to ensure only neurons with some degree of significant selectivity were used for the cluster analysis. But why these particular values? Are the results sensitive to this choice? In this section, I'd also suggest providing references for those interested in better understanding the use of Bayesian information criteria. Similarly, it is stated that PCA is a "standard method for describing the behavior of neural populations" – as such it would be nice to provide some relevant references for the reader

In consideration of the reviewers’ comments, and for simplicity of presentation, we have eliminated the cluster analysis previously presented in Figure 7A. Figure 7B (the PCA analysis) has been moved to Figure 8D. We opted to go with the PCA approach as PCA is more commonly used in the literature and involves fewer assumptions. We have included a reference to the use of PCA in neuroscience, below.

Cunningham JP, Yu BM. Dimensionality reduction for large-scale neural recordings. Nat Neurosci. 2014;17(11):1500-9.

While no longer in the manuscript, we briefly explain the choices made for the cluster analysis for completeness. We wanted to ensure that we are applying the classification analysis only to units with a reasonable strength of encoding. The cross-validated classification analysis generates a large matrix of values that, by chance, can sometimes result in large values. We are interested in the maximum classification accuracy but wanted to avoid these chance peaks. We therefore selected the 90th percentile to provide a more robust estimate of the peak value. The 50% accuracy number was chosen through a shuffle analysis, in which we found that it reflected greater than the 95th percentile of chance outcomes. We performed a basic sensitivity analysis, testing the 80th, 85th and 90th percentile, and threshold values of 55%, 55% and 60% percent. The basic result, with three clusters and relatively similar temporal patterns, were robust to these choices.

For further reading on Bayesian information criteria:

Original article:

G. E. Schwarz, Estimating the Dimension of a Model. Annals of Statistics6, 461–464 (1978).

For use in clustering, see:

S. S. Chen, P. S. Gopalakrishnan, Clustering via the Bayesian information criterion with applications in speech recognition. Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing2, 645-648 (1998).

3) At the start of the Results section it is stated that the recordings were from "well-isolate and multi-unit neurons". This seems to contradict the Materials and methods section, which only talks about "sorted" neurons. This needs to be clarified, and if multi-units were included, it should be stated which sections this concerns as it will have implications for the results (e.g. for selectivity for different body parts). In any case, the number of neurons included in different analyses should be evident. There are some numbers in the Materials and methods and sprinkled throughout the Results section, but for some of the analyses (e.g. clustering analysis, which was run only on a responsive subset of neurons) no numbers are provided.

There is no contradiction, but we understand the confusion. All channels were sorted into putative neurons, which we then categorized as either being single unit or multi-unit neurons. Multi-unit refers to the situation where two or possibly more neurons have sufficiently similar waveforms that they cannot be distinguished. All units were pooled for the analyses presented in the manuscript. To ensure that the pooling did not impact the validity of core results, we performed separate analyses on the single and multi-unit responses based on the cluster “isolation distance” computed during the spike sorting stage and found that the basic conclusions were consistent across single and multi-unit data. We have now included new supplemental figures (Figure 1—figure supplement 2, Figure 2—figure supplement 2, Figure 3—figure supplement 2, Figure 4—figure supplement 1 and Figure 7—figure supplement 1) that show the results for the pooled, single unit, and multi-unit data for representative analyses that may be especially sensitive to mixing response properties from multiple units. As can be seen, the qualitative results are similar for multi and single-unit data. We have included within the Materials and methods (“Data collection and unit selection”) a clarification on unit selection, pooling, the invariance of our results to the pooling of units, and our approach to categorizing units as single units or potentially multi-units.

Materials and methods:

“Well isolated single and multi-units were pooled across recording sessions. To ensure that such pooling did not bias the conclusions of this manuscript, we performed core analyses within this manuscript on single units alone, potential multi-units alone and all units together. The results of these analyses, shown as supplemental figures for key results, generally demonstrate that our results were robust to the pooling of all sorted units together. For the separation of spike sorted units into high quality single and multi-units, we used as a threshold the mean across all units of the base-10 logarithm of their cluster isolation distances, based on a previously described method (18, 50). Sorted units for which the cluster isolation distance was above this measure were considered single units, and those with a distance below this threshold were considered potential multi-units. Findings were robust to the exact choice of isolation distance threshold.”

4) The linear analysis section needs further details. The coefficients are matched to "conditions" but it is not explained how. I am assuming that each touch location is assigned to a condition c, however the way the model is described suggests that the vector X can in principle have multiple conditions active at the same time. Additionally, could the authors confirm whether it is the significance of the coefficients that determined whether a neuron was classed as responsive as shown in Figure 1? This analysis states a p-value but does give no further information on which test was run and on what data.

Your assumptions are correct. The description of the linear analysis (Materials and methods, “Linear analysis” relevant for Figures 1, 5 and 7) has been updated to improve clarity. The relevant text is included below:

“To determine whether a neuron was tuned (i.e., differentially modulated to touch locations), we fit a linear model that describes firing rate as a function of the neuron’s response to each touch location. […] A unit was considered tuned if the F-statistic comparing the β coefficients was significant (p<0.05, false discovery rate (FDR) corrected for multiple comparisons).”

5) Figure 1C could be converted into a matrix that lists all combinations of RF numbers on either side of the body to highlight whether larger RFs on one side of the body generally imply larger RFs on the other side.

We have converted Figure 1C (currently Figure 2B) to matrix form. We agree that seeing that larger RFs on one side imply larger RFs on the other side is a nice feature of this presentation.

6) I am confused about the interpretation of the coefficient of determination as shown in Figure 2A. In the text this is described as testing the "selectivity" of the neurons. To clarify, I am assuming that the "regression analysis" is referring to the linear model described in a previous section. The authors then presumably took the coefficients from this model for a single side only and tested how well they could predict the responses to the opposite side, as assessed by R2 (Figure 2C,E). Before that in Figure 2A, they tested how well each single-side model could predict the responses. This is all fine, but the "within" comparison then simply tests how well a linear model can explain the observed responses, and has nothing to do with the selectivity of the neuron. For example, the neuron might be narrowly or broadly selective, but the model might fit equally well.

I think your core understanding is correct; but our usage of “selectivity” created unnecessary confusion. We understand the confusion and have taken several steps to clarify. First, we were using selectivity here to refer to the ability to accurately differentiate between the different touch locations using the linear model. You are correct that this would have nothing to do with narrow or broad tuning. To avoid this confusion, we have replaced “selective” with “discriminative” throughout the manuscript. Further, we have clarified the text describing what is to be learned from this analysis, including an illustrative schematic, Figure 3—figure supplement 4. The relevant updates (from Materials and methods: “Tests for mirror symmetric neural coding of body locations: single unit analysis” (relevant for Figure 3) are included below:

“The purpose of this analysis was to assess whether neural responses to one body side were the same as neural responses to the alternate body side on a single unit basis. Heuristically, we used a cross-validation approach, similar in concept to the population correlation, to ask whether the neural responses to one body side are similar to the other body side. […] If the patterns of response are similar, this would lead to data points falling along the identity line. If the patterns are distinct, the data points should fall below the identity line.”

7) We computed a cross-validated coefficient of determination (R2 within) to measure the strength of neuronal selectivity for each body side.

Even after reading the Materials and methods (further comments below) it is difficult to figure out what all these related measures reveal. At this point in the text it is very difficult to intuit how R2 would measure selectivity.

The confusion here is very much related to point 6. Again, we believe we understand the source of the confusion and have replaced “selective” with “discriminative” and have updated the relevant text for clarity (see response to major point 6).

8) Regarding the timing analysis, it is not clear to me how the accuracy can top out at 100% as shown in the figure, when the control conditions were included. Additionally, the authors should state the p value and statistic for the comparison of latencies.

Based on reviewer feedback and additional analysis, we have changed the way latency is computed (see major point 15). In the original manuscript, only the experienced touch conditions were included in the sliding window classification analysis: touch to the shoulder, cheek, and hand. With the three conditions, chance performance is 33% and accuracy can hit 100%, consistent with the previous results.

A detailed description of the new approach includes a description of how the p-value was computed based on a permutation shuffle test (see major point 15).

9) Spatial analysis. Could the authors provide the size of the paintbrush tip that was used in this analysis. Furthermore, as stimulation sites were 2 cm apart, it is not appropriate to specify receptive fields down to millimeter precision.

The Materials and methods have been updated to include the size of the paintbrush tip (3 mm).

Materials and methods have been updated to read:

“Stimuli were presented via a paintbrush (three mm round tip) gently brushed against each location, at a frequency of one brush per second.”

The millimeter place has been removed.

10) Imagery: how many neurons were responsive to both imagery and real touch? Were all neurons that were responsive to imagery also responsive to actual touch? This is left vague and Figure 5 only includes the percentages per condition, but gives no indication of how many neurons responded to several conditions. Whether and how many neurons were responsive to both conditions also determines the ceiling for the correlation analysis in Figure 5D (e.g. if the most neurons are responsive only to actual but not imaginary touch, this will limit the population correlation).

We now include in Figure 7 (panel D) a Venn diagram that illustrates the number of units that were significantly modulated by experienced touch alone, experienced touch and imagery, and imagery alone. The analysis was restricted to the cheek and shoulder given that touch induced no response on the hand.

11) What is added by including both classification and Mahalanobis distance?

We used classification analysis as the values are readily interpretable and it captures the key feature that we think is important: the ability to discriminate between the different task conditions and whether the decision boundaries generalize between experimental manipulations, e.g. between imagery and experienced touch. However, the discretization process in classification analysis can lead to a loss in sensitivity. The population response might be changing, but not in a way that crosses the decision boundaries established by the classification algorithm. Mahalanobis distance is very sensitive to changes in neural population response. A limitation of Mahalanobis distance is that we, for example, cannot tell whether the changes will affect how e.g., a classifier would interpret the population response. Thinking about this in the context of Figure 8C (previously Figure 6C), the classification analysis tells us that the basic population level response patterns that allow us to distinguish the different classes are generally preserved from the cue through the execution period. The Mahalanobis adds to this picture by showing that there is a clear transition from one point to another in the neural state space, and that this transition occurs at short latency and over a short duration just after the go cue.

A description motivating the benefit of Mahalanobis distance is included in the Materials and methods (“Temporal dynamics of population activity during tactile imagery task: within category” (relevant for Figure 8)):

“…to explicitly test whether population activity was changing, we used Mahalanobis distance as our measure. This is necessary as classification involves a discretization step that makes the technique relatively insensitive to changes in neural population activity that do not cross decision thresholds. Mahalanobis distance, being a proper distance measure, is a more sensitive measure of change. To illustrate, imagine that a classifier is trained on time point A and tested on time point B. At time point A, the means of the two classes are 0 and 1 respectively and at time point 2 the means are 0 and 4 respectively. All classes are assumed to have equal but negligible variance (e.g. 0.01) in this example. When trained on time point A, the classifier finds a decision boundary at 0.5. with 100% classification accuracy. When tested on time point B, with the same 0.5 decision boundary, the classifier again is 100%. Naively, this could be interpreted as signifying that no change in the underlying data has occurred, even though the mean of the second distribution has shifted.”

Relatedly, information coding evolves for a single unit. Two complimentary analyses were then performed.

In what sense are they complementary? What is added (besides complexity) by including both cluster analysis and PCA?

They are complimentary in the sense that while PCA estimates how much of the variability across all units is captured by specific temporal patterns while clustering can show how many units are described by the different patterns. However, aggregating across reviewer responses, and acknowledging that the primary message from both is essentially the same, we have opted to simplify presentation and are removing the cluster analysis.

12) Classification was performed using linear discriminant analysis with the following assumptions:

one, the prior probability across tested task epochs was uniform;

It is not clear what prior probability this refers to. Just stimulus site?

two, the conditional probability distribution of each unit on any epoch was normal;

Is this a reference to firing rate probability conditioned on stimulus site?

three, only the mean firing rates differ for unit activity during each epoch (covariance of the normal distributions are the same for each);

four, firing rates for each input are independent (covariance of the normal distribution is diagonal).

Does this refer to independent firing rates of neurons across stimulus sites? This seem very unlikely, given everything we know about dimensionality of cortex. Perhaps it refers to something else. Cannot all of these assumptions be tested? Were they?

These assumptions are more about tractability of analysis given limited data than true assumptions about the behavior of neurons. To elaborate, in linear discriminant analysis, the response to each stimulation site is modeled as a Gaussian characterized by a mean and variance. In our experiments, we only acquire 10 repetitions per condition, generally not enough to get a reasonable estimate of variance. Therefore, we compute a pooled estimate of variance based on all conditions to try and get a better variance estimate. Likewise, estimates of covariance require even more data. We sometimes use the Ledoit-Wolf optimal shrinkage estimator for covariance to compute a regularized estimate, but this almost invariably returns the diagonal matrix given our amount of data and thus it feels somewhat disingenuous to claim we are capturing covariance structure between neurons. To validate these choices, we use cross-validation to test how well the models explain held-out data. For on the order of ~10 repetitions per condition, the assumptions above nearly always outperform or at worst match performance for alternative assumptions. As an aside, we have collected data in other experiments where we have on the order of ~80 trials per condition. With this amount of data, allowing separate estimates of variance and a full covariance matrix (sometimes called quadratic discriminant analysis) can improve classification performance, but only marginally. Note that we know a priori that equal variance is unlikely as neurons exhibit Poisson spiking statistics (the variance is proportional to the mean); nonetheless, the regularization through a pooled variance estimate still allows improved generalization and thus is the lesser of two evils. Anyhow, we have updated the Materials and methods for a simplified and better justified.

Materials and methods :

“Classification was performed using linear discriminant analysis with the following parameter choices: one, only the mean firing rates differ for unit activity in response to each touch location (covariance of the normal distributions are the same for each condition); and, two, firing rates for each unit are independent (covariance of the normal distribution is diagonal). These choices do not reflect assumptions about the behavior of neurons, but instead, were found to improve cross-validation prediction accuracy on preliminary data. In our experiments, we acquired 10 repetitions per touch location, generally not enough data to robustly estimate the covariance matrix that describes the conditional dependence of the neural behavior on the stimulus. In choosing equal covariance, we are able to pool data across touch locations, achieving a more generalizable approximation of the neural response as verified by cross-validation.”

13) We computed the cross-validated coefficient of determination

(R2 within) to measure how well a neuron's firing rate could be explained by the responses to the sensory fields.

This needs a better description, and I may be missing the point entirely. I assume it is an analysis of mean firing rate (which should be stated explicitly) and that it uses something like the indicator variable of the linear analysis of individual neuron tuning above. In this case is this is a logistic regression? As it is computed for each side independently, it would appear that there are only four bits to describe the firing of any given neuron. This would seem to be a pretty impoverished statistic, even if the statistical model is accurate.

You are correct in saying that the model, a function built on indicator variables, is a description of the mean response to each tactile field, with the mean being captured by the continuously valued β coefficient. In comparing across body sides, we are asking whether the mean response to each tactile field computed from one side of the body can predict the single trial responses to each tactile field on the other side of the body. A few clarifications: while the indicator variable is composed of zeros and ones, the β values (means) that scale the indicator variable in the linear model are continuous values. Thus, it is not 4 bits, but instead, 4 continuous values. Further, if we had simply computed the mean values for each side, and then correlated the resulting sets of 4 values, we would not expect especially meaningful results as correlations computed with such a small number of values can by chance cover the whole spectrum of correlations. Instead, we are seeing how well the trial-to-trial responses of one side is explained by the mean responses of the other side. This is summarized as the R2 value. We have revised the description of the linear model for greater clarity, updated the description of the approach, and included a new schematic figure (Figure 3—figure supplement 4) to motivate the approach. This figure along with its caption was shown in response to major point 6.

The revisions to the relevant text on the linear model can be found in the response to major point 4.

And clarification of the across body-side regression can be found in the response to major point 6.

14) The purpose of computing a specificity index was to quantify the degree to which a neuron was tuned to represent information pertaining to one side of the body over the other.

This is all pretty hard to follow. The R2 metric itself is a bit mysterious, as noted above. Within and across R2 is fairly straightforward, but adds to the complexity, as does SI, which makes comparisons of three different combinations of these measures across sides. Aside from R2 itself, the math is pretty transparent. However, a better high-level description of what insight all the different combinations provide would help to justify using them all. As is, there is no discussion and virtually no description of the difference across these three scatter plot. The critical point apparently, is that, "nearly all recorded PC-IP neurons demonstrate bilateral coding". There should be much a more direct way to make this point.

We have updated the text and streamlined the figure to make this section a little more digestible. These changes can be found in response to major point 6.

15) Computing response latency via RF discrimination is rather indirect and assumes that there is significant classification in the first place. I suspect it will add at least some delay beyond more typical tests. Why not a far simpler and more direct test of means in the same sliding window? Alternatively, a change point analysis?

The original sliding window classification analysis included three conditions: touch to the shoulder, cheek, and hand. The inclusion of the hand, a condition resulting in no significant modulation of the population, should allow for detection of the touch response, even in the absence of RF modulation. However, following your suggestion, we reanalyzed the latency data and have updated our method as outlined below. In brief, we now use a piece wise linear model on the first principal component (1PC) to detect the time the signal rises above the baseline response. A bootstrap procedure is used to find the quartile range of these times. Interestingly, the latency estimates are clearly earlier then what we got for the classification approach. We believe this is likely related to discussion above about Mahalanobis distance versus classification: namely, changes in the underlying neural behavior are only detected once the neural responses cross a decision line which likely results in delays detecting changes in neural behavior.

Here we opted for the 1PC as it is generally recommended when trying to capture the basic properties of the population response (51). Consistent with this, as shown in Author response image 2, the mean across the population (left) is quite a bit noisier then the first PC (right).

Author response image 2

Another nice property of the first PC, is that save for averaging responses in 2ms bins, we did not impose any additional smoothing.We have updated the manuscript to include the new approach. The text from the Results and Materials and methods are copied below for convenience.

Results:

“We measured latency as the time at which the response of the neural population rose above the pre-stimulus baseline activity (Figure 6). The neural population response was quantified as the first principal component computed from principal component analysis (PCA) of the activity of all neurons (51, 52). The first principal component was then fit with a piece-wise linear function and latency was computed as the time the linear function crossed the baseline pre-stimulus response. Response latency was short for both body sides and was slightly shorter for contralateral (right) receptive fields (50 ms) than for ipsilateral (left) receptive fields (54 ms) although this difference was not statistically significant (Permutation shuffle test, p>0.05). Figure 6A shows the time course of the first principal component relative to time of contact of the touch probe (stepped window; 2 ms window size, stepped at 2 ms, no smoothing) along with the piece-wise linear fit (dashed line). A bootstrap procedure was used to find the inter-quartile range of latency estimates (Figure 6B)”

Materials and methods:

“We quantified the neural response latency to touch stimuli at the level of the neural population. […] We used a rank test to compare the true difference in latency estimates against an empirical null distribution of differences in latency estimates generated by shuffling labels and repeating the comparison 2000 times.”

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

The reviewers were happy with the vast majority of the extensive revisions that have been implemented relative to the original submission, but felt that the presentation of the results and their discussion was inadequate. While we recognise the enormous amount of work that was poured into the revisions, there are still a few issues, originating from the revisions, that the reviewers felt require further consideration. But beyond these specific considerations that are elaborated below, we all agreed that the paper, at its present form, is too dense with results/interpretation, making it very difficult to read and evaluate. As one of the reviewers summarised:

"My main high-level concern was the, "extensive and overlapping analyses" that made it difficult to follow and to find a clear takeaway message from the paper. I suggested eliminating a number of figures and clarifying the remainder to improve the impact of the paper. Although a few small panels have disappeared, most of the duplication remains and there are now nine, fairly complex figures (had been eight). It's hard to judge how much longer the text got because the figures are no longer embedded, but that has certainly also increased. The paper has not improved as a consequence".

There is a strong need for a “deep clean” (or more precisely deep edit) of the paper. As highlighted in our original review, there's a lot of overlap between analyses/findings that will not be of interest to the average reader. We believe the key innovation of the study – the imagery results – could be presented a lot more concisely with a more focused discussion of the findings and the limitations of the suggested interpretation. Much of the details of the tactile RF properties is of secondary interest and as such should be moved to the supplementary section. I leave the decision of how to achieve this leaner and more focused version to the authors, but please consider the average reader (and their very limited time and attention span) when making the edits. As a rule of thumb, the reviewers estimated you will be likely be able to remove ~50% of the figure panels and 33% of the main text without weakening the key findings.

We understand the concern and have substantially reduced the amount of text and figures. We have cut the early portion of the text that discusses actual touch in half (from ~1800 words to ~850 words.) In addition we have cut the number of main figures in this section from 5 down to 2. This reduction was accomplished by a combination of removing figure panels (e.g. former panel 1B, 3B, 5C, 5E, portions of 5D and F were removed; we also removed 7C, 9B, and 9C) as well as pushing figures to the supplement (e.g. 3C, 4, and remaining parts of 5). This takes the main figures in the actual touch section from 18 panels down to 5 panels. This new presentation gets the key points across, e.g. neurons have spatially structured tactile receptive fields that are activated at short-latency consistent with processing of tactile sensations, in a more succinct manner.

This reorganization does leave a number of supplementary figures. However, the text is structured such that these figures are truly supplements, and can be skimmed or skipped without compromising a basic understanding of the primary message. We considered removing additional panels. However, 1) to our knowledge this is the first report of single unit responses in human PPC to tactile stimuli, and we think an (admittedly smaller) group of interested readers will find these analyses essential and 2) we have several upcoming papers what will cite these basic touch responses.

1) Perhaps the most exciting innovation of the study relates to the neural responses related to the imagery conditions. Yet, we know from many previous studies in humans that cognitive processes…

While the Introduction has been much improved, the Discussion still mostly disregards the alternative cognitive processes that are likely to drive the present findings (prediction, attention). The authors seem to downplays the impact of attention considerably. For example, they state:

"Most studies of pre-stimulus attention report that any modulation of baseline activity is modest at best."

But especially when considering the modest effects of imagery, this statement is misleading. The Roland (1981) study that we provided, for example, show a 25% increase in rCBF when subjects attended the index finger without being stimulated. And this increase was spatially specific as it shifted to the lip area when attending to the upper lip instead of the finger. Although the other 3 attention references were looking at the effects of attention while sensory stimulation was present, they still seem relevant to the discussion and show what I'd consider to be greater than modest effects of attention. For example, the Puckett (2017) reference shows clear digit-specific modulation when attending to the different fingertips. Note that despite stimulation being present during this condition, the stimulation was constant (allowing the phasic attention-related signal to be somewhat isolated from the sensory signal). The amplitude of the phasic attention signal was similar to that elicited by phasic stimulation alone (i.e., without endogenous attention).

There needs to be a more serious consideration that the effects attributed here to imagery are in fact modulated by (or even driven by) related cognitive processes, such as prediction and attention, which are not specifically linked to the auditory cue used in the present study.

We agree that the precise neural correlate of the activity is unclear. As the reviewer mentioned, imagery is just one possibility, and the manuscript does not state that the responses are the neural correlate of imagery.

For example, the Abstract states:

“Our results are the first neuron level evidence of touch encoding in human PPC and its cognitive engagement during tactile imagery, which may reflect semantic processing, attention, sensory anticipation, or imagined touch.”

Introducing the relevant section of the Discussion we write:

“While our task identifies dynamic engagement of multiple cognitive processes during tactile imagery, it is inadequate to precisely define the cognitive correlates of the observed neural activity. A number of cognitive processes may be engaged during the tactile imagery task including preparation for and/or execution of imagery, engagement of an internal model of the body, semantic processing of the auditory cue, allocation of attention to the cued body location or nature of the upcoming stimulus, and/or sensory memory for the corresponding actual sensation applied by the experimenter.“

That said, there are a couple of places where the distinction between the task that is used to evoke activity and the underlying cognitive process which is encoded could be made even more explicit. We have updated the text to more clearly specify that neural activity results from a “tactile imagery task”, e.g. updating the Abstract:

“We recorded neurons within the PPC of a human clinical trial participant during actual touch presentation and during a tactile imagery task. Neurons encoded actual touch at short latency with bilateral receptive fields, organized by body part, and covered all tested regions. The tactile imagery task evoked body part specific responses that shared a neural substrate with actual touch. Our results are the first neuron level evidence of touch encoding in human PPC and its cognitive engagement during a tactile imagery task, which may reflect semantic processing, attention, sensory anticipation, or imagined touch.”

We have also updated the portion of the Discussion that explicitly discusses a possible role for attention (We noticed that the references for neuroimaging were misaligned to our purpose in the revision. We had intended to use the citations the reviewer provided (here shown as 1-4). Thank you for references, and we have made the correction.):

“Hearing an auditory cue can direct the study participant’s attention to the cued body part. […] If our results are interpreted within the framework of attention, our current findings are inconsistent with a simple gain-like mechanism for attention, but instead suggest a richer mechanism by which information is selectively enhanced for further processing (7).”

This update is similar to the original, but I think the text is a fair accounting, consistent with our reading of the literature. First, we state that attention may be engaged by the task and that attention allocated to the body has been shown to effect neural responses. We are not questioning whether attention alters somatosensory processing. However, critically, this is not the question at hand. What is relevant to the current paper is whether attention generates condition specific changes in single unit neural firing, in the absence of any stimulus. The statement “Most studies of pre-stimulus attention report modest modulation of baseline neural activity“ is, to the best of our knowledge, an accurate statement of reported effects of pre-stimulus attention on measures of single unit activity. This is not a dismissal of attention, it just reflects the majority of studies that have looked at single unit spiking during the pre-stimulus period. In the revised text we go further than I believe necessary by stating that the inability to find a positive result may have been a failure to use sensitive analysis as suggested by the Snyder et al. paper. We then cite the one paper that shows relatively substantive pre-stimulus effects of attention in the firing of single units and state our results are compatible with the attention findings. We go on to interpret our results within the framework of attention. Again, I think the text is a fair accounting, consistent with our reading of the literature.

The reviewer provided references to bolster the case for attention. Three of the references measure attention effects on stimulus responses. We unequivocally state that “Attention to a stimulated body part has been shown to enhance sensory processing in human neuroimaging (1-4).” We are not questioning whether attention alters sensory processing and have included the recommended references to support a role for attention in the somatosensory system. As we state above however, it is not clear what this would say about the effect of prestimulus attention on spiking activity.

The reviewer also provides a reference to Roland 1981, which measures a 25% change in rCBF to attended body parts absent a stimulus. Can these changes be clearly attributed to attention-based modulation of single unit firing? To this point, Roland states “The most natural explanation for the rCBF and metabolic increase in the sensory finger region is that they mainly result from the local sum of many EPSPs … [and] a part of the metabolic increase in the sensory finger area may also be due to IPSPs.” (see Discussion). Roland thinks the bulk of the signal is dendritic processing, not a result of action potentials. Please understand that we are not using Roland’s words as a general critique of the ability of neuroimaging methods to measure single unit firing; we know that while they can dissociate, they generally correlate. That said, it is especially something like attention, defined in its modulatory capacity, that could give rise to dendritic computations in the absence of explicit spiking during the pre-stimulus period. Further, looking closer at Roland’s methodology, it is not clear that the rCBF results can clearly be attributed to attention: In Roland’s study, rCBF measurements were made absent an overt stimulus. However, only data in which the subject reported having experienced sensations were included in analysis. These “false alarms or reports of stimuli though none were applied … was considered a sign of high focal attention” otherwise “If no false alarms were reported the test was repeated.”. In this way “high focal attention” is inextricably tied to what might reasonably called imagined sensations, amongst other related cognitive variables. As with our paper, the highly correlated nature of many cognitive variables makes definitive interpretation difficult.

Taken together, I think that we have handled the possible interpretation of cognitive activity in a responsible manner.

2) Figure 4: Several panels would be more effective.

This figure has changed quite lot, addressing my cosmetic concerns. However, I do not understand this statistical test: If no comparison was significant (FDR corrected), the unit was classified as "single peak." If at least one of the comparisons was statistically significant, it was characterized as "multi-peak." I must be missing something fundamental. I took this to be a test of differences of the responses to the different body parts, with respect to the peak response. How is this a single peak? No differences sounds like a flat line. By this definition a neuron with no response whatsoever would be "single peak". Likewise, the multi-peak definition is a puzzle.

From our results: “In brief, for each neuron, we found the location of maximal response and asked whether we could find a second local maxima that rose significantly above the neighboring values. If no significant second local maxima was found, the neuron was categorized as single peak, otherwise, the neuron was categorized as multi-peak.”

We have updated the Materials and methods description for improved clarity, copied below for convenience:

“We found that many neurons responded to touch to multiple body locations. We wished to further characterize the receptive field structure to determine whether neurons were characterized by single-peaked broad receptive fields or discontinuous receptive fields with multiple peaks. […] The unit was then classified as multi-peak. If no second local maxima was found the unit was classified as single-peak. ”

3) What is added by including both classification and Mahalanobis distance?

"Mahalanobis distance that provides a sensitive measure of change which is masked by the discretization process of classification "

I don't think the discrete nature of the classifier output is really the biggest issue. By averaging across many instances, it essentially becomes continuous, as in this figure. There are also classifier-related metrics that are by their nature, continuous. The nature of the distance measure they are making is likely more important. In this case, I do not understand why the classification rises only slightly at t=0 as the M distance increases sharply. Subsequently, between 1 and 2.5 s, classifier success drops back below its original level even as distance is stable. The two measures really don't seem to be concordant. What is going on here? I think this concern is not unrelated to my next comment about Figure 8C (now 9C).

There is a question generated from another question. It may be that only the last portion of this response is needed, but in the interest of clarity, I’ll introduce the topic more generally.

Neural state-space is encoded as discrete values when using classification. Consider the schematic in Author response image 3 in which the neural state can either be described using the continuous variables of firing rate, or the discrete variables of class label (A or B):

Author response image 3

In this example, we see how two transitions (shown as vectors) through neural state space of comparable Mahalanobis distance are dramatically different in terms of classifier output (e.g. the movement started in class B and ended in class B, there is no indication that any movement occurred at all). From this example, it should be clear that Mahalanobis distance is a more sensitive measure of change. At the same time, it can also be significant whether or not the movement crosses significant boundaries. The classification result shows that although there is movement in neural state-space, the transition still leaves the neural state within the same basic class boundaries.Regarding how movement in state-space can give rise to the behavior shown, consider the following in Author response image 4:

Author response image 4

Initially, the neural state moves further from the boundary (improving accuracy[Note that I am showing “mean” trajectories. Single trial results would be quite a bit noisier, accounting for trial-to-trial variability in accuracy]), while increasing Mahalanobis distance from the starting location. Towards the end of the movement, the position in state-space maintains a consistent distance from the starting point while drifting to the boundary. This would lead to a steady-state distance, but dropping accuracy. Other geometries, factoring in mean and variance, could account for the results. However, given that the changes in accuracy are on the order of a couple percentage points, it is not clear that a detailed analysis would be of interest to a general audience.

4) Figure 8C: Despite my best efforts, I have no idea…

"This asymmetry is likely a consequence of the analysis technique and may not be of physiological significance.”

I agree with the statement, but not its sentiment. Perhaps I'm missing something, but the fact that a single classifier can distinguish between rest state and two very different activation states does nothing to suggest those two states are a general representation of an input. The classification failure in the opposite direction only reinforces that. Presumably, classifying imagined and actual touch would be trivial, at a much higher level of success than rest and imagined touch, suggesting that they are in fact, rather different, even by this metric. If the authors wish to make the claim that their results show more than grossly common receptive fields bilaterally and across modes (which is not an uninteresting finding) they would do well to adopt tools more appropriate for it, like those that have been used by the groups of Shenoy, Churchland, Miller, Kaufman, and others: Canonical correlations, principal angles, subspace overlap.

I believe there is a misunderstanding here. Perhaps it is best to take a step back.

Our objective is to test the hypothesis that actual and cognitive representations of touch share a neural substrate at the population level in PPC. We address this by asking whether the neural population representations of the stimulated body part are similar across these two contexts. There are a number of metrics that can be used to measure similarity. In the draft, we use cross-classification accuracy as the metric of similarity. This approach creates a low-dimensional discretized representation of neural space (the decision space defined by the classifier) based on the conditions of one context (e.g. imagery) and asks whether trials from the alternate context map to corresponding regions of neural space. This is a powerful notion of similarity; it says that a decision maker (e.g. the classifier) interprets population level responses across contexts in a similar way.

Cross-classification is not the only way of computing a meaningful measure of similarity. The reviewer suggests a number of methods, with the implication that these alternate metrics would reveal something more fundamental about the relationship between the two contexts. While I agree that these proposed methods would measure something different, I disagree that they can reveal something more fundamental.

Consider principal angles. In this approach, we would compute the latent manifold independently for each context using, e.g. PCA. We would then compute the principal angles between these manifolds, thereby quantifying the degree to which population responses in the two contexts live in the same manifold. On the surface, this sounds interesting, and, no doubt, has its place. However, for our purposes, I would argue that it is less stringent and less revealing than something like cross-classification. This is because overlapping manifolds does not imply similar neural encoding. In fact, manifolds can be perfectly aligned in the potentially uninteresting case that the only correspondence between the two contexts is that the same set of neurons are modulated, but in completely unrelated ways. To illustrate, I created a toy problem (full matlab code below) where the two contexts are characterized by independent realizations of a multivariate normal distribution (e.g. X=randn(500,50); Y=randn(500,50)). I then added a multiplier on the first 10 dimensions. As shown in Author response image 5 (left panel) there is no covariance between the two contexts. Due to the multiplier term, there is a ten-dimensional manifold that explains >95% of the variance (Author response image 5 : middle column). Because the multiplier was applied to the same columns of the two contexts (X and Y), we see that the first 10 principal angles are close to 0 (Author response image 5 : right column), indicating that the two manifolds are very similar. As this example illustrates, near perfect alignment of manifolds might say nothing more than that the same neurons modulate in both contexts. This has its place, and may even be necessary to find any meaningful relationship when eg. Using unlabeled data or when temporal considerations prevent a meaningful alignment of the data. However, in our case, we wanted a metric that is able to say whether the same neurons are modulating in similar ways based on task conditions, not just that the same neurons are modulating. Cross-classification gives us this, principal angles do not.

Matlab code:

X=randn(500,50); Y=randn(500,50);

idx=1:10;

X(:,idx)=X(:,idx)*10; Y(:,idx)=Y(:,idx)*10;

[coeffX,scoreX,latentX,tsquaredX,explainedX,muX] = pca(X);

[coeffY,scoreY,latentY,tsquaredY,explainedY,muY] = pca(Y);

figure; subplot(1,3,1); hold on

imagesc(corr(X,Y),[-1 1]); colormap('jet');colorbar

axis image; title('Correlation')

xlabel('Context 1'); ylabel('Context 2')

subplot(1,3,2); hold on

plot(explainedX,'r.-','markersize',10)

plot(explainedY,'g.-','markersize',10)

ylabel('Var Explained'); legend({'Context 1','Context 2'})

theta=subspacea(coeffX(:,1:12),coeffY(:,1:12));

subplot(1,3,3);

plot(theta'*180/pi,'.-','markersize',10)

title('Principal Angle')

ylabel('Angle (deg)')

Author response image 5

Canonical correlation analysis (CCA) is also proposed. Again, CCA has its place but I do not think it is correct to say that it would reveal something more interesting/fundamental. Consider the following example. Author response image 6 shows the response of two neurons in two contexts, color coded by condition. Cross-classification in this scenario would fail to show a similar relationship between labeled data between contexts. In contrast, CCA would show a very strong similarity (correlation values ~.8 in this simulated data). CCA is successful because it finds the linear projection that best equates these two scenarios (in this case, heuristically, by rotating Context 1 clockwise). In effect, CCA computes a measure of similarity, after an arbitrary linear transformation is applied to maximize similarity. This is fine for what it is (and sometimes necessary when e.g. observations from the two contexts might come from different sensors that have no natural correspondence) but I don’t think it would provide deeper insight given that we already show that there is correspondence prior to an optimized linear mapping.

Author response image 6

None of this changes the fact that the asymmetries inherent in cross-classification can add complication without any clear benefit. In our case, I believe the asymmetry may not be especially interesting, thus adding complication without benefit. For this reason, we are substituting out cross-classification for cross-correlation. This maintains the benefits of direct tests of whether the condition specific patterns of activation are consistent across contexts without the inherent complications that come from different discretization of neural space depending on the dataset used for training. As can be seen, the basic pattern of results is essentially identical.Results:

“Cognitive processing during the cue-delay and imagery epochs of the tactile imagery task shares a neural substrate with that for actual touch

Finally, we look at how encoding patterns through time generalize between the tactile imagery and actual touch conditions. A dynamic correlation analysis was applied both within and across the imagery and actual touch condition types (Figure 6A). In brief, the neural activation pattern elicited to each body location was quantified as a vector, and these vectors were concatenated to form a population response matrix for each condition type and for each point in time. These vectors were then pair-wise correlated in a cross-validated manner so that the strength of correlation between conditions could be assessed relative to the strength of correlation within condition, and across time. We found that the neural population pattern that defined responses to actual touch was similar to population responses both during the cue-delay or the imagery phases of the imagery task (Figure 6A). This implies that cognitive processing prior to active imagery as well as during imagery share a neural substrate with actual touch. Sample neuronal responses that help to understand single unit and population behavior are shown in Figure 6B.”

5) Computing response latency via RF discrimination is rather indirect.

The authors have adopted a more sensible and sensitive test of latency. I do not agree with this statement, however: "We believe this is likely related to discussion above about Mahalanobis distance versus classification: namely, changes in the underlying neural behavior are only detected once the neural responses cross a decision line which likely results in delays detecting changes in neural behavior." In what fundamental sense is classification significance different from a significant distance in M-space? It seems to me that the more likely explanation is simply that significant modulation precedes significant discrimination.

The classification involved discrimination between touch to three locations, the hand, shoulder and cheek. Touch to the insensate hand resulted in no neural modulation. For this reason, modulation of cheek and shoulder should enable discrimination against the unmodulated hand condition. Peak accuracy in this case would not be 100% but results should still be significant. The exact reason why classification failed to detect the time of modulation is a timely manner is a legitimate question, but moot given that we have removed the classification-based approach entirely.

However, to illustrate why the discretization problem (as defined in response 3) could contribute, consider the schematic in Author response image 7:

Author response image 7

Depending on the rate of change of the neural activity, and exact location of the decision boundary, there could be a significant delay between latency as measured by time of modulation and time neural activity crosses a decision threshold.

https://doi.org/10.7554/eLife.61646.sa2

Article and author information

Author details

  1. Srinivas Chivukula

    1. Department of Biology and Biological Engineering, California Institute of Technology, Pasadena, United States
    2. Tianqiao and Chrissy Chen Brain-Machine Interface Center, Chen Institute for Neuroscience, California Institute of Technology, Pasadena, United States
    3. Geffen School of Medicine, University of California, Los Angeles, Los Angeles, United States
    Contribution
    Data curation, Software, Formal analysis, Validation, Investigation, Methodology, Writing - original draft, Writing - review and editing
    Contributed equally with
    Carey Y Zhang and Tyson Aflalo
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-3570-162X
  2. Carey Y Zhang

    1. Department of Biology and Biological Engineering, California Institute of Technology, Pasadena, United States
    2. Tianqiao and Chrissy Chen Brain-Machine Interface Center, Chen Institute for Neuroscience, California Institute of Technology, Pasadena, United States
    Contribution
    Data curation, Software, Formal analysis, Investigation, Methodology, Writing - original draft, Writing - review and editing
    Contributed equally with
    Srinivas Chivukula and Tyson Aflalo
    Competing interests
    No competing interests declared
  3. Tyson Aflalo

    1. Department of Biology and Biological Engineering, California Institute of Technology, Pasadena, United States
    2. Tianqiao and Chrissy Chen Brain-Machine Interface Center, Chen Institute for Neuroscience, California Institute of Technology, Pasadena, United States
    Contribution
    Conceptualization, Resources, Data curation, Software, Formal analysis, Supervision, Funding acquisition, Validation, Investigation, Methodology, Writing - original draft, Project administration, Writing - review and editing
    Contributed equally with
    Srinivas Chivukula and Carey Y Zhang
    For correspondence
    taflalo@caltech.edu
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-0101-2455
  4. Matiar Jafari

    1. Department of Biology and Biological Engineering, California Institute of Technology, Pasadena, United States
    2. Tianqiao and Chrissy Chen Brain-Machine Interface Center, Chen Institute for Neuroscience, California Institute of Technology, Pasadena, United States
    3. Geffen School of Medicine, University of California, Los Angeles, Los Angeles, United States
    Contribution
    Resources, Data curation, Validation, Writing - review and editing
    Competing interests
    No competing interests declared
  5. Kelsie Pejsa

    1. Department of Biology and Biological Engineering, California Institute of Technology, Pasadena, United States
    2. Tianqiao and Chrissy Chen Brain-Machine Interface Center, Chen Institute for Neuroscience, California Institute of Technology, Pasadena, United States
    Contribution
    Resources, Data curation, Supervision, Funding acquisition, Methodology
    Competing interests
    No competing interests declared
  6. Nader Pouratian

    1. Department of Biology and Biological Engineering, California Institute of Technology, Pasadena, United States
    2. Tianqiao and Chrissy Chen Brain-Machine Interface Center, Chen Institute for Neuroscience, California Institute of Technology, Pasadena, United States
    3. Geffen School of Medicine, University of California, Los Angeles, Los Angeles, United States
    Contribution
    Resources, Data curation, Funding acquisition, Methodology, Writing - review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-0426-3241
  7. Richard A Andersen

    1. Department of Biology and Biological Engineering, California Institute of Technology, Pasadena, United States
    2. Tianqiao and Chrissy Chen Brain-Machine Interface Center, Chen Institute for Neuroscience, California Institute of Technology, Pasadena, United States
    Contribution
    Resources, Supervision, Funding acquisition, Validation, Project administration, Writing - review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-7947-0472

Funding

National Eye Institute (R01EY015545)

  • Tyson Aflalo
  • Nader Pouratian
  • Richard A Andersen

Conte Center (P50MH094258)

  • Tyson Aflalo
  • Richard A Andersen

T&C Chen Brain-Machine Interface Center at Caltech

  • Tyson Aflalo
  • Richard A Andersen

Boswell Foundation

  • Richard A Andersen

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

The authors thank subject NS for participating in the studies, Viktor Shcherbatyuk for technical assistance, and Kelsie Pejsa for administrative and regulatory assistance. This work was supported by the National Institute of Health (R01EY015545), the Tianqiao and Chrissy Chen Brain-machine Interface Center at Caltech, the Conte Center for Social Decision Making at Caltech (P50MH094258), and the Boswell Foundation.

Ethics

Clinical trial registration NCT01958086.

Human subjects: All procedures were approved by the California Institute of Technology (IRB #18-0401), University of California, Los Angeles (IRB #13-000576-AM-00027), and Casa Colina Hospital and Centers for Healthcare (IRB #00002372) Institutional Review Boards. Informed consent was obtained after the nature of the study and possible risks were explained.

Senior and Reviewing Editor

  1. Tamar R Makin, University College London, United Kingdom

Publication history

  1. Received: July 31, 2020
  2. Accepted: February 8, 2021
  3. Version of Record published: March 1, 2021 (version 1)

Copyright

© 2021, Chivukula et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 1,239
    Page views
  • 128
    Downloads
  • 1
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Neuroscience
    Stephanie Maynard et al.
    Research Article

    Precise quantitative information about the molecular architecture of synapses is essential to understanding the functional specificity and downstream signaling processes at specific populations of synapses. Glycine receptors (GlyRs) are the primary fast inhibitory neurotransmitter receptors in the spinal cord and brainstem. These inhibitory glycinergic networks crucially regulate motor and sensory processes. Thus far the nanoscale organization of GlyRs underlying the different network specificities has not been defined. Here, we have quantitatively characterized the molecular arrangement and ultra-structure of glycinergic synapses in spinal cord tissue using quantitative super-resolution correlative light and electron microscopy (SR-CLEM). We show that endogenous GlyRs exhibit equal receptor-scaffold occupancy and constant packing densities of about 2000 GlyRs µm-2 at synapses across the spinal cord and throughout adulthood, even though ventral horn synapses have twice the total copy numbers, larger postsynaptic domains and more convoluted morphologies than dorsal horn synapses. We demonstrate that this stereotypic molecular arrangement is maintained at glycinergic synapses in the oscillator mouse model of the neuromotor disease hyperekplexia despite a decrease in synapse size, indicating that the molecular organization of GlyRs is preserved in this hypomorph. We thus conclude that the morphology and size of inhibitory postsynaptic specializations rather than differences in GlyR packing determine the postsynaptic strength of glycinergic neurotransmission in motor and sensory spinal cord networks.

    1. Neuroscience
    Haleigh N Mulholland et al.
    Research Article

    Intracortical inhibition plays a critical role in shaping activity patterns in the mature cortex. However, little is known about the structure of inhibition in early development prior to the onset of sensory experience, a time when spontaneous activity exhibits long-range correlations predictive of mature functional networks. Here, using calcium imaging of GABAergic neurons in the ferret visual cortex, we show that spontaneous activity in inhibitory neurons is already highly organized into distributed modular networks before visual experience. Inhibitory neurons exhibit spatially modular activity with long-range correlations and precise local organization that is in quantitative agreement with excitatory networks. Furthermore, excitatory and inhibitory networks are strongly co-aligned at both millimeter and cellular scales. These results demonstrate a remarkable degree of organization in inhibitory networks early in the developing cortex, providing support for computational models of self-organizing networks and suggesting a mechanism for the emergence of distributed functional networks during development.