Peer review process
Not revised: This Reviewed Preprint includes the authors’ original preprint (without revision), an eLife assessment, and public reviews.
Read more about eLife’s peer review process.Editors
- Reviewing EditorPeter KokUniversity College London, London, United Kingdom
- Senior EditorTamar MakinUniversity of Cambridge, Cambridge, United Kingdom
Reviewer #1 (Public Review):
This study explored how expectations influence tactile perception. In summary, anticipating a tactile event enhances detection compared to when knowledge is lacking or ambiguous. However, prior information can also impair performance if the expected and actual stimuli are incongruent. The authors used fMRI and multivariate decoding analyses to investigate the underlying mechanisms of this behavioural phenomenon.
They stimulated two fingers (thumb and ring) of the left hand and analysed activity patterns in contralateral and ipsilateral somatosensory regions during and before stimulation. They were able to distinguish activity patterns for the two fingers during both stimulation and the pre-stimulation stage, specifically for the congruent condition. The authors suggest that congruent vibrotactile stimulation leads to higher multivariate information content and improved behavioural detection performance. They also found that the expectation of vibrotactile stimulation elicits somatotopic activity in contralateral S1, similar to the activity generated by actual stimulation.
I thoroughly enjoyed reading this well-written and clear work. The incorporation of multivariate decoding analysis alongside univariate analysis is a good choice for addressing the claimed questions. In the following sections, I will highlight the strengths and weaknesses of the study. While I generally agree with the authors' conclusions regarding the functional mechanisms underlying behavioural improvements, I believe there are limitations in the experimental design and chosen measures that constrain the interpretations drawn from the results. I hope that my comments can contribute to clarifying certain details and improving aspects of the study that may be considered weak. I believe this study holds significance for the field and provides a foundation for future investigations into the influence of top-down processing on tactile processing.
Strengths:
The research question is highly intriguing as it delves into the unexplored territory of top-down processes within the tactile domain that still needs to be well characterised.
The addition of multivariate decoding analysis alongside the univariate analysis was a good choice in my opinion, since activity level per se may not accurately reflect the underlying information content. Both high activity levels and absence of activity (as observed in this study) can still contain information. To be more specific, Figure 2C shows no significant activity in the congruent condition, but significant decoding for the two finger activity patterns is still possible in this condition (Figure 3A).
The utilization of a staircase before each functional run was also a good approach, although a potential limitation is noted (discussed below). Considering that prior knowledge can be particularly influential in the presence of weak or noisy stimuli, it is crucial to confirm that the stimulation was at threshold to maximize the likelihood of detecting differences in the pre-stimulus stage.
Weaknesses:
My main concern regarding this study lies in the choice of a detection paradigm, which may introduce response biases and affect the interpretation of results. If the threshold was set too low for some participants, it is possible that they reported feeling the touch more frequently on the cued finger, even when no actual sensation was present. Consequently, accuracy may be inflated for the congruent condition and reduced for the incongruent condition, making it difficult to attribute the observed improvements solely to enhanced detection. I think it would have been more appropriate to use a discriminatory task (e.g., discriminating pin patterns), as employed in Kok et al., 2012, where behavioural performance can be directly linked to decoding accuracy between related activity patterns. Additionally, incorporating trials with no stimulation (I am not sure whether this was the case in this study) and utilizing "None" responses to calculate accuracy could provide a more reliable measure of performance. Using dprime as a performance measure, which is bias-free, may be more appropriate. However, I remain concerned that participant responses are influenced more by the cue than the actual detection of stimuli.
While I appreciate the use of the staircase method, I was somewhat surprised by the relatively short length of each staircase (only 7 trials). I might not have extensive experience in this area, therefore this might still be ok for fingers, but I want to emphasize the importance of accurately determining the threshold for this study (as discussed in the previous point). However, I can see from Figure 1B that there seems to be consistency across runs (at least in the shown participant).
The absence of significant decoding in the incongruent condition (Figure 3A) raises some questions. It seems reasonable to expect that discrimination between the two finger activity patterns should still be possible in this condition, albeit with reduced accuracy as observed in Kok et al., 2012. Could this lack of significant decoding result from the detection task or possibly due to the smaller number of trials in the incongruent condition?
I am a bit confused about which specific region of interest (ROI) was used for both the univariate and decoding analysis during the stimulation stage, and the decoding analysis and RSA during the pre-stimulation phase. From my understanding, the entire S1 region (as defined using the SPM Anatomy toolbox) was included, encompassing not only the hand territory but the entire body. However, I may have misinterpreted the methodology. Given that an independent localizer was used to define ROIs for the univariate analysis during the pre-stimulation phase, it raises the question of why the same approach was not applied to the analyses during the stimulation phase and the multivariate analysis during the pre-stimulation phase.
By using a large ROI for analysis (as mentioned in point 4), the straightforward interpretation of BOLD level (i.e., no significant activity) in the congruent condition (Figure 2C) becomes less clear. It raises the question of whether there is truly no activity in the congruent condition or if the activity would be observed with a smaller region. This aligns with the findings of Kok et al., 2012, where they demonstrate activity in both expected and unexpected conditions, albeit reduced in the expected condition.
Point 5 raises another issue regarding the suggestion that significant decoding results imply higher multivariate information content in finger representations of congruent vibrotactile stimulations. Suppose a smaller ROI were used, revealing activity in the congruent condition and differential activity between the two finger conditions. In that case, the substantial difference in activation levels suggests that increased decoding accuracy may not necessarily require higher multivariate information content. It is conceivable that discrimination between the two conditions could be achieved with just two voxels-one in the thumb territory and one in the index territory.
Reviewer #2 (Public Review):
Summary
The authors conducted a study where participants were perceiving near-threshold touch at either the thumb or ring finger while lying in the MR scanner. Prior to stimulation, a visual cue indicated to them with 80% probability which finger would be touched next (thumb or ring finger), or did not provide meaningful information on which finger would be touched. Subsequently, participants were asked to indicate which finger was actually touched via button press. By showing that 1. participants were more accurate in responding which finger was touched in the congruent compared to the incongruent and neutral conditions, 2. S1 responses were higher in the incongruent compared to the congruent and neutral conditions, 3. decoding accuracies were higher for the congruent compared to incongruent and neutral conditions, and 4. decoding was also successful in the period after cueing and before stimulation, the authors argue that similar to V1, S1 shows decreased BOLD activation in response to expected versus non-expected stimuli, whereas the finger-specific response is more precise for expected versus non-expected stimuli. The authors also argue that behavioral improvement is associated to a tactile stimulus being predicted in location.
Strengths
The manuscript combines a behavioral threshold task that can be analyzed using psychophysics with BOLD responses in S1, providing a rich paradigm to understand the relationship between S1 responsively and tactile perception. The authors combine GLM with both ROI-based and whole-brain searchlight-based decoding analyses, and therefore offer different analyses methods to obtain a comprehensive picture of the S1 responsively during expected versus non-expected touch. It is also a strength of the paper that two different fingers were investigated, hence addressing the aspect of topography.
Weaknesses
The behavioral paradigm that was chosen is not ideal to address the authors' questions on whether or not behavior improves for expected versus non-expected touch. More precisely, in 80% of the cases when it was indicated that the ring finger would be touched, in fact later the ring finger was touched, whereas in 80% of the cases when it was indicated that the thumb would be touched, in fact later the thumb was touched. In the congruent conditions where later the indicated finger was indeed touched, participants showed on average 70% accuracy. Therefore, they could have reached this accuracy level simply by choosing the indicated finger unless they had a strong sensation that indicated to them to respond otherwise. In order to show that the cueing can improve behavioral performance, one would have to choose a tactile task that is not related to finger identity (which was cued), such as frequency detection or spatial acuity.
The correlation between accuracy and decoding accuracies as shown in Figure 3b does not seem to be correct. The decoding accuracies indicate how well the algorithm can differentiate between D1 versus D4 stimulation in the congruent condition, whereas the behavior indicates the difference between congruent and incongruent responses. I think those two measures should not directly be compared, in addition to the general problem that is inherent in the behavioral paradigm, as outlined above. I would therefore treat this correction and the behavioral analyses in general with great caution.
Alternative ways to interpret the data
It is worth noting that the incongruent stimulation condition did not reveal significant D1 versus D4 decoding results neither when ROI-based decoding was used nor when searchlight-based decoding was used (see Figure 3a,c). Therefore, it seems that when the wrong finger was cued, the finger representation of the actually touched finger did not respond. Given the decoding accuracy is even below 50% for the incongruent ROI-based decoding, this seems to indicate that the finger-specific response in S1 to the cued finger is even stronger than the finger-specific response in S1 to the actually touched finger. This may be the major take-home-message of the paper. This hypothesis could be directly tested by showing the the plot in Figure 2c for each finger: The results may show that the higher activation in the incongruent condition is actually due to the fact that in this condition, both the non-touched and finger the touched finger respond, whereas this is not the case for the other conditions.
When discussing this finding, the authors write that "finger representations of congruent vibrotactile stimulations are associated with higher multivariate information content, are more aligned with the somatotropin organization in contralateral S1, and that the enhanced representation of these stimuli is strongly associated with behavioral detection performance." - A better formulation may be that for threshold tactile stimulation, the expectation of finger touch can override the actual finger touch, indicating a strong influence of top-down control on S1 finger maps. This is also supported by the analyses that there is finger-specific activation in the cue-stimulation interval. However, as indicated above, finger- and condition-specific BOLD activation needs to be shown to explore this in more detail.
Reviewer #3 (Public Review):
The authors have devised a clever experimental design involving the provision of cues to participants, indicating the finger that is more likely to be stimulated in each trial (e.g., ring finger or thumb). Employing fMRI analyses, the authors have leveraged the distinct and well-defined finger representations in the somatosensory cortex to investigate how prior knowledge influences the processing of haptic stimuli in a probability cueing paradigm. The authors successfully replicate key neural phenomena associated with predictive processing, encompassing expectation suppression, the sharpening of expected information representation, and the pre-activation of sensory templates associated with the anticipated stimulus. The methodology employed in this study is straightforward, and the obtained results are convincing.
However, it is worth noting that the cue-finger and finger associations were explicitly conveyed to the participants in this study. Additionally, the inter-stimulus interval (ISI) between the finger-cue and the cue varied randomly across trials, rendering the onset of the cue unpredictable (in time) for the participants. These experimental manipulations lead me to consider that the observed results may not be solely explained by predictive mechanisms but could also involve top-down controlled attention. It would be valuable for the authors to include a task similar to Experiment 2 in Kok et al. (2012), where participants' attention was diverted away from the gratings contrast, yet decoding sharpening for expected but task-irrelevant stimulus orientations was still evident. By incorporating such a task, it would help elucidate whether the authors would replicate similar results when predictive information remains intact but the predicted stimulus feature becomes task-irrelevant.
Furthermore, I have concerns regarding potential issues related to the training of the multivariate decoder. If I understand correctly, instead of using the functional localiser to train the SVM classifier, the authors directly employed the experimental data from the congruent, incongruent, and non-informative conditions together. It is noted that the number of trials used in each training fold was downsampled to achieve an equal number of trials from each condition, controlling for the asymmetry in number of trials between the incongruent and congruent conditions. However, I am concerned that if there are univariate differences between the activity patterns in the training datasets (e.g., congruent < incongruent), the decoder might exhibit a bias towards relying more on the activity of one specific condition, thereby potentially performing better in decoding that particular condition. To address this, I suggest presenting Representational Similarity Analysis (RSA) results using the activity patterns evoked by congruent, incongruent, and non-informative stimuli. This analysis would offer a simpler, more interpretable representation of changes in the representational geometry of the stimuli based on previous predictions (see Blank & Davis, 2016), and might shed some light on whether your results correspond on sharpening or dampening of the expected information.