Uncertainty-based inference of a common cause for body ownership

  1. Marie Chancel  Is a corresponding author
  2. H Henrik Ehrsson
  3. Wei Ji Ma
  1. Department of Neuroscience, Karolinska Institutet, Sweden
  2. Center for Neural Science and Department of Psychology, New York University, United States

Abstract

Many studies have investigated the contributions of vision, touch, and proprioception to body ownership, i.e., the multisensory perception of limbs and body parts as our own. However, the computational processes and principles that determine subjectively experienced body ownership remain unclear. To address this issue, we developed a detection-like psychophysics task based on the classic rubber hand illusion paradigm, where participants were asked to report whether the rubber hand felt like their own (the illusion) or not. We manipulated the asynchrony of visual and tactile stimuli delivered to the rubber hand and the hidden real hand under different levels of visual noise. We found that: (1) the probability of the emergence of the rubber hand illusion increased with visual noise and was well predicted by a causal inference model involving the observer computing the probability of the visual and tactile signals coming from a common source; (2) the causal inference model outperformed a non-Bayesian model involving the observer not taking into account sensory uncertainty; (3) by comparing body ownership and visuotactile synchrony detection, we found that the prior probability of inferring a common cause for the two types of multisensory percept was correlated but greater for ownership, which suggests that individual differences in rubber hand illusion can be explained at the computational level as differences in how priors are used in the multisensory integration process. These results imply that the same statistical principles determine the perception of the bodily self and the external world.

Editor's evaluation

In the rubber hand illusion, a rubber hand feels as if being part of one's body when stroked in synchrony with one's own occluded hand. By varying the temporal lags between the strokes to the rubber and to the real hand, and visual noise, the authors suggest that body ownership is governed by an active and probabilistic causal inference process that uses both prior knowledge and sensory uncertainty. The authors argue that probabilistic functions rather than fixed multisensory integration rules governs body ownership, thereby opening new venues for investigating its computational principles. The evidence is compelling, and the findings advanced our fundamental understanding of the computational principles of body ownership.

https://doi.org/10.7554/eLife.77221.sa0

Introduction

The body serves as an anchor point for experiencing the surrounding world. Humans and animals need to be able to perceive what constitutes their body at all times, i.e., which objects are part of their body and which are not, to effectively interact with objects and other individuals in the external environment and to protect their physical integrity through defensive action. This experience of the body as one’s own, referred to as ‘body ownership’ (Ehrsson, 2012), is automatic and perceptual in nature and depends on integrating sensory signals from multiple sensory modalities, including vision, touch, and proprioception. We thus experience our physical self as a blend of sensory impressions that are combined into a coherent unitary experience that is separable from the sensory impressions associated with external objects, events, and scenes in the environment. This perceptual distinction between the self and nonself is fundamental not only for perception and action but also for higher self-centered cognitive functions such as self-recognition, self-identity, autobiographical memory, and self-consciousness (Banakou et al., 2013; Beaudoin et al., 2020; Bergouignan et al., 2014; Blanke et al., 2015; Maister and Tsakiris, 2014; Tacikowski et al., 2020; van der Hoort et al., 2017). Body ownership is also an important topic in medicine and psychiatry, as disturbances in bodily self-perception are observed in various neurological (Brugger and Lenggenhager, 2014; Jenkinson et al., 2018) and psychiatric disorders (Costantini et al., 2020; Keizer et al., 2014; Saetta et al., 2020), and body ownership is a critical component of the embodiment of advanced prosthetic limbs (Collins et al., 2017; Makin et al., 2017; Niedernhuber et al., 2018; Petrini et al., 2019). Thus, understanding how body ownership is generated is an important goal in psychological and brain sciences.

The primary experimental paradigm for investigating the sense of body ownership has been the rubber hand illusion (Botvinick and Cohen, 1998). In the rubber hand illusion paradigm, participants watch a life-sized rubber hand being stroked in the same way and at the same time as strokes are delivered to their real passive hand, which is hidden from view behind a screen. After a period of repeated synchronized strokes, most participants start to feel the rubber hand as their own and sense the touches of the paintbrush on the rubber hand where they see the model hand being stroked. The illusion depends on the match between vision and somatosensation and is triggered when the observed strokes match the sensed strokes on the hidden real hand and when the two hands are placed sufficiently close and in similar positions. A large body of behavioral research has characterized the temporal (Shimada et al., 2009; Shimada et al., 2014), spatial (Lloyd, 2007; Preston, 2013), and other (e.g. form and texture; Filippetti et al., 2019; Holmes et al., 2006; Lin and Jörg, 2016; Lira et al., 2017; Tieri et al., 2015; Ward et al., 2015) rules that determine the elicitation of the rubber hand illusion and have found that these rules are reminiscent of the spatial and temporal congruence principles of multisensory integration (Ehrsson, 2012; Kilteni et al., 2015). Moreover, neuroimaging studies associate body ownership changes experienced under the rubber hand illusion with activations of multisensory brain regions (Ehrsson et al., 2004; Guterstam et al., 2019a; Limanowski and Blankenburg, 2016). However, we still know very little about the perceptual decision process that determines whether sensory signals should be combined into a coherent own-body representation or not, i.e., the multisensory binding problem that lays at the heart of body ownership and the distinction between the self and nonself.

The current study goes beyond the categorical comparisons of congruent and incongruent conditions that have dominated the body representation literature and introduces a quantitative model-based approach to investigate the computational principles that determine body ownership perception. Descriptive models (e.g. Gaussian fit) traditionally used in psychophysics experiments are useful to provide detailed statistical summaries of the data. These models describe ‘what perception emerges in response to stimulation without making assumptions about the underlying sensory processing. However, computational approaches using process models make quantitative assumptions on ‘how’ the final perception is generated from sensory stimulation. Among these types of models, Bayesian causal inference (BCI) models (Körding et al., 2007) have recently been used to explain the multisensory perception of external objects (Cao et al., 2019; Kayser and Shams, 2015; Rohe et al., 2019), including the integration of touch and vision (Badde et al., 2020). The interest in this type of model stems from the fact that it provides a formal solution to the problem of deciding which sensory signals should be bound together and which should be segregated in the process of experiencing coherent multisensory objects and events. In BCI models, the most likely causal structure of multiple sensory events is estimated based on spatiotemporal correspondence, sensory uncertainty, and prior perceptual experiences; this inferred causal structure then determines to what extent sensory signals should be integrated with respect to their relative reliability.

In recent years, it has been proposed that this probabilistic model could be extended to the sense of body ownership and the multisensory perception of one’s own body (Fang et al., 2019; Kilteni et al., 2015; Samad et al., 2015). In the case of the rubber hand illusion, the causal inference principle predicts that the rubber hand should be perceived as part of the participant’s own body if a common cause is inferred for the visual, tactile, and proprioceptive signals, meaning that the real hand and rubber hand are perceived as the same. Samad et al., 2015 developed a BCI model for the rubber hand illusion based on the spatiotemporal characteristics of visual and somatosensory stimulation but did not quantitatively test this model. These authors used congruent and incongruent conditions and compared questionnaire ratings and skin conductance responses obtained in a group of participants (group level) to the model simulations; however, they did not fit their model to individual responses, i.e., did not quantitatively test the model. Fang et al., 2019 conducted quantitative model testing, but a limitation of their work is that they did not use body ownership perceptual data but an indirect behavioral proxy of the rubber hand illusion (reaching error) that could reflect processes other than body ownership (arm localization for motor control). More precisely, these authors developed a visuoproprioceptive rubber hand illusion based on the action of reaching for external visual targets. The error in the reaching task, induced by manipulating the spatial disparity between the image of the arm displayed on a screen and the subject’s (a monkey or human) real unseen arm, was successfully described by a causal inference model. In this model, the spatial discrepancy between the seen and felt arms is taken into account to determine the causal structure of these sensory stimuli. The inferred causal structure determines to what extent vision and proprioception are integrated in the final percept of arm location; this arm location estimate influences the reaching movement by changing the planned action’s starting point. Although such motor adjustments to perturbations in sensory feedback do not equate to the sense of body ownership, in the human participants, the model’s outcome was significantly correlated with the participants’ subjective ratings of the rubber hand illusion. While these findings are interesting (Ehrsson and Chancel, 2019), the evidence for a causal inference principle governing body ownership remains indirect, using the correlation between reaching performance and questionnaire ratings of the rubber hand illusion instead of a quantitative test of the model based on perceptual judgments of body ownership.

Thus, the present study’s first goal was to test whether body ownership is determined by a Bayesian inference of a common cause. We developed a new psychophysics task based on the classical rubber hand illusion to allow for a trial-by-trial quantitative assessment of body ownership perception and then fitted a BCI model to the individual-level data. Participants performed a detection-like task focused on the ownership they felt over a rubber hand within a paradigm where the tactile stimulation they felt on their real hidden hand was synchronized with that of the rubber hand or systematically delayed or advanced in intervals of 0–500 ms. We calculated the percentage of trials in which participants felt the rubber hand as theirs for each degree of asynchrony. A Bayesian observer (or ‘senser’, as the rubber hand illusion creates a bodily illusion that one feels) would perceive the rubber hand as their own hand when the visual and somatosensory signals are inferred as coming from a common source, a single hand. In this BCI for body ownership model (which we refer to as the ‘BCI model’), the causal structure is inferred by comparing the absolute value of the measured asynchrony between the participants’ seen and felt touches to a criterion that depends on the prior probability of a common source for vision and somatosensation.

A second key aim was to test whether sensory uncertainty influences the inference of a common cause for the rubber hand illusion, which is a critical prediction of the BCI models not tested in earlier studies (Fang et al., 2019; Samad et al., 2015). Specifically, a Bayesian observer would take into account trial-to-trial fluctuations in sensory uncertainty when making perceptual decisions, changing their decision criterion in a specific way as a function of the sensory noise level of the current trial (Keshvari et al., 2012; Körding et al., 2007; Magnotti et al., 2013; Qamar et al., 2013; Zhou et al., 2020). Alternatively, the observer might incorrectly assume that sensory noise does not change or might ignore variations in sensory uncertainty. Such an observer would make a decision regarding whether the rubber hand is theirs or not based on a fixed criterion (FC) that does not depend on sensory uncertainty. Suboptimal but potentially ‘easy-to-implement’ observer models using a FC decision rule have often been used to challenge Bayesian models of perception (Badde et al., 2020; Qamar et al., 2013; Rahnev et al., 2011; Stengård and van den Berg, 2019; Zhou et al., 2020). To address whether humans optimally adjust the perceptual decision made to the level of sensory uncertainty when inferring a common cause for body ownership, we varied the level of sensory noise from trial to trial and determined how well was the data fit from our BCI model compared to a FC model.

Finally, we directly compared body ownership and a basic multisensory integration task within the same computational modeling framework. Multisensory synchrony judgment is a widely used task to examine the integration versus segregation of signals from different sensory modalities (Colonius and Diederich, 2020), and such synchrony perception follows BCI principles (Adam and Noppeney, 2014; Magnotti et al., 2013; Noel et al., 2018; Noppeney and Lee, 2018; Shams et al., 2005). Thus, we reasoned that by comparing ownership and synchrony perceptions, we could directly test our assumption that both types of multisensory percepts follow similar probabilistic causal inference principles and identify differences that can advance our understanding of the relationships of the two (see further information below). To this end, we collected both visuotactile synchrony judgments and body ownership judgments of the same individuals under the same conditions; only instructions regarding which perceptual feature to detect – hand ownership or visuotactile synchrony – differed. Thus, we fit both datasets using our BCI model. We modeled shared sensory parameters and lapses for both tasks as we applied the same experimental stimulations to the same participants, and we compared having a shared prior for both tasks versus having separate priors for each task and expected the latter to improve the model fit (see below). Furthermore, we tested whether the estimates of prior probabilities for a common cause in the ownership and synchrony perceptions were correlated in line with earlier observations of correlations between descriptive measures of the rubber hand illusion and individual sensitivity to asynchrony (Costantini et al., 2016; Shimada et al., 2014). We also expected the prior probability of a common cause to be systematically higher for body ownership than for synchrony detection; this a priori greater tendency to integrate vision and touch for body ownership would explain how the rubber hand illusion could emerge despite the presence of noticeable visuotactile asynchrony (Shimada et al., 2009; Shimada et al., 2014). In the rubber hand illusion paradigm, the rubber hand’s placement corresponds with an orientation and location highly probable for one’s real hand, a position that we often adopt on a daily basis. Such previous experience likely facilitates the emergence of the rubber hand illusion we theorized (Samad et al., 2015) while not necessarily influencing visuotactile simultaneity judgments (Smit et al., 2019).

Our behavioral and modeling results support the predictions made for the three main aims described above. Thus, collectively, our findings establish the uncertainty-based inference of a common cause for multisensory integration as a computational principle for the sense of body ownership.

Results

Behavioral results

In this study, participants performed a detection-like task on the ownership they felt toward a rubber hand; the tactile stimulation they felt on their hidden real hand (taps) was synchronized with the taps applied to the rubber hand that they saw or systematically delayed (negative asynchronies) or advanced (positive asynchronies) by 150, 300, or 500 ms. Participants were instructed to report if ‘yes or no (the rubber hand felt like it was my hand)’. For each degree of asynchrony, the percentage of trials in which the participants felt like the rubber hand was theirs was determined (Figure 1A). Three different noise conditions were tested, corresponding to 0, 30, and 50% of visual noise being displayed via augmented reality glasses (see Materials and methods). The rubber hand illusion was successfully induced in the synchronous condition; indeed, the participants reported perceiving the rubber hand as their own hand in 94 ± 2% (mean ± SEM) of the 12 trials when the visual and tactile stimulations were synchronous; more precisely, 93 ± 3%, 96 ± 2%, and 95 ± 2% of responses were ‘yes’ responses for the conditions with 0, 30, and 50% visual noise, respectively. Moreover, for every participant, increasing the asynchrony between the seen and felt taps decreased the prevalence of the illusion. When the rubber hand was touched 500 ms before the real hand, the illusion was reported in only 20 ± 5% of the 12 trials (noise level 0: 13 ± 4%, noise level 30: 21 ± 5%, and noise level 50: 26 ± 7%); when the rubber hand was touched 500 ms after the real hand, the illusion was reported in only 19 ± 6% of the 12 trials (noise level 0: 10 ± 3%, noise level 30: 18 ± 5%, and noise level 50: 29 ± 6%; main effect of asynchrony: F[6, 84]=5.97, p<0.001; for the individuals’ response plots, see Figure 2—figure supplements 14). Moreover, regardless of asynchrony, the participants perceived the illusion more often when the level of visual noise increased (F[2, 28]=22.35, p<0.001; Holmes’ post hoc test: noise level 0 versus noise level 30: p=0.018, davg = 0.4; noise level 30 versus noise level 50: p=0.005, davg = 0.5; noise level 0 versus noise level 50: p<0.001, davg = 1; Figure 1B). The next step was to examine whether these behavioral results can be accounted for by the BCI principles, including the increased emergence of the rubber hand illusion with visual noise.

Elicited rubber hand illusion under different levels of visual noise.

(A) Colored dots represent the mean reported proportion of elicited rubber hand illusions (± SEM) for each asynchrony for the 0 (black), 30 (orange), and 50% (red) noise conditions. (B) Bars represent how many times in the 84 trials the participants answered ‘yes (the rubber hand felt like my own hand)’ under the 0 (black), 30 (orange), and 50% (red) noise conditions; gray dots are individual data points. There was a significant increase in the number of ‘yes’ answers when the visual noise increased * p<0.001.

Figure 1—source data 1

Sum of "yes" answer for the different asynchrony and noise levels tested in the body ownership judgment task used in Figure 1.

https://cdn.elifesciences.org/articles/77221/elife-77221-fig1-data1-v3.xlsx

BCI model fit to body ownership

Our main causal inference model, the BCI model, assumes that the observer infers the causal structure of the visual and tactile signal to decide to what extent they should be merged into one coherent percept. In this model, the inference depends on the prior probability of the common cause and the trial-to-trial sensory uncertainty. Thus, this model has five free parameters: psame is the prior probability of a common cause for vision and touch, independent of any sensory stimulation, σ0, σ30, σ50 correspond to the noise impacting the measured visuotactile asynchrony in each of the three noise conditions, and λ is the lapse rate to account for random guesses and unintended responses (see Materials and methods and Appendix 1 for more details). This BCI model fits the observed data well (Figure 2A). This finding supports our hypothesis that the sense of body ownership is based on an uncertainty-based inference of a common cause. Three further observations can be noted. First, the probability of a common cause for the visual and tactile stimuli psame exceeded 0.5 (mean ± SEM: 0.80±0.05), meaning that in the context of body ownership, observers seemed to assume that vision and touch were more likely to come from one source than from different sources. This result broadly corroborates previous behavioral observations that the rubber hand illusion can emerge despite considerable sensory conflicts, for example, visuotactile asynchrony of up to 300 ms (Shimada et al., 2009). Second, the estimates for the sensory noise σ increased with the level of visual white noise: 116±13 ms, 141±25 ms, and 178±33 ms for the 0, 30, and 50% visual noise conditions, respectively (mean ± SEM); this result echoes the increased sensory uncertainty induced by our experimental manipulation. Finally, the averaged lapse rate estimate λ was rather low, 0.08±0.04, as expected for this sort of detection-like task, when participants were performing the task according to the instructions (see Figure 2—figure supplement 1 for individual fit results).

Figure 2 with 5 supplements see all
Observed and predicted detection responses for body ownership in the rubber hand illusion.

Bars represent how many times across the 84 trials participants answered ‘yes’ in the 0 (black), 30 (orange), and 50% (red) noise conditions (mean ± SEM). Lighter polygons denote the Bayesian causal inference (BCI) model predictions (A) and fixed-criterion (FC) model predictions (C) for the different noise conditions. Observed data refer to 0 (black dots), 30 (orange dots), and 50% (red dots) visual noise and corresponding predictions (mean ± SEM; gray, yellow, and red shaded areas, respectively) for the BCI model (B) and FC model (D).

Comparing the BCI model to Bayesian and non-Bayesian alternative models

Next, we compared our BCI model to alternative models (see Materials and methods and Appendix 1). First, we observed that adding an additional parameter to account for observer-specific stimulation uncertainty in the BCI* model did not improve the fit of the BCI model (Table 1, Figure 2—figure supplement 3). This observation suggests that assuming the observer’s assumed stimulus distribution has the same SD as the true stimulus distribution was reasonable, i.e., allowing a participant-specific value for σS did not improve the fit of our model enough to compensate for the loss of parsimony.

Table 1
Bootstrapped CIs (95% CI) of the Akaike information criterion (AIC) and Bayesian information criterion (BIC) differences between our main model Bayesian causal inference (BCI) and the BCI* (first line) and fixed criterion (FC; second line) models.

A negative value means that the BCI model is a better fit. Thus, the BCI model outperformed the other two.

Model comparisonAIC (95% CI)BIC (95% CI)
Lower boundRaw sumUpper boundLower boundRaw sumUpper bound
BCI – BCI*–28–25–21–81–77–74
BCI – FC–116–65–17–116–65–17
  1. Finally, the pseudo-R2 were of the same magnitude for each model (mean ± SEM: BCI = 0.62 ± 0.04, BCI* = 0.62 ± 0.04, FC = 0.60 ± 0.05). However, the exceedance probability analysis confirmed the superiority of the Bayesian models over the fixed criterian one for the ownership data (family exceedance probability [EP]: Bayesian: 0.99, FC: 0.0006; when comparing our main model to the FC: protected-EPFC = 0.13, protected-EPBCI = 0.87, posterior probabilities: RFX: p[H1|y] = 0.740, null: p[H0|y] = 0.260).

Second, an important alternative to the Bayesian model is a model that ignores variations in sensory uncertainty when judging if the rubber hand is one’s own, for example, because the observer incorrectly assumes that sensory noise does not change. This second alternative model based on a fixed decisional criterion is the FC model. The goodness of fit of the BCI model was found to be higher than that of the FC model (Figure 2, Table 1, and Figure 2—figure supplement 2). This result shows that the BCI model provides a better explanation for the ownership data than the simpler FC model that does not take into account the sensory uncertainty in the decision process.

Comparison of the body ownership and synchrony tasks

The final part of our study focused on the comparison of causal inferences of body ownership and visuotactile synchrony detection. In an additional task, participants were asked to decide whether the visual and tactile stimulation they received happened at the same time, i.e., whether the felt and seen touches were synchronous or not. The procedure was identical to the body ownership detection task apart from a critical difference in the instructions, which was now to detect if the visual and tactile stimulations were synchronous (instead of judging illusory rubber hand ownership).

Extension analysis results (Table 2 and Figure 3 and Figure 3—figure supplement 1)

Table 2
Bootstrapped CIs (95% CI) for the Akaike information criterion (AIC) and Bayesian information criterion (BIC) differences between shared and different psame values for the Bayesian causal inference (BCI) model in the extension analysis.

A negative value means that the model with different psame values is a better fit.

Model comparisonAIC (95% CI)BIC (95% CI)
Lower boundRaw sumUpper boundLower boundRaw sumUpper bound
Different psame
shared parameters
–597–352–147–534–289–83
Figure 3 with 3 supplements see all
Extension analysis results.

(A) Correlation between the prior probability of a common cause psame estimated for the ownership and synchrony tasks in the extension analysis. The psame estimate is significantly lower for the synchrony task than for the ownership task. The solid line represents the linear regression between the two estimates, and the dashed line represents the identity. Numbers denote the participants’ numbers. (B and C) Colored dots represent the mean reported proportion of perceived synchrony for visual and tactile stimulation for each asynchrony under the 0 (purple), 30 (blue), and 50% (light blue) noise conditions (±SEM). Lighter shaded areas show the corresponding Bayesian causal inference (BCI) model predictions made when all parameters are shared between the ownership and synchrony data (B) and when psame is estimated separately for each dataset (C) for the different noise conditions (see also Figure 3—figure supplement 1).

Figure 3—source data 1

Parameter estimates for the extension and transfer analysis and collected answers in the synchrony detection tasks used in Figure 3.

https://cdn.elifesciences.org/articles/77221/elife-77221-fig3-data1-v3.xlsx

The BCI model fits the combined dataset from both ownership and synchrony tasks well (Figure 3B and C and Figure 3—figure supplement 1). Since the model used identical parameters (or identical parameters except for one), this observation supports the hypothesis that both the rubber hand illusion and visuotactile synchrony perception are determined by similar multisensory causal inference processes. However, in agreement with one of our other hypotheses, the goodness of fit of the model improved greatly when the probability of a common cause (psame) differed between the two tasks (Table 2). Importantly, psame was significantly lower for the synchrony judgment task (mean ± SEM: 0.65±0.04) than for the ownership judgment task (mean ± SEM: 0.83±0.04, paired t-test: t=5.9141, df = 14, and p<0.001). This relatively stronger a priori probability for a common cause for body ownership compared to visuotactile synchrony judgments supports the notion that body ownership and visuotactile event synchrony correspond to distinct multisensory perceptions, albeit being determined by similar causal probabilistic causal inference principles. Finally, in line with our hypothesis, we found that the psame values estimated separately for the two tasks were correlated (Pearson correlation: p=0.002, cor = 0.71; Figure 3A). That is, individuals who displayed a higher prior probability of combining the basic tactile and visual signals and perceiving the visuotactile synchrony of the events also showed a greater likelihood of combining multisensory signals in the ownership task and experiencing the rubber hand illusion. This observation corroborates the link between visuotactile synchrony detection and body ownership perception and provides a new computational understanding of how individual differences in multisensory integration can explain individual differences in the rubber hand illusion.

Transfer analysis results (Table 3, Figure 3—figure supplement 2)

Table 3
Bootstrapped CIs (95% CIs) of the Akaike information criterion (AIC) and Bayesian information criterion (BIC) differences between the partial and full transfer analyses for the Bayesian causal inference (BCI) model.

‘O to S’ corresponds to the fitting of synchrony data by the BCI model estimates from ownership data. ‘S to O’ corresponds to the fitting of ownership data by the BCI model estimates from synchrony data. A negative value means that the partial transfer model is a better fit.

Transfer directionAIC (partial – full transfer, 95% CI)BIC (partial – full transfer, 95% CI)
Lower boundRaw sumUpper boundLower boundRaw sumUpper bound
O to S–1837–1051–441–1784–998–388
S to O–1903–1110–448–1851–1057–394

Finally, we compared the body ownership and synchrony tasks using what we call a transfer analysis. We used the parameters estimated for the ownership task to fit the synchrony task data (O to S) or the parameters estimated for the synchrony task to fit the ownership task data (S to O). Leaving psame as a free parameter always led to a much better fit of the data, as displayed in Table 3 (see also Figure 3—figure supplement 2). Thus, this analysis leads us to the same conclusion as that of the extension analysis. The body ownership task and synchrony task involved different processing of the visual and somatosensory signals for the participants, and this difference in behavioral responses was well captured when two different a priori probabilities for a common cause were used to model each task.

Note that the exceedance probability analysis also confirmed the superiority of the Bayesian models over the FC one for the synchrony data when analyzed separately from the ownership data (family exceedance probability: Bayesian: 0.71, FC: 0.29; when comparing our main model to the FC: protected-EPFC=0.46, protected-EPBCI=0.54, posterior probabilities: RFX (random-effect analysis): p[H1|y]=0.860, null: p[H0|y]=0.140). Further details about the behavioral results for the synchrony judgment task can be found in the Figure 3—figure supplement 3.

Discussion

The main finding of the present study is that body ownership perception can be described as a causal inference process that takes into account sensory uncertainty when determining whether an object is a part of one’s own body or not. Participants performed a detection-like task on the ownership they felt over a rubber hand placed in full view in front of them in our version of the rubber hand illusion paradigm that involved the use of psychophysics, robotically controlled sensory stimulation, and augmented reality glasses (to manipulate visual noise); the tactile stimulation that the participants felt on their own hidden hand was synchronized with the taps applied to the rubber hand that they saw or systematically delayed or advanced. For each degree of asynchrony, the percentage of trials for which the participants felt like the rubber hand was theirs was determined. We found that the probability of the emergence of the rubber hand illusion was better predicted by a Bayesian model that takes into account the trial-by-trial level of sensory uncertainty to calculate the probability of a common cause for vision and touch given their relative onset time than by a non-Bayesian (FC) model that does not take into account sensory uncertainty. Furthermore, in comparing body ownership and visuotactile synchrony detection, we found interesting differences and similarities that advance our understanding of how the perception of multisensory synchrony and body ownership is related at the computational level and how individual differences in the rubber hand illusion can be explained as individual differences causal inference. Specifically, the prior probability of a common cause was found to be higher for ownership than for synchrony detection, and the two prior probabilities were found to be correlated across individuals. We conclude that body ownership is a multisensory perception of one’s own body determined by an uncertainty-based probabilistic inference of a common cause.

Body ownership perception predicted by inference of a common cause

One of the strengths of the present study lies in its direct, individual-level testing of a causal inference model on body ownership perceptual data. This novel means to quantify the rubber hand illusion based on psychophysics is more appropriate for computational studies focused on body ownership than traditional measures such as questionnaires or changes in perceived hand position (proprioceptive drift). Previous attempts made to apply BCI to body ownership were conducted at the group level by the categorical comparison of experimental conditions (Samad et al., 2015); however, such a group-level approach does not properly challenge the proposed models as required according to standards in the field of computational behavioral studies. The only previous study that used quantitative Bayesian model testing analyzed target-reaching error in a virtual reality version of the rubber hand illusion (Fang et al., 2019), but reaching errors tend to be relatively small, and it is unclear how well the reaching errors correlate with the subjective perception of the illusion (Heed et al., 2011; Kammers et al., 2009; Newport et al., 2010; Newport and Preston, 2011; Rossi Sebastiano et al., 2022; Zopf et al., 2011). Thus, the present study contributes to our computational understanding of body ownership as the first direct fit of the BCI model to individual-level ownership sensations judged under the rubber hand illusion.

Computational approaches to body ownership can lead to a better understanding of the multisensory processing involved in this phenomenon than traditional descriptive approaches. The BCI framework informs us about how various sensory signals and prior information about body states are integrated at the computational level. Previous models of body ownership focus on temporal and spatial congruence rules and temporal and spatial ‘windows of integration;’ if visual and somatosensory signals occur within a particular time window (Shimada et al., 2009; Costantini et al., 2016) and within a certain spatial zone (Lloyd, 2007; Brozzoli et al., 2012), the signals will be combined, and the illusion will be elicited (Ehrsson, 2012; Tsakiris, 2010; Makin et al., 2008). However, these models do not detail how this happens at the computational level or explain how the relative contribution of different sensory signals and top-down prior information dynamically changes due to changes in uncertainty. Instead of occurring due to a sequence of categorical comparisons as proposed by Tsakiris, 2010 or by a set of rigid temporal and spatial rules based on receptive field properties of multisensory neurons as implied by Ehrsson, 2012 or Makin et al., 2008, body ownership under the rubber hand illusion arises as a consequence of a probabilistic computational process that infers the rubber hand as the common cause of vision and somatosensation by dynamically taking into account all available sensory evidence given their relative reliability and prior information. The causal inference model further has greater predictive power than classical descriptive models; in that, it makes quantitative predictions about how illusion perception will change across a wide range of temporal asynchronies and changes in sensory uncertainty. For example, the ‘time window of integration’ model – which is often used to describe the temporal constraint of multisensory integration (Meredith et al., 1987; Stein and Meredith, 1993) – only provides temporal thresholds (asynchrony between two sensory inputs) above which multisensory signals will not be integrated (Colonius and Diederich, 2004). In contrast, the present causal inference model explains how information from such asynchronies is used together with prior information and estimates of uncertainty to infer that the rubber hand is one’s own or not. Even though the present study focuses on temporal visuotactile congruence, spatial congruence (Fang et al., 2019; Samad et al., 2015) and other types of multisensory congruences (e.g. Ehrsson et al., 2005; Tsakiris et al., 2010; Ide, 2013; Crucianelli and Ehrsson, 2022) would naturally fit within the same computational framework (Körding et al., 2007; Sato et al., 2007). Thus, in extending beyond descriptive models of body ownership, our study supports the idea that individuals use probabilistic representations of their surroundings and their own body that take into account information about sensory uncertainty to infer the causal structure of sensory signals and optimally process them to create a clear perceptual distinction between the self and nonself.

From a broader cognitive neuroscience perspective, causal inference models of body ownership can be used in future neuroimaging and neurophysiological studies to investigate the underlying neural mechanisms of the computational processes. For example, instead of simply identifying frontal, parietal, and subcortical structures that show higher activity in the illusion condition compared to control conditions that violate temporal and spatial congruence rules (Ehrsson et al., 2004; Ehrsson et al., 2005; Guterstam et al., 2013; Limanowski and Blankenburg, 2016; Guterstam et al., 2019a; Rao and Kayser, 2017), one can test the hypothesis that activity in key multisensory areas closely follows the predictions of the BCI model and correlates with specific parameters of this model. Such a model-based imaging approach, recently successfully used in audiovisual paradigms (Cao et al., 2019; Rohe and Noppeney, 2015; Rohe and Noppeney, 2016; Rohe et al., 2019), can thus afford us a deeper understanding of the neural implementation of the causal inference for body ownership. From previous neuroimaging work (Ehrsson et al., 2004; Guterstam et al., 2013; Limanowski and Blankenburg, 2016; Guterstam et al., 2019a), anatomical and physiological considerations based on nonhuman primate studies (Avillac et al., 2007; Graziano et al., 1997; Graziano et al., 2000; Fang et al., 2019), and a recent model-based fMRI study on body ownership judgments (Chancel et al., 2022), we theorize that neuronal populations in the posterior parietal cortex and premotor cortex could implement the computational processes of the uncertainty-based inference of a common cause of body ownership.

Observers take trial-to-trial sensory uncertainty into account in judging body ownership

The current study highlights the contribution of sensory uncertainty to body ownership by showing the superiority of a Bayesian model in predicting the emergence of the rubber hand illusion relative to a non-Bayesian model. Although BCI is an often-used model to describe multisensory processing from the behavioral to cerebral levels (Badde et al., 2020; Cao et al., 2019; Dokka et al., 2019; Kayser and Shams, 2015; Körding et al., 2007; Rohe et al., 2019; Rohe and Noppeney, 2015; Wozny et al., 2010), it is not uncommon to observe behaviors induced by sensory stimulation that diverge from strict Bayesian-optimal predictions (Beck et al., 2012). Some of these deviations from optimality can be explained by a contribution of sensory uncertainty to the perception that differs from that assumed under a Bayesian-optimal inference (Drugowitsch et al., 2016). Challenging the Bayesian-optimal assumption is thus a necessary good practice in computational studies (Jones and Love, 2011), and this is often done in studies of the perception of external sensory events, such as visual stimuli (Qamar et al., 2013; Stengård and van den Berg, 2019; Zhou et al., 2020). However, very few studies have investigated the role of sensory uncertainty in perceiving one’s own limbs from a computational perspective. Such studies explore the perception of limb movement trajectory (Reuschel et al., 2010), limb movement illusion (Chancel et al., 2016), or perceived static limb position (van Beers et al., 1999; van Beers et al., 2002) but not the sense of body ownership or similar aspects of the embodiment of an object. These studies assume the full integration of visual and somatosensory signals and describe how sensory uncertainty is taken into account when computing a single-fused estimate of limb movement or limb position. However, none of these previous studies investigate inferences about a common cause. A comparison between Bayesian and non-Bayesian models was also missing from the above-described studies of the rubber hand illusion and causal inference (Fang et al., 2019; Samad et al., 2015). Thus, the current results reveal how uncertainty influences the automatic perceptual decision to combine or segregate bodily related signals from different sensory modalities and that this inference process better follows Bayesian principles than non-Bayesian principles. While we have argued that people take into account trial-to-trial uncertainty when making their body ownership and synchrony judgments, it is also possible that they learn a criterion at each noise level (Ma and Jazayeri, 2014), as one might predict in standard signal detection theory. However, we believe this is unlikely because we used multiple interleaved levels of noise while withholding any form of experimental feedback. Thus, more broadly, our results advance our understanding of the multisensory processes that support the perception of one’s own body, as they serve as the first conclusive empirical demonstration of BCI in a bodily illusion. Such successful modeling of the multisensory information processing in body ownership is relevant for future computational work into bodily illusions and bodily self-awareness, for example, more extended frameworks that also include contributions of interoception (Azzalini et al., 2019; Park and Blanke, 2019), motor processes (Burin et al., 2015; Burin et al., 2017), pre-existing stored representations about what kind of objects that may or may not be part of one’s body (Tsakiris et al., 2010), expectations (Chancel et al., 2021; Guterstam et al., 2019b Ferri et al., 2013), and high-level cognition (Lush et al., 2020; Slater and Ehrsson, 2022). Future quantitative computational studies like the present one are needed to formally compare these different theories of body ownership and advance the corresponding theoretical framework.

In the present study, we compared the Bayesian hypothesis to a FC model. FC strategies are simple heuristics that could arise from limited sensory processing resources. Our body plays such a dominant and critical role in our experience of the world that one could easily imagine the benefits of an easy-to-implement heuristic strategy for detecting what belongs to our body and what does not. Our body is more stable than our ever-changing environment, so in principle, a resource-effective and straightforward strategy for an observer could be to disregard, or not optimally compute, sensory uncertainty to determine whether an object in view is part of one’s own body or not. However, our analysis shows that the BCI model outperforms such a model. Thus, observers seem to take into account trial-to-trial sensory uncertainty to respond regarding their body ownership perception. More visual noise, i.e., increased visual uncertainty, increases the probability of the rubber hand illusion, consistent with the predictions of Bayesian probabilistic theory. Intuitively, this makes sense, as it is easier to mistake one partner’s hand for one’s own under poor viewing conditions (e.g. in semidarkness) than when viewing conditions are excellent. However, this basic effect of sensory uncertainty on own-body perception is not explained by classical descriptive models of the rubber hand illusion (Botvinick and Cohen, 1998; Tsakiris et al., 2010; Ehrsson, 2012; Makin et al., 2008). Thus, the significant impact of sensory uncertainty on the rubber hand illusion revealed here advances our understanding of the computational principles of body ownership and of bodily illusions and multisensory bodily perception more generally.

Relationship between body ownership and synchrony perception

The final part of our study focused on the comparison of causal inferences of body ownership and visuotactile synchrony detection. Previous studies have already demonstrated that audiovisual synchrony detection can be explained by BCI (Adam and Noppeney, 2014; Magnotti et al., 2013; Noel et al., 2018; Noppeney and Lee, 2018; Shams et al., 2005). We successfully extend this principle to visuotactile synchrony detection in the context of a rubber hand illusion paradigm. The results of our extension analysis using both ownership and synchrony data suggest that both multisensory perceptions follow similar computational principles in line with our expectations and previous literature. Whether the rubber hand illusion influences synchrony perception was not investigated in the present study, as the goal was to design ownership and synchrony tasks to be as identical as possible for the modeling. However, the results from the previous literature diverge regarding the potential influence of body ownership on synchrony judgment (Ide and Hidaka, 2013; Maselli et al., 2016; Smit et al., 2019), so this issue deserves further investigation in future studies.

Body ownership and synchrony perception were better predicted when modeling different priors instead of a single shared prior. The goodness of fit of the BCI model is greatly improved when the a priori probability of a common cause is different for each task, even when the loss of parsimony due to an additional parameter is taken into account. This result holds whether the two datasets are fitted together (extension analysis), or the parameters estimated for one task are used to fit the other (transfer analysis). Specifically, the estimates of the a priori probability of a common cause were found to be smaller for the synchrony judgment than for the ownership judgment. This means that the degree of asynchrony had to be lower for participants to perceive the seen and felt taps as occurring simultaneously compared to the relatively broader degree of visuotactile asynchrony that still resulted in the illusory ownership of the rubber hand. This result suggests that a common cause for vision and touch outcomes is a priori more likely to be inferred for body ownership than for visuotactile synchrony. We believe that this makes sense, as a single cause for visual and somatosensory impressions in the context of the ownership of a human-like hand in an anatomically matched position in sight is a priori a more probable scenario than a common cause for brief visual and tactile events that in principle could be coincidental and stem from visual events occurring far from the body. This observation is also consistent with previous studies reporting the induction of the rubber hand illusion for visuotactile asynchronies of as long as 300 ms (Shimada et al., 2009), which are perceptually noted. While it seems plausible that psame reflects the real-world prior probability of a common cause of the visual and somatosensory signals, it could also be influenced by experimental properties of the task, demand characteristics (participants forming beliefs based on cues present in a testing situation, Weber and Cook, 1972; Corneille and Lush, 2022; Slater and Ehrsson, 2022), and other cognitive biases.

How the a priori probabilities of a common cause under different perceptive contexts are formed remains an open question. Many studies have shown the importance of experience in shaping the prior (Adams et al., 2004; Chambers et al., 2017; Snyder et al., 2015), and recent findings also seem to point toward the importance of effectors in sensorimotor priors (Yin et al., 2019) and dynamical adjustment during a task (Prsa et al., 2015). In addition, priors for own-body perception could be shaped early during development (Bahrick and Watson, 1985; Bremner, 2016; Rochat, 1998) and influenced by genetic and anatomical factors related to the organization of cortical and subcortical maps and pathways (Makin and Bensmaia, 2017; Stein et al., 2014).

The finding that prior probabilities for a common cause were correlated for the ownership and synchrony data suggests a shared probabilistic computational process between the two multisensory tasks. This result could account for the previously observed correlation at the behavioral level between individual susceptibility to the rubber hand illusion and individual temporal resolution (‘temporal window of integration’) in visuotactile synchrony perception (Costantini et al., 2016). It is not that having a narrower temporal window of integration makes one more prone to detect visuotactile temporal mismatches leading to a weaker rubber hand illusion as the traditional interpretation assumes. Instead, our behavioral modeling suggests that the individual differences in synchrony detection and the rubber hand illusion can be explained by individual differences in how prior information on the likelihood of a common cause is used in multisensory causal inference. This probabilistic computational explanation for individual differences in the rubber hand illusion emphasizes differences in how information from prior knowledge, bottom-up sensory correspondence, and sensory uncertainty is combined in a perceptual inferential process rather than there being ‘hard-wired’ differences in temporal windows of integration or trait differences in top-down cognitive processing (Eshkevari et al., 2012; Germine et al., 2013; Marotta et al., 2016). It should be noted that other multisensory factors not studied in the present study can also contribute to individual differences in the rubber hand illusion, notably as the relative reliability of proprioceptive signals from the upper limb (Horváth et al., 2020). The latter could be considered in future extensions of the current model that also consider the degree of spatial disparity between vision and proprioception and the role of visuoproprioceptive integration (Samad et al., 2015; Fang et al., 2019; Kilteni et al., 2015).

Conclusion

BCI models have successfully described many aspects of perception, decision making, and motor control, including sensory and multisensory perception of external objects and events. The present study extends this probabilistic computational framework to the sense of body ownership, a core aspect of self-representation and self-consciousness. Specifically, the study presents direct and quantitative evidence that body ownership detection can be described at the individual level by the inference of a common cause for vision and somatosensation, taking into account trial-to-trial sensory uncertainty. The fact that the brain seems to use the same probabilistic approach to interpret the external world and the self is of interest to Bayesian theories of the human mind (Ma and Jazayeri, 2014; Rahnev, 2019) and suggests that even our core sense of conscious bodily self (Blanke et al., 2015; Ehrsson, 2020; Tsakiris, 2017; de Vignemont, 2018) is the result of an active inferential process making ‘educated guesses’ about what we are.

Materials and methods

Participants

18 healthy participants naïve to the conditions of the study were recruited for this experiment (six males, aged 25.2±4 years, right-handed; they were recruited from outside the department, never having taken part in a bodily illusion experiment before). Note that in computational studies such as the current one, the focus is on fitting and comparing models within participants, i.e., to rigorously quantify perception at the single-subject level, and not only rely on statistical results at the group level. All volunteers provided written informed consent prior to their participation. All participants received 600 SEK (Swedish krona) as compensation for their participation (150 SEK per hr). All experiments were approved by the Swedish Ethics Review Authority (Ethics number 2018/471-31/2).

Inclusion test

Request a detailed protocol

In the main experiment, participants were asked to judge the ownership they felt toward the rubber hand. It was therefore necessary for them to be able to experience the basic rubber hand illusion. However, we know that approximately 20–25% of healthy participants do not report a clear and reliable rubber hand illusion (Kalckert and Ehrsson, 2014), and such participants are not able to make reliable ownership discriminations in psychophysics tasks (Chancel and Ehrsson, 2020), which were required for the current modeling study (they tended to respond randomly). Thus, all participants were first tested on a classical rubber hand illusion paradigm to ensure that they could experience the illusion. For this test, each participant sat with their right hand resting on a support beneath a small table. On this table, 15 cm above the hidden real hand, the participant viewed a life-sized cosmetic prosthetic male right hand (model 30,916 R, Fillauer, filled with plaster; a ‘rubber hand’) placed in the same position as the real hand. The participant kept their eyes fixed on the rubber hand while the experimenter used two small probes (firm plastic tubes, diameter: 7 mm) to stroke the rubber hand and the participant’s hidden hand for 12 s, synchronizing the timing of the stroking as much as possible. Each stroke lasted 1 s and extended approximately 1 cm; the strokes were applied to five different points along the real and rubber index fingers at a frequency of 0.5 Hz. The characteristics of the strokes and the duration of the stimulation were designed to resemble the stimulation later applied by the robot during the discrimination task (see below). Then, the participant completed a questionnaire adapted from that used by Botvinick and Cohen, 1998, see also Chancel and Ehrsson, 2020 and Figure 4—figure supplement 1. This questionnaire includes three items assessing the illusion and four control items to be rated with values between –3 (‘I completely disagree with this item’) and 3 (‘I completely agree with this item’). Our inclusion criteria for a rubber hand illusion strong enough for participation in the main psychophysics experiment were as follows: (1) a mean score for the illusion statements (Q1, Q2, and Q3) of greater than 1 and (2) a difference between the mean score for the illusion items and the mean score for the control items of greater than 1. Three participants (two females) did not reach this threshold; therefore, 15 subjects participated in the main experiment (five males, aged 26.3±4 years, Figure 4—figure supplement 2). The inclusion test session lasted 30 min in total. After this inclusion phase, the participants were introduced to the setup used in the main experiment.

Experimental setup

Request a detailed protocol

During the main experiment, the participant’s right hand lay hidden, palm down, on a flat support surface beneath a table (30 cm lateral to the body midline), while on this table (15 cm above the real hand), a right rubber hand was placed in the same orientation as the real hand aligned with the participants’ arm (Figure 4A). The participant’s left hand rested on their lap. A chin rest and elbow rest (Ergorest Oy, Finland) ensured that the participant’s head and arm remained in a steady and relaxed position throughout the experiments. Two robot arms (designed in our laboratory by Martti Mercurio and Marie Chancel, see Chancel and Ehrsson, 2020 for more details) applied tactile stimuli (taps) to the index finger of the rubber hand and to the participant’s hidden real index finger. Each robot arm was composed of three parts: two 17-cm-long, 3-cm-wide metal pieces and a metal slab (10×20 cm) as a support. The joint between the two metal pieces and that between the proximal piece and the support was powered by two HS-7950TH Ultra Torque servos that included 7.4 V optimized coreless motors (Hitec Multiplex, USA). The distal metal piece ended with a ring containing a plastic tube (diameter: 7 mm) that was used to touch the rubber hand and the participant’s real hand.

Figure 4 with 2 supplements see all
Experimental setup (A) and experimental procedure (B and C) for the ownership judgment task.

A participant’s real right hand is hidden under a table while they see a life-sized cosmetic prosthetic right hand (rubber hand) on the table (A). The rubber hand and real hand are touched by robots for periods of 12 s, either synchronously or with the rubber hand touched slightly earlier or later at a degree of asynchrony that is systematically manipulated (±150 ms, ±300 ms, or ± 500ms). The participant is then required to state whether the rubber hand felt like their own hand or not (‘yes’ or ‘no’ forced choice task) (B). Using the Meta2 headset, three noise conditions are tested: 0 (top picture), 30 (middle picture), and 50% (bottom picture) visual noise (C).

During the experiment, the participants wore augmented reality glasses: a Meta2 VR headset with a 90° field of view, 2560×1440 high-dpi display, and 60 Hz refresh rate (Meta View Inc). Via this headset, the uncertainty of the visual scene could be manipulated: The probability of a pixel of the scene observed by the participant turning white from one frame to the other varied (frame rate: 30 images/s); when turning white, a pixel became opaque, losing its meaningful information (information on the rubber hand and robot arm touching the rubber hand) and therefore becoming irrelevant to the participant. The higher the probability of the pixels turning white becomes, the more uncertain the visual information becomes. During the experiment, the participants wore earphones playing white noise to cancel out any auditory information from the robots’ movements that might have otherwise interfered with the behavioral task and with illusion induction (Radziun and Ehrsson, 2018).

Procedure

Request a detailed protocol

The main experiment involved two tasks conducted in two different sessions: a body ownership judgment task and a synchrony judgment task. Both tasks were yes/no psychophysical detection tasks (Figure 4B).

Body ownership judgment task

Request a detailed protocol

In each trial, the participant was asked to decide whether the rubber hand felt like their own hand, i.e., to determine whether they felt the key phenomenological aspect of the rubber hand illusion (Botvinick and Cohen, 1998; Ehrsson et al., 2004; Longo et al., 2008). Each trial followed the same sequence. The robots repeatedly tapped the index fingers of the rubber hand and the actual hand six times each for a total period of 12 s in five different locations in randomized order (‘stimulation period’): immediately proximal to the nail on the distal phalanx, on the distal interphalangeal joint, on the middle phalanx, on the proximal interphalangeal joint, and on the proximal phalanx. All five locations were stimulated at least once in each 12 s trial, and the order of stimulation sites randomly varied from trial to trial. The participant was instructed to focus their gaze on the rubber hand. Then, the robots stopped while the participant heard a tone instructing them to verbally report whether the rubber hand felt like their own hand by saying ‘yes’ (the rubber hand felt like it was my hand) or ‘no’ (the rubber hand did not feel like it was my hand). This answer was registered by the experimenter. A period of 12 s was chosen in line with a previous rubber hand illusion psychophysics study (Chancel and Ehrsson, 2020), and because earlier studies with individuals susceptible to the illusion have shown that the illusion is reliably elicited in approximately 10 s (Ehrsson et al., 2004; Guterstam et al., 2013; Lloyd, 2007), different locations on the finger were chosen to prevent the irritation of the skin during the long psychophysics session and in line with earlier studies stimulating different parts of the hand and fingers to elicit the rubber hand illusion (e.g. Guterstam et al., 2011). During this period of stimulation, the participant was instructed to look at and focus on the rubber hand.

After the stimulation period and the body ownership judgment answer, the participant was asked to wiggle their right fingers to avoid any potential numbness or muscle stiffness from keeping their hand still and to eliminate possible carry-over effects to the next stimulation period by breaking the rubber hand illusion (moving the real hand while the rubber hand remained immobile eliminates the rubber hand illusion). The participant was also asked to relax their gaze by looking away from the rubber hand because fixating on the rubber hand for a whole session could have been uncomfortable. 5 s later, a second tone informed the participant that the next trial was about to start; the next trial started 1 s after this sound cue.

Two variables were manipulated in this experiment: (1) the synchronicity between the taps that seen and those felt by the participants (asynchrony condition) and (2) the level of visual white noise added to the visual scene (noise condition). Seven different asynchrony conditions were tested. The taps on the rubber hand could be synchronized with the taps on the participant’s real hand (synchronous condition) or could be delayed or advanced by 150, 300, or 500 ms. In the rest of this article, negative values of asynchrony (−150,–300, and –500 ms) mean that the rubber hand was touched first, and positive values of asynchrony (+150,+300, and +500 ms) mean that the participant’s hand was touched first. The seven levels of asynchrony appeared with equal frequencies in pseudorandom order so that no condition was repeated more than twice in a row. The participants did not know how many different asynchrony levels were tested (as revealed in unformal post-experiment interviews) and that no feedback was given on their task performance. Three different noise conditions were tested, corresponding to 0, 30, and 50% of visual noise being displayed, i.e., the pixels of the Meta2 headset screen could turn white from one frame to another with a probability of 0, 30, or 50% (Figure 4C). The three levels of noise also appeared with equal frequencies in pseudorandom order. During the experiment, the experimenter was blind to the noise level presented to the participants, and the experimenter sat out of the participants’ sight.

Visuotactile synchrony judgment task

Request a detailed protocol

During this task, the participant was asked to decide whether the visual and tactile stimulation they received happened at the same time, i.e., whether the felt and seen touches were synchronous or not. The procedure was identical to the body ownership detection task apart from a critical difference in the instructions, which was now to determine if the visual and tactile stimulations were synchronous (instead of judging illusory rubber hand ownership). In each trial, a 12 s visuotactile stimulation period was followed by the yes/no verbal answer given by the participant and a 4 s break. The same two variables were manipulated in this experiment: the synchronicity between the seen and felt taps (asynchrony condition) and the level of visual white noise (noise condition). The asynchronies used in this synchrony judgment task were lesser than those of the ownership judgment task (±50, ±150, or ±300 ms instead of ±150, ±300, or ±500 ms) to maintain an equivalent difficulty level between the two tasks; this decision was made based on a pilot study involving 10 participants (three males, aged 27.0±4 years, different than the main experiment sample) who performed the ownership and synchrony tasks under 11 different levels of asynchrony (Appendix 1—table 3 and Figure 2). The noise conditions were identical to those used for the ownership judgment task.

The ordering of the tasks was counterbalanced across the participants. Each condition was repeated 12 times, leading to a total of 252 judgments made per participant and task. The trials were randomly divided into three experimental blocks per task, each lasting 13 min.

Modeling

Request a detailed protocol

As explained in the introduction, we assumed that the rubber hand illusion is driven by the integration of visual and tactile signals in the current paradigm. To describe this integration, we designed a model in which the observer performs BCI; we compare this model to a non-Bayesian model. We then extended the same models of the synchrony judgment task and examined whether the same model with the same parameters could describe a participant’s behavior in both tasks.

BCI model for body ownership

Request a detailed protocol

We first specify the BCI model for body ownership. A more detailed and step-by-step description of the modeling can be found in Appendix 1.

Generative model

Request a detailed protocol

Bayesian inference is based on a generative model, which is a statistical model of the world that the observer believes to give rise to observations. By ‘inverting’ this model for a given set of observations, the observer can make an ‘educated guess’ about a hidden state. Therefore, we first must specify the generative model that captures both the statistical structure of the task as assumed by the observer and an assumption about measurement noise. In our case, the model contains three variables: the causal structure category C, the tested asynchrony s, and the measurement of this asynchrony by the participant x. Even though the true frequency of synchronous stimulation (C=1) is 1/7=0.14, we allow it to be a free parameter, which we denote as psame. One can view this parameter as an incorrect belief, but it can equivalently be interpreted as a perceptual or decisional bias. Next, when C=1, the asynchrony s is always 0; we assume that the observer knows this. When C=2, the true asynchrony takes one of several discrete values; we do not assume that the observer knows these values or their probabilities and instead assume that the observer assumes that asynchrony is normally distributed with the correct SD σS of 348 ms (i.e. the true SD of the stimuli used in this experiment). In other words, p(s|C=2)= N(s;0, σs2). Next, we assume that the observer makes a noisy measurement x of the asynchrony. We make the standard assumption (inspired by the central limit theorem) that this noise follows the below a normal distribution:

p(x|s)= N(x;s, σ2)

where the variance depends on the sensory noise for a given trial. Finally, we assume that the observer has accurate knowledge of this part of the generative model.

Inference

Request a detailed protocol

Now that we have specified the generative model, we can turn to inference. Visual and tactile inputs are to be integrated, leading to the emergence of the rubber hand illusion if the observer infers a common cause (C=1) for both sensory inputs. On a given trial, the model observer uses x to infer the category C. Specifically, the model observer computes the posterior probabilities of both categories, pC=1|x and pC=2|x, i.e., the belief that the category was C. Then, the observer would report ‘yes, it felt like the rubber hand was my own hand’ if the former probability were higher, or in other words, when d>0, where

d=logpC=1|xpC=2|x .

This equation can be written as a sum of the log prior ratio and the log-likelihood ratio:

d=logpsame1-psame+logpxtrialC=1pxtrialC=2 #

The decision rule d>0 is thus equivalent to (see the Appendix 1)

|x|<k

where

k= K

and

K=σ2 (σs2+ σ2)σs2 2log psame1-psame +log σs2+ σ2σ2

where σ is the sensory noise level of the trial under consideration. As a consequence, the decision criterion changes as a function of the sensory noise affecting the observer’s measurement (Figure 5). This is a crucial property of BCI and indeed a property shared by Bayesian models used in previous work on multisensory synchrony judgments (Magnotti et al., 2013), audiavisual spatial localization (Körding et al., 2007), visual searching (Stengård and van den Berg, 2019), change detection (Keshvari et al., 2012), collinearity judgment (Zhou et al., 2020), and categorization (Qamar et al., 2013). The output of the BCI model is the probability of the observer reporting the visual and tactile inputs as emerging from the same source when presented with a specific asynchrony value s:

p(C^=1|s)= 0.5λ+(1λ)(Φ(s; k, σ2) Φ(s;k, σ2))

Here, the additional parameter λ reflects the probability of the observer lapsing, i.e., randomly guessing. This equation is a prediction of the observer’s response probabilities and can thus be fit to a participant’s behavioral responses.

Decision process for the emergence of the rubber hand illusion (RHI) according to the Bayesian and fixed criterion observers.

(A) The measured asynchrony between the visual and tactile events for the low (orange) or high (red) noise level conditions and the probability of the different causal scenarios: the visual and tactile events come from one source, the observer’s body, or from two different sources. The probability of a common source is a narrow distribution (full curves), and the probability of two distinct sources is a broader distribution (dashed curve), both centered on synchronous stimulation (0 ms) such that when the stimuli are almost synchronous, it is likely that they come from the same source. When the variance of the measured stimulation increases from trial to trial, decision criteria may adjust optimally (Bayesian – light blue) or stay fixed (fixed – dark blue). The first assumption corresponds to the Bayesian causal inference (BCI) model, and the second corresponds to the fixed criterion (FC) model (see next paragraph for details). The displayed distributions are theoretical, and the BCI model’s psame is arbitrarily set at 0.5. (B) The decision criterion changes from trial to trial as a function of sensory uncertainty according to the optimal decision rule from the BCI model. Black curves represent this relationship for different psame values of 0.4–0.9 (from lightest to darkest). (C) From left to right, these last plots illustrate how the BCI model-predicted outcome is shaped by psame , σ, and λ, respectively. Left: psame = 0.8 (black), 0.6 (green), and 0.9 (blue). Middle: σ = 150 ms (black), 100 ms (green), and 200 ms (blue). Right: λ = 0.05 (black), 0.005 (green), and 0.2 (blue). (D) Finally, this last plot shows simulated outcomes predicted by the BCI model (in full lines and bars) and the FC model (in dashed lines and shredded bars). In this theoretical simulation, both models predict the same outcome distribution for one given level of sensory noise (0%); however, since the decision criterion of the BCI model is adjusted to the level of sensory uncertainty, an overall increase of the probability of emergence of the RHI is predicted by this Bayesian model. On the contrary, the FC model, which is a non-Bayesian model, predicts a neglectable effect of sensory uncertainty on the overall probability of emergence of the RHI.

The BCI model has five free parameters: psame: the prior probability of a common cause for vision and touch, independent of any sensory stimulation, σ0, σ30, σ50 : the noise impacting the measurement x specific to each noise condition, and λ: a lapse rate to account for random guesses and unintended responses. We assumed a value of 348 ms for σS , i.e., σS is equal to the actual SD of the asynchronies used in the experiment, but we challenged this assumption later. Moreover, in our experiment, the spatial parameters and the proprioceptive state of our participants are not manipulated or altered from one condition to the other. Thus, our model focuses on the temporal aspects of the visuotactile integration in the context of body ownership. In this, it differs from the model proposed by Samad et al., 2015 in which both spatial and temporal aspects were modeled separately and then averaged to obtain an estimate of body ownership (that they then compared with questionnaire ratings of rubber hand illusion).

Alternative models

BCI model for body ownership with a free level of uncertainty impacting the stimulation (BCI*)

Request a detailed protocol

For the BCI model, we assumed that the observer’s assumed stimulus distribution has the same SD σS as the true stimulus distribution. We also tested a variant in which the assumed SD σS is a free parameter. As a result, this model is less parsimonious than the BCI model. The model has six free parameters (psame, σ0, σ30, σ50, σS, and λ). Nevertheless, the decision rule remains the same as that of the BCI model.

FC (non-Bayesian) model

Request a detailed protocol

An important alternative to the Bayesian model is a model that ignores variations in sensory uncertainty when judging if the rubber hand is one’s own, for example, because the observer incorrectly assumes that sensory noise does not change. We refer to this as the FC model. The decision rule for the FC model then becomes the following:

|x|<k0,

where k0 corresponds to an FC for each participant, which does not vary with trial-to-trial sensory uncertainty. If the decisional stage is independent of the trial-to-trial sensory uncertainty, the encoding stage is still influenced by the level of sensory noise. Thus, the output of the FC model is the probability of the observer reporting the illusion when presented with a specific asynchrony value s:

p(illusion|s)= 0.5λ+(1λ)(Φ(s; k0, σ2) Φ(s;k0, σ2))

Again, the additional parameter λ reflects the probability of the observer lapsing, i.e., randomly guessing. This equation is a prediction of the observer’s response probabilities and can thus be fitted to a participant’s behavioral responses.

Parameter estimation

Request a detailed protocol

All model fitting was performed using maximum-likelihood estimation implemented in MATLAB (MathWorks). We used both the built-in MATLAB function fmincon and the Bayesian adaptive directed search (BADS) algorithm (Acerbi and Ma, 2017), each using 100 different initial parameter combinations per participant. Fmincon is gradient based, while BADS is not. The best estimate from either of these two procedures was kept, i.e., the set of estimated parameters that corresponded to the maximal log-likelihood for the models. Fmincon and BADS produced the same log-likelihood for the BCI, BCI*, and FC models for 12, 13, and 14 of the 15 participants, respectively. For the remaining participants, the BADS algorithm performed better. Moreover, the fitting procedure run 100 times (with different initial parameter combinations) led to the same set of estimated parameters at least 31 times for all participants and models. To validate our procedure, we performed parameter recovery. For this procedure, data simulated from random parameters were fitted using the models we designed. Because the generating random parameters were recovered, i.e., are similar to the estimated parameters, we are confident that the parameter estimation applied for the fitting procedure used in the current study is reliable (Appendix 1—figure 1 & Appendix 1—table 2).

Model comparison

Request a detailed protocol

The Akaike information criterion (AIC; Akaike, 1973) and Bayesian information criterion (BIC; Schwarz, 1978) were used as measures of goodness of model fit. The lower the AIC or BIC, the better the fit. The BIC penalizes the number of free parameters more heavily than the AIC. We calculated AIC and BIC values for each model and participant according to the following equations:

AIC=2npar2logL
BIC=nparlogntrials2logL

where L* is the maximized value of the likelihood, npar the number of free parameters, and ntrial the number of trials. We then calculated the AIC and BIC difference between models and summed across the participants. We estimated a CI using bootstrapping: 15 random AIC/BIC differences were drawn with replacement from the actual participants’ AIC/BIC differences and summed; this procedure was repeated 10,000 times to compute the 95% CI.

As an additional assessment of the models, we compute the coefficient of determination R2 (Nagelkerke, 1991) defined as

R2=1exp(2n(logL(M)logL(M0)))

where logL(M) and logL(M0) denote the log-likelihoods of the fitted and the null model, respectively, and n is the number of data points. For the null model, we assumed that an observer randomly chooses one of the two response options, i.e., we assumed a discrete uniform distribution with a probability of 0.5. As in our case the models’ responses were discretized to relate them to the two discrete response options, the coefficient of determination was divided by the maximum coefficient (Nagelkerke, 1991) defined as

max(R2)=1exp(2nlogL(M0))

We also performed Bayesian model selection (Rigoux et al., 2014) at the group level to obtain the exceedance probability for the candidate models (i.e. the probability that a given model is more likely than any other model given the data) using the VBA (Variational Bayesian Analysis) toolbox (Rigoux et al., 2014). With this analysis, we consider a certain degree of heterogeneity in the population instead of assuming that all participants follow the same model and assess the a posteriori probability of each model.

Ownership and synchrony tasks

Request a detailed protocol

The experimental contexts of the ownership and synchrony judgment tasks only differed in the instructions given to the participants regarding which perceptual feature they were to detect (rubber hand ownership or visuotactile synchrony). Thus, the bottom-up processing of the sensory information is assumed to be the same. In particular, the uncertainty impacting each sensory signal is likely to be the same between the two tasks, since the sensory stimulation delivered to the observer is identical. The difference in the participants’ synchrony and ownership perceptions should be reflected in the a priori probability of the causal structure. For our BCI model, this means that the σ0, σ30, and σ50 parameters are assumed to be the same for the two tasks. The same applies for the lapse rate λ that depends on the observer and not on the task. In contrast, the prior probability for a common cause psame could change when a different judgment (ownership or synchrony) is assessed.

We used two complementary approaches to test whether people show different prior probabilities of a common cause for body ownership and synchrony perceptions: an extension analysis and a transfer analysis. In the extension analysis, we applied our BCI model to both sets of data and compared the fit of the model with all parameters (psame, σ0, σ30, σ50, σS, and λ) shared between tasks to a version of the model with one probability of a common cause psame, ownership for the body ownership task only and one probability of a common cause psame, synchrony for the synchrony task only. In the transfer analysis, we used the estimated parameters for one task (ownership or synchrony) to predict the data from the other task (synchrony or ownership). We compared a full transfer, in which all previously estimated parameters were used, to a partial transfer, in which psame was left as a free parameter. We again used the AIC and BIC to compare the different models.

Appendix 1

1. Bayesian causal inference model for body ownership

Bayesian models typically require three steps: first, specification of the generative model, which represents the statistics of the variables and their relationships, as believed by the observer; second, specification of the actual inference process, in which the observer uses a particular observation and ‘inverts’ the generative model to build a posterior distribution over the world state of interest; and third, specification of the predicted response distribution, which can be directly related to data. Below, we lay out these three steps for the body ownership task, in which the observer judges whether the rubber hand is theirs or not. For synchrony detection task, everything is the same except for the interpretation of the category variable C.

Step 1: generative model

We first need to specify the generative model, which captures the statistical structure of both the task and the measurement noise, as assumed by the observer. It contains three variables: the category, C, the physical visuotactile asynchrony, s, and the noisy measurement of this asynchrony, x. The variable C represents the high-level scenario:

C=1: only one common source, hence the rubber hand is my hand.

C=2: two different sources, hence the rubber hand is not my hand.

The a priori probability of a common cause, before any sensory stimulation is delivered to the observer is expressed as:

pC=1=psame

Next, we assume that the observer correctly assumes that the asynchrony s is always zero when C=1, and incorrectly assumes that the asynchrony follows a Gaussian distribution with standard deviation σs when C=2:

(1) p(s|C=1)=δ(s)
(2) p(s|C=2)= N(s;0, σs2)

Note that the distribution psC=2 is not the experimental asynchrony distribution that would be a mixture of delta functions, because in the C=2 condition, we presented a discrete set of asynchronies (±500 ms, ±300 ms, ±150 ms, and 0 ms). Why do we assume that the observer’s assumed asynchrony distribution for C=2 is different from the experimental one? We reasoned that it is unlikely that our participants were aware of the discrete nature of the experimental distribution, and that it is more likely that they assumed the distribution to be continuous. We use a Gaussian distribution because, in view of its simplicity and frequent occurrence, this seems to be a distribution that participants could plausibly assume. We tested both a model in which the SD of the Gaussian is equal to the experimental SD, and one in which it is not necessarily so (and therefore fitted as a free parameter).

Finally, we assume that the observer assumes that the measured asynchrony x is affected by a Gaussian noise σ:

(3) p(x|s)= N(x;s, σ2)

This assumption is standard and loosely motivated by the central limit theorem.

Step 2: inference

We now move to the inference performed by the observer. Visual and tactile inputs are to be integrated, thus leading to the emergence of the rubber hand illusion if the observer inferred a common cause (C=1) for both sensory inputs. On a given trial, the observer receives a particular measured asynchrony xtrial (simply a number) and infers the category C by computing the posterior probabilities p(C=1|xtrial) and pC=2xtrial. These probabilities are conveniently combined into the log posterior ratio d:

(4) d=log(p(C=1|xtrial)p(C=2|xtrial))

The observer would report ‘yes, it felt like the rubber hand was my own hand’ if d is positive. Equation (4) can be written as a sum of the log prior ratio and the log-likelihood ratio:

(5) d=log(psame1psame)+log(p(xtrial|C=1)p(xtrial|C=2))

Further evaluation of this expression requires us to calculate two likelihoods. The likelihood of C=1 is

pxtrialC=1=pxtrials=0
=Nxtrial;0, σ2

where we used Equations (1) and (3). The likelihood of C=2 is

pxC=2=pxtrialspsC=2ds
=Nxtrial;0, σ2+σs2

where we used Equations (2) and (3). Substituting both likelihoods into Equation (5), we can now calculate d:

(6) d=log(psame1psame)+log(N(xtrial;0, σ2)N(xtrial;0, σ2+σs2))
(7) =log(psame1psame)+12log(σ2+σs2σ2)xtrial22(1σ21σ2+σs2)

As mentioned above, we assume that the observer reports ‘yes, the rubber hand felt like my own hand’ if d>0. Using Equation (7), we can now rewrite this condition in terms of xtrial.

xtrial22(1σ21σ2+σs2)< log(psame1psame)+12log(σ2+σs2σ2)
xtrial2< σ2(σ2+σs2)σs2 (2log(psame1psame)+log(σ2+σs2σ2))

Then, we define

K= σ2(σ2+σs2)σs22logpsame1-psame+logσ2+σs2σ2

If K<0, which can theoretically happen when psame is very small, then the condition d>0 is never satisfied, regardless of the value of xtrial. This corresponds to the (unrealistic) case that it is so a priori improbable that there is a common cause that no amount of sensory evidence can override that belief. If K<0, the condition d>0 is satisfied when this condition is equivalent to

|xtrial|<k

where we call k=K the decision criterion. Notice that k takes into account both psame and the sensory uncertainty. This concludes our specification of the Bayesian inference performed by our model observer.

Step 3: response probability

We complete the model by calculating the probability that our model observer responds ‘I felt like the rubber hand was my hand’ (which we denote by C^=1) for the visuotactile asynchrony strial experimentally presented on a given trial. The first case to consider is K<0. Then,

p(C^=1|strial)=0

Otherwise,

p(C^=1|strial)=Prxtrial|strial(|xtrial|<k)=Φ(k;strial,σ2)Φ(k;strial,σ2)
=Φk;strial,σ2-Φ-k;strial,σ2

where Φ denotes the cumulative normal distribution. Finally, we introduce a lapse rate, which is the probability of making a random response (which we assume to be yes or no [the rubber hand felt like my hand] with equal probability). Then, the overall response probability becomes

pwith lapse(C^=1|strial)=0.5λ+(1λ)(Φ(k;strial,σ2)Φ(k;strial,σ2))

It is this outcome probability that we want to fit to our data. Five free parameters need to be fitted: θ=[psame,σ0,σ30, σ50, λ]. In the basic model, the source noise σs is fixed, its value corresponding to the real SD of the asynchronies used in the experiment (348 ms).

2. Alternative models

BCI model with free source noise: BCI*

This model shares the generative model and decision rule of the Bayesian causal inference (BCI) model (Equation 7). However, the level of noise impacting the stimulation σs is considered as a free parameter instead of being fixed. Thus, six parameters need to be fitted: θ=[psame,σ0,σ30, σ50, σs, λ].

BCI model with a minimal asynchrony different from 0: BCI_bias

We also designed a model that did not assume that the observer treats an asynchrony of 0 as minimal. In this alternative model, the decision criterion is the same as in the BCI model (Equation 7); however, a parameter μ (representing the mean of the distribution of asynchrony) is taken into account when computing the predicted answer in the following step:

pwith lapse(C^=1|strial)=0.5λ+(1λ)(Φ(k+μ;strial,σ2)Φ(k+μ;strial,σ2))

Thus, six parameters need to be fitted: θ=[psame,σ0,σ30, σ50, μ, λ].

Fixed-criterion model: FC

This model shares the generative model with the BCI models, but the variations of the level of sensory uncertainty from trial to trial are not taken into account in the decision rule (Equation 7). Because psame remains constant in our experiment, the decision rule is equivalent to reporting ‘yes, the rubber hand felt like my hand’ if the measured asynchrony is smaller than a constant k0 :

|xtrial|<k0

Five free parameters need to be fitted: θ=[k0,σ0,σ30, σ50, λ].

Note that if the decisional stage in the FC model is independent of the trial-to-trial sensory uncertainty, the encoding stage is still influenced by the level of sensory noise. Thus, the output of the FC model is the probability of the observer reporting the illusion when presented with a specific asynchrony value s:

pwith lapse(C^=1|strial)=0.5λ+(1λ)(Φ(k0;strial,σ2)Φ(k0;strial,σ2))

As in the main BCI model, the additional parameter λ reflects the probability of the observer lapsing, i.e., randomly guessing. This equation is a prediction of the observer’s response probabilities and can thus be fit to a participant’s behavioral responses.

3. Model fitting and comparison

Model fitting

For each model, we want to find the combination of parameters that best describe our data D, i.e., the yes/no responses to the presented asynchronies. We use maximum-likelihood estimation to estimate the model parameters, which for a given model, we collectively denote by θ. The likelihood of θ is the probability of the data D given θ:

Lθ=p(D|θ)

We next assume that the trials are conditionally independent so that the likelihood becomes a product over trials:

L(θ)=trial tp(Ct^|st, σt, θ)

where st and σt are the asynchrony and the noise level on the tth trial, respectively. It is convenient to maximize the logarithm of the likelihood, which is

(8) logL(θ)=trial tlogp(Ct^|st, σt, θ)

We now switch notation and group trials by noise condition (labeled i and corresponding to the three noise levels) and stimulus condition (labeled j and corresponding to the seven asynchronies). Then, we can compactly denote the observed data by n1ij and n0ij, which are the numbers of times the participant reported ‘yes’ and ‘no,’ respectively, in the i, jth condition. Then, Equation 8 simplifies to

logL(θ)=i,  j [n1ijlogp(C^=1|sj, θ)+n0ijlog(1p(C^=1|sj, θ))]

The hard and plausible bounds used in the optimization algorithms can be found in the Appendix 1—table 1.

Appendix 1—table 1
Bounds used in the optimization algorithms.
ParameterTypeHard boundPlausible bound
psameProbability(0, 1)(0.3, 0.7)
σSensory noise (log)(−Inf, +Inf)(–3, 9)
λLapse(0, 1)(eps, 0.2)
k0Asynchrony (log)(−Inf, +Inf)(–3, 9)

Parameter recovery

In order to qualitatively assess our fitting process, we performed parameter recovery. We used random sets of parameters θ=[psame,σ0,σ30, σ50, σs, λ] to generate data from the BCI model, then fitted the BCI model to these simulated data. We then did three assessments: (1) the log likelihoods of the fitted parameters were higher than of the generating parameters Negative log-likelihood: NLL (Minitial)=920 ± 78; NLL (Mrecovered)=812 ± 79 and than of an alternative model NLL (FC)=948 ± 89; (2) the model fits to the simulated data looked excellent (Appendix 1—figure 1); (3) the generating parameters were roughly recovered after this procedure. Thus, parameter recovery was successful (Appendix 1—table 1).

Appendix 1—figure 1
The figure displays simulated ‘yes (the rubber hand felt like my own hand)’ answers as a function of visuotactile asynchrony (dots) and corresponding Bayesian causal inference (BCI) model fit (curves).

As in the main text, black, orange, and red correspond to the 0, 30, and 50% noise levels, respectively.

Appendix 1—table 2
Initial parameters used to generate the simulations and recovered parameters.
ParticipantInitialRecovered
psameσ0σ30σ50λpsameσ0σ30σ50λ
S10.532461641290.090.512641761330.11
S20.741832041300.150.861521711090.21
S30.39281962230.150.413131112510.09
S40.909732850.020.899433830.02
S50.7318596290.070.74176101310.07
S60.542381982150.190.502942212750.00
S70.261382751100.120.2715117,8031230.12
S80.9012401410.010.87252561460.01
S90.6972652960.080.6602743160.06
S100.1910142120.050.3636477640.05
S110.755032130.160.7647342300.18
S120.691082701910.100.671112722130.09
S130.81224461810.080.79237481930.06
S140.2222203830.010.2234232760.02
S150.402152471560.050.392322231570.03

Model comparison

We used the Akaike information criterion (AIC) and the Bayesian information criterion (BIC) to compare models. These quantities are calculated for each model and each participant:

AIC=2npar-2logL*
BIC=ntriallognpar-2logL*

where L* is the maximized value of the likelihood, npar the number of free parameters, and ntrial the number of trials. To compare two models, we calculated the difference in AIC between the two models per participant and summed the differences across the 15 participants. We obtained CIs through bootstrapping: we drew 15 random AIC differences with replacement from the actual participants’ AIC differences, then summed those. This procedure was repeated 10,000 times to compute the 95% CI. The same analysis was also conducted for the BIC results.

4. Pilot experiment and asynchrony sample adjustment

We chose to match qualitatively difficulty by adjusting the degree of asynchrony in the synchrony judgment task after analyzing the results from 10 participants (six women, 26±4 years) in a pilot study. We only used the zero-noise condition in this pilot and tested identical asynchronies in the two tasks (from –500 ms to +500 ms), otherwise, the procedure was identical to the main experiment. As shown in the table below, in the ±500 ms and the ±300 ms conditions, the number of trials for which the visuotactile stimulation was perceived as synchronous was consistently very low or never happened (zeros) in many cases. This observation suggests that the synchrony task was too easy and that it would not produce behavioral data that would be useful for model fitting or testing the BCI model. Thus, we adjusted the asynchrony conditions in the synchrony task to make this task more challenging and more comparable to the ownership judgment task. Note that we could not change the asynchronies in the ownership task to match the synchrony task because we need the longer 300 ms and 500 ms asynchronies to break the illusion effectively.

Appendix 1—table 3
Pilot data.

Number of ‘yes’ (the visual and tactile stimulation were synchronous) answers in the synchrony judgment task and of ‘yes’ (the rubber hand felt like it was my own hand) answers in the body ownership task (total number of trials per condition: 12).

ParticipantSynchrony judgmentOwnership judgment
–500–300–1500150300500–500–300–1500150300500
P1005114000167340
P200212300912121212100
P3001122000211121290
P4001121104691111118
P50131110003712620
P60000000111212121197
P700192000812121220
P80021001056811842
P910112300371012320
P1000312200041012520

To assess if this change in asynchrony range between tasks may explain the lower prior probability for a common cause in the synchrony detection task, we applied our extension analysis to the pilot data to test the BCI model on tasks with identical asynchronies. The pilot study did not manipulate the level of sensory noise (only the 0% noise level was included). The Appendix 1—figure 2 shows the key results regarding the estimated psame. The same trend was observed as in the main experiment: the estimated a priori probability for a common cause for synchrony judgment was lower than for body ownership. However, for more than half of our pilot participants, psame for body ownership reaches the extremum (psame=1). This ceiling effect probably is because the synchrony task was too easy when using asynchronies of 300 ms and 500 ms as in the ownership task; it lacked challenging stimulation conditions required to assess the participants’ perception as a gradual function finely. This observation convinced us further that we needed to make the synchrony judgment task more difficult by reducing the longer asynchronies to obtain high-quality behavioral data that would allow us to test the subtle effects of sensory noise, compare different models, and compare with the ownership judgment task in a meaningful way. From a more general perspective, different tasks may interact differently with sensory factors, but we argue that such task differences is most likely reflected in a change in prior. Even if our model cannot rule out some task-related influences on sensory processing, our interpretation that the priors are genuinely different between the two tasks is consistent with previous studies that examined the relationship between synchrony perception and body ownership (Costantini et al., 2016; Chancel and Ehrsson, 2020; Maselli et al., 2016; see introduction).

Appendix 1—figure 2
Correlation between the prior probability of a common cause psame estimated for the ownership and synchrony tasks in the extension analysis in the pilot study (left) and the main study (right).

The solid line represents the linear regression between the two estimates, and the dashed line represents the identity function (x=f[x]).

Data availability

Figure 1—source data 1, Figure 2—source data 1 and Figure 3—source data 1 contain the numerical data used to generate the figures and their supplements. These files have also been made available: https://osf.io/zu2h6/.

The following data sets were generated
    1. Chancel M
    2. Ehrsson HH
    3. Ma WJ
    (2021) Open Science Framework
    ID n7atw. Uncertainty-based inference of a common cause for body ownership.

References

    1. Acerbi L
    2. Ma WJ
    (2017)
    Practical bayesian optimization for model fitting with bayesian adaptive direct search
    Advances in Neural Information Processing Systems 30:1834–1844.
    1. Akaike H
    (1973)
    Second International Symposium on Information Theory
    267–281, Information theory and an extension of the maximum likelihood principle, Second International Symposium on Information Theory, Budapest, Academiai Kiado.
  1. Book
    1. de Vignemont F
    (2018)
    Mind the Body: An Exploration of Bodily Self-Awareness
    Oxford University Press.
    1. Ehrsson HH
    (2012)
    The New Handbook of Multisensory Processes
    The concept of body ownership and its relation to multisensory integration, The New Handbook of Multisensory Processes, MIT Press.
    1. Ehrsson HH
    (2020)
    Multisensory Perception
    179–200, Multisensory processes in body ownership, Multisensory Perception, Elsevier, 10.1016/B978-0-12-812492-5.00008-5.
    1. Radziun D
    2. Ehrsson HH
    (2018) Auditory cues influence the rubber-hand illusion
    Journal of Experimental Psychology. Human Perception and Performance 44:1012–1021.
    https://doi.org/10.1037/xhp0000508
  2. Book
    1. Stein BE
    2. Meredith MA
    (1993)
    The Merging of the Senses
    The MIT Press.

Decision letter

  1. Virginie van Wassenhove
    Reviewing Editor; CEA, DRF/I2BM, NeuroSpin; INSERM, U992, Cognitive Neuroimaging Unit, France
  2. Tamar R Makin
    Senior Editor; University of Cambridge, United Kingdom
  3. Zoltan Dienes
    Reviewer; University of Sussex, United Kingdom
  4. Liping Wang
    Reviewer; Chinese Academy of Sciences, China
  5. Mate Aller
    Reviewer; MRC Cognition and Brain Sciences Unit, United Kingdom

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Decision letter after peer review:

Thank you for submitting your article "Uncertainty-based inference of a common cause for body ownership" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Tamar Makin as the Senior Editor. The following individuals involved in the review of your submission have agreed to reveal their identity: Zoltan Dienes (Reviewer #1); Liping Wang (Reviewer #2); Mate Aller (Reviewer #3).

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

Essential revisions:

(1) A first reviewer highlights a major interpretation problem based on the 12 s asynchrony, contrasting a domain-general reasoning module with a body ownership one. The reviewer makes suggestions on how to accommodate and clearly refine the claims (e.g. explaining that a main observation is about a shift in criterion).

(2) Consistent with the above issue, a second reviewer highlights several empirical limitations and inconsistencies in the authors' analytical approach (points 2 to 5) with a lack of clear comparisons between the two tasks. These points need to be addressed as listed.

(3) The issue of inter-individual differences is being raised by two reviewers and several suggestions are made to provide a more transparent report of the BCI model and their consistency with authors' interpretations.

Reviewer #1 (Recommendations for the authors):

The authors explore the basis of the "rubber hand illusion" in which people come to feel a rubber hand as their own when it is stroked or tapped synchronously with their real hand. They show that when participants are exposed to lags in the tapping of the rubber and real hand, under conditions of different visual noise, people adjust their sense of ownership according to the manipulated variables. The degree of adjustment can be fitted by a Bayesian model, i.e. one in which evidence for ownership is construed as strength of evidence for the stroking being synchronous, and this is used to update the prior probability of there being a common cause for the sight and feeling of stroking.

The paper shows a lot of care in writing and study methodology. Setting up the equipment, the robotic device and the VR, required a lot of effort; the modeling and analysis have been conscientiously performed. The weakness for me is in the background framing of what is going on: Naturally, as is the authors' prerogative, they have taken a particular stance. However, they also say that these data may help distinguish different theories and approaches from their own preferred one. But no argument is made for how the data distinguish their interpretative approach from different ones. This is left as a promissory note. In fact, I think there is a rather different way of viewing the data.

The authors frame their work in terms of a mechanism for establishing bodily ownership. On the other hand, people may infer how feelings of ownership should vary given what is manipulated in the experiment. That is, if asynchrony is manipulated we know people view this as something that is meant to change their feeling of ownership (e.g. Lush 2020 Collabra). That is, the results may be based on general inference rather than any special ownership module. Consistently, reasoning in medical diagnosis and legal decisions can be fit by Bayesian models as well (e.g. Kitzis et al., Journal Math. Psych.; Shengelia and Lagnado, 2021, Frontiers in Psych). That is, the study could be brought under the research programme of showing people often reason in approximately Bayesian ways in all sorts of inference tasks, especially when concrete information is given trial by trial (e.g. Oaksford and Chater).

The results support the bare conclusions I stated, but what those conclusions mean is still up for grabs. I always welcome attempts to model theoretical assumptions, and the approach is more rigorous than many other experiments in that field. Hopefully it will set an example.

Modeling how inferences are formed when presented with different sources of information is an important task, whether or not those inferences reflect the general ability of people to reason, or else the specific processes of a particular model. For example, Fodor claimed that general reasoning was beyond the scope of science, but the fact that across many different inference tasks similar principles arise – roughly though not exactly Bayesian – indicates Fodor was wrong!

Specific comments:

i) The model can be described as Bayesian, but how different is it from a signal detection theory model, with adjustable vs fixed criteria, and a criteria offset for the RHI and asynchrony judgment task? In other words, rather than the generative model being an explicit model for the system, instead different levels of asynchrony simply produce certain levels of evidence for a time difference, as in an STD model. Then set up criteria to judge whether there was or was not a time difference. Adjust criteria according to prior probabilities, as is often done in SDT. That's it. Is this just a verbal rephrasing?

One thing on the framing of the model: Surely a constant offset over several taps is just as good evidence for a common cause no matter whether the offset is 0 or something else? But if the prior probability is specifically that common cause is the same hand is involved (which requires an offset close to 0), surely that prior probability is essentially zero? So how should the assumption of C1 be properly framed?

ii) lines 896-897"…temporally correlated visuotactile signals are a key driving factor behind the emergence of the rubber hand illusion Chancel and Ehrsson,202, …"

Cite findings that need some twisting to fit in with this view, e.g. the finding that imagining a hand is one's own creates SCRs to its being cut; and perhaps more easy to deal with but I think rather telling, no visual input of a hand is needed (Guterstan et al., 2013) and laser lights instead of brushes work just about as well (Durgin et al., 2013) as does stroking the air (Guterstam et al., 2016), making magnetic sensations akin to the magnetic hands suggestion in hypnosis. It seems the simplest explanation is that participants respond to what they perceive as what is meant to be relevant in the study manipulations. Suitable responses are constructed, based on genuine experience or otherwise, in accordance with the needs of the context. The very way the question is framed determines the answer according to principles of general inference (e.g. Lush and Seth, 2022, Nat Coms; Corneille and Lush, https://psyarxiv.com/jqyvx/).

iii) Provide means and SE's for conditions for synchrony judgment tasks.

iv) Discuss the alternative views of how the study could be interpreted as I have indicated above.

Reviewer #2 (Recommendations for the authors):

Using rubber hand illusion in humans, the study investigated the computational processes in self-body representation. The authors found that the subjects' behavior can be well captured by the Bayesian causal inference model, which is widely used and well described in multisensory integration. The key point in this study is that the body ownership perception was well predicted by the posterior probability of the visual and tactile signals coming from a common source. Although this notion was investigated before in humans and monkeys, the results are still novel:

1. This study directly measures illusions with the alternative forced-choice report instead of objective behavioral measurements (e.g., proprioceptive drift).

2. The visual sensory uncertainty was changed trial by trial to examine the contribution of sensory uncertainty in multisensory body perception. Combined with the model comparison results, these results support the superiority of a Bayesian model in predicting the emergence of the rubber hand illusion relative to the non-Bayesian model.

3. The authors investigated the asynchrony task in the same experiment to compare the computational processing in the RHI task and visual-tactile synchrony detection task. They found that the computational mechanisms are shared between these tasks, while the prior of common source is different.

In general, the conclusions are well supported, and the findings advanced our understanding of the computational principles of body ownership.

Main comments:

1. One of the critical points in this study is the comparison between the BCI model which takes the sensory uncertainty into account and the non-Bayesian (fixed-criterion) model. Therefore, I suggest the authors show the prediction results of both the BCI and non-Bayesian model in Figure 2 and compare the key hypothesis and the prediction results.

2. This study has two tasks: the ownership task and the asynchrony task. As the temporal disparity is the key factor, the criteria for determining the time disparities are important. The author claim that the time disparities used in the two tasks were determined based on the pilot experiments to maintain an equivalent difficulty level between the two tasks. If I understand correctly, the subjects were asked to report whether the rubber hand felt like my hand or not in the ownership task. Thus, there are no objective criteria of right or wrong. The authors should clarify how they define the difficulty of the two tasks and how to make sure the difficulty is equal. I think this is important because the time disparities in these two tasks were different, and the comparison of Psame in these tasks may be affected by the time disparities. Furthermore, the authors claimed that the ownership and visual-tactile synchrony perception had distinct multisensory processing according to the different Psame in the two tasks. Thus the authors should show further evidence to exclude the difference of Psame results from the chosen time disparities.

3. Related to the question above, the authors found that the same BCI model can reasonably predict the behavioral results in these two tasks with the same parameters (or only different Psame). They claimed that these two tasks shared similar results in multisensory causal inference. While in the following, they argued that there was a distinct multisensory perception in these two tasks because the Psame were different. If these tasks shared the same BCI computational approaches and used the posterior probability of common source to determine ownership and synchrony judgment, what is the difference between them?

4. The extension analysis showed that the Psame values from the two tasks were correlated across subjects. This is very interesting. However, since the uncertainty of timing perception (the sensory uncertainty of perceived time of the tactile stimuli on real hand and fake hand) was taken into account to estimate the posterior probability of common source in this study, the variance across subjects in ownership and synchrony task can only be interpreted by the Psame. In fact, the timing perception was considered as a Gaussian distribution in a modulated version of the BCI model for the agency (temporal binding) (R. Legaspi, 2019) and ownership (Samad, 2015). It will be more persuasive if the authors exclude the possibility that the individual difference of timing uncertainty cannot explain the variance across subjects.

5. Please include the results of single-subject behavior in the asynchrony task. It is helpful to compare the behavioral pattern between these two tasks. The authors compared the Psame between ownership and asynchrony tasks. Still, they did not directly compare the behavioral results (e.g., the reported proportion of elicited rubber hand illusions and the reported proportion of perceived synchrony).

6. The analysis of model fitting seems to lack some details. If I understood it correctly, the authors repeated the fitting procedure 100 times (line 502), then averaged all repeats as the final results of each parameter? It is reported that "the same set of estimated parameters at least 31 times for all participants and models". What does this sentence mean? Can the authors show more details about the repeats of the model result?

7. In Figure 3A, was there an interaction effect between the time disparity and visual noise level?

8. Line 624, the model comparison results suggested that the subjects have the same standard deviation as the true stimulus distribution. I encourage the authors to directly compare the BCI* model predicted uncertainty (σ s) to the true stimulus uncertainty, which will make this conclusion more convincing.

9. How did the authors fit the BCI model to the combined dataset from both ownership and synchrony tasks? What is the cost function when fitting the combined dataset?

10. As shown in the supplementary figures, the variations of ownership between the three visual noise levels varied widely among subjects and the predicted visual sensory sigmas (Appendix 1 – Table 2). The ownership in the three visual noise levels correlated with the individual difference of visual uncertainty?

11. The statements of the supplementary figures are confusing. For example, it is hard to determine which one is the "Supplementary File 1. A" in line 249?

12. Line 1040, it is hard to follow how to arrive at this equation from the previous formulas. Please give some more details and explanations.

Reviewer #3 (Recommendations for the authors):

This study investigated the computational mechanisms underlying the rubber hand illusion. Combining a detection-like task with the rubber hand illusion paradigm and Bayesian modelling, the authors show that human behaviour regarding body ownership can be best explained by a model based on Bayesian causal inference which takes into account the trial-by-trial fluctuations in sensory evidence and adjusts its predictions accordingly. This is in contrast with previous models which use a fixed criterion and do not take trial-by-trial fluctuations in sensory evidence into account.

The main goal of the study was to test whether body ownership is governed by a probabilistic process based on Bayesian causal inference (BCI) of a common cause. The secondary aim was to compare the body ownership task with a more traditional multisensory synchrony judgement task within the same probabilistic framework.

The objective and main question of the study is timely and interesting. The authors developed a new version of the rubber hand illusion task in which participants reported their perceived body ownership over the rubber hand on each trial. With the manipulation of visual uncertainty through augmented reality glasses they were able to assess whether trial-by-trial fluctuation in sensory uncertainty affects body ownership – a key prediction of the BCI model.

This behavioural paradigm opens up the intriguing possibility of testing the BCI model for body ownership at a neural level with fMRI or EEG (e.g., as in Rohe and Noppeney (2015, 2016) and Aller and Noppeney (2019)).

I was impressed by the methodological rigour, modelling and statistical methods of the paper. I was especially glad to see the modelling code validated by parameter recovery. This greatly increases one's confidence that good coding practices were followed. It would be even more reassuring if the analysis code were made publicly available.

The data and analyses presented in the paper support the key claims. The results represent a relevant contribution to our understanding of the computational mechanisms of body ownership. The results are adequately discussed in light of a broader body of literature. Figures are well designed and informative.

Main points:

1. Line 298: It is not clear if all 5 locations were stimulated in each 12 s stimulation phase or they were changed only between stimulation phases. Please clarify.

2. Line 331: "The 7 levels of asynchrony appeared with equal frequencies in pseudorandom order". I assume this was also true to the noise conditions, i.e., they also appeared in pseudorandom order with equal frequencies and not e.g., blocked. Could you please make this explicit here?

3. Line 348: Was the pilot study based on an independent sample of participants from the main experiment? Please also include standard demographics data (mean+/-SD age, sex) from the pilot study.

4. Line 406: From the standpoint of understanding the inference process at a high level, the crucial step of how prior probabilities are combined with sensory evidence to compute posterior probabilities is missing from the equations. More precisely it is not exactly missing, but it is buried inside the definition of K (line 416) if I understand correctly. I think it would make it easier for non-experts to follow the thought process if Equation 5 from Supplementary material would be included here.

5. Line 511: There are different formulations of BIC, could you please state explicitly the formula you used to compute it? Please also state the formula for AIC.

6. Line 512: "Badness of fit": Interesting choice of word, I completely understand why it is chosen here, however perhaps I would use "goodness of fit" instead to avoid confusion and for the sake of consistency with the rest of the paper.

7. Figure 4: I think the title could be improved here, e.g., "Model predictions of behavioural results for body ownership" or something similar. Details in the current title (mean +/- sem etc.) could go in the figure legend text.

I am a bit confused about why the shaded overlays from the model fits are shaped as oblique polygons? This depiction hints that there is a continuous increase in the proportion of "yes" answers in the neighbourhood of each noise level. Aren't these model predictions based on a single noise level value?

The mean model predictions are not indicated in the figure only the +/- SEM ranges marked by the shaded areas.

Line 261: Given that participants' right hand was used consistently, I am wondering if it makes any difference if the dominant or non-dominant hand is used to elicit the rubber hand illusion? If this is a potential source of variability it would be useful to include information on the handedness of the participants in the methods section.

Line 314: Please use either stimulation "phase" or "period" consistently across the manuscript.

Line 422: The study by Körding et al., (2007) is about audiovisual spatial localization, not synchrony judgment as is referenced currently. Please correct.

Line 529: Perhaps the better paper to cite here would be Rigoux et al., (2014) as this is an improvement over Stephan et al., (2009) and also this is the paper that introduces the protected exceedance probability which is used here.

Figure 3 and potentially elsewhere: Please overlay individual data points on bar graphs. There is plenty of space to include these on bar graphs and would provide valuable additional information on the distribution of data.

Figure 5A: Please consider increasing the size of markers and their labels for better visibility (i.e., similar size as in panels B and C).

Line 608,611, 639-640, and potentially elsewhere: Please indicate what values are stated in the form XX +/- YY. I assume they represent mean +/- SEM, but this must be indicated consistently throughout the manuscript.

[Editors’ note: further revisions were suggested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "Uncertainty-based inference of a common cause for body ownership" for further consideration by eLife. Your revised article has been evaluated by Tamar Makin (Senior Editor) and a Reviewing Editor.

The manuscript has been improved but there are some remaining issues that need to be addressed, as outlined below:

In particular, there was a consensus that the potential contribution of demand characteristics should be discussed, rather than dismissed. We also ask that you discuss the potential utility of a more simple model (signal detection theory). Please see more details below.

1) The authors added a paragraph detailing why the minimised (in their opinion) the contributions of demand characteristics. They argue that the theory that subjects respond to demand characteristics "cannot explain the specifically shaped psychometric curves with respect to the subtle stepwise asynchrony manipulations and the widening of this curve by the noise manipulation as predicted by the causal inference model." We were not fully convinced by this argument and wonder why you would want to categorically rule this possibility out.

Reviewer 1 wrote back: This claim is false. The authors should spell out the alternative theory in terms of subjects using general purpose Bayesian inference to infer what is required. Subjects do not need to know how general purpose inference works; they just need to use it. Does the fact that information was delivered trial by trial over many trials rule out general purpose Bayesian inference? On the contrary, it supports it. Bayesian updating is particularly suited to trial by trial situations (e.g. see general learning and reasoning models by Cristoph Mathys), as illustrated by the author's own model. The authors' model could be a general purpose model of Bayesian inference, shown by its applicability to the asynchrony judgment task. The fact that the number of asynchrony levels may not have been noticed by subjects is likewise irrelevant; subjects do not need to know this. Indeed, the authors point out that the asynchrony judgment task was easier than the RHI task, so that the authors needed to use a smaller asynchrony range for this task than the RHI one. That is, subjects' ability to discriminate asynchronies is shown by the authors' data to be more than enough to allow task-appropriate RHI judgments. (Even if successive differences between asynchrony levels were below a jnd, one would still of course still get a well formed psychophysical function over an appropriate span of asynchronies; so subjects could still use asynchrony in lawful ways as part of general reasoning, i.e. apart from any specific self module.)

(In their cover letter the authors bring up other points that are covered by the fact that subjects do not need to know how e.g. inference and imagination work in order to use them. It is well established that response to imaginative suggestions involves the neurophysiology underlying the corresponding subjective experience (e.g https://psyarxiv.com/4zw6g/ for review); imagining a state of affairs will create appropriate fMRI or SCR responses without the subject knowing how either fMRI or SCRs work.) In sum, none of the arguments raised by the authors actually count against the alternative theory of general inference in response to demand characteristics (followed by imaginative absorption), which remains a simple alternative explanation.

Following a discussion we ask that to address the Reviewer's perspective, you acknowledge the possibility that demand characteristics are contributing to participants performance.

2) "How the a priori probabilities of a common cause under different perceptive contexts are formed remains an open question."

Reviewer 1: One plausible possibility is that in a task emphasizing discrimination (asynchrony task) vs emphasizing integration (RHI) subjects infer different criteria are appropriate; namely in the former, one says "same" less readily.

and

3) "temporally correlated visuotactile signals are a key driving factor behind the emergence of the rubber hand illusion"

Reviewer 1: On the theory that the RH effect is produced by whatever manipulation appears relevant to subjects, of course in this study, where asynchrony was clearly manipulated, asynchrony would come up as relevant. So the question is, are there studies where visual-tactile asynchrony is not manipulated, but something else is, so subjects become responsive to something else? And the answer is yes. Guterstam et al., obtained a clear RH ownership effect and proprioceptive drift for brushes stroking the air i.e. not touching the rubber hand Durgin et al., obtained a RH effect with laser pointers, i.e. no touch involved either. The authors may think the latter effect will not replicate; but potentially challenging results still need to be cited.

Here we do not ask that you provide an extensive literature review. Instead, we simply ask you that you acknowledge in the discussion that task differences might influence participants performance (similar to our request above).

4) On noise effects.

Reviewer 1: If visual noise increases until 100% of pixels are turned white, the ratio of likelihoods for C=1 vs C=2 must go to 1 (as there is no evidence for degree of asynchrony) so the probability of saying "yes" goes to p_same, no matter the actual asynchrony (which by assumption cannot be detected at all in this case). p-same is estimated as.8 in the RHI condition. Yet as noise increases, p(yes) actually increases higher than 0.8 in the -150 to +150 asynchrony range (Figure 2). Could an explanation be given of why noise increases p(yes) other than the apparent explanation I just gave (that p(yes) moves towards p(same) as the relative evidence for the different causal processes reduces)?

The deeper issue is that it seems as visual noise increases, the probability that subjects say the rubber hand is their own increases and becomes less sensitive to asynchrony. In the limit it means if one had very little visual information, one just knew a rubber hand had been placed near you on the table, you would be very likely to say it feels like your own hand (maybe around the level of p-same), just because you felt some stroking on your own hand. But if the reported feeling of the rubber hand were the output of a special self processing system, the prior probability of a rubber hand slapped down on the table being self must be close to 0; and it must remain close to zero even if you felt some stroking of your hand and saw visual noise. But if the tendency to say "it felt like my own hand" was an experience constructed by realizing the paradigm called for this, then a high baseline probability of saying a rubber hand is self could well be high – even in the presence of a lot of visual noise.

Please consider that this effect may bear on the two different explanations.

5) The authors reject an STD model in the cover letter on the grounds subjects would not know where to place their criterion on a trial by trial basis taking into account sensory uncertainty.

Reviewer 1: Why could not subjects attempt to roughly keep p(yes) the same across uncertainties? If the authors say this is asking subjects to keep track of too much, note in the Bayesian model the subjects need an estimate of a variance for each uncertainty to work out the corresponding K. That seems to be asking for even more from subjects. The authors should acknowledge the possibility of a simple STD model and what if anything hangs on using these different modelling frameworks.

We feel that a brief mention of this possibility will benefit the community when considering how to leverage your interesting work in future studies.

6) A few proofing notes that have been picked up by Reviewer 3 (these are not comprehensive, so please read over the manuscript again more carefully):

1. Main points 1 and 3: The changes in response to these points as indicated in the response to reviewers are not exactly incorporated in the main manuscript file. Could you please correct?

2. Main point 4: in the main manuscript file there is an unnecessary '#' symbol at the end of the equation, please remove.

3. Main point 7: the title for figure 2 in the updated manuscript does not match the title indicated in the response to reviewers. I think the latter would be a better choice.

4. Supplements for figures 2, 3, 4: It seems that after re-numbering these figures, the figure legends for their supplement figures have not been updated and they still show the original numbering. Could you please update?

https://doi.org/10.7554/eLife.77221.sa1

Author response

Essential revisions:

(1) A first reviewer highlights a major interpretation problem based on the 12 s asynchrony, contrasting a domain-general reasoning module with a body ownership one. The reviewer makes suggestions on how to accommodate and clearly refine the claims (e.g. explaining that a main observation is about a shift in criterion).

We have considered the first reviewer’s comments very carefully and provided detailed responses to all his comments; we also explicitly discuss these issues in the new version of the manuscript.

However, we disagree with the comment that there is a major interpretational problem with our experimental asynchrony manipulation. We are also puzzled by the recommendation that we need to “refine the claims” and that “the main observation is about a shift in criterion” because all three reviewers stated that our results support our main conclusions, and none of the reviewers claimed that our main finding is about a shift in criterion, as far as we can see: reviewer 2 discusses some issues related to the comparisons of the two tasks and inter-individual differences that we replied to, and Reviewer 1 wants us to consider a couple of alternative interpretations. We addressed the latter concern and detailed why we disagree with reviewer 1’s suggestion that demand characteristics and domain-general reasoning is a reasonable alternative explanation for our key findings.

Our main findings are the specifically shaped psychometric curves with respect to the subtle stepwise asynchrony manipulations and the widening of this curve by the noise manipulation as predicted by the causal inference model. We do not see how this finding can be explained by demand characteristics or a domain general cognitive reasoning strategy. In brief, the “computational” hypothesis is hidden from the naïve participants and can probably not be figured out spontaneously by them in the current detection task with over 500 randomized trials. Note further that we analyze and model each individual subject’s illusion detections and we find very good replication of our key modeling results between individual participants. There are also many features of the task design and the experimental procedures that reduce the risk of demand characteristics, e.g., use of robots and the experimenter being blind to the noise manipulation. Critically, the shortest asynchrony we used (150 ms) was short enough that most participants did not perceive it reliably as asynchronous, indeed previous work in the literature identified the crossmodal threshold to detect visuotactile asynchrony around 200 ms, which was confirmed by the analysis of our own asynchrony detection task data showing that participants did not detect the +/- 150 ms asynchrony above chance level, and the participants did not know how many different asynchrony levels were tested (as revealed in post-experiment interviews). Another crucial point is that no feedback was given to the participants about task performance, so they could not learn to map their responses to the different experimental manipulations. Cognitive bias might, of course, influence our data as in any perceptual decision paradigm, but our critical argument is that in the current study, such effects will most likely affect the conditions globally – across conditions – and very unlikely to explain specific changes in the psychometric curves that fitted our causal inference model’s predictions.

We also disagree with some of reviewer 1’s more general theoretical comments about the rubber hand illusion, where he seems to imply that the illusion might be nothing more than a combination of demand characteristics and domain-general reasoning, perhaps supplemented perhaps by hypnotic suggestibility. Such strong claims are not supported by the previous rubber hand illusion literature or the current study’s results. We have criticized Peter Lush’s and reviewer 1’s controversial claims about the rubber hand illusion in other publications (Ehrsson et al., 2022 Nature Communications; Slater and Ehrsson. 2022 Frontiers in Human Neuroscience), and in the current response letter we revisit and further clarify some of this debate with respect to the current study’s specific findings and present results from additional analyses. We are also attaching materials from two other ongoing studies that further support the current study’s main conclusion. In one model-based fMRI study, using the current rubber hand illusion detection task, we showed that the hand ownership detection decisions are associated with increased activity in specific multisensory brain areas. And in one behavioral study, we used signal detection theory in a rubber hand illusion discrimination task which supports the perceptual nature of the decision processes.

(2) Consistent with the above issue, a second reviewer highlights several empirical limitations and inconsistencies in the authors' analytical approach (points 2 to 5) with a lack of clear comparisons between the two tasks. These points need to be addressed as listed.

We have addressed all the empirical issues raised by the second reviewer in our point-by-point responses. Thanks to these constructive remarks, we think our analytical approach is now more clearly presented. We also justified in more detail our computational approach and explained the strengths and limitations of comparing the two tasks at the computational level.

(3) The issue of inter-individual differences is being raised by two reviewers and several suggestions are made to provide a more transparent report of the BCI model and their consistency with authors' interpretations.

We addressed all the concerns of reviewers related to inter-individual differences and used many of their recommendations to make changes in the manuscript to make the reporting of the models and the consistency of the result with respect to our conclusions more transparent. We also present several additional analyses and figures in the point-to-point response letter to further clarify these points. We hope that all our results regarding the BCI model and the issues related to inter-individual differences are now presented clearly and transparently.

We thank the reviewers and the editor for the valuable feedback and exciting discussions, and we are confident that the manuscript has been improved by taking into account all various concerns, suggestions, and positive feedback.

Reviewer #1 (Recommendations for the authors):

The authors explore the basis of the "rubber hand illusion" in which people come to feel a rubber hand as their own when it is stroked or tapped synchronously with their real hand. They show that when participants are exposed to lags in the tapping of the rubber and real hand, under conditions of different visual noise, people adjust their sense of ownership according to the manipulated variables. The degree of adjustment can be fitted by a Bayesian model, i.e. one in which evidence for ownership is construed as strength of evidence for the stroking being synchronous, and this is used to update the prior probability of there being a common cause for the sight and feeling of stroking.

The paper shows a lot of care in writing and study methodology. Setting up the equipment, the robotic device and the VR, required a lot of effort; the modeling and analysis have been conscientiously performed. The weakness for me is in the background framing of what is going on: Naturally, as is the authors' prerogative, they have taken a particular stance. However, they also say that these data may help distinguish different theories and approaches from their own preferred one. But no argument is made for how the data distinguish their interpretative approach from different ones. This is left as a promissory note. In fact, I think there is a rather different way of viewing the data.

The authors frame their work in terms of a mechanism for establishing bodily ownership. On the other hand, people may infer how feelings of ownership should vary given what is manipulated in the experiment. That is, if asynchrony is manipulated we know people view this as something that is meant to change their feeling of ownership (e.g. Lush 2020 Collabra). That is, the results may be based on general inference rather than any special ownership module. Consistently, reasoning in medical diagnosis and legal decisions can be fit by Bayesian models as well (e.g. Kitzis et al., Journal Math. Psych.; Shengelia and Lagnado, 2021, Frontiers in Psych). That is, the study could be brought under the research programme of showing people often reason in approximately Bayesian ways in all sorts of inference tasks, especially when concrete information is given trial by trial (e.g. Oaksford and Chater).

The results support the bare conclusions I stated, but what those conclusions mean is still up for grabs. I always welcome attempts to model theoretical assumptions, and the approach is more rigorous than many other experiments in that field. Hopefully it will set an example.

Modeling how inferences are formed when presented with different sources of information is an important task, whether or not those inferences reflect the general ability of people to reason, or else the specific processes of a particular model. For example, Fodor claimed that general reasoning was beyond the scope of science, but the fact that across many different inference tasks similar principles arise – roughly though not exactly Bayesian – indicates Fodor was wrong!

We thank the reviewer for his positive remarks on the methods, modeling results, and the writing of our paper. We are more than happy to discuss the alternative ways of interpreting the data that the reviewer is bringing up in this report. We agree with the reviewer that we conducted our study to test a particular hypothesis and computational model about the rubber hand illusion, and that our study was not designed to try to distinguish between different theories. So in the manuscript, we will stay focused on the conclusions that are directly supported by our results and avoid unnecessary theoretical speculation.

The reviewer’s central critical point is that our results might stem from ‘demand characteristics’, i.e., that participants may infer how feelings of ownership should vary given the experimental manipulations and then use a general reasoning strategy to generate the behavioral responses to meet these expectancies. In other words, rather than performing perceptual decisions based on a genuine illusion experience, the participants may simply be reporting as they think the researchers want them to report. In our response below, we will first analyze this concern in detail, and we will argue that it is very unlikely that demand characteristics and a general reasoning strategy can explain our findings. Next, we will consider some of the more general theoretical points raised by the reviewer, including the recent studies by Lush and colleagues (Lush, 2020; Lush et al., 2020). Here we will argue that the current study’s main findings are well-protected against the kind of cognitive effects reported in Lush’s articles. Despite some theoretical disagreements about rubber hand illusion (Ehrsson et al., 2022; Slater and Ehrsson, 2022), we think our positions converge on the importance of developing new methods to extend existing computational frameworks to several domains of human cognition and perception.

Demand characteristics and a general cognitive reasoning strategy

The reviewer’s central concern is that the current data emerge from a general cognitive inference process driven by cognitive expectancies that the participants develop about the different conditions. But what would these expectations be, more precisely? We used seven steps of visuotactile asynchrony (0 ms, +/- 150 ms, +/- 300 ms and +/- 500ms) and these differed in small temporal intervals. This means that it is difficult for the participants to separate them and keep track of which stimuli that belongs to which trial type among the hundreds of fully randomized trials in our psychophysics experiments. Critically, our hypotheses are “hidden” and correspond to specific psychometric curves that change in an unintuitive way by the noise manipulation. Thus, it is very difficult for the participants to develop meaningful understanding about what the study is supposed to demonstrate and what they should feel on any given trial.

Importantly, even asynchrony as short as 150 ms that leads to significantly different detection of the rubber hand illusion (- 150 ms versus 0 ms conditions: t = -3.31, df = 14, p = 0.0052, 95% CI = [-2.75; -0.59]; + 150 ms versus 0 ms conditions: t = -3.251, df = 14, p = 0.0059, 95% CI = [-3.65; -0.75]). However, the perceptual threshold for visuotactile asynchrony is above 200 ms (e.g., 211 ± 59.9 ms (mean ± SD) in Costantini et al., 2016; 302 ± 35 ms (mean ± SD) in Shimada et al., 2014; in our own synchrony detection task, participants did not detect the +/-150 ms asynchrony above chance level: 50% detection threshold = 179 +/- 47 ms (mean ± SD)). Thus, in the +/- 150 ms trials the participants are exposed to very subtle manipulations of asynchrony that most participants will not reliably perceive as asynchronous. Consequently, in our view it is implausible that participants form different expectations for different asynchrony trials that they do not even experience as different. Note further that the participants were never informed about how many levels of asynchrony that was used in the current study or told anything about the temporal intervals. Indeed, informal interviews at the end of the experimental sessions suggest the participants never realized that seven levels of asynchrony were used (most participants guessed three or four). So, we think it is very unlikely that they could develop the kind of precise cognitive expectations that would be required in order to even be able to generate the hypothesized curves of behavioral responses based using cognitive reasoning strategy.

The concern with demand characteristics becomes even more implausible when the visual noise conditions are taken into consideration. The causal inference model predicts that increasing sensory uncertainty by increasing visual noise should lead to a specific widening of the psychometric curves so that greater asynchronies are tolerated in the illusion; and the data fit the causal inference model’s predictions well (and better than a fixed-criteria model). We think that it is very implausible that the naïve participants in our study could figure out this hidden computational hypothesis by themselves. In informal interviews after the experiments, some participants spontaneously reported that they thought the noisier visual information should degrade the rubber hand illusion, but most participants had no idea what the noise was supposed to do. Even experts in our field rarely correctly predicted the impact of the visual noise: this work has been presented in two conferences (Body Representation Network – 2021; ASSC24 – 2021) and during two invited seminars (LICÉA – Paris Nanterre, 2022; LPNC – Grenoble Alpes University, 2022); when academic colleagues were first introduced to the task at these meetings most of them expected the visual noise to work against the illusion, i.e., lead to “a weaker illusion”, which is opposite of our results and does not capture the graded effect. Furthermore, it is critical to again note that participants never receive any feedback about their behavior; they are never told if they are right or wrong. Thus, with the trial-to-trial random variation in sensory noise and visuotactile asynchrony in the absence of feedback, it is impossible for the participants to learn to map their answers onto the specific predictions of the model.

It is also relevant to point out here that the experimenter was always out-of-sight of the participant and that all visuotactile stimulation was produced by two robots. Thus, the participant could not pick up putative subtle social cues from the experimenter that could potentially serve as an implicit source of information about the performance or the hypotheses. Moreover, the experimenter was blind to the visual noise condition.

If we look a bit more closely at the psychophysics task itself, please note that it is based on a huge number of trials presented in a fully randomized order, and with relatively little time after each trial to give the classification response. This experimental design makes the use of a reasoning strategy difficult. It is improbable the participants could keep track of the 12 randomized repetitions of the 42 conditions tested in the current experiment, which sums up to over 500 trials in a single experiment, which is required in order to generate response patterns based on a cognitive reasoning strategy to “simulate” the causal inference model’s fit. In addition, such a cognitive strategy based on domain-general reasoning would put great demands on working memory, long-term memory, analytical reasoning skills, as well as knowledge about probabilistic principles of perception. In addition to thinking about each trial, they would simultaneously have to remember the approximate number of yes/no responses for each asynchrony level in their previous history of responses and how the frequencies of responses on each of 7 different asynchrony conditions and three levels of noise should change according to Bayesian probabilistic principals of causal inference. We are not sure that this is even theoretically possible with the current paradigm and recall that our participants were naive participants recruited from outside the department who had nothing to gain from even attempting to perform such an unpleasantly demanding cognitive task. It is much easier for the participants just to follow the instruction and base their response on each trial on the rubber hand illusion feeling.

In addition to the above arguments, also note that we analyze and model our data on each individual participant individually. Thus, even if we assume that there were a few participants who could figure out the specific hypotheses and that also had the motivation, determination, theoretical knowledge, and cognitive capacity to simulate behavior to meet our computational predictions against task instruction (a super version of “the good subject”), it cannot explain that we observed good model fits in the large majority of participants (pseudoR2 above 0.60 for 11 out 15 participants). Importantly, when comparing our different models, we added to our AIC/BIC analysis a protected exceedance probability analysis (Rigoux et al., 2014). This type of analysis includes participant has a random factor, i.e., it takes into account the possibility that the goodness of fit of one model would be mostly due to a few “perfect” subjects while other participants followed another model or a random distribution, by computing the posterior probability that one model occurs more frequently than any other model in the set, above and beyond chance. The results of this type of analyses highlighted the relevance and dominance of our main model across our whole sample. In addition to these specific results, most participants in experimental behavioral studies like the current one simply want to follow task instructions and “the good subject” seems to be relatively rare: in reviewing the classic literature on demand characteristics in social psychology experiments, Weber and Cook (1972) concluded that evidence for demand characteristics in experimental psychological studies was weak and ambiguous in most cases and convincing evidence for instances where “the good subject” explained the results were lacking. The current analysis approach based on individual participants’ perception also means that our results cannot stem from weak demand characteristics or cognitive reasoning effects occurring in individual subjects that are then aggregated into a “false” group-level effect.

So, in sum, we do not see how expectations, demand characteristics, and high-level domain-general reasoning can explain the current study’s main results or constitute a plausible alternative interpretation for the conclusion that body ownership in the rubber hand illusion is governed by Bayesian causal inference of sensory evidence. Although cognitive bias might lead to “global” changes in decision criteria, as we will discuss further below, such possible effects cannot, in our view, explain the specific shapes of the psychometrics curves related to the subtle asynchrony manipulation or the changes in these curves occurring when sensory uncertainty is manipulated.

The fact that reasoning at a “high cognitive level” (e.g., medical and judicial decisions) can be described as a near-Bayesian general inferential process does not exclude the existence of specific perceptual modules. Similar probabilistic computational decision principles may govern cognition and perception (e.g., Shams and Beierholm, 2022). However, this does not mean that if a perceptual decision task follows Bayesian probabilistic principles, it must be based on high-level cognition. The idea that automatic perceptual decisions related to the multisensory binding problem are implemented in the brain’s perceptual systems and operate according to Bayesian causal inference principles is rather well established in the literature on multisensory perception (Aller and Noppeney, 2019; Körding et al., 2007; Rohe et al., 2019; Rohe and Noppeney, 2016, 2015; Shams and Beierholm, 2010). In the current study, we extend this principle that is relevant for illusory and non-illusory audio-visual perceptual effects to the case of a multisensory bodily illusion.

Lush’s studies

The reviewer states that “if asynchrony is manipulated we know people view this as something that is meant to change their feeling of ownership” and cite Lush 2020 (Collabra). We disagree with this statement because we think it is too strong and too generalizing, and it is not clear how Lush’s findings relate to the current study’s results since Lush 2020 did not test the rubber hand illusion. In fact, we still know quite little about how conceptual knowledge about the rubber hand illusion influences subjective ratings of the illusion in actual experiments when naïve participants are instructed to report on their illusion feeling.

Lush (2020) has several limitations (Slater and Ehrsson, 2022), which we need to point out because it is relevant for the current discussion. First, Lush 2020 never tested the participants on the actual rubber hand illusion, so we do not know how the participant’s knowledge and expectations might have influenced their subjective ratings of illusion strength in an actual rubber hand illusion experiment. Second, the information and instructions provided to the participants in Lush 2020 are different from a typical rubber hand illusion experiment. In Lush 2020 the participants, who were psychology undergraduates, studied extensive written and video material about the rubber hand procedures (on their own laptops), including detailed information about the synchronous and asynchronous stimulation conditions, videos of the hidden real hand receiving the different types of tactile stimulation, and explicit information that the purpose of the rubber hand illusion was to “generate changes in experience”. The participants' task was to try to guess what people experience in the rubber hand illusion when shown the questionnaire statements that are used to quantify the illusion. In typical rubber hand illusion experiments, naïve participants are given minimal information about the procedures, and they are just instructed to report what they experience. Thus, these differences in task instruction (metacognitive evaluation versus rate a perceptual sensation) and the differences in the information about the rubber hand illusion may have created demand characteristics and expectations that are specific to Lush’s study and not representative of rubber hand illusion experiments in general. In other words, regardless of what the psychology undergraduates may or may not have thought about how the illusion works, this may not have substantially influenced their ratings of the illusion in a real rubber hand illusion experiment. Third, when the participants in Lush 2020 are given a rubber hand illusion questionnaire and asked to fill them out according to their expectancies, they report affirmative mean illusion ratings in both the synchronous and the asynchronous conditions. Noteworthy, this is qualitatively different from real rubber hand illusion experiments, where the illusion is typically clearly rejected in the asynchronous condition (negative scores in the order of mean -1 to -2; e.g., Kalckert and Ehrsson, 2014; Reader et al., 2021), suggesting that the participants’ guesses about the RHI were vague when it come to the specific effect of the asynchrony manipulation.

Interestingly, in Lush et al., (2020), the participants' expectations about what they were expected to experience in the synchronous and asynchronous conditions were registered before the rubber hand illusion (together with their trait suggestibility), and then the rubber hand illusion was tested and quantified with questionnaires (and proprioceptive drift). Importantly, expectations about what they expected to feel in the synchronous and the asynchronous conditions only had a very small effect on the questionnaire ratings from the actual rubber hand illusion experiment, only affecting the synchronous condition’s ratings a little and apparently not at all affecting the asynchronous condition (Slater and Ehrsson, 2022). Critically, the contribution was so small that it could effectively be ignored compared to the contribution of the visuotactile synchrony-asynchrony manipulation and trait suggestibility (Slater and Ehrsson, 2020). Furthermore, the contribution of visuotactile synchrony-asynchrony was two-to-three times more important than expectations and trait suggestibility combined. Thus, potential expectancies are unlikely to explain the current study’s main behavioural results. In addition to this, Lush et al., 2020, as well as recently reanalyzes of the same dataset (Ehrsson et al., 2022; Slater and Ehrsson, 2022), clearly show that the differences in illusion ratings between the synchronous and asynchronous conditions are unrelated to trait suggestibility (Lush et al., 2020; Ehrsson et al., 2022; Slater and Ehrsson, 2022). Thus, possible differences in trait suggestibility cannot explain the current study's main findings since these are based on asynchrony manipulation and differences between conditions, and such difference measures do not correlate with trait suggestibility (Lush et al., 2020; Ehrsson et al., 2022; Slater and Ehrsson, 2022). We think it was relevant to clarify this point here because suggestibility is a key concept in Lush’s articles.

To us, Lush and colleagues’ work are interesting because they inform us about individual differences in how trait suggestibility and cognitive expectancies may influence subjective illusion reports. There is no incompatibility between the arguments that the rubber hand illusion is a bodily illusion driven to a large extent by multisensory correlations, and at the same time, subjective rubber hand illusion reports can be modulated by top-down cognitive processes (Slater and Ehrsson, 2022) and show individual differences in illusion reports that are modulated by cognition. In the current study, we focus on the former multisensory perceptual aspects of the illusion at the level of individual subjects. We did not test Lush and colleagues’ hypotheses about demand characteristics, although the current computational modeling and psychophysics approach could be used for this purpose in future studies.

Cognitive bias

To be clear, we are not arguing that participants’ cognitive expectations cannot influence rubber hand illusion reports at all, or that the current dataset is completely free from cognitive biases, or that there could be no changes in decision criteria between some of the conditions in some of the participants. Our main argument is that such effects can probably not explain our main computational modeling results. It is possible that some participants adopt a more conservative decision criterion than other individuals in the detection task, while others use a more liberal criterion. Such differences in decision criteria could stem from many different postperceptual processes, including individual differences in trait suggestibility; but also differences in perceptual bias that can capture key perceptual aspects of perceptual illusions (Morgan et al., 1990). The effect of biases (cognitive or perceptual) could be accounted for by changes in the prior in our models. As such, a cognitive bias effect related to demand characteristics and expectations would most likely manifest itself as a global influence on all or some of conditions (Lush et al., 2020; Slater and Ehrsson, 2022), and not explain the specific patterns of trial-to-trial results we observed that fit with the causal inference model’s predictions.

Furthermore, a hypothetical change in decision criterion related to perceived synchrony (e.g., 0 ms and +/- 150 ms trials) or asynchrony (e.g., +/- 300 ms and +/- 500 ms) would not lead to the specific changes predicted by the causal inference model. Actually, such possible effects would correspond to a “fixed criteria” strategy rather than a Bayesian causal inference one. But the current study argues against such a fixed criteria strategy in our data because we formally compared the fit of the Bayesian causal inference model to our Fixed-Criterion (FC) model, and the causal inference model outperformed the FC model. In addition, and as already said, we have used a randomized trial design where all asynchrony conditions and the noise conditions are presented in random trial order. Such a design is considered to reduce the risk of adaptation and changes in decision criteria compared to designs where different conditions are presented in separate runs, and this was something we took into account when designing the study.

In conclusion, and after considering all the reviewer concerns, we still think that our conclusion that the rubber hand illusion is governed by Bayesian causal inference based on sensory evidence and sensory uncertainty is well-supported by the results. Our findings support this conclusion, and our results and conclusions are in line with our hypothesis and theoretical framework of the rubber hand as a multisensory bodily illusion, as well as the broader empirical and theoretical literature on causal inference in multisensory perception. We think reviewer 1’s suggestion that demand characteristic and a domain general cognitive strategy is an unlikely explanation for the current study’s main modeling findings, but we have made changes in the revised version of the manuscript to explicitly discuss this issue.

Specific comments:

(i) The model can be described as Bayesian, but how different is it from a signal detection theory model, with adjustable vs fixed criteria, and a criteria offset for the RHI and asynchrony judgment task? In other words, rather than the generative model being an explicit model for the system, instead different levels of asynchrony simply produce certain levels of evidence for a time difference, as in an STD model. Then set up criteria to judge whether there was or was not a time difference. Adjust criteria according to prior probabilities, as is often done in SDT. That's it. Is this just a verbal rephrasing?

The reviewer points toward an interesting discussion on the differences and potential overlap between two different computational frameworks and correctly points out that SDT and BCI sometimes lead to the same predictions at the behavioral level. However, we would argue that the scope of these two frameworks diverges and that the BCI framework is more relevant for the current study’s aims.

The SDT allows the estimation of decisional thresholds given one or several sensory inputs. From this standpoint, SDT can describe the thresholds used by the participants but does not explain how they are established. Under this approach, one could speculate that thresholds are quantitatively adjusted in two different tasks (e.g., under two different priors), which would require complexifying the initial SDT assumptions. However, it would be unrealistic to assume that the participants learn to adjust their SDT threshold from trial-to-trial, taking into account the level of sensory uncertainty.

On the contrary, the BCI framework is designed to be a more comprehensive, process-based approach: this framework explains how the perceptual thresholds are computed. This is not just “verbal rephrasing” but a fundamentally different approach to examine hypotheses about underlying computational principles. As a result, our BCI model efficiently captures the observed behavioral effect of visual noise on body ownership perception, including the variations in rubber hand illusion detection on a trial-to-trial basis.

Both approaches could be used in future psychophysics studies on bodily illusions, but with different purposes. For example, we are currently working on an SDT rubber hand illusion study based on a 2-AFC hand ownership task that we have developed in our lab (Chancel et al., 2021; Chancel and Ehrsson, 2020). Among this study's many interesting observations, relevant to mention here is that hand ownership sensitivity (d’) is significantly above zero for stimulation asynchrony of 50 ms, 100 ms and 200 ms, i.e., small asynchronies in a similar range as used in the current experiment. Note also significant ownership sensitivity for delays of 50 ms, which are too brief to be consciously perceived and therefore should not produce cognitive expectancies (Lanfranco et al., 2022 in preparation). Since hand ownership sensitivity measures (d’) and bias-free estimates of the participants’ rubber hand illusion discrimination behavior, this finding is in line with the current investigation’s Bayesian modeling results and interpretation.

One thing on the framing of the model: Surely a constant offset over several taps is just as good evidence for a common cause no matter whether the offset is 0 or something else? But if the prior probability is specifically that common cause is the same hand is involved (which requires an offset close to 0), surely that prior probability is essentially zero? So how should the assumption of C1 be properly framed?

The reviewer is correct that the temporal correlation between the visual and tactile stimulation in our experiment could promote multisensory integration due to the constant offset over several taps regardless of the visuotactile asynchrony (Parise and Ernst, 2016). Yet, the visuotactile delay also matters for multisensory perception (as discussed in Parise and Ernst, 2016), and that is the parameter that we manipulated in the current study keeping all other factors constant (except for the noise manipulation). Thus, both the correlations and the asynchrony matter, and the greater the asynchrony, the less information in favor of the rubber hand as being one’s own despite the visuotactile correlation.

Importantly, the inference of a common cause in models such as the one we are using takes into account the discrepancy between the sensory signals; thus, the magnitude of the offset matters. Especially since in the model we use, a common cause for vision and touch in our model (C = 1) means the same hand is involved as the source of visual and tactile inputs. And indeed, it requires the offset to be close to 0; that’s why we assume that the asynchrony s is always zero when C=1 (see appendix 1 for a detailed presentation, page 36 lines 1056 to 1059). A similar phrasing of the different causal scenarios was already proposed by Samad et al., (2015). In the future, more complex models could be developed that incorporate both temporal correlations and visuotactile delays, but this is beyond the aim of the current study.

(ii) lines 896-897"…temporally correlated visuotactile signals are a key driving factor behind the emergence of the rubber hand illusion Chancel and Ehrsson,202, …"

Cite findings that need some twisting to fit in with this view, e.g. the finding that imagining a hand is one's own creates SCRs to its being cut; and perhaps more easy to deal with but I think rather telling, no visual input of a hand is needed (Guterstan et al., 2013) and laser lights instead of brushes work just about as well (Durgin et al., 2013) as does stroking the air (Guterstam et al., 2016), making magnetic sensations akin to the magnetic hands suggestion in hypnosis. It seems the simplest explanation is that participants respond to what they perceive as what is meant to be relevant in the study manipulations. Suitable responses are constructed, based on genuine experience or otherwise, in accordance with the needs of the context. The very way the question is framed determines the answer according to principles of general inference (e.g. Lush and Seth, 2022, Nat Coms; Corneille and Lush, https://psyarxiv.com/jqyvx/).

The statement that “temporally correlated visuotactile signals are an important factor driving the emergence of the rubber hand illusion” is supported by a very large previous literature, as well as the current study’s findings. Note that we are not saying this is the only factor that drives the illusion (e.g., spatial congruence factors also contribute), only that it is an important factor.

We cited Chancel and Ehrsson (2020) because it is a highly relevant rubber hand illusion study given the similarities in paradigm and the manipulation of fine-grained temporal asynchronies of visual and tactile signals. Chancel and Ehrsson (2020) used seven levels of visuotactile asynchrony shorter than or equal to 200 ms, which means that the different visuotactile asynchronies were not clearly perceived by the participants and yet the collected data showed a significant relationship between the degree of visuotactile asynchrony and illusory rubber hand illusion as quantified in a discrimination task. Moreover, the illusory hand ownership discriminations were influenced by a manipulation of the distance between the participants’ hand and the rubber hand (5 cm change in lateral axis) in line with the spatial congruence principle of multisensory integration even though the participants did not notice this distance manipulation (which occurred between runs and out-of-sight of the participants, and confirmed in post-experimental interviews), and thus very unlikely to form high level cognitive expectations about the specific hypothesis related to this orthogonal and small spatial manipulation. Our statement that temporally correlated visuotactile signals are a key driving factor behind the rubber hand illusion comes from these observations, but as we said, also from a very large previous literature (e.g., Blanke et al., 2015; Botvinick and Cohen, 1998; Ehrsson, 2020; Ehrsson et al., 2004; Kilteni et al., 2015; Slater and Ehrsson, 2022; Tsakiris, 2010; Tsakiris and Haggard, 2005).

The statement under discussion is also supported by our own data. To illustrate this further we ran a new posthoc analysis. A mixed-effect logistic regression with participant as random effect confirms a significant effect of asynchrony (p <2e-16), as expected (see Author response image 1). A clear effect of asynchrony is seen in every participant, which is in line with our claim that this is key factor that drives the emergence of the rubber hand illusion.

Author response image 1
Mixed-effect logistic regression with participant as random effect.

Dots represents individual responses, the curves are the regression fit, the shaded areas the 95% confidence interval.

We could very well have cited Guterstam et al., (2013) here because it is a well-controlled study with many experiments that support the conclusion that temporally correlated visuotactile signals play a critical role in bodily illusions such as the rubber hand illusion. In this study, Arvid Guterstam presented evidence from 10 separate experiments conducted on different groups of naive participants; 234 subjects in total (one explorative pilot experiment and nine experiments that would nowadays be called hypothesis testing experiments). Nine of the experiments included synchronous and asynchronous conditions, and all findings support an important role for synchronous visuotactile correlations in driving this version of the rubber hand illusion. Several additional control conditions were included in this study, such as various spatial manipulations related to the spatial rule of multisensory integration and a control condition involving a block of wood that eliminates the illusion due to shape incongruence and top-down factors. The outcome measures were questionnaire ratings, increases in SCR triggered by physical threats towards the illusory hand, changes in hand position sense towards the illusory hand (proprioceptive drift), as well as one functional magnetic resonance imaging experiment. Critically the synchronous illusion condition leads to greater illusion questionnaire ratings, greater threat-evoked SCR, greater proprioceptive drift, and greater BOLD signals in key areas related to multisensory integration of bodily signals (such as the posterior parietal cortex, premotor cortex, and cerebellum) than the asynchronous condition and the other control conditions. Thus, the findings from Guterstam et al., (2013) support the sentence under discussion above and are in line with the current study’s findings and conclusions.

Guterstam et al., (2016) examine a somewhat different perceptual phenomenon of a perceived causal connection (similar to a magnetic force field) between visual and tactile events close to one’s own hand in peripersonal space. Again, this study is well controlled and contains many experiments, controls and measures, and the study finds that synchronous visuotactile stimulation and a limited spatial extent of peripersonal space around the hand are critical factors for this visuotactile illusion effect to arise. This study’s findings are in line with the rubber hand illusion literature and the literature on multisensory integration in peripersonal space. The “similarity” to “magnetic hands suggestions” that the reviewer mentions makes no sense to us. Such hypnotic suggestions do not obey temporal and spatial rules of multisensory integration and do not depend on correlated visuotactile signals. Also, the procedures and instructions are very different.

We can’t say so much about the Durgin et al., 2007 study because we have not tried to replicate it. However, in an ongoing study, we have used a control condition where participants just look at a rubber hand while we shine a laser light on it without any tactile or other somatosensory stimulation delivered to the hidden real hand. We find low questionnaire ratings in this condition, significantly lower than typically observed in rubber hand illusion conditions with synchronous visuotactile stimulation and more similar to the control condition when participants just look at a rubber hand without any stimulation. Synchronous visuotactile stimulation leads to a significantly stronger rubber hand illusion than this latter condition when participants are just looking at the rubber hand without any visuotactile stimulation (e.g., Guterstam et al., 2019). We think more studies are needed before we can draw any strong conclusions from Durgin et al., (2007).

It is well-known in psychological research that SCR is an unspecific measure, and that different perceptual, emotional, and cognitive processes can influence SCR. Thus, just as in psychophysiological research in general, it is very important to have adequate control conditions and adopt a hypothesis-driven approach when using the SCR in bodily illusion research. SCR responses triggered by physical threats are ecologically valid because it probes basic emotional defense reactions triggered by bodily threats (Ehrsson et al., 2007; Graziano and Cooke, 2006). In our bodily illusion studies with threat-evoked SCR, we always use control conditions (e.g., Gentile et al., 2013; Guterstam et al., 2015) and only focus on illusion-condition specific increases in SCR triggered by the threat-stimulus compared to when identical threats are presented in the asynchronous and other control conditions; in addition, we sometimes use control stimuli like non-threatening objects (e.g., Guterstam et al., 2015; Petkova and Ehrsson, 2008). In addition, bear in mind that most naïve participants do not understand how SCR works. Thus, in our view it is very unlikely that naïve participants can voluntarily control their evoked SCR responses in a condition-specific manner to simulate physiological responses in experimental designs like the ones described above. Also note that physical threats toward the rubber hand elicit specific fMRI responses in areas related to pain anticipation and fear, such as the anterior insular cortex (Ehrsson et al., 2007; Gentile et al., 2013) and amygdala (Guterstam et al., 2015), and the stronger the illusion in the synchronous condition compared to the asynchronous (and other controls), the stronger these threat-evoked BOLD responses. This suggest that the threat-evoked SCR reflects centrally mediated emotional defense reactions that are triggered by perceived threats towards one’s own body.

Note that in the recent study by Lush and colleagues (Lush et al., 2021) no SCR recordings were conducted so the claims about possible links between cognitive expectancies and changes in SCR remain speculative. This study also suffers from the same limitations as the questionnaire study discussed above (Lush 2020) in that psychology students were asked to guess which of the two conditions – the synchronous conditions or the asynchronous conditions – they thought should produce the strongest SCR after reading and learning about the rubber hand illusion. But as said, SCR was not registered, and the rubber hand illusion was not tested. Thus, it is unclear how metacognitive guesses about the rubber hand illusion relate to condition-specific differences in threat-evoked SCR during an actual rubber hand illusion experiment in genuinely naïve participants. To the best of our knowledge, the potential influence of cognitive expectations on SCR has yet not been tested in a controlled bodily illusion experiment.

The reviewer brought up the topic of indirect measures of the rubber hand here, so we would like to make a couple of further clarifications. In rubber hand illusion studies, it is common to supplement the results from questionnaires and rating scales with more objective tests such as proprioceptive drift, the cross-modal congruence task, threat-evoked SRC, fMRI, and electrophysiology (EEG/ECoG)(see Slater and Ehrsson 2022 for a recent review). The results of these studies support the hypothesis that the rubber hand illusion is a multisensory body illusion and underscore the importance of visual and somatosensory signals. Take fMRI, for example. Numerous studies have found differences between the synchronous and asynchronous conditions in specific premotor and posterior parietal areas that are known to be involved in the integration of visual, tactile, and proprioceptive bodily signals (Brozzoli et al., 2012; Ehrsson et al., 2004; Gentile et al., 2013; Limanowski and Blankenburg, 2016). Moreover, the stronger the illusion-condition-specific differences in BOLD signals in these areas, the stronger the illusion as rated in questionnaires (illusion ratings synchronous minus asynchronous).

Extremely relevant to the present study, in a recent imaging study, which is currently under revision in a leading neuroscience journal, we used fMRI to scan 30 participants as they performed the current rubber hand illusion detection task (based on the same stepwise asynchrony manipulation) and fitted the responses to the current BCI model (Chancel et al., see attached abstract for a poster at the OHBM conference in Glasgow – June 2022). As expected, BOLD activity in the premotor and posterior parietal cortices was related to illusion detection at the level of individual participants and trials, and activity in the posterior parietal cortex reflected the Bayesian causal inference model’s predicted probability of illusion emergence based on each participant’s behavioral response-profile. These findings corroborate the current behavioral study’s findings and conclusions and suggest that the rubber hand illusion detection decisions involve activity in multisensory areas related to integration versus segregation of bodily multisensory information.

That mental imagery of one’s hand being cut may modulate SCR is unsurprising. Mental imagery can influence emotional processes and lead to changes in SCR recordings. Without knowing which study you refer to, and what control conditions were used, it is difficult for us to say much more. For example, Hägni et al., (2008) did not elicit a bodily illusion but compared passive viewing arms playing a ball game versus a mental imagery condition so the changes in SCR reported in this study could relate to different cognitive factors.

Last and not least, we disagree with the reviewer’s statement that “the simplest explanation (for the rubber hand illusion, our addition) is that participants respond to what they perceive as what is meant to be relevant in the study manipulation”. Given all the arguments we presented above and in our first response, we fail to see how this “simple” interpretation can explain our key findings while another simple explanation that can do this well is that participants simply report what they feel when they experience the illusion.

References:

Aller M, Noppeney U. 2019. To integrate or not to integrate: Temporal dynamics of hierarchical Bayesian causal inference. PLOS Biology 17:e3000210. doi:10.1371/journal.pbio.3000210

Blanke O, Slater M, Serino A. 2015. Behavioral, Neural, and Computational Principles of Bodily Self-Consciousness. Neuron 88:145–166. doi:10.1016/j.neuron.2015.09.029

Botvinick M, Cohen J. 1998. Rubber hands “feel” touch that eyes see. Nature 391:756. doi:10.1038/35784

Brozzoli C, Gentile G, Ehrsson HH. 2012. That’s near my hand! Parietal and premotor coding of hand-centered space contributes to localization and self-attribution of the hand. J Neurosci 32:14573–14582. doi:10.1523/JNEUROSCI.2660-12.2012

Chancel M, Ehrsson HH. 2020. Which hand is mine? Discriminating body ownership perception in a two-alternative forced-choice task. Atten Percept Psychophys. doi:10.3758/s13414-020-02107-x

Chancel M, Hasenack B, Ehrsson HH. 2021. Integration of predictions and afferent signals in body ownership. Cognition 212:104722. doi:10.1016/j.cognition.2021.104722

Costantini M, Robinson J, Migliorati D, Donno B, Ferri F, Northoff G. 2016. Temporal limits on rubber hand illusion reflect individuals’ temporal resolution in multisensory perception. Cognition 157:39–48. doi:10.1016/j.cognition.2016.08.010

Durgin FH, Evans L, Dunphy N, Klostermann S, Simmons K. 2007. Rubber Hands Feel the Touch of Light. Psychological Science 18:152–157. doi:10.1111/j.1467-9280.2007.01865.x

Ehrsson HH. 2020. Multisensory processes in body ownershipMultisensory Perception. Elsevier. pp. 179–200. doi:10.1016/B978-0-12-812492-5.00008-5

Ehrsson HH, Spence C, Passingham RE. 2004. That’s My Hand! Activity in Premotor Cortex Reflects Feeling of Ownership of a Limb. Science 305:875–877. doi:10.1126/science.1097011

Ehrsson HH, Wiech K, Weiskopf N, Dolan RJ, Passingham RE. 2007. Threatening a rubber hand that you feel is yours elicits a cortical anxiety response. Proc Natl Acad Sci USA 104:9828–9833. doi:10.1073/pnas.0610011104

Gentile G, Guterstam A, Brozzoli C, Ehrsson HH. 2013. Disintegration of multisensory signals from the real hand reduces default limb self-attribution: an fMRI study. J Neurosci 33:13350–13366. doi:10.1523/JNEUROSCI.1363-13.2013

Graziano MSA, Cooke DF. 2006. Parieto-frontal interactions, personal space, and defensive behavior. Neuropsychologia 44:845–859. doi:10.1016/j.neuropsychologia.2005.09.009

Guterstam A, Björnsdotter M, Gentile G, Ehrsson HH. 2015. Posterior cingulate cortex integrates the senses of self-location and body ownership. Curr Biol 25:1416–1425. doi:10.1016/j.cub.2015.03.059

Guterstam A, Gentile G, Ehrsson HH. 2013. The invisible hand illusion: multisensory integration leads to the embodiment of a discrete volume of empty space. J Cogn Neurosci 25:1078–1099. doi:10.1162/jocn_a_00393

Guterstam A, Larsson DEO, Zeberg H, Ehrsson HH. 2019. Multisensory correlations—Not tactile expectations—Determine the sense of body ownership. PLOS ONE 14:e0213265. doi:10.1371/journal.pone.0213265

Guterstam A, Zeberg H, Özçiftci VM, Ehrsson HH. 2016. The magnetic touch illusion: A perceptual correlate of visuo-tactile integration in peripersonal space. Cognition 155:44–56. doi:10.1016/j.cognition.2016.06.004

Hägni K, Eng K, Hepp-Reymond M-C, Holper L, Keisker B, Siekierka E, Kiper DC. 2008. Observing Virtual Arms that You Imagine Are Yours Increases the Galvanic Skin Response to an Unexpected Threat. PLOS ONE 3:e3082. doi:10.1371/journal.pone.0003082

Kalckert A, Ehrsson HH. 2014. The moving rubber hand illusion revisited: comparing movements and visuotactile stimulation to induce illusory ownership. Conscious Cogn 26:117–132. doi:10.1016/j.concog.2014.02.003

Kilteni K, Maselli A, Kording KP, Slater M. 2015. Over my fake body: body ownership illusions for studying the multisensory basis of own-body perception. Front Hum Neurosci 9:141. doi:10.3389/fnhum.2015.00141

Körding KP, Beierholm U, Ma WJ, Quartz S, Tenenbaum JB, Shams L. 2007. Causal inference in multisensory perception. PLoS ONE 2:e943. doi:10.1371/journal.pone.0000943

Limanowski J, Blankenburg F. 2016. Integration of Visual and Proprioceptive Limb Position Information in Human Posterior Parietal, Premotor, and Extrastriate Cortex. J Neurosci 36:2582–2589. doi:10.1523/JNEUROSCI.3987-15.2016

Lush P. 2020. Demand Characteristics Confound the Rubber Hand Illusion. Collabra: Psychology 6:22. doi:10.1525/collabra.325

Lush P, Botan V, Scott RB, Seth AK, Ward J, Dienes Z. 2020. Trait phenomenological control predicts experience of mirror synaesthesia and the rubber hand illusion. Nat Commun 11:4853. doi:10.1038/s41467-020-18591-6

Lush P, Seth AK. 2022. Reply to: No specific relationship between hypnotic suggestibility and the rubber hand illusion. Nat Commun 13:563. doi:10.1038/s41467-022-28178-y

Lush P, Seth AK, Dienes Z. n.d. Hypothesis awareness confounds asynchronous control conditions in indirect measures of the rubber hand illusion. Royal Society Open Science 8:210911. doi:10.1098/rsos.210911

Morgan MJ, Hole GJ, Glennerster A. 1990. Biases and sensitivities in geometrical illusions. Vision Research, Optics Physiology and Vision 30:1793–1810. doi:10.1016/0042-6989(90)90160-M

Parise CV, Ernst MO. 2016. Correlation detection as a general mechanism for multisensory integration. Nat Commun 7:11543. doi:10.1038/ncomms11543

Petkova VI, Ehrsson HH. 2008. If I Were You: Perceptual Illusion of Body Swapping. PLOS ONE 3:e3832. doi:10.1371/journal.pone.0003832

Reader AT, Trifonova VS, Ehrsson HH. 2021. The Relationship Between Referral of Touch and the Feeling of Ownership in the Rubber Hand Illusion. Front Psychol 12:629590. doi:10.3389/fpsyg.2021.629590

Rigoux L, Stephan KE, Friston KJ, Daunizeau J. 2014. Bayesian model selection for group studies — Revisited. NeuroImage 84:971–985. doi:10.1016/j.neuroimage.2013.08.065

Rohe T, Ehlis A-C, Noppeney U. 2019. The neural dynamics of hierarchical Bayesian causal inference in multisensory perception. Nat Commun 10:1–17. doi:10.1038/s41467-019-09664-2

Rohe T, Noppeney U. 2016. Distinct Computational Principles Govern Multisensory Integration in Primary Sensory and Association Cortices. Curr Biol 26:509–514. doi:10.1016/j.cub.2015.12.056

Rohe T, Noppeney U. 2015. Cortical Hierarchies Perform Bayesian Causal Inference in Multisensory Perception. PLOS Biology 13:e1002073. doi:10.1371/journal.pbio.1002073

Samad M, Chung AJ, Shams L. 2015. Perception of body ownership is driven by Bayesian sensory inference. PLoS ONE 10:e0117178. doi:10.1371/journal.pone.0117178

Shams L, Beierholm U. 2022. Bayesian causal inference: A unifying neuroscience theory. Neurosci Biobehav Rev 137:104619. doi:10.1016/j.neubiorev.2022.104619

Shams L, Beierholm UR. 2010. Causal inference in perception. Trends Cogn Sci 14:425–432. doi:10.1016/j.tics.2010.07.001

Shimada S, Suzuki T, Yoda N, Hayashi T. 2014. Relationship between sensitivity to visuotactile temporal discrepancy and the rubber hand illusion. Neuroscience Research 85:33–38. doi:10.1016/j.neures.2014.04.009

Slater M, Ehrsson HH. 2022. Multisensory Integration Dominates Hypnotisability and Expectations in the Rubber Hand Illusion. Frontiers in Human Neuroscience 16.

Tsakiris M. 2010. My body in the brain: a neurocognitive model of body-ownership. Neuropsychologia 48:703–712. doi:10.1016/j.neuropsychologia.2009.09.034

Tsakiris M, Haggard P. 2005. The Rubber Hand Illusion Revisited: Visuotactile Integration and Self-Attribution. Journal of Experimental Psychology: Human Perception and Performance 31:80–91. doi:10.1037/0096-1523.31.1.80

Weber SJ, Cook TD. 1972. Subject effects in laboratory research: An examination of subject roles, demand characteristics, and valid inference. Psychological Bulletin 77:273–295. doi:10.1037/h0032351

(iII) Provide means and SE's for conditions for synchrony judgment tasks.

Table has been added to the Figure3_Supplement3 (former Figure5_Supplement3).

(iv) Discuss the alternative views of how the study could be interpreted as I have indicated above.

The revised manuscript includes a discussion of the alternative views suggested by the reviewer and we make the key arguments leading to our main conclusions more explicit for the reader. We think this study’s main results are well protected against demand characteristics and suggestibility and that it is unplausible that participants can solve the current rubber hand illusion task using a domain-general reasoning strategy. We agree with the reviewer that it is best to discuss these issues in the text.

The reviewer is correct that we conducted our study to test a particular hypothesis and computational model about the rubber hand illusion and that our study was not designed to try to distinguish between different theories. We are happy to stay more focused on the conclusions that directly relate to our modelling results and hypotheses and avoid unnecessary broader theoretical speculations. So, we have removed the original sentence we had where we cited Lush et al., 2020, which was uninformative and vague.

Note, that the current manuscript is already very long, and the other reviewers have also asked us to include more technical details (eLife does not allow supplementary material with supplementary discussion). We also do not want to replicate previous more general discussions about demand characteristics and suggestibility in the rubber hand illusion, which has already been well covered in other publications (e.g.,Lush and Seth, 2022; Ehrsson et al., 2022; Slater and Ehrssson, 2022). Thus, we are focusing the new discussion sentences on how the issues raised by the reviewer relates to the current study’s experimental design main modelling results and key conclusions.

Old version:

“Such successful modeling of the multisensory processing driving body ownership is especially relevant as several alternative models of body ownership emphasize interoception (Azzalini et al., 2019, Park and Blanke, 2019), motor processes (Burin et al., 2015, 2017), hierarchical cognitive and perceptual models (Tsakiris et al., 2010), or high-level cognition and expectations (Lush et al., 2020). Therefore, quantitative computational studies like the present one are needed to formally compare these different theories of body ownership and advance the corresponding theoretical framework.”

New version:

“Such successful modeling of the multisensory information processing in body ownership is relevant for future computational work into bodily illusions and bodily self-awareness, for example, more extended frameworks that also include contributions of interoception (Azzalini et al., 2019, Park and Blanke, 2019), motor processes (Burin et al., 2015, 2017), pre-existing stored representations about what kind of objects that may or may not be part of one’s body (Tsakiris et al., 2010), and high-level cognition and expectations (Lush et al., 2020; Lush 2019). Quantitative computational studies like the present one are needed to formally compare these different theories of body ownership and advance the corresponding theoretical framework.”

One of the reviewers raised the question of to what extent demand characteristics and cognitive expectations could influence the current findings, citing the recent debate on this topic (Lush 2020; Lush et al., 2020; Ehrsson et al., 2022; Lush and Seth 2022; Slater and Ehrsson 2022). Although demand characteristics and trait suggestibility is likely to modulate illusion responses across all levels of asynchrony in the current study (Lush 2020; Slater and Ehrsson 2022), and may thus explain some of the differences in the overall tendency to give ‘yes’ responses across participants, such effects cannot explain the specifically shaped psychometric curves with respect to the subtle stepwise asynchrony manipulations and the widening of this curve by the noise manipulation as predicted by the causal inference model. Note that there are many features of the task design and the experimental procedures that reduce the risk of demand characteristics or changes in decision criterion: a large number of trials were presented in a fully randomized order, robots delivered the stimuli, the experimenter was blind to the noise manipulation and out-of-view of the participants, the data was analyzed and modeled in individual participants, and the BCI model’s specific predictions are “hidden” and very difficult to figure out spontaneously; note further the shortest asynchrony we used (150 ms) was so brief that most participants did not perceive it reliably as asynchronous (Costantini et al., 2016; Shimada et al., 2009), and the participants did not know how many different asynchrony levels were tested (as revealed in post-experiment interviews). Another crucial point is that no feedback was given to the participants about task performance, so they could not learn to map their responses to the different experimental manipulations. Thus, our main results reflect the effect of the causal experimental manipulations – the level of asynchrony and visual noise – on the RHI detection responses.

We also made some additions to the method section to highlight the features of the task design and the experimental procedures that reduce the risk of demand characteristics:

– Page 26, lines 790 – 792: During the experiment, the experimenter was blind to the noise level presented to the participants, and the experimenter sat out of the participants’ sight.

– Page 26, lines 785 – 787: The participants did not know how many different asynchrony levels were tested (as revealed in unformal post-experiment interviews) and that no feedback was given on their task performance

– Page 23, lines 672 – 673: Eighteen healthy participants naïve to the conditions of the study were recruited for this experiment (6 males, aged 25.2 ± 4 years, right-handed; they were recruited from outside the department, never having taken part in a bodily illusion experiment before).

Reviewer #2 (Recommendations for the authors):

Using rubber hand illusion in humans, the study investigated the computational processes in self-body representation. The authors found that the subjects' behavior can be well captured by the Bayesian causal inference model, which is widely used and well described in multisensory integration. The key point in this study is that the body ownership perception was well predicted by the posterior probability of the visual and tactile signals coming from a common source. Although this notion was investigated before in humans and monkeys, the results are still novel:

1. This study directly measures illusions with the alternative forced-choice report instead of objective behavioral measurements (e.g., proprioceptive drift).

2. The visual sensory uncertainty was changed trial by trial to examine the contribution of sensory uncertainty in multisensory body perception. Combined with the model comparison results, these results support the superiority of a Bayesian model in predicting the emergence of the rubber hand illusion relative to the non-Bayesian model.

3. The authors investigated the asynchrony task in the same experiment to compare the computational processing in the RHI task and visual-tactile synchrony detection task. They found that the computational mechanisms are shared between these tasks, while the prior of common source is different.

In general, the conclusions are well supported, and the findings advanced our understanding of the computational principles of body ownership.

Thank you for taking the time to assess our manuscript and carefully review it. We are delighted to read that the reviewer thinks our conclusions are well supported and that our study advances our understanding of the computational principles of body ownership.

Main comments:

1. One of the critical points in this study is the comparison between the BCI model which takes the sensory uncertainty into account and the non-Bayesian (fixed-criterion) model. Therefore, I suggest the authors show the prediction results of both the BCI and non-Bayesian model in Figure 2 and compare the key hypothesis and the prediction results.

This is an excellent suggestion on how to illustrate the key difference between the BCI and FC models. Following this suggestion, we propose to add Author response image 2 as a panel D to Figure 5 (previously Figure 2).

Author response image 2
(D) Finally, this last plot shows simulated outcomes predicted by the Bayesian Causal Inference model (BCI in full lines and bars) and the fixed criterion model (FC in dashed lines and shredded bars).

In this theoretical simulation, both models predict the same outcome distribution for one given level of sensory noise (0%), however, since the decision criterion of the BCI model is adjusted to the level of sensory uncertainty, an overall increase of the probability of emergence of the rubber hand illusion is predicted by this Bayesian model. On the contrary, the FC model, which is a non- model, FC, predicts a neglectable effect of sensory uncertainty on the overall probability of emergence of the rubber hand illusion.

2. This study has two tasks: the ownership task and the asynchrony task. As the temporal disparity is the key factor, the criteria for determining the time disparities are important. The author claim that the time disparities used in the two tasks were determined based on the pilot experiments to maintain an equivalent difficulty level between the two tasks. If I understand correctly, the subjects were asked to report whether the rubber hand felt like my hand or not in the ownership task. Thus, there are no objective criteria of right or wrong. The authors should clarify how they define the difficulty of the two tasks and how to make sure the difficulty is equal.

The reviewer rightfully pointed out that we did not use an objective external criterion of right or wrong answer in our rubber hand illusion detection task thus we should not say that the difficulty is equal in both tasks. Our choice to reduce the range of tested asynchronies in the synchrony detection task was made after collecting the pilot data presented in Supplementary File 2A. In the +/- 500 and +/- 300 ms asynchrony conditions the number of trials for which the visuo-tactile stimulation was perceived as synchronous was consistently very low or never happened (zeros) in many cases. This observation suggests that the synchrony task was too easy with such relatively longer asynchronies and that it would not produce behavioral data that would be useful for model fitting or testing the BCI model. Thus, we adjusted the asynchrony conditions in the synchrony task to make this task more challenging and more comparable to the ownership judgment task in terms of modeling. Note that we could not change the asynchronies in the ownership task to match the synchrony task because we need the longer 300 ms and 500 ms asynchronies to suppress the illusion effectively. We agree that the term “difficulty” was misleading, thus we replaced it by “sensitivity”.

Please note that the key purpose of including two tasks was that we wanted to be able to show that both follow Bayesian causal inference, which is a finding that strengthens our main conclusions about multisensory causal inference in body ownership. We can now conclude from our own data that the rubber hand illusion obeys similar causal inference principles based on sensory evidence and sensory uncertainty as more basic forms multisensory integration. So having the two tasks in the current study has several advantages. Still, one should think of the synchrony task as a control task in the classic sense of the word because it is a different task than the rubber hand illusion detection task, so there will always be differences between them. The fact that illusory hand ownership can be elicited with longer asynchronies is one such these differences, that was already pointed out in earlier work studies (Shimiada et al., 2009).

I think this is important because the time disparities in these two tasks were different, and the comparison of Psame in these tasks may be affected by the time disparities. Furthermore, the authors claimed that the ownership and visual-tactile synchrony perception had distinct multisensory processing according to the different Psame in the two tasks. Thus the authors should show further evidence to exclude the difference of Psame results from the chosen time disparities.

The reviewer raises an interesting point. The fact that the tested asynchronies are different for the two tasks can at a first glance be seen at first as a limitation. However, as explained above, the decision to have shorter asynchronies in the synchrony task was taken after careful consideration and analysis of the pilot data. As mentioned in the discussion, we cannot exclude this methodological choice might have influenced the observed differences in the causal prior in the two tasks (i.e., higher prior probability of a common cause for the smaller range of asynchrony) instead of reflecting differences in causal inference of hand ownership versus visuotactile simultaneity perception as we suggest based on theoretical considerations and the previous literature.

However, to further examine the reviewer’s concern we here report the results from an additional analysis. We applied our extension analysis to the pilot data to test the BCI model on tasks with identical asynchronies as shown in Supplementary File 2B. Note that the pilot study did not manipulate the level of sensory noise (only the 0% noise level was included). As described above, this pilot study included ten naïve participants and used the same setup and procedures as the main experiment, with the only exception being that asynchronies in the synchrony judgment task now were identical to those in the ownership judgment task (and no noise manipulation as just said).

The Author response image 3 shows the key results regarding the estimated psame. Note that the same trend is observed as in the main experiment: the estimated a priori probability for a common cause for synchrony judgment is lower than for body ownership judgement. However, in this pilot experiment psame for body ownership and synchrony are not correlated (spearman correlation: S=150, p = 0.81, rho = 0.09), and for more than half of our pilot participants, psame for body ownership reaches the extremum (psame = 1). This ceiling effect might be because the synchrony task was too “easy” when using asynchronies of 300 ms and 500 ms; it lacked challenging stimulation conditions required to assess the participants’ perception as a finely gradual function, and thus these data was unsuitable for our modeling purposes. This observation further convinced us that we needed to make the synchrony judgment task more difficult by reducing the longer asynchronies to obtain high-quality behavioral data that would allow us to test the subtle effects of sensory noise, compare different models, and compare with the ownership judgment task in a meaningful way.

Author response image 3
Correlation between the prior probability of a common cause psame estimated for the ownership and synchrony tasks in the extension analysis in the pilot study (left) and the main study (right).

The psame estimate is significantly lower for the synchrony task than for the ownership task. The solid line represents the linear regression between the two estimates, and the dashed line represents the identity function (x=f(x)).

3. Related to the question above, the authors found that the same BCI model can reasonably predict the behavioral results in these two tasks with the same parameters (or only different Psame). They claimed that these two tasks shared similar results in multisensory causal inference. While in the following, they argued that there was a distinct multisensory perception in these two tasks because the Psame were different. If these tasks shared the same BCI computational approaches and used the posterior probability of common source to determine ownership and synchrony judgment, what is the difference between them?

We thank the reviewer for this opportunity to clarify our claim. We think it makes good sense that the degree of asynchrony of the visual and tactile stimuli drives both the synchrony judgments (for obvious reasons) and the rubber hand illusion judgments (since the temporal congruence of the visuotactile correlations is a key factor factor that drives the emergence of the illusion). Both follow multisensory causal inference, which we can demonstrate with our own data, and which support our main claim about multisensory body ownership. But of course, the tasks are different in that the perception of simultaneous brief visuotactile events and the perception of the rubber hand as one’s own are different percepts (and the latter also involve other sources of sensory evidence that we kept constant in the current study such as visuo-proprioceptive spatial congruence, the number of correlated stimuli, the prior state of body representation), so it is expected that psame should be different we think.

What our modeling analysis shows is that both tasks use the same probabilistic principles for sensory decision making: the same strategy to combine sensory information and take into account sensory uncertainty, but the difference lays in what type of prior information that is combined and what the causal inference is about. The sensory input signals to be considered are the same, since the visuotactile stimulations are the same, as well as the visual impressions of the rubber hand and the proprioceptive feedback from the hidden hand. However, the prior information used in the causal inference process is different. Hence, we conclude that the two tasks use similar computational principles but call upon different prior representations in line with that the tasks probes two distinct types of perception. The prior is an umbrella term that can reflect different types of information. Nevertheless, the finding of different priors in the two tasks is in line with our conclusion of differences in computational processing between the two tasks. However, we agree that the comparison of the psame across the two tasks is the least conclusive finding in the current study, and that is why we also discuss this finding critically in the Discussion section. Note, however, that this finding is independent of our main finding that hand-ownership feeling in the rubber hand illusion follows Bayesian causal inference based on sensory information and sensory uncertainty.

4. The extension analysis showed that the Psame values from the two tasks were correlated across subjects. This is very interesting. However, since the uncertainty of timing perception (the sensory uncertainty of perceived time of the tactile stimuli on real hand and fake hand) was taken into account to estimate the posterior probability of common source in this study, the variance across subjects in ownership and synchrony task can only be interpreted by the Psame. In fact, the timing perception was considered as a Gaussian distribution in a modulated version of the BCI model for the agency (temporal binding) (R. Legaspi, 2019) and ownership (Samad, 2015). It will be more persuasive if the authors exclude the possibility that the individual difference of timing uncertainty cannot explain the variance across subjects.

We are happy to read that the reviewer found this result interesting. Let us clarify our approach: in our study, the noise parameters (i.e., the timing uncertainty) are fitted individually, just like the psame parameters, therefore they reflect parts of the variability across participants. Thus, this does not go against our interpretation of the results: the observed differences between the two tasks reflect the use of different priors between the two types perception and not an individual difference in the measured timing uncertainty.

5. Please include the results of single-subject behavior in the asynchrony task. It is helpful to compare the behavioral pattern between these two tasks. The authors compared the Psame between ownership and asynchrony tasks. Still, they did not directly compare the behavioral results (e.g., the reported proportion of elicited rubber hand illusions and the reported proportion of perceived synchrony).

We agree with the reviewer that several approaches could be used to compare ownership and synchrony detection data. Focusing on a comparison at the computational level as we do means that we handle a “summary” of the datasets, obtained thanks to careful and statistically robust fitting methods. By doing so we avoid limitations related to multiple comparisons and random effects from the participant samples. These limitations could be controlled for in a more traditional approach based on classic inferential statistics, however not without impacting the statistical power of our analyses. Critically, in the present study we use a model-based approach to learn more about the computational principles of body ownership, so we want to use this approach throughout the manuscript. Adding lot of extra comparisons based on descriptive statistics might be distracting to the reader (and we already have lots of results in the current manuscript). Moreover, the two tasks have many differences that make direct statistical comparisons of the sort as the reviewer is recommending somewhat difficult to interpret. Thus, we prefer to focus on a model-based comparison of the two tasks, using a computational approach to pinpoint difference in processes between the tasks more than on specific observations. Additionally, we would like to emphasize that the main findings of the study are the fit of the BCI model to the body ownership data and the modulation of body ownership by the sensory uncertainty. The comparison with the synchrony judgment task, and the finding of the different psame are interesting and novel but can almost be seen as a secondary “bonus finding” of the study; but a finding of interest to researchers in the rubber hand illusion community, we think. Nonetheless, if the reviewer thinks it would be insightful, we could add the plot of the individual responses and corresponding fit presented within Author response image 4 as the Figure3_Supplement4.

Author response image 4
Individual data and BCI model fit.

The figure display one plot per participant, the “yes [the rubber hand felt like my own hand]" answers as a function of visuo-tactile asynchrony (dots) and corresponding BCI model fit (curves) are plotted. As in the main text, dark blue, light blue, and cyan correspond to the 0%, 30%, and 50% noise levels, respectively.

6. The analysis of model fitting seems to lack some details. If I understood it correctly, the authors repeated the fitting procedure 100 times (line 502), then averaged all repeats as the final results of each parameter? It is reported that "the same set of estimated parameters at least 31 times for all participants and models". What does this sentence mean? Can the authors show more details about the repeats of the model result?

As the reviewer mentioned, we repeat the fitting procedure 100 times, with different initial values for the parameters; however, we did not average the results but selected the parameter set that maximized the most the log-likelihood of the model. This procedure is a way to find the “best estimation” of the parameters and avoid local minimums (sets of estimated parameters that do not lead to the true minimum negative log-likelihood for a given participant and a given model).

Finding the same set of estimated parameters several times is an argument in favor of the robustness of our fitting procedure: the optimization algorithm leads to the same “best set of estimated parameters” starting from different initial values for the parameters.

We added the following sentence to the parameter estimation section to clarify this procedure:

“The best estimate from either of these two procedures was kept, i.e., the set of estimated parameters that corresponded to the maximal log-likelihood for the models”

7. In Figure 3A, was there an interaction effect between the time disparity and visual noise level?

The interaction between time disparity and visual noise was not significant (F(12, 280) = 1.6, p = 0.097). Note that Figure 3 is now Figure 1.

8. Line 624, the model comparison results suggested that the subjects have the same standard deviation as the true stimulus distribution. I encourage the authors to directly compare the BCI* model predicted uncertainty (σ s) to the true stimulus uncertainty, which will make this conclusion more convincing.

We respectfully disagree that a comparison based on one parameter estimation would be a stronger argument to select the best model between BCI* and BCI than a confidence interval and bootstrap analysis using AIC and BIC since we are dealing here with a parsimony question. Our analysis shows that the BCI* would present a risk of an overfit. Since the estimated value of one parameter is not independent of the evaluation of the other parameters, an analysis based on parameter estimation in the case of an overfitted model does not make sense to us. Hence, we find it more relevant to use a model comparison method that is based on the overall fit of the models and not just one parameter value.

9. How did the authors fit the BCI model to the combined dataset from both ownership and synchrony tasks? What is the cost function when fitting the combined dataset?

We have added the information about the extension analysis the reviewer is asking about to the appendix:

When fitting the BCI to the combined dataset, we used the same expression of the log-likelihood to be maximized as when fitting only one dataset:

logL(θ)=i, j [n1ijlogp(C^=1|sj, θ)+n0ijlog(1p(C^=1|sj, θ))]

where n1ij and n0ij are the observed data, i.e., the numbers of times the participant reported “yes” and “no”, respectively, in the (i, j)th condition.

When one dataset was considered, there were 21 conditions (7 asynchronies and 3 levels of visual noise). When the datasets were combined, the conditions were doubled since all experimental conditions were tested in the body ownership task and in the synchrony detection task. Most parameters were fitted to all the data, the only exception being when a different value of psame was estimated for each type of judgment. In this case, psame is affected by only half of the data. As a result, the parameters to be fitted (θ) depended on the type of extension analysis:

θ=[psame,σ0,σ30, σ50, λ], when both tasks shared all parameters.

θ=[psame,ownership,psame, synchrony, σ0,σ30, σ50, λ], when a different value of psame was estimated for each type of judgment

10. As shown in the supplementary figures, the variations of ownership between the three visual noise levels varied widely among subjects and the predicted visual sensory sigmas (Appendix 1 – Table 2). The ownership in the three visual noise levels correlated with the individual difference of visual uncertainty?

We are not sure we fully understand the reviewer's concern here. Indeed, the Appendix 1 – Table 2 presents the results of the parameter recovery analysis; therefore, it is based on simulation and not actual participants. The variability is therefore greater than in the actual dataset. Moreover, in our study, because all the parameters are fitted individually, the variations in all parameters explain the variations among individuals, and all contribute to the goodness of the final fit.

11. The statements of the supplementary figures are confusing. For example, it is hard to determine which one is the "Supplementary File 1. A" in line 249?

We apologize for this lack of clarity. We initially created a supplementary material document containing supplementary figures, files, and notes. However, the submission format to eLife does not allow supplementary notes. In accepted manuscripts, the supplementary figures would be directly linked to the main figures they supplement (e.g., Figure 1_Supplement1) while the supplementary files are meant to be independent, which should help clarify the referencing of the supplement material. For more clarity, we transform the supplementary file 2 into appendix 1 – Section 4 and 5 and supplementary file 1 into figure 4 – supplement 1 and 2.

12. Line 1040, it is hard to follow how to arrive at this equation from the previous formulas. Please give some more details and explanations.

Good point. We have added the following equations to make the different steps leading to the expression of the decision criterion easier to follow.

xtrial22(1σ21σ2+σs2)< log(psame1psame)+12log(σ2+σs2σ2)
xtrial2< σ2(σ2+σs2)σs2 (2log(psame1psame)+log(σ2+σs2σ2))

Reviewer #3 (Recommendations for the authors):

This study investigated the computational mechanisms underlying the rubber hand illusion. Combining a detection-like task with the rubber hand illusion paradigm and Bayesian modelling, the authors show that human behaviour regarding body ownership can be best explained by a model based on Bayesian causal inference which takes into account the trial-by-trial fluctuations in sensory evidence and adjusts its predictions accordingly. This is in contrast with previous models which use a fixed criterion and do not take trial-by-trial fluctuations in sensory evidence into account.

The main goal of the study was to test whether body ownership is governed by a probabilistic process based on Bayesian causal inference (BCI) of a common cause. The secondary aim was to compare the body ownership task with a more traditional multisensory synchrony judgement task within the same probabilistic framework.

The objective and main question of the study is timely and interesting. The authors developed a new version of the rubber hand illusion task in which participants reported their perceived body ownership over the rubber hand on each trial. With the manipulation of visual uncertainty through augmented reality glasses they were able to assess whether trial-by-trial fluctuation in sensory uncertainty affects body ownership – a key prediction of the BCI model.

This behavioural paradigm opens up the intriguing possibility of testing the BCI model for body ownership at a neural level with fMRI or EEG (e.g., as in Rohe and Noppeney (2015, 2016) and Aller and Noppeney (2019)).

I was impressed by the methodological rigour, modelling and statistical methods of the paper. I was especially glad to see the modelling code validated by parameter recovery. This greatly increases one's confidence that good coding practices were followed. It would be even more reassuring if the analysis code were made publicly available.

The data and analyses presented in the paper support the key claims. The results represent a relevant contribution to our understanding of the computational mechanisms of body ownership. The results are adequately discussed in light of a broader body of literature. Figures are well designed and informative.

Thank you for this positive evaluation of our work. We are especially pleased to read that the reviewer thinks that our modeling approach is sound and that our data and analyses support the key claims. We were also delighted to see that you highlighted that the current approach offers new perspectives, including the possibility of investigating the neural basis of causal inference of body ownership. Indeed, this is something we are currently pursuing in a new fMRI study (see the attached abstract, the corresponding manuscript is currently under review; see also our responses to reviewer 1).

Main points:

1. Line 298: It is not clear if all 5 locations were stimulated in each 12 s stimulation phase or they were changed only between stimulation phases. Please clarify.

The clarification has been added to the manuscript. “All five locations were stimulated at least once in each 12 s trial and the order of stimulation sites randomly varied from trial to trial.” Indeed, a 12s stimulation meant six touches, one on every location and one location was stimulated a second time. The order of stimulation and the location stimulated twice were randomly chosen from one trial to the other (see below an example of the stimulation sequence for one participant).

Example of a stimulation location sequence, the touch location being numbered from 1 to 5:

Trial 1: 4, 2, 3, 1, 2, 5 – Trial 2: 3, 5, 4, 2, 5, 1 – Trial 3: 2, 4, 3, 3, 5, 1 – Trial 4: 5, 2, 1, 3, 4, 2 – ….

2. Line 331: "The 7 levels of asynchrony appeared with equal frequencies in pseudorandom order". I assume this was also true to the noise conditions, i.e., they also appeared in pseudorandom order with equal frequencies and not e.g., blocked. Could you please make this explicit here?

The reviewer is correct; in the new version of the manuscript, we have explicitly explained the pseudo randomization of the noise conditions:

“The three levels of noise also appeared with equal frequencies in pseudorandom order.”

3. Line 348: Was the pilot study based on an independent sample of participants from the main experiment? Please also include standard demographics data (mean+/-SD age, sex) from the pilot study.

The pilot participants were different from the ones participating to the main experiment. We added the demographics of the pilot sample to the manuscript:

“(3 males, aged 27.0 ± 4 years, different than the main experiment sample)”.

4. Line 406: From the standpoint of understanding the inference process at a high level, the crucial step of how prior probabilities are combined with sensory evidence to compute posterior probabilities is missing from the equations. More precisely it is not exactly missing, but it is buried inside the definition of K (line 416) if I understand correctly. I think it would make it easier for non-experts to follow the thought process if Equation 5 from Supplementary material would be included here.

We thank the reviewer for pointing out this lack of clarity. We tried to make our procedure as transparent as possible without penalizing the readability of our manuscript. Following the reviewer’s suggestion, we added the equation showing how prior probability and sensory evidence are combined (Equation 5 in the appendix) when presenting the inference step of the model in the main manuscript:

“… This equation can be written as a sum of the log prior ratio and the log-likelihood ratio:

d=log(psame1psame)+log(p(xtrial|C=1)p(xtrial|C=2))

5. Line 511: There are different formulations of BIC, could you please state explicitly the formula you used to compute it? Please also state the formula for AIC.

The BIC and AIC formula, originally reported only in the appendix, are now added to the main text as well.

“We calculated AIC and BIC values for each model and participant according to the following equations:

AIC=2npar2logL
BIC=ntriallognpar2logL

where L is the maximized value of the likelihood, npar the number of free parameters, and ntrial the number of trials.”

6. Line 512: "Badness of fit": Interesting choice of word, I completely understand why it is chosen here, however perhaps I would use "goodness of fit" instead to avoid confusion and for the sake of consistency with the rest of the paper.

We followed the reviewer’s suggestion and used the term “goodness of fit” to avoid any confusion.

7. Figure 4: I think the title could be improved here, e.g., "Model predictions of behavioural results for body ownership" or something similar. Details in the current title (mean +/- sem etc.) could go in the figure legend text.

I am a bit confused about why the shaded overlays from the model fits are shaped as oblique polygons? This depiction hints that there is a continuous increase in the proportion of "yes" answers in the neighbourhood of each noise level. Aren't these model predictions based on a single noise level value?

The mean model predictions are not indicated in the figure only the +/- SEM ranges marked by the shaded areas.

We followed the reviewer’s recommendation to improve the figure’s title and renamed Figure 4, now Figure 2: “Observed and predicted detection responses for body ownership in the rubber hand illusion”. Moreover, we thank the reviewer for pointing out a mistake in the code used to plot the subplot A and C, where continuous predictions were used instead of being pooled by noise level. The figure has been edited accordingly. Finally, we created a Figure2_Supplement5 that displays the predicted proportion of “yes” responses (mean+/- SEM) by our main models.

Line 261: Given that participants' right hand was used consistently, I am wondering if it makes any difference if the dominant or non-dominant hand is used to elicit the rubber hand illusion? If this is a potential source of variability it would be useful to include information on the handedness of the participants in the methods section.

To the best of our our knowledge, no significant effect of handedness has been observed in the rubber hand illusion (Smit M, Kooistra DI, van der Ham IJM, Dijkerman HC (2017) Laterality and body ownership: Effect of handedness on experience of the rubber hand illusion. Laterality 22:703–724.). The illusion seems to work equally well on the right hand and the left hand, and in the literature, one finds that many studies use the left hand (as Botvinick and Cohen 1998) and that many use the right hand (which we typically do because most neuroimaging studies has studied the right hand, so we know more about the central representation of the right upper limb). However, we included only right-handed participants to eliminate any putative effect of handedness in the current data. This clarification has been added to the method section.

Line 314: Please use either stimulation "phase" or "period" consistently across the manuscript

We thank the reviewer for pointing out this lack of consistency. We now use ‘stimulation period’ instead of phase.

Line 422: The study by Körding et al., (2007) is about audiovisual spatial localization, not synchrony judgment as is referenced currently. Please correct.

We corrected our mistake.

Line 529: Perhaps the better paper to cite here would be Rigoux et al., (2014) as this is an improvement over Stephan et al., (2009) and also this is the paper that introduces the protected exceedance probability which is used here.

We agree with the reviewer, the reference to Stephan et al.’s study was not the most relevant. We are now citing Rigoux et al., (2014) instead.

Figure 3 and potentially elsewhere: Please overlay individual data points on bar graphs. There is plenty of space to include these on bar graphs and would provide valuable additional information on the distribution of data.

Individual data points have been added to the bar plot on figure 1 (former Figure 3). The bar plots on figure 2 (former figure 4) display the same data with the addition of the model predictions. To avoid overloading the latter figure, we did not add the individual data points (that would be the same as in figure 1.B) but that can be done if needed. Individual data points have also been added to Figure3_Supplement3.

Figure 5A: Please consider increasing the size of markers and their labels for better visibility (i.e., similar size as in panels B and C).

Markers’ size has been increased.

Line 608,611, 639-640, and potentially elsewhere: Please indicate what values are stated in the form XX +/- YY. I assume they represent mean +/- SEM, but this must be indicated consistently throughout the manuscript.

The reviewer is correct; for consistency, we always used mean +/- SEM; this is now indicated across the Results section.

[Editors’ note: further revisions were suggested prior to acceptance, as described below.]

In particular, there was a consensus that the potential contribution of demand characteristics should be discussed, rather than dismissed. We also ask that you discuss the potential utility of a more simple model (signal detection theory). Please see more details below.

Thank you for your feedback and the time you spent on our manuscript. We now discuss the potential contribution of demand characteristics without dismissing the concern. We also explicitly mention the signal detection theory in the text.

(1) The authors added a paragraph detailing why the minimised (in their opinion) the contributions of demand characteristics. They argue that the theory that subjects respond to demand characteristics "cannot explain the specifically shaped psychometric curves with respect to the subtle stepwise asynchrony manipulations and the widening of this curve by the noise manipulation as predicted by the causal inference model." We were not fully convinced by this argument and wonder why you would want to categorically rule this possibility out.

Reviewer 1 wrote back: This claim is false. The authors should spell out the alternative theory in terms of subjects using general purpose Bayesian inference to infer what is required. Subjects do not need to know how general purpose inference works; they just need to use it. Does the fact that information was delivered trial by trial over many trials rule out general purpose Bayesian inference? On the contrary, it supports it. Bayesian updating is particularly suited to trial by trial situations (e.g. see general learning and reasoning models by Cristoph Mathys), as illustrated by the author's own model. The authors' model could be a general purpose model of Bayesian inference, shown by its applicability to the asynchrony judgment task. The fact that the number of asynchrony levels may not have been noticed by subjects is likewise irrelevant; subjects do not need to know this. Indeed, the authors point out that the asynchrony judgment task was easier than the RHI task, so that the authors needed to use a smaller asynchrony range for this task than the RHI one. That is, subjects' ability to discriminate asynchronies is shown by the authors' data to be more than enough to allow task-appropriate RHI judgments. (Even if successive differences between asynchrony levels were below a jnd, one would still of course still get a well formed psychophysical function over an appropriate span of asynchronies; so subjects could still use asynchrony in lawful ways as part of general reasoning, i.e. apart from any specific self module.)

(In their cover letter the authors bring up other points that are covered by the fact that subjects do not need to know how e.g. inference and imagination work in order to use them. It is well established that response to imaginative suggestions involves the neurophysiology underlying the corresponding subjective experience (e.g https://psyarxiv.com/4zw6g/ for review); imagining a state of affairs will create appropriate fMRI or SCR responses without the subject knowing how either fMRI or SCRs work.) In sum, none of the arguments raised by the authors actually count against the alternative theory of general inference in response to demand characteristics (followed by imaginative absorption), which remains a simple alternative explanation.

Following a discussion we ask that to address the Reviewer's perspective, you acknowledge the possibility that demand characteristics are contributing to participants performance.

In the new version of the manuscript, we now acknowledge that demand characteristics can influence the participants' judgments; moreover, we point out how within our computational framework (influencing psame).

However, we prefer not to speculate about unknown cognitive factors that may contribute to demand characteristics in the current study. We find reviewer 1’s proposal about (unconscious) Bayesian general inference in response to demand characteristics followed by imaginative absorption very speculative and improbable. The participants received no experimental feedback, and we are not aware of any published study that have reported (unconscious) Bayesian general inference in a paradigm like ours; the articles the reviewer referred to in law and medicine are very different. Note further that in control experiments with similar asynchrony manipulation, we find little evidence of general cognitive inference https://psyarxiv.com/uw8gh. Regarding “imaginative absorption” and “hypnotic hallucinations” we are not aware of any previous study that have reported perceptual or imaging results showing such effects to occur in normal, not highly hypnotizable subjects, and in the absence of active hypnotic suggestions (the classic studies cited in the Dienes and Lush preprint, e.g., Kosslyn et al., 2000 Am J Psychiatry, used highly hypnotizable subjects and hypnotic induction). Also, as we pointed out in our previous response, in Lush et al., 2021, there was no relationship between hypnotic suggestibility and the subjective strength of the rubber hand illusion when considering the difference between synchronous and asynchronous conditions; and if no changes were seen in such a simple design we do not understand how hypnotic hallucinations or “imaginative absorption” could explain the current detailed psychometric curves as a function of asynchrony and noise level. Indeed, strong claims like the ones the reviewer is making would require further formalization of this theory, especially quantifiable – hence testable – predictions about the expected contribution of demand characteristics’ influence on the participants’ behavior when confronted to variable degrees of asynchronous stimulations. Because of this lack of specific predictions and based on the reasons we outlined in the previous rebuttal letter, we think that considering demand characteristics and associated cognitive factors as a major explanation for the current major findings is very speculative. But as said, we now acknowledge the possibility of demand characteristics in the text, have taken the reviewer’s perspective into account, and added the following to the discussion (pages 20, lines 584-588):

“While it seems plausible that psame reflects the real-world prior probability of a common cause of the visual and somatosensory signals, it could also be influenced by experimental properties of the task, demand characteristics (participants forming beliefs based on cues present in a testing situation, Weber et al., 1972; Corneille and Lush, 2022, Slater and Ehrsson, 2022), and other cognitive biases.”

(2) "How the a priori probabilities of a common cause under different perceptive contexts are formed remains an open question."

Reviewer 1: One plausible possibility is that in a task emphasizing discrimination (asynchrony task) vs emphasizing integration (RHI) subjects infer different criteria are appropriate; namely in the former, one says "same" less readily.

We agree that we cannot exclude that experimental properties of the tasks may influence the responses. Thus, in response to the reviewers’ comment, we edited the following sentence in the paragraph discussing what factors may explain the observed difference in psame between the two tasks, (pages 20, lines 582-586):

“While it seems plausible that psame reflects the real-world prior probability of a common cause of the visual and somatosensory signals, it could also be influenced by experimental properties of the task, demand characteristics (participants forming beliefs based on cues present in a testing situation, Weber et al., 1972; Corneille and Lush, 2022, Slater and Ehrsson, 2022), and other cognitive biases.”

and

(3) "temporally correlated visuotactile signals are a key driving factor behind the emergence of the rubber hand illusion"

Reviewer 1: On the theory that the RH effect is produced by whatever manipulation appears relevant to subjects, of course in this study, where asynchrony was clearly manipulated, asynchrony would come up as relevant. So the question is, are there studies where visual-tactile asynchrony is not manipulated, but something else is, so subjects become responsive to something else? And the answer is yes. Guterstam et al., obtained a clear RH ownership effect and proprioceptive drift for brushes stroking the air i.e. not touching the rubber hand Durgin et al., obtained a RH effect with laser pointers, i.e. no touch involved either. The authors may think the latter effect will not replicate; but potentially challenging results still need to be cited.

Here we do not ask that you provide an extensive literature review. Instead, we simply ask you that you acknowledge in the discussion that task differences might influence participants performance (similar to our request above).

Following the Reviewer and Editors suggestion, we rephrased the part of the sentence under discussion to emphasize that other types of factors can influence the RHI and body ownership judgments. Together with the other changes we have implemented in the discussion (see above), we think that the role of visuotactile temporal congruence has now been better contextualized and cognitive factors acknowledged. We added the following sentence (page 16, lines 458-462):

“Even though the present study focuses on temporal visuotactile congruence, spatial congruence (Fang et al., 2019; Samad et al., 2015) and other types of multisensory congruences (e.g., Ehrsson et al., 2005; Tsakiris et al., 2010; Ide 2013; Crucianelli and Ehrsson, 2022) would naturally fit within the same computational framework (Körding et al., 2007, Sato et al., 2007).”

(4) On noise effects.

Reviewer 1: If visual noise increases until 100% of pixels are turned white, the ratio of likelihoods for C=1 vs C=2 must go to 1 (as there is no evidence for degree of asynchrony) so the probability of saying "yes" goes to p_same, no matter the actual asynchrony (which by assumption cannot be detected at all in this case). p-same is estimated as.8 in the RHI condition. Yet as noise increases, p(yes) actually increases higher than 0.8 in the -150 to +150 asynchrony range (Figure 2). Could an explanation be given of why noise increases p(yes) other than the apparent explanation I just gave (that p(yes) moves towards p(same) as the relative evidence for the different causal processes reduces)?

The deeper issue is that it seems as visual noise increases, the probability that subjects say the rubber hand is their own increases and becomes less sensitive to asynchrony. In the limit it means if one had very little visual information, one just knew a rubber hand had been placed near you on the table, you would be very likely to say it feels like your own hand (maybe around the level of p-same), just because you felt some stroking on your own hand. But if the reported feeling of the rubber hand were the output of a special self processing system, the prior probability of a rubber hand slapped down on the table being self must be close to 0; and it must remain close to zero even if you felt some stroking of your hand and saw visual noise. But if the tendency to say "it felt like my own hand" was an experience constructed by realizing the paradigm called for this, then a high baseline probability of saying a rubber hand is self could well be high – even in the presence of a lot of visual noise.

Please consider that this effect may bear on the two different explanations.

We first would like to point out a misunderstanding here regarding how the causal scenario is selected: the observer is not sampling from the distribution of p(C=1|s)/p(C=2|s) but maximizing it; that’s why p(yes)>psame when the noise increases. In a hypothetical situation psame is slightly above.5 and visual noise is very high, the participant would indeed rely mostly on prior and this would lead to a majority of “yes”.

Then, we are concerned that the theoretical case presented here, with a 100% visual noise, participants would not see anything, it would be like a unisensory stimulation condition. Recall that we ask our participants “did the hand you saw felt like it was your own hand?”. If the signal is undiscernible from the noise, participants won’t see a hand, and therefore there’s no measurement of visuotactile asynchrony, and thus our model is not applicable. Moreover, in this “extreme case scenario” mentioned by the reviewer, the participant is aware of a hand being put in front of them while observing an extremely noisy visual scene, and the reviewer reasons that this would lead the participant to report having the illusion. We argue that having your hand stroked while seeing a hand near yours not being touch could be assimilated to a strong incongruence from a visuotactile perspective (asynchrony > 12 seconds) which would still lead to a “no” answer, despite a noisy visual signal, as long as the participant is able to measure a visual signal (i.e., not noise only).

Finally, we would like to remind the reviewer that our data are much richer than merely a baseline probability of reporting the illusion. Namely, we measure and model detailed psychometric curves as a function of asynchrony and noise level. Thus, we do not believe that demand characteristics would provide a complete alternative explanation. Nevertheless, we have acknowledged that psame could be influenced by demand characteristic in the discussion (see our response to the first point).

(5) The authors reject an STD model in the cover letter on the grounds subjects would not know where to place their criterion on a trial by trial basis taking into account sensory uncertainty.

Reviewer 1: Why could not subjects attempt to roughly keep p(yes) the same across uncertainties? If the authors say this is asking subjects to keep track of too much, note in the Bayesian model the subjects need an estimate of a variance for each uncertainty to work out the corresponding K. That seems to be asking for even more from subjects. The authors should acknowledge the possibility of a simple STD model and what if anything hangs on using these different modelling frameworks.

We feel that a brief mention of this possibility will benefit the community when considering how to leverage your interesting work in future studies.

The subjects do not seem to attempt to roughly keep p(yes) the same across uncertainties since the number of yes answers increases significantly across our visual noise conditions as mentioned in the Results section (page 8: “regardless of asynchrony, the participants perceived the illusion more often when the level of visual noise increased (F(2, 28) = 22.35, p <.001; Holmes’ post hoc test: noise level 0 versus noise level 30: p = .018, davg = 0.4; noise level 30 versus noise level 50: p = .005, davg = 0.5; noise level 0 versus noise level 50: p <.001, davg = 1, Figure 1B)”). Nonetheless, the risk of a learned threshold for each uncertainty is worth considering. We believe this is unlikely because we used multiple interleaved levels of noise while withholding any form of experimental feedback. We added this information to our discussion (page 18, lines 513 – 518).

“While we have argued that people take into account trial-to-trial uncertainty when making their body ownership and synchrony judgments, it is also possible that they learn a criterion at each noise level (Ma and Jazayeri, 2014), as one might predict in standard signal detection theory. We believe this is unlikely because we used multiple interleaved levels of noise while withholding any form of experimental feedback.”

(6) A few proofing notes that have been picked up by Reviewer 3 (these are not comprehensive, so please read over the manuscript again more carefully):

1. Main points 1 and 3: The changes in response to these points as indicated in the response to reviewers are not exactly incorporated in the main manuscript file. Could you please correct?

Correction done.

2. Main point 4: in the main manuscript file there is an unnecessary '#' symbol at the end of the equation, please remove.

Correction done.

3. Main point 7: the title for figure 2 in the updated manuscript does not match the title indicated in the response to reviewers. I think the latter would be a better choice.

Correction done.

4. Supplements for figures 2, 3, 4: It seems that after re-numbering these figures, the figure legends for their supplement figures have not been updated and they still show the original numbering. Could you please update?

Correction done.

https://doi.org/10.7554/eLife.77221.sa2

Article and author information

Author details

  1. Marie Chancel

    Department of Neuroscience, Karolinska Institutet, Stockholm, Sweden
    Contribution
    Conceptualization, Data curation, Formal analysis, Investigation, Visualization, Methodology, Writing – original draft, Project administration, Writing – review and editing
    For correspondence
    marie.chancel@ki.se
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-3052-5268
  2. H Henrik Ehrsson

    Department of Neuroscience, Karolinska Institutet, Stockholm, Sweden
    Contribution
    Conceptualization, Resources, Supervision, Funding acquisition, Validation, Writing – original draft, Writing – review and editing
    Contributed equally with
    Wei Ji Ma
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-2333-345X
  3. Wei Ji Ma

    Center for Neural Science and Department of Psychology, New York University, New York, United States
    Contribution
    Conceptualization, Supervision, Validation, Methodology, Writing – original draft, Writing – review and editing
    Contributed equally with
    H Henrik Ehrsson
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-9835-9083

Funding

European Research Council (787386)

  • H Henrik Ehrsson

Wenner-Gren Foundation

  • Marie Chancel

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We would like to thank Martti Mercurio for his help in building the robots and writing the program to control them. We also thank Pius Kern and Birgit Hasenack for their help with data acquisition during the pilot phase of this study. We thank the reviewers for their constructive feedback during the reviewing process that helped us improve the article.

This research was funded by the Swedish Research Council, the Göran Gustafssons Foundation, and the European Research Council under the European Union’s Horizon 2020 research and innovation programme (grant 787386 SELF-UNITY). MC was funded by a postdoctoral grant from the Wenner-Gren Foundation.

Ethics

Human subjects: All volunteers provided written informed consent prior to their participation. All experiments were approved by the Swedish Ethics Review Authority (Ethics number 2018/471-31/2).

Senior Editor

  1. Tamar R Makin, University of Cambridge, United Kingdom

Reviewing Editor

  1. Virginie van Wassenhove, CEA, DRF/I2BM, NeuroSpin; INSERM, U992, Cognitive Neuroimaging Unit, France

Reviewers

  1. Zoltan Dienes, University of Sussex, United Kingdom
  2. Liping Wang, Chinese Academy of Sciences, China
  3. Mate Aller, MRC Cognition and Brain Sciences Unit, United Kingdom

Publication history

  1. Preprint posted: June 1, 2021 (view preprint)
  2. Received: January 20, 2022
  3. Accepted: September 27, 2022
  4. Accepted Manuscript published: September 27, 2022 (version 1)
  5. Version of Record published: October 12, 2022 (version 2)
  6. Version of Record updated: October 25, 2022 (version 3)

Copyright

© 2022, Chancel et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 768
    Page views
  • 281
    Downloads
  • 1
    Citations

Article citation count generated by polling the highest count across the following sources: PubMed Central, Crossref, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Marie Chancel
  2. H Henrik Ehrsson
  3. Wei Ji Ma
(2022)
Uncertainty-based inference of a common cause for body ownership
eLife 11:e77221.
https://doi.org/10.7554/eLife.77221

Further reading

    1. Neuroscience
    Sophie L Fayad, Guillaume Ourties ... Nathalie Leresche
    Research Article Updated

    Cav3.2 T-type calcium channel is a major molecular actor of neuropathic pain in peripheral sensory neurons, but its involvement at the supraspinal level is almost unknown. In the anterior pretectum (APT), a hub of connectivity of the somatosensory system involved in pain perception, we show that Cav3.2 channels are expressed in a subpopulation of GABAergic neurons coexpressing parvalbumin (PV). In these PV-expressing neurons, Cav3.2 channels contribute to a high-frequency-bursting activity, which is increased in the spared nerve injury model of neuropathy. Specific deletion of Cav3.2 channels in APT neurons reduced both the initiation and maintenance of mechanical and cold allodynia. These data are a direct demonstration that centrally expressed Cav3.2 channels also play a fundamental role in pain pathophysiology.

    1. Neuroscience
    Sarah M Lurie, James E Kragel ... Joel L Voss
    Research Article

    Hippocampal-dependent memory is thought to be supported by distinct connectivity states, with strong input to the hippocampus benefitting encoding and weak input benefitting retrieval. Previous research in rodents suggests that the hippocampal theta oscillation orchestrates the transition between these states, with opposite phase angles predicting minimal versus maximal input. We investigated whether this phase dependence exists in humans using network-targeted intracranial stimulation. Intracranial local field potentials were recorded from individuals with epilepsy undergoing medically necessary stereotactic electroencephalographic recording. In each subject, biphasic bipolar direct electrical stimulation was delivered to lateral temporal sites with demonstrated connectivity to hippocampus. Lateral temporal stimulation evoked ipsilateral hippocampal potentials with distinct early and late components. Using evoked component amplitude to measure functional connectivity, we assessed whether the phase of hippocampal theta predicted relatively high versus low connectivity. We observed an increase in the continuous phase-amplitude relationship selective to the early and late components of the response evoked by lateral temporal stimulation. The maximal difference in these evoked component amplitudes occurred across 180 degrees of separation in the hippocampal theta rhythm; i.e., the greatest difference in component amplitude was observed when stimulation was delivered at theta peak versus trough. The pattern of theta phase dependence observed for hippocampus was not identified for control locations. These findings demonstrate that hippocampal receptivity to input varies with theta phase, suggesting that theta phase reflects connectivity states of human hippocampal networks. These findings confirm a putative mechanism by which neural oscillations modulate human hippocampal function.