Introduction

A substantial amount of research from medicine, neuroscience, psychology, and education aims to establish the effectiveness of different treatments, such as drugs, cognitive training, biofeedback, and neurostimulation, in both clinical and non-clinical populations. However, the research findings from these fields tend to be heterogeneous. As a result, there has been increased scepticism among researchers about the efficacy of these treatments (Lampit et al., 2014; López-Alonso et al., 2014; Sitaram et al., 2017).

In recent years, neuromodulation has been studied as one of the most promising treatment methods. As a result, the global neuromodulation device industry is expected to grow to $13.3 billion in 2022 (Colangelo, 2020). Further, one particular form of neuromodulation, transcranial magnetic stimulation (TMS), has been approved by regulatory bodies in multiple countries, including the US Food and Drug Administration (FDA) and is used as an evidence- based treatment for patients with migraine, major depression, obsessive-compulsive disorder and smoking addiction. Moreover, TMS, and other neuromodulatory devices, such as transcranial focused ultrasound, and electrical stimulation (tES), have been highlighted as a potential treatment for psychiatric, neurological, and neurodevelopmental disorders (Grover et al., 2021; Khedr et al., 2005; McGough et al., 2019), and they have also been used to enhance various mental processes, including attention, memory, language, mathematics and intelligence in healthy populations (Santarnecchi et al., 2015). These encouraging findings have raised hope for the potential application of these techniques within and outside the clinic (Dubljević et al., 2014).

Despite some encouraging results on the beneficial effects of both TMS and tES, contradictory findings have emerged across different studies (Horvath et al., 2015; Medina & Cason, 2017; Parkin et al., 2015; Wang et al., 2018; Westwood et al., 2017). Several factors have been pointed at as plausible reasons for the heterogeneity in research results (Filmer et al., 2020; Guerra et al., 2020; van Bueren et al., 2021). However, a crucial factor that researchers have largely overlooked is the extent to which subjective beliefs can explain variability in treatment efficacy. Here, we address this gap by examining whether modelling participants’ beliefs about receiving the placebo or active treatment can account for changes in clinical, cognitive and behavioural outcomes.

Participants that take part in TMS and tES studies consistently report various perceptual sensations, such as audible clicks, visual disturbances, and cutaneous sensations (Davis et al., 2013). As a result of these bodily sensations, they can become aware of having received the active treatment. This, in turn, makes it more likely that subjective beliefs and demand characteristics about the aim of the intervention might influence performance (Polanía et al., 2018). To account for the emergence of such non-specific effects, sham (placebo) protocols have been developed and utilised. During sham stimulation, a negligible amount or no stimulation is administered to the subject while keeping the interventional procedure otherwise identical. For example, in transcranial direct current stimulation (tDCS), which is the most common form of tES, the stimulation intensity is slowly ramped up and—after a few seconds of actual stimulation (e.g., 30 seconds) —ramped down promptly, differently from active stimulation that typically lasts up to 20 minutes. Another example is sham TMS, in which the TMS coil is tilted so that an edge remains in contact with the head. As a result, a sham TMS pulse leads to a clicking sound similar to active TMS, which results in some somatosensory effects and even peripheral nerve stimulation without stimulation of the brain area of interest (Duecker & Sack, 2015). Overall, these types of sham stimulation aim to mimic the perceptual sensations associated with active stimulation without substantially affecting cortical excitability (Fritsch et al., 2010; Nitsche & Paulus, 2000). As a result, sham treatments should allow controlling for participants’ specific beliefs about the type of stimulation received.

Previous studies have addressed whether manipulating participants’ expectations about the effects of either active or sham stimulation can moderate treatment efficacy (Braga et al., 2021, Rabipour et al., 2018). However, to our knowledge, these studies have not examined whether individual differences in participants’ subjective experience of receiving active or sham treatment provide a better model fit than the condition to which participants are assigned in the intervention. We term the former subjective treatment and the latter objective treatment.

The above consideration becomes particularly crucial when considering that the experimental design of most randomised controlled trials (RCTs) involves recording whether participants believed they received the active or placebo treatment. While it is common practice to assess experimental blinding using this data, the explanatory power of individual differences in subjective treatment is rarely, if at all, considered. This is based on the assumption that if no differences emerged at the group level in participants’ guess for receiving the active vs the placebo treatment (i.e., if experimental blinding was successful), placebo effects could not explain the obtained results.

Here, we hypothesise that such an assumption can be erroneous and aim to explore how accounting for differences in subjective beliefs can shed light on the conclusions of previous treatment studies. Moreover, we introduce a simple and straightforward approach that could be used to analyse existing data and guide future clinical and fundamental research to examine whether subjective treatment explains variability in experimental outcomes over and beyond objective treatment. Below, we demonstrate this approach by reanalysing four independently published neurostimulation studies (including TMS and tES) that test clinical and non-clinical samples from different age groups (Blumberger et al., 2016; Filmer et al., 2019; Kaster et al., 2018). The data and the codebook of the analyses are available on the Open Science Framework (https://osf.io/rztxu/?view_only=879a6dbccd624d87a96cb9d839dd833c).

Results

Study 1

Repetitive TMS (rTMS) is a method for treating depression that has been approved by the FDA (Connolly et al., 2012). In Study 1 (Blumberger et al., 2016), patients aged 18-85 years with treatment-resistant depression (N=121) were randomised to receive either bilateral rTMS, unilateral left-rTMS or sham rTMS for 3 or 6 weeks (objective treatment). We examined whether participants’ beliefs about receiving active or sham stimulation (subjective treatment) explained changes in depression over time. In this study, subjective treatment was based on participants’ reports of whether they thought they received active or sham rTMS, inquired at the end of treatment (week 6).

A linear mixed model with depression scores, measured by the HAMD-17 scale, was fitted to the data for weeks 1–6. The baseline model included time (week 1/week 3/week 6) as a main effect, as well as the interaction of time and objective treatment. We first added to this model subjective treatment as a main effect. Next, we extended the model to include the two-way interaction of time and subjective treatment. Lastly, we considered a model with the three-way interaction of time, subjective treatment and objective treatment. Our results showed that the two-way interaction between subjective treatment and time led to a significantly higher model fit (BIC=2027.48, AIC=1985.62, P<.001; see Supplementary Table S1). Hence, our analysis suggests that participants’ subjective experience about the treatment accounted for variability in depression scores over time, while the actual treatment condition to which participants were assigned did not. As shown in Figure 1, participants that thought they received active stimulation showed a steeper decrease in depression over time than participants who thought they received sham. The interaction of subjective treatment and time was significant in week three and week six. We used contrasts to break down this interaction and compare depression scores at weeks three and six to depression at baseline (week zero) between participants who reported active vs sham as subjective treatment. Our results showed that depression scores were lower for participants that thought they were receiving active compared to sham stimulation at both week three (b=-3.14, t(323)=3.46, P=.000613) and week six (b=-6.67, t(323)=6.88, P<. 001).

Depression scores as a function of subjective treatment over time.

Each diamond represents the mean depression score (HAMD-17) for the time points (baseline, week 3, week 6), and each line in the background represents a patient. Error bars represent ± 1 standard error of the mean.

We next examined whether variability in depression scores was explained by both objective and subjective treatment. To this aim, we run a model comparison adding objective treatment first and, secondly, the interaction of objective treatment with time to a baseline model already including the interaction of subjective treatment and time. Our results showed that the inclusion of neither objective treatment nor the objective treatment*time interaction led to a better model fit (see Supplementary, Table S2). Therefore, a statistical model that includes the participants’ subjective experience of receiving the real or sham treatment at baseline fits the observed data better than a statistical model that only includes the actual treatment allocation.

We also investigated whether subjective treatment could explain variability in participants’ response rates and remission rates. In the study, the response rate was defined as a >50% reduction in depressive symptomatology and was binary coded. A mixed binomial model with HAMD-17 response rate as the outcome was fitted to the data. The baseline model included only objective treatment as a predictor and was compared to an updated model, including subjective treatment as the main effect. Given that response rates were measured only once, time did not vary and was therefore omitted from the model. We compared the model with subjective treatment as the main effect to a model including the subjective treatment*objective treatment interaction. Our results showed that the model with subjective treatment as the main effect led to a significantly better fit (deviance=14.78, P=.0001; see Supplementary, Table S3). As shown in Figure 2, response rates were higher for participants that reported thinking they received the active compared to the sham treatment (log(OR)=1.28, z=3.21, SE=0.40, P=.002). On the contrary, when we examined whether the addition of objective treatment to a model already including subjective treatment led to a better fit, this was not the case (deviance=1.40, P=.496; see Supplementary, Table S4). Therefore, treatment allocation did not explain changes in patients’ depression when subjective beliefs were already accounted for in the model.

Depression response rates as a function of subjective treatment.

The left plot presents the contribution of subjective treatment on the response rate of the HAMD-17, and the right plot presents the contribution of subjective treatment on the BDI-II. Each dot represents an individual patient, stacked toward 100% representing a response or 0% representing no response. Error bars represent ± 1 standard error of the mean.

The same pattern of results was replicated for response rates calculated based on another depression scale (the BDI-II depression scale), where subjective treatment as the main effect led to a significantly better model fit (deviance=10.81, P=.001; see Supplementary, Table S5), and participants reporting the active subjective treatment showed higher response rates (b=1.85, z=3.06, SE=0.60, P=.002; Figure 2). In contrast, objective treatment did not provide a better model fit than subjective treatment (deviance=0.27, P=0.873; see Supplementary, Table S6).

Additionally, in the study, participants were classified as either remitters or non-remitters based on blinded clinical ratings at the end of weeks three and week six, defined by a HAMD-17 score less than or equal to 7. We conducted a survival analysis to examine whether subjective treatment explained variability in remission rates. The results supported the idea that patients that reported they subjectively believed receiving active stimulation showed higher remission rates than patients that believed they received sham. We found that, for objective treatment, the survival curves did not significantly differ between the active and sham condition (Gehan-Breslow-Wilcoxon test(1)= 3.72, P=0.053), indicating that remission rates did not differ for patients that received active rTMS compared to patients that received sham. On the contrary, for subjective treatment, a significant difference emerged (Gehan- Breslow-Wilcoxon test(1)=18.16, P<.001). Specifically, patients that reported they believed receiving active stimulation showed higher remission rates than patients that believed they received sham (Gehan-Breslow-Wilcoxon test(1)=5.12, P=.020).

Study 2

In Study 2 (Kaster et al., 2018), 52 participants aged between 60 and 85 years diagnosed with late-life depression were randomised to active or sham high dose deep rTMS. Compared to standard rTMS, Deep rTMS with the H1 coil has been designed to stimulate deeper and larger areas of the cortex (primarily the left DLPFC and portions of the right DLPFC). We examined whether, also in this case, subjective treatment could account for the changes in participants’ depression scores, despite the use of a different sample and TMS technique. Notably, in contrast to the other studies, participants were asked to report whether they thought they received the active or sham treatment at the first week of the study (after the first week of treatment) rather than at the end (at the fourth week). This avoids that subjective treatment— as inquired at the end of the study—would inherently be biased due to the clinical change the patient experienced in the intervention.

A linear mixed model with HDRS-24 score as the outcome was fitted to the data for weeks 1–4. As in Study 1, we first compared the baseline model, including time and its interaction with the objective treatment, to a model including the interaction of subjective treatment*time. The latter model was then compared to a three-way interaction model with subjective treatment*objective treatment*time. Our results showed that the three-way interaction model led to a significantly better fit (AIC=1601.91, BIC=1668.74, P=.010; see Supplementary Table S7 & Table S8). Hence, participants’ beliefs explained variability in depression scores over time in relation to the experimental allocation. For following-up contrasts of the three-way interaction, we investigated the differences between the objective and subjective treatment each week compared to the baseline (see also in Figure 3). The analysis showed a steeper decrease in depression from baseline to week 3 (b=8.79, t(102.57)=2.01, SE=4.37, P=.045) and from baseline to week 4 (b=9.19, t(103.76)=2.09, SE=4.39, P=.038). In both cases, the scores for active objective treatment and active subjective treatment were higher than the sham treatments. Another way to explore the three-way interaction is by investigating the polynomial contrasts of the weeks variable between the objective and subjective treatment conditions. The analysis showed that the objective and subjective treatments differed in the linear contrast of time (b=28.03, t(182.29)=3.12, P=.002). The contrast showed a negative slope throughout the weeks that were significantly different between the objective treatment levels in the subjective sham treatment (b=-19.63, t(181.05)=-2.77, P=.001), but not for the subjective active treatment (b=8.40, t(184.33)=1.52, P=.128). Thus, the results show that the steepest change in depression occurred among those who received the active treatment but believed they received the sham treatment (compared to those that believed they received the active treatment).

Subjective sham treatment drives the difference between objective treatments in depression scores.

Three-way interaction between subjective treatment, objective treatment, and time showing the reduction of depression scores over time in the objective treatment group is accounted for by subjective treatment. The left plot shows subjective sham treatment, and the right plot shows subjective active treatment. Each line in the background represents a patient. Error bars represent ± 1 standard error of the mean.

We further investigated whether subjective treatment could provide a better model fit for the patients’ remission and response rates than objective treatment. To this aim, we fitted two mixed binomial models with remission and response rates as the outcomes. Time was not considered in this case because both remission and response rates were collected only once at the end of the fourth week. In line with our previous results, we found that the interaction of subjective treatment*objective treatment was significantly better at predicting remission rates (deviance=4.47, P=.035; see Supplementary Table S9 & Table S10) and response rates (deviance=49.80, P=.004; see Supplementary Table S11 & Table S12).

For remission rates, we found a significant two-way interaction between objective treatment and subjective treatment (log(OR)=-0.81, z=-1.99, SE=0.41, P=.047, Figure 4). While the effect did not differ significantly between active and sham rTMS as the objective treatment when participants thought they received the active stimulation (b=-1.01, z=-.82, SE=1.23, P=.410), it showed higher remission rates when they thought that they received sham (b=2.69, SE=1.29, z=2.08, P=0.038). These results were replicated when we considered response rates as the outcome (Figure 4), for which we found a significant two-way interaction (log(OR)=- 1.02, z=-2.58, SE=0.4, P=.010). Again, for participants that thought they received the active stimulation, remission did not differ significantly between active and sham rTMS as the objective treatment (b=-1.7, z=-1.44, SE=1.18, P=.150). On the contrary, when participants thought they received sham stimulation, they showed higher response rates in the active compared to sham rTMS as the objective treatment (b=2.38, SE=1.05, z=2.26, P=0.020).

Remission and response rates as a function of subjective and objective treatment.

The left columns present the contribution of objective active treatment and the right column the contribution of objective sham treatment. Each dot represents an individual patient and is stacked toward 100% representing a response or 0% representing no response. Error bars represent ± 1 standard error of the mean.

Study 3

In Study 3, the researchers examined the effect of home-based tDCS treatment used for four weeks on a clinical group of adults diagnosed with ADHD (Leffa et al., 2022; N=64). The primary outcome measure was symptoms of inattention taken from a clinician-administered questionnaire (Adult ADHD Self-report Scale; CASRS-I). Data on participants’ beliefs reflecting subjective treatment was collected at the end of the experiment. In line with the studies above, we first inspected the addition of subjective treatment to the model accounting for objective treatment between the baseline and the last assessments. Including subjective treatment led to a better model fit (AIC=593.80, BIC=609.78, P<.001; see supplementary Table S13). Subsequent contrasts revealed that inattention scores for participants who believed they were getting the active treatment were significantly lower compared to those who believed they belonged to the sham group (b=-3.3, t(100)=-3.35, SE=0.99, P<.001; see Figure 5). This finding provides further evidence supporting the contribution of subjective treatment over objective treatment, extending our previous results to another mental health condition, population and tES method.

The contribution of subjective and objective treatment on symptoms of inattention taken from a clinician-administered questionnaire.

The left plot shows the contribution of subjective treatment, and the right plot shows the contribution of objective sham treatment. Each dot represents an individual patient. Error bars represent ± 1 standard error of the mean.

Next, we investigated if a model including objective treatment could explain variability in a model already including subjective treatment. Differently from the Study 1 and 2, where this addition was not found significant, here, the addition of the objective treatment was significant (AIC=593.80, BIC=609.78, P<.001; see supplementary Table S14). As expected, the contrast showed lower inattention symptoms in the objective active treatment group (b=-4.17, t(100)=- 4.21, SE=0.99, P<.001). Thus, subjective treatment did not wholly overrule the contribution of objective treatment to research outcomes. As later expanded, this finding demonstrates the varied explanatory power that subjective treatment can have in relation to various types of tES treatments.

Study 4

In Study 4, we extended our results beyond clinical populations by examining the effects of different doses (current intensity) of tDCS on mind-wandering in healthy participants (N=150). Similar to Studies 1 and 3, participants were asked about subjective beliefs at the end of the experiment. For this study, we tested whether not only subjective treatment but also subjective dosage (participants’ beliefs of the strength of the stimulation they received) could explain variability in the results attributed originally to objective treatment. A linear regression model with average mind wandering scores calculated over the whole experimental session was fitted to the data. In line with Studies 1-3, subjective treatment contributed to a significantly better model fit. Specifically, participants’ beliefs explained variability in mind wandering when subjective treatment was included as a main effect on top of objective treatment (AIC=284.72, BIC=305.80, P=.045; see supplementary Table S15). Furthermore, as shown in Figure 6, participants who believed they received active treatment showed higher mind-wandering levels than those who reported they believed to receive sham treatment (b=-0.21, SE=0.1, t(140)=-2, P=.044).

Mind wandering scores as a function of subjective treatment and subjective dosage.

Each dot represents a participant. Error bars represent ± 1 standard mean error.

The experimental design in this study allowed us also to expand our previous findings by examining the contribution of subjective dosage to a model including objective treatment (AIC=282.90, BIC=310, P=.025; see supplementary Table S17). In this regard, we found that mind wandering increased for people reporting weak (b=.31, SE=0.14, t(142)=2.23, p=.003), moderate (b=.27, SE=0.12, t(142)=2.24, p=.003) and strong (b=.47, SE=0.19, t(142)=2.41, p=.017) subjective dosage compared to none. These results indicate that mind wandering increased proportionally as the subjective dosage increased (from none to strong). Conversely, our results showed that participants’ objective treatment did not lead to a better model fit neither when added on top of subjective treatment (AIC=284.72, BIC=305.8, P=.093; see supplementary Table S16) nor when added on top of subjective dosage (AIC=282.9, BIC=310.0, P=.106; see supplementary Table S18). These findings highlight that participants’ beliefs regarding the type of treatment received and their subjective experience of the treatment dosage can explain variability in cognitive performance.

Discussion

In this work, we used a novel approach to examine whether and to what extent participants’ subjective beliefs may account for variability in research outcomes. To this aim, we analysed four independent datasets from the field of neurostimulation; specifically, two rTMS RCTs in patients with depression (Blumberger et al., 2016; Kaster et al., 2018), one tDCS study in adults with ADHD (Leffa et al., 2022), and another tDCS study in a healthy adult sample (Filmer et al., 2019).

We demonstrate that participants’ subjective beliefs about receiving the active vs control (sham) treatment is an important factor that explains variability in the primary outcome and, in some cases, fits the observed data better than the actual treatment participants received during the experiment. Specifically, in Studies 1, 2 and 4, the fact that participants thought to be in the active or control condition explained variability in clinical and cognitive scores to a more considerable extent than the objective treatment alone. Notably, the same pattern of results emerged when we replaced subjective treatment with subjective dosage in the fourth experiment, showing that subjective beliefs about treatment intensity also explained variability in research results better than objective treatment.

An important question arising from our findings relates to the causal role of subjective beliefs. This question is a complex one to answer and falls outside the scope of this study. Based on the goal of testing blinding efficacy, it is a standard practice for current treatment studies to record data on subjective beliefs only at the end of the experiment rather than before and throughout. This was the case in Studies 1, 3, and 4. While in Study 3, both objective and subjective treatment explained unique variance in the clinical outcome, in Study 1 and 4 it was impossible to conclude whether participants’ beliefs, which were captured by the subjective treatment, affected experimental outcomes or, on the contrary, whether participants’ changes in performance or symptoms throughout the studies influenced beliefs regarding treatment allocation. Therefore, it is important to consider that subjective treatment could also capture the changes faced by participants in placebo-controlled trials, whereby, if they feel better, it would be hard psychologically to report that they believe it was due to the placebo.

Notably, Study 2 used a less common approach, in which subjective beliefs about treatment allocation were queried after the first week. Notably, in this study, significant results emerged only two weeks after this inquiry (in weeks 3 and 4). Given that subjective beliefs about treatment allocation were documented before the emergence of changes in any clinical outcome, we suggest that participants’ beliefs may have affected experimental outcomes rather than vice versa.

Based on the above considerations, future studies should strive to record data on subjective beliefs at different time points: before, during and after the experiment. This will allow mapping the way subjective beliefs might be differentially associated with experimental results depending on the considered study and treatment type. However, we acknowledge two caveats of this suggestion. Firstly, participants may be more prone to pay attention to their treatment allocation and, consequently, figure out their assigned condition. Secondly, recording subjective beliefs at multiple time points might interfere with the effects of the treatment. For instance, patients might suppress their response for the fear that the treatment received is a placebo (Sonawalla & Rosenbaum, 2002). An alternative approach could entail deception, whereby all participants are told they received the active treatment. While this raises an ethical concern, such an approach would 1) allow minimising the effect of subjective beliefs on research outcomes and 2) hold more ecological validity, as it would mimic the way approved treatments are delivered in the clinic, where all patients know to be receiving the active treatment (Burke et al., 2019).

While this study focuses only on neuromodulation techniques, we want to highlight that the proposed approach can be applied to other forms of treatment (e.g., pharmacological studies, cognitive training) tested as part of standard experiments or RCTs. It is worth noting that the contribution of subjective beliefs to experimental results might be even more enhanced when considering interventions carried out in seemly cutting-edge research settings, such as experiments involving virtual reality, neurofeedback paradigms, and other types of brain- computer interfaces. In such cases, participants might be more susceptible to forming specific expectations about treatment effects (Burke et al., 2019; Thibault et al., 2017). Therefore, the explanatory power of subjective beliefs could be intensified compared to more traditional forms of treatment, such as pharmacology.

One question emerging from this study is whether these results, observed with self-report measures, would apply to more objective behavioural outcomes (e.g., sensorimotor recovery in stroke patients, improvements in fluid intelligence in participants with learning difficulties) and neural functions (e.g., functional connectivity). We argue that the answer to this question is likely to be positive since placebo effects have been shown to impact not only behaviour but also brain activity (Burke et al., 2019; Hashmi, 2018; Oken et al., 2008; Schmidt et al., 2014). However, independent of this possibility, the contribution of subjective treatment to explaining variability in self-reported outcomes should not be underestimated. Noteworthy, in most RCTs investigating the effect of different treatments on clinical and subclinical groups (e.g., depression, chronic pain, eating disorders, attention-deficit/hyperactivity disorder), some of which have also been approved by the FDA, the measurement of symptomatology is mostly based on self-reported outcomes, such as questionnaires. This consideration makes the case of subjective treatment even stronger, hinting at the potential role of this factor in explaining experimental results across a variety of experimental outcomes and treatment types.

While our study examined the explanatory power of subjective beliefs about receiving treatment, neither of the four studies (similar to most studies in the field) collected data on participants’ expectations. Indeed, as has recently been shown by Parong et al. (2022), expectations regarding the effectiveness of cognitive training (i.e., whether it will increase or decrease performance) can significantly modulate the effect of training. Thus, investigating the interplay between expectations and subjective treatment could allow examining the directionality and strength of the effect of subjective treatment on the outcome of interest. For instance, some participants may expect a treatment to improve their capabilities or symptoms. In contrast, others could expect even the opposite, and the level of these expectations can vary during the intervention. These factors could, in turn, impact individual variability in subjective treatment. Arguably, when questioned early, subjective treatment could be more related to expectations rather than an actual reflection of the treatment benefits. This variation may explain the findings in Study 2 (improvement in depression for subjective sham treatment) compared to Study 1 (decrements in depression for subjective active treatment), where only in the former were subjects questioned during the procedure (week 1) and not at its end. This possibility is a post hoc explanation, and future experiments collecting data on participants’ behavioural, cognitive and clinical outcomes should also record subjective expectations thoroughly (Boot et al., 2013).

We want to highlight that while we present subjective treatment as an important variable with explanatory power in addition to objective treatment, these results do not imply that participants’ subjective beliefs can explain all of the variability in research outcomes (see also in Hochman et al., under review, commenting to Gordon et al., 2022). This is demonstrated in Study 3, where the objective treatment significantly explained inattention symptoms even when subjective treatment was accounted for. Additionally, we present in the supplementary materials an example of a neuromodulation study in which objective treatment explained variability in treatment effects that could not be attributed to subjective treatment (Murphy et al., 2020). Based on this consideration, where researchers have data, examining variability in participants’ subjective treatment may add further insight into prior results. However, unsurprisingly, when researchers were contacted about providing data on subjective treatment, many reported that the assessment of subjective beliefs, aside from side effects, was not recorded. Indeed, even our group’s procedure in the past lacked the recording of subjective treatment (Cohen Kadosh et al., 2007, 2010; Looi et al., 2016).

Overall, our findings hold twofold importance. Firstly, we introduce two new concepts in the academic literature: subjective treatment and subjective dosage. Secondly, we cast light on the role of participants’ subjective experience in explaining the variability of results from RCTs and experiments that test the effectiveness of treatments on mental health and behaviour. Altogether, we call for future studies to systematically collect data on participants’ subjective beliefs and expectations. Studies that have collected data on subjective beliefs at the start and end of the intervention may consider examining the potential effect of such beliefs on their results. Aside from estimating the contribution of subjective beliefs about belonging to the active or control condition, we suggest that future research may consider collecting and analysing data on 1) participants’ beliefs before and at some midpoint during the experiment rather than only at the end and 2) participants’ expectations about the directionality and strength of the effect of subjective treatment on expected outcomes. This approach would be enhanced with designs that include deception, whereby all participants are told that they received the active treatment. However, such designs require careful ethical review, particularly in clinical populations. Overall, such data will allow a thorough examination of subjective beliefs, yielding more valid and replicable results to progress scientific and clinical studies to benefit human health and behaviour.

Methods

Participants and design

Study 1

One hundred twenty-one patients (77 females, age range 18-85 years) with treatment-resistant depression took part in this study based on the data from Blumberger et al. (2016). Patients were randomised as part of a mixed design to receive sequential bilateral rTMS, unilateral high- frequency left (HFL)-rTMS or sham rTMS for three or six weeks, depending on treatment response. Patients were included in the study if: 1) the Structured Clinical Interview for DSM- IV (SCID) provided a DSM-IV diagnosis of MDD; 2) they were experiencing a current major depressive episode (MDE) with a score of 20 or higher on the 17-item HAMD-17; 3) they had failed to achieve a clinical response to or did not tolerate at least two different antidepressants from distinct classes at sufficient doses for at least six weeks; 4) they had been receiving psychotropic medications for at least four weeks before randomisation took place. Patients were excluded if 1) a history of DSM-IV substance dependence was present in the six months before the study or a history of DSM-IV substance abuse was present in the month preceding the study; 2) the Structured Clinical Interview provided a DSM-IV provided a diagnosis of borderline personality disorder or antisocial personality disorder; 3) an unstable medical or neurologic illness or a history of seizures was present; 4) they were suicidal; 5) they were pregnant; 6) had metal implants in the skull; 7) had a cardiac pacemaker; 8) had an implanted defibrillator or a medication pump; 9) presented a diagnosis of dementia or a current Mini- Mental State Examination (MMSE) score less than 24; 10) they were taking lorazepam or an equivalent medication during the four weeks before the study.

Study 2

Fifty-two outpatients (20 females, age range 65-80 years) with late-life depression took part in this study, which was based on the data from Blumberger et al. (2018). Patients were randomised as part of a mixed design to receive active deep rTMS or sham rTMS for four weeks. The same inclusion criteria applied as in Experiment 1 aside from 1) the age restriction and 2) the depression diagnosis (defined based on a score of ≥22 on the HDRS-24). Similarly, in addition to the exclusion criteria outlined in Experiment 1, patients were excluded if 1) any of the following diagnoses were present: bipolar I or II disorder, primary psychotic disorder, psychotic symptoms in the current episode, primary diagnosis of obsessive-compulsive, post- traumatic stress, anxiety, or personality disorder; 2) a dementia diagnosis was presented based on a Mini-Mental Status Exam (MMSE) with a score of <26; 3) rTMS contraindications (such as a history of seizures; intracranial implant); 4) a previously failed ECT trial during the current episode; 5) previous rTMS treatment; 6) receival of bupropion >300 mg/day due to the dose- dependent increased risk of seizures.

Study 3

Sixty four patients (30 females, mean age 38.6 (SD = 9.6)) with ADHD (48% inattentive presentation and 52% combined presentation) took part in this study, which was based on the data from Leffa et al. (2022). Patients were randomised to receive active tDCS or sham tDCS for four weeks for a total of 28 sessions. Patients were included in the study if they: 1) met DSM-5 criteria for ADHD based on a semistructured clinical interview conducted by trained psychiatrists, 2) were either not being treated with stimulants or agreed to perform a 30-day washout from stimulants before starting the tDCS, 3) estimated IQ score of 80 or above (based on Wechsler Adult Intelligence Scale, Third Edition), 4) self-reported being of European descendency. Patients were excluded if they: 1) showed moderate to severe symptoms of depression or depression based on Beck Depression Inventory-II (BDI), 2) had a diagnosis of bipolar disorder with a manic or depressive episode or history of noncontrolled epilepsy with seizures in the year prior to the study, 3) had a diagnosis of autism spectrum disorder or schizophrenia or psychotic disorder, 4) positive screened for substance use disorder, 5) showed unstable medical condition with reduction of functional capacity, 6) pregnancy or willingness to become pregnant in the 3 months subsequent to the beginning of the study, 7) inability to use the home-based tDCS device for any reason, 8) previous history of neurosurgery or presence of any ferromagnetic metal in the head or implanted medical devices in the head or neck region.

The outcome measure was based on the Inattentive scores in the clinician-administered version of the Adult ADHD Self-report Scale version 1.1 (CASRS-I)).

Study 4

One hundred fifty healthy participants (96 females, age M=23, SD=5) took part in this study, based on the data from Filmer et al. (2019). All subjects were right-handed, normal or corrected to normal vision and passed a safety screening procedure. Participants were tested as part of a between-subject design. Subjects were randomly assigned to either one of the following five conditions: anodal 1 mA, cathodal 1 mA, 1.5 mA, 2 mA, or sham tDCS.

Materials and Procedure

Study 1

All participants received treatment five times per week over three weeks for fifteen treatments, only delivered on weekdays. After the first three weeks, participants were classified as either remitters (HAMD-17 score < 8) or non-remitters (HAMD-17 score ≥ 8) based on blinded clinical ratings. Those who achieved remission completed the study at week three, while those classified as non-remitters entered a second phase, during which they received an additional three weeks of the same treatment under double-blind conditions.

During the study, rTMS was administered using a Magventure RX-100 repetitive magnetic stimulator (Tonika/Magventure) and a cool B-65 figure-8 coil. To derive stimulation intensity, the motor threshold was obtained before treatment. In order to localise the stimulation site (left dorsolateral prefrontal cortex), a structural MRI was coregistered to participants’ heads using a magnetic tracking device (miniBIRD, Ascension Technology Group) for coil-to-cortex coregistration. Sham stimulation was administered in randomised fashion, either as sham HFL- rTMS or sham bilateral rTMS with the coil angled 90° away from the skull in a single-wing tilt position, leading to some scalp sensations and sound intensity similar to that of active stimulation. Moreover, participants could not see the coil, reducing the likelihood of detecting the treatment allocation. Full details of the neuronavigation procedure and applied stimulation can be found in the supplementary material of Blumberger et al. (2016). After the final session, participants were asked whether they thought they received active or sham stimulation (presented as a binary choice).

Study 2

Participants were randomised to active rTMS or sham rTMS, administered five days per week for a total of 20 treatments over four weeks, and only delivered during weekdays. Participants achieved remission by the end of week 4 (defined as both HDRS-24 ≤10 and ≥60% reduction from baseline on two consecutive weeks). Participants were withdrawn if HDRS-24 increased from baseline >25% on two consecutive assessments if they developed significant suicidal ideation or attempted suicide.

This study administered rTMS using a Brainsway deep rTMS system with the H1 coil device (Brainsway Ltd, Jerusalem, Israel). The intensity was derived using the resting motor threshold (RMT) obtained before treatment. All participants included in the analysis received rTMS with the H1 coil targeting the dorsolateral and ventrolateral prefrontal cortex bilaterally and performed at 120% of the RMT. The active rTMS group received the following standardised dose of rTMS: 18 Hz, at 120% RMT, 2 s pulse train, 20 s inter-train interval, 167 trains, for a total of 6012 pulses per session over 61 min. The sham group received treatment with the same parameters, device, and helmet. However, the active H1 coil was disabled when initiating the sham mode. A second coil (sham H1 coil) was located within the treatment helmet but activated far above the participant’s scalp. This sham H1 coil delivered a tactile and auditory sensation similar to the active H1 coil, but the electric field was insufficient to induce neuronal activation. Full details regarding the applied stimulation can be found in Blumberger et al. (2018). After the first session, participants were asked whether they thought they received active or sham stimulation (presented as a binary choice) via a short questionnaire.

Study 3

The authors used a home-based tDCS device developed at Hospital de Clínicas de Porto Alegre for this study. The at-home tDCS device has been used in previous studies and included a user- friendly interface sensitive to impedance, such that sessions with too high impedance were automatically blocked. Furthermore, the number of the sessions, the dosage of the sessions and the stimulations were pre-programmed with a minimum interval between 2 consecutive sessions of 16 hours along with an option to abort a session (if necessary). Additionally, the capacity to save the number of sessions and time of stimulation performed by each participant was also controlled and pre-programmed. The current was delivered using 35-cm2 electrodes (7 cm × 5 cm) coated with a vegetable sponge moistened with saline solution before the stimulation by two silicone cannulas coupled to the electrode. The electrodes were fixed on one of three sizes of neoprene caps that were given to each patient based on their head circumference.

Instructions on using the device were given at the baseline assessment when they received the first stimulation session, assisted by trained staff. Participants were instructed to remain seated during sessions, but no other behavioural restriction was imposed. Participants underwent 30- minute daily sessions of tDCS, 2-mA direct constant current, for four weeks for a total of 28 sessions (including weekends). The anodal and cathodal electrodes were positioned over F4 and F3, corresponding to the right and left DLPFC according to the international 10-20 electroencephalography system. Devices programmed for sham treatment delivered a 30- second ramp-up (0-2 mA) stimulation followed by a 30-second ramp-down (2-0 mA) at the application’s beginning, middle, and end. This procedure was performed to mimic the tactile sensations commonly reported with tDCS. Also, each participant received a daily reminder in the form of a text message on their cell phones to improve adherence. The participants were encouraged to perform the stimulation sessions at the same time of the day.

Study 4

The experiment was conducted on a single day and consisted of three parts. Firstly, participants were familiarised with the experimental paradigm. Secondly, participants were instructed to sit quietly with their eyes open and stimulation was applied offline to the left prefrontal cortex for 20 min. Lastly, participants performed a sustained attention task for 40 minutes, during which mind wandering, the main outcome of this study, was measured. Overall, each participant completed a single session, lasting approximately 1.5 hours.

Stimulation was delivered with a NeuroConn stimulator (neuroConn GmbH, Ilmenau, Germany). The target was placed over F3 (EEG 10–20 system), and the reference was over the right orbitofrontal region (e.g., Figure 3b). For the four groups who received active stimulation (e.g., Figure 3c), tDCS lasted 20 minutes (including 30 s ramping up and down). During stimulation, participants were asked to sit quietly and keep their eyes open. The group that received sham stimulation had the same instructions but only received 15 s of constant current. The current was ramped up for 30 s up to 1.5 mA, then ramped down for 30 s. Stimulation was single-blinded, meaning that while the participants were blind to the stimulation they received, the experimenters were aware of the participant’s stimulation group.

During the experiment, participants completed a sustained attention task (SART) in which they were asked to respond via a keypress (space bar) to non-target stimuli (single digits excluding the number 3) and withholding responses to target stimuli (the number 3), see Figure 3a. Half of the trials ended in a target stimulus; the other half ended in a task-unrelated thought (TUT) probe. The TUT probe asked: “To what extent have you experienced task-unrelated thoughts prior to the thought probe? 1 (minimal) – 4 (maximal)”. Participants’ average response to the probe across trials was taken as a measure of mind wandering performance, with higher scores indicating higher mind wandering.

At the end of the experiment, participants were asked whether they thought they received active or sham stimulation (presented as a binary choice) via a short questionnaire. Moreover, at the end of the study, participants were also asked to guess which stimulation dosage they received, choosing between the following options: none, weak, moderate, or strong.

Statistical Analysis

Statistical analysis was run using R (version 4.2.0. for Windows). When considering a dependent variable on a continuous scale (e.g., depression scores), the function lme4 (Bates et al., 2015) was chosen to fit a linear mixed-effects model in the formulation described by Laird and Ware (1982). This analytic framework has two advantages over non-mixed linear models: 1) it allows the pooling of the same grand mean for both sham and the active groups at the baseline, and 2) the within-group errors are allowed to be correlated and/or have unequal variances. Hence, the assumption of homoscedasticity can be violated. When the dependent variables were coded as binary (e.g., remission and response rates), the function glm (R Core Team, 2022) was chosen to run general mixed-effects models.

We here refer to the subject’s judgment of whether they received active or sham stimulation as subjective treatment, in opposition to objective treatment, which indicates the actual type of stimulation that each subject received during the experiment. Similarly, we refer to participants’ judgment of stimulation dosage as subjective dosage.

We performed a theoretically-driven model comparison to address the following two questions: 1) does the inclusion of subjective treatment leads to a model with a significantly better fit than the baseline model including objective treatment (and time, when applicable) anddo they interact?; and 2) does the inclusion of objective treatment lead to a model with a significantly better fit than the baseline model including subjective treatment (and time, when applicable)?

In order to address the first question, we defined a baseline model including time, and time by objective treatment interaction as fixed effects. Time was defined as a categorical variable, with each level reflecting the weekly assessments from baseline to the end of the study. Participants were entered into the model as random effects. Notably, the reference levels for all of the models (the intercepts) were the baseline; therefore, each effect was imposed as a difference compared to the baseline performance. Thus, the effect of time grasps the overall time difference compared to the baseline. In the same vein, the interaction terms of time and treatments (either subjective or objective) could be conceptualized as a covariate capturing the effect of the treatment over time when compared to the baseline performance. Given our interest in the contribution of subjective treatment over time, we compared the baseline model to an updated model that also included subjective treatment in a two-way interaction with time. Model comparison was run using the anova function in R (R Core Team, 2022). Our focus was on whether the comparison was significant at α<.05, indicating that the inclusion of subjective treatment led to a considerably better model fit, explaining variability in the dependent variable in addition to the explanatory power of objective treatment over time. Lastly, we compared the updated model to a more complex model, including the three-way interaction of time, subjective treatment and objective treatment. In this case, our focus was whether the model comparison was significant, indicating that subjective treatment interacted with time and objective treatment to explain variability.

As for the second question, we switched the order of the baseline models from the previous investigation. The baseline model included time and the interaction of subjective treatment with time, and then the compared model included objective treatment. Henceforth, the additional comparison of the three-way interaction was identical to the one in the first question. That allows for establishing if the objective treatment explained variability over subjective treatment.

Credit Author Statement

Luisa Fassi: Formal analysis; Conceptualisation; Data curation; Methodology; Investigation; Writing - original

Shachar Hochman: Formal analysis; Data curation; Methodology; Investigation; Writing - review and editing

Daniel M. Blumberger: Conceptualisation; Data curation; Investigation; Writing - review and editing

Zafiris J. Daskalakis: Data curation; Investigation; Writing - review and editing

Roi Cohen Kadosh: Formal analysis; Conceptualisation; Supervision; Methodology; Writing – original