1. Introduction

Noninvasive neuromodulation is a powerful tool for causal inference that strengthens our understanding of the brain and holds great clinical potential (Bergmann & Hartwigsen, 2021; Bestmann & Walsh, 2017). Transcranial ultrasonic stimulation (TUS) is a particularly promising non- invasive brain stimulation technique, overcoming current limitations with high spatial resolution and depth range (Darmani et al., 2022). The efficacy of TUS is well-established in cell cultures and animal models (Menz et al., 2013; Mohammadjavadi et al., 2019; Murphy et al., 2022; Tyler et al., 2008, 2018; Yoo et al., 2022), and emerging evidence for the neuromodulatory utility of TUS in humans has been reported for both cortical and subcortical structures (cortical: Butler et al., 2022; Lee et al., 2016; Liu et al., 2021; Zeng et al., 2022; subcortical: Ai et al., 2016; Cain et al., 2021; Nakajima et al., 2022). Especially now, at this foundational stage of TUS in humans, it is essential to converge on protocols that maximize the specificity and efficacy of stimulation (Folloni et al., 2019; Verhagen et al., 2019).

Motor inhibitory effects of a commonly applied 1000 Hz pulsed TUS protocol are among the most robust and replicable human findings (Fomenko et al., 2020; Legon, Bansal, et al., 2018; Xia et al., 2021). Here, by concurrently applying transcranial magnetic stimulation (TMS), modulation of corticospinal excitability is indexed by motor-evoked potentials (MEPs). However, the mechanism by which TUS evokes motor inhibition has remained under debate (Xia et al., 2021).

Recent studies in both animal and human models demonstrate how electrophysiological and behavioral outcomes of TUS can be elicited by nonspecific auditory activation rather than direct neuromodulation (Airan & Butts Pauly, 2018; Braun et al., 2020; Guo et al., 2018; Sato et al., 2018). Indirect effects of stimulation are not unique to TUS, as transcranial magnetic and electric stimulation are also associated with auditory and somatosensory confounds. Indeed, the field of non-invasive brain stimulation as a whole depends on controlling for these confounding factors when present, to unveil the specificity of the neuromodulatory effects (Conde et al., 2019; Duecker et al., 2013; Polanía et al., 2018; Siebner et al., 2022). However, prior online TUS-TMS studies, including those exploring optimal neuromodulatory parameters to inform future work, have considered some but not all necessary conditions to control for the salient auditory confound elicited by a 1000 Hz pulsed protocol (Fomenko et al., 2020; Legon, Bansal, et al., 2018; Xia et al., 2021).

In this multicenter study, we employed improved control conditions to disentangle direct neuromodulatory and indirect auditory contributions to motor inhibitory effects of TUS. We further investigated dose-response effects through administration of multiple stimulus durations, stimulation intensities, and individualized simulations of intracranial intensity. Additionally, we considered the possibility that online TUS might not drive a global change in the excitation/inhibition balance but instead might interact with ongoing neural dynamics by introducing state-dependent noise. Finally, we interrogated sound-driven effects through modulation of auditory confound volume, duration, pitch, and auditory masking. We show that motor inhibitory effects of TUS are spatially nonspecific and driven by sound-cued preparatory motor inhibition. However, we do find preliminary evidence that TUS might introduce dose- and state-dependent neural noise to the dynamics of corticospinal excitability. The present study highlights the importance of carefully constructed control conditions to account for confounding factors while exploring and refining TUS as a promising technique for human neuromodulation.

2. Materials and methods

2.1. Participants

This multicenter study comprised of four experiments conducted independently across three institutions. Experiment I (N = 12, 4 female, Mage = 25.9, SDage = 4.6; METC: NL76920.091.21) and Experiment II (N = 27, 13 female, Mage = 24.1, SDage = 3.7; METC: NL76920.091.21) were conducted at the Donders Institute of the Radboud University (the Netherlands). Experiment III was conducted at the Krembil Research Institute (N = 16, 8 female, Mage = 31.4, SDage = 7.9; Toronto University Health Network Research Ethics Board: 20-5740, Canada), and Experiment IV at the Neuroimaging Centre of the Johannes Gutenberg University Medical Centre Mainz (N = 12, 11 female, Mage = 23.0, SDage = 2.7, Landesärztekammer Rheinland-Pfalz: 2021-15808_01, Germany). All participants were healthy, right- handed, without a history of psychiatric or neurological disorders, and provided informed consent. Ethical approval was obtained for each experiment.

2.2. Transcranial ultrasonic and magnetic stimulation

Ultrasonic stimulation was delivered with the NeuroFUS system (manufacturer: Sonic Concepts Inc., Bothell, WA, USA; supplier/support: Brainbox Ltd., Cardiff, UK). A radiofrequency amplifier powered a piezoelectric ultrasound transducer via a matching network using a rectangular pulse shape. Transducers consisted of a two-element annular array, either with ‘solid water coupling’ (NeuroFUS, Exp. I, II, & IV), or without (Exp. III). Ultrasonic stimulation parameters were based on those used in prior TUS-TMS studies (Table 1, Fig. 1B; Fomenko et al., 2020; Legon, Bansal, et al., 2018; Xia et al., 2021).

Experimental conditions and procedures.

(A) Main conditions: on-target TUS of the left-hemispheric hand motor area (Exp. I-III), active control TUS of the right-hemispheric face motor area (Exp. I-II), sound-only sham (Exp. I-III), and inactive control TUS of the white matter ventromedial to the hand motor area (Exp. IV). (B) Ultrasonic stimulation protocol. (C) Experimental timing. (D) TUS-TMS clamp (DOI: 10.5281/zenodo.6517599).

Ultrasonic stimulation parameters. f = fundamental frequency, PD = pulse duration, PRF = pulse repetition frequency, DC = duty cycle, SD = stimulus duration, Isppa = spatial-peak pulse-average intensity in free- water, P = pressure, MI = derated mechanical index. For estimated intracranial indices for Experiments I & II see Supplementary Figure 2.

Single-pulse TMS was delivered with a figure-of-eight coil held at 45° from midline to induce an approximate posterolateral to anteromedial current. The hand motor hotspot and required TMS intensity were determined using standard procedures as outlined in Supplementary Fig. 1. To apply TUS and TMS concurrently, the ultrasound transducer was affixed to the center of the TMS coil using a custom-made 3D-printed clamp (Fig. 1D; Experiments I, II, & IV; Experiment IV: see Fomenko et al., 2020). TMS was triggered 10 ms prior to the offset of TUS (Fig. 1C). Muscular activity was recorded in the first dorsal interosseous (FDI; Experiments I-III) or in the abductor pollicis brevis (APB; Experiment IV) via electromyography with surface adhesive electrodes using a belly-tendon montage (Supplementary Table 1).

In Experiments I, II, and IV, we used online neuronavigation with individual anatomical scans to support target selection and consistent TMS and TUS placement (Localite Biomedical Visualization Systems GmbH, Sankt Augustin, Germany; MRI specifications: Supplementary Table 2). Further, we recorded the position of TUS in Experiments I and II for post-hoc acoustic and thermal simulations.

2.3. Experiment I

On-target TUS was delivered to the left-hemispheric hand motor area to determine the effect of ultrasonic stimulation on corticospinal excitability. We assessed TUS spatial specificity with an active control condition targeting the right-hemispheric face motor area. Further, we included a sound-only sham condition wherein a MATLAB-generated 1000 Hz square wave tone with a 4:1 signal-to-noise ratio was administered over bone-conducting headphones (Fig. 1A; AfterShockz Trekz, TX, USA). Finally, we incorporated a baseline condition consisting solely of TMS.

Ultrasonic stimulation was delivered at two stimulus durations (100/500 ms) and at two intensities (32.5/65 W/cm2 Isppa) to probe a potential dose-response effect. Additionally, with consideration of potentially audible differences between on-target and active control stimulation sites, we applied these conditions both with and without masking stimuli. These auditory stimuli were identical those used during sound-only sham (Braun et al., 2020). See Supplementary Fig. 3 for an overview of conditions and experimental timing for each experiment.

Conditions were administered in a single-blind inter-subject counterbalanced blocked design while participants were seated at rest. Ultrasound gel was used to couple both transducers to the participant’s scalp (Aquasonic 100, Parker Laboratories, NJ, USA). In total, participants completed 14 blocks of 20 trials each. Each trial lasted 6 ± 1 seconds. Two baseline measurements were completed, the first occurring as one of the first four blocks, and the second as one of the last four, to capture any general shift in excitability throughout the experiment. TMS was administered on every trial for a total of 280 single pulses.

2.4. Experiment II

To confirm and expand upon our findings from Experiment I we conducted a second, preregistered, experiment using the same main conditions and procedures, with a few adaptations (https://doi.org/10.17605/OSF.IO/HS8PT). The 2x2x2 design comprised of stimulation site (on- target/active control), stimulation intensity (6.35/19.06 W/cm2), and auditory masking (no mask/masked). We applied ultrasonic stimulation exclusively at an effective 500 ms stimulus duration. In this experiment, in-ear headphones were used for sound-only sham and auditory masking conditions (ER-3C Insert Earphones, Etymotic Research, Illinois, USA). To better capture any baseline shift in excitability during the experiment, we presented conditions in a single-blind pseudorandomized order in which each consecutive set of 10 trials included each of 10 conditions once. Participants completed 25 trials per condition, resulting in 250 trials total.

To further probe a potential dose-response effect of stimulation intensity, we ran acoustic and thermal simulations (Supplementary Fig. 4). Here, we assessed the relationship between estimated intracranial intensities and perturbation of corticospinal excitability. While simulations were also run for Experiment I, its sample size was insufficient to test for intracranial dose-response effects.

Following the main experiment, we tested the efficacy of our masking stimuli with a forced- choice task wherein participants reported if they had received TUS for each condition, excluding baseline. Additionally, we investigated whether audible differences between stimulation sites were present during auditory masking (Supplementary Fig. 5).

2.5. Experiment III

We further characterized possible effects of auditory confounds on motor cortical excitability by administering varied auditory stimuli, both alongside on-target TUS and without TUS (i.e., sound-only sham). Auditory stimuli were either 500 or 700 ms in duration, the latter beginning 100 ms prior to TUS. Both durations were presented at two pitches. A 12 kHz tone was administered over speakers positioned to the left of the participant as in Fomenko and colleagues (2020), and a 1 kHz tone was administered as in Experiments I, II, IV, and prior research (Braun et al., 2020) over noise-cancelling earbuds.

First, we investigated changes in corticospinal excitability from baseline following these auditory stimuli. Participants received 15 trials of baseline (i.e., TMS only) and 15 trials of each of the four sound-only sham stimuli. Conditions were presented in a blocked single-blind randomized order with participants seated at rest. An inter-trial interval of 5 seconds was used.

Next, we assessed whether applying on-target TUS during these auditory stimuli affected motor excitability. Here, TMS intensity was set to evoke a ∼1 mV MEP separately for each of the four sound-only sham conditions (Supplementary Fig. 1). To account for different applied TMS intensities between baseline and these conditions, we calculated Relative MEP amplitude by multiplying each trial by the ratio of applied TMS intensity to baseline TMS intensity. Participants received 15 trials of each auditory stimulus, once with on-target TUS and once as a sound-only sham. Ultrasound gel (Wavelength MP Blue, Sabel Med, Oldsmar, FL) and a 1.5 mm thick gel-pad (Aquaflex, Parker Laboratories, NJ, USA) were used to couple the transducer to the participants’ scalp. Conditions were presented in pairs of sound-sham and TUS for each auditory stimulus, counterbalanced between subjects. The order of the different auditory stimuli was randomized across participants.

2.6. Experiment IV

We further investigated the role of TUS audibility on motor excitability by administering stimulation to an inactive control site – the white matter ventromedial to the hand motor area. In doing so, TUS is applied over a homologous region of the scalp and skull without likely direct neuromodulation, thus allowing us to closely replicate the auditory confound while simultaneously isolating its effects.

Here, we probed whether the varying volume of the auditory confound at different stimulation intensities might itself impact motor cortical excitability. To this end, we applied stimulation at 4.34, 8.69, and 10.52 W/cm2 Isppa, or in effect, at three auditory confound volumes. We additionally applied stimulation both with and without a continuous auditory masking stimulus that consisted of a 1 kHz square wave with white noise, presented through wired bone-conducting headphones (LBYSK Wired Bone Conduction Headphones). The volume and signal-to-noise ratio of the masking stimulus were increased until the participant could no longer hear TUS, or until the volume became uncomfortable.

We administered conditions in a single-blind inter-subject randomized blocked design. Two blocks were measured per condition, each including 30 TUS-TMS trials and an additional 30 TMS-only trials to capture drifts in baseline excitability. These trials were applied in random order within each block with an inter-trial interval of 5 ± 1 seconds. Ultrasound gel (Aquasonic 100, Parker Laboratories, NJ, USA) and a ∼2-3 mm thick gel-pad were used to couple the transducer to the participant’s scalp (Aquaflex, Parker Laboratories, NJ, USA). During blocks with auditory masking, the mask was played continuously throughout the block. Following each block, participants were asked whether they could hear TUS (yes/no/uncertain).

2.7. Analysis

Raw data were exported to MATLAB, where MEP peak-to-peak amplitude was calculated using a custom script and confirmed by trial-level visual inspection. Trials where noise prevented an MEP to be sufficiently quantified were removed. Given the right-skewed nature of the raw MEP values, we performed a square root transformation to support parametric statistics. For visualization purposes, baseline corrected MEP amplitudes were also calculated.

Linear mixed-effects models (LMMs) were fitted using the lme4 package in R (Bates et al., 2015; R core team, 2021). Intercepts and condition differences (slopes) were allowed to vary across participants, including all possible random intercepts, slopes, and correlations in a maximal random effects structure (Barr et al., 2013). Statistical significance was set at two-tailed α = 0.05 and was computed with t-tests using the Satterthwaite approximation of degrees of freedom. For direct comparisons to a reference level (e.g., baseline), we report the intercept (b), standard error (SE), test- statistic (t), and significance (p). For main effects and interactions, we report the F statistic, significance, and partial eta squared. LMMs included square root transformed MEP peak-to-peak amplitude as the dependent variable, with the relevant experimental conditions and their interactions as predictors. Given the large number of baseline trials in Experiment IV (50% of total), the LMM testing effects of stimulation intensity and auditory masking instead included baseline corrected MEP amplitude as the dependent variable.

3. Results

3.1. Motor cortical inhibition is not specific to on-target TUS

We first corroborate previous reports of MEP suppression following 500 ms of TUS applied over the hand motor area (Experiments I-III; Fomenko et al., 2020; Legon, Bansal, et al., 2018; Xia et al., 2021). An LMM revealed significantly lower MEP amplitudes following on-target TUS as compared to baseline for Experiment I (b = -0.14, SE = 0.06, t(11) = -2.23, p = 0.047), Experiment II (b = -0.18, SE = 0.04, t(26) = -4.82, p = 6⋅10-5), and Experiment III (b = -0.22, SE = 0.07, t(15) = -3.08, p = 0.008).

However, corticospinal inhibition from baseline was also observed following control conditions. LMMs revealed significant attenuation of MEP amplitude following active control stimulation of the right-hemispheric face motor area (Experiment I: b = -0.12, SE = 0.05, t(11) = -2.29, p = 0.043; Experiment II: b = -0.22, SE = 0.04, t(26) = -5.60, p = 7⋅10-6), as well as after inactive control stimulation of the white matter ventromedial to the left-hemispheric hand motor area (Experiment IV: b = -0.14; SE = 0.04; t(11) = -3.09; p = 0.010). The same effect was observed following sound-only sham (Experiment I: b = -0.14; SE = 0.05; t(11) = -3.18; p = .009; Experiment II: b = -0.22; SE = 0.04; t(26) = -5.38; p = 1⋅10-5; Experiment III: b = -0.24; SE = 0.08; t(15) = -2.86; p = 0.012). These results suggest a spatially non-specific effect of TUS that is related to the auditory confound (Fig. 2).

Non-specific motor inhibitory effects of TUS.

A significant suppression of MEP amplitude relative to baseline (gray) was observed for on-target TUS (green), but also for stimulation of a control region (cyan), and presentation of a sound alone (sound-sham; blue) indicating a spatially non-specific and sound-driven effect on motor cortical excitability. Bar plots depict condition means, error bars represent standard errors, clouds indicate the distribution over participants, and points indicate individual participants. Square-root corrected MEP amplitudes are depicted for Experiments I, II, and IV, and Relative MEP amplitude is depicted for Experiment III (see Methods). *p < 0.05, **p < 0.01, ***p < 0.001.

3.2. No dose-response effects of TUS on corticospinal inhibition

We further tested for direct ultrasonic neuromodulation by investigating a potential dose-response effect of TUS intensity (Isppa) on motor cortical excitability. First, we applied TUS at multiple free-water stimulation intensities (Fig. 3B). In Experiment I, a linear mixed model with the factor ‘intensity’ (32.5/65 W/cm2) did not reveal a significant effect of different on-target TUS intensities on motor excitability (F(1,11) = 0.47, p = 0.509, ηp2 = 0.04). In Experiment II, a linear mixed model with the factors ‘stimulation site’ (on-target/active control), ‘masking’ (no mask/masked), and ‘intensity’ (6.35/19.06 W/cm2) similarly did not reveal an effect of stimulation intensity (F(1,50) = 1.29, p = 0.261, ηp2 = 0.03). Importantly, there was no effect of stimulation site (F(1,168) = 1.75, p = 0.188, ηp2 = 0.01), nor any significant interactions (all p-values > 0.1; all ηp2 < 0.06). These results provide neither evidence for spatially specific neuromodulation when directly comparing stimulation sites, nor evidence for a dose- response relationship within the range of applied intensities.

No dose-response effects of TUS.

(A) Acoustic (top) and thermal (bottom) simulations for a single subject in Experiment II. The acoustic simulation depicts estimated pulse-average intensity (Ipa) above a 0.15 W/cm2 lower bound, with the dotted line indicating the full-width half-maximum of the pressure. The thermal simulation depicts maximum estimated temperature rise. (B) There is no significant effect of free-water stimulation intensity on MEP amplitude. Values are expressed as a percentage of baseline MEP amplitude (square root corrected). Remaining conventions are as in Fig. 2. (C) On-target TUS MEP amplitude as a percentage of active control MEP amplitude against simulated intracranial intensities at the two applied free- water intensities: 6.35 W/cm2 (top) and 19.06 W/cm2 (bottom). The shaded area represents the 95% CI, points represent individual participants. (D) Temporal autocorrelation, operationalized as the slope of the linear regression between trial t and baseline trial t-1, differed significantly as a function of stimulation site and intensity for masked trials. Individual points represent the differential autocorrelation compared to the active control site. Autocorrelation was not modulated by sound-only sham, but was significantly higher for on-target TUS at 6.35 W/cm2, and significantly lower for on-target TUS at 19.06 W/cm2 compared to active control TUS. *p < 0.05, **p < 0.01, ***p < 0.001.

However, it is likely that the effectiveness of TUS depends primarily on realized intracranial intensity, which we estimated with individualized 3D simulations (Fig. 3A). Yet, testing the relationship between estimated intracranial intensity and MEP amplitude change following on-target TUS similarly did not yield evidence for a dose-response effect (Fig. 3C, Supplementary Fig. 6).

Prior work has primarily focused on probing facilitatory or inhibitory effects on corticospinal excitability. Here, we also considered an alternative: how TUS might introduce noise to ongoing neural dynamics, rather than a directional modulation of excitability. Indeed, human TUS studies have often failed to show a global change in behavioral performance, instead finding TUS effects primarily around the perception threshold where noise might drive stochastic resonance (Butler et al., 2022; Legon, Ai, et al., 2018). Neural noise introduced by brain stimulation is likely state-dependent and might not exceed the dynamic range of the intra-subject variability (Silvanto et al., 2007). Therefore, in an exploratory analysis, we exploited the natural structure in corticospinal excitability that exhibits as a strong temporal autocorrelation in MEP amplitude. Specifically, we tested how strongly the MEP on test trial t is predicted by the previous baseline trial t-1. As such, we quantified state-dependent autocorrelation between baseline MEP amplitude and MEP amplitude following on-target TUS, active control TUS, and sound-sham conditions (Supplementary Fig. 7). In brief, we found a significant interaction between previous baseline (t-1), stimulation site (on-target/active control), and intensity (6.35/19.06 W/cm2; F(1,30) = 12.10, p = 0.002, ηp2 = 0.28) during masked trials. This interaction exhibited as increased autocorrelation for on-target TUS compared to active control TUS at 6.35 W/cm2 (i.e., lower TUS-induced noise; F(1,1287) = 13.43, p = 3⋅10-4, ηp2 = 0.01), and reduced autocorrelation at 19.06 W/cm2 (i.e., higher noise; F(1,1282) = 5.76, p = 0.017, ηp2 = 4⋅10-3; Fig. 3D). This effect was not only dependent upon intensity and stimulation site, but also dependent on the presence of auditory masking. As such, the effect was also observed in a four-way interaction of the previous baseline, site, intensity, and masking (Supplementary Fig. 7). These results might suggest that ultrasound stimulation interacted with ongoing neural dynamics by introducing temporally specific noise, rather than biasing the overall excitation/inhibition balance beyond its natural variation.

3.3. Audible differences between stimulation sites do not underlie nonspecific inhibition

Stimulation over two separate sites could evoke distinct perceptual experiences arising from bone- conducted sound (Braun et al., 2020). To account for possible audibility differences between stimulation of on-target and active control sites in Experiments I and II, we also tested these conditions in the presence of a time-locked masking stimulus (Fig. 4D-E). Following Experiment II, we additionally assessed the blinding efficacy of our masking stimuli, finding that the masking stimulus effectively reduced participant’s ability to determine whether TUS was administered to approximately chance level (Supplementary Fig. 5).

Sound-driven effects on corticospinal excitability.

(A) Longer (auditory) stimulus durations resulted in lower MEP amplitudes, regardless of TUS administration, indicating a sound-duration-dependency of motor inhibitory outcomes (Exp. I). (C) A significant effect of auditory stimulus duration was also observed in Experiment III. (B) Less MEP attenuation was measured during continuous masking, particularly for lower stimulation intensities (i.e., auditory confound volumes), pointing towards a role of TUS audibility in MEP attenuation. (D-E) There were no significant effects of time-locked masking, indicating that audible differences between stimulation sites did not obscure or explain the absence of direct neuromodulation. (F) The pitch of auditory stimuli affected MEPs, where lower amplitudes were observed following a 1 kHz tone. There was no effect of TUS. Conventions are as in Figs. 2-3B.

In Experiment I, a linear mixed model with factors ‘masking’ (no mask/masked) and ‘stimulation site’ (on-target/active control) did not reveal a significant effect of masking (F(1,11) = 0.01, p = 0.920, ηp2 = 1⋅10-5), stimulation site (F(1,11) = 0.15, p = 0.703, ηp2 = 0.01), nor their interaction (F(1,11) = 1⋅10-3, p = 0.971, ηp2 = 1⋅10-4). Similarly, in Experiment II, the linear mixed model described under the previous section revealed no significant main effect of masking (F(1,30) = 1.68, p = 0.205, ηp2 = 0.05), nor any interactions (all p-values > 0.1; all ηp2 < 0.06). These results indicate that an underlying specific neuromodulatory effect of TUS was not being obscured by audible differences between stimulation sites.

3.4. Sound-driven effects on corticospinal excitability

3.4.1. Duration and pitch

Prior research has shown that longer durations of TUS significantly inhibited motor cortical excitability (i.e., ≥400 ms; Fomenko et al., 2020), while shorter durations did not. In Experiment I, we applied on- target, active control, and sound-sham conditions at shorter and longer durations to probe this effect. When directly comparing these conditions at different stimulus durations (100/500 ms), no evidence for an underlying neuromodulatory effect of TUS was observed, in line with our aforementioned findings. Instead, a linear mixed model with factors ‘condition’ (on-target/active control/sound-sham) and ‘stimulus duration’ (100/500 ms) revealed only a significant main effect of (auditory) stimulus duration, where longer stimulus durations resulted in stronger MEP attenuation (F(1,11) = 10.07, p = 0.009, ηp2 = 0.48). There was no significant effect of condition (F(2,11) = 1.30, p = 0.311, ηp2 = 0.19), nor an interaction between stimulus duration and condition (F(2,11) = 0.65, p = 0.543, ηp2 = 0.11). These results further show that the auditory confound and its timing characteristics, rather than ultrasonic neuromodulation, underlies the observed inhibition of motor cortical excitability (Fig. 4A).

We further tested auditory effects in Experiment III, where we administered sound-sham stimuli at four combinations of duration and pitch. A LMM with factors ‘duration’ (500/700 ms) and ‘pitch’ (1/12 kHz) revealed significantly lower MEPs following 500 ms auditory stimuli (Fig. 4C; duration: F(1,15) = 7.12, p = 0.017, ηp2 = 0.32; pitch: F(1,15) = 0.02, p = 0.878, ηp2 = 2⋅10-3; interaction: F(1,15) = 2.23, p = 0.156, ηp2 = 0.13), supporting the role of auditory stimulus timing in perturbation of MEP amplitude.

Subsequently, ultrasonic stimulation was also administered alongside these four auditory stimuli. Here, a LMM with factors ‘auditory stimulus duration’ (500/700 ms), ‘pitch’ (1/12 kHz), and ‘ultrasonic stimulation’ (yes/no) revealed no significant effect of auditory stimulus duration in contrast to the first test (F(1,15) = 0.44, p = 0.517, ηp2 = 0.03). However, a 1 kHz pitch resulted in significantly lower MEP amplitudes than a 12 kHz pitch (Fig 4F; F(1,15) = 4.94, p = 0.042, ηp2 = 0.25). Importantly, we find no evidence for ultrasonic neuromodulation, where both on-target TUS and sound-sham reduced MEP amplitude from baseline (Fig. 2C), and where applying on-target TUS did not significantly affect MEP amplitude as compared to sound-sham (F(1,15) = 0.42, p = 0.526, ηp2 = 0.03; Fig. 4F). We observed a nonsignificant trend for the interaction between ‘ultrasonic stimulation’ and ‘auditory stimulus duration’ (F(1,15) = 4.22, p = 0.058, ηp2 = 0.22). No trends were observed for the remaining interactions between these three factors (all ηp2 < 0.06, p > 0.3). Taken together, these results do not provide evidence for direct ultrasonic neuromodulation but support the influence of auditory stimulation characteristics on motor cortical excitability.

3.4.2. TUS audibility and confound volume

In Experiment IV, we applied TUS to an inactive target – the white matter ventromedial to the left- hemispheric hand motor area – both with and without a continuous auditory masking stimulus. MEP amplitudes did not significantly differ in baseline conditions regardless of whether a continuous sound was being played (b = 0.03, SE = 0.06, t(11) = 0.52, p = 0.616), indicating that continuous auditory stimulation alone might not be sufficient to inhibit MEP amplitude.

We additionally applied stimulation at multiple intensities to isolate the effect of auditory confound volume. A linear mixed model with factors ‘masking’ (no mask/masked) and ‘intensity’ (4.34/8.69/10.52 Wcm-2) with a random intercept and slope for each factor revealed a significant interaction (F(2,4038) = 3.43, p = 0.033, ηp2 = 2⋅10-3) and an accompanying effect of ‘masking’ with lesser MEP attenuation when stimulation was masked (F(1,11) = 11.84, p = 0.005, ηp2 = 0.52; Fig. 4B). Follow-up comparisons revealed significantly less attenuation for masked stimulation at 4.34 W/cm2 intensity (F(1,11) = 13.02, p = 0.004, ηp2 = 0.55), and a nonsignificant trend for the higher intensities (8.69 W/cm2: F(1,11) = 3.87, p = 0.077, ηp2 = 0.27; 10.52 W/cm2: F(1,11) = 3.47, p = 0.089, ηp2 = 0.24).

In direct comparisons to baseline, all conditions resulted in a significant inhibition of MEP amplitude (all t < -3.36, all p < 0.007), with the exception of continuously masked stimulation at the lowest intensity of 4.34 W/cm2 Isppa (b = -0.06, SE = 0.03, t(11) = -2.04, p = 0.065).

The data indicate that continuous masking reduces motor inhibition, likely by minimizing the audibility of TUS, particularly when applied at a lower stimulation intensity (i.e., auditory confound volume). It is likely that the remaining motor inhibition observed during masked trials owes to, albeit decreased, persistent audibility of TUS during masking. Indeed, MEP attenuation in the masked conditions descriptively scale with participant reports of audibility (Supplementary Fig. 8).

3.4.3. Preparatory cueing of TMS

We find that MEP attenuation results from auditory stimulation rather than direct neuromodulation. Two putative mechanisms through which sound cuing may drive motor inhibition have been proposed, positing either that explicit cueing of TMS timing results in compensatory processes that drive MEP reduction (Capozio et al., 2021; Tran et al., 2021), or suggesting the evocation of a startle response that leads to global inhibition (Fisher et al., 2004; Furubayashi et al., 2000; Ilic et al., 2011; Kohn et al., 2004; Wessel & Aron, 2013). Critically, we can dissociate between these theories by exploring the temporal dynamics of MEP attenuation.

In Experiments I and II, linear mixed models with ‘trial number’ as a predictor show significant changes in MEP amplitude throughout the experiment, pointing to a learning effect. Specifically, in Experiment I, a significant reduction in MEP amplitude was observed across the first 10 trials where a 500 ms stimulus was delivered (b = -0.04, SE = 0.01, t(11) = -2.88, p = 0.015), following by a stabilization in subsequent blocks (b = -2⋅10-4, SE = 3⋅10-4, t(11) = -0.54, p = 0.601). This same pattern was observed in Experiment II, with a significant reduction across the first 20 trials (b = -0.01, SE = 3⋅10-3, t(26) = - 4.08, p = 4⋅10-4), followed by stabilization (b = 6⋅10-5, SE = 1⋅10-4, t(26) = 0.46, p = 0.650; Fig. 5). The data suggest that the relative timing of TUS and TMS is learned across initial trials, followed by a stabilization at a decreased MEP amplitude once this relationship has been learned. These results could reflect auditory cueing of TMS that leads to a compensatory expectation-based reduction of motor excitability.

Auditory cueing of TMS.

There was a significant reduction in MEP amplitude when participants were first presented with a 500 ms stimulus (initial trials) in Experiment I (left) and Experiment II (right), following by a stabilization of MEP amplitude during the rest of the experiment (following trials), indicating a learning process by which TUS acts as a cue signaling the onset of TMS. The solid line depicts the loess regression fit, and the shaded area represents the 95% confidence interval.

Discussion

In this study, we show the considerable impact of auditory confounds during audibly pulsed TUS in humans. We employed improved control conditions compared to prior work across four experiments, one preregistered, at three independent institutions. Here, we disentangle direct neuromodulatory and indirect auditory contributions during ultrasonic neuromodulation of corticospinal excitability. While we corroborated motor inhibitory effects of online TUS (Fomenko et al., 2020; Legon, Bansal, et al., 2018; Xia et al., 2021), we demonstrated that this inhibition also occurs with stimulation of a control region or presentation of a sound alone, suggesting that the auditory confound rather than direct ultrasonic neuromodulation drives inhibition. Further, no direct neuromodulatory effects on overall excitability were observed, regardless of stimulation timing, intensity, or masking. However, we note that an exploratory investigation of temporal dynamics indicated ultrasound might introduce noise to the neural system. Importantly, we found convincing evidence that characteristics of auditory stimuli do globally affect motor excitability, where auditory cueing of TMS pulse timing can affect measures of corticospinal excitability. This highlights the importance of explicit cueing in TMS experimental design. Most importantly, our results call for a reevaluation of earlier findings following audible TUS, and highlight the importance of suitable controls in experimental design (Bergmann & Hartwigsen, 2021; Siebner et al., 2022).

No evidence for direct neuromodulation by TUS

Prior studies have highlighted sound-driven effects of TUS in behavioral and electrophysiological research (Airan & Butts Pauly, 2018; Braun et al., 2020; Guo et al., 2018; Johnstone et al., 2021; Sato et al., 2018). Here, we assessed whether the auditory confound of a conventional 1000 Hz pulsed protocol might underlie motor inhibitory effects, which are among the most robust and replicable human findings (Fomenko et al., 2020; Legon, Bansal, et al., 2018; Xia et al., 2021). While we successfully replicated this inhibitory effect, we found the same inhibition following stimulation of a motor control site (contralateral, active) and stimulation of a white-matter control site (ipsilateral, inactive; Fig. 2). This contrasts with a prior TUS-TMS study which found that TUS of the contralateral hand motor area did not change motor cortical excitability (Xia et al., 2021). Indeed, in all direct comparisons between on-target and control stimulation, no differences in excitability were observed, pointing towards a spatially nonspecific effect of TUS. Considering further inhibitory effects following administration of an auditory stimulus alone, the data suggest that online TUS motor inhibition is largely driven by the salient auditory confound, rather than spatially specific and direct neuromodulation. However, an exploratory analysis that tested for effects beyond a global shift in excitation-inhibition balance revealed that TUS might interact with ongoing neural dynamics by introducing dose-dependent noise (Fig. 3D).

We found no evidence of a dose-response relationship between TUS intensity (Isppa) and motor inhibition when applying stimulation at a wide range of intensities, nor when testing the relationship between simulated intracranial intensities and changes in excitability (Fig. 3B-C). Similarly, administration of a time-locked auditory masking stimulus that effectively reduced TUS detection rates did not provide evidence of direct effects being obscured by audible differences between conditions (Fig. 4D-E, Supplementary Fig. 5). Taken together, this study presents no evidence for direct and spatially specific TUS inhibition of motor excitability when applying a clearly audible protocol, despite using improved control conditions, higher stimulation intensities, and a larger sample size than prior studies (Fomenko et al., 2020; Legon, Bansal, et al., 2018; Xia et al., 2021). Building on these results, the current challenge is to develop efficacious neuromodulatory protocols with minimal auditory interference. Efforts in this direction are already underway (Mohammadjavadi et al., 2019; Nakajima et al., 2022; Zeng et al., 2022).

Sound-cued motor inhibition

Until now, it was unclear how TUS induced motor inhibition in humans. Here, we show that this inhibition is caused by peripheral auditory stimulation. It is well-known that MEPs are sensitive to both sensory and psychological factors (Duecker et al., 2013). For example, several studies find MEP attenuation following a startling auditory stimulus (Fisher et al., 2004; Furubayashi et al., 2000; Ilic et al., 2011; Kohn et al., 2004; Wessel & Aron, 2013), and have demonstrated the impact of stimulus duration and volume on this inhibition (Furubayashi et al., 2000). It is possible that a similar mechanism is at play for audible TUS protocols. Indeed, we observed modulation of motor cortical excitability dependent upon the characteristics of auditory stimuli, including their duration and timing (Fig. 4A,C), their pitch/frequency (Fig. 4F), and whether the confound was audible in general, including perceived volume (Fig. 4B, Supplementary Fig. 6).

One possible interpretation of the observed MEP attenuation is that the auditory confound acts as a salient cue to predict the upcoming TMS pulse. Prediction-based attenuation has been reported in both sensory and motor domains (Ford et al., 2007; Tran et al., 2021). For example, MEPs are suppressed when the timing of a TMS pulse can be predicted by a warning cue (Capozio et al., 2021; Tran et al., 2021). In the current experimental setup, participants could also learn the relative timing of the auditory stimulus and the TMS pulse. Indeed, we observe MEP attenuation emerge across initial trials as participants learn when to expect TMS, until a stable (i.e., learned) state is reached (Fig. 5). Moreover, no motor inhibition was observed when TUS onset was inaudible or when stimulation timing was potentially too fast to function as a predictive cue (100 ms). Taken together, a parsimonious explanation is expectation-based inhibition of TMS-induced MEPs. This inhibitory response might either reflect inhibition of competing motor programs – a component of motor preparation – or a homeostatic process anticipating the TMS-induced excitation (Capozio et al., 2021; Tran et al., 2021).

Limitations

The precise biomolecular and neurophysiological mechanisms underlying ultrasonic neuromodulation remain under steadily progressing investigation (Weinreb & Moses, 2022; Yoo et al., 2022). A shared interpretation is that mechano-electrophysiological energy transfer is proportional to acoustic radiation force, and thus proportional to stimulation intensity. Accordingly, one could argue that the TUS dose in the present study could have been insufficient to evoke direct neuromodulation. Indeed, despite the applied intensities exceeding prior relevant human work (Fomenko et al., 2020; Legon, Bansal, et al., 2018; Xia et al., 2021) the total applied neuromodulatory doses are relatively low as compared to, for example, repetitive TUS protocols (rTUS) in animal work (Folloni et al., 2019; Verhagen et al., 2019) or recent human studies (Nakajima et al., 2022).

Alternatively, insufficient neural recruitment could be attributed to stimulation parameters other than intensity. If so, the absence of direct neuromodulation across these experiments might not generalize to parameters outside the tested set. For example, while we replicated and extended prior work targeting the hand motor area at ∼30 mm from the scalp (Fomenko et al., 2020; Legon, Bansal, et al., 2018; Xia et al., 2021), other studies have suggested that the optimal stimulation depth to engage the hand motor area may be more superficial (Osada et al., 2022; Siebner et al., 2022).

One might further argue that the TMS hotspot provides insufficient anatomical precision to appropriately target the underlying hand muscle representation with TUS. The motor hotspot may not precisely overly the cortical representation of the assessed muscle due to the increased coil-cortex distance introduced by the TUS transducer. This distance, and the larger TMS coils required to evoke consistent MEPs, results in a broad electric field that is substantially larger than the TUS beam width (e.g., 6 mm for 250 kHz; Fomenko et al., 2020; Legon, Bansal, et al., 2018). Thus, it is possible that a transducer aligned with the center of the TMS coil may not be adequate. Nevertheless, we note that previous work utilizing a similar targeting approach has effectively induced changes in corticospinal motor excitability (Zeng et al., 2022). We also note that our stimulation depth and targeting procedures were comparable to all prior TUS-TMS studies, and that our simulations confirmed targeting (Fig. 3A, Supplementary Fig. 4). In summary, our main finding that the auditory confound drove motor inhibition in the present study, and likely had an impact in previous studies, holds true.

Considerations and future directions

Crucially, our results do not provide evidence that TUS is globally ineffective at inducing neuromodulation. While the present study and prior research highlight the confounding role of indirect auditory stimulation during pulsed TUS, there remains strong evidence for the efficacy of ultrasonic stimulation in animal work when auditory confounds are accounted for (Mohammadjavadi et al., 2019), or in controlled in-vitro systems such as an isolated retina, brain slices, or neuronal cultures in which the auditory confound carries no influence (Menz et al., 2013; Tyler et al., 2018).

It follows that where an auditory confound could be expected, appropriate control conditions are critical. These controls could involve stimulating a control region, and/or including a matched sound-only sham. In parallel, or perhaps alternatively, the impact of this confound can be mitigated in several ways. First, we recommend that the influence of auditory components be considered in transducer design and selection. Second, masking the auditory confound can help to blind participants to experimental conditions. Titrating auditory mask quality by participant to account for intra- and inter-individual differences in subjective perception of the auditory confound would be beneficial. However, this auditory stimulation might still influence cognitive task performance, among other measures. Third, the ultrasonic pulse and pulse train can be modulated, or ramped, to minimize or even eliminate the auditory confound. This approach still requires validation and will only be relevant for protocols with pulses of sufficient duration. Here, one can expect that the experimental control required to account for auditory confounds might also hold for alternative peripheral effects, such as somatosensory confounds. Longer pulse durations are common in offline rTUS paradigms (Zeng et al., 2022), with more opportunity for inaudible pulse shaping and the added benefit of separating the time of stimulation from that of measurement. However, appropriate control conditions remain central to make inferences on interventional specificity.

Conclusion

Transcranial ultrasonic stimulation is rapidly gaining traction as a promising neuromodulatory technique in humans. For TUS to reach its full potential we must identify robust and effective stimulation protocols. Here, we demonstrate that one of the most reliable findings in the human literature – online motor cortical inhibition during a 1000 Hz pulsed protocol – primarily stems from an auditory confound rather than direct neuromodulation. Instead of driving overall inhibition, we found preliminary evidence that TUS introduces noise to ongoing neural dynamics. Future research must carefully account for peripheral confounding factors to isolate direct neuromodulatory effects of TUS, thereby enabling the swift and successful implementation of this technology in both research and clinical settings.

Data availability

Data and code required to achieve the reported results will be made available for all four experiments pending peer review.

CRediT authorship contribution statement

Contribution diagram.

This figure depicts the involvement of each author using the CRediT taxonomy (Brand et al., 2015) and categorizes their contributions according to three levels represented by color: ‘none’ (gray), ‘substantial contribution’ (light green), ‘leading contribution’ (dark green).

Declaration of competing interest

Umair Hassan is the head of software development at sync2brain GmbH, where the bossdevice used in Experiment IV was developed. All other authors declare that no competing interests exist.

Acknowledgements

Experiments I & II were supported by the Dutch Research Council (NWO), awarding VIDI fellowships to L.V. (18919) and H.E.M.d.O. (452-17-016), and the Topsector programme Holland High Tech (HiTMaT-38H3). We thank Brittany van Beek for assistance with data acquisition and administration, Sarmad Peymaei for contributions to technical piloting, Norbert Hermesdorf for developing the 3D- printed TUS-TMS clamp, and Gerard van Oijen and Pascal de Water for their technical support. Experiment III was funded by the Canadian Institutes of Health Research (FDN 154292, ENG 173742) and the Natural Science and Engineering Research Council of Canada (RGPIN-2020-04176, RTI-2020- 0024). Experiment IV and T.N. were supported by a grant from the Boehringer Ingelheim Foundation to T.O.B.

Appendix A. Supplementary information

Supplementary tables and figures can be found in Appendix A.