Abstract
Summary
Our movements result in predictable sensory feedback that is often multimodal. Based on deviations between predictions and actual sensory input, primary sensory areas of cortex have been shown to compute sensorimotor prediction errors. How prediction errors in one sensory modality influence the computation of prediction errors in another modality is still unclear. To investigate multimodal prediction errors in mouse auditory cortex (ACx), we used a virtual environment to experimentally couple running to both self-generated auditory and visual feedback. Using two-photon microscopy, we first characterized responses of layer 2/3 (L2/3) neurons to sounds, visual stimuli, and running onsets and found responses to all three stimuli. Probing responses evoked by audiomotor mismatches, we found that they closely resemble visuomotor mismatch responses in visual cortex (V1). Finally, testing for cross modal influence on audiomotor mismatch responses by coupling both sound amplitude and visual flow speed to the speed of running, we found that audiomotor mismatch responses were amplified when paired with concurrent visuomotor mismatches. Our results demonstrate that multimodal and non-hierarchical interactions shape prediction error responses in cortical L2/3.
Introduction
Neuronal responses consistent with prediction errors have been described in a variety of different cortical areas (Audette and Schneider, 2023; Ayaz et al., 2019; Han and Helmchen, 2023; Heindorf et al., 2018; Keller et al., 2012; Liu and Kanold, 2022) and across different species (Eliades and Wang, 2008; Keller and Hahnloser, 2009; Stanley and Miall, 2007). In V1, these responses are thought to be learned with experience (Attinger et al., 2017) and depend on local plasticity in cortex (Widmer et al., 2022). Prediction errors signal the unanticipated appearance or absence of a sensory input, and are thought to be computed as a deviation between top-down predictions and bottom-up sensory inputs (Rao and Ballard, 1999). One type of top-down signals that conveys a prediction of sensory input are motor-related signals (Leinweber et al., 2017). In auditory cortex, motor-related signals can modulate responses to self-generated vocalizations (Eliades and Wang, 2008) or to sounds coupled to locomotion (Schneider et al., 2018). Auditory cortex is thought to use these motor-related signals to compute audiomotor prediction error responses (Audette et al., 2022; Audette and Schneider, 2023; Eliades and Wang, 2008; Keller and Hahnloser, 2009; Liu and Kanold, 2022). These audiomotor prediction errors can be described in a hierarchical variant of predictive processing, in which top-down motor-related signals function as predictions of bottom-up sensory input. While parts of both auditory and visual processing streams are well described by a hierarchy, the cortical network as a whole does not easily map onto a hierarchical architecture, anatomically (Markov et al., 2013) or functionally (St-Yves et al., 2023; Suzuki et al., 2023), in a non-trivial way. One of the connections that does not neatly fit into a hierarchical model is the surprisingly dense reciprocal connection between ACx and V1 (Clavagnier et al., 2004; Falchier et al., 2002; Ibrahim et al., 2016; Leinweber et al., 2017; Zhao et al., 2022). From ACx to V1 this connection conveys a prediction of visual input given sound (Garner and Keller, 2022). What the reciprocal projection from V1 to ACx conveys, is still unclear. In proposals for hierarchical implementations of predictive processing there are no such lateral connections, and there is no reason to assume prediction error computations in different modalities should directly interact at the level of primary sensory areas. Thus, we argued that the lateral interaction between V1 and ACx is a good starting point to investigate how non-hierarchical interactions are involved in the computation of prediction errors, and how multimodal interactions shape sensorimotor prediction errors.
Based on this idea, we designed an experiment in which we could couple and transiently decouple running speed in a virtual environment to both self-generated auditory feedback and self-generated visual flow feedback. While doing this, we recorded activity in L2/3 neurons of ACx using two-photon calcium imaging. Using this approach, we first confirmed that a substantial subset of L2/3 neurons in ACx responds to either auditory, visual (Sharma et al., 2021), or motor-related inputs (Henschke et al., 2021; Morandell et al., 2023; Vivaldo et al., 2023). While we found that L2/3 neurons in ACx responded to audiomotor mismatches in a way that closely resembles visuomotor mismatch responses found in V1 (Keller et al., 2012), we found no evidence of responses to visuomotor mismatch in ACx. However, when coupling both visual flow and auditory feedback to running, we found that L2/3 neurons in ACx non-linearly combine information about visuomotor and audiomotor mismatches. Overall, our results demonstrate that prediction errors can be potentiated by multimodal interactions in primary sensory cortices.
Results
Auditory, visual, and motor-related signals were intermixed in L2/3 of ACx
To investigate auditory, visual, and motor-related signals in mouse ACx, we combined an audiovisual virtual reality system with two-photon calcium imaging in L2/3 ACx neurons (Figure 1A). We used an adeno-associated viral (AAV) vector to express a genetically encoded calcium indicator (AAV2/1-EF1α-GCaMP6f-WPRE) in ACx (Figures 1B-D). Following recovery from surgery, mice were habituated to the virtual reality setup (Figure 1C). We first mapped the location of the primary auditory cortex (A1) and the anterior auditory field (AAF) using widefield calcium imaging (Figure 1C and Figure 1E). Based on these maps, we then chose recording locations for two-photon imaging of L2/3 neurons in either A1 or AAF. For the purposes of this work, we did not distinguish between A1 and AAF and will refer to these two areas here as ACx. To characterize basic sensory and motor-related responses, we recorded neuronal responses to pure tones, full-field moving gratings, and running onsets. We first assessed population responses of L2/3 neurons evoked by sounds (pure tones presented at 4 kHz, 8 kHz, 16 kHz, or 32 kHz at either 60 dB or 75 dB sound pressure level (SPL); Figure 1F). While pure tones resulted in both increases or decreases in calcium activity in individual neurons (Figure 1G), the average population response exhibited a significant decrease in activity (Figure 1H). Next, we analyzed visual responses evoked by full-field drifting gratings (see Methods; Figure 1I). Visual stimulation resulted in a diverse response across the population of L2/3 neurons (Figure 1J) that was initially positive at the population level (Figure 1K). To quantify motor-related inputs, we analyzed activity during running onsets (Figure 1L). We found that the majority of neurons increased their activity during running onsets (Figure 1M), which was also reflected in a significant positive response on the population level (Figure 1N). Finally, we investigated how running modulates auditory and visual responses in ACx. In V1, running strongly increases responses to visual stimuli (Niell and Stryker, 2010), while in auditory cortex running has been shown to modulate auditory responses in a variety of different ways (Audette et al., 2022; Bigelow et al., 2019; Henschke et al., 2021; McGinley et al., 2015; Morandell et al., 2023; Schneider et al., 2014; Vivaldo et al., 2023; Yavorska and Wehr, 2021; Zhou et al., 2014). Separating auditory responses by running state, we found that sound evoked responses of ACx neurons were overall similar during sitting and running, but exhibited a smaller decrease in activity when the mouse was sitting (Figure S1A). Visual responses also appeared overall similar but with a small increase in strength during running (Figure S1B), similar to the running modulation effect observed on visual responses in V1. Thus, running appears to moderately and differentially modulate auditory and visual responses in L2/3 ACx neurons. Consistent with previous work, these results demonstrate that auditory, visual, and motor-related signals are all present in ACx and that running modulation influences L2/3 ACx neurons differently than in V1.
L2/3 neurons in ACx responded to audiomotor mismatch
To test whether auditory, visual, and motor-related signals are integrated in L2/3 neurons of ACx to compute prediction errors, we first probed for responses to audiomotor (AM) mismatches. A mismatch in this context, is the absence of a sensory input that the brain predicts to receive from the environment, and thus a specific type of negative prediction error. We experimentally generated a coupling between movement and sensory feedback and then used movement as a proxy for what the mouse predicts to receive as sensory feedback. To do this with an auditory stimulus, we coupled the sound amplitude of an 8 kHz pure tone to the running speed of the mouse on the spherical treadmill such that sound amplitude was proportional to locomotion speed (Figures 2A and 2B). In this paradigm, a running speed of 0 corresponded to a sound amplitude of 0, while 30 cm/s running speed corresponded to a sound amplitude of 60 dB SPL. We refer to this type of session as closed loop. We then introduced AM mismatches by setting the sound amplitude to 0 for 1 s at random times (on average every 15 s). An alternative approach to introduce AM mismatches would have been to clamp the sound amplitude to a constant value. However, based on the analogy between sound amplitude and visual flow speed in visuomotor (VM) mismatch paradigms, where we induce VM mismatch by setting visual flow speed to 0 (Keller et al., 2012), we chose the former. We found that AM mismatch resulted in a strong population response (Figure 2C and Figure 2D). Interestingly, this response was already apparent in the first closed loop session with audiomotor coupling that the mice ever experienced, suggesting that this coupling is learned very rapidly (Figure S2A). To test whether AM mismatch responses can be explained by a sound offset response, we performed recordings in open loop sessions that consisted of a replay of the sound profile the mouse had self-generated in the preceding closed loop session. Mice were free to run during this session and did so at similar levels as during the closed loop session (Figure S2B). The average response to the playback of sound halt during the open loop session was significantly less strong than the average response to AM mismatch (Figure 2D), but in contrast to visual playback halt responses in V1 (Vasilevskaya et al., 2023), we found no evidence of a running modulation of the response to the playback halt (Figure S2C). Thus, L2/3 neurons in ACx respond to AM mismatch in a way similar to how the L2/3 neurons in V1 respond to visuomotor (VM) mismatch.
Assuming that AM mismatch responses are computed as a difference between an excitatory motor-related prediction and an auditory stimulus driven inhibition, we would expect the neurons with high AM mismatch responses to exhibit opposing influence of motor-related and auditory input. To test this, we selected the 5% of neurons with the strongest responses to AM mismatch and quantified the responses of these neurons to sound stimulation and running onsets. Consistent with a model of a subtractive computation of prediction errors, we found that AM mismatch neurons exhibited a strong reduction in activity in response to sound stimulation and an increase of activity on running onsets (Figure 2E). Given that mismatch responses are likely enriched in the superficial part of L2/3 (O’Toole et al., 2023), and that in our two-photon imaging experiments we also preferentially recorded from more superficial neurons, we suspect that our population is enriched for mismatch neurons. Consistent with this interpretation, we observed a strong population response to AM mismatches (Figures 2C and 2D) and a decrease in population activity in response to sound stimulation (Figure 1H). Nevertheless, sound evoked responses were significantly more negative in neurons strongly responsive to AM mismatch, than for the remainder of the L2/3 neuronal population (Figure 2F). This effect was similar when we used different thresholds for the selection of AM mismatch neurons (10% or 20% of neurons with the strongest response to AM mismatch; Figure S3). Consistent with a sound driven reduction of activity and running related increase of activity in AM mismatch neurons, the correlation of calcium activity of AM mismatch neurons was predominantly negative with sound amplitude and positive with running speed in open loop sessions (Figure 2G). This again resembles the properties of VM mismatch neurons in V1 (Attinger et al., 2017). If AM mismatch responses are computed as a difference between a locomotion driven excitation and a sound driven inhibition, we could also expect to find a correlation between the strength of mismatch response and the strength of sound playback halt responses. Even in the absence of locomotion driven excitation, a relief from sound driven inhibition could trigger an increase in calcium activity. When comparing AM mismatch responses with playback sound halt responses for all neurons, we do indeed find a positive correlation between the two (Figure 2H). Overall, these results suggest that the implementation of sensorimotor prediction error computation generalizes beyond V1 to other primary cortices and might be a canonical cortical computation in L2/3.
We found no evidence of visuomotor mismatch responses in L2/3 of ACx
Visuomotor mismatch responses are likely calculated in V1 (Jordan and Keller, 2020), and spread across dorsal cortex from there (Heindorf and Keller, 2023). To investigate multimodal mismatch responses, we first quantified the strength of these VM mismatch responses, which are independent of auditory input, in ACx. In these experiments, the running speed of the mouse was coupled to the visual flow speed in a virtual corridor, but not to any sound feedback (Figures 3A and 3B). We introduced VM mismatches by halting visual flow for 1 s at random times while the mice were running, as previously described (Keller et al., 2012; Zmarz and Keller, 2016). To control for visual responses independent of visuomotor coupling, we used an open loop replay of the visual flow generated in the previous session (see Methods). We found that neither VM mismatches nor visual flow playback halts, which the mouse experienced in open loop sessions, resulted in a measurable population response in ACx (Figures 3C and 3D). Selecting the 5% of neurons with the strongest responses to VM mismatches and quantifying their responses to grating presentations and running onsets, we found that these neurons exhibited positive responses to running onset and no significant response to grating stimuli (Figure 3E). These responses were not different from the population responses of the remainder of the neurons (Figure 3F). Quantifying the correlation of calcium activity with visual flow speed and running speed in the open loop session, we found that VM mismatch responsive neurons exhibited a distribution not different from chance (Figure 3G). We also found no evidence of a correlation between VM mismatch responses and playback halt responses (Figure 3H). Thus, while there may be a small subset of VM mismatch responsive neurons in L2/3 of ACx, we find no evidence of a VM mismatch response at the level of the L2/3 population.
Mismatch responses were potentiated by multimodal interactions
Finally, we explored how multimodal coupling of both auditory and visual feedback to running speed influenced mismatch responses in L2/3 of ACx. To do this, we coupled both sound amplitude and visual flow speed to the running speed of the mouse in an audiovisual virtual environment (Figures 4A and 4B). We then introduced mismatch events by halting both sound and visual flow for 1 s to trigger a concurrent audiomotor and visuomotor [AM + VM] mismatch (Figure 4B). The nomenclature here is such that the first letter in the pair denotes the sensory input that is being predicted, while the second letter denotes the putative predictor – the square brackets are used to denote that the two events happen concurrently. By putative predictor, we mean an information source available to the mouse that would, in principle, allow it to predict another input, given the current experimental environment. Thus, in the case of a [AM + VM] mismatch both the halted visual flow and the halted sound amplitude are predicted by running speed. The [AM + VM] mismatch resulted in a significant response on the population level (Figure 4C). The concurrent experience of mismatch between multiple modalities could simply be the result of a linear combination of the responses to the different mismatch stimuli or could be the result of a non-linear combination. To test whether we find evidence of a non-linear combination of mismatch responses, we compared the [AM+VM] mismatch to [AM] and [VM] mismatch events presented alone. We found that the presentation of a [AM+VM] mismatch led to a significantly larger response than either an [AM] or a [VM] mismatch in isolation (Figure 4D). To test whether the linear summation of [AM] + [VM] mismatch responses could explain the response to the concurrent presentation [AM+VM], we compared the two directly, and found that the concurrent presentation [AM+VM] elicited a significantly larger response than the linear sum of [AM] + [VM] mismatch responses (Figures 4E and S4). Plotting the [AM+VM] mismatch responses against the linear sum of the [AM] + [VM] mismatch responses for each neuron, we found that while there is some correlation between the two, there is a subset of neurons (13.7%; red dots, Figure 4F) that selectively respond to the concurrent [AM+VM] mismatch, while a different subset of neurons (11.2%; orange dots, Figure 4F) selectively responds to the mismatch responses in isolation. This demonstrates that mismatch responses in different modalities can interact non-linearly.
Discussion
Consistent with previous reports, we found that auditory, visual, and motor-related signals are intermixed in the population of L2/3 neurons in ACx. Responses to both motor-related (McGinley et al., 2015; Schneider et al., 2014; Vivaldo et al., 2023; Yavorska and Wehr, 2021; Zhou et al., 2014) and visual signals (Bigelow et al., 2022; Morrill and Hasenstaub, 2018; Sharma et al., 2021) have been reported across layers in ACx, with the strongest running modulation effect found in L2/3 (Schneider et al., 2014). Also, consistent with previous reports, we found that a subset of L2/3 neurons in ACx respond to audiomotor prediction errors (Audette et al., 2022; Liu and Kanold, 2022). In V1, it has been demonstrated that neurons signaling prediction errors exhibit opposing influence of bottom-up visual and top-down motor-related inputs. This has been speculated to be the consequence of a subtractive computation of prediction errors (Jordan and Keller, 2020; Keller et al., 2012; Leinweber et al., 2017). Our findings now reveal a similar pattern of opposing influence in prediction error neurons in primary ACx that exhibit a positive correlation with motor-related input and a negative correlation with auditory input (Figure 2G). This would be consistent with the idea that both visuomotor and audiomotor prediction errors are computed as a subtractive difference between bottom-up and top-down inputs. Based on this, it is conceivable that this type of computation extends also beyond primary sensory areas of cortex and may be a more general computational principle implemented in L2/3 of cortex.
Finally, we found that concurrent prediction errors in multiple modalities result in an increase in prediction error response that exceeds a linear combination of the prediction error responses in single modalities (Figure 4E), with a subset of neurons selectively responding only to the combination of prediction error responses (Figure 4F). A similar non-linear relationship has been described between auditory and visual oddball responses in both ACx and V1 (Shiramatsu et al., 2021). At this point, it should be kept in mind that deviations from linearity in terms of spiking responses are difficult to assess using calcium imaging data. However, given that the difference between the concurrent presentation and the linear sum of the two individual mismatch responses was approximately a factor of two (Figure 4E), and the fact that we found a population of neurons that responds selectively to the concurrent presentation of both mismatches, we suspect that also the underlying spiking responses are non-linear. What are the mechanisms that could underlie this interaction? Neurons in ACx have access to information about VM mismatches from at least two sources. In widefield calcium imaging, VM mismatch responses are detectable across most of dorsal cortex (Heindorf and Keller, 2023). Thus, VM mismatch responses could be present in long-range cortico-cortical axons from V1, or possibly in L4, L5, or L6 neurons in ACx. Alternative sources of VM mismatch input are neuromodulatory signals. Locus coeruleus, for example, drives noradrenergic signals in response to VM mismatches across the entire dorsal cortex (Jordan and Keller, 2023). However, given that noradrenergic signals only weakly modulate responses in L2/3 neurons in V1 (Jordan and Keller, 2023), it is unclear if the broadcasted noradrenergic signals could non-linearly potentiate the AM mismatch responses of ACx neurons. We speculate that cholinergic signals are also unlikely to contribute to this effect. In V1 there are no cholinergic responses to VM mismatch (Yogesh and Keller, 2023). However, given that ACx and V1 receive cholinergic innervation from different sources (Kim et al., 2016), we cannot rule out the possibility that cholinergic signals in ACx respond to VM mismatch. Nevertheless, given that the [AM+VM] mismatch responses do not simply appear to be an amplified variant of the [AM] mismatch responses (Figure 4F), we speculate that [AM+VM] mismatch responses are primarily driven by long-range cortico-cortical input from V1 that interacts with a local computation of [AM] mismatch responses in the L2/3 ACx circuit.
Lateral interactions in the computation of prediction errors between sensory streams are not accounted for by hierarchical variants of predictive processing. In these hierarchical variants, prediction errors are computed as a comparison between top-down and bottom-up inputs (Rao and Ballard, 1999). To explain the lateral interactions between prediction errors likely computed in ACx (AM mismatch responses) and prediction errors likely computed in V1 (VM mismatch responses) that we describe here, we will need new variants of predictive processing models that include lateral and non-hierarchical interactions. Thus, our results demonstrate that mismatch responses in different modalities interact non-linearly and can potentiate each other. The circuit mechanisms that underlie this form of multimodal integration of mismatch responses are still unclear and will require further investigation. However, we would argue that the relatively strong multimodal interaction demonstrates that unimodal and hierarchical variants of predictive processing are insufficient to explain cortical mismatch responses - if predictive processing aims to be a general theory of cortical function, we will need to explore non-hierarchical variants of predictive processing.
Supplementary figures
Methods
Mice and surgery
All animal procedures were approved by and carried out in accordance with guidelines of the Veterinary Department of the Canton Basel-Stadt, Switzerland. C57BL/6 female mice (Charles River), between the ages of 7 and 12 weeks were used in this study. For cranial window implantation, mice were anesthetized using a mixture of fentanyl (0.05 mg/kg), medetomidine (0.5 mg/kg), and midazolam (5 mg/kg). Analgesics were applied perioperatively. Lidocaine was injected subcutaneously into the scalp (10 mg/kg s.c.) prior to the surgery. Mice underwent a cranial window implantation surgery at an age of between 7 and 8 weeks. First, a custom-made titanium head-plate was attached to the skull (right hemisphere) with dental cement (Heraeus Kulzer). Next, a 3 mm craniotomy was made over left ACx (4.2 mm to 4.4 mm lateral from the midline and 2.6 mm to 2.8 mm posterior from bregma) followed by 4 to 6 injections of approximately 200 nl each of the AAV vector: AAV2/1-EF1α-GCaMP6f-WPRE (1013- 14 GC/ml). A circular glass cover slip was glued (Ultragel, Pattex) in place to seal the craniotomy. Metacam (5 mg/kg, s.c.) and buprenorphine (0.1 mg/kg s.c.) were injected intraperitoneally for 2 days after completion of the surgery. Mice were returned to their home cage and group housed for 10 days prior to the first experiments.
Virtual reality environment
All recordings were done with mice head-fixed in a virtual reality system, as described previously (Leinweber et al., 2014). Mice were free to run on an air-supported polystyrene ball. Three types of closed loop conditions were used for the experiments. The rotation of the spherical treadmill was either coupled 1: to the sound amplitude of an 8 kHz pure tone (audiomotor coupling), while the animal was locomoting in darkness, 2: to the movement in a virtual corridor (visuomotor coupling), or 3: to both the sound amplitude of an 8 kHz pure tone and the movement in a virtual corridor (audio-visuo-motor coupling). For audiomotor coupling, we used the running speed of the mouse to control the SPL of an 8 kHz pure tone presented to the mouse through a loudspeaker (see section auditory stimulation). This closed loop coupling was not instantaneous but exhibited a delay of 260 ms ± 60 ms (mean ± STD). For visuomotor coupling, the running speed of the mouse was coupled to the visual flow speed in the virtual environment projected onto a toroidal screen surrounding the mouse using a Samsung SP-F10M projector synchronized to the turnaround times of the resonant scanner of the two-photon microscope. The delay in the visuomotor closed loop coupling was 90 ms ± 10 ms (mean ± STD). From the point of view of the mouse, the screen covered a visual field of approximately 240 degrees horizontally and 100 degrees vertically. The virtual environment presented on the screen was a corridor tunnel with walls consisting of vertical sinusoidal gratings. Prior to the recording experiments, mice were habituated in darkness to the setup in 1 to 2-hour long sessions for up to 5 days, until they displayed regular locomotion. Closed loop sessions were followed by open loop sessions, in which rotation of the spherical treadmill was decoupled from both the sound amplitude and the movement in the virtual corridor. During these open loop sessions, we replayed the amplitude modulated sound or the visual flow recorded in the previous closed loop session.
Auditory stimulation
Sounds were generated with a 16-bit digital-to-analog converter (PCI6738, National Instruments) using custom scripts written in LabVIEW (LabVIEW 2020, National Instruments) at 160 kHz sampling rate, amplified (SA1, Tucker Davis Technologies, FL, USA) and played through an MF1 speaker (Tucker Davis Technologies. FL, USA) positioned 10 cm from the mouse’s right ear. Stimuli were calibrated with a wide-band ultrasonic acoustic sensor (Model 378C01, PCB Piezotronics, NY, USA). To study sound-evoked responses, we used 4 kHz, 8 kHz, 16 kHz, and 32 kHz pure tones played at 60 dB and 75 dB SPL (1 s duration, at a randomized inter-stimulus interval 4 s ± 1 s, 10 repetitions, 1 ms on and off-ramp, in a randomized order). For audiomotor coupling experiments, we used an 8 kHz pure tone with a sound amplitude that varied between 40 dB and 75 dB SPL.
Visual stimulation
For visual stimulation, we used full-field sinusoidal drifting grating (0 degrees, 45 degrees, 90 degrees, 270 degrees, moving in either direction) in a pseudo-random sequence, each presented for a duration of 6 s ± 2 s, with between 2 and 7 repetitions, with a randomized inter-stimulus interval of 4.5 s ± 1.5 s during which a gray screen was displayed.
Running onsets
Running onsets were defined as the running speed crossing a threshold of 3 cm/s, where the average speed in the previous 3 s was below 1.8 cm/s. To separate trials with AM mismatch, VM mismatch, auditory stimulus and grating stimulus based on locomotion state into those running and those while sitting, we used threshold of 0.3 cm/s in a 1 s window preceding the stimulus onset.
Widefield calcium imaging
To establish a reference tonotopic map of A1 and AAF (Figure 1E), we performed widefield fluorescence imaging experiments on a custom-built microscope consisting of objectives mounted face-to-face (Nikon 85 mm/f1.8 sample side, Nikon 50 mm/f1.4 sensor side), as previously described (Heindorf and Keller, 2023). Blue illumination was provided by a light-emitting diode (470 nm, Thorlabs) and passed through an excitation filter (SP490, Thorlabs). Green fluorescence emission was filtered with a 525/50 bandpass filter. Images were acquired at a frame rate of 100 Hz on a sCMOS camera (PCO edge 4.2). The raw images were cropped on-sensor, and the resulting data was saved to disk with custom-written software in LabVIEW (National Instruments).
Two-photon imaging
Calcium imaging of L2/3 neurons in A1 and AAF was performed using a modified Thorlabs Bergamo II microscope with a 16x, 0.8 NA objective (Nikon N16XLWD-PF), as previously described (Leinweber et al., 2014). To record in left ACx, the microscope was tilted 45 degrees to the left. The excitation light source was a tunable, femtosecond-pulsed laser (Insight, Spectra Physics or Chameleon, Coherent) tuned to 930 nm. The laser power was adjusted to 30 mW. A 12 kHz resonance scanner (Cambridge Technology) was used for line scanning, and we acquired 400 lines per frame. This resulted in a frame rate of 60 Hz at a resolution of 400 × 750 pixels. We used a piezo-electric linear actuator (Physik Instrumente, P-726) to record from imaging planes at four different cortical depths, separated by 15 μm. This reduced the effective frame rate per layer to 15 Hz. The emission light was bandpass filtered using a 525/50 nm filter (Semrock), and signals were detected with a photomultiplier (Hamamatsu, H7422), amplified (Femto, DHCPCA-100), digitized at 800 MHz (National Instruments, NI5772), and bandpass filtered at 80 MHz with a digital Fourier-transform filter on a field-programmable gate array (National Instruments, PXIe-7965). Recording locations were visually registered against the reference images acquired with widefield imaging previously using blood vessels patterns.
Widefield image analysis
Off-line data processing and data analysis were done with custom-written MATLAB scripts. Slow drifts in the fluorescence signal were removed using 8th percentile filtering with a 62.5 s moving window, similar to what was used for two-photon imaging data (Dombeck et al., 2007). Activity was calculated as the ΔF/F0, where F0 was the median fluorescence over the entire recording session. For stimulus responses, we use a response window of 0.2 s to 1.2 s following stimulus onset and a baseline window of −1 s to 0 s before stimulus onset. The pixels with the strongest response (top 3% - 5% of response distribution), were used to mark the tonotopic areas corresponding to the different stimuli.
Two-photon image analysis
Calcium imaging data were processed as described previously. In brief, raw images were full-frame registered to correct for lateral brain motion. Neurons were selected manually based on mean and maximum fluorescence images. Average fluorescence per neuron over time was corrected for slow fluorescence drift using an 8th percentile filter and a 66 s (or 1000 frames) window (Dombeck et al., 2007; Keller et al., 2012; Leinweber et al., 2014) and divided by the median value over the entire trace to calculate ΔF/F0. All stimulus-response curves were baseline subtracted. The baseline subtraction window was −0.5 s to 0 s before stimulus onset. For quantification of responses during different onset types (auditory, visual, running, mismatch), ΔF/F was averaged over the response time window (0.5 s to 2.5 s after stimulus onset) and baseline subtracted (mean activity in a window preceding stimulus onset, −0.5 s to 0 s). Onsets which were not preceded by at least 2 s of baseline or not followed by at least 3 s of recording time, were excluded from the analysis. Sessions with less than two onsets were not included in the analysis. To quantify the difference in average calcium responses as a function of time, we used a hierarchical bootstrap test for every 5 frames of the calcium trace (333 ms) and marked comparisons where responses were different (p < 0.05). Mismatch responsive neurons were selected based on the absolute response strength over the response time window (0.5 s to 2.5 s). To infer spikes from calcium signals (Figure S4), we used CASCADE (Rupprecht et al., 2021).
Statistical tests
All statistical information for the tests performed in this manuscript is provided in Table S1. We used hierarchical bootstrapping (Saravanan et al., 2020) for statistical testing to account for the nested structure of the data (multiple neurons from one imaging site). We first resampled the data with replacement at the level of imaging sites, followed by resampling at the level of neurons. We then computed the mean responses across the resampled population and repeated this process 10 000 times. The probability of one group being different from the other was calculated as a fraction of bootstrap sample means which violated the tested hypothesis.
Key Resource Table
Acknowledgements
We thank Tingjia Lu for the production of viral vectors and all the members of the Keller lab for discussion and support. This project has received funding from the Swiss National Science Foundation (GBK), the Novartis Research Foundation (GBK), and the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 865617) (GBK).
Declaration of interests
The authors declare no competing financial interests.
References
- Visuomotor Coupling Shapes the Functional Development of Mouse Visual CortexCell 169:1291–1302https://doi.org/10.1016/j.cell.2017.05.023
- Stimulus-specific prediction error neurons in mouse auditory cortexhttps://doi.org/10.1101/2023.01.06.523032
- Precise movement-based predictions in the mouse auditory cortexCurr Biol 32:4925–4940https://doi.org/10.1016/j.cub.2022.09.064
- Layer-specific integration of locomotion and sensory information in mouse barrel cortexNature Communications 10:1–14https://doi.org/10.1038/s41467-019-10564-8
- Movement and VIP Interneuron Activation Differentially Modulate Encoding in Mouse Auditory CortexeNeuro 6https://doi.org/10.1523/ENEURO.0164-19.2019
- Visual modulation of firing and spectrotemporal receptive fields in mouse auditory cortexCurrent Research in Neurobiology 3https://doi.org/10.1016/j.crneur.2022.100040
- Long-distance feedback projections to area V1: Implications for multisensory integration, spatial awareness, and visual consciousness. Cognitive, Affective& Behavioral Neuroscience 4:117–126https://doi.org/10.3758/CABN.4.2.117
- Imaging large-scale neural activity with cellular resolution in awake, mobile miceNeuron 56:43–57https://doi.org/10.1016/j.neuron.2007.08.003
- Neural substrates of vocalization feedback monitoring in primate auditory cortexNature 453:1102–6https://doi.org/10.1038/nature06910
- Anatomical evidence of multimodal integration in primate striate cortexJ Neurosci 22:5749–5759https://doi.org/10.1523/JNEUROSCI.22-13-05749.2002
- A cortical circuit for audio-visual predictionsNat Neurosci 25:98–105https://doi.org/10.1038/s41593-021-00974-7
- Behavior-relevant top-down cross-modal predictions in mouse neocortexhttps://doi.org/10.1101/2023.04.03.535389
- Mouse Motor Cortex Coordinates the Behavioral Response to Unpredicted Sensory FeedbackNeuron 99:1040–1054https://doi.org/10.1016/j.neuron.2018.07.046
- Antipsychotic drugs selectively decorrelate long-range interactions in deep cortical layershttps://doi.org/10.1101/2022.01.31.478462
- Enhanced modulation of cell-type specific neuronal responses in mouse dorsal auditory field during locomotionCell Calcium 96https://doi.org/10.1016/j.ceca.2021.102390
- Cross-Modality Sharpening of Visual Cortical Processing through Layer-1-Mediated Inhibition and DisinhibitionNeuron 89:1031–1045https://doi.org/10.1016/j.neuron.2016.01.027
- The locus coeruleus broadcasts prediction errors across the cortex to promote sensorimotor plasticityeLife 12https://doi.org/10.7554/eLife.85111
- Opposing Influence of Top-down and Bottom-up Input on Excitatory Layer 2/3 Neurons in Mouse Primary Visual CortexNeuron 108:1194–1206https://doi.org/10.1016/j.neuron.2020.09.024
- Sensorimotor Mismatch Signals in Primary Visual Cortex of the Behaving MouseNeuron 74:809–815https://doi.org/10.1016/j.neuron.2012.03.040
- Neural processing of auditory feedback during vocal practice in a songbirdNature 457:187–90https://doi.org/10.1038/nature07467
- Selectivity of Neuromodulatory Projections from the Basal Forebrain and Locus Ceruleus to Primary Sensory CorticesJ. Neurosci 36:5314–5327https://doi.org/10.1523/JNEUROSCI.4333-15.2016
- A Sensorimotor Circuit in Mouse Cortex for Visual Flow PredictionsNeuron 95:1420–1432https://doi.org/10.1016/j.neuron.2017.08.036
- Two-photon calcium imaging in mice navigating a virtual reality environmentJournal of visualized experiments : JoVE e 50885
- Interactive auditory task reveals complex sensory-action integration in mouse primary auditory cortexhttps://doi.org/10.1101/2022.12.12.520155
- Cortical high-density counterstream architecturesScience. American Association for the Advancement of Science https://doi.org/10.1126/science.1238406
- Cortical Membrane Potential Signature of Optimal States for Sensory Signal DetectionNeuron 87:179–192https://doi.org/10.1016/j.neuron.2015.05.038
- Movement-related modulation in mouse auditory cortex is widespread yet locally diversebioRxiv https://doi.org/10.1101/2023.07.03.547560
- Visual Information Present in Infragranular Layers of Mouse Auditory CortexJ Neurosci 38:2854–2862https://doi.org/10.1523/JNEUROSCI.3102-17.2018
- Modulation of visual responses by behavioral state in mouse visual cortexNeuron 65:472–9https://doi.org/10.1016/j.neuron.2010.01.033
- Molecularly targetable cell types in mouse visual cortex have distinguishable prediction error responsesNeuron 111:2918–2928https://doi.org/10.1016/j.neuron.2023.08.015
- Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effectsNature Neuroscience 2:79–87https://doi.org/10.1038/4580
- A database and deep learning toolbox for noise-optimized, generalized spike inference from calcium imagingNat Neurosci 24:1324–1337https://doi.org/10.1038/s41593-021-00895-5
- Application of the hierarchical bootstrap to multi-level data in neuroscienceNeuron Behav Data Anal Theory
- A synaptic and circuit basis for corollary discharge in the auditory cortexNature 513:189–194https://doi.org/10.1038/nature13724
- A cortical filter that learns to suppress the acoustic consequences of movementNature 561:391–395https://doi.org/10.1038/s41586-018-0520-5
- Modulation of auditory responses by visual inputs in the mouse auditory cortexhttps://doi.org/10.1101/2021.01.22.427870
- Auditory, Visual, and Cross-Modal Mismatch Negativities in the Rat Auditory and Visual CorticesFront Hum Neurosci 15https://doi.org/10.3389/fnhum.2021.721476
- Functional activation in parieto-premotor and visual areas dependent on congruency between hand movement and visual stimuli during motor-visual primingNeuroImage 34:290–9https://doi.org/10.1016/j.neuroimage.2006.08.043
- Brain-optimized deep neural network models of human visual areas learn non-hierarchical representationsNat Commun 14https://doi.org/10.1038/s41467-023-38674-4
- How deep is the brain? The shallow brain hypothesisNat. Rev. Neurosci :1–14https://doi.org/10.1038/s41583-023-00756-z
- Locomotion-induced gain of visual responses cannot explain visuomotor mismatch responses in layer 2/3 of primary visual cortexCell Rep 42https://doi.org/10.1016/j.celrep.2023.112096
- Auditory cortex ensembles jointly encode sound and locomotion speed to support sound perception during movementPLOS Biology 21https://doi.org/10.1371/journal.pbio.3002277
- NMDA receptors in visual cortex are necessary for normal visuomotor integration and skill learningeLife 11https://doi.org/10.7554/eLife.71476
- Effects of Locomotion in Auditory Cortex Are Not Mediated by the VIP NetworkFront Neural Circuits 15https://doi.org/10.3389/fncir.2021.618881
- Cholinergic input to mouse visual cortex signals a movement state and acutely enhances layer 5 responsivenesseLife 12https://doi.org/10.7554/eLife.89986
- Whole-Brain Direct Inputs to and Axonal Projections from Excitatory and Inhibitory Neurons in the Mouse Primary Auditory AreaNeurosci Bull 38:576–590https://doi.org/10.1007/s12264-022-00838-5
- Scaling down of balanced excitation and inhibition by active behavioral states in auditory cortexNat Neurosci 17:841–850https://doi.org/10.1038/nn.3701
- Mismatch Receptive Fields in Mouse Visual CortexNeuron 92:766–772https://doi.org/10.1016/j.neuron.2016.09.057
Article and author information
Author information
Version history
- Preprint posted:
- Sent for peer review:
- Reviewed Preprint version 1:
- Reviewed Preprint version 2:
Copyright
© 2024, Magdalena Solyga & Georg B. Keller
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.