Introduction

Sensory perception requires fast encoding of relevant stimuli from a mixture of complex signals. Sensory cortices play a vital role in such sensory processing. In the auditory domain, for example, small neuronal ensembles in the auditory cortex (AC) are actively engaged to efficiently perceive relevant acoustic information 14. The AC contains multiple ensembles of neurons that can be functionally identified, e.g., those formed by subsets of neurons preferring the same frequency also referred to as co-tuned neurons 57. Repeated presentation of the same acoustic stimulus, e.g., a tone of the same frequency leads to a stable percept, but in the AC different ensembles of neurons are activated together at each repeat indicating a high trial-by-trial variability 5,810. Activation of these different subsets of co-tuned neurons at each presentation of a stimulus reflects a sparse encoding of sound stimuli. Such sparse representation of co-activated neurons enables efficient coding with reduced metabolic energy to process complex information 1116. The sparse neuronal representation raises key questions of how activation of different ensembles leads to the same percept and how the overall activity within the cortical network is balanced across ensembles of co-tuned neurons. In particular, when a specific sound is present, a subset of co-tuned neurons will be activated, but not all co-tuned neurons 17. Given that the percept of a repeating stimulus is constant, we speculated that neural activity is balanced across co-tuned as well as non co-tuned ensembles. While neuronal ensembles constantly update their activities based on incoming information, how the activation of a particular sparse neuronal ensemble affects other neurons within the network to maintain the overall network balance for processing specific sensory information in vivo is largely unknown. In vivo optogenetic stimulation studies in the visual cortex (VC) suggested that inhibitory processes play a role in balancing network activity. In particular, in vivo single-cell holographic stimulation on a group of target cells, which induced increased response amplitude, resulted in changes in the response amplitudes of neighboring non-target neurons in the primary visual cortex (V1) 18, with similarly tuned neurons’ activity being suppressed. Moreover, in vivo holographic optogenetic stimulation showed that visually-suppressed neurons had attenuated response amplitudes when holographic stimulation was given along with the visual stimulus presentation, which was not observed in visually activated neurons 19. This suggested that neurons exhibit supralinear-to-linear input-output (IO) functions in vivo, rather than threshold-linear IO functions observed in vitro. These studies suggest that inhibitory influence from additional neuronal activation in the VC seems to play a major role during in vivo sensory processing, likely to maintain the activity balance of the network by modulating activities of neighboring neurons that share a similar tuning property.

One major difference between VC and AC is that the frequency tuning of neurons in the AC is less spatially localized, especially in a superficial layer (Layer 2/3) 20. The local frequency preferences in the AC are diverse, thus neighboring neurons can show widely differing tuning properties 5. To test how activity in specific AC cells among an intermingled and spatially distributed co-tuned and non co-tuned cell population is balanced during auditory processing, we stimulated a small group of AC cells using in vivo holographic optogenetic stimulation 21,22 while imaging AC population activity using 2-photon Ca2+ imaging in awake mice. We further tested whether any activity changes induced by holographic stimulation persist, as recurrent cortical networks engage homeostatic plasticity to stabilize overall network activity levels 23,24. Stimulating small ensembles of co-tuned neurons together with the presentation of a pure tone in their preferred frequency increased their tone-evoked activity. Furthermore, we observed that non-stimulated co-tuned neurons decreased their tone-evoked activity. Non co-tuned ensembles did not exhibit such changes in tone-evoked responses, regardless of the pure tone frequency. Thus, the increased activity in the stimulation-targeted ensemble had caused a decrease in activity in the non-stimulated co-tuned ensembles, specifically when the stimulation-paired pure tone was their preferred frequency. Non-target co-tuned neurons exhibiting such effects were not necessarily neighboring the targeted cells, suggesting specific interactions between co-tuned but not co-located neurons. Lastly, the decreased activity in the non-stimulated co-tuned ensembles persisted in the subsequent imaging session, even in the absence of holographic stimulation. These results suggest that co-tuned ensembles form interacting overall networks that balance their activity.

Results

Optogenetic holographic stimulation increases activity in small ensembles in vivo

To sparsely manipulate neuronal ensembles, we used in vivo holographic stimulation. To achieve reliable and selective in vivo holographic optogenetic stimulation of small ensembles of neurons with single cell precision, we generated an AAV co-expressing the red-shifted opsin rsChRmine and GCaMP8s (AAV9-hSyn-GCaMP8s-T2A-rsChRmine), as rsChRmine minimizes the optical cross talk reducing a possible activation from the imaging laser (940 nm excitation wavelength) 25. Injecting AAV9-hSyn-GCaMP8s-T2A-rsChRmine into AC yielded cells expressing both GCaMP and opsin (Fig. 1A). We first tested the efficiency and reliability holographic stimulation, by targeting either a single cell or a small ensemble of five cells. For single cell stimulation, we varied the stimulation point from the target cell position by 10, 20, and 30 μm along either the x-axis or y-axis of the fields of view (FOV; n = 15 cells, 3 animals). This results in a rapid decay of response amplitudes to stimulation by the distance shift from the original cell position, confirming reliable holographic stimulation at the single-cell level (∼15 μm diameter) (mixed-effect model, p < 0.05; Fig. 1B). Furthermore, the stimulation effect of target cells was specific to the targeted z-plane, showing no stimulation effect when the stimulation z-plane was off by 20 μm (Fig. S2). For 5-cell stimulation, a majority of cells reliably responded to photo-stimulated in vivo (5 mW/cell, 15 μm spiral, 30 revolutions, 6 s inter-stimulus interval (ISI), 5 trials) and exhibited robust Ca2+ responses (Fig. 1CD; permutation test, all p < 0.05), comparable with responses to other opsins 2629. Thus, in vivo holographic stimulation enables precise targeting and activation of groups of single neurons in AC.

Holographic optogenetic stimulation in AC and experimental procedure.

A: An example brain slice showing cells in AC expressing both GCaMP and opsin (AAV9-hSyn-GC8s-T2A-rsChRmine). B: An example field of view (FOV) where single cell targeting precision was tested and response traces to the holographic stimulation from an example cell. Stimulation was offset from the original position (red circle) to distance-shifted positions in 10 μm increments (gray dashed circles in the x-axis or y-axis of the FOV). Responses were the greatest when the stimulation was performed on the original cell position (red solid line trace). Rapid amplitude decay along the position shift was observed (red dashed line traces). Grey error shades indicate SEM across trials. A right inset errorbar plot shows a grand average amplitude change per stimulation point across all tested cells (n = 15 cells, 3 animals). Error bars indicate SEM across cells. C: An example FOV showing a population of cells (left) and amplitude changes to 5-cell stimulation as a stimulation effect (Δ F/Fstim - Δ F/Fspont., right). Filled squares indicate each cell. White circles indicate stimulation targeted cells. D: (left) Proportion of stimulated cells that showed an increase in fluorescence following photostimulation. Error bars indicate SEM across FOVs. A horizontal dashed line indicates average permutation results (random permutation test on 100 iterations, p < 0.0001). (right) Grand average of the stimulation effect across imaging sessions. Error bars indicate SEM across FOVs. A horizontal dashed line indicates average permutation results (random permutation test on 100 iterations, p < 0.0001). E: Experimental procedure. A total of four consecutive imaging sessions were acquired: 1) A cell selection session to identify neurons selective for 16 kHz pure tones, 2) a baseline imaging session to acquire tone-evoked activity response to either 16 or 54 kHz pure tone, 3) a stimulation session representing five cells of either 16 kHz or 54 kHz responsive cells as target stimulation to examine the effect of stimulation synchronized to tone presentations, and 4) a post-stimulation session to examine network persistence after stimulation-related changes.

Optogenetic holographic stimulation increases sound evoked activity in A1 ensembles

Since repeated sound stimulation activates different ensembles while resulting in the same percept, we reasoned that ensembles interacted and speculated that increased activity in one ensemble would prevent or reduce activity in co-tuned ensembles. We thus next sought to investigate how increased neural activity in small co-tuned ensembles during sound presentation affected sound-evoked responses in targeted and non-target co-tuned and non co-tuned ensembles. To achieve this, we first needed to identify the tuning properties of single neurons and then target a subset of co-tuned neurons for stimulation. To study how the increased activity of a small number of neurons influences the activity of other neurons according to their frequency tuning properties, we designed an experimental paradigm comprising four sequential imaging sessions (Fig. 1E):

First, in the cell selection session (Fig. 1E), we identified tuned ensembles in primary auditory cortex (A1) layer 2/3 (L2/3) by assessing frequency tuning properties of neurons within the FOVs covering 550 μm2 (total cells = 7344, sound responsive cells = 1331, FOVs = 23; Fig. 1D). We presented pure tones of three different frequencies spanning the hearing range of the mouse (4 kHz, 16 kHz, and 54 kHz, 100 ms duration, 2 sec. ISIs, 10 repeats for each frequency). We chose 16 kHz and 54 kHz as the representative target ensemble tone frequencies, as 16 kHz is within the most sensitive frequency range of mice 30 and 54 kHz is within the range of mouse ultrasonic vocalization 31. By selecting target ensembles in two different frequencies, we ensured that effects of stimulation were not specific to one particular population. For each condition (16 kHz or 54 kHz target ensemble for stimulation), we selected 5 target cells to stimulate. To ensure that all cells in the ensemble were selective for the target tone, we chose the most responsive cells in each condition. Thus, for the 16 kHz target ensemble condition, we selected five cells (target cells) among the top 30% most responsive cells to the 16 kHz tone. Similarly, for the 54 kHz target ensemble condition, we selected five of the top 30% most 54 kHz tone responsive cells. By selecting target cells sharing the same frequency preference, we aimed at investigating how activity changes from co-tuned neuronal ensembles alter the processing of the target frequency in other co-tuned and non co-tuned cells.

Second, in the baseline session (Fig. 1E), we determined the sound-evoked responses of all imaged cells by presenting a series of 16 kHz and 54 kHz pure tones in a random order (100 ms duration, 5.8-6.5 sec. ISI; baseline session, 30 repeats for each frequency). Exemplar responses of cells from a 16 kHz and a 54 kHz ensemble are shown as black traces in Figure 2A and 2C.

Targeted cells and non-target cells show response changes due to stimulation.

A: (top) An example FOV showing the stimulation effect (Δ F/Fstim - Δ F/Fbaseline) of sound responsive cells for 16 kHz target cell stimulation (filled squares). Black circles indicate stimulated target cells. (bottom) Mean response traces of an example target and non-target cell in baseline session (black) and stimulation session (red). Error bars indicate SEM across trials. An example target cell shows an increased response due to the stimulation. An example non-target cell shows a decreased response due to stimulation on the target cells. B: (left) A violin plot of the proportion of stimulated cells that showed increased activity due to stimulation across FOVs. Horizontal solid line indicates mean proportion, empty circle indicates median proportion, and gray filled circles indicate individual FOVs. (right) Mean amplitude changes of target cells for stimulation session and post-stimulation session normalized to the baseline session. Error bars indicate SEM across cells. Dashed horizontal lines on both panels indicate average permutation results (random permutation test on 100 iterations, p < 0.0001). CD: Same as ABC for 54 kHz target cell stimulation.

Third, in the stimulation session (Fig. 1E), we examined how all sound-responsive cells change their responses when a small group of cells in the network increases their activity. We presented the same tones (16 kHz and 54 kHz in a random order), in tandem with the optogenetic stimulation of five target cells (stimulation session, 100 ms stimulation duration). We performed different sessions for the 16 kHz and 54 kHz target ensembles, varying FOVs for each session (18 FOVs for 16 kHz target ensemble condition and 12 FOVs for 54 kHz target ensemble condition). Figure 2A and C show two example FOV with targeted neurons for a 16 kHz and 54 kHz ensemble.

Since both imaging and optogenetic stimulation involve optomechanical components, we wanted to ensure that effects were not due to artifacts caused by our stimulation or imaging setup. Moreover, cells can adapt their responses to repeated sound presentation. Thus, to confirm any response changes observed from the stimulation session is due to the optogenetic stimulation rather than simple response change due to acoustic sound presentation, we added an additional control condition. For this control condition, we performed the “stimulation” session with five target cells but with 0 mW laser power (i.e., no stimulation) to verify that any response changes occurring in the stimulation session compared to the baseline session were not simply due to the eventual response adaptation of neurons to the tuned frequency (control condition; 13 FOVs). By selecting cells and presenting 0 mW laser power, instead of no target cell selection or selecting any other no-cell area within the FOV, we ensured that the laser power given to selected cells was the only difference between the actual stimulation and control conditions, keeping any noise caused by the imaging and stimulation setup the same.

Fourth, after the stimulation session (Fig. 1E), we performed an additional imaging session (post-stimulation session), presenting another series of 16 kHz and 54 kHz pure tones in a random order to examine whether changes in the sound-evoked responses persisted or reverted back after the stimulation session.

Optogenetic holographic stimulation increases activity in targeted ensembles

We first identified the effect of the optogenetic stimulation on the targeted ensembles. Figures 2A and 2D show fluorescence traces of exemplar cells from 16 kHz and 54 kHz target ensembles. Optogenetic stimulation increased the sound-evoked fluorescence amplitude in these individual cells. To quantify the effect of the optogenetic stimulation, we compared the tone-evoked fluorescence responses of the targeted cells with and without stimulation (Stimulation effect = ΔF/F(stimulation session) - ΔF/F(baseline session)). Around 72% of target cells (66 out of 90 cells over 18 FOVs for 16 kHz target stimulation and 42 out of 60 cells 12 FOVs for 54 kHz target stimulation) showed increased response amplitude during the stimulation session compared to the baseline session, regardless of the tone presented (Fig. 2BD; permutation tests, all p < 0.001). These results indicate that holographic stimulation was able to reliably increase activity in small populations of neurons. Moreover, given that the target cells we selected were most responsive to their preferred tone frequency, this increase in fluorescence indicates that the cells’ responses to their preferred tone were not saturated.

Optogenetic holographic stimulation decreases activity in non-target co-tuned ensembles

We next investigated whether the optogenetically enhanced sound-evoked activity of a small group of cells would cause activity changes in other non-stimulated cells. During holographic optogenetic stimulation of the targeted cells, the non-target, but sound-responsive cells (n = 995 cells for 16 kHz target ensemble condition and n = 675 cells for 54 kHz target ensemble condition), also changed their activity, showing either increased or decreased response amplitudes (Fig. 2AC).

If cortical networks rebalance their activity, we speculated that the increased tone-evoked activity in the targeted ensemble would lead to a decrease in tone-evoked activity in coupled ensembles. Such rebalancing would keep the activity within the cortical network stable. Moreover, given that we increased the activity to the preferred sound frequency, if this rebalancing happens, it should occur only for the distinct sound frequency related to the cell’s tuning property. For example, stimulation of a 16 kHz ensemble should cause a greater reduction in the 16 kHz tone response of non-targeted 16 kHz cells compared to their response to the 54 kHz tone.

To address these questions, we investigated whether increased activity in the targeted cells influenced the activity of non-target cells and how these changes were related to the tuning properties of the cells. We first defined each sound-responsive cell’s frequency selectivity by computing a difference between response amplitude to 16 kHz and 54 kHz from the baseline session (frequency preference = (ΔF/F(16kHz))-(ΔF/F(54kHz))). We then divided these cells into either 16 kHz preferring or 54 kHz preferring groups, taking 0 (i.e., no preference) as a criterion (Fig. S2). Both subgroups exhibited stronger tone-evoked responses to their preferred frequency, independent of the condition (t(5700) = 4.79, p < 0.0001; Fig. S2). This confirms that the criterion for cell group threshold is valid.

We then focused on our main question by comparing the stimulation effect of the two target ensemble groups to the control condition to identify whether stimulation decreased the response of non-target co-tuned neurons. Neural activity in AC rapidly shows stimulus-specific adaptation to the repeated presentation of the stimulus3234, which can obscure stimulation related changes. We thus used the response amplitude change between the baseline and the “stimulation” control session as a representative threshold to test the effect of the stimulation. We once again used the difference in response amplitude between the baseline and stimulation sessions as the measure of the stimulation effect (ΔF/F(stimulation session) - ΔF/F(baseline session)). Neighboring cells within 20 μm from the target stimulation point were removed from the analysis since they could have been directly affected by the stimulation.

We compared the stimulation effect between non-target co-tuned and non co-tuned cells across conditions (16 kHz and 54 kHz target ensembles as well as control conditions) for different pure tone presentations. Since our primary interest was how non-target cells respond to increased activity in target ensembles, we focused on conditions where the pure tone frequency matched or did not match the tuning properties of the non-target cells. Since we stimulated during tone presentation the effects of the holographic stimulation and stimulus-specific adaptation co-occurred. To isolate these components, we used a linear mixed-effect model with cell group, condition, and pure tone frequency as fixed factors, and FOVs as a random factor. We then performed ANOVA on the model to assess the main effects and interactions.

A marginal significant main effect of the condition (F(2,37.1) = 2.983, p = 0.0628) on the response change in the stimulation session relative to the baseline session (i.e., stimulation effect) was observed, indicating that these changes may depend on the stimulation condition. We further analyzed the data to better understand how the different factors interacted in the response amplitude changes. A significant interaction between the pure tone frequency and cell group (F(1,4397.6) = 186.967, p < 0.0001) suggests that each cell group responded differently to the two pure tone frequencies. Specifically, the response amplitude decreases in the stimulation session relative to the baseline session were more pronounced for each cell group when the played pure tone matched to their tuning property. This interaction between pure tone frequency and cell group highlights the importance of frequency tuning in modulating response amplitudes. Such response amplitude decreases of non-target cells to their preferred pure tone presentation further aligns with the stimulus specific adaptation due to repeatedly presented pure tones 32. Additionally, a significant three-way interaction across condition, cell groups, and pure tone frequency (F(2,4397.6) = 3.517, p = 0.0298) suggests the combined effects of the stimulation condition and the cell group on response amplitude depend on the pure tone frequency. The stimulation effect is not uniform across cell groups and depends heavily on the frequency, highlighting a complex interplay between the tuning property of cells, stimulation condition, and presented pure tone frequency.

Consequently, we analyzed post-hoc comparisons estimated marginal means with contrasts, as our focus was how co-tuned cells change their responses due to the increased activity in the target cells along with the frequency of the presented pure tone.

For 16 kHz preferring cell group (n = 537), we observed a greater stimulation effect (i.e., decrease in response amplitude) for 16 kHz tone presentation when 16 kHz target ensemble was stimulated compared to the control condition (t(124) = 3.114, p = 0.0064). For all other pairs, no significant stimulation effect was observed. This suggests that non-target 16 kHz co-tuned cells reduce their response amplitudes when target ensembles share the same tuning property. Furthermore, such response change occurs only when they process their preferred frequency (Fig. 3B, left).

Non-target co-tuned cells show more decreased response amplitudes due to stimulation when synchronized with their preferred tones.

A: Stimulation effect (Δ F/Fstim - Δ F/Fbaseline) in all sound responsive cells, including both target and non-target cells, responding to either 16 kHz (blue) or 54 kHz (orange) pure tones, representing global activity changes due to the stimulation. No significant differences between stimulation conditions and responses to different frequencies were observed (all p > 0.05). B: Stimulation effect (Δ F/Fstim - Δ F/Fbaseline) in 16 kHz (blue) and 54 kHz (orange) preferring cells. Both cell groups show decreased amplitude to their preferred frequency regardless of conditions due to acoustic stimulus-specific adaptation. Only co-tuned cells (16 kHz preferring cells for 16 kHz stimulation or 54 kHz preferring cells for 54 kHz stimulation) show a further decrease in response amplitudes due to the stimulation, when the preferred pure tone (PT) frequency was synchronized. Error bars indicate SEM across FOVs (*: p < 0.0001). C: Stimulation effect from the model prediction. Amplitude changes computed from simulated data by applying cell suppression to all cells (All supp.) or only co-tuned cells (Co-tuned supp.) were compared with real data. Only the Co-tuned supp. model showed a significant amplitude decrease for co-tuned neurons compared to non co-tuned neurons, similar to the result from the real data (p < 0.05; see texts for more detail). D: Post-stimulation effect (Δ F/Fpost - Δ F/Fstim) 16 kHz (blue) and 54 kHz (orange) preferring cells. No significant response amplitude changes were observed. Error bars indicate SEM across FOVs. E: Sub-categorization of cells based on the frequency selectivity for each target stimulation condition (left: 16 kHz stim, right 54 kHz stim). Cells were first grouped into either 16 kHz preferring cells (blue) or 54 kHz preferring cells (orange). Within each cell group, cells were further subdivided into low, mid, and high frequency selectivity categories based on their 33% quartile ranges. Note that the frequency preference was log-transformed for visualization, but the x-axis labels kept the original frequency selectivity values before transformation. Vertical dashed lines indicate 33% quartile ranges. F: Response amplitude change based on the frequency selectivity category for each cell groups (blue: 16 kHz preferring cells, orange: 54 kHz preferring cells). Significant response amplitude changes relative to the control condition were observed only for high frequency selectivity category when target stimulated cells were co-tuned (*: p < 0.05).

We repeated the experiments and the analysis with 54kHz cells as the target group. In general, we observed similar results. The stimulation effect was significantly more pronounced for 54 kHz tone presentation when 54 kHz target ensemble (n = 359) was stimulated compared to the control condition (t(168) = 3.074, p = 0.0069; all p-values were adjusted for multiple comparisons using the Tukey method). All other pairs did not show any stimulation effect (Fig. 3B, right).

To further explore whether the stimulation effect could be explained by activity rebalancing within the co-tuned network, we implemented a simple model in which a suppression term was applied either to all neurons or specifically to non-target co-tuned cells. By comparing two different model outcomes and the real data, we observed a significant effect of the model type (F(2, 2535) = 34.943, p < 0.0001). Moreover, an interaction between the model type and cell groups was observed (F(2, 2535) = 36.348, p < 0.0001). Applying suppression to only non-target co-tuned cells during the stimulation session yielded a significant response amplitude decrease for co-tuned cells compared to non co-tuned cells (F(1, 2535) = 45.62, p < 0.0001), which resembles the real data. In contrast, applying suppression to all non-target cells led to similar amplitude changes in both co-tuned and non co-tuned neurons (F(1, 2535) = 0.87, p = 0.35), which was not observed in either the real data or the simulated data restricted to co-tuned cell suppression. These results suggest that the target cell stimulation induces a selective activity suppression within the co-tuned network for processing their preferred frequency.

Together, these results indicate that the effect of holographic optogenetic stimulation depends not on the specific tuning of cells, but on the co-tuning between stimulated and non-stimulated neurons. Also, this effect is not driven solely by a few non-target cells with large response changes. Rather, the overall population of cells shows relative response changes due to the stimulation when synchronized with their preferred frequency.

Overall, these results further suggest that when neural activity is increased in a subset of target cells due to photostimulation in addition to the target sound presentation, other co-tuned cells selectively reduce their tone-evoked responses to their preferred tone presentation, indicating that the network rebalances to maintain network activity within a certain range.

Rebalanced network responses are stable

We then questioned whether such response amplitude changes due to stimulation within the local network are persistent. To test whether the rebalanced status of the neuronal ensemble is persistent, we examined the tone-evoked response amplitude changes between the post-stimulation and the stimulation sessions (post-stimulation effect: ΔF/F(post-stimulation session) - ΔF/F(stimulation session)). Response amplitudes were similar across conditions and tone presentation frequencies for both groups of cells (F(2, 4056) = 1.83, p = 0.16; Fig. 3B). These results indicate that pairing exogenous stimulation on a subset of neurons along with sounds can instantaneously change the network responses to sounds, and this change can persist at least for many minutes during the experimental session.

Neurons with higher frequency selectivity show greater response changes

Our results demonstrate that response changes on non-target cells are significantly influenced by the frequency tuning of stimulation-target cells as well as the frequency of the presented pure tone along with the stimulation. However, we also observed a marginal stimulation effect in the 54 kHz non-target cell group during 54 kHz pure tone presentation, even when 16 kHz target cells were stimulated. We reasoned that this effect might be due to some weak sound activation of 54 kHz cells by 16 kHz tones potentially due to the asymmetric shapes of many auditory tuning curves in AC 35,36. Indeed, many cells exhibited broad tuning properties, responding to both 16 kHz and 54 kHz (Fig. 3C). Thus, this marginal stimulation effect could be attributed to cells grouped as 54 kHz preferring cells, yet still showing sound evoked responses to 16 kHz, particularly given that 16 kHz is within the sensitive frequency range in mice 30.

Building on our findings of a rebalanced cortical network, we next aimed to identify whether frequency tuning selectivity influences response amplitude changes in the non-targeted co-tuned neurons. For each cell, we calculated the frequency preference index (ΔF/F(16kHz))-(ΔF/F(54kHz)) and divided the cells into three categories of frequency selectivity: low, mid, and high. We removed cells with extreme frequency preference values, where the index values exceed ± 4 standard deviations from the median, prior to dividing them into three categories. This removed about 1% of cells from the dataset for further analyses. This grouping was based on the 33% quartile ranges, with each category representing one-third of the data distribution (Fig. 3C). Values closer to 0 indicate more broadly tuned cells across frequencies while extreme positive and negative values to indicate sharply tuned cells to either frequency.

We then tested whether cells with higher frequency selectivity to one frequency exhibited greater response amplitude changes. We performed a three-way ANOVA to examine the effect of frequency selectivity (low, mid, high selectivity), stimulation condition (control, 16 kHz target stim, 54 kHz target stim), and cell groups (16 kHz vs. 54 kHz preferring cells) on the response amplitude change. There were significant main effects of frequency selectivity (F(2, 2183) = 23.52, p < 0.0001) and stimulation condition (F(2, 2183) = 11.03, p < 0.0001). No significant main effect of cell group was observed (F(1, 2183) = 0.77, p = 0.379). Thus, neither the interaction between frequency selectivity and cell group (F(2, 2183) = 0.69, p = 0.503), nor the interaction between condition and cell group (F(2, 2183) = 2.64, p = 0.072) was significant. The three-way interaction between frequency selectivity, stimulation condition, and cell group was also not significant (F(4, 2183) = 0.86, p = 0.487).

However, the interaction between frequency selectivity and stimulation condition was significant (F(4, 2183) = 2.82, p = 0.0238), indicating that the effect of frequency selectivity depended on the condition. These results suggests that the response amplitude changes across conditions were more prominent for cells with higher frequency selectivity.

To identify where the significant response difference occurred across conditions, we further performed post-hoc pairwise comparisons between conditions within each frequency selectivity category for each cell groups. For 16 kHz preferring cells, we observed a significant difference in the response change between control and 16 kHz stim conditions only from the high frequency selectivity category (p < 0.027, Holm-Bonferroni correction for multiple comparisons). In parallel, for 54 kHz preferring cells, the significant effect was observed between control and 54 kHz stim conditions (p = 0.033, Holm-Bonferroni correction for multiple comparisons; Fig. 3D). These results indicate that non-targeted cells with higher frequency selectivity exhibit the greatest response amplitude changes, only when the target stimulated cells were co-tuned with them. These results also suggest that frequency selective neurons form co-tuned networks.

Sparsely distributed non-target co-tuned ensembles immediately rebalance their activities to maintain the network balance

Network balance can be achieved by multiple mechanisms operating on different timescales. To get insight into the potential mechanisms underlying the observed rebalancing, we next investigated how rapidly cells start adjusting their responses during the stimulation condition. We thus examined the stimulation effect (changes in response amplitude due to stimulation of target cells) for non-target co-tuned ensembles at the single-trial level. We observed decreased response amplitudes from the first trial, with no significant decay across trials (Fig. 4A), regardless of cell groups, frequency presentation, and conditions (sum-of-squares F-test, all p > 0.05). The absence of trial-related response amplitude changes in non-target co-tuned ensembles indicates that non-target co-tuned cells immediately change their activity whenever targeted cells increased their activity due to stimulation, to maintain the network balance.

Rebalanced response changes on non-target 16 kHz cells are immediate and widely distributed.

A: Stimulation effect (Δ F/Fstim - Δ F/Fbaseline) in 16 kHz (blue) and 54 kHz (orange) preferring cells per each trial for the 5-cell 16 kHz stimulation condition (left), 54 kHz stimulation condition (middle), and the no-cell control condition (right). Each circle represents average stimulation effect per each trial. Decreased amplitudes to preferred frequencies were observed from as early as trial 1 with no significant further changes across trials, regardless of frequencies and conditions (sum-of-squares F-test, all p > 0.05). Solid lines indicate fitted curves and dashed lines indicate 95% confidence intervals. B: Stimulation effect (Δ F/Fstim - Δ F/Fbaseline) of each non-target 16 kHz (blue) or 54 kHz (orange) preferring cells for either 16 kHz (top row) or 54 kHz (bottom row) presentation in relation to the mass of center distance to any target cells for the stimulation condition (left), 54 kHz stimulation condition (middle), and the control condition (right). Each circle represents each cell. For the control condition, we considered top 5 most tone-responsive cells from the baseline session as “target” cells, as there was no stimulation. Non-target cells are widely distributed within the FOV (550 μm2), regardless of cell groups, frequencies, and conditions. Gray lines indicate fitted curves, excluding cells that are closer than 15 μm (vertical green lines; cells < 15 μm marked in lighter shades), and dashed lines indicate 95% confidence intervals.

Non-target co-tuned ensembles that show rebalancing are spatially distributed

Activity rebalancing could be driven by local, e.g. changes in excitatory-inhibitory (E/I) balance 37, or distributed changes. To identify whether co-tuned ensembles that changed their activities are locally or widely distributed, we computed the center of mass distance between each non-target cell to any of the target cells. For the stimulation condition, we excluded non-target cells that were within 20 μm distance of the target cells to ensure that any effects from those neighboring cells with their increased response amplitudes could have been not, even partially, due to photostimulation (Fig. 1B-D). We observed that non-target co-tuned ensembles were widely distributed within the FOV, similar to non-target non co-tuned ensembles as well as those from the control condition (sum-of-squares F-test, p > 0.05; Fig. 4B). This indicates that activity changes of non-target co-tuned ensembles are not merely the result of direct input from external photostimulation within a tight localized network. Rather, widespread, sparsely represented co-tuned ensembles continuously update incoming information based on their tuning properties.

Discussion

Trial-by-trial variability in neuronal activity is ubiquitous in the brain, with sensory stimuli evoking activity in different sparse co-tuned ensembles at different times. How sensory-evoked activity is distributed and coordinated across sparsely distributed co-tuned networks has been unknown. Here, we leveraged the capability for selective in vivo stimulation via holographic optogenetics to investigate how functionally related neuronal ensembles in AC coordinate activity. Our results show that manipulating a small subset of target co-tuned neurons alters the auditory-evoked responses of other non-target co-tuned neurons. Increased activity in one subset of neurons is balanced by decreased activity in the rest of the co-tuned population of neurons.

Importantly, such network rebalancing occurs only for processing acoustic features specific to their tuning. Our analysis shows that the most selective non-targeted neurons show the strongest effect after stimulation, suggesting that selective neurons form functionally interacting sub-networks consistent with in vivo correlation analyses 38 and in vitro studies in visual cortex 39. Functionally related cells might form these subnetworks during development likely due to lineage relationships and Hebbian processes 4042. Together, our findings suggest that neuronal ensembles with strengthened connectivity across neuronal ensembles sharing similar functional tuning properties actively interact and update their status to maintain the overall network, enabling energy-efficient sensory processing.

The present work applied holographic optogenetic stimulation to manipulate neuronal activity at a single-cell resolution in the AC for the first time. Similar to previous findings in VC 18,19, our study further supports the idea that extra activation within the network exerts an inhibitory influence on a subset of neurons. In the VC, single cell stimulation resulted in suppression of neighboring co-tuned neurons 18. Our results here show that feature-specific suppression occurs in spatially dispersed ensembles of non-target co-tuned neurons. The widely distributed response amplitude decreases in those neurons suggest that this phenomenon is not limited to local neighboring cells but involved widespread networks 8. Thus, the effect is not solely due to inhibition caused by neighboring interneurons from the optogenetic stimulation. Instead, neurons with similar functional characteristics, sparsely distributed throughout the AC, actively interact, with more sharply tuned neurons to modulate their activity the most.

Cortical networks are shaped by dynamic changes in neural activity driven by various factors. Neurons rapidly modulate their responses based on their functional roles in sensory processing. Recurrent cortical networks are thought to update their activity based on incoming information to maintain homeostatic balance 23,24. Concurrently, co-activated neurons processing similar acoustic properties strengthen their connectivity by Hebbian learning 4345. Thus, cortical networks are shaped by an active interplay of synaptic plasticity, homeostasis and Hebbian learning, rather than by a single dominant mechanism 4649. Rebalancing of network activity is often attributed to homeostatic rebalancing of individual cell’s activity 24,50 or an E/I balance: increased activity of inhibitory neurons resulting in reduced activity of excitatory neurons 37. Our results suggest that rebalancing is tuning-specific.

Given that inhibitory neurons in A1 are generally less frequency selective than excitatory neurons 51, changes in inhibition are unlikely to be the only contributor to the observed effect. While our AAV was not cell-type specific, it is also unlikely that many selected target neurons would be inhibitory interneurons as a greater proportion of neurons in sensory cortices are excitatory (about 80% compared to 20% inhibitory 5255; AAV-hSyn is expressed in both excitatory and inhibitory neurons in similar proportions 56). Furthermore, our primary cell selection criterion for stimulation yielded a subgroup of a strong specific frequency-responsive cells (top 30% of cells that show the biggest evoked responses to the target tone after excluding cells that show responses to multiple tones). This criterion likely selected more excitatory neurons, as they generally show greater stimulus-selective responses than inhibitory neurons in sensory cortices 51,57.

It is noteworthy that not all target neurons showed clear activation in response to holographic stimulation. Although we attempted to pre-select cells responsive to stimulation, some of them seemed to exhibit reduced activation during the experiment, potentially due to motion artifacts, response adaptation, network suppression, or trial-by-trial variability. Nevertheless, the frequency-specific suppression on co-tuned neurons observed in this study suggests that this effect can be driven by activation of a very small number of neurons. One caveat is that, while we presume most target cells were likely excitatory, inhibitory neurons may have been included in the target cell group. Inhibitory neurons could show a reliable, strong responses to the optogenetics compared to more variable responses that excitatory neurons could show 58, thus may have been included in the target cell group. However, if more inhibitory cells were included, this would have reduced trial-by-trial variability from the stimulation yielding higher probability of target cell activation, which is different from what we observed. Additionally, re-occurring sounds evoke activity changes in both excitatory and inhibitory neurons in the same direction, leaving the E/I balance unchanged 59,60. Moreover, the fraction of inhibitory cells is 10 ∼ 20% of all cortical neurons 6163, thus the chance of stimulating them is small. Together, these considerations suggest that our results are unlikely the effect of the activation of inhibitory neurons. Rather, other balancing mechanisms such as short-term depression at thalamocortical synapses may play a role 64.

In contrast to slow changes of homeostatic plasticity related to the E/I balance, plasticity in cortical cellular responses can occur quickly and are cell-specific, thereby tuned to their functional response properties 6567. Based on this, we speculate that the decrease in response amplitudes of non-target neurons likely reflects rapid activity-dependent synaptic changes in excitatory cells. Future work using Cre-dependent virus expression or a cell type specific labeling approach will be required to confirm cell-type-specific roles in this phenomenon and underlying mechanisms.

Lastly, no additional response changes were observed in the post-stimulation session indicating that the rebalanced network status remained constant. Indeed, constant amplitudes were observed regardless of conditions, subgroups, and frequencies, suggesting that the network persistence was achieved through repeated acoustic stimuli presentation and photostimulation. The persistence after the stimulation condition further suggests that once the newly learned rebalanced network status is achieved, the network response stabilizes and remains persistent. The mechanisms behind this stabilization are unclear but may involve an active interplay of homeostasis and Hebbian learning to form co-active networks 68.

Taken together, the present study reveals how neuronal ensembles in the AC rebalance to maintain a homeostatic processing equilibrium for a given sensory input, and that rebalanced networks remain persistent. Moreover, our results show that network activity can be controlled by even a small subset of neurons, and the network changes are closely tied to the functional tuning properties of neurons.

Materials and Methods

Methods

Animals

A total of 14 mice over 8 weeks old (8 – 30 weeks, 6 males and 8 females) were used in the experiments. To retain high-frequency hearing at experimental ages, offspring from C57BL6/J and B6.CAST-Cdh23Ahl+/Kjn (Jax 002756) were used for all experiments. Animals were housed on 12-hr reversed light/dark cycle. All experimental procedures were approved by Johns Hopkins Institutional Animal Care and Use Committee.

Preparation of AAV9-hSyn-GCaMP8s-T2A-rsChRmine

To generate the AAV9-hSyn-GCaMP8s-T2A-rsChRmine virus, a gene fragment containing T2A, rsChRmine, Kv2.1 soma-targeted localization motif, and 3xHA tag (synthesized by Twist Biosciences) was subcloned into the pGP-AAV-syn-jGCaMP8s plasmid vector (Addgene: 162374). Viral vectors were commercially prepared (Virovek) to a concentration of 1 × 1013 vg/mL.

Surgery and virus injection

Surgery was performed as described in previous studies 9,10. We injected dexamethasone (1mg/kg, VetOne) subcutaneously (s.c.) to minimize brain swelling prior to surgery. 4% isoflurane (Fluriso, VetOne) with a calibrated vaporizer (Matrix VIP 3000) was used to induce anesthesia, which was then reduced down to 1.5 – 2% for maintenance. Body temperature was monitored and maintained at around 36L throughout the surgery (Harvard Apparatus Homeothermic monitor). We first removed the hair on the head using a hair removal product (Nair) to expose the skin. Betadine and 70% ethanol were applied three times to the exposed skin. Skin and tissues were then removed, and muscles were scraped to expose the left temporal side where the craniotomy was conducted. Unilateral craniotomy was performed to expose about 3.5 mm diameter region over the left AC. Virus (AAV9-hSyn-GCaMP8s-TSA-rsChRmine titer of 1:2) injection was performed at 2-3 sites near tentative A1 area at about 300 μm depth from the surface, using a glass pipette controlled by a micromanipulator (Sutter Instrument MPC-200 and ROE-200 controller). We injected 300 nL on each site at the rate of 180 nL/min (Nanoject3). Once virus injection was completed, two circular glass coverslips (one of 3 mm and one of 4 mm in diameter) were affixed with a clear silicone elastomer (Kwik-Sil, World Precision Instruments). An extra layer of dental acrylic (C&B Metabond) was applied around the edge of the cranial window to further secure it, cover the exposed skull, and adhere a custom 3D-printed stainless steel headpost. Carprofen (5 mg/kg) and cefazolin (300mg/kg) were injected (s.c.) post-operatively. Animals were given at least 3-4 weeks of recovery and viral expression time before any imaging was performed.

Experimental procedures

Animals were head-fixed on a custom-made stage, where a free field speaker (TDT ED1) was faced towards the right ear at 45 degrees. All sound stimuli were driven by TDT RX6 multiprocessor and we imaged GCaMP8s responses with a resonant scanning two-photon microscope (Bruker Ultima 2Pplus; 940 nm excitation wavelength) at A1 L2/3 (160 – 200 μm). A1 was identified by its tonotopy gradient using widefield imaging (Fig. 1E) 69. During 2-photon imaging sessions, we first conducted a short imaging session (about 1 min.) presenting 100-ms pure tones at 3 different frequencies (4, 16, 54 kHz) at 70 dB SPL, covering the mouse hearing range, for 10 times each in a randomized order (ISI: 2 sec). This was to identify initial tuning properties of sound-responsive cells (cell selection session; Fig. 1E). The acquired imaging data was immediately processed using ‘suite2p’ and a custom-written Matlab script to identify tone-responsive cells for each frequency. We continued only when at least 50% of cells within the FOV showed sound-evoked responses. We took 16 kHz or 54 kHz as our target functional properties. Target frequency was randomly assigned. We manually tested the response changes to stimulation of the top 30% of target frequency responsive cells (∼ 20 cells) selected from the cell selection session by using a stimulation laser (Light Conversion Carbide; 1040 nm excitation wavelength). Stimulation laser power was set around 5 mW per cell. We selected 5 representative cells (target cells) that showed visible fluorescence changes to the stimulation.

Prior to the main experimental session, we ran a short (∼ 1 min.) stimulation session without any sound presentation that comprised 5 trials of 100-ms stimulation with 6-second ISIs for a rapid check of the stimulation effect. This session was restricted to 5 trials of stimulation given the limited time of the imaging session, leading to larger variability of the observed stimulation effect. We further verified the stimulation effect from the experimental stimulation session where 30 trials were given.

For the main experiment, three consecutive imaging sessions were followed by the presentation of either 16 kHz or 54 kHz 100 ms pure tones with random ISIs between 5.8 – 6.5 sec. for 30 trials each, as baseline session, stimulation session, and post-stimulation session (Fig. 1E). Only the second imaging session (i.e., stimulation session) received holographic stimulation on the pre-selected 5 target cells for 30 revolutions of 15 um spiral for about 100 ms at 5 mW laser power per cell (16 kHz target cell or 54 kHz target cell stimulation conditions) or 0 mW laser power per cell (control condition). The default mechanical setup (Bruker PrairieView version 5.7) of the microscope opens and closes the uncaging shutter to enable the stimulation laser path for every single stimulation time point, which causes an external mechanic sound that can trigger neural activation of the AC. To minimize any effect of external mechanic sounds to cells in the AC, we kept the uncaging shutter open during the imaging session. Number of imaging sessions per animal varied depending on the virus expression. Regardless, all animals were used for control and at least one stimulation condition with minimum 2 days apart by varying FOVs to avoid imaging the same cells multiple times. The order of conditions and the imaging depth presented to the same animal were randomized.

An additional single-cell stimulation only session was conducted on a subset of animals, to further test a reliable holographic stimulation at a single-cell level (n = 3 animals). We varied a stimulation position from the original target stimulation point to 10, 20, or 30 μm shifted along x-axis or y-axis, generating 7 different stimulation point (original cell position, 10, 20, or 30 μm shifted along the x-axis, 10, 20, or 30 μm shifted along the y-axis). We stimulated each stimulation point for 10 times in a randomized order across stimulation points with 8-sec. ISIs.

Analysis

Imaging data were processed with ‘suite2p’ for motion correction, cell detection, and cell fluorescence trace extraction 70. We applied neuropil correction to the fluorescence traces using the following equation: F(cell_corrected) = F(cell) – (0.8 * F(neuropil)). We then computed ΔF/F normalized to the baseline, by following the equation: ΔF/F = (F(trace) – F(baseline)) / F(baseline), where baseline is about 300 ms before the sound onset. For the single-cell level stimulation session, we computed peak ΔF/F per each stimulation point and applied a mixed-effect model by taking peak ΔF/F as dependent variables, stimulation point as independent variables, and cells as random factor. For the 5-cell stimulation validation session, we computed a proportion of activated cells (any cell with peak ΔF/F > 0) among stimulated cells per each FOV as well as peak ΔF/F and ran a random permutation with 100 iterations.

For experimental sessions, to select sound responsive cells, we compared sound-evoked activity (160 ms – 660 ms after sound onset capturing the sound-evoked response due to slow calcium transient) and the baseline activity (300 ms – 0 ms before the sound onset). We considered cells sound responsive when the amplitude of the average sound-evoked activity exceeded two standard deviations of the amplitude of the average baseline activity. Sound responsive cells were selected based on fluorescence traces only from the baseline session to minimize any potential effect of stimulation on cell selection. We then computed the ratio of the evoked activity between two frequencies as an index of the frequency preference (ΔF/F(16kHz) - ΔF/F(54kHz)). We subgrouped sound-responsive cells into either 16 kHz preferring cells or 54 kHz preferring cells based on the frequency preference, taking 0 as a subgroup criterion. As our main interest was changes to non-target cells, we excluded target cells for further analyses. To compare response changes due to stimulation for each group of cells, average sound-evoked activity of sound-responsive cells from the baseline session was subtracted from the stimulation session (stimulation effect: ΔF/F(stimulation session) - ΔF/F(baseline session)) for each condition. We further compared response changes between post-stimulation session and stimulation session, again by subtracting the response amplitudes between two sessions for each group and condition (post-stimulation effect: ΔF/F(post-stimulation session) - ΔF/F(stimulation session)). To quantify the stimulation effect based on functional properties for conditions and groups, we applied a mixed-effect model by taking amplitude changes as dependent variables, frequency, conditions, and cell groups as independent variables, and FOVs as random factor. We then computed Type III analysis of variance (ANOVA) to examine whether the effect of functional property (frequency) to the amplitude changes was specific to the target frequency presentation synchronized with the stimulation (i.e., 16 kHz for 16 kHz target cell stimulation or 54 kHz for 54 kHz target cell stimulation). To quantify whether the response amplitude changes due to stimulation differ across trials, we fitted a dataset of average stimulation effect across each trial per each condition (stimulation or control), each cell group (16 kHz preferring or 54 kHz preferring cells), and each tone presentation (16 kHz or 54 kHz) to the three-parameter model and computed the extra sum-of-squares F-test to compare whether response amplitude changes across trials were different from a constant line 71. To quantify a relationship between response amplitude changes of non-target cells and their distances to target cells, we computed a center of mass distance of each cell position relative to target cells, and fitted the dataset of the response amplitude changes across distance to the three-parameter model to compute the extra sum-of-squares F-test, per each condition (16 kHz target stimulation, 54 kHz target stimulation, or control), each cell group (16 kHz preferring or 54 kHz preferring cells), and each tone presentation (16 kHz or 54 kHz). For the control condition, as there were no stimulated target cells, we chose top five most tone-responsive cells from the baseline session as “target” cells.

We then generated a simple model in which a suppression term was applied either to all neurons or specifically to non-target co-tuned cells to test our results from the data. We took a similar range of number of neurons and FOVs to closely simulate the model to the real dataset structure. On 50 simulated calcium traces of neurons (n),

Where Rn(t) is a response amplitude from either baseline or stimulation session, thetan is a suppression term applied either to all neurons or only to non-target co-tuned neurons, only during the stimulation session, and epsilonn(t) is additive noise. Theta was defined as proportional to the average stimulation strength from target neurons, derived from the real dataset, and scaled by a factor α = 0.3 in the current simulation. To introduce neuron-level variability, an additional jitter (epsilonn) was applied as follow:

Similar to the real data analyses, we compared the response change between the stimulation and baseline sessions’ trace amplitudes.

Histology

Animals were deeply anesthetized with 4% isoflurane to perform transcardial perfusion with 4% paraformaldehyde (PFA) in 0.1 M phosphate buffer saline (PBS). The extracted brains were post-fixed in 4% PFA for additional 12-24 hours. Coronal brain sections at 50 μm containing the AC were stained with primary antibodies of HA-Tag (1:500) and chicken Green Fluorescent Protein (GFP, 1:500) for GCaMP8s, and secondary antibodies of 594-conjugated anti-rabbit IgG (1:1000) and 488-conjugated anti-chicken IgG (1:1000) for red-shifted opsins.

Data and materials availability

All preprocessed imaging data and relevant analyses scripts will be deposited at Johns Hopkins University research data repository, available for open access.

Acknowledgements

Supported by U19 NS107464 (POK), NIH RO1DC017785 (POK), NIH F32DC019842 (TAB).

Additional information

Author Contributions

Conceptualization: HK, TAB, POK; Methodology: HK, TAB, POK; Investigation: HK, TAB; Writing: HK, POK

Funding

NINDS (U19 NS107464)

NIDCD (RO1DC017785)

NIDCD (F32DC019842)

Additional files

Supplemental Figure