Introduction

Learning to make appropriate decisions based on external stimuli is fundamental for survival and adaptive behavior. This process relies on the coordinated activity of distributed neural circuits spanning sensory, association, and motor areas. Previous studies have implicated multiple cortical and subcortical regions in visual task learning and decision-making. In the cortex, anterior regions have been found to play important roles. The medial prefrontal cortex (mPFC) was indispensable for both learning visual discrimination and maintaining enhanced visual acuity after learning 1. The secondary motor cortex (M2) encoded sensory history information in a flexible visual decision task, and its inactivation impaired adaptive action selection 2. Visuomotor learning also promoted visually evoked activity in M2 and anterior cingulate cortex (ACC) 3. Top-down inputs to the primary visual cortex (V1) are also critical. Orbitofrontal cortex (OFC) projections to V1 were required for learning a visual Go/No-Go task 4, and retrosplenial inputs to V1 were essential for encoding task-related events 5. Among subcortical regions, the dorsomedial striatum was necessary for visual category learning 6, and innervation from M2 to the dorsal striatum suppressed inappropriate visual decisions 7. The mediodorsal thalamus (MDTh) regulated prefrontal signal and noise via distinct circuit mechanisms under different scenarios of decision uncertainty 8. Although the roles of these brain regions have been well studied individually or in pairwise combinations, how these regions dynamically reorganize their functional interactions as a mesoscale network during the learning of decision-making remains unclear. Addressing this question requires an approach that captures large-scale, longitudinal activity patterns across both cortical and subcortical areas.

Recent advances in large-scale neural recordings have enabled monitoring of activity across multiple brain regions, providing new insights into information representation and transformation at a mesoscale level 9,10. It has been revealed that the encoding of sensory, choice, and body motion information is not confined to a single or a few brain regions, but widely distributed across brain regions during visual decision tasks 1113. Similarly, sensorimotor transformations during decision-making have been found highly distributed across many brain regions following the learning of a visual change detection task 14. In a delayed-response paradigm, learning has been associated with emergence of a specific subnetwork involving layer 2/3 neurons in the anterior lateral motor cortex and posterior parietal cortex, accompanied by sparser global functional connectivity across the dorsal cortex 15. However, most of these studies have focused on neural dynamics in expert animals or have been restricted to superficial cortical layers, largely due to technical constraints of commonly used techniques such as high-density silicon probes and calcium imaging. These approaches typically provide either limited spatial coverage in depth or lack the longitudinal recording capability across learning. As a result, how cortical-subcortical neural dynamics evolve during task acquisition across broad spatial scales remains poorly understood.

To tackle this question, we utilized uFINE-M (ultra-Flexible Implantable Neural Electrodes for Mouse) arrays 16 to simultaneously record spiking activity across 10 brain regions in mice learning a Go/No-Go task over two to three weeks. The chronic implantation capability of uFINE-M arrays enabled us to track the evolvement of a mesoscale functional network throughout learning. By analyzing functional connectivity patterns and information encoding dynamics, we found that learning reshaped interregional communication and accelerated the broadcast of stimulus information throughout the network. These findings provide insights into how distributed brain networks adapt during the acquisition of decision-making skills.

Results

High-throughput recording in mice performing a visual Go/No-Go task

To investigate the dynamics of the mesoscale functional network during decision-making task learning, we trained head-fixed mice to discriminate between two visual stimuli (vertical vs. horizontal static gratings) using a Go/No-Go paradigm, which has been used to study local neural dynamics during visual associative learning and decision-making 4. In this task, mice were required to lick a waterspout in response to Go stimuli (Hit trials) within a specified response window to receive a water reward, while withholding licking for the No-Go stimuli (Correct rejection trials, CR trials) to avoid timeout punishment (Figure 1A). Despite substantial daily fluctuations in task performance (Figure S1), mice generally gained proficiency over time by learning to make more correct rejection decisions at No-Go stimuli (Figure 1B).

High-throughput recording in mice performing a visual Go/No-Go task.

(A) Schematic of the task. (B) Average correct rate during training (mean ± SEM, n = 7 mice). (C) Photos showing the uFINE-M shanks and recording sites. (D) Schematic showing the implantation sites of uFINE-M arrays, along with example single-unit waveforms recorded from each brain region. Brain section images are adapted from Allen reference atlas23. (E) Example spike rasters during two trials. (F) Top, the number of single units recorded in each brain region. Each data point represents data from an individual recording session. Bottom, the total number of single units recorded during training (n = 5 mice). Each symbol represents data from an individual mouse.

To capture neural spiking activity across multiple brain regions throughout task learning, we chronically implanted eight 128-channel uFINE-M arrays into the left hemisphere of each mouse brain and simultaneously recorded from 10 brain regions, including frontal regions (mPFC, OFC, and ACC), motor cortices (M1 and M2), visual cortices (V1, V2M, and V2L), and subcortical regions (striatum and MD thalamus) (Figures 1C and D). These regions have been implicated in visuomotor tasks 2,4,6,7,1719 and exhibit dense structural connectivity 20. On average, 532.1 ± 92.5 single units (mean ± SD, n = 39 sessions from 5 mice) were recorded across 1024 channels in each session, with no fewer than 15 units recorded from each region of interest (Figures 1E and F). Given the fluctuations in behavioral performance (Figure S1), we categorized trials in early sessions with low behavioral discriminability 21 (d-prime < 2) as “early stage” data, and trials in late sessions with high behavioral discriminability (d-prime > 3) as “expert stage” data (Figure S1). Only these data were used in the following analyses.

The average firing rate patterns showed substantial changes during learning (Figure 2A), with an overall decrease in firing rate during CR trials and increase in firing rate during Hit trials. Moreover, many brain regions exhibited significant changes in their temporal profile of activity in CR trials as learning progressed (Figure 2A).

Activity changes throughout task learning.

(A) Averaged firing rate aligned to the visual stimulus onset for all CR trials and Hit trials in the early and expert stages (n = 118 early CR and 828 early Hit trials from 7 sessions of 3 mice, 610 expert CR trials and 677 expert Hit trials from the same mice). Shading, SEM. (B) Left, distribution of activity onset timing across time. Right, activity onset timing of each region. Each data point represents data from a neuron. (C) Same as B but for Hit trials. (D) Comparison of activity onset timing between the early and expert stages. *p < 0.05, **p < 0.01, ***p < 0.001, ****p < 0.0001, two-way ANOVA, Sidak’s multiple comparison. Error bars, SEM.

Based on these observations, we calculated the activity onset timing of each single unit relative to the visual stimulus onset (Methods) and found that the mean activity onset timing formed a clearer sequential activation pattern across brain regions in expert CR trials (Figures 2B and C). This sequential pattern was accompanied by a more temporally compressed activation profile with learning, as evidenced by a significant reduction in the spread of peak activation times across regions (Figure S2 and Methods, early: 170.0 ± 160.0 ms vs. expert: 56.67 ± 38.58 ms, mean ± SD, p < 1.0 × 10-4, 2-way ANOVA, Sidak’s multiple comparison). Additionally, several visual and frontal regions (V1, V2M, mPFC, OFC, M2, and M1) also showed faster average response time following stimulus onset in expert CR trials compared to the early stage (Figure 2D). This learning-induced compression of mesoscale activity is reminiscent of similar phenomena observed in motor skill learning 22, suggesting a potential general principle of temporal refinement in distributed brain networks during task acquisition.

In contrast, learning this visual Go/No-Go task did not induce significant changes in the average activity onset timing for most regions in Hit trials (except V2L, Figure 2D), nor did it significantly alter the spread of regional peak activation times (Figure S2). This could be attributed to the fact that the Go stimulus was already introduced during pre-training stages, when the mice learned the basic trial structure (Methods).

In summary, we found that learning this visual Go/No-Go task led to a sequential activation pattern of the neural activity across brain regions, particularly generating an earlier and more compressed activity sequence in CR trials. These results led us to further explore the leading-following relationships between regions and the roles of individual brain regions within the functional brain network. We focused specifically on CR trials, as mice improved their performance mainly by learning to correctly reject No-Go stimuli.

Ranking dynamics of mesoscale brain network during learning of CR trials

To study functional connectivity dynamics and quantify the overall extent of leading-following relationship among spiking activity across brain regions, we first identified neuron pairs that exhibited functional connectivity if their cross-correlation score of spiking activity (TSPE algorithm) 24 was above chance level (p < 0.05, Figure 3). We focused on fast connections within 20 ms and included only excitatory connections in subsequent analyses. If the spiking activity of neuron a preceded that of neuron b, we defined neuron a as having functional output to neuron b, and neuron b receiving functional input from neuron a. The functional input/output strength between any two brain regions was then defined as the proportion of neuron pairs with significant excitatory functional input/output, relative to the total number of possible input/output neuron pairs between these two regions. Considering that differences in firing rates might bias cross-correlation between spike trains 25, making raw counts of significant neuron pairs difficult to compare across conditions, we ranked the values in the regional connection matrix on a scale from 1 to 10. This ranking approach enabled us to focus on the relative importance of each region within the brain network and more effectively evaluate the ranking dynamics across time windows and trial types.

Definition of functional connection.

(A) Schematic of data processing flow of calculating functional connectivity. For each 200-ms time window (t), cross-correlation scores were calculated between spike trains of neuron pairs and the percentage of neuron pairs that showed significant cross-correlations was treated as functional connection strength between brain regions. The regional connection matrix was then ranked from 1 to 10 to evaluate the relative importance of regional connection compared with other connections within the same time window of the same trial. (B) Details of processing stages. (C) Connection rank matrix calculated from the example data in A.

In early CR trials, most brain regions did not show obvious differences in input/output rankings (Figure 4A), with rank values remaining close to 5 (the expected level for random data) throughout the CR trial. Only the ACC and V2L maintained high rank values in early CR trials, suggesting their roles in visual attention and high-level visual processing. After the mice achieved proficiency in this task, we observed a clear separation of input/output rankings among different brain regions (Figure 4A). ACC maintained a high rank, whereas the ranks of V2L, striatum, and MDTh decreased across all trial periods. In contrast, the ranks of V1, V2M, and OFC increased across all trial periods, and M2 exhibited an increased rank during the response period (Figures 4B and C, Figure S3, p < 0.05, two-way ANOVA, Sidak’s multiple comparison). These results suggest that, during the learning of visual-based decision-making, brain regions within the network differentiate in task involvement. Regions associated with visual processing, value processing, and action selection can emerge as key input/output hubs, forming a more task-relevant subnetwork.

Ranking dynamics in CR trials during learning.

(A) Input/output ranking dynamics during early and expert CR trials. (B) Average input rank of each brain region in the early stimulus period (0–400 ms after stimulus onset), late stimulus period (400– 800 ms after stimulus onset), early response period (800–1800 ms after stimulus onset), and late response period (1800–2800 ms after stimulus onset) of early and expert CR trials, mapped on brain atlas 28. ΔRank represents the rank change between the expert and early stages. (C) Same as B but for output ranks. n = 118 early CR trials from 7 sessions of 3 mice, and 610 expert CR trials from 6 sessions of same mice. Error bars, SEM.

Since previous studies have reported the dominance of movement-related activity across brain regions 12,26, we examined the extent to which the observed changes in functional connectivity patterns could be explained by potential changes in body movements during task training. We performed video recording in a cohort of mice and quantified motion energy of facial movements, foot movements, and pupil dynamics 27 during CR trials. Pupil dynamics showed a reduction across all CR trial periods during learning (Figure S4), which might account for the overall decrease in firing rates in CR trials (Figure 2A). Only facial movements showed decreases during the early stimulus period (Figure S4), thus changes in body movements could not fully explain the changes in functional connection ranks in the stimulus and responses periods during learning.

Ranking dynamics during learning of Hit trials and fruitless learning

Compared to CR trials, the mesoscale network showed more rapid and dynamic transitions at different intra-trial time points in Hit trials (Figure 5A). In early Hit trials, a transient separation of input/output rankings was observed around the visual stimulus onset, with the ACC ranking the highest. During the response period, regional rankings become more convergent. After the mice reached expert level in this task, although the correct rate of Hit trials remained unchanged (Figure 1B), the striatum acquired a high input ranking, particularly during the early response period of expert Hit trials (Figures 5A and B, Figure S5A). The rise of input rank of striatum during the response period was still clear with the analyses repeated on spike time data aligned to the first lick in each Hit trial, indicating this observation was not a result of possible changes in lick initiation time during learning (Figure S6). In addition to the striatum, we also observed significant changes in the input/output ranks of multiple regions between early and expert Hit trials (Figures 5B and C, Figure S5), suggesting that distinct mesoscale functional connectivity patterns could underlie similar behaviors.

Ranking dynamics in Hit trials during learning.

(A) Input/output ranking dynamics in early and expert Hit trials. (B) Average input rank of each brain region in the early stimulus period (0–400 ms after stimulus onset), late stimulus period (400– 800 ms after stimulus onset), early response period (800–1800 ms after stimulus onset), and late response period (1800–2800 ms after stimulus onset) of early and expert Hit trials, mapped on brain atlas 28. ΔRank represents the rank change between the expert and early stages. (C) Same as B but for output ranks. n = 828 early Hit trials from 7 sessions of 3 mice, and 677 expert Hit trials from 6 sessions of 3 mice. Error bars, SEM.

We also trained mice on a “fruitless learning” task, in which all visual stimuli and task structure were identical to those in the normal learning group, but the Go/No-Go visual stimuli were presented randomly and had no association with reward. As expected, mice continued to lick in every trial regardless of the visual stimulus type, and we defined the trials in which mice were randomly rewarded as fruitless-learning Hit trials. During the early stimulus period, the ACC and visual regions showed the highest ranks in these fruitless-learning Hit trials (Figure S7), similar to those in the normal learning group. In the early response period, however, we observed a strong elevation of the rank of the MDTh (Figure S7), suggesting cognitive effort in the face of uncertainty regarding the task rules 29,30. These results further demonstrate that distinct mesoscale functional connectivity patterns can emerge and evolve depending on task demands, in accordance with previous reports 15,3133.

Rank increase of the visual/frontal regions was attributed to elevated regional connection rank in CR trials

To investigate the factors driving the rank changes in CR trials during learning, we examined the regional connection ranks of brain regions that showed rank increases in CR trials (V1, V2M, M2, and OFC, Figures 4B and C, Figure S3). For each time window within a trial, regional connection strength was ranked on a scale from 1 to 10, with a rank of 1 representing the lowest 10% strength among all regional connections within the same time window (Figures 3B and C). We observed a general increase in the ranks of regional connections between these regions (Figure 6, V1, V2M, M2, and OFC) and other frontal and motor regions (Figure 6, mPFC, ACC, M2, and M1).

Rank increase in CR trials was attributed to elevated input/output rank from/to other regions.

(A) Average input rank changes for the four regions (V1, V2M, M2, and OFC) that showed increased rank values in the stimulus or response period of CR trials during visual learning. *p < 0.05, **p < 0.01, ***p < 0.001, ****p < 0.0001, two-way ANOVA, Sidak’s multiple comparison. (B) Same as A but for the response period of CR trials. (C-D) Same as A-B but for output ranks. n = 118 early CR trials from 7 sessions of 3 mice, and 610 expert CR trials from 6 sessions of 3 mice. Dashed lines at rank 5 indicate the average level of random data. Error bars, SEM. Stim, stimulus period. Res, response period.

We also examined the regional connection ranks of regions that exhibited rank decreases in CR trials (V2L, MDTh, and striatum, Figure S8). All three regions showed a decline in regional connection rank both with each other and with most frontal and motor regions (mPFC, ACC, M2, and M1). The striatum, which exhibited the most pronounced rank decrease, showed the most widespread reduction in regional connection ranks.

In summary, these results suggest that the network forms a more compact functional motif during learning to reject the No-Go visual stimulus. This reorganization is characterized by increased relative connection strength among several key visual (V1 and V2M), frontal (mPFC, OFC, and ACC), and motor regions (M2 and M1), while regions such as V2L, MDTh, and the striatum become less engaged in the task-related functional network.

Visual stimulus information became widespread in the stimulus period as learning progressed

After examining network dynamics during different trial periods and learning stages, we wondered how the stimulus encoding ability of each region changed during task learning. To assess the stimulus encoding ability based on spike counts, we grouped trials according to visual stimulus identity (with behavioral choice balanced) and applied ROC (receiver operating characteristic) analyses in each 200-ms time window21. In each session, spike count data for each neuron was bootstrap-resampled to balance the number of trials across different trial types (Figure 7A). For each neuron, 50 trials were resampled with replacement for each trial type to perform ROC analyses, and this procedure was repeated 500 times for each time window. A neuron was classified as stimulus-selective if its ROC selectivity was above 95% of its own randomly shuffled spike count data (p < 0.05) in more than 95% of resampling iterations.

Encoding of visual stimulus information during task learning.

(A) Schematic of the ROC analyses and example data from a neuron preferring the No-Go stimulus. (B) Percentage of stimulus-selective neurons in each brain region during the early and late training stages. (C) Mean percentage of stimulus-selective neurons in the early (0–400 ms after stimulus onset) and late (400–800 ms after stimulus onset) stimulus periods. n = 10 time bins for the early training stage and 30 time bins for the expert training stage. (D) Same as C, but for the early (800–1800 ms after stimulus onset), and late (1800–2800 ms after stimulus onset) response periods. *p < 0.05, **p < 0.01, ***p < 0.001, ****p < 0.0001, two-way ANOVA, Sidak’s multiple comparison. n = 34 time bins for the early training stage and 102 time bins for late training stage. (E) Correlation between stimulus encoding peak time and input/output rank in expert CR trials. A significant correlation was observed during the stimulus period but not the response period (Pearson’s correlation).

In the early training stage, stimulus-selective neurons were mainly found in V1 during the stimulus period (Figures 7B and C), while other regions contained very few stimulus-selective neurons during this period. By the late response period, stimulus-selective neurons were found in larger proportions in nearly all regions (except V2L), suggesting that the visual information was broadcast through the network at this time (Figures 7B and D). In the expert stage, however, stimulus-selective neurons emerged in all regions during the stimulus period, and their proportion also substantially increased during the response period (Figure 7D), in alignment with previous reports that visuomotor learning could promote visually evoked activity in dorsal medial prefrontal cortex 3. We also noticed that some regions showed stimulus encoding even before the visual stimulus onset, suggesting effects from trial history 34 or expert mice had likely learned the pseudo-random trial sequence (Methods) and anticipated upcoming visual stimuli based on sensory history.

In expert mice, ROC encoding curves of most regions showed two distinct peaks —one during the stimulus period and another during the response period (Figure 7B). Therefore, we defined the peak time of stimulus encoding in each trial period as the center of the time window with the highest mean percentage of stimulus-selective neurons. We found a significant correlation between the input/output rank of each brain region in CR expert trials and its encoding peak time during the stimulus period (p < 0.05, Pearson’s correlation, Figure 7E), with higher-ranked regions reaching their encoding peaks earlier. No significant correlation was observed during the response period in expert CR trials (Figure 7E).

In summary, as learning progressed, visual information propagated more rapidly through the network, likely due to the increased functional connection ranks between visual and frontal regions. Moreover, a region’s connection rank within the network became highly predictive of how quickly it reached its encoding peak during stimulus viewing.

Optogenetic inhibition of rank-increasing regions impaired task learning

Finally, to examine whether the regions with increased rank during CR trials actually contributed to task learning, we performed manipulation experiments on two of these regions, specifically V2M and OFC. For each manipulation group, we expressed AAV2/9-mCaMKIIa-eJaws3.0-mRuby3-WPRE-pA (AAV2/9-mCaMKIIa-mCherry-WPRE-pA for the control group) in the bilateral OFC or V2M and inhibited these regions during either the stimulus or response period of task training (Figure 8).

The effects of bilateral optogenetic inhibition on task performance.

(A) Expression of AAV2/9-mCaMKIIa-eJaws3.0-mRuby3-WPRE-pA in the OFC. VO: ventral orbitofrontal cortex; MO: medial orbitofrontal cortex. Regions were named according to the Paxinos atlas 35. (B) Correct rejection rate for the OFC-stimulus period inhibition group (eJaws 3.0 Stim), and the control group (mCherry Stim). Shading, SEM. n = 8 and 14 mice for the eJaws 3.0 and mCherry group, respectively. ****p < 0.0001, significance for the group factor in two-way ANOVA. (C) Same as B, but for the OFC-response period inhibition group (n = 8 mice) and control group (n = 16 mice). (D) Average miss rate for each mouse in the OFC manipulation group and control group. (E-H): Same as A to D but for V2M inhibition. n = 8 mice for each manipulation group. ***p < 0.001, significance for the group factor in two-way ANOVA. The control group here was the same group of mice in A-D.

We found bilateral inhibition of the OFC (Figures 8B and C) showed significantly impaired task learning in both the stimulus period (F(1, 399) = 29.91, p < 1.0 × 10-4, η² = 0.047, two-way ANOVA), and the response period (F(1, 420) = 87.51, p < 1.0 × 10-4, η² = 0.098, two-way ANOVA). The interaction with training sessions was not significant for both periods (F(19, 399) = 0.75, p = 0.77, η² = 0.022 for the stimulus period, F(19, 420) = 1.05, p = 0.40, η² = 0.022 for the response period), suggesting consistent impairment across training sessions. Bilateral inhibition of V2M during the stimulus period also impaired task learning with a small effect size (F(1, 399) = 8.19, p = 4.4 × 10-3, η² = 0.012, two-way ANOVA, Figure 8F), whereas inhibition during the response period did not affect task learning (F(1, 440) = 0.0095, p = 0.92, two-way ANOVA, Figure 8G). None of the manipulation groups showed significant differences in miss rate compared to the mCherry control group (p > 0.05, Welch’s t test, Figures 8D and H), indicating the observed performance decline was not due to task abandonment.

Taken together, the manipulation effects on task performance provide some support for the connection rank analysis, suggesting that regions with increased rank during learning likely contribute to task acquisition. However, while a rise in connection rank may reflect a region’s involvement in the learning process, it does not necessarily imply a causal relationship with learning.

Discussion

In this study, we investigated how mesoscale functional networks changed during the learning of a visual-based decision-making task. Using 1024-channel uFINE-M arrays for chronic spiking activity recording across multiple cortical and subcortical regions, we were able to examine the mesoscale network dynamics at different timescales: rapid transitions between different periods within a trial, distinct functional connectivity patterns across trial types within a session, and the long-term evolvement of network dynamics throughout task learning.

A key finding of our study is that task learning reshaped interregional connectivity, leading to the emergence of a more task-relevant subnetwork as mice learned to correctly reject No-Go stimuli. Specifically, several visual and frontal regions (V1, V2M, OFC, and M2) gained prominence in the network, while others (V2L, MDTh, and striatum) became less engaged. These findings suggests that learning is accompanied by a selective refinement of interregional communication, with a shift in functional connectivity toward regions more directly involved in processing task-relevant information. This observation aligns with previous report of the spatiotemporal refinement in cortical activity during the learning of a texture discrimination task 36 and a visually guided delayed-response task 15, suggesting that the emergence of a more task-relevant functional network may be a general mesoscale network feature of learning.

Beyond connectivity changes, we also found that the encoding of stimulus information became more widely distributed across the network as learning progressed. Moreover, the connectivity rank of a brain region was strongly correlated with the timing of its stimulus encoding peak during the stimulus period, suggesting that high-ranked regions may not only receive information earlier but also play a more central role in relaying task-relevant signals. These findings indicate that learning facilitates more efficient information flow through the network, potentially enhancing sensory processing and decision-making processes. The broader recruitment of stimulus-selective neurons during the response period in expert mice further supports the notion that learned associations between sensory inputs and behavioral outcomes become increasingly embedded in distributed circuits over time 13,14.

We also noticed discrepancies between the results of network ranking analyses and optogenetic inhibition experiments. Inhibiting OFC during either the stimulus or response period significantly impaired learning, consistent with its increased rank in both task periods. However, while V2M also showed increased network rank in both the stimulus and response periods, inhibition of V2M during the response period had no significant effect. This suggests that rank increases with learning do not necessarily indicate a direct causal role in driving behavioral improvements. Other factors, such as neuromodulatory influences and internal state changes, may also contribute to the observed changes in functional connectivity 37. We also cannot fully rule out the possible effect from rebound activity following optogenetic inhibition 38,39, which may confound the interpretation of manipulation effects during the stimulus period (Figure 8), but did not change the causal role of OFC in learning and the discrepancies between ranking analysis and the manipulation results in this case.

Despite recording from 10 brain regions, our study remains limited compared to the extensive network of brain regions implicated in visual-based decision-making 13, including the midbrain, hindbrain, and cerebellum. Future studies should aim for broader spatial coverage, ideally with stable tracking of the same neuronal populations throughout the entire learning process, to achieve a more comprehensive characterization of the mesoscale network dynamics. Additionally, carefully designed decision-making tasks 40 will be essential for disentangling neural representations of sensory stimulus information, decision-making, action execution, and arousal states. More importantly, as recording scales continue to expand, future work should aim to systematically evaluate the predictive power of different analysis methods in determining the causal contributions of various brain regions to learning and decision-making.

Materials and methods

Animals

Animal use procedures were approved by the Animal Care and Use Committee at the Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences (approval number NA-056-2023). Data were collected from a total of 84 male adult C57BL/6 mice (3–5 months old). Among them, 7 were used for task training to acquire behavioral learning results without electrode array or optical fiber implants, 5 for electrophysiological recordings during behavioral task (3 for visual-based decision-making learning and 2 for fruitless learning), 69 for optogenetic manipulation experiments, and 3 for video analysis. Mice were generally housed in groups of 3–4 per cage, but mice for chronic extracellular recordings were housed individually to protect the implants. Mice were water-deprived in the home cage and received water reward during daily behavioral sessions. On days when mice did not perform the task, restricted water access (∼0.8 mL per mouse) was provided each day. All mice were maintained on a 12-h light/12-h dark cycle (lights on at 7:00 a.m.), and all sessions were performed in the light phase.

Visual Go/No-Go task

Mice were head-fixed during training sessions and positioned in an acrylic tube placed in a behavioral chamber. A capacitive lick detector and a peristaltic water pump were controlled by custom MATLAB (MathWorks) scripts and digital I/O devices (Arduino Uno R3, Arduino) to monitor tongue licks and deliver water reward, respectively. Visual stimuli were presented on a 19" LCD monitor (Dell P1917S, max luminance 80 cd/m2) placed 10-cm from the right eye of the head-fixed mouse. A yellow light-emitting diode (LED) was placed above the waterspout to signal the onset of the response period (response signal).

Each trial was initiated automatically after the preceding inter-trial interval (ITI) expired. A full-field visual stimulus (vertical or horizontal static gratings with spatial frequency of 0.09 cycles/° and 100% contrast) was presented on the monitor for 800 ms, followed by illumination of the yellow LED to indicate the start of the response window (response signal). Mice were required to lick during the response period of “Go” stimuli to receive a water reward (Hit trials). Failure to respond by licking during the response window of “Go” stimuli would result in a Miss trial with no reward or punishment. Licking during the response period of “No-Go” stimuli would be punished with an 8-s timeout (False Alarm trials, FA). Correctly withholding licking for “No-Go” stimuli (Correct response trials, CR) would be rewarded with a 2-s reduction in the ITI. ITIs were randomized between 4–6 s, but lick during the ITI would extend the interval by an additional 4–6 s (up to a maximum of 30 s) to punish impulsive licking. The training session was terminated if there were no lick responses in 20 consecutive trials.

The mice were trained to perform this task in three sequential steps. In step 1 (day 1–2), mice were allowed to collect water rewards by simply licking the waterspout placed under nose, with a fixed interval of 4 s. In step 2 (day 3–5), mice were required to lick specifically during the response window (at the presence of the LED response signal) and refrain from impulsive licking during the ITI. Only the “Go” stimulus was presented in step 2. In step 3 (day 6–20), the “No-Go” stimulus was introduced, and mice were trained to withhold licking for “No-Go” stimulus to avoid timeout punishment (Figure 1A). The trial sequence was pseudo-randomized to maintain a balanced number of “Go” and “No-Go” stimuli in every 6 trials.

For mice used in electrophysiological recordings, each mouse underwent 7.67 ± 2.08 (mean ± SD) sessions in training step 3 until they reached an average correct rate of ∼85% in daily sessions. For optogenetics manipulation experiments, mice completed a fixed 20-session training in step 3.

Design, fabrication, and assembly of ultra-flexible microelectrode array (uFINE-M)

Each microelectrode array contained 128 channels for electrophysiological recording (Figure 1C). The array consisted of four flexible implantable shanks, each with 32 recording sites arranged in a 16 × 2 matrix. Each recording site was circular, with a diameter of 20 µm. The flexible shank was 6 mm in length, and the longitudinal spacing of electrode recording sites was either 30 μm or 50 μm to suit different spatial coverage requirements. The shank spacing was customized by adjusting the spacing between the four shuttling tungsten wires, which were used to guide the flexible shanks into brain tissue, ensuring proper alignment with the targeted implantation region (Figure 1D).

The fabrication of uFINE-M was adapted from a planar microfabrication technique featuring a multilayer architecture, as previously described16. The structural and passivation material was non-photosensitive polyimide, and patterning was achieved through O2 plasma etching. Titanium was used as the adhesion layer between metal and polymer. The overall device thickness was limited to 1–1.5 µm to maintain low bending stiffness for minimized tissue damage. Both the interconnects and recording site surfaces were made of gold. A 20-µm diameter hole was designed at the tip of the flexible shank, in which the tip of the shuttling tungsten wire was anchored to drag the shank into brain tissue during implantation. The recording sites were coated with either 200 nm of sputtered iridium oxide film or electrochemically deposited PEDOT:PSS (poly(3,4-ethylenedioxythiophene) polystyrene sulfonate) to lower the electrode impedance to below 100 kΩ at 1 kHz in saline solution.

Each array was soldered to a 128-channel flexible printed circuit (FPC) board measuring 42 mm in length, which was connected to the SpikeGadgets 128-channel headstage (SpikeGadgets, San Francisco, USA) for signal acquisition. The four shuttling tungsten wires were fixed onto a carrier chip with 5% Poly (ethylene oxide)-300000 (PEO; CAS No. 25322-68-3) before implantation.

Surgery

Electrode array implantation

Electrode array implantation and viral injection for optogenetic inactivation experiments were performed before behavioral training. Mice were anesthetized with isoflurane before surgery (3–4% for induction, ∼1% for maintenance) and head-fixed in a stereotaxic apparatus. Body temperature was maintained at 37℃ using a heating pad. Chlortetracycline hydrochloride eye ointment was applied to prevent corneal drying. A circular piece of scalp was removed to expose the skull, and the incision site was treated with cyanoacrylate tissue adhesive (Vetbond, 3M, Saint Paul, USA).

For chronic implantation of uFINE-M arrays (Figure 1D), three craniotomies (∼6 mm2 each) were performed over the left hemisphere, and the dura was left intact. A grounding silver wire was implanted posterior to lamda on the right hemisphere. The cortical surface was kept moist with artificial cerebrospinal fluid or 1×phosphate-buffered saline (PBS). Arrays were implanted to OFC (orbitofrontal cortex, LO, VO and MO, AP 2.46 mm, ML 0.65 mm, depth 2.20 mm), across anterior M1 (primary motor cortex) and anterior M2 (secondary motor cortex, AP 1.94 mm, ML 1.50 mm, depth 0.80 mm), mPFC (medial prefrontal cortex, PL and IL, AP 1.78 mm, ML 0.30 mm, depth 2.20 mm), striatum (caudate putamen, AP 1.42 mm, ML 0.30 mm, depth 2.90 mm), across posterior M1, posterior M2, and ACC (anterior cingulate cortex, Cg1 and Cg2, AP -0.20 mm, ML 0.55 mm, depth 1.20 mm), MDTh (mediodorsal thalamus, MDL, MDC, MDM, AP -1.34mm, ML 0.30 mm, deep 3.30 mm), across of V1 (primary visual cortex) and V2L (secondary visual cortex lateral area, AP -2.80 mm, ML 3.25 mm, deep 0.90 mm), and V2M (secondary visual cortex medial area, V2MM and V2ML, AP -2.80 mm, ML 1.40 mm, deep 0.90 mm). Brain regions were named according to The Mouse Brain in Stereotaxic Coordinates by Franklin and Paxinos (3rd edition)35. The exposed parts of arrays were bonded together layer by layer using light-curable resin (Filtek™ Z350 XT, 3M, Saint Paul, USA). The craniotomy was sealed with a thin layer of silicone elastomer Kiwi-Cast (World Precision Instruments, Sarasota, USA). A custom-designed headplate was positioned on the skull and secured using Super-Bond C&B (SUN MEDICAL, Japan). After the Super-Bond Polymer cured, several layers of dental acrylic cement were applied to secure the entire implant.

Viral injection

For viral injections, the skull was not cracked but only thinned to allow smooth entry of a borosilicate glass pipette with a tip diameter of ∼40–50 μm. A total of 150 nL viral solution—either AAV2/9-mCaMKIIa-eJaws 3.0-mRuby3-WPRE-pA (for manipulation group mice) or AAV2/9-mCaMKIIa-mCherry-WPRE-pA (for control group mice) were injected at a depth of 1750 μm for OFC and 500 μm for V2M using a syringe pump (Nanoject II Auto-Nanoliter Injector, Drummond Scientific Company, USA). Group identity (manipulation or control) was randomly assigned among cage mates. After injection, the pipette was left in place for 10–15 minutes before retraction. Optical fibers were implanted bilaterally above the virus injection sites (1000-μm deep for OFC and on cortical surface for V2M), angled ∼10 degrees laterally. Mice (5 mg/kg) subcutaneously for postoperative analgesia.

Mice were allowed to recover from the surgery for at least 3 weeks before water-restriction and behavioral training.

Electrophysiological recording

Neural signals were amplified and recorded using a SpikeGadgets 1024-channel system (SpikeGadgets, San Francisco, USA). Raw voltage signals were sampled at 30 kHz. Task-related behavioral events were digitized as TTL signals and recorded simultaneously by the SpikeGadgets system.

Optogenetic inactivation

Optical silencing via activation of eJaws 3.0 activation was induced by LED red laser (625 nm; Thorlabs) and controlled by digital I/O devices (Arduino Uno R3, Arduino). To manipulate neural activity in either the OFC or V2M, the laser was delivered during stimulus period (0–800 ms after visual stimulus onset) or response period (800–2500 ms after visual stimulus onset) of all trials in separate manipulation groups. The laser power at the fiber tip was calibrated to 2 mW.

Histology

Mice were deeply anesthetized with isoflurane followed by an intraperitoneal injection of 15% ethyl carbamate solution. Transcardial perfusion was then performed using 4% paraformaldehyde (PFA). Brains were extracted, post-fixed in 4% PFA at 4℃ overnight, and then transferred to 30% sucrose in PBS until equilibration for cryoprotection. Brains were sectioned at a thickness of 25 μm, and slices were mounted with antifade mounting medium (with DAPI). Fluorescence images were acquired using a virtual slide microscope (VS120, Olympus, Shinjuku, Japan; Figure 8).

Analysis of behavioral performance

To classify behavioral trial by task performance, we used the d-prime (d’) metric21 and labeled each trial by the d-prime value (Figure S1) calculated with the 10 trials before and 10 trials after it:

where norminv is the inverse of cumulative normal function, Hit rate is the frequency of Hit response in the Go stimulus trials, and FA rate is the frequency of False Alarm response in the No-Go stimulus trials. To minimize confounding effects of animal’s motivation on the evaluation of task performance, the first 20 trials and the trials after the last lick in daily sessions were discarded (Figure S1).

Mouse oral-facial movements during training were recorded at 250 frames per second (fps) using a high-speed camera (MV-CA016-10UC, Hikrobot Co., Ltd., China). The video data were processed with the open-source software Facemap41 and custom MATLAB scripts. ROIs were manually defined and the motion energy (Figure S4) at each timepoint was calculated as the absolute value of the difference between consecutive frames, summed across all pixels within ROI 26. To account for motion energy changes caused by environment luminance changes (e.g., LED response signal), we subtracted motion energy data of miss trials from the data of other trial types during the response period, as mice did not show observable oral-facial motion in miss trials.

Analysis of neuronal responses

Spike sorting

Spike sorting was performed offline using custom MATLAB scripts and open-source software Spyking Circus 42. Raw voltage signals were first filtered above 300 Hz, then denoised with manual threshold and common-median referenced within each probe shank 43. The preprocessed signals were fed to Spyking Circus, which applies automated density-based clustering and template-matching algorithm for spike detection. The results from Spyking Circus were then manually curated using Phy (https://github.com/cortex-lab/phy) to remove obvious artifacts with abnormal waveform shape. Spike clusters were considered as single units if the interspike interval exceeded 1 ms.

Firing rate and activity onset timing

To calculate the average firing rate, the spikes were first binned at 1-ms resolution and resolution spike rate was computed over a 25-ms time window. To identify the activity onset timing of individual single units, the firing rate of a single unit in each 25-ms time window across trials was compared to its baseline activity (500–0 ms before the visual stimulus onset) by t-test. We identified time windows in which P values were below 0.05 for at least three consecutive time windows, and defined the first time window as the timing of activity onset (Figure 2).

For each brain region, the time window with the highest proportion of neurons exhibiting activity onset was defined as the regional peak activation time, and the pairwise differences in peak activation times across all brain regions were used as a measure of the temporal compression of activation sequence (Figure S2).

Connection rank analyses

Functional connections between neurons were defined based on the significance of cross-correlation scores between their spike trains. For each 200-ms timebin, cross-correlation scores between neurons were calculated with the total spiking probability edge (TSPE) algorithm 24, in which an edge filter was applied to the cross-correlogram to facilitate the detection of local maxima and minima. A functional connection was identified if its cross-correlation score exceeded at least 95% of cross-correlation scores calculated from randomly shuffled spike trains. Only excitatory connections within 20 ms were included in subsequent analyses.

To establish regional connection profiles and identify key brain regions within the network, we defined the functional input/output strength between any two brain regions as the proportion of neuron pairs that had significant excitatory functional connections, of all possible input/output pairs between these two regions. To better evaluate the relative importance of each region within the brain network, we ranked the summed values of input/output strength of each brain region on a scale from 1 to 10. To better compare interregional input/output strength, for each time window within a trial, regional connection strength was ranked on a scale of 1 to 10, with a rank of 1 representing the lowest 10% strength among all regional connections within the same time window (Figures 3-7, Figure S3, S5-S8).

ROC analysis

To quantify the selectivity of each neuron for visual stimulus, we applied the receiver operating characteristic (ROC) analysis 21 to the distributions of spike counts on each 200-ms time window within the trial. A neuron was included in the ROC analysis only if it had at least 5 trials for each of the four trial types (Hit, CR, Miss, and FA). For each neuron, 500 bootstrap-resampled data were generated in each time window, and the number of trials with different choices and visual stimuli were balanced to ensure 50 trials for each condition. The area under the ROC curve (auROC) indicates the accuracy with which an ideal observer can correctly classify whether a given response is recorded in one of the two conditions. ROC selectivity was defined as 2× abs(auROC–0.5), which ranges from 0 to 1. A neuron was classified as stimulus-selective if its ROC selectivity was larger than 95% percent of randomly shuffled data (p < 0.05) in at least 95% of bootstrap-resampled dataset (Figure 7).

Statistical analysis

No statistical methods were used to pre-determine sample sizes. Sample sizes were consistent with similar studies in the field. Statistical analyses were performed using MATLAB or GraphPad Prism (GraphPad Software). The two-way ANOVA were used to determine the significance of the effects. Correlation values were computed using Pearson’s correlation. Unless otherwise specified, data were reported as mean ± SEM and statistical significance was set at p < 0.05.

Data Availability

Spiking data and behavior data analyzed during this study are available upon request.

Acknowledgements

The authors thank Nanofabrication Facility for Advanced Brain Science at CEBSIT and Dr. Xiaocheng Li for supporting electrode fabrication and thank Dr. Muming Poo and Dr. Jun Yan for discussion and advice on various details in task design and data analysis. This work was supported by the National Science and Technology Innovation 2030 Major (No. 2021ZD0202200 and No. 2021ZD0202202), Shanghai Municipal Science and Technology Major Project (No. 2021SHZDZX), Lingang Laboratory (No. LG202105-01), the National Natural Science Foundation of China (No. 32200917), and Shanghai Pujiang Program (No. 23PJ1414400).

Additional information

Ethics

All experimental procedures were approved by the Animal Care and Use Committee at the Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, protocol # NA-056-2020.

Author Contributions

Tian-Yi Wang and Chengcong Feng contributed equally to this work. Tian-Yi Wang, conceptualization, performed all the analyses on electrophysiology data and optogenetic manipulation, all the related behavior training experiments, drafting and revising the article; Chengcong Feng, performed all the implantation surgeries, video analyses of mouse oral-facial movements during task learning and related behavior training, design of ultra-flexible microelectrode array devices; Chengyao Wang, design and fabrication of ultra-flexible microelectrode array devices; Chi Ren, supervision, funding acquisition, revising the article; Zhengtuo Zhao, conceptualization, supervision, funding acquisition, revising the article.

Funding

Ministry of Science and Technology of the People's Republic of China (MOST) (No. 2021ZD0202200)

Ministry of Science and Technology of the People's Republic of China (MOST) (No. 2021ZD0202202)

Shanghai Municipal People’s Government (2021SHZDZX) (No. 2021SHZDZX)

Shanghai Municipal People’s Government (2021SHZDZX) (No. 23PJ1414400)

Shanghai Municipal People’s Government (2021SHZDZX) (No. LG202105-01)

National Natural Science Foundation of China (NSFC) (No. 32200917)

Additional files

Supplemental Figures