Introduction

Adaptive behavior depends on evaluating sensory cues according to their emotional significance. Acoustic signals are particularly effective for threat detection because they provide rapid, omnidirectional information that often precedes danger. This broad accessibility reduces spatial specificity and increases learning variability, making auditory associative learning a powerful model for studying how the brain extracts shared, robust features across experiences for proper generalization. The prelimbic cortex (PL) contributes to the expression (Burgos-Robles et al., 2009; Sierra-Mercado et al., 2011; Sotres-Bayon & Quirk, 2010) and the proper discrimination and generalization of threat memories (Rosas-Vidal et al., 2025; Stujenske et al., 2022). Accumulating evidence also points to a broader role for the PL in auditory processing, particularly through top-down attentional modulation along the auditory pathways (Hockley & Malmierca, 2024; Zikopoulos & Barbas, 2006).However, it remains unclear whether the same or distinct PL neurons encode sensory features of sounds versus their learned or inferred emotional significance, and how these representations change with experience.

At the population level, episodic (DeNardo et al., 2019; Kitamura et al., 2017) and procedural (Do-Monte et al., 2015; Iqbal et al., 2026) memory ensembles undergo pronounced reorganization across days—a hallmark of systems consolidation—even as threat-associated behaviors remain stable over time. This dissociation between neural instability and behavioral persistence indicates that memory representations are not fixed at the level of individual neurons, but instead depend on population-level structure that is preserved across time (Deitch et al., 2021; Gallego et al., 2020). This idea raises several central questions: How can population-level representations remain functionally stable while their cellular constituents continually change? Are all neurons equally dynamic, or do distinct subpopulations exhibit differential stability? How do dynamic networks support appropriate generalization, ensuring that only potentially dangerous stimuli generate overlapping neural representations, amid ongoing ensemble turnover?

Recent studies demonstrate that PL neurons generalize emotional value by inferentially updating the valence of previously neutral stimuli after learning (Gu & Johansen, 2025). However, the neural mechanisms underlying this process remain unclear. Because generalization often extends to entirely novel stimuli that have never been experienced (for example, danger associated with a familiar siren generalizing to a new siren), it is unknown how subsets of neurons responding to these novel cues infer valence amid ongoing network dynamism. Here, our goal was to determine which network-level features preserve the essential components of a memory across time and how these elements support appropriate generalization to novel cues. We hypothesized that generalization to novel stimuli depends on stable subnetwork organization that enables comparisons between learned and inferred valence, as well as population-level features that reduce variability across related representations. To test this hypothesis, we combined longitudinal calcium imaging with computational analyses in freely moving mice to examine PL network dynamics during discriminative auditory learning and retrieval in the presence of both conditioned and novel tones. Our results show that stable cortical subnetworks integrate the emotional “gist” of memory and inferred valence for novel cues over time, despite ongoing ensemble reorganization, and that population-level firing rate similarity across stimulus presentations determines threat generalization.

Results

Logarithmic tone separation predicts freezing generalization

To assess memory and generalization, GRIN-lens–implanted and non-implanted mice were trained in a differential auditory fear-conditioning paradigm (Fig. 1a). One tone (CS⁺; 80 dB) was paired with a mild foot shock (0.5 mA, 0.5 s), whereas a second tone (CS⁻; 80 dB) was never paired with shock. In one group, the CS⁺ was 15 kHz and the CS⁻ was 3 kHz (CS⁺15; N = 27; 7 implanted, 20 non-implanted); in a second group, contingencies were reversed (CS⁺3; N = 22; 5 implanted, 17 non-implanted). A no-shock control group was exposed to the same tones without shock (N = 13; 6 implanted, 7 non-implanted).

Experimental design, behavior, histology and calcium imaging.

a,Experimental design illustrating discriminative fear conditioning with either a 15 kHz CS⁺ or a 3 kHz CS⁺ (experimental groups), and no-shock controls tested at identical time points but never exposed to footshock. All groups were tested with 3 and 15 kHz tones and two intermediate frequencies (7 and 11 kHz) on days 1, 15 and 30 after conditioning. b, Behavioral responses across testing days (CS⁺15 kHz: n = 27 mice; CS⁺3 kHz: n = 22 mice; no-shock controls: n = 13 mice). Two-way repeated-measures ANOVA: CS⁺15 : effect of day F(2,52) = 13.28, p < 0.001, frequency: F(3,78) = 85.09, p < 0.001, day × frequency interaction: F(6,156) = 3.79, p = 0.002; CS⁺3: effects of day F(2,42) = 14.42, p < 0.001, frequency: F(3,63) = 58.81, p < 0.001, day × frequency interaction: F(6,126) = 1.41, p = 0.217; no-shock controls: effects of day: F(2,20) = 1.38, p = 0.276, frequency: F(3,30) = 1.43, p = 0.253; day × frequency interaction: F(6,60) = 0.335, p = 0.916. Significant Tukey multiple comparisons are denoted by asterisks, *** p < 0.001. c, Representative histological section showing GCaMP6f expression. Imaging depth did not exceed 300 μm, restricting recordings to PL. d, Example calcium imaging data showing raw fluorescence signals, deconvolved activity, thresholded events and corresponding activity traces.

Memory retrieval was tested on days 1, 15, and 30 after conditioning to probe early, long-term, and remote memory (Bontempi et al., 1996). During retrieval, mice were tested in a novel context with the CS⁺, CS⁻, and two intermediate frequencies (7 and 11 kHz), presented in semi-random order, with each tone repeated three times (Fig. 1a). Frequencies were spaced linearly to assess whether logarithmic frequency separation predicts similarity to the threat-associated tone. Although CS⁺/CS⁻ separation was identical across groups (2.32 octaves), spacing between the CS⁺ and adjacent frequencies differed (0.45 octaves for 11–15 kHz; 1.22 octaves for 3–7 kHz). Accordingly, if logarithmic separation governs threat generalization, freezing was expected to generalize to 11 kHz in CS⁺15 mice, with weaker generalization between 3 and 7 kHz in CS⁺3 mice.

In the CS⁺15 group, mice consistently discriminated between the CS⁺ and CS⁻. Although generalization to intermediate frequencies was broad on day 1, it became progressively restricted to 11 kHz—the frequency adjacent to the CS⁺—during long-term (day 15) and remote (day 30) retrieval (p < 0.05; Fig. 1b, left). CS⁺3 mice also discriminated between the CS⁺ and CS⁻; however, these animals exhibited greater freezing to the CS⁺ than to all intermediate frequencies across testing days (p < 0.05; Fig. 1b, center), indicating reduced generalization to the adjacent tone when the logarithmic separation was larger. No-shock control mice showed no significant differences in freezing across frequencies on any testing day (p > 0.05; Fig. 1b, right), confirming that freezing reflected associative learning. Together, these results indicate that logarithmic spacing, rather than simple proximity to the CS⁺, was the primary determinant of generalization.

No sex differences were observed in CS⁺15 mice (11 females, 16 males) or no-shock controls (6 females, 7 males; p > 0.05). In CS⁺3 mice (11 females, 11 males), minor and inconsistent sex-related effects were detected on days 1 and 15, but these effects were absent by day 30 (Fig. S1). Accordingly, sex was not included as a factor in subsequent neural analyses.

Behavioral performance did not differ between implanted and non-implanted mice on days 1 and 15 (p > 0.05). On day 30, a modest main effect of group was observed in the CS⁺3: group, reflecting higher overall freezing levels in implanted animals compared with non-implanted mice (p < 0.03). Importantly, both experimental groups exhibited robust frequency effects across days (p < 0.001), with no group × frequency interactions (p > 0.05), indicating preserved threat discrimination and generalization profiles. A similar elevation in overall freezing was observed in implanted no-shock controls (p < 0.05), suggesting that group differences reflect nonspecific changes in behavioral output rather than differences in associative learning.

PL sound-responsive networks exhibit dynamic properties over time

To examine the temporal evolution of PL neuronal responses, we performed longitudinal calcium imaging in CS⁺15 (N = 7), CS⁺3 (N = 5), and no-shock control mice (N = 6). Across groups, neurons were tracked during conditioning and all retrieval sessions. Figs. 1c and 1d show GCaMP6f expression in PL, representative calcium footprints, and activity traces. Sound-evoked responses were visualized using raster plots with calcium activity ranked across simultaneously recorded neurons (Fig. 2a). Across animals and sessions, we identified distinct neuronal populations showing positive modulation, negative modulation, mixed responses, or no consistent response to sound (Fig. 2b). Sound responder neurons were classified using a test that detected modulation based on magnitude relative to baseline variability, allowing reliable identification of both transient and sustained responses while remaining robust to noise. Fig. 2c summarizes the proportions of response types pooled across animals and sessions.

Neuronal activity distributions across days and stability of the active network.

a,Raster plots from an animal trained with a 15 kHz CS+, ordered by activity level. Raster plots illustrate positive sound-responsive neurons (bottom), negative sound-responsive neurons (top), and mixed or non-responsive neurons (middle). b, Proportions of neuronal response types across experimental groups and testing days. Number of cells: CS+15: conditioning: 3,834; day 1: 3,177; day 15: 4,186; day 30: 3,693, CS+3 conditioning: 1,628; day 1: 1,381; day 15: 1,881; day 30: 1,833, control: conditioning: 1,118; day 1: 986; day 15: 1,386; day 30: 563 (2 mice only yielded data up to day 15). c, Venn diagrams showing the proportions of consistently active neurons (active across all retrieval sessions; one-way ANOVA: F(2,13) = 2.70, p = 0.104), partially active neurons (active in two sessions; one-way ANOVA: days 1–15, F(2,13) = 1.45, p = 0.271; days 15–30, F(2,13) = 1.02, p = 0.388; days 1–30, F(2,13) = 0.33, p = 0.723), and transiently active neurons (active in a single session; day 1, Kruskal–Wallis: H(2) = 1.90, p = 0.387; day 15, one-way ANOVA: F(2,13) = 0.06, p = 0.942; day 30, Kruskal–Wallis: H(2) = 7.52, p < 0.05). Diagrams and analyses for control mice include only mice recorded for 30 days.

Given the ongoing debate over whether neocortical memory traces stabilize or remain dynamic over time (DeNardo et al., 2019; Kitamura et al., 2017; Kupke & Oliveira, 2025; Lopez et al., 2024; Mau et al., 2020; Rao-Ruiz et al., 2021; Refaeli et al., 2023; Terranova et al., 2023; Zaki & Cai, 2024), we next assessed the temporal stability of PL networks by tracking the cellular footprints of sound-responsive neurons across retrieval sessions. Network stability was visualized using Venn diagrams (Fig. 2c). A moderate proportion of neurons was present across all retrieval sessions, with no differences between groups (p > 0.05). Likewise, there were no group differences in the proportion of neurons overlapping across only two sessions (p > 0.05). For neurons present in only a single session, control mice exhibited a higher proportion on day 30 compared with CS15+ mice (p < 0.05), with no other differences observed. Overall, these data indicate that the majority of neurons overlapped across only a limited number of sessions or were transiently active, reflecting pronounced population-level dynamism over time.

Sound-modulated PL population responses encode learned and inferred valence

To assess population-level representations of learned and novel tones, calcium activity was averaged across three presentations for each frequency during a 10-s baseline, 20-s tone presentation, and 10-s post-stimulus interval. In CS⁺15 mice, positively modulated sound-responsive neurons exhibited graded tone activity reflecting the contingency learned valence as well as the inferred valence of novel tones across testing days (Fig. 3a, top). Area-under-the-curve (AUC) analyses obtained by averaging recorded cells per animal revealed the highest responses at 15 and 11 kHz, intermediate responses at 7 kHz, and the lowest at 3 kHz; AUCs for 11 and 15 kHz exceeded those for 3 kHz (p < 0.05), with no difference between 11 and 15 kHz (p > 0.05) across testing days. This population level gradient mirrored behavioral generalization (Fig. 1b, left), with strong responses to 11 kHz and intermediate responses to 7 kHz. By contrast, negatively modulated neurons showed overlapping responses across frequencies with no significant differences at any time point (p > 0.05; Fig. 3a, bottom).

Population activity of positive sound responders shows emotional graded valence patterns of activity in response to tones.

a-c, Population responses in animals trained with a 15 kHz CS+ (a), a 3 kHz CS+ (b), and no-shock controls (c). In all groups, upper panels show positively responsive neurons and lower panels negatively responsive neurons. Boxplots show the median (center line), interquartile range (box), and whiskers extending to ±1.5× the interquartile range; points outside the whiskers represent individual observations beyond this range. CS+15: positively modulated: day 1: F(3,18) = 9.963, p < 0.001; day 15: F(3,18) = 9.973, p < 0.001; day 30: F(3,18) = 6.627, p = 0.003; negatively modulated: day 1: F(3,18) = 2.483, p = 0.094; day 15: F(3,18) = 1.877, p = 0.178; day 30: F(3,18) = 2.753, p = 0.073). CS+3: positively modulated: day 1: F(3,12) = 6.899, p = 0.006; day 15: F(3,12) = 9.247, p = 0.002; day 30: F(3,12) = 6.123, p = 0.009; negatively modulated: (day 1: F(3,12) = 0.87, p = 0.484; day 15: F(3,12) = 0.512, p = 0.641; Day 30: F(3,12) = 1.448, p = 0.278); no shock control: positively modulated: day 1: F(3,15) = 0.527, p = 0.670; day 15: F(3,15) = 1.852, p = 0.181; day 30: F(3,9) = 1.046, p = 0.418; negatively modulated: day 1: F(3,13) = 1.205, p = 0.347; day 15: F(3,13) = 1.375, p = 0.294; day 30: F(3,9) = 0.95, p = 0.457. Significant Tukey multiple comparisons are denoted by asterisks, *p < 0.05, **p < 0.01, ***p < 0.001.

A complementary pattern was observed in CS⁺3 mice. On day 1, the AUC for 3 kHz was greater than that for 11 kHz (p < 0.05), but only showed a trend relative to 15 kHz (p = 0.68; Fig. 3, middle panel); however, on days 15 and 30, it exceeded that for both 11 and 15 kHz (p < 0.05). Responses to 7 kHz reached significance relative to 3 kHz on days 1 and 15 (p < 0.05), but not on day 30. As in CS⁺15 mice, negatively modulated neurons exhibited overlapping responses across frequencies and days (p > 0.05). In no-shock controls, although both positive and negative responses were present, population activity was not modulated by tone frequency or valence (p > 0.05; Fig. 3c, bottom panel), indicating that graded responses require associative learning. Together, these results show that despite substantial neuronal turnover, PL population responses encode graded sound-valence associations that reflect both learning and inference, closely matching behavioral generalization.

Consistently active neurons preserve valence representations as newly recruited neurons sharpen remote memory traces

Longitudinal tracking of individual neurons revealed how subpopulations with distinct stability profiles shape the evolution of cortical memory representations. We compared population responses across three stability types: consistently active neurons (active during conditioning and all retrieval sessions), emerging–retained neurons (recruited after conditioning and persisting through day 30), and transiently active neurons (only present during a single retrieval session). Consistently active, positively modulated neurons exhibited graded population responses reflecting both learned and inferred sound associations from day 1 onward in both experimental groups, which was quantified by calculating AUC per animal (p < 0.05; Fig. S2a–b, Table S1). In contrast, negatively responding neurons showed no graded tuning across days (p > 0.05), except for a day 1 difference between 15 and 3 kHz in CS⁺15 mice (p < 0.05). In control mice, positively and negatively modulated neurons displayed variable and inconsistent activity across frequencies in all cell categories (consistently active, emerging–retained, and transiently active). As a result, these neurons were not included in further cell-type analyses.

Emerging-retained neurons recruited after conditioning exhibited graded population responses when recruitment occurred after day 1 (Fig. S3a–b, Table S1). This category encompassed neurons that emerged on day 1 and persisted through day 30, as well as neurons that emerged on day 15 and remained active through day 30. In both experimental groups, graded tuning was absent on day 1 (p > 0.05) but emerged on days 15 and 30, when population responses closely mirrored behavioral generalization, with responses to 11 and 15 kHz becoming increasingly similar regardless of CS⁺ identity (p > 0.05). By contrast, emerging–retained negatively responding neurons showed no significant emotional tuning across days.

Finally, we examined transiently active neurons, defined as cells present only on a single retrieval session. Transiently positive responders showed no valence tuning when active on day 1 but exhibited clear graded population responses when recruited on days 15 or 30 in both experimental groups (Fig. S4a–b, Table S1). In contrast, transiently negative responders showed no significant frequency tuning at any time point (p > 0.05, Table S1). Together, these results indicate that consistently active neurons maintain stable representations of learned and inferred sound associations across time, whereas neurons recruited after conditioning progressively acquire graded tuning at later retrieval stages. This dynamic refinement suggests that cortical memory representations become increasingly selective during systems consolidation, while a stable neuronal subpopulation preserves the core emotional content of the memory.

Population vector similarity at stimulus onset determines degree of generalization

Consistent neural responses across stimulus repetitions within sessions support stable population representations that enable recognition and generalization to similar cues (Hoshi et al., 2023). To assess population similarity within each retrieval session, we compared activity across repeated tone presentations (three per session, presented in semirandom order). Population activity was estimated using CASCADE, a validated spike-inference neural network (Rupprecht et al., 2021), and activity vectors were constructed from simultaneously recorded neurons. We quantified response similarity across tone pairs during stimulus presentation to assess temporal and rate consistency at the population level.

Rate-based population similarity analyses revealed reliable, temporally structured responses for 3/3 tone pairs in CS⁺3 mice and for 15/15 and 15/11 tone pairs in CS⁺15 mice that were absent in control mice for any frequency pair (Fig. 4a-d). Because population similarity peaked shortly after stimulus onset, we quantified similarity during the first 5 s after tone onset relative to the CS⁺. In CS⁺15 mice, population similarity was highest for 15/15 and 15/11 tone pairs (p < 0.001), with no difference between them (p > 0.05), and was significantly greater than for 15/3 comparisons (p < 0.05, Fig. 4e). In CS⁺3 mice, similarity was highest for 3/3 tone pairs (p < 0.004) and significantly lower for 11/3 and 15/3 comparisons (Fig. 4f; p < 0.05). The 3/3 and 3/7 comparisons showed a non-significant trend (p = 0.08). No differences were observed among 7/15, 11/15, and 15/15 tone pairs in either group (p > 0.05). These findings indicate that population-level similarity at stimulus onset scales with behavioral threat generalization and is maximal for tones associated with robust threat responses.

Population vector similarity across tones and time.

a-c, Population similarity maps for all tone pairs across the time course of sound presentation in the CS+15 (a), CS+3 (b), and no-shock control (c) groups. d, Schematic illustrating the color scale and map orientation; the y-axis denotes the earlier tone in the comparison, and the x-axis denotes the later tone. e-f, Box plots showing population similarity during the first 5 s following tone onset, quantified relative to the CS+ for the CS+15 (e, F(3,36) = 12.025, p < 0.001) and CS+3 (f, F(3,24) = 7.435, p = 0.004) groups. Significant Tukey multiple comparisons are denoted by asterisks, p < 0.05, p < 0.01, p < 0.001.

Different subnetworks encode acoustic versus learned properties of sound association

Our previous analyses show that learned and inferred associations are represented at the population level. However, these results do not resolve whether graded responses arise from pooled activity of frequency-selective neurons or from subnetworks encoding integrated learned valence across tones. Because most neurons exhibit dynamic properties and only a subset remains stable over time, we hypothesized that the PL active ensemble segregates into functionally distinct subnetworks: one encoding tone-specific sensory features with dynamic characteristics, and another responding to all frequencies encoding stable core memory content and inferred emotional valence.

To test this hypothesis, we developed a clustering approach based on mutual information (MI), which captures both linear and nonlinear relationships between neuronal response profiles (Quian Quiroga & Panzeri, 2009). MI was used to group neurons with related activity patterns into functional subnetworks. Because MI does not preserve response polarity, neurons were subsequently classified by the sign of their pairwise correlations and re-clustered, consistently revealing two major response classes: positively and negatively modulated sound responders (Fig. 5a). Clustering was performed across all animals to derive global cluster identities (Fig. 5) and within each experimental and control group to identify learning-specific subnetworks (Figs. S5S7).

Clustering of PL subnetworks based on signed mutual information

a,Schematic of the mutual information (MI)–based clustering pipeline. An unsorted MI matrix computed from simultaneously recorded prelimbic (PL) neurons was first subjected to spectral clustering to identify primary clusters based on shared information structure, independent of response sign. The MI matrix was then reordered according to these primary cluster assignments. Within each primary cluster, MI values were combined element-wise with a corresponding sign matrix encoding the direction of correlation between cell pairs, yielding a signed MI matrix. Spectral clustering was subsequently applied independently within each primary cluster to identify secondary subclusters with distinct signed interaction patterns. Each subcluster was assigned a unique label, and all labels were combined to generate the final clustered signed MI matrix, enabling separation of positively and negatively modulated sound-responsive neurons. b, Final clustered signed MI matrices for each experimental and control groups. Matrices are sorted by cluster labels for the CS+15 group (top), CS+3 group (middle), and No Shock group (bottom). Color scale indicates signed MI strength (HI to LO). c, Average stimulus-aligned population responses for clusters showing strong positive modulation to individual tones. C.1–C.4 show primary responses to 3 kHz, 7 kHz, 11 kHz, and 15 kHz tones, respectively, across groups. d, Average stimulus-aligned population responses for clusters showing strong negative modulation to individual tones. D.1–D.4 show primary responses to 3 kHz, 7 kHz, 11 kHz, and 15 kHz tones, respectively, across groups.

Clustering quality was assessed by sorting signed MI values by global cluster identity and comparing within-versus across-cluster correlations (Fig. 5b). The resulting box-diagonal structure indicated robust clustering, with comparable quality indices across groups (CS⁺15: 0.28; CS⁺3: 0.27; controls: 0.30). In all cases, within-cluster correlations exceeded across-cluster correlations, supporting the reliability of the approach (CS⁺15: 0.30 vs. −0.03; CS⁺3: 0.27 vs. −0.02; controls: 0.28 vs. −0.04).

Across all groups, we identified positively and negatively modulated clusters selective for individual frequencies, indicating that frequency-specific representations are present in PL independent of learning (Fig. 5c–d; Figs. S5S6), consistent with a role in auditory processing (Hockley & Malmierca, 2024; Zikopoulos & Barbas, 2006). In contrast, clusters encoding graded learned associations—characterized by responses to all frequencies that scaled with learned and inferred emotional value—were observed exclusively in trained animals (Fig. 6; Figs. S5S6). Responses peaking at 15 kHz were specific to the CS⁺15 group (Fig. 6a–b; Fig. S5), whereas responses peaking at 3 kHz were unique to the CS⁺3 group (Fig. 6d–e; Fig. S6).

Graded emotional response clusters are present in experimental groups.

a-b, Average stimulus-aligned population responses for clusters showing graded emotional valence in animals trained with a 15 kHz CS+, showing positive (a) and negative (b) response patterns. c-e, Average stimulus-aligned population responses for clusters showing graded emotional valence in animals trained with a 3 kHz CS+, showing positive (c-d) and negative (e) response patterns. f-g, Analysis of stability of neuronal identity within graded valence clusters across retrieval sessions. Benjamini–Hochberg corrected significance denoted by asterisks. h-i, Baseline/stimulus firing rate ratio (BSR) showing changes in activity of graded clusters only present in the experimental CS+15 (h) and CS+3 (i) groups across days. Significant Tukey multiple comparisons are denoted by asterisks, *p < 0.05, **p < 0.01, ***p < 0.001.

If neurons encoding graded responses carry core mnemonic information, they should exhibit enhanced stability over time. To test this hypothesis, we quantified the proportion of registered neurons that retained their cluster identity across at least two retrieval sessions and compared these values to a shuffled null distribution (10,000 iterations), with multiple comparisons controlled using the Benjamini–Hochberg procedure. Only the positively modulated clusters encoding graded learned associations showed significant identity stability across all retrieval intervals (days 2–15, 15–30, and 2–30; Fig. 6a-c). In contrast, negatively modulated graded responders and a smaller positively modulated cluster in the CS⁺3 group showed stability only between adjacent sessions (Fig. 6b, d, e, Table S2). We repeated this analysis focusing only on consistently active neurons finding the same results. These data demonstrate that graded valence clusters remain consistently active and retain stable cellular identity, enabling encoding of core memory components and providing a stable scaffold for inferred valence comparisons.

Graded clusters encode emotional valence but constitute only a fraction of the active population; yet valence coding at the population level remains accurate and precise. This indicates that neurons newly recruited into the population—likely frequency-selective and organized within learning-independent clusters—can be shaped by associative processes through modulation of firing activity. To test this hypothesis, we calculated the normalized baseline/stimulus mean firing rate ratio (BSR) of positively responding neurons within the identified clusters using CASCADE (Rupprecht et al., 2021).

Although clusters of neurons responding to individual tones emerged independently of learning (i.e., they were present in experimental and control mice), their activity was modulated by associative processes. Specifically, neurons in clusters selectively responding to 3 kHz exhibited higher BSR in both experimental groups compared with neurons from the same cluster in controls (p < 0.05), indicating elevated activity regardless of whether 3 kHz was associated with the CS⁺ or CS⁻. By contrast, the BSR of neurons selectively responding to 11 or 15 kHz in the CS⁺15 group displayed significantly higher activity than corresponding neurons in the CS⁺3 or control groups on days 15 and 30 (p < 0.05, Fig. S8 and Table S3). Importantly, in the CS+15 group—where robust threat generalization was observed to 11 kHz—clusters associated with 15 or 11 kHz exhibited similarly high BSR to both tones, indicating comparable firing-rate responses. As expected, the BSR of positively modulated graded-responder neurons mirrored freezing behavior across experimental groups, with higher BSR to the CS+ and to tones eliciting robust generalization, and lower BSR to tones that elicited discrimination (p < 0.05; Fig. 6h–i, Table S3).

Together, these results indicate that firing rates of stable graded clusters and dynamic tone selective ones encode both learned and inferred emotional valence.

Discussion

Our results show that PL populations encode learned emotional valence despite substantial turnover in active neurons over time. A subset of neurons remains consistently active across sessions, preserving core components of the memory trace and supporting inference of emotional valence for novel sounds, while neurons recruited after conditioning progressively acquire valence selectivity at remote time points. Population similarity at stimulus onset is observed for threat-associated cues and for stimuli that elicit robust generalization, consistent with the idea that generalization arises when similar inputs reactivate shared population states (Aschauer et al., 2022). Notably, populations responses emerge from coordinated activity across distinct subpopulations, including stable subnetworks encoding core memory content and inferred associations and dynamic sensory ensembles providing flexibility through learning-dependent modulation.

These findings bear directly on debates regarding systems consolidation and cortical memory stability (Frankland & Bontempi, 2005; Lopez et al., 2024). Classical systems consolidation models propose that memory formation initially depends on subcortical regions, followed by the gradual transfer and stabilization of memory traces in the neocortex (Moscovitch & Nadel, 1998; Nadel et al., 2000). In contrast, multiple-trace frameworks posit parallel encoding and continuous reorganization, with memory representations stabilizing only as they become increasingly schematic (Moscovitch & Nadel, 1998; Nadel et al., 2000; Tonegawa et al., 2018). Our data reconcile these views by demonstrating that cortical representations of emotional valence emerge rapidly after learning and persist within stable subnetworks, even as the broader population undergoes substantial turnover. This architecture preserves core mnemonic content while allowing flexibility in the surrounding ensemble.

At the circuit level, this organization aligns with principles established for memory engrams. A large body of work has shown that memories are encoded in sparse neuronal ensembles whose activity is both necessary and sufficient for memory expression (Josselyn & Tonegawa, 2020), and that these ensembles form stable functional units embedded within distributed circuits through strengthened synaptic connectivity (Tonegawa et al., 2018). Consistent with this framework, the graded valence subnetworks identified here exhibit hallmark properties of canonical engram populations, including stable cellular identity and persistent response profiles. Importantly, these subnetworks encode both learned contingencies and the inferred valence of novel stimuli along a graded representational axis, suggesting that strong recurrent connectivity provides a stable scaffold for emotional memory representations.

In the auditory cortex, neurons exhibiting sound-evoked suppression (“negative responders”) are thought to contribute to lateral inhibition, sharpening frequency tuning and improving signal-to-noise by suppressing activity in neurons tuned to similar or neighboring frequencies (Kato et al., 2017; Wehr & Zador, 2003). In our study, negative responders did not display consistent graded responses at the population level. However, we identified negatively graded clusters whose response profiles mirrored those of positively graded clusters, exhibiting opposite modulation across frequencies. This organization suggests that suppressive responses form structured subnetworks that may provide complementary inhibitory contrast to excitatory valence-encoding ensembles. Such opponent network dynamics could enhance discriminability at the population level without requiring individual negative responders to exhibit stable tuning across time.

Importantly, the precision of population graded emotional responses does not arise from learning-dependent recruitment of all participating neurons. Dynamic tone-selective responsive neurons emerge independently of learning, as they are present in both control and experimental mice, reflecting pre-existing PL sensory-driven properties (Hockley & Malmierca, 2024; Zikopoulos & Barbas, 2006). However, the activity of these subnetworks is strongly modulated by associative learning, indicating that learning reshapes the gain of dynamic sensory ensembles. Conversely, graded valence emotional clusters only emerge following associative conditioning, preserving elevated rate activity over time. This interaction between pre-existing sensory organization and learning-dependent modulation provides a mechanism by which PL circuits remain flexible while maintaining emotionally relevant information (Mau et al., 2020). Within this framework, PL supports threat generalization and inference by engaging overlapping and consistent population states for behaviorally generalizing cues. Together, this organization provides a circuit-level substrate through which PL compares novel sensory inputs with established emotional representations, enabling appropriate generalization or discrimination while preserving core emotional memory components despite ongoing ensemble reorganization.

Methods

Subjects

Female and male C57BL/6J mice (IMSR_JAX:000664; Jackson Laboratory, Bar Harbor, ME), aged 8–10 weeks, were housed on a 12 h light/dark cycle. All experiments were conducted during the light phase of the cycle. Animal housing and care were consistent with standards set by the Association for Assessment and Accreditation of Laboratory Animal Care (AAALAC). All experimental procedures were approved by the University of Iowa Institutional Animal Care and Use Committee.

Discriminatory fear conditioning task

The training day, mice were placed in a square conditioning chamber (20 × 20 cm; Context A, Fig. 1a). The first 3 minutes served as a habituation period. Following habituation, experimental mice received presentations of a conditioned stimulus (CS+) tone (15 or 3 kHz, 80 dB, 20.5 s) that co-terminated with a mild foot shock (unconditioned stimulus, US; 0.5 s, 0.5 mA). Between CS+ presentations, animals were presented with a non-conditioned stimulus (CS) tone (3 or 15 kHz, 80 dB, 20.5 s) that was never paired with shock. Mice received 10 CS+ tone–shock pairings and 10 CS− tone presentations. Intertrial intervals between CS+ and CS presentations alternated pseudo-randomly between 2, 4, and 6 minutes. Following the final CS presentation, mice remained in the chamber for an additional 60 s before being returned to their home cages. Control mice were exposed to the same auditory stimuli but did not receive foot shocks. On day 1, mice were placed in a novel context (Context B; 20 × 20 cm; see Fig. 1a) that differed from the training context in wall color, floor texture, and background scent. Animals were allowed to freely explore the chamber for 3 minutes, after which they were presented with auditory stimuli consisting of the conditioned tone (CS+) the non-conditioned tone (CS), and two intermediate frequencies (7 and 11 kHz) spanning the same spectral range. These intermediate frequencies were spaced linearly rather than logarithmically. Tones were presented in a semi-random order, with three presentations per frequency and 3-min intertrial intervals. Following the final tone presentation, mice remained in the chamber for an additional 60 s period before being returned to their home cages.

Behavioral analysis

All behavioral sessions were video recorded using a mounted camera. Freezing behavior was defined as complete immobility except for respiratory movements and was quantified as a percentage of total trial time. Freezing was measured using FreezeFrame 5 software (Actimetrics, RRID:SCR_014429) for non-implanted mice and inertial measurement unit (IMU) data from the Inscopix miniscope system for implanted mice. This approach quantifies movement across three axes and circumvents visual occlusion introduced by the miniscope tether. To validate automated scoring, a subset of sessions from implanted and non-implanted mice were randomly selected and manually scored by an experimenter blind to experimental condition, following established criteria (Phillips & LeDoux, 1992; Wang et al., 2012). Automated and manual measurements were highly correlated (Pearson’s r = 0.98 and 0.96), confirming strong inter-method reliability. Freezing responses were analyzed across three retrieval sessions (days 1, 15, and 30). Two animals in the non-shock control group (JAM061 and JAM062) were excluded from repeated-measures analyses due to missing data on day 30 but were retained for day-specific comparisons.

Viral construct stereotaxic surgery

Mice were anesthetized with isoflurane (3% in oxygen at 1 L/min) and placed in a stereotaxic apparatus (David Kopf Instruments, Tujunga, CA). During surgery, anesthesia was maintained at 1.5%. The head was positioned to ensure a flat skull in both the anterior–posterior (AP) and medial–lateral (ML) planes. Rimadyl (5 mg/kg, s.c.) was administered preoperatively and for two consecutive days postoperatively for analgesia. For calcium imaging experiments, three injections (150 nL each) of AAV1-CaMKIIα-GCaMP6f-WPRE-SV40 (Addgene cat. # 100834) were delivered into the PL at the following stereotaxic coordinates relative to bregma: AP +1.95 mm, ML ±0.3 mm, and DV −1.7, −1.5, and −1.3 mm. Injections were performed at a rate of 50 nL/min. After each injection, the syringe was left in place for 5 minutes to allow for viral diffusion and to minimize backflow. Mice were allowed to recover for four weeks prior to implantation of the GRIN lens and baseplate assembly for in vivo imaging.

Miniendoscope baseplate placement

Following AAV-GCaMP6f injections, a gradient-index GRIN lens (Inscopix, diameter: 1.0 mm, length: 4.0 mm, numerical aperture: 0.5, model: 1050-004637) and baseplate assembly (Inscopix, Mountain View, CA) were implanted above the PL. Stereotaxic coordinates relative to bregma were AP +1.94 mm, ML ±0.3 mm, and DV −1.5 mm. The cortical surface was identified and set as the zero reference point. To create a tract for lens placement, a 26-gauge syringe needle was lowered to approximately two-thirds of the final depth (DV −1.0 mm) and then slowly retracted. The GRIN lens was subsequently lowered to the final DV coordinate (−1.5 mm) and secured in place with a thin layer of cyanoacrylate adhesive applied around the lens perimeter and over the exposed skull. Dental cement (Lang Dental) was used to further stabilize the baseplate and anchor it to the skull. Mice were allowed to recover for at least two weeks before the onset of behavioral training.

Calcium imaging analysis

Calcium imaging data acquired using the Inscopix nVoke 2.0 miniscope system were processed using Inscopix Data Processing Software (IDPS; Inscopix, Mountain View, CA). Raw recordings were first preprocessed using the standard IDPS pipeline, which included spatial downsampling, background subtraction, spatial filtering, and rigid motion correction to compensate for movement artifacts. Fluorescence signals were then normalized to ΔF/F, defined as the change in fluorescence relative to a baseline fluorescence level computed for each pixel, to quantify activity-dependent calcium dynamics. Following preprocessing, neuronal signals were extracted using a constrained non-negative matrix factorization approach optimized for one-photon calcium imaging (CNMF-E implemented by IDPS). This method decomposes the imaging data into spatial components corresponding to putative neurons and their associated temporal activity traces while accounting for background and neuropil contamination. Extracted components were manually curated to exclude non-neuronal signals and artifacts based on established morphological and activity criteria (Fig. 1d).

Sound responder classification

To quantify stimulus-evoked modulation of neuronal activity, we implemented a window-based sound response test (SRT) that assesses changes in calcium activity relative to a pre-stimulus baseline while preserving temporal structure during the stimulus period.

Data alignment and normalization

For each neuron, calcium fluorescence traces (ΔF/F₀) were aligned to sound onset and resampled onto a common relative time axis. Each aligned response comprised a pre-stimulus baseline period (time < 0 s) and a stimulus period of fixed duration (20 s). Analyses were first performed at the single-trial level and subsequently aggregated across trials. To normalize activity across trials, each trial was z-scored relative to its own baseline period. Specifically, the mean and standard deviation of baseline activity were computed separately for each trial, and all time points within that trial were normalized by subtracting the baseline mean and dividing by the baseline standard deviation. A small constant was added to the denominator to prevent division by zero. This normalization ensured that trial-to-trial differences in baseline variance did not bias subsequent comparisons.

Sliding window construction

To capture temporally localized stimulus responses, normalized traces were segmented into overlapping sliding windows spanning the stimulus period. Windows were defined by a fixed duration (1 s) and stride (0.5 s), with window boundaries determined directly from the sampling interval of the data. Only windows fully contained within the stimulus epoch were included. This approach enabled detection of responses with variable onset latencies and durations without assuming a fixed response time.

Window-wise response quantification

For each window and trial, the median normalized activity within the window was computed to provide a robust estimate of window-level response magnitude. For each trial, the median baseline activity was computed using the same normalized data. Window responses were expressed as the difference between the window median and the corresponding trial baseline median. Baseline-subtracted window responses were then averaged across trials, yielding a mean response magnitude for each time window. An effect size was computed for each window as the mean baseline-subtracted response divided by the standard deviation of baseline medians across trials. This metric expresses response magnitude in units of baseline variability and provides a standardized measure of modulation strength. For single-trial cases, the raw normalized difference was used directly.

Statistical comparison to baseline

To determine whether activity within each window differed significantly from baseline, the distribution of normalized activity values within each window was compared to the pooled distribution of normalized baseline samples across trials. For each window, a two-sample Kolmogorov–Smirnov test was applied, yielding a window-specific p-value. This nonparametric test was selected to avoid assumptions about distributional shape and to remain sensitive to changes in both central tendency and distribution structure.

Identification of significant response epochs

A window was classified as significant if it met two criteria: (i) a p-value below a predefined threshold (typically p < 0.05) and (ii) an absolute effect size of at least one baseline standard deviation. Significant windows were further classified as positive (increased activity) or negative (decreased activity) based on the sign of the effect size. Contiguous significant windows of the same sign were grouped into response epochs. To account for brief interruptions due to noise, adjacent epochs separated by short gaps (≤1 window) were merged. For each neuron and response direction (positive or negative), the longest contiguous significant epoch was selected for reporting.

Summary response metrics

For each detected response epoch, several summary metrics were extracted, including: (1) the mean effect size across the epoch, (2) the peak response magnitude (reported both in normalized units and converted back to ΔF/F₀ using the baseline standard deviation), (3) onset and offset times relative to stimulus onset, and (4) total response duration. Based on these metrics, responses were classified as positively modulated (normalized effect size ≥ 3 and significant for at least 1 s) or negatively modulated (normalized effect size ≤ −1.5 and significant for at least 1 s).

Mapping active memory ensembles across time

For longitudinal analyses, neuronal identity across imaging sessions was tracked using spatial footprint registration to identify putatively identical neurons across days. Spatial footprints of cells detected during conditioning were registered longitudinally using a probabilistic model implemented in CellReg (Inscopix (Sheintuch et al., 2017)). This algorithm aligns neurons based on the similarity of their spatial footprints, enabling consistent tracking of neuronal identity across imaging sessions. Cells classified as consistently active were those reliably detected across all phases of conditioning and retrieval, including sessions on days 1, 15, and 30.

Average stimulus-aligned trace procedure

For each neuron and stimulus frequency, the session-long calcium fluorescence trace was used as the initial input and then z-scored. For each tone presentation, traces were temporally aligned to stimulus onset. A time window spanning from 10 s before tone onset to 10 s after tone offset was extracted to generate a stimulus-aligned trace (SAT). To ensure equal trace length and consistent temporal alignment across presentations, each SAT was linearly interpolated. Baseline activity was computed as the mean signal during the 10 s pre-stimulus period (−10 to 0 s) and subtracted from the corresponding SAT to normalize activity relative to baseline. Baseline-corrected SATs from all repetitions were then averaged to yield a single average stimulus-aligned trace (ASAT) for each neuron and tone frequency.

Generation of Population Sound-Response Curves

To generate population-level sound-response curves, an average trace was first computed for each combination of subject, session, tone frequency, and modulation classification by averaging across all ASATs from individual cells within that condition (cell-averaged trace). Subsequently, for each experimental group, session, frequency, and modulation classification these cell-averaged traces were averaged across animals to obtain the group-level average trace, representing the population response to each auditory stimulus.

Population similarity over time across tone pairs

To estimate neuronal firing rates from calcium activity, deconvolution was performed using CASCADE, a supervised neural network–based algorithm trained on ground-truth electrophysiological recordings (RRID:SCR_005861) (Rupprecht et al., 2021). This approach provides temporally precise estimates of spiking activity from calcium fluorescence traces. Estimated firing rate traces were subsequently Gaussian-smoothed across time using a 2-s kernel and z-scored within each session to allow comparisons across neurons and sessions. For population-level analyses, activity vectors were constructed by averaging estimated firing rates across 1-s time windows for each neuron (overlap 0.5 s). These population vectors were used to compute similarity metrics. To quantify the consistency and temporal structure of population responses, we computed population similarity matrices across time points by correlating population activity vectors across independent repetitions of the same tone (within-tone comparisons) as well as across different tone identities (cross-tone comparisons). For all tone comparisons, similarity matrices were not symmetrized, thereby preserving potential asymmetries arising from differences in the order of the stimuli across repetitions (e.g., 11 vs 3 kHz or 3 vs 11 kHz), tone identity, and/or stimulus-evoked population dynamics. All similarity matrices were indexed by time relative to stimulus onset (−10 to 30 s), with tone presentation occurring from 0 to 20 s. In all graphs we plotted the earlier tone presentation on the y axis, and the later one on the x axis. To analyze population similarity of tone pairs during stimulus onset, the population activity vector correlations occurring during the first 5 s were computed for each animal. This analysis focused on tone pairs relative to the CS+ in each experimental group.

Classification of subnetwork tone-response patterns

Stimulus response vectors

To identify subsets of neurons with shared activity patterns in response to auditory tones, fluorescence traces from individual cells were detrended using the 10th percentile value to remove slow baseline drifts and z-scored to normalize activity levels across cells. For each session (12 tone presentations), 40-s segments of the calcium trace were extracted around each tone period—comprising 10 s before tone onset (baseline), 20 s during tone presentation, and 10 s after tone offset. Segments were interpolated to a uniform set of timestamps to ensure consistent temporal sampling across trials. For each tone frequency (3, 7, 11, and 15 kHz), baseline activity was subtracted and traces were averaged across repetitions. The four mean tone responses were concatenated in ascending frequency order to form a single composite response vector for each cell. All vectors from a session were compiled into a dataset for subsequent analyses.

Mutual information matrix

For each pair of cells, we computed the mutual information (MI) between their stimulus response vectors to quantify the statistical dependence between their responses. MI measures the reduction in uncertainty about one variable given knowledge of another. In this context, low MI values indicate that the stimulus response vectors of two cells are largely independent, whereas high MI values indicate shared information or coordinated response structure. The MI values for all cell pairs were assembled into a symmetric matrix (n_cells × n_cells), with each entry representing the information shared between a given pair of cells. Separate MI matrices were computed for each experimental group (non-shock, CS+3, CS+15), pooling cells across recording days within each group. In what follows, “MI matrix” refers to one of these matrices.

Adjacency matrix construction

To transform the MI matrix into a network representation, we applied a summed top-k adjacency procedure. For each cell (matrix row), connections to other cells were ranked by MI strength, and only the top-k values were retained. Each retained connection was weighted by the reciprocal of its rank (1 / rank), favoring the strongest associations. The resulting matrix was symmetrized to represent an undirected network of functional connectivity (if cell A is functionally connected to cell B then this enforces that B is also functionally connected to A).

Spectral clustering

The adjacency matrix was analyzed using spectral clustering to identify functional subnetworks. Spectral clustering converts the adjacency matrix into a graph Laplacian, computes its eigenvalues and eigenvectors, and projects the data into a low-dimensional space in which clusters are more separable. K-means clustering was then applied to the eigenvector representations to delineate groups of highly connected neurons. This approach is well suited for detecting non-linear or irregular structures in neuronal co-activity networks.

Hyperparameter selection

Two hyperparameters were determined: the top-k value and the number of clusters. We evaluated a range of top-k values (2 to 100) and, for each, computed the first 30 eigenvalues of the normalized graph Laplacian obtained during spectral clustering. Eigenvalues were averaged across top-k values to yield a single representative spectrum. The optimal cluster number was then determined using the eigengap method, defined as the largest gap between consecutive eigenvalues. The top-k values associated with this solution were averaged to obtain the final k.

Positively and negatively modulated subnetworks

Because MI is non-directional, it cannot distinguish between positively and negatively correlated responses. To separate these responses, Spearman’s correlation was computed for all pairs of cells within the cluster, and only the sign of each correlation was retained, resulting in a sign matrix of the same dimensions as the MI matrix. A signed MI matrix was then created by element-wise multiplication of the sign matrix by the MI matrix. The signed MI matrix was then used as input for another round of clustering, as previously described. Briefly, this involved generating a top-k adjacency matrix, computing eigenvalues from the normalized Laplacian, determining the optimal number of clusters using the eigengap method, and assigning cluster labels. Based on this procedure, one cluster was separated into three sub-clusters, while all other clusters were separated into two sub-clusters.

Global labels

Since our clustering method begins with a MI matrix, and each group has its own MI matrix and associated subclusters and subcluster labels, we performed a final cluster labeling step that allowed for comparison across groups. The results are such that, if a set of cells in group A have been assigned cluster label L, then the set of cells in group B having similar response patterns will be assigned the same cluster label L, allowing for comparison across groups.

Cluster Stability

To assess the stability and reorganization of functional clusters across days, we tracked registered cells that were detected on at least two of the three imaging sessions (day 1, day 15, and day 30). For each pair of days (A, B) and each cluster label L on day A, we quantified label stability (“percent_same”) — the percentage of cells that had label L on day A and retained the same label on day B:

where nL is the number of cells assigned to label L on day A and nsame is the number of those cells that were assigned the same label on day B. Higher percent_same values indicate that a larger fraction of cells maintained the same cluster identity across days, suggesting that the corresponding functional cell assemblies were stable over time. Conversely, lower percent_same values reflect cluster reorganization, where individual cells changed their cluster membership between sessions.

Statistics

Behavioral and neural data were analyzed using one- or two-way analyses of variance (ANOVAs) with repeated measures. For one-way ANOVAs, tone frequency was treated as the repeated factor; for two-way ANOVAs, a between-subjects group factor (e.g., experimental vs. control) was additionally included. When ANOVAs revealed significant main effects or interactions, post hoc comparisons were performed using Tukey’s multiple-comparisons test. Data normality was assessed using the Shapiro–Wilk test and homogeneity of variances with the Brown–Forsythe test prior to statistical analysis. When assumptions were violated, Friedman tests were used for repeated-measures analyses, and Kruskal–Wallis tests were used for comparisons without repeated measures. Rank-based tests were followed by Dunn’s post hoc multiple-comparison tests. Statistical significance was evaluated using a two-sided alpha level of 0.05. For analyses involving multiple cell-identity comparisons, p values were adjusted using the Benjamini–Hochberg false discovery rate procedure.

Histology

At the conclusion of experiments, mice were deeply anesthetized with isoflurane and transcardially perfused with phosphate-buffered saline (PBS), followed by 4% paraformaldehyde (PFA). Extracted brains were post-fixed in 4% PFA for 24 hours and subsequently transferred to a 20% sucrose solution containing sodium azide at 4 °C for an additional 24 hours. Frozen brains were coronally sectioned at 50 µm using a cryostat, and sections were mounted on Superfrost Plus microscope slides (Fisher Scientific). Histological verification was performed to confirm GCaMP expression and accurate lens placement within the PL.

Supplemental Figures

Sex differences in behavior.

a-c, Sex differences in CS⁺15-trained mice. No sex differences were observed on any testing day (effect of sex: day 1, F(1,25) = 1.52, p = 0.23; day 15, F(1,25) = 0.55, p = 0.467; day 30, F(1,25) = 0.17, p = 0.681). On all days, there was a significant main effect of frequency, indicating successful learning in both males and females (day 1, F(3,75) = 33.46, p < 0.001; day 15, F(3,75) = 26.32, p < 0.001; day 30, F(3,75) = 51.47, p < 0.001). No sex × frequency interactions were detected on any testing day (p > 0.05). Post hoc comparisons showed that on day 1, mice discriminated 3 kHz from 7, 11 and 15 kHz (p < 0.05), but generalized among the higher frequencies (p > 0.05). On days 15 and 30, mice discriminated 3 kHz from 7, 11 and 15 kHz (p < 0.05) and 7 kHz from 11 and 15 kHz (p < 0.05), while generalization persisted between 11 and 15 kHz (p > 0.05). d-f, Sex differences in CS⁺3-trained mice. No main effect of sex was observed on any testing day (day 1, F(1,20) = 1.24, p = 0.279; day 15, F(1,20) = 0.78, p = 0.388; day 30, F(1,20) = 0.003, p = 0.956). A significant main effect of frequency was present across all days, indicating learning in both sexes (day 1, F(3,60) = 40.65, p < 0.001; day 15, F(3,60) = 30.17, p < 0.001; day 30, F(3,60) = 35.19, p < 0.001). A significant sex × frequency interaction was observed on day 1 (F(3,60) = 6.91, p < 0.001) and day 15 (F(3,60) = 3.55, p < 0.02), reflecting sex-specific differences at 11 kHz on day 1 and at 3 kHz on day 15. These effects were not consistent across frequencies or present on day 30 (F(3,60) = 0.76, p = 0.522), likely reflecting increased behavioral variability in females, rather than stable sex differences. Significant Tukey multiple comparisons are denoted by asterisks, ** p < 0.01.

Population activity of consistently active neurons.

a-b, Population activity of consistently active neurons from animals trained with a 15 kHz CS+ (a) or a 3 kHz CS+ (b). Upper panels show positively tone-responsive neurons and lower panels show negatively tone-responsive neurons. Adjacent boxplots display areas under the population response curves. Consistently active positive responders exhibit graded responses across test days (day 1 to day 30). CS⁺15: day 1, F(3,18) = 6.072, p < 0.005; day 15, F(3,18) = 4.492, p = 0.016; day 30, F(3,18) = 4.599, p = 0.015; CS⁺3: Day 1, F(3,12) = 9.304, p = 0.002; Day 15, F(3,12) = 3.022, p = 0.072; day 30, F(3,12) = 5.26, p = 0.015; controls: day 1, F(3,9) = 5.343, p < 0.02; day 15, F(3,8) = 0.399, p = 0.758; day 30, F(3,8) = 0.19, p = 0.90, *p < 0.05; **p < 0.01.

Population activity of emerging-retained neurons.

Emerging-retained neurons become part of the active ensemble after conditioning and are eded after that. a-b, Population activity of emerging-retained neurons from animals trained with a 15 kHz CS+ (a) or a 3 kHz CS+ (b). Upper panels show positively tone-responsive neurons and lower panels show negatively tone-responsive neurons. Adjacent boxplots display AUC. Emerging-retained neurons show greater variability on day 1 compared with later test days. Only positive responders show significant graded valence (CS⁺15: Day 1, F(3,16) = 4.587, p = 0.017; day 15, F(3,18) = 4.749, p = 0.013; day 30, F(3,18) = 4.748, p = 0.013; CS⁺3: day 1, F(3,9) = 1.684, p = 0.239; day 15, F(3,12) = 7.728, p = 0.004; day 30, F(3,12) = 5.63, p = 0.012). Significant post hoc multiple comparisons noted in the Figure with asterisks. *p < 0.05; **p < 0.01.

Population activity of transiently active neurons.

Transiently active neurons were active on only a single test day. a-b, Population activity of transiently active neurons from animals trained with a 15 kHz CS+ (a) or a 3 kHz CS+ (b). Upper panels show positively tone-responsive neurons and lower panels show negatively tone-responsive neurons. Adjacent boxplots display areas under the population response curves. Transiently active neurons did not show graded population responses on day 1; graded responses emerged by day 15 and were maintained through day 30 in positive sound responder neurons. *p < 0.05; **p < 0.01.

Clustering of PL subnetworks based on signed mutual information for the CS+ 15 kHz group per testing session.

a-b, Average stimulus-aligned population responses for clusters showing positive (a) or negative (b) modulation to individual tones (3, 7, 11, or 15 kHz, upper panels) or graded emotional tuning (a-b, bottom panels).

Clustering of PL subnetworks based on signed mutual information for the CS+ 3 kHz group per testing session.

a-b, Average stimulus-aligned population responses for clusters showing positive (a) or negative (b) modulation to individual tones (3, 7, 11, or 15 kHz, upper panels) or graded emotional tuning (a-b, bottom panels).

Clustering of PL subnetworks based on signed mutual information for the no shock control per testing session.

a-b, Average stimulus-aligned population responses for clusters showing positive (a) or negative (b) modulation to individual tones (3, 7, 11, or 15 kHz).

a–d, Baseline-to-stimulus firing rate ratio (BSR) illustrating changes in positively responding neurons within tone-specific clusters shown in Fig. 5, for the CS+15 (red), CS+3 (blue), and control groups. (a) BSR of neurons in clusters primarily responsive to 3 kHz (Fig. 5c.1), (b) 7 kHz (Fig. 5c.2), (c) 11 kHz (Fig. 5c.3), and (d) 15 kHz (Fig. 5c.4). ANOVA results are reported in Table S3; Tukey’s multiple-comparison tests indicate significance (p < 0.05; p < 0.01; p < 0.001).

Statistics corresponding to consistently active, emerging-retained, or transiently active neurons

Statistics corresponding to cell identity across time for all neurons clustered using MI

Statistics corresponding the baseline/stimulus firing rate changes (BSR) for tone-selective responders

Data availability

Datasets available at: https://data.mendeley.com/datasets/9yyn63g346/1.

Acknowledgements

This works has been funded by NSF (NSF/IOS 2303305 to IAM), NIH (R01 MH123260-01 to IAM; RISE GMO60655 to MRL).

Additional information

Author Contributions

MEN developed method to identify subnetworks, wrote code for analysis, and contributed to writing the manuscript, PMO collected data, conducted behavioral and in vivo recording experiments, and contributed to writing the manuscript, MRL contributed to experimental design, collected data, conducted behavioral and in vivo recording experiments. IAM supervised design, experiments, analysis, and writing of the manuscript.

Funding

National Science Foundation (NSF) (2303305)

  • Isabel A Muzzio

National Institute of Mental Health and Neurosciences (NIMHANS) (MH123260-01)

  • Isabel A Muzzio