Figures and data

Automated tracking in the NoSeMaze and experimental design.
A, The NoSeMaze (see Supplementary Fig. S1) tracks social status and reinforcement learning in mouse societies (n = 9-10 per group) over multiple weeks without human intervention. RFID-tagged animals are automatically identified at the tube entrances and during olfactory stimulus-outcome learning tasks at the water lickport. B, Two experimental groups of male mice (young and older adults; n = 41 and n =38, respectively) on a C57BL/6J background were used. Prior to the main experiment, repeated reshuffling of home cage group compositions ensured familiarity among all group members. Mice were then housed in the NoSeMaze for several three-week rounds. In each NoSeMaze round, group compositions were shuffled so that mice lived with different group members. Between rounds, mice returned to their home cages for several weeks, again including reshuffling of group compositions. NoSeMaze, Non-invasive Sensor-rich Maze; RFID, radio frequency identification

Inter-individual differences in reinforcement-learning in the NoSeMaze.
A, Mice performed stimulus-outcome learning trials for daily water intake, initiated by head insertion into the odor port. In CS+ trials, licking twice during odor presentation released water, while in CS− trials, it triggered a 6-second timeout. B, Reward contingencies switched every three days. C, Trial activity peaked during the dark phase. D, Number of trials required to reach 80% performance (hits or correct rejections, mean ± SEM) in each phase (cf. Fig. 2B). Performance decreased after the first reversal and improved in later phases, consistent with learning the task structure. Only data from the first round in the NoSeMaze was considered. E-G, Example lick patterns from three mice show variability in pre-CS licking rates and modulations during CS presentation in the stable phases of the last 150 trials before reversals. H, Pre-CS licking rates varied widely across animals. I, No correlation was observed between pre-CS licking rates and CS+ modulation peaks (see Supplementary Fig. S2B). J, The fraction of correct rejections was more broadly distributed across mice than hits. K, Pre-CS licking rates negatively correlated with correct rejection rates (see Supplementary Fig. S2C). L, Spearman’s correlation matrix (data from 17 groups) revealed significant relationships between reinforcement- learning metrics, including pre-CS licking rate, correct hit and rejection rates, CS+ modulation peak, and latency to switch after contingency reversals. Only statistically significant correlation coefficients are shown (Bonferroni- corrected for multiple comparisons). M, K-means clustering based on reward-seeking features identified three distinct learning strategies: impulsive go learners (purple), cautious no-go learners (green), and flexible learners (orange). N, The three learning strategies (color labeling as in M) differed across reward-seeking features including hit and rejection rates, pre-CS licking rates, CS switch latencies, and CS+ modulation peak. Horizontal bars denote statistical comparisons between clusters (p-values from permutation tests with n = 100,000 permutations using the median). P-values were corrected for multiple comparisons using the Benjamini–Hochberg false discovery rate (FDR) procedure (α = 0.05). Significant comparisons after FDR correction are marked with #. CI, confidence interval; CS+, rewarded conditioned stimulus; CS−, unrewarded conditioned stimulus; SEM, standard error of the mean Note: The regression line is illustrative only; statistical inference is based on Spearman’s correlation, which does not assume linearity.

Social hierarchy in the spontaneous tube competitions in the NoSeMaze.
A, Incidental encounters in the tubes triggered dyadic tube competitions. B, Network graph derived from the cumulative tube competitions in an example group over three weeks. Outgoing arrows indicate wins, incoming arrows indicate losses. Hierarchical positions were calculated using the David’s score and can be converted into linear social ranks from 1 (highest) to 10 (lowest). C, Box plots of metrics characterizing social hierarchy for 18 groups, including transitivity, steepness, stability, and uncertainty-by-repeatability (for details, see ‘Source Data’). D, Scatterplots show that David’s scores (z-scored) were significantly correlated across the three different weeks, indicating temporal stability of social hierarchy within one NoSeMaze round (data from 18 groups). E, David’s scores were strongly correlated with Elo ratings, demonstrating convergent validity of rank measures. F, Elo ratings corrected for differences in entry time highly correlated to uncorrected Elo ratings. G, Relative body weight (z-scored per group) was significantly negatively correlated to the fraction of losses in tube competitions (cube root transformed, red), with heavier animals being less likely to lose. No association was found between body weight and the fraction of wins in tube competitions (cube root transformed). Body weight was measured prior to the animals’ introduction to the NoSeMaze, respectively. CI, confidence interval; SD, standard deviation; Note: Regression lines are illustrative only; statistical inference is based on Spearman’s correlation, which does not assume linearity.

Proactive chasing behavior in the NoSeMaze.
A, Schematic of the NoSeMaze setup for automated tracking of voluntary tube chasing. Chasing was quantified as the fraction of chases initiated by each individual relative to the total number of chases in their group (‘active chases’), as well as by the fraction of times being chased (see Supplementary Fig. S6A). B, Chasing events occurred predominantly during the dark phase. C, Cumulative fraction of chases initiated (dark blue) and received (dark red) by individuals ranked within each group. Top-ranking individuals initiated a high number of chases, with the two highest-ranking individuals accounting for more than 50% of all chases. Significant stars mark differences between the two curves (unpaired permutation test, n=10.000 permutations, ***, p < 0.001). D, Active chases were highly consistent across weeks. E, Active chases were not significantly correlated to relative weight before entering the NoSeMaze. However, the fraction of times being chased was negatively correlated with weight, indicating that heavier mice were less likely to be chased. CI, confidence interval; SD, standard deviation; Note: Regression lines are illustrative only; statistical inference is based on Spearman’s correlation, which does not assume linearity.

Stability of social and reward-seeking features across repeated NoSeMaze rounds.
A–D, Social hierarchy and chasing behavior were stable across NoSeMaze rounds. Individual David’s scores (A), fraction of active chases (B), and fraction of times being chased (C) were all significantly correlated between round 1 and 2 (Spearman’s ρ = 0.57, 0.75, and 0.54, respectively; all p < 0.001). Lines in A–C are illustrative only; inference is based on Spearman’s correlation. (D) Intraclass correlation coefficients (ICC) confirm similar stability for the three metrics across all rounds. Blue circles show Spearman’s ρ between rounds 1 and 2 with 95% CIs; golden squares show ICC with 95% CIs estimated across all rounds to quantify across-round stability of each metric. E, Stability of key reward-seeking features. As in D, blue circles depict Spearman’s ρ between rounds 1 and 2 (95% CIs), whereas orange squares give ICCs computed across all rounds (95% CIs) to index stability over the full longitudinal series. Features include correct hits and correct rejections, pre-CS licking rate, CS+ modulation peak, and switch latencies for CS+ and CS− (see also Supplementary Fig. S8A-C). F, Sankey diagram illustrates consistency in individual reward-seeking strategies across rounds. Most mice retained their behavioral strategy classification (flexible, cautious, or impulsive) between rounds, with the majority of flexible and impulsive animals remaining in the same category (see Supplementary Fig. S8D). CI, confidence interval; CS+, rewarded conditioned stimulus; CS−, unrewarded conditioned stimulus Note: Regression lines are illustrative only; statistical inference is based on Spearman’s correlation, which does not assume linearity.

Relationship between social rank, chasing behavior, and reward-seeking features.
A, Social hierarchy (tube competition David’s score) was positively correlated with the fraction of active chases. B, Across groups, the correlation coefficient calculated between David’s scores from tube competitions and fraction of active chases (x-axis, calculated separately within each group, as in A) was negatively associated with the group- level transitivity of the social hierarchy derived from tube competitions. Groups with lower transitivity showed a stronger association between social rank and chasing, suggesting that aggressive status signaling via chasing becomes more relevant when hierarchies are less well-defined. C, Correlation matrix showing relationships between social hierarchy and chasing metrics. Color scale indicates Spearman’s correlation coefficients; statistically significant values (Bonferroni-corrected) are shown in white. Note that the David’s score for tube competitions correlated to both, fraction of wins and losses in tube competitions, while the ‘chasing David’s score’ was strongly correlated to the fraction of active chases, but not to the fraction of times being chased. D-F, Wins and losses in tube competitions (D) were concentrated in the top-right, occurring most consistently between animals of opposing ranks. In contrast, chasing interactions (E, F) clustered in the top-left, and thus within a subset of high-rank mice, in which they may help clarifying dominance. G, Comparison of the fraction of active chases between the three different clusters defined from the reward-seeking task. Cautious mice initiated less chases than flexible individuals (p-values from permutation tests with n = 100,000 permutations using the median). Box plots show the median (line), interquartile range (IQR, box), and whiskers extending to 1.5 × the IQR. P-values were corrected for multiple comparisons using the Benjamini–Hochberg false discovery rate (FDR) procedure (α = 0.05). Significant comparisons after FDR correction are marked with #. H, Comparison of the David’s score from tube competitions between the three different clusters defined from the reward-seeking task showed no significant difference (p-values from permutation tests with n = 100,000 permutations using the median). Box plots show the median (line), interquartile range (IQR, box), and whiskers extending to 1.5 × the IQR. I, Correlation heatmap between social and reward-seeking traits showed no significant associations after FDR correction for multiple comparison. J, Loading values from a principal component analysis of social and reward-seeking (cognitive) traits for the top three components (PC1–PC3). Bars indicate each trait’s contribution to the respective component. PC1 and PC2 were dominated by cognitive traits (p = 0.0009 and p = 0.0063), PC3 by social traits (p = 0.0016; permutation tests, n = 10,000). The largely non-overlapping loadings suggest orthogonality between social and cognitive domains. CI, confidence interval; CS+, rewarded conditioned stimulus; CS−, unrewarded conditioned stimulus Note: Regression lines are illustrative only; statistical inference is based on Spearman’s correlation, which does not assume linearity.