I. Introduction

Social behavior is fundamental for numerous animal species, encompassing human societies. From the dynamic spectacle of Mexican waves in a football stadium to the intricate waggle dance of bees, the diverse manifestations of social interaction raise a pivotal question: How do these social behaviors come to fruition, and what roles do individuals play in their emergence?

In recent decades, the exploration of social behavior has predominantly centered around studying animal dyads in controlled laboratory conditions. However, these experimental paradigms inherently impose limitations on investigating intricate social behaviors that often involve more than two interacting individuals. Studies on social interactions frequently employ tests with brief observation periods, during which animals are evaluated in novel environments, accompanied by the presence of an experimenter that induces stress, influencing social behavior [15]. Recently, there has been a notable shift towards conducting experiments in natural settings, involving animal groups such as flocks of birds [6] or swarms of midges [7]. Additionally, semi-naturalistic environments, exemplified by fish in tanks [810], marching locusts in arenas [11], flocks of sheep [12], and hordes of rodents [13, 14], are increasingly being utilized. These approaches present unique opportunities for the comprehensive quantification of complex social interactions and sociability.

Mice stand out as a valuable model system for delving into the complexities of social behavior, given their intricate manifestation of various social behaviors. They tend to form cohesive groups, showcasing both amicable and agonistic behaviors. Depending on the environmental context, mice demonstrate territoriality and dynamic social hierarchies [15]. Communication among mice is extensive, primarily mediated through odors, allowing them to convey emotional states such as stress, fear, and preferences in food [16, 17]. Additionally, mice exhibit prosocial behaviors, actively assisting distressed fellow mice in need [18]. Decades of research have extensively explored social interactions between pairs of mice, while the study of mouse groups has only recently become feasible with advancements in high-throughput technologies, particularly radiofrequency identification (RFID) [14, 19]. The Eco-HAB system, utilized in this study, leverages RFID tracking to observe naturally-formed cohorts of mice in a controlled yet naturalistic environment, enabling longitudinal experiments on sociability with minimal human interference [14].

Social behavior arises from chance, individual preferences, group structure, and the transmission of preferences and interactions among group members. To unravel these elements and understand the establishment of social networks and hierarchies, we need high-resolution quantitative measurements of behavior over extended periods and statistical modeling to capture natural variability. In this study, we integrate Eco-HAB recordings with statistical inference to construct models of collective behavior, focusing on the statistics of system states to identify interaction structures within the group.

The prefrontal cortex (PFC) plays a crucial role in processing social information, understanding others’ emotions, maintaining social hierarchy, and transmitting information about food safety in both rodents and humans [20, 21]. Neuronal activity of the PFC is correlated with proximity to conspecifics, and studies in mice reveal distinct PFC responses to social and non-social olfactory stimuli [22, 23]. The PFC integrates existing knowledge with new information about self and others, demonstrating dynamic neuronal plasticity [24]. In cognitive tasks involving the PFC and subcortical areas, neuronal connectivity refines more rapidly in the former, highlighting its adaptability to changing environments [25, 26]. Tissue inhibitors of metalloproteinases (TIMPs), particularly TIMP-1, influence synaptic plasticity by inhibiting matrix metalloproteinases (MMPs), especially MMP-9 [2731]. TIMP-1 is involved in long-term potentiation (LTP), a crucial process for cellular-level memory formation [32, 33]. This sustained release impedes the updating of neuronal connectivity in the prelimbic part of the PFC (PL), crucial for maintaining social structure [3436]. Our study employs nanoparticles for gradual TIMP-1 release over several days [37] to impact plasticity in the PL on the changes in group behavior.

Maximum entropy models successfully explain social rules governing collective behavior in bird flocks and mouse hordes [38, 39]. These models help distinguish observed correlations, like the clustering of mice in a specific location, from direct interactions or individual preferences. Shemesh et al. pioneered the use of these models in studying mouse group behavior, revealing the significance of higher-order interactions in co-localization patterns [39]. While Shemesh et al. utilized video tracking of groups of four mice, our Eco-HAB setup employs RFID technology for tracking groups of 10-15 mice, providing more compact data with longer recording times but lower spatial resolution. Our focus is on quantifying sociability in mouse hordes through pairwise interactions within groups, ensuring statistical power. We explore whether interactions between pairs can explain collective behavior and examine how social structure evolves over time. We analyze the effects of individual versus social preferences and investigate the impact of considering three animals instead of two. Using a data analysis approach based on wild type C57BL6/J male and female mice, we discuss social structure and sociability changes in mice with temporary prefrontal cortex plasticity modification.

II. Results

A. Recording of mice location in naturalistic environment

Eco-HAB is an automated, ethologically-relevant experimental apparatus that tracts voluntary behavior in group-housed mice [14]. Constructed to simulate notable characteristics of natural murine environment, it consists of four connected large compartments, two of which contain food and water (Fig. 1A). Cohorts of 10 to 15 mice are introduced into the Eco-HAB, where they behave freely while their locations are tracked over time. The details of used mouse strains and cohorts’ compositions can be found in Materials and Methods. The compartments are connected with tube-shaped corridors resembling underground tunnels, on whose ends there are 125kHz antennas recording every time a mouse crosses with an accuracy of over 20Hz. Each mouse is tagged with a unique RFID tag. The mice are recorded for 10 days with alternating 12-hour-long light-dark phases that simulate the day-night cycle.

Mice were tested in Eco-HAB, a system for automated, ecologically-relevant assessment of voluntary behavior in groups of mice. Animals were tested for 10 days.

(A) Schematic of the Eco-HAB system, where four compartments are connected with tunnels. Food and water are available ad libitum in compartments 2 and 4. (B) Time series of the location of 15 mice over 10 days, as aligned to the daylight cycle. (C) The activity of the mice is affected by the circadian clock. Error bars represent standard deviation across all mice (mouse-mouse variability, in blue) or across all days for the mean activity level for all mice (day-day variability, in orange). The two curves are slightly shifted horizontally for clearer visualization. We focus the following analysis on the data collected during the first half of the dark phase, between 13:00 and 19:00 (shaded region).

The location of each mouse at each time is reconstructed using the time stamps, reducing the data to a discrete time series, σt at time t = 1, 2, …, T, with possible values of the locations σt = 1, 2, 3, 4 corresponding to the four compartments. The time resolution for the discretization is set to 2 seconds. As shown by the color-coded location traces in Fig. 1B, the majority of mice are often found in the same compartment, especially in the non-active light phases: this corresponds to the ethological behavior – mice tend to sleep in a pile to keep each other warm. This suggests that the behavior depends on latent variables, i.a. the circadian clock.

To ensure the relative consistency of the analyzed data, we used the observed rate of transitions to measure the activity of each mouse, and choose an analysis window of 6 hours covering the first half of the dark phase (13:00 - 19:00), which corresponds to the most active time on each day for the entire duration of the experiment. The variability of activity across individuals is larger than the day-to-day variability for a single mouse, suggesting that the level of locomotor activity is a well-defined individual characteristic (Fig. 1C).

Pairwise interaction model explains the statistics of social behavior

We first establish a quantification of sociability by building probabilistic interaction models for groups of mice. Following previous work [14], we use the in-cohort sociability, which measures the excess probability of two mice being found in the same compartment compared to the case where they are independent. Mathematically, in-cohort sociability is defined as:

where and fij(r, r′) are respectively the empirical frequencies of finding a mouse i in compartment r, and a pair of mice (i, j) in compartments r, r′.

As schematically explained in Fig. 2A, in-cohort sociability is due to pairwise interactions between each pair of mice, and modifies how likely they are to be found in each compartment with respect to the mice’s innate preference for that compartment. However, considering the presence of more than two animals, in-cohort sociability is not an effective measure of social structure of the group: two animals with zero attraction to one another can still be found to have a high in-cohort sociability, if a third animal has a strong social bond with both of them, since they all will be spending time with one another.

Mice in Eco-HAB interact in a pairwise fashion.

(A, B) The schematics showing pairwise interactions: two mice are more likely to be found in the same compartment than the sum of their individual preference implies (A); also, the probability for three mice being in the same compartment can be predicted from the pairwise interactions (B). (C) From pairwise correlation Cij, defined as the probability for mouse i and mouse j being in the same compartment (subtracted by the prediction of the independent model), and the probability for mouse i to be found in compartment r (subtracted by the model where each mouse spends equal amount of time in each of the four compartments), mir − 1/4, pairwise maximum entropy model learns the interaction strength between a pair of mice, Jij, and the local field , which gives the tendency for each mouse in each compartment. The data shown is the aggregated 5-day data from day 1 to day 5 of the C57 males (cohort M1). The pairwise maximum entropy model can predict higher order statistical structures of the data (schematics in panel B), such as the probability for triplets of mice being in the same compartment (subtracted by the prediction of the independent model, mathematically fijk(r, r, r) − fi(r)fj(r)fj(r)) (D), and the probability of K mice being found in the same compartment (E).

Since measurements of location preference, and the incohort sociability, together with the dynamical observables such as the rate of activity, are stable over time (SI Fig. S1) it invites a quantitative modeling of the joint-probability distribution of the co-localization of mice.

To distinguish social structure interactions from the effective correlations that define in-cohort sociability, we build a maximum entropy model with pairwise interactions. This approach constrains the joint probability distribution of all the possible co-localization patterns of all mice to reproduce the empirical occupation frequencies and the in-cohort sociability , while otherwise remaining as random as possible [3941]. With these assumptions, the joint probability distribution of the mice co-localization patterns can be written as

where hir is the individual preference of mouse i to be in compartment r, and Jij is the interaction between mouse i and mouse j. The set of parameters (hir, Jij) is learned through gradient descent (see SI Fig. S2 and Materials and Methods for details). The interactions Jij may be positive or negative. We see that although the structure of the interactions Jij follows that of in-cohort sociability Cij, they are not identical. Likewise, individual mice preferences hir are not equal to the occupation probability mir (Fig. 2C). Thus, this approach allows us to distinguish direct interactions from indirect ones.

To validate the model, we tested that it is able to predict higher order features of the data, such as the probability of a specific combination of triplets of mice being in the same compartment (Fig. 2BD), and the probability of observing K mice in the same compartment (Fig. 2E).

Although the model assumes the strength of interaction does not depend on which compartment the mice are in, our minimal model can predict probability of K mice in certain compartments (SI Fig. S4, compartments 1 and 3). The model-predicted in-state probability matches the observed in-state probability (SI Fig. S5), showing that at any single time point, the inferred pairwise model and the rest of the network provides an unbiased estimator of the in-state probability for every individual mouse. These results show that pairwise interaction among mice are sufficient to assess the observed collective behavior.

Stability of sociability over time

The data-driven model and its inferred parameters allow us to explore various aspects of social behavior. To test the reliability of the interaction parameters Jij to quantify sociability in groups of animals, we first assessed their temporal consistency. We found the distribution of the inferred interaction parameters Jij to be stable across the whole 10 days of the experiments, as quantified by its mean and standard deviation (Fig. 3B, top panel ; Fig. 3C, blue). We compared this variability to that of the individual preferences hir. As we recall the subterritories 2 and 4 contain food and water (Fig. 1A), for simplicity of visualization, we focus on the preference for these compartments as the preference for food-containing compartments, Δhihi2 + hi4hi1hi3. We find that these preferences, Δhi, changed more strongly from day to day (Fig. 3D, top panel) relative to the interaction measured as the Jij’s. These results suggest that the structure of social interactions in a cohort as a whole is consistent across all days, and can be quantified using the mean and the standard deviation of the interactions across all pairs. Notably, these measures of sociability were also consistent across different cohorts of mice of the same strain (SI Fig. S8).

Quantification of sociability, and the impact of the impaired neuronal plasticity in the prelimbic cortex (PL).

(A) The schematic of the experiment, in which neuronal plasticity in the PL of the tested subjects was impaired with TIMP-1 treatment. A cohort of C57BL6/J male mice (N = 15) was tested in Eco-HAB for 10 days, and then removed from the cages for neuronal plasticity manipulation procedures. After a recovery period, they were placed back in Eco-HAB for another 10 days. The effect of TIMP-1 is known to decay after 5 days [37], and as expected the most notable effects were recorded during the first half of the experiment. (B) The model-inferred interactions Jij for each day of the experiment, both before (top panel) and after TIMP-1 treatment (bottom panel). (C) Interaction parameters (mean and standard deviation, respectively top and bottom) before (blue) and after TIMP-1 treatment (red). The mean of the interaction strength is similar, but the variability increases for the first few days after TIMP-1 treatment. Error bars are the standard deviation across random halves of the duration of the experiment. (D) Preference for the compartments containing food Δhi for each day of the experiment, both before (top panel) and after TIMP-1 treatment (bottom panel). Data points corresponding to the same mouse are connected with gray line segments. (E) Mean (top panel) and variability (bottom panel) of the preference for the compartments containing food, Δhi, for each day of the experiment, before (blue) and after TIMP-1 treatment (red). The preference for the compartments containing food immediately increases after injection of TIMP-1, and decays to a control level after 6 days. Error bars are the standard deviation across random halves of the duration of the experiment. (F) Conditional log-likelihood of mouse locations, predicted by the pairwise model (blue / red), the independent model (black), and the null model (gray dashed) assuming no compartment preference or interactions, averaged across all mice. for before (top panel) and after TIMP-1 treatment (bottom panel). After impairing neuronal plasticity in the PL, individual compartment preferences, represented by the independent model, explain most of the prediction power. Error bars are the standard deviation across all N = 15 mice.

In contrast, the strength of the individual interactions in the specific pairs of mice i and j, Jij, vary more notably (SI Fig. S6). This sets a limitation to our model. The consistency is improved when the interaction strength is summed over all mice, Ji ≡ ∑j≠i Jij, which measures individual mouse sociability regardless of the identity of the other mice it interacts with. One possible explanation is that mice interactions are driven by individual social drive, rather than by the bilateral interactions.

Quantifying the influence of social versus individual preferences

Further, we ask how important social interactions are for determining mice behavior, by measuring how much the data can be explained by the individual preferences for specific spaces within the territory vs. the interactions with other mice. Mice are social animals, yet they perform many behaviors based on their individual moment-to-moment needs, and it is unclear a priori how much the social interactions influence mice behavior in comparison to their individual preference.

For each mouse i, we consider three nested models with increasing descriptive power: first, the null model assuming each mouse has the same probability of being found in each compartment, P (0)(σi) = 1/4; second, the independent model that assumes no interactions among mice, and the probability of finding each mouse in each compartment is solely determined by their individual preferences, P (1)(σi) = fi(σi); third, the inferred pairwise interaction model based on voluntarily spending time with other mice considered, P (2)(σi|{σji}), using Eq. (2).

We then quantified how well each model explains the data by comparing the mean log-likelihoods of finding a mouse in a given compartment, conditioned on the location of all other mice. As shown in Fig. 3F (top panel), including information on pairwise interactions increases the log-likelihood of the data by as much as including information on individual compartment preferences, as shown by the similar values of the probability ratios P (2)/P (1) and P (1)/P (0). The likelihood is consistent across all ten days of the experiment, but exhibits variability across different cohorts of animals within the same strain (SI Fig. S9). Examining differences of performance between the pairwise model and the independent model for individual mice on each day shows that some mice are consistently more sociable than others (SI Fig. S7)

Another possible measure of sociability is the mutual information between a single mouse’s location within the territory and the location of the rest of the cohort, which tells us how accurately the position of a single mouse can be predicted if the positions of all other mice are known (see details in Materials and Methods IV A). The possible values of the mutual information is between 0 and 2 bits, where 0 bits means no predictability, and 2 bits means perfect predictability. In our Eco-HAB data, the average mutual information for each mouse is 0.076 ± 0.052(SD) bits, with the largest value being 0.33 bits, indicating that despite non-zero sociability, the precise mouse position at any single moment is difficult to predict from the network.

Effect of impairing neuronal plasticity in the PL on subterritory preferences and sociability

As a next step we investigate the effects of impairing neuronal plasticity in the prelimbic cortex (PL), the brain structure containing neural circuits indispensable for both maintaining proper social interactions and encoding individual preferences [22]. To that end, we inject animals with a Tissue Inhibitor of MetalloProteinases (TIMP-1), an enzyme regulating the activity of synaptic plasticity proteins. Changing its physiological levels was previously shown to disrupt the neuronal plasticity in various brain structures [42, 43], including the prefrontal cortex (PFC) [33, 44]. Now, we ask if we can identify changes in behavioral patterns after the mice have been injected with nanoparticles (NP) gradually releasing TIMP-1 (NP-TIMP-1) into the PL [37]. It has been demonstrated in the Eco-HAB that it can reduce the mice’s interest in chasing other animals (a proxy for their social ranks), and diminish persistence in seeking reward related to social olfactory cues [44]. Here, we measure the behavior of a cohort of N = 15 C57BL6/J mice, before and after the injection of NP-TIMP-1. A cohort of mice is introduced into the Eco-HAB and their free behavior is measured for 10 days (Fig. 3A). Then, neuronal activity in the PL of the subjects is impaired by injecting nanoparticles releasing TIMP-1 into the PL. After recovery animals are re-introduced into the Eco-HAB, and their behavior is measured for another 10 days. As a control we also have a cohort (male cohort M4, N = 9) that is injected with nanoparticles loaded with bovine serum albumin, a physiologically neutral substance having no impact on neuronal plasticity (BSA, vehicle). The detailed experimental procedure can be found in [44]. To provide a perspective on both sexes, a female cohort is also included in this study (female cohort F1, N = 13); it was processed identically to the experimental group of males described above. For each day of the experiment, we infer a pairwise model (Eq. 2) and study the changes of the inferred interactions, as well as individual preferences for specific spaces within the territory.

We observe change in both the interaction strength Jij and individual preferences for the compartments containing food Δhi following the prolonged release of TIMP-1. The mean Jij remains stable and similar to its value from before treatment. The variability of the Jij shows a slight increase compared to the before-treatment base-line in the first days after treatment, with a return to pre-treatment levels after five days (Fig. 3B right panel, Fig. 3C), consistent with the time course of TIMP-1 release [37]. The individual preferences for compartment containing food shows an increase in both its mean and its variance across all mice following treatment (Fig. 3D right panel, Fig. 3E). In comparison, the control cohort M4 shows the increase in preference for the compartments containing food after injection of the BSA-loaded nanoparticles, similar to that of TIMP-1 injected cohort, while the mean and the variability of the social interactions do not change (SI Fig. S10) The effect of increasing variability in interactions is even stronger in the female cohort, where for day 1 and day 2 after TIMP-1 injection, the variability of interaction both significantly increased compared to the pre-injection baseline (p = 0.0014 for day 1, p = 0.0010 for day 2, SI Fig. S11B).

For better statistics, we take advantage of the observation that TIMP-1 release from the TIMP-1-loaded nanoparticles diminishes after 5 days and perform analysis on subsets of 5-day aggregate data: the first and the last 5 days of the experiment with the untreated control animals, and the first and the last 5 days of the experiment after the same animals are TIMP-1/BSA treated. Similar to the whole-day longitudinal analysis, the variability of interactions increases significantly after the TIMP-1 injection for both the male cohort M1 and the female cohort F1, and remains unchanged before and after BSA injection for the control cohort M4 (SI Fig. S13A, two-samples F -test for equal variance). With 5-day aggregate data, we can also ask how TIMP-1 induced modification of PL plasticity affects individual mice, by comparing the pairwise specific interactions Jij before and after drug treatment. However, we cannot conclude much as the Pearson’s correlation coefficient between Jij shows almost no significant correlation across the four time periods in the above datasets for both the TIMP-1 treated cohort and the BSA-treated control cohort (SI Fig. S13B).

To quantify the sociability of the entire cohort, we compute conditional likelihoods as introduced in the previous paragraphs, as it measures how much the pairwise model explains the observed data compared to a model where mice behave independently. Figure 3F shows that for cohort M1, the model’s likelihoods sharply increase following treatment, meaning that the behavior is more predictable. Represented by the independent model, the individual compartment preferences explain most of this increase, suggesting that TIMP-1 treatment reorganizes preferences for specific subterritories. These differences decay back to pre-treatment levels after 5 to 8 days, following the time course of drug release. A slightly smaller increase in model’s likelihood is observed in the control cohort M4 after BSA injection (SI Fig. S10), suggesting that at least part of the change in compartment preferences can be due to the injection procedure rather than change in the neuronal plasticity itself. In contrast, the increasing model likelihood is not found in the female cohort F1, where the conditional likelihood remains constant after TIMP-1 treatment. However, the contribution of the pairwise interaction is increased (SI Fig. S11E), which points to a sex specificity of observed effects.

This observation is further confirmed by the sociability measure of mutual information between single mouse location and the positions of the rest of the cohort, which was introduced in previous paragraphs. As shown in SI Fig. S12, the mutual information either does not change (for the male cohort M1), or increases (for the female cohort F1) after the injection of TIMP-1 for both analysis on single-day data and on 5-day aggregate data (SI Fig. S13C).

Impaired neuronal plasticity in the PL affects the structure of social interactions

The increasing variability of pairwise interactions and the non-decreasing mutual information between single mouse location and the location of the rest of the group upon TIMP-1 requires further investigation in the face of previous results showing that injecting TIMP-1 to the PL of wide type animals reduces their sociability. Thus, we examined the detailed group structure of pairwise interactions. We define the dissatisfaction triplet index (DTI) among a triplet of mice as Fijk ≡ − JijJjkJki if and only if among the three pairwise interactions among mice i, j and k, exactly one of them is negative (see Fig. 3A for schematics), and otherwise zero. Notice that DTI is analogous to the concept of “frustration” in physics of disordered systems. A positive DTI means a triplet of pairwise interactions where all of them cannot be satisfied simultaneously – for instance, if mouse i likes to be with j and k, but j and k do not like to be together. We define the global DTI by averaging the local DTI’s across all possible triplets of mice. The larger the global DTI is for a cohort, the more difficult it is for the cohort to form cliques with multiple mice where the interactions among each possible pairs are positive, which may suggest possible difficulty in transmitting information between mice. As shown in Figure 4B, PL-targeted plasticity disruption with TIMP-1 significantly increases the global DTI for the male mice cohort M1 and the female cohort F1 (p = 0.0019 for day 2 after TIMP-1 treatment for cohort M1, p = 6.9 × 10−8 and 4.8 × 10−10 respectively for day 1 and day 2 after TIMP-1 treatment for cohort F1; see Materials and Methods for details of the significant test), followed by the decay after 2 to 5 days. In contrast, in the control cohort M4, injecting male mice with BSA does not significantly change the global DTI. Further grouping of the data into 5-day segments shows that the increase of the global DTI after TIMP-1 treatment is significant (Fig. 4C, two-sample Welch’s t-test, variability from random halves of the data). This increase of the global DTI is due to the increasing variance of the interaction Jij, which is related to more of the negative interactions. Randomly shuffling Jij does not change the global DTI, indicating that no network structure was found that contributes to this global DTI (SI Fig. S14).

Effect of TIMP-1 on the structure of the interaction network.

(A) Schematics of how triplets of mice may enter a state of “dissatisfaction” due to competitive pair-wise interactions. Dissatisfaction reduces the space of preferable states due to competitive interactions. (B) The global dissatisfaction triplet index (DTI), F, as a function of time, computed using the inferred interaction Jij matrix learned each day. The level of dissatisfaction F increases in the first few days after TIMP-1 treatment for both male and female mice, and recovers to the level of the control group after five days. In the control cohort M4 (male treated with BSA), there is no significant difference of the global DTI before and after drug treatment. In the male cohort M1, day 2 after TIMP-1 injection shows a significantly greater value of global DTI compared to the baseline from before drug application (p = 0.0019, for day 1, p = 0.0054, see Materials and Methods for details of the significance test). In the female cohort M1, the global DTI increases significantly for day 1 (p = 6.7 × 10−8) and day 2 (p = 4.8 × 10−10) after TIMP-1 injection. (C) Global DTI computed using inferred interaction from 5-day segments of the data shows that for both male and female mice treated with TIMP-1, the global DTI is significantly increased after drug treatment. Two-sided Welch’s t -test is performed to test the significance for the difference of the global DTI between the first 5 days after drug injection against the other 5-day segments of the data.

III. Discussion

We demonstrate how the joint probability distribution of the mice positions in the Eco-HAB can be used to quantify sociability. By building a pairwise interaction model whose parameters are learned directly from the data, we quantify how much the combined activities of all mice in the cohort are influenced by their individual preferences and how much by the social context. This approach shows that pairwise interactions between mice are sufficient to describe the statistics of collective behavior in larger groups. Additionally, the pairwise interaction model can capture changes in the social interactions of the network induced by alterations in the neuronal plasticity of the prelimbic cortex (PL) in the tested subjects. The Eco-HAB, combined with this analytical approach, provides a toolbox to quantify sociability in mice, which can be applied generally to different mouse strains to study various behavioral phenotypes, including characteristics associated with neurodevelopmental disorders such as autism. Compared to traditional experimental methods like the three-chamber test, our study combines the advantages of an ecologically relevant and automatic experimental apparatus with the powerful tools of statistical inference. This disentangles the effects of individual preferences versus pairwise social interactions in generating the patterns of mouse positions within their territory. The challenge in studying social behavior lies in finding a balance between being specific enough to capture the properties of sociability while avoiding the loss of generalizability. Including excessive details, such as the classification of precise social behavior among mice, may lead to a more accurate description of the specific mice cohort studied, such as the construction of a precise social network. However, it is difficult to assess comparability across different cohorts of mice. Alternatively, as used in this paper, one can construct minimal models and use the ensemble statistics of the models to quantify social properties of a mouse strain without explicitly constructing social networks for each cohort. For example, our study found that the inferred interaction has similar ensemble statistics across three different male cohorts of the same strain but differs across different sexes. This provides evidence to support our argument for a coarse-grained description of mouse social behavior.

Another challenge in studying social behavior lies in the interplay of timescales. Comparing the probabilistic models we constructed using single-day data and aggregated five-day data, we find it challenging to balance model construction with enough data versus studying the temporal evolution of sociability. Is the day-by-day variability of the social network a true property of the social interaction of the mice cohort, or is it due to variabilities of the inferred model caused by the finite data? To address this question, one needs to consider various timescales. For example, mice-mice interactions occur at a much shorter timescale compared to the timescale of changes in the social network, while in between, there are the timescales of adaptation to the new environment and the circadian cycle. These issues need to be addressed using a combination of theoretical tools and experimental validation methods in future works.

Additionally, we have simplified our analysis by focusing on a 6-hour time window each day, during which the rate of locomotor activity is most stable. This approach allowed us to circumvent issues related to individual or strain differences in the circadian cycle, such as the observed ‘lunch hour’ in C57 male mice. One avenue for future research involves reintroducing the circadian cycle as a latent variable to better explain the system. However, caution must be exercised to differentiate between group behavior influenced by the circadian cycle of individual mice and emergent behavior resulting from interactions.

We will now discuss the relationship between our study and that of Shemesh et al., wherein the authors applied a similar approach, investigating the social behavior of groups of 4 mice in a complex experimental environment using statistical modeling of the joint probability distribution of mice locations [39]. They found that triplet interactions are necessary to describe collective behavior, while we found that triplet interactions can be predicted by the pairwise model. We suspect the difference in our results could arise from two factors. First, our data is more coarse-grained spatially, as the state of each mouse is determined by the large compartment it is in, whereas in Shemesh et al., the location is more precise (e.g., a door, a pillar, etc.). As suggested by a comparison to renormalization theory in physics, at coarser spatial scales, the importance of higher-order interactions is likely to decrease. Second, our studies include more mice (10 to 15) compared to Shemesh et al. (4 mice), which may also influence the importance of higher-order interactions. To further investigate these effects, future experiments in Eco-HAB could include mice cohorts of smaller sizes.

How do we move forward, and what is the ideal experiment to study social behaviors? We believe that Eco-HAB offers a balance between a semi-natural environment and controllability, which works well in studying social behavior. One direction for future experimental studies is to focus on the biological function of social interactions. For example, how do mouse cohorts respond to novel odors and transmit information among the cohort? What is the speed of information transmission related to sociability? The current configuration of the Eco-HAB already allows for the introduction of novel odors accessible to all mice, while the next generation of experiments will localize the introduction of information to individuals. From the analysis perspective, as presented in this manuscript, our current model is purely static. Our model describes the joint probability distribution of mice positions within the territory at concurrent time points and does not model the dynamics of the cohort. To take into account the dynamic aspect of social behaviors, such as dominant mice actively chasing others, one will need to build dynamical models of interaction. For example, this can be done by modeling the probability of transitioning to another compartment of each mouse as a function of the history of its previous location and the locations of all other mice [45].

IV. Materials and methods

Animals

Animals were treated in accordance with the ethical standards of the European Union (directive no. 2010/63/UE) and Polish regulations. All the experiments were pre-approved by the Local Ethics Committee no. 1 in Warsaw, Poland. C57BL/6 male mice were bred in the Animal House of Nencki Institute of Experimental Biology, Polish Academy of Sciences or Mossakowski Medical Research Centre, Polish Academy of Sciences. The animals entered the experiments when 2-3 month old. They were littermates derived from several breeding pairs. The mice were transferred to the animal room at least 2 weeks before the experiments started and put in the groups of 12-15 in one cage (56 cm × 34 cm × 20cm) with enriched environment (tubes, shelters, nesting materials). They were kept under 12h/12h light-dark cycle. The cages were cleaned once per week.

Exclude inactive and dead mice from analysis

Mouse whose trajectory does not cover all four compartments within the 6-hour period for at least one day of the experiment is defined as inactive, and excluded from the analysis. Including inactive mice in the maximum entropy model will results in unstable learned parameters, as shown by bootstrapped results. For the same mouse cohort before and after injection of drug (M1, M4, and F1), if a mouse is dead or inactive in either phase of the experiment, its trajectory is masked out from the data for consistency of comparison before and after. Specifically, for cohort F1, mouse number 13 (in the original ordering of the 14 mice) is inactive after the drug application. For cohort M4, mouse number 3 and 11 (in the original ordering of the 12 mice) died after surgery, mouse number 9 (in the original ordering) is inactive in the 10th day after drug injection.

Longitudinal observation of social structure in the Eco-HAB

Cohorts of mice with the same gender and same strain were placed in the Eco-HAB systems and observed for 10 days, removed from the system to undergo stereotaxic injections with TIMP-1 loaded nanoparticles. After 4 to 6 days of recovery, the mice were placed back to a cleaned Eco-HAB, and observed for 10 days.

Activity level

The activity level for a given mouse i during a given time period (ti, tf) on day d, is computed by counting the number of times the mouse passes by any antenna, and denoted by .

Averaging this quantity over all N mice, one obtain the mean activity level for all mice during a given time period. Mathematically, . The standard deviation across all days is the day-to-day variability of mean activity level.

Averaging this quantity over all T days, one obtain the mean activity level for each mouse. Mathematically, . The standard deviation across all mice is the mouse-to-mouse variability of the mean activity level.

Mice location

The raw data consists of time points when mice cross an antenna, as well as the identity of the specific antenna, which are placed at the ends of the four tunnels. The location of a mouse at any given time point is deduced from the most recent time stamps before and after the current time point. For simplicity, for the time points when a mouse is in the tunnel, the location of the mouse is set to be the compartment it will enter. The time resolution is set to 2 seconds, as two adjacent time stamps with separation less than 2 seconds are likely an artifact of mice sniffing the tunnel and returning to the previous compartment.

Pseudocounts

Observing the mice for a finite amount of time means sometimes we have the situation where the mice is stuck in the same compartment for the entire 6 hours of observation. This is not common, but this messes up our statistical inference or model building procedure. To avoid this situation, we use pseudocounts that smoothen the observed statistics. We define

and

where q = 4 is the number of possible states, T is the total number of time points in the data, and λ is the parameter for the pseudocount. In our analysis, after scanning through a range of values for λ, we set λ = 8 around which value the outcome remained largely unchanged.

The probability model

Gauge fixing for the local field : the probability model is equivalent for the local fields upon a constant, i.e. P (h) and P (hir + δhi) are equivalent. We overcome this redundancy by enforcing the sum of all local fields for each mouse to be zero, i.e. ∑r hir = 0.

Learning the probability model : We train the model using gradient descent, at each learning step k updating the parameters by and , where α = 0.25 ∼ 0.8 is the step size of learning. The stopping condition is set such that when the difference between the model predicted correlation and magnetization is less than the data variability, estimated by extrapolation from random halves of the data. In addition, because we are interested in quantifying social properties of mice cohort using the statistics of learned parameters, we add to the stopping condition that the mean and the variation of inferred interaction reach a stable value, with change less than over 100 learning steps.

A. Computing higher-order correlations

The connected three-point correlation function gives the frequency of finding three mice in the same compartment, subtracting the contributions from the mean and the pair-wise correlation. Mathematically,

If we only subtract the individual preference, then we define

Compare in-state probability between model prediction and data observation

Given time-series data and the inferred joint probability distribution of mice location, we can compare the in-state probability of single mouse, as given by model prediction versus data observation. For each time point and each mouse i, the marginal probability of mouse i being in each of the four compartments is computed using the true position of all the other mice at the same time point, and the inferred compartment preference hir for mouse i and the inferred pairwise interaction Jij for all ji (see Eq. (2)). These model-predicted marginal probabilities are binned using histogram equalization for each mouse, and for each bin, the observed in-state probability for each mouse is computed by frequency counts.

Compute mutual information between single mouse position and the rest of the network

The mutual information between single mouse position and the rest of the network is a measure of collectiveness. For Eco-HAB with 4 compartments, the mutual information is between 0 and 2 bits. If the mutual information is close to 2 bits, knowing where other mice are is a perfect predictor for the position of single mouse. If the mutual information is close to 0 bits, knowing where other mice are do not help predicting the position of the singled-out mouse. The mutual information can be computed as the difference of the entropy of mouse i and the conditional entropy of mouse i with respect to the state of all other mice. Mathematically,

where the entropy of mouse i is

and the conditional entropy is computed using the conditional probability given {σj } and the inferred pair-wise data, and averaged over all observed data patterns {σj(t)}.

To reach the final results, we approximate the ensemble average over all possible mice configurations with a temporal average over all observed mice configuration in the data. We also replace the true underlying conditional probability of P (σi|{σj}ji) with the inferred pairwise probability model P (2).

Generate errorbars using random bootstrapped halves of the data

Error bars of the observed statistics 𝒪 (e.g. pairwise correlation, Cij, and probability in each compartment, mir), the inferred parameters 𝒫 (e.g. pairwise interaction Jij and compartment preference hir), and the sub-sequent results ℛ (e.g. the entropy, S(1,2) and S(1), and the dissatisfaction triplet index F) are bootstrap errors generated by repeatedly taking random halves of the data and computing the deviations in the mean. Specifically, each data set (at least 6 hours in duration) is first divided into time bins of 400 seconds. The length of the time bin is chosen such that it is longer than twice the correlation time for each mouse. Then, random halves of the time bins are chosen to compute the observables, as well as used to train a specific pairwise maximum entropy model, which generates a specific set of learned parameters. The deviation across the random halves,σbs, can be extrapolated to the full dataset by .

Test of significance for comparing observables and inferred parameters

To perform significance test across different days of the experiment, we used the Welch’s t -test for the mean of inferred interaction ⟨Jij ⟩ and for the global dissatisfaction triplet index (DTI), F.

For the significance test comparing the global DTI, random halves of the 5-day aggregated data is chosen 10 times, each used to learn the interaction parameters and compute the global DTI. The variance of the global DTI across the 10 random halves is used as variation due to finite amount of data, and is adjusted by . Two-tailed tests are performed, and the Bonferroni correction is applied as the total number of tests performed for the pairwise comparisons for the 5-day aggregated data before and after pharmacological intervention is 6. In Figure 4, the asterisks encode the following p-values: *, p ≤ 0.025; **, p ≤ 0.005; * * *, p ≤ 0.0005.

Significance test comparing the statistics for single-day longitudinal tests is designed as the following. The base-line of each specific statistics (e.g. the mean and variability of the interaction, the global DTI, etc.) is computed using the mean and the standard deviation (SD) using the day-to-day variability from the 10-day test before drug application. Then, for the 10-day test after the drug application, we compare the statistics from each day to the normal distribution assuming the mean and the SD from before. A test is significant if the probability of the statistics is drawn from such normal distribution is less than p = 0.005, for which the Bonferroni correction is applied as the total number of tests performed is 10. Without the Bonferroni correction, the significance threshold is set to p = 0.05.

Data and code availability

All data used in our manuscript and the MATLAB and python code to analyze the data can be found in https://github.com/statbiophys/social_mice.

Note

This reviewed preprint has been updated to use the correct text; previously, a version prior to the one reviewed was presented here.

Acknowledgements

This work was partially supported by the European Research Council Consolidator Grant n. 724208, ‘BRAINCITY - Centre of Excellence for Neural Plasticity and Brain Disorders’ project of the Polish Foundation for Science, and the National Science Center grant 2020/39/D/NZ4/01785.

Stability of the data, given by the time evolution across 10 days of the experiment (A-D) and the scatter plot between the observables measured using the first 5 days of the data vs. the last 5 days of the data for cohort M1 (E-H).

The observables plotted include (A, E) mir, probability of mouse i being found in compartment r, (B, F) Cij, the (connected) pairwise correlation, or the in-cohort sociability, between mouse i and mouse j, (C, G) the sum of observed in-cohort sociability, Ci ≡ ∑ji Cij, which gives a proxy for how mouse i is effected by the social interaction, and (D, H) the activity rate, measured by the number of transition event per second. The error bars in panels E-H are extrapolated by bootstrapping random halves of the data.

The pairwise maximum entropy model is trained such that the model reproduces the probability for each mouse in each compartment, mir, and the probability for pairs of mice in the same compartment, Cij, as given by the data.

Error bars are generated by bootstrapping random halves of the data.

Experiments used in this study. The column NP indicates the load of the injected nanoparticles.

Learned parameters in the pairwise interaction model versus the observed statistics, plotted for the 5-day aggregate data from the first 5 days of the experiment on male cohort M1 before TIMP-1 treatment.

(A) The inferred interaction Jij versus the connected correlation Cij ; (B) the inferred individual compartment preference hir versus the in-compartment probability for each mouse mir - 1/4.

The probability of K mice found in the same compartment, predicted by the pairwise maximum entropy model, the independent model, and computed from the 5-day aggregate data for the first 5 days in male cohort M1 before TIMP-1 treatment (N = 15).

The subpanels are arranged in the same order as in the Eco-HAB setup. Error bars for the experiment are extrapolated from 50 random halves of the data, for the independent is generated by 50 random cyclic shuffling of the data, and for the pairwise model is from 50 random MCMC samplings (each with 54000 realizations, the same number of data points as the data) for the pairwise model.

Model predicted in-state probability matches data observation for the aggregate data of first five days of experiment in mice cohort M1 - C57BL6/J male mice (N = 15), which shows the prediction of the inferred pairwise model is unbiased.

Error bars are extrapolated from 20 draws of random halves of the data.

The specific pairwise interaction Jij is unstable across different days of the experiment. Results shown for cohort M1.

(A) The preference for the compartments containing food, Δhi, the inferred pairwise interaction strength, Jij, and the partial sum of inferred interaction strength, Ji ≡ ∑ ji Jij, vary over the 10 days of the experiment. (B) We measure the consistency of the inferred parameters using the Pearson’s correlation coefficient. (C) Bar plot for the correlation coefficient of the inferred parameters across various days. The compartment preference and Ji are more consistent compared to the pair-specific pairwise interaction.

Box plot for how much is each mouse influenced by its peers, measured by the difference of the conditional log-likelihood, given by the pairwise model vs. the independent model.

top: The difference in conditional log-likelihood for each day. bottom: The difference of conditional log-likelihood for each mouse shows some mice are consistently more social than others.

The inferred socialbility is consistent across different cohorts of mice of the same strain.

Plotted as a function of days of the experiment is (A) the mean of the inferred interaction strength, ⟨Jij ⟩and (B) the variablilty of interaction, measured by the standard deviation of the inferred interaction σJ. Error bars are the standard deviation across random halves of the duration of the experiment.

The conditional log likelihood is different for each cohort of C57BL6/J male mice (N = 13 in cohort M2, and N = 10 in cohort M3), exhibiting individuality.

Quantification of sociability, and the impact of the injection of BSA-infused nanoparticles, a control which does not impair neuronal plasticity in the prelimbic cortex (PL). This figure follows Figure 3, now for cohort M4 (N = 9).

In panel (B), no significant deviation from the baseline is detected for both the mean interaction and the variability of interaction in the data after BSA treatment.

Quantification of sociability, and the impact of the impaired neuronal plasticity in the prelimbic cortex (PL) in female mice. This figure follows Figure 3, now for cohort F1 (N = 13).

In panel (B), the asterisks indicate day 1 and day 2 after TIMP-1 injection, the variability of interaction, σJ is significantly increased compared to the baseline (p = 0.0014 for day 1, p = 0.0010 for day 2, see Materials and Methods for details of the significance test).

Mutual information between single mouse position and the rest of the network, given by the inferred pairwise model, for the male cohort M1 before and after TIMP-1 treatment.

(A), the control male cohort M2 before and after BSA treatment (B), the female cohort F1 before and after TIMP-1 treatment (C). In panel (D), day-averaged MI value shows that TIMP-1 treatment either does not change the mutual information for the male cohort M1, or increases for the female cohort F1 (two-sample t -tests: p = 0.012 for comparison between day-averaged MI values before and after TIMP-1 injection, and p = 8 × 10−5 comparing to the first 5-days after TIMP-1 injection).

Interaction analysis for 5-day aggregated data, for the male and female cohort injected with TIMP-1, and the male cohort injected with the BSA control.

(A) Inferred interaction Jij for cohort M1 for the first and the last 5 days of the experiment, before and after TIMP-1 treatment. Two-sample F-test was performed to test whether the distributions have the same variance. The interaction strength learned from mice location in the first five day after TIMP-1 treatment has a significantly different variance compared to both before treatment, and day 6 to day 10 after treatment. The increase of interaction variability is also observed in the female cohort, although only when comparing the first 5 days after treatment with the first 5 days before treatment, and is not observed for the male BSA cohort. (B) Pearson’s correlation coefficient between inferred interaction Jij from different days. Asterisks indicate statistical significance. Almost no correlation is detected between the inferred Jij. (C) Mutual information between single mouse position and the rest of the network given by the inferred pairwise model. Consistent with single-day results from Fig. S12, the mutual information does not change for the male cohorts M1 and M4, and increases for the female cohort F1 after TIMP-1 treatme.

The global dissatisfaction triplet index (DTI) computed using shuffled interaction, Fshuffled vs. the global DTI computed using the inferred interaction, Frandom half.

Each point corresponds to one random half of the data. The two sets of global DTI’s are equal within the range of the error bars, computed by standard deviation across 20 random shuffling of the inferred interaction Jij, which shows the global DTI comes from the value of the inferred interaction, and that there is no additional network structure of the inferred interaction.