JAX Animal Behavior System (JABS), a genetics-informed, end-to-end advanced behavioral phenotyping platform for the laboratory mouse

9 figures and 5 additional files

Figures

JABS data acquisition module (JABS-DA).

The JABS-DA consists of hardware and software for video data acquisition and processing. (A) JABS pipeline highlighting individual steps towards automated behavioral quantification. (B) Detailed example of JABS data acquisition, including a picture of the monitoring hardware, architecture of the real-time monitoring app, and screenshots from videos taken during daytime and nighttime. The open field arena is shown from the outside (left), and a screenshot of video data is shown on the right. JABS-DA blocks visible light to the camera and only collects data using IR illumination, which produces uniform data during day and night. The JABS-DA computer hardware and software (middle) allow streaming of video data from edge devices, which enables remote welfare checks and web-based experiment setup and monitoring. Data compression is handled on these edge devices.

Figure 2 with 2 supplements
JABS data acquisition module (JABS-DA) consists of a web-based control system for recording and monitoring experiments.

(A) JABS pipeline. (B–E) Screenshots from Angular web client that allows monitoring of multiple JABS Acquisition units in multiple physical locations can be seen on one screen (B). Dashboard view allows monitoring of all JABS units and their status, Device Status provided detailed data on individual devices (C) Recording session dashboard allows initiation of new experiments (D), and remote welfare view allows live video to be streamed from each unit (E).

Figure 2—figure supplement 1
JABS data acquisition module: Environmental parameters in the arena.

(A) JABS pipeline highlighting individual steps towards automated behavioral quantification. (B) Carbon dioxide concentrations and (C) ammonia concentrations were both much higher in the standard wean cage than in the JABS arena. Carbon dioxide was also compared to room background levels. (D) Temperature and (E) humidity measured at floor level in JABS arenas and a standard wean cage compared to room background across a 14-day period. (F) Average body weight as percent of start weight in each JABS arena and wean cage across the 14-day period. (G) Food and (H) water consumption shown as grams per mouse per day for one JABS arena and one wean cage for a 14-day period.

Figure 2—figure supplement 2
Representative hematoxylin and eosin (H&E) stained tissue sections from mice after spending 14 days in the JABS arena or control wean cage.

Tissues selected for examination (eye, lung, trachea, and nasal passages) are those expected to be most affected if the mice lived in a space with inadequate air flow. All tissues appeared normal.

JABS-AL is a behavior annotation and classification module that allows training classifiers with sparse labels.

(A) JABS pipeline highlighting individual steps towards automated behavioral quantification. (B) Screenshot of the Python-based open-source GUI application used for annotating multiple videos frame by frame. One can annotate multiple mice and for multiple behaviors. The labeled data is used for training classifiers using either random forest or gradient boosting methods. Adjustable window size (number of frames on the left and right of the current frame) to include features from a window of frames around the current frame. The labels and predicted labels are displayed at the bottom. (C) A sample workflow for training a typical classifier. Multiple experts can sparsely label videos to train multiple classifiers for the same behavior. These classifiers can be compared, and experts can consult to iterate through the training process.

Figure 4 with 2 supplements
JABS Benchmarks: Selecting hyper-parameters and benchmarking JABS classifiers using grooming dataset.

(A) JABS pipeline highlighting individual steps towards automated behavioral quantification. Using feature window size, type of classification algorithm, and the number of training videos as our benchmarking parameters: (B) Accuracy of JABS classifiers trained using different window size (in frames) features. Each boxplot shows the range of accuracy values for different number of training videos and type of classification algorithms. (C, D) The effect of increasing the training data size on Accuracy and AUROC score of the JABS classifiers. (E) ROC curves for the JABS classifier trained with the window size of 60, XGB algorithm and varying training data size. (F) True positive rate at 5% false positive rate corresponding to the JABS classifier from panel (E) as the amount of training data is changed. (G) Comparing the performance of JABS-based classifiers with a 3D Convolutional neural network (CNN) and JAABA-based classifiers for different training data sizes. JAABA and CNN results were adopted from Geuther et al., 2021.

Figure 4—figure supplement 1
Per strain F1 score and accuracy for grooming dataset.
Figure 4—figure supplement 2
F1 score and accuracy for mice with different coat colors in the grooming dataset.
Frame-based comparison of classifiers from different annotators but trained for the same behavior.

(A) JABS pipeline highlighting individual steps toward automated behavioral quantification. (B, C) Two sample ethograms for the left turn behavior showing variation in behavior inference for two different annotators. (D, G) Kernel density estimate (KDE) of the percentage of frames predicted to be a left turn and a right turn, respectively, by each annotator across all the videos. The major discrepancy between the two annotators is that A-2 systematically predicts a larger number of frames as behavior compared to A-1. (E, H) Confusion matrix showing the agreement between predictions of two classifiers over all the videos in the strain survey for left and right turn behavior. (F, I) Venn diagram capturing the frame-wise behavior agreement between the two annotators for left and right turn behavior.

Bout-based comparison of classifier predictions from different annotators but trained for the same behavior.

(A) JABS pipeline highlighting individual steps towards automated behavioral quantification. (B) Ethogram depicting frame-wise left turn predictions for annotators A1 (red) and A2 (blue). (C) Ethograph corresponding to the ethogram in panel (B) capturing the bout level information as a bipartite network. The nodes represent bouts with node size and color proportional to the bout length and annotator, respectively. Edge weights capture the fraction of bout overlap between two bouts predicted by different annotators for the same behavior. Edge weight and node size with zero value indicate missed bouts by an annotator. These have been given a small positive value for visualization purposes only. (D–E) Bout length distribution of annotators A1 and A2 for left and right turn behavior. (F) The mathematical definition of the average bout agreement between two annotators, where w(u,v) represents weight between nodes u and v (uU, vV) in the ethograph G(U,V,E) and w is the bout overlap threshold (0.5 fixed for our study). (G) Overview of the workflow for stitching and filtering at the bout level. (H, I) Hyper-parameters tuning to find optimal filtering and stitching thresholds. (J) Sample ethogram and its corresponding ethograph before and after applying stitching and filtering. (K) Inter-annotator agreement in frame-wise predictions underestimates the agreement, whereas the bout-wise comparison post filtering and stitching captures the overall agreement in a more biologically meaningful way.

Figure 7 with 2 supplements
JABS-AI (analysis and integration) module: strain-level behavioral phenotyping across genetically diverse mouse populations.

(A) JABS pipeline highlighting individual steps towards automated behavioral quantification. (B) Heatmap showing Z-transformed behavioral scores for aggregate phenotypes measured at three time points (5, 20, and 55 min) across the JABS600 strain survey. Each column represents a genetically distinct mouse strain, and each row corresponds to a specific behavioral measure including locomotion (turning left/right), exploration (rearing), self-directed behaviors (grooming, scratching), and escape responses. Color intensity indicates deviation from the population mean, with red representing increased behavioral expression and blue representing decreased expression relative to the strain average. Z-score thresholding (|Z-score|>1) was applied to all behavioral measures, with escape behaviors displayed separately using modified thresholding parameters to preserve detection of outlier strains exhibiting rare but phenotypically important escape responses. Behavioral measures are stratified by time point (T5, T20, T55) to capture temporal dynamics of phenotypic variation across genetic backgrounds.

Figure 7—figure supplement 1
JABS behavior characterization module: univariate analysis captures the combined effect of sex and strain on the aggregate phenotypes using the JABS600 dataset.

(A) JABS pipeline highlighting individual steps towards automated behavioral quantification. (B) The LOD scores (log10(qvalue)) and effect sizes are shown at left and right panels, respectively. In the left panel, the number of *s represents the strength of evidence against the null hypothesis of no sex effect, while + represents a suggestive effect. In the right panel, the color (red for female and blue for male) and area of the circle (area being proportional to the size of the effect) represent the direction and magnitude of the effect size. Strains with a sex difference in at least one of the aggregated phenotypes are colored pink.

Figure 7—figure supplement 2
JABS 600 strain distribution by sex.
Figure 8 with 1 supplement
JABS-AI (analysis and integration) module: large-scale GWAS investigation of different mouse behaviors utilizing the JABS1200 dataset.

(A) Statistical power comparison between two datasets (JABS600 vs JABS1200) at the genome-wide significance threshold of 2.4e-07. The y-axis shows how power varies with SNP effect size (x-axis). (B) Aggregate (55 min) phenotypes’ heritability (PVE) estimates. (C) Lower Triangular Matrix Representation of Genotypic Correlation Among all of the 55-min aggregate phenotypes using a bi-variate linear mixed model, (D) Linkage disequilibrium (LD) blocks size, along with the mean genotype correlations for SNPs at varying genomic distances. (E) Aggregated GWAS results graphically represented via a comprehensive Manhattan plot. Peak SNP clusters, extracted from (F), determine color differentiation; SNPs within the same LD block are color-coordinated to match their peak SNP. Each SNP is assigned the minimum p-value derived from all phenotypes. (F) An inclusive heatmap exhibiting all the significant peak SNPs for each phenotype. Each row, representing an SNP, is color-coordinated according to the allocated cluster within the k-means clustering. The color scheme originating from the k-means cluster is also applied in panel E of this analysis.

Figure 8—figure supplement 1
JABS 1200 strain distribution by sex.
JABS-AI (analysis and integration): a web application for sharing the JABS classifiers and automated downstream genetic analysis.

(A) Illustrates the fundamental workflow of the web application, beginning with the user employing a classifier trained via the JABS active learning application. The user subsequently deposits this classifier into our web application, which performs comprehensive automated analyses, encompassing both behavioral and genetic aspects, on the dataset selected by the user from our curated strain survey collection (JABS600, JABS1200) accessible via a dropdown menu. The outcome of these analyses, encapsulating detailed behavioral patterns and genetic correlations, is then dispatched to the user’s designated email address within a short timeframe. (B) Screenshot of the web app highlighting the tabular presentation of the repository of classifiers developed in our laboratory, complete with pertinent metadata such as the date of creation, training hyperparameters, and user ratings. When any two classifiers are selected, the application offers the option to analyze the genetic correlations between the phenotypes corresponding to the selected classifiers, in conjunction with their heritability scores.

Additional files

MDAR checklist
https://cdn.elifesciences.org/articles/107259/elife-107259-mdarchecklist1-v1.docx
Supplementary file 1

Training and classifier metadata for grooming benchmark.

Table 1: Data used for grooming benchmark. Number of videos (first column), and number of annotated frames (second and third columns). Table 2: Classifiers trained by JABS with their respective window sizes and F1 scores.

https://cdn.elifesciences.org/articles/107259/elife-107259-supp1-v1.xlsx
Supplementary file 2

Behavioral phenotypes definitions.

Table 3: Summary of framewise behavioral phenotypes and their definitions. Each value corresponds to the total duration (in s) of the indicated behavior during the specified time window, averaged across all analyzed videos. Table 4: Behavioral phenotypes annotated by different annotators (A1, A2). Each phenotypemeasures a specific metric related to bouts of the indicated behavior during the first 55 min of the video averaged across all the analyzed videos.

https://cdn.elifesciences.org/articles/107259/elife-107259-supp2-v1.xlsx
Supplementary file 3

Table of JABS features used for training behavioral classifiers.

Table 5: List of JABS per-frame features.

https://cdn.elifesciences.org/articles/107259/elife-107259-supp3-v1.xlsx
Supplementary file 4

List of Mouse strains used in this study and their JAX stock numbers.

Table 6: Mouse strains used in this study and their JAX stock numbers.

https://cdn.elifesciences.org/articles/107259/elife-107259-supp4-v1.xlsx

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Anshul Choudhary
  2. Brian Q Geuther
  3. Thomas J Sproule
  4. Glen Beane
  5. Vivek Kohar
  6. Jarek Trapszo
  7. Vivek Kumar
(2026)
JAX Animal Behavior System (JABS), a genetics-informed, end-to-end advanced behavioral phenotyping platform for the laboratory mouse
eLife 14:RP107259.
https://doi.org/10.7554/eLife.107259.3