Novel and optimized mouse behavior enabled by fully autonomous HABITS: Home-cage assisted behavioral innovation and testing system

  1. Bowen Yu
  2. Penghai Li
  3. Haoze Xu
  4. Yueming Wang
  5. Kedi Xu
  6. Yaoyao Hao  Is a corresponding author
  1. The State Key Lab of Brain-Machine Intelligence, Zhejiang University, China
  2. Nanhu Brain-computer Interface Institute, China
  3. Department of Biomedical Engineering, Zhejiang University, China
  4. College of Computer Science and Technology, Zhejiang University, China
6 figures, 2 videos, 1 table and 3 additional files

Figures

Figure 1 with 1 supplement
System setup for HABITS.

(A), Front (left) and side (right) view of HABITS, showing components for stimulus presenting (LEDs & buzzers), rewarding (water tanks and pumps), behavioral reporting (lickports) and health monitoring (weight platform). These components are coordinated by the controller unit and integrated into the mouse home-cage with a tray for bedding change. (B), HABITS installed on standard mouse cage rack. (C), Mouse, living in home-cage with food, bedding, nesting material (cotton), and enrichment (tube), is performing task on the weight platform. (D), System architecture for high-throughput behavioral training, showing different tasks are running in parallel groups of HABITS, which further wirelessly connect to one single PC through Wi-Fi to stream real-time data to the graphic user interface (GUI).

Figure 1—figure supplement 1
HABITS system.

(A) Block diagram of control system of HABITS, showing peripherals connected with microcontroller through digital input/output (DIO) or serial port. (B) Graphic user interface (GUI) of a specific cage (left, magnified) and data plot window (right) when clicking the ‘plot’ button in the GUI, showing daily performance in all previous days, trial performance (green for correct and red for error trials) in last 24 hours, and body weight data in last 24 hours. (C) Example protocol programs for HABITS. (D) Around 100 HABITS are packed on standard racks for large-scale mouse behavioral testing. (E) Workflow pipeline for HABITS, showing fully autonomous mouse behavioral training after initialization of HABITS, before data harvest from SD card for analysis.

Figure 2 with 3 supplements
HABITS performance in d2AFC task.

(A) Task structure for d2AFC based on sound frequency. (B) Example licks for correct (blue), error (red), and early lick (gray) trials. Choice is the first lick after response onset. (C) Correct rate (black line) and early lick rate (gray line) of an example mouse during training in HABITS for the first 13 days. Shaded blocks indicate trials occurred in the dark cycle. Trials with early lick inhabitation only occur after the blue vertical line. Red vertical dash lines represent delay duration advancement from 0.2 s to 1.2 s. (D) Averaged correct rate (left) and early lick rate (right) for all mice trained in d2AFC. Criterion level (75%) and chance level (50%) are labeled as gray and red dash lines, respectively. (E) Same as (D) but for manual training (1~3 hr/day in home-cage). (F) Averaged correct rate (left), early lick rate (middle), and no response rate (right) of expert mice trained with the two protocols. (G) Averaged number of trials (left) and days (right) to reach the criterion performance for the two training protocols. Circles, individual mice. Error bar, mean, and 95% CI across mice. (H) Left, number of trials performed per day throughout the training schedule for three different protocols. Error bar indicates the mean and 95% confidence interval (CI) across mice. Middle, volume of water harvested per day. Right, Relative body weights of mice in days 0, 8, 16, 26. Bold line and shades indicate mean and 95% CI across mice. (I) Behavioral performance of all mice training in d2AFC task based on sound orientation (left), light orientation (middle), and light color (right). (J) Box plot of average number of trials (left) and days (right) to reach the criterion performance for d2AFC tasks with different sensory modalities. (K), Left, percentage of trials performed as a function of time in a day for the four modalities trained autonomously (thick black shows the average). Shaded area indicates the dark cycle. Top right, averaged correct rate of grouped mice in dark cycle versus light cycle. Error bars show 95% CI across mice. Bottom right, box plot of the averaged proportion of trials performed in dark cycle for the four modalities. Data collected from expert mice. (L) Left, percentage of trials in blocks with varying number of consecutive trials for automated training in home-cage. Right, correct rate and early lick rate as functions of trial block size. Gray dash line, the criterion performance; Red dash line, chance performance level. Data collected from trials of expert mice. For significance levels not mentioned in all figures, n.s., not significant, p>0.05; *, p<0.05; **, p<0.01 (two-sided Wilcoxon rank-sum tests).

Figure 2—figure supplement 1
Autonomous versus manual training in home-cage.

(A) Flow chart of the task training protocol in home-cage (Materials and methods). (B) Logistic regression model. (C) Top, behavioral performance of example mouse in the autonomous training. Bottom, the significance of individual regressors; Circle size corresponds to p values; The significance of a regressor is evaluated by comparing the prediction of the full model to a partial model with the regressor of interest excluded. p-values are based on cross-validation t-test (Materials and methods). (D) Percentage of trials significantly predicted by different regressors during task learning. Cycles and light lines, individual mice; Bars and bold lines, average across mice; Shades and error bars, 0.95 CI. *, p<0.05, n.s., p>0.05, two-sided Wilcoxon rank-sum tests. (E) averaged water harvested per day (left) and number of trials per day (right) changing from manual to autonomous training in home-cage. Cycles, individual mice; Bar plot and error bar, mean and 0.95 CI across mice. (F) Averaged relative body weights as a function of training days for free water (blue) and all d2AFC training mice (black). Shaded area shows 95% CI. (G), Performance of all 6 female mice performing d2AFC task in home-cage automatically. (H) The histogram of inter-trial interval for both autonomous and manual training in HABITS.

Figure 2—figure supplement 2
Reaction-time-based 2AFC task training in home-cage automatically.

(A) Task structure of RT-based 2AFC task. (B) Flow chart of training protocol in home-cage. (C) Conditioned behavioral data of example trials for correct (blue block) and error (red block) choice. (D) Performance of example mouse performing task in home-cage. The color of the background corresponding to (B). Gray blocks indicate dark cycle. Gray dash line, the criterion performance. Red horizontal dash line, chance performance level. (E) Correct rate of all mice. (F) Reaction time of all mice. Black line fitting to all mice from the onset to the end of training. (G) Histogram of reaction time. Data collected from all mice. The bold vertical line represents the median of RT. (H) Conditioned histogram of inter-trial interval (ITI) for correct (blue) and error (red) trials.

Figure 2—figure supplement 3
Value-based dynamic foraging task.

(A) Task structure. (B) Example performance of a mouse in the early (top, first 6000 trials) and late (bottom, last 6000 trials) training stages with block size 500. Blue lines represent moving averaged behavioral probability of left choice within 40 trials. Purple lines show the assignment probability for left reward. (C) Averaged probability of choosing the lickport with the higher assignment probability (P(high)) across mice gradually increases following the number of trials. Black line indicates the assignment probability for left and right lickports is 60% (gray line, 52.5%) and 10% (gray line, 17.5%), respectively. Dots and error bars, mean and 95% CI. (D) Left, averaged P(high) across mice follows training sub-protocols with different block size. Right, the number of days to complete all training protocols from block size 500–100. Square dots indicate individual mice. (E) Same as (B) but data collected from the sub-protocol with block size 100.

Figure 3 with 1 supplement
Representative cognitive task performed in HABITS.

(A) Contingency reversal task. (A1) Task structure. (A2) Correct rate of example mice with different learning rates. Gray vertical lines indicate contingency reversal. (A3) Relative number of trials to reach the criterion as a function of reverse times. Gray lines, individual mice. Black lines, linear fit. (A4) Number of trials in the first reversal learning versus the average number of trials of the rest of contingency reversal learning for each mouse (each dot). Black line, linear regression. Red dash line, diagonal line. (B) Working memory task with sound frequency modality. (B1) Task structure. (B2) Stimulus generation matrix (SGM) for left (orange) and right (green) trials. (B3) Left, averaged correct rate for each stimulus combination tested. Right, averaged correct rate for each (S1 +S2) stimulus combination across mice. Black line and shade, linear regression and 95% CI. (B4) Averaged psychometric curves, that is percentage of right choice as a function of frequency difference between sample 1 and sample 2. (B5) Averaged correct rate as a function of delay duration. (C) Evidence accumulation with spatial cue task. (C1) Task structure. (C2) Averaged psychometric curves, that is performance as a function of the difference between right and left clicks rates. (C3) Averaged correct rate across all mice as a function of sample duration for different Poisson rates (different colors). Error bar represents 95% CI. (D) Multimodal integration task. (D1) Task structure. (D2) Averaged correct rate across all mice as a function of sample duration for different stimulus modalities (different colors). (D3) Averaged event rates during sample period for left (red) and right (blue) choice trials. (D4) Averaged weights (black line) of logistic regression fitting to the choice of trials across expert mice tested in >1000 trials (N=11 mice) from the first bin (40ms) to the last bin (1000ms) of the sample period. A gray dash line represents the null hypothesis. Gray dots indicate significance, p<0.05, two-sided t-tests. (D5), Psychometric curve for trials with multimodal stimulus. (E) Confidence probing task. (E1) Task structure. (E2) Psychometric curve, that is right choice rate as a function of relative contrast (log scaled relative frequency). (E3) Histogram of time invested (TI) for both correct and error trials. (E4) Averaged correct rate across all mice as a function of TI. (E5) Averaged TI as a function of absolute relative contrast for both correct and error trials. Cycles, individual mice; *, p<0.05; **, p<0.01, two-sided Wilcoxon rank-sum tests.

Figure 3—figure supplement 1
Other complex cognitive behavioral tasks training in home-cage automatically.

(A), Left, stimulus generation matrix of working memory task. Middle, number of days to train. Right, correct rate for SGM. Values lying in the diagonal line corresponding to the correct rate of probe trials. (B), Top, d3AFC task according to sound orientation and number of days to reach the criterion performance. Dots indicate individual mice. Performance (middle) and early lick rate (bottom) of all mice performing the d3AFC task. Red dash line, chance performance level; Grey dash line, the criterion performance. (C), Left, contingency reversal of d3AFC task according to sound frequency (top) and performance of an example mouse (bottom). Right, averaged correct rate across all mice for different reverse times (top). Number of trials needed to learn as a function of reverse training times (bottom). Dots, individual mice. Line and shades, linear regression.

Challenging mouse tasks innovated in HABITS.

(A) Continuous learning task. (A1) Task structure showing mice learning five subtasks one by one. (A2) Left, averaged correct rate of all mice performing the five tasks (different colors) continually. All task schedules are normalized to their maximum number of trials and divided into 10 stages equally. Right, box plot of number of trials to criteria for each task. (A3) Left, averaged reaction time of all mice performing the five tasks continually. Right, averaged median reaction time across the five tasks during early (perf. <0.55), middle (perf. <0.75), and trained (perf. >0.75) stage. Error bar indicates 95% CI. (A4) Same as (A3) but for absolute performance bias. n.s., p>0.05; **, p<0.01, two-sided Wilcoxon signed-rank tests. (B) Double delayed match sample task (dDMS) with sound frequency modality. (B1) Task structure. (B2) Averaged correct rate across all mice during training (left) and averaged number of days to reach the criterion (right). (B3) Averaged early lick rate across all mice. (B4) Averaged correct rate (black) and early lick rate (gray) for all combination of sample and test stimulus. (B5) Heatmap of error rate (left) and early lick rate (right) varies with different combination of delay1 and delay2 durations. (C) Delayed 3 alternative forced choice (d3AFC). (C1) Task structure. (C2) Averaged correct rate across all mice during training (left, colors indicate trial types) and averaged number of days to reach the criterion performance (right). (C3) Averaged correct rate (colors indicate trial types) and early lick rate (gray) for different trial types. (C4) Averaged error rate of choices conditioning trial types. In each subplot, the position of bars corresponds to different choices. ****, p<0.0001, n.s., p>0.05, two-sided t-tests. (C5) Averaged choice rates for the three lickports (colors) as a function of sample frequency. Data collected from trained mice. (D) Context-dependent attention task. (D1) Task structure. (D2) Averaged correct rate across all mice during training (left, data only from trials with multimodal w/ conflict) and averaged number of days to reach the criterion (right). (D3) Correct rate (left) and reaction time (right) conditioning modalities. (D4) Averaged psychometric curve and partitioned linear regression for the multimodal with and without conflict conditions, respectively. (D5) Performance bias to sound orientation modal as a function of pre-cue contrast, for the two multimodal conditions. (D6) Averaged correct rate as a function of delay duration.

Figure 5 with 1 supplement
MT enabled faster learning with higher quality.

(A) The framework of machine teaching (MT) algorithm (see text for details). (B) Working memory task as in Figure 4A, but with full stimulus generation matrix. (C) Averaged number of trials needed to reach the criterion for MT-based and random trial type selection strategies. **, p<0.01, two-sided Wilcoxon rank-sum test. (D) The absolute difference between contrast (contr.) of sample1 (S1) and sample2 (S2) during training process for the two strategies. (E) Same as (D) but for correct rate. (F) MT-based d2AFC task training. Box plot of correct rate of expert mice (left) and number of trials needed to reach the criterion (right) for different training strategies (MT, anti-bias, and random). n.s., p>0.05, Kruskal–Wallis tests. (G) Left, averaged absolute performance bias for the three strategies during different training stages. Right, averaged across training stages. (H) Same as (G) but for absolute trial type bias. (I) Percentage of trials showing significance for different regressors during task learning. (J–K) Box plot of correct rate (J) and prediction performance difference between the full model and partial model excluding current stimulus (S0) (K) for different trained stage, including early (perf. >75%), middle (perf. >80%), and well (perf. >85%) trained. *, p<0.05, **, p<0.01, ***, p<0.001, n.s., p>0.05, two-sided Wilcoxon rank-sum tests with Bonferroni correction.

Figure 5—figure supplement 1
Simulation of machine teaching algorithm in decision-making scenario.

(A) The weight of regressors in an ideal learner varies during learning a 2AFC task. Note that the initial weights of bias and S1 regressors are not zero. (B) The presented trial types generated by random (black) and MT (red) during the entire training process. (C) Same as (A) but weights of all regressors begin at zero.

Figure 6 with 1 supplement
MT manifested distinct learning path with faster forgetting and higher learning rate.

(A) Task structure. (B) Chart of training path in latent decision space following three goals one by one. (C) Top, averaged correct rate across grouped mice during training (color, machine teaching; black, random). Bottom, same as top but performance for non-relative cue. (D) Top, the slopes of linear regression between trial number and correct rate. Bottom, same as top but between trial number and performance for non-relative cue. **, p<0.01; n.s., p>0.05; two-sided Wilcoxon rank-sum tests. (E) The learning path of mice (lines) in latent decision space for machine teaching and random training strategies. Light dots represent model weights fitted by individual mice’s behavioral data. Shaded dots, averaged across mice. (Square dots, testing protocol; Cross dots, the first or the last half of trials in learning protocol; Cycle dots, all trials in learning protocol) (F) Left, averaged absolute trial type bias between stay and switch conditions across grouped mice for the MT and random strategies from L1 to L3. Right, same as middle but for the bias between left and right trials. (G) Same as (H) but for absolute performance bias in T1 and T2 protocols. L1, the first 500 trials of frequency learning protocol; L2, intermediate trials of frequency learning protocol; L3, the last 200 trials of frequency learning protocol; T1, testing orientation protocol; T2, testing frequency protocol. *, p<0.05; n.s., p>0.05; two-sided t-tests.

Figure 6—figure supplement 1
Details of behavioral analysis for multi-dimensional tasks.

(A) Left, linear regression between trial number and correct rate in task requiring mice to attend to sound frequency. Right, the R-square of every individual linear regression. (B) Same as (A) but for performance following non-relative cue. (C) The number of trials to reach criterion performance for MT and random group. (D) Performance of both grouped mice in T1 and T2 protocol. n.s., no significant. two-sided Wilcoxon rank-sum tests. (E) The presented individual trials with Stay/Switch (top) and Left/Right (bottom) trial type generated by MT (L3) and Random (T2). (F, G) After mice were trained by MT as in Figure 6A, they were intermediately set the training protocol to the beginning and retrained with randomly generated trial sequence. We compared the correct rate of trials with sound frequency stimulus in the first and the second training, presented in (F). (G) shows the learning rate (left) and training efficiency (right) of the first and the second training processes. **, p<0.01; two-sided Wilcoxon signed-rank tests. (H), correct rate of both grouped mice for stay and switch trials in T2 protocol. n.s., no significant. two-sided Wilcoxon rank-sum tests.

Videos

Video 1
Free-moving mouse performing task in HABITS.
Video 2
The 24 hr activities of mice living in HABITS.

Tables

Table 1
All tasks training in HABITS.
Protocol name (abbreviation)ModalityAnimals trained(trained / used)Note
delayed 2-Alternative Forced Choice (d2AFC)Sound frequency (3 k vs. 10 kHz)11/11Figure 2
Sound orientation (Left vs. Right)10/11
Light orientation (Left vs. Right)10/11
Light color (Blue VS. Red)8/11
Light color (Green VS. Blue)0/10Mice are insensitive to light colors.
Light color (flashed Green VS. Blue)0/10
Reaction time 2AFC (RT-2AFC)Sound frequency (3 k vs. 12 kHz)6/6Figure 2—figure supplement 2
Contingency reversalRT-2AFC, sound frequency (3 k vs. 12 kHz)8/8Figure 3A
Continuous learningSound freq. (3 k vs. 12 kHz), reversal sound freq., sound orient. (Left vs. Right), reversal sound orient., light orient. (Left vs. Right)10/30Figure 4A;
20 mice did not learn light oriental modal within 90 days.
Evidence accumulation with spatial cuePoisson distributed clicks with spatial diff. (Left vs. Right)8/10Figure 3C
Multimodal IntegrationPoisson distributed clicks and flashes (4 vs 20 events/s)13/15Figure 3D
Poisson distributed light flashes (4 vs 20 events/s)3/1512/15 mice failed to discriminate light flash
Confidence probing taskSound frequency (8 k vs. 32 kHz)5/6Figure 3E
Sound frequency (8 k vs. 32 kHz) and Poisson distributed clicks with spatial diff. (Left vs. Right)0/12Delay period up to 8 sec, failed
Value-based dynamic foraging taskNo sensory cues (Block size from 500 to 100)6/6Figure 2—figure supplement 3
Working memory taskTemporal regular clicks with 5 alternative rates (8, 16, 32, 64, 128 Hz)5/6Figure 3B
Temporal regular clicks with 3 alternative rates (8, 32, 128 Hz)8/8Figure 3—figure supplement 1 C
double Delayed Match Sample (dDMS)Sound frequency (3 k & 12 kHz)10/10Figure 4B;
Sample and test period: 500ms
Sound frequency (3 k & 12 kHz)0/10Random sample and test period: 100 or 1000ms
delayed 3-Alternative Forced Choice (d3AFC)Sound frequency (8 k vs. 16 k vs. 32 kHz)14/14Figure 4C
Sound frequency reversal (8 k vs. 16 k vs. 32 kHz)4/4Figure 3—figure supplement 1B
Sound orientation (Left vs. Middle vs. Right)6/6Figure 3—figure supplement 1A
Context-dependent attention taskSound frequency (3 k vs. 12 kHz) or sound orientation (Left vs. Right); regular click rates (16 vs 64 Hz) as context cue6/6Figure 4D
Sound orientation (Left vs. Right) or light orientation (Left vs. Right); regular click rates (16 vs 64 Hz) as context cue0/10Mice failed in light modality
Sound orientation (Left vs. Right) or light orientation (Left vs. Right); Sound frequency (3 k vs. 12 kHz) as context cue0/20
Machine teaching algorithmWorking memory task, Temporal regular clicks with 5 alternative rates and full stimulus matrix (8, 16, 32, 64, 128 Hz)7/7 (MT)
5/8(Random)
Figure 5B
RT-2AFC, Sound frequency (3 k vs. 12 kHz)10/10 (MT)
10/10 (random)
10/10 (antibias)
Figure 5F;
Same group of mice used in continuous learning
2AFC with Sound frequency (3 k vs. 12 kHz), sound orientation (Left vs. Right) and sound orientation reversal, respectively10/10 (MT)
8/8 (random)
Figure 6
TotalN/A200/284N/A

Additional files

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Bowen Yu
  2. Penghai Li
  3. Haoze Xu
  4. Yueming Wang
  5. Kedi Xu
  6. Yaoyao Hao
(2025)
Novel and optimized mouse behavior enabled by fully autonomous HABITS: Home-cage assisted behavioral innovation and testing system
eLife 14:RP104833.
https://doi.org/10.7554/eLife.104833.3