DeepEthogram, a machine learning pipeline for supervised behavior classification from raw pixels

  1. James P Bohnslav
  2. Nivanthika K Wimalasena
  3. Kelsey J Clausing
  4. Yu Y Dai
  5. David A Yarmolinsky
  6. Tomás Cruz
  7. Adam D Kashlan
  8. M Eugenia Chiappe
  9. Lauren L Orefice
  10. Clifford J Woolf
  11. Christopher D Harvey  Is a corresponding author
  1. Department of Neurobiology, Harvard Medical School, United States
  2. F.M. Kirby Neurobiology Center, Boston Children’s Hospital, United States
  3. Department of Molecular Biology, Massachusetts General Hospital, United States
  4. Department of Genetics, Harvard Medical School, United States
  5. Champalimaud Neuroscience Programme, Champalimaud Center for the Unknown, Portugal
7 figures, 9 videos, 2 tables and 1 additional file

Figures

Figure 1 with 1 supplement
DeepEthogram overview.

(A) Workflows for supervised behavior labeling. Left: a common traditional approach based on manual labeling. Middle: workflow with DeepEthogram. Right: Schematic of expected scaling of user time …

Figure 1—figure supplement 1
Optic flow.

(A) Example images from the Fly dataset on two consecutive frames. (B) Optic flow estimated with TinyMotionNet. Note that the image size is half the original due to the TinyMotionNet architecture. …

Figure 2 with 6 supplements
Datasets and behaviors of interest.

(A) Left: raw example images from the Mouse-Ventral1 dataset for each of the behaviors of interest. Right: time spent on each behavior, based on human labels. Note that the times may add up to more …

Figure 2—figure supplement 1
Example images from the datasets, part 1.

(A) Examples from the Mouse-Ventral1 dataset. Each row is three consecutive frames of the indicated behavior. Right columns: optic flow computed by TinyMotionNet and visualized as in Figure 1—figure …

Figure 2—figure supplement 2
Example images from the datasets, part 2.

(A) Examples from the Mouse-Openfield dataset. Each row is three consecutive frames of the indicated behavior. Right columns: optic flow computed by TinyMotionNet and visualized as in Figure …

Figure 2—figure supplement 3
Example images from the datasets, part 3.

Examples from the Mouse-Homecage dataset. Each row is three consecutive frames of the indicated behavior. Right columns: optic flow computed by TinyMotionNet and visualized as in Figure 1—figure …

Figure 2—figure supplement 4
Example images from the datasets, part 4.

Examples from the Mouse-Social dataset. Each row is three consecutive frames of the indicated behavior. Right columns: optic flow computed by TinyMotionNet and visualized as in Figure 1—figure …

Figure 2—figure supplement 5
Example images from the datasets, part 5.

Examples from the Sturman-EPM dataset. Each row is three consecutive frames of the indicated behavior. Right columns: optic flow computed by TinyMotionNet and visualized as in Figure 1—figure …

Figure 2—figure supplement 6
Example images from the datasets, part 6.

(A) Examples from the Sturman-FST dataset. Each row is three consecutive frames of the indicated behavior. Right columns: optic flow computed by TinyMotionNet and visualized as in Figure 1—figure …

Figure 3 with 12 supplements
DeepEthogram performance.

All results are from the test sets only. (A) Overall accuracy for each model size and dataset. Error bars indicate mean ± SEM across five random splits of the data (three for Sturman-EPM). (B) …

Figure 3—figure supplement 1
DeepEthogram performance, precision.

All results are from the test sets only. (A) Overall precision for each model size and dataset. Error bars indicate mean ± SEM across five random splits of the data (three for Sturman-EPM). (B) …

Figure 3—figure supplement 2
DeepEthogram performance, recall.

All results are from the test sets only. (A) Overall recall for each model size and dataset. Error bars indicate mean ± SEM across five random splits of the data (three for Sturman-EPM). (B) Recall …

Figure 3—figure supplement 3
DeepEthogram performance, area under the receiver operating characteristic curve (AUROC).

All results are from the test sets only. (A) Overall recall for each model size and dataset. Error bars indicate mean ± SEM across five random splits of the data (three for Sturman-EPM). (B) AUROC …

Figure 3—figure supplement 4
Ethogram examples for the Mouse-Ventral1 dataset.

(A) An example ethogram with above-average performance, showing the human labels, estimated probabilities for each behavior from DeepEthogram-medium, and the thresholded and postprocessed …

Figure 3—figure supplement 5
Ethogram examples for the Mouse-Ventral2 dataset.

(A) An example ethogram with above-average performance, showing the human labels, estimated probabilities for each behavior from DeepEthogram-medium, and the thresholded and postprocessed …

Figure 3—figure supplement 6
Ethogram examples for the Mouse-Openfield dataset.

(A) An example ethogram with above-average performance, showing the human labels, estimated probabilities for each behavior from DeepEthogram-medium, and the thresholded and postprocessed …

Figure 3—figure supplement 7
Ethogram examples for the Mouse-Homecage dataset.

(A) An example ethogram with above-average performance, showing the human labels, estimated probabilities for each behavior from DeepEthogram-medium, and the thresholded and postprocessed …

Figure 3—figure supplement 8
Ethogram examples for the Mouse-Social dataset.

(A) An example ethogram with above-average performance, showing the human labels, estimated probabilities for each behavior from DeepEthogram-medium, and the thresholded and postprocessed …

Figure 3—figure supplement 9
Ethogram examples for the Sturman-EPM dataset.

(A) An example ethogram with above-average performance, showing the human labels, estimated probabilities for each behavior from DeepEthogram-medium, and the thresholded and postprocessed …

Figure 3—figure supplement 10
Ethogram examples for the Sturman-FST dataset.

(A) An example ethogram with above-average performance, showing the human labels, estimated probabilities for each behavior from DeepEthogram-medium, and the thresholded and postprocessed …

Figure 3—figure supplement 11
Ethogram examples for the Sturman-OFT dataset.

(A) An example ethogram with above-average performance, showing the human labels, estimated probabilities for each behavior from DeepEthogram-medium, and the thresholded and postprocessed …

Figure 3—figure supplement 12
DeepEthogram exhibits position and heading invariance.

Nine randomly selected examples of the ‘face groom’ behavior from the Mouse-Openfield dataset. All examples were identified as ‘face groom’ by DeepEthogram-medium. The examples include different …

DeepEthogram performance on bout statistics.

All results from DeepEthogram-medium, test set only. (A–C) Comparison of model predictions and human labels on individual videos from the Mouse-Ventral1 dataset. Each point is one behavior from one …

Figure 5 with 2 supplements
Comparison of model performance to human performance on bout statistics.

All model data are from DeepEthogram-medium, test set data. r values indicate Pearson’s correlation coefficient. (A) Performance on Mouse-Ventral1 dataset for time spent. Each circle is one behavior …

Figure 5—figure supplement 1
Performance of keypoint-based behavior classification on the Mouse-Openfield dataset.

(A) Left: keypoints identified, labeled, and predicted using DeepLabCut. Right: example keypoint sequence predicted by DeepLabCut from a held-out video. (B) Example images from held-out videos …

Figure 5—figure supplement 2
Comparison with unsupervised methods.

(A) B-SoID pipeline. (B) B-SoID behavioral space. Shown are a random sample of points that B-SoID labeled confidently (57% of total data). Left: colors are B-SoID cluster assignments. Right: colors …

DeepEthogram performance as a function of training set size.

(A) Accuracy (top) and F1 score (bottom) for DeepEthogram-fast as a function of the number of videos in the training set for Mouse-Ventral1, shown for each behavior separately. The mean is shown …

Graphical user interface.

(A) Example DeepEthogram window with training steps highlighted. (B) Example DeepEthogram window with inference steps highlighted.

Videos

Video 1
DeepEthogram example from the Mouse-Ventral1 dataset.

Video is from the test set. Top: raw image. Title indicates frame number in video. Tick legends indicate pixels. Middle: human labels. Black box indicates the current frame. Bottom: DeepEthogram …

Video 2
DeepEthogram example from the Mouse-Ventral2 dataset.

Video is from the test set. Top: raw image. Title indicates frame number in video. Tick legends indicate pixels. Middle: human labels. Black box indicates the current frame. Bottom: DeepEthogram …

Video 3
DeepEthogram example from the Mouse-Openfield dataset.

Video is from the test set. Top: raw image. Title indicates frame number in video. Tick legends indicate pixels. Middle: human labels. Black box indicates the current frame. Bottom: DeepEthogram …

Video 4
DeepEthogram example from the Mouse-Homecage dataset.

Video is from the test set. Top: raw image. Title indicates frame number in video. Tick legends indicate pixels. Middle: human labels. Black box indicates the current frame. Bottom: DeepEthogram …

Video 5
DeepEthogram example from the Mouse-Social dataset.

Video is from the test set. Top: raw image. Title indicates frame number in video. Tick legends indicate pixels. Middle: human labels. Black box indicates the current frame. Bottom: DeepEthogram …

Video 6
DeepEthogram example from the Sturman-EPM dataset.

Video is from the test set. Top: raw image. Title indicates frame number in video. Tick legends indicate pixels. Middle: human labels. Black box indicates the current frame. Bottom: DeepEthogram …

Video 7
DeepEthogram example from the Sturman-FST dataset.

Video is from the test set. Top: raw image. Title indicates frame number in video. Tick legends indicate pixels. Middle: human labels. Black box indicates the current frame. Bottom: DeepEthogram …

Video 8
DeepEthogram example from the Sturman-OFT dataset.

Video is from the test set. Top: raw image. Title indicates frame number in video. Tick legends indicate pixels. Middle: human labels. Black box indicates the current frame. Bottom: DeepEthogram …

Video 9
DeepEthogram example from the Flies dataset.

Video is from the test set. Top: raw image. Title indicates frame number in video. Tick legends indicate pixels. Middle: human labels. Black box indicates the current frame. Bottom: DeepEthogram …

Tables

Table 1
Inference speed.
DatasetResolutionInference time (FPS)
Titan RTXGeforce 1080 Ti
DEG_fDEG_mDEG_sDEG_fDEG_mDEG_s
Mouse-Ventral1256 × 256235128341527613
Mouse-Ventral2256 × 256249132341577913
Mouse-Openfield256 × 256211117331418013
Mouse-Homecage352 × 224204102281327011
Mouse-Social224 × 2243241554420410617
Sturman-EPM256 × 256240123341578313
Sturman-FST224 × 4481577521106519
Sturman-OFT256 × 256250125341598413
Flies128 × 1926232948937818933
Table 2
Model summary.
Model nameFlow generator (parameters)Feature extractor (parameters)Sequence model (parameters)# frames input to flow generator# frames input to RGB feature extractorTotal parameters
DeepEthogram-fastTinyMotionNet (1.9M)ResNet18 × 2 (22.4M)TGM (250K)111~24.5M
DeepEthogram-mediumMotionNet (45.8M)ResNet50 × 2 (49.2M)TGM (250K)111~ 95.2M
DeepEthogram-slowTinyMotionNet3D (0.4M)ResNet3D-34 × 2 (127M)TGM (250K)1111~ 127.6M

Additional files

Download links