SqueakPose Studio: An end-to-end platform for pose estimation and real-time edge-AI deployment

David L Haggerty author has email address
Caleb B Darden
David M Lovinger

Laboratory for Integrative Neuroscience, Institute on Alcohol Abuse and Alcoholism, National Institutes of Health, Bethesda, United States

https://doi.org/10.7554/eLife.111308.1

Open access
Copyright information

Figures and data

SqueakPose Studio provides a unified, modular environment for behavioral pose estimation that links dataset creation, training, and inference.
(a) Software architecture linking video import, labeling, training, and analysis. (b) YOLOv11s-pose network design with C2f-Darknet backbone, PAN-FPN++ neck, and multitask detection, pose, and segmentation heads. (c) Training speed comparison between YOLOv11s-pose and DeepLabCut (ResNet-50) under identical dataset conditions. (d) Inference speed comparison on a 10-min open-field video showing an 8.5× improvement in processing rate. (e) Distribution of keypoint localization error across six anatomical landmarks showing comparable spatial precision. Together, these benchmarks demonstrate that SqueakPose Studio achieves DeepLabCut level accuracy with markedly improved training and inference efficiency.

The SqueakPose Studio GUI supports dataset management, annotation, and model training across operating systems.
(a) Application overview showing integrated dataset browser, model configuration, and training panels. (b) Bounding-box and keypoint labeling modes with crosshair placement and visibility-state assignment. (c) Model-assisted labeling using on-device YOLO predictions for annotation acceleration. (d) Dataset validation and export tools for generating YOLO-compatible structures and .yaml configuration files. (e) Integrated training interface for selecting model size, GPU, epochs, and batch size. Built in Python 3.12 with PyQt6, SqueakPose Studio enables interactive annotation and training without command-line dependencies.

SqueakPose Analysis: A pre-built Jupyter notebook automates motion feature extraction and visualization from SqueakPose Studio detection outputs.
(a) Workflow for loading detection (.csv) and video (.mp4) files into the analysis utility. (b) Graphical interface for converting pixel coordinates to real-world units and applying 1-Euro smoothing; derived features include distance, velocity, acceleration, heading, and bounding-box descriptors. (c) Example model confidence trace across frames (mean ≈ 88.5%). (d) Frame-processing latency before, during, and after inference. (e–f) Locomotion profiles showing frame-wise distance and velocity over a 10-min open-field session. (g) Heading-direction distribution illustrating orientation sampling. (h–i) Spatial trajectory overlay and region-of-interest (ROI) GUI for defining arena zones. (j–k) Quantification of time and speed by ROI, revealing edge-biased exploration and higher center-zone velocities consistent with anxiogenic center behavior.

A semi-supervised workflow for identifying recurring behavioral clusters from pose-derived motion features.
(a) Overview of the analysis pipeline. Distance, velocity, acceleration, heading, and temporal features are extracted from detection files and embedded using UMAP, followed by unsupervised clustering with HDBSCAN. (b) Two-dimensional UMAP embedding of motion features colored by cluster identity, illustrating separable groups of behavioral states. (c) Hierarchical dendrogram showing similarity relationships among clusters using Ward distance. (d) Representative pose-estimation track trajectories (2 seconds total) from selected clusters, user-defined as “edge exploration,” “edge stationary,” and “center exploration.” This semi-supervised workflow provides a practical example of behavior-space discovery and labeling, with cluster outputs directly exportable to advanced models such as CEBRA or keypoint-MoSeq for further analysis.

MouseHouse: A 3D-printed, edge-computing environment integrating synchronized video acquisition and behavioral hardware control.
(a–d) Computer-aided design (CAD) renderings showing the modular enclosure, camera mount, and removable front panel. (e) Top-mounted USB3 machine-vision camera with visible and 850 nm infrared (IR) LED illumination for light and dark recordings. (f–g) Configurable arena fixtures for pellet wells and capacitive lickometers supporting fluid-access or operant tasks. (h) System wiring diagram linking the NVIDIA Jetson Orin Nano Super (6-core ARM CPU, 1024 CUDA cores, 32 tensor cores, 8 GB LPDDR5, 67 INT8 TOPS) to an RP2040 controller and MPR121 capacitive sensors. (i) Example configuration showing a two-bottle choice and fixed-ratio 1 (FR1) feeding task. Each unit operates as an independent edge device performing on-board YOLO inference, sensor synchronization via TTL, and bidirectional task control—enabling scalable, low-cost, networked behavioral experiments without workstation-grade GPUs.

SqueakView: A graphical interface for live behavioral recording, model deployment, and performance monitoring on Jetson edge devices.
(a) The Data Engine Builder notebook generates optimized TensorRT engines (FP32, FP16, or INT8) from trained YOLO weights for device-specific deployment. (b) Configuration dialog for experiment setup, including camera resolution, frame rate, pixel format, TTL trigger mode, serial communication port, and DeepStream/YOLO model selection. (c–e) Live interface showing real-time video feed with overlaid keypoints and bounding boxes, resource utilization (CPU, GPU, RAM), and behavior-event dashboards synchronized via serial input. (f) Output files saved per session include raw and annotated videos, JSON session metadata, and CSV logs for detections and system performance. (g–h) Frame-latency benchmarks for YOLOv11n detection and pose models showing real-time inference performance on the Jetson Orin Nano Super without frame buffering. Together, SqueakView enables fully embedded, real-time video analysis synchronized with behavioral sensor data.

Sign up for email alerts