Figures and data

DANCE assay overcomes the limitation of existing methods for quantification of aggression and courtship behaviors.
(A) Existing hardware for acquisition of aggression and courtship behavior using machine vision cameras and the simplified DANCE hardware. (B) Steps for developing DANCE classifiers and benchmarking with existing methods and manual ground-truthing to produce behavioral scores. (C) Various behavioral classifiers developed to quantify male aggression (lunge) and male courtship (wing extension, circling, following, attempted copulation, and copulation). (D) Raster plots comparing performance of ground-truth vs. DANCE classifier vs. CADABRA vs. Divider assay for aggression in a representative video. (E) Raster plots for various courtship behaviors comparing performance of ground-truth vs. DANCE classifiers vs. MateBook from representative videos. Created in BioRender.

DANCE lunge classifier outperforms existing methods of quantifying male aggression.
(A) Lunge scores from 20 minute long videos from ground-truth (grey), DANCE lunge classifier (orange), CADABRA (purple) and Divider assay classifier (green). (B-E) Comparison of lunge scores from different levels of aggression. Male flies showing lunges ranging from 0 to 500 were compared manually, with a new DANCE lunge classifier, CADABRA, and Divider assay classifier. (B) ‘low aggressive’, (n=10; **P<0.0017, ****P<0.0001). (C) 71–160 lunges, ’moderately aggressive’, (n=11; **P<0.0102, ***P<0.0002, ***P<0.0001). (D) 161–300 lunges, ‘highly aggressive’, (n = 11; **P<0.0057, ***P<0.0002, ***P<0.0004). and (E) >300 lunges, ‘hyper-aggressive’, (n = 8; *P<0.0402, **P<0.0029, ***P<0.0006; Friedman’s ANOVA with Dunn’s test) (F) Regression analysis of DANCE ‘lunge classifier’ vs. manual scores (R2=0.9893, n=40). (G) CADABRA vs. DANCE lunge classifier (R2=0.9, n=40). (H) Divider assay lunge classifier vs. manual scores (R2=0.7739, n=40). (I) F1 score, precision, and recall between DANCE lunge classifier, CADABRA, and Divider assay classifier.

DANCE wing extension classifier outperforms existing methods of quantification.
(A) Wing extension index of males from 15 minutes long videos from the ground-truth (grey), DANCE wing extension classifier (orange) and MateBook (purple), against decapitated virgin females. MateBook underscored wing extension across multiple videos (black arrows). (B) Comparison of ground-truth vs. DANCE vs. MateBook wing extension classifier (Kruskal-Wallis ANOVA with Dunn’s test, ns, p>0.9999, *p=0.0436; n=15). (C) Regression analysis of the DANCE wing extension classifier vs. ground-truth (R2=0.9831, n=15). (D) MateBook vs. ground-truth (R2=0.1054, n=15). (E) F1 score, precision, and recall of DANCE wing extension classifier and MateBook against ground- truth scores.

DANCE attempted-copulation classifier outperforms existing methods of quantification.
(A) Attempted copulation index of males from 15 minute long videos from the ground-truth (grey), ‘DANCE attempted copulation classifier’ (orange) and MateBook (purple) against both mated and decapitated females. (B) Comparison of ground-truth vs. DANCE attempted-copulation classifier vs. MateBook (Kruskal-Wallis ANOVA with Dunn’s test, ns, p>0.9999; ****p<0.0001, n=32). (C) Regression analysis of the attempted-copulation classifier vs. ground-truth (R2=0.8742, n=32). (D) Regression analysis of MateBook vs. ground-truth (R2=1512, n=32). (E) F1 score, precision, and recall of DANCE and MateBook attempted-copulation classifiers against ground-truth scores.

DANCE circling classifier outperforms existing methods of quantification.
(A) Circling index of males from 15 minutes long videos from the ground-truth (grey), ‘DANCE circling classifier’ (orange) and MateBook (purple) against decapitated virgin females. (B) Comparison of manual vs. DANCE vs. MateBook circling classifier (Ordinary one-way ANOVA with Dunnett’s test, ns, p=0.8014; *p=0.0157, n=12). (C) Regression analysis of the circling classifier vs. ground-truth (R2=0.92, n=12). (D) MateBook vs. ground-truth (R2=0.88, n=12). (E) F1 score, precision, and recall of DANCE and MateBook circling classifiers against ground-truth scores.

DANCE hardware and recordings setup.
(A) DANCE aggression set up (B) 3D- rendered components of the aggression set up (C) DANCE courtship set up (D) 3D- rendered components of the courtship set up, male and female are separated on either side using X-ray film separator or ‘divider comb’. (E-G) Top and side views of the DANCE setup with a smartphone camera for recording and electronic tablet being used as a backlight.

Benchmarking DANCE hardware and testing various neurogenetic tools.
(A- B) Courtship behaviors recorded using a pre-existing (circular) and DANCE set up, from group-housed (GH) and single-housed (SH) flies, for (C-D) wing extension, (C) ***p<0.0010, n=23; (D) **p<0.0013, GH, n=22 and SH, n=26. (E-F) Attempted copulation, (E) ***p<0.0002, n=23; (F) ns, p<0.1907, GH, n=18 and SH, n=22. (G-H) Following, (G) ns, p>0.0959, n=23; (H) ns, p<0.6589, GH, n=22 and SH, n=26. (I-J) Circling, (I) *p<0.012, n=23; (J) **p<0.0021, GH, n=19 and SH, n=22. (K-L) Aggressive lunges recorded using a pre-existing (circular) and DANCE set up. (M-N) Lunges in SH flies compared to GH flies reared on food with yeast granules, (M) **p<0.0138, n=36; (N) **p<0.0372, n=40. (O) Effect of yeast extract food on aggressive behavior; ****p<0.0001, n=38. (P-Q) Genetic knockdown of the neuropeptide Drosulfakinin (Dsk) in insulin-producing neurons using dilp2-GAL4. (P) ns, p<0.0502, ns, p>0.9999, ****p<0.0001, **p<0.0040, n=35. (Q) ****p<0.0001, ns, p>0.9999, ****p<0.0001, *p>0.0210, n=30. (R) Optogenetic silencing of dopaminergic neurons by UAS-GtACR1 driven by TH-GAL4 driver, ns, p<0.0986, ns, p>0.9999, ****p>0.0001, **p>0.0012, **p>0.0013, n=24. (C-J and M-O) Mann-Whitney U test; (P-R) Kruskal-Wallis test with Dunn’s multiple comparisons.

Aggression chamber described by Dankert et al., 2009.
It consists of a bottom food plate, 12 wells (aggression arenas), a top plate with fly loading holes and a screw slot for sliding the loading plate.

Courtship set-up described by Koemans et al., 2017).
It consists of 18 wells (courtship arenas), a top cover plate, a sliding loading plate, and a sliding divider assembly to separate male and female flies.

Comparison of annotations by two independent evaluators to assess observer bias during classifier ground-truthing.
(A) Aggressive lunge, p=0.8789, n=15. (B) Courtship including wing extension, p=0.9999, n=13. (C) Attempted copulation, p=0.0571, n=16. (D) Circling, p=0.4343, n=12. (E) Following, p=0.4405, n=13. (F) Copulation, p=0.9221, n=25. (A-F) Mann-Whitney U test.

DANCE wing extension classifier outperforms existing quantification methods in videos with mated females.
(A) Wing extension of males from 15-minute-long videos from the ground-truth (grey), the DANCE classifier (orange) and MateBook (purple) against mated females MateBook underscored wing extension across multiple videos (black arrows) (Friedman’s ANOVA with Dunn’s test, p=0.3582; p<0.0001, n=25). (B) Comparison of ground-truth vs. DANCE vs. MateBook wing extension classifier (Kruskal-Wallis ANOVA with Dunn’s test, p>0.9999; p=0.1039, n=25). (C) Regression analysis of the DANCE classifier vs. ground-truth (R2=0.9951, n=25). (D) MateBook vs. ground-truth (R2=0.8282, n=25). (E) F1 score, precision, and recall of DANCE classifier and MateBook against ground-truth scores.

DANCE copulation classifier evaluation in mixed female dataset.
(A) Quantification of copulation in 15 minutes from individual videos showing scores from the manual method (grey), DANCE copulation classifier (orange), and MateBook (purple) (Friedman ANOVA with Dunn’s test, p>0.9999; p>0.9999, n=21). (B) Box plot comparison of manual vs. DANCE copulation classifier vs. MateBook (Kruskal- Wallis ANOVA with Dunn’s test, p>0.9999; p>0.9999, n=21). (C) Bar plots showing F1 score, precision, and recall of DANCE copulation classifier and MateBook against ground- truth scores. (D) Regression analysis of the copulation classifier vs. manual scores (R2=0.98, n=21). (E) MateBook vs. manual scores (R2=0.81, n=21).

DANCE circling classifier evaluation in the mated female dataset.
(A) Circling index of males from 15-minute videos from the ground-truth (grey), ‘DANCE circling classifier’ (orange) and MateBook (purple) against mated female dataset (Friedman’s ANOVA with Dunn’s test, p>0.9999; p <0.0001, n=19). (B) Comparison of manual vs. DANCE vs. MateBook circling classifier (Kruskal-Wallis ANOVA, p>0.9999; p=0.0822, n=19). (C) Regression analysis of the circling DANCE classifier vs. ground-truth (R2 = 0.9494, n = 19). (D) MateBook vs. ground-truth (R2=0.6938, n=19). (E) F1 score, precision, and recall of DANCE and MateBook circling classifiers against ground-truth scores.

DANCE following classifier evaluation in the mated female dataset.
(A) Following index of males from 15 minutes long videos from the ground- truth (grey), ‘DANCE following classifier’ (orange) and MateBook (purple) against mated females (Friedman’s ANOVA with Dunn’s test, p=0.1794; p=0.0029, n=25). (B) Box plot comparison of manual vs. DANCE following classifier vs. MateBook (Kruskal-Wallis ANOVA with Dunn’s test, p>0.9999; p=0.5287, n=25). (C) Regression analysis of the following classifier vs. ground-truth (R2=0.9894, n=25). (D) MateBook vs. ground-truth (R2=0.9204, n=25). (E) F1 score, precision, and recall of DANCE and MateBook following classifiers against ground-truth scores.

Effect of optogenetic silencing of dopaminergic neurons on Drosophila activity.
(A, B) Transient silencing of dopaminergic neurons using UAS-GtACR1 driver didn’t affect daytime activity between SH and GH flies. (A) Without silencing mediated by green light on day 1; TH-GAL4, ns, p=0.7383, GH: n=43, SH: n=43; UAS-GtACR1, ns, p=0.4812, GH: n=51, SH: n=42; TH-GAL4>UAS-GtACR1, ns, p=0.9942, GH: n=54, SH: n=51. (B) With silencing mediated by green light on day 2; TH-GAL4, ns, p=0.9976, GH: n=43, SH: n=43; UAS-GtACR1, ns, p=0.9779, GH: n=51, SH: n=42; TH- GAL4>UAS-GtACR1, ns, p=0.9974, GH: n=54, SH: n=51. One-way ANOVA with Tukey’s multiple comparisons test for within each day comparison. Two-way ANOVA for comparison across days; interaction, ns, p=0.5504, silencing, ns, p=0.5172, housing, ns, p=0.1602, GH: n=43, SH: n=41; UAS-GtACR1, interaction, ns, p=0.4533, silencing, ns, p=0.3602, housing, *p=0.255, GH: n=51, SH: n=42; TH-GAL4>UAS-GtACR1, interaction, ns, p=0.9977, silencing, ns, p=0.2454, housing, ns, p=0.5868, GH: n=54, SH: n=51.