1. Computational and Systems Biology
  2. Neuroscience
Download icon

Automated analysis of long-term grooming behavior in Drosophila using a k-nearest neighbors classifier

  1. Bing Qiao
  2. Chiyuan Li
  3. Victoria W Allen
  4. Mimi Shirasu-Hiza
  5. Sheyum Syed  Is a corresponding author
  1. University of Miami, United States
  2. Columbia University, United States
Tools and Resources
  • Cited 0
  • Views 1,544
  • Annotations
Cite this article as: eLife 2018;7:e34497 doi: 10.7554/eLife.34497

Abstract

Despite being pervasive, the control of programmed grooming is poorly understood. We addressed this gap by developing a high-throughput platform that allows long-term detection of grooming in Drosophila melanogaster. In our method, a k-nearest neighbors algorithm automatically classifies fly behavior and finds grooming events with over 90% accuracy in diverse genotypes. Our data show that flies spend ~13% of their waking time grooming, driven largely by two major internal programs. One of these programs regulates the timing of grooming and involves the core circadian clock components cycle, clock, and period. The second program regulates the duration of grooming and, while dependent on cycle and clock, appears to be independent of period. This emerging dual control model in which one program controls timing and another controls duration, resembles the two-process regulatory model of sleep. Together, our quantitative approach presents the opportunity for further dissection of mechanisms controlling long-term grooming in Drosophila.

https://doi.org/10.7554/eLife.34497.001

eLife digest

From birds that preen their feathers to dogs that lick their fur, many animals groom themselves. They do so to stay clean, but routine grooming also has a range of other uses, such as social communication or controlling body temperature. Despite its importance, grooming remains poorly understood; it is especially unclear how this behavior is regulated.

Fruit flies could be a good model to study grooming because they are often used in laboratories to look into the genetic and brain mechanisms that control behavior. Flies clean themselves by sweeping their legs over their wings and body, but little is known about how the insects groom ‘naturally’ over long periods of time. This is partly because scientists have had to recognize and classify grooming behavior by eye, which is highly time-consuming.

Here, Qiao, Li et al. have created a system to automatically detect grooming behavior in fruit flies over time. First, a camera records the movement of an individual insect. A computer then analyzes the images and picks out general features of the fly’s movement that can help work out what the insect is doing. For example, if a fly is moving its limbs, but not the main part of its body, it is probably grooming itself. Qiao, Li et al. then borrowed an algorithm from an area of computer science known as ‘machine learning’ to teach the computer how to classify each fly’s behavior automatically.

The new system successfully recognized grooming behavior in over 90% of cases, and it revealed that fruit flies spend about 13% of their waking life grooming. It also showed that grooming seems to be controlled by two potentially independent internal programs. One program is tied to the internal body clock of the fly, and regulates when the insect grooms during the day. The other commands how long the fly cleans itself, and balances the amount of time spent on grooming with other behaviors.

Cleaning oneself is not just important for animals to stay disease-free: it also reflects the general health state of an individual. For example, a loss of grooming is associated with sickness, old age, and, in humans, with mental illness. If scientists can understand how grooming is controlled at the brain and molecular levels, this may give an insight into how these mechanisms relate to diseases. The system created by Qiao, Li et al. could help to make such studies possible.

https://doi.org/10.7554/eLife.34497.002

Introduction

Grooming is broadly defined as a class of behaviors directed at the external surface of the body. Most animals spend considerable time grooming (Mooring et al., 2004; Sachs, 1988), and this near universality suggests that grooming likely fulfills an essential role for animals (Spruijt et al., 1992). Grooming assumes a variety of forms in different species—for instance, birds preen the oily substance produced by the preening gland from their feathers and skin, cats and dogs lick their fur, and flies sweep their body parts with their legs. Although in most cases the primary function of grooming is to maintain a clean body surface, different species-specific forms of grooming have roles in diverse functions such as thermoregulation, communication and social relationships (Dawkins and Dawkins, 1976; Ferkin et al., 2001; Geist and Walther, 1974; McKenna, 1978; Patenaude and Bovet, 1984; Schino, 2001; Schino et al., 1988; Seyfarth, 1977; Spruijt et al., 1992; Thiessen et al., 1977; Walther, 1984).

Many animal behaviors, such as locomotion, have been shown to be controlled by both external stimuli (stimulated behavior) and internal programs (programmed behavior). An example of stimulated locomotor activity is the abrupt evasive response triggered by the sudden appearance of a predator. In contrast, programmed locomotor activities, such as daily foraging for food, are essential to maintain vital functions of the organism (Bergman et al., 2000). Similar to locomotion, limited data from mammals suggest that grooming may be controlled by both external stimuli and internal programs (Hart et al., 1992; Hawlena et al., 2008; Mooring and Samuel, 1998). For example, stimulated grooming might be performed when the animal is excessively dirty or itchy, and programmed grooming might be performed as a social ritual. Although grooming is a widely observed behavior, the basic mechanisms regulating grooming are still not well understood.

The fruit fly Drosophila melanogaster is an ideal model organism with which to dissect the fundamental mechanisms of grooming and its relationship to other behaviors. The fly is known to be a frequent groomer with a rich repertoire of behaviors and a sophisticated genetic toolkit developed to study them (Connolly, 1968; Owald et al., 2015). The study of Drosophila grooming can be traced back to the 1960s (Connolly, 1968; Szebenyi, 1969), and notable progress has since been made in studying grooming stimulated by the application of dust particles to the insect exterior (Hampel et al., 2015; Seeds et al., 2014). While most grooming studies thus far have focused on stimulated grooming, understanding the mechanisms responsible for programmed grooming will not only identify components distinct to each type of grooming but also inform us about how programmed grooming is prioritized with regard to other programmed behaviors such as locomotion, feeding, and sleep in the same organism.

A major hurdle in detecting programmed grooming in Drosophila is the lack of practical methodology. In many cases, fly grooming events are extracted by eye (King et al., 2016; Phillis et al., 1993; Yanagawa et al., 2014). Consequently, these data report only conspicuous behaviors within relatively short durations of observation. To improve resolution and accuracy, a number of sophisticated video-tracking methods have been recently developed for fly behavior (Kain et al., 2013; Mendes et al., 2013). These designs are not amenable to easy scale-up for tracking multiple individuals simultaneously. Moreover, while several of these methods are sufficient for short-term monitoring (Branson et al., 2009; Kabra et al., 2013), continuous multi-hour measurements and rapid, automated quantification methods are required to dissect long-term, unstimulated fly grooming relative to other daily behaviors like locomotion and sleep.

To overcome limitations of currently available methods, we developed a new platform for long-term video-tracking and automated analysis of fly grooming. The layout of our hardware takes advantage of a basic design for housing individual flies that is widely used in locomotion and sleep studies (Gilestro, 2012; Pfeiffenberger et al., 2010; Zimmerman et al., 2008). Here, we incorporate this standardized hardware into studies of grooming. Our algorithm maps fly activity onto a three-dimensional behavioral space and utilizes k-nearest neighbors (kNN) method, a machine learning technique, to classify each video frame as grooming, locomotion or rest. Results from multi-day recordings reveal that Drosophila spend approximately 13% of their waking time grooming, and the temporal pattern of grooming behavior is tightly regulated by the fly’s internal circadian pacemaker. These findings suggest that grooming, similar to feeding and rest, likely serves one or more critical functions in Drosophila. Additionally, genetic perturbations reveal that the transcription factors CYCLE and CLOCK are critical parts of an internal program that controls the amount of Drosophila grooming. These grooming data, the easily implementable hardware, and the automated analysis package together permit the construction of high-resolution ethograms of stereotypical fly behavior over the circadian time-scale.

Results

Automatic grooming detecting system

We used a custom-designed video set-up to monitor fly behavior. Within the set-up, insects were placed individually in cylindrical glass tubes 6 cm long and 5 mm wide with food and cotton at opposite ends (Figure 1A). Tubes were placed in a chamber where temperature and humidity are monitored and controlled. Flies were illuminated from the sides by white light-emitting diodes (LED) to simulate day-night conditions and by infrared LED from below for video imaging. Videos were captured by a digital camera above the chambers (see Materials and methods). A sample raw video clip is shown in Video 1. Because the tubes (commonly used with Drosophila Activity Monitors or DAMs) are commercially available for studying circadian and sleep behavior, this set-up can be easily replicated by other labs.

Overview of approach for detecting Drosophila grooming.

(A) Apparatus used in recording behavior. Flies constrained to individual tubes are continuously illuminated by infrared light from below and recorded by a digital camera from above. LED lights on sides of chamber simulate day-night light conditions. Temperature and humidity probes placed in the chamber are monitored by a computer. Inset: Camera photo of fly tubes in chamber. (B) Examples of the most commonly observed types of grooming in our experiments. The top row displays postures of a fly in inactive state. The three rows below show how the limbs and body of a fly coordinate to perform specific grooming movements. Arrows point to the moving part during grooming. (C) Flowchart of our algorithm used to classify fly behavior. After generating a suitable background image, the algorithm characterizes movements of fly center (CD), core (CM) and periphery (PM) to fully classify behavior in each frame.

https://doi.org/10.7554/eLife.34497.003
Video 1
Sample raw experimental video
https://doi.org/10.7554/eLife.34497.004

We then developed an automated video image analysis package that classifies fly behavior into grooming, locomotion, or rest. ‘Grooming’ in our algorithm is defined as fly legs rubbing against each other or sweeping over the surface of the body and wings (Szebenyi, 1969) (Videos 2 and 3), ‘locomotion’ as translation of the whole body, and ‘rest’ as the absence of either grooming or locomotion. Figure 1B shows images of grooming behaviors frequently observed in our videos involving the head, legs and wings. Since we are primarily interested in detecting grooming events rather than performing a detailed classification of all types of behavior (Branson et al., 2009), other behaviors involving body centroid movements, such as feeding, were initially classified as locomotion. This three-tier classification allowed our algorithm to efficiently and rapidly interpret grooming events in the recordings without incurring any significant errors in reporting locomotion and rest.

Video 2
Sample video of grooming on head and front legs
https://doi.org/10.7554/eLife.34497.005
Video 3
Sample video of grooming on wings and hind legs
https://doi.org/10.7554/eLife.34497.006

Behavior classification algorithm

To classify behavior, raw videos were processed through four major automated steps: fly identification, feature extraction, classifier training (optional), and subset behavior classification (Figure 1C). First, fly identification was accomplished with the following analysis. Fly shape was extracted from a video frame by computing the difference between the current frame and a reference frame. The reference or background frame was created by comparing eight randomly selected frames and erasing all moving objects from one of them (see Materials and methods). The background frame was updated every 1000 s to account for changes in the fly’s surroundings, such as decrease in the level of food and accumulation of debris within the tube, over the course of multiple hours (Figure 2—figure supplement 1B). A preliminary image of flies in the current frame was determined by comparing the frame to background and setting all pixels greater than a threshold C0 (Figure 2A) equal to 10. Despite the use of C0, some artifacts in the form of small objects still remained in the extracted image. A C0 = 10 rejects artifacts larger than 20 pixels (Figure 2B). Based on this, to further eliminate remaining small objects, we erased all closed objects with areas less than a second threshold C1 = 25 pixels, retaining only the fly silhouette (Figure 2—figure supplement 1C, right). Thus, each individual fly and its movements were distinguished from background structures.

Figure 2 with 1 supplement see all
Feature extraction and behavior classification.

(A) The distribution of grayscale fluctuations in the absence of mobile flies. A cutoff of grayscale value change C0 = 10 rules out >99.99% of fluctuations. Shown here are only positive values of fluctuations, which are symmetric about zero. (B) Maximum area (pixels) of a closed object generated by noise when different threshold C0 are applied. A C0 = 10 rejects objects larger than 20 pixels. Based on this, we set a threshold C1 = 25 to remove objects smaller than 25 pixels without affecting identification of flies which have a typical area of ~300 pixels in our studies. (C) Grayscale value distribution of pixels belonging to 20 individual flies. Two regions are clearly seen: the left region with peak around 40 represents the core of the flies and the right region with peak around 90 represents their periphery. (D) Variations in the center position of a stationary fly. The minimum displacement that represents a true fly center movement is 0.5-pixel length in our experiment, a requirement that excludes >99.99% of false displacements. (E) Examples of original and processed images of a fly displaying different behaviors: Top, left: front leg grooming; top, right: wing grooming; bottom, left: resting; bottom, right: locomoting. In each panel, original images from two consecutive frames are shown on left, periphery in the middle and core on the right. Changes of periphery and core are shown in the bottom row. PM and CM denote differences in the number of pixels representing the fly periphery and core, respectively, in two frames. Features PM and CM are different for different behaviors. Rubbing of front legs manifests through PM (top, left) while sweeping wings affects PM and CM (top, right). (F) k-nearest neighbors (kNN) algorithm works by placing an unclassified sample (black circle) representing a frame into a feature space with pre-labeled samples (green/gray/purple circles, the training set). The label of the unclassified point is decided by the most frequent label among its k-nearest neighbors. The three axes of the feature space are normalized periphery movement (PM), core movement (CM), and center displacement (CD). Fly activity in the feature space is separated into three regions: grooming (green), locomotion (gray) and resting (purple). Training samples (N = 9322 grooming, 9930 locomotion, 5748 rest) and nine unlabeled samples in PM-CM-CD space are shown.

https://doi.org/10.7554/eLife.34497.007

Second, we performed feature extraction to distinguish three specific types of behaviors, which are grooming, locomotion, and rest, performed by the individual fly. The features we used were: (1) periphery movement (PM), which characterizes movements of the legs, head and wings; (2) core movement (CM), which quantifies movements of the thorax and abdomen; and (3) centroid displacement (CD), which quantifies whole body displacement. Extracting these three features allowed us to identify patterns corresponding to different types of behavior.

To extract PM and CM, we split each fly’s body into a core and a periphery. Based on the grayscale distributions of the two parts (Figure 2C), we set the median of pixel grayscale values as the criterion to split a fly body into core (darker) and periphery (lighter). This criterion made the core and periphery areas roughly equal, giving PM and CM equal weight in the feature space. Slight variations in light condition across the arena can cause differing grayscale distribution for each individual. We therefore calculated the median value separately for each fly. After splitting the fly’s body into two parts, PM and CM were extracted by computing the number of non-overlapping periphery and core pixels, respectively, in two consecutive frames.

To extract CD, we calculated the average position of all pixels from the individual fly and defined changes in that quantity between every two consecutive frames as CD. Since the fly moves in essentially one dimension through the narrow tube, we ignored movements perpendicular to the long axis of the tube when calculating centroid movement. In subsequent analysis, fly location was represented by its centroid position. Noise in the apparatus may slightly change the centroid position even when a fly is stationary. Figure 2D shows the distribution of such centroid displacements caused by noise. Based on this distribution, we set 0.5 pixel length as the minimum actual CD -- that is, displacements smaller than 0.5 pixel were ignored. Application of this threshold eliminated 99.99% of such false displacements and accurately identified fly centroid displacement.

By extracting these three features (PM, CM and CD), we were able to distinguish between locomotion, rest, and grooming. As shown in Figure 2E, relative metrics of PM and CM were different depending on the type of behavior. Specifically, during locomotion, both parts moved significantly (Figure 2E, bottom-right) together with substantial changes in CD. During rest, no significant movement was seen either in the periphery or the core (Figure 2E, bottom-left). During grooming, the periphery moved more than the core (Figure 2E, top-left, top-right). Importantly, since differences in fly size can affect values of PM, CM and CD, we normalized these features to individual fly size before proceeding with further analysis (see Materials and methods). The behavior-dependent changes of these features suggest that PM, CM and CD are appropriate metrics for behavior classification.

Third, to produce a rapid, objective and automated quantification of grooming behavior, we performed classifier training to teach the algorithm to automatically recognize these features. We classified fly behavior by applying the k-nearest neighbors (kNN) technique to the normalized features (Bishop, 2007; Dankert et al., 2009; Kain et al., 2013). Briefly, kNN works by placing an unlabeled sample into a feature space with pre-labeled samples serving as a training set for the algorithm. The label or class of the unlabeled sample is then decided by the label that is most common among its k-nearest training samples. In our case, the nearest neighbors were searched through a k-d tree algorithm (Sproull, 1991). To construct the kNN classifier, we prepared a training set by visually labeling fly behavior from 25,000 frames (9322 frames of grooming, 9930 frames of locomotion and 5748 frames of resting from 20 w1118 flies) and mapping them onto a three-dimensional feature space where the axes correspond to normalized PM, CM and CD (Figure 2F, color symbols). With these training samples, we applied 10-fold cross-validation (Bishop, 2007; McLachlan et al., 2005) to the kNN classifier with k ranging from 1 to 50 and settled on k = 10 to achieve balance between computing time and accuracy (Figure 2—figure supplement 1D).

Finally, to specifically distinguish between grooming behavior and other types of peripheral movement, we pruned output labels from the kNN classifier (Figure 3A). The algorithm calculates features from every two consecutive frames, resulting in some classifications being confounded by short-term fly activity. For example, features extracted from only two frames often cannot distinguish a fly stretching its body parts from one that is grooming (Video 4). Based on our observations during creation of the training set, a typical bout of grooming lasts >3 s or for 15 frames at our normal frame rate, longer than an average stretching event, which lasts for ~1 s. Accordingly, we devised a strategy in which a ~ 15 frame-long temporal filter slid one frame at a time to eliminate false grooming labels caused by short, grooming-like behavior. Grooming designations were retained only if at least a minimum number of grooming frames were found within the filter (Figure 3A). To determine the size of the filter and the minimum number of grooming frames within, we assessed the accuracy of our classifier with the ‘minimum number of grooming frames/size of filter’ at 4/5, 8/10, 8/15, 10/15, 10/20, 12/15, 14/15, and 15/20. These tests were conducted with a 10 min video (N = 20 Canton S flies). As expected, comparison between 8/15, 10/15, 12/15 and 14/15 shows (Figure 3B) that for fixed filter sizes, a larger number of grooming frames led to fewer false positive (higher accuracy) but more frequent false negative identification of grooming (lower sensitivity). On the other hand, <12 minimum number increased risk of misidentifying other short-term grooming-like behaviors as grooming. Based on these findings, we set the pruning filter to be 12/15, simultaneously minimizing false positive and false negative errors. Because of this pruning process, if fewer than 12 grooming frames were found within a 15-frame sliding window, then all grooming frames were re-labeled as locomotion once the left edge of the window reached the 15th frame (Figure 3A). Thus, these pruned labels were the final output of our grooming classification algorithm, consisting of fly identification, feature extraction and classifier training.

Data pruning and performance evaluation.

(A) Grooming data are pruned after identification by the kNN classifier. A frame is finally labeled as grooming only if this frame is in a group of 15 frames in which 12 or more were labeled as grooming by the classifier (see B below). Frame previously labeled as grooming by the classifier but that did not pass the pruning procedure is relabeled as locomotion. (B) Performance of the classifier with pruning filter sizes of 4/5, 8/10, 8/15, 10/15, 10/20, 12/15, 14/15 and 15/20. Accuracy (closed circles) is equal to the ratio of correct grooming labels to all output grooming labels. Sensitivity (open circles) is equal to the ratio of grooming identified by the classifier to all visually labeled grooming events. We set the pruning filter to be 12/15 to attain >90% accuracy and sensitivity. (C) Fly genotypes vary by size and pigmentation, which can potentially affect performance of our classifier. To verify the generality and robustness of our method to different genotypes, accuracy (top) and sensitivity (bottom) of classifier on w1118, Canton S, iso31, and yw were tested. Error rates in all tested strains were less than 10%.

https://doi.org/10.7554/eLife.34497.011
Video 4
Sample video of grooming-like behavior (stretching body)
https://doi.org/10.7554/eLife.34497.013

The accuracy of our algorithm was evaluated by comparing the computer-identified grooming with manually labeled grooming identified by visual inspection. We tested a total of 450 min of videos from a different set of w1118 flies (N = 15) than the one used in training the classifier. The comparisons showed that, of the grooming events picked out by our algorithm, 92.1% were manually verified as true grooming events (Figure 3C, top panel). Furthermore, among all manually scored grooming events, 95.5% were successfully identified by our computational method (Figure 3C, bottom panel). Since size and pigmentation differences between genotypes can potentially affect behavioral classification, we investigated robustness of our w1118-trained classifier with manually-labeled data from Canton S, iso31, and yw strains (10 min videos with N = 20 of each type). As shown in Figure 3C, error rates in each tested strain less than 10%. Together, these results suggest that our method identifies grooming with high fidelity in several different Drosophila melanogaster strains.

Flies spend a significant portion of their awake time grooming

The solitary flies in our experiments also spent portions of their time feeding (Ja et al., 2007) and sleeping (Hendricks et al., 2000; Shaw et al., 2000), behaviors that our classifier did not initially label but that can nevertheless be identified by our algorithm. Prolonged proximity with food (>3 s,<body length) was accepted as a proxy for feeding. Rest periods lasting 5 min (Dubowy and Sehgal, 2017) were classified as sleep, following the currently accepted definition of the behavior. Together, these additional classifications led to the identification of five major behaviors in our data: grooming, locomotion, feeding, short rest (< 5 min of quiescence), and sleep (Figure 4). The first four behaviors are mutually exclusive at the level of single events, together defining the wake state of the fly, and collectively complementary to the sleep state (Figure 4A). We found that a typical iso31+ fly under 12 hr light:12 hr dark (LD) conditions spent approximately 6% of its daily time grooming,~24% time locomoting, ~3% time feeding, ~16% resting, and the remaining ~51% sleeping (Figure 4B). That is, the average iso31+ fly spent ~13% of its awake time grooming. It is worth noting here that such behavioral statistics can vary even between wild-type laboratory strains (Colomb and Brembs, 2014; Zalucki et al., 2015). For instance, similar analysis of a Canton-S strain showed that these flies groomed ~19% of their awake time (Figure 4—figure supplement 1A). These analyses demonstrate that our platform for long-term video-tracking and automated analysis can provide a quantitative ethological structure for daily basal fly behavior.

Figure 4 with 3 supplements see all
How grooming fits into the daily routine of a fly.

(A) Ethogram of grooming (green), locomotion (gray), feeding (blue), short rest (purple), and sleep (dark gray) performed by an iso31+ fly in 60 s (300 frames). Individual events of these four behaviors are mutually exclusive and together constitute wake (yellow-orange), which is complementary to sleep (dark gray). (B) Average fraction of time flies spent in each behavior. N = 83 iso31+ flies. (C) (D) Correlation between pairs of behaviors. There is strong negative correlation between sleep and locomotion (r = −0.93) and between sleep and short rest (r = −0.63). Interestingly, time spent in grooming does not show strong correlation with any of the other four behaviors. N = 83 iso31+ flies. r is the Pearson product-moment correlation coefficient. (E) Temporal patterns of behaviors of a single iso31+ fly during 4 days in LD cycles. Behaviors shown here are, grooming (G), locomotion (L), feeding (F), short rest (R), wake (W), and sleep (S). Level of activity is shown in terms of fraction of time spent in each behavior. Fraction is calculated every 30 min. White/black horizontal bars indicate light/dark environmental conditions, respectively. (F) Rhythmicity in grooming, locomotion and wake in an example fly. In LD condition, fraction of time spent in these behaviors are plotted on left. In power spectra on right of time series of behaviors (horizontal dash line denotes threshold power for p=0.05), temporal patterns of the three behaviors all show significant circadian rhythmicity. In right top, spectra of randomized grooming show no rhythmicity, while modified locomotion is still rhythmic. Similarly, in time series on right bottom, with the same randomized grooming, wake remains rhythmic while grooming, as one component from it, is arrhythmic. In time series of behaviors, activity is binned every 30 min.

https://doi.org/10.7554/eLife.34497.014
Table 1
Parameter values and fitting errors from fitting grooming time-series
https://doi.org/10.7554/eLife.34497.021
Fly #bMDbMRbERbEDT0TMTEHMHEError
10.0000.007−0.0240.01023.96.03.02944420.0115
20.0080.008−0.0550.01423.93.05.02803440.0894
3−0.0080.005−0.0700.027244.03.02958110.0674
4−0.0190.003−0.0350.046244.03.02045400.0674
50.0020.007−0.0420.019244.02.026211550.0541
6−0.0130.0050.0090.026243.03.01713170.0115
70.0260.003−0.0180.00624.34.03.02104360.076
80.1100.008−0.0120.01523.92.05.01583440.0057
9−0.0150.003−0.0010.09823.93.04.02674750.0175
Table 2
Parameter values and fitting errors from fitting locomotion time-series
https://doi.org/10.7554/eLife.34497.022
Fly #bMDbMRbERbEDT0TMTEHMHEError
1−0.0010.0040.0040.03324.06.02.0163116750.01
2−0.0050.073−0.0690.02824.12.02.082514340.0037
3−0.0130.0630.0200.00223.92.91.9416212080.0469
4−0.0100.0200.0220.00123.73.02.0335513880.001
50.0550.060−0.3380.007243.03.074119480.0056
60.0090.015−0.0540.02823.63.03.0150913690.1029
70.0010.028−0.0220.023242.03.0153510100.0072
8−0.0150.008−0.0070.03223.93.02.0150423080.007
90.0140.020−0.0280.01623.93.03.0151920040.0007

Since sleep and wake are complementary states, we expected fractional time spent in sleep to negatively correlate with that of the four wake behaviors our method tracks. Pair-wise comparisons (Pearson’s correlation coefficient, r, see Materials and methods) of individual fly sleep with grooming, locomotion, short rest or feeding, showed the expected negative relationships (Figure 4C and Figure 4—figure supplement 1B). Interestingly, the strength of negative correlation with sleep (Figure 4C) increased with the average fractional time spent in a wake behavior (Figure 4B). We reasoned that similar analysis among the wake behaviors, in contrast, should show positive correlations. Pair-wise comparisons among grooming, locomotion, short rest and feeding showed the predicted positive correlations, although to varying degrees (Figure 4D and Figure 4—figure supplement 1C). The analyses further revealed that the fraction of time a fly spent in short rest was the best predictor of its grooming time (r = 0.42 in iso31+ and 0.26 in Canton-S) while locomotion (r = 0.26 and −0.13) and feeding (r = 0.27 and 0.06) were both less reliable in predicting grooming.

The weaker grooming-locomotion and grooming-feeding correlations were unexpected for two reasons. First, daily variations in grooming levels had appeared to closely follow those in locomotion (Figure 4—figure supplement 2A), suggesting the possibility that grooming is a by-product of the more robustly driven locomotor activity. Second, feeding activity has been postulated to act as a trigger for grooming with food debris serving as an external stimulus (Hampel et al., 2015; Seeds et al., 2014). To further dissect the lack of predictive relationship between grooming and locomotion, we first examined temporal parameters that describe grooming and locomotion over short timescales (Figure 4—figure supplement 2A–E). Basal locomotor events during mid-day and night (Figure 4—figure supplement 2A, rectangles) were relatively sparse compared to grooming episodes during the same times. This difference in inter-event time interval between grooming and locomotion persisted to different degrees throughout the day-night cycle, such that the average longest pause between two subsequent grooming events was ~88 min while that between two locomotor events was ~132 min (Figure 4—figure supplement 2C). Examination of the duration of individual events showed grooming events on average lasted for ~0.23 min compared to ~0.44 for locomotor events (Figure 4—figure supplement 2D). These analyses revealed significant differences between the two behaviors over short timescales and do not support locomotor activity as a driver of grooming.

To focus on temporal dynamics at longer timescales, we binned multi-day data in 30 min (Figure 4—figure supplement 2F,G) and applied least square fit to a previously developed mathematical model that describes long timescale variations in fly activity in terms of exponential functions (Lazopulo and Syed, 2016, 2017). The functions were defined by four rate parameters bMR, bMD, bER and bED, where subscripts denote morning rise (MR), morning decay (MD), evening rise (ER) and evening decay (ED), and two duration parameters that describe the relative durations of morning (TM) and evening (TE) peaks in activity (Figure 4—figure supplement 2H and Figure 4—figure supplement 3). Results from this analysis showed that the rate parameter bMR of grooming was smaller than that of locomotion (Figure 4—figure supplement 2I), indicating a slower increase in night-time grooming activity. Additionally, the evening duration parameter (TE) for grooming was greater than that for locomotion (Figure 4—figure supplement 2J), indicating that the evening peak in grooming lasted longer. These differences in long timescale kinetics were again inconsistent with locomotor activity as a driver of grooming. Finally, comparison with large timescale variations in feeding patterns showed that peak time in contacting food was offset by 2–4 hr from nearby peaks in grooming (Figure 4—figure supplement 2O–P). The large temporal offset suggests contact with food is also not likely to drive the majority of grooming events observed in our experiments. Thus, according to our analyses of the kinetics of Drosophila ethograms in our system, neither locomotor activity nor feeding is likely to be a primary driver of basal grooming.

To identify major drivers of basal grooming, we noted that multi-day time series of the behaviors showed time-of-day-dependent changes in each behavior (Figure 4E). The appearance of repeating patterns raised the possibility that external light-dark (LD) cycles alone or in combination with internal programs could be exerting temporal control over several of these behavioral outputs, including grooming. Indeed, environmental light-dark cycles through influence on the circadian clock are known to drive rhythmic changes in fly sleep and wake durations and within the awake state, feeding, and locomotor activities (Chatterjee and Rouyer, 2016; Pfeiffenberger et al., 2010). That these rhythms persisted in the absence of LD cycles is generally considered to be strong support for clock control of these behaviors.

We set out to determine whether the circadian clock drives rhythmic modulations in fly daily grooming independent of other circadian-regulated behaviors--that is, to test whether grooming exhibits circadian oscillations simply because individual grooming events are mutually exclusive of other individual wake activities. We recognized that the mutual exclusivity of the behaviors seen at the level of individual events (Figure 4A) did not persist at the level of fractional time in each behavior where the long timescale modulations are visible (Figure 4E). This is because fractional time data are binned and the only constraint on these data was that the sum of the time spent in each wake behavior (grooming, locomotion, feeding and short rest) and sleep equaled one for each time bin (Figure 4—figure supplement 1F). In this representation, therefore, rhythmicity of one behavior (i.e. grooming) did not dictate rhythmic status of another (i.e. locomotion).

To test the independence of rhythms, we performed a series of ‘shuffling experiments’ using well-established (Allada and Chung, 2010; Chatterjee and Rouyer, 2016) rhythmicities of wakefulness and locomotion as metrics (Figure 4F ). In brief, we took data from Figure 4E in which grooming, locomotion and wakefulness have LD-driven ~24 hr rhythms (Figure 4F, left and power spectra) and computationally randomized the grooming time-series such that it lost rhythmicity (Figure 4F, right). To account for the randomized grooming, we also adjusted either locomotion (Figure 4F, upper panel) or wakefulness (Figure 4F, bottom panel), in both cases ensuring that wakefulness was between 0 and 1 at all times (see Materials and methods). In either case, we found that rhythmicity in locomotion and wakefulness were intact regardless of the rhythmic status of grooming. The simulation result suggested that circadian control of fly locomotion and wakefulness does not guarantee circadian control of underlying basal grooming, at least as measured from changes in the duration of the behaviors. Therefore, demonstration of robust ~24 hr rhythms in grooming in the absence of any external cues should be strong evidence in favor of circadian control of the behavior.

Temporal pattern of grooming is controlled by the circadian clock

To test whether basal grooming is also under circadian control, we first entrained iso31+ + to 2 days of alternating light-dark cycles and then monitored their behavior over multiple days in constant darkness (DD). In the absence of light cues, locomotor, feeding and sleep showed the familiar clock-driven rhythms in their daily timing (Figure 5A,B). Although short rest appeared to undergo rhythmic changes (Figure 5A), spectral analysis indicated these changes did not result in statistically significant rhythms at the p=0.05 level (Figure 5B). Lack of rhythms in short rest is consistent with our earlier reasoning that rhythmic wakefulness and locomotion does not necessarily imply rhythmicity of each behavior in the awake state.

Figure 5 with 3 supplements see all
Grooming is under control of the circadian clock.

(A) Average temporal patterns (fraction of time spent in 30 min bins) of locomotion, feeding, short rest and sleep of eight representative iso31+ flies during 3 days in constant darkness (DD). Black horizontal bar represents lights-off condition. (B) Power spectra of behaviors in panel (A). Except for short rest, temporal patterns of the other three behaviors show significant circadian rhythmicity. Horizontal dash line and dash dot line denote threshold powers for p=0.05 and p=0.01, respectively. (C) Grooming activity (in 30 min bins) of wild-type and clock mutants during 2 days in LD cycle followed by four days in DD cycle. Grooming traces are population averages. In DD, wild-type (WT, iso31+) grooming continues to show 24 hr rhythms. In comparison, grooming in perSor perL flies show shorter or longer rhythms, respectively. For per0 flies, grooming is arrhythmic in DD. N = 8 WT, 8 perS, 8 perL, and 8 per0 representative flies. (D) Example power spectra showing circadian rhythmicity in grooming patterns of three individual wild-type, perS, perL and per0 flies. Spectra are normalized to variance of activity (in 30 min bins). Dash lines and dash dot lines represent threshold power at p=0.05 and p=0.01, respectively. More examples of individual power spectra are provided in Figure 5—figure supplement 1. (E) Spectral powers of circadian peaks of individual wild-type and circadian mutants. N = 29 control, 20 perS, 29 perL, 20 per0, 13 cyc01 and 11 clkJRK.

https://doi.org/10.7554/eLife.34497.023

Grooming data also showed periodic changes in constant darkness (Figure 5C). Power spectra of individual time-series (‘WT’ in Figure 5D and Figure 5—figure supplement 1A) indicated these periodic changes to be statistically rhythmic by revealing peaks significant at p=0.01 in 100% of flies (29 out of 29 individuals, Figure 5E). The average period of oscillations was 23.72 hr, with a standard deviation of 0.72 hr (Figure 5—figure supplement 1B). The presence of these robust circadian rhythms in the absence of external cues further support the hypothesis that fly basal grooming is under control of the internal timekeeper. Consistent with our prediction that grooming rhythms in DD do not necessarily follow from rhythms in locomotion or wakefulness, we found that knowing locomotion or wakefulness is rhythmic did not inform about the rhythmic status of grooming (Figure 5—figure supplement 2). This finding further underscored the importance of the DD studies in establishing rhythmicity in basal grooming. It should be noted here that our simulation results do not demonstrate bidirectional independence of rhythmicity in wakefulness and grooming but, only that rhythmicity of wakefulness does not depend on that of grooming. Demonstration of fully independent rhythms in the two behaviors is beyond the scope of the present study.

We next took advantage of several circadian mutants to examine further the control of grooming by the circadian clock. The Drosophila clock is composed of two interlocked genetic feedback loops in which period (per) is one of the core components and whose transcription is controlled by the primary transcription factors Clock (clk) and Cycle (cyc) (Allada and Chung, 2010). The per gene has several well-characterized mutant alleles, two of which---perS and perL---produce short and long circadian rhythms, respectively, while a third, per0, results in arrhythmic behavior (Konopka and Benzer, 1971). Population-averaged grooming of perS and perL showed altered oscillations in LD and DD (Figure 5C, second and third rows), with average DD periods of 19.23 ±  0.57 hr and 28.84 ± 1.13 hr, respectively (Figure 5D,E and Figure 5—figure supplement 1A). The periods of oscillation in grooming were well within published values of circadian rhythms of these mutants (Konopka and Benzer, 1971) and in agreement with alterations in locomotor rhythms of the flies (Figure 5—figure supplement 3A). Consistent with these results, grooming in per0 flies was arrhythmic (Figure 5C, bottom row) and, when analyzed at the individual fly level, the power spectra unveiled the absence of statistically significant rhythms in 19 out of 20 flies at p=0.01 level (Figure 5D,E and Figure 5—figure supplement 1A). Moreover, analysis of grooming patterns in cyc01 (Rutila et al., 1998) and clkJrk (Allada et al., 1998), arrhythmic mutants of cyc and clk, also showed loss of circadian rhythms (Figure 5E and Figure 5—figure supplement 3B–D). Together, these results support the hypothesis that the circadian clock temporally modulates fly grooming.

Grooming duration is controlled by cycle and clock

To test whether, in addition to regulating the timing of grooming, the circadian clock also regulates grooming duration, we examined the average duration of grooming in circadian mutants. Despite causing major changes in temporal patterns of grooming, the per0 mutation did not significantly change the average duration of grooming in these flies (Figure 6A). In contrast, cyc01 and clkJrk mutants both exhibited increased daily average grooming relative to their respective genetic controls (Figure 6B,C). While both mutants exhibited increased grooming duration, this change was accompanied by opposing changes in their locomotion: cyc01 flies spent less time and clkJrk flies spent almost twice as much time in locomotion (Figure 6B,C, pie plots). Thus, the increase in cyc01 grooming came almost entirely from loss of locomotor activity while the increase in clkJrk grooming came from loss of sleep. These results support the hypothesis that locomotion and grooming are partly independent behaviors and further suggests that the cyc01 and clkJrk mutations alter the insect’s internal homeostasis in distinct ways, similar to phenotypic differences reported previously in sleep studies involving cyc01 and clkJrk (Hendricks et al., 2003; Shaw et al., 2002). Importantly, together with per0 data, the results raise the possibility of non-circadian roles for cyc and clk in setting the duration of internally driven grooming in Drosophila.

Figure 6 with 1 supplement see all
Control of grooming duration is independent of circadian rhythmicity.

In each panel, bar plots on left show average fractional time spent in grooming in mutant and control flies. Pie charts on right present average fractional time spent in grooming (green), locomotion (gray), sleep (dark gray), short rest (purple) and feeding (blue). Here, numerical values for fractional time spent in behavior are indicated only for grooming, locomotion and sleep with additional details in Figure 6—figure supplement 1A. Although loss of a functional clock does not affect grooming amount (A), mutations in clock (B) and cycle (C) genes lead to robust increases in the time flies spend grooming. Additional time for grooming can come from reduction in sleep (B) or reduction in locomotion (C). Reduction in sleep, however, does not always entail similar changes in grooming since sleep mutants fumin (D) and sleepless (E) show divergent alterations in grooming durations. N = 83 control, 53 per0, p=0.28. N = 76 control, 18 cyc01, p=2.7×10−4. N = 28 control, 25 clkJRK, p=7.8×10−9. N = 17 control, 23 fumin, p=0.003. N = 28 control, 17 sss, p=1.3×10−10.

https://doi.org/10.7554/eLife.34497.031

cycle and clock have also been implicated in stress response, particularly in regulating level of sleep in response to sleep deprivation and adjusting locomotor output in response to nutrient unavailability (Hendricks et al., 2003; Keene et al., 2010; Shaw et al., 2002). Because grooming and sleep have both been previously linked to stress, we asked whether reduction in sleep is always accompanied by an increase in grooming as seen in our clkJrk data. To address this question, we examined relationship between grooming and sleep in standard LD cycles in two short-sleeping mutants--fumin and sleepless (sss). Consistent with the original studies (Koh et al., 2008; Kume et al., 2005), our method found both strains to have extremely low levels of sleep (Figure 6D,E, pie plots). But, while loss of sleep in fumin was accompanied by an upregulation in grooming (Figure 6D), loss of sleep in sss was accompanied by a dramatic downregulation in grooming, compared to control flies (Figure 6E). These divergent relationships between sleep and grooming (e.g. sss vs. fumin) and between locomotion and grooming (e.g. clkJrk vs. cyc01) became more evident when individuals of different genotypes were compared together (Figure 6—figure supplement 1F,G). To better visualize the effects of disparate mutations, data of each genotype in these plots were normalized to the population-mean of its genetic control. These results suggest that resetting of the level of internally-driven grooming can occur via a number of ways with complex compensatory changes in sleep and locomotor behavior.

Accumulated data from our experiments suggest that grooming is an innate fly behavior controlled by two major regulators. One of these regulators controls temporal patterns in grooming and the other controls amount of time spent in grooming. Circadian genes per, cyc and clk are involved in controlling the timing of peaks/troughs in grooming rhythms while cyc and clk are also involved in setting how much time is spent grooming. The apparent absence of per from the second regulatory mechanism is consistent with the possibility that the two control mechanisms operate independently.

Nearly all animals tested exhibit daily basal grooming behavior, suggesting that grooming is not only fundamental to health but also reflects a generally healthy state. Consistent with this, loss of grooming is indicative of sickness behavior (Hart, 1988) associated with infection or old age, and, in the case of humans, mental illness. A greater understanding of the molecular mechanisms regulating grooming would provide insight into the principles and neural circuits underlying other complex programmed behaviors, as well as potentially identify biomarkers of pathological disease states. Critical to the dissection of these molecular mechanisms is a system for rapid, automated interpretation of grooming in a genetically tractable model organism. The development of our platform will facilitate high-throughput and unbiased analysis of the genetic regulators and neural circuits that control grooming, as well as those responsible for loss of grooming in the context of disease.

Discussion

Grooming continues to be one of the least understood Drosophila behaviors, possibly due to the technical challenges of detecting grooming events in this small insect. Early work describing fly grooming relied on manual scoring (Connolly, 1968; Szebenyi, 1969; Tinbergen, 1965), which imposes significant limitations on the length of events that can be detected, fidelity and objectivity of detection, and the level of detail that can be extracted from the data. Despite such limitations, these initial studies made a number of noteworthy observations. Szebenyi delineated all the major modes of fly grooming and suggested that repetitive grooming actions may closely follow a preset sequence (Szebenyi, 1969). A subsequent study in the blowfly offered a more refined mechanistic picture of insect grooming by proposing that the sequential actions form a hierarchical structure (Dawkins and Dawkins, 1976). Combining modern computational and genetic tools, an elegant study in Drosophila recently confirmed these previous hypotheses (Seeds et al., 2014). That fruit flies may groom spontaneously in the absence of any apparent stimulus has also been previously suggested (Connolly, 1968; Tinbergen, 1965). Consistent with this, our work provides evidence that fruit flies groom as part of their daily repertoire of internally programmed behaviors and often without any obvious external stimulus. Our analysis revealed that over a period of hours, grooming is temporally structured by the fly circadian clock, with peak activity around dawn and dusk. The study also identifies transcription factors CLOCK and CYCLE as critical molecular components that control the amplitude of programmed Drosophila grooming.

Machine-learning is increasingly gaining popularity due to its applicability to virtually any problem involving pattern classification, including in studies aimed at deconstructing stereotyped behavior in the fruit fly (Branson et al., 2009; Kabra et al., 2013; Kain et al., 2013; Mendes et al., 2013; Valletta et al., 2017). Similar to these recent efforts, we constructed a computational pipeline incorporating elements of machine learning to automatically identify grooming events in video recordings of behaving flies. Our approach relies, in particular, on a supervised k-nearest neighbors algorithm to broadly classify behavior into grooming, locomotion and rest (Figure 2). Application of additional optional filters yielded approximate data on feeding and sleep (Figure 4). While previous methods offer important details on different modes of grooming (Berman et al., 2014; Seeds et al., 2014), leg movements (Kain et al., 2013; Mendes et al., 2013), and fly-fly interactions (Branson et al., 2009; Kabra et al., 2013) from short videos, the methods have limited capability for interpreting multi-day and multi-fly recordings. The method presented here offers less detail on modes of grooming, but can instead readily dissect circadian time-scale recordings into three to five behavioral classes on a typical personal computer.

The apparatus used in this method (Figure 1) also offers a number of advantages over current ones. First, most items used in the apparatus, including the ~6 cm tubes in which flies are visualized, are standard in a typical circadian experiment studying fly locomotion or sleep (Lazopulo et al., 2015; Pfeiffenberger et al., 2010) using the Drosophila Activity Monitor (DAM). The retention of this basic feature should lower the technical hurdle for the interested investigator who is likely to be one already engaged in locomotion and sleep studies in Drosophila. The use of a shared design to house flies also means that both experimental subjects and certain conclusions drawn from one platform may be readily transferred to the other. Most current grooming methods require specialized equipment for fly stimulation and detection (Seeds et al., 2014), elaborate optics (Kain et al., 2013), or a specific form of fluorescence microscopy (Mendes et al., 2013). Second, our apparatus can simultaneously monitor up to ~20 flies, while the existing approaches, although offering higher-resolution data, monitor only one animal at a time. The scalability and high-throughput nature of our platform should appeal to investigators interested in, for example, large-scale genetic studies to identify mechanisms that differentially affect grooming, locomotion and rest (King et al., 2016). Finally, the flies in our apparatus are allowed to move freely over a distance roughly 10 times their body length and still remain in the camera’s field of view while technical constraints in other studies limit visualization to short distances (Mendes et al., 2013). The relative freedom of mobility, access to food, and long time-scales of observation offered by our apparatus thus facilitate analysis of basal, internally programmed behavior.

These properties make our platform amenable to addressing questions of biological relevance, such as the importance of grooming behavior, its temporal regulation with regards to other fly behaviors, and its dependence on the circadian timekeeping system. First, we found that flies consistently devote a significant fraction of time to grooming behavior during periods of wakefulness (13%), and surprisingly, that grooming behavior is observed even during periods of reduced locomotor activity (Figure 4—figure supplement 2A). This suggests that the benefits of grooming outweigh the caloric resources expended and the resulting interruption of rest, underscoring the hypothesis that daily grooming is a fundamental behavior of Drosophila.

A few recent studies (Hampel et al., 2015; Phillis et al., 1993; Seeds et al., 2014) have shown that fly grooming can be directly induced by peripheral stimuli, and there has been considerable progress toward identifying the behavioral and neural aspects of such stimulus-induced grooming. However, programmed grooming, or grooming in the absence of a macroscopic stimulus, remains relatively understudied in Drosophila. To our knowledge, the existence of programmed grooming, first proposed in the mid 60s, still remains unreported.

Data from this study suggest that a significant portion of daily fly grooming is driven by internal programs. Flies in our experiments are active for ~34% of the time within a 24 hr period, during which they mostly engage in grooming, locomotion and feeding. Behavioral analysis showed that, like sleep, locomotion and feeding, fly grooming behavior is modulated by oscillations of the circadian clock (Figure 5). This finding raised the possibility that the observed grooming was stimulated by rhythms in contact with food or locomotor activity. However, closer examination revealed that kinetics in feeding and locomotion were distinct from those of grooming (Figure 4—figure supplement 2). Additionally, genetic modifications resulted in contrasting changes in these behaviors (Figure 6). These results together suggest that the majority of grooming events detected in our experiments are not triggered by external stimuli such as light, food and locomotor movements. Rather, internal regulatory mechanisms, independent of external stimuli, likely drive this programmed behavior.

Multi-day recordings of wild-type flies in constant darkness showed 24 hr rhythms in daily grooming patterns (Figure 5, Figure 5—figure supplement 1). Furthermore, these rhythms were shifted appropriately in the canonical period mutants perL and perS and abolished in arrhythmic per0 flies (Figure 5). These data support a regulatory model in which timing of programmed grooming behavior is orchestrated by the circadian clock. Notably, since loss of rhythmicity did not significantly affect the amount of grooming (Figure 6A), our results suggest that the primary role of the clock is to organize the behavior in time without influencing the total time flies dedicate to grooming.

Intriguingly, two other circadian mutations, cyc01 and clkJrk, increased the proportion of daily time flies spend grooming (Figure 6B,C), implying that the changes in grooming level may not be due to circadian defects. These data are consistent with the hypothesis that clock-independent but cyc- and clk- dependent pathways regulate the amount of programmed grooming behavior.

Finally, why are flies innately programmed to groom? The present study does not directly address this important question, but given that microscopic pathogens can sporulate on the fly cuticle and eventually infect the insect (St. Leger et al., 2011), persistent grooming may serve as a first line of defense against such attack. Thus, the immune system may constitute another internal program, similar to the cyc and clk-controlled mechanisms, that drives fly grooming; if so, we hypothesized that mutants with defective immune response may exhibit altered grooming behavior (Lemaitre et al., 1995; Michel et al., 2001). Consistent with this, we found that grooming was reduced in the immune-deficient imd mutant (Figure 6—figure supplement 1H), although a second immune-deficient strain lacking a member of the Toll pathway (PGRP-SAseml) did not show a significant change. Further studies are required to clarify these initial results and elucidate the biological function of programmed grooming in Drosophila.

Together, our data provide strong supporting evidence for programmed grooming in Drosophila and suggest that this innate behavior is driven by two possibly distinct sets of regulatory systems. The circadian system temporally segregates time-dependent variations in grooming from those of other essential behavioral outputs like feeding and sleep. Circadian coordination of grooming underscores a previously under-appreciated importance of this behavior in the daily routine of the fruit fly. The second regulatory system adjusts the level of grooming relative to other behaviors. This set of regulation likely confers adaptability on the animal by allowing it to up- or downregulate grooming as necessitated by internal and external conditions. The dual control mechanism of grooming proposed here is highly reminiscent of the two-process framework---circadian and homeostatic---that is widely used in understanding sleep regulation (Borbély, 1982). Although this work has not demonstrated grooming is under homeostatic control, future studies could be aimed at better characterizing the nature of the non-circadian regulatory system of fly grooming.

In summary, we present here a new platform to detect innate grooming behavior simultaneously and for days at a time in multiple individual fruit flies. The apparatus can be assembled easily, and the accompanying analytics are available publicly. Utilizing this platform, we report several mechanisms that are possibly responsible for driving the timing and level of programmed grooming in Drosophila. We also suggest future experiments that through use of this platform can lead to deeper understanding of the underlying biology of grooming and its relation to other essential fly behaviors.

Materials and methods

Key resources table
Reagent type (species)
or resource
DesignationSource or referenceIdentifiersAdditional information
Strain, strain background (Drosophila melanogaster, male)sssp1DOI: 10.1126/science.1155942on iso31 background
Strain, strain background (D. melanogaster, male)iso31DOI: 10.1126/science.1155942
Strain, strain background (D. melanogaster, male)fuminDOI: 10.1523/JNEUROSCI.2048-05.2005on w1118 background
Strain, strain background (D. melanogaster, male)w1118Bloomington Drosophila Stock CenterBDSC: 3605
Strain, strain background (D. melanogaster, male)Canton SBloomington Drosophila Stock CenterBDSC: 64349
Strain, strain background (D. melanogaster, male)clkJRKthis paperbackcrossed for five generations to iso31
Strain, strain background (D. melanogaster, male)per0this paperbackcrossed for five generations to iso31+
Strain, strain background (D. melanogaster, male)perSthis paperbackcrossed for six generations to iso31+
Strain, strain background (D. melanogaster, male)perLthis paperbackcrossed for six generations to iso31+
Strain, strain background (D. melanogaster, male)cyc01otheron Canton S background, gifts from William Ja
Strain, strain background (D. melanogaster, male)iso31+othergifts from Michael Young

Fly strains

Clock mutants perS, per L, and per0 were backcrossed for five-six generations to an iso31 with mini-white insertion strain (iso31+). cyc01 flies, gifts from William Ja (The Scripps Research Institute), have the Canton S background. ClkJrk flies were backcrossed for five generations to iso31. sssP1 mutant flies, gifts from Amita Sehgal (Perelman School of Medicine at the University of Pennsylvania), have the iso31 background. fumin mutants, gifts from F. Rob Jackson (Tufts University School of Medicine), have the w1118 background. Flies were bred and raised at 23°C and 40% relative humidity on standard cornmeal and molasses food. All experiments were done with 5–8 days old males at 260C and 70–80% relative humidity in a custom-built behavior tracking chamber (Figure 1 and Figure 2—figure supplement 1A). For each experiment, control strain refers to the genetic background of a mutant. WT flies in Figures 4 and 5 refer to the iso31+ line.

Behavior tracking apparatus

Chamber

Flies were placed individually in glass tubes (Trikinetics Inc., Waltham, MA, PGT5 × 65) with food and a cotton plug at opposite ends. Twenty tubes were placed on a custom-designed acrylic plate inside a transparent acrylic cuboid box for simultaneous imaging. Temperature and humidity were monitored every 5 min with a digital thermometer (Dallas Semiconductor, Dallas, TX, DS18B20) and a humidity sensor (Honeywell, Morris Plains, NJ, HIH-4010), respectively, while a wet sponge inside the chamber kept the relative humidity around 70–80% (Figure 2—figure supplement 1A).

Illumination

The chamber was illuminated by two sets of light-emitting diode (LED) strips. White LEDs (LEDwholesalers, Hayward, CA, 2026) producing ~700 lux were used to simulate daytime conditions and infrared LEDs (LEDLIGHTSWORLD, Bellevue, WA, SMD5050-300-IR 850 nm) were used to visualize the flies at all times.

Camera

A CCD monochrome camera (The Imaging Source, Charlotte, NC, DMK-23U445) fitted with a varifocal lens (Computar, Cary, NC, T2Z-3514-CS) was used for video imaging. To minimize influence of chamber’s light/dark conditions on video quality, we put a 780 nm long pass filter (Midopt, Palatine, IL, LP780-30.5) in front of the lens. Videos were saved as 8-bit images in. avi format with 1280 × 960 resolution at 10 Hz and down-sampled as needed.

Analytic hardware and runtime

Using a desktop computer with Intel Core i7-4770 3.4 GHz processer and 4 × 4 G DDR3 1600 MHz RAM, it takes ~7 hr to extract grooming, locomotion and rest data from an 8 hr video of 20 flies recorded in 10 Hz (in total 288,000 frames) at 1280 pixel ×960 pixel resolution. Videos are analyzed every two frames (5 Hz), which is sufficient to capture grooming events.

Algorithm for automatic detection of grooming

All computational analyses were done with custom-written Matlab scripts that will be available at https://github.com/sbadvance/Drosophila-Grooming-Tracking.git (Qiao, 2017; copy archived at https://github.com/elifesciences-publications/Drosophila-Grooming-Tracking).

Fly shape extraction. Fly shape was extracted by applying a background subtraction algorithm. The background or reference frame is constructed randomly picking two frames, a template and a contrast, and comparing their pixel grayscale values and erasing all moving objects from the template frame. To remove the fly from the template frame, we replace the pixels belonging to the fly with corresponding pixels from contrast frames, relying on the fact that a fly is always darker than the surrounding objects. The template frame with no fly present then becomes the background frame. Additionally, because a fly’s surroundings, including food debris, change substantially during the course of an experiment (Figure 2—figure supplement 1B), the background frame is regenerated every 1000 s. Lastly, if a fly occupies the same area in the template and contrast frames, the overlapping region cannot be erased on the template. To circumvent this problem, every time a background frame is generated, we randomly choose seven, instead of one, frames as contrast frames and compare all of them with the template. When a fly does not move for more than 1000 s, the fly will not be removed from the background and cannot be detected in other frames during this 1000 s. Thus, when a fly is not detected, we consider the fly to be stationary at the position where it was last detected.

To reduce effects of charge coupled device (CCD) image noise and fluctuations in the system, we set a minimum change C0 as the threshold to accept grayscale changes from fly movements. We denote the grayscale value of a pixel located at (x, y) (in units of pixel, in our case, x ∈ [1:1280], y ∈ [1:960]) in the template as Itemplate(x,y) and in the contrast frame Icontrast(x,y). Only if

Itemplate(x,y)Icontrast(x,y)>C0

then

Itemplate(x,y)=Icontrast(x,y)

While increasing threshold C0 reduces noise, it can also lead to rejection of real movements of the fly. To optimize C0, we tested noise levels in our images by analyzing a 3-hr video with dead flies. In the test, 30 pairs of consecutive frames were randomly chosen from the video and the differences between their corresponding grayscale pixel values were calculated. The distribution of the differences, stemming from noise, is shown in Figure 2A. Based on this distribution, we set C0=10, which excludes 99.99% of noise-related changes in grayscale values.

Feature normalization. Since PM and CM both represent areas (number of pixels in area), while CD represents distance, we take the square root of PM and CM to make the dimensions of the features homogeneous. In addition, fly size varies between individuals and across experimental settings. To facilitate comparison of data in feature space, we therefore normalize PM, CM and CD of each fly with a scale parameter SP equal to the square root of the area of that fly. Thus, the final form of normalized features are

Normalized PM=PM/SP
Normalized CM=CM/SP
Normalized CD=CD/SP

Spectral analysis

Figures 4 and 5 and Figure 5—figure supplements 13: To measure periodicity in locomotion and grooming recordings, we applied the Lomb-Scargle periodogram (Lazopulo et al., 2015; Scargle, 1982) to time-series that were binned into 30 min periods. Power at indicated p values shown in power spectra were calculated according to

Power=-ln(1-(1-p)1/N)

where p is the p-value and N is the number of frequencies computed in Lomb-Scargle periodogram.

To test the effect of binning on rhythmicity, we binned grooming activity of individual flies in 30 min, 5 min, and 1 min bin sizes and ran Lomb-Scargle periodogram analysis on these binned data, as well as raw data without any additional binning. Examples of 5 individual spectra of each bin size are shown in Figure 5—figure supplement 1C. As shown in the figure, the separation between statistical cut-off power (at certain p value, horizontal lines) and peak power increases with smaller bin size or equivalently, larger number of data points (N). This is because in Lomb-Scargle periodogram, cut-off power grows as log (N) while peak power grows as N.

Time series randomization

In Figure 4F and Figure 5—figure supplement 2, randomized grooming was generated by randomly shuffling time in raw grooming data. The corresponding modified locomotion and wake were calculated according to

Modified locomotion = original locomotion+original grooming – randomized grooming

Modified wake = original wakefulness+original grooming – randomized grooming

These manipulations modified either locomotion or wake while keeping the other unchanged.

Statistics

No sample size estimation was performed when the study was being designed. Unless otherwise specified, quantitative experiments with statistical analysis were repeated at least three times independently. Exclusion of data applies to flies which were physically damaged (for example, broken wings or legs), physically confined (for example, trapped by condensation inside tubes), or dead during experiments. For testing statistical significance of differences between groups, we first tested the normality of data by one-sample Kolmogorov-Smirnov test. Two-sample F-test was applied for equal variances test. Samples with equal variances were compared using two-tailed t-test. Satterthwaite's approximation for the effective degrees of freedom was applied for samples with unequal variances. Results were expressed as mean ± s.d., unless otherwise specified. *p<0.05, **p<0.01, ***p<0.001 were considered statistically significant.

In Figure 4C,D and Figure 4—figure supplement 1B,C, the Pearson correlation coefficient r for each pair of data was calculated according to the standard definition

rX,Y=E[(XμX)(YμY)]σXσY

where X and Y are time spent in two behaviors X and Y, rX, Y is the Pearson correlation coefficient between two behaviors, E[ ] is the expectation value, μ and σ are, respectively, mean value and standard deviation of a behavior. The statistical significance of r was estimated through bootstrapping. For each two behaviors, we randomly paired data from n flies (n = 84 for iso31+ and n = 76 for Canton S) and calculated a correlation coefficient r. This process was repeated 100,000 times and the empirical distribution of the randomly paired r values were used for a two-tailed test (Figure 4—figure supplement 1D). p-values for all Pearson correlation coefficients are presented in Figure 4—figure supplement 1E.

References

  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
    Pattern Recognition and Machine Learning
    1. CM Bishop
    (2007)
    Springer.
  6. 6
    A two process model of sleep regulation
    1. AA Borbély
    (1982)
    Human Neurobiology 1:195–204.
  7. 7
  8. 8
  9. 9
  10. 10
  11. 11
  12. 12
  13. 13
  14. 14
  15. 15
    The Behaviour of Ungulates and Its Relation to Management, 24
    1. V Geist
    2. F Walther
    (1974)
    I.U.C.N. Publication.
  16. 16
  17. 17
  18. 18
  19. 19
  20. 20
  21. 21
  22. 22
  23. 23
  24. 24
  25. 25
  26. 26
  27. 27
  28. 28
  29. 29
  30. 30
  31. 31
  32. 32
  33. 33
  34. 34
  35. 35
  36. 36
    Analyzing Microarray Gene Expression Data
    1. G McLachlan
    2. KA Do
    3. C Ambroise
    (2005)
    John Wiley & Sons.
  37. 37
  38. 38
  39. 39
  40. 40
  41. 41
    Light, heat, action: neural control of fruit fly behaviour
    1. D Owald
    2. S Lin
    3. S Waddell
    (2015)
    Philosophical Transactions of the Royal Society B: Biological Sciences 370:20140211.
    https://doi.org/10.1098/rstb.2014.0211
  42. 42
  43. 43
  44. 44
    Isolation of mutations affecting neural circuitry required for grooming behavior in Drosophila melanogaster
    1. RW Phillis
    2. AT Bramlage
    3. C Wotus
    4. A Whittaker
    5. LS Gramates
    6. D Seppala
    7. F Farahanchi
    8. P Caruccio
    9. RK Murphey
    (1993)
    Genetics 133:581–592.
  45. 45
  46. 46
  47. 47
  48. 48
  49. 49
  50. 50
  51. 51
  52. 52
  53. 53
  54. 54
  55. 55
  56. 56
  57. 57
  58. 58
  59. 59
  60. 60
    Animal Behavior
    1. N Tinbergen
    (1965)
    Time Incorporated.
  61. 61
  62. 62
    Communication and Expression in Hoofed Mammals
    1. FR Walther
    (1984)
    Indiana University Press.
  63. 63
  64. 64
  65. 65

Decision letter

  1. Kristin Scott
    Reviewing Editor; University of California, Berkeley, Berkeley, United States

In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

[Editors’ note: a previous version of this study was rejected after peer review, but the authors submitted for reconsideration. The first decision letter after peer review is shown below.]

Thank you for submitting your work entitled "Automated analysis of internally programmed grooming behavior in Drosophila using a k-nearest neighbors classifier" for consideration by eLife. Your article has been reviewed by three peer reviewers, and the evaluation has been overseen by a Reviewing Editor and a Senior Editor. The following individuals involved in review of your submission have agreed to reveal their identity: Benjamin de Bivort (Reviewer #1); Daniel Cavanaugh (Reviewer #3).

Our decision has been reached after consultation between the reviewers. Based on these discussions and the individual reviews below, we regret to inform you that your work will not be considered further for publication in eLife. Although all three reviewers found aspects of the work exciting, there were concerns about the manuscript as a methods paper and about the main biological finding that grooming is under circadian control preclude publication at this time.

Reviewer #1:

In this manuscript Qiao and Li et al. develop an automated annotation procedure for scoring grooming behavior in flies. They use this approach to then measure the circadian dependence of grooming, examining the effects of mutation in period, clock and cycle genes on grooming patterns. They find that the pattern of grooming is circadian and is disrupted by mutations that disrupt the circadian pattern of locomotion. They also examine the effect of stress on grooming. The major scientific result seems to be that control of grooming timing and duration are under independent genetic control, which I find plausible.

This work seems to be well done technically, and relatively clearly presented. The grooming detection method will be useful to other researchers. I am not sure of the significance of the scientific findings. It seems like it could be trivially explained by the circadian clock driving arousal, and once an animal is resolved it has to make an exclusive "choice" between grooming or locomoting. Thus, even if the circadian control of grooming is indirect, falling out of the direct control of locomotion by the circadian rhythm, the authors might see these results. Still maybe this counts as circadian control of grooming.

It doesn't feel like the paper is either clearly a methods paper or a circadian investigation. If it is the former, as suggested by the title, more work needs to be done to convey the generality and robustness of the classifier. If it's the latter, as suggested by the Abstract and the amount of Results section narrative devoted to the method (2p) vs circadian experiments (7.5), it seems fine but maybe in the wrong article category.

Lastly, as my expertise is not in circadian behavioral genetics, I can only say that their results in this area seem plausible and well performed but cannot evaluate them in the broader context of that subfield.

1) Were the flies used to assess the classifier accuracy all from the same genotype? If so which one, and do the authors have evidence that the classifier will generalize to other genotypes? Genotypes vary by size and pigmentation and this has the potential to change the value of the classification features and hence the classifier accuracy. Since the paper is given as a methods paper, more effort should be given to conveying the generality and robustness of the approach.

It should be easy to see if classification accuracy changes with the size of the segmented fly body from the data they have in hand, if there's enough natural variation in body size. But I don't think it's unreasonable to explore these questions with new measurements.

2) Subsection “Grooming plays an important role in the daily life of Drosophila” – Using distributions to assess behavioral stereotypy feels misguided, particularly as the authors have individual-level data. Individual flies have different behavioral biases in essentially all behaviors, so it's likely this is true for grooming. This could be true and nevertheless produce a normalized histogram that is relatively narrow. To assess, the authors could look at individual-level data of grooming rates across days. If every individual fly fills out the population distribution (i.e. the behavior is ergodic), then their argument would be supported.

The sentence in Subsection “Grooming plays an important role in the daily life of Drosophila” seems particularly speculative. More appropriate for the discussion? Even there, I don't see any reason why spontaneous behavior should be less idiosyncratic than stimulus evoked behavior. Perhaps it would even go the other way. All in all, this argument is not compelling.

The authors might want to, just for curiosity, look at the individual-level correlation between grooming and sleep.

3) Subsection “Temporal pattern of grooming is under control of the circadian clock” – The authors should clarify what they mean about testing whether grooming is under circadian control. In particular, if locomotion is given as under circadian control, and locomotion and grooming are definitionally exclusive, how can grooming *not be under circadian control? i.e. What is the result which would falsify the hypothesis that grooming is under circadian control?

As the authors go onto conclude in Subsection “Temporal pattern of grooming is under control of the circadian clock” that circadian systems regulate grooming it's particularly important to clarify what they mean early on.

The reported lag between peak locomotion and peak grooming of 2 hours as suggested by Figure 4E is not particularly convincing to me that these processes aren't trivially coupled in the sense of being both driven by arousal but exclusive of each other. Such an arrangement would be consistent with circadian rhythms driving arousal, and when an aroused animal doesn't walk it grooms. This doesn't feel like the kind of circadian control of grooming the authors are interested in, but they should clarify if this sort of arrangement passes the test.

Also, the data presented for a lag between grooming and running aren't particularly convincing as they appear to only happen 2/4 times. The sample size should be shown in Figure 4F. Better would be to show the data points themselves.

Reviewer #2:

The authors develop methods to automate the detection of grooming (versus rest and locomotion) in Drosophila. They do so by using a trained classifier that relies on k-nearest neighbors clustering, to detect whether the fly periphery moves more than the core (grooming), the periphery and core move similarly (locomotion) or whether neither move much at all (rest). This is all performed on data from flies moving in small tubes, similar to those used extensively for circadian research. The authors manually label some data to assess the false positive and negative rates of their automated method – the algorithm is simple but nonetheless performs robustly with low false positive and negative rates. They then go on to collect data on fly grooming and locomotion over days and report on the statistics of these two behaviors over time, and to investigate whether grooming is modulated by circadian cycle. To do so, they use constant darkness and mutations in the period gene and find that they alter the rhythm of grooming but not the total time spent grooming each day. Finally, they examine mutations in cycle and clock genes, along with starvation conditions, and their effect on grooming rhythms, taking care to account for how flies change locomotor and feeding behavior concomitantly. This leads to the conclusion that there are two separate mechanisms that shape the rhythms of grooming (when flies groom throughout the day) and the duration of grooming (how long they spend grooming). I find the paper overall compelling for establishing a pipeline for studying fly grooming over long timescales – this setup should facilitate studies on the genetic and neural mechanisms underlying the regulation of grooming (versus locomotion, feeding, and sleep) over the course of days. However, there are a number of issues that I feel preclude publication of the manuscript in its current state.

1) After reading the title I expected a methods-heavy paper – however, much of the details on the automated grooming detector are in the Materials and methods section – and there is scant information describing the choices the authors made in how they extracted fly shape or optimized the classifier. Additionally, while the algorithm seems useful for studying grooming behaviors in these particular tubes under IR light, it is not clear how the algorithm would perform if the flies were in other (more natural) environments. The algorithm itself does not seem to be sufficiently innovative on its own to be of general interest, so it seems important for the authors to demonstrate its ability to detect grooming behaviors robustly in a number of conditions (thus making the algorithm useful to a number of investigators). On a related note, the statistics on grooming and locomotor behaviors reported in Figure 3 are likely to be dependent on the particular constraints of the environment these flies are in (housed singly, in thin narrow tubes, etc.) in addition to the genotypes of the flies (for example, Canton S, a lab inbred strain)- the authors should be clear about this in the text. In a different environment or with a different genetic background, the relative statistics of the two behaviors might be quite different.

2) The authors bin the locomotor and grooming data into ~3 minute bins, and then investigate rhythmicity using the Lomb-Scargle periodogram (a robust method for identifying rhythms in sparsely sampled data). This corresponds to a sampling rate of 0.006Hz – thus pushing any high frequency power in the grooming or locomotor rhythm down into the low frequencies (Figure 4C), and it is not clear if the peaks in the power spectrum are significant. The authors should report on the statistical significance of the periodicity (if it exists) and also report on the effect of binning on the rhythm. Because of this issue, it is not clear if and how the period mutants affect the grooming patterns – grooming behavior may fluctuate over the course of the day, but the question the authors have not yet established is whether it is rhythmic.

3) A central claim of the paper is that grooming is not simply a correlate of locomotion/activity but is under separate control by the circadian clock, and a model is used (Figure 2) to show that grooming and locomotion are distinctly patterned during the day. However, the parameters are only useful if the model provides a good description of the data. How well does the model fit the data? Example fits, and quantification of fit quality should be provided. Related to this issue, differences in the temporal patterning of locomotion vs grooming only become apparent in the averaged and normalized data shown in the supplement (Figure S3A). But the normalization could introduce artifacts. For instance, the absolute basal rates for both behaviors appear similar in the raw data (Figure 3F). After normalization, basal grooming rates appear elevated (Figure S3A) – but this is simply an effect of normalization to the max. The authors should ensure that this does not introduce artifacts in their model parameters.

4) The reduced variability of the fraction of time spent grooming vs. locomotion (Figure 3E) could result from the fact that the mean and variance of the underlying variables are correlated, as is common for Bernoulli or Poisson processes. For example, the average rates for locomotion are higher than those for grooming – under a Poisson model, their standard deviation is expected to be higher as well. This trivial explanation should be ruled out, for example by looking at the correlation between individual fly means and standard deviations for locomotion alone (or grooming alone).

Reviewer #3:

The authors of this manuscript have made a significant advance in allowing for automated assessment of grooming behavior over long periods of time, and this will be generally useful to Drosophila researchers. Particularly beneficial is the fact that their system has the ability to monitor basal grooming, as opposed to induced grooming, which should be very informative to the field. The authors have also done a nice job of explaining in detail how their algorithm works, which appears to accurately detect grooming events.

There are several major problems, however, with the specific application of this system and the resulting conclusion that grooming is under circadian control and that the amount of grooming is regulated by the clock and cycle genes. Most concerning is that the authors have failed to use standard assessments, such as periodogram analysis of individual fly grooming behavior, to confirm a circadian pattern of grooming and to determine the robustness of the rhythm. The only quantification of free-running grooming rhythms is Lomb-Scargle periodogram analysis on mean grooming data over 4 days in constant darkness. We therefore have no idea as to the periods and powers of behavior of individual flies. The raw data traces shown suggest very weak rhythms. They should also show how clock and cycle mutations affect the circadian pattern of grooming-it's curious that they only show total amount of grooming for these mutants when a loss of rhythmicity would bolster the idea that grooming behavior is under circadian control.

In addition, the locomotor behavior of wildtype flies as determined by this monitoring system is unusual as it appears that a substantial amount (more than half) of activity is occurring during the dark period (see Figure 3—Figure supplement 1C). This calls into question either the accuracy of the algorithm to determine locomotion, the lighting and noise controls during the experiment, or the genetic background of the wildtype flies used.

Finally, it is not clear that the rhythms of grooming behavior are under direct circadian control as opposed to being a secondary product of sleep-wake cycles. Grooming can only occur when the flies are awake, which makes it very difficult to disentangle from sleep-wake rhythms. The authors have attempted to address this question by comparing the onset of grooming and locomotion during the evening (Figure 4E), but locomotion is not the same thing as wakefulness, and thus it is possible that flies groom upon awakening from the afternoon siesta prior to engaging in large-scale movements. The authors should more closely analyze individual fly behavior to determine the temporal relationship between wakefulness and grooming. Unfortunately, short of identifying a mutant that selectively alters sleep-wake or grooming rhythms without affecting the other, it will be impossible to unequivocally address this concern.

The relationship between wakefulness and grooming might also explain why clkjrk and cyc01 mutants have increased grooming, as both mutants have decreased sleep overall (Hendricks et al., 2003; although note that cyc01 mutants used in this manuscript don't have reduced sleep amount as determined by the algorithm). It would be useful to look at grooming behavior in other short-sleeping mutants to see if it is similarly increased.

Overall, this manuscript represents a nice innovation that will benefit other researchers but has fallen short in using this innovation to uncover novel mechanisms regulating grooming behavior. At a minimum, additional analysis is needed to convincingly demonstrate circadian control of grooming.

[Editors’ note: what now follows is the decision letter after the authors submitted for further consideration.]

Thank you for resubmitting your work entitled "Automated analysis of internally programmed grooming behavior in Drosophila using a k-nearest neighbors classifier" for further consideration at eLife. Your revised article has been favorably evaluated by K VijayRaghavan (Senior editor), Kristin Scott (Reviewing editor), and three reviewers.

The manuscript has been improved but there are some remaining issues that need to be addressed as text changes before acceptance, as outlined below:

1) Independence of grooming cycles from locomotion/wakefulness cycles is challenging to show definitively. At this stage, the authors have provided evidence that grooming is not *trivially* rhythmic because of wakefulness rhythmicity (i.e. there's not a wake/sleep cycle and 1/nth of the time you're awake you groom). Clearly these are coupled processes in that they are hierarchical – you have to be awake to groom. The authors should restate their discussion to say that they haven't definitively demonstrated that grooming rhythms are independent of the sleep-wake cycle.

2) Separability of the modes of regulating grooming is not so clear, since the genes that seem to be require for the separate modes are all part of the core clock machinery. The justification for the idea of two separate programs in the current paper rests primarily on the fact that mutations in the per gene result in loss of grooming rhythms but not grooming amount, while mutations in clock and cyc affect both the rhythm and amount of grooming. The main problem here is that clock and cyc are core elements of the clock, and thus can't be separated from their role in driving rhythmicity. Furthermore, per is part of the negative arm of the clock while clock and cyc are part of the positive arm. Couldn't it be that the positive components, when missing, lead to increased grooming, but the negative components do not? The data do not prove that there are separate programs controlling the timing and amount of grooming. Without that additional evidence, this conclusion be toned down.

3) To address concerns about binning their data, the authors now test 3 different bin sizes (1, 5, and 30 minute bins) and then search for low frequency rhythms – this seems okay, since for all cases the bin width is much smaller than the underlying rhythm they are trying to detect (24 hour circadian cycle). However, the authors should be able to run the Lomb-Scargle analysis on the data directly, without binning, and still uncover the rhythms, no? If so, this argues against a need for binning at all.

4) In subsection “Automatic grooming detecting system”, the authors say: "we used a system that incorporates features from Drosophila Activity Monitors (DAMs) with a custom video set-up". Based on my understanding, the only feature of the DAM system that the authors have incorporated is the 5mm diameter glass tubes to house the flies. It seems misleading to say that they are using features of the DAM system, which would imply use of the monitoring technology (i.e. the actual monitor itself).

5) In Figure 2E, the word "periphery" is misspelled.

6) There is no mention of the Pearson correlation test in the methods section. I would recommend that the authors include p values in addition to the correlation coefficient. In Figure 4—figure supplement 1, it would be nice for the order of the Pearson correlation graphs to match the order from Figure 4C.

7) Are the data from Figure 5—figure supplement 2 collected in DD conditions?

8) This is a stylistic consideration, but some of the new figure captions (for example the caption for Figure 6), read like a Results section. They don't clearly explain what is being graphed and instead make conclusions about the data.

9) There are some residual typos, such as missing spaces before references and sporadic double spaces. There are almost certainly other typos that I failed to see in reading it. So careful copy editing should be done to reduce the number of these that make it into print.

10) The authors might want to consider versions of the title along these lines: "Automated classification of grooming behavior in Drosophila reveals independent modes of genetic control " This reflects a few suggestions: (1) I don't think it's essential to mention kNN in the title. Other clustering approaches would presumably work in that space, (2) "internally programmed behavior" still strikes me as an odd framing. I'd rather see a quick summary of their major science finding. Those are my two cents. Happy to leave this up to the authors.

https://doi.org/10.7554/eLife.34497.041

Author response

[Editors’ note: the author responses to the first round of peer review follow.]

Reviewer #1:

1) Were the flies used to assess the classifier accuracy all from the same genotype? If so which one, and do the authors have evidence that the classifier will generalize to other genotypes? Genotypes vary by size and pigmentation and this has the potential to change the value of the classification features and hence the classifier accuracy. Since the paper is given as a methods paper, more effort should be given to conveying the generality and robustness of the approach.

It should be easy to see if classification accuracy changes with the size of the segmented fly body from the data they have in hand, if there's enough natural variation in body size. But I don't think it's unreasonable to explore these questions with new measurements.

This is an excellent point. In the previous version, we overlooked mentioning the genotype. The genotype is now explicit in the manuscript. All training samples shown in Figure 2 are from w1118 flies (N=20). Since individual body sizes vary even within the same genotype, we created the PM, CD and CM feature space (Figure 2F) in terms of the features normalized by individual fly body size, as described in Materials and methods section.

The reviewer is correct – pigmentation and other morphological differences can also potentially affect classifier accuracy. To investigate the influence of these factors, we performed additional analyses assessing the accuracy metrics of the w1118–based training set for individuals from three other commonly used laboratory strains – Canton S, iso31, and yw. Each of the three videos was 10-minutes long and included 20 flies. As shown in Figure 3C, error rates in all tested strains are less than 10%. Details of these studies and their findings are now included in the manuscript. The text now reads (subsection “Behavior classification algorithm”):

“Since size and pigmentation differences between genotypes can potentially affect behavioral classification, we investigated robustness of our w1118-trained classifier with manually-labeled data from Canton S, iso31, and yw strains (10-minute videos with N=20 of each type). As shown in Figure 3C, error rates in all tested strains are less than 10%. Together, these results suggest that our method identifies grooming with high fidelity in several different Drosophila melanogaster strains.”

2) Subsection “Grooming plays an important role in the daily life of Drosophila” – Using distributions to assess behavioral stereotypy feels misguided, particularly as the authors have individual-level data. Individual flies have different behavioral biases in essentially all behaviors, so it's likely this is true for grooming. This could be true and nevertheless produce a normalized histogram that is relatively narrow. To assess, the authors could look at individual-level data of grooming rates across days. If every individual fly fills out the population distribution (i.e. the behavior is ergodic), then their argument would be supported.

We thank the reviewer for raising concerns about the presentation. We should not have presented the data using a normal distribution as it clearly conveyed the wrong point and was not appropriate for the purpose. Our intention was simply to draw attention to the observation that grooming has a smaller population-wide variance. In the revised version, because of the new structure of our work, this comparison between locomotion and grooming is removed from the manuscript. In Author response image 1 we show the confusing presentation and the simple graph that conveys our point.

In the panel on right, inter-individual differences in daily grooming and locomotion are presented. Each point is standardized daily grooming or locomotion of an individual fly, which is calculated by dividing daily among of grooming(green) or locomotion(gray) of an individual fly by the respective population average. The variation in standardized daily grooming time among individuals (coefficient of variation) is significantly less than in locomotion. The coefficient of variation of grooming is 0.14 compared with 0.34 for locomotion. N=66 flies. In the previous version (left panel), standardized daily grooming or locomotion of individual flies were fitted to normal distributions.

The sentence in Subsection “Grooming plays an important role in the daily life of Drosophila” seems particularly speculative. More appropriate for the discussion? Even there, I don't see any reason why spontaneous behavior should be less idiosyncratic than stimulus evoked behavior. Perhaps it would even go the other way. All in all, this argument is not compelling.

The authors might want to, just for curiosity, look at the individual-level correlation between grooming and sleep.

We have removed the comparison of population-wide variance between grooming and locomotion in the revised manuscript, as mentioned above. Thus, this sentence is no longer there.

We thank the reviewer for the advice to look at individual-level correlations. We have added individual-level correlation among grooming, locomotion, feeding, short rest, and sleep of wt to Figure 4C, D (iso31+) and in Figure 4—figure supplement 1B, C (Canton S). In addition, correlation between grooming and locomotion and between grooming and sleep in individual sleep mutants sss and fumin, and circadian mutants per0, clkJRK and cyc01flies are provided in Figure 6—figure supplement 1F, G.

3) Subsection “Temporal pattern of grooming is under control of the circadian clock” – The authors should clarify what they mean about testing whether grooming is under circadian control. In particular, if locomotion is given as under circadian control, and locomotion and grooming are definitionally exclusive, how can grooming *not be under circadian control? i.e. What is the result which would falsify the hypothesis that grooming is under circadian control?

As the authors go onto conclude in Subsection “Temporal pattern of grooming is under control of the circadian clock” that circadian systems regulate grooming it's particularly important to clarify what they mean early on.

The issue of circadian control of grooming is an important result of this work and we thank the reviewer for pointing out insufficient clarity in the definitions and test criteria. Previous wording may have incorrectly implied that locomotion and grooming are always mutually exclusive. They are mutually exclusive at the level of individual events but not at the level of fractional duration in each behavior. The latter quantity is what we show in time-series and where any long-term oscillations would appear. Within the state of wakefulness, a fly can locomote, groom, feeding or rest. As a result, wakefulness can be rhythmic as long as only one of the four is rhythmic. Therefore, if we find locomotion to be rhythmic, there is no requirement on either grooming or rest to be rhythmic. To test the presence of circadian rhythmicity in grooming behavior, it was therefore necessary to monitor the behavior in constant environment (without any external light or temperature cues) and check for its rhythmicity.

Much of this reasoning was missing in the previous version. But, prompted by the referee’s questions, we have now added substantial additional text starting in the last paragraph of subsection “Flies spend a significant portion of their awake time grooming”. We have also redesigned figures and added new figures to address this important question: Figure 4F, Figure 4—figure supplement 1D and Figure 5—figure supplement 2.

The reported lag between peak locomotion and peak grooming of 2 hours as suggested by Figure 4E is not particularly convincing to me that these processes aren't trivially coupled in the sense of being both driven by arousal but exclusive of each other. Such an arrangement would be consistent with circadian rhythms driving arousal, and when an aroused animal doesn't walk it grooms. This doesn't feel like the kind of circadian control of grooming the authors are interested in, but they should clarify if this sort of arrangement passes the test.

Also, the data presented for a lag between grooming and running aren't particularly convincing as they appear to only happen 2/4 times. The sample size should be shown in Figure 4F. Better would be to show the data points themselves.

We updated number of samples and now this figure has been moved to Figure 4—figure supplement 2O, P to make room for new panels in main figure. Sample size was N= 50 iso31+ flies.

Reviewer #2:

1) After reading the title I expected a methods-heavy paper – however, much of the details on the automated grooming detector are in the Materials and methods section – and there is scant information describing the choices the authors made in how they extracted fly shape or optimized the classifier. Additionally, while the algorithm seems useful for studying grooming behaviors in these particular tubes under IR light, it is not clear how the algorithm would perform if the flies were in other (more natural) environments. The algorithm itself does not seem to be sufficiently innovative on its own to be of general interest, so it seems important for the authors to demonstrate its ability to detect grooming behaviors robustly in a number of conditions (thus making the algorithm useful to a number of investigators). On a related note, the statistics on grooming and locomotor behaviors reported in Figure 3 are likely to be dependent on the particular constraints of the environment these flies are in (housed singly, in thin narrow tubes, etc.) in addition to the genotypes of the flies (for example, Canton S, a lab inbred strain)- the authors should be clear about this in the text. In a different environment or with a different genetic background, the relative statistics of the two behaviors might be quite different.

a) We agree with the reviewer that more information about our approach ought to be included outside of Methods and in the main body of the manuscript. We have added more details on fly shape extraction, features extraction, model optimization, and validation in the Results section. We have put more data on pruning filter size optimization in Figure 3B and accuracy tests of our methods on different wt strains in Figure 3C.

b) This study intended to devise a platform that would add high-throughput detection of fly grooming to the current list of fly circadian behaviors studied in the laboratory. Since locomotor activity and sleep are two of the most commonly studied circadian behaviors and the studies use the same experimental design (Drosophila Activity Monitor (DAM)), we attempted to retain as much of that design as possible in designing the new platform. The choice of ~6 cm long glass tubes to house individual animals retains an important feature of ongoing fly circadian experiments and lowers the technical hurdle of setting up the platform for the interested researcher (who is more likely to be one already studying locomotion and/or sleep). This also means the same fly preliminarily measured using DAM can be promptly placed in the new platform without the need for any special handling/transfer steps. Also, a shared arena design means that conclusions from one platform on some aspects of locomotion and sleep can be assumed to be valid in another, thus speeding up progress.

Although we have not conducted a comprehensive study, limited tests suggest our method should work well in other environments as long as flies are solitary and back-lit with a stable infra-red source.

c) Thank you for bringing up this important point. We agree that some of the behavioral statistics are potentially specific to the environment in which the flies are housed. Some features are clearly dependent on the genotype as well, as revealed by our data (e.g. Figure 4 B-D and Figure 4—figure supplement 1 A-C; Figure 6 and Figure 6—supplement 1). We now further stress these points in subsection “Flies spend a significant portion of their awake time grooming” with additional references:

“It is worth noting here that such behavioral statistics can vary even between wild-type laboratory strains (Colomb et al., 2015; Zalucki et al., 2015)

2) The authors bin the locomotor and grooming data into ~3 minute bins, and then investigate rhythmicity using the Lomb-Scargle periodogram (a robust method for identifying rhythms in sparsely sampled data). This corresponds to a sampling rate of 0.006Hz – thus pushing any high frequency power in the grooming or locomotor rhythm down into the low frequencies (Figure 4C), and it is not clear if the peaks in the power spectrum are significant. The authors should report on the statistical significance of the periodicity (if it exists) and also report on the effect of binning on the rhythm. Because of this issue, it is not clear if and how the period mutants affect the grooming patterns – grooming behavior may fluctuate over the course of the day, but the question the authors have not yet established is whether it is rhythmic.

We apologize for not reporting significance metrics for the circadian peaks in the original manuscript. We now include a few example power spectra (with p values) and detailed statistics of all tested control and mutants in Figure 5D, E. Accompanying text in Results has been updated. A new supplementary figure, Figure 5—figure supplement 1A, has been added to show additional example power spectra (with p=0.05 and p=0.01 cut offs) of wild-type and mutant flies. These details clearly show that grooming rhythms are statistically significant in wild-type and period-shifted mutants and well-below the cut-off in the classic arrhythmic flies.

To test the effect of binning on rhythmicity, we binned grooming activity of individual flies in 30minutes, 5-minutes, and 1-minute bin sizes and ran Lomb-Scargle periodogram analysis on these binned data. Examples of 5 individual spectra of each bin size are shown in Figure 5—figure supplement 1C. As shown in the figure, the separation between statistical cut-off power (at certain p value, horizontal lines) and peak power increases with smaller bin size or equivalently, larger number of data points (N). This is because in Lomb-Scargle periodogram, cut-off power grows as ln(N) while peak power grows as N. This detail has been added to Methods section.

3) A central claim of the paper is that grooming is not simply a correlate of locomotion/activity but is under separate control by the circadian clock, and a model is used (Figure 2) to show that grooming and locomotion are distinctly patterned during the day. However, the parameters are only useful if the model provides a good description of the data. How well does the model fit the data? Example fits, and quantification of fit quality should be provided.

We thank the reviewer for pointing out this missing important detail. A summary about the model and method, including analytic functions used in the fitting procedure and example fits, is now in Figure 4—figure supplement 4. In addition, fitting errors are shown in the accompanying Table 1 and Table 2 in Figure 4—figure supplement 3.

Related to this issue, differences in the temporal patterning of locomotion vs grooming only become apparent in the averaged and normalized data shown in the supplement (Figure S3A). But the normalization could introduce artifacts. For instance, the absolute basal rates for both behaviors appear similar in the raw data (Figure 3F). After normalization, basal grooming rates appear elevated (Figure S3A) – but this is simply an effect of normalization to the max. The authors should ensure that this does not introduce artifacts in their model parameters.

The actual fittings of the data were done on the raw activity and power spectrum of individual flies, without any normalization. It is true that the absolute value of basal rate of grooming is lower than that of locomotion (previous Figure 3D, 3F). The point for normalization in Figure S3A (removed in revised version) was to allow easy comparison of the two behaviors and show the smaller relative day-night difference (smaller amplitude) in grooming than in locomotion.

4) The reduced variability of the fraction of time spent grooming vs. locomotion (Figure 3E) could result from the fact that the mean and variance of the underlying variables are correlated, as is common for Bernoulli or Poisson processes. For example, the average rates for locomotion are higher than those for grooming – under a Poisson model, their standard deviation is expected to be higher as well. This trivial explanation should be ruled out, for example by looking at the correlation between individual fly means and standard deviations for locomotion alone (or grooming alone).

We thank the referee for this advice. Substantial revision and streamlining precluded, this comparison between locomotion and grooming from the current manuscript.

But regarding the now excluded figure: It is true that variance and mean of grooming and locomotion of individual flies could be correlated. In the previous Figure 3E we presented the variability of daily grooming and locomotion among individuals in the population. In the figure, daily grooming or locomotion of individual flies was standardized by dividing by the population average grooming or locomotion (coefficients of variation of grooming and locomotion). Based on the central limit theorem, the distribution of both standardized behaviors should be normally distributed. Thus, in the two distributions that were presented, mean and variance should not be correlated.

Reviewer #3:

There are several major problems, however, with the specific application of this system and the resulting conclusion that grooming is under circadian control and that the amount of grooming is regulated by the clock and cycle genes. Most concerning is that the authors have failed to use standard assessments, such as periodogram analysis of individual fly grooming behavior, to confirm a circadian pattern of grooming and to determine the robustness of the rhythm. The only quantification of free-running grooming rhythms is Lomb-Scargle periodogram analysis on mean grooming data over 4 days in constant darkness. We therefore have no idea as to the periods and powers of behavior of individual flies. The raw data traces shown suggest very weak rhythms. They should also show how clock and cycle mutations affect the circadian pattern of grooming-it's curious that they only show total amount of grooming for these mutants when a loss of rhythmicity would bolster the idea that grooming behavior is under circadian control.

We apologize for this important omission. We have updated Figure 5D, E, Figure 5—figure supplement 1, and Figure 5—figure supplement 3 with examples of individual power spectra and detailed circadian-rhythm related statistics of wild-type, per mutants, clkJRK and cyc01flies. All single fly power spectra now appear with power at the p=0.05 and p=0.01 levels (horizontal dashed and dash dot lines) indicated.

These details now offer clearer picture of circadian rhythmicity in grooming of wild-type and period-shifted mutants and their absence in canonical arrhythmic mutants.

In addition, the locomotor behavior of wildtype flies as determined by this monitoring system is unusual as it appears that a substantial amount (more than half) of activity is occurring during the dark period (see Figure 3—Figure supplement 1C). This calls into question either the accuracy of the algorithm to determine locomotion, the lighting and noise controls during the experiment, or the genetic background of the wildtype flies used.

All per mutant strains we used in this work are backcrossed to an iso31 line with miniwhite insertion (iso31+). Based on raw data from Figure 3—figure supplement 1C (now Figure 5—figure supplement 3A in revised version), in 12-12 light-dark experiments these wildtype flies do show higher level of locomotion activity during night (~41%) than during day (~32%)The ratio of nightime to day-time locomotor activity is ~1.28. To validate data from our new platform, we tracked locomotor behavior of the iso31+ flies with the traditional single IR beam monitors in a commercial incubator with superior light and noise control than the enclosure for grooming experiments. Note activity measurement units are different from the two systems – video data yield time spent in locomotion and IR measurements yield beam breaks per unit time – and are therefore not directly comparable. Examples of single beam LD data of individual flies, Author response image 2, show more locomotion activity during dark period than during light, the ratio being 1.11, consistent with our video tracking data. The comparison rules out the possibilities that night-time activity is caused by algorithmic error or experimental noise.

Finally, it is not clear that the rhythms of grooming behavior are under direct circadian control as opposed to being a secondary product of sleep-wake cycles. Grooming can only occur when the flies are awake, which makes it very difficult to disentangle from sleep-wake rhythms. The authors have attempted to address this question by comparing the onset of grooming and locomotion during the evening (Figure 4E), but locomotion is not the same thing as wakefulness, and thus it is possible that flies groom upon awakening from the afternoon siesta prior to engaging in large-scale movements. The authors should more closely analyze individual fly behavior to determine the temporal relationship between wakefulness and grooming. Unfortunately, short of identifying a mutant that selectively alters sleep-wake or grooming rhythms without affecting the other, it will be impossible to unequivocally address this concern.

We thank the reviewer for pointing out these subtleties in the relationship between the various behaviors and what they mean for rhythmicity of grooming. Our interpretation of temporal constraints on the behaviors was missing in the previous version and that may have been responsible for the perceived incorrect implication that locomotion and grooming are mutually exclusive in terms of fractional time spent in the two behaviors. At any point in time, flies in our experiments can transition into locomotion, grooming, feeding, short rest, or sleep. Although individual events of these different states are mutually exclusive, once they are binned in time, the behaviors lose their rigid mutual exclusivity. This means that within a given time bin (as in the fractional time data traces) a fly can be engaged in multiple behaviors, as long as the first four (constituting wakefulness) and sleep add up to 1 ().Figure 4—figure supplement 1.F Complementary relationship between wakefulness and sleep imply that if one is rhythmic, the other must also be rhythmic. However, a requirement of rhythmic wakefulness does not necessitate every one of the four constituent behaviors to be rhythmic but rather only one to be rhythmic (preferably one in which flies spend the most time). Since locomotion is already widely considered to be rhythmic, there was no prior mathematical burden on grooming, feeding or rest to vary rhythmically. Therefore, to test the presence of circadian rhythmicity in grooming behavior, it was necessary to monitor the behavior in constant darkness and check for its rhythmicity.

Much of this reasoning was missing in the previous version. But, prompted by the referee’s question, we have now added additional text (subsection “Flies spend a significant portion of their awake time grooming”), Figure 4F, and Figure 5—figure supplement 2.

Previous Figure panel 4E has been moved to Figure 4—figure supplement 2D in order to make room for additional panels in Figure 4. The purpose of that panel was to simply demonstrate differences in circadian regulation of grooming vs locomotion and grooming vs feeding within the state of wakefulness.

The relationship between wakefulness and grooming might also explain why clkjrk and cyc01 mutants have increased grooming, as both mutants have decreased sleep overall (Hendricks et al., 2003; although note that cyc01 mutants used in this manuscript don't have reduced sleep amount as determined by the algorithm). It would be useful to look at grooming behavior in other short-sleeping mutants to see if it is similarly increased.

Our clkJRK flies groom much more but sleep significantly less than their controls (clkJRK groom 9% and sleep 32% as opposed to control grooming at 6% and sleep at 56%). In contrast, cyc01 flies groom more compared to their control strain (16% vs. 9%) but sleep about the same (51% vs. 52%). This indicates that increased grooming is not necessarily a result of increased wakefulness (or decreased sleep).

Following the referee’s suggestion, we looked at grooming behavior in other short sleep mutants, fumin and sleepless. Compared with wt flies, sleep time in fumin flies decreases significantly, while daily grooming increases only ~1.7% (Figure 6D). On the other hand, sleepless flies show significant decrease in both behaviors (Figure 6E). These trends are also visualized now in Figure 6—figure supplement 1F.

These data again suggest that grooming is not simply correlated with wakefulness.

[Editors' note: the author responses to the re-review follow.]

The manuscript has been improved but there are some remaining issues that need to be addressed as text changes before acceptance, as outlined below:

1) Independence of grooming cycles from locomotion/wakefulness cycles is challenging to show definitively. At this stage, the authors have provided evidence that grooming is not *trivially* rhythmic because of wakefulness rhythmicity (i.e. there's not a wake/sleep cycle and 1/nth of the time you're awake you groom). Clearly these are coupled processes in that they are hierarchical – you have to be awake to groom. The authors should restate their discussion to say that they haven't definitively demonstrated that grooming rhythms are independent of the sleep-wake cycle.

We agree with the reviewer that a more thorough study is required to show complete independence of rhythms in grooming and wakefulness. We simply wish to make the reader aware that rhythmic wakefulness does not directly imply rhythmic grooming – a level of independence permitted within the hierarchical relationship between the two behaviors – only to motivate a need for grooming experiments in constant darkness. Independence in the opposite direction is not critical to our work. We have added the following statements to further alert the reader of the incompleteness of our results (subsection “Temporal pattern of grooming is controlled by the circadian clock”):

“It should be noted here that our simulation results do not demonstrate bidirectional independence of rhythmicity in wakefulness and grooming but, only that rhythmicity of wakefulness does not depend on that of grooming. Demonstration of fully independent rhythms in the two behaviors is beyond the scope of the present study.”

2) Separability of the modes of regulating grooming is not so clear, since the genes that seem to be require for the separate modes are all part of the core clock machinery. The justification for the idea of two separate programs in the current paper rests primarily on the fact that mutations in the per gene result in loss of grooming rhythms but not grooming amount, while mutations in clock and cyc affect both the rhythm and amount of grooming. The main problem here is that clock and cyc are core elements of the clock, and thus can't be separated from their role in driving rhythmicity. Furthermore, per is part of the negative arm of the clock while clock and cyc are part of the positive arm. Couldn't it be that the positive components, when missing, lead to increased grooming, but the negative components do not? The data do not prove that there are separate programs controlling the timing and amount of grooming. Without that additional evidence, this conclusion be toned down.

We agree with the reviewer that we have not shown conclusively that the two modes of grooming control are independent. The following changes have been made in response to this comment:

Abstract: “One of these programs regulates the timing of grooming and involves the core circadian clock components cycle, clock, and period. The second program regulates the duration of grooming and, while dependent on cycle and clock, appears to be independent of period.”

Subsection “Grooming duration is controlled by cycle and clock”: “Importantly, together with per0 data, the results raise the possibility of non-circadian roles for cyc and clk in setting the duration of internally driven grooming in Drosophila.”

Subsection “Grooming duration is controlled by cycle and clock”: “The apparent absence of per from the second regulatory mechanism is consistent with the possibility that the two control mechanisms <strike>are</strike> operate independently.”

Discussion section: “…implying that the changes in grooming level <strike>are not</strike> may not be due to circadian defects. These data are consistent with the hypothesis that clock-independent but cyc- and clk- dependent pathways regulate the amount of programmed grooming behavior”.

Discussion section: suggest that this innate behavior is driven by two possibly distinct sets of regulatory systems.

3) To address concerns about binning their data, the authors now test 3 different bin sizes (1, 5, and 30 minute bins) and then search for low frequency rhythms – this seems okay, since for all cases the bin width is much smaller than the underlying rhythm they are trying to detect (24 hour circadian cycle). However, the authors should be able to run the Lomb-Scargle analysis on the data directly, without binning, and still uncover the rhythms, no? If so, this argues against a need for binning at all.

We thank the reviewer for highlighting this point. The time series in Figures 4, 5, and Figure 5—figure supplement 1, Figuer 5—figure supplement 2 and Figure 5—figure supplement 2 were binned to bring out long timescale patterns in the data. Display of the raw 5 Hz data would not demonstrate this point as effectively, as the raw data would be dominated by short timescale fluctuations. A 30-minute bin size was chosen as a balance between averaging over the short time fluctuations but without loss of the ~24 hr oscillations. As the reviewer correctly points out, periodogram analysis can be done on the raw data. We have now added an additional column in Figure 5—figure supplement 1C showing Lomb-Scargle analysis of the 5 Hz data. Compared to the binned data, difference in power between the circadian peak and the statistical cut-off is even larger for the 5 Hz data, in accordance with our statement in subsection “Spectral analysis”, that in power spectrum the peak power grows as N while the cut-off power grows as log N.

4) In subsection “Automatic grooming detecting system”, the authors say: "We used a system that incorporates features from Drosophila Activity Monitors (DAMs) with a custom video set-up". Based on my understanding, the only feature of the DAM system that the authors have incorporated is the 5mm diameter glass tubes to house the flies. It seems misleading to say that they are using features of the DAM system, which would imply use of the monitoring technology (i.e. the actual monitor itself).

We concur with the reviewer that the sentence may lead to confusion. The statement has been replaced with a new one that does not refer to the DAM system:

“We used a custom-designed video set-up to monitor fly behavior.”

5) In Figure 2E, the word "periphery" is misspelled.

We thank the reviewer for pointing out the error. The spelling has been corrected.

6) There is no mention of the Pearson correlation test in the methods section. I would recommend that the authors include p values in addition to the correlation coefficient. In Figure 4—figure supplement 1, it would be nice for the order of the Pearson correlation graphs to match the order from Figure 4C.

We apologize for not reporting significance metrics for the Pearson correlation coefficient in the manuscript. We now include p-values for Pearson coefficient in all correlation analyses. When we calculated the bivariate normality of pairs of data in each correlation graph, we noticed that some pairs deviated strongly from a normal distribution. Thus, instead of testing significance of r values with Student’s t-test, we applied the bootstrap method for calculation of p-values. We now include example empirical distribution of r values in Figure 4—figure supplement 1D and provide all p-values in Figure 4—figure supplement 1E. Details of the test are described in Materials and methodssubsection “Statistics” as follows:

“In Figure 4C, D and Figure 4—figure supplement 1B, C, the Pearson correlation coefficient r for each pair of data was calculated according to the standard definition

rX,Y=EX-μXY-μYσXσY

where X and Y are time spent in two behaviors X and Y, rX,Y is the Pearson correlation coefficient between two behaviors, E[] is the expectation value, μ and σ are, respectively, mean value and standard deviation of a behavior. The statistical significance of r was estimated through bootstrapping. For each two behaviors, we randomly paired data from n flies (n=84 for iso31+ and n=76 for Canton S) and calculated a correlation coefficient r. This process was repeated 100000 times and the empirical distribution of the randomly paired r values were used for a two-tailed test (Figure 4—figure supplement 1D). p-values for all Pearson correlation coefficients are presented in Figure 4—figure supplement 1E.”

Also, the order of the Pearson correlation graphs in Figure 4—figure supplement 1B, C is now matched with that in Figure 4C, D.

7) Are the data from Figure 5—figure supplement 2 collected in DD conditions?

Yes, they are. We noticed that this information was before mentioned only in the main text. Now, it is specified in the figure caption as well.

“Time series in the four examples were taken in constant darkness (DD) and[…]”

8) This is a stylistic consideration, but some of the new figure captions (for example the caption for Figure 6), read like a Results section. They don't clearly explain what is being graphed and instead make conclusions about the data.

We regret not having included details. We have added more information about the figure in caption for Figure 6 and now it reads:

“In each panel, bar plots on left show average fractional time spent in grooming in mutant and control flies. Pie charts on right present average fractional time spent in grooming (green), locomotion (gray), sleep (dark gray), short rest (purple) and feeding (blue). Here, numerical values for fractional time spent in behavior are indicated only for grooming, locomotion and sleep[…]”

9) There are some residual typos, such as missing spaces before references and sporadic double spaces. There are almost certainly other typos that I failed to see in reading it. So careful copy editing should be done to reduce the number of these that make it into print.

We thank the reviewer for this comment and have now corrected 14 similar typos.

10) The authors might want to consider versions of the title along these lines: "Automated classification of grooming behavior in Drosophila reveals independent modes of genetic control " This reflects a few suggestions: (1) I don't think it's essential to mention kNN in the title. Other clustering approaches would presumably work in that space, (2) "internally programmed behavior" still strikes me as an odd framing. I'd rather see a quick summary of their major science finding. Those are my two cents. Happy to leave this up to the authors.

We appreciate the referee’s suggestion. Considering (1) this is a methods/technique paper (2) referee comment above about further ‘toning down’ of the grooming control model we propose, and (3) the suggestion to replace ‘internally programmed’ from title, we decided to keep ‘kNN’, not mention ‘independent control’ and replace ‘internally programmed’ with ‘long-term’ in the title. The new title is:

“Automated analysis of long-term grooming behavior in Drosophila using a k-nearest neighbors classifier”

https://doi.org/10.7554/eLife.34497.042

Article and author information

Author details

  1. Bing Qiao

    Department of Physics, University of Miami, Coral Gables, United States
    Contribution
    Data curation, Software, Formal analysis, Validation, Investigation, Visualization, Writing—original draft
    Contributed equally with
    Chiyuan Li
    Competing interests
    No competing interests declared
  2. Chiyuan Li

    Department of Physics, University of Miami, Coral Gables, United States
    Contribution
    Conceptualization, Software, Formal analysis, Methodology, Writing—original draft
    Contributed equally with
    Bing Qiao
    Competing interests
    No competing interests declared
  3. Victoria W Allen

    Department of Genetics and Development, Columbia University, New York, United States
    Contribution
    Investigation, Writing—review and editing
    Competing interests
    No competing interests declared
  4. Mimi Shirasu-Hiza

    Department of Genetics and Development, Columbia University, New York, United States
    Contribution
    Funding acquisition, Writing—review and editing
    Competing interests
    No competing interests declared
  5. Sheyum Syed

    Department of Physics, University of Miami, Coral Gables, United States
    Contribution
    Conceptualization, Supervision, Funding acquisition, Project administration, Writing—review and editing
    For correspondence
    s.syed@miami.edu
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-4642-6678

Funding

National Science Foundation (IOS-1656603)

  • Sheyum Syed

National Institutes of Health (R01GM105775)

  • Mimi Shirasu-Hiza

National Institutes of Health (R01AG045842)

  • Mimi M Shirasu-Hiza

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

This work was partially supported by the National Science Foundation under grant IOS-1656603 to SS and by National Institutes of Health grants R01GM105775 and R01AG045842 to MSH. The authors are grateful to William Ja, F Rob Jackson, Amita Sehgal and Michael Young for providing fly strains, Juan Lopez and Manuel Collazo for technical support and Stanislav Lazopulo and Andrey Lazopulo for suggestions and assistance with experiments. We thank Alan Li and Gadi Trocki for helpful comments on the manuscript.

Reviewing Editor

  1. Kristin Scott, University of California, Berkeley, Berkeley, United States

Publication history

  1. Received: December 20, 2017
  2. Accepted: February 26, 2018
  3. Accepted Manuscript published: February 27, 2018 (version 1)
  4. Version of Record published: March 20, 2018 (version 2)

Copyright

© 2018, Qiao et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 1,544
    Page views
  • 228
    Downloads
  • 0
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Computational and Systems Biology
    2. Developmental Biology
    Inna Averbukh et al.
    Research Article
    1. Computational and Systems Biology
    2. Neuroscience
    Dilawar Singh, Upinder Singh Bhalla
    Research Article Updated