# Dynamic structure of locomotor behavior in walking fruit flies

1. Stanford University, United States
2. MIT Electrical Engineering and Computer Science Department, United States
3. University of Illinois at Urbana-Champaign, United States
7 figures and 4 tables

## Figures

Figure 1 Velocity-acceleration phase space captures aggregate constraints on behavior. (A) An example trajectory fragment of an individual fly walking on a surface. The animal’s orientation (red arrows) and displacement (gray arrows) are tracked at 30 frames per second. (B) A schematic illustration of the three velocity components v→=(vT,vR,vS) around the major body axis of a fly. These three components completely describe whole-animal movement in two dimensions. (C) Example trajectory in velocity phase space parameterized by v→=(vT,vR,vS). Red arrows denote the magnitude of the acceleration vector at each velocity. (D) A single plane in the 3-dimensional phase space, E[v→˙∣v→], v→=(vT,vR=0,vS) (black arrows), with an example trajectory (red arrows), projected into this plane (x-axis: vS; y-axis: vT). https://doi.org/10.7554/eLife.26410.002
Figure 2 Trajectory divergence in velocity phase space and definition of a behavior episode around vR extrema. (A) Starting from a small neighborhood of similar velocities (box), diverging trajectories spread throughout velocity phase space, increasing in a measure of their spread. (B) Starting from the same neighborhood, divergence of trajectories corresponding to modal patterns is bounded for the duration of each pattern. (C) Bounded or unbounded divergence can be seen when modal patterns overlap in velocity phase space, depending on when trajectories are sampled in the course of these patterns. (D) The normalized standard deviation of trajectories at time t after passage through a small neighborhood, averaged over a random sample of NB=200 neighborhoods, σ¯(t)=1σ[vT]σ[vR]σ[vS]⟨σt[vT]σt[vR]σt[vS]⟩NB (Equation 1, Materials and methods). Shaded areas represent 95% confidence intervals of the estimate of ¯⁢(t). Neighborhoods were sampled from anywhere in velocity space, and for each neighborhood all trajectories were included (dot-solid line), or only those with a turn peak 100 ms after passing the neighborhood (dashed line). (E) An example trajectory fragment depicting vR(τ) [° s−1], vT(τ) and vS(τ) [cm s−1]. (F) The autocorrelation functions of vR(τ),vT(τ) and vS(τ). https://doi.org/10.7554/eLife.26410.003
Figure 3 Iterative ICA segmentation reveals dynamic submodes. (A) Iterative ICA procedure. Local vR peaks are identified in a single trajectory (red dots). Fragments of trajectories are then selected around vR (dashed lines), and all velocity components are aligned in time on vR extrema in matrix M. In M, turn direction is normalized while preserving the relative direction of side slip by multiplying vR and vS by c, the sign of vR extremum at aligned time 0. M dimensionality is then reduced and independent components are identified. In these independent components, velocity trajectories are represented by a coefficient in each component. If a multimodal coefficient distribution is found, coefficient clusters corresponding to distinct trajectory fragment subsets are separated, and the entire procedure is repeated iteratively on each fragment subset to identify distinct submodes. (B) Segmentation tree showing decision points (triangles), and submodes (dashed circles). 15 submodes were identified, but the decision tree is drawn up to branching levels supported by Markov modeling (Figure 5, Figure 5—figure supplement 1). Submodes are described in Figure 4 and Figure 4—figure supplement 1. https://doi.org/10.7554/eLife.26410.004
Figure 4 with 1 supplement Submode velocity profiles for each submode. (1-15) Velocity distributions over time for each identified submode. The central 68% of the distribution is shown at each time point for up to 200 ms around peak vR. The time within each submode relative turn peak is color-coded. Submodes are shown in columns, by membership in modes I−V (Figure 5). (1’−15’) Example trajectory fragment of each submode type is illustrated. Three time points are shown per trajectory, each spaced 100 ms apart. A diagrammed fly illustrates the centroid location and orientation of an animal at each time point. Note that only one time point illustrates submode 8 because this submode consists of stops. https://doi.org/10.7554/eLife.26410.007
Figure 4—figure supplement 1 Example trajectory fragments from each submode class. Black arrows represent a fly’s body orientation, shown at 33 time steps. Gray horizontal scale bars = 3 mm, a representative length of an adult D. melanogaster female. Ten samples from each submode were drawn randomly. The time interval shown is between the two acceleration peaks flanking each submode, taken as a period between behavior adjustments as described in Main Text. https://doi.org/10.7554/eLife.26410.008
Figure 5 with 2 supplements Identified movement patterns as distributions in velocity phase space. (A–E) Top: Modes are described as velocity distributions over time, termed ‘velocity profiles.’The central 68% of the velocity distribution at each time point around peak vR is shown. The time relative to a turn peak within each mode is color-coded; green represents time of peak vR. Bottom: Example trajectory fragments from each mode are illustrated. Except for M⁢o⁢d⁢e⁢I, two trajectories are shown per mode, three time points per trajectory, each spaced 100 ms apart. A diagrammed fly illustrates the centroid location and orientation of an animal at each time point. Note that only one time point illustrates M⁢o⁢d⁢e⁢I because this mode consists of stops. (F) Projection of all mode velocity profiles into a vT−vR plane. Note that modes overlap in this joint velocity projection, and modes spanning smaller velocity regions are plotted on top of broader modes. (G) Residence time distributions, by mode, from real (light blue) and model-generated (black) trajectories. Two model-generated distributions are plotted for m⁢o⁢d⁢e⁢I (left panel): a Markov model without hidden states captures short residence times in the real distribution, while a Markov model that included an additional, non-communicating hidden state captures both short and long residence times. (H) The chosen 5-state Markov model, M⁢M⁢5. Edges are labeled with transition probabilities between modes. This model does not represent hidden substates of m⁢o⁢d⁢e⁢I. https://doi.org/10.7554/eLife.26410.009
Figure 5—figure supplement 1 Grouping submodes by kinetic criteria to define a behavioral model. (A) Statistical relationship of submodes assessed using sampling from generative models. Short trajectory fragments generated using statistics from all 15 submodes individually (gray bars) or submode groups (orange, red bars) were compared to real trajectory fragments (m⁢e⁢a⁢n±s.d. duration = 1.3±0.3 sec). To generate synthetic sequences, submode transition statistics were either estimated directly from submode sequences in real trajectories (models I,M1,M2), and then used to generate synthetic submode sequences, or transition statistics from state model fits to data were first used to generate state sequences, and then emission statistics fit by each model were used to sample submodes associated with each state, producing submode sequences (HMM,MM). In all cases, submode sequences were compared between real and model-generated sets of sequences. The matched fraction (f) of real sequences, quantifying set overlap, is the average number of exact matches between synthetic and real fragment sets, normalized by the average number of matches between two randomly sampled real sets (Materials and methods sections IV and V, Equation 10). First- and second-order Markov models (M1,M2) perform significantly better than a model that presumes independence between submodes (I) (fI=0.58±0.016;fM⁢1=0.93±0.009;fM⁢2=0.96±0.008). Five and six state Hidden Markov Models (HMM5,6), as well as 5 and 6 state Markov Models (MM5,6) without hidden states perform well on this generative test (fH⁢M⁢M⁢5=0.91±0.007;fH⁢M⁢M⁢6=0.94±0.009;fM⁢M⁢5=0.88±0.010;fM⁢M⁢6=0.89±0.006). (B) Illustration of the relevant difference between HMM and MM: in a MM (right) observable submodes (u1,u2,u3) unambiguously reflect an underlying state qi (termed a M⁢o⁢d⁢e,mi). In an HMM (left) a given observed submode uj may arise from more than one underlying state qi. (C) HMM mean log-likelihood values, plotted against performance on a generative test (fragment set matched fraction, f). Each HMM is represented by a dot and colored by the number of hidden states. (D) Emission probabilities, Pr⁡[s⁢u⁢b⁢m⁢o⁢d⁢e⁢uj∣s⁢t⁢a⁢t⁢e⁢qi] for a selected 5-state HMM (HMM5, starred in C). Bar colors represent emitting state qi. Legends list the submodes associated with each state. These are submodes u for which Pr⁡[uj∣qi] is significantly higher than 0. (E) MM performance on likelihood and generative tests as in (C). (F) Emission probabilities, Pr⁡[s⁢u⁢b⁢m⁢o⁢d⁢e⁢uj∣s⁢t⁢a⁢t⁢e⁢qi=M⁢o⁢d⁢e⁢mi], for the selected MM (MM5, starred in E, and E inset). Modes are colored as in Figure 5. https://doi.org/10.7554/eLife.26410.010
Figure 5—figure supplement 2 Information between modes in sequences. The fraction of uncertainty about mode identity in sequences, explained by the previous mode (ρ) and the mode before that (ρ2), plotted for different arbitrary models that group submodes into five groups in different ways. Each circle represents a unique grouping. The result of the actual grouping identified in Figure 5—figure supplement 1, in MM5, is shown in red. For mode sequences of any length k>2,(mi)i=1k,m∈{I⁢…⁢V},ρ=I⁢(mi;mi-1)/H⁢(m) and ρ2=I(mi;mi-2∣mi-1)/H(mi∣mi-1), where I is mutual information and H is Shannon entropy (Cover and Thomas, 1991). https://doi.org/10.7554/eLife.26410.011
Figure 6 with 1 supplement Submode segmentation in two datasets, S and L. Histograms of trajectory coefficients s(j) in independent component space (Box 1 and Materials and methods) are shown at decision points of the iterative segmentation procedure. Heat maps show counts in two dimensions of the independent component space, and line plots show counts in one dimension. Decision points (numbered triangles) correspond to decision points labeled on the segmentation tree in Figure 3B. Subsets of coefficients separated at each decision point are shown in the same dimensions, as noted. Plot dimensions do not necessarily correspond to independent components, as sometimes data was rotated in IC space to visualize multimodal coefficient distributions. Trajectories from two different experimental conditions, corresponding to datasets S and L, were projected in the same independent components, and their coefficients are shown on the same or adjacent plots, as noted below. The initial decision point (1) and all terminal decision points (2,4–7) are shown. The coefficient distribution at decision point three is not shown because it could not be represented in two dimensions. Decision pt. (1). (j) histograms in two dimensions, showing the multimodal distribution of coefficients from the entire dataset (Top plots), and coefficient clusters 1 and 2 after they were separated at this decision point (Bottom plots). Decision pt. (2) Data projection in two dimensions before segmentation, showing the segmentation cut (white oval), and the dwell time distributions for each resulting submode, along with single exponential fits. Note that dwell time distributions are consistent with one dominant component for each segmented submode, despite differences in the joint velocity distributions and dwell times between datasets S and L. Decision pts. (4-7) s(j) histograms in one dimension, as used for segmentation. Note that the distribution at decision point six is plotted on a log scale, where segmentation separates the long tail. Decision pt. seven is upstream of further segmentations that were not validated by Markov models. As these submodes are very low frequency, they may represent distinct submodes that can be validated using larger datasets. The 5-state MM5 groups submodes as follows: Mode I: u8; Mode II: u7,u3,u10−13; Mode III: u9,u4,u5,u14−15; Mode IV: u2; Mode V: u1,u6. https://doi.org/10.7554/eLife.26410.013
Figure 6—figure supplement 1 Model connectivity and transition frequencies in two datasets. Transition probabilities of the 5-state Markov Model MM5 are shown for dataset S (A, same as Figure 5H), and dataset L (B). https://doi.org/10.7554/eLife.26410.014
Figure 7 with 2 supplements Model prediction and trajectory divergence over time. (A) Matched fraction (f) of mode sequences from real trajectories as a function of sequence length. Synthetic mode sequences were generated using the transition component of the model MM5. Red dashed line marks f=90%, red tick and arrow mark the sequence length when f first crosses below 90%. (B) Trajectory divergence in velocity space from randomly sampled small neighborhoods, without and with information about current mode identity. Two divergence curves from Figure 2D are replotted over a longer time scale: starting from neighborhoods anywhere in velocity space, and including all trajectories (dot-solid line), or including only trajectories that attained a turn peak 100 ms later(dashed line), or only those with a turn peak 100 ms later corresponding to a given mode (solid line). Dashed and solid lines overlap for their durations. Note the short ¯ plateau, and rise after approximately 200 ms. (C) Predicted versus actual log-frequency of all mode sequences above measurement noise, up to trajectories that include 17 consecutive modes, covering 4.3 s. Markers are colored by sequence length, from shortest (2-mer, dark blue) to longest (17-mer, dark red). R=1.00,n=60,p≪0.001; Sampling error cutoff is at freq < 12(2.58N−12), N = number of sequences in each condition. Ten of these sequences (#2–11) were randomly selected for display in Figure 7—figure supplement 1, plus the most common sequence (#1). (D) Illustration of velocity trajectories specified by a stable dynamical system. Two trajectories starting from similar velocities (arrowheads, bottom), follow similar paths over time that show little divergence some time later (arrowheads, top). (E) Illustration of mode structure and transitions in velocity space. Trajectories from a small region of velocity phase space (black and red arrows) display bounded divergence for the duration of a mode (gray area) but display approximately no memory of past behavior at transitions (gaps between gray regions) beyond their current mode. Note divergence in the course of mode traversal may lead to path scrambling inside modes (jagged lines). https://doi.org/10.7554/eLife.26410.017
Figure 7—figure supplement 1 Examples of common mode sequences. Arrows represent a fly’s body orientation, shown at 33 time steps. Examples are colored by membership in a mode class of Markov model MM5, with each mode center marked by a circle on the trajectory. Gray horizontal scale bars = 3 mm, a representative length of an adult D. melanogaster female. Numbers reference locations in Figure 5C, ascending from the most common to the rarest sequence shown. https://doi.org/10.7554/eLife.26410.018
Figure 7—figure supplement 2 Trajectory divergence in velocity space, measured from the center or periphery of each mode’s velocity distribution. Trajectory divergence from small neighborhoods in velocity space (occupying 0.004% of space) was measured separately by mode, and by each neighborhood’s quartile in that mode’s velocity distribution Pr[vT,vR,vS∣t′], at t′=-100ms. Divergence from individual neighborhoods was then averaged in the center (P25.5-74.5) and in the periphery (P1-25.5⁢and⁢P74.5-99) for all modes (center, dot-solid line; periphery, solid line). Shaded areas represent 95% confidence intervals for estimates of ¯⁢(t). https://doi.org/10.7554/eLife.26410.019

## Tables

Table 1
Table 2
Table 3
Table 4

A two-part list of links to download the article, or parts of the article, in various formats.