Integrative models of visually guided steering in Drosophila

  1. Department of Electronic Engineering, Hanyang University, Seoul, Republic of Korea
  2. Department of Artificial Intelligence, Hanyang University, Seoul, Republic of Korea

Peer review process

Revised: This Reviewed Preprint has been revised by the authors in response to the previous round of peer review; the eLife assessment and the public reviews have been updated where necessary by the editors and peer reviewers.

Read more about eLife’s peer review process.

Editors

  • Reviewing Editor
    John Tuthill
    University of Washington, Seattle, United States of America
  • Senior Editor
    Albert Cardona
    University of Cambridge, Cambridge, United Kingdom

Reviewer #1 (Public review):

This study provides an integrative model of the visuomotor control in Drosophila melanogaster. This model presents an experimentally derived model based on visually evoked wingbeat pattern recordings of three strategically selected visual stimulus types with well-established behavioral response characteristics. By testing variations of these models, the authors demonstrate that the virtual model behavior can recapitulate the recorded wing beat behavioral results and those recorded by others for these specific stimuli when presented individually. Yet, the novelty of this study and their model is that it allows predictions for natural visual scenes in which multiple visual stimuli occur simultaneously and may have opposite or enhancing effects on behavior. Testing three models that would allow interactions of these visual modalities, the authors show that using a visual efference copy signal allows visual streams to interact, replicating behavior recorded when multiple stimuli are presented simultaneously. Importantly, they validated the prediction of this model in real flies using magnetically tethered flies, e.g., presenting moving bars with varying backgrounds. In conclusion, the presented manuscript presents a commendable effort in developing and demonstrating the validity of a mixture model that enables predictions of Drosophila behavior in natural visual environments.

The manuscript employs a thorough, logical approach, combining computational modeling with experimental behavioral validation using magnetically tethered flies. This iterative integration of simulation and empirical behavioral evidence enhances the credibility of the findings. The quantitative models and validating behavioral experiments make this a valuable contribution to the field. This study is well executed and addresses a significant gap in the modeling of fly behavior and holistic understanding of visuomotor behaviors.

The associated code base is well documented and readily produces all figures in the document.

Reviewer #2 (Public review):

Summary:

The fly visual circuit and its behavioral response to simple visual stimuli have been well investigated, yet how they respond to more complex visual patterns is less understood. Canelo et al. first characterized a fly's steering to simple stimuli and examined how the combination of those stimuli impacts behavior. Combining behavioral experiments and simulation, the authors found that, for some combinations, a behavioral response can be explained by a linear summation of responses to individual stimuli. However, for looming and background motion combinations, the behavioral response to one was suppressed by the other. Furthermore, the effect was dependent on the onset timing of the pair of stimuli.

Strength:

The authors tested various visual stimulus patterns and time delays between combinations of visual stimuli and found novel interactions in behavior. Their findings support the idea that, depending on the visual context, additional mechanisms kick into the visual-motor circuit to coordinate steering behavior flexibly.

Weakness:

The manuscript does not provide conclusive evidence on the presence of an efference copy signal, though there appears to be an intention to associate it with the result. However, demonstrating it is likely to be beyond the main scope of the revised version.

The goal of this manuscript is to understand how the fly's steering behavior is coordinated upon complex visual stimuli, and a number of experiments and simulations support their conclusion.

The behavioral findings presented in this paper will be helpful in further dissecting the underlying neural mechanisms of contextual sensory processing and in understanding visual processing in other species.

Reviewer #3 (Public review):

Summary:

Canelo et al. used a combination of mathematical modeling and behavioral experiments to ask how flies orient to visual features and stabilize their gaze. In particular, the authors propose three models of visuomotor control, which lead to specific experimental predictions. With the goal of teasing out the suggested models, the authors design three flight experiments: 1) a bar-background experiment, 2) a looming-background experiment, and 3) a bar-background statistics experiment. The authors claim that: experiment 1 data favor the addition-only and graded EC model; experiment 2 data favor the all-or-none EC model; experiment 3 appears to suggest a graded EC model.

While the study is interesting, there are major issues with the conceptual framework. In general, there is a major disconnect between model and animal data. The manuscript lacks a statistical framework to support or refute the proposed models. In the end, it is unclear what are the main conclusions of the manuscript and contributions to the field.

Strengths:

They ask a significant question related to efference copies during volitional movement.

The figures are overall clear and salient.

Weaknesses:

Comparison of model to fly data:
In general, the manuscript suffers from a lack of quantitative comparisons between proposed models and fly data, which compromises the main findings of the work. While Figure 1-Fig. supplement 1 shows a direct comparison between experiment and model predictions, puzzlingly there is no such quantitative comparison in the main manuscript for the faster moving stimuli. Please overlay model predictions and experimental data and provide statistical comparisons throughout. The 3 proposed models are hypotheses, but there is no statistical framework to reject or support the models/hypotheses. Further, there is a disconnect between the new flight experiments and models. In fact, we do not see the model predictions for the set of experimental conditions tested in Figs. 5-7.

Concerns about mechanical model: I have several concerns regarding the biomechanics block in Figure 2:

(1) The inertia coefficient, derived from free flight studies. does not take into account the fact that the center of rotation and center of mass do not align in the magnetic tether (see Bender & Dickinson, 2006 for estimates). This must be corrected using the parallel axis theorem. As the authors compare the model prediction to experimental data in a magnetic tether, it is critical that they revise their analysis.

(2) According to their chosen inertia and damping constants, they would estimate that the I/C time constant is ~1E-3 ms, which is much much smaller than what has been estimated for yaw turns in the magnetic tether (200 ms; Bender & Dickinson, 2006) or free flight saccades (~17 ms; see Cheng et al., 2010; 10.1242/jeb.038778). The bottom line is that the current model underestimates the influence of inertia in turn manoeuvres, i.e. the aerodynamic damping is cranked up too high relative to yaw inertia. This may explain the mismatch between data and model that the authors posit, "What causes the fly to undershoot the movement of the target object in the magnetically tethered assay? One hypothesis is that strong upward magnetic force or a blunt top end of the steel pin significantly dampens the flies' flight turns."

Loom response experiment:
As nicely shown by 10.1242/jeb.02369, visual stimulation of looming stimuli in the magnetic tether evokes saccades. Is it the case as well in Fig. 6? Without showing individual trials, it is not possible to know whether this is the case. If indeed saccades are present, then the authors will have to reframe their results given the physiological evidence for saccade-related cancellation signals and the three proposed models.

Minor comments:

Missing Equation 13 for saccade model in Methods.

For the discussion and results related to flight responses to the mismatch between expected and actual visual feedback, which is germane to the proposed models, the authors should integrate a discussion of a recent paper which directly tested this idea through an augmented reality system: 10.1016/j.cub.2023.11.045. In particular, the authors argue that the optomotor response is not particularly flexible because it may not rely on an internal model, as suggested by recent physiological evidence (Fenk et al.). How do these findings relate to the 3 proposed models within your work?

Author response:

The following is the authors’ response to the original reviews.

Reviewer #1 (Public Review):

Summary:

The manuscript "Drosophila Visuomotor Integration: An Integrative Model and Behavioral Evidence of Visual Efference Copy" provides an integrative model of the visuomotor control in Drosophila melanogaster. This model presents an experimentally derived model based on visually evoked wingbeat pattern recordings of three strategically selected visual stimulus types with well-established behavioral response characteristics. By testing variations of these models, the authors demonstrate that the virtual model behavior can recapitulate the recorded wing beat behavioral results and those recorded by others for these specific stimuli when presented individually. Yet, the novelty of this study and their model is that it allows predictions for natural visual scenes in which multiple visual stimuli occur simultaneously and may have opposite or enhancing effects on behavior. Testing three models that would allow interactions of these visual modalities, the authors show that using a visual efference copy signal allows visual streams to interact, replicating behavior recorded when multiple stimuli are presented simultaneously. Importantly, they validated the prediction of this model in real flies using magnetically tethered flies, e.g., presenting moving bars with varying backgrounds. In conclusion, the presented manuscript presents a commendable effort in developing and demonstrating the validity of a mixture model that allows predictions of the behavior of Drosophila in natural visual environments.

Strengths:

Overall, the manuscript is well-structured and clear in its presentation, and the modeling and experimental research are methodically conducted and illustrated in visually appealing and easy-to-understand figures and their captions.

The manuscript employs a thorough, logical approach, combining computational modeling with experimental behavioral validation using magnetically tethered flies. This iterative integration of simulation and empirical behavioral evidence enhances the credibility of the findings.

The associated code base is well documented and readily produces all figures in the document.

Suggestions:

However, while the experiments provide evidence for the use of a visual efference copy, the manuscript would be even more impressive if it presented specific predictions for the neural implementation or even neurophysiological data to support this model. Or, at the very least, a thorough discussion. Nonetheless, these models and validating behavioral experiments make this a valuable contribution to the field; it is well executed and addresses a significant gap in the modeling of fly behavior and holistic understanding of visuomotor behaviors.

We appreciate the reviewer’s thoughtful comments on the strengths and weaknesses of our manuscript. We agree that biophysically realistic model reflecting the structure of neural circuits as well as physiological data from them would be invaluable. However, we are currently unable to provide physiological evidence for EC-based suppression, nor provide circuit architecture for efference copy-based suppression of the stability circuit because the neural pathway underlying this behavior remains unidentified. Extensive recordings from the HS/VS system have revealed cell-type-specific motor-related inputs during both spontaneous and loom-evoked flight turns (Fenk et al., 2021; Kim et al., 2017, 2015). These studies predicted suppression of the optomotor stability response during such turns, and our new experiments confirmed this suppression specifically during loom-evoked turns (Figures 5, 6). However, these neurons are primarily involved in the head optomotor response, not the body optomotor response. We hope to extend our current model in future studies to incorporate more cellular-level detail, as the feedforward circuits underlying stability behavior become more clearly defined.

Here are a few points that should be addressed:

(1) The biomechanics block (Figure 2) should be elaborated on, to explain its relevance to behavior and relation to the underlying neural mechanisms.

We appreciate this suggestion. The mathematical representation of the biomechanics block has been developed by other groups in previous studies (Fry et al., 2003; Ristroph et al., 2010). We used exactly the same model, and its parameters were identical to those used in one of those studies (Fry et al., 2003; Ristroph et al., 2010), in which the parameters were estimated from the stabilizing response in response to magnetic “stumbling” pulses. In the previous version of the manuscript, we had a description of the biomechanics block in the Method section (see Equation 4). In response to the reviewer’s comment, we have made a few changes in Figure 2A and expanded the associated description in the main text, as follows.

(Line 160) “To test the orientation behavior of the model, we developed an expanded model, termed “virtual fly model” hereafter. In this model, we added a biomechanics block that transforms the torque response of the fly to the actual heading change according to kinematic parameters estimated previously (Michael H Dickinson, 2005; Ristroph et al., 2010) (Figure 2A, see Equation 4 in Methods and Movie S1). The virtual fly model, featuring position and velocity blocks that are conditioned on the type of the visual pattern, can now change its body orientation, simulating the visual orientation behavior of flies in the free flight condition.”

(2) It is unclear how the three integrative models with different strategies were chosen or what relevance they have to neural implementation. This should be explained and/or addressed.

Thank you for this valuable comment. We selected the three models based on previous studies investigating visuomotor integration across multiple species, under conditions where multiple sensory cues are presented simultaneously.

The addition-only model represents the simplest hypothesis, analogous to the “additive model” proposed by Tom Collett in his 1980 study (Collett, 1980). We used this model as a baseline to illustrate behavior in the absence of any efference copy mechanism. Notably, some modeling studies have proposed linear (additive) integration for multimodal sensory cues at the behavioral level (Liu et al., 2023; Van der Stoep et al., 2021). However, experimental evidence demonstrating strictly linear integration—either behaviorally or physiologically—remains limited. In our study, new data (Figure 5) show that bar-evoked and background movement-evoked locomotor responses are combined linearly, supporting the addition-only model.

The graded efference copy model has been most clearly demonstrated in the cerebellum-like circuit of Mormyrid fish during electrosensation (Bell, 1981; Kennedy et al., 2014). In this system, the efference copy signal forms a negative image of the predicted reafferent input and undergoes plastic changes as the environment changes—an idea that inspired our modifiable efference copy model (Figure 4–figure supplement 1). The all-or-none efference copy model is exemplified in the sensory systems of smaller organisms, such as the auditory neurons of crickets during stridulation (Poulet and Hedwig, 2006). Notably, in crickets, the motor-related input is referred to as corollary discharge rather than efference copy. Typically, “efference copy” refers to a graded, subtractive motor-related signal, while “corollary discharge” denotes an all-or-none signal, both counteracting the sensory consequences of self-generated actions. In this manuscript, we use the term efference copy more broadly, encompassing both types of motor-related feedback signals (Sommer and Wurtz, 2008).

In response to this comment, we have made the following changes in the main text to enhance its accessibility to general readers.

(Line#268) “This integration problem has been studied across animal sensory systems, typically by analyzing motor-related signals observed in sensory neurons (Bell, 1981; Collett, 1980; Kim et al., 2017; Poulet and Hedwig, 2006). Building on the results of these studies, we developed three integrative models. The first model, termed the “addition-only model”, assumes that the outputs of the object (bar) and the background (grating) response circuits are summed to control the flight orientation (Figure 4B, see Equation 14 in Methods).”

(Line#272) “In the second and third models, an EC is used to set priorities between different visuomotor circuits (Figure 4C,D). In particular, the EC is derived from the object-induced motor command and sent to the object response system to nullify visual input associated with the object-evoked turn (Bell, 1981; Collett, 1980; Poulet and Hedwig, 2006). These motor-related inputs fully suppress sensory processing in some systems (Poulet and Hedwig, 2006), whereas in others they selectively counteract only the undesirable components of the sensory feedback (Bell, 1981; Kennedy et al., 2014).”

(3) There should be a discussion of how the visual efference could be represented in the biological model and an evaluation of the plausibility and alternatives.

Thank you for this helpful comment. We have now added the following discussion to share our perspective on the circuit-level implementation of the visual efference copy in Drosophila.

(Line#481) “Efference copy in Drosophila vision

Under natural conditions, various visual features in the environment may concurrently activate multiple motor programs. Because these may interfere with one another, it is crucial for the central brain to coordinate between the motor signals originating from different sensory circuits. Among such coordination mechanisms, the EC mechanisms were hypothesized to counteract so-called reafferent visual input, those caused specifically by self-movement (Collett, 1980; von Holst and Mittelstaedt, 1950). Recent studies reported such EC-like signals in Drosophila visual neurons during spontaneous as well as loom-evoked flight turns (Fenk et al., 2021; Kim et al., 2017, 2015). One type of EC-like signals were identified in a group of wide-field visual motion-sensing neurons that were shown to control the neck movement for the gaze stability (Kim et al., 2017). The EC-like signals in these cells were bidirectional depending on the direction of flight turns, and their amplitudes were quantitatively tuned to those of the expected visual input across cell types. Although amplitude varies among cell types, it remains inconclusive whether it also varies within a given cell type to match the amplitude of expected visual feedback, thereby implementing the graded EC signal. A more recent study examined EC-like signal amplitude in the same visual neurons for loom-evoked turns, across events (Fenk et al., 2021). Although the result showed a strong correlation between wing response and the EC-like inputs, the authors pointed that this apparent correlation could stem from noisy measurement of all-or-none motor-related inputs.

Thus, these studies did not completely disambiguate between graded vs. all-or-none EC signaling. Another type of EC-like signals observed in the visual circuit tuned to a moving spot exhibited characteristics consistent with all-or-none EC. That is, it entirely suppressed visual signaling, irrespective of the direction of the self-generated turn (Kim et al., 2015; Turner et al., 2022).

Efference-copy (EC)–like signals have been reported in several Drosophila visual circuits, yet their behavioral role remains unclear. Indirect evidence comes from a behavioral study showing that the dynamics of spontaneously generated flight turns were unaffected by unexpected background motion (Bender and Dickinson, 2006a). Likewise, our behavioral experiments showed that, during loom-evoked turns, responses to background motion are suppressed in an all-or-none manner (Figures 6 and 7). Consistent with this, motor-related inputs recorded in visual neurons exhibit nearly identical dynamics during spontaneous and loom-evoked turns (Fenk et al., 2021). Together, these behavioral and physiological parallels support the idea that a common efference-copy mechanism operates during both spontaneous and loom-evoked flight turns.

Unlike loom-evoked turns, bar-evoked turn dynamics changed in the presence of moving backgrounds (Figure 5), a result compatible with both the addition-only and graded EC models. However, when the static background was updated just before a bar-evoked turn—thereby altering the amplitude of optic flow—the turn dynamics remained unaffected (Figures 5 and 7), clearly contradicting the addition-only model. Thus, the graded EC model is the only one consistent with both findings. If a graded EC mechanism were truly at work, however, an unexpected background change should have modified turn dynamics because of the mismatch between expected and actual visual feedback (Figure 4–figure supplement 1)—yet we detected no such effect at any time scale examined (Figure 7–figure supplement 1). This mismatch would be ignored only if the amplitude of the graded EC adapted to environmental changes almost instantaneously—a mechanism that seems improbable given the limited computational capacity of the Drosophila brain. In electric fish, for example, comparable adjustments take more than 10 minutes (Bell, 1981; Muller et al., 2019). Further investigation is needed to clarify how reorienting flies ignore optic flow generated by static backgrounds, potentially by engaging EC mechanisms not captured by the models tested in this study.

Why would Drosophila rely on the all-or-none EC mechanism instead of the graded one for loom-evoked turns? A graded EC must be adjusted adaptively depending on the environment, as the amplitude of visual feedback varies with both the dynamics of self-generated movement and environmental conditions (e.g., empty vs. cluttered visual backgrounds) (Figure 4—figure supplement 1). Recent studies on electric fish have suggested that a large array of neurons in a multi-layer network is crucial for generating a modifiable efference copy signal matched to the current environment (Muller et al., 2019). Given their small-sized brain, flies might opt for a more economical design for suppressing unwanted visual inputs regardless of the visual environment. Circuits mediating such a type of EC were identified in the cricket auditory system during stridulation (Poulet and Hedwig, 2006), for example. Our study strongly suggests the existence of a similar circuit in the Drosophila visual system.

We tested the hypothesis that efference-copy (EC) signals guide action selection by suppressing specific visuomotor reflexes when multiple visual features compete. An alternative motif with a similar function is mutual inhibition between motor pathways (Edwards, 1991; Mysore and Kothari, 2020). In Drosophila, descending neurons form dense lateral connections (Braun et al., 2024), offering a substrate for such competitive interactions. Determining whether—and how—EC and mutual inhibition operate will require recordings from the neurons that ensure visual stability, which remain unidentified. Mapping these pathways and assessing how they are modulated by visual and behavioral context are important goals for future work.”

Reviewer #2 (Public Review):

It has been widely proposed that the neural circuit uses a copy of motor command, an efference copy, to cancel out self-generated sensory stimuli so that intended movement is not disturbed by the reafferent sensory inputs. However, how quantitatively such an efference copy suppresses sensory inputs is unknown. Here, Canelo et al. tried to demonstrate that an efference copy operates in an all-or-none manner and that its amplitude is independent of the amplitude of the sensory signal to be suppressed. Understanding the nature of such an efference copy is important because animals generally move during sensory processing, and the movement would devastatingly distort that without a proper correction. The manuscript is concise and written very clearly. However, experiments do not directly demonstrate if the animal indeed uses an efference copy in the presented visual paradigms and if such a signal is indeed non-scaled. As it is, it is not clear if the suppression of behavioral response to the visual background is due to the act of an efference copy (a copy of motor command) or due to an alternative, more global inhibitory mechanism, such as feedforward inhibition at the sensory level or attentional modulation. To directly uncover the nature of an efference copy, physiological experiments are necessary. If that is technically challenging, it requires finding a behavioral signature that unambiguously reports a (copy of) motor command and quantifying the nature of that behavior.

We thank the reviewer for this insightful and constructive comment. We agree that our current behavioral evidence does not directly identify the underlying circuit mechanism, and that direct recordings from visual neurons modulated by an efference copy would be critical for distinguishing between potential mechanisms.

A prerequisite for such physiological investigations would be the identification of both (1) the feedforward neurons directly involved in the optomotor response, and (2) the neurons conveying motor-related signals to the optomotor circuit. Despite efforts by several research groups, the location of the feedforward circuit mediating the optomotor response remains elusive. This limitation has prevented us from obtaining direct cellular evidence of flight turn-associated suppression of optomotor signaling.

In light of the reviewer’s suggestion, we expanded our investigation to strengthen the behavioral evidence for efference copy (EC) mechanisms. In addition to our earlier experiments involving unexpected changes in the static background, we examined how object-evoked flight turns influence the optomotor stability reflex and vice versa (Figures 5 and 6). To quantify the interaction between different visuomotor behaviors, we systematically varied the temporal relationship between two types of visual motion—loom versus moving background, or moving bar versus moving background—and measured the resulting behavioral responses.

Our findings support pattern- and time-specific suppressive mechanisms acting between flight turns associated with the different visual patterns. Specifically:

The responses to a moving bar and a moving background add linearly, even when presented in close temporal proximity.

Loom-evoked turns and the optomotor stability reflex mutually suppress each other in a time-specific manner.

For both loom- and moving bar-evoked flight turns, changes in the static background had no measurable effect on the dynamics of the object-evoked responses.

These results provide a detailed behavioral characterization of a suppressive interaction between distinct visuomotor responses. This, in turn, offers correlative evidence supporting the involvement of an efference copy-like mechanism acting on the visual system. While similar efference copy mechanisms have been documented in other parts of the visual system, we acknowledge that our findings do not exclude alternative explanations. In particular, it is still possible that lateral inhibition within the central brain or ventral nerve cord contributes to the suppression we observed.

Ultimately, definitive proof will require identifying the specific neurons that convey efference copy signals and demonstrating that silencing these neurons abolishes the behavioral suppression. Until such experiments are feasible, our behavioral approach provides an important contribution toward understanding the nature of sensorimotor integration in this system.

Reviewer #3 (Public Review):

Summary:

Canelo et al. used a combination of mathematical modeling and behavioral experiments to ask whether flies use an all-or-none EC model or a graded EC model (in which the turn amplitude is modulated by wide-field optic flow). Particularly, the authors focus on the bar-ground discrimination problem, which has received significant attention in flies over the last 50-60 years. First, they use a model by Poggio and Reichardt to model flight response to moving small-field bars and spots and wide-field gratings. They then simulate this model and compare simulation results to flight responses in a yaw-free tether and find generally good agreement. They then ask how flies may do bar-background discrimination (i.e. complex visual environment) and invoke different EC models and an additive model (balancing torque production due to background and bar movement). Using behavioral experiments and simulation supports the notion that flies use an all-or-none EC since flight turns are not influenced by the background optic flow. While the study is interesting, there are major issues with the conceptual framework.

Strengths:

They ask a significant question related to efference copies during volitional movement.

The methods are well detailed and the data (and statistics) are presented clearly.

The integration of behavioral experiments and mathematical modeling of flight behavior.

The figures are overall very clear and salient.

Weaknesses:

Omission of saccades: While the authors ask a significant question related to the mechanism of bar-ground discrimination, they fail to integrate an essential component of the Drosophila visuomotor responses: saccades. Indeed, the Poggio and Reichardt model, which was developed almost 50 years ago, while appropriate to study body-fixed flight, has a severe limitation: it does not consider saccades. The authors identify this major issue in the Discussion by citing a recent switched, integrate-and-fire model (Mongeau & Frye, 2017). The authors admit that they "approximated" this model as a smooth pursuit movement. However, I disagree that it is an approximation; rather it is an omission of a motor program that is critical for volitional visuomotor behavior. Indeed, saccades are the main strategy by which Drosophila turn in free flight and prior to landing on an object (i.e. akin to a bar), as reported by the Dickinson group (Censi et al., van Breugel & Dickinson [not cited]). Flies appear to solve the bar-ground discrimination problem by switching between smooth movement and saccades (Mongeau & Frye, 2017; Mongeau et al., 2019 [not cited]). Thus, ignoring saccades is a major issue with the current study as it makes their model disconnected from flight behavior, which has been studied in a more natural context since the work of Poggio.

Thank you for this helpful comment. We agree that including saccadic turns is essential and qualitatively improves the model. In the revised manuscript, we therefore expanded our bar-tracking model to incorporate an integrate-and-saccade strategy, now presented in Figure 2—figure supplement

The manuscript now introduces this result as follows:

(Line#190) “Finally, one important locomotion dynamics that a flying Drosophila exhibits while tracking an object is a rapid orientation change, called a “saccade” (Breugel and Dickinson, 2012; Censi et al., 2013; Heisenberg and Wolf, 1979). For example, while tracking a slowly moving bar, flies perform relatively straight flights interspersed with saccadic flight turns (Collett and Land, 1975; Mongeau and Frye, 2017). During this behavior, it has been proposed that visual circuits compute an integrated error of the bar position with respect to the frontal midline and triggers a saccadic turn toward the bar when the integrated value reaches a threshold (Frighetto and Frye, 2023; Mongeau et al., 2019; Mongeau and Frye, 2017). We expanded our bar fixation model to incorporate this behavioral strategy (Figure 2--figure supplement 2). The overall structure of the modified model is akin to the one proposed in a previous study (Mongeau and Frye, 2017), and the amplitude of a saccadic turn was determined by the sum of the position and velocity functions (Figure 2--figure supplement 2A; see Equation 13 in Methods). When simulated, our model successfully reproduced experimental observations of saccade dynamics across different object velocities (Figure 2--figure supplement 2B-D) (Mongeau and Frye, 2017). Together, our models faithfully recapitulated the results of previous behavioral observations in response to singly presented visual patterns (Collett, 1980; Götz, 1987; H. Kim et al., 2023; Maimon et al., 2008; Mongeau and Frye, 2017).”

Apart from Figures 1 and 2, most of our data—whether from simulations or behavioral experiments—use brief visual patterns lasting 200 ms or less. These stimuli trigger a single, rapid orientation change reminiscent of a saccadic flight turn. In this part of the paper, we essentially have examined how multiple visuomotor pathways interact to determine the direction of object-evoked turns when several visual patterns occur simultaneously.

Critically, recent work showed that a group of columnar neurons (T3) appear specialized for saccadic bar tracking through integrate-and-fire computations, supporting the notion of parallel visual circuits for saccades and smooth movement (Frighetto & Frye, 2023 [not cited]).

Thanks for bringing up this critical issue. We have now added this paper in the following part of the manuscript.

(Line#193) “During this behavior, it has been proposed that visual circuits compute an integrated error of the horizontal bar position with respect to the frontal midline and triggers a saccadic turn toward the bar when the integrated value reaches a threshold (Frighetto and Frye, 2023; Mongeau and Frye, 2017).”

(Line#462) “Visual systems extract features from the environment by calculating spatiotemporal relationships of neural activities within an array of photoreceptors. In Drosophila, these calculations occur initially on a local scale in the peripheral layers of the optic lobe (Frighetto and Frye, 2023; Gruntman et al., 2018; Ketkar et al., 2020).”

A major theme of this work is bar fixation, yet recent work showed that in the presence of proprioceptive feedback, flies do not actually center a bar (Rimniceanu & Frye, 2023). Furthermore, the same study found that yaw-free flies do not smoothly track bars but instead generate saccades. Thus prior work is in direct conflict with the work here. This is a major issue that requires more engagement by the authors.

Thank you for your thoughtful comments and for drawing our attention to this important paper. In our experiments, bar fixation on oscillating vertical objects emerges during the “alignment” phase of the magneto-tether protocol. The pattern movement dynamics was similar those used by Rimniceanu & Frye (2023), yet the two studies differ in a key respect: Rimniceanu & Frye employed a motion-defined bar, whereas we presented a dark vertical bar against a uniform or random-dot background. The alignment success rate—defined as the proportion of trials in which the fly’s body angle is within ±25° of the target—was about 50 % (data not shown). Our alignment pattern consisted of three vertical stripes spanning ~40° horizontally; when we replaced it with a single, narrower stripe, the success rate was lowered (data not shown). These observations suggest that bar fixation in the magnetically tethered assay is less robust than in the rigid-tethered assay, although flies still orient toward highly salient vertical objects.

We also observed that bar-evoked turns were elicited more reliably when the bar moved rapidly (45° in 200 ms) in the magneto-tether assay, although the turn magnitude was significantly smaller than the actual bar displacement (Figure 3).

In response to the reviewer’s comment, we now added the following description in the paper regarding the bar fixation behavior, citing Rimniceanu&Frye 2023.

(Line#239) “Another potential explanation arises from recent studies demonstrating that proprioceptive feedback provided during flight turns in a magnetically tethered assay strongly dampens the amplitude of wing and head responses (Cellini and Mongeau, 2022; Rimniceanu et al., 2023).”

Relevance of the EC model: EC-related studies by the authors linked cancellation signals to saccades (Kim et al, 2014 & 2017). Puzzlingly, the authors applied an EC model to smooth movement, when the authors' own work showed that smooth course stabilizing flight turns do not receive cancellation signals (Fenk et al., 2021). Thus, in Fig. 4C, based on the state of the field, the efference copy signal should originate from the torque commands to initiate saccades, and not from torque to generate smooth movement. As this group previously showed, cancellation signals are quantitatively tuned to that of the expected visual input during saccades. Importantly, this tuning would be to the anticipated saccadic turn optic flow. Thus the authors' results supporting an all-or-none model appear in direct conflict with the author's previous work. Further, the addition-only model is not particularly helpful as it has been already refuted by behavioral experiments (Rimneceanu & Frye, Mongeau & Frye).

Thank you for this constructive comment. Efference copy is best established for brief, discrete actions like flight saccades. While motor-related modulation of visual processing has been reported across short- and long-duration behaviours (Chiappe et al., 2010; Fujiwara et al., 2017; Kim et al., 2015, 2017; Maimon et al., 2010; Turner et al., 2022), only flight saccade-associated signals exhibit the temporal profile appropriate to cancel reafferent input. However, von Holst & Mittelstaedt (1950) originally formulated efference copy to explain the smooth optomotor response of hoverflies. In HS/VS recordings in previous studies, however, we could not detect membrane-potential changes tied to baseline wing-beat amplitude (data not shown), but further work is needed.

Note that visually evoked flight turns analyzed in this paper have relatively fast dynamics. Fenk et al. (2021) showed that HS cells carry EC-like motor signals during both loom-evoked turns and spontaneous saccades. Building on this, we tested whether object-evoked rapid turns modulate other visuomotor pathways. Although Fenk et al. also found that optomotor turns lack motor input to HS cells, the authors did not test whether the optomotor pathway suppresses other reflexes, such as loom-evoked turns. Our new behavioral data (Figure 6) show that optomotor turns indeed suppress loom-evoked turns, suggesting a potential EC signal arising from the optomotor pathway that inhibits loom-responsive visual neurons.

In Kim et al. (2017), the authors argued that HS/VS neurons receive a “quantitatively tuned” efference copy that varies across cell types: yaw-sensitive LPTCs are strongly suppressed, roll-sensitive cells receive intermediate input, and pitch-sensitive cells receive little or none. We also showed that when the amplitude of ongoing visual drive changes, the amplitude of saccade-related potentials (SRPs) scales linearly. This proportionality does not imply a genuinely graded EC, however, because SRP amplitude could vary solely through changes in driving force (Vm – Vrest) with a fixed EC conductance. Crucially, SRPs do not fully suppress feed-forward visual signalling, arguing against an all-or-none EC mechanism.

How, then, can the cellular and behavioural data be reconciled? Silencing HS/VS neurons—or their primary inputs, the T4/T5 neurons—does not markedly diminish the optomotor response in flight (Fenk et al., 2014; Kim et al., 2017), indicating the presence of additional, as-yet-unidentified pathways.

Physiological recordings from other visual neurons that drive the optomotor response in flying Drosophila are therefore needed to determine how strongly they are suppressed during loom-evoked turns.

Behavioral evidence for all-or-none EC model: The authors state "unless the stability reflex is suppressed during the flies' object evoked turns, the turns should slow down more strongly with the dense background than the sparse one". This hypothesis is based on the fact that the optomotor response magnitude is larger with a denser background, as would be predicted by an EMD model (because there are more pixels projected onto the eye). However, based on the authors' previous work, the EC should be tuned to optic flow and thus the turning velocity (or amplitude). Thus the EC need not be directly tied to the background statistics, as they claim. For instance, I think it would be important to distinguish whether a mismatch in reafferent velocity (optic flow) links to distinct turn velocities (and thus position). This would require moving the background at different velocities (co- and anti-directionally) at the onset of bar motion. Overall, there are alternative hypotheses here that need to be discussed and more fully explored (as presented by Bender & Dickinson and in work by the Maimon group).

We appreciate the reviewer’s important suggestion. In response, we performed the recommended experiment. In Figures 5 and 6 of the revised manuscript, we now present how bar- or loom-evoked flight turns affect the response to a moving background pattern. These experiments revealed that bar-evoked turns do not suppress the optic flow response, whereas loom-evoked turns strongly suppress it. Specifically, when background motion began 100 ms after the onset of loom expansion, the response to the background was significantly suppressed. Although weak residual responses to the background motion were observed in this case, this could be due to background motion occurring outside of the suppression interval, which may correspond in duration to the duration of flight turns (Figure 6C,D).

The lack of suppression of the optic flow response during and after bar-evoked turns appears to suggest that the responses are added linearly (Figure 5), seemingly contradicting the lack of dynamic change when the background dot density was altered (Figure 7, Figure 7–figure supplement 1). That is, the experimental result in Figure 5 supports either an addition-only or a graded efference copy (EC) model. However, the result in Figure 7 supports an all-or-none EC model. If a graded EC were used, the amplitude of the EC should be updated almost instantaneously when the static background changes.

Another possibility is that the optic flow during self-generated turns in a static background is extremely weak compared to the optic flow input generated by physically moving the pattern, perhaps due to the rapid nature of head movements. Indeed, detailed kinematic analysis of head movement during spontaneous saccades in blow flies revealed that the head reaches the target angle before the body completes the orientation change, making the effective speed of reafferent optic flow higher than the speed of body rotation (Hateren and Schilstra, 1999). To test these hypotheses, further experiments will be needed for bar-evoked flight turns.

Publishing the reviewed preprint:

(1) The Reviewed Preprint (including the full text of the preprint we reviewed, the eLife assessment, and public reviews) will typically be published in two weeks' time.

Please let us know if you would like to provide provisional author responses to be posted at the same time (if so, please send these by email). Please do not resubmit within the next two/three weeks, as we will need to publish the first version of the Reviewed Preprint first.

If there are any factual errors in the eLife assessment or public reviews, or other issues we should be aware of, please let us know as soon as possible.

(2) After publication of the Reviewed Preprint, you can use the link below to submit a revised version. There is no deadline to resubmit. Before resubmitting, please ensure that you update the preprint at the preprint server to correspond with the revised version. Upon submitting a revised version, we will ask the editors and reviewers if it's appropriate to update their assessment and public reviews, which will be included alongside the revised Reviewed Preprint. At that time we will also post the recommendations to the authors and the author responses you provide with the revised version. In the author response, please respond to the public reviews (where relevant) and the recommendations to the authors.

(3) Alternatively, you can proceed with the current version of the Reviewed Preprint (once published), without revisions, and request an eLife Version of Record. See the Author Guide for further information: https://elife-rp.msubmit.net/html/elife-rp_author_instructions.html#vor. However, most authors decide to request a Version of Record after a round of revision.

(4) After publication of eLife's Reviewed Preprint, you also have the option to submit/publish in another journal instead: if you choose to do this, please let us know so we can update our records.

The reviewers identified two key revisions that could improve the assessment of the paper:

(1) Consideration of saccades within the model framework (outlined by reviewer 3).

(2) Addition of physiology data to support the conclusions of the paper (outlined by reviewer 2). If this is not feasible within the timescale of revisions, the paper would need to be revised to clarify that the model leads to a hypothesis that would need to be tested with future physiology experiments.

Thank you for these comments.

Regarding revision point #1, we have added Figure 2–figure supplement 2, where we incorporated our position-velocity model (estimated in Figure 1) into the framework of the integrate-and-saccade model. A detailed description of this model is now provided in the main text (Lines 190–203).

For revision point #2, obtaining electrophysiological evidence for efference copy remains challenging, as neither the visual neurons nor the efference-copy neuron has been identified for the wing optomotor response. As suggested by the reviewers, we have revised the title of the paper to reduce emphasis on efference copy and have noted electrophysiological recordings as a direction for future work.

old title: A visual efference copy-based navigation algorithm in Drosophila for complex visual environments

new title: Integrative models of visually guided steering in Drosophila

Specific recommendations are detailed below.

Reviewer #2 (Recommendations For The Authors):

To directly demonstrate if an efference copy is non-scaled, the following experiments can be helpful: record from HS/VS cells and examine the relation between the amplitude of the succade-suppression signal vs. succade amplitude.

Thanks for raising this important point. We previously carried out the suggested analysis for loom-evoked saccades in Fenk et al. (2021). There, significant correlations emerged between wing-response amplitude and saccade-related potentials (Figures 2F and 3C). However, we did not interpret the strong correlation (r ≈ 0.8) as evidence for a graded efference copy, because the amplitude of saccade-related potentials appeared to be bimodal. Upon presentation of the looming stimulus, flies either executed large evasive turns or showed minimal changes in wing-stroke amplitude. Large wing responses were accompanied by strong, saturated suppression of HS-cell membrane potential, whereas trials without wing responses produced only weak modulations—reflected in the bimodal distribution of saccade-related potential amplitudes (Figure 3C).

Importantly, in rigidly tethered preparations—where these potentials are typically measured—the absence of proprioceptive feedback can itself drive wingbeat amplitudes to saturation during saccades. We therefore reasoned that the lack of intermediate-sized flight saccades would naturally yield correspondingly saturated saccade-related potentials, even if a graded EC system is in play.

In Kim et al. (2017), we also performed a comprehensive analysis of spontaneous saccade-related potentials across all HS/VS cell types. When we later examined the relationship between saccade amplitude and the corresponding saccade-related potentials in each cell type, we could not find any statistically significant correlation (unpublished data).

measure how much a weak visual stimulus and a strong visual stimulus are suppressed by the suppression signal. If the signal is non-scaled, visual stimuli should always be suppressed independently of their intensities.

Thank you for this important suggestion. As mentioned in our response to the previous comment, we believe it is not feasible to record from neurons responsible for the body optomotor response at this point, as their identity remains unknown. Regarding the HS/VS cells, our previous study showed that HS cells are not always fully suppressed. The changes in saccade-related potential amplitude can be described as a linear function of the pre-saccadic visually-evoked membrane potential (Figure 7 in Kim et al., 2017).

As suggested by Fenk et al. 2014 (doi: 10.1016/j.cub.2014.10.042), HS cells might also be responsive to a moving bar. If that is the case, and if you present a bar and background (either sparse or dense) in a closed-loop manner to a head-fixed fly, HS cells might be sensitive only to the bar but not to the background (independently of the density).

Thanks for pointing out this important issue. HS cells indeed respond strongly to the horizontal movement of a vertical bar, as expected given that their receptive fields are formed by the integration of local optic flow vectors. In one of our previous studies (Supplemental Figure 1 in Kim et al., 2015), we showed that the response amplitude to a single vertical bar is roughly equivalent to that elicited by a vertical grating composed of 12 bars of the same size. Therefore, we believe that HS cells are likely to contribute to the head response to a moving vertical bar. In a body-fixed flight simulator, HS cells would respond only to the bar if the bar runs in a closed loop with a static background. In this scenario, HS cells are likely to play a role in the head optomotor response.

Note also that the role of HS cells in the wing optomotor response remains unresolved. Unilateral activation of HS cells has been shown to elicit locomotor turns in walking Drosophila (Fujiwara et al., 2017), as well as in flying individuals (unpublished data from our lab). However, a previous study also showed that strong silencing of HS/VS cells significantly reduced the head optomotor response, but not the wing optomotor response (Kim et al., 2017).

If neurophysiology is technically challenging, an alternative way might pay attention to a head movement that exclusively follows the background (Fox et al., 2014 (doi: 10.1242/jeb.080192)). Because HS cells are thought to promote head rotation to background motion, a non-scaled suppression signal on HS cells would always suppress the head rotation independently of the background density.

Thanks for this helpful comment. We have analyzed head movements during bar-evoked flight turns (Figure 7–figure supplement 1B) and found no significant changes across different background dot densities. We think that this might suggest that HS cells are unlikely to receive suppressive inputs during bar-evoked turns, akin to the lack of modulation during optomotor turns.

Another way to separate a potential efference copy from other mechanisms (more global inhibition) is the directionality. A global inhibition would suppress the response to the background even if the background moves in the same direction as self-motion, but the efference copy would not.

Thanks for this important point. In Heisenberg and Wolf, 1979, it was proposed that modulation might be bidirectional, with behavioral effects observed only for perturbations in the “unexpected” direction. In our new data on loom-evoked turns (Figure 6), the suppression appears equally strong for background motion in either direction, supporting an all-or-none suppression mechanism.

Besides, in general, it is unclear if you think an efference copy operates both in smooth pursuits and saccades or if such a signal is only present during saccades. Your previous neurophysiological work supports the latter. Are your behavioral results consistent with the previous saccade suppression idea, or do you propose a new type of efference copy that also operates in smooth pursuits?

Thanks for raising this important point. von Holst and Mittelstaedt (1950) originally introduced the concept of efference copy to explain the smooth optomotor response. We previously analyzed electrophysiological recordings from HS cells for membrane-potential changes associated with slow deviations in wing-steering angle but found none. However, this negative result does not entirely rule out modulation of visual processing during smooth flight turns, given the slow drift in membrane potential observed in most whole-cell recordings.

In this study, We examined only the interactions among visuomotor pathways during these rapid flight turns as the dynamics of visually evoked turns are almost as rapid as spontaneous saccades. Our data reveal that interactions between distinct visuomotor reflexes are more diverse than previously appreciated.

Minor comments:

Line 108, 109: match the description between here and the labels in Fig. 1F.

Thank you for indicating this issue. We have defined the general equation to obtain the position and velocity components in the main text lines 108,109, but due to a slight asymmetry in the data (Fig. 1E) we used the approach indicated in Fig. 1F. and explained in lines 113-117.

Fig.1 F: If the position-dependent component is due to fatigue, the tuning curve's shape is likely changed (shrunk or extended) depending on the stimulus speed. How can you generalize the tuning curve shown here? Does the result hold even if the stimulus speed/contrast/spatial frequency is changed?

We appreciate this indication. We believed that fatigue may be the reason why the wing response to the grating stimulus showed that significant decay (Fig. 1E). As you mention, the stimulus speed would increase the amplitude of the fly’s response up to a saturation point. We addressed this in our model by multiplying the derived value by the angular velocity of the grating.

Regarding the contrast, and spatial frequency we did not test it experimentally, instead, we simulated our model for changing visual feedback (Fig. 4A, B), which can be seen as increasing/decreasing contrast of a grating. An increase in the contrast would increase the response of the fly to the grating and so will contribute to dampening the response to the foreground object (Fig. 4C).

Line 233-255: Here, the description sounds like you will consider several parallel objects (e.g., two stripes) in the visual field instead of the combination of the figure and background (which is referred to in the following paragraph).

Thank you for pointing it out. Indeed it was slightly ambiguous. We have addressed this by explaining the specific situation of a combination of an object and the background in lines 231-233.

Figure 6C: you kept the foreground visual field between sparse and dense random dot backgrounds to keep the bar's saliency. Is it sure that this does not influence the difference in the fly's response to these two backgrounds (in Figure 6B)?

This is a good point that we have also discussed internally. We also carried out similar experiments with a fully covered background and found no significant differences (Figure 7–figure supplement 1).

Reviewer #3 (Recommendations For The Authors):

Identify and analyze flight saccade dynamics in the raw trajectories (e.g., Fig. 3B). There should be some since the bar is near the 'sweet spot' for triggering saccades (see Mongeau & Frye, 2017).

Thank you for bringing up this interesting point. In previous work, it was reported that the fly fixated on a vertical bar through saccadic turns rather than smooth-tracking (Mongeau & Frye, 2017). When the bar width was thin (<15 deg) there was barely one saccade per second (Mongeau & Frye, 2017, Fig. 4). In our magno tether essay (Fig. 3A, B) the object width was 11.25 degrees, and the object moved for a short time window, and so the fly only generated the saccade related to the onset of the object. It could not be considered as a saccade some small turns of a few degrees that are likely related to small perturbations in comparison to those previously reported (Mongeau & Frye, 2017). Additionally, in our protocol (Fig. 3A) from onset time (‘go’ mark), only a single object moved, within an empty background, so in principle there is no trigger for a switch to a smooth movement. We addressed this in lines x-x.

Consider updating the Poggio model with flight saccades (switched, integrate-and-fire).

We appreciate this suggestion. Following previous work (Mongeau et al., 2017), we expanded our model to include a saccade mechanism: the torque produced by the summed position- and velocity-dependent components is now replaced by an integrate-and-fire saccade (Figure 2—figure supplement 2). We optimized the saccade interval and amplitude so that both vary linearly with stimulus amplitude and faithfully reproduce the kinematic properties reported previously (Mongeau et al., 2017).

Please engage more with the literature, especially work that directly conflicts with your conclusions (see above). Also, highly relevant work by Bender & Dickinson was not sufficiently discussed. Spot results presented in Fig. 3 should be contextualized in light of the work of Mongeau et al., 2019, who performed similar experiments and identified a switch in saccade valence.

We appreciate your pointing out the relevant previous work. We have added references to the following papers and tried to describe the relationship between our data and previous ones.

Bender & Dickinson 2006

(Line#162) “This simulation experiment is reminiscent of the magnetically tethered flight assay, where a flying fly remains fixed at a position but is free to rotate around its yaw axis (Bender and Dickinson, 2006b; Cellini et al., 2022; G. Kim et al., 2023; Mongeau and Frye, 2017).”

(Line#218) “We tested the predictions of our models with flies flying in an environment similar to that used in the simulation (Figure 3A). A fly was tethered to a short steel pin positioned vertically at the center of a vertically oriented magnetic field, allowing it to rotate around its yaw axis with minimal friction (Bender and Dickinson, 2006b; Cellini et al., 2022; G. Kim et al., 2023).”

(Line#238) “To determine if our assay imposes additional friction compared to other assays used in previous studies, we analyzed the dynamics of spontaneous saccades during the “freeze” phase (Figure 3–figure supplement 1A). We found their duration and amplitude to be within the range reported previously (Bender and Dickinson, 2006b; Mongeau and Frye, 2017) (Figure 3–figure supplement 1B-D).

Mongeau et al., 2019

(Line#196) “During this behavior, it has been proposed that visual circuits compute an integrated error of the bar position with respect to the frontal midline and triggers a saccadic turn toward the bar when the integrated value reaches a threshold (Frighetto and Frye, 2023; Mongeau et al., 2019; Mongeau and Frye, 2017). We expanded our bar fixation model to incorporate this behavioral strategy (Figure 2–figure supplement 2).”

This paper shows that the dynamics of saccadic flight turns elicited by a rotating bar or spot determine whether flies display attraction or aversion. In that study, the visual stimulus—a bar or spot—rotated slowly at a constant 75 deg s⁻¹. By contrast, in our Figure 3 the object moves much faster, driving the neural “integrator” to saturation and triggering an almost immediate flight turn. In Mongeau et al. (2019), saccades occur at variable times and their amplitudes and directions are more stochastic, again reflecting the slower stimulus speed. Because these differences all arise from the disparity in object speed, we did not cite Mongeau et al. (2019) in Figure 3 or the associated text.

In addition to the two papers cited above, we have incorporated several relevant studies on the Drosophila visuomotor control identified through the reviewers’ insightful comments. Examples include:

Frighetto G, Frye MA. 2023 (Line#195, 464)

Rimniceanu et al., 2023 (Line#241)

Cellini & Mongeau 2020 (Line#91)

Cellini & Mongeau 2022 (Line#241)

Cellini et al., 2022 (LIne#91, 162, 218)

Many citations are not in the proper format (e.g. using numbers rather than authors' last name).

Thank you for letting us know. We have changed the remaining citations to the proper format.

  1. Howard Hughes Medical Institute
  2. Wellcome Trust
  3. Max-Planck-Gesellschaft
  4. Knut and Alice Wallenberg Foundation