Learning to predict target location with turbulent odor plumes
Abstract
Animal behavior and neural recordings show that the brain is able to measure both the intensity and the timing of odor encounters. However, whether intensity or timing of odor detections is more informative for olfactorydriven behavior is not understood. To tackle this question, we consider the problem of locating a target using the odor it releases. We ask whether the position of a target is best predicted by measures of timing vs intensity of its odor, sampled for a short period of time. To answer this question, we feed data from accurate numerical simulations of odor transport to machine learning algorithms that learn how to connect odor to target location. We find that both intensity and timing can separately predict target location even from a distance of several meters; however, their efficacy varies with the dilution of the odor in space. Thus, organisms that use olfaction from different ranges may have to switch among different modalities. This has implications on how the brain should represent odors as the target is approached. We demonstrate simple strategies to improve accuracy and robustness of the prediction by modifying odor sampling and appropriately combining distinct measures together. To test the predictions, animal behavior and odor representation should be monitored as the animal moves relative to the target, or in virtual conditions that mimic concentrated vs dilute environments.
Editor's evaluation
This paper explores the question of the optimum strategy for odor detection in a turbulent environment. The authors use highresolution simulations of turbulent flow to investigate the transport and detection of odors advected by the flow, comparing machine learning strategies based on the temporal dynamics of the signal with those based on intensity. The work should be of interest to researchers working on a broad range of problems in sensation and navigation across scales.
https://doi.org/10.7554/eLife.72196.sa0Introduction
Most macroscopic organisms detect odors in intermittent bursts, that may be separated by extended regions with no odor. Organisms leverage this complex dynamics efficiently for diverse tasks, including locating and identifying an odor source Murlis et al., 1992; MafraNeto and Cardé, 1994; Vickers, 2000; Riffell et al., 2014; Ache et al., 2016; Ackels et al., 2021. However, what are the most informative features of intermittent odor cues remains largely unclear. There are two broad classes of measures that quantify the dynamics of olfactory cues: those that depend on odor intensity including e.g. odor gradients in space or time, and those that do not depend on odor intensity but only on its timing, i.e. on whether the odor is on or off regardless of its concentration. To compute quantities that depend on odor intensity, an accurate representation of the odor is needed. In contrast, measuring the timing of odor detection simply requires to mark at all times whether the odor is on or off, thus a binary switch is sufficient.
Behavioral evidence suggests that animals use both intensity and timing of odor encounters for olfactory navigation Baker et al., 2018. At close range, mammals appear to compare odor intensity either across nostrils or across sniffs Catania, 2013; Gire et al., 2016; Findley et al., 2021. On the other hand, mounting evidence suggests timing of odor detection also plays a key role for olfactory navigation Ache et al., 2016: moths respond to odor pulsed at specific frequencies Riffell et al., 2014; Vickers et al., 2001; fruit flies respond to timing since last odor detection van Breugel and Dickinson, 2014; Demir et al., 2020; lobsters and sharks compare odor arrival time across their paired olfactory organs and orient toward the side that detected the odor first Basil and Atema, 1994; Gardiner and Atema, 2010; many organisms will move upwind upon detection of an odor Kennedy and Marsh, 1974; Murlis et al., 1992; Steck et al., 2012.
Neural recordings upon stimulation with intermittent odor cues confirm that the brain of many animals is able to record information both about intensity (and its derivatives) as well as timing of odor encounters (most information comes from work on arthropods Nagel and Wilson, 2011; Vickers et al., 2001; Brown et al., 2005; GorurShandilya et al., 2017; Jacob et al., 2017; Riffell et al., 2014, but see also Parabucki et al., 2019; Lewis et al., 2021). For example, when insects are presented with intermittent odor cues, information about intensity and timing is recorded in their antennal lobe (see e.g. Vickers et al., 2001; Brown et al., 2005). Odors that mimic natural intermittency elicit a response that preserves an accurate measure of timing in fruit flies and moths GorurShandilya et al., 2017; Jacob et al., 2017. In lobsters, bursting olfactory neurons encode specifically for the time between successive odor encounters, see Park et al., 2014; Park et al., 2016 and references therein. Interestingly, the neural activity varies considerably with the dynamics of the odor cues Nagel and Wilson, 2011; Vickers et al., 2001; Lewis et al., 2021, but how intermittency of an odor affects its neural representation is not well understood.
This evidence suggests animals are able to identify when they detect an odor as well as how intense it is; but whether they record and rely on both kinds of information is not understood. From a physical perspective, these two measures clearly provide information about source location. Indeed, we know from theoretical Shraiman and Siggia, 2000; Falkovich et al., 2001; Celani et al., 2014 and experimental Justus et al., 2002; Moore and Crimaldi, 2004 work that turbulence causes the odor to be distributed in highly intermittent patches separated by blanks with no odor. Both intensity and timing of these intermittent bursts vary depending on the location of the source Celani et al., 2014, as early recognized by Atema, 1996, thus can be used to infer source location or navigate to it Vergassola et al., 2007; Schmuker et al., 2016; Boie et al., 2018; Leathers et al., 2020; Michaelis et al., 2020.
Here, we ask what salient features of turbulent odor signals best predict the location of the odor source and specifically compare quantities related to intensity vs timing of odor encounters. We first compose a dataset of realistic odor fields at scales of several meters using accurate stateoftheart fluid dynamics simulations. We then develop machine learning algorithms that predict source location based on these synthetic odor fields.
We find that measures of odor temporal dynamics based on a short memory span (down to about 1 s) hold information about source location. Close to the source or close to the substrate, measures of intensity predict distance better than measures of timing; but this ranking is reversed at further distance from the source or from the substrate. Pairing the two kinds of measure improves dramatically the quality of the prediction robustly across all datasets, whereas pairing two measures of intensity or two measures of timing is either useless or detrimental.
Our results demonstrate that timing and intensity are complementary attributes of odor dynamics and are most effective in more dilute and concentrated conditions respectively. These different conditions exist in different portions of space because odor gets transported, mixed and diluted by the fluid. As a result, the spatial range of operation of a living organism constrains the solutions it may evolve to make predictions with turbulent odors.
Results
Odor cues at several meters from the source are often turbulent. Figure 1a–c and show snapshots of the velocity field and odor cues in space, resulting from direct numerical simulations of the turbulent flow in a channel of length L, width W and height H (also see Figure 1—video 1). Air flows from left to right at a mean speed ${U}_{b}$ and hits a cylindrical obstacle that generates turbulence. The height of the obstacle is $H/4$ and tunes the intensity of turbulent fluctuations relative to the mean velocity. To characterize the flow we show in Figure 1d that the mean velocity profile, for ${z}^{+}\gtrsim $ 30, follows the law of the wall $\frac{\u27e8U\u27e9}{{u}_{\tau}}=\frac{1}{\kappa}\mathrm{ln}{z}^{+}+B$, where $\kappa $ and $B$ are constants and ${z}^{+}=\frac{z}{{\delta}_{\nu}}$, recovering classical statistics for channel turbulence Pope, 1984. The odor field is emitted from a concentrated source downstream from the obstacle; it develops as a meandering filament that fluctuates as it travels downstream and soon breaks into discrete pockets of odor (whiffs) separated by odorless stretches (blanks) (Figure 1b–c and f). The Kolomogorov scaling for the spectra of odor fluctuations ${k}^{5/3}$ holds for $k\eta \lesssim 0.1$ (see Figure 1e), consistent with previous experimental results in channel flow (see e.g. Saddoughi and Veeravalli, 1994). Typical time courses of the odor are shown in Figure 1f. Note that depending on the sampling location, odor may be more or less sparse (compare for example Figure 1f left and right). All parameters and methods are summarized respectively in Table 1 and in Materials and Methods.
Do odor cues bear information about source location meters away from the source? To answer this question, we develop supervised machine learning algorithms that learn the relationship between the input (odor) and the distance from the source (output) from a large dataset of examples. In order to dissect what are the best predictors of source location and how ranking depends on the statistics of the odor, we need to detail more specifically the input and output of the algorithm.
To design the input we start with the odor concentration field $c(\mathbf{z},t)$ which varies stochastically in space and time as a result of turbulent transport. Here, $\mathbf{z}=({z}_{1},{z}_{2},{z}_{3})$ is a location in the three dimensional space and $t$ is time. We focus on a plane at a fixed height, and consider the conical region where odor can be detected, the ‘cone of detection’ (Figure 2a). We first compose time series of the odor field; each time series is indicated with ${\mathbf{c}}_{i}$ and consists of the odor sampled at $M$ equally spaced times with frequency $\omega $ at a discrete location ${\mathbf{z}}_{i}$ within the cone of detection. Thus, each time series is a vector ${\mathbf{c}}_{i}=(c({\mathbf{z}}_{i},{t}_{i}),\mathrm{\dots},c({\mathbf{z}}_{i},{t}_{i+M}))$, where ${t}_{i+M}{t}_{i}=M/\omega $ is the temporal span of the time series, or memory. From each time series, ${\mathbf{c}}_{i}$ we calculate five features ${x}_{i}^{1},\mathrm{\dots},{x}_{i}^{5}$, where ${x}_{i}^{1}$ is the temporal average of the concentration during whiffs in the time series $\mathbf{c}}_{i$; $x_i^2$ is its average slope (time derivative of odor upon detection, averaged across whiffs within ${\mathbf{c}}_{i}$); ${x}_{i}^{3}$ is the average duration of blanks (stretches of time when odor is below detection within ${\mathbf{c}}_{i}$); ${x}_{i}^{4}$ is the average duration of whiffs (stretches of time when odor is above threshold within ${\mathbf{c}}_{i}$); and ${x}_{i}^{5}$ is the intermittency factor (the fraction of time the time series ${\mathbf{c}}_{i}$ is above threshold). The detection threshold is defined adaptively as discussed in Materials and methods and Figure 2—figure supplement 1. Features ${x}^{1}$ and ${x}^{2}$ depend explicitly on odor concentration, whereas features ${x}^{3}$, ${x}^{4}$ and ${x}^{5}$ only depend on when the odor is on or off, but not on its intensity. To remark this difference, we refer to ${x}^{1}$ and ${x}^{2}$ as intensity features, and ${x}^{3}$, ${x}^{4}$ and ${x}^{5}$ as timing features. Our input ${\mathbf{x}}_{i}=({x}_{i}^{1},\mathrm{\dots},{x}_{i}^{d})$ is composed of ddimensional vectors of features and we will focus on $d=1,2,5$. We seek to infer distance from the source, thus our output $y$ is the coordinate of the sampling point $\mathbf{z}$ in the downwind direction, i.e. $y={z}_{1}$, with the source placed at the origin (see sketch in Figure 2a). We refer to the figure supplements for results in the crosswind direction, $y={z}_{2}$. We train the algorithm by providing $N$ examples of inputoutput pairs $({\mathbf{x}}_{i},{y}_{i})$ selected randomly from the full simulation, and obtain the function that connects input and output: $y\approx f(\mathbf{x})$.
We propose a machine learning approach where the different odor features are ranked based on their predictive power, rather than their fitting properties. Different datasets of odor/distance pairs are defined. The datasets differ in the way odor measurements are represented in terms of feature vectors. For each dataset we learn a function to predict the distance to target given the corresponding odor features. The predictive power of each function, and corresponding set of features, is then assessed. More precisely, each dataset is split in a training and a test set, as custom in machine learning. Training sets are used to learn functions connecting odor to target location, whereas test sets are used to assess their prediction properties. The training/test split is crucial since the goal is to make good predictions on new, unseen points, that are not within the training sets. From a modeling perspective, a flexible nonlinear/nonparametric approach based on kernel methods is contrasted and shown to be superior to a simpler linear model (Figure 2—figure supplement 2). A careful protocol based on holdout crossvalidation is used to select the hyperparameters of the considered learning models (we refer to Materials and methods and Figure 2—figure supplement 3 for more details).
To illustrate the results we pick the twodimensional plane at height $H/4$ that contains the source. The first result is that individual features ($d=1$) bear useful information for twodimensional source localization even at several meters from the source. Performance is quantified by the normalized squared error averaged over the ${N}_{t}$ points in the test set $\chi ={\sum}_{i=1}^{{N}_{t}}{[{y}_{i}f({\mathbf{x}}_{i})]}^{2}/{\sum}_{i=1}^{{N}_{t}}{[{y}_{i}\overline{y}]}^{2}$. For this dataset, intensity features rank higher than timing features (Figure 2b–c), consistent with previous work Atema, 1996 and predictions are more accurate in the crosswind than in the downwind direction (compare with Figure 2—figure supplement 4). For reference, a random guess with flat probability within the correct lower and upper bounds yields ${\chi}_{\text{random}}=2$, whereas a target function ${f}_{\text{trivial}}(x)={\u27e8y\u27e9}_{\text{test}}$ that learns the average of the output over the test set yields ${\chi}_{\text{trivial}}=1$.
In order to prove that the algorithm captures the dynamics of the scalar and not of the underlying velocity field, we realize three computational fluid dynamics simulations with the odor source at different locations downstream of the obstacle. Each source is placed at a different distance from the obstacle and thus feels different velocity fields, because the flow is inhomogeneous in the downstream direction. To perform a fair comparison we train the algorithm on points sampled from a conical domain based on the most downstream source; for the other two sources, we use an identical domain shifted upstream so that the vertex of the cone is located at the respective source location (see Figure 2—figure supplement 5 left). Performance at source height varies little over the three locations, demonstrating that the algorithm is learning the dynamics of the scalar and not of the carrying flow (see Figure 2—figure supplement 5 center). Additionally we demonstrate that, irrespective of source position, timing features acquire predictive power with height, whereas the opposite is observed for intensity features (see Figure 2—figure supplement 5 right). We discuss this property in further detail in the following. . Finally, we train the algorithm with the odor fields emanating from one source location, and test its performance over odor fields generated by the two other sources. Figure 2—figure supplement 6 shows that performance of individual features varies little when test and training are performed over different dataset. Learning from pairs of features appears somewhat more sensitive to the details of the dataset. In the aggregate, the analysis corroborates that the algorithm captures odor dynamics and is rather insensitive to details of the underlying flow. In the rest of the manuscript we focus on the leftmost source, so as to exploit the full spatial range available from the simulations.
Next we analyze whether and how the sampling strategy affects performance and ranking of the features. Most results are shown for a memory of $100{\tau}_{\eta}\approx 15s$. Performance improves with longer memory (Figure 3a), because this allows to better average out noise and obtain more stable estimates of the features. But improvement follows a slow power law so that waiting for example 20 times longer yields predictions only about twice as precise. On the other hand, waiting as little as $10{\tau}_{\eta}\approx 1.5$ seconds still allows to make predictions, albeit less precise. We then verify whether performance may improve with a larger training set. Because we infer distance from an individual (scalar) feature, the problem is one dimensional and we find that a small number of training points, which we indicate with $N$, is sufficient to reach a plateau in prediction performance (Figure 3b). We choose $N=5000$ training points, which is also robust to the case with more than one feature (Figure 3—figure supplement 1). Finally, sampling more frequently than once per Kolmogorov time does not essentially affect the results nor ranking (Figure 3c). Similar results hold for the crosswind direction (Figure 3—figure supplement 2).
Pairing two observables improves performance in some cases, but not always. In fact, pairing two features of the same category results in little to no improvement (Figure 4 and similarly for the crosswind direction, Figure 4—figure supplement 1). In contrast, combining one intensity and one timing feature improves performance considerably, up to 65%. This result can be understood by mapping the error done by individual features in space (Figure 5—figure supplement 1), showing that intensity and timing features are complementary, that is intensity features perform well in locations where timing features perform poorly.
We next seek to clarify whether the results depend on space. To this end we compose five different dataset, a to e, obtained by extracting odor snapshots from horizontal planes at source height (b), above the source (c to e), and below the source (a) (Figure 5a). From a to e, sparsity increases and intensity decreases (Figure 5b) simply because closer to the boundary, the air slows down and the odor accumulates. By analyzing performance across these dataset, we find that ranking of individual features shifts considerably. The two intensity features outperform all timing features when the dataset is not very sparse (dataset ab, Figure 5c and d left). In contrast, two timing features (intermittency factor and blank duration) outperform all others for the more sparse and less intense dataset de (Figure 5c and d right). Whiff duration performs poorly in de because intermittency is too severe and whiffs are short in duration thus bear little information (the average whiff duration is 1–7 time steps in over 90% of the time series). Although the ranking of individual features shifts with height, pairing one intensity and one timing feature remains the most successful strategy across all heights (Figure 5c and d). In contrast, combining all five features contributes little improvement (Figure 5—figure supplement 2).
Let us now focus on the plane at source height and separate locations based on their distance from the source. We assemble a distal dataset and a proximal dataset, composed of points that are further and closer than 2330η from the source respectively (Figure 6a). The odor is more intense and more sparse closer to the source and it becomes more dilute and less sparse with distance from the source (Figure 6b). Performance of individual features degrades with distance (Figure 6d). Intensity features clearly outperform timing features at close range, as seen both from various percentiles of the test error (Figure 6d, left) as well as the full distribution (Figure 6c, left). The disparity between timing and intensity features disappears in the distal problem: the error distribution for all individual features is essentially superimposed except for the tails (Figure 6c, right and inset), which cause small differences in the median and other percentiles of the error (Figure 6d, right). Remarkably, mixed pairs outperform all individual features in both the distal and proximal problems (Figure 6c–d). In the aggregate, results demonstrate that, even within a single turbulent flow, ranking shifts considerably. Namely, measuring timing of odor encounters is most useful in regions where the odor is dilute, that is far from the source and from the substrate, whereas measuring intensity is most useful in concentrated conditions, that is close to the source or the substrate.
Thus, the predictive power of different features varies greatly in space and time. Next, we show how this spatial variation is dictated by the statistical properties of the odor plume. To this end, we provide an analytical characterization of the test error of individual features, $\chi $, that connects directly to the physics of the problem. Such a characterization is consistent with the observed variation in performance. Note that for large enough samples, the test error $\chi $ approximates the following expected error
The latter is called expected error and is the ideal version of the test error. It can be interpreted as the error summed over all possible inputoutput pairs, weighted by their corresponding joint probability to be sampled. Here, ${f}^{*}(x)=\u27e8yx\u27e9={\int}_{0}^{R}yp(yx)dy$ is the so called regression function, which minimizes the expected error among all possible functions. The regression function can be approximated using Kernel ridge regression and sufficiently rich kernels. Indeed, kernel ridge regression is known to be a so called universal estimator Hastie et al., 2001; Steinwart, 2002. In the above expression, $R$ is the length of the cone, $p(yx)$ is the conditional probability distribution of the output given the input, $p(x,y)$ is the joint probability distribution of the input output pair $(x,y)$ is the prior distribution on the output $y$. The idea is to relate ${f}^{*}$, hence the expected error, to the distribution $p(xy)$ of observing feature $x$ depending on distance $y$. Indeed, the latter is dictated by the fluid dynamics of odor plumes from a concentrated source, and hence provides a more direct connection between the expected error and the physics of the problem. Since the considered features are sample averages, in the limit of large samples, their distribution is well approximated by a Gaussian, hence fully characterized by the mean and standard deviation. Then simple estimates can be computed empirically from our data. Equation (1), provided with these estimates, reproduces the predictive power of individual features showed in Figure 5c (see Figure 5—figure supplement 3). Note that whiffs deviate considerably from a normal distribution, hence the argument needs to be revised for this feature. To move beyond empirical estimates and extend the above reasoning to other flows and generic combinations of (possibly non Gaussian) features, a generalization of the asymptotic arguments proposed in Celani et al., 2014 is needed.
Discussion
Our results demonstrate that within the cone of detection, the time course of an odor bears useful information for source localization even at meters from the source. We find that the concentration and the slope of a turbulent odor signal, averaged over a memory lag, are particularly useful to predict source location at close range or near the boundary. These features quantify the intensity of the odor and its variation. The primacy of the intensity features wanes in more challenging conditions, for example moving away from the source or away from the boundary. In these portions of space, where the odor is scarcer, features that quantify timing of odor detection become as effective as intensity features, or more effective. One of the best studied example of olfactory search in dilute conditions is arguably the case of insects.
Interestingly, olfactory receptor neurons in insects appear to encode efficiently information about timing across a wide range of intensities GorurShandilya et al., 2017; Martelli et al., 2013.
Note that while the statistics of an odor plume clearly depends on all details of the flow and the source, see e.g. Justus et al., 2002; Celani et al., 2014; Fackrell and Robins, 1982, here we keep all of these parameters constant and demonstrate that even within a single flow, odor dynamics and the best predictors vary considerably in space. This begs the next question: do organisms switch between different modalities depending on attributes of odor dynamics, which will vary in space? This could be the case for mice, where the neural activity in the first relay of olfactory processing does in fact depend on how sparse is the odor Lewis et al., 2021. Specifically, sparse odor cues elicit individual responses that follow closely the ups and downs of the odor in time. In contrast, continuous signals elicit intense responses which are however uncorrelated to the temporal dynamics of the odor itself Lewis et al., 2021.
We find that features within the same class are redundant whereas features from different classes are complementary. Indeed, features of the same class have similar patterns of performance in space, but each class has a distinct pattern. As a consequence, measuring both timing and intensity is beneficial, but using more than one feature to quantify either timing or intensity provides no advantage. Combining all features does not improve over the performance of mixed pairs, consistent with redundancy within each class. Note that there is no fundamental reason to expect features from the same class to be redundant, and further work with a larger library of features is needed to prove or disprove this notion.
Importantly, mixed time/intensity pairs of features outrank individual features robustly, that is in all portions of space, regardless of distance from the source and from the ground. This is in contrast with individual features and suggests relying on simultaneous timing and intensity features is advantageous when odors are sensed at various distances from the source and from the substrate. Interestingly, the coexistence of bursting olfactory neurons and canonical olfactory neurons in lobsters suggests these animals are in fact able to measure simultaneously timing and intensity Park et al., 2014; Ache et al., 2016, which is consistent with the increased predictive power of the mixed pairs of features. Similarly, in mammals, optogenetic activation of the olfactory bulb Smear et al., 2013 demonstrates that both kinds of measures guide behavior (lick vs no lick).
In this work, we have investigated the problem of predicting the location of a target from measures of the time course of a turbulent odor. Previous work explored a related question, that is how to best represent instantaneous snapshots of the odor to encode maximum information about source location Victor et al., 2019. The two approaches are not immediately comparable: first, Victor et al., 2019 consider few snapshots of the odor, rather than measures of its time course. Second, maximizing information does not guarantee good predictions (to make predictions information needs to be extracted and processed, and importantly the focus is on new data that were not previously seen). We provide two comments that are relevant if information is the limiting factor for prediction accuracy: (i) binary representations were suboptimal in all conditions considered in Victor et al., 2019; Boie et al., 2018, that is at few tens of cm from the source. This is consistent with our results in concentrated conditions, where timing features accessible through binary representations are suboptimal. Our evidences suggest, however, that the result may not hold in more dilute conditions, where the gap between binary and more accurate representations should become increasingly small. (ii) Individual snapshots of odor from Victor et al., 2019; Boie et al., 2018 contained 1–2 bits of information about source location, but allocating more resources to represent how the odor varies in time was found informative Victor et al., 2019; Boie et al., 2018. Our mixed pairs of features at close range achieve precisions of 5–6%, corresponding to coding for position with words of 4–4.3 bits. Our results thus confirm that memory is indeed useful, but the gain does not increase indefinitely with further memory.
The literature on olfactory navigation is vast. Although a complete review of available algorithms is beyond the scope of the present work, we remark that recent results investigated gradient descent algorithms using either concentration alone Gire et al., 2016, or various measures of timing and intensity Park et al., 2016; Michaelis et al., 2020; Leathers et al., 2020. Overall, both intensity and timing appear to have a potential to lead to an odor source, consistent with our results on individual features. A combination of the two kinds of features was found beneficial in Leathers et al., 2020, consistent with our results on mixed pairs. Note that predicting source location and navigating to reach it are distinct tasks. Although they are often assumed to be intimately connected, whether good predictors may be good variables for navigation in more general contexts remains to be understood.
Here, we have analyzed the features that enable the most accurate prediction of source location. We add a few observations about the significance of the results for animal behavior. First: whether animals rely on features from either class will depend on what features best support behavior. It is often implicitly assumed that features that bear reliable information on source location are also the most useful for navigation. However, this connection between prediction and navigation is far from straightforward and more work is needed to establish whether accurate predictions imply efficient navigation. Second: animals are unlikely to have prior information on the details of the odor source, for example its intensity. Timing features are more robust than intensity features with respect to the intensity of the source and may thus be favored regardless of their performance, which was argued in Schmuker et al., 2016. In our work, timing features are precisely invariant with source intensity because we define the detection threshold adaptively (see Materials and methods). More realistic conditions will need to be evaluated, where dependence on source intensity emerges as a result of nonlinearities that we did not model in this work. These effects emerge for example, close to a boundary which partially absorbs the odor GorurShandilya et al., 2019, or in the case of fixed thresholds, although this dependence is weak in the far field where timing features are most useful Celani et al., 2014. Third: we have focused on predicting source location from within the cone of detection, where an agent will detect the odor quite often. However, a crucial difficulty of turbulent navigation is to find the cone itself. We cannot address the problem of predicting source location from outside the cone because detections are so rare that we lack statistics. The distinction between inside and outside the cone of detection is key for navigation with sparse cues (see Reddy et al., 2021) and deserves further attention.
Materials and methods
Direct numerical simulations of turbulent odor plumes
Request a detailed protocolTo reproduce a realistic odor landscape and generate the dataset showed in Figure 1, we solve the NavierStokes (2) and the advectiondiffusion equation for passive odor transport (3) at all relevant scales of motion from the smallest turbulent eddies (Kolmogorov scale $\eta $) to the integral scale ($L>600\eta$), using Direct numerical simulations (DNS):
where $\mathbf{u}$ is the velocity field, $\rho $ is the fluid density, $P$ is pressure, $\nu $ is the fluid kinematic viscosity, $c$ is the odor concentration, ${\kappa}_{c}$ is its diffusivity and $q$ an odor source. All parameters are listed in Table 1. Note that we use $Sc=1$ which is appropriate for typical odors in air but not in water. However, we expect a weak dependence on the Schmidt number as the Batchelor and Kolmogorov scales are below the size of the source and we are interested in the large scale statistics Falkovich et al., 2001; Celani et al., 2014; Duplat et al., 2010. We simulate a turbulent channel flow with a concentrated odor source and an obstacle that generates turbulence by customizing the opensource software Nek5000 Fischer et al., 2008 developed at Argonne National Laboratory, Illinois. Nek5000 employs a spectral element method (SEM) Patera, 1984 Orszag, 1980 based on Legendre polynomials for discretization Ho, 1989, and a 4th order RungeKutta scheme for time marching. The code is written in fortran77 and C and it uses MPI for parallelization.
The threedimensional channel is divided in $E=\mathrm{160\hspace{0.17em}000}$ discrete elements: $200\times 40\times 20$ (number of elements in length × width × height); within each element the solution is expanded in 8th grade tensorproduct polynomials so that the domain is effectively discretized in 81 920,000 elements. The average spatial resolution is equal in each direction $\mathrm{\Delta}x\approx 4\eta $. A cylindrical cap of height $=160\phantom{\rule{thinmathspace}{0ex}}\eta$ is added on the ground; the cylinder spans the entire width of the channel. The mesh is adapted to fit the cylinder. Fluid flows from left to right and the obstacle generates turbulence in the channel, in particular the height of the cylinder tunes the velocity fluctuations. The velocity fluctuations are defined as $\delta u(\mathbf{z},t)=u(\mathbf{z},t){\u27e8u(\mathbf{z},t)\u27e9}_{y}$; their intensity is ${u}^{\prime}=\sqrt{\u27e8{(\delta u)}^{2}\u27e9}$, where averages are intended in space and time. Table 1 summarizes the parameters that characterize turbulence.
Each simulation runs for 300,000 time steps where $\delta t={10}^{2}{\tau}_{\eta}$ and follows from a severe Courant criterium with $U\mathrm{\Delta}t/\mathrm{\Delta}x<0.4$ to ensure convergence of both the velocity and scalar fields. Snapshots of velocity and odor fields are saved at constant frequency $\omega =1/{\tau}_{\eta}$ (except for results in Figure 3c where snapshots are saved 10 times more frequently). Each DNS requires 2 weeks of computational time using 320 cpus.
Boundary conditions and odor source
Request a detailed protocolWe impose a Poiseuille velocity profile at the inlet: $\mathbf{u}=(u,0,0)$ and $u=6{U}_{b}(\zeta {\zeta}^{2})$, where $\zeta ={z}_{3}/H$ is the vertical coordinate normalized to the height of the channel and ${U}_{b}$ is the mean speed. We set a noslip condition $\mathbf{u}=0$ at the ground and on the obstacle; on the remaining boundaries we impose the turbulent outflow condition defined in Fischer et al., 2007 that imposes a positive exit velocity to avoid potential negative flux and the consequent instability it generates.
More precisely, the divergence ramps up from zero to a positive value along the element closest to the boundary: $\nabla \cdot \mathbf{u}=C[1{({z}_{\u27c2}/\mathrm{\Delta}x)}^{2}]$, where ${z}_{\u27c2}$ is the distance from the boundary and $C=2$ is the minimal value that ensures convergence. For the odor, we impose a Dirichlet condition ($c=0$) at the ground, on the obstacle and at the inlet; while an outflow condition is set at the top, on the sides and at the outlet: $k(\nabla c)\cdot \mathbf{n}=0$. We introduce a source located right above and downstream of the obstacle, at coordinates x_{s} =810η, y_{s} =650η, z_{s} =238η; odor intensity at the source is defined by a gaussian distribution $q={e}^{[{({z}_{1}{x}_{s})}^{2}+{({z}_{2}{y}_{s})}^{2}+{({z}_{3}{z}_{s})}^{2}]/(2{\sigma}^{2})}$, where $\sigma =5\eta $.
Machine learning
Request a detailed protocolTo learn the correct position of a target source given an odor, we propose to use supervised machine learning. We next review some key ideas, and refer to standard textbooks for further details for example Hastie et al., 2001.
The goal in supervised learning is to infer a function $f$ given a training set $({\mathbf{x}}_{1},{y}_{1}),\mathrm{\dots}({\mathbf{x}}_{N},{y}_{N})$ of input/output pairs. A good function estimate should allow to predict the outputs associated to new input points. In our setting each input $\mathbf{x}$ is a one, two, or fivedimensional vector whose entries are scalar features of odor time series, where the odor is sampled at a specific spatial location. From every sampling location, we compute the distance to the source and this distance is the output $y$.
To measure how close the prediction $f(\mathbf{x})$ is to the correct output $y$, we consider the square loss ${(f(\mathbf{x})y)}^{2}$. Following a statistical learning framework, the data are assumed to be sampled according to a fixed but unknown data distribution $P$. In this view, the ideal solution ${f}^{*}$ should minimize the expected loss $\u27e8l(f(\mathbf{x}),y)\u27e9$ over all data distributed according to $P$. In practice, only an empirical loss based on training data can be measured, and the search for a solution needs be restricted to a suitable class of hypothesis. Note that, the choice of the latter is critical since the nature of the function to be learnt is not known a priori. A basic choice is considering linear functions $f(\mathbf{x})=\mathbf{w}\cdot \mathbf{x}$. In this case, minimizing the training loss reduces to linear least squares $\text{min}\frac{1}{N}{YX\cdot \mathbf{w}}^{2}$, where $X$ is the matrix composed of the $N$ training data input $X={({\mathbf{x}}_{1},\mathrm{\dots},{\mathbf{x}}_{N})}^{T}$ and $Y$ is the vector composed of the $N$ labels of the training set $Y={({y}_{1},\mathrm{\dots},{y}_{N})}^{T}$. The corresponding solution is easily shown to be $\mathbf{w}={({X}^{T}X)}^{1}{X}^{T}Y$. In Figure 2—figure supplement 1, we show that the choice of linear models has limited predictive power and does not allow to rank features. To tackle this issue, we consider kernel methods Schölkopf and Smola, 2002, a more powerful class of nonlinear models corresponding to functions of the form $f(\mathbf{x})={\sum}_{i=1}^{N}k({\mathbf{x}}_{i},\mathbf{x}){\mathbf{c}}_{i}.$ Here, $k(\mathbf{x},{\mathbf{x}}^{\prime})$ is a so called kernel, that here we will choose to be the Gaussian kernel $k(\mathbf{x},{\mathbf{x}}^{\prime})={e}^{{\parallel \mathbf{x}{\mathbf{x}}^{\prime}\parallel}^{2}/2{\sigma}^{2}}$. The coefficients $\mathbf{c}=({\mathbf{c}}_{1},\mathrm{\dots},{\mathbf{c}}_{n})$ are given by the expression
which minimizes
In the above expression, $K$ is the $N$ by $N$ matrix with entries ${K}_{ij}=k({\mathbf{x}}_{i},{\mathbf{x}}_{j})$. The first term can be shown to be a data fit term whereas the second term can be shown to control the regularity of the obtained solution Schölkopf and Smola, 2002. The regularization parameter$\lambda $ balances out the two terms and needs be tuned, together with the kernel parameters (the Gaussian width $\sigma $ in our case).
Kernel methods offer a number of advantages. They are nonlinear, and hence can learn a wide range of complex input/output behavior. They are an example of nonparametric models, where the complexity of the model can adapt to the problem at hand and indeed learn any kind of continuous function provided enough data. This can be contrasted to linear models that clearly cannot learn any nonlinear function. Moreover, by tuning the hyperparameters $\lambda ,\sigma $ more or less complex shape can be selected. When $\lambda $ is small we are simply fitting the data, possibly at the price of stability, whereas for large $\lambda $ we are favoring simpler models. With small $\sigma $ we allow highly varying functions, whereas with large enough $\sigma $ we essentially recover linear models.
Indeed, the choice of these parameters is crucial and tested and visualized in Figure 2—figure supplement 3. Here it is shown that for $\lambda \to 0$, the solution incurs in the well known stability issues for large $\sigma $ and overfitting issues for small $\sigma $. We note that ideally one would want to choose these hyperparameters minimizing the test error; however, this would lead to overoptimistic estimates of the prediction properties of the obtained model. Hence, we consider a holdout cross validation protocol, where the training data are further split in a training and a validation sets. The new training set is used to compute solutions corresponding to different hyperparameters. The validation set is used as a proxy for the text error to select the hyperparameters with small corresponding error. The prediction properties of the model thus tuned is then assessed on the test set.
Dataset
Request a detailed protocolTo compose the dataset for regression we first extract twodimensional snapshots of odor at fixed height from the 3D simulation. Each snapshot from the simulation has dimensions $1600\times 320$ (number of points in the downwind direction × crosswind direction). The initial evolution up to $300{\tau}_{\eta}$ is excluded from the analysis as odor has not yet reached a stationary state. At stationary state we save 2700 frames at frequency $\omega =1/{\tau}_{\eta}$ per simulation. Thus at each spatial location we have the entire time evolution composed of 2700 time points at regular intervals of ${\tau}_{\eta}$. We partition each simulation in fragments with $M$ snapshots (duration $M{\tau}_{\eta}$). Most simulations are shown for $M=100$, thus for each spatial location we have 27 time series of the same duration (except for results leading to Figure 3a, where we vary memory from $10{\tau}_{\eta}$ to $250{\tau}_{\eta}$ resulting in 270 to 10 time series per location respectively).
The characteristic shape of the odor plume is a cone (Figure 2), that we defined as the region where the probability of detection computed over the entire simulation is larger than 0.35. The training set and test set are obtained by extracting $N=5000$ (unless otherwise stated) and ${N}_{\text{t}}=13500$ time series portions of duration $M{\tau}_{\eta}$. To select these $M$long time series, we extract random locations ${\mathbf{z}}_{i}$ to cover homogeneously the cone, i.e. with flat probability within the cone, and random initial times t_{i}, with the training in the first half of the time history and the test in the second half of the time history. Time series that remain entirely under threshold are excluded.
Each odor time series is further processed by computing five features, two of which quantify intensity of the odor and rely on a precise representation of odor concentration (average concentration and average peak slope) and three of which quantify timing of odor encounters and are computed after binarizing the odor (average whiff and blank duration and intermittency factor). The threshold ${c}_{thr}$ used for binarization is adaptive i.e. ${c}_{thr}=0.5\u27e8cc>0\u27e9$, where the average is computed over each time series separately. The threshold thus varies from ${c}_{thr}=0.5{c}_{0}$ at the source to ${c}_{thr}={10}^{6}{c}_{0}$ at the farthest edges of the cone, where c_{0} is the concentration at the source. The choice of an adaptive threshold was suggested in GorurShandilya et al., 2017. The precise value of the relative threshold has little effect on the results as shown in Figure 2—figure supplement 1, left. Fixed thresholds were tested and discarded because results depend sensibly on the threshold and the optimal threshold varies with the dataset in nontrivial ways (Figure 2—figure supplement 1, right). Finally, adaptive thresholds that are defined based on purely local information appear more plausible for a biological system that has no information on the intensity of the source.
The parameters $\lambda $ and $\sigma $ are obtained through 4folds cross validation: the training set is split in 4 equal parts, 3 are used for training and 1 for validation. The empirical risk is computed on the validation set and averaged over the 4 possible permutations, systematically varying the hyperparameters $\lambda ,\sigma $. The couple of hyperparameters that minimize the empirical risk over the validation set is selected through grid search using an $8\times 8$ regular grid and further refined with a $4\times 4$ subgrid. Results are insensitive to further refinement because there is a large plateau around the minimum, as shown in Figure 5—figure supplement 2. The optimal hyperparameters are used to compute the solution (4). The error $\chi $ used throughout the manuscript is simply the normalized test error $\chi ={\sum}_{i=1}^{{N}_{t}}{[{y}_{i}f({\mathbf{x}}_{i})]}^{2}/{\sum}_{i=1}^{{N}_{t}}{[{y}_{i}\overline{y}]}^{2}$. For most of the figures, we used 5000 training points and up to 13,500 testing points (we removed from the dataset all the points with null entries for the entire time span), we implemented Kernel ridge regression using FALKON Rudi et al., 2018, a fast algorithm for matrix inversion (the number of iterations is set to 5 and the number of Nystrom centers is equal to the number of points in the training set) and we used it both for training and test.
Modeling the expected performance of individual features
Request a detailed protocolThe importance of a feature is quantified by the corresponding optimal test error
where $R$ is the length of the cone. The target, or regression, function is given by ${f}^{*}$:
and can be shown to minimize the expected error among all possible functions. In practice, the optimal expected error cannot be computed exactly, since neither the target nor the data distribution are known. In this paper, we choose to estimate the regression function from data using Kernel ridge regression with a Gaussian kernel. The choice of this latter approach is due to its nonparametric nature, which ensures that any target function ${f}^{*}$ can be recovered provided large enough samples, and more generally that accurate estimates can be derived when only finite data are given Hastie et al., 2001; Steinwart, 2002. Provided with a kernel ridge regression estimator, an approximation to the optimal test error can then be computed on a holdout set of data.
Next, we are interested into developing a clearer connection between the above statistical approach and quantities with a direct physical meaning. Towards this end, note that the joint, marginal and posterior distributions, given the prior on $y$ and the likelihood $p(xy)$ are given by,
Assuming that points are sampled uniformly within the cone of detection, the prior is $p(y)=2y/{R}^{2}$ (thus $\overline{y}=2R/3$ and the denominator in Equation 5 is ${R}^{2}/18$). Importantly, the likelihood $p(xy)$ is dictated by the fluid dynamics of odor plumes from a concentrated source. Our features are sample averages of intensity or timing attributes of the odor signal, and in the limit of large samples, they are normally distributed, that is
Assuming the asymptotic limit to be well approximated, we can find empirical estimates for $g(y)$ and $s(y)$ given the data. This provides the desired characterization of the expected error and hence of the importance of each given feature in term of predictive power. Indeed, by numerically solving Equations 5–9 with the assumption (Equation 10) and using the empirical estimates for $g(y)$ and $s(y)$, we reproduce the predictive power of individual features showed in Figure 5c (see Figure 5—figure supplement 3). To move beyond empirical estimates of the likelihood and generalize predictions to other kinds of flows one could generalize asymptotic models of turbulent plumes developed in Celani et al., 2014 to account for zvariations of the sampling locations. Similarly, extension are needed to consider combination of possibly non Gaussian features.
Data availability
Request a detailed protocolSimulations of odor transport are generated through NEK5000 Fischer et al., 2008, freely available from Argonne National Laboratory (https://nek5000.mcs.anl.gov/). Outputs from DNS presented in Figure 1 are processed to extract time series and compute the five features described in the text (average concentration, slope, blank duration, whiff duration and intermittency factor). These data are available at the online repository https://osf.io/ja9xr/. The results of kernel ridge regression perfomed on these data are presented in Figures 2—6. We perform kernel ridge regression on these data with the freely available code FALKON Rudi et al., 2018 (https://github.com/LCSL/FALKON_paper, Rigolli, 2022 copy archived at swh:1:rev:480741cf1e7da0d1d7415309cd6f254080a6ca17).
Data availability
Simulations of odor transport are generated through NEK5000 (43), freely available from Argonne National Laboratory (https://nek5000.mcs.anl.gov/). Outputs from DNS presented in Figure 1 are processed to extract time series and compute the five features described in the text (average concentration, slope, blank duration, whiff duration and intermittency factor). These data are available at the online repository https://osf.io/ja9xr/. The results of kernel ridge regression performed on these data are presented in Figures 26. We perform kernel ridge regression on these data with the freely available code FALKON (https://github.com/LCSL/FALKON_paper, copy archived at swh:1:rev:480741cf1e7da0d1d7415309cd6f254080a6ca17).
References

Smelling time: a neural basis for olfactory scene analysisTrends in Neurosciences 39:649–655.https://doi.org/10.1016/j.tins.2016.08.002

Eddy chemotaxis and odor landscapes: exploration of nature with animal sensorsThe Biological Bulletin 191:129–138.https://doi.org/10.2307/1543074

Algorithms for olfactory search across speciesThe Journal of Neuroscience 38:9383–9389.https://doi.org/10.1523/JNEUROSCI.166818.2018

Informationtheoretic analysis of realistic odor plumes: What cues are useful for determining location?PLOS Computational Biology 14:e1006275.https://doi.org/10.1371/journal.pcbi.1006275

Encoding a temporally structured stimulus with a temporally structured neural representationNature Neuroscience 8:1568–1576.https://doi.org/10.1038/nn1559

Stereo and serial sniffing guide navigation to an odour source in a mammalNature Communications 4:1441.https://doi.org/10.1038/ncomms2444

Odor landscapes in turbulent environmentsPhysical Review X 4:041015.https://doi.org/10.1103/PhysRevX.4.041015

Entanglement rules for random mixturesPhysical Review Letters 105:034504.https://doi.org/10.1103/PhysRevLett.105.034504

Concentration fluctuations and fluxes in plumes from point sources in a turbulent boundary layerJournal of Fluid Mechanics 117:1–26.https://doi.org/10.1017/S0022112082001499

Particles and fields in fluid turbulenceReviews of Modern Physics 73:913–975.https://doi.org/10.1103/RevModPhys.73.913

Simulation of highReynolds number vascular flowsComputer Methods in Applied Mechanics and Engineering 196:3049–3060.https://doi.org/10.1016/j.cma.2006.10.015

Controlling and measuring dynamic odorant stimuli in the laboratoryThe Journal of Experimental Biology 222:207787.https://doi.org/10.1242/jeb.207787

BookThe Elements of Statistical LearningNew York, NY: Springer.https://doi.org/10.1007/9780387216065

ThesisA Legendre spectral element method for simulation of incompressible unsteady viscous freesurface flowsMassachusetts Institute of Technology, Cambridge, USA.

Olfactory coding in the turbulent realmPLOS Computational Biology 13:e1005870.https://doi.org/10.1371/journal.pcbi.1005870

Measurement of odorplume structure in a wind tunnel using a photoionization detector and a tracer gasEnvironmental Fluid Mechanics 2:115–142.https://doi.org/10.1023/A:1016227601019

Plume dynamics structure the spatiotemporal activity of mitral/tufted cell networks in the mouse olfactory bulbFrontiers in Cellular Neuroscience 15:633757.https://doi.org/10.3389/fncel.2021.633757

Intensity invariant dynamics and odorspecific latencies in olfactory receptor neuron responseThe Journal of Neuroscience 33:6285–6297.https://doi.org/10.1523/JNEUROSCI.042612.2013

Odor landscapes and animal behavior: tracking odor plumes in different physical worldsJournal of Marine Systems 49:55–64.https://doi.org/10.1016/j.jmarsys.2003.05.005

Odor plumes and how insects use themAnnual Review of Entomology 37:505–532.https://doi.org/10.1146/annurev.en.37.010192.002445

Biophysical mechanisms underlying olfactory receptor neuron dynamicsNature Neuroscience 14:208–216.https://doi.org/10.1038/nn.2725

Spectral methods for problems in complex geometriesJournal of Computational Physics 37:70–92.https://doi.org/10.1016/00219991(80)900054

Odor concentration change coding in the olfactory bulbENeuro 6:ENEURO.039618.2019.https://doi.org/10.1523/ENEURO.039618.2019

Intermittency coding in the primary olfactory system: A neural substrate for olfactory scene analysisThe Journal of Neuroscience 34:941–952.https://doi.org/10.1523/JNEUROSCI.220413.2014

Neurally encoding time for olfactory navigationPLOS Computational Biology 12:e1004682.https://doi.org/10.1371/journal.pcbi.1004682

A spectral element method for fluid dynamics: Laminar flow in A channel expansionJournal of Computational Physics 54:468–488.https://doi.org/10.1016/00219991(84)901281

BookTurbulent FlowsCambridge: Cambridge University Press.https://doi.org/10.1017/CBO9780511840531

SoftwareFALKON_paper, version swh:1:rev:480741cf1e7da0d1d7415309cd6f254080a6ca17Software Heritage.

ConferenceNeural information processing systemsProceedings of the 31st International Conference on Neural Information Processing Systems.

Local isotropy in turbulent boundary layers at high Reynolds numberJournal of Fluid Mechanics 268:333–372.https://doi.org/10.1017/S0022112094001370

Exploiting plume structure to decode gas source distance using metaloxide gas sensorsSensors and Actuators B 235:636–646.https://doi.org/10.1016/j.snb.2016.05.098

Multiple perceptible signals from a single olfactory glomerulusNature Neuroscience 16:1687–1691.https://doi.org/10.1038/nn.3519

Support vector machines are universally consistentJournal of Complexity 18:768–791.https://doi.org/10.1006/jcom.2002.0642

Mechanisms of animal navigation in odor plumesThe Biological Bulletin 198:203–212.https://doi.org/10.2307/1542524

Olfactory navigation and the receptor nonlinearityThe Journal of Neuroscience 39:3713–3727.https://doi.org/10.1523/JNEUROSCI.251218.2019
Decision letter

Raymond E GoldsteinReviewing Editor; University of Cambridge, United Kingdom

Aleksandra M WalczakSenior Editor; CNRS LPENS, France
In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.
Decision letter after peer review:
Thank you for submitting your article "Learning to predict target location with turbulent odor plumes" for consideration by eLife. Your article has been reviewed by 2 peer reviewers, one of whom is a member of our Board of Reviewing Editors, and the evaluation has been overseen by Aleksandra Walczak as the Senior Editor. The reviewers have opted to remain anonymous.
The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.
Essential revisions:
1) It would be good if the authors could provide evidence that their conclusions do not depend on the precise location of the odor source relative to the cylinder. We do not require an exhaustive study, but some evidence would be very helpful.
2) We think that more could have been understood from these data, had the authors tried to focus on dimensionless quantities. It would be important to understand what sets the distance from the source where the intensitybased search strategy becomes less effective than the timebased one, and how the results depend on the Schmidt number.
3) The authors should carefully specify how dimensionless parameters are introduced. They do it in places but not systematically. For instance, how are wavenumbers in Figure 1(d) are made dimensionless?
4) On a similar note, how does the range of k's in Figure 1(d) compare to the height of the channel? Are these k's taken along the streamwise direction only? What is the vertical axis in that graph? Would it integral over all k's correspond to the odour density variance? These need to be specified.
5) What is the value of odour diffusivity \kapppa_\theta? What is the Schmidt number in these simulations? Visual inspection of Figure 1(d) suggests that Sc>>1, which might explain why the 5/3 slope is so far from representing the data. A reference to passive scalar advection by turbulence is in order here. We would suggest K. Sreenivasan, Turbulent mixing: A perspective, PNAS (2019).
6) Perhaps T in line 335 should be replaced by \theta.
7) Please provide figure numbers in lines 377, 412, and 414.Reviewer #1 (Recommendations for the authors):
In this work, the authors combine high resolution numerical simulations of turbulent airflow that advects a passive scalar (an "odor") and machine learning algorithms to investigate the question of how a navigation strategy based on the temporal dynamics of odor detection compares with one based on the intensity of the plume. In the simulations, the odor is introduced downstream of a cylinder that induces turbulence in the flow and the machine learning algorithm is trained and evaluated at various points further downstream. The authors conclude that intensity/gradient based measurements work closer to the source, while temporal schemes are good throughout the range. The study is done to a very high standard and presented clearly, although there are general questions about the data analysis (see below) that should be improved. That said, the work should have significant impact on a broad range of fields, from sensing to navigation, across a range of organism length scales.
One concern is the nature of the turbulent profile and its advection of the passive scalar. It would help if the characteristic dimensionless numbers of the problem were specified (Peclet, Schmidt numbers), and it was made clear how the choice of odor release point affects the conclusions. The turbulence itself spreads and diffuses with distance from the source, and one would presume the location of the odor release can matter substantially.
Also, as mentioned briefly in the Discussion, this work examines algorithms based on evaluating accuracy of prediction at a given measurement point. Unless I have missed something in the presentation, the issue of navigation is left unexamined. As in bacterial chemotaxis or any related problem, no navigation strategy is perfect, especially in the face of such a fluctuating source of information, so the full problem involves making estimates of where the source is and then moving to a new location, estimating again, moving, etc. The authors should clarify under what circumstances an evaluation at a fixed point is sufficiently predictive of "learning".Reviewer #2 (Recommendations for the authors):
In their manuscript, Rigolli et al., studied how measurements of intensity of a passive scalar (odour), its spatial gradients, and its time variations can be used to efficiently find the spatial location of its source. To this end, the authors performed direct numerical simulations of highReynolds number pressuredriven channel flow, partially blocked by a cylinder to generate turbulent velocity fluctuations. At a fixed position downstream the cylinder, they introduce a source of odour that diffuses through the fluid and is advected by its turbulent motion. The authors then trained a supervised machine learning algorithm on a collection of spatial and temporal odour profiles thus obtained. The algorithm was subsequently tested on various odour signals originating from the same pool of simulations. By varying the spatial position of the measurement point and the temporal exposure, the authors concluded that detection strategies based on the odour concentration and its spatial gradient work best at small separations from the source (and also close to the zeroodour concentration boundaries), while timebased strategies work reasonably well everywhere within the domain selected for detection.
Within the geometry set by the authors, the conclusions of the paper are supported by data. However, this setup leads to a natural question whether the search strategies determined here pertain to the distance to the odour source or to the distance to the turbulence source (the halfcylinder). Turbulence generated by an obstacle has a particular spatial profile; its temporal profile also depends on the distance to the obstacle. It is therefore possible that the machine learning algorithm has indirectly picked up these features rather than the odour profile itself. This can be settled by feeding the existing algorithm signals from simulations with various distances between the source and the obstacle. In the absence of this check, it is impossible to make general statements about applicability of these search strategies in other situations.
Another weak point of this study is the use of a cone of detection as it dramatically reduces the search complexity to almost a quasi onedimensional problem. The authors appreciate it and point out that their choice is forced by a very limited by a very small number of detections outside the cone. This shortcoming should be addressed in future work.
https://doi.org/10.7554/eLife.72196.sa1Author response
Essential revisions:
1) It would be good if the authors could provide evidence that their conclusions do not depend on the precise location of the odor source relative to the cylinder. We do not require an exhaustive study, but some evidence would be very helpful.
We expect results to be robust to changes in source location, as long as this is within the bulk of the channel and not within the viscous layer, which generates substantially less intermittent odor plumes (see e.g. Fackrell and Robins, J Fluid Mech 117:1, 1982). We performed additional computational fluid dynamics simulations, moving the source to two further locations downstream of the obstacle, at the same height and performed inference over an identical conical domain shifted to match the source location (see Figure 2, Figure Supplement 5 left). Performance at source height varies little over the three locations, demonstrating that the algorithm learns from the dynamics of the scalar and not of the underlying velocity field (we chose one individual feature per class and their combination, see Figure 2, Figure Supplement 5 center). For all three sources, performance of the average degrades with height, whereas performance of intermittency improves with height (Figure 2, Figure Supplement 5 right), consistent with the results of the Figure 5c and despite using a different cone. Finally, we trained the algorithm with a dataset generated by one source and tested it on dataset obtained from the other sources: we found that pairs of features are more sensitive to the details of the dataset, whereas individual features maintain their full predictive power (Figure 2 Figure Suppl 6).
If and how animals adapt their models to deal with sources at different heights is a fascinating question that is relevant for animal behavior; we believe this point deserves an in depth investigation on its own and we are currently addressing these questions in a different study.
2) We think that more could have been understood from these data, had the authors tried to focus on dimensionless quantities. It would be important to understand what sets the distance from the source where the intensitybased search strategy becomes less effective than the timebased one, and how the results depend on the Schmidt number.
We thank the reviewers for this important observation that led us to develop a theoretical frame work that we believe deserves further attention. We highlight below the conceptual building blocks, which we have detailed in the manuscript. The predictive power χ of an individual feature x is its expected error: $\frac{{\int}_{0}^{R}(y{f}^{\ast}\left(x\right){)}^{2}p\left(x,y\right)\text{dy}}{{\int}_{0}^{R}(y\overline{y}{)}^{2}p(y)dy}$ where p(x,y) is the joint probability distribution of the input output pair (x,y)and p(y) is the prior on the output y.f*(x) is the optimal predictor and can be shown to be ${f}^{\ast}\left(x\right)={\int}_{0}^{R}\text{yp}\left(yx\right)dy,$ where R is the length of the cone (see Hastie et al., The elements of statistical learning: Datamining, inference, and prediction, 2001 and Steinwart, J of Complexity 18:768, 2002). The expected error is generally unavailable because p<milestonestart />(<milestoneend />yx<milestonestart />)<milestoneend /> and p(x,y) are unknown. However, we can connect equation (1) to the physics of the odor plume by computing the joint, marginal and posterior distributions using the likelihood p(x│y):$p\left(x,y\right)=p\left(xy\right)p(y)$ $p\left(x\right)={\int}_{0}^{\mathrm{\infty}}p\left(x,y\right)\text{dy}$ $p\left(yx\right)=\frac{p(x,y)}{p(x)}$ where the prior is p(y)=2y/R^{2} (thus $\overline{y}$= 2R/3 and the denominator in equation (1) is R^{2}/18). Importantly, the likelihood is dictated by the fluid dynamics of odor plumes from a concentrated source. Because our individual features are sample averages, their statistics is approximately Gaussian: $p\left(xy\right)N(xg\left(y\right),s\left(y\right))$ and we can then simply characterize the likelihood by data fitting to estimate their average and standard deviation. By solving equations (1) to (4) with the assumption (5) and the empirical estimates for g(y) and s(y) we recover the predictive power of individual features showed in Figure 5c of the manuscript (see Figure 1). (The argument needs to be better adapted to whiffs which deviate considerably from a normal distribution)
We believe this theoretical framework deserves further attention. Indeed, one can move
beyond empirical estimates of the likelihood and generalize predictions to pairs of features and other kinds of flows by leveraging the asymptotic arguments recently proposed by Celani et al., Phys Rev 4:041015, 2014. We do not further elaborate on this point here, but we are currently working to fully develop these asymptotic arguments not only to obtain scaling behaviors for g(y) and s(y) but also the prefactor, which is crucial to understand what nondimensional quantities dictate the switch from timing to intensity. Note that in this regime the Schmidt number plays a minor role, as we further elaborate on below.
3) The authors should carefully specify how dimensionless parameters are introduced. They do it in places but not systematically. For instance, how are wavenumbers in Figure 1(d) are made dimensionless?
Thank you for the comment, we have provided information on the Schmidt number used in the simulations (Sc = 1) and the definition of wavenumbers in Figure 1d (see next answer for more details on this point), as well as dimensional parameters that were either only mentioned in the text (viscosity) or not mentioned (diffusivity). All information is now summarized in Table 1.
4) On a similar note, how does the range of k’s in Figure 1(d) compare to the height of the channel? Are these k’s taken along the streamwise direction only? What is the vertical axis in that graph? Would it integral over all k’s correspond to the odour density variance? These need to be specified.
Thank you for the comment, we have included the missing information. Figure 1d shows the two dimensional spectra $Ek=\frac{d}{\text{dk}}({\int}_{\u2502kxy\u2502<k}\u2502\hat{c}\left({k}_{\text{xy}}\right){\u2502}^{2}{d}^{2}k)$ normalized with the scalar variance σc2, where ĉ(κ_{xy}) is the two dimensional Fourier transform of the scalar concentration at the height of the source. The integral of the spectra is indeed the scalar variance. We nondimensionalize the wavenumber with the inverse Kolmogorov scale η −1. We corrected a small discrepancy coming from the fact that the Legendre polynomials used by Nek5000 have sampling points that are not perfectly equally spaced. The k−5∕3 scaling holds for k η 0.1, consistent with previous experimental results in channel flow (see e.g. S.G. Saddoughi and S.V. Veeravalli J. Fluid Mech.(1994) 268:333372). To further characterize the flow we add a panel to figure 1, showing that the mean flow follows the well known law of the wall, recovering classical statistics for channel turbulence.
5) What is the value of odour diffusivity \kapppa_\theta? What is the Schmidt number in these simulations? Visual inspection of Figure 1(d) suggests that Sc>>1, which might explain why the 5/3 slope is so far from representing the data. A reference to passive scalar advection by turbulence is in order here. We would suggest K. Sreenivasan, Turbulent mixing: A perspective, PNAS (2019).
Here we work at Schmidt = 1, κθ = ν = 1.5 × 10−5m2∕s. The inertial range is short but consistent with previous literature on channel flow (see previous answer). Varying the Schmidt number affects the fine details between the Batchelor and Kolmogorov scales (Falkovich et al., Rev Mod Phys 105 73:913, 2001). These are below the size of the source in our simulations, as is also the case in 106 many situations of practical relevance. In this regime, the dynamics below the Kolmogorov scale 107 affects little the large scale statistics of odor plumes from concentrated sources which are dictated 108 by the separation of Lagrangian particles (as argued in Celani et al., Phys Rev X, 2014 and confirmed experimentally in Duplat et al., Phys Fluids, 22:035104, 2010). We included this comment at page 12.
6) Perhaps T in line 335 should be replaced by \theta.
Thank you, this is indeed a typo and should read c
7) Please provide figure numbers in lines 377, 412, and 414.
Thank you, we have added the references to the figures
Reviewer #1 (Recommendations for the authors):
In this work, the authors combine high resolution numerical simulations of turbulent airflow that advects a passive scalar (an "odor") and machine learning algorithms to investigate the question of how a navigation strategy based on the temporal dynamics of odor detection compares with one based on the intensity of the plume. In the simulations, the odor is introduced downstream of a cylinder that induces turbulence in the flow and the machine learning algorithm is trained and evaluated at various points further downstream. The authors conclude that intensity/gradient based measurements work closer to the source, while temporal schemes are good throughout the range. The study is done to a very high standard and presented clearly, although there are general questions about the data analysis (see below) that should be improved. That said, the work should have significant impact on a broad range of fields, from sensing to navigation, across a range of organism length scales.
One concern is the nature of the turbulent profile and its advection of the passive scalar. It would help if the characteristic dimensionless numbers of the problem were specified (Peclet, Schmidt numbers), and it was made clear how the choice of odor release point affects the conclusions. The turbulence itself spreads and diffuses with distance from the source, and one would presume the location of the odor release can matter substantially.
We thank the reviewer for the comment. We have now stated the nondimensional numbers and remarked that in our regime the Schmidt number is expected to play a minor role (see response to general comments above). To analyze how the results depend on source location, we have conducted a series of numerical simulations with the source located further downstream from its original location, and this does not affect our conclusion. We did not move the source laterally, as this cannot affect the results because the flow is homogeneous in the crosswind direction z2. We expect that moving the source to the ground will affect the results because intermittency of an odor plume from a source located at ground is much less intermittent (see response to general comment 1). Animals that would target both sources in air and on ground would clearly have to use qualitatively different predictive models. Ethological studies on dogs suggest that they do track both sources on ground and in air, and that they do this by alternating sniffing in air and on ground.
We are currently investigating this fascinating behavior.
Also, as mentioned briefly in the Discussion, this work examines algorithms based on evaluating accuracy of prediction at a given measurement point. Unless I have missed something in the presentation, the issue of navigation is left unexamined. As in bacterial chemotaxis or any related problem, no navigation strategy is perfect, especially in the face of such a fluctuating source of information, so the full problem involves making estimates of where the source is and then moving to a new location, estimating again, moving, etc. The authors should clarify under what circumstances an evaluation at a fixed point is sufficiently predictive of "learning".
We agree entirely with the referee on this point, which we now stress more in the Discussion. We are currently working to test whether the best predictors are also the best features to be used in navigation. Although it is often implicitly assumed that this is indeed the case, up until now this question is entirely unexplored.
Reviewer #2 (Recommendations for the authors):
In their manuscript, Rigolli et al., studied how measurements of intensity of a passive scalar (odour), its spatial gradients, and its time variations can be used to efficiently find the spatial location of its source. To this end, the authors performed direct numerical simulations of highReynolds number pressuredriven channel flow, partially blocked by a cylinder to generate turbulent velocity fluctuations. At a fixed position downstream the cylinder, they introduce a source of odour that diffuses through the fluid and is advected by its turbulent motion. The authors then trained a supervised machine learning algorithm on a collection of spatial and temporal odour profiles thus obtained. The algorithm was subsequently tested on various odour signals originating from the same pool of simulations. By varying the spatial position of the measurement point and the temporal exposure, the authors concluded that detection strategies based on the odour concentration and its spatial gradient work best at small separations from the source (and also close to the zeroodour concentration boundaries), while timebased strategies work reasonably well everywhere within the domain selected for detection.
Within the geometry set by the authors, the conclusions of the paper are supported by data. However, this setup leads to a natural question whether the search strategies determined here pertain to the distance to the odour source or to the distance to the turbulence source (the halfcylinder). Turbulence generated by an obstacle has a particular spatial profile; its temporal profile also depends on the distance to the obstacle. It is therefore possible that the machine learning algorithm has indirectly picked up these features rather than the odour profile itself. This can be settled by feeding the existing algorithm signals from simulations with various distances between the source and the obstacle. In the absence of this check, it is impossible to make general statements about applicability of these search strategies in other situations.
Thank you for your comment. To test robustness of the algorithm, we have performed two additional sets of simulations moving the source further downstream of the obstacle: the results are consistent with the results of the main simulation; source location may affect details about where exactly the transition between the timing vs intensity regimes occurs. First, we verified that performance of 2 individual features and their pair does not depend on source location, confirming that the algorithm learns solely from odor statistics. Second, we trained the algorithm with data from one simulation and tested its performance over datasets from the remaining simulations.
We found that performance of individual features is preserved even when training and test are performed over different dataset. Pairs of features are more sensitive to consistency between training and test dataset (see Figure 2 Figure supplement 6). Third, we analysed performance for each source location as a function of height of the sampling plane. We recovered that timing features improve with height whereas intensity features degrade with height, as showed in Figure 5c (Figure 2 Figure Supplement 5, right). We note that the transition between timing vs intensity may occur at different heights for the three source locations, we did not investigate this comparison in more detail.
Another weak point of this study is the use of a cone of detection as it dramatically reduces the search complexity to almost a quasi onedimensional problem. The authors appreciate it and point out that their choice is forced by a very limited by a very small number of detections outside the cone. This shortcoming should be addressed in future work.
We entirely agree with the referee, in fact olfactory searches first need to locate the plume and subsequently locate the source within the plume. Our manuscript defines the most informative features once the agent is within the plume. We believe that our results are not relevant for the previous stage when the agent searches for the plume. This is because there is hardly any detection at all outside of the cone. In fact, our ongoing work (currently under review) demonstrates a model based navigation strategy that uses absence of detection to narrow down the possible locations to search for the plume.
https://doi.org/10.7554/eLife.72196.sa2Article and author information
Author details
Funding
European Research Council (101002724)
 Agnese Seminara
Air Force Office of Scientific Research (FA86552017028)
 Lorenzo Rosasco
National Institutes of Health (R01DC018789)
 Agnese Seminara
Agence Nationale de la Recherche (ANR15IDEX01)
 Agnese Seminara
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
This work was supported by: the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 101002724 RIDING); the Air Force Office of Scientific Research under award number FA86552017028; the National Institutes of Health (NIH) under award number R01DC018789; the French government, through the UCAJEDI Investments in the Future project managed by the National Research Agency (ANR) under reference number #ANR15IDEX01. The authors are grateful to the OPAL infrastructure from Université Côte d’Azur and the Université Côte d’Azur’s Center for HighPerformance Computing for providing resources and support. N.M. and N.R. are thankful for the support of Instituto Nazionale di Fisica Nucleare (INFN) Scientific Initiative SFT: Statistical Field Theory, LowDimensional Systems, Integrable Models and Applications.
Senior Editor
 Aleksandra M Walczak, CNRS LPENS, France
Reviewing Editor
 Raymond E Goldstein, University of Cambridge, United Kingdom
Publication history
 Preprint posted: June 16, 2021 (view preprint)
 Received: July 14, 2021
 Accepted: July 18, 2022
 Version of Record published: August 12, 2022 (version 1)
Copyright
© 2022, Rigolli et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics

 365
 Page views

 89
 Downloads

 1
 Citations
Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading

 Physics of Living Systems
 Plant Biology
Dandelion seeds respond to wet weather by closing their plumes, which reduces dispersal when wind conditions are poor.

 Physics of Living Systems
The actomyosin cytoskeleton is a crucial driver of morphogenesis. Yet how the behavior of largescale cytoskeletal patterns in deforming tissues emerges from the interplay of geometry, genetics, and mechanics remains incompletely understood. Convergent extension in D. melanogaster embryos provides the opportunity to establish a quantitative understanding of the dynamics of anisotropic nonmuscle myosin II. Cellscale analysis of protein localization in fixed embryos suggests that gene expression patterns govern myosin anisotropy via complex rules. However, technical limitations have impeded quantitative and dynamic studies of this process at the whole embryo level, leaving the role of geometry open. Here we combine in toto live imaging with quantitative analysis of molecular dynamics to characterize the distribution of myosin anisotropy and the corresponding genetic patterning. We found pair rule gene expression continuously deformed, flowing with the tissue frame. In contrast, myosin anisotropy orientation remained approximately static, and was only weakly deflected from the stationary dorsalventral axis of the embryo. We propose that myosin is recruited by a geometrically defined static source, potentially related to the embryoscale epithelial tension, and account for transient deflections by cytoskeletal turnover and junction reorientation by flow. With only one parameter, this model quantitatively accounts for the time course of myosin anisotropy orientation in wildtype, twist, and evenskipped embryos as well as embryos with perturbed egg geometry. Geometric patterning of the cytoskeleton suggests a simple physical strategy to ensure a robust flow and formation of shape.