Learning to predict target location with turbulent odor plumes

  1. Nicola Rigolli  Is a corresponding author
  2. Nicodemo Magnoli
  3. Lorenzo Rosasco
  4. Agnese Seminara  Is a corresponding author
  1. Department of Physics, University of Genova, Italy
  2. Institut de Physique de Nice, Université Côte d’Azur, Centre National de la Recherche Scientifique, France
  3. National Institute of Nuclear Physics, Italy
  4. MalGa, Department of Civil, Chemical and Environmental Engineering, University of Genoa, Italy
  5. MaLGa, Department of computer science, bioengineering, robotics and systems engineering, University of Genova, Italy

Abstract

Animal behavior and neural recordings show that the brain is able to measure both the intensity and the timing of odor encounters. However, whether intensity or timing of odor detections is more informative for olfactory-driven behavior is not understood. To tackle this question, we consider the problem of locating a target using the odor it releases. We ask whether the position of a target is best predicted by measures of timing vs intensity of its odor, sampled for a short period of time. To answer this question, we feed data from accurate numerical simulations of odor transport to machine learning algorithms that learn how to connect odor to target location. We find that both intensity and timing can separately predict target location even from a distance of several meters; however, their efficacy varies with the dilution of the odor in space. Thus, organisms that use olfaction from different ranges may have to switch among different modalities. This has implications on how the brain should represent odors as the target is approached. We demonstrate simple strategies to improve accuracy and robustness of the prediction by modifying odor sampling and appropriately combining distinct measures together. To test the predictions, animal behavior and odor representation should be monitored as the animal moves relative to the target, or in virtual conditions that mimic concentrated vs dilute environments.

Editor's evaluation

This paper explores the question of the optimum strategy for odor detection in a turbulent environment. The authors use high-resolution simulations of turbulent flow to investigate the transport and detection of odors advected by the flow, comparing machine learning strategies based on the temporal dynamics of the signal with those based on intensity. The work should be of interest to researchers working on a broad range of problems in sensation and navigation across scales.

https://doi.org/10.7554/eLife.72196.sa0

Introduction

Most macroscopic organisms detect odors in intermittent bursts, that may be separated by extended regions with no odor. Organisms leverage this complex dynamics efficiently for diverse tasks, including locating and identifying an odor source Murlis et al., 1992; Mafra-Neto and Cardé, 1994; Vickers, 2000; Riffell et al., 2014; Ache et al., 2016; Ackels et al., 2021. However, what are the most informative features of intermittent odor cues remains largely unclear. There are two broad classes of measures that quantify the dynamics of olfactory cues: those that depend on odor intensity including e.g. odor gradients in space or time, and those that do not depend on odor intensity but only on its timing, i.e. on whether the odor is on or off regardless of its concentration. To compute quantities that depend on odor intensity, an accurate representation of the odor is needed. In contrast, measuring the timing of odor detection simply requires to mark at all times whether the odor is on or off, thus a binary switch is sufficient.

Behavioral evidence suggests that animals use both intensity and timing of odor encounters for olfactory navigation Baker et al., 2018. At close range, mammals appear to compare odor intensity either across nostrils or across sniffs Catania, 2013; Gire et al., 2016; Findley et al., 2021. On the other hand, mounting evidence suggests timing of odor detection also plays a key role for olfactory navigation Ache et al., 2016: moths respond to odor pulsed at specific frequencies Riffell et al., 2014; Vickers et al., 2001; fruit flies respond to timing since last odor detection van Breugel and Dickinson, 2014; Demir et al., 2020; lobsters and sharks compare odor arrival time across their paired olfactory organs and orient toward the side that detected the odor first Basil and Atema, 1994; Gardiner and Atema, 2010; many organisms will move upwind upon detection of an odor Kennedy and Marsh, 1974; Murlis et al., 1992; Steck et al., 2012.

Neural recordings upon stimulation with intermittent odor cues confirm that the brain of many animals is able to record information both about intensity (and its derivatives) as well as timing of odor encounters (most information comes from work on arthropods Nagel and Wilson, 2011; Vickers et al., 2001; Brown et al., 2005; Gorur-Shandilya et al., 2017; Jacob et al., 2017; Riffell et al., 2014, but see also Parabucki et al., 2019; Lewis et al., 2021). For example, when insects are presented with intermittent odor cues, information about intensity and timing is recorded in their antennal lobe (see e.g. Vickers et al., 2001; Brown et al., 2005). Odors that mimic natural intermittency elicit a response that preserves an accurate measure of timing in fruit flies and moths Gorur-Shandilya et al., 2017; Jacob et al., 2017. In lobsters, bursting olfactory neurons encode specifically for the time between successive odor encounters, see Park et al., 2014; Park et al., 2016 and references therein. Interestingly, the neural activity varies considerably with the dynamics of the odor cues Nagel and Wilson, 2011; Vickers et al., 2001; Lewis et al., 2021, but how intermittency of an odor affects its neural representation is not well understood.

This evidence suggests animals are able to identify when they detect an odor as well as how intense it is; but whether they record and rely on both kinds of information is not understood. From a physical perspective, these two measures clearly provide information about source location. Indeed, we know from theoretical Shraiman and Siggia, 2000; Falkovich et al., 2001; Celani et al., 2014 and experimental Justus et al., 2002; Moore and Crimaldi, 2004 work that turbulence causes the odor to be distributed in highly intermittent patches separated by blanks with no odor. Both intensity and timing of these intermittent bursts vary depending on the location of the source Celani et al., 2014, as early recognized by Atema, 1996, thus can be used to infer source location or navigate to it Vergassola et al., 2007; Schmuker et al., 2016; Boie et al., 2018; Leathers et al., 2020; Michaelis et al., 2020.

Here, we ask what salient features of turbulent odor signals best predict the location of the odor source and specifically compare quantities related to intensity vs timing of odor encounters. We first compose a dataset of realistic odor fields at scales of several meters using accurate state-of-the-art fluid dynamics simulations. We then develop machine learning algorithms that predict source location based on these synthetic odor fields.

We find that measures of odor temporal dynamics based on a short memory span (down to about 1 s) hold information about source location. Close to the source or close to the substrate, measures of intensity predict distance better than measures of timing; but this ranking is reversed at further distance from the source or from the substrate. Pairing the two kinds of measure improves dramatically the quality of the prediction robustly across all datasets, whereas pairing two measures of intensity or two measures of timing is either useless or detrimental.

Our results demonstrate that timing and intensity are complementary attributes of odor dynamics and are most effective in more dilute and concentrated conditions respectively. These different conditions exist in different portions of space because odor gets transported, mixed and diluted by the fluid. As a result, the spatial range of operation of a living organism constrains the solutions it may evolve to make predictions with turbulent odors.

Results

Odor cues at several meters from the source are often turbulent. Figure 1a–c and show snapshots of the velocity field and odor cues in space, resulting from direct numerical simulations of the turbulent flow in a channel of length L, width W and height H (also see Figure 1—video 1). Air flows from left to right at a mean speed Ub and hits a cylindrical obstacle that generates turbulence. The height of the obstacle is H/4 and tunes the intensity of turbulent fluctuations relative to the mean velocity. To characterize the flow we show in Figure 1d that the mean velocity profile, for z+ 30, follows the law of the wall Uuτ=1κlnz++B, where κ and B are constants and z+=zδν, recovering classical statistics for channel turbulence Pope, 1984. The odor field is emitted from a concentrated source downstream from the obstacle; it develops as a meandering filament that fluctuates as it travels downstream and soon breaks into discrete pockets of odor (whiffs) separated by odor-less stretches (blanks) (Figure 1b–c and f). The Kolomogorov scaling for the spectra of odor fluctuations k-5/3 holds for kη0.1 (see Figure 1e), consistent with previous experimental results in channel flow (see e.g. Saddoughi and Veeravalli, 1994). Typical time courses of the odor are shown in Figure 1f. Note that depending on the sampling location, odor may be more or less sparse (compare for example Figure 1f left and right). All parameters and methods are summarized respectively in Table 1 and in Materials and Methods.

Figure 1 with 1 supplement see all
Turbulent odor cues are patchy and intermittent.

Snapshot of streamwise velocity (a) in a vertical plain at mid channel; odor snapshot side view at mid channel (b) and top view at source height (c). White regions mark the cylindrical obstacle. Snapshots are obtained from direct numerical simulations of the Navier-Stokes equations and the equation for odor transport (see Materials and Methods and parameters summarized in Table 1). (d) The mean velocity profile follows the well known log law when z3+ = z3/δν > 30, where δν=ν/uτ where uτ is the friction velocity. (e) Two dimensional spectra of odor fluctuations E(k)=ddk(|kxy|<k|c^(kxy)|2d2k) normalized with the scalar variance σc2 ; c^(kx,y) is the 2D Fourier transform of the scalar concentration at source height; the integral of the spectra is the scalar variance. Wavenumbers are nondimensionalized with the inverse Kolmogorov scale η-1. Error bars show standard deviation calculated over N = 420 points. (f) Typical time courses of the odor cues at locations labeled with 1 and 2 in c, visualizing noise and sparsity, particularly at location 1.

Table 1
Parameters of the simulation.

Length L, width W, height H of the computational domain; horizontal speed along the centerline U; mean horizontal speed Ub=u; kinematic viscosity ν; diffusivity κc; Kolmogorov length scale η=(ν3/ϵ)1/4 where ϵ is the energy dissipation rate; mean size of gridcell Δx; Kolmogorov timescale τη=η2/ν; energy dissipation rate ϵ=ν/2(ui/xj+uj/xi)2; Taylor microscale λ=u2/(u/x)2; wall lengthscale y+=ν/uτ where the friction velocity is uτ=τ/ρ and the wall stress is τ=ρνdu/dz|z=0; Reynolds number Re=U(H/2)/ν based on the centerline speed U and half height; Reynolds number Reλ=Uλ/ν based on the centerline speed and the Taylor microscale λ; Schmidt number (Sc = ν/κc = Pe/Re); magnitude of velocity fluctuations u relative to the centerline speed; large eddy turnover time T=H/2u. First row reports results in non dimensional units; second and third rows correspond to dimensional parameters in air and water assuming the velocity of the centerline is 50 cm/s in air and 12 cm/s in water.

LWHUUbνκcηΔx
408432231/2501/2500.0060.025
air9.50 m1.90 m0.96 m50 cm/s36 cm/s1.510-5 m2/s1.510-5 m2/s0.15 cm0.6 cm
water2.66 m0.53 m0.27 m12 cm/s8.6 cm/s10-6 m2/s10-6 m2/s0.04 cm0.2 cm
τηελy+ReReλScu/UT
0.01390.170.0035160001360111%64τη
air0.15 s6.3e-4 m2/s34 cm0.09 cm
water0.18 s3e-5 m2/s31 cm0.02 cm

Do odor cues bear information about source location meters away from the source? To answer this question, we develop supervised machine learning algorithms that learn the relationship between the input (odor) and the distance from the source (output) from a large dataset of examples. In order to dissect what are the best predictors of source location and how ranking depends on the statistics of the odor, we need to detail more specifically the input and output of the algorithm.

To design the input we start with the odor concentration field c(z,t) which varies stochastically in space and time as a result of turbulent transport. Here, z=(z1,z2,z3) is a location in the three dimensional space and t is time. We focus on a plane at a fixed height, and consider the conical region where odor can be detected, the ‘cone of detection’ (Figure 2a). We first compose time series of the odor field; each time series is indicated with ci and consists of the odor sampled at M equally spaced times with frequency ω at a discrete location zi within the cone of detection. Thus, each time series is a vector ci=(c(zi,ti),,c(zi,ti+M)), where ti+M-ti=M/ω is the temporal span of the time series, or memory. From each time series, ci we calculate five features xi1,,xi5, where xi1 is the temporal average of the concentration during whiffs in the time series ci; $x_i^2$ is its average slope (time derivative of odor upon detection, averaged across whiffs within ci); xi3 is the average duration of blanks (stretches of time when odor is below detection within ci); xi4 is the average duration of whiffs (stretches of time when odor is above threshold within ci); and xi5 is the intermittency factor (the fraction of time the time series ci is above threshold). The detection threshold is defined adaptively as discussed in Materials and methods and Figure 2—figure supplement 1. Features x1 and x2 depend explicitly on odor concentration, whereas features x3, x4 and x5 only depend on when the odor is on or off, but not on its intensity. To remark this difference, we refer to x1 and x2 as intensity features, and x3, x4 and x5 as timing features. Our input xi=(xi1,,xid) is composed of d-dimensional vectors of features and we will focus on d=1,2,5. We seek to infer distance from the source, thus our output y is the coordinate of the sampling point z in the downwind direction, i.e. y=z1, with the source placed at the origin (see sketch in Figure 2a). We refer to the figure supplements for results in the crosswind direction, y=z2. We train the algorithm by providing N examples of input-output pairs (xi,yi) selected randomly from the full simulation, and obtain the function that connects input and output: yf(x).

Figure 2 with 6 supplements see all
Individual features enable inference in two dimensions.

(a) Sketch of the geometry. (b) Test error χ for inference using individual features as input. (c) Predicted vs actual distance for inference. Prediction for representative test points (grey circles); 30–70th percentile (patch, same color code as in (b)); trivial prediction f(x)=<y>test (solid horizontal line, corresponds to χ=1); exact prediction (bisector, corresponds to χ=0); dispersion away from the bisector visualizes the prediction error. Results are obtained with a supervised learning algorithm based on regularized empirical risk minimization (Materials and methods). Each input datum xi is one individual scalar feature computed from the time course of odor concentration measured at location zi at 100 evenly spaced time points with sampling frequency ω=1/τη, where τη is Kolmogorov time. The training/test set are composed of N=5000 and Nt=13500 data points, respectively.

We propose a machine learning approach where the different odor features are ranked based on their predictive power, rather than their fitting properties. Different data-sets of odor/distance pairs are defined. The data-sets differ in the way odor measurements are represented in terms of feature vectors. For each data-set we learn a function to predict the distance to target given the corresponding odor features. The predictive power of each function, and corresponding set of features, is then assessed. More precisely, each data-set is split in a training and a test set, as custom in machine learning. Training sets are used to learn functions connecting odor to target location, whereas test sets are used to assess their prediction properties. The training/test split is crucial since the goal is to make good predictions on new, unseen points, that are not within the training sets. From a modeling perspective, a flexible nonlinear/nonparametric approach based on kernel methods is contrasted and shown to be superior to a simpler linear model (Figure 2—figure supplement 2). A careful protocol based on hold-out cross-validation is used to select the hyper-parameters of the considered learning models (we refer to Materials and methods and Figure 2—figure supplement 3 for more details).

To illustrate the results we pick the two-dimensional plane at height H/4 that contains the source. The first result is that individual features (d=1) bear useful information for two-dimensional source localization even at several meters from the source. Performance is quantified by the normalized squared error averaged over the Nt points in the test set χ=i=1Nt[yi-f(xi)]2/i=1Nt[yi-y¯]2. For this dataset, intensity features rank higher than timing features (Figure 2b–c), consistent with previous work Atema, 1996 and predictions are more accurate in the crosswind than in the downwind direction (compare with Figure 2—figure supplement 4). For reference, a random guess with flat probability within the correct lower and upper bounds yields χrandom=2, whereas a target function ftrivial(x)=ytest that learns the average of the output over the test set yields χtrivial=1.

In order to prove that the algorithm captures the dynamics of the scalar and not of the underlying velocity field, we realize three computational fluid dynamics simulations with the odor source at different locations downstream of the obstacle. Each source is placed at a different distance from the obstacle and thus feels different velocity fields, because the flow is inhomogeneous in the downstream direction. To perform a fair comparison we train the algorithm on points sampled from a conical domain based on the most downstream source; for the other two sources, we use an identical domain shifted upstream so that the vertex of the cone is located at the respective source location (see Figure 2—figure supplement 5 left). Performance at source height varies little over the three locations, demonstrating that the algorithm is learning the dynamics of the scalar and not of the carrying flow (see Figure 2—figure supplement 5 center). Additionally we demonstrate that, irrespective of source position, timing features acquire predictive power with height, whereas the opposite is observed for intensity features (see Figure 2—figure supplement 5 right). We discuss this property in further detail in the following. . Finally, we train the algorithm with the odor fields emanating from one source location, and test its performance over odor fields generated by the two other sources. Figure 2—figure supplement 6 shows that performance of individual features varies little when test and training are performed over different dataset. Learning from pairs of features appears somewhat more sensitive to the details of the dataset. In the aggregate, the analysis corroborates that the algorithm captures odor dynamics and is rather insensitive to details of the underlying flow. In the rest of the manuscript we focus on the leftmost source, so as to exploit the full spatial range available from the simulations.

Next we analyze whether and how the sampling strategy affects performance and ranking of the features. Most results are shown for a memory of 100τη15s. Performance improves with longer memory (Figure 3a), because this allows to better average out noise and obtain more stable estimates of the features. But improvement follows a slow power law so that waiting for example 20 times longer yields predictions only about twice as precise. On the other hand, waiting as little as 10τη1.5 seconds still allows to make predictions, albeit less precise. We then verify whether performance may improve with a larger training set. Because we infer distance from an individual (scalar) feature, the problem is one dimensional and we find that a small number of training points, which we indicate with N, is sufficient to reach a plateau in prediction performance (Figure 3b). We choose N=5000 training points, which is also robust to the case with more than one feature (Figure 3—figure supplement 1). Finally, sampling more frequently than once per Kolmogorov time does not essentially affect the results nor ranking (Figure 3c). Similar results hold for the crosswind direction (Figure 3—figure supplement 2).

Figure 3 with 2 supplements see all
The sampling strategy affects performance but not ranking.

(a) Error χ as a function of memory in units of Kolmogorov times τη; memory is defined as the duration of the time series of odor concentration ci=(c(zi,ti),,c(zi,ti+M)) used to compute the five features xi1,,xi5, i.e. memory =ti+M-ti=M/ω. Red and pink: Performance using xi=xi1 (average concentration) and xi=xi5 (intermittency factor). The number of training points and the frequency of sampling are fixed, N=5000 and ω=1/τη. Dotted, dashed and solid grey lines are power laws with exponents -1/5, -1/10 and -1/4 respectively to guide the eye. (b) Error as a function of number of points in the training set N, with Nt=13500 points in the test set, memory =100τη and ω=1/τη. Color code as in (a). (c) Performance using the five individual features as input with N=5000, Nt=13500, memory =100τη sampling odor at frequency ω=1/τη (empty bars) and ω=10/τη (filled bars). Key shows color coding.

Pairing two observables improves performance in some cases, but not always. In fact, pairing two features of the same category results in little to no improvement (Figure 4 and similarly for the crosswind direction, Figure 4—figure supplement 1). In contrast, combining one intensity and one timing feature improves performance considerably, up to 65%. This result can be understood by mapping the error done by individual features in space (Figure 5—figure supplement 1), showing that intensity and timing features are complementary, that is intensity features perform well in locations where timing features perform poorly.

Figure 4 with 1 supplement see all
Pairing one timing feature and one intensity feature considerably improves performance.

(a) Error χ obtained with individual features (full bars) and pairs of features (empty bars). Grey and black indicate pairings of two intensity features and two timing features respectively; green indicates mixed pairs of one timing and one intensity feature. (b) Performance (left) and relative improvement over the best of the two paired features (right). Results for the median (bottom) and the 95th percentile (top). Within each table plot, rows from bottom to top and columns from left to right are labeled by the 5 individual features: A (average, x1), S (slope x2), B (blanks x3), W (whiffs x4), I (intermittency x5). Results with individual features are shown on the diagonal; results pairing feature i and feature j are shown at position (i,j). Mixed pairs provide both the best performance and the largest improvement over individual features.

We next seek to clarify whether the results depend on space. To this end we compose five different dataset, a to e, obtained by extracting odor snapshots from horizontal planes at source height (b), above the source (c to e), and below the source (a) (Figure 5a). From a to e, sparsity increases and intensity decreases (Figure 5b) simply because closer to the boundary, the air slows down and the odor accumulates. By analyzing performance across these dataset, we find that ranking of individual features shifts considerably. The two intensity features outperform all timing features when the dataset is not very sparse (dataset a-b, Figure 5c and d left). In contrast, two timing features (intermittency factor and blank duration) outperform all others for the more sparse and less intense dataset d-e (Figure 5c and d right). Whiff duration performs poorly in d-e because intermittency is too severe and whiffs are short in duration thus bear little information (the average whiff duration is 1–7 time steps in over 90% of the time series). Although the ranking of individual features shifts with height, pairing one intensity and one timing feature remains the most successful strategy across all heights (Figure 5c and d). In contrast, combining all five features contributes little improvement (Figure 5—figure supplement 2).

Figure 5 with 3 supplements see all
Ranking shifts with height from the ground.

(a) Datasets a to e correspond to data obtained at heights z/H=25%, 37.5%, 50%, 55% and 65% respectively. (b) Distribution of intensities (top) and intermittency factors (bottom) over the training set from a to e (left to right). Moving away from the boundary, the odor becomes less intense and more sparse. (c) Median performance as a function of average intermittency factor of the training set for individual intensity (grey) and timing (black) features, mixed pairs of one intensity and one timing feature (green) and all five features together (dark green). (d) Predicted vs actual distance, to visualize a representative subset of the results in (c), scale bar 103η. Ranking depends sensibly on height: intensity features outperform timing features near the substrate, where there is more odor and it is more continuous; timing features outperform intensity features further from the substrate where there is less odor and it is more sparse; mixed pairs perform best across all conditions; combining five features provides little to no improvement over mixed pairs.

Let us now focus on the plane at source height and separate locations based on their distance from the source. We assemble a distal dataset and a proximal dataset, composed of points that are further and closer than 2330η from the source respectively (Figure 6a). The odor is more intense and more sparse closer to the source and it becomes more dilute and less sparse with distance from the source (Figure 6b). Performance of individual features degrades with distance (Figure 6d). Intensity features clearly outperform timing features at close range, as seen both from various percentiles of the test error (Figure 6d, left) as well as the full distribution (Figure 6c, left). The disparity between timing and intensity features disappears in the distal problem: the error distribution for all individual features is essentially superimposed except for the tails (Figure 6c, right and inset), which cause small differences in the median and other percentiles of the error (Figure 6d, right). Remarkably, mixed pairs outperform all individual features in both the distal and proximal problems (Figure 6c–d). In the aggregate, results demonstrate that, even within a single turbulent flow, ranking shifts considerably. Namely, measuring timing of odor encounters is most useful in regions where the odor is dilute, that is far from the source and from the substrate, whereas measuring intensity is most useful in concentrated conditions, that is close to the source or the substrate.

Ranking depends on distance from the source.

(a) At source height, the dataset is split in proximal (distance <2330η) and distal (distance >2330η). (b) Distributions of average odor intensity (left) and intermittency factor (right) over the training set; closer to the source, the odor is more intense and more sparse. (c) Distribution of test error for the proximal (left) and distal (right) problem showing intensity features (grey) outperform timing features (black) at close range, but not in the distal problem where differences in the error distribution are limited to the tails (see insets). Mixed pairs of features (green) outperform individual features either marginally (left) or considerably (right). (d) Percentiles of the error distribution in (c) for the proximal (left) and distal (right) problems confirming the picture emerged from (c).

Thus, the predictive power of different features varies greatly in space and time. Next, we show how this spatial variation is dictated by the statistical properties of the odor plume. To this end, we provide an analytical characterization of the test error of individual features, χ, that connects directly to the physics of the problem. Such a characterization is consistent with the observed variation in performance. Note that for large enough samples, the test error χ approximates the following expected error

(1) 0dx0Rdy(y-f*(x))2p(x,y)0R(y-y¯)2p(y)dy.

The latter is called expected error and is the ideal version of the test error. It can be interpreted as the error summed over all possible input-output pairs, weighted by their corresponding joint probability to be sampled. Here, f*(x)=y|x=0Ryp(y|x)dy is the so called regression function, which minimizes the expected error among all possible functions. The regression function can be approximated using Kernel ridge regression and sufficiently rich kernels. Indeed, kernel ridge regression is known to be a so called universal estimator Hastie et al., 2001; Steinwart, 2002. In the above expression, R is the length of the cone, p(y|x) is the conditional probability distribution of the output given the input, p(x,y) is the joint probability distribution of the input output pair (x,y) is the prior distribution on the output y. The idea is to relate f*, hence the expected error, to the distribution p(x|y) of observing feature x depending on distance y. Indeed, the latter is dictated by the fluid dynamics of odor plumes from a concentrated source, and hence provides a more direct connection between the expected error and the physics of the problem. Since the considered features are sample averages, in the limit of large samples, their distribution is well approximated by a Gaussian, hence fully characterized by the mean and standard deviation. Then simple estimates can be computed empirically from our data. Equation (1), provided with these estimates, reproduces the predictive power of individual features showed in Figure 5c (see Figure 5—figure supplement 3). Note that whiffs deviate considerably from a normal distribution, hence the argument needs to be revised for this feature. To move beyond empirical estimates and extend the above reasoning to other flows and generic combinations of (possibly non Gaussian) features, a generalization of the asymptotic arguments proposed in Celani et al., 2014 is needed.

Discussion

Our results demonstrate that within the cone of detection, the time course of an odor bears useful information for source localization even at meters from the source. We find that the concentration and the slope of a turbulent odor signal, averaged over a memory lag, are particularly useful to predict source location at close range or near the boundary. These features quantify the intensity of the odor and its variation. The primacy of the intensity features wanes in more challenging conditions, for example moving away from the source or away from the boundary. In these portions of space, where the odor is scarcer, features that quantify timing of odor detection become as effective as intensity features, or more effective. One of the best studied example of olfactory search in dilute conditions is arguably the case of insects.

Interestingly, olfactory receptor neurons in insects appear to encode efficiently information about timing across a wide range of intensities Gorur-Shandilya et al., 2017; Martelli et al., 2013.

Note that while the statistics of an odor plume clearly depends on all details of the flow and the source, see e.g. Justus et al., 2002; Celani et al., 2014; Fackrell and Robins, 1982, here we keep all of these parameters constant and demonstrate that even within a single flow, odor dynamics and the best predictors vary considerably in space. This begs the next question: do organisms switch between different modalities depending on attributes of odor dynamics, which will vary in space? This could be the case for mice, where the neural activity in the first relay of olfactory processing does in fact depend on how sparse is the odor Lewis et al., 2021. Specifically, sparse odor cues elicit individual responses that follow closely the ups and downs of the odor in time. In contrast, continuous signals elicit intense responses which are however uncorrelated to the temporal dynamics of the odor itself Lewis et al., 2021.

We find that features within the same class are redundant whereas features from different classes are complementary. Indeed, features of the same class have similar patterns of performance in space, but each class has a distinct pattern. As a consequence, measuring both timing and intensity is beneficial, but using more than one feature to quantify either timing or intensity provides no advantage. Combining all features does not improve over the performance of mixed pairs, consistent with redundancy within each class. Note that there is no fundamental reason to expect features from the same class to be redundant, and further work with a larger library of features is needed to prove or disprove this notion.

Importantly, mixed time/intensity pairs of features outrank individual features robustly, that is in all portions of space, regardless of distance from the source and from the ground. This is in contrast with individual features and suggests relying on simultaneous timing and intensity features is advantageous when odors are sensed at various distances from the source and from the substrate. Interestingly, the coexistence of bursting olfactory neurons and canonical olfactory neurons in lobsters suggests these animals are in fact able to measure simultaneously timing and intensity Park et al., 2014; Ache et al., 2016, which is consistent with the increased predictive power of the mixed pairs of features. Similarly, in mammals, optogenetic activation of the olfactory bulb Smear et al., 2013 demonstrates that both kinds of measures guide behavior (lick vs no lick).

In this work, we have investigated the problem of predicting the location of a target from measures of the time course of a turbulent odor. Previous work explored a related question, that is how to best represent instantaneous snapshots of the odor to encode maximum information about source location Victor et al., 2019. The two approaches are not immediately comparable: first, Victor et al., 2019 consider few snapshots of the odor, rather than measures of its time course. Second, maximizing information does not guarantee good predictions (to make predictions information needs to be extracted and processed, and importantly the focus is on new data that were not previously seen). We provide two comments that are relevant if information is the limiting factor for prediction accuracy: (i) binary representations were suboptimal in all conditions considered in Victor et al., 2019; Boie et al., 2018, that is at few tens of cm from the source. This is consistent with our results in concentrated conditions, where timing features -accessible through binary representations- are suboptimal. Our evidences suggest, however, that the result may not hold in more dilute conditions, where the gap between binary and more accurate representations should become increasingly small. (ii) Individual snapshots of odor from Victor et al., 2019; Boie et al., 2018 contained 1–2 bits of information about source location, but allocating more resources to represent how the odor varies in time was found informative Victor et al., 2019; Boie et al., 2018. Our mixed pairs of features at close range achieve precisions of 5–6%, corresponding to coding for position with words of 4–4.3 bits. Our results thus confirm that memory is indeed useful, but the gain does not increase indefinitely with further memory.

The literature on olfactory navigation is vast. Although a complete review of available algorithms is beyond the scope of the present work, we remark that recent results investigated gradient descent algorithms using either concentration alone Gire et al., 2016, or various measures of timing and intensity Park et al., 2016; Michaelis et al., 2020; Leathers et al., 2020. Overall, both intensity and timing appear to have a potential to lead to an odor source, consistent with our results on individual features. A combination of the two kinds of features was found beneficial in Leathers et al., 2020, consistent with our results on mixed pairs. Note that predicting source location and navigating to reach it are distinct tasks. Although they are often assumed to be intimately connected, whether good predictors may be good variables for navigation in more general contexts remains to be understood.

Here, we have analyzed the features that enable the most accurate prediction of source location. We add a few observations about the significance of the results for animal behavior. First: whether animals rely on features from either class will depend on what features best support behavior. It is often implicitly assumed that features that bear reliable information on source location are also the most useful for navigation. However, this connection between prediction and navigation is far from straightforward and more work is needed to establish whether accurate predictions imply efficient navigation. Second: animals are unlikely to have prior information on the details of the odor source, for example its intensity. Timing features are more robust than intensity features with respect to the intensity of the source and may thus be favored regardless of their performance, which was argued in Schmuker et al., 2016. In our work, timing features are precisely invariant with source intensity because we define the detection threshold adaptively (see Materials and methods). More realistic conditions will need to be evaluated, where dependence on source intensity emerges as a result of non-linearities that we did not model in this work. These effects emerge for example, close to a boundary which partially absorbs the odor Gorur-Shandilya et al., 2019, or in the case of fixed thresholds, although this dependence is weak in the far field where timing features are most useful Celani et al., 2014. Third: we have focused on predicting source location from within the cone of detection, where an agent will detect the odor quite often. However, a crucial difficulty of turbulent navigation is to find the cone itself. We cannot address the problem of predicting source location from outside the cone because detections are so rare that we lack statistics. The distinction between inside and outside the cone of detection is key for navigation with sparse cues (see Reddy et al., 2021) and deserves further attention.

Materials and methods

Direct numerical simulations of turbulent odor plumes

Request a detailed protocol

To reproduce a realistic odor landscape and generate the dataset showed in Figure 1, we solve the Navier-Stokes (2) and the advection-diffusion equation for passive odor transport (3) at all relevant scales of motion from the smallest turbulent eddies (Kolmogorov scale η) to the integral scale (L>600η), using Direct numerical simulations (DNS):

(2) tu+uu=-1ρP+ν2u  u=0
(3) tc+uc=κc2c+q

where u is the velocity field, ρ is the fluid density, P is pressure, ν is the fluid kinematic viscosity, c is the odor concentration, κc is its diffusivity and q an odor source. All parameters are listed in Table 1. Note that we use Sc=1 which is appropriate for typical odors in air but not in water. However, we expect a weak dependence on the Schmidt number as the Batchelor and Kolmogorov scales are below the size of the source and we are interested in the large scale statistics Falkovich et al., 2001; Celani et al., 2014; Duplat et al., 2010. We simulate a turbulent channel flow with a concentrated odor source and an obstacle that generates turbulence by customizing the open-source software Nek5000 Fischer et al., 2008 developed at Argonne National Laboratory, Illinois. Nek5000 employs a spectral element method (SEM) Patera, 1984 Orszag, 1980 based on Legendre polynomials for discretization Ho, 1989, and a 4th order Runge-Kutta scheme for time marching. The code is written in fortran77 and C and it uses MPI for parallelization.

The three-dimensional channel is divided in E=160 000 discrete elements: 200×40×20 (number of elements in length × width × height); within each element the solution is expanded in 8th grade tensor-product polynomials so that the domain is effectively discretized in 81 920,000 elements. The average spatial resolution is equal in each direction Δx4η. A cylindrical cap of height =160η is added on the ground; the cylinder spans the entire width of the channel. The mesh is adapted to fit the cylinder. Fluid flows from left to right and the obstacle generates turbulence in the channel, in particular the height of the cylinder tunes the velocity fluctuations. The velocity fluctuations are defined as δu(z,t)=u(z,t)-u(z,t)y; their intensity is u=(δu)2, where averages are intended in space and time. Table 1 summarizes the parameters that characterize turbulence.

Each simulation runs for 300,000 time steps where δt=10-2τη and follows from a severe Courant criterium with UΔt/Δx<0.4 to ensure convergence of both the velocity and scalar fields. Snapshots of velocity and odor fields are saved at constant frequency ω=1/τη (except for results in Figure 3c where snapshots are saved 10 times more frequently). Each DNS requires 2 weeks of computational time using 320 cpus.

Boundary conditions and odor source

Request a detailed protocol

We impose a Poiseuille velocity profile at the inlet: u=(u,0,0) and u=6Ub(ζ-ζ2), where ζ=z3/H is the vertical coordinate normalized to the height of the channel and Ub is the mean speed. We set a no-slip condition u=0 at the ground and on the obstacle; on the remaining boundaries we impose the turbulent outflow condition defined in Fischer et al., 2007 that imposes a positive exit velocity to avoid potential negative flux and the consequent instability it generates.

More precisely, the divergence ramps up from zero to a positive value along the element closest to the boundary: u=C[1-(z/Δx)2], where z is the distance from the boundary and C=2 is the minimal value that ensures convergence. For the odor, we impose a Dirichlet condition (c=0) at the ground, on the obstacle and at the inlet; while an outflow condition is set at the top, on the sides and at the outlet: k(c)n=0. We introduce a source located right above and downstream of the obstacle, at coordinates xs =810η, ys =650η, zs =238η; odor intensity at the source is defined by a gaussian distribution q=e[(z1-xs)2+(z2-ys)2+(z3-zs)2]/(2σ2), where σ=5η.

Machine learning

Request a detailed protocol

To learn the correct position of a target source given an odor, we propose to use supervised machine learning. We next review some key ideas, and refer to standard textbooks for further details for example Hastie et al., 2001.

The goal in supervised learning is to infer a function f given a training set (x1,y1),(xN,yN) of input/output pairs. A good function estimate should allow to predict the outputs associated to new input points. In our setting each input x is a one-, two-, or five-dimensional vector whose entries are scalar features of odor time series, where the odor is sampled at a specific spatial location. From every sampling location, we compute the distance to the source and this distance is the output y.

To measure how close the prediction f(x) is to the correct output y, we consider the square loss (f(x)-y)2. Following a statistical learning framework, the data are assumed to be sampled according to a fixed but unknown data distribution P. In this view, the ideal solution f* should minimize the expected loss l(f(x),y) over all data distributed according to P. In practice, only an empirical loss based on training data can be measured, and the search for a solution needs be restricted to a suitable class of hypothesis. Note that, the choice of the latter is critical since the nature of the function to be learnt is not known a priori. A basic choice is considering linear functions f(x)=wx. In this case, minimizing the training loss reduces to linear least squares min1N||Y-Xw||2, where X is the matrix composed of the N training data input X=(x1,,xN)T and Y is the vector composed of the N labels of the training set Y=(y1,,yN)T. The corresponding solution is easily shown to be w=(XTX)-1XTY. In Figure 2—figure supplement 1, we show that the choice of linear models has limited predictive power and does not allow to rank features. To tackle this issue, we consider kernel methods Schölkopf and Smola, 2002, a more powerful class of nonlinear models corresponding to functions of the form f(x)=i=1Nk(xi,x)ci. Here, k(x,x) is a so called kernel, that here we will choose to be the Gaussian kernel k(x,x)=e-x-x2/2σ2. The coefficients c=(c1,,cn) are given by the expression

(4) c=(K+λNI)-1y

which minimizes

1NKc-y+λcKc.

In the above expression, K is the N by N matrix with entries Kij=k(xi,xj). The first term can be shown to be a data fit term whereas the second term can be shown to control the regularity of the obtained solution Schölkopf and Smola, 2002. The regularization parameterλ balances out the two terms and needs be tuned, together with the kernel parameters (the Gaussian width σ in our case).

Kernel methods offer a number of advantages. They are nonlinear, and hence can learn a wide range of complex input/output behavior. They are an example of nonparametric models, where the complexity of the model can adapt to the problem at hand and indeed learn any kind of continuous function provided enough data. This can be contrasted to linear models that clearly cannot learn any nonlinear function. Moreover, by tuning the hyper-parameters λ,σ more or less complex shape can be selected. When λ is small we are simply fitting the data, possibly at the price of stability, whereas for large λ we are favoring simpler models. With small σ we allow highly varying functions, whereas with large enough σ we essentially recover linear models.

Indeed, the choice of these parameters is crucial and tested and visualized in Figure 2—figure supplement 3. Here it is shown that for λ0, the solution incurs in the well known stability issues for large σ and overfitting issues for small σ. We note that ideally one would want to choose these hyper-parameters minimizing the test error; however, this would lead to overoptimistic estimates of the prediction properties of the obtained model. Hence, we consider a hold-out cross validation protocol, where the training data are further split in a training and a validation sets. The new training set is used to compute solutions corresponding to different hyper-parameters. The validation set is used as a proxy for the text error to select the hyper-parameters with small corresponding error. The prediction properties of the model thus tuned is then assessed on the test set.

Dataset

Request a detailed protocol

To compose the dataset for regression we first extract two-dimensional snapshots of odor at fixed height from the 3D simulation. Each snapshot from the simulation has dimensions 1600×320 (number of points in the downwind direction × crosswind direction). The initial evolution up to 300τη is excluded from the analysis as odor has not yet reached a stationary state. At stationary state we save 2700 frames at frequency ω=1/τη per simulation. Thus at each spatial location we have the entire time evolution composed of 2700 time points at regular intervals of τη. We partition each simulation in fragments with M snapshots (duration Mτη). Most simulations are shown for M=100, thus for each spatial location we have 27 time series of the same duration (except for results leading to Figure 3a, where we vary memory from 10τη to 250τη resulting in 270 to 10 time series per location respectively).

The characteristic shape of the odor plume is a cone (Figure 2), that we defined as the region where the probability of detection computed over the entire simulation is larger than 0.35. The training set and test set are obtained by extracting N=5000 (unless otherwise stated) and Nt=13500 time series portions of duration Mτη. To select these M-long time series, we extract random locations zi to cover homogeneously the cone, i.e. with flat probability within the cone, and random initial times ti, with the training in the first half of the time history and the test in the second half of the time history. Time series that remain entirely under threshold are excluded.

Each odor time series is further processed by computing five features, two of which quantify intensity of the odor and rely on a precise representation of odor concentration (average concentration and average peak slope) and three of which quantify timing of odor encounters and are computed after binarizing the odor (average whiff and blank duration and intermittency factor). The threshold cthr used for binarization is adaptive i.e. cthr=0.5c|c>0, where the average is computed over each time series separately. The threshold thus varies from cthr=0.5c0 at the source to cthr=10-6c0 at the farthest edges of the cone, where c0 is the concentration at the source. The choice of an adaptive threshold was suggested in Gorur-Shandilya et al., 2017. The precise value of the relative threshold has little effect on the results as shown in Figure 2—figure supplement 1, left. Fixed thresholds were tested and discarded because results depend sensibly on the threshold and the optimal threshold varies with the dataset in non-trivial ways (Figure 2—figure supplement 1, right). Finally, adaptive thresholds that are defined based on purely local information appear more plausible for a biological system that has no information on the intensity of the source.

The parameters λ and σ are obtained through 4-folds cross validation: the training set is split in 4 equal parts, 3 are used for training and 1 for validation. The empirical risk is computed on the validation set and averaged over the 4 possible permutations, systematically varying the hyperparameters λ,σ. The couple of hyperparameters that minimize the empirical risk over the validation set is selected through grid search using an 8×8 regular grid and further refined with a 4×4 subgrid. Results are insensitive to further refinement because there is a large plateau around the minimum, as shown in Figure 5—figure supplement 2. The optimal hyperparameters are used to compute the solution (4). The error χ used throughout the manuscript is simply the normalized test error χ=i=1Nt[yi-f(xi)]2/i=1Nt[yi-y¯]2. For most of the figures, we used 5000 training points and up to 13,500 testing points (we removed from the dataset all the points with null entries for the entire time span), we implemented Kernel ridge regression using FALKON Rudi et al., 2018, a fast algorithm for matrix inversion (the number of iterations is set to 5 and the number of Nystrom centers is equal to the number of points in the training set) and we used it both for training and test.

Modeling the expected performance of individual features

Request a detailed protocol

The importance of a feature is quantified by the corresponding optimal test error

(5) 0dx0Rdy(y-f*(x))2p(x,y)0R(y-y¯)2p(y)dy.

where R is the length of the cone. The target, or regression, function is given by f*:

(6) f*(x)=y|x=0Ryp(y|x)dy

and can be shown to minimize the expected error among all possible functions. In practice, the optimal expected error cannot be computed exactly, since neither the target nor the data distribution are known. In this paper, we choose to estimate the regression function from data using Kernel ridge regression with a Gaussian kernel. The choice of this latter approach is due to its nonparametric nature, which ensures that any target function f* can be recovered provided large enough samples, and more generally that accurate estimates can be derived when only finite data are given Hastie et al., 2001; Steinwart, 2002. Provided with a kernel ridge regression estimator, an approximation to the optimal test error can then be computed on a hold-out set of data.

Next, we are interested into developing a clearer connection between the above statistical approach and quantities with a direct physical meaning. Towards this end, note that the joint, marginal and posterior distributions, given the prior on y and the likelihood p(x|y) are given by,

(7) p(x,y)=p(x|y)p(y)
(8) p(x)=0p(x,y)dy
(9) p(y|x)=p(x,y)p(x).

Assuming that points are sampled uniformly within the cone of detection, the prior is p(y)=2y/R2 (thus y¯=2R/3 and the denominator in Equation 5 is R2/18). Importantly, the likelihood p(x|y) is dictated by the fluid dynamics of odor plumes from a concentrated source. Our features are sample averages of intensity or timing attributes of the odor signal, and in the limit of large samples, they are normally distributed, that is

(10) p(x|y)=N(xg(y),s(y)).

Assuming the asymptotic limit to be well approximated, we can find empirical estimates for g(y) and s(y) given the data. This provides the desired characterization of the expected error and hence of the importance of each given feature in term of predictive power. Indeed, by numerically solving Equations 5–9 with the assumption (Equation 10) and using the empirical estimates for g(y) and s(y), we reproduce the predictive power of individual features showed in Figure 5c (see Figure 5—figure supplement 3). To move beyond empirical estimates of the likelihood and generalize predictions to other kinds of flows one could generalize asymptotic models of turbulent plumes developed in Celani et al., 2014 to account for z-variations of the sampling locations. Similarly, extension are needed to consider combination of possibly non Gaussian features.

Data availability

Request a detailed protocol

Simulations of odor transport are generated through NEK5000 Fischer et al., 2008, freely available from Argonne National Laboratory (https://nek5000.mcs.anl.gov/). Outputs from DNS presented in Figure 1 are processed to extract time series and compute the five features described in the text (average concentration, slope, blank duration, whiff duration and intermittency factor). These data are available at the online repository https://osf.io/ja9xr/. The results of kernel ridge regression perfomed on these data are presented in Figures 26. We perform kernel ridge regression on these data with the freely available code FALKON Rudi et al., 2018 (https://github.com/LCSL/FALKON_paper, Rigolli, 2022 copy archived at swh:1:rev:480741cf1e7da0d1d7415309cd6f254080a6ca17).

Data availability

Simulations of odor transport are generated through NEK5000 (43), freely available from Argonne National Laboratory (https://nek5000.mcs.anl.gov/). Outputs from DNS presented in Figure 1 are processed to extract time series and compute the five features described in the text (average concentration, slope, blank duration, whiff duration and intermittency factor). These data are available at the online repository https://osf.io/ja9xr/. The results of kernel ridge regression performed on these data are presented in Figures 2-6. We perform kernel ridge regression on these data with the freely available code FALKON (https://github.com/LCSL/FALKON_paper, copy archived at swh:1:rev:480741cf1e7da0d1d7415309cd6f254080a6ca17).

References

  1. Thesis
    1. Ho L
    (1989)
    A Legendre spectral element method for simulation of incompressible unsteady viscous free-surface flows
    Massachusetts Institute of Technology, Cambridge, USA.
  2. Book
    1. Pope SB
    (1984) Turbulent Flows
    Cambridge: Cambridge University Press.
    https://doi.org/10.1017/CBO9780511840531
  3. Conference
    1. Rudi A
    2. Carratino L
    3. Rosasco L
    (2018) Neural information processing systems
    Proceedings of the 31st International Conference on Neural Information Processing Systems.
  4. Book
    1. Schölkopf B
    2. Smola A
    (2002)
    Learning with Kernels
    Cambridge, MA, USA: MIT Press.

Decision letter

  1. Raymond E Goldstein
    Reviewing Editor; University of Cambridge, United Kingdom
  2. Aleksandra M Walczak
    Senior Editor; CNRS LPENS, France

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Decision letter after peer review:

Thank you for submitting your article "Learning to predict target location with turbulent odor plumes" for consideration by eLife. Your article has been reviewed by 2 peer reviewers, one of whom is a member of our Board of Reviewing Editors, and the evaluation has been overseen by Aleksandra Walczak as the Senior Editor. The reviewers have opted to remain anonymous.

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

Essential revisions:

1) It would be good if the authors could provide evidence that their conclusions do not depend on the precise location of the odor source relative to the cylinder. We do not require an exhaustive study, but some evidence would be very helpful.

2) We think that more could have been understood from these data, had the authors tried to focus on dimensionless quantities. It would be important to understand what sets the distance from the source where the intensity-based search strategy becomes less effective than the time-based one, and how the results depend on the Schmidt number.

3) The authors should carefully specify how dimensionless parameters are introduced. They do it in places but not systematically. For instance, how are wavenumbers in Figure 1(d) are made dimensionless?

4) On a similar note, how does the range of k's in Figure 1(d) compare to the height of the channel? Are these k's taken along the streamwise direction only? What is the vertical axis in that graph? Would it integral over all k's correspond to the odour density variance? These need to be specified.

5) What is the value of odour diffusivity \kapppa_\theta? What is the Schmidt number in these simulations? Visual inspection of Figure 1(d) suggests that Sc>>1, which might explain why the -5/3 slope is so far from representing the data. A reference to passive scalar advection by turbulence is in order here. We would suggest K. Sreenivasan, Turbulent mixing: A perspective, PNAS (2019).

6) Perhaps T in line 335 should be replaced by \theta.

7) Please provide figure numbers in lines 377, 412, and 414.Reviewer #1 (Recommendations for the authors):

In this work, the authors combine high resolution numerical simulations of turbulent airflow that advects a passive scalar (an "odor") and machine learning algorithms to investigate the question of how a navigation strategy based on the temporal dynamics of odor detection compares with one based on the intensity of the plume. In the simulations, the odor is introduced downstream of a cylinder that induces turbulence in the flow and the machine learning algorithm is trained and evaluated at various points further downstream. The authors conclude that intensity/gradient based measurements work closer to the source, while temporal schemes are good throughout the range. The study is done to a very high standard and presented clearly, although there are general questions about the data analysis (see below) that should be improved. That said, the work should have significant impact on a broad range of fields, from sensing to navigation, across a range of organism length scales.

One concern is the nature of the turbulent profile and its advection of the passive scalar. It would help if the characteristic dimensionless numbers of the problem were specified (Peclet, Schmidt numbers), and it was made clear how the choice of odor release point affects the conclusions. The turbulence itself spreads and diffuses with distance from the source, and one would presume the location of the odor release can matter substantially.

Also, as mentioned briefly in the Discussion, this work examines algorithms based on evaluating accuracy of prediction at a given measurement point. Unless I have missed something in the presentation, the issue of navigation is left unexamined. As in bacterial chemotaxis or any related problem, no navigation strategy is perfect, especially in the face of such a fluctuating source of information, so the full problem involves making estimates of where the source is and then moving to a new location, estimating again, moving, etc. The authors should clarify under what circumstances an evaluation at a fixed point is sufficiently predictive of "learning".Reviewer #2 (Recommendations for the authors):

In their manuscript, Rigolli et al., studied how measurements of intensity of a passive scalar (odour), its spatial gradients, and its time variations can be used to efficiently find the spatial location of its source. To this end, the authors performed direct numerical simulations of high-Reynolds number pressure-driven channel flow, partially blocked by a cylinder to generate turbulent velocity fluctuations. At a fixed position downstream the cylinder, they introduce a source of odour that diffuses through the fluid and is advected by its turbulent motion. The authors then trained a supervised machine learning algorithm on a collection of spatial and temporal odour profiles thus obtained. The algorithm was subsequently tested on various odour signals originating from the same pool of simulations. By varying the spatial position of the measurement point and the temporal exposure, the authors concluded that detection strategies based on the odour concentration and its spatial gradient work best at small separations from the source (and also close to the zero-odour concentration boundaries), while time-based strategies work reasonably well everywhere within the domain selected for detection.

Within the geometry set by the authors, the conclusions of the paper are supported by data. However, this setup leads to a natural question whether the search strategies determined here pertain to the distance to the odour source or to the distance to the turbulence source (the half-cylinder). Turbulence generated by an obstacle has a particular spatial profile; its temporal profile also depends on the distance to the obstacle. It is therefore possible that the machine learning algorithm has indirectly picked up these features rather than the odour profile itself. This can be settled by feeding the existing algorithm signals from simulations with various distances between the source and the obstacle. In the absence of this check, it is impossible to make general statements about applicability of these search strategies in other situations.

Another weak point of this study is the use of a cone of detection as it dramatically reduces the search complexity to almost a quasi one-dimensional problem. The authors appreciate it and point out that their choice is forced by a very limited by a very small number of detections outside the cone. This shortcoming should be addressed in future work.

https://doi.org/10.7554/eLife.72196.sa1

Author response

Essential revisions:

1) It would be good if the authors could provide evidence that their conclusions do not depend on the precise location of the odor source relative to the cylinder. We do not require an exhaustive study, but some evidence would be very helpful.

We expect results to be robust to changes in source location, as long as this is within the bulk of the channel and not within the viscous layer, which generates substantially less intermittent odor plumes (see e.g. Fackrell and Robins, J Fluid Mech 117:1, 1982). We performed additional computational fluid dynamics simulations, moving the source to two further locations downstream of the obstacle, at the same height and performed inference over an identical conical domain shifted to match the source location (see Figure 2, Figure Supplement 5 left). Performance at source height varies little over the three locations, demonstrating that the algorithm learns from the dynamics of the scalar and not of the underlying velocity field (we chose one individual feature per class and their combination, see Figure 2, Figure Supplement 5 center). For all three sources, performance of the average degrades with height, whereas performance of intermittency improves with height (Figure 2, Figure Supplement 5 right), consistent with the results of the Figure 5c and despite using a different cone. Finally, we trained the algorithm with a dataset generated by one source and tested it on dataset obtained from the other sources: we found that pairs of features are more sensitive to the details of the dataset, whereas individual features maintain their full predictive power (Figure 2 Figure Suppl 6).

If and how animals adapt their models to deal with sources at different heights is a fascinating question that is relevant for animal behavior; we believe this point deserves an in depth investigation on its own and we are currently addressing these questions in a different study.

Author response image 1
Estimated test error of individual features at different heights using the theoretical framework Equation (1)-(5) outlined in the text with the empirical likelihood estimated from data (not shown).

Symbols as in Figure 5c of the manuscript (black / grey represent timing /intensity features; dashed lines: whiffs).

2) We think that more could have been understood from these data, had the authors tried to focus on dimensionless quantities. It would be important to understand what sets the distance from the source where the intensity-based search strategy becomes less effective than the time-based one, and how the results depend on the Schmidt number.

We thank the reviewers for this important observation that led us to develop a theoretical frame work that we believe deserves further attention. We highlight below the conceptual building blocks, which we have detailed in the manuscript. The predictive power χ of an individual feature x is its expected error: 0R(yf(x))2p(x,y)dy0R(yy¯)2p(y)dy where p(x,y) is the joint probability distribution of the input output pair (x,y)and p(y) is the prior on the output y.f*(x) is the optimal predictor and can be shown to be f(x)=0Ryp(y|x)dy, where R is the length of the cone (see Hastie et al., The elements of statistical learning: Datamining, inference, and prediction, 2001 and Steinwart, J of Complexity 18:768, 2002). The expected error is generally unavailable because p<milestone-start />(<milestone-end />yx<milestone-start />)<milestone-end /> and p(x,y) are unknown. However, we can connect equation (1) to the physics of the odor plume by computing the joint, marginal and posterior distributions using the likelihood p(x│y):p(x,y)=p(x|y)p(y) p(x)=0p(x,y)dy p(y|x)=p(x,y)p(x) where the prior is p(y)=2y/R2 (thus y¯= 2R/3 and the denominator in equation (1) is R2/18). Importantly, the likelihood is dictated by the fluid dynamics of odor plumes from a concentrated source. Because our individual features are sample averages, their statistics is approximately Gaussian: p(x|y)N(xg(y),s(y)) and we can then simply characterize the likelihood by data fitting to estimate their average and standard deviation. By solving equations (1) to (4) with the assumption (5) and the empirical estimates for g(y) and s(y) we recover the predictive power of individual features showed in Figure 5c of the manuscript (see Figure 1). (The argument needs to be better adapted to whiffs which deviate considerably from a normal distribution)

We believe this theoretical framework deserves further attention. Indeed, one can move

beyond empirical estimates of the likelihood and generalize predictions to pairs of features and other kinds of flows by leveraging the asymptotic arguments recently proposed by Celani et al., Phys Rev 4:041015, 2014. We do not further elaborate on this point here, but we are currently working to fully develop these asymptotic arguments not only to obtain scaling behaviors for g(y) and s(y) but also the prefactor, which is crucial to understand what non-dimensional quantities dictate the switch from timing to intensity. Note that in this regime the Schmidt number plays a minor role, as we further elaborate on below.

3) The authors should carefully specify how dimensionless parameters are introduced. They do it in places but not systematically. For instance, how are wavenumbers in Figure 1(d) are made dimensionless?

Thank you for the comment, we have provided information on the Schmidt number used in the simulations (Sc = 1) and the definition of wavenumbers in Figure 1d (see next answer for more details on this point), as well as dimensional parameters that were either only mentioned in the text (viscosity) or not mentioned (diffusivity). All information is now summarized in Table 1.

4) On a similar note, how does the range of k’s in Figure 1(d) compare to the height of the channel? Are these k’s taken along the streamwise direction only? What is the vertical axis in that graph? Would it integral over all k’s correspond to the odour density variance? These need to be specified.

Thank you for the comment, we have included the missing information. Figure 1d shows the two dimensional spectra Ek=ddk(kxy<kc^(kxy)2d2k) normalized with the scalar variance σc2, where ĉ(κxy) is the two dimensional Fourier transform of the scalar concentration at the height of the source. The integral of the spectra is indeed the scalar variance. We non-dimensionalize the wavenumber with the inverse Kolmogorov scale η −1. We corrected a small discrepancy coming from the fact that the Legendre polynomials used by Nek5000 have sampling points that are not perfectly equally spaced. The k−5∕3 scaling holds for k η 0.1, consistent with previous experimental results in channel flow (see e.g. S.G. Saddoughi and S.V. Veeravalli J. Fluid Mech.(1994) 268:333-372). To further characterize the flow we add a panel to figure 1, showing that the mean flow follows the well known law of the wall, recovering classical statistics for channel turbulence.

5) What is the value of odour diffusivity \kapppa_\theta? What is the Schmidt number in these simulations? Visual inspection of Figure 1(d) suggests that Sc>>1, which might explain why the -5/3 slope is so far from representing the data. A reference to passive scalar advection by turbulence is in order here. We would suggest K. Sreenivasan, Turbulent mixing: A perspective, PNAS (2019).

Here we work at Schmidt = 1, κθ = ν = 1.5 × 10−5m2∕s. The inertial range is short but consistent with previous literature on channel flow (see previous answer). Varying the Schmidt number affects the fine details between the Batchelor and Kolmogorov scales (Falkovich et al., Rev Mod Phys 105 73:913, 2001). These are below the size of the source in our simulations, as is also the case in 106 many situations of practical relevance. In this regime, the dynamics below the Kolmogorov scale 107 affects little the large scale statistics of odor plumes from concentrated sources which are dictated 108 by the separation of Lagrangian particles (as argued in Celani et al., Phys Rev X, 2014 and confirmed experimentally in Duplat et al., Phys Fluids, 22:035104, 2010). We included this comment at page 12.

6) Perhaps T in line 335 should be replaced by \theta.

Thank you, this is indeed a typo and should read c

7) Please provide figure numbers in lines 377, 412, and 414.

Thank you, we have added the references to the figures

Reviewer #1 (Recommendations for the authors):

In this work, the authors combine high resolution numerical simulations of turbulent airflow that advects a passive scalar (an "odor") and machine learning algorithms to investigate the question of how a navigation strategy based on the temporal dynamics of odor detection compares with one based on the intensity of the plume. In the simulations, the odor is introduced downstream of a cylinder that induces turbulence in the flow and the machine learning algorithm is trained and evaluated at various points further downstream. The authors conclude that intensity/gradient based measurements work closer to the source, while temporal schemes are good throughout the range. The study is done to a very high standard and presented clearly, although there are general questions about the data analysis (see below) that should be improved. That said, the work should have significant impact on a broad range of fields, from sensing to navigation, across a range of organism length scales.

One concern is the nature of the turbulent profile and its advection of the passive scalar. It would help if the characteristic dimensionless numbers of the problem were specified (Peclet, Schmidt numbers), and it was made clear how the choice of odor release point affects the conclusions. The turbulence itself spreads and diffuses with distance from the source, and one would presume the location of the odor release can matter substantially.

We thank the reviewer for the comment. We have now stated the non-dimensional numbers and remarked that in our regime the Schmidt number is expected to play a minor role (see response to general comments above). To analyze how the results depend on source location, we have conducted a series of numerical simulations with the source located further downstream from its original location, and this does not affect our conclusion. We did not move the source laterally, as this cannot affect the results because the flow is homogeneous in the crosswind direction z2. We expect that moving the source to the ground will affect the results because intermittency of an odor plume from a source located at ground is much less intermittent (see response to general comment 1). Animals that would target both sources in air and on ground would clearly have to use qualitatively different predictive models. Ethological studies on dogs suggest that they do track both sources on ground and in air, and that they do this by alternating sniffing in air and on ground.

We are currently investigating this fascinating behavior.

Also, as mentioned briefly in the Discussion, this work examines algorithms based on evaluating accuracy of prediction at a given measurement point. Unless I have missed something in the presentation, the issue of navigation is left unexamined. As in bacterial chemotaxis or any related problem, no navigation strategy is perfect, especially in the face of such a fluctuating source of information, so the full problem involves making estimates of where the source is and then moving to a new location, estimating again, moving, etc. The authors should clarify under what circumstances an evaluation at a fixed point is sufficiently predictive of "learning".

We agree entirely with the referee on this point, which we now stress more in the Discussion. We are currently working to test whether the best predictors are also the best features to be used in navigation. Although it is often implicitly assumed that this is indeed the case, up until now this question is entirely unexplored.

Reviewer #2 (Recommendations for the authors):

In their manuscript, Rigolli et al., studied how measurements of intensity of a passive scalar (odour), its spatial gradients, and its time variations can be used to efficiently find the spatial location of its source. To this end, the authors performed direct numerical simulations of high-Reynolds number pressure-driven channel flow, partially blocked by a cylinder to generate turbulent velocity fluctuations. At a fixed position downstream the cylinder, they introduce a source of odour that diffuses through the fluid and is advected by its turbulent motion. The authors then trained a supervised machine learning algorithm on a collection of spatial and temporal odour profiles thus obtained. The algorithm was subsequently tested on various odour signals originating from the same pool of simulations. By varying the spatial position of the measurement point and the temporal exposure, the authors concluded that detection strategies based on the odour concentration and its spatial gradient work best at small separations from the source (and also close to the zero-odour concentration boundaries), while time-based strategies work reasonably well everywhere within the domain selected for detection.

Within the geometry set by the authors, the conclusions of the paper are supported by data. However, this setup leads to a natural question whether the search strategies determined here pertain to the distance to the odour source or to the distance to the turbulence source (the half-cylinder). Turbulence generated by an obstacle has a particular spatial profile; its temporal profile also depends on the distance to the obstacle. It is therefore possible that the machine learning algorithm has indirectly picked up these features rather than the odour profile itself. This can be settled by feeding the existing algorithm signals from simulations with various distances between the source and the obstacle. In the absence of this check, it is impossible to make general statements about applicability of these search strategies in other situations.

Thank you for your comment. To test robustness of the algorithm, we have performed two additional sets of simulations moving the source further downstream of the obstacle: the results are consistent with the results of the main simulation; source location may affect details about where exactly the transition between the timing vs intensity regimes occurs. First, we verified that performance of 2 individual features and their pair does not depend on source location, confirming that the algorithm learns solely from odor statistics. Second, we trained the algorithm with data from one simulation and tested its performance over datasets from the remaining simulations.

We found that performance of individual features is preserved even when training and test are performed over different dataset. Pairs of features are more sensitive to consistency between training and test dataset (see Figure 2 Figure supplement 6). Third, we analysed performance for each source location as a function of height of the sampling plane. We recovered that timing features improve with height whereas intensity features degrade with height, as showed in Figure 5c (Figure 2 Figure Supplement 5, right). We note that the transition between timing vs intensity may occur at different heights for the three source locations, we did not investigate this comparison in more detail.

Another weak point of this study is the use of a cone of detection as it dramatically reduces the search complexity to almost a quasi one-dimensional problem. The authors appreciate it and point out that their choice is forced by a very limited by a very small number of detections outside the cone. This shortcoming should be addressed in future work.

We entirely agree with the referee, in fact olfactory searches first need to locate the plume and subsequently locate the source within the plume. Our manuscript defines the most informative features once the agent is within the plume. We believe that our results are not relevant for the previous stage when the agent searches for the plume. This is because there is hardly any detection at all outside of the cone. In fact, our ongoing work (currently under review) demonstrates a model based navigation strategy that uses absence of detection to narrow down the possible locations to search for the plume.

https://doi.org/10.7554/eLife.72196.sa2

Article and author information

Author details

  1. Nicola Rigolli

    1. Department of Physics, University of Genova, Genova, Italy
    2. Institut de Physique de Nice, Université Côte d’Azur, Centre National de la Recherche Scientifique, Nice, France
    3. National Institute of Nuclear Physics, Genova, Italy
    4. MalGa, Department of Civil, Chemical and Environmental Engineering, University of Genoa, Genoa, Italy
    Contribution
    Conceptualization, Software, Formal analysis, Validation, Investigation, Visualization, Methodology, Writing – original draft, Writing – review and editing
    For correspondence
    nicola.rigolli@edu.unige.it
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-0734-2105
  2. Nicodemo Magnoli

    1. Department of Physics, University of Genova, Genova, Italy
    2. National Institute of Nuclear Physics, Genova, Italy
    Contribution
    Supervision, Investigation, Project administration, Writing – review and editing
    Competing interests
    No competing interests declared
  3. Lorenzo Rosasco

    MaLGa, Department of computer science, bioengineering, robotics and systems engineering, University of Genova, Genova, Italy
    Contribution
    Software, Supervision, Funding acquisition, Methodology, Project administration, Writing – review and editing
    Competing interests
    No competing interests declared
  4. Agnese Seminara

    1. Institut de Physique de Nice, Université Côte d’Azur, Centre National de la Recherche Scientifique, Nice, France
    2. MalGa, Department of Civil, Chemical and Environmental Engineering, University of Genoa, Genoa, Italy
    Contribution
    Conceptualization, Software, Formal analysis, Supervision, Funding acquisition, Investigation, Visualization, Methodology, Writing – original draft, Project administration, Writing – review and editing
    For correspondence
    agnese.seminara@unige.it
    Competing interests
    Reviewing editor, eLife
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-5633-8180

Funding

European Research Council (101002724)

  • Agnese Seminara

Air Force Office of Scientific Research (FA8655-20-1-7028)

  • Lorenzo Rosasco

National Institutes of Health (R01DC018789)

  • Agnese Seminara

Agence Nationale de la Recherche (ANR-15-IDEX-01)

  • Agnese Seminara

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

This work was supported by: the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 101002724 RIDING); the Air Force Office of Scientific Research under award number FA8655-20-1-7028; the National Institutes of Health (NIH) under award number R01DC018789; the French government, through the UCAJEDI Investments in the Future project managed by the National Research Agency (ANR) under reference number #ANR-15-IDEX-01. The authors are grateful to the OPAL infrastructure from Université Côte d’Azur and the Université Côte d’Azur’s Center for High-Performance Computing for providing resources and support. N.M. and N.R. are thankful for the support of Instituto Nazionale di Fisica Nucleare (INFN) Scientific Initiative SFT: Statistical Field Theory, Low-Dimensional Systems, Integrable Models and Applications.

Senior Editor

  1. Aleksandra M Walczak, CNRS LPENS, France

Reviewing Editor

  1. Raymond E Goldstein, University of Cambridge, United Kingdom

Publication history

  1. Preprint posted: June 16, 2021 (view preprint)
  2. Received: July 14, 2021
  3. Accepted: July 18, 2022
  4. Version of Record published: August 12, 2022 (version 1)

Copyright

© 2022, Rigolli et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 270
    Page views
  • 71
    Downloads
  • 1
    Citations

Article citation count generated by polling the highest count across the following sources: PubMed Central, Crossref, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Nicola Rigolli
  2. Nicodemo Magnoli
  3. Lorenzo Rosasco
  4. Agnese Seminara
(2022)
Learning to predict target location with turbulent odor plumes
eLife 11:e72196.
https://doi.org/10.7554/eLife.72196

Further reading

    1. Immunology and Inflammation
    2. Physics of Living Systems
    Derek M Britain, Jason P Town, Orion David Weiner
    Research Advance Updated

    T cells use kinetic proofreading to discriminate antigens by converting small changes in antigen-binding lifetime into large differences in cell activation, but where in the signaling cascade this computation is performed is unknown. Previously, we developed a light-gated immune receptor to probe the role of ligand kinetics in T cell antigen signaling. We found significant kinetic proofreading at the level of the signaling lipid diacylglycerol (DAG) but lacked the ability to determine where the multiple signaling steps required for kinetic discrimination originate in the upstream signaling cascade (Tiseher and Weiner, 2019). Here, we uncover where kinetic proofreading is executed by adapting our optogenetic system for robust activation of early signaling events. We find the strength of kinetic proofreading progressively increases from Zap70 recruitment to LAT clustering to downstream DAG generation. Leveraging the ability of our system to rapidly disengage ligand binding, we also measure slower reset rates for downstream signaling events. These data suggest a distributed kinetic proofreading mechanism, with proofreading steps both at the receptor and at slower resetting downstream signaling complexes that could help balance antigen sensitivity and discrimination.

    1. Physics of Living Systems
    Ye Li, Shiqi Liu ... Yilin Wu
    Research Article

    Long-range material transport is essential to maintain the physiological functions of multicellular organisms such as animals and plants. By contrast, material transport in bacteria is often short-ranged and limited by diffusion. Here we report a unique form of actively regulated long-range directed material transport in structured bacterial communities. Using Pseudomonas aeruginosa colonies as a model system, we discover that a large-scale and temporally evolving open channel system spontaneously develops in the colony via shear-induced banding. Fluid flows in the open channels support high-speed (up to 450 µm/s) transport of cells and outer membrane vesicles over centimeters, and help to eradicate colonies of a competing species Staphylococcus aureus. The open channels are reminiscent of human-made canals for cargo transport, and the channel flows are driven by interfacial tension mediated by cell-secreted biosurfactants. The spatial-temporal dynamics of fluid flows in the open channels are qualitatively described by flow profile measurement and mathematical modeling. Our findings demonstrate that mechanochemical coupling between interfacial force and biosurfactant kinetics can coordinate large-scale material transport in primitive life forms, suggesting a new principle to engineer self-organized microbial communities.