Inferring characteristics of bacterial swimming in biofilm matrix from time-lapse confocal laser scanning microscopy

  1. Guillaume Ravel
  2. Michel Bergmann
  3. Alain Trubuil
  4. Julien Deschamps
  5. Romain Briandet
  6. Simon Labarthe  Is a corresponding author
  1. University of Bordeaux, INRAE, BIOGECO, France
  2. Inria, INRAE, France
  3. Memphis Team, INRIA, France
  4. University of Bordeaux, IMB, UMR 5251, France
  5. CNRS, IMB, UMR 5251, France
  6. Université Paris-Saclay, INRAE, MaIAGE, France
  7. Université Paris-Saclay, INRAE, AgroParisTech, Micalis Institute, France

Abstract

Biofilms are spatially organized communities of microorganisms embedded in a self-produced organic matrix, conferring to the population emerging properties such as an increased tolerance to the action of antimicrobials. It was shown that some bacilli were able to swim in the exogenous matrix of pathogenic biofilms and to counterbalance these properties. Swimming bacteria can deliver antimicrobial agents in situ, or potentiate the activity of antimicrobial by creating a transient vascularization network in the matrix. Hence, characterizing swimmer trajectories in the biofilm matrix is of particular interest to understand and optimize this new biocontrol strategy in particular, but also more generally to decipher ecological drivers of population spatial structure in natural biofilms ecosystems. In this study, a new methodology is developed to analyze time-lapse confocal laser scanning images to describe and compare the swimming trajectories of bacilli swimmers populations and their adaptations to the biofilm structure. The method is based on the inference of a kinetic model of swimmer populations including mechanistic interactions with the host biofilm. After validation on synthetic data, the methodology is implemented on images of three different species of motile bacillus species swimming in a Staphylococcus aureus biofilm. The fitted model allows to stratify the swimmer populations by their swimming behavior and provides insights into the mechanisms deployed by the micro-swimmers to adapt their swimming traits to the biofilm matrix.

Editor's evaluation

This paper nicely considers how the biofilm matrix impacts the organism's moving within that environment, connecting prior analyses of cell movements on/within abiotic substrates to those within a "living" substrate. Though there are instinctive descriptions for this motility, the strength of this manuscript is the development and implementation of a statistical model that quantifies critical parameters and incorporates interactions with the biofilm matrix itself. While the manuscript measures the differences between morphologically distinct bacteria, a long-term possibility is to achieve predictable and reliable delivery of antimicrobials (delivered by bacteria or an abiotic object) into the biofilm's center, thereby reducing a biofilm's recalcitrant responses to biocontrol chemicals.

https://doi.org/10.7554/eLife.76513.sa0

eLife digest

Anyone who has ever cleaned a bathroom probably faced biofilms, the dark, slimy deposits that lurk around taps and pipes. These structures are created by bacteria which abandon their solitary lifestyle to work together as a community, secreting various substances that allow the cells to organise themselves in 3D and to better resist external aggression.

Unwanted biofilms can impair industrial operations or endanger health, for example when they form inside medical equipment or water supplies. Removing these structures usually involves massive application of substances which can cause long-term damage to the environment.

Recently, researchers have observed that a range of small rod-shaped bacteria – or ‘bacilli’ – can penetrate a harmful biofilm and dig transient tunnels in its 3D structure. These ‘swimmers’ can enhance the penetration of anti-microbial agents, or could even be modified to deliver these molecules right inside the biofilm. However, little is known about how the various types of bacilli, which have very different shapes and propelling systems, can navigate the complex environment that is a biofilm. This knowledge would be essential for scientists to select which swimmers could be the best to harness for industrial and medical applications.

To investigate this question, Ravel et al. established a way to track how three species of bacilli swim inside a biofilm compared to in a simple fluid. A mathematical model was created which integrated several swimming behaviors such as speed adaptation and direction changes in response to the structure and density of the biofilm. This modelling was then fitted on microscopy images of the different species navigating the two types of environments.

Different motion patterns for the three bacilli emerged, each showing different degrees of adapting to moving inside a biofilm. One species, in particular, was able to run straight in and out of this environment because it could adapt its speed to the biofilm density as well as randomly change direction.

The new method developed by Ravel et al. can be redeployed to systematically study swimmer candidates in different types of biofilms. This would allow scientists to examine how various swimming characteristics impact how bacteria-killing chemicals can penetrate the altered biofilms. In addition, as the mathematical model can predict trajectories, it could be used in computational studies to examine which species of bacilli would be best suited in industrial settings.

Introduction

Biofilm is the most abundant mode of life of bacteria and archaea on earth (Flemming and Wuertz, 2019; Flemming et al., 2016b). They are composed of spatially organized communities of microorganisms embedded in a self-produced extracellular polymeric substances (EPS) matrix. EPS are typically forming a gel composed of a heterogenous mixture of water, polysaccharides, proteins, and DNA (Flemming et al., 2016a). The biofilm mode of life confers to the inhabitant microbial community strong ecological advantages such as resistance to mechanical or chemical stresses (Bridier et al., 2011) so that conventional antimicrobial treatments remain poorly efficient against biofilms (Bridier et al., 2015). Different mechanisms were invoked such as molecular diffusion-reaction limitations in the biofilm matrix and the cell type diversification associated with stratified local microenvironments (Bridier et al., 2017). Biofilms can induce harmful consequences in several industrial applications, such as water (Beech and Sunner, 2004), or agri-food industry (Doulgeraki et al., 2017), leading to significant economic and health burden (Köck et al., 2010). Indeed, it was estimated that the biofilm mode of life is involved in 80% of human infection and usual chemical control leads to serious environmental issues (Bridier et al., 2011). Hence, finding efficient ways to improve biofilm treatment represents important societal sustainable perspectives.

Motile bacteria have been observed in host biofilms formed by exogenous bacterial species (Houry et al., 2012; Li et al., 2014; Piard et al., 2016; Flemming et al., 2016a). These bacterial swimmers are able to penetrate the dense population of host bacteria and to find their way in the interlace of EPS. Doing so, they visit the 3D structure of the biofilm, leaving behind them a trace in the biofilm structure, that is a zone of extracellular matrix free of host bacteria (Figure 1a and Appendix 1—figure 3). Hence, bacterial swimmers are digging a network of capillars in the biofilm, enhancing the diffusivity of large molecules (Houry et al., 2012), allowing the transport of biocide at the heart of the biofilm, reducing islands of living cells. The potentiality of bigger swimmers has also been studied for biofilm biocontrol, including spermatozoa (Mayorga-Martinez et al., 2021), protozoans (Derlon et al., 2012), or metazoans (Klein et al., 2016). Recent results suggest a deeper role of bacterial swimmers in biofilm ecology with the concept of microbial hitchhiking: motile bacteria can transport sessile entities such as spores (Muok et al., 2021), phages (Yu et al., 2021) or even other bacteria (Samad et al., 2017), enhancing their dispersion within the biofilm. Hence, characterizing microbial swimming in the very specific environment of the biofilm matrix is of particular interest to decipher biofilm spatial regulations and their biocontrol, but more generally in an ecological perspective of microbial population dynamics in natural ecosystems.

Microscopy data and model outlines.

(a) Temporal stacks of 2D images are acquired, with different fluorescence colors for host bacteria (Staphylococcus aureus, green) and swimmers (Bacillus pumilus, Bacillus sphaericus or Bacillus cereus, red). Bacterial swimmers navigate in a host biofilm and are tracked in the different snapshots. Swimmer trajectories are represented with white lines. High density and low density zones of host cells are visible in the biofilm (green scale). (b) Additionally to speed and acceleration distributions, three trajectory descriptors are considered. Distance is the total length of the trajectory path. Displacement is the distance between the initial and final points of the trajectory. Visited area is the total area of the pores left by the swimmer during its path. Hence, when a swimmer retraces its steps, the displacement is incremented but not the visited area. (c) Three different mechanisms are considered in the mechanistic model. Biofilm-dependant speed. A target speed is defined accordingly to the local density of biofilm and asymptotically reached after a relaxation time. Biofilm-dependent direction. Swimming direction is defined accordingly to the local biofilm density gradient. Random walk. A Brownian motion is added. (d) The image acquisition workflow is composed of a first step at the wet lab where host biofilm and swimmer are plated and imaged in different color channels. Then a post-processing phase recomposes the swimmer trajectories with tracking algorithms. Finally, temporal positions, speeds and accelerations are computed. On the biofilm channel, density and density gradient maps are processed at each time step.

Bacterial swimming is strongly influenced by the micro-topography and bacteria deploy strategies to sense and adapt their motion to their environment (Lee et al., 2021), with specific implications for biofilm formation and dynamics (Conrad and Poling-Skutvik, 2018). Model-based studies were conducted to characterize bacterial active motion in interaction with an heterogeneous environment. An image and model-based analysis showed non-linear self-similar trajectories during chemotactic motion with obstacles (Koorehdavoudi et al., 2017). Theoretical studies explored Brownian dynamics of self-propelled particles in interaction with filamentous structures such as EPS (Jabbarzadeh et al., 2014) or with random obstacles, exhibiting continuous limits and different motion regimes depending on obstacle densities (Chepizhko and Peruani, 2013; Chepizhko et al., 2013). Image analysis characterized different swimming patterns in polymeric fluids (Patteson et al., 2015), completed by detailed comparisons between a micro-scale model of flagellated bacteria in polymeric fluids and high-throughput images (Martinez et al., 2014). Models of bacterial swimmers in visco-elastic fluids were also developed to study the force fields encountered during their run (Li and Ardekani, 2016). However, to our knowledge, no study tried to characterize swimming patterns in the highly heterogeneous environment presented by an exogenous biofilm matrix.

In this study, we aim to provide a quantitative characterization of the different swimming behaviours in adaptation to the host biofilm matrix observed by microscopy. We focus on identifying potential species-dependent swimming characteristics and quantifying the swimming speed and direction variations induced by the host biofilm structure. To address these goals, three different Bacillus species presenting contrasted physiological characteristics are selected. First, different trajectory descriptors accounting for interactions with the host biofilm are defined, allowing to discriminate the swim of these bacterial strains by differential analysis. Then, a mechanistic random-walk model including swimming adaptations to the host biofilm is introduced. This model is numerically explored to identify the sensitivity of the trajectory descriptors to the model parameters. An inference strategy is designed to fit the model to 2D+T microscopy images. The method is validated on synthetic data and applied to a microscopy dataset to decipher the swimming behaviour of the three Bacillus.

Results

Ultrastuctural bacterial morphology

To investigate how the shape and propelling mechanism of bacteria can affect the way they navigate in a porous media such as a biofilm, we first image three bacterial swimmers –Bacillus pumilus (B. pumilus), Bacillus sphaericus (B. sphaericus), and Bacillus cereus (B. cereus) – by Transmitted Electron Microscopy (TEM) (Figure 2) to seek for potential structural and physiological differences. Important discrepancies can be observed between these Bacillus. First, they show noticeable difference in length and diameter, B. sphaericus being the longest bacteria by a factor of approximatively 1.5, and B. cereus and B. pumilus having similar size, but B. cereus showing a higher aspect ratio. Secondly, they do not have the same type of flagella: B. pumilus and B. sphaericus present several long flagella distributed over the whole surface of the membrane while B. cereus shows a unique brush-like bundle of very thin flagella, at its back tip.

TEM images of the three Bacillus.

TEM images of the three Bacillus are acquired, scaled in the same dimension and aligned (left panel). Images at lower scale are made with a zoom in on the flagella insertion (right panel). Note that the zoom in is optical so that the zoomed in image do not correspond to a zone of the larger scale images.

We then used these three species to test if these ultrastructural differences could impact their swimming behaviour in a host biofilm or in a Newtonian control fluid: could the longer body of B. sphaericus be an impediment in a crowded environment such as a biofilm or on the contrary could its larger size give it a higher strength to cross the biofilm matrix? Is the unique brush-like flagella of B. cereus an advantage or a disadvantage to swim in a Newtonian fluid or in a host biofilm?

Characterizing bacterial swimming in a biofilm matrix through image descriptors

2D+T Confocal Laser Scanning Microscopy (CLSM) of the three Bacillus swimming in a Staphylococcus aureus (S. aureus) host biofilm or in a control Newtonian buffer are acquired (see Figure 1d). Swimmers and host biofilms are imaged with different fluorescent dyes, allowing their acquisition in different color channels, and to recover in the same spatio-temporal referential the swimmer trajectories and the host biofilm density (see Materials and methods, Figure 1 and Table 1). Namely, for each species s and individual swimmer i, we recover the initial (T0,is) and final (Tend,is) observation times (when the swimmer goes in and out the focal plane, see Materials and methods sect. Confocal Laser Scanning Microscopy [CLSM]), and the number Tis of time points in the trajectory. We then extract from the 2D+T images the observed position, instantaneous speed and acceleration time-series

tXis(t),tVis(t),tAis(t), for t(T0,is,Tend,is).
Table 1
Dataset characteristics.

We detailed, for each batch, the number of trajectories, the average number of time points by trajectory (and standard deviation), the total number of time points in the dataset, the total movie duration in seconds and the time interval between two snapshots in seconds.

SpeciesBatch# traject.traj. lengthtime pointsDuration [s]Δt[s]
B. pumilus112240 (7.4)4,590300.134
215225 (5.7)3,543300.134
324338 (6.9)8,825300.134
B. sphaericus19840 (7.6)3,762300.134
29143 (7.7)3,771300.134
34855 (7.9)2,543230.134
B. cereus110547 (7.9)4,766300.069
25336 (7.7)1,808300.069
312143 (7.1)5,006300.069

Noting bs(t,x) the dynamic biofilm density maps obtained from the biofilm images, we also compute the local biofilm density and density gradient along trajectories

tbs(t,Xis(t)), and tbs(t,Xis(t)).

The angle θis(t) and the average velocity V¯is(t) between two successive speed vectors are also collected (see Materials and methods sec. Post-processing of image data).

Different swimming patterns can be deciphered by qualitative observations of the trajectories Xis(t) (Figure 3) in the biofilm and in the control Newtonian buffer, and run-and-tumble swimming patterns are quantified with θis(t) and V¯is(t) (Figure 4). B. sphaericus has a similar run-and-reverse behaviour in the biofilm and the control buffer with trajectories divided between back and forth paths around the starting point and long runs, the biofilm strongly impairing its speed and increasing the number of reverse events. By contrast, B. pumilus clearly switches its swimming behaviour in the biofilm, from quasi-straight runs in the Newtonian buffer to a pronounced run-and-reverse behaviour in the biofilm with decreased speeds and chaotic trajectories. On the contrary, B. cereus swimmers manage to conserve comparable trajectories and distributions of swimming speed and direction in the biofilm compared to control. Interestingly, the number of reverse events is even reduced in the host biofilm for B. cereus.

Swimmer trajectories The whole set of trajectories of each species is displayed in the control Newtonian buffer (upper panel) and in the host biofilm (lower panel).

Note that the 3 batches of the different species are pooled on these images. Number of trajectories are n = 517 and 123 (B. pumilus), n = 237 and 94 (B. sphaericus) and n = 279 and 144 (B. cereus) for, respectively, the biofilm and the control buffer. The physical size of the domain is 147x147μm.

Assessing run-and-tumble with speed and direction distributions.

For each time point, the swimmer mean speed V¯is(t), defined as the mean between the incoming and outgoing velocity vectors V¯is(t)=(Vis(t)+Vis(tΔt))/2, for t(T0,is+Δt,Tend,is), is plotted versus the direction change, defined as the angle θis(t) between the incoming and outgoing velocity vectors θis(t)=arccos((Vis(t)Vis(tΔt))/(||Vis(t)||Vis(tΔt)||)). The left and bottom panels indicate the marginal distributions, with the mean (dashed line) and quantiles 0.05, 0.5, and 0.95 (plain lines). Number of times points are n = 6 848 and 6 509 (B.pumilus), n = 2 818 and 3 740 (B. sphaericus) and n = 3 526 and 4 435 (B. cereus) for, respectively, the biofilm and the control buffer.

For further quantitative analysis, trajectory descriptors are defined. We first investigate the distribution of the population-wide average acceleration and velocity norms 1Tis2tAis(t) and 1Tis1tVis(t), where denotes the Euclidian norm. We also quantify the swimming kinematics by computing the travelled distance distis along the path and the total displacement dispis, that is the distance between the initial and final trajectory points, with

distis=T0,isTend,isVis(t)dt and dispis=X(Tend,is)-X(T0,is)=T0,isTend,isVis(t)dt.

We finally compute the total biofilm area visited by a swimmer along its path (see Figure 1b). The same descriptors are computed in the control Newtonian buffer.

The three species present contrasted distributions for these descriptors (Figure 5). B. sphaericus has the smallest mean (||A||=0.58 and ||V||=0.70) and median (A=0.50 and V=0.53) values of acceleration and speed, while B. pumilus has the widest distributions (difference between 95% and 5% centiles of 2.76 for A and 2.45 for V compared to 1.00, 1.51 and 1.90, 1.49 for B. sphaericus and B. cereus respectively). B. cereus for its part shows the highest accelerations, indicating larger changes in swimming velocities, but median and mean speeds comparable to B. pumilus (Figure 5, A and V panels). We also note that B. sphaericus and to a lower extent B. pumilus trajectories have a significant amount of null or small average speeds, while B. cereus trajectories have practically no zero velocity, consistently with the qualitative analysis (Figure 5, V panels). Small velocities episodes of B. sphaericus and B. pumilus could occur during their back-and-forth trajectories, which produce small displacements and pull the displacement distribution towards lower values than B. cereus (Figure 5, Disp panel). B. pumilus displacement is intermediary. Conversely, back-and-forth trajectories can produce large swimming distances for B. sphaericus and B. pumilus (mean adimensioned value of 32.2 and 43.2 respectively) so that B. sphaericus has a distance distribution comparable to B. cereus (mean adimensioned value of 29.6, Figure 5, Dist panel), but lower than B. pumilus. Observing conjointly displacement and distance (Figure 5, lower-right panel) provides consistent insights: B. sphaericus shows a large variability of small displacement trajectories, from small to large distances, while B. cereus trajectory displacement seems to vary almost linearly with the distance at least for the points inside the isoline 50%. B. pumilus has again an intermediary distribution, with a large range of displacement-distance couples. The distributions of visited areas of B. pumilus and B. cereus are almost identical, and higher than B. sphaericus one. Compared to the control buffer, all descriptors are reduced in the biofilm. Consistently with previous observations, the displacement (disp) is strongly reduced for B. pumilus, and less impacted for B. sphaericus and B. cereus. These observations must be related to the behavioural switch for B. pumilus and to the identical swimming patterns for the two other Bacilii in the biofilm compared to the control fluid.

Analysis of swimming characteristics using trajectory descriptors.

Upper panel: normalized acceleration, speed, distance, displacement, and area distributions structured by species are displayed, together with quantile 0.05, 0.5, and 0.95 (vertical plain lines) and mean (vertical dashed line). The descriptor distribution in the control Newtonian buffer is indicated with the dotted line. All values are normalized by the corresponding reference value as indicated in Materials and methods. T-test pairwise comparison p-values are displayed in Appendix 1—table 2. Number of trajectories are n = 517 and 123 (B.pumilus), n = 237 and 94 (B. sphaericus) and n = 279 and 144 (B. cereus) for, respectively, the biofilm and the control buffer. Lower panel: we display the distribution of the instantaneous acceleration norm respectively to the local biofilm density gradient (i.e. ||Ai(t)|| function of b(Xi(t))) and of the instantaneous velocity norm respectively to the local biofilm density (i.e. ||Vi(t)|| function of b(Xi(t))), structured by population. The point cloud of each species is approximated by a gaussian kernel and gaussian kernel isolines enclosing 5, 50% and 95% of the points centered in the densest zones are displayed to facilitate comparisons between species (see Materials and methods Plots and statistics).

All together, this data depict (1) a long-range species, B. cereus, which moves efficiently in the biofilm during long, relatively straight, rapid runs, almost identically as in a Newtonian fluid (2) a short-range species, B. sphaericus, that moves mainly locally in small areas in the biofilm and in the control buffer with lower accelerations and speeds except few exceptions (only 6% of its trajectories induced a displacement higher than 10μm compared to 28% for B. cereus and 26% for B. pumilus) and (3) a medium-range species, B. pumilus, with a large diversity of rapid trajectories, from small to large displacement, and a behavioural change from straight runs in a Newtonian fluid to frequent run-and-reverse events in the biofilm. These kinematics discrepancies for B. pumilus and B. cereus allow them however to cover identical visited areas.

Though, these global descriptors do not inform about potential adaptations of the swimmers to the biofilm matrix. We first check if swimmer velocities are directly linked to the local biofilm density, and if the swimmers adapt their trajectory according to density gradients by plotting the points (b(t,Xis(t)),Ais(t)) and (b(t,Xis(t)),Vis(t)) (Figure 5, lower panel). Clear differences between the three species can be deciphered. First, the three Bacillus do not have the same distribution of visited biofilm density and gradient. B. pumilus swimmers visit denser biofilm with higher variations than the other species while B. sphaericus and B. cereus stay in less dense and smoother areas, the quantile 0.5 of these species being circumscribed in low gradient and low density values. Next, B. cereus has a wider distribution of accelerations, specially for small-density gradients, compared to B. pumilus and B. sphaericus. This could indicate that when the biofilm is smooth, B. cereus samples its acceleration in a large distribution of possible values. Finally, we observe that the speed distribution rapidly drops for increasing biofilm densities for B. sphaericus and B. cereus, while the decrease is much smoother for B. pumilus. These observations provide additional insights in the species swimming characteristics: B. pumilus swimmers seem to be less inconvenienced by the host biofilm density than the other species, while B. cereus and B. sphaericus bacteria appear to be particularly impacted by higher densities and to favor low densities where it can efficiently move. Though, B. sphaericus has lower motile capabilities than B. cereus when the biofilm is not dense.

Analysis of swimming data with an integrative swimming model

This descriptive analysis does not allow to clearly identify potential mechanisms by which the swimmers adapt their swim to the biofilm structure or to simulate new species-dependant trajectories. We then build a swimming model based on a Langevin-like equation on the acceleration that involves several swimming behaviours modelling the swimmer adaptation to the biofilm. Furthermore, after inference, new synthetic data can be produced by predicting swimmer random walks sharing characteristics comparable to the original data.

We consider bacterial swimmers as Lagrangian particles and we model the different forces involved in the update of their velocity v. We assume that the swimmer motion can be modelled by a stochastic process with a deterministic drift (Figure 1c):

(1) dv=γ(α(b)-v)vvdtspeed selection+βbbdtdirection selection+ηdtrandom term

where the right hand side is composed of two deterministic terms in addition to a gaussian noise, each weighted by the parameters γ, β and ϵ.

The first term implements the biological observation (Figure 5, lower central panel) that the bacterial swimmers adapt their velocity to the biofilm density. This term can be interpreted as a speed selection term that pulls the instantaneous speed of the swimmer towards a prescribed target velocity α(b) that depends on the host biofilm density b. The weight γ can be interpreted as a penalization coefficient. In such a formalism, the difference between the swimmer and the prescribed speed is divided by a relaxation time τ to be homogeneous to an acceleration. Hence, γ is proportionally inverse to τ, γ1τ. As a first-order approximation of the speed drop observed in Figure 5 for increasing b, the target speed α(b) is modeled as a linear variation between v0 and v1, where v0 is the swimmer characteristic speed in the lowest density regions, where b=0, and v1 in the highest density zones where b=1:

α(b)=v0(1-b)+bv1=v0+b(v1-v0)

The second term updates the velocity direction according to the local gradient of the biofilm density b. The sign of β indicates if the swimmer is inclined to go up (negative β) or down (positive β) the host biofilm gradient, while the weight magnitude indicate the influence of this mechanism in the swimmer kinematics. We note that this term does not depend on the gradient magnitude but only on the gradient direction: this reflects the implicit assumption that the bacteria are able to sense density variations to find favorable directions, but that the biological sensors are not sensitive enough to evaluate the variation magnitudes.

The third term is a stochastic two-dimensional diffusive process that models the dispersion around the deterministic drift modelled by the two first terms. We define

ηN(0,ϵ)

The term η can also be interpreted as a model of the modelling errors, tuned by the term ϵ. Equation 1 is supplemented by an initial condition by swimmer. For vanishing v or b leading to an indetermination, the corresponding term in the equation is turned off.

Equation 1 links the observed biofilm density and the swimmer trajectories trough mechanistic swimming behaviours. The model fitting can be seen as an ANOVA-like integrative statistical analysis of the image data. It decomposes the observed acceleration variance between mechanistic processes describing different swimming traits in order to decipher their respective influence on the swimmer trajectories while integrating heterogeneous data (density maps b and trajectories kinematics).

We can define characteristic speed and acceleration V* and A* in order to set a dimensionless version of Equation 1

(2) dv=γ(v0+b(v1-v0)-v)vvdt+βbbdt+ηdt

where γ=γV*A*, v0=v0V*, v1=v1V*, β=βA*, ηN(0,ϵ) and ϵ=ϵA*2.

This dimensionless version will strongly improve the inference process and will allow an analysis of the relative contribution of the different terms in the kinematics. An extended numerical exploration of this model is performed in Appendix 2 Sec. Numerical exploration on mock biofilm images to illustrate the impact of the different parameters on the trajectories, showing in particular the interplay between γ and ϵ: counter-intuitively, straight lines are induced when the stochastic part ϵ is high compared to the speed selection parameter γ (see also Appendix 2).

Inferring swimming parameters from trajectory data

For each bacterial swimmer population, we now seek to infer with a Bayesian method population-wide model parameters governing the swimming model of a given species from microscope observations.

Inference model setting

Equation (2) is re-written as a state equation on the acceleration for the bacterial strain s and the swimmer i

(3) Ais(t)=γs(v0s+b(t,Xis(t))(v1s-v0s)-Vis(t))Vis(t)Vis(t)+βsb(t,Xis(t))b(t,Xis(t))+ηs
(4) :=fA(θs,b(t,Xis(t)),Vis(t),Xis(t))+ηs

where

θs:=(γs,v0s,v1s,βs)

are species-dependant equation parameters. The function fA can be seen as the deterministic drift of the random walk, gathering all the mechanisms included in the model. The inter-individual variability of the swimmers of a same species comes from the swimmer-dependent initial condition, the resulting biofilm matrix they encounter during their run, and the stochastic term.

Inferring the parameters θs can then be stated in a Bayesian framework as solving the non linear regression problem

(5) Ais(t)N(fA(θs|b(t,Xis(t)),Vis(t),Xis(t)),ϵs)

from the data b(t,X), Xis(t), Vis(t) and Ais(t), with truncated normal prior distributions

(6) θsN(0,1),ϵsN(0,1),

and additional constrains on the parameters

γs0,v0s0,v1s0,ϵs0.

We note that Equation (5) can be seen as a likelihood equation of the parameter θs knowing Ais(t),b(t),Vis(t) and Xis(t). The parameter ϵs can now be seen as a corrector of both modelling errors in the deterministic drift and observation errors between the observed and the true instantaneous acceleration. Alternative settings where these uncertainties sources are separated and a true state for position and acceleration is inferred can be defined (see Annex Various inference models). The inference problem is implemented in the Bayesian HMC solver Stan (Stan Development Team, 2018) using its python interface pystan (Riddell et al., 2021). Inference accuracy is thoroughly assessed on synthetic data (see Appendix 1 Assessment of the inference with synthetic data and Figure 6).

Inference assessment on synthetic data.

(a) Predicted vs true trajectories. Trajectories are recovered by sampling the parameter posterior distribution starting from the same initial condition than in the data. We represent a ground truth trajectory extracted randomly from the original dataset in red, the corresponding sampled trajectories with thin gray lines, and the trajectory obtained with the posterior means in orange. Note that in this simulation, the stochastic part is the same for all simulations, so that the only source of uncertainties comes from the inference procedure. (b) Trajectory descriptors. Trajectories are re-computed replacing the original parameters (ground truth) by the inferred parameters. The trajectory descriptors introduced in Characterizing bacterial swimming in a biofilm matrix through image descriptors are computed on the synthetic data (blue curves) and on the data obtained with the inferred parameters (orange curves). Number of trajectories are n = 72 for the ground truth and n = 100 after inference. (c) Ground truth vs fitted trajectories. The ground truth, that is the original trajectories (blue) and fitted (red) trajectories are displayed and show common characteristics. (d-e) Qqplot of fitted model output vs ground truth. After inference, the fitted model is used to re-compute the synthetic dataset. We plot the x (d) and y (e) components of the accelerations in a qqplot: the fitted model output quantiles are plotted against the quantiles of the original dataset (ground truth) with blue dots, together with the y=x line (red).

Analysis of the confocal microscopy dataset

We now solve the inference problem (5)-(6) on the confocal microscopy dataset to identify population-wide swimming model parameters in order to decompose the swimmer kinematics in three mechanisms: biofilm-related speed selection, density-induced direction changes and random walk. The inference process is assessed by comparing the descriptors obtained on trajectories predicted by the fitted model (Figure 7a) with descriptors of real trajectories (Figure 5). The mean values of acceleration and speeds are accurately predicted for the three species ( Figure 7a panels A and V, dashed lines). Relative positions of distance, displacement and visited area mean values are also correctly simulated (Figure 5 and Figure 7a, upper panel). B. sphaericus presents the lowest predicted accelerations and speeds while B. pumilus has the widest speed and acceleration distributions and B. cereus shows the highest accelerations, consistently with the data. The visited area and the distances are slightly over estimated, but the relative position and the shape of the distributions are conserved. The amount of null velocities for B. sphaericus is under estimated by the fitted model and not rendered for B. pumilus. The distance distributions of the three species are accurately predicted by the fitted model. When displaying conjointly the distance and the displacement (Figure 7a, right lower panel), the distribution of B. sphaericus is correctly predicted by the simulations, but B. cereus and B. pumilus displacements are underestimated. Some qualitative features can be recovered, such as the higher distribution of distance-distribution couples for B. cereus or higher displacement for B. cereus compared to B. sphaericus.

Inference result on the experimental images.

(a) To validate the inference process, a synthetic dataset is assembled by computing Equation 1 with the inferred parameters and the trajectory descriptors introduced in section Characterizing bacterial swimming in a biofilm matrix through image descriptors are computed and can be compared to the data descriptors in Figure 5. Acceleration, speed, distance and displacement distributions are displayed in the upper panel, with quantiles 0.05, 0.5 and 0.95 (plain lines) and mean (dashed line). The mean values observed in the image data are also displayed for comparison (black dashed line). The number of trajectories are identical than in Figure 5: n = 517 (B. pumilus), n = 237 (B. sphaericus) and n = 279 (B. cereus). Interactions between the host biofilm and, respectively, acceleration and speed distributions are displayed in the lower panel with isolines enclosing 5, 50% and 95% of the points, centered in the densest zones. (b) Inferred parameter posterior distributions after analysis of the confocal swimmer images, and posterior mean (dashed line). We used 4000 points for the computation of the Gaussian KDE.

Descriptors of swimming adaptations to the host biofilm are also correctly preserved for the main part (Figure 5 and Figure 7 a, lower panel). B. pumilus is the species that crosses the highest biofilm densities in the fitted model simulations, showing the highest speeds in this crowded areas, and that visits the most frequently areas with high density gradients, consistently with the data. As in the confocal images, the simulated B. sphaericus and B. cereus favor smoother zones of the biofilm with lower biofilm densities. The B. cereus fitted model correctly renders the highest acceleration variance observed in the data for low biofilm gradients, while B. sphaericus speed and acceleration variance is the lowest for all ranges of biofilm densities and gradients, both in the data and in the fitted model predictions. The drop of speeds and accelerations for increasing biofilm densities and gradients is well predicted for B. pumilus, but is smoother in the simulation compared to the data for B. sphaericus and B. cereus. In particular, the sharp drop of speeds for b0.25 observed in the data for B. cereus and B. sphaericus is underestimated by the fitted model.

All together, the model reproduces very accurately the mean values of acceleration, speed and visited area, renders relative positions and the main characteristics of distributions for distance, displacement and interactions with the host biofilm matrix, but produces less variable outputs than observed in the data, meaning that the model is less accurate in the distribution tails. The main features of the swimmer adaptation to the underlying biofilm are however correctly predicted by the model.

To further inform the fitted model accuracy, the coefficient of determination Rdet2 of the deterministic components fA(θs,b(t),Vis,Xis(t)) of Equation 4 is computed (Table 2), in order to quantify the goodness of fit of the friction and gradient terms of (Equation 2) that represent interactions with the biofilm. These results highlight that B. cereus bacteria do present an important stochastic part in the accelerations, while the B. pumilus species is the best represented by our deterministic modelling.

Table 2
Reference acceleration and speed, and acceleration variance decomposition between stochastic and deterministic terms.

The number N of acceleration time points is indicated for each specie. Then, reference values for acceleration Aref and speed Vref used for adimensionalization are computed by averaging the corresponding values by specie. Descriptive statistics of acceleration variance decomposition are then computed in order to illustrate the contribution of the deterministic terms in the observed acceleration distribution, and the part of the residual mechanisms that are not included in the model. We indicate for each species the acceleration variance σ(A), the part of the variance explained by the deterministic terms Rdet2 (see Materials and methods sec.Inference validation on experimental data) and the variance of the stochastic term ϵ2. We note that in order to compare species at vizualisation step, they are re-normalized with the average of the species reference values: Aref= 78.31 and Vref= 6.55.

dataNArefVrefσ(A)Rdet2[%]ϵ2
B.pumilus33,91681.087.890.8758.800.36
B.sphaericus20,15244.934.740.5848.500.30
B.cereus23,160108.927.030.6332.720.42

The three species present very different inferred parameter values (Figure 7 b and Table 3), showing that the model inference captures contrasted swimming characteristics of these Bacillus. Due to the mechanistic terms introduced in Equation 1, these differences can be interpreted in term of speed and direction adaptations to the host biofilm. First, B. pumilus shows the highest v0 value, and the highest amplitude between v0 and v1, inducing a higher ability for B. pumilus to swim fast in low density biofilm zones and strong deceleration in crowded area. In comparison, B. sphaericus presents the smallest amplitude between v0 and v1 showing a poor adaptation to biofilm density. B. cereus has the highest γ value, showing a reduced relaxation time toward the density dependant speed: in other words, B. cereus is able to adapt its swimming speed more rapidly than the other species when the biofilm density varies. B. cereus swimmers are also better able to change their swimming direction in function of the biofilm variations they encounter along their way, their β distribution being markedly higher than the other species which have very low β. Finally, the stochastic parameter ϵ is also contrasted, from a low distribution for B. sphaericus to high values for B. cereus. All together, the inference complete the observations made in Figure 5: B. pumilus poorly adapts its swimming direction to the host biofilm (low β) but has a wide range of possible speeds when the biofilm density varies (high v0, low v1), that it can reach quite rapidly (intermediary γ) with intermediary stochastic correction (ϵ). In contrast, B. cereus reaches lower speed values (intermediary v0, low v1) but is more agile to adapt its swimming to its environment by changing rapidly its speed when the biofilm density is more favorable (highest γ) and adapting its swimming direction to biofilm variations, with higher stochastic variability (large ϵ). Finally, B. sphaericus is the less flexible of the three bacteria: less fast (smallest difference between v0 and v1), they are also less responsive to biofilm variations (small γ and β) with low random perturbations (small ϵ).

Table 3
Inference outputs for the three species.

The posterior mean, standard deviation and inferred confidence interval are indicated for each parameter and each specie. Convergence diagnosis index neff and Rhat are provided.

speciesparammeanstdconfidence interval [2.5%–97.5%]neffRhat
B. pumilusγ0.773.95×10–3[0.77–0.77]4,5071
v00.148.67×10–3[0.12–0.16]3,8791
v11.69×10–31.69×10–3[5.18×10–5−6.26×10–3]4,8211
β9.84×10–35.07×10–3[1.45×10–5−2.07×10–2]5,2231
ε0.622.48×10–3[0.61–0.62]5,3071
B. sphaericusγ0.614.53×10–3[0.60–0.62]4,9651
v02.75×10–42.75×10–4[4.91×10–6−1.01×10–3]4,0191
v14.84×10–34.77×10–3[9.39×10–5−1.45×10–2]5,0011
β4.25×10–33.33×10–3[−2.18×10–3−1.15×10–2]4,6681
ε0.321.55×10–3[0.31–0.32]5,9431
B. cereusγ0.831.11×10–2[0.80–0.86]2,7001
v06.44×10–21.07×10–2[3.22×10–2−9.66×10–2]2,5101
v16.65×10–36.33×10–3[1.50×10–4−2.15×10–2]4,0611
β2.78×10–29.04×10–3[1.39×10–2−5.56×10–2]4,2301
ε0.94.17×10–3[0.89–0.92]4,8521

Finally, after inference, the impact of each term in the overall acceleration data can be quantified and analyzed by displaying its relative contribution in a ternary plot (Appendix 2—figure 6). This relative contribution can be measured thanks to the swimming model which integrates these different mechanisms in the same inference problem. The direction selection is the least influential mechanism for the three species, with a slightly higher impact for B. cereus (50% and 95% isolines slightly shifted towards A(b) in Appendix 2—figure 6a). When zooming in, the three Bacillus show differences in the balance between speed selection and the random term (Appendix 2—figure 6b): while B. pumilus is slightly more influenced by the friction term than by stochasticity, these mechanisms are perfectly balanced in B. sphaericus accelerations, while B. cereus is more influenced by the random term.

Interpretation of the bacterial swimming at the light of their morphology

Kinematics descriptors and swimming parameters can then be reinterpreted through the insights provided by the morphology of each bacteria species as shown in Figure 2. As observed in Figure 2, B. pumilus and B. sphaericus are flagellated whereas B. cereus is equipped by a unique brush-like bundle of thin flagella at its tail. This morphology can be linked to their swimming patterns. The flagella could be linked to the run-and-tumble behaviour of B. pumilus and B. sphaericus, as shown for other flagellated bacteria such as E. coli, the tumbling events of which are induced by reverse rotation of the cellular motor of its multiple flagella (Patteson et al., 2015). Additional functional characteristics may discriminate B. pumilus and B. sphaericus, since run-and-reverse swimming is the natural behaviour of B. sphaericus even in the Newtonian control buffer, whereas B. pumilus drastically reduces its speed in high-density biofilms (Figure 7, a) and starts tumbling in the host biofilm (Figure 4). B. pumilus has the highest number of flagella and is the bacteria that reaches the highest speeds specially in the Newtonian buffer and in low-density areas, indicating that this characteristic may be an advantage for swimming fast in the extracellular matrix. The kind, size and disposition of the flagella bundle may help B. cereus swimmers to adapt their runs to their environment by changing directions to follow lower density areas (higher impact of direction selection term of the three Bacillus in Appendix 2—figure 6) or to adapt rapidly when biofilm density varies (largest γ). B. cereus being the bacteria with the strongest stochastic part (highest ϵ, density shifted towards A(ϵ) in Appendix 2—figure 6), this morphology could also help the swimmer to go through the biofilm by random navigation, which helps to maintain comparable straight trajectory with or without biofilm when the stochastic part is higher than the speed selection term (Appendix 1—figure 1, Appendix 2—figure 3 and Appendix 2—figure 6). Finally, B. sphaericus bacteria are much longer than the other two species, which may explain why this species is the least motile in terms of acceleration and kinematics, both in biofilms and in the Newtonian control buffer.

Discussion

Modelling and analysis of swimming trajectories

When analyzing microbial swimming trajectories, two general strategies can be found in the literature. The first one aims at designing statistical tests quantifying similarities with or deviations from typical motion of interest such as diffusion (Patteson et al., 2015). Another strategy consists in providing a generative model of the data, analyzing it (Chepizhko and Peruani, 2013; Chepizhko et al., 2013) and comparing model outputs with real data (Koorehdavoudi et al., 2017; Jabbarzadeh et al., 2014), possibly after inference. The model that is studied in this paper belong to the second category: the model includes deterministic mechanisms describing interactions with the host biofilm, together with a random correction counterbalancing the modelling errors. The parameter inference allows to interpret the data variance relatively to speed or direction adaptations to the host biofilm versus residual effects gathered in the stochastic term. This method is comparable to ANOVA-like multivariate analysis: the parametric phenomelogical mappings between explicative co-variables and a swimming behaviour (for example the function defining speed selection from biofilm density) are gathered in the same inference problem, enabling to decompose acceleration variability between the different swimming behaviours. This integrative method allows for multi-data integration and co-analysis. Furthermore, the fitted model allows to simulate typical swimming trajectories of a given species.

Population-wide swimming characteristics vs true-state inference

In this study, we do not aim to recover ’true’ swimmer trajectories (a.e. the blue trajectory in Appendix 2—figure 4), that is identifying through smoothing techniques an approximation of the specific realization of the stochastic modeling and observation errors that lead to a given ’observed’ trajectory. Rather, the goal is to identify common characteristics shared by a population of trajectories by inferring the ‘population-wide’ parameters (the parameters α, β, v0, v1, γ, and ϵ) that best explain the whole set of observed accelerations in a same population of swimmers. For this reason, we did not introduce swimmer-specific terms nor individual noise: they would have increased the model accuracy, but to the price of a blurrier characterization of the species specificities.

This choice determined our inference framework. Despite several alternative options for recovering hidden states, in particular SSM (space state models) which are common in spatial ecology (Auger‐Méthé et al., 2021), the Bayesian method we opted for is a simpler non-linear regression problem that proved to be sufficient to recover macroscopic features of swimmer trajectories and species stratification. We discuss in Appendix 3 Various inference models the different options that were tested and present in Materials and methods Sec. Inference the method for noise model selection. Among other interesting features, the Bayesian method provides confidence intervals on the final parameter estimation, and on the resulting trajectories as in Figure 6a.

Predictive capabilities of the model

The deterministic terms of the model explain only half of the variance (Table 2). A major part of the underlying mechanisms is not correctly described by our model which is a common feature since it is a phenomenological model which only considers interactions with the underlying biofilm at a macroscopic level, without taking into account nanoscale physical mechanisms. A more detailed description of the underlying physics could have been designed as in Martinez et al., 2014, but it would have made more complex the analysis of the interactions between the host biofilm and the swimmer trajectories and the extraction of species-specific patterns. However, we note that our model correctly renders observations made through macroscopic trajectory descriptors, even though the inference process has not been made based on these observables. Furthermore, several repetitions of the same models with different samples of the stochastic terms give very similar values for the trajectory descriptors (see Appendix 2—figure 5 and section Influence of inference and stochastic terms on the trajectory descriptors), showing that these descriptors are robust to stochastic perturbations. Hence, the model (2) can be used to produce synthetic data sharing the same global characteristics than the original ones specifically taking into accounts interactions between the swimmers and the host biofilm. Furthermore, these predictions also reproduce the species stratification observed in the original data using the global descriptors.

Biological interpretation of the fitted models

The direction selection term of the equation driven by β has little impact in the swimmer model fitted on real data. However, the parameter β can have a sensible impact on the kinematics as shown in the sensitivity analysis, and on the trajectories in mock biofilms (Appendix 2—figure 1). This could indicate that direction selection based on biofilm gradients is marginally effective in real-life swimming trajectories in a biofilm matrix. On the contrary, the speed selection term is more effective for the three Bacillus, showing that these micro-swimmer are able to adapt their swimming velocity to the biofilm density faced during their run. This term acts as an inertial term which enhances the stochastic term to provide direction and velocity changes.

The model has been used to decipher different adaptation strategies to the host biofilm of the three species during their swim. It confirms that B. sphaericus are the less motile bacteria in the biofilm, with reduced speeds and adaptation capabilities as indicated by the smallest model parameter values and a stereotypic run-and-reverse behaviour inside or outside the biofilm. B. pumilus on the contrary drastically changes its swimming behaviour in the biofilm compared to the Newtonian control buffer, which is reflected in the model by a high amplitude between v0 and v1 and a high γ that indicates a rapid adaptation for varying biofilm densities. B. cereus shows the highest adaptation ability to the biofilm matrix, with the highest γ and β reflecting biofilm-induced speed and direction changes. Furthermore, the high stochastic effects (highest ∈) higher than the speed selection term tuned by γ (see Appendix 2—figure 6) allows this swimmer to conserve straight runs in the biofilm (see Appendix 2 Sec. Friction and random term in Langevin equations.) in the same way than in the control Newtonian fluid.

This characterization methodology could be used to drive species selection for improved biofilm control. Furthermore, the model can be used to predict new trajectories and the resulting biofilm vascularization, in a similar framework as in Houry et al., 2012. Coupled with a model of biocide diffusion, these simulations could be used to test numerically the efficiency of mono- or multi-species swimmer pre-treatment to improve the removal of the host biofilm.

Flagellated bacteria in polymeric solutions

Characterization of flagellated bacteria motility in polymeric solutions is a very active research area (Martinez et al., 2014; Patteson et al., 2015; Zöttl and Yeomans, 2019; Qu and Breuer, 2020; Qu et al., 2018). Speed and direction variations have been measured for various polymeric fluids with different visco-elastic properties. For the model bacteria E. coli in polymeric solutions, enhanced viscosity decreases tumbling while increased elasticity speeds up the swimmers (Patteson et al., 2015; Zöttl and Yeomans, 2019). In our experiments on the contrary, we observed decreased speeds and strong enhancement of reverse events for the flagellated B. sphaericus and B. pumilus in the biofilm compared to the Newtonian control buffer. However, the experimental set-up shows strong differences: the complex rheology of S. aureus biofilms may strongly differ from polymeric fluids even if under certain condition they can be considered as visco-elastic fluids (Gloag et al., 2020), impacting differently the swimmer behaviours. Furthermore, the physiology of the motor cell in the Gram-positive Bacillus differs from the one of the Gram-negative E. coli (Terahara et al., 2020; Szurmant and Ordal, 2004; Subramanian and Kearns, 2019). Finally, the particular brush-like flagella bundle of B. cereus may allow this species to conserve the same swimming in Newtonian and crowded environments, by adapting its swimming speed to the local density and otherwise randomly selecting swimming directions across the host biofilm. To generalize this approach to other contexts, this study should be reproduced for other swimmers and other host biofilms, together with polymeric fluids and porous media, including biochemical interactions.

Materials and methods

Infiltration of host biofilms by bacilli swimmers

Request a detailed protocol

Infiltration of S. aureus biofilms by bacilli swimmers were prepared in 96-well microplates. Submerged biofilms were grown on the surface of polystyrene 96-well microtiter plates with a μ clear base (Greiner Bio-one, France) enabling high-resolution fluorescence imaging (Bridier et al., 2010). 200 μL of an overnight S. aureus RN4220 pALC2084 expressing GFP (Malone et al., 2009) cultured in TSB (adjusted to an OD 600 nm of 0.02) were added in each well. The microtiter plate was then incubated at 30°C for 60 min to allow the bacteria to adhere to the bottom of the wells. Wells were then rinsed with TSB to eliminate non-adherent bacteria and refilled with 200 μL of sterile TSB prior incubation at 30 celsius for 24 h. In parallel, B. sphaericus 9 A12, B. pumilus 3 F3 and B. cereus 10B3 were cultivated overnight planktonically in TSB at 30 °C. Overnight cultures were diluted 10 times and labelled in red with 5 μM of SYTO 61 (Molecular probes, France). After 5 min of contact, 50 μL of labelled fluorescent swimmers suspension were added immediately on the top of the S. aureus biofilm. All microscopic observations were collected within the following 30 min to avoid interference of the dyes with bacterial motility. Three replicates were conducted. The same protocol has been repeated without the host biofilm (control experiments): the swimmers are added to the buffer only which is a Newtonian fluid.

Confocal laser scanning microscopy (CLSM)

Request a detailed protocol

The 96 well microtiter plate containing 24 hr S. aureus biofilm and recently added bacilli swimmers were mounted on the motorized stage of a Leica SP8 AOBS inverter confocal laser scanning microscope (CLSM, LEICA Microsystems, Germany) at the MIMA2 platform (https://www6.jouy.inra.fr/mima2_eng/). Temperature was maintained at 30 celsius during all experiments. 2D+T acquisitions were performed with the following parameters: images of 147.62 × 147.62 μm were acquired at 8000 Hz using a 63×/1.2 N.A. To detect GFP, an argon laser at 488 nm set at 10% of the maximal intensity was used, and the emitted fluorescence was collected in the range 495–550 nm using hybrid detectors (HyD LEICA Microsystems, Germany). To detect the red fluorescence of SYTO61, a 633 nm helium-neon laser set at 25% and 2% of the maximal intensity was used, and fluorescence was collected in the range 650–750 nm using hybrid detectors. Images were collected during 30 s (see Table 1 for sampling period).

Bacterial swimmers navigate within a three-dimensional biofilm matrix and confocal microscope refreshment time is not small enough to allow 3D+T images. To limit 3D trajectories, a focal plane near the well edge has been selected, where the well wall physically constrains the swimmer trajectories in one direction, which select longer trajectories in the 2D plane that can be tracked in time. Therefore, experimental data are composed of two-dimensional trajectories captured between the swimmer arrival and departure times in the focal plane, and the associated 2D+T biofilm density images that change over time due to swimmer action.

To check that the host biofilm structure is identical near the well’s edge compared to other 2D slices, we took 4 replicates of S. aureus biofilms that were imaged in 3D using a stack of 6 horizontal images, starting from z=0 near the well’s edge, to z=6Δz, at the interface between the biofilm and the bulk solution. To study the between and within biofilm density variability in the horizontal images, we subsampled them with a regular Cartesian 4 × 4 grid, resulting in a 4 × 6 x(4 × 4)=384 2D images database supplemented by metadata (stack, z and x-y coordinate of the subsample), before computing a clustered pairwise correlation similarity matrix and a permanova.

Transmitted electron microscopy

Request a detailed protocol

Materials were directly adsorbed onto a carbon film membrane on a 300-mesh copper grid, stained with 1% uranyl acetate, dissolved in distilled water, and dried at room temperature. Grids were examined with Hitachi HT7700 electron microscope operated at 80 kV (Elexience – France), and images were acquired with a charge-coupled device camera (AMT).

Post-processing of image data

Request a detailed protocol

See Figure 1 for a sketch of the datastream from microscope raw images to model inputs and Appendix 1—figure 1 for data visualization at each step of the post-processing pipeline.

Swimmer tracking has been applied on the red channel of the raw temporal stacks with IMARIS software (Oxford Instruments) using the tracking function after automated spots detection to get position time-series for each swimmer. Time-series with less than 8 time steps were filtered out.

Then, swimmer speed and acceleration time-series were computed from their position by finite-difference approximations and trajectory descriptors were extracted. The RGB green channel corresponding to the biofilm density temporal images were converted into grayscale and rescalled between 0 and 1 (linear scalling).

Trajectory descriptors are defined as follows. The mean acceleration and speed values, distance and displacement are computed with Ais=1Tis-2tAis(t), Vis=1Tis-1tVis(t), distis=ΔtT0,isTend,is-ΔtVis(t) and dispis=X(Tend,is)-X(T0,is). To compute the visited area, each trajectory piece was subsampled by computing Xis(tk)=knsXis(t)+(1-kns)Xis(t+Δt) for k=0,ns, with ns=10 and the pixels included in the ball B(Xis(tk),r) with radius r=2 were labeled. The total area of the labelled pixels is defined as the visited area of the swimmer i of species s.

To assess run-and-tumble behaviour, the angle θis(t) and the mean velocity V¯is(t) between two consecutive speed vectors are defined with θis(t) = arccos((Vis(t)Vis(tΔt))/(Vis(t)Vis(tΔt))) and V¯is(t)=(Vis(t)+Vis(tΔt))/2, for t(T0,is+Δt,Tend,is).

Post-processed data are available at https://forgemia.inra.fr/bioswimmers/swim-infer/SwimmerData.

Computation of the forward swimming model

Request a detailed protocol

Time integration of equations (2) has been solved with an explicit Euler scheme regarding positions xi,ts and velocities vi,ts of the swimmer i of species s at time t:

(7) xi,t+1s=xi,ts+vi,tsdt
(8) vi,t+1s=vi,ts+dvi,ts

where dvi,ts is given by Equation 2, and depends on θs, Vi,ts, xi,ts, b(t,xi,ts) and b(t,xi,ts). In practice, the biofilm density and gradient maps b and b are discretized with a Cartesian grid corresponding to the image pixels.

During random walks, swimmer may exit the biofilm domain. When the swimmer reaches the domain boundary, a new swimmer is introduced with a velocity oriented towards the interior of the domain while the original trajectory is stopped at the boundary.

Sensitivity analysis

Request a detailed protocol

A local sensitivity analysis (Figure 1) is performed by comparing basal simulation obtained with γ=β=ϵ=1 (v0 and v1 where taken as in Appendix 1—table 3) with 3 simulations where γ, β and ϵ are alternatively set to 0, resulting in 3 alternative models where the speed or the direction selection or the random term is turned off. The interaction between the speed selection term (set by γ) and the random term is illustrated in Appendix 2—figure 3 where 5 repetitions of the same trajectory of a simplified Langevin equation (11) are displayed with or without friction (γ=1 or γ=0), but with the same random seed for the stochastic term so that the stochastic part is strictly identical.

To analyze the impacts of the non-dimensionalized swimming parameters γ, v0, v1, β, ϵ on the locomotion behaviour, a global sensitivity analysis has been performed. The parameter space [0,1]5 was uniformly sampled with n = 1000 points using the Fourier Amplitude Sensitivity Test (FAST) sampler of the SALib library that is the function SALib.sample.fast_sampler.sample (Cukier et al., 1973; Saltelli et al., 1999). We note that the interval [0,1] covers a large parameter domain for some parameters, in particular β which remains small after inference. For this parameter, the sensitivity analysis will show potential impact on the output, that may be ineffective in the parameter range of the inferred model.

For each point in the parameter space, a forward simulation is conducted on a population of swimmers on a representative biofilm extracted from the dataset (first batch of the B. pumilus dataset). Trajectory descriptors are then extracted and taken as observable of the sensitivity anaylsis that requires both the parameters sampling and the associated descriptors. Sobol indices of first order are then returned and pairwise partial correlations matrix has been calculated. Convergence of the Sobol indices has been checked by taking sub-samples containing less than 1,000 points.

Inference

Numerical implementation

Request a detailed protocol

The inverse problem (4)-(6) has been implemented using a Hamiltonian Monte Carlo (HMC) method to solve this Bayesian inference problem.

The three replicates for each swimmer species are pooled (trajectories and biofilm density maps) and the input data required for the inference procedure (velocity yV and acceleration yA times series for the whole batch of swimmers, biofilm densities yb and gradient yGb extracted at swimmer positions) were assembled in a customed data structured. Normal standard prior distributions were set for all swimming parameters θ=(γ,v0,v1,β,ϵ). Additional positivity constrained were imposed for all parameters but β. Therefore, the implemented model can be summarized as:

θN(0,1),γ0,v00,v10,ϵ0
yAN(fA(γ,v0,v1,β|yb,yV,yb,yGb,dt),ϵ)

A warmup of 1000 runs is followed by the Markov chains construction (4,000 iterations for 4 Markov chains). Markov chain convergence is assessed by direct visualization (Appendix 1—figure 4) by checking for biaised covariance structures in pair-plots (Appendix 1—figure 5). Standard convergence index were additionnaly computed: effective sample size per iteration (neff) and potential scale reduction factor (Rhat).

Noise model selection

Request a detailed protocol

Different noise models have been evaluated for the regression model (5) to take into account batch or individual effects. Namely, we decomposed the noise in Equation 5 by replacing ηs by ηsi and/or ηs,b for individual i and experimental batch b. Model selection has been conducted by computing the WAIC for the different noise models. A huge degradation of the WAIC has been observed for individual or batch dependant noises, indicating that the enhancement of the inference accuracy provided by the additional parameters can be considered as over-fitting and discarded.

Inference validation on synthetic data

Ground truth data construction

Request a detailed protocol

Ground truth synthetic data (see section Assessment of the inference with synthetic data) were computed by solving Equations 2; 8 with γ=10s-1, v0=5μm.s-1, v1=1μm.s-1, β=10μm.s-2, ϵ=40μm.s-2 and biofilm maps taken from the first batch of the B. pumilus dataset. The number of swimmers was fixed to N=50 and the number of time steps was taken identical to the experimental data that is Nt=224. Resulting mean speeds and accelerations were Aref=68.29μm.s-2, Vref=7.47μm.s-1 and were used to rescale the data before inference together with the ground truth parameters (Appendix 1—table 3). In total, the acceleration dataset contains 9,523 samples for each spatial direction.

Comparing ground truth data with the fitted model

Request a detailed protocol

After inference, a new dataset is obtained by solving Equation 8 with the fitted parameters. The same initial conditions for speeds and positions as the ground truth data are taken. Trajectories are stopped after the same number of time step as in the corresponding trajectory of the ground truth dataset. To discard spurious stochastic uncertainties, the same random seed as the ground truth simulations was taken, so that the unique uncertainty source was inference errors.

Checking the sensitivity to biofilm image noise

Request a detailed protocol

To produce Appendix 1—figure 6, the biofilm density and the biofilm density gradient maps have been noised with an additive gaussian noise with increasing variance, before inference: we set

ϵbN(0,lσb)andϵbN(0,2lΔxσb)

where σb is the variance observed in the original data, and ϵb and ϵb are respectively the noise applied to the biofilm density and the biofilm density gradient. The parameter l[0,0.01,0.02,0.03,0.04,0.05] is increased to apply a noise from 0% to 5%.

Inference validation on experimental data

Comparing microscopy data with the fitted model

Request a detailed protocol

The same procedure is repeated on the microscopy data: after inference, a new dataset is obtained by solving Equation 8 with the fitted parameter, taking the same initial conditions for speeds and positions. Trajectories are stopped after the same number of time step as in the corresponding trajectory of the ground truth experimental dataset.

Measuring the deterministic reconstruction

Request a detailed protocol

The deterministic coefficient of determination Rdet2 was computed to measure how much the dataset is explained by the deterministic part of the model. Setting Ais,det=fA(γ,v0,v1,β|yb,yV,yb,yGb,dt):

Rdet2,s=1-i(yAis-Ais,det)2i(yAis-yAs¯)2

where yAs¯ is the acceleration mean. Rdet2,s is expected to tend towards 1 when the stochastic term η=N(0,ϵ) becomes negligible with respect to Adet.

Plots and statistics

Request a detailed protocol

To allow inter-species comparisons in plots, the data and model outputs are re-normalized with common reference values Aref and Vref defined as the average of the species reference values (see Table 2 for values). Uni-dimensional distributions (Figure 5 upper panel, Figure 6b upper panel, Figure 7a, upper panel, and Figure 7b) were obtained with the gaussian_kde function of scipy.stats. T tests for mean comparison were performed using scipy.stats ttest_ind.

Two-dimensional distribution plots (Figures 5 and 6 b, Figure 7a lower panels) were obtained by first plotting the two-dimensional point cloud and approximating the point distribution with a gaussian KDE using scipy.stats gaussian_kde function. Then, the gaussian kde is evaluated at each point of the point cloud and quantiles 0.05, 0.5, and 0.95 of the resulting values are computed. Finally, quantile isovalues are plotted and the point cloud and the KDE are removed (see Appendix 4—figure 1 and Sec. KDE computation for details): this procedure ensures to enclose 5, 50% and 95% of the original points, centered in the densest zones of the initial point cloud.

Ternary plots (Appendix 2—figure 6) were obtained by first computing the contribution of each term of equation (4) to acceleration estimate. Namely, note

s(b)is=βsb(t,Xis(t))b(t,Xis(t)), and s(η)is=ηs.
s(Δb)is=βsΔb(t,Xis(t))Δb(t,Xis(t)),ands(η)is=ηs

We compute the proportions A(k)is for k{b,b,η},

A(k)is=s(k)iss(b)is+s(b)is+s(η)is.

Points (A(b)is,A(b)is,A(η)is) are then plotted in ternary plots using the Ternary python package (Weinstein et al., 2019) and approximated by gaussian KDE. Isolines are finally plotted as previously described.

To construct the plot in Appendix 1—figure 2, pairwise correlation of the biofilm density in the 384 samples has been computed (scikit-learn pairwise_distances, ‘correlation’ metric parameter Pedregosa et al., 2011), and the resulting similarity matrix has been displayed using Seaborn package clustermap function (Waskom, 2021) after hierarchical clustering (scipy.cluster.hierarchy linkage function Virtanen et al., 2020). Additional permanova has been computed to assess the significance of between-group dissimilarities using stats.distance package permanova function (scikit-bio development team, 2020 ).

Code availability

Request a detailed protocol

All the image pre- and post-processing, calculations and statistics have been performed with custom scripts using the standard python libraries numpy (Harris et al., 2020), scipy (Virtanen et al., 2020), imageio (Klein, 2021), and pandas (McKinney, 2010). The forward swimming problem computation is computed using customed scripts built upon numpy (Harris et al., 2020) and H5py (https://www.h5py.org). Sensitivity analysis has been conducted with the SALib library (Cukier et al., 1973; Saltelli et al., 1999) (Sobol index, function SALib.analyze.fast.analyze) and the pingouin library (Vallat, 2018) (PCC, pcorr method). The Bayesian inference has been conducted using the STAN library (Stan Development Team, 2018) through its python interface pystan (Riddell et al., 2021). All plots have been made with the matplotlib python library (Hunter, 2007).

The whole python code have been made available and accessible at the following git repository https://forgemia.inra.fr/bioswimmers/swim-infer.

Appendix 1

Illustration of the datastream

Data acquisition

Illustrations of the image data at different steps of the data stream are displayed in Appendix 1—figure 1, from raw microscopy data to rescalled biofilm density map with trajectories. The contrast of the original 2 chanel image has been enhanced for visualization. The RGB biofilm density temporal images (see Materials and methods) were converted into grayscale and rescalled between 0 and 1 (linear scalling). In this images, for illustrations, trajectories are mapped into the biofilm density map and rescaled density map at initial condition of the first B. pumilus batch. In the dataset, the trajectories are associated with the corresponding biofilm map: Xis(t) is associated with the value b(t,Xis(t)) for swimmer i of species s at time t. As the biofilm density map is also a time-series, the trajectories can hardly be represented on the underlying biofilm that also changes in time.

Appendix 1—figure 1
Illustration of image data along the post-processing process.

Raw data (2 chanel images) are first displayed. Then, trajectory tracking are obtained. Finally, the biofilm density map is rescalled, and mapped to grayscale. Images dimensions are 147x147μm.

Assessing the 3D structure of the biofilm

We check that the selection of a 2D focal plan does not induce an additional bias by over-selecting biofilm areas with specific structures near the well’s edge. To do so, we assembled an additional dataset of 4 replicates of S. aureus 3D images (see Materials and methods, section 4.2, and Appendix 1—figure 2.A for the dataset assembly) of horizontal image subsamples, and computed their within and between dissimilarities (see Materials and methods), section Plots and statistics. The resulting pairwise correlation matrix is displayed in Appendix 1—figure 2 after hierarchical clustering. It shows that the z direction does not structure the information, since the images are not clustered according to their z coordinates contrary to the stack or the x-y coordinate labels. Permanova analysis shows that the differences between stacks and x-y subsamples are significant (p-value=1e-4) but not between horizontal images (p-value=1).

Appendix 1—figure 2
Assessment that the biofilm structure does not strongly vary in the z direction.

(A) Subsampling procedure. We illustrate the subsampling procedure in one of the 4 replicates. The 2D images constituting the 3D stack are sampled with a 4 × 4 cartesian grid. We can also visually observe that the biofilm variation between horizontal images are weak. Images dimensions are 147x147μm. (B) Pairwise correlation matrix. The correlation dissimilarity between sample pairs is displayed (black = 0 value, indicating identical samples, to light orange >1, indicating dissimilar samples) after hierarchical clustering. We indicate the stack,z and x-y label in the 3 first columns with a color code. We can observe that the samples are not gathered by,z but rather by stacks and x-y groups, indicating that images with identical x-y labels are clustered together, showing that they are more similar to samples with the same x-y coordinates in other z slices, than other samples in the same z slice with other x-y coordinates.

Illustration of pore formation

As strongly documented in Houry et al., 2012, swimmers can dig pores in a exogenous biofilm, which enhance the biofilm innervation and facilitate the penetration of macromolecules. To illustrate the pore formation, we show two successive images taken from a 2D temporal stack of B. sphaericus swimmers in a S. aureus host biofilm in Figure 3. In the dashed ellipse, we can see a swimmer that has moved in the two successive images, letting behind it an empty space free from host bacteria.

Appendix 1—figure 3
Illustration of pore formation.

Extractions of two successive images of B. sphaericus swimming in a S. aureus biofilm are displayed. The dashed ellipse indicates a zone where a swimmer moves between the two successive images, which creates a pore along its swimming path. Images dimensions are 76 x 78 μm.

Statistical tests

T-tests were performed to compare mean differences between 1D distribution of Figure 5. Resulting p-values are displayed in Appendix 1—table 2.

Appendix 1—table 1
P-values of pairwise comparison between distributions in biofilms.

Pairwise comparison were performed between 1D distributions displayed in Figure 5 using T-test and p-values are displayed. Non-significant values are indicated in bold.

AVdistdispArea
B. cereusB. sphaericusB. cereusB. sphaericusB. cereusB. sphaericusB. cereusB. sphaericusB. cereusB. sphaericus
B.pumilus5.e-91.e-104.e-21.e-131.e-52.e-38.e-37.e-107.e-14.e-14
B.cereus5.e-512.e-133.e-15.e-151.e-12
Appendix 1—table 2
P-values of pairwise comparison between distributions in the control Newtonian buffer.

Pairwise comparison were performed between 1D distributions displayed in Figure 5 using T-test and p-values are displayed. Non-significant values are indicated in bold.

AVdistdispArea
B. cereusB. sphaericusB. cereusB. sphaericusB. cereusB. sphaericusB. cereusB. sphaericusB. cereusB. sphaericus
B.pumilus3.e-119.e-49.e-32.e-21.e-102.e-22.e-211.e-144.e-26.e-1
B.cereus9.e-12.e-72.e-128.e-11.e-1

Assessment of the inference with synthetic data

Appendix 1—table 3
Inference results on synthetic data.

The normalized ground-truth parameter values (i.e. ground truth parameter rescaled with Aref and Vref) are compared with the inference outputs on synthetic data: posterior distribution mean and standard deviation are indicated, together with the inferred confidence intervals for the true parameters. Convergence diagnosis indices are also given, with neff the effective sample size per iteration and Rhat the potential scale reduction factors, indicating that convergence occurred for all parameters.

parameterground truthmeanstdconfidence interval [2.5%–97.5%]neffRhat
γ1.0941.081×10-2[1.06-1.1]3,5691.0
v00.6690.661×10-2[0.64-0.68]3,7101.0
v10.1340.132×10-2[0.09-0.17]3,4311.0
β0.1460.166, 2×10-3[0.15-0.17]5,0501.0
ϵ0.5860.593×10-3[0.58-0.59]4,9061.0

To assess the inference method, synthetic data are built and will be used as reference for assessment. We arbitrarily fix a parameter vector and solve system (1) from random initial positions, in a host biofilm arbitrarily chosen in the image dataset. We then extract the swimmer positions at given time-steps and recover accelerations and speeds with the same post-processing pipeline as for microscopy images and solve the inverse problem (5)-(6). If the inference process correctly works, we expect to recover the original parameters (the ground truth).

The ground truth parameters are correctly recovered by the inference procedure (Appendix 1—table 3), indicating that the parameters are correctly identifiable and that the inverse problem is well-posed. An error of respectively 1.28, 1.34, 2.98% and 0.68% on the parameters γ, v0, v1 and ϵ is observed in this controlled situation, β being inferred with lower accuracy (9.59 %). This estimate is robust to noise on the biofilm data, with highest impact on β (Appendix 1—figure 6). To assess the impact of parameter inference uncertainties on trajectory computation, the posterior parameter distribution is sampled and new trajectories are computed, replacing the ground-truth parameters by the sampled ones. The swimmer ground truth trajectories are accurately recovered: the sampled trajectories tightly frame the original swimmer path as illustrated on a randomly chosen trajectory (Figure 6a). We note that an identical random seed has been taken for these simulations, including the ground truth trajectory, in order to turn off the stochastic uncertainties and only focus on the propagation of inference errors during simulations of swimmer trajectories.

Finally, we re-assemble a synthetic dataset by replacing the ground-truth parameters by the inferred ones, that is the posterior mean. Qqplot of the fitted model accelerations versus the ground truth accelerations give an excellent accuracy (Figure 6d–e), with all the points lying on the bisector, except slight divergences on the distribution tails. The fitted model trajectories visually reproduce the qualitative characteristics of the original dataset (Figure 6c). The trajectory descriptors of section Characterizing bacterial swimming in a biofilm matrix through image descriptors are then computed on both datasets (ground truth and inferred) and compared (Figure 6b). The kinematics descriptors, that is acceleration and speed distributions, are very accurately recovered with a relative error of 0.1%, 3.2%, 5% for respectively the mean, quantiles 0.05 and 0.95 of the acceleration (resp. 0.9%, 2.5%,2% for speed). Some small discrepancies can be observed on the distance and displacement distributions, even if the mean and the quantiles 0.05 and 0.95 are close. The interactions between the host biofilm and the acceleration and speed distribution are also recovered with high accuracy. We note that part of the observed discrepancies comes from an additional source of variability of the simulation framework: when a swimmer reaches a domain boundary during a simulation, its trajectory is stopped and a new swimmer is randomly introduced elsewhere in the biofilm (see Materials and methods for more details). This simulation strategy seems to be responsible of the over-representation of short trajectories in the inferred dataset, compared to the ground truth (Figure 6b upper panel, distance and displacement distributions).

Markov chains convergence and correlation

Markov chain (Appendix 1—figure 4) and markov chain pairplots (Appendix 1—figure 5) are displayed. Direct visualization of the posterior sampling allows to detect convergence failure (strong autocorrelation or stationnary markov chain). Markov chain pairplot informs on potential correlation between different parameters posterior samples, showing an interaction between parameter and an identification issue. In Appendix 1—figure 4, the markov chains correctly converged for all the parameter. No strong correlation can be observed in Appendix 1—figure 5.

Appendix 1—figure 4
Inference convergence validation.

The markov chain (upper panel) and the posterior distribution (lower panel) of each parameter is displayed, showing good convergence of the stochastic sampling of the posteriors.

Appendix 1—figure 5
Pairplot of parameter markov chains.

No strong covariance effect can be observed, showing that the model can not be reduced by analytical dependence between parameters. Slight correlation is observed between the parameters v0, v1 and γ: this feature is not surprising since γ, v0 and v1 are in the same term of equation (2). The correlation is however too low to expect a model reduction.

Impact of noise on biofilm data

The impact of noise on the parameter inference is assessed by noising the biofilm density and the biofilm density gradients with an additive gaussian noise with increasing variance (Appendix 1—figure 6). The noise variance is scaled with the variance observed in the original data. Namely, we set

(9) ϵbN(0,lσb)

and

(10) ϵbN(0,2lΔxσb)

where σb is the variance observed in the original data, ϵb and ϵb are respectively the noise applied to the biofilm density and the biofilm density gradient and Δx is a pixel width. The parameter l[0,0.01,0.02,0.03,0.04,0.05] is increased to apply a noise from 0% to 5%.

We can observe that the estimate of only two parameters is impacted by noising the biofilm inputs: the estimate of β and v1. The parameter β is also the parameter which is the less accurately inferred when no noise is added (5%). Its estimation relative error increases up to 35% when 5% noise is added. The parameter β tunes the direction selection, which is the less effective process in the swimmer model. The other parameters are recovered with correct accuracy (kept under 18% for v1, and under 6% otherwise).

Appendix 1—figure 6
Impact of noise level on parameter inference.

We plot the relative error of the estimate of the different equation parameters for increasing noise applied on the biofilm density and the biofilm density gradients input data.

Appendix 2

Numerical exploration

To illustrate the impact of each parameter on the interplay between the host biofilm and the swimmers trajectories, the model (2) was first computed on two mock biofilms. The first one is a square linear density gradient and the second is composed of large pores on a textured background mimicking the dense biofilm zones (Appendix 2—figure 1a). A basal simulation is computed with γ=β=ϵ=1 and will be used later on as reference for comparisons. These three parameters are alternatively set to zero to assess the resulting trajectories when the speed selection, the direction selection or the random term is shut down. Suppressing speed selection results in rectilinear trajectories (γ=0, Appendix 2—figure 1c), which is rather counter-intuitive since the remaining terms are designed to tune the direction. A discussion of this phenomenon is provided in Appendix 2 Influence of inference and stochastic terms on the trajectory descriptors. When suppressing direction selection (β=0, Appendix 2—figure 1d), the trajectories are no longer drifted downwards the gradient in the upper panel as in the basal simulation, and no longer follow the pores (lower panel). If the stochastic term is shut down (ϵ=0, Appendix 2—figure 1e), the trajectories directly go down the gradients and are trapped in the center of the image in the upper panel. When a pore is found along the run, the swimmer keeps following it without being able to escape the pore any longer unlike the basal situation (lower panel).

The link between the model parameters and the global trajectory descriptors introduced in Section Characterizing bacterial swimming in a biofilm matrix through image descriptors is less intuitive. A global sensitivity analysis of the trajectory descriptors (mean acceleration and speed, distance, displacement and visited areas) with respect to the parameters γ, v0, v1, β and ϵ is conducted in Model sensitivity analysis by computing their first order Sobol index (SI) and their pairwise correlation coefficient (PCC). The sensitivity analysis shows that the mean speed is mainly influenced by γ and ϵ with slightly negative and positive impact respectively, while acceleration is rather influenced by β and ϵ with positive impact. The link between the parameters and the other descriptors is more complex, including non linear effects (strong SI and small PCC) and parameter interactions (higher SI residuals, see Appendix 2—figure 2).

Appendix 2—figure 1
To illustrate the influence of each term of Equation 2, they are alternatively turned off (c to e), and swimmer trajectories are computed on mock biofilms.

(a) displaying marked density gradients (upper pannel) or marked pores (lower pannel). Trajectories can be compared to a basal simulation (b) when all the terms have the same intensity (1).

Model sensitivity analysis

The link between the model parameters and the global trajectory descriptors introduced in Section Characterizing bacterial swimming in a biofilm matrix through image descriptors is not intuitive. A global sensitivity analysis of the trajectory descriptors (mean acceleration and speed, distance, displacement and visited areas) with respect to the parameters γ, v0, v1, β and ϵ is conducted by computing their first order Sobol index (SI) and their pairwise correlation coefficient (PCC).

Appendix 2—figure 2
Sensitivity analysis of state observable respectively to state-equation parameters.

The sensitivity of different state observables to parameter shifts is systematically studied through sensitivity analysis methods. (a) Sobol indices are displayed for each output through barplots indicating the part of variance explained by a given parameter. The bars do not reach the value 1, indicating a residual variance reflecting interactions between parameters. (b) Pairwise correlation coefficient (PCC) of the observable respectively to the input parameters are displayed. A negative PCC indicates a negative impact on the output, and conversely. We note that the red dot indicating the PCC of β for V is confounded with the purple one indicating the PCC of ϵ.

The residual variance is small for the median speed and acceleration but slightly larger for the distance, displacement and visited area indicating larger effects of parameter interactions for these outputs, that is output variations induced by joint shifts of the parameters (Appendix 2—figure 2). The SI of the parameters v0 and v1 are negligible, except for the displacement and the visited area. The parameters γ, β and ϵ, that is the three weights associated to each component of the state Equation 2, are more influential. Distance and speed have several main drivers. The distance is impacted nearly equally by γ, β and ϵ and the PCC of these parameters is quite small, indicating that these parameters may induce indistinctly negative or positive variations of the travelled distance, except for ϵ which is slightly negatively correlated. The median speed is mainly impacted by ϵ (slightly positively) and γ (slightly negatively), with relatively small PCC (Appendix 2—figure 2). The mean acceleration, the displacement and the visited area are preponderantly impacted by a main driver: the mean acceleration and the visited area are particularly impacted by ϵ, the stochastic term weight, with positive influence. The displacement is mainly influenced by γ with no preponderant variation direction (null PCC, Appendix 2—figure 2).

Friction and random term in Langevin equations

To illustrate the interplay between the friction and the random term during a random walks, we solve the problem

(11) dv=-γvdtfriction+ηdtrandom term
(12) v(0)=(0,0)
(13) X(0)=(0,0)

in an unconstrained domain, with η a 2 dimensional white noise with unitary variance. The friction parameter γ is alternatively set to 1 (Appendix 2—figure 3, upper panel) or 0 (Appendix 2—figure 3, lower panel). We note that the random seed is the same for the simulations with or without the friction term, so that the stochastic contribution is completely identical in the upper and lower panels. The trajectories produced without the friction term are much more regular and rectilinear that those produced with the friction term, that are much chaotic.

The reason of that behaviour may come from the null mean of the white noise. Roughly speaking, in average, the acceleration shows small variations around zero which leads after temporal integration to regular speeds and rectilinear-like trajectories. By contrast, the friction term reduces the particle inertia, enhancing the impact of the stochastic term, which produces much more chaotic trajectories.

Appendix 2—figure 3
Illustration of the interplay between friction and stochastic terms in Langevin equation.

Trajectories produced by different repetitions of Equation 11 are displayed with γ=1 (upper panel) and γ=0 (lower pannel). We note that the same random seed has been taken for the simulations of the same column with or without the friction term, meaning that the stochastic term is strictly identical in both simulations.

Impact of the stochastic term

We illustrate the impact of the random walk term on the overall swimmer trajectory with Appendix 2—figure 4. In this figure, we display two trajectories computed from model (2) with identical parameters (α, β, v0, v1, γ and ϵ), initial condition, host biofilm and time length. Different random samplings of the stochastic term of Equation (2) lead to these very different trajectories. This example illustrate the difference between identifying population-wide characteristics and inferring true trajectories: while the later try to detect the differences between the two trajectories (i.e. in this example, identifying and smoothing the different stochastic samples leading to these trajectories), the former focuses on the common features between these apparently different trajectories.

Appendix 2—figure 4
Influence of the stochastic process on swimmer trajectories.

We plot two different trajectories computed with the model (2), including the same parameters α, β, v0, v1, γ and ϵ, the same initial condition and identical host biofilm. The only uncertainty source comes from the different random samplings of the stochastic term. In this simulation, the ground truth (with default random seed) is plotted in blue.

Influence of inference and stochastic terms on the trajectory descriptors

We wonder if the uncertainty sources involved in the inference process and in the stochastic term of the random walk have a decisive impact on the trajectory descriptors. To address this question, a first dataset is assembled by integrating in time Equation 2 for given parameters (see Appendix 1—table 3), initial conditions and host biofilm. Then, this dataset is used as inputs of the inference method to infer the initial parameters (ground truth). Another dataset is produced by replacing the initial parameters by the inferred parameters. We note that we take the same seed for the random number generator than for the initial dataset, so that the only uncertainty that has been introduced until this step comes from the inference procedure. Finally, we produce a last dataset by solving the model with the same inferred parameters as in the second dataset, but changing the seed of the random number generator. Hence, this last dataset involves uncertainties coming from the stochastic terms and from the inference process. This variation results in modifying the sampling of the stochastic terms and leads to strong modifications of the trajectories, like in Appendix 2—figure 4.

At end, the trajectory descriptors are computed and plotted in Appendix 2—figure 5. We can see that the trajectory descriptor distributions are very similar across the different dataset, except for the total distance and the displacement where discrepancies can be noted. However, these differences are relatively small compared to the mean and the width of the distributions. We can also observe that the interactions with the underlying biofilm is very well conserved, even when the sampling of the stochastic term is very different. This observation grounds the initial guess that these trajectory descriptors captures common global features of the different trajectories rather than specificities of given trajectories.

Appendix 2—figure 5
Low influence of the stochastic term on the trajectory descriptors.

To assess the influence of the random term on the population-wide trajectory descriptors and overall prediction accuracy, we repeated the experiment displayed in Figure 6a. A synthetic database was first assembled (ground truth) and prediction were performed with a fitted model (After inference). Then, a second repetition of the prediction of the fitted model was computed with another seed for the random number generator, resulting in modifying the sampling of the stochastic terms and strong modifications of individual trajectories, like in Figure 4. The population-wide trajectory descriptors are however slightly impacted by this random effect, indicating that the main characteristics of the trajectory populations marginally depend on the stochastic term.

Relative impact of the different swimming processes on the species swim

The ternary plot presented in Appendix 2—figure 6 shows the balance between the different swimming processes. The contribution of each term of equation (4) to acceleration estimate was first computed. Namely, the relative value of the speed selection term s(b)is, the direction selection term s(b)is and the stochastic term s(η)is where

s(b)is=γ(v0s+b(t,Xis(t))(v1s-v0s)-Vis(t))Vis(t)Vis(t),
s(b)is=βsb(t,Xis(t))b(t,Xis(t)), and s(η)is=ηs.

The proportions A(k)is of each process was computed with

A(k)is=s(k)iss(b)is+s(b)is+s(η)is.

for k{b,b,η}. As the contribution of the direction selection was limited compared to the other processes, we zoomed in the plot near the edge s(b)is=0 to allow inter-species comparisons.

Appendix 2—figure 6
Respective influence of stochastic effects, speed or direction adaptation to the host biofilm.

We plot in a ternary plot the respective influence of the speed selection (V), the direction selection (D) and the random term (ϵ) of Equation 1 in the acceleration distribution of each species. Each squared instantaneous acceleration is mapped in the ternary plot coordinates through the relative contribution of V2, D2 and ϵ2, and this point cloud is approximated in the ternary plot coordinates with a gaussian kernel to display the point distributions. The 0.05, 0.5 and 0.95 quantile isovalues of these distributions are plotted. (a) The entire ternary plot is displayed. The dashed line represents the zoom box represented in Fig. (b), where the same isolines are displayed, but with a zoom in in the y direction to highlight differences between species. The number of trajectories are identical than in Figure 5: n = 517 (B. pumilus), n = 237 (B. sphaericus) and n = 279 (B. cereus).

Appendix 3

Various inference models

Different inference models were designed and tested from the dimensionless state equation (2).

SSM model

The inference model can be stated as a space-state model (SSM) which is a framework commonly used in spatial ecology to infer a true state, that is true positions and trajectories, and population-wide random walk parameters from time-serie data (Auger‐Méthé et al., 2021). The SSM inference model is a generalization of Hidden Markov Models (HMM).

Note zis(t) the true (hidden) position of the individual i of the species s at time t. The state model on acceleration (4) can be rewritten as

(14) dvis(t)dt=γ(v0s+b(t,zis(t))(v1s-v0s)-νis(t))vis(t)vis(t)+βsb(t,zis(t))b(t,zis(t))+ηmods
(15) zis(t)dt=vis(t)

In this equation, vis is the true hidden swimmer velocity. Starting from observed initial conditions zis(0), vis(0), Equation 16 can be integrated in time to recover hidden zis(t), vis(t) for all times t.

Then, a likelihood equation can be written to compare the true hidden state to the observations.

(16) Xis(t)zis(t)+ηobss

We note a link between ηmod and ηobs in Equations 15 and 16 and the random state η in Equation 4. Namely, noting σmod and σobs the standard deviation of the gaussian noises ηmod and ηobs, direct finite-difference of Ais(t) from the true state gives an estimate of the noise variance on the acceleration of the non-linear regression model

ϵ=(σmodΔt)2+(6σobsΔt2)2.

Compared to problem (5), the main advantages are that the likelihood is written on the original data, that is the observed position, and not a post-processed observed acceleration, subject to finite-difference errors. Furthermore, the true trajectories are recovered and modelling errors ηmods and observation errors ηobss are separated. The main drawback of this methodology is that the state space is very large since it includes all the positions and speeds at every time for every swimmers, which leads to intractable computations.

Mixing SSM and non-linear inference models

An intermediary strategy has been designed by selecting swimmer trajectories that we want to infer by SSM, the remaining trajectories being kept to compute an acceleration dataset Ais(t). Namely, note Dssm the set of swimmer index kept for SSM, and DA the set of swimmer index kept for non-linear regression. We set, for iDssm

(17) dvis(t)dt=γ(v0s+b(t,zis(t))(v1s-v0s)-νis(t))vis(t)vis(t)+βsb(t,zis(t))b(t,zis(t))+ηmods
(18) zis(t)dt=vis(t)

for given initial conditions zis(0), vis(0), and for jDA

(19) Ajs(t)=γ(v0s+b(t,Xjs(t))(v1s-v0s)-νis(t))Vjs(t)Vjs(t)+βsb(t,Xjs(t))b(t,Xjs(t))+ηs

where Xjs(t), Vjs(t) and Ajs(t) are observed positions, speeds and accelerations. This model is completed by a likelihood equation

(20) Xis(t)zis(t)+ηobss, for iDSSM
(21) Ais(t)fA(θs|b,Xjs(t),Vjs(t),Ajs(t))+ηs

where fA is defined in equation (4).

This setting kept some advantages of the SSM, like inferring some true hidden trajectories or separating the estimate of modeling and observation errors, while limiting the computational load if DSSM is not too large.

We finally kept the regression model for several reasons. First, we are interested in recovering population wide parameters to characterize strain-specific swims, and not identifying true trajectories. Second, we can consider that the observation error with confocal microscopy is several order of magnitudes under the spatial characteristic lengths involved in equation (2), so that observation errors can be neglected. Hence, the objective of separating the uncertainty sources between model and observation errors, which is a main advantage of the SSM or mixed inference settings, becomes secondary. Furthermore, enhancing the state space dimension provided additional uncertainties, worsening the inference precision on synthetic data. We then opted for the simple regression model that provided sufficient parameter identifiability for limited computational load.

Appendix 4

KDE computation

We illustrate the process of visualization of multiple point distributions in the same graph using KDE and isolines enclosing specific proportions of the data in Appendix 4—figure 1. A point cloud is first approximated with a Gaussian KDE. Then, the value of the gaussian KDE is evaluated in each point of the original point cloud, which allows to map the 2D map into a 1D set where order relation can be defined. Specific quantiles of the resulting values are computed (namely quantile 0.05, 0.5 and 0.95). By definition, the quantile 0.05 separate 5% of the points of the original dataset (the 5% lowest Gaussian KDE values) from the remainder of the data set. The isoline corresponding to the quantile 0.05 then also separates in the 2D map the 5% lowest Gaussian KDE values from the others.

Appendix 4—figure 1
Illustration of the Gaussian KDE isovalues computation.

Starting from a random 2D point distribution (left panel), a gaussian KDE is computed using the scipy.stats function (middle panel). Then, the gaussian KDE is evaluated at the original point positions, and quantiles of the resulting values are computed (quantiles 0.05, 0.5 and 0.95). Gaussian KDE isolines corresponding to this quantiles are finally computed (right panel). This isolines enclose respectively 5, 50% and 95% of the points of the original distribution, centered in the densest area of the initial point cloud. This procedure gives a good representation of the shape of the data, but allows to display several distributions in the same graph, enabling comparison while avoiding superimposition issues.

Data availability

Data and code have been deposited at https://forgemia.inra.fr/bioswimmers/swim-infer and https://doi.org/10.5281/zenodo.6560673.

The following data sets were generated
    1. Labarthe S
    2. Ravel G
    3. Deschamps J
    4. Briandet R
    (2022) Zenodo
    Inferring characteristics of bacterial swimming in biofilm matrix from time-lapse confocal laser scanning microscopy: compagnon code and data.
    https://doi.org/10.5281/zenodo.6560673

References

  1. Conference
    1. Li Y
    2. Briandet R
    3. Trubuil A
    (2014) Tracking swimmers bacteria and pores within a biofilm
    2014 IEEE 11th International Symposium on Biomedical Imaging ISBI 2014.
    https://doi.org/10.1109/ISBI.2014.6867869
  2. Book
    1. Piard J
    2. Kim S
    3. Deschamps J
    4. Li Y
    5. Dorel C
    6. Gruss A
    7. Trubuil A
    8. Briandet R
    (2016)
    Travelling through Slime–Bacterial Movements in the Eps Matrix
    London, United kingdom: IWA Publishing.

Decision letter

  1. Karine A Gibbs
    Reviewing Editor; University of California, Berkeley, United States
  2. Wendy S Garrett
    Senior Editor; Harvard T.H. Chan School of Public Health, United States
  3. Iago Grobas
    Reviewer; University of Oxford, United Kingdom
  4. Karine Gibbs
    Reviewer; University of California, Berkeley, Berkeley, United States

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Decision letter after peer review:

Thank you for submitting your article "Inferring characteristics of bacterial swimming in biofilm matrix from time-lapse confocal laser scanning microscopy" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, one of whom is a member of our Board of Reviewing Editors, and the evaluation has been overseen by Wendy Garrett as the Senior Editor. The following individual involved in review of your submission has agreed to reveal their identity: Iago Grobas (Reviewer #2).

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

Essential revisions:

Overall, the presented data support the conclusions and provide insights into bacterial motility. However, there are concerns about the model (assumptions and extensions), the limited discussion/application of bacterial motility and morphology, and aspects of the experimental design. If you choose to revise this manuscript, please address the following items:

1) In its current form, the paper lacks a characterization of the swimming motility of the bacteria in a Newtonian fluid, an important aspect to rationalize the impact of the biofilm.

2) Clearly communicate how this model would help characterize bacteria motility in exogeneous biofilms (instead of performing microscopy).

3) The discussion of cell morphology and potential impacts on motility within the biofilm is inadequate. Cell morphology should also be considered when characterizing swimming motility in a Newtonian fluid.

See the detailed comments for specific concerns regarding each of these items.Reviewer #1 (Recommendations for the authors):

This manuscript by Ravel and colleagues connects prior analysis of bacterial movements on/within abiotic substrates to those within a "living" substrate. The primary outcome is developing a methodology to examine the overall swimming of a bacterial population within another bacterium's biofilm. Implementing these methods would allow for differentiation among bacterial species, identifying microbes that can travel more deeply within an existing biofilm. The authors consider that in the long-term, one could potentially use microbes to reliably deliver antimicrobials (or similar) into the heart of a biofilm, thereby hopefully reducing the biofilm's recalcitrant responses to these biocontrol chemicals.

Strengths

– These scientists develop an inference model and supportive methods to ascertain traits of the population of swimming cells. While similar methods and models exist, the specific examination within a "living" biofilm is intriguing and foundational for developing drug delivery methods or interpreting how cells of any type, e.g., immune cells, might penetrate these bacterial communities. These results also raise questions about the role of cell shape, size, and motility in infiltrating already formed colonies, and conversely, how the composition of the extracellular matrix and cell stacking could protect biofilms from invasion.

– The scientists sequentially test their inference model with and without experimental data. Similar results emerge, suggesting that the inference model could provide an initial step to evaluate different bacterial strains with various biofilm conditions rapidly.

– The authors do a good job of clearly explaining each variable's definition and directly addressing many of the assumptions and constraints of these experiments.

Overall, the presented data support the conclusions, except for the discussion about bacterial shape and motility.

Weaknesses

– An underlying thread is that the difference in shape and flagella size/motility contributes to how each Bacillus strain can navigate the S. aureus biofilm. Yet, the discussion of these contributions is not until the end and relies on abstract assertions of potential behavioral differences. This analysis could be more robust if it included analysis of each species' motility in the absence of biofilm so as to establish if and how these species swim differently from one another. As such, lines 416 – 419 are not strongly supported by the data in this manuscript or the current literature.

– There is a potential bias in the data due to the constraints of the experimental set-up: only focal planes near the well's edge are included in the analysis. The biofilm in these zones could have a distinct physical structure (due to the well's wall) than the remainder of the biofilm. While the datasets presented here are self-contained for analysis, any conclusions about the general/overall biofilm are more narrow (or should be taken with a caveat).

– The time-scale of 30 minutes at 30{degree sign}C could permit bacterial growth while cells are moving. The inference measurements appear to be reasonably robust against these potential impacts (perhaps incorporated as part of the "noise" variable). The possible cell growth is worth also considering if attributing differential swimming behaviors to cell morphology or flagella size/location.

I have included all of my scientific comments in the public review. A couple of typographical errors alter a sentence's meaning. The critical ones are as follows:

– line 96, dies --> dyes

– line 198, γ --> β (?)

Further, please clarify that in Table 1 that the video duration is in minutes.

Reviewer #2 (Recommendations for the authors):

This work aims to create a workflow to elucidate how the biofilm matrix affect the trajectories of exogeneous swimming bacteria. In principle, this could potentially be interesting to study biofilm spatial organization and also to characterize transport of synthetic particles such as nanoparticles or colloids carrying biocides into the biofilm. However, the statistical model presented here does not consider the surface interactions between the particles and the porous media which makes the model quite specific to the bacteria-biofilm interaction and, therefore, difficult to extend to other particles and porous materials.

The article is mainly focused on the statistical model which quantifies the key parameters of the bacteria swimming motility affecting their trajectory within the biofilm matrix. The main parameters studied are the acceleration, swimming speed, net displacement in the swimming trajectory and the area covered by the bacteria during the trajectory. The variables in the biofilm matrix that can affect those parameters are the local differences in cell density within the biofilm and the cell density itself. In this regard, the authors found a good agreement between the trajectories inferred by the model and the real trajectories, validating their model for the strains tested. The model allows to decipher how the different bacteria species adapt to the biofilm matrix. Nevertheless, I fail to understand how the model provides an advantage with respect to just observing the trajectories under the microscope. Following the bacteria with microscopy could identify when they slow down or speed up depending on the density of the host biofilm matrix as the model does.

Finally, the authors connect the bacteria morphology of the three species studied to the kinematic descriptors of the bacteria trajectories. The morphological features logically agree with what their model predicts, meaning that, long bacteria are slower in the biofilm matrix and bacteria with lower aspect ratio finds easier to go through the porous biofilm matrix. I find this like a nice way of checking that the model works but at the same time I wonder how useful this model can be since the mentioned morphological features and their impact in the bacteria trajectories could be inferred without the model, just by using microscopy. On the same line, the authors suggest that a brush-like group of thin flagella make changing directions easier but I do not see how this is checked experimentally.

In summary, the parameters affecting the model and the correlation between the kinematic variables of the swimming trajectories and the local conditions in the biofilm are thoroughly checked. However, my concern is that I do not see how this model can widen our knowledge about how bacteria navigate biofilm matrix since tracking bacteria in a biofilm matrix would give similar information. Furthermore, I do not know how limited this model is in terms of the surface chemistry interaction between the biofilm and the bacteria, or the particles in general, that are introduced in the biofilm. This surface chemistry interaction could totally change the trajectory of the swimmers independently on the parameters studied in this article which limits the model to bacteria swimming in a host biofilm and does not allow the extension of the model to synthetic particles or other porous media.

I do not understand very well Figure 1a. There are red bacteria (guest bacteria) in the last frame that do not appear in previous timestamps. Specifically, the trajectory at the bottom right has its origin in an area that is visible for all timepoints and I do not manage to see any bacteria in there in any of the timestamps.

I have also a problem with Figure 1b. I guess that in the caption, 'distance' refers to the shortest distance between the initial and final point of the trajectory. But this is what the authors seem to be drawing in the figure for 'displacement' which is defined as 'the total length of the trajectory path'. So I think these two should be swapped. Actually, I think the correct representation according to the capture is as drawn in Figure 1d.

In table 1, could the units of time interval be specified?

In Figure 2, is the whole set of trajectories for each species displayed for 1 batch or for the 3 batches? I think Figure 2, would be easier to follow if the order of the bacteria in (a) matches the order in (b). I think it would also benefit from a rough y axis in the first graph of the upper panel of (b). What are the units in the x axis? If the magnitudes are dimensionless because they are normalized, I think this should be said in the caption, also because dimensionless magnitudes are referred as V* and A* in the main text. The legend in the bottom panel (b) containing the names of the bacteria should be in italics and 'B' separated from the rest of the name. There is a mistake in the caption (2b), the magnitude 'Area' is not mentioned in the caption.

Do the authors know why the distributions in Figure 2 look quite similar for B. pumilus and B. cereus but different to B. sphaericus? It seems to me that this might be an artefact coming from the fact that less trajectories are plotted for B. sphaericus. I think it would be informative if they put a label with the number of total trajectories displayed in this panel.

I do not see very well the added value of the bottom panel in Figure 2b. I have read the Results section just looking at the 50% line, I don't think the others present much more additional data.

I think the penalization coefficient 'γ' being inversely proportional to the relaxation time should be further explained.

In line 316 says that B. pumilus shows the highest v0 value, indicating a higher ability to swim fast in low density. But v0 was defined as the speed at the highest density, so this means that B. pumilus swims fast at high density. Which one is right?

I do not know if it can be argued that B. sphaericus presents no difference between v0 and v1 as it is written in the main text. It seems that the v0 is extremely low, almost 0, indicating that it cannot swim at high cell density. However, this strain has the highest v1 mean.

Appendix 1: In Figure 1. I think the red channel's brightness should be increased. When the authors say 'rescalled biofilm density map'? what do they mean? Is it just the first image of the biofilm for a certain condition? Or do they alter the image in any way? In this same Figure 1, there is a white space in the gray frame enclosing the 'swimmer trajectories in normalized biofilm'.

I would put figure 6 in an appendix. I do not think it adds much value to the aim of the paper.

Not sure about Fig7. I think it's better to put a square in the zoomed in regions instead of the dashed line. For example, in the middle panel, the zoomed in version of B. pumilus seems like there are two bacterial bodies attached to each other. However, this is very difficult to see in the zoomed out version. I think this would be clearer if instead of a dashed line there was a dashed square around the region.

Why is there a huge change in the epsilon coefficient depending on the species? I thought random noise would affect roughly the same to all species.

I think in general, the paper would benefit if the figure 7 was changed to figure 1. I would motivate the paper by saying that there are three species with different hydrodynamic properties because of shape, number/type of flagella, etc. and this leads to the hypothesis that the behaviour would be different in the porous biofilm matrix. For example, the longer body of B. sphaericus would be an impediment when navigating the biofilm and therefore they expect lower motility, etc. As the paper is structured now, the whole story seems to be around checking whether their model is correct.

I do not understand the calculation of the visited area. What is the parameter k and ns? Why do they have those values? Why was the visited area not calculated using the trajectory multiplied by the size of the particle?

Reviewer #3 (Recommendations for the authors):

Ravel et al., investigate the swimming behavior of three different Bacillus species in the biofilm formed by Staphylococcus aureus. The biofilm structure and the swimmers' behavior are experimentally characterized using time-lapse confocal microscopy. The swimmers' behavior is described by several parameters measured from tracking their positions in time, including acceleration and speed. The biofilm structure is described in terms of the density of the biomass. The data highlight differences in the shape of the trajectories, speed distributions, and tendency to visit high-density areas of the biofilm in the three different species. These observations are reproduced using a generative model of the data and tentatively explained in terms of bacterial morphology. The model could, in principle, be used to predict the motility of other bacterial species in biofilms.

The conclusions of this paper are supported by the data and the model. Still, the explanation of the former in terms of bacterial motility and morphology needs to be extended and some aspects of the model need to be clarified.

1) The authors explain the different swimming behavior observed in the biofilm in terms of different bacterial morphology, i.e. shape and number of flagella. However, the impact of these differences in a Newtonian fluid should be evaluated and used to better understand the adaptation strategies to the biofilm. In addition, the observations should be commented in light of the literature studying bacterial motility in non-Newtonian fluids (Phys. Rev. Fluids, 5, 073103 (2020); Nat. Phys, 15, 554-558, (2019); Sci. Rep., 5, 15761 (2015); PNAS, 111, 17771-17776 (2014)), where the impact of the flagellar shape and motility are discussed.

2) While a lot of attention is dedicated to describing how bacterial motility is quantified, very few details are given about the measurement of biofilm density and the possible error sources. Since this is a primary ingredient for the interpretation of the results, I would recommend commenting on the procedure and the sensitivity of density measurement.

3) The authors suggest that bacterial motility could help create channels into the biofilm, affecting transport. Is this effect observed in the experiments? This aspect should be clarified and commented on.

4) The purpose of the model and its pros and cons should be discussed more clearly for a general audience. In particular, the authors should clarify the added value of the model with respect to the experiments.

5) Do these bacteria perform a run-and-tumble or a run-and-reverse type of motion in water? Could a back-and-forth trajectory be described as a run-and-reverse?

6) Table 1: The units are missing.

7) Line 125: It would be helpful if some numbers were given in the text to quantify the "large swimming distances" or the "widest swimming distribution".

8) Figure 1: The biofilm density map should be added in panel a. I would recommend avoiding the superimposition of the distributions of ||A||, ||V||,dist, disp and area in panel b.

9) Line 136: Could the term "few" be quantified? From Figure 1a the exceptions seem to be several.

10) Line 192: The bio-readership may not find clear the term "basal" used to identify the simulations. The same applies to "ground truth" in the following. These terms should be defined to improve clarity.

11) Line 303-306: The conclusion on the variability of the outputs should be clarified. Is it a positive feature or is the model missing something?

https://doi.org/10.7554/eLife.76513.sa1

Author response

Essential revisions:

Overall, the presented data support the conclusions and provide insights into bacterial motility. However, there are concerns about the model (assumptions and extensions), the limited discussion/application of bacterial motility and morphology, and aspects of the experimental design. If you choose to revise this manuscript, please address the following items:

We warmly thank the reviewers for this positive comment. Corrections are detailed below.

1) In its current form, the paper lacks a characterization of the swimming motility of the bacteria in a Newtonian fluid, an important aspect to rationalize the impact of the biofilm.

New experiments have been conducted to image the swimming motility of the studied strains in a Newtonian fluid. We took exactly the same culture protocol for the swimmers, but put them in the buffer which is a Newtonian fluid in the absence of host biofilm. Images have been obtained and post-processed with the same dataflow as previous experiments. We then obtained a control experiment characterizing the microbial swim without biofilm.

We first computed the same trajectory descriptors as for the trajectories inside the biofilm, allowing for pairwise comparisons between species swimming in the control buffer, but also to compare the free swim (control) with the swim in a crowded environment (biofilm). We also conducted an additional characterization by addressing a potential run-and tumble or run-and-reverse behaviour of the species (see Reviewer 3 remark): we computed for each trajectory time point the swimmer speed (defined as the mean between the incoming and outgoing velocity vectors (ǁVis(t)ǁ+ǁ Vis(t-Δt)ǁ)/2, for t∊ (Tis+1, Tend,is)) versus the direction change defined as the angle between the incoming and outgoing velocity vectors θis(t)=arccos((Vis(t)Vis(tΔt))/(||Vis(t)||Vis(tΔt)||)). An additional figure was added illustrating run-and-tumble in the host biofilm or in the control buffer.

These additional data provided very interesting insights. We could in particular observe that B.sphaericus has a similar run-and-reverse behaviour in the biofilm or the control buffer, the biofilm strongly impairing the swimming speed and increasing the number of reverse events whereas B.pumilus switches from rectilinear runs in the Newtonian buffer to a pronounced run-and-reverse behaviour in the biofilm with decreased speeds.

By contrast, B.cereus is weakly impacted by the host biofilm: it manages to keep approximately straight trajectories and comparable swimming velocities in the biofilm. Interestingly, the number of reverse events is even reduced in the biofilm.

– line 466 : Material and Methods, imaging protocol for the control experiments in the Newtonian fluid is added.

– Figure 2 of the initial submission has been split in two figures now numbered figures 3 and 5. In Figure 3, a panel has been added with the swimmer trajectories in the control buffer. In Figure 5, the descriptor distribution in the control buffer has been added. The figure interpretations has been modified accordingly in lines 110, 122, 133, 140, 165, 167, 169 and 173.

– A new figure (Figure 4) has been added to assess potential run-and tumble behaviours. The figure description has been added in lines 133. Precise definition of angle and average velocity between two consecutive speed vectors is given line 516 in the Material and Methods.

2) Clearly communicate how this model would help characterize bacteria motility in exogeneous biofilms (instead of performing microscopy).

We thank the reviewers for this fundamental remark. We agree with the reviewers, and in particular with Reviewer 2 who was sceptical about the need of an additional mathematical model, that a large part of the present study is made with direct observations of the trajectories without additional modelling: Figures 3, 4 and 5 of the present submission are nothing else than the extraction of direct descriptors from the microscopy images. In particular, these descriptors allow to quantify swimming characteristics (Figure 5, top panel) but also to link the host biofilm (density and gradient) to the swimmer behaviour (speed and acceleration) (Figure 5, lower panel) or to assess run and tumble (Figure 4).

However, the main goal of this study was to go beyond these descriptive analysis. First, we propose an integrative model able to analyse simultaneously heterogeneous data (density maps and trajectories) and to gather all the swimming behaviours that are considered (speed selection, direction changes and random walk) in the same quantification in order to measure their relative impact on the swimmer trajectories. This method is comparable to ANOVA-like multivariate analysis: parametric phenomelogical mappings are defined to link explicative co-variables to a phenomena (i.e. for example the function defining speed selection from biofilm density) and gathered in the same inference problem. Compared to direct observations, the model we propose additionally tells us for example that speed adaptation is preponderant compared to direction selection (Appendix 2 Figure 4). Secondly, we infer and validate a generative model able to reproduce observed trajectories. This model can be used in future works to predict the impact of a bacterial swimmer on the vascularisation of the host biofilm.

– Section title has been changed line 192 to better communicate that the mathematical model is actually a data analysis method. In the same paragraph, comments were added lines 220.

– Additional sentence is added lines 257 and 322 to clarify the fact that the measurement of the relative contribution of the different swimming mechanisms to the acceleration variance is allowed by the mathematical model.

– Additional discussion has been added line 369 to clearly communicate that the model inference can be seen as an analysis method allowing for multi-data integration and co-analysis.

3) The discussion of cell morphology and potential impacts on motility within the biofilm is inadequate. Cell morphology should also be considered when characterizing swimming motility in a Newtonian fluid.

A new discussion has been added linking the cell morphology to the swimming path in the control Newtonian buffer, but also discussing these results respectively to the literature, as suggested by Reviewer 3.

– The result section was rewritten and made more accurate by linking the presence of multiple flagella to a run-and-tumble behaviour, as observed in other flagellated bacteria such as E. coli, line 352.

– A new section has been added in the discussion to put the results in perspective, regarding previous studies of the literature and the new control experiments in the Newtonian buffer, line 447.

See the detailed comments for specific concerns regarding each of these items.

Reviewer #1 (Recommendations for the authors):

This manuscript by Ravel and colleagues connects prior analysis of bacterial movements on/within abiotic substrates to those within a "living" substrate. The primary outcome is developing a methodology to examine the overall swimming of a bacterial population within another bacterium's biofilm. Implementing these methods would allow for differentiation among bacterial species, identifying microbes that can travel more deeply within an existing biofilm. The authors consider that in the long-term, one could potentially use microbes to reliably deliver antimicrobials (or similar) into the heart of a biofilm, thereby hopefully reducing the biofilm's recalcitrant responses to these biocontrol chemicals.

Strengths

– These scientists develop an inference model and supportive methods to ascertain traits of the population of swimming cells. While similar methods and models exist, the specific examination within a "living" biofilm is intriguing and foundational for developing drug delivery methods or interpreting how cells of any type, e.g., immune cells, might penetrate these bacterial communities. These results also raise questions about the role of cell shape, size, and motility in infiltrating already formed colonies, and conversely, how the composition of the extracellular matrix and cell stacking could protect biofilms from invasion.

– The scientists sequentially test their inference model with and without experimental data. Similar results emerge, suggesting that the inference model could provide an initial step to evaluate different bacterial strains with various biofilm conditions rapidly.

– The authors do a good job of clearly explaining each variable's definition and directly addressing many of the assumptions and constraints of these experiments.

Overall, the presented data support the conclusions, except for the discussion about bacterial shape and motility.

We thank Reviewer 1 for his positive and encouraging comments. Remarks about the discussion about bacterial shape and motility have been addressed, as detailed in the previous section.

– Results and discussion of the links between shape and motility have been rewritten, see lines 352 and 447.

Weaknesses

– An underlying thread is that the difference in shape and flagella size/motility contributes to how each Bacillus strain can navigate the S. aureus biofilm. Yet, the discussion of these contributions is not until the end and relies on abstract assertions of potential behavioral differences. This analysis could be more robust if it included analysis of each species' motility in the absence of biofilm so as to establish if and how these species swim differently from one another. As such, lines 416 – 419 are not strongly supported by the data in this manuscript or the current literature.

As suggested by Reviewers 1 and 2 (see below), we moved the ultrastructural microscopy to the beginning of the result section, the biological interpretation of the swimming changes being devolved to the last result section, which has been re-written (see Point 3 above). This section title has been changed accordingly. Lines 416-419 of the initial submission have been reformulated.

– The image showing the bacterial physiology has been moved to the beginning of the results and additional descriptions of the bacteria have been included, line 102 to 107.

– Result subsection title has been changed, line 329.

– Lines 416-419 of the initial submission have been reformulated, line 425.

– There is a potential bias in the data due to the constraints of the experimental set-up: only focal planes near the well's edge are included in the analysis. The biofilm in these zones could have a distinct physical structure (due to the well's wall) than the remainder of the biofilm. While the datasets presented here are self-contained for analysis, any conclusions about the general/overall biofilm are more narrow (or should be taken with a caveat).

We performed an additional analysis to check that the physical structure of the biofilm is not drastically changed near the well’s edge, at the bottom of the well, where the focal plan has been chosen to screen the swimmers. To provide a quantitative analysis, we took 4 replicates of S. aureus biofilms that were imaged in 3D using a stack of 6 horizontal images, starting from z = 0 near the well’s edge, to z = 6∆z, at the interface between the biofilm and the bulk solution. To study the between and within biofilm variability in the horizontal images, we subsampled them with a regular Cartesian 4x4 grid, resulting in a 4*6*(4*4)=384 2D images database supplemented by metadata (stack, z and x−y coordinate of the subsample). Pairwise correlation of the biofilm density in the 384 samples has been computed (scikit-learn pairwise distances, ’correlation’ parameter), and the resulting similarity matrix has been displayed using Seaborn package (clustermap) after hierarchical clustering (scipy.cluster.hierarchy linkage function) and added to the annex. Additional permanova has been computed to assess the significance of between-group dissimilarities (skbio.stats.distance permanova function). This study shows that the z direction does not structure the data, contrary to the x−y and stack groups, indicating that the biofilm structure is not different near the well’s bottom edge.

– A description of this additional study has been added to the Material and Method, lines 492 and 628.

– A new section has been added in the annex 1, showing the results of the study, in Annex 1 Figure 2, and in the relative description.

– The time-scale of 30 minutes at 30{degree sign}C could permit bacterial growth while cells are moving. The inference measurements appear to be reasonably robust against these potential impacts (perhaps incorporated as part of the "noise" variable). The possible cell growth is worth also considering if attributing differential swimming behaviors to cell morphology or flagella size/location.

– The video duration was actually of 30 s. We really apologize for this misunderstanding induced by the fact that we forgot to mention the time unit in Table 1 which collects all the dataset characteristics, as highlighted by the three Reviewers. The duration was mentioned in the Material and Methods, but we rewrote the unit in plain letter in order to avoid any ambiguity. This time scale is relevant for motion screening and allow to reduce the number of division events in the dataset. Biofilm colonization at longer time scales has been studied for example in [Houry13088], but the influence of swimmer division in the process is interesting: it could be adressed in future works.

– We indicated the time unit in Table 1 and its caption.

– Video duration was indicated in plain letter in the Material and Methods line 478.

I have included all of my scientific comments in the public review. A couple of typographical errors alter a sentence's meaning. The critical ones are as follows:

– line 96, dies --> dyes

– line 198, γ --> β (?)

Further, please clarify that in Table 1 that the video duration is in minutes.

Corrections have been included.

– Modifications were made lines 111 and Appendix 1.

– units have been clarified in Table 1 and line 478 but are in seconds.

Reviewer #2 (Recommendations for the authors):

This work aims to create a workflow to elucidate how the biofilm matrix affect the trajectories of exogeneous swimming bacteria. In principle, this could potentially be interesting to study biofilm spatial organization and also to characterize transport of synthetic particles such as nanoparticles or colloids carrying biocides into the biofilm.

We thank Reviewer 2 for this description that fits the objectives of our work

However, the statistical model presented here does not consider the surface interactions between the particles and the porous media which makes the model quite specific to the bacteria-biofilm interaction and, therefore, difficult to extend to other particles and porous materials.

We agree with Reviewer 2 that biofilm materials are very complex, and can be related to a limited extent to polymeric fluids or porous materials. We added a discussion of these links and limitations and put our results in perspective relatively to other results on flagellated bacteria in polymeric fluids, as also suggested by Reviewer 3 (see below). Cell-to-cell interactions are an important determinant of the kinematic of the swimmer, but taking it into account would necessitate a precise characterization of the rheology of the host biofilm. We opted for a phenomenological description of the kinematic by decomposing the forces exerted on the swimmer into 3 blocks gathering respectively the net forces involved in the velocity selection, in the direction selection, and in random forces. This approach does not pretend to accurately describe the mechanics at the swimmer surface, but rather to gather information into interpretable blocks, the importance of which is given by the inference. This approach is comparable to ANOVA methods by its rationale.

– A discussion has been provided lines 447.

– The discussion on the overall mathematical approach has been enhanced line 369

The article is mainly focused on the statistical model which quantifies the key parameters of the bacteria swimming motility affecting their trajectory within the biofilm matrix. The main parameters studied are the acceleration, swimming speed, net displacement in the swimming trajectory and the area covered by the bacteria during the trajectory. The variables in the biofilm matrix that can affect those parameters are the local differences in cell density within the biofilm and the cell density itself. In this regard, the authors found a good agreement between the trajectories inferred by the model and the real trajectories, validating their model for the strains tested. The model allows to decipher how the different bacteria species adapt to the biofilm matrix. Nevertheless, I fail to understand how the model provides an advantage with respect to just observing the trajectories under the microscope. Following the bacteria with microscopy could identify when they slow down or speed up depending on the density of the host biofilm matrix as the model does.

We thank Reviewer 2 for this description that we endorse. Answers to the concerns about the need for a mathematical model have been provided above in Section 2, point 3. We hope that these modifications will fit his/her expectations.

– See above Section 2, point 3.

Finally, the authors connect the bacteria morphology of the three species studied to the kinematic descriptors of the bacteria trajectories. The morphological features logically agree with what their model predicts, meaning that, long bacteria are slower in the biofilm matrix and bacteria with lower aspect ratio finds easier to go through the porous biofilm matrix. I find this like a nice way of checking that the model works but at the same time I wonder how useful this model can be since the mentioned morphological features and their impact in the bacteria trajectories could be inferred without the model, just by using microscopy. On the same line, the authors suggest that a brush-like group of thin flagella make changing directions easier but I do not see how this is checked experimentally.

As detailed above in Section 2, point 3, the mathematical model is a way to integrate in the same quantification the interaction between the host biofilm and the swimmer velocity, direction and ”randomness”, in order to identify their relative impact on the overall swimmer trajectories. It can be seen as a specific statistical treatment of the microscopy images.

– Additional discussion of the links between kinematics and morphology has been added as indicated in Section 2, point 4 above.

In summary, the parameters affecting the model and the correlation between the kinematic variables of the swimming trajectories and the local conditions in the biofilm are thoroughly checked. However, my concern is that I do not see how this model can widen our knowledge about how bacteria navigate biofilm matrix since tracking bacteria in a biofilm matrix would give similar information. Furthermore, I do not know how limited this model is in terms of the surface chemistry interaction between the biofilm and the bacteria, or the particles in general, that are introduced in the biofilm. This surface chemistry interaction could totally change the trajectory of the swimmers independently on the parameters studied in this article which limits the model to bacteria swimming in a host biofilm and does not allow the extension of the model to synthetic particles or other porous media.

A comment about the generalization of our work has been added in the discussion

– A comment has been added line 449.

I do not understand very well Figure 1a. There are red bacteria (guest bacteria) in the last frame that do not appear in previous timestamps. Specifically, the trajectory at the bottom right has its origin in an area that is visible for all timepoints and I do not manage to see any bacteria in there in any of the timestamps.

Thank you for this remark. The figure has been remade.

– Figure 1a has been remade with a correct trajectory.

I have also a problem with Figure 1b. I guess that in the caption, 'distance' refers to the shortest distance between the initial and final point of the trajectory. But this is what the authors seem to be drawing in the figure for 'displacement' which is defined as 'the total length of the trajectory path'. So I think these two should be swapped. Actually, I think the correct representation according to the capture is as drawn in Figure 1d.

Thanks for pointing to this concern. The correct definition of distance and displacement was given in their mathematical definition line 135 of the revised version of the manuscript. We corrected accordingly Figure 1.d and its caption.

– Images have been swapped in image 1.d consistently with figure 1.b and mathematical definition.

– Caption has been modified.

In table 1, could the units of time interval be specified?

This is done.

– The unit is added in the corresponding column header

In Figure 2, is the whole set of trajectories for each species displayed for 1 batch or for the 3 batches? I think Figure 2, would be easier to follow if the order of the bacteria in (a) matches the order in (b). I think it would also benefit from a rough y axis in the first graph of the upper panel of (b). What are the units in the x axis? If the magnitudes are dimensionless because they are normalized, I think this should be said in the caption, also because dimensionless magnitudes are referred as V* and A* in the main text. The legend in the bottom panel (b) containing the names of the bacteria should be in italics and 'B' separated from the rest of the name. There is a mistake in the caption (2b), the magnitude 'Area' is not mentioned in the caption.

Do the authors know why the distributions in Figure 2 look quite similar for B. pumilus and B. cereus but different to B. sphaericus? It seems to me that this might be an artefact coming from the fact that less trajectories are plotted for B. sphaericus. I think it would be informative if they put a label with the number of total trajectories displayed in this panel.

I do not see very well the added value of the bottom panel in Figure 2b. I have read the Results section just looking at the 50% line, I don't think the others present much more additional data.

In previously figure 2 and now figure 3, the 3 batches are pooled. The y axis is not indicated since the distribution are normalized to sum to one. The x-axis are also dimensionless since they are normalized: we added a note in the figure caption. We modified the figures accordingly to the comments.

In our opinion, similar distributions for B.pumilus of B.cereus could come from similar trajectory characteristics for these macroscopic descriptors. We can not discard that discrepancies could come from different sample numbers. The artefact could be discarded by the fact that the number of trajectories for B.sphaericus is in the same order than B.cereus, although they show different descriptor distributions.

The bottom panel in the previously Figure 2b but now figure 3 shows the dependence of the acceleration and speed to the underlying host biofilm structure (density and density gradient). It is a way to present the result of direct observation by microscopy of the links between kinematic characteristics and the host biofilm.

– We agree that most of the information is given by the 50% isoline. We kept however the different lines to comply with eLife policy in distribution presentation, since the authors are asked to present the distribution spread in addition to means or medians.

– Plot order has been modified.

– Names of bacteria in the figure legend have been changed.

– Figure 3 caption has been amended with a note indicating that the 3 batches are pooled.

– The number of samples in the different densities has been indicated in the legend.

I think the penalization coefficient 'γ' being inversely proportional to the relaxation time should be further explained.

The fact that γ is dimensioned as an inverse of time is necessary to be homogeneous to the acceleration in the left hand side of the equation. The term ”relaxation” comes from ”penalization” modelling in fluid mechanics, where given vector fields can be prescribed by adding to the state equation a penalization term penalizing the difference to the prescribed vector fields. The term ”relaxation” refers to the time needed for the computed vector field to relax towards the constraint.

– Additional explanations have been added line 204.

In line 316 says that B. pumilus shows the highest v0 value, indicating a higher ability to swim fast in low density. But v0 was defined as the speed at the highest density, so this means that B. pumilus swims fast at high density. Which one is right?

We corrected the model description to clearly state that v0 is the prescribed speed when b = 0 and v1 when b = 1. The corresponding sentence line 316 was correct.

– The mathematical model is more clearly defined with additional explanation line 204.

I do not know if it can be argued that B. shaericus presents no difference between v0 and v1 as it is written in the main text. It seems that the v0 is extremely low, almost 0, indicating that it cannot swim at high cell density. However, this strain has the highest v1 mean.

The sentence has been reformulated to be more accurate. The same sentence was repeated later in the original submission: it has been also modified.

– Modifications have been added lines 303 and 317.

Appendix 1: In Figure 1. I think the red channel's brightness should be increased. When the authors say 'rescalled biofilm density map'? what do they mean? Is it just the first image of the biofilm for a certain condition? Or do they alter the image in any way? In this same Figure 1, there is a white space in the gray frame enclosing the 'swimmer trajectories in normalized biofilm'.

The figure has been remade. Indications have been added in Annex 1 to state that brightness has been changed, and to recall the Material and method sentence that detail the rescaling applied to the biofilm density map. No other post-processing has been applied to the raw biofilm maps than mapping in gray scale and color rescaling between 0 and 1.

– The figure has been remade with increased brightness.

I would put figure 6 in an appendix. I do not think it adds much value to the aim of the paper.

This figure shows the balance between the different swimming processes. It shows consistency with the other results. In particular, B.cereus is the bacteria which is the most influenced by the biofilm gradients. It is also the bacteria with the highest stochastic term, related to straight lines in the Langevin equation. These data are consistent with a higher hability of B.cereus to navigate in the biofilm and conserve straight runs.

– Figure 6 has been moved to Annex 2. Additional text has been added for the annex to be self-contained.

Not sure about Fig7. I think it's better to put a square in the zoomed in regions instead of the dashed line. For example, in the middle panel, the zoomed in version of B. pumilus seems like there are two bacterial bodies attached to each other. However, this is very difficult to see in the zoomed out version. I think this would be clearer if instead of a dashed line there was a dashed square around the region.

The zoomed in images were not obtained by digital zoom in but optical zoom in. The bacteria of zoomed in image do not correspond to the bacteria of the larger scale image. That is why we did not opt for a dashed square around the region that was zoomed in.

– We added a complementary sentence in the zoomed in legend to explain that fact.

Why is there a huge change in the epsilon coefficient depending on the species? I thought random noise would affect roughly the same to all species.

The huge change in the epsilon mainly come from the fact that the mechanistic part of the model do not explain the observed kinematics to the same extent, which is reflected in the different goodness of fit for the 3 bacteria (see Table 2). Note however that a larger stochastic part is also related to straighter trajectories (see Figure 1 and Annex 2 Figure 1): the higher stochastic part of B.cereus can be related to its straighter runs in the data (see Figure 3).

I think in general, the paper would benefit if the figure 7 was changed to figure 1. I would motivate the paper by saying that there are three species with different hydrodynamic properties because of shape, number/type of flagella, etc. and this leads to the hypothesis that the behaviour would be different in the porous biofilm matrix. For example, the longer body of B. sphaericus would be an impediment when navigating the biofilm and therefore they expect lower motility, etc. As the paper is structured now, the whole story seems to be around checking whether their model is correct.

We thank Reviewer 2 for this advice. We moved the ultimately figure 7 to the top of the paper. We added a brief description of the ultrastuctural bacterial morphology and linked that section to the remainder of the article by checking if these physiological discrepancies could give the bacteria different swimming behaviour in the biofilm matrix. See also above (Reviewer 1’s public review, point 2).

– We moved the image and added a new paragraph lines 102 and 107.

I do not understand the calculation of the visited area. What is the parameter k and ns? Why do they have those values? Why was the visited area not calculated using the trajectory multiplied by the size of the particle?

The visited area was not calculated as the trajectory multiplied by the size of the particle to take into account bacteria that visit several times the same zones. This metric would be a proxy of the biofilm vascularization induced by the swimmer : a swimmer that cross several times the same area does not dig additional pores in the host biofilm and does not increase the biofilm vascularization. We then opted for tagging all the pixels of the biofilm image that are crossed by a swimmer. But the displacement of the swimmers during a time step can be higher than several pixels : we then had to reconstruct the trajectory during consecutive swimmer positions to recover the pixels crossed by the swimmer during the time step. As a first order approximation of the trajectory during a time step, we computed a straight line between two successive positions.

The parameter ns was chosen so that Xis(t+Δt)Xis(t)/ns was always inferior to a pixel size. k is not a parameter but a variable varying from 1 to ns.

Reviewer #3 (Recommendations for the authors):

Ravel et al., investigate the swimming behavior of three different Bacillus species in the biofilm formed by Staphylococcus aureus. The biofilm structure and the swimmers' behavior are experimentally characterized using time-lapse confocal microscopy. The swimmers' behavior is described by several parameters measured from tracking their positions in time, including acceleration and speed. The biofilm structure is described in terms of the density of the biomass. The data highlight differences in the shape of the trajectories, speed distributions, and tendency to visit high-density areas of the biofilm in the three different species. These observations are reproduced using a generative model of the data and tentatively explained in terms of bacterial morphology. The model could, in principle, be used to predict the motility of other bacterial species in biofilms.

The conclusions of this paper are supported by the data and the model. Still, the explanation of the former in terms of bacterial motility and morphology needs to be extended and some aspects of the model need to be clarified.

We thank Reviewer 3 for this positive comments.

1) The authors explain the different swimming behavior observed in the biofilm in terms of different bacterial morphology, i.e. shape and number of flagella. However, the impact of these differences in a Newtonian fluid should be evaluated and used to better understand the adaptation strategies to the biofilm. In addition, the observations should be commented in light of the literature studying bacterial motility in non-Newtonian fluids (Phys. Rev. Fluids, 5, 073103 (2020); Nat. Phys, 15, 554-558, (2019); Sci. Rep., 5, 15761 (2015); PNAS, 111, 17771-17776 (2014)), where the impact of the flagellar shape and motility are discussed.

We thank Reviewer 3 for pointing out these references. We added additional discussions, in light of these studies.

– Additional discussion added line 447.

2) While a lot of attention is dedicated to describing how bacterial motility is quantified, very few details are given about the measurement of biofilm density and the possible error sources. Since this is a primary ingredient for the interpretation of the results, I would recommend commenting on the procedure and the sensitivity of density measurement.

We investigated the effect on parameter estimate of increasing noise on the biofilm data. The impact of noise on the parameter inference is assessed by noising the biofilm density and the biofilm density gradients with an additive gaussian noise with increasing variance. The noise variance is scaled with the variance observed in the original data. Namely, we set

(1) bN(0,lσb)

and

(2) ϵbN(0,2lΔxσb)

where σb is the variance observed in the original data, and b and ∊b and ∊∇bare respectively the noise applied to the biofilm density and the biofilm density gradient. The parameter l ∈ [0,0.01,0.02,0.03,0.04,0.05] is increased to apply a noise from 0 to 5%.

The parameter β is the most impacted (error of 35% for 5% noise). But it is also the parameter that has the minimal influence on the swimmer model when no noise is applied, showing a lower identifiability for this parameter. For the other parameters, the impact is limited.

– A new figure and its description is added in Annex 1 to show the impact of increasing noise on the parameter estimate.

– Sensitivity analysis method is detailed in Material and method line 591.

– A reference to the figure is added Appendix 2.

3) The authors suggest that bacterial motility could help create channels into the biofilm, affecting transport. Is this effect observed in the experiments? This aspect should be clarified and commented on.

This effect was observed during this experiment but not quantified. The impact of swimmer as facilitator of the penetration of macromolecules inside the biofilm has been published by co-authors in [Houry13088].

– A comment has been added line 54.

– A new image has been added (Annex 1, figure 3) to illustrate the pore formation.

4) The purpose of the model and its pros and cons should be discussed more clearly for a general audience. In particular, the authors should clarify the added value of the model with respect to the experiments.

This comment has been addressed (see essential Revisions above).

– see Section 2 point 3 above.

5) Do these bacteria perform a run-and-tumble or a run-and-reverse type of motion in water? Could a back-and-forth trajectory be described as a run-and-reverse?

As stated above, we analysed the new experiments in the control Newtonian buffer, together with the data in the biofilm, by searching for run-and-tumble or run-and-reverse episode. We can now state that B.sphaericus performs run-and-reverse swimming in a Newtonian fluid. This behaviour is not changed in a biofilm. Conversely, B.pumilus has a different swimming pattern in a Newtonian fluid (straight lines) and in the biofilm (increase of the number of tumbling). B.cereus has the same type of swimming in the Newtonian buffer or in the biofilm.

– An additional figure has been added (Figure 4), with comments and description (see above).

6) Table 1: The units are missing.

This is corrected

– corrections have been made in the table column header and in the material and methods (see above).

7) Line 125: It would be helpful if some numbers were given in the text to quantify the "large swimming distances" or the "widest swimming distribution".

Additional quantifications were provided.

– Value for average speed and acceleration were given: line 154 and 155

– Large swimming distances have been documented with numbers: line 142 and 142

– The widest distribution has been documented with numbers: line 145

8) Figure 1: The biofilm density map should be added in panel a. I would recommend avoiding the superimposition of the distributions of ||A||, ||V||,dist, disp and area in panel b.

The biofilm density map can hardly be added to panel a since (1) the trajectories of the 3 batches are pooled, corresponding to 3 different density map temporal stacks, (2) the density map is changing overtime due to the mechanical action of the swimmers. In the annex, we presented trajectories with an underlying biofilm density map for illustrative purpose, but we highlighted that the biofilm information is a time-series, and that the swimmer position is associated to the biofilm density at the corresponding time. The distributions were modified.

– Figure 5 has been modified

9) Line 136: Could the term "few" be quantified? From Figure 1a the exceptions seem to be several.

We gave quantification of the fact that few trajectories of B.sphaericus are leaving the neighbourhood of their starting point: we stated that 6% of the B.sphaericus induce a displacement of more than 10 µm compared to 28% for B.cereus and 26% for B.pumilus.

– Additional details are given line 171.

10) Line 192: The bio-readership may not find clear the term "basal" used to identify the simulations. The same applies to "ground truth" in the following. These terms should be defined to improve clarity.

Defintions were added.

– we added definitions for these terms Appendix 2.

– we added additional precisions in the legend of Figure 6.

11) Line 303-306: The conclusion on the variability of the outputs should be clarified. Is it a positive feature or is the model missing something?

The model is recovering the main characteristics of the swimmer population, but is not able to reproduce the whole variability of the data, meaning that it is less accurate at reproducing points in the distribution tail.

– Clarifications are given line 288.

https://doi.org/10.7554/eLife.76513.sa2

Article and author information

Author details

  1. Guillaume Ravel

    1. University of Bordeaux, INRAE, BIOGECO, Cestas, France
    2. Inria, INRAE, Talence, France
    Contribution
    Formal analysis, Investigation, Methodology, Project administration, Software, Visualization, Writing – original draft, Writing – review and editing
    Competing interests
    No competing interests declared
  2. Michel Bergmann

    1. Memphis Team, INRIA, Talence, France
    2. University of Bordeaux, IMB, UMR 5251, Talence, France
    3. CNRS, IMB, UMR 5251, Talence, France
    Contribution
    Methodology, Validation, Writing – review and editing
    Competing interests
    No competing interests declared
  3. Alain Trubuil

    Université Paris-Saclay, INRAE, MaIAGE, Jouy-en-Josas, France
    Contribution
    Conceptualization, Methodology, Validation, Writing – review and editing
    Competing interests
    No competing interests declared
  4. Julien Deschamps

    Université Paris-Saclay, INRAE, AgroParisTech, Micalis Institute, Jouy-en-Josas, France
    Contribution
    Data curation, Validation, Writing – review and editing
    Competing interests
    No competing interests declared
  5. Romain Briandet

    Université Paris-Saclay, INRAE, AgroParisTech, Micalis Institute, Jouy-en-Josas, France
    Contribution
    Conceptualization, Data curation, Funding acquisition, Investigation, Methodology, Validation, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-8123-3492
  6. Simon Labarthe

    1. University of Bordeaux, INRAE, BIOGECO, Cestas, France
    2. Inria, INRAE, Talence, France
    3. Université Paris-Saclay, INRAE, MaIAGE, Jouy-en-Josas, France
    Contribution
    Conceptualization, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Software, Supervision, Validation, Writing – original draft, Writing – review and editing
    For correspondence
    simon.labarthe@inrae.fr
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-5463-7256

Funding

Mathnum department - INRAe

  • Guillaume Ravel

Agence Nationale de la Recherche (ANR-12-ALID-0006)

  • Romain Briandet

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

This work has benefited from the facilities and expertise of MIMA2 MET – GABI, INRAE, AgroParistech, 78,352 Jouy-en-Josas, France. C Péchoux is warmly acknowledged for TEM observations. Financial support was provided by the French National Research Agency ANR-12-ALID-0006. Guillaume Ravel received funding from the Mathnum department at INRAE.

Senior Editor

  1. Wendy S Garrett, Harvard T.H. Chan School of Public Health, United States

Reviewing Editor

  1. Karine A Gibbs, University of California, Berkeley, United States

Reviewers

  1. Iago Grobas, University of Oxford, United Kingdom
  2. Karine Gibbs, University of California, Berkeley, Berkeley, United States

Publication history

  1. Received: December 19, 2021
  2. Preprint posted: January 12, 2022 (view preprint)
  3. Accepted: June 10, 2022
  4. Accepted Manuscript published: June 14, 2022 (version 1)
  5. Version of Record published: July 11, 2022 (version 2)

Copyright

© 2022, Ravel et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 423
    Page views
  • 199
    Downloads
  • 0
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Guillaume Ravel
  2. Michel Bergmann
  3. Alain Trubuil
  4. Julien Deschamps
  5. Romain Briandet
  6. Simon Labarthe
(2022)
Inferring characteristics of bacterial swimming in biofilm matrix from time-lapse confocal laser scanning microscopy
eLife 11:e76513.
https://doi.org/10.7554/eLife.76513

Further reading

    1. Computational and Systems Biology
    2. Neuroscience
    Sergio Oscar Verduzco-Flores, Erik De Schutter
    Research Article Updated

    How dynamic interactions between nervous system regions in mammals performs online motor control remains an unsolved problem. In this paper, we show that feedback control is a simple, yet powerful way to understand the neural dynamics of sensorimotor control. We make our case using a minimal model comprising spinal cord, sensory and motor cortex, coupled by long connections that are plastic. It succeeds in learning how to perform reaching movements of a planar arm with 6 muscles in several directions from scratch. The model satisfies biological plausibility constraints, like neural implementation, transmission delays, local synaptic learning and continuous online learning. Using differential Hebbian plasticity the model can go from motor babbling to reaching arbitrary targets in less than 10 min of in silico time. Moreover, independently of the learning mechanism, properly configured feedback control has many emergent properties: neural populations in motor cortex show directional tuning and oscillatory dynamics, the spinal cord creates convergent force fields that add linearly, and movements are ataxic (as in a motor system without a cerebellum).

    1. Computational and Systems Biology
    2. Immunology and Inflammation
    Mingyao Pan, Bo Li
    Short Report Updated

    T cells are potent at eliminating pathogens and playing a crucial role in the adaptive immune response. T cell receptor (TCR) convergence describes T cells that share identical TCRs with the same amino acid sequences but have different DNA sequences due to codon degeneracy. We conducted a systematic investigation of TCR convergence using single-cell immune profiling and bulk TCRβ-sequence (TCR-seq) data obtained from both mouse and human samples and uncovered a strong link between antigen-specificity and convergence. This association was stronger than T cell expansion, a putative indicator of antigen-specific T cells. By using flow-sorted tetramer+ single T cell data, we discovered that convergent T cells were enriched for a neoantigen-specific CD8+ effector phenotype in the tumor microenvironment. Moreover, TCR convergence demonstrated better prediction accuracy for immunotherapy response than the existing TCR repertoire indexes. In conclusion, convergent T cells are likely to be antigen-specific and might be a novel prognostic biomarker for anti-cancer immunotherapy.