Inferring characteristics of bacterial swimming in biofilm matrix from timelapse confocal laser scanning microscopy
Abstract
Biofilms are spatially organized communities of microorganisms embedded in a selfproduced organic matrix, conferring to the population emerging properties such as an increased tolerance to the action of antimicrobials. It was shown that some bacilli were able to swim in the exogenous matrix of pathogenic biofilms and to counterbalance these properties. Swimming bacteria can deliver antimicrobial agents in situ, or potentiate the activity of antimicrobial by creating a transient vascularization network in the matrix. Hence, characterizing swimmer trajectories in the biofilm matrix is of particular interest to understand and optimize this new biocontrol strategy in particular, but also more generally to decipher ecological drivers of population spatial structure in natural biofilms ecosystems. In this study, a new methodology is developed to analyze timelapse confocal laser scanning images to describe and compare the swimming trajectories of bacilli swimmers populations and their adaptations to the biofilm structure. The method is based on the inference of a kinetic model of swimmer populations including mechanistic interactions with the host biofilm. After validation on synthetic data, the methodology is implemented on images of three different species of motile bacillus species swimming in a Staphylococcus aureus biofilm. The fitted model allows to stratify the swimmer populations by their swimming behavior and provides insights into the mechanisms deployed by the microswimmers to adapt their swimming traits to the biofilm matrix.
Editor's evaluation
This paper nicely considers how the biofilm matrix impacts the organism's moving within that environment, connecting prior analyses of cell movements on/within abiotic substrates to those within a "living" substrate. Though there are instinctive descriptions for this motility, the strength of this manuscript is the development and implementation of a statistical model that quantifies critical parameters and incorporates interactions with the biofilm matrix itself. While the manuscript measures the differences between morphologically distinct bacteria, a longterm possibility is to achieve predictable and reliable delivery of antimicrobials (delivered by bacteria or an abiotic object) into the biofilm's center, thereby reducing a biofilm's recalcitrant responses to biocontrol chemicals.
https://doi.org/10.7554/eLife.76513.sa0eLife digest
Anyone who has ever cleaned a bathroom probably faced biofilms, the dark, slimy deposits that lurk around taps and pipes. These structures are created by bacteria which abandon their solitary lifestyle to work together as a community, secreting various substances that allow the cells to organise themselves in 3D and to better resist external aggression.
Unwanted biofilms can impair industrial operations or endanger health, for example when they form inside medical equipment or water supplies. Removing these structures usually involves massive application of substances which can cause longterm damage to the environment.
Recently, researchers have observed that a range of small rodshaped bacteria – or ‘bacilli’ – can penetrate a harmful biofilm and dig transient tunnels in its 3D structure. These ‘swimmers’ can enhance the penetration of antimicrobial agents, or could even be modified to deliver these molecules right inside the biofilm. However, little is known about how the various types of bacilli, which have very different shapes and propelling systems, can navigate the complex environment that is a biofilm. This knowledge would be essential for scientists to select which swimmers could be the best to harness for industrial and medical applications.
To investigate this question, Ravel et al. established a way to track how three species of bacilli swim inside a biofilm compared to in a simple fluid. A mathematical model was created which integrated several swimming behaviors such as speed adaptation and direction changes in response to the structure and density of the biofilm. This modelling was then fitted on microscopy images of the different species navigating the two types of environments.
Different motion patterns for the three bacilli emerged, each showing different degrees of adapting to moving inside a biofilm. One species, in particular, was able to run straight in and out of this environment because it could adapt its speed to the biofilm density as well as randomly change direction.
The new method developed by Ravel et al. can be redeployed to systematically study swimmer candidates in different types of biofilms. This would allow scientists to examine how various swimming characteristics impact how bacteriakilling chemicals can penetrate the altered biofilms. In addition, as the mathematical model can predict trajectories, it could be used in computational studies to examine which species of bacilli would be best suited in industrial settings.
Introduction
Biofilm is the most abundant mode of life of bacteria and archaea on earth (Flemming and Wuertz, 2019; Flemming et al., 2016b). They are composed of spatially organized communities of microorganisms embedded in a selfproduced extracellular polymeric substances (EPS) matrix. EPS are typically forming a gel composed of a heterogenous mixture of water, polysaccharides, proteins, and DNA (Flemming et al., 2016a). The biofilm mode of life confers to the inhabitant microbial community strong ecological advantages such as resistance to mechanical or chemical stresses (Bridier et al., 2011) so that conventional antimicrobial treatments remain poorly efficient against biofilms (Bridier et al., 2015). Different mechanisms were invoked such as molecular diffusionreaction limitations in the biofilm matrix and the cell type diversification associated with stratified local microenvironments (Bridier et al., 2017). Biofilms can induce harmful consequences in several industrial applications, such as water (Beech and Sunner, 2004), or agrifood industry (Doulgeraki et al., 2017), leading to significant economic and health burden (Köck et al., 2010). Indeed, it was estimated that the biofilm mode of life is involved in 80% of human infection and usual chemical control leads to serious environmental issues (Bridier et al., 2011). Hence, finding efficient ways to improve biofilm treatment represents important societal sustainable perspectives.
Motile bacteria have been observed in host biofilms formed by exogenous bacterial species (Houry et al., 2012; Li et al., 2014; Piard et al., 2016; Flemming et al., 2016a). These bacterial swimmers are able to penetrate the dense population of host bacteria and to find their way in the interlace of EPS. Doing so, they visit the 3D structure of the biofilm, leaving behind them a trace in the biofilm structure, that is a zone of extracellular matrix free of host bacteria (Figure 1a and Appendix 1—figure 3). Hence, bacterial swimmers are digging a network of capillars in the biofilm, enhancing the diffusivity of large molecules (Houry et al., 2012), allowing the transport of biocide at the heart of the biofilm, reducing islands of living cells. The potentiality of bigger swimmers has also been studied for biofilm biocontrol, including spermatozoa (MayorgaMartinez et al., 2021), protozoans (Derlon et al., 2012), or metazoans (Klein et al., 2016). Recent results suggest a deeper role of bacterial swimmers in biofilm ecology with the concept of microbial hitchhiking: motile bacteria can transport sessile entities such as spores (Muok et al., 2021), phages (Yu et al., 2021) or even other bacteria (Samad et al., 2017), enhancing their dispersion within the biofilm. Hence, characterizing microbial swimming in the very specific environment of the biofilm matrix is of particular interest to decipher biofilm spatial regulations and their biocontrol, but more generally in an ecological perspective of microbial population dynamics in natural ecosystems.
Bacterial swimming is strongly influenced by the microtopography and bacteria deploy strategies to sense and adapt their motion to their environment (Lee et al., 2021), with specific implications for biofilm formation and dynamics (Conrad and PolingSkutvik, 2018). Modelbased studies were conducted to characterize bacterial active motion in interaction with an heterogeneous environment. An image and modelbased analysis showed nonlinear selfsimilar trajectories during chemotactic motion with obstacles (Koorehdavoudi et al., 2017). Theoretical studies explored Brownian dynamics of selfpropelled particles in interaction with filamentous structures such as EPS (Jabbarzadeh et al., 2014) or with random obstacles, exhibiting continuous limits and different motion regimes depending on obstacle densities (Chepizhko and Peruani, 2013; Chepizhko et al., 2013). Image analysis characterized different swimming patterns in polymeric fluids (Patteson et al., 2015), completed by detailed comparisons between a microscale model of flagellated bacteria in polymeric fluids and highthroughput images (Martinez et al., 2014). Models of bacterial swimmers in viscoelastic fluids were also developed to study the force fields encountered during their run (Li and Ardekani, 2016). However, to our knowledge, no study tried to characterize swimming patterns in the highly heterogeneous environment presented by an exogenous biofilm matrix.
In this study, we aim to provide a quantitative characterization of the different swimming behaviours in adaptation to the host biofilm matrix observed by microscopy. We focus on identifying potential speciesdependent swimming characteristics and quantifying the swimming speed and direction variations induced by the host biofilm structure. To address these goals, three different Bacillus species presenting contrasted physiological characteristics are selected. First, different trajectory descriptors accounting for interactions with the host biofilm are defined, allowing to discriminate the swim of these bacterial strains by differential analysis. Then, a mechanistic randomwalk model including swimming adaptations to the host biofilm is introduced. This model is numerically explored to identify the sensitivity of the trajectory descriptors to the model parameters. An inference strategy is designed to fit the model to 2D+T microscopy images. The method is validated on synthetic data and applied to a microscopy dataset to decipher the swimming behaviour of the three Bacillus.
Results
Ultrastuctural bacterial morphology
To investigate how the shape and propelling mechanism of bacteria can affect the way they navigate in a porous media such as a biofilm, we first image three bacterial swimmers –Bacillus pumilus (B. pumilus), Bacillus sphaericus (B. sphaericus), and Bacillus cereus (B. cereus) – by Transmitted Electron Microscopy (TEM) (Figure 2) to seek for potential structural and physiological differences. Important discrepancies can be observed between these Bacillus. First, they show noticeable difference in length and diameter, B. sphaericus being the longest bacteria by a factor of approximatively 1.5, and B. cereus and B. pumilus having similar size, but B. cereus showing a higher aspect ratio. Secondly, they do not have the same type of flagella: B. pumilus and B. sphaericus present several long flagella distributed over the whole surface of the membrane while B. cereus shows a unique brushlike bundle of very thin flagella, at its back tip.
We then used these three species to test if these ultrastructural differences could impact their swimming behaviour in a host biofilm or in a Newtonian control fluid: could the longer body of B. sphaericus be an impediment in a crowded environment such as a biofilm or on the contrary could its larger size give it a higher strength to cross the biofilm matrix? Is the unique brushlike flagella of B. cereus an advantage or a disadvantage to swim in a Newtonian fluid or in a host biofilm?
Characterizing bacterial swimming in a biofilm matrix through image descriptors
2D+T Confocal Laser Scanning Microscopy (CLSM) of the three Bacillus swimming in a Staphylococcus aureus (S. aureus) host biofilm or in a control Newtonian buffer are acquired (see Figure 1d). Swimmers and host biofilms are imaged with different fluorescent dyes, allowing their acquisition in different color channels, and to recover in the same spatiotemporal referential the swimmer trajectories and the host biofilm density (see Materials and methods, Figure 1 and Table 1). Namely, for each species $s$ and individual swimmer $i$, we recover the initial (${T}_{0,i}^{s}$) and final (${T}_{end,i}^{s}$) observation times (when the swimmer goes in and out the focal plane, see Materials and methods sect. Confocal Laser Scanning Microscopy [CLSM]), and the number ${T}_{i}^{s}$ of time points in the trajectory. We then extract from the 2D+T images the observed position, instantaneous speed and acceleration timeseries
Noting ${b}^{s}(t,x)$ the dynamic biofilm density maps obtained from the biofilm images, we also compute the local biofilm density and density gradient along trajectories
The angle ${\theta}_{i}^{s}(t)$ and the average velocity ${\overline{V}}_{i}^{s}(t)$ between two successive speed vectors are also collected (see Materials and methods sec. Postprocessing of image data).
Different swimming patterns can be deciphered by qualitative observations of the trajectories ${X}_{i}^{s}(t)$ (Figure 3) in the biofilm and in the control Newtonian buffer, and runandtumble swimming patterns are quantified with ${\theta}_{i}^{s}(t)$ and ${\overline{V}}_{i}^{s}(t)$ (Figure 4). B. sphaericus has a similar runandreverse behaviour in the biofilm and the control buffer with trajectories divided between back and forth paths around the starting point and long runs, the biofilm strongly impairing its speed and increasing the number of reverse events. By contrast, B. pumilus clearly switches its swimming behaviour in the biofilm, from quasistraight runs in the Newtonian buffer to a pronounced runandreverse behaviour in the biofilm with decreased speeds and chaotic trajectories. On the contrary, B. cereus swimmers manage to conserve comparable trajectories and distributions of swimming speed and direction in the biofilm compared to control. Interestingly, the number of reverse events is even reduced in the host biofilm for B. cereus.
For further quantitative analysis, trajectory descriptors are defined. We first investigate the distribution of the populationwide average acceleration and velocity norms $\frac{1}{{T}_{i}^{s}2}{\sum}_{t}\Vert {A}_{i}^{s}(t)\Vert$ and $\frac{1}{{T}_{i}^{s}1}{\sum}_{t}\Vert {V}_{i}^{s}(t)\Vert$, where $\parallel \cdot \parallel $ denotes the Euclidian norm. We also quantify the swimming kinematics by computing the travelled distance $dis{t}_{i}^{s}$ along the path and the total displacement $dis{p}_{i}^{s}$, that is the distance between the initial and final trajectory points, with
We finally compute the total biofilm area visited by a swimmer along its path (see Figure 1b). The same descriptors are computed in the control Newtonian buffer.
The three species present contrasted distributions for these descriptors (Figure 5). B. sphaericus has the smallest mean ($A=0.58$ and $V=0.70$) and median ($\parallel A\parallel =0.50$ and $\parallel V\parallel =0.53$) values of acceleration and speed, while B. pumilus has the widest distributions (difference between 95% and 5% centiles of 2.76 for $\parallel A\parallel $ and 2.45 for $\parallel V\parallel $ compared to 1.00, 1.51 and 1.90, 1.49 for B. sphaericus and B. cereus respectively). B. cereus for its part shows the highest accelerations, indicating larger changes in swimming velocities, but median and mean speeds comparable to B. pumilus (Figure 5, $\parallel A\parallel $ and $\parallel V\parallel $ panels). We also note that B. sphaericus and to a lower extent B. pumilus trajectories have a significant amount of null or small average speeds, while B. cereus trajectories have practically no zero velocity, consistently with the qualitative analysis (Figure 5, $\parallel V\parallel $ panels). Small velocities episodes of B. sphaericus and B. pumilus could occur during their backandforth trajectories, which produce small displacements and pull the displacement distribution towards lower values than B. cereus (Figure 5, Disp panel). B. pumilus displacement is intermediary. Conversely, backandforth trajectories can produce large swimming distances for B. sphaericus and B. pumilus (mean adimensioned value of 32.2 and 43.2 respectively) so that B. sphaericus has a distance distribution comparable to B. cereus (mean adimensioned value of 29.6, Figure 5, Dist panel), but lower than B. pumilus. Observing conjointly displacement and distance (Figure 5, lowerright panel) provides consistent insights: B. sphaericus shows a large variability of small displacement trajectories, from small to large distances, while B. cereus trajectory displacement seems to vary almost linearly with the distance at least for the points inside the isoline 50%. B. pumilus has again an intermediary distribution, with a large range of displacementdistance couples. The distributions of visited areas of B. pumilus and B. cereus are almost identical, and higher than B. sphaericus one. Compared to the control buffer, all descriptors are reduced in the biofilm. Consistently with previous observations, the displacement ($disp$) is strongly reduced for B. pumilus, and less impacted for B. sphaericus and B. cereus. These observations must be related to the behavioural switch for B. pumilus and to the identical swimming patterns for the two other Bacilii in the biofilm compared to the control fluid.
All together, this data depict (1) a longrange species, B. cereus, which moves efficiently in the biofilm during long, relatively straight, rapid runs, almost identically as in a Newtonian fluid (2) a shortrange species, B. sphaericus, that moves mainly locally in small areas in the biofilm and in the control buffer with lower accelerations and speeds except few exceptions (only 6% of its trajectories induced a displacement higher than $10\mu m$ compared to 28% for B. cereus and 26% for B. pumilus) and (3) a mediumrange species, B. pumilus, with a large diversity of rapid trajectories, from small to large displacement, and a behavioural change from straight runs in a Newtonian fluid to frequent runandreverse events in the biofilm. These kinematics discrepancies for B. pumilus and B. cereus allow them however to cover identical visited areas.
Though, these global descriptors do not inform about potential adaptations of the swimmers to the biofilm matrix. We first check if swimmer velocities are directly linked to the local biofilm density, and if the swimmers adapt their trajectory according to density gradients by plotting the points $(\parallel \nabla b(t,{X}_{i}^{s}(t))\parallel ,\parallel {A}_{i}^{s}(t)\parallel )$ and $(b(t,{X}_{i}^{s}(t)),\parallel {V}_{i}^{s}(t)\parallel )$ (Figure 5, lower panel). Clear differences between the three species can be deciphered. First, the three Bacillus do not have the same distribution of visited biofilm density and gradient. B. pumilus swimmers visit denser biofilm with higher variations than the other species while B. sphaericus and B. cereus stay in less dense and smoother areas, the quantile 0.5 of these species being circumscribed in low gradient and low density values. Next, B. cereus has a wider distribution of accelerations, specially for smalldensity gradients, compared to B. pumilus and B. sphaericus. This could indicate that when the biofilm is smooth, B. cereus samples its acceleration in a large distribution of possible values. Finally, we observe that the speed distribution rapidly drops for increasing biofilm densities for B. sphaericus and B. cereus, while the decrease is much smoother for B. pumilus. These observations provide additional insights in the species swimming characteristics: B. pumilus swimmers seem to be less inconvenienced by the host biofilm density than the other species, while B. cereus and B. sphaericus bacteria appear to be particularly impacted by higher densities and to favor low densities where it can efficiently move. Though, B. sphaericus has lower motile capabilities than B. cereus when the biofilm is not dense.
Analysis of swimming data with an integrative swimming model
This descriptive analysis does not allow to clearly identify potential mechanisms by which the swimmers adapt their swim to the biofilm structure or to simulate new speciesdependant trajectories. We then build a swimming model based on a Langevinlike equation on the acceleration that involves several swimming behaviours modelling the swimmer adaptation to the biofilm. Furthermore, after inference, new synthetic data can be produced by predicting swimmer random walks sharing characteristics comparable to the original data.
We consider bacterial swimmers as Lagrangian particles and we model the different forces involved in the update of their velocity $\mathbf{v}$. We assume that the swimmer motion can be modelled by a stochastic process with a deterministic drift (Figure 1c):
where the right hand side is composed of two deterministic terms in addition to a gaussian noise, each weighted by the parameters $\gamma $, $\beta $ and $\u03f5$.
The first term implements the biological observation (Figure 5, lower central panel) that the bacterial swimmers adapt their velocity to the biofilm density. This term can be interpreted as a speed selection term that pulls the instantaneous speed of the swimmer towards a prescribed target velocity $\alpha (b)$ that depends on the host biofilm density $b$. The weight $\gamma $ can be interpreted as a penalization coefficient. In such a formalism, the difference between the swimmer and the prescribed speed is divided by a relaxation time $\tau $ to be homogeneous to an acceleration. Hence, $\gamma $ is proportionally inverse to $\tau $, $\gamma \sim \frac{1}{\tau}$. As a firstorder approximation of the speed drop observed in Figure 5 for increasing $b$, the target speed $\alpha (b)$ is modeled as a linear variation between v_{0} and v_{1}, where v_{0} is the swimmer characteristic speed in the lowest density regions, where $b=0$, and v_{1} in the highest density zones where $b=1$:
The second term updates the velocity direction according to the local gradient of the biofilm density $\nabla b$. The sign of $\beta $ indicates if the swimmer is inclined to go up (negative $\beta $) or down (positive $\beta $) the host biofilm gradient, while the weight magnitude indicate the influence of this mechanism in the swimmer kinematics. We note that this term does not depend on the gradient magnitude but only on the gradient direction: this reflects the implicit assumption that the bacteria are able to sense density variations to find favorable directions, but that the biological sensors are not sensitive enough to evaluate the variation magnitudes.
The third term is a stochastic twodimensional diffusive process that models the dispersion around the deterministic drift modelled by the two first terms. We define
The term η can also be interpreted as a model of the modelling errors, tuned by the term $\u03f5$. Equation 1 is supplemented by an initial condition by swimmer. For vanishing $\parallel v\parallel $ or $\parallel \nabla b\parallel $ leading to an indetermination, the corresponding term in the equation is turned off.
Equation 1 links the observed biofilm density and the swimmer trajectories trough mechanistic swimming behaviours. The model fitting can be seen as an ANOVAlike integrative statistical analysis of the image data. It decomposes the observed acceleration variance between mechanistic processes describing different swimming traits in order to decipher their respective influence on the swimmer trajectories while integrating heterogeneous data (density maps $b$ and trajectories kinematics).
We can define characteristic speed and acceleration ${V}^{*}$ and ${A}^{*}$ in order to set a dimensionless version of Equation 1
where ${\gamma}^{\prime}=\frac{\gamma {V}^{*}}{{A}^{*}}$, ${v}_{0}^{\prime}=\frac{{v}_{0}}{{V}^{*}}$, ${v}_{1}^{\prime}=\frac{{v}_{1}}{{V}^{*}}$, ${\beta}^{\prime}=\frac{\beta}{{A}^{*}}$, ${\eta}^{\mathrm{\prime}}\sim \mathcal{N}(0,{\u03f5}^{\mathrm{\prime}})$ and ${\u03f5}^{\prime}=\frac{\u03f5}{{A}^{*2}}$.
This dimensionless version will strongly improve the inference process and will allow an analysis of the relative contribution of the different terms in the kinematics. An extended numerical exploration of this model is performed in Appendix 2 Sec. Numerical exploration on mock biofilm images to illustrate the impact of the different parameters on the trajectories, showing in particular the interplay between $\gamma $ and $\u03f5$: counterintuitively, straight lines are induced when the stochastic part $\u03f5$ is high compared to the speed selection parameter $\gamma $ (see also Appendix 2).
Inferring swimming parameters from trajectory data
For each bacterial swimmer population, we now seek to infer with a Bayesian method populationwide model parameters governing the swimming model of a given species from microscope observations.
Inference model setting
Equation (2) is rewritten as a state equation on the acceleration for the bacterial strain $s$ and the swimmer $i$
where
are speciesdependant equation parameters. The function ${f}_{A}$ can be seen as the deterministic drift of the random walk, gathering all the mechanisms included in the model. The interindividual variability of the swimmers of a same species comes from the swimmerdependent initial condition, the resulting biofilm matrix they encounter during their run, and the stochastic term.
Inferring the parameters ${\theta}^{s}$ can then be stated in a Bayesian framework as solving the non linear regression problem
from the data $b(t,X)$, ${X}_{i}^{s}(t)$, ${V}_{i}^{s}(t)$ and ${A}_{i}^{s}(t)$, with truncated normal prior distributions
and additional constrains on the parameters
We note that Equation (5) can be seen as a likelihood equation of the parameter ${\theta}^{s}$ knowing ${A}_{i}^{s}(t),b(t),{V}_{i}^{s}(t)$ and ${X}_{i}^{s}(t)$. The parameter ${\u03f5}^{s}$ can now be seen as a corrector of both modelling errors in the deterministic drift and observation errors between the observed and the true instantaneous acceleration. Alternative settings where these uncertainties sources are separated and a true state for position and acceleration is inferred can be defined (see Annex Various inference models). The inference problem is implemented in the Bayesian HMC solver Stan (Stan Development Team, 2018) using its python interface pystan (Riddell et al., 2021). Inference accuracy is thoroughly assessed on synthetic data (see Appendix 1 Assessment of the inference with synthetic data and Figure 6).
Analysis of the confocal microscopy dataset
We now solve the inference problem (5)(6) on the confocal microscopy dataset to identify populationwide swimming model parameters in order to decompose the swimmer kinematics in three mechanisms: biofilmrelated speed selection, densityinduced direction changes and random walk. The inference process is assessed by comparing the descriptors obtained on trajectories predicted by the fitted model (Figure 7a) with descriptors of real trajectories (Figure 5). The mean values of acceleration and speeds are accurately predicted for the three species ( Figure 7a panels $\parallel A\parallel $ and $\parallel V\parallel $, dashed lines). Relative positions of distance, displacement and visited area mean values are also correctly simulated (Figure 5 and Figure 7a, upper panel). B. sphaericus presents the lowest predicted accelerations and speeds while B. pumilus has the widest speed and acceleration distributions and B. cereus shows the highest accelerations, consistently with the data. The visited area and the distances are slightly over estimated, but the relative position and the shape of the distributions are conserved. The amount of null velocities for B. sphaericus is under estimated by the fitted model and not rendered for B. pumilus. The distance distributions of the three species are accurately predicted by the fitted model. When displaying conjointly the distance and the displacement (Figure 7a, right lower panel), the distribution of B. sphaericus is correctly predicted by the simulations, but B. cereus and B. pumilus displacements are underestimated. Some qualitative features can be recovered, such as the higher distribution of distancedistribution couples for B. cereus or higher displacement for B. cereus compared to B. sphaericus.
Descriptors of swimming adaptations to the host biofilm are also correctly preserved for the main part (Figure 5 and Figure 7 a, lower panel). B. pumilus is the species that crosses the highest biofilm densities in the fitted model simulations, showing the highest speeds in this crowded areas, and that visits the most frequently areas with high density gradients, consistently with the data. As in the confocal images, the simulated B. sphaericus and B. cereus favor smoother zones of the biofilm with lower biofilm densities. The B. cereus fitted model correctly renders the highest acceleration variance observed in the data for low biofilm gradients, while B. sphaericus speed and acceleration variance is the lowest for all ranges of biofilm densities and gradients, both in the data and in the fitted model predictions. The drop of speeds and accelerations for increasing biofilm densities and gradients is well predicted for B. pumilus, but is smoother in the simulation compared to the data for B. sphaericus and B. cereus. In particular, the sharp drop of speeds for $b\simeq 0.25$ observed in the data for B. cereus and B. sphaericus is underestimated by the fitted model.
All together, the model reproduces very accurately the mean values of acceleration, speed and visited area, renders relative positions and the main characteristics of distributions for distance, displacement and interactions with the host biofilm matrix, but produces less variable outputs than observed in the data, meaning that the model is less accurate in the distribution tails. The main features of the swimmer adaptation to the underlying biofilm are however correctly predicted by the model.
To further inform the fitted model accuracy, the coefficient of determination ${R}_{det}^{2}$ of the deterministic components ${f}_{A}({\theta}^{s},b(t),{V}_{i}^{s},{X}_{i}^{s}(t))$ of Equation 4 is computed (Table 2), in order to quantify the goodness of fit of the friction and gradient terms of (Equation 2) that represent interactions with the biofilm. These results highlight that B. cereus bacteria do present an important stochastic part in the accelerations, while the B. pumilus species is the best represented by our deterministic modelling.
The three species present very different inferred parameter values (Figure 7 b and Table 3), showing that the model inference captures contrasted swimming characteristics of these Bacillus. Due to the mechanistic terms introduced in Equation 1, these differences can be interpreted in term of speed and direction adaptations to the host biofilm. First, B. pumilus shows the highest v_{0} value, and the highest amplitude between v_{0} and v_{1}, inducing a higher ability for B. pumilus to swim fast in low density biofilm zones and strong deceleration in crowded area. In comparison, B. sphaericus presents the smallest amplitude between v_{0} and v_{1} showing a poor adaptation to biofilm density. B. cereus has the highest $\gamma $ value, showing a reduced relaxation time toward the density dependant speed: in other words, B. cereus is able to adapt its swimming speed more rapidly than the other species when the biofilm density varies. B. cereus swimmers are also better able to change their swimming direction in function of the biofilm variations they encounter along their way, their $\beta $ distribution being markedly higher than the other species which have very low $\beta $. Finally, the stochastic parameter $\u03f5$ is also contrasted, from a low distribution for B. sphaericus to high values for B. cereus. All together, the inference complete the observations made in Figure 5: B. pumilus poorly adapts its swimming direction to the host biofilm (low $\beta $) but has a wide range of possible speeds when the biofilm density varies (high v_{0}, low v_{1}), that it can reach quite rapidly (intermediary $\gamma $) with intermediary stochastic correction ($\u03f5$). In contrast, B. cereus reaches lower speed values (intermediary v_{0}, low v_{1}) but is more agile to adapt its swimming to its environment by changing rapidly its speed when the biofilm density is more favorable (highest $\gamma $) and adapting its swimming direction to biofilm variations, with higher stochastic variability (large $\u03f5$). Finally, B. sphaericus is the less flexible of the three bacteria: less fast (smallest difference between v_{0} and v_{1}), they are also less responsive to biofilm variations (small $\gamma $ and $\beta $) with low random perturbations (small $\u03f5$).
Finally, after inference, the impact of each term in the overall acceleration data can be quantified and analyzed by displaying its relative contribution in a ternary plot (Appendix 2—figure 6). This relative contribution can be measured thanks to the swimming model which integrates these different mechanisms in the same inference problem. The direction selection is the least influential mechanism for the three species, with a slightly higher impact for B. cereus (50% and 95% isolines slightly shifted towards $A(\nabla b)$ in Appendix 2—figure 6a). When zooming in, the three Bacillus show differences in the balance between speed selection and the random term (Appendix 2—figure 6b): while B. pumilus is slightly more influenced by the friction term than by stochasticity, these mechanisms are perfectly balanced in B. sphaericus accelerations, while B. cereus is more influenced by the random term.
Interpretation of the bacterial swimming at the light of their morphology
Kinematics descriptors and swimming parameters can then be reinterpreted through the insights provided by the morphology of each bacteria species as shown in Figure 2. As observed in Figure 2, B. pumilus and B. sphaericus are flagellated whereas B. cereus is equipped by a unique brushlike bundle of thin flagella at its tail. This morphology can be linked to their swimming patterns. The flagella could be linked to the runandtumble behaviour of B. pumilus and B. sphaericus, as shown for other flagellated bacteria such as E. coli, the tumbling events of which are induced by reverse rotation of the cellular motor of its multiple flagella (Patteson et al., 2015). Additional functional characteristics may discriminate B. pumilus and B. sphaericus, since runandreverse swimming is the natural behaviour of B. sphaericus even in the Newtonian control buffer, whereas B. pumilus drastically reduces its speed in highdensity biofilms (Figure 7, a) and starts tumbling in the host biofilm (Figure 4). B. pumilus has the highest number of flagella and is the bacteria that reaches the highest speeds specially in the Newtonian buffer and in lowdensity areas, indicating that this characteristic may be an advantage for swimming fast in the extracellular matrix. The kind, size and disposition of the flagella bundle may help B. cereus swimmers to adapt their runs to their environment by changing directions to follow lower density areas (higher impact of direction selection term of the three Bacillus in Appendix 2—figure 6) or to adapt rapidly when biofilm density varies (largest $\gamma $). B. cereus being the bacteria with the strongest stochastic part (highest $\u03f5$, density shifted towards $A(\u03f5)$ in Appendix 2—figure 6), this morphology could also help the swimmer to go through the biofilm by random navigation, which helps to maintain comparable straight trajectory with or without biofilm when the stochastic part is higher than the speed selection term (Appendix 1—figure 1, Appendix 2—figure 3 and Appendix 2—figure 6). Finally, B. sphaericus bacteria are much longer than the other two species, which may explain why this species is the least motile in terms of acceleration and kinematics, both in biofilms and in the Newtonian control buffer.
Discussion
Modelling and analysis of swimming trajectories
When analyzing microbial swimming trajectories, two general strategies can be found in the literature. The first one aims at designing statistical tests quantifying similarities with or deviations from typical motion of interest such as diffusion (Patteson et al., 2015). Another strategy consists in providing a generative model of the data, analyzing it (Chepizhko and Peruani, 2013; Chepizhko et al., 2013) and comparing model outputs with real data (Koorehdavoudi et al., 2017; Jabbarzadeh et al., 2014), possibly after inference. The model that is studied in this paper belong to the second category: the model includes deterministic mechanisms describing interactions with the host biofilm, together with a random correction counterbalancing the modelling errors. The parameter inference allows to interpret the data variance relatively to speed or direction adaptations to the host biofilm versus residual effects gathered in the stochastic term. This method is comparable to ANOVAlike multivariate analysis: the parametric phenomelogical mappings between explicative covariables and a swimming behaviour (for example the function defining speed selection from biofilm density) are gathered in the same inference problem, enabling to decompose acceleration variability between the different swimming behaviours. This integrative method allows for multidata integration and coanalysis. Furthermore, the fitted model allows to simulate typical swimming trajectories of a given species.
Populationwide swimming characteristics vs truestate inference
In this study, we do not aim to recover ’true’ swimmer trajectories (a.e. the blue trajectory in Appendix 2—figure 4), that is identifying through smoothing techniques an approximation of the specific realization of the stochastic modeling and observation errors that lead to a given ’observed’ trajectory. Rather, the goal is to identify common characteristics shared by a population of trajectories by inferring the ‘populationwide’ parameters (the parameters $\alpha $, $\beta $, v_{0}, v_{1}, $\gamma $, and $\u03f5$) that best explain the whole set of observed accelerations in a same population of swimmers. For this reason, we did not introduce swimmerspecific terms nor individual noise: they would have increased the model accuracy, but to the price of a blurrier characterization of the species specificities.
This choice determined our inference framework. Despite several alternative options for recovering hidden states, in particular SSM (space state models) which are common in spatial ecology (Auger‐Méthé et al., 2021), the Bayesian method we opted for is a simpler nonlinear regression problem that proved to be sufficient to recover macroscopic features of swimmer trajectories and species stratification. We discuss in Appendix 3 Various inference models the different options that were tested and present in Materials and methods Sec. Inference the method for noise model selection. Among other interesting features, the Bayesian method provides confidence intervals on the final parameter estimation, and on the resulting trajectories as in Figure 6a.
Predictive capabilities of the model
The deterministic terms of the model explain only half of the variance (Table 2). A major part of the underlying mechanisms is not correctly described by our model which is a common feature since it is a phenomenological model which only considers interactions with the underlying biofilm at a macroscopic level, without taking into account nanoscale physical mechanisms. A more detailed description of the underlying physics could have been designed as in Martinez et al., 2014, but it would have made more complex the analysis of the interactions between the host biofilm and the swimmer trajectories and the extraction of speciesspecific patterns. However, we note that our model correctly renders observations made through macroscopic trajectory descriptors, even though the inference process has not been made based on these observables. Furthermore, several repetitions of the same models with different samples of the stochastic terms give very similar values for the trajectory descriptors (see Appendix 2—figure 5 and section Influence of inference and stochastic terms on the trajectory descriptors), showing that these descriptors are robust to stochastic perturbations. Hence, the model (2) can be used to produce synthetic data sharing the same global characteristics than the original ones specifically taking into accounts interactions between the swimmers and the host biofilm. Furthermore, these predictions also reproduce the species stratification observed in the original data using the global descriptors.
Biological interpretation of the fitted models
The direction selection term of the equation driven by $\beta $ has little impact in the swimmer model fitted on real data. However, the parameter $\beta $ can have a sensible impact on the kinematics as shown in the sensitivity analysis, and on the trajectories in mock biofilms (Appendix 2—figure 1). This could indicate that direction selection based on biofilm gradients is marginally effective in reallife swimming trajectories in a biofilm matrix. On the contrary, the speed selection term is more effective for the three Bacillus, showing that these microswimmer are able to adapt their swimming velocity to the biofilm density faced during their run. This term acts as an inertial term which enhances the stochastic term to provide direction and velocity changes.
The model has been used to decipher different adaptation strategies to the host biofilm of the three species during their swim. It confirms that B. sphaericus are the less motile bacteria in the biofilm, with reduced speeds and adaptation capabilities as indicated by the smallest model parameter values and a stereotypic runandreverse behaviour inside or outside the biofilm. B. pumilus on the contrary drastically changes its swimming behaviour in the biofilm compared to the Newtonian control buffer, which is reflected in the model by a high amplitude between v_{0} and v_{1} and a high $\gamma $ that indicates a rapid adaptation for varying biofilm densities. B. cereus shows the highest adaptation ability to the biofilm matrix, with the highest $\gamma $ and $\beta $ reflecting biofilminduced speed and direction changes. Furthermore, the high stochastic effects (highest ∈) higher than the speed selection term tuned by $\gamma $ (see Appendix 2—figure 6) allows this swimmer to conserve straight runs in the biofilm (see Appendix 2 Sec. Friction and random term in Langevin equations.) in the same way than in the control Newtonian fluid.
This characterization methodology could be used to drive species selection for improved biofilm control. Furthermore, the model can be used to predict new trajectories and the resulting biofilm vascularization, in a similar framework as in Houry et al., 2012. Coupled with a model of biocide diffusion, these simulations could be used to test numerically the efficiency of mono or multispecies swimmer pretreatment to improve the removal of the host biofilm.
Flagellated bacteria in polymeric solutions
Characterization of flagellated bacteria motility in polymeric solutions is a very active research area (Martinez et al., 2014; Patteson et al., 2015; Zöttl and Yeomans, 2019; Qu and Breuer, 2020; Qu et al., 2018). Speed and direction variations have been measured for various polymeric fluids with different viscoelastic properties. For the model bacteria E. coli in polymeric solutions, enhanced viscosity decreases tumbling while increased elasticity speeds up the swimmers (Patteson et al., 2015; Zöttl and Yeomans, 2019). In our experiments on the contrary, we observed decreased speeds and strong enhancement of reverse events for the flagellated B. sphaericus and B. pumilus in the biofilm compared to the Newtonian control buffer. However, the experimental setup shows strong differences: the complex rheology of S. aureus biofilms may strongly differ from polymeric fluids even if under certain condition they can be considered as viscoelastic fluids (Gloag et al., 2020), impacting differently the swimmer behaviours. Furthermore, the physiology of the motor cell in the Grampositive Bacillus differs from the one of the Gramnegative E. coli (Terahara et al., 2020; Szurmant and Ordal, 2004; Subramanian and Kearns, 2019). Finally, the particular brushlike flagella bundle of B. cereus may allow this species to conserve the same swimming in Newtonian and crowded environments, by adapting its swimming speed to the local density and otherwise randomly selecting swimming directions across the host biofilm. To generalize this approach to other contexts, this study should be reproduced for other swimmers and other host biofilms, together with polymeric fluids and porous media, including biochemical interactions.
Materials and methods
Infiltration of host biofilms by bacilli swimmers
Request a detailed protocolInfiltration of S. aureus biofilms by bacilli swimmers were prepared in 96well microplates. Submerged biofilms were grown on the surface of polystyrene 96well microtiter plates with a μ clear base (Greiner Bioone, France) enabling highresolution fluorescence imaging (Bridier et al., 2010). 200 μL of an overnight S. aureus RN4220 pALC2084 expressing GFP (Malone et al., 2009) cultured in TSB (adjusted to an OD 600 $\mathrm{nm}$ of 0.02) were added in each well. The microtiter plate was then incubated at 30°C for 60 $\mathrm{min}$ to allow the bacteria to adhere to the bottom of the wells. Wells were then rinsed with TSB to eliminate nonadherent bacteria and refilled with 200 μL of sterile TSB prior incubation at 30 $\mathrm{celsius}$ for 24 h. In parallel, B. sphaericus 9 A12, B. pumilus 3 F3 and B. cereus 10B3 were cultivated overnight planktonically in TSB at 30 °C. Overnight cultures were diluted 10 times and labelled in red with 5 μM of SYTO 61 (Molecular probes, France). After 5 min of contact, 50 μL of labelled fluorescent swimmers suspension were added immediately on the top of the S. aureus biofilm. All microscopic observations were collected within the following 30 min to avoid interference of the dyes with bacterial motility. Three replicates were conducted. The same protocol has been repeated without the host biofilm (control experiments): the swimmers are added to the buffer only which is a Newtonian fluid.
Confocal laser scanning microscopy (CLSM)
Request a detailed protocolThe 96 well microtiter plate containing 24 hr S. aureus biofilm and recently added bacilli swimmers were mounted on the motorized stage of a Leica SP8 AOBS inverter confocal laser scanning microscope (CLSM, LEICA Microsystems, Germany) at the MIMA2 platform (https://www6.jouy.inra.fr/mima2_eng/). Temperature was maintained at 30 $\mathrm{celsius}$ during all experiments. 2D+T acquisitions were performed with the following parameters: images of 147.62 × 147.62 $\mu \mathrm{m}$ were acquired at 8000 $\mathrm{Hz}$ using a 63×/1.2 N.A. To detect GFP, an argon laser at 488 $\mathrm{nm}$ set at 10% of the maximal intensity was used, and the emitted fluorescence was collected in the range 495–550 $\mathrm{nm}$ using hybrid detectors (HyD LEICA Microsystems, Germany). To detect the red fluorescence of SYTO61, a 633 $\mathrm{nm}$ heliumneon laser set at 25% and 2% of the maximal intensity was used, and fluorescence was collected in the range 650–750 $\mathrm{nm}$ using hybrid detectors. Images were collected during 30 s (see Table 1 for sampling period).
Bacterial swimmers navigate within a threedimensional biofilm matrix and confocal microscope refreshment time is not small enough to allow 3D+T images. To limit 3D trajectories, a focal plane near the well edge has been selected, where the well wall physically constrains the swimmer trajectories in one direction, which select longer trajectories in the 2D plane that can be tracked in time. Therefore, experimental data are composed of twodimensional trajectories captured between the swimmer arrival and departure times in the focal plane, and the associated 2D+T biofilm density images that change over time due to swimmer action.
To check that the host biofilm structure is identical near the well’s edge compared to other 2D slices, we took 4 replicates of S. aureus biofilms that were imaged in 3D using a stack of 6 horizontal images, starting from $z=0$ near the well’s edge, to $z=6\mathrm{\Delta}z$, at the interface between the biofilm and the bulk solution. To study the between and within biofilm density variability in the horizontal images, we subsampled them with a regular Cartesian 4 × 4 grid, resulting in a 4 × 6 x(4 × 4)=384 2D images database supplemented by metadata (stack, $z$ and $xy$ coordinate of the subsample), before computing a clustered pairwise correlation similarity matrix and a permanova.
Transmitted electron microscopy
Request a detailed protocolMaterials were directly adsorbed onto a carbon film membrane on a 300mesh copper grid, stained with 1% uranyl acetate, dissolved in distilled water, and dried at room temperature. Grids were examined with Hitachi HT7700 electron microscope operated at 80 $\mathrm{kV}$ (Elexience – France), and images were acquired with a chargecoupled device camera (AMT).
Postprocessing of image data
Request a detailed protocolSee Figure 1 for a sketch of the datastream from microscope raw images to model inputs and Appendix 1—figure 1 for data visualization at each step of the postprocessing pipeline.
Swimmer tracking has been applied on the red channel of the raw temporal stacks with IMARIS software (Oxford Instruments) using the tracking function after automated spots detection to get position timeseries for each swimmer. Timeseries with less than 8 time steps were filtered out.
Then, swimmer speed and acceleration timeseries were computed from their position by finitedifference approximations and trajectory descriptors were extracted. The RGB green channel corresponding to the biofilm density temporal images were converted into grayscale and rescalled between 0 and 1 (linear scalling).
Trajectory descriptors are defined as follows. The mean acceleration and speed values, distance and displacement are computed with ${\parallel A\parallel}_{i}^{s}=\frac{1}{{T}_{i}^{s}2}{\sum}_{t}\parallel {A}_{i}^{s}(t)\parallel $, ${\parallel V\parallel}_{i}^{s}=\frac{1}{{T}_{i}^{s}1}{\sum}_{t}\parallel {V}_{i}^{s}(t)\parallel $, $dis{t}_{i}^{s}=\mathrm{\Delta}t{\sum}_{{T}_{0,i}^{s}}^{{T}_{end,i}^{s}\mathrm{\Delta}t}\parallel {V}_{i}^{s}(t)\parallel $ and $dis{p}_{i}^{s}=\parallel X({T}_{end,i}^{s})X({T}_{0,i}^{s})\parallel $. To compute the visited area, each trajectory piece was subsampled by computing ${X}_{i}^{s}({t}_{k})=\frac{k}{{n}_{s}}{X}_{i}^{s}(t)+(1\frac{k}{{n}_{s}}){X}_{i}^{s}(t+\mathrm{\Delta}t)$ for $k=0,{n}_{s}$, with ${n}_{s}=10$ and the pixels included in the ball $B({X}_{i}^{s}({t}_{k}),r)$ with radius $r=2$ were labeled. The total area of the labelled pixels is defined as the visited area of the swimmer $i$ of species $s$.
To assess runandtumble behaviour, the angle ${\theta}_{i}^{s}(t)$ and the mean velocity ${\overline{V}}_{i}^{s}(t)$ between two consecutive speed vectors are defined with ${\theta}_{i}^{s}(t)$ = $\mathrm{arccos}(({V}_{i}^{s}(t)\cdot {V}_{i}^{s}(t\mathrm{\Delta}t))/(\Vert {V}_{i}^{s}(t)\Vert \Vert {V}_{i}^{s}(t\mathrm{\Delta}t)\Vert ))$ and ${\overline{V}}_{i}^{s}(t)=(\Vert {V}_{i}^{s}(t)\Vert +\Vert {V}_{i}^{s}(t\mathrm{\Delta}t)\Vert )/2$, for $t\in ({T}_{0,i}^{s}+\mathrm{\Delta}t,{T}_{end,i}^{s})$.
Postprocessed data are available at https://forgemia.inra.fr/bioswimmers/swiminfer/SwimmerData.
Computation of the forward swimming model
Request a detailed protocolTime integration of equations (2) has been solved with an explicit Euler scheme regarding positions ${\mathbf{x}}_{i,t}^{s}$ and velocities ${\mathbf{v}}_{i,t}^{s}$ of the swimmer $i$ of species $s$ at time $t$:
where ${\mathbf{dv}}_{i,t}^{s}$ is given by Equation 2, and depends on ${\theta}^{s}$, ${V}_{i,t}^{s}$, ${x}_{i,t}^{s}$, $b(t,{x}_{i,t}^{s})$ and $\nabla b(t,{x}_{i,t}^{s})$. In practice, the biofilm density and gradient maps $b$ and $\mathrm{\nabla}b$ are discretized with a Cartesian grid corresponding to the image pixels.
During random walks, swimmer may exit the biofilm domain. When the swimmer reaches the domain boundary, a new swimmer is introduced with a velocity oriented towards the interior of the domain while the original trajectory is stopped at the boundary.
Sensitivity analysis
Request a detailed protocolA local sensitivity analysis (Figure 1) is performed by comparing basal simulation obtained with $\gamma =\beta =\u03f5=1$ (v_{0} and v_{1} where taken as in Appendix 1—table 3) with 3 simulations where $\gamma $, $\beta $ and $\u03f5$ are alternatively set to 0, resulting in 3 alternative models where the speed or the direction selection or the random term is turned off. The interaction between the speed selection term (set by $\gamma $) and the random term is illustrated in Appendix 2—figure 3 where 5 repetitions of the same trajectory of a simplified Langevin equation (11) are displayed with or without friction ($\gamma =1$ or $\gamma =0$), but with the same random seed for the stochastic term so that the stochastic part is strictly identical.
To analyze the impacts of the nondimensionalized swimming parameters $\gamma $, v_{0}, v_{1}, $\beta $, $\u03f5$ on the locomotion behaviour, a global sensitivity analysis has been performed. The parameter space ${[0,1]}^{5}$ was uniformly sampled with n = 1000 points using the Fourier Amplitude Sensitivity Test (FAST) sampler of the SALib library that is the function SALib.sample.fast_sampler.sample (Cukier et al., 1973; Saltelli et al., 1999). We note that the interval $[0,1]$ covers a large parameter domain for some parameters, in particular $\beta $ which remains small after inference. For this parameter, the sensitivity analysis will show potential impact on the output, that may be ineffective in the parameter range of the inferred model.
For each point in the parameter space, a forward simulation is conducted on a population of swimmers on a representative biofilm extracted from the dataset (first batch of the B. pumilus dataset). Trajectory descriptors are then extracted and taken as observable of the sensitivity anaylsis that requires both the parameters sampling and the associated descriptors. Sobol indices of first order are then returned and pairwise partial correlations matrix has been calculated. Convergence of the Sobol indices has been checked by taking subsamples containing less than $1,000$ points.
Inference
Numerical implementation
Request a detailed protocolThe inverse problem (4)(6) has been implemented using a Hamiltonian Monte Carlo (HMC) method to solve this Bayesian inference problem.
The three replicates for each swimmer species are pooled (trajectories and biofilm density maps) and the input data required for the inference procedure (velocity $\mathbf{yV}$ and acceleration $\mathbf{yA}$ times series for the whole batch of swimmers, biofilm densities $yb$ and gradient $\mathbf{yGb}$ extracted at swimmer positions) were assembled in a customed data structured. Normal standard prior distributions were set for all swimming parameters $\theta =(\gamma ,{v}_{0},{v}_{1},\beta ,\u03f5)$. Additional positivity constrained were imposed for all parameters but $\beta $. Therefore, the implemented model can be summarized as:
A warmup of 1000 runs is followed by the Markov chains construction (4,000 iterations for 4 Markov chains). Markov chain convergence is assessed by direct visualization (Appendix 1—figure 4) by checking for biaised covariance structures in pairplots (Appendix 1—figure 5). Standard convergence index were additionnaly computed: effective sample size per iteration (${n}_{eff}$) and potential scale reduction factor (${R}_{hat}$).
Noise model selection
Request a detailed protocolDifferent noise models have been evaluated for the regression model (5) to take into account batch or individual effects. Namely, we decomposed the noise in Equation 5 by replacing ${\eta}^{\mathbf{s}}$ by $\eta ^{\mathbf{s}}{}_{i}$ and/or ${\eta}^{\mathbf{s},\mathbf{b}}$ for individual $i$ and experimental batch $b$. Model selection has been conducted by computing the WAIC for the different noise models. A huge degradation of the WAIC has been observed for individual or batch dependant noises, indicating that the enhancement of the inference accuracy provided by the additional parameters can be considered as overfitting and discarded.
Inference validation on synthetic data
Ground truth data construction
Request a detailed protocolGround truth synthetic data (see section Assessment of the inference with synthetic data) were computed by solving Equations 2; 8 with $\gamma =10{s}^{1}$, ${v}_{0}=5\mu m.{s}^{1}$, ${v}_{1}=1\mu m.{s}^{1}$, $\beta =10\mu m.{s}^{2}$, $\u03f5=40\mu m.{s}^{2}$ and biofilm maps taken from the first batch of the B. pumilus dataset. The number of swimmers was fixed to $N=50$ and the number of time steps was taken identical to the experimental data that is ${N}_{t}=224$. Resulting mean speeds and accelerations were ${A}_{ref}=68.29\mu m.{s}^{2}$, ${V}_{ref}=7.47\mu m.{s}^{1}$ and were used to rescale the data before inference together with the ground truth parameters (Appendix 1—table 3). In total, the acceleration dataset contains 9,523 samples for each spatial direction.
Comparing ground truth data with the fitted model
Request a detailed protocolAfter inference, a new dataset is obtained by solving Equation 8 with the fitted parameters. The same initial conditions for speeds and positions as the ground truth data are taken. Trajectories are stopped after the same number of time step as in the corresponding trajectory of the ground truth dataset. To discard spurious stochastic uncertainties, the same random seed as the ground truth simulations was taken, so that the unique uncertainty source was inference errors.
Checking the sensitivity to biofilm image noise
Request a detailed protocolTo produce Appendix 1—figure 6, the biofilm density and the biofilm density gradient maps have been noised with an additive gaussian noise with increasing variance, before inference: we set
where ${\sigma}_{b}$ is the variance observed in the original data, and ${\u03f5}_{b}$ and ${\u03f5}_{\nabla b}$ are respectively the noise applied to the biofilm density and the biofilm density gradient. The parameter $l\in [0,0.01,0.02,0.03,0.04,0.05]$ is increased to apply a noise from 0% to 5%.
Inference validation on experimental data
Comparing microscopy data with the fitted model
Request a detailed protocolThe same procedure is repeated on the microscopy data: after inference, a new dataset is obtained by solving Equation 8 with the fitted parameter, taking the same initial conditions for speeds and positions. Trajectories are stopped after the same number of time step as in the corresponding trajectory of the ground truth experimental dataset.
Measuring the deterministic reconstruction
Request a detailed protocolThe deterministic coefficient of determination ${R}_{det}^{2}$ was computed to measure how much the dataset is explained by the deterministic part of the model. Setting ${A}_{i}^{s,det}={f}_{A}(\gamma ,{v}_{0},{v}_{1},\beta yb,\mathbf{yV},yb,\mathbf{yGb},dt)$:
where $\overline{y{A}^{s}}$ is the acceleration mean. ${R}_{det}^{2,s}$ is expected to tend towards 1 when the stochastic term $\eta =\mathcal{N}(0,\u03f5)$ becomes negligible with respect to ${A}^{det}$.
Plots and statistics
Request a detailed protocolTo allow interspecies comparisons in plots, the data and model outputs are renormalized with common reference values ${A}_{ref}$ and ${V}_{ref}$ defined as the average of the species reference values (see Table 2 for values). Unidimensional distributions (Figure 5 upper panel, Figure 6b upper panel, Figure 7a, upper panel, and Figure 7b) were obtained with the gaussian_kde function of scipy.stats. T tests for mean comparison were performed using scipy.stats ttest_ind.
Twodimensional distribution plots (Figures 5 and 6 b, Figure 7a lower panels) were obtained by first plotting the twodimensional point cloud and approximating the point distribution with a gaussian KDE using scipy.stats gaussian_kde function. Then, the gaussian kde is evaluated at each point of the point cloud and quantiles 0.05, 0.5, and 0.95 of the resulting values are computed. Finally, quantile isovalues are plotted and the point cloud and the KDE are removed (see Appendix 4—figure 1 and Sec. KDE computation for details): this procedure ensures to enclose 5, 50% and 95% of the original points, centered in the densest zones of the initial point cloud.
Ternary plots (Appendix 2—figure 6) were obtained by first computing the contribution of each term of equation (4) to acceleration estimate. Namely, note
We compute the proportions $A{(k)}_{i}^{s}$ for $k\in \{b,\nabla b,\eta \}$,
Points $(A{(b)}_{i}^{s},A{(\nabla b)}_{i}^{s},A{(\eta )}_{i}^{s})$ are then plotted in ternary plots using the Ternary python package (Weinstein et al., 2019) and approximated by gaussian KDE. Isolines are finally plotted as previously described.
To construct the plot in Appendix 1—figure 2, pairwise correlation of the biofilm density in the 384 samples has been computed (scikitlearn pairwise_distances, ‘correlation’ metric parameter Pedregosa et al., 2011), and the resulting similarity matrix has been displayed using Seaborn package clustermap function (Waskom, 2021) after hierarchical clustering (scipy.cluster.hierarchy linkage function Virtanen et al., 2020). Additional permanova has been computed to assess the significance of betweengroup dissimilarities using stats.distance package permanova function (scikitbio development team, 2020 ).
Code availability
Request a detailed protocolAll the image pre and postprocessing, calculations and statistics have been performed with custom scripts using the standard python libraries numpy (Harris et al., 2020), scipy (Virtanen et al., 2020), imageio (Klein, 2021), and pandas (McKinney, 2010). The forward swimming problem computation is computed using customed scripts built upon numpy (Harris et al., 2020) and H5py (https://www.h5py.org). Sensitivity analysis has been conducted with the SALib library (Cukier et al., 1973; Saltelli et al., 1999) (Sobol index, function SALib.analyze.fast.analyze) and the pingouin library (Vallat, 2018) (PCC, pcorr method). The Bayesian inference has been conducted using the STAN library (Stan Development Team, 2018) through its python interface pystan (Riddell et al., 2021). All plots have been made with the matplotlib python library (Hunter, 2007).
The whole python code have been made available and accessible at the following git repository https://forgemia.inra.fr/bioswimmers/swiminfer.
Appendix 1
Illustration of the datastream
Data acquisition
Illustrations of the image data at different steps of the data stream are displayed in Appendix 1—figure 1, from raw microscopy data to rescalled biofilm density map with trajectories. The contrast of the original 2 chanel image has been enhanced for visualization. The RGB biofilm density temporal images (see Materials and methods) were converted into grayscale and rescalled between 0 and 1 (linear scalling). In this images, for illustrations, trajectories are mapped into the biofilm density map and rescaled density map at initial condition of the first B. pumilus batch. In the dataset, the trajectories are associated with the corresponding biofilm map: ${X}_{i}^{s}(t)$ is associated with the value $b(t,{X}_{i}^{s}(t))$ for swimmer $i$ of species $s$ at time $t$. As the biofilm density map is also a timeseries, the trajectories can hardly be represented on the underlying biofilm that also changes in time.
Assessing the 3D structure of the biofilm
We check that the selection of a 2D focal plan does not induce an additional bias by overselecting biofilm areas with specific structures near the well’s edge. To do so, we assembled an additional dataset of 4 replicates of S. aureus 3D images (see Materials and methods, section 4.2, and Appendix 1—figure 2.A for the dataset assembly) of horizontal image subsamples, and computed their within and between dissimilarities (see Materials and methods), section Plots and statistics. The resulting pairwise correlation matrix is displayed in Appendix 1—figure 2 after hierarchical clustering. It shows that the $z$ direction does not structure the information, since the images are not clustered according to their $z$ coordinates contrary to the stack or the $xy$ coordinate labels. Permanova analysis shows that the differences between stacks and $xy$ subsamples are significant ($pvalue=1e4$) but not between horizontal images ($pvalue=1$).
Illustration of pore formation
As strongly documented in Houry et al., 2012, swimmers can dig pores in a exogenous biofilm, which enhance the biofilm innervation and facilitate the penetration of macromolecules. To illustrate the pore formation, we show two successive images taken from a 2D temporal stack of B. sphaericus swimmers in a S. aureus host biofilm in Figure 3. In the dashed ellipse, we can see a swimmer that has moved in the two successive images, letting behind it an empty space free from host bacteria.
Statistical tests
Ttests were performed to compare mean differences between 1D distribution of Figure 5. Resulting pvalues are displayed in Appendix 1—table 2.
Assessment of the inference with synthetic data
To assess the inference method, synthetic data are built and will be used as reference for assessment. We arbitrarily fix a parameter vector and solve system (1) from random initial positions, in a host biofilm arbitrarily chosen in the image dataset. We then extract the swimmer positions at given timesteps and recover accelerations and speeds with the same postprocessing pipeline as for microscopy images and solve the inverse problem (5)(6). If the inference process correctly works, we expect to recover the original parameters (the ground truth).
The ground truth parameters are correctly recovered by the inference procedure (Appendix 1—table 3), indicating that the parameters are correctly identifiable and that the inverse problem is wellposed. An error of respectively 1.28, 1.34, 2.98% and 0.68% on the parameters $\gamma $, v_{0}, v_{1} and $\u03f5$ is observed in this controlled situation, $\beta $ being inferred with lower accuracy (9.59 %). This estimate is robust to noise on the biofilm data, with highest impact on $\beta $ (Appendix 1—figure 6). To assess the impact of parameter inference uncertainties on trajectory computation, the posterior parameter distribution is sampled and new trajectories are computed, replacing the groundtruth parameters by the sampled ones. The swimmer ground truth trajectories are accurately recovered: the sampled trajectories tightly frame the original swimmer path as illustrated on a randomly chosen trajectory (Figure 6a). We note that an identical random seed has been taken for these simulations, including the ground truth trajectory, in order to turn off the stochastic uncertainties and only focus on the propagation of inference errors during simulations of swimmer trajectories.
Finally, we reassemble a synthetic dataset by replacing the groundtruth parameters by the inferred ones, that is the posterior mean. Qqplot of the fitted model accelerations versus the ground truth accelerations give an excellent accuracy (Figure 6d–e), with all the points lying on the bisector, except slight divergences on the distribution tails. The fitted model trajectories visually reproduce the qualitative characteristics of the original dataset (Figure 6c). The trajectory descriptors of section Characterizing bacterial swimming in a biofilm matrix through image descriptors are then computed on both datasets (ground truth and inferred) and compared (Figure 6b). The kinematics descriptors, that is acceleration and speed distributions, are very accurately recovered with a relative error of 0.1%, 3.2%, 5% for respectively the mean, quantiles 0.05 and 0.95 of the acceleration (resp. 0.9%, 2.5%,2% for speed). Some small discrepancies can be observed on the distance and displacement distributions, even if the mean and the quantiles 0.05 and 0.95 are close. The interactions between the host biofilm and the acceleration and speed distribution are also recovered with high accuracy. We note that part of the observed discrepancies comes from an additional source of variability of the simulation framework: when a swimmer reaches a domain boundary during a simulation, its trajectory is stopped and a new swimmer is randomly introduced elsewhere in the biofilm (see Materials and methods for more details). This simulation strategy seems to be responsible of the overrepresentation of short trajectories in the inferred dataset, compared to the ground truth (Figure 6b upper panel, distance and displacement distributions).
Markov chains convergence and correlation
Markov chain (Appendix 1—figure 4) and markov chain pairplots (Appendix 1—figure 5) are displayed. Direct visualization of the posterior sampling allows to detect convergence failure (strong autocorrelation or stationnary markov chain). Markov chain pairplot informs on potential correlation between different parameters posterior samples, showing an interaction between parameter and an identification issue. In Appendix 1—figure 4, the markov chains correctly converged for all the parameter. No strong correlation can be observed in Appendix 1—figure 5.
Impact of noise on biofilm data
The impact of noise on the parameter inference is assessed by noising the biofilm density and the biofilm density gradients with an additive gaussian noise with increasing variance (Appendix 1—figure 6). The noise variance is scaled with the variance observed in the original data. Namely, we set
and
where ${\sigma}_{b}$ is the variance observed in the original data, ${\u03f5}_{b}$ and ${\u03f5}_{\nabla b}$ are respectively the noise applied to the biofilm density and the biofilm density gradient and $\mathrm{\Delta}x$ is a pixel width. The parameter $l\in [0,0.01,0.02,0.03,0.04,0.05]$ is increased to apply a noise from 0% to 5%.
We can observe that the estimate of only two parameters is impacted by noising the biofilm inputs: the estimate of $\beta $ and v_{1}. The parameter $\beta $ is also the parameter which is the less accurately inferred when no noise is added (5%). Its estimation relative error increases up to 35% when 5% noise is added. The parameter $\beta $ tunes the direction selection, which is the less effective process in the swimmer model. The other parameters are recovered with correct accuracy (kept under 18% for v_{1}, and under 6% otherwise).
Appendix 2
Numerical exploration
To illustrate the impact of each parameter on the interplay between the host biofilm and the swimmers trajectories, the model (2) was first computed on two mock biofilms. The first one is a square linear density gradient and the second is composed of large pores on a textured background mimicking the dense biofilm zones (Appendix 2—figure 1a). A basal simulation is computed with $\gamma =\beta =\u03f5=1$ and will be used later on as reference for comparisons. These three parameters are alternatively set to zero to assess the resulting trajectories when the speed selection, the direction selection or the random term is shut down. Suppressing speed selection results in rectilinear trajectories ($\gamma =0$, Appendix 2—figure 1c), which is rather counterintuitive since the remaining terms are designed to tune the direction. A discussion of this phenomenon is provided in Appendix 2 Influence of inference and stochastic terms on the trajectory descriptors. When suppressing direction selection ($\beta =0$, Appendix 2—figure 1d), the trajectories are no longer drifted downwards the gradient in the upper panel as in the basal simulation, and no longer follow the pores (lower panel). If the stochastic term is shut down ($\u03f5=0$, Appendix 2—figure 1e), the trajectories directly go down the gradients and are trapped in the center of the image in the upper panel. When a pore is found along the run, the swimmer keeps following it without being able to escape the pore any longer unlike the basal situation (lower panel).
The link between the model parameters and the global trajectory descriptors introduced in Section Characterizing bacterial swimming in a biofilm matrix through image descriptors is less intuitive. A global sensitivity analysis of the trajectory descriptors (mean acceleration and speed, distance, displacement and visited areas) with respect to the parameters $\gamma $, v_{0}, v_{1}, $\beta $ and $\u03f5$ is conducted in Model sensitivity analysis by computing their first order Sobol index (SI) and their pairwise correlation coefficient (PCC). The sensitivity analysis shows that the mean speed is mainly influenced by $\gamma $ and $\u03f5$ with slightly negative and positive impact respectively, while acceleration is rather influenced by $\beta $ and $\u03f5$ with positive impact. The link between the parameters and the other descriptors is more complex, including non linear effects (strong SI and small PCC) and parameter interactions (higher SI residuals, see Appendix 2—figure 2).
Model sensitivity analysis
The link between the model parameters and the global trajectory descriptors introduced in Section Characterizing bacterial swimming in a biofilm matrix through image descriptors is not intuitive. A global sensitivity analysis of the trajectory descriptors (mean acceleration and speed, distance, displacement and visited areas) with respect to the parameters $\gamma $, v_{0}, v_{1}, $\beta $ and $\u03f5$ is conducted by computing their first order Sobol index (SI) and their pairwise correlation coefficient (PCC).
The residual variance is small for the median speed and acceleration but slightly larger for the distance, displacement and visited area indicating larger effects of parameter interactions for these outputs, that is output variations induced by joint shifts of the parameters (Appendix 2—figure 2). The SI of the parameters v_{0} and v_{1} are negligible, except for the displacement and the visited area. The parameters $\gamma $, $\beta $ and $\u03f5$, that is the three weights associated to each component of the state Equation 2, are more influential. Distance and speed have several main drivers. The distance is impacted nearly equally by $\gamma $, $\beta $ and $\u03f5$ and the PCC of these parameters is quite small, indicating that these parameters may induce indistinctly negative or positive variations of the travelled distance, except for $\u03f5$ which is slightly negatively correlated. The median speed is mainly impacted by $\u03f5$ (slightly positively) and $\gamma $ (slightly negatively), with relatively small PCC (Appendix 2—figure 2). The mean acceleration, the displacement and the visited area are preponderantly impacted by a main driver: the mean acceleration and the visited area are particularly impacted by $\u03f5$, the stochastic term weight, with positive influence. The displacement is mainly influenced by $\gamma $ with no preponderant variation direction (null PCC, Appendix 2—figure 2).
Friction and random term in Langevin equations
To illustrate the interplay between the friction and the random term during a random walks, we solve the problem
in an unconstrained domain, with $\eta $ a 2 dimensional white noise with unitary variance. The friction parameter $\gamma $ is alternatively set to 1 (Appendix 2—figure 3, upper panel) or 0 (Appendix 2—figure 3, lower panel). We note that the random seed is the same for the simulations with or without the friction term, so that the stochastic contribution is completely identical in the upper and lower panels. The trajectories produced without the friction term are much more regular and rectilinear that those produced with the friction term, that are much chaotic.
The reason of that behaviour may come from the null mean of the white noise. Roughly speaking, in average, the acceleration shows small variations around zero which leads after temporal integration to regular speeds and rectilinearlike trajectories. By contrast, the friction term reduces the particle inertia, enhancing the impact of the stochastic term, which produces much more chaotic trajectories.
Impact of the stochastic term
We illustrate the impact of the random walk term on the overall swimmer trajectory with Appendix 2—figure 4. In this figure, we display two trajectories computed from model (2) with identical parameters ($\alpha $, $\beta $, v_{0}, v_{1}, $\gamma $ and $\u03f5$), initial condition, host biofilm and time length. Different random samplings of the stochastic term of Equation (2) lead to these very different trajectories. This example illustrate the difference between identifying populationwide characteristics and inferring true trajectories: while the later try to detect the differences between the two trajectories (i.e. in this example, identifying and smoothing the different stochastic samples leading to these trajectories), the former focuses on the common features between these apparently different trajectories.
Influence of inference and stochastic terms on the trajectory descriptors
We wonder if the uncertainty sources involved in the inference process and in the stochastic term of the random walk have a decisive impact on the trajectory descriptors. To address this question, a first dataset is assembled by integrating in time Equation 2 for given parameters (see Appendix 1—table 3), initial conditions and host biofilm. Then, this dataset is used as inputs of the inference method to infer the initial parameters (ground truth). Another dataset is produced by replacing the initial parameters by the inferred parameters. We note that we take the same seed for the random number generator than for the initial dataset, so that the only uncertainty that has been introduced until this step comes from the inference procedure. Finally, we produce a last dataset by solving the model with the same inferred parameters as in the second dataset, but changing the seed of the random number generator. Hence, this last dataset involves uncertainties coming from the stochastic terms and from the inference process. This variation results in modifying the sampling of the stochastic terms and leads to strong modifications of the trajectories, like in Appendix 2—figure 4.
At end, the trajectory descriptors are computed and plotted in Appendix 2—figure 5. We can see that the trajectory descriptor distributions are very similar across the different dataset, except for the total distance and the displacement where discrepancies can be noted. However, these differences are relatively small compared to the mean and the width of the distributions. We can also observe that the interactions with the underlying biofilm is very well conserved, even when the sampling of the stochastic term is very different. This observation grounds the initial guess that these trajectory descriptors captures common global features of the different trajectories rather than specificities of given trajectories.
Relative impact of the different swimming processes on the species swim
The ternary plot presented in Appendix 2—figure 6 shows the balance between the different swimming processes. The contribution of each term of equation (4) to acceleration estimate was first computed. Namely, the relative value of the speed selection term $\parallel s{(b)}_{i}^{s}\parallel $, the direction selection term $\parallel s{(\nabla b)}_{i}^{s}\parallel $ and the stochastic term $\parallel s{(\eta )}_{i}^{s}\parallel $ where
The proportions $A{(k)}_{i}^{s}$ of each process was computed with
for $k\in \{b,\nabla b,\eta \}$. As the contribution of the direction selection was limited compared to the other processes, we zoomed in the plot near the edge $\parallel s{(\nabla b)}_{i}^{s}\parallel =0$ to allow interspecies comparisons.
Appendix 3
Various inference models
Different inference models were designed and tested from the dimensionless state equation (2).
SSM model
The inference model can be stated as a spacestate model (SSM) which is a framework commonly used in spatial ecology to infer a true state, that is true positions and trajectories, and populationwide random walk parameters from timeserie data (Auger‐Méthé et al., 2021). The SSM inference model is a generalization of Hidden Markov Models (HMM).
Note ${z}_{i}^{s}(t)$ the true (hidden) position of the individual $i$ of the species $s$ at time $t$. The state model on acceleration (4) can be rewritten as
In this equation, ${\mathbf{v}}_{i}^{s}$ is the true hidden swimmer velocity. Starting from observed initial conditions ${z}_{i}^{s}(0)$, ${\mathbf{v}}_{i}^{s}(0)$, Equation 16 can be integrated in time to recover hidden ${z}_{i}^{s}(t)$, ${\mathbf{v}}_{i}^{s}(t)$ for all times $t$.
Then, a likelihood equation can be written to compare the true hidden state to the observations.
We note a link between ${\eta}_{mod}$ and ${\eta}_{obs}$ in Equations 15 and 16 and the random state $\eta $ in Equation 4. Namely, noting ${\sigma}_{mod}$ and ${\sigma}_{obs}$ the standard deviation of the gaussian noises ${\eta}_{mod}$ and ${\eta}_{obs}$, direct finitedifference of ${A}_{i}^{s}(t)$ from the true state gives an estimate of the noise variance on the acceleration of the nonlinear regression model
Compared to problem (5), the main advantages are that the likelihood is written on the original data, that is the observed position, and not a postprocessed observed acceleration, subject to finitedifference errors. Furthermore, the true trajectories are recovered and modelling errors ${\eta}_{\mathbf{mod}}^{\mathbf{s}}$ and observation errors ${\eta}_{\mathbf{obs}}^{\mathbf{s}}$ are separated. The main drawback of this methodology is that the state space is very large since it includes all the positions and speeds at every time for every swimmers, which leads to intractable computations.
Mixing SSM and nonlinear inference models
An intermediary strategy has been designed by selecting swimmer trajectories that we want to infer by SSM, the remaining trajectories being kept to compute an acceleration dataset ${A}_{i}^{s}(t)$. Namely, note ${D}_{ssm}$ the set of swimmer index kept for SSM, and ${D}_{A}$ the set of swimmer index kept for nonlinear regression. We set, for $i\in {D}_{ssm}$
for given initial conditions ${z}_{i}^{s}(0)$, ${\mathbf{v}}_{i}^{s}(0)$, and for $j\in {D}_{A}$
where ${X}_{j}^{s}(t)$, ${V}_{j}^{s}(t)$ and ${A}_{j}^{s}(t)$ are observed positions, speeds and accelerations. This model is completed by a likelihood equation
where ${f}_{A}$ is defined in equation (4).
This setting kept some advantages of the SSM, like inferring some true hidden trajectories or separating the estimate of modeling and observation errors, while limiting the computational load if ${D}_{SSM}$ is not too large.
We finally kept the regression model for several reasons. First, we are interested in recovering population wide parameters to characterize strainspecific swims, and not identifying true trajectories. Second, we can consider that the observation error with confocal microscopy is several order of magnitudes under the spatial characteristic lengths involved in equation (2), so that observation errors can be neglected. Hence, the objective of separating the uncertainty sources between model and observation errors, which is a main advantage of the SSM or mixed inference settings, becomes secondary. Furthermore, enhancing the state space dimension provided additional uncertainties, worsening the inference precision on synthetic data. We then opted for the simple regression model that provided sufficient parameter identifiability for limited computational load.
Appendix 4
KDE computation
We illustrate the process of visualization of multiple point distributions in the same graph using KDE and isolines enclosing specific proportions of the data in Appendix 4—figure 1. A point cloud is first approximated with a Gaussian KDE. Then, the value of the gaussian KDE is evaluated in each point of the original point cloud, which allows to map the 2D map into a 1D set where order relation can be defined. Specific quantiles of the resulting values are computed (namely quantile 0.05, 0.5 and 0.95). By definition, the quantile 0.05 separate 5% of the points of the original dataset (the 5% lowest Gaussian KDE values) from the remainder of the data set. The isoline corresponding to the quantile 0.05 then also separates in the 2D map the 5% lowest Gaussian KDE values from the others.
Data availability
Data and code have been deposited at https://forgemia.inra.fr/bioswimmers/swiminfer and https://doi.org/10.5281/zenodo.6560673.

ZenodoInferring characteristics of bacterial swimming in biofilm matrix from timelapse confocal laser scanning microscopy: compagnon code and data.https://doi.org/10.5281/zenodo.6560673
References

A guide to state–space modeling of ecological time seriesEcological Monographs 91:1470.https://doi.org/10.1002/ecm.1470

Biocorrosion: towards understanding interactions between biofilms and metalsCurrent Opinion in Biotechnology 15:181–186.https://doi.org/10.1016/j.copbio.2004.05.001

The biofilm architecture of sixty opportunistic pathogens deciphered using a high throughput CLSM methodJournal of Microbiological Methods 82:64–70.https://doi.org/10.1016/j.mimet.2010.04.006

Biofilmassociated persistence of foodborne pathogensFood Microbiology 45:167–178.https://doi.org/10.1016/j.fm.2014.04.015

Spatial Organization Plasticity as an Adaptive Driver of Surface Microbial CommunitiesFrontiers in Microbiology 8:1364.https://doi.org/10.3389/fmicb.2017.01364

Optimal noise maximizes collective motion in heterogeneous mediaPhysical Review Letters 110:238101.https://doi.org/10.1103/PhysRevLett.110.238101

Diffusion, subdiffusion, and trapping of active particles in heterogeneous mediaPhysical Review Letters 111:160604.https://doi.org/10.1103/PhysRevLett.111.160604

Confined Flow: Consequences and Implications for Bacteria and BiofilmsAnnual Review of Chemical and Biomolecular Engineering 9:175–200.https://doi.org/10.1146/annurevchembioeng060817084006

Study of the sensitivity of coupled reaction systems to uncertainties in rate coefficients I TheoryThe Journal of Chemical Physics 59:3873–3878.https://doi.org/10.1063/1.1680571

Biofilms: an emergent form of bacterial lifeNature Reviews. Microbiology 14:563–575.https://doi.org/10.1038/nrmicro.2016.94

Bacteria and archaea on Earth and their abundance in biofilmsNature Reviews. Microbiology 17:247–260.https://doi.org/10.1038/s4157901901589

Matplotlib: A 2D Graphics EnvironmentComputing in Science & Engineering 9:90–95.https://doi.org/10.1109/MCSE.2007.55

Swimming fluctuations of microorganisms due to heterogeneous microstructurePhysical Review. E, Statistical, Nonlinear, and Soft Matter Physics 90:043021.https://doi.org/10.1103/PhysRevE.90.043021

Multifractal characterization of bacterial swimming dynamics: a case study on real and simulated Serratia marcescensProceedings. Mathematical, Physical, and Engineering Sciences 473:20170154.https://doi.org/10.1098/rspa.2017.0154

ConferenceTracking swimmers bacteria and pores within a biofilm2014 IEEE 11th International Symposium on Biomedical Imaging ISBI 2014.https://doi.org/10.1109/ISBI.2014.6867869

Collective Motion of Microorganisms in a Viscoelastic FluidPhysical Review Letters 117:118001.https://doi.org/10.1103/PhysRevLett.117.118001

Fluorescent reporters for Staphylococcus aureusJournal of Microbiological Methods 77:251–260.https://doi.org/10.1016/j.mimet.2009.02.011

Swarming Aqua Sperm Micromotors for Active Bacterial Biofilms Removal in Confined SpacesAdvanced Science (Weinheim, BadenWurttemberg, Germany) 8:e2101301.https://doi.org/10.1002/advs.202101301

ConferenceData Structures for Statistical Computing in PythonPython in Science Conference.https://doi.org/10.25080/Majora92bf192200a

Running and tumbling with E. coli in polymeric solutionsScientific Reports 5:1–11.https://doi.org/10.1038/srep15761

Scikitlearn: Machine learning in PythonJournal of Machine Learning Research 12:2825–2830.

BookTravelling through Slime–Bacterial Movements in the Eps MatrixLondon, United kingdom: IWA Publishing.

Swimming bacteria promote dispersal of nonmotile staphylococcal speciesThe ISME Journal 11:1933–1937.https://doi.org/10.1038/ismej.2017.23

Functional Regulators of Bacterial FlagellaAnnual Review of Microbiology 73:225–246.https://doi.org/10.1146/annurevmicro020518115725

Diversity in chemotaxis mechanisms among the bacteria and archaeaMicrobiology and Molecular Biology Reviews 68:301–319.https://doi.org/10.1128/MMBR.68.2.301319.2004

Dynamic exchange of two types of stator units in Bacillus subtilis flagellar motor in response to environmental changesComputational and Structural Biotechnology Journal 18:2897–2907.https://doi.org/10.1016/j.csbj.2020.10.009

Pingouin: statistics in PythonJournal of Open Source Software 3:1026.https://doi.org/10.21105/joss.01026

seaborn: statistical data visualizationJournal of Open Source Software 6:3021.https://doi.org/10.21105/joss.03021

Hitchhiking Behavior in Bacteriophages Facilitates Phage Infection and Enhances Carrier Bacteria ColonizationEnvironmental Science & Technology 55:2462–2472.https://doi.org/10.1021/acs.est.0c06969
Decision letter

Karine A GibbsReviewing Editor; University of California, Berkeley, United States

Wendy S GarrettSenior Editor; Harvard T.H. Chan School of Public Health, United States

Iago GrobasReviewer; University of Oxford, United Kingdom

Karine GibbsReviewer; University of California, Berkeley, Berkeley, United States
In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.
Decision letter after peer review:
Thank you for submitting your article "Inferring characteristics of bacterial swimming in biofilm matrix from timelapse confocal laser scanning microscopy" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, one of whom is a member of our Board of Reviewing Editors, and the evaluation has been overseen by Wendy Garrett as the Senior Editor. The following individual involved in review of your submission has agreed to reveal their identity: Iago Grobas (Reviewer #2).
The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.
Essential revisions:
Overall, the presented data support the conclusions and provide insights into bacterial motility. However, there are concerns about the model (assumptions and extensions), the limited discussion/application of bacterial motility and morphology, and aspects of the experimental design. If you choose to revise this manuscript, please address the following items:
1) In its current form, the paper lacks a characterization of the swimming motility of the bacteria in a Newtonian fluid, an important aspect to rationalize the impact of the biofilm.
2) Clearly communicate how this model would help characterize bacteria motility in exogeneous biofilms (instead of performing microscopy).
3) The discussion of cell morphology and potential impacts on motility within the biofilm is inadequate. Cell morphology should also be considered when characterizing swimming motility in a Newtonian fluid.
See the detailed comments for specific concerns regarding each of these items.Reviewer #1 (Recommendations for the authors):
This manuscript by Ravel and colleagues connects prior analysis of bacterial movements on/within abiotic substrates to those within a "living" substrate. The primary outcome is developing a methodology to examine the overall swimming of a bacterial population within another bacterium's biofilm. Implementing these methods would allow for differentiation among bacterial species, identifying microbes that can travel more deeply within an existing biofilm. The authors consider that in the longterm, one could potentially use microbes to reliably deliver antimicrobials (or similar) into the heart of a biofilm, thereby hopefully reducing the biofilm's recalcitrant responses to these biocontrol chemicals.
Strengths
– These scientists develop an inference model and supportive methods to ascertain traits of the population of swimming cells. While similar methods and models exist, the specific examination within a "living" biofilm is intriguing and foundational for developing drug delivery methods or interpreting how cells of any type, e.g., immune cells, might penetrate these bacterial communities. These results also raise questions about the role of cell shape, size, and motility in infiltrating already formed colonies, and conversely, how the composition of the extracellular matrix and cell stacking could protect biofilms from invasion.
– The scientists sequentially test their inference model with and without experimental data. Similar results emerge, suggesting that the inference model could provide an initial step to evaluate different bacterial strains with various biofilm conditions rapidly.
– The authors do a good job of clearly explaining each variable's definition and directly addressing many of the assumptions and constraints of these experiments.
Overall, the presented data support the conclusions, except for the discussion about bacterial shape and motility.
Weaknesses
– An underlying thread is that the difference in shape and flagella size/motility contributes to how each Bacillus strain can navigate the S. aureus biofilm. Yet, the discussion of these contributions is not until the end and relies on abstract assertions of potential behavioral differences. This analysis could be more robust if it included analysis of each species' motility in the absence of biofilm so as to establish if and how these species swim differently from one another. As such, lines 416 – 419 are not strongly supported by the data in this manuscript or the current literature.
– There is a potential bias in the data due to the constraints of the experimental setup: only focal planes near the well's edge are included in the analysis. The biofilm in these zones could have a distinct physical structure (due to the well's wall) than the remainder of the biofilm. While the datasets presented here are selfcontained for analysis, any conclusions about the general/overall biofilm are more narrow (or should be taken with a caveat).
– The timescale of 30 minutes at 30{degree sign}C could permit bacterial growth while cells are moving. The inference measurements appear to be reasonably robust against these potential impacts (perhaps incorporated as part of the "noise" variable). The possible cell growth is worth also considering if attributing differential swimming behaviors to cell morphology or flagella size/location.
I have included all of my scientific comments in the public review. A couple of typographical errors alter a sentence's meaning. The critical ones are as follows:
– line 96, dies > dyes
– line 198, γ > β (?)
Further, please clarify that in Table 1 that the video duration is in minutes.
Reviewer #2 (Recommendations for the authors):
This work aims to create a workflow to elucidate how the biofilm matrix affect the trajectories of exogeneous swimming bacteria. In principle, this could potentially be interesting to study biofilm spatial organization and also to characterize transport of synthetic particles such as nanoparticles or colloids carrying biocides into the biofilm. However, the statistical model presented here does not consider the surface interactions between the particles and the porous media which makes the model quite specific to the bacteriabiofilm interaction and, therefore, difficult to extend to other particles and porous materials.
The article is mainly focused on the statistical model which quantifies the key parameters of the bacteria swimming motility affecting their trajectory within the biofilm matrix. The main parameters studied are the acceleration, swimming speed, net displacement in the swimming trajectory and the area covered by the bacteria during the trajectory. The variables in the biofilm matrix that can affect those parameters are the local differences in cell density within the biofilm and the cell density itself. In this regard, the authors found a good agreement between the trajectories inferred by the model and the real trajectories, validating their model for the strains tested. The model allows to decipher how the different bacteria species adapt to the biofilm matrix. Nevertheless, I fail to understand how the model provides an advantage with respect to just observing the trajectories under the microscope. Following the bacteria with microscopy could identify when they slow down or speed up depending on the density of the host biofilm matrix as the model does.
Finally, the authors connect the bacteria morphology of the three species studied to the kinematic descriptors of the bacteria trajectories. The morphological features logically agree with what their model predicts, meaning that, long bacteria are slower in the biofilm matrix and bacteria with lower aspect ratio finds easier to go through the porous biofilm matrix. I find this like a nice way of checking that the model works but at the same time I wonder how useful this model can be since the mentioned morphological features and their impact in the bacteria trajectories could be inferred without the model, just by using microscopy. On the same line, the authors suggest that a brushlike group of thin flagella make changing directions easier but I do not see how this is checked experimentally.
In summary, the parameters affecting the model and the correlation between the kinematic variables of the swimming trajectories and the local conditions in the biofilm are thoroughly checked. However, my concern is that I do not see how this model can widen our knowledge about how bacteria navigate biofilm matrix since tracking bacteria in a biofilm matrix would give similar information. Furthermore, I do not know how limited this model is in terms of the surface chemistry interaction between the biofilm and the bacteria, or the particles in general, that are introduced in the biofilm. This surface chemistry interaction could totally change the trajectory of the swimmers independently on the parameters studied in this article which limits the model to bacteria swimming in a host biofilm and does not allow the extension of the model to synthetic particles or other porous media.
I do not understand very well Figure 1a. There are red bacteria (guest bacteria) in the last frame that do not appear in previous timestamps. Specifically, the trajectory at the bottom right has its origin in an area that is visible for all timepoints and I do not manage to see any bacteria in there in any of the timestamps.
I have also a problem with Figure 1b. I guess that in the caption, 'distance' refers to the shortest distance between the initial and final point of the trajectory. But this is what the authors seem to be drawing in the figure for 'displacement' which is defined as 'the total length of the trajectory path'. So I think these two should be swapped. Actually, I think the correct representation according to the capture is as drawn in Figure 1d.
In table 1, could the units of time interval be specified?
In Figure 2, is the whole set of trajectories for each species displayed for 1 batch or for the 3 batches? I think Figure 2, would be easier to follow if the order of the bacteria in (a) matches the order in (b). I think it would also benefit from a rough y axis in the first graph of the upper panel of (b). What are the units in the x axis? If the magnitudes are dimensionless because they are normalized, I think this should be said in the caption, also because dimensionless magnitudes are referred as V* and A* in the main text. The legend in the bottom panel (b) containing the names of the bacteria should be in italics and 'B' separated from the rest of the name. There is a mistake in the caption (2b), the magnitude 'Area' is not mentioned in the caption.
Do the authors know why the distributions in Figure 2 look quite similar for B. pumilus and B. cereus but different to B. sphaericus? It seems to me that this might be an artefact coming from the fact that less trajectories are plotted for B. sphaericus. I think it would be informative if they put a label with the number of total trajectories displayed in this panel.
I do not see very well the added value of the bottom panel in Figure 2b. I have read the Results section just looking at the 50% line, I don't think the others present much more additional data.
I think the penalization coefficient 'γ' being inversely proportional to the relaxation time should be further explained.
In line 316 says that B. pumilus shows the highest v0 value, indicating a higher ability to swim fast in low density. But v0 was defined as the speed at the highest density, so this means that B. pumilus swims fast at high density. Which one is right?
I do not know if it can be argued that B. sphaericus presents no difference between v0 and v1 as it is written in the main text. It seems that the v0 is extremely low, almost 0, indicating that it cannot swim at high cell density. However, this strain has the highest v1 mean.
Appendix 1: In Figure 1. I think the red channel's brightness should be increased. When the authors say 'rescalled biofilm density map'? what do they mean? Is it just the first image of the biofilm for a certain condition? Or do they alter the image in any way? In this same Figure 1, there is a white space in the gray frame enclosing the 'swimmer trajectories in normalized biofilm'.
I would put figure 6 in an appendix. I do not think it adds much value to the aim of the paper.
Not sure about Fig7. I think it's better to put a square in the zoomed in regions instead of the dashed line. For example, in the middle panel, the zoomed in version of B. pumilus seems like there are two bacterial bodies attached to each other. However, this is very difficult to see in the zoomed out version. I think this would be clearer if instead of a dashed line there was a dashed square around the region.
Why is there a huge change in the epsilon coefficient depending on the species? I thought random noise would affect roughly the same to all species.
I think in general, the paper would benefit if the figure 7 was changed to figure 1. I would motivate the paper by saying that there are three species with different hydrodynamic properties because of shape, number/type of flagella, etc. and this leads to the hypothesis that the behaviour would be different in the porous biofilm matrix. For example, the longer body of B. sphaericus would be an impediment when navigating the biofilm and therefore they expect lower motility, etc. As the paper is structured now, the whole story seems to be around checking whether their model is correct.
I do not understand the calculation of the visited area. What is the parameter k and ns? Why do they have those values? Why was the visited area not calculated using the trajectory multiplied by the size of the particle?
Reviewer #3 (Recommendations for the authors):
Ravel et al., investigate the swimming behavior of three different Bacillus species in the biofilm formed by Staphylococcus aureus. The biofilm structure and the swimmers' behavior are experimentally characterized using timelapse confocal microscopy. The swimmers' behavior is described by several parameters measured from tracking their positions in time, including acceleration and speed. The biofilm structure is described in terms of the density of the biomass. The data highlight differences in the shape of the trajectories, speed distributions, and tendency to visit highdensity areas of the biofilm in the three different species. These observations are reproduced using a generative model of the data and tentatively explained in terms of bacterial morphology. The model could, in principle, be used to predict the motility of other bacterial species in biofilms.
The conclusions of this paper are supported by the data and the model. Still, the explanation of the former in terms of bacterial motility and morphology needs to be extended and some aspects of the model need to be clarified.
1) The authors explain the different swimming behavior observed in the biofilm in terms of different bacterial morphology, i.e. shape and number of flagella. However, the impact of these differences in a Newtonian fluid should be evaluated and used to better understand the adaptation strategies to the biofilm. In addition, the observations should be commented in light of the literature studying bacterial motility in nonNewtonian fluids (Phys. Rev. Fluids, 5, 073103 (2020); Nat. Phys, 15, 554558, (2019); Sci. Rep., 5, 15761 (2015); PNAS, 111, 1777117776 (2014)), where the impact of the flagellar shape and motility are discussed.
2) While a lot of attention is dedicated to describing how bacterial motility is quantified, very few details are given about the measurement of biofilm density and the possible error sources. Since this is a primary ingredient for the interpretation of the results, I would recommend commenting on the procedure and the sensitivity of density measurement.
3) The authors suggest that bacterial motility could help create channels into the biofilm, affecting transport. Is this effect observed in the experiments? This aspect should be clarified and commented on.
4) The purpose of the model and its pros and cons should be discussed more clearly for a general audience. In particular, the authors should clarify the added value of the model with respect to the experiments.
5) Do these bacteria perform a runandtumble or a runandreverse type of motion in water? Could a backandforth trajectory be described as a runandreverse?
6) Table 1: The units are missing.
7) Line 125: It would be helpful if some numbers were given in the text to quantify the "large swimming distances" or the "widest swimming distribution".
8) Figure 1: The biofilm density map should be added in panel a. I would recommend avoiding the superimposition of the distributions of A, V,dist, disp and area in panel b.
9) Line 136: Could the term "few" be quantified? From Figure 1a the exceptions seem to be several.
10) Line 192: The bioreadership may not find clear the term "basal" used to identify the simulations. The same applies to "ground truth" in the following. These terms should be defined to improve clarity.
11) Line 303306: The conclusion on the variability of the outputs should be clarified. Is it a positive feature or is the model missing something?
https://doi.org/10.7554/eLife.76513.sa1Author response
Essential revisions:
Overall, the presented data support the conclusions and provide insights into bacterial motility. However, there are concerns about the model (assumptions and extensions), the limited discussion/application of bacterial motility and morphology, and aspects of the experimental design. If you choose to revise this manuscript, please address the following items:
We warmly thank the reviewers for this positive comment. Corrections are detailed below.
1) In its current form, the paper lacks a characterization of the swimming motility of the bacteria in a Newtonian fluid, an important aspect to rationalize the impact of the biofilm.
New experiments have been conducted to image the swimming motility of the studied strains in a Newtonian fluid. We took exactly the same culture protocol for the swimmers, but put them in the buffer which is a Newtonian fluid in the absence of host biofilm. Images have been obtained and postprocessed with the same dataflow as previous experiments. We then obtained a control experiment characterizing the microbial swim without biofilm.
We first computed the same trajectory descriptors as for the trajectories inside the biofilm, allowing for pairwise comparisons between species swimming in the control buffer, but also to compare the free swim (control) with the swim in a crowded environment (biofilm). We also conducted an additional characterization by addressing a potential runand tumble or runandreverse behaviour of the species (see Reviewer 3 remark): we computed for each trajectory time point the swimmer speed (defined as the mean between the incoming and outgoing velocity vectors (ǁV_{i}^{s}(t)ǁ+ǁ V_{i}^{s}(tΔt)ǁ)/2, for t∊ ($T}_{i}^{s$+1, $T}_{end,i}^{s$)) versus the direction change defined as the angle between the incoming and outgoing velocity vectors ${\theta}_{i}^{s}(t)=\mathrm{arccos}(({V}_{i}^{s}(t)\cdot {V}_{i}^{s}(t\mathrm{\Delta}t))/({V}_{i}^{s}(t){V}_{i}^{s}(t\mathrm{\Delta}t)))$. An additional figure was added illustrating runandtumble in the host biofilm or in the control buffer.
These additional data provided very interesting insights. We could in particular observe that B.sphaericus has a similar runandreverse behaviour in the biofilm or the control buffer, the biofilm strongly impairing the swimming speed and increasing the number of reverse events whereas B.pumilus switches from rectilinear runs in the Newtonian buffer to a pronounced runandreverse behaviour in the biofilm with decreased speeds.
By contrast, B.cereus is weakly impacted by the host biofilm: it manages to keep approximately straight trajectories and comparable swimming velocities in the biofilm. Interestingly, the number of reverse events is even reduced in the biofilm.
– line 466 : Material and Methods, imaging protocol for the control experiments in the Newtonian fluid is added.
– Figure 2 of the initial submission has been split in two figures now numbered figures 3 and 5. In Figure 3, a panel has been added with the swimmer trajectories in the control buffer. In Figure 5, the descriptor distribution in the control buffer has been added. The figure interpretations has been modified accordingly in lines 110, 122, 133, 140, 165, 167, 169 and 173.
– A new figure (Figure 4) has been added to assess potential runand tumble behaviours. The figure description has been added in lines 133. Precise definition of angle and average velocity between two consecutive speed vectors is given line 516 in the Material and Methods.
2) Clearly communicate how this model would help characterize bacteria motility in exogeneous biofilms (instead of performing microscopy).
We thank the reviewers for this fundamental remark. We agree with the reviewers, and in particular with Reviewer 2 who was sceptical about the need of an additional mathematical model, that a large part of the present study is made with direct observations of the trajectories without additional modelling: Figures 3, 4 and 5 of the present submission are nothing else than the extraction of direct descriptors from the microscopy images. In particular, these descriptors allow to quantify swimming characteristics (Figure 5, top panel) but also to link the host biofilm (density and gradient) to the swimmer behaviour (speed and acceleration) (Figure 5, lower panel) or to assess run and tumble (Figure 4).
However, the main goal of this study was to go beyond these descriptive analysis. First, we propose an integrative model able to analyse simultaneously heterogeneous data (density maps and trajectories) and to gather all the swimming behaviours that are considered (speed selection, direction changes and random walk) in the same quantification in order to measure their relative impact on the swimmer trajectories. This method is comparable to ANOVAlike multivariate analysis: parametric phenomelogical mappings are defined to link explicative covariables to a phenomena (i.e. for example the function defining speed selection from biofilm density) and gathered in the same inference problem. Compared to direct observations, the model we propose additionally tells us for example that speed adaptation is preponderant compared to direction selection (Appendix 2 Figure 4). Secondly, we infer and validate a generative model able to reproduce observed trajectories. This model can be used in future works to predict the impact of a bacterial swimmer on the vascularisation of the host biofilm.
– Section title has been changed line 192 to better communicate that the mathematical model is actually a data analysis method. In the same paragraph, comments were added lines 220.
– Additional sentence is added lines 257 and 322 to clarify the fact that the measurement of the relative contribution of the different swimming mechanisms to the acceleration variance is allowed by the mathematical model.
– Additional discussion has been added line 369 to clearly communicate that the model inference can be seen as an analysis method allowing for multidata integration and coanalysis.
3) The discussion of cell morphology and potential impacts on motility within the biofilm is inadequate. Cell morphology should also be considered when characterizing swimming motility in a Newtonian fluid.
A new discussion has been added linking the cell morphology to the swimming path in the control Newtonian buffer, but also discussing these results respectively to the literature, as suggested by Reviewer 3.
– The result section was rewritten and made more accurate by linking the presence of multiple flagella to a runandtumble behaviour, as observed in other flagellated bacteria such as E. coli, line 352.
– A new section has been added in the discussion to put the results in perspective, regarding previous studies of the literature and the new control experiments in the Newtonian buffer, line 447.
See the detailed comments for specific concerns regarding each of these items.
Reviewer #1 (Recommendations for the authors):
This manuscript by Ravel and colleagues connects prior analysis of bacterial movements on/within abiotic substrates to those within a "living" substrate. The primary outcome is developing a methodology to examine the overall swimming of a bacterial population within another bacterium's biofilm. Implementing these methods would allow for differentiation among bacterial species, identifying microbes that can travel more deeply within an existing biofilm. The authors consider that in the longterm, one could potentially use microbes to reliably deliver antimicrobials (or similar) into the heart of a biofilm, thereby hopefully reducing the biofilm's recalcitrant responses to these biocontrol chemicals.
Strengths
– These scientists develop an inference model and supportive methods to ascertain traits of the population of swimming cells. While similar methods and models exist, the specific examination within a "living" biofilm is intriguing and foundational for developing drug delivery methods or interpreting how cells of any type, e.g., immune cells, might penetrate these bacterial communities. These results also raise questions about the role of cell shape, size, and motility in infiltrating already formed colonies, and conversely, how the composition of the extracellular matrix and cell stacking could protect biofilms from invasion.
– The scientists sequentially test their inference model with and without experimental data. Similar results emerge, suggesting that the inference model could provide an initial step to evaluate different bacterial strains with various biofilm conditions rapidly.
– The authors do a good job of clearly explaining each variable's definition and directly addressing many of the assumptions and constraints of these experiments.
Overall, the presented data support the conclusions, except for the discussion about bacterial shape and motility.
We thank Reviewer 1 for his positive and encouraging comments. Remarks about the discussion about bacterial shape and motility have been addressed, as detailed in the previous section.
– Results and discussion of the links between shape and motility have been rewritten, see lines 352 and 447.
Weaknesses
– An underlying thread is that the difference in shape and flagella size/motility contributes to how each Bacillus strain can navigate the S. aureus biofilm. Yet, the discussion of these contributions is not until the end and relies on abstract assertions of potential behavioral differences. This analysis could be more robust if it included analysis of each species' motility in the absence of biofilm so as to establish if and how these species swim differently from one another. As such, lines 416 – 419 are not strongly supported by the data in this manuscript or the current literature.
As suggested by Reviewers 1 and 2 (see below), we moved the ultrastructural microscopy to the beginning of the result section, the biological interpretation of the swimming changes being devolved to the last result section, which has been rewritten (see Point 3 above). This section title has been changed accordingly. Lines 416419 of the initial submission have been reformulated.
– The image showing the bacterial physiology has been moved to the beginning of the results and additional descriptions of the bacteria have been included, line 102 to 107.
– Result subsection title has been changed, line 329.
– Lines 416419 of the initial submission have been reformulated, line 425.
– There is a potential bias in the data due to the constraints of the experimental setup: only focal planes near the well's edge are included in the analysis. The biofilm in these zones could have a distinct physical structure (due to the well's wall) than the remainder of the biofilm. While the datasets presented here are selfcontained for analysis, any conclusions about the general/overall biofilm are more narrow (or should be taken with a caveat).
We performed an additional analysis to check that the physical structure of the biofilm is not drastically changed near the well’s edge, at the bottom of the well, where the focal plan has been chosen to screen the swimmers. To provide a quantitative analysis, we took 4 replicates of S. aureus biofilms that were imaged in 3D using a stack of 6 horizontal images, starting from z = 0 near the well’s edge, to z = 6∆z, at the interface between the biofilm and the bulk solution. To study the between and within biofilm variability in the horizontal images, we subsampled them with a regular Cartesian 4x4 grid, resulting in a 4*6*(4*4)=384 2D images database supplemented by metadata (stack, z and x−y coordinate of the subsample). Pairwise correlation of the biofilm density in the 384 samples has been computed (scikitlearn pairwise distances, ’correlation’ parameter), and the resulting similarity matrix has been displayed using Seaborn package (clustermap) after hierarchical clustering (scipy.cluster.hierarchy linkage function) and added to the annex. Additional permanova has been computed to assess the significance of betweengroup dissimilarities (skbio.stats.distance permanova function). This study shows that the z direction does not structure the data, contrary to the x−y and stack groups, indicating that the biofilm structure is not different near the well’s bottom edge.
– A description of this additional study has been added to the Material and Method, lines 492 and 628.
– A new section has been added in the annex 1, showing the results of the study, in Annex 1 Figure 2, and in the relative description.
– The timescale of 30 minutes at 30{degree sign}C could permit bacterial growth while cells are moving. The inference measurements appear to be reasonably robust against these potential impacts (perhaps incorporated as part of the "noise" variable). The possible cell growth is worth also considering if attributing differential swimming behaviors to cell morphology or flagella size/location.
– The video duration was actually of 30 s. We really apologize for this misunderstanding induced by the fact that we forgot to mention the time unit in Table 1 which collects all the dataset characteristics, as highlighted by the three Reviewers. The duration was mentioned in the Material and Methods, but we rewrote the unit in plain letter in order to avoid any ambiguity. This time scale is relevant for motion screening and allow to reduce the number of division events in the dataset. Biofilm colonization at longer time scales has been studied for example in [Houry13088], but the influence of swimmer division in the process is interesting: it could be adressed in future works.
– We indicated the time unit in Table 1 and its caption.
– Video duration was indicated in plain letter in the Material and Methods line 478.
I have included all of my scientific comments in the public review. A couple of typographical errors alter a sentence's meaning. The critical ones are as follows:
– line 96, dies > dyes
– line 198, γ > β (?)
Further, please clarify that in Table 1 that the video duration is in minutes.
Corrections have been included.
– Modifications were made lines 111 and Appendix 1.
– units have been clarified in Table 1 and line 478 but are in seconds.
Reviewer #2 (Recommendations for the authors):
This work aims to create a workflow to elucidate how the biofilm matrix affect the trajectories of exogeneous swimming bacteria. In principle, this could potentially be interesting to study biofilm spatial organization and also to characterize transport of synthetic particles such as nanoparticles or colloids carrying biocides into the biofilm.
We thank Reviewer 2 for this description that fits the objectives of our work
However, the statistical model presented here does not consider the surface interactions between the particles and the porous media which makes the model quite specific to the bacteriabiofilm interaction and, therefore, difficult to extend to other particles and porous materials.
We agree with Reviewer 2 that biofilm materials are very complex, and can be related to a limited extent to polymeric fluids or porous materials. We added a discussion of these links and limitations and put our results in perspective relatively to other results on flagellated bacteria in polymeric fluids, as also suggested by Reviewer 3 (see below). Celltocell interactions are an important determinant of the kinematic of the swimmer, but taking it into account would necessitate a precise characterization of the rheology of the host biofilm. We opted for a phenomenological description of the kinematic by decomposing the forces exerted on the swimmer into 3 blocks gathering respectively the net forces involved in the velocity selection, in the direction selection, and in random forces. This approach does not pretend to accurately describe the mechanics at the swimmer surface, but rather to gather information into interpretable blocks, the importance of which is given by the inference. This approach is comparable to ANOVA methods by its rationale.
– A discussion has been provided lines 447.
– The discussion on the overall mathematical approach has been enhanced line 369
The article is mainly focused on the statistical model which quantifies the key parameters of the bacteria swimming motility affecting their trajectory within the biofilm matrix. The main parameters studied are the acceleration, swimming speed, net displacement in the swimming trajectory and the area covered by the bacteria during the trajectory. The variables in the biofilm matrix that can affect those parameters are the local differences in cell density within the biofilm and the cell density itself. In this regard, the authors found a good agreement between the trajectories inferred by the model and the real trajectories, validating their model for the strains tested. The model allows to decipher how the different bacteria species adapt to the biofilm matrix. Nevertheless, I fail to understand how the model provides an advantage with respect to just observing the trajectories under the microscope. Following the bacteria with microscopy could identify when they slow down or speed up depending on the density of the host biofilm matrix as the model does.
We thank Reviewer 2 for this description that we endorse. Answers to the concerns about the need for a mathematical model have been provided above in Section 2, point 3. We hope that these modifications will fit his/her expectations.
– See above Section 2, point 3.
Finally, the authors connect the bacteria morphology of the three species studied to the kinematic descriptors of the bacteria trajectories. The morphological features logically agree with what their model predicts, meaning that, long bacteria are slower in the biofilm matrix and bacteria with lower aspect ratio finds easier to go through the porous biofilm matrix. I find this like a nice way of checking that the model works but at the same time I wonder how useful this model can be since the mentioned morphological features and their impact in the bacteria trajectories could be inferred without the model, just by using microscopy. On the same line, the authors suggest that a brushlike group of thin flagella make changing directions easier but I do not see how this is checked experimentally.
As detailed above in Section 2, point 3, the mathematical model is a way to integrate in the same quantification the interaction between the host biofilm and the swimmer velocity, direction and ”randomness”, in order to identify their relative impact on the overall swimmer trajectories. It can be seen as a specific statistical treatment of the microscopy images.
– Additional discussion of the links between kinematics and morphology has been added as indicated in Section 2, point 4 above.
In summary, the parameters affecting the model and the correlation between the kinematic variables of the swimming trajectories and the local conditions in the biofilm are thoroughly checked. However, my concern is that I do not see how this model can widen our knowledge about how bacteria navigate biofilm matrix since tracking bacteria in a biofilm matrix would give similar information. Furthermore, I do not know how limited this model is in terms of the surface chemistry interaction between the biofilm and the bacteria, or the particles in general, that are introduced in the biofilm. This surface chemistry interaction could totally change the trajectory of the swimmers independently on the parameters studied in this article which limits the model to bacteria swimming in a host biofilm and does not allow the extension of the model to synthetic particles or other porous media.
A comment about the generalization of our work has been added in the discussion
– A comment has been added line 449.
I do not understand very well Figure 1a. There are red bacteria (guest bacteria) in the last frame that do not appear in previous timestamps. Specifically, the trajectory at the bottom right has its origin in an area that is visible for all timepoints and I do not manage to see any bacteria in there in any of the timestamps.
Thank you for this remark. The figure has been remade.
– Figure 1a has been remade with a correct trajectory.
I have also a problem with Figure 1b. I guess that in the caption, 'distance' refers to the shortest distance between the initial and final point of the trajectory. But this is what the authors seem to be drawing in the figure for 'displacement' which is defined as 'the total length of the trajectory path'. So I think these two should be swapped. Actually, I think the correct representation according to the capture is as drawn in Figure 1d.
Thanks for pointing to this concern. The correct definition of distance and displacement was given in their mathematical definition line 135 of the revised version of the manuscript. We corrected accordingly Figure 1.d and its caption.
– Images have been swapped in image 1.d consistently with figure 1.b and mathematical definition.
– Caption has been modified.
In table 1, could the units of time interval be specified?
This is done.
– The unit is added in the corresponding column header
In Figure 2, is the whole set of trajectories for each species displayed for 1 batch or for the 3 batches? I think Figure 2, would be easier to follow if the order of the bacteria in (a) matches the order in (b). I think it would also benefit from a rough y axis in the first graph of the upper panel of (b). What are the units in the x axis? If the magnitudes are dimensionless because they are normalized, I think this should be said in the caption, also because dimensionless magnitudes are referred as V* and A* in the main text. The legend in the bottom panel (b) containing the names of the bacteria should be in italics and 'B' separated from the rest of the name. There is a mistake in the caption (2b), the magnitude 'Area' is not mentioned in the caption.
Do the authors know why the distributions in Figure 2 look quite similar for B. pumilus and B. cereus but different to B. sphaericus? It seems to me that this might be an artefact coming from the fact that less trajectories are plotted for B. sphaericus. I think it would be informative if they put a label with the number of total trajectories displayed in this panel.
I do not see very well the added value of the bottom panel in Figure 2b. I have read the Results section just looking at the 50% line, I don't think the others present much more additional data.
In previously figure 2 and now figure 3, the 3 batches are pooled. The y axis is not indicated since the distribution are normalized to sum to one. The xaxis are also dimensionless since they are normalized: we added a note in the figure caption. We modified the figures accordingly to the comments.
In our opinion, similar distributions for B.pumilus of B.cereus could come from similar trajectory characteristics for these macroscopic descriptors. We can not discard that discrepancies could come from different sample numbers. The artefact could be discarded by the fact that the number of trajectories for B.sphaericus is in the same order than B.cereus, although they show different descriptor distributions.
The bottom panel in the previously Figure 2b but now figure 3 shows the dependence of the acceleration and speed to the underlying host biofilm structure (density and density gradient). It is a way to present the result of direct observation by microscopy of the links between kinematic characteristics and the host biofilm.
– We agree that most of the information is given by the 50% isoline. We kept however the different lines to comply with eLife policy in distribution presentation, since the authors are asked to present the distribution spread in addition to means or medians.
– Plot order has been modified.
– Names of bacteria in the figure legend have been changed.
– Figure 3 caption has been amended with a note indicating that the 3 batches are pooled.
– The number of samples in the different densities has been indicated in the legend.
I think the penalization coefficient 'γ' being inversely proportional to the relaxation time should be further explained.
The fact that γ is dimensioned as an inverse of time is necessary to be homogeneous to the acceleration in the left hand side of the equation. The term ”relaxation” comes from ”penalization” modelling in fluid mechanics, where given vector fields can be prescribed by adding to the state equation a penalization term penalizing the difference to the prescribed vector fields. The term ”relaxation” refers to the time needed for the computed vector field to relax towards the constraint.
– Additional explanations have been added line 204.
In line 316 says that B. pumilus shows the highest v0 value, indicating a higher ability to swim fast in low density. But v0 was defined as the speed at the highest density, so this means that B. pumilus swims fast at high density. Which one is right?
We corrected the model description to clearly state that v0 is the prescribed speed when b = 0 and v1 when b = 1. The corresponding sentence line 316 was correct.
– The mathematical model is more clearly defined with additional explanation line 204.
I do not know if it can be argued that B. shaericus presents no difference between v0 and v1 as it is written in the main text. It seems that the v0 is extremely low, almost 0, indicating that it cannot swim at high cell density. However, this strain has the highest v1 mean.
The sentence has been reformulated to be more accurate. The same sentence was repeated later in the original submission: it has been also modified.
– Modifications have been added lines 303 and 317.
Appendix 1: In Figure 1. I think the red channel's brightness should be increased. When the authors say 'rescalled biofilm density map'? what do they mean? Is it just the first image of the biofilm for a certain condition? Or do they alter the image in any way? In this same Figure 1, there is a white space in the gray frame enclosing the 'swimmer trajectories in normalized biofilm'.
The figure has been remade. Indications have been added in Annex 1 to state that brightness has been changed, and to recall the Material and method sentence that detail the rescaling applied to the biofilm density map. No other postprocessing has been applied to the raw biofilm maps than mapping in gray scale and color rescaling between 0 and 1.
– The figure has been remade with increased brightness.
I would put figure 6 in an appendix. I do not think it adds much value to the aim of the paper.
This figure shows the balance between the different swimming processes. It shows consistency with the other results. In particular, B.cereus is the bacteria which is the most influenced by the biofilm gradients. It is also the bacteria with the highest stochastic term, related to straight lines in the Langevin equation. These data are consistent with a higher hability of B.cereus to navigate in the biofilm and conserve straight runs.
– Figure 6 has been moved to Annex 2. Additional text has been added for the annex to be selfcontained.
Not sure about Fig7. I think it's better to put a square in the zoomed in regions instead of the dashed line. For example, in the middle panel, the zoomed in version of B. pumilus seems like there are two bacterial bodies attached to each other. However, this is very difficult to see in the zoomed out version. I think this would be clearer if instead of a dashed line there was a dashed square around the region.
The zoomed in images were not obtained by digital zoom in but optical zoom in. The bacteria of zoomed in image do not correspond to the bacteria of the larger scale image. That is why we did not opt for a dashed square around the region that was zoomed in.
– We added a complementary sentence in the zoomed in legend to explain that fact.
Why is there a huge change in the epsilon coefficient depending on the species? I thought random noise would affect roughly the same to all species.
The huge change in the epsilon mainly come from the fact that the mechanistic part of the model do not explain the observed kinematics to the same extent, which is reflected in the different goodness of fit for the 3 bacteria (see Table 2). Note however that a larger stochastic part is also related to straighter trajectories (see Figure 1 and Annex 2 Figure 1): the higher stochastic part of B.cereus can be related to its straighter runs in the data (see Figure 3).
I think in general, the paper would benefit if the figure 7 was changed to figure 1. I would motivate the paper by saying that there are three species with different hydrodynamic properties because of shape, number/type of flagella, etc. and this leads to the hypothesis that the behaviour would be different in the porous biofilm matrix. For example, the longer body of B. sphaericus would be an impediment when navigating the biofilm and therefore they expect lower motility, etc. As the paper is structured now, the whole story seems to be around checking whether their model is correct.
We thank Reviewer 2 for this advice. We moved the ultimately figure 7 to the top of the paper. We added a brief description of the ultrastuctural bacterial morphology and linked that section to the remainder of the article by checking if these physiological discrepancies could give the bacteria different swimming behaviour in the biofilm matrix. See also above (Reviewer 1’s public review, point 2).
– We moved the image and added a new paragraph lines 102 and 107.
I do not understand the calculation of the visited area. What is the parameter k and ns? Why do they have those values? Why was the visited area not calculated using the trajectory multiplied by the size of the particle?
The visited area was not calculated as the trajectory multiplied by the size of the particle to take into account bacteria that visit several times the same zones. This metric would be a proxy of the biofilm vascularization induced by the swimmer : a swimmer that cross several times the same area does not dig additional pores in the host biofilm and does not increase the biofilm vascularization. We then opted for tagging all the pixels of the biofilm image that are crossed by a swimmer. But the displacement of the swimmers during a time step can be higher than several pixels : we then had to reconstruct the trajectory during consecutive swimmer positions to recover the pixels crossed by the swimmer during the time step. As a first order approximation of the trajectory during a time step, we computed a straight line between two successive positions.
The parameter n_{s} was chosen so that $\Vert {X}_{i}^{s}\left(t+\mathrm{\Delta}t\right){X}_{i}^{s}(t)\Vert /{n}_{s}$ was always inferior to a pixel size. k is not a parameter but a variable varying from 1 to n_{s}.
Reviewer #3 (Recommendations for the authors):
Ravel et al., investigate the swimming behavior of three different Bacillus species in the biofilm formed by Staphylococcus aureus. The biofilm structure and the swimmers' behavior are experimentally characterized using timelapse confocal microscopy. The swimmers' behavior is described by several parameters measured from tracking their positions in time, including acceleration and speed. The biofilm structure is described in terms of the density of the biomass. The data highlight differences in the shape of the trajectories, speed distributions, and tendency to visit highdensity areas of the biofilm in the three different species. These observations are reproduced using a generative model of the data and tentatively explained in terms of bacterial morphology. The model could, in principle, be used to predict the motility of other bacterial species in biofilms.
The conclusions of this paper are supported by the data and the model. Still, the explanation of the former in terms of bacterial motility and morphology needs to be extended and some aspects of the model need to be clarified.
We thank Reviewer 3 for this positive comments.
1) The authors explain the different swimming behavior observed in the biofilm in terms of different bacterial morphology, i.e. shape and number of flagella. However, the impact of these differences in a Newtonian fluid should be evaluated and used to better understand the adaptation strategies to the biofilm. In addition, the observations should be commented in light of the literature studying bacterial motility in nonNewtonian fluids (Phys. Rev. Fluids, 5, 073103 (2020); Nat. Phys, 15, 554558, (2019); Sci. Rep., 5, 15761 (2015); PNAS, 111, 1777117776 (2014)), where the impact of the flagellar shape and motility are discussed.
We thank Reviewer 3 for pointing out these references. We added additional discussions, in light of these studies.
– Additional discussion added line 447.
2) While a lot of attention is dedicated to describing how bacterial motility is quantified, very few details are given about the measurement of biofilm density and the possible error sources. Since this is a primary ingredient for the interpretation of the results, I would recommend commenting on the procedure and the sensitivity of density measurement.
We investigated the effect on parameter estimate of increasing noise on the biofilm data. The impact of noise on the parameter inference is assessed by noising the biofilm density and the biofilm density gradients with an additive gaussian noise with increasing variance. The noise variance is scaled with the variance observed in the original data. Namely, we set
and
where σ_{b} is the variance observed in the original data, and b and ∊_{b} and ∊_{∇b}are respectively the noise applied to the biofilm density and the biofilm density gradient. The parameter l ∈ [0,0.01,0.02,0.03,0.04,0.05] is increased to apply a noise from 0 to 5%.
The parameter β is the most impacted (error of 35% for 5% noise). But it is also the parameter that has the minimal influence on the swimmer model when no noise is applied, showing a lower identifiability for this parameter. For the other parameters, the impact is limited.
– A new figure and its description is added in Annex 1 to show the impact of increasing noise on the parameter estimate.
– Sensitivity analysis method is detailed in Material and method line 591.
– A reference to the figure is added Appendix 2.
3) The authors suggest that bacterial motility could help create channels into the biofilm, affecting transport. Is this effect observed in the experiments? This aspect should be clarified and commented on.
This effect was observed during this experiment but not quantified. The impact of swimmer as facilitator of the penetration of macromolecules inside the biofilm has been published by coauthors in [Houry13088].
– A comment has been added line 54.
– A new image has been added (Annex 1, figure 3) to illustrate the pore formation.
4) The purpose of the model and its pros and cons should be discussed more clearly for a general audience. In particular, the authors should clarify the added value of the model with respect to the experiments.
This comment has been addressed (see essential Revisions above).
– see Section 2 point 3 above.
5) Do these bacteria perform a runandtumble or a runandreverse type of motion in water? Could a backandforth trajectory be described as a runandreverse?
As stated above, we analysed the new experiments in the control Newtonian buffer, together with the data in the biofilm, by searching for runandtumble or runandreverse episode. We can now state that B.sphaericus performs runandreverse swimming in a Newtonian fluid. This behaviour is not changed in a biofilm. Conversely, B.pumilus has a different swimming pattern in a Newtonian fluid (straight lines) and in the biofilm (increase of the number of tumbling). B.cereus has the same type of swimming in the Newtonian buffer or in the biofilm.
– An additional figure has been added (Figure 4), with comments and description (see above).
6) Table 1: The units are missing.
This is corrected
– corrections have been made in the table column header and in the material and methods (see above).
7) Line 125: It would be helpful if some numbers were given in the text to quantify the "large swimming distances" or the "widest swimming distribution".
Additional quantifications were provided.
– Value for average speed and acceleration were given: line 154 and 155
– Large swimming distances have been documented with numbers: line 142 and 142
– The widest distribution has been documented with numbers: line 145
8) Figure 1: The biofilm density map should be added in panel a. I would recommend avoiding the superimposition of the distributions of A, V,dist, disp and area in panel b.
The biofilm density map can hardly be added to panel a since (1) the trajectories of the 3 batches are pooled, corresponding to 3 different density map temporal stacks, (2) the density map is changing overtime due to the mechanical action of the swimmers. In the annex, we presented trajectories with an underlying biofilm density map for illustrative purpose, but we highlighted that the biofilm information is a timeseries, and that the swimmer position is associated to the biofilm density at the corresponding time. The distributions were modified.
– Figure 5 has been modified
9) Line 136: Could the term "few" be quantified? From Figure 1a the exceptions seem to be several.
We gave quantification of the fact that few trajectories of B.sphaericus are leaving the neighbourhood of their starting point: we stated that 6% of the B.sphaericus induce a displacement of more than 10 µm compared to 28% for B.cereus and 26% for B.pumilus.
– Additional details are given line 171.
10) Line 192: The bioreadership may not find clear the term "basal" used to identify the simulations. The same applies to "ground truth" in the following. These terms should be defined to improve clarity.
Defintions were added.
– we added definitions for these terms Appendix 2.
– we added additional precisions in the legend of Figure 6.
11) Line 303306: The conclusion on the variability of the outputs should be clarified. Is it a positive feature or is the model missing something?
The model is recovering the main characteristics of the swimmer population, but is not able to reproduce the whole variability of the data, meaning that it is less accurate at reproducing points in the distribution tail.
– Clarifications are given line 288.
https://doi.org/10.7554/eLife.76513.sa2Article and author information
Author details
Funding
Mathnum department  INRAe
 Guillaume Ravel
Agence Nationale de la Recherche (ANR12ALID0006)
 Romain Briandet
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
This work has benefited from the facilities and expertise of MIMA2 MET – GABI, INRAE, AgroParistech, 78,352 JouyenJosas, France. C Péchoux is warmly acknowledged for TEM observations. Financial support was provided by the French National Research Agency ANR12ALID0006. Guillaume Ravel received funding from the Mathnum department at INRAE.
Senior Editor
 Wendy S Garrett, Harvard T.H. Chan School of Public Health, United States
Reviewing Editor
 Karine A Gibbs, University of California, Berkeley, United States
Reviewers
 Iago Grobas, University of Oxford, United Kingdom
 Karine Gibbs, University of California, Berkeley, Berkeley, United States
Publication history
 Received: December 19, 2021
 Preprint posted: January 12, 2022 (view preprint)
 Accepted: June 10, 2022
 Accepted Manuscript published: June 14, 2022 (version 1)
 Version of Record published: July 11, 2022 (version 2)
Copyright
© 2022, Ravel et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics

 423
 Page views

 199
 Downloads

 0
 Citations
Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading

 Computational and Systems Biology
 Neuroscience
How dynamic interactions between nervous system regions in mammals performs online motor control remains an unsolved problem. In this paper, we show that feedback control is a simple, yet powerful way to understand the neural dynamics of sensorimotor control. We make our case using a minimal model comprising spinal cord, sensory and motor cortex, coupled by long connections that are plastic. It succeeds in learning how to perform reaching movements of a planar arm with 6 muscles in several directions from scratch. The model satisfies biological plausibility constraints, like neural implementation, transmission delays, local synaptic learning and continuous online learning. Using differential Hebbian plasticity the model can go from motor babbling to reaching arbitrary targets in less than 10 min of in silico time. Moreover, independently of the learning mechanism, properly configured feedback control has many emergent properties: neural populations in motor cortex show directional tuning and oscillatory dynamics, the spinal cord creates convergent force fields that add linearly, and movements are ataxic (as in a motor system without a cerebellum).

 Computational and Systems Biology
 Immunology and Inflammation
T cells are potent at eliminating pathogens and playing a crucial role in the adaptive immune response. T cell receptor (TCR) convergence describes T cells that share identical TCRs with the same amino acid sequences but have different DNA sequences due to codon degeneracy. We conducted a systematic investigation of TCR convergence using singlecell immune profiling and bulk TCRβsequence (TCRseq) data obtained from both mouse and human samples and uncovered a strong link between antigenspecificity and convergence. This association was stronger than T cell expansion, a putative indicator of antigenspecific T cells. By using flowsorted tetramer^{+} single T cell data, we discovered that convergent T cells were enriched for a neoantigenspecific CD8^{+} effector phenotype in the tumor microenvironment. Moreover, TCR convergence demonstrated better prediction accuracy for immunotherapy response than the existing TCR repertoire indexes. In conclusion, convergent T cells are likely to be antigenspecific and might be a novel prognostic biomarker for anticancer immunotherapy.