Learning developmental mode dynamics from single-cell trajectories

Abstract
Editor's evaluation
Introduction
Results
Discussion
Materials and methods
Appendix 1
Appendix 2
Appendix 3
Appendix 4
Data availability
References
Article and author information
Metrics

Abstract

Embryogenesis is a multiscale process during which developmental symmetry breaking transitions give rise to complex multicellular organisms. Recent advances in high-resolution live-cell microscopy provide unprecedented insights into the collective cell dynamics at various stages of embryonic development. This rapid experimental progress poses the theoretical challenge of translating high-dimensional imaging data into predictive low-dimensional models that capture the essential ordering principles governing developmental cell migration in complex geometries. Here, we combine mode decomposition ideas that have proved successful in condensed matter physics and turbulence theory with recent advances in sparse dynamical systems inference to realize a computational framework for learning quantitative continuum models from single-cell imaging data. Considering pan-embryo cell migration during early gastrulation in zebrafish as a widely studied example, we show how cell trajectory data on a curved surface can be coarse-grained and compressed with suitable harmonic basis functions. The resulting low-dimensional representation of the collective cell dynamics enables a compact characterization of developmental symmetry breaking and the direct inference of an interpretable hydrodynamic model, which reveals similarities between pan-embryo cell migration and active Brownian particle dynamics on curved surfaces. Due to its generic conceptual foundation, we expect that mode-based model learning can help advance the quantitative biophysical understanding of a wide range of developmental structure formation processes.

Editor's evaluation

This work proposes a method to obtain a reduced description of the collective dynamics of thousands of cells moving together during zebrafish gastrulation as a few fundamental modes, and to derive effective dynamics for these modes. This well-written work enables a simplified picture of the key features of cellular collective motion, that will be useful to physicists and biologists looking for a quantitative understanding of morphogenesis.

https://doi.org/10.7554/eLife.68679.sa0

Introduction

Embryogenesis, the development of a multicellular organism from a single fertilized egg cell, requires coordinated collective motions of thousands of cells across a wide range of length and time scales (Gilbert and Barresi, 2016; Solnica-Krezel, 2005). Understanding how a highly reproducible and robust tissue organization arises from the dynamics and interactions of individual cells presents a major interdisciplinary challenge (Collinet and Lecuit, 2021). Recent advances in high-resolution live imaging make it possible to track the internal biological states and physical movements of many individual cells on pan-embryonic scales throughout various stages of development (Stelzer, 2015; Power and Huisken, 2017; Hartmann et al., 2019; Shah et al., 2019). This unprecedented wealth of data poses two intertwined compression problems of equal practical and conceptual importance. The first concerns the efficient reduction of high-dimensional tracking data without loss of relevant information; the second relates to inferring predictive low-dimensional models for the developmental dynamics. Mathematical solutions to the first problem are aided by taking into account the geometry and symmetries of the developing embryo, which suggest suitable basis functions for a coarse-grained and sparse mode representation of raw data (Levy, 2006). Efficient algorithmic approaches tackling the second problem appear within reach thanks to recent advances in the direct inference of dynamical systems equations from data (Brunton et al., 2016; Rackauckas et al., 2021). Building on these ideas, we construct and demonstrate here a computational framework that translates developmental single-cell trajectory data on curved surfaces into quantitative models for the dominant hydrodynamic modes.

Widely applied in physics (Kac, 1966; Goldenfeld and Woese, 2011; Kantsler and Goldstein, 2012; Bhaduri et al., 2020), engineering (Soong and Grigoriu, 1993; Heydari et al., 2021), and spectral computing (Driscoll et al., 2014; Burns et al., 2020; Fortunato et al., 2021), mode representations (Schmid, 2010; Tu et al., 2014) provide a powerful tool to decompose and study system dynamics at and across different energetic, spatial and temporal scales. In quantum systems, for example, mode representations in the form of carefully constructed eigenstates are used to characterize essential energetic system properties (Slater and Koster, 1954; Jaynes and Cummings, 1963). Similarly, turbulence theory has seen significant progress by studying the coupling between Fourier modes that represent dynamics at different length scales. This approach enabled a better understanding of energy cascades (Kolmogorov, 1941; Wang et al., 2021) and provided insights into the nature of turbulence in non-living (Kraichnan and Montgomery, 1980; Pope, 2000) and in living (Dunkel et al., 2013; Bratanov et al., 2015; Ramaswamy and Jülicher, 2016; Alert et al., 2020) systems. Additionally, the multi-scale nature of many biological processes make them particularly amenable to a representation in terms of spatial and temporal modes (Marchetti et al., 2013). Despite this fact, however, mode representations are not yet widely used to characterize and compress cell tracking data, or to infer dynamic models from such data.

To demonstrate the practical potential of mode representations for the description of multicellular developmental processes, we develop here a computational framework that takes cell tracking data as inputs, translates these data into a sparse mode representation by exploiting symmetries of the biological system, and utilizes recently developed ODE inference techniques (Rackauckas et al., 2021) to infer a predictive dynamical model. The model will be specified in terms of a learned Green’s function that propagates initial cell density and flux data forward in time. To validate the approach, we demonstrate that it correctly recovers the hydrodynamic equations for active Brownian particle (ABP) dynamics on curved surfaces. Subsequently, as a first example application to experimental single-cell tracking data, we consider the pan-embryonic cell migration during early gastrulation in zebrafish (Shah et al., 2019), an important vertebrate model system for studying various morphogenetic events (Solnica-Krezel, 2005; Krieg et al., 2008; Morita et al., 2017). During gastrulation, complex migratory cell movements organize several thousand initially undifferentiated cells into different germlayers that lay out the primary body plan (Rohde and Heisenberg, 2007). The underlying high-dimensional single-cell data make this process a prototypical test problem for illustrating how spatio-temporal information can be efficiently compressed to analyze and model biological structure formation.

Results

Broadly, our goal is to translate experimentally measured single-cell trajectories on a curved surface into a quantitative model of collective cell migration dynamics. As a specific example, we consider recently published lightsheet microscopy data that captures the individual movements of thousands of cells during early zebrafish development from epiboly onset at 4 hours post-fertilization (hpf) to about 18 hpf (Shah et al., 2019). This developmental period is characterized by a collective symmetry breaking event during which cells collectively migrate over the yolk cell surface (Rohde and Heisenberg, 2007). Namely, they rearrange from an initial localization around the animal pole (AP) (Figure 1A, left) into a more elongated configuration that already indicates the basic geometry of the fully developed zebrafish larva (Figure 1A, right). Working with a two-dimensional (2D) sphere projection of the experimental data, we first describe a coarse-graining approach that faithfully captures cell-mass transport on a curved surface. We then construct a sparse mode representation of the resulting hydrodynamic fields in terms of scalar and vector spherical harmonic basis functions, discuss mode signatures of morphogenetic symmetry breaking events, and connect them to the dynamics of topological defects in the cellular flux. We validate this mode representation framework and the subsequent model inference using synthetic data of ABPs on a sphere, for which coarse-grained fields and learned models can be directly compared against analytical predictions. Finally, we infer a linear model for the mode dynamics of the experimental zebrafish data, which enables us to study the characteristics of cell interactions through kernels that couple cell density and flux and compare their features with the hydrodynamic mean-field signatures of ABPs on a sphere.

Figure 1 with 1 supplement see all

Download asset Open asset

From single-cell tracking data to sparse mode amplitude representations.

(A) Microscopic imaging data of early zebrafish development (adapted from Figure 1b in Kobitski et al., 2015) shows cell migration from an initially homogeneous pole of cells (left) toward an elongated structure that indicates the head-tail axis of the fully developed organism. Scale bar, $100 μ m$ . (B) Experimental single-cell tracking data from Shah et al., 2019 (blue dots) during similar developmental time points (±20 min) as in A. $t = 0$ min for the indicated time points in B corresponds to a developmental time of 4 hours post fertilization. The $z$ -axis points from the ventral pole (VP) to the animal pole (AP). (C) Coarse-grained relative cell density $ρ (r, t)$ (color) and associated coarse-grained flux $J (r, t)$ (streamlines) determined from single cell positions and velocities from data in B via Equation 2a. Thickness of streamlines is proportional to the logarithm of the spatial average of $| J |$ (see Video 1). (D) Dynamic harmonic mode representation of the relative density $ρ (r, t)$ (Equation 4, left panel) and of the flux $J (r, t)$ (Equation 5, middle and right panel) for fields shown in C. The modes $j_{l m}^{(1)}$ correspond to compressible, divergent cell motion, the modes $j_{l m}^{(2)}$ describe incompressible, rotational cell motion. Mode amplitudes become negligible for $l \geq 5$ (Video 2). For all panels, horizontal black lines delineate blocks of constant harmonic mode number $l$ and black triangles denote the end of epiboly phase.

Coarse-graining of cellular dynamics on a spherical surface

The experimentally observed cell motions are approximately two-dimensional (2D): The radius of the yolk cell surface on which the dynamics takes place is much larger than the average height changes of the evolving cell mass (Shah et al., 2019). We therefore adopt a thin film approximation, in which the cellular motion is represented on an effective spherical mid-surface (gray surface in Figure 1B); refined future models should aim to account for the full 3D dynamics. Focusing here on the in-plane dynamics, we project all cell positions and velocities onto a spherical mid-surface $S$ of radius $R_{s} = 300 μ m$ . On this spherical surface, each cell $α = 1, 2, \dots, N$ has a position $r_{α} (t)$ and in-plane velocity $v_{α} (t) = d r_{α} / d t$ .

As a second processing step, a coarse-grained representation of the single-cell dynamics on a spherical surface is determined. To facilitate the applicability of our framework to a wide range of experimental inputs, we propose a coarse-graining approach that can flexibly integrate cell number variations stemming from cell divisions, but also those from experimental uncertainties in cell imaging and tracking. Consequently, we first consider an idealized scenario in which the total cell number is approximately constant. In this case, mass conservation informs the construction of self-consistent coarse-graining kernels on a spherical surface. In a second step, we describe how this approach generalizes when there are variations in the total cell number.

Consistent coarse-graining of idealized microscopic data

Our specific aim is to translate microscopic cell positions $r_{α} (t)$ and velocities $v_{α} (t)$ into a continuous cell surface density $ρ (r, t)$ and an associated flux $J (r, t)$ at any point $r$ of the spherical mid-surface. For an approximately constant total number of cells, the fields $ρ$ and $J$ are related by the mass conservation equation

\frac{\partial ρ}{\partial t} + \nabla_{S} \cdot J = 0 .

Here, $\nabla_{S} \cdot J$ denotes the in-plane divergence of the cell number flux. To convert cell position $r_{α} (t)$ and velocities $v_{α} (t)$ into a normalized cell surface density $ρ (r, t)$ and an associated normalized flux $J (r, t)$ , we consider a kernel coarse-graining of the form (Appendix 1)

ρ (r, t) = \frac{1}{N} \sum_{α = 1}^{N} K [r, r_{α} (t)]

J (r, t) = \frac{1}{N} \sum_{α = 1}^{N} K [r, r_{α} (t)] \cdot {\bar{v}}_{α},

where $N$ is the total number of cells and ${\bar{v}}_{α} = v_{α} / | r_{α} |$ is the angular velocity of a given cell on a reference unit sphere (Appendix 1). The kernels $K (r, r^{'})$ and $K (r, r^{'})$ are given by a scalar and a matrix-valued function, respectively. The matrix kernel $K (r, r^{'})$ takes into account contributions of a particle with velocity $v_{α}$ at $r^{'} = r_{α}$ to nearby points $r$ on the sphere, which involves an additional projection to ensure that $J (r, t)$ is everywhere tangent to the spherical surface (Appendix 1). Importantly, the mass conservation Equation 1 implies a non-trivial consistency relation between the kernels $K (r, r^{'})$ and $K (r, r^{'})$ in Equation 2a, Equation 2b. The kernels that obey this condition represent different coarse-graining length scales (Appendix 1—figure 2). Throughout, we fix an intermediate coarse-graining length scale to enable a sparse representation of the experimental data, while ensuring that spatial details of the dynamics remain sufficiently well resolved. The final surface density $ρ (r, t)$ and the associated normalized flux $J (r, t)$ , computed from Equation 2a and Equation 2b using a kernel with an effective great-circle coarse-graining width of $\sim 70 μ m$ , are shown in Figure 1C (see also Video 1).

Video 1

Download asset

posterframe for video — Time evolution of the pre-processed cell tracking data (point cloud, see Materials and methods), and of the density field $ρ (r, t)$ (colormap) and associated flux $J (r, t)$ (streamlines) corresponding to the harmonic modes ${ρ_{l m}, j_{l m}^{(1)}, j_{l m}^{(2)}}$ shown in Figure 1D.

This mode representation was determined by the coarse-graining and projection procedure described in the main text. Streamline thickness is proportional to the logarithm of the average flux amplitude ${⟨ | J | ⟩}_{s}$ . For visualization purposes, cell distances to the origin were rescaled by a factor of $1.2 R_{s} / ⟨ R (t) ⟩$ , where $⟨ R (t) ⟩$ is the average cell distance from center at time $t$ and $R_{s} = 300 μ m$ is the mid-surface radius.

Consequences of cell number variations in experimental data

Because cell divisions are essential to most developmental processes, total cell numbers will in many cases – including early zebrafish gastrulation (Kobitski et al., 2015) – vary over time. True cell numbers and cell number changes are often difficult to measure due to experimental uncertainties arising from single-cell imaging and tracking within dense cellular aggregates. We therefore merely assume here that single cells are tracked in a representative fashion so that local relative surface densities found from Equation 2a reflect the probability that cells are present at a given point $r$ . In the absence of further information on cell deaths and cell divisions, we additionally make the more restrictive assumption that cell appearances or disappearances are everywhere proportional to the local cell density. With these assumptions, we can define a cell number surface density $\tilde{ρ} (r, t) = N (t) ρ (r, t)$ , where $N (t)$ is the cell number at time $t$ and $ρ (r, t)$ is the normalized surface density given in Equation 2a. Similarly, a cell number flux is given by $\tilde{J} (r, t) = N (t) J (r, t)$ , where the flux $J (r, t)$ is computed from the data as described by Equation 2b. Using these definitions in Equation 1, we find that the fields $\tilde{ρ} (r, t)$ and $\tilde{J} (r, t)$ obey a continuity equation

\frac{\partial \tilde{ρ}}{\partial t} + \nabla_{S} \cdot \tilde{J} = k (t) \tilde{ρ},

where $k (t) = \dot{N} (t) / N (t)$ denotes a time-dependent effective growth rate. Importantly, under the two above assumptions, Equation 3 encodes for any time-dependent total cell number $N (t) §gt; 0$ the same information as Equation 1 for coarse-grained normalized surface density $ρ (r, t)$ and associated flux $J (r, t)$ given by Equation 2a and Equation 2b, respectively. In the following analysis, we hence focus on these normalized fields.

Spatial mode representation on a spherical surface

To obtain a sparse mode representation of the hydrodynamic fields $ρ (r, t)$ and $J (r, t)$ on the spherical surface, we expand them in terms of scalar and vector spherical harmonics (SHs) (Arfken et al., 2013; Sandberg, 1978) (Appendix 2.A). SHs are defined on points $\hat{r} = r / R_{s}$ of the unit sphere, where $R_{s} = 300$ μm is the mid-surface radius. In this basis, the scalar density field is represented as

ρ (r, t) = \sum_{l = 0}^{l_{m a x}} \sum_{m = - 1}^{l} ρ_{l m} (t) Y_{l m} (\hat{r}),

which conveniently separates the time- and space-dependence of $ρ (r, t)$ into mode amplitudes $ρ_{l m} (t)$ and scalar harmonic functions $Y_{l m} (\hat{r})$ , respectively. The maximal mode number $l_{max}$ is a proxy for the maximal spatial resolution at which $ρ (r, t)$ is faithfully represented. Similarly, the vector-valued flux $J (r, t)$ can be decomposed into time-dependent mode amplitudes $j_{l m}^{(1)} (t)$ and $j_{l m}^{(2)} (t)$ , while its spatial dependence is described by vector SHs $Ψ_{l m} (\hat{r})$ and $Φ_{l m} (\hat{r})$ (Sandberg, 1978) (Appendix 2, Video 2).

J (r, t) = \sum_{l = 1}^{l_{max}} \sum_{m = - l}^{l} (j_{l m}^{(1)} (t) Ψ_{l m} (\hat{r}) + j_{l m}^{(2)} (t) Φ_{l m} (\hat{r})) .

Video 2

Download asset

Besides the in-plane divergence $\nabla_{S} \cdot J$ that leads to local density changes (see Equation 1), the cell number flux $J (r, t)$ also contains an in-plane curl component $\nabla_{S} \times J$ that is associated with locally rotational cell flux. The two sets of vector SHs ${Ψ_{l m}}$ and ${Φ_{l m}}$ conveniently decompose the flux into these contributions: Because $\nabla_{S} \cdot Φ_{l m} = \nabla_{S} \times Ψ_{l m} = 0$ , and $\hat{r} \cdot (\nabla_{S} \times Φ_{l m}) = \nabla_{S} \cdot Ψ_{l m} = - l (l + 1) Y_{l m} / R_{s}$ (Sandberg, 1978), we see from Equation 5 that $j_{l m}^{(1)} (t)$ corresponds to modes that drive density changes and $j_{l m}^{(2)} (t)$ represents modes of local rotational cell motion that change relative cell positions but do not change local density. Indeed, using harmonic mode representations of the cell number density Equation 4 and the cell number flux Equation 5 directly in the continuity Equation 1, we find a set of ordinary differential equation in mode space

\frac{d}{d t} ρ_{l m} (t) = \frac{l (l + 1)}{R_{s}} j_{l m}^{(1)} (t),

where $l = 0, 1, \dots, l_{max}$ and for each value of $l$ , $m = - l, - l + 1, \dots, l - 1, l$ . Equation 6 offers an alternative way of determining the modes $j_{l m}^{(1)} (t)$ directly from the modes $ρ_{l m} (t)$ of the coarse-grained cell number density (see Equation 4 and Equation 2a), while ensuring that the resulting fields obey mass conservation exactly. In practice, the modes $j_{l m}^{(1)} (t)$ found from a vector harmonic representation of the coarse-grained cell number flux (Equation 2b) will often deviate from modes $j_{l m}^{(1)} (t)$ determined from Equation 6, even if cell numbers are expected to be conserved. This can be, for example, due to limited accuracy in determining velocities $v_{α} (t)$ from noisy single-cell trajectories $r_{α} (t)$ , or due to spatially inhomogeneous appearances and disappearances of cells in tracking data. Consistent with our simplifying assumption that cell number changes in the data can be sufficiently well approximated by a globally homogeneous growth rate (compare Equation 1 with Equation 3), the subsequent analysis uses the modes $j_{l m}^{(1)} (t)$ as determined from the density modes $ρ_{l m} (t)$ via Equation 6, together with modes $j_{l m}^{(2)} (t)$ from the explicit velocity coarse-graining Equation 2b. The complete construction is detailed in Appendix 2 and the full coarse-grained dynamics is shown in Video 1.

The representation of $ρ (r, t)$ and $J (r, t)$ in terms of spherical harmonic modes with $l \leq l_{max}$ leads in total to $3 {(l_{max} + 1)}^{2}$ mode amplitude trajectories, displaying only a few dominant contributions (Figure 1D) with almost no signal remaining for $l \geq 5$ (Figure 1—figure supplement 1, Video 2). This demonstrates that the underlying coarse-grained experimental data is sufficiently smooth and implies that a spectral representations is indeed meaningful. Thus, the coarse-graining approach outlined above provides a sparse spectral representation of high-dimensional microscopic single-cell data. The associated harmonic basis functions and vectors have an intuitive physical meaning, convenient algebraic properties and, as we will see, encode information about the length scales and symmetries of the collective dynamics.

Temporal mode representation

We further compress the dynamical information by representing the time series of the modes in terms of Chebyshev polynomial basis functions $T_{n} (t)$ (Driscoll et al., 2014; Mason and Handscomb, 2002). To simplify notation, we define a dynamic mode vector $a (t) = {[ρ_{l m} (t), j_{l m}^{(1)} (t), j_{l m}^{(2)} (t)]}^{⊤}$ that collects all the modes up to $l = l_{max}$ determined in the previous section and consider an expansion

a (t) = \sum_{n = 0}^{n_{max}} T_{n} (t) {\hat{a}}_{n}

in terms of the spatio-temporal mode coefficients ${\hat{a}}_{n}$ with temporal mode number $n$ (Appendix 2). This compression allows us to accurately evaluate time derivatives of the mode amplitudes (Supekar et al., 2021), an important step when using Equation 6 to determine flux modes $j_{l m}^{(1)} (t)$ directly from density modes $ρ_{l m}$ . Fixing $l_{max} = 4$ and $n_{max} = 30$ in the remainder, the initial single-cell data set of about 1.4 million recorded cell position entries, or 4.2 million degrees of freedom, has thus been reduced to 2250 mode coefficients, corresponding to a compression ratio $≳ 1800$ .

Characterization of the developmental mode dynamics

A harmonic mode decomposition naturally integrates the geometry of the underlying domain and simultaneously provides useful insights into spatial scales and symmetries of the dynamics. For each mode $(l m)$ in the sets of SHs ${Y_{l m}}$ , ${Ψ_{l m}}$ and ${Φ_{l m}}$ , the integer index $l$ indicates the spatial scale of the harmonic, with $l = 0$ being a constant and larger $l$ indicating progressively finer spatial scales. The second index $m \in {- l, - l + 1, \dots, l}$ provides additional information about the orientation of the harmonic scalar function or vector field. The modes $l = 1$ and $l = 2$ are particularly useful for characterizing the symmetry of spatial patterns on a spherical surface (Mietke et al., 2019; Scholich et al., 2020): Modes with $l = 1$ indicate patterns with a global polar symmetry, whereas modes with $l = 2$ represent spatial patterns with a global nematic symmetry. We now exploit these features for a detailed characterization of the symmetry breaking that takes place during cellular rearrangements and to study the properties of the cellular flux in more detail. To this end, we discuss spatial averages

⟨ O ⟩_{s} (t) = \frac{1}{A_{s}} \int_{S} d A_{s} O (r, t)

of different real-space observables $O (r, t)$ over the mid-surface $S$ .

Mode signatures of developmental symmetry breaking

To study how different developmental stages and their associated symmetry breaking events are reflected in the mode representation, we first consider the average cell surface density fluctuations

{⟨ {(ρ - {⟨ ρ ⟩}_{s})}^{2} ⟩}_{s} = \sum_{l = 1}^{l_{max}} \sum_{m = - l}^{l} ρ_{l m}^{2} (t) .

For each mode $l$ , the power spectrum $P_{ρ, l} (t) = \sum_{m = - l}^{l} ρ_{l m}^{2} (t)$ in Equation 9 provides a rotationally invariant quantity (Çetingül et al., 2012; Schwab et al., 2013) that can effectively serve as an order parameter to characterize the symmetry of cell density patterns on the spherical surface. The dynamics of the density fluctuations given in Equation 9 broken down into contributions $P_{ρ, l} (t)$ from each mode $l \leq l_{max} = 4$ is shown in Figure 2B. Several features of this representation are particularly striking and can be directly related to specific developmental stages. First, patterns of cell surface density fluctuations evolve from a dominantly polar symmetry ( $l = 1$ ) into density patterns with a prominent nematic symmetry ( $l = 2$ ). These mode signatures intuitively reflect the essential symmetry breaking that takes place when cells collectively reorganize from an initially localized cell dome (Figure 1B, 52 min) into an elongated shape that wraps in an open ring-like pattern around the yolk cell (Figure 1B, 760 min). Second, during this transition at around 300 min (9 hpf) (black triangle in Figure 2B), the cell surface density is most homogeneous as fluctuations become minimal for all modes $l$ . Interestingly, this time point approximately marks the completion of epiboly, when the different cell layers have fully engulfed the yolk. Finally, although in a less pronounced manner, the power spectrum of the mode $l = 4$ also exhibits an increased amplitude towards later times, indicating the formation of structures at finer spatial scales as development progresses. We find that mode signatures of the symmetry breaking and progression through developmental stages are robust (Figure 2—figure supplement 1B, D), illustrating that mode-based analysis can provide a systematic and meaningful characterization of developmental symmetry breaking events.

Figure 2 with 4 supplements see all

Download asset Open asset

Mode signatures of developmental symmetry breaking and topological defects in cellular flux.

(A) Two-dimensional Mollweide projection of the compressed coarse-grained density field $ρ (r, t)$ (colormap) and of the coarse-grained cell flux $J (r, t)$ (streamlines) at different time points of zebrafish gastrulation.White circles depict topological defects of charge +1 in the flux vector field, red circles depict defects with charge -1. The total defect charge is 2 at all times. Defects are seen to ‘lead’ the large-scale motion of cells and later localize mostly along the curve defined by the forming spine. Animal pole (AP) and ventral pole (VP) are located at top and bottom, respectively. (B) Density fluctuations as a function of developmental time (see Equation 9), broken down in contributions from different harmonic modes $l$ . The underlying symmetry breaking is highlighted prominently by this representation: During the first 75% of epiboly (0–280 min) cells migrate away from, but are still mostly located near the animal pole, presenting a density pattern with polar symmetry ( $l = 1$ ). During the following convergent extension phase cells converge towards a confined elongated region that is ‘wrapped’ around the yolk, corresponding to a density pattern with nematic symmetry ( $l = 2$ ). Black triangles indicate transition from epiboly to convergent extension. (C) Comparison of surface averaged divergence $\nabla_{S} \cdot J$ and curl $\nabla_{S} \times J$ of the cellular flux computed via Equation 10a and Equation 10b, respectively (top). A relative curl amplitude $S_{c u r l}$ computed from these quantities via Equation 11 correlates with the appearance of an increased number of topological defects in the cell flux (bottom), suggesting that incompressible, rotational cell flux is associated with the formation of defects.

Mode signatures of emergent topological defects in cellular flux

The vectorial nature of the cell number flux $J (r, t)$ on a spherical surface implies the presence of topological defects (colored circles in Figure 2A, see Materials and methods) (Kamien, 2002). Several recent experimental results pertaining to the self-organization of multicellular systems suggest an important role of such topological defects in organizing morphogenetic events (Doostmohammadi et al., 2016; Saw et al., 2017; Guillamat et al., 2020; Copenhagen et al., 2020; Meacock et al., 2020; Maroudas-Sacks et al., 2020). We therefore analyze how defects within the cell number flux $J (r, t)$ are dynamically organized during early zebrafish gastrulation and if signatures of defect formation and annihilation are present in the mode representation Equation 5. We first consider the average squared divergence and curl of the cell number flux given by

{⟨ {(\nabla_{S} \cdot J)}^{2} ⟩}_{s} = \sum_{l = 1}^{l_{max}} \sum_{m = - l}^{m} {[\frac{l (l + 1)}{R_{s}} j_{l m}^{(1)} (t)]}^{2},

{⟨ {(\nabla_{S} \times J)}^{2} ⟩}_{s} = \sum_{l = 1}^{l_{max}} \sum_{m = - l}^{m} {[\frac{l (l + 1)}{R_{s}} j_{l m}^{(2)} (t)]}^{2},

which are shown in Figure 2C (top). The two contributions to the collective cellular dynamics – locally compressible, divergent flux quantified by the divergence $\nabla_{S} \cdot J$ and locally incompressible, rotational cell motion characterized by the curl $\nabla_{S} \times J$ – are independently determined by the modes $j_{l m}^{(1)} (t)$ and $j_{l m}^{(2)} (t)$ . Therefore, each contribution can be evaluated conveniently and with high accuracy from a representation of $J (r, t)$ in terms of vector SHs. From Figure 2C (top), we see that the most significant divergent flux (blue curve) occurs around 300 min at the transition from epiboly towards the convergence and extension stage. A quantification of the incompressible rotational flux relative to the total cell number flux is shown in Figure 2C (bottom), where we plotted the relative curl amplitude

S_{curl} (t) = \frac{\sum_{l, m} {[j_{l m}^{(2)} (t)]}^{2}}{\sum_{l, m} {[j_{l, m}^{(1)} (t)]}^{2} + \sum_{l, m} {[j_{l, m}^{(2)} (t)]}^{2}} .

This measure suggests a correlation between incompressible rotational cell motion and the occurrence of topological defects (circles in Figure 2A) in the cell flux $J (r, t)$ . The total number of topological defects present at any time point is depicted in Figure 2C (bottom, blue curve). Because the vector-valued flux is defined on a sphere, we observe that the total topological charge always sums to +2 (Kamien, 2002), while additional defect pairs with opposite charge (red +1 and white -1 circles in Figure 2A) can be created, resulting in total defect numbers greater than two (see Figure 2C, bottom). Interestingly, the relative curl amplitude $S_{curl}$ defined in Equation 11 indicates that increased contributions from incompressible rotational flux are associated with the formation of topological defects in the cell number flux, a feature that is robustly identified by our framework (Figure 2—figure supplement 1A, C, Figure 2—figure supplement 3, Figure 2—figure supplement 4). The appearance of additional defects at the end of epiboly, when the developing embryo begins to extrude more significantly in the radial direction, suggests that topological defects in the 2D projected cellular flux fields could signal the start of the formation of more complex structures in three dimensions.

Learning a linear hydrodynamic model of the developmental mode dynamics

The results in Figure 2 confirm that a low-dimensional mode representation can capture essential characteristics of developmental symmetry breaking processes. The mode representation therefore provides a natural starting point for the inference of hydrodynamic models from coarse-grained cell-tracking data. For a given time-dependent mode vector $a (t) = {[ρ_{l m} (t), j_{l m}^{(1)} (t), j_{l m}^{(2)} (t)]}^{⊤}$ that contains all modes up to $l = l_{max}$ , the simplest hydrodynamic model corresponds to the linear dynamical equation

\frac{d a (t)}{d t} = M \cdot a (t),

where the constant coefficient matrix $M$ encodes the couplings between different modes. Intuitively, Equation 12 aims to describe an experimentally observed density and flux dynamics in terms of a relaxation process, starting from inhomogeneous initial conditions represented by $a (0)$ . The mathematical learning problem is then to find a coefficient matrix $M$ such that the linear model Equation 12 holds for the mode vector time series $a (t)$ that was determined from the coarse-graining procedure described in the previous sections.

Validation of the learning framework using active Brownian particle dynamics

Before applying the combined coarse-graining and inference framework to experimental data, we illustrate and validate the learning approach on synthetic data for which coarse-graining results and hydrodynamic mean-field equations are analytically tractable. To this end, we consider the stochastic dynamics of non-interacting active Brownian particles (ABPs) on the unit sphere of radius $R_{0} = 1$ (Sknepnek and Henkes, 2015; Fily et al., 2016; Castro-Villarreal and Sevilla, 2018). Similar to a migrating cell, an ABP at position $x (t)$ moves across the unit sphere at constant speed v₀ in the direction of its fluctuating orientation unit vector $u (t)$ . The strength of the orientational Gaussian white noise is characterized by a rotational diffusion constant $D_{r}$ (Figure 3A, Appendix 3).

Figure 3

Download asset Open asset

Learning active Brownian particle (ABP) dynamics on a sphere.

(A) ABPs move on a unit sphere (radius $R_{0} = 1$ ) with angular speed $v_{0} = 1$ along a tangential unit vector $u (t)$ that is subject to stochastic in-plane fluctuations (see Appendix 3 for further details). Example single-particle trajectories are shown in the high-noise (orange, $D_{r} = 10$ in units of $R_{0} v_{0}$ ) and in the low-noise regime (blue, $D_{r} = 0.5$ ). Time $t$ is measured in units of $R_{0} / v_{0}$ in all panels. (B) Position correlation function $⟨ x (t) \cdot x (0) ⟩$ averaged over $3 \times 10^{4}$ independent ABP trajectories show distinct oscillations of period $\approx 2 π$ in the low-noise regime, as ABPs orbit the spherical surface more persistently (see Video 3). Standard error of the mean is smaller than symbol size. (C) Analytically predicted (left) and inferred (right) dynamical matrices $M$ (see Equation 12) describing the mean-field dynamics of a large collection of non-interacting ABPs (see Equation 13a, Equation 13b, Equation 13c and Appendix 3) show good quantitative agreement. (D) Mollweide projections of coarse-grained ABP simulations with $v_{0} = 1$ and $D_{r} = 0.5$ using cell positions from the first time point in the zebrafish data (Figure 1) as the initial condition: At each position 60 particles with random orientation were generated and their ABP dynamics simulated, amounting to approximately $1.2 \times 10^{5}$ particles in total. The density fields homogenize over time, where the maximum density at $t = 12.3$ has decayed to about 5% of the maximum density at $t = 1.02$ . Blue lines and arrows indicate streamlines of the cell flux $J (r, t)$ . (E) Simulation of the learned linear model, Equation 12 with $M$ shown in (C) (right), for the same initial condition as in D. Marked time points indicate intervals of learning, validation and prediction phases of the model inference (see Appendix 4).

Video 3

Download asset

Compared with conventional passive Brownian motion, self-propulsion of an ABP along its orientation direction $u$ introduces a persistence to the particle’s motion that is reduced as rotational noise $D_{r}$ is increased. Additionally, the topology of the spherical surface implies that in the low-noise regime, $R_{0} D_{r} / v_{0} §lt; 1$ , particles are expected to return to the vicinity of their starting points after a duration $Δ t \approx 2 π R_{0} / v_{0}$ . The conjunction of persistent motion and topology then leads to oscillatory dynamics in the positional correlation $⟨ x (t) \cdot x (0) ⟩$ (blue dots in Figure 3B, Appendix 3). Comparing correlations from stochastic ABP simulations in different noise regimes with theoretical predictions (solid lines in Figure 3B) validates our numerical ABP simulation scheme.

To generate a test data set for our coarse-graining and inference framework, we simulated non-interacting ABPs in both the low-noise ( $R_{0} D_{r} / v_{0} §lt; 1$ ) and the high-noise ( $R_{0} D_{r} / v_{0} §gt; 1$ ) regime with initial positions drawn from the experimental data shown in Figure 1. Specifically, at each cell position present in the data, we generated 60 particles with random orientation, amounting to approximately $1.2 \times 10^{5}$ particles in total, and simulated their dynamics on a unit sphere. The resulting trajectory data were coarse-grained following the procedure outlined in the previous sections, yielding dynamic density fields $ρ (r, t)$ and fluxes $J (r, t)$ (Video 3), together with their mode representations $ρ_{l m} (t), j_{l m}^{(1)} (t)$ and $j_{l m}^{(2)} (t)$ .

In the second ‘learning’ step, we infer a sparse mode coupling matrix $M$ that approximates the dynamics Equation 12 for the dynamical mode vectors $a (t) = {[ρ_{l m}, j_{l m}^{(1)}, j_{l m}^{(2)}]}^{⊤}$ obtained from the coarse-grained simulated ABP data. Our inference algorithm combines adjoint techniques (Rackauckas et al., 2021) and a multi-step sequential thresholding approach inspired by the Sparse Identification of Nonlinear Dynamics (SINDy) algorithm introduced by Brunton et al., 2016. The full algorithm is detailed in Appendix 4 and illustrated in the summary flowchart Appendix 4—figure 1. Importantly, we perform the sparse regression using dynamical mode vectors $a (t)$ rescaled by their median absolute deviation (MAD) to compensate for substantial scale variations between different modes. The final output matrix $M$ of this learning algorithm is shown in the right panel of Figure 3C and can be compared against the analytically coarse-grained dynamics of ABPs on curved surfaces (Fily et al., 2016; Castro-Villarreal and Sevilla, 2018). Under suitable closure assumptions (Appendix 3), the mean-field dynamics of ABPs on a unit sphere is given in harmonic mode space by

\frac{d j_{l m}^{(2)}}{d t} = - D_{r} j_{l m}^{(2)},

\frac{d j_{l m}^{(1)}}{d t} = - \frac{v_{0}^{2}}{2 R_{0}} ρ_{l m} - D_{r} j_{l m}^{(1)}

\frac{d j_{l m}^{(2)}}{d t} = - D_{r} j_{l m}^{(2)}

from which we can read off the mode coupling matrix $M$ shown in the left panel of Figure 3C. A direct comparison between the theoretical and the inferred matrices shows that our framework recovers both the structure and the quantitative values of $M$ with good accuracy. Due to the finite number of ABPs used to determine the coarse-grained fields, we do not expect that the theoretically predicted coupling matrix is recovered perfectly from the data. Instead, some mode couplings suggested by Equation 13a may not be present or modified in the particular realization of the ABP dynamics that was coarse-grained. Indeed, direct simulation of the learned model projected in real space (Figure 3E) reveals a density and flux dynamics that agrees very well with the dynamics of the the coarse-grained input data (Figure 3D). Altogether, these results demonstrate that the proposed inference framework enables us to to faithfully recover expected mean-field dynamics from coarse-grained fields of noisy particle-based data.

Learning developmental mode dynamics from experimental data

The same inference framework can now be directly applied to the coarse-grained experimental zebrafish embryo data shown in Figure 1C and D, yielding a sparse coefficient matrix $M$ (Figure 4A and B) that encodes the dynamics of the developmental mode vector $a (t) = {[ρ_{l m} (t), j_{l m}^{(1)} (t), j_{l m}^{(2)} (t)]}^{⊤}$ according to Equation 12. The inferred coupling between the time derivative of density modes $ρ_{l m}$ and flux modes $j_{l m}^{(1)}$ faithfully recovers mass conservation (Figure 4C; see Equation 6). Overall, the learned matrix $M$ has 395 non-zero elements, effectively providing further compression of the experimental data, which required 2,250 spatio-temporal mode coefficients collected in ${\hat{a}}_{n}$ (see Equation 7) for its representation. Using the mode vector $a (t = 0)$ of the first experimental time point as the initial condition, the inferred minimal model Equation 12 with $M$ shown in (Figure 4A and B) faithfully recovers both the mode and real-space dynamics seen in the coarse-grained fields of the experimental input data (Figure 4E–G, Video 4).

Figure 4 with 1 supplement see all

Download asset Open asset

Model learning for experimental data of collective cell motion during early zebrafish development.

(A) Visualization of the constant mode coupling matrix $M$ that was learned from experimental data (see Appendix 4) and describes the dynamics of the mode vector $a = {[ρ_{l m} (t), j_{l m}^{(1)} (t), j_{l m}^{(2)} (t)]}^{T}$ via Equation 12. Dimensionless fields are defined by ${\hat{ρ}}_{l m} = R_{s}^{2} ρ_{l m}$ and ${\hat{j}}_{l m}^{(i)} = R_{s} Δ t j_{l m}^{(i)}$ ( $i = 1, 2$ ) with $R_{s} = 300 μ m$ and $Δ t = 2$ min. (B) Scaling the learned matrix $M$ by the Mean Absolute Deviation (MAD) of the modes (see Appendix 4) reveals structures reminiscent of the mode coupling matrix learned for ABPs (Figure 3C). (C) The learned model recovers mass conservation in mode space (Equation 6). (D) Comparison of theoretical and inferred real-space kernels (see Equation 14 and Appendix 4) for the ABP dynamics and for the experimental data of collective cell motion. The trace of the non-dimensional kernel ${\hat{M}}^{J} (r, r^{'})$ (the only non-zero eigenvalue, Appendix 4—figure 2) indicates a localized flux-flux coupling with a similar profile among both systems. The oscillating magnitude of the non-dimensionalized density-flux kernel $| {\hat{m}}^{ρ} (r, r^{'}) |$ (insets) in the ABP system indicates a gradient-like coupling and is consequence of the persistent ABP motion. In the experimental data, a first peak around $ω = π / 4$ is also visible, but less pronounced. All kernel properties were computed by averaging over pairs of positions $r, r^{'}$ that are separated by the same angular distance $ω = \arccos (r \cdot r^{'}) \in [0, π]$ . Solid lines indicate mean, shaded areas indicate standard deviation. (E) Comparison of experimental mode dynamics (circles) with numerical solution (solid line) of the minimal model Equation 12 for learned matrix $M$ visualized in A. For clarity, the comparison is shown for the two dominant modes of each set of harmonic modes $ρ_{l m}, j_{l m}^{(1)}$ and $j_{l m}^{(2)}$ . (**F, G**) Mollweide projections of the experimental data (F) and of the numerical solution of the learned model (G) show very good agreement (Video 4). Blue lines and arrows illustrate streamlines defined by the cell flux $J (r, t)$ , circles depict defects with topological charge +1 (white) and -1 (red).

Video 4

Download asset

It is instructive to analyze the inferred matrix $M$ and the linear model it encodes in more detail. Comparing the MAD-rescaled matrix (see Appendix 4) learned for the experimental zebrafish data (Figure 4B) with the non-dimensionalized matrix learned for the active Brownian particle dynamics (Figure 3C), we find similar patterns of prominent diagonal and block-diagonal couplings. Consistent with the analysis of single cell trajectories (Shah et al., 2019), this suggests that a random, but persistent movement of cells akin to ABPs moving on a sphere partially contributes to the early gastrulation process in zebrafish. This is complemented in the minimal model of the experimental dynamics by significant off-diagonal contributions (Figure 4B), which are absent in the non-interacting ABP model. Such off-diagonal contributions represent effective linear approximations of cell-cell interactions, environmental influences or other external stimuli reflected in the experimental time-series data. Ultimately, such contributions to the mode coupling matrix $M$ help realize the symmetry breaking process observed in the underlying experimental data (Figure 2).

The inferred mode coupling matrix $M$ shown in Figure 4B together with Equation 12 provides a highly robust minimal model. Specifically, despite being linear, it is numerically stable over a period approximately four times as long as the input data from which the matrix $M$ was learned. Furthermore, simulations with modified initial conditions (see Figure 4—figure supplement 1) still exhibit a characteristic symmetry breaking and lead to the emergence of density and flux patterns similar to those seen in Figure 4F and G. For example, simulating Equation 12 using the initial condition of a different experimental data set (Figure 2—figure supplement 1) leads to final patterns with the same symmetry as in the original training data, further corroborating that the observed symmetry breaking is directly encoded in the interactions represented by the matrix $M$ . A similar robustness is observed under moderate perturbations of the initial condition, such as a rotation of initial cell density patterns relative to the coordinate system in which $M$ was inferred, or a local depletion of the initial density, emulating a partial removal of cells as experimentally realized in Morita et al., 2017. Taken together, these numerical experiments demonstrate that the inferred mode coupling matrix $M$ meaningfully captures the dynamics and interactions of cells that facilitate the symmetry breaking observed during early zebrafish development.

Green’s function representation of learned models in real space

To characterize the inferred spatial interactions in more detail, we can analyze the real-space representation of the learned mode coupling matrix $M$ . While the density dynamics represented by $M$ (the first row in Figure 4AB) simply reflects mass conservation Equation 1 in real space, the dynamics of the flux (the second and third row in Figure 4A and B) corresponds in real space to the integral equation (Appendix 4)

\frac{\partial}{\partial t} J (r, t) = \int d Ω^{'} [m^{ρ} (r, r^{'}) ρ (r^{'}, t) + M {}^{J}{(r, r^{'})} \cdot J (r^{'}, t)],

where $d Ω^{'} = \sin θ^{'} d θ^{'} d ϕ^{'}$ is the spherical surface area element. The vector-valued kernel $m^{ρ} (r, r^{'})$ in Equation 14 connects the distribution of cell density $ρ$ across the surface to dynamic changes of the flux $J$ at a given point $r$ . Similarly, the matrix-valued kernel $M^{J} (r, r^{'})$ describes how the distribution of cell fluxes at $r^{'}$ affects temporal changes of the flux at $r$ .

To analyze the spatial range of interactions between points $r$ and $r^{'}$ , we use the fact that the matrix-valued kernel $M^{J} (r, r^{'})$ has only one non-zero eigenvalue (Appendix 4—figure 2). Consequently, the trace $tr (M^{J})$ serves as a proxy for the distance-dependent interaction strength mediated by $M^{J}$ . Averages of $tr (M^{J})$ over point-pairs with the same angular distance $ω = acos (r \cdot r^{'})$ are shown for the ABP dynamics and for the minimal model inferred from experimental data in Figure 4D. Note that to make the models amenable to comparison, we compute $M^{J} (r, r^{'})$ from the known mean-field model of ABPs (Equation 13a) using the same finite number of modes as used to represent the ABP and the zebrafish data ( $l_{max} = 4$ ). In theory, one expects for the ABP dynamics a highly localized, homogeneous kernel $tr (M^{J}) \sim δ (r - r^{'})$ , so that an exact spectral representation would require an infinite number of modes (see Appendix 4). In practice, using a finite number of modes leads to a wider kernel range (Figure 4D ’ABP theory’) and introduces an apparent spatial inhomogeneity, as indicated by the non-zero standard deviation of $tr (M^{J})$ at fixed distance $ω$ (blue shades). Both the quantitative profile of $tr (M^{J})$ and its variation are successfully recovered by applying the inference framework to stochastic simulations of ABPs (Figure 4D ’ABP simulation’) where $M^{J} (r, r^{'})$ was computed from the learned mode coupling matrix $M$ shown in Figure 3C. For the inferred minimal model of the cell dynamics (Figure 4D ’Zebrafish experiment’), we find a similar short-ranged flux-flux coupling mediated by $M^{J}$ . However, the increased variability of $tr (M^{J})$ at fixed distances $ω$ indicates more substantial spatial inhomogeneities of the corresponding interactions. These inhomogeneities are absent in a non-interacting system of ABPs and represent an interpretable real-space signature of the symmetry-breaking mechanisms built into the underlying mode coupling matrix $M$ .

A similar analysis can be performed for the kernel $m^{ρ} (r, r^{'})$ that couples the density at position $r^{'}$ to dynamics of fluxes at position $r$ (see Equation 14), where we average the magnitude $| m^{ρ} (r, r^{'}) |$ over pairs ( $r$ , $r^{'}$ ) with the same angular distance $ω$ (Figure 4D insets). Using a finite number of modes to compute this kernel in the different scenarios again introduces apparent spatial inhomogeneities in all cases. Additionally, all kernel profiles exhibit a distinct maximum at short range, indicating a coupling between density gradients and the flux dynamics that emerges microscopically from a persistent ABP and cell motion (see Appendix 3 and 4) – an observations that is consistent with the similar block-diagonal structure of both inferred matrices $M$ (compare Figures 3C and 4B).

In conclusion, the real-space analysis and comparison of inferred interaction kernels further highlights potential ABP-like contributions to the collective cellular organization during early zebrafish development and reveals an effectively non-local coupling between density and flux dynamics. The latter could result, for example, from unresolved fast-evolving morphogens (Hannezo and Heisenberg, 2019), through mechanical interactions with the surrounding material (Münster et al., 2019) or due to other relevant degrees of freedom that are not explicitly captured in this linear hydrodynamic model. More generally, a real-space representation of kernels provides an alternative interpretable way to study the interactions and symmetry-breaking mechanisms encoded by models directly learned in mode space.

Discussion

Leveraging a sparse mode representation of collective cellular dynamics on a curved surface, we have presented a learning framework that translates single-cell trajectories into quantitative hydrodynamic models. This work complements traditional approaches to find quantitative continuum models of complex multicellular processes (Etournay et al., 2015; Hannezo et al., 2015; Morita et al., 2017; Streichan et al., 2018; Münster et al., 2019) that match problem-specific constitutive relations of active materials in real-space with experimental observations. We have demonstrated here that length scales and symmetries associated with a mode representation can directly inform about the character of symmetry breaking transitions and topological features of collective cellular motion even before a model is specified. The successful applications to synthetic ABP simulation data and experimental zebrafish embryo data show that model learning in mode space provides a promising and computationally feasible approach to infer quantitative interpretable models in complex geometries.

The learned linear minimal model for cell migration during early zebrafish morphogenesis quantitatively recapitulates the spatiotemporal dynamics of a complex developmental process (Figure 4F and G), and highlights similarities between collective cell migration and analytically tractable ABP dynamics on a curved surface. An extension to nonlinear mode-coupling models or an integration of additional, experimentally measured degrees of freedom, such as concentration fields of morphogens involved in mechanochemical feedbacks (Hannezo and Heisenberg, 2019), is in principle straightforward by including nonlinear terms in Equation 12. Furthermore, the above framework could be generalized to describe the dynamics within a spherical shell of finite height by complementing the surface vector SHs used in this work by their radial counterpart (Barrera et al., 1985).

To provide a concrete example, we focused here on applying the model learning framework to single-cell tracking data of early zebrafish morphogenesis. However, the essentially spherical organization of cells during gastrulation observed in zebrafish is shared by many species whose early development occurs through a similar discoidal cleavage (Gilbert and Barresi, 2016), and the framework introduced here is directly applicable once tracking data becomes available for these systems. More generally, as novel imaging technologies are being developed (Keller et al., 2010; Royer et al., 2016; Shah et al., 2019), we expect that even larger and more detailed imaging data will further facilitate the exploration of finer scales and length-scale bridging processes (Lenne et al., 2021) through learning approaches that directly built on mode-based data representations.

Materials and methods

Data pre-processing

Request a detailed protocol

We obtained two single-cell tracking data sets from the experiments described in Shah et al., 2019. These data consist of the Cartesian coordinates of each cell together with a tracking ID. Some of the data is accessible at https://idr.openmicroscopy.org with ID number idr0068. We first denoised each cell trajectory using MATLAB’s (Matlab, 2019) wavelet denoiser function wdenoise, and centered the cloud of cells by least-squares fitting a spherical surface through it and shifting the origin at each time to coincide with the center of this sphere. We then computed the velocity of each cell by using Tikhonov-regularized differentiation as described in Knowles and Renka, 2014 and implemented in the MATLAB third-party module rdiff (Wagner, 2020). After examination of the cells’ velocity distribution, we further removed outlier cells whose speed is in the 95th percentile or above and verified that this operation only removes aberrant cells. Finally, we rotated the data to align the animal pole of the embryo with the $z$ -axis, as determined by the direction of the center of mass of the initial cell distribution. The resulting single cell data are shown as point clouds in Figure 1B and in Video 1.

Topological defect tracking

Request a detailed protocol

We have developed a defect tracker that identifies topological defects in vector fields tangent to a spherical surface via integration along suitable Burger circuits. The corresponding software is available at (https://github.com/NicoRomeo/surf-vec-defects; Romeo, 2022, copy archived at swh:1:rev:6dc742c376b0d085e19ece65f932ac6935342aba).

Appendix 1

Consistent coarse-graining on curved surfaces

We describe the derivation of self-consistent coarse-graining kernels that are used in the main text to convert single cell information into a continuous density field and its associated fluxes on a spherical surface. We first motivate this problem for a flat surface and then proceed with a detailed derivation for the case of a spherical surface.

Kernel consistency in Euclidean space

It is instructive to first consider a set of particles $α = 1, 2, 3, \dots$ at positions $X_{α} (t)$ moving with velocities $V_{α} (t) = d X_{α} / d t$ , where capitalized vectors indicate position and velocity in Euclidean space, e.g. particles move on a flat surface or within some three-dimensional volume. A coarse-grained density $ρ (X, t)$ and a mass flux $J (X, t)$ can be defined from this microscopic information by

\begin{array}{ll} ρ (X, t) & = \sum_{α} K_{e} [X, X_{α} (t)], \end{array}

\begin{array}{ll} J (X, t) & = \sum_{α} K_{e} [X, X_{α} (t)] \cdot V_{α} (t), \end{array}

where $K_{e} (X, X^{'})$ and $K_{e} (X, X^{'})$ represent a scalar-valued and a matrix-valued kernel function, respectively. At the same time, in a system with a constant number of particles, mass conservation implies, in general,

\partial_{t} ρ (X, t) + \nabla_{X} \cdot J (X, t) = 0,

relating the density $ρ (X, t)$ and the mass flux $J (X, t)$ of particles. Using the coarse-graining prescriptions Equation 15a and Equation 15b directly in Equation 16 and assuming the resulting relation must hold for any set of particle trajectories, one finds a general kernel consistency relation

\nabla_{X^{'}} K_{e} (X, X^{'}) + \nabla_{X} \cdot K_{e} (X, X^{'}) = 0 .

This condition is automatically satisfied for any translationally invariant and isotropic pair of kernels $K_{e} (X, X^{'}) = K_{e} (X - X^{'})$ and $K_{e} (X, X^{'}) = K_{e} (X - X^{'}) I$ , where $I$ is the unit matrix. Coarse-graining with such kernels is frequently employed in practice: Positions and velocities can be, for example, simply convolved with a Gaussian function of mean zero (Supekar et al., 2021).

Kernel consistency on a curved surface

For a surface parameterized by $r (s^{1}, s^{2}) \in ℝ^{3}$ with generalized coordinates $s^{1}, s^{2}$ , two tangential basis vectors are defined by $e_{i} = \partial r / \partial s^{i}$ ( $i = 1, 2$ ). Partial derivatives are, in the following, denoted $\partial_{i} := \partial / \partial s^{i}$ . The metric tensor is given by $g_{i j} = e_{i} \cdot e_{j}$ . The mean curvature is defined by $H n = - \nabla_{i} e^{i} / 2$ , where $n = e_{1} \times e_{2} / | e_{1} \times e_{2} |$ denotes the unit surface normal and the Einstein summation convention is used. The covariant form of mass conservation Equation 1 (main text) on a curved surface reads

\partial_{t} ρ + \nabla_{i} J^{i} = 0,

with $J^{i} = e^{i} \cdot J$ and $\nabla_{i}$ denotes the covariant derivative. In general, we are interested in describing an effective dynamics for cell positions and velocities that are projected onto a common reference sphere of radius $R_{s}$ . Such a description can be found by first formulating the coarse-graining approach for a unit sphere, on which particle positions and velocities are fully determined by angular coordinates and corresponding angular velocities, and finally rescaling the density and flux fields by suitable factors of $R_{s}$ . The corresponding coarse-graining Equation 2b (main text) of in-plane angular velocities ${\bar{v}}_{α} (t) = v_{α} (t) / | r_{α} (t) |$ for particles $α$ on a unit sphere reads covariantly

J^{i} = \sum_{α} K {(r, r_{α})}_{j^{'}}^{i} {\bar{v}}_{α}^{j^{'}},

where ${\bar{v}}_{α}^{i} = e^{i} \cdot {\bar{v}}_{α}$ and we drop the dependence on time to simplify the notation. The two-point kernel tensor $K {(r, r^{'})}_{i j^{'}}$ (a ‘bitensor’) is evaluated in the tangent space of $r$ for its first index and in the tangent space of $r^{'}$ at the second, primed index (Appendix 1—figure 1). Mass conservation on a curved surface, Equation 18, together with the coarse-graining prescriptions Equation 2a (main text) and Equation 19 then implies a covariant kernel consistency relation

\partial_{j^{'}} K (r, r^{'}) + \nabla_{i} K {(r, r^{'})}_{j^{'}}^{i} = 0 .

Appendix 1—figure 1

Download asset Open asset

Illustration of the action of the coarse-graining tensor kernel $K {(r, r^{'})}_{i j^{'}}$ Equation 19.

Left: $K_{i j^{'}}$ acts in the two tangent space at points $r$ and $r^{'}$ that are separated by an angular distance $ω = acos (r \cdot r^{'})$ . Each tangent plane has corresponding basis vectors, $e_{i}$ $e_{i^{'}}$ for $i = 1, 2$ . Right: The tensor kernel $K_{i j^{'}} \sim e_{i} \cdot e_{j^{'}}$ projects vectors $u$ in the tangent space of $r^{'}$ and generates a vector $v$ tangent at $r$ .

Solving the kernel consistency relation on a sphere

We solve Equation 20 in the following on the unit sphere, such that $r = n$ corresponds to the surface normal. The final result can simply be rescaled to any spherical surface of radius $R_{s}$ . Furthermore, we note that the parameter

x = r \cdot r^{'}

provides a measure for the great circle distance $ω (x) = acos (x)$ between two points on a sphere. Hence, we consider an ansatz for the kernels in Equation 20 of the form

K (r, r^{'}) = f (x)

K {(r, r^{'})}_{i j^{'}} = g (x) e_{i} \cdot e_{j^{'}},

with two unknown scalar functions $f (x)$ and $g (x)$ . The relevant derivatives of the ansatz Equation 22a and Equation 22b can readily be evaluated to

\partial_{j^{'}} K (r, r^{'}) = \frac{d f (x)}{d x} r \cdot e_{j^{'}}

\nabla_{i} K {(r, r^{'})}_{j^{'}}^{i} = \frac{d g (x)}{d x} r^{'} \cdot (e_{i} \otimes e^{i}) \cdot e_{j^{'}} - 2 g (x) r \cdot e_{j^{'}} .

Here, ⊗ denotes a dyadic product and we use $\partial_{i} x = r^{'} \cdot e_{i}$ and $\partial_{i^{'}} x = r \cdot e_{i^{'}}$ , which follows from Equation 21, as well as $\nabla_{i} e^{i} = - 2 r$ in the second equation, which holds on a unit sphere and follows from the definition of the mean curvature. We then use the expansion of the identity matrix in $ℝ^{3}$ on the spherical basis $I = e_{i} \otimes e^{i} + n \otimes n$ , such that in our case with $r = n$ we have $e_{i} \otimes e^{i} = I - r \otimes r$ . Hence, Equation 23b becomes

\nabla_{i} K {(r, r^{'})}_{j^{'}}^{i} = - \frac{d g (x)}{d x} (r^{'} \cdot r) (r \cdot e_{j^{'}}) - 2 g (x) r \cdot e_{j^{'}} .

Using Equation 24 and Equation 23a in the kernel consistency relation Equation 20 and dividing by $r \cdot e_{j^{'}}$ (at $r = r^{'}$ , for which $r \cdot e_{j^{'}} = 0$ , Equation 20 is obeyed for any $f (x)$ , $g (x)$ ), we find that the scalar functions in the kernel ansatz Equation 22a and Equation 22b have to obey

x \frac{d g (x)}{d x} + 2 g (x) = \frac{d f (x)}{d x} .

Hence, the general covariant consistency relation Equation 20 implies for the kernel ansatz Equation 22a and Equation 22b that the weighting functions $g (x)$ and $f (x)$ must be related by

g (x) = \frac{1}{x^{2}} \int_{0}^{x} d u u \frac{d f (u)}{d u} .

Kernel functions with compact support

In the last step, we determine a family of kernel functions $g (x)$ and $f (x)$ defined on the interval $x \in [- 1, 1]$ that satisfy Equation 25, along with the requirements:

$f (x)$ and $g (x)$ must be $C^{1}$ regular on $[- 1, 1]$
$f \geq 0$ on $[- 1, 1]$
$f$ is normalized to one on the unit sphere.

Recalling $x = \cos [ω (r, r^{'})]$ with angular distance $ω$ between $r$ and $r^{'}$ , a family of functions fulfilling these conditions is given by

f_{k} (ω) = \frac{k + 1}{2 π} (\cos ω)^{k} 1_{{\cos ω > 0}}

g_{k} (ω) = \frac{k}{2 π} (\cos ω)^{k - 1} 1_{{\cos ω > 0}},

where $1_{{\cos ω > 0}}$ is an indicator function that is one if $\cos ω §gt; 0$ and vanishes otherwise (Appendix 1—figure 2). In this work, we have chosen the kernels Equation 22a and Equation 22b with $f = f_{k}$ and $g = g_{k}$ for $k = 6$ . For these kernels derived here, densities $ρ (r, t)$ and associated fluxes $J (r, t)$ that are coarse-grained on a unit sphere can be converted into effective densities and fluxes on a spherical surface of radius $R_{s}$ through the rescaling $ρ \to ρ / R_{s}^{2}$ and $J \to J / R_{s}$ . Equivalently, rescaled kernels $K (r, r^{'}) \to K (r, r^{'}) / R_{s}^{2}$ and $K {(r, r^{'})}_{i j^{'}} \to K {(r, r^{'})}_{i j^{'}} / R_{s}$ can be used directly, as was done in Equation 2a and Equation 2b of the main text to generate the data shown in Figure 1 (main text).

Appendix 1—figure 2

Download asset Open asset

Family of kernel functions $f_{k} (ω)$ and $g_{k} (ω)$ given in Equation 26a.

These functions represent weights of the coarse-graining kernels Equation 22a and Equation 22b that are defined such that the kernels satisfy the consistency relation Equation 19. $ω = acos (r \cdot r^{'})$ denotes angular distances between $r$ and. $r^{'}$ Coarse-graining of a conserved number of particles on a sphere to determine a density field $ρ$ (Equation 2a, main text) requires a different weighting – $f_{k} (ω)$ – than the coarse-graining of an associated flux $J$ (Equation 2b , main text), which requires a weighting $g_{k} (ω)$ instead to ensure that coarse-grained fields obey mass conservation Equation 18. A characteristic coarse-graining length scale associated with these kernels is the half-width at half-maximum (HWHM), which is related to $k$ by HWHM $= \arccos (2^{- 1 / k})$ .

Appendix 2

Spatio-temporal mode decomposition

In this section, we provide explicit expressions for the scalar and spherical harmonic basis functions (‘spatial modes’), as well as for the Chebyshev basis functions (‘temporal modes’) used in this work. Additionally, we describe a systematic approach to determine the minimal number of modes needed to describe the coarse-grained data, while preserving a high level of accuracy in the representation.

Spatial basis: spherical harmonics

In this work, we use the real spherical harmonics defined in spherical coordinates $(θ, ϕ)$ by Arfken et al., 2013 as

Y_{l m} (θ, ϕ) = \sqrt{\frac{2 l + 1}{4 π} \frac{(l - | m |)!}{(l + | m |)!}} P_{l}^{| m |} (\cos θ) N_{m} (ϕ),

where $P_{l}^{| m |} (x)$ is the associated Legendre polynomial of degree $l$ and order $| m |$ , and

N_{m} (ϕ) = {\begin{matrix} \sqrt{2} \cos (m ϕ) & if m §gt; 0 \\ 1 & if m = 0 \\ \sqrt{2} \sin (| m | ϕ) & if m §lt; 0 \end{matrix} .

Vector spherical harmonics can be defined and expressed as vector fields in 3D or covariantly as (Sandberg, 1978; Mietke et al., 2019)

Ψ_{l m} = \nabla_{S} Y_{l m} \Leftrightarrow Ψ_{(l m)}^{i} = g^{i j} \partial_{j} Y_{l m}

Φ_{l m} = \hat{r} \times Ψ_{l m} \Leftrightarrow Φ_{(l m)}^{i} = ϵ^{j i} \partial_{j} Y_{l m}

where $\nabla_{S} = e_{θ} \partial_{θ} + e_{ϕ} \sin^{- 1} θ \partial_{ϕ}$ denotes the gradient operator an a unit sphere, $ϵ_{i j}$ is the covariant Levi-Civita tensor, and $g_{i j}$ the metric tensor. Scalar harmonics $Y_{l m}$ and either vector harmonic $Λ_{l m} \in {Ψ_{l m}, Φ_{l m}}$ are orthogonal:

\int d Ω Y_{l m} Y_{l^{'} m^{'}} = δ_{l l^{'}} δ_{m m^{'}}

\int d Ω Λ_{l m} \cdot Λ_{l^{'} m^{'}} = l (l + 1) δ_{l l^{'}} δ_{m m^{'}},

where $d Ω = \sin θ d θ d ϕ$ . The increasing complexity of patterns and accuracy of reconstruction with larger $l$ is illustrated in Appendix 2—figure 1 and Video 2.

Appendix 2—figure 1

Download asset Open asset

Sequentially adding vector spherical harmonics $Ψ_{l m}$ and $Φ_{l m}$ – equivalent to increasing $l_{max}$ in Equation 5 – resolves increasing levels of details present in experimental flux fields ("Data").

Main features of the data are captured already by a relatively small number of modes ( $l_{max} = 4$ used throughout this work).

Temporal basis: Chebyshev polynomials

Chebyshev polynomials of the first kind $T_{n}$ are defined by Arfken et al., 2013.

T_{n} (\cos x) = \cos (n x) .

Chebyshev polynomials form an orthogonal basis of continuous functions on the interval $[- 1, 1]$ , such that an expansion

f (t) = \sum_{n = 0}^{n_{max}} c_{n} T_{n} (t)

uniformly converges as $n_{max} \to \infty$ (Driscoll et al., 2014). This representation also allows computing derivatives spectrally from

f^{'} (t) = \sum_{n = 0}^{n_{max}} c_{n} T_{n}^{'} (t) .

Information loss through coarse-graining

Coarse-graining microscopic data into smooth fields is an irreversible operation, during which some of the original particle information is irretrievably lost. The choice of coarse-graining scale is thus dictated by a trade-off between smoothness and information content - choosing larger coarse-graining scales leads to smoother fields but blurs finer scale structures which may be of interest. To inform our choice of coarse-graining scale, we quantify the loss of information incurred by the coarse-graining operation.

The measure we introduce to quantify information loss is based on the the well-known relationship between the smoothness of functions in real space and Fourier space (Stein and Shakarchi, 2011): A smooth function in real space should have a peaked, quickly decaying spectrum in Fourier space while a collection of point-like objects such as delta functions should have a uniform non-decaying spectrum. Specifically, we describe a uniformly sampled field as a $M \times N$ matrix with components being the field values $X_{i, j} = X (θ_{i}, ϕ_{j})$ . In our case, $X_{i, j}$ represents either the density field $ρ$ or any of the Cartesian components of the flux vector field $J$ at a given time point. We find the complex discrete Fourier spectrum ${\hat{X}}_{i, j}$ of this matrix using the two-dimensional fast Fourier transform. We then calculate the power spectral density (PSD) of the Fourier spectrum as $R_{i, j} = {| {\hat{X}}_{i, j} |}^{2}$ and interpret the normalized PSD.

P_{i, j} = \frac{R_{i, j}}{\sum_{a, b} R_{a, b}}

as a discrete probability distribution. The spectral entropy $S$ characterizing the information content of the field $X$ is then defined by

S (X) = - \frac{1}{\log_{2} N M} \sum_{i, j} P_{i, j} \log_{2} P_{i, j} .

Smooth fields are sharply peaked in Fourier space and have a low spectral entropy, whereas fields that resolve discrete single particle information are rather flat in Fourier space and have a large spectral entropy. The difference in entropy between particle data and smoothed fields then measures the information eliminated by the coarse-graining procedure. If we additionally normalize by the entropy of the spectral entropy $S_{0} (X)$ of the raw particle data, we finally obtain a relative measure of the information that is lost in the coarse-graining process. In general, a measure as given in Equation 34 can be defined for any transform with the property that smoothness in real space leads to a fast decaying spectrum in transform space.

We compute the spectral entropy of density and flux component fields at a representative time point and for varying coarse-graining length scales (Appendix 2—figure 2). Specifically, we coarse-grain density and flux through the procedure described in the main text and in Appendix 1 for different values of the kernel parameter $k$ (see Equation 26a). Large values of $k$ correspond to small coarse-graining length scales, with the effective half-width at half-maximum (HWHM) of the kernels given in Equation 22a and Equation 22b with weight functions Equation 26a and Equation 26b scaling as HWHM $= \arccos (2^{- 1 / k})$ . Normalized spectral entropies $S (X) / S_{0} (X)$ with $X \in {ρ, J}$ are then computed using Equation 34. For the flux field, we define $S (J) := S (J_{x}) + S (J_{y}) + S (J_{z})$ ("Flux sum" in Appendix 2—figure 2) and interpret the sum of these three contributions ("Flux x", "Flux y", "Flux z" in Appendix 2—figure 2) as the total information contained in the flux field. We find that the spectral entropies of all fields show similar features. In particular, an increasing coarse-graining width first results in a sharp loss of information as individual particle positions are blurred, followed by less steep information loss as continuous fields progressively lose details of finer structures. In this work, we use an intermediate value of the coarse-graining parameter $k = 6$ (yellow data in Appendix 2—figure 2).

Appendix 2—figure 2

Download asset Open asset

Normalized spectral entropy as a function of the coarse-graining kernel width (top) computed for density $ρ$ and flux field $J$ using Equation 34.

To evaluate the spectral entropy for the vector-valued flux, we define $S (J) := S (J_{x}) + S (J_{y}) + S (J_{z})$ (‘Flux sum’). The coarse-graining width – the half-width at half-maximum (HWHM) of the coarse-graining kernels Equation 22a and Equation 22b with weight functions Equation 26a – is varied by varying the kernel index $k$ , where $HWHM = \arccos (2^{- 1 / k})$ (see Appendix 1—figure 2). The fields $ρ$ and $| J |$ are shown in the two bottom rows for different values of $k$ i. $k = 5000$ (blue, data used to compute the reference spectral entropies $S_{0} (ρ)$ and $S_{0} (J)$ ) ii. $k = 60$ (brown) iii. $k = 6$ (yellow, used in main text) iv. $k = 2$ (purple).

Optimal compression in space and time

Spectral representations are exact in the limit of an infinite number of modes. In practice, we choose a maximal harmonic mode number $l_{max}$ and maximal Chebyshev mode number $n_{max}$ . A too large value of $l_{max}$ and $n_{max}$ provides little compression benefit, while too small values suffer accuracy penalties. Hence, there is a compression-accuracy trade-off that we seek to optimize. To evaluate the trade-off quantitatively, we define a heuristic compression metric $C$ by

1 / C = \frac{n_{max}}{N_{t}} + \frac{{(l_{max} + 1)}^{2}}{N_{s}},

where $N_{t}$ is the number of sampled time steps and $N_{s}$ is the number of spatial grid points used for coarse-graining. Larger values of $C$ correspond to a higher compression factor. To define accuracy metrics, we consider the norm

‖ f ‖^{2} = \sum_{i = 1}^{N_{t}} f (t_{i})^{2}

where the sum runs over $N_{t}$ regularly sampled time points t_i. We denote a particular mode representation ${{\tilde{ρ}}_{l m} (t), {\tilde{j}}_{l m}^{(1)} (t), {\tilde{j}}_{l m}^{(2)} (t)}$ of the data that was coarse-grained via Equation 2a and Equation 2b (main text) for $l = 0, \dots, l_{max}^{ref} = 20$ as the ‘uncompressed’ reference. A measure to characterize the accuracy of a mode-truncated ‘compressed’ data representation is then given by a relative average mode reconstruction error

E_{modes} (n_{max}, l_{max}) = \frac{1}{2 (l_{max}^{ref} + 1)^{2}} \sum_{l = 0}^{l_{max}} \sum_{m = - l}^{m = l} {(\frac{{‖ ρ_{l m} - {\tilde{ρ}}_{l m} ‖}^{2}}{‖ {\tilde{ρ}}_{l m} ‖^{2}} + \frac{{‖ j_{l m}^{(2)} - {\tilde{j}}_{l m}^{(2)} ‖}^{2}}{{‖ {\tilde{j}}_{l m}^{(2)} ‖}^{2}})}^{1 / 2} .

This measure compares the compressed mode representation ${ρ_{l m} (t), j_{l m}^{(2)} (t)}$ , truncated at maximal Chebychev mode number $n_{max}$ (temporal representation Equation 32, Appendix 2) and maximal harmonic mode number $l_{max}$ (spatial representation, Equations 4; 5, main text) with the reference modes ${{\tilde{ρ}}_{l m} (t), {\tilde{j}}_{l m}^{(2)} (t)}$ . To find a compromise between accuracy, characterized by $E_{modes} (n_{max}, l_{max})$ , and compression $C$ defined in Equation 35, the aim is to find a pair $(n_{max}, l_{max})$ on the Pareto front (Jin and Sendhoff, 2008) of $E_{modes}$ vs. $1 / C$ (red dots in Appendix 2—figure 3).

Appendix 2—figure 3

Download asset Open asset

Relative average mode reconstruction error $E_{modes} (n_{max}, l_{max})$ Equation 36.

As a function of the inverse of the compression $C$ defined in Equation 35. Red points indicate the Pareto front (Jin and Sendhoff, 2008) of this compression-accuracy approximation trade-off. Orange circle indicates the final value used for our analysis.

Note that the modes ${\tilde{j}}_{l m}^{(1)} (t)$ and $j_{l m}^{(1)} (t)$ are so far omitted from this analysis, because the latter are in practice found directly from density modes via Equation 6 (main text). However, taking temporal derivatives of $ρ_{l m} (t)$ using Equation 33 to determine $j_{l m}^{(1)} (t)$ introduces undesirable oscillations for too large Chebychev cut-offs $n_{max}$ . This implies an additional trade-off between the need for accuracy (higher $n_{max}$ ) and stability (lower $n_{max}$ ). In practice, we wish to find values of $(n_{max}, l_{max})$ such that relative amplitudes of pairs $({\tilde{j}}_{l m}^{(1)}, {\tilde{j}}_{l m}^{(2)})$ and $(j_{l m}^{(1)}, j_{l m}^{(2)})$ are preserved by the compression. This can be achieved by comparing the relative curl amplitude

S_{curl} (t) = \frac{\sum_{l m} {[j_{l m}^{(2)} (t)]}^{2}}{\sum_{l m} {[j_{l m}^{(1)} (t)]}^{2} + {[j_{l m}^{(2)} (t)]}^{2}}

to the analog quantity ${\tilde{S}}_{curl} (t)$ computed from the reference modes ${{\tilde{j}}_{l m}^{(1)} (t), {\tilde{j}}_{l m}^{(2)} (t)}$ and analyzing the curl reconstruction error

E_{curl} = \frac{∥ S_{curl} - {\tilde{S}}_{curl} ∥}{∥ {\tilde{S}}_{curl} ∥}

as a function of $n_{max}$ and $l_{max}$ (Appendix 2—figure 4). From this, we find a region of low error around $l_{max} = 4, n_{max} = 30$ , which also is on the Pareto front of the accuracy vs. compression trade-off (orange circles in Appendix 2—figures 3 and 4) and represents the final values used throughout this work.

Appendix 2—figure 4

Download asset Open asset

$S_{curl}$ reconstruction error landscape (log scale) as a function of $l_{max}$ and $n_{max}$ .

Black contour lines indicate iso-error lines (see Equation 37, $E_{curl} =$ const.), whereas white contour lines indicate iso-compression levels (see Equation 35, $C =$ const.). Orange circle indicates the final value used for our analysis.

Appendix 3

Active Brownian particles on the sphere

In this section, we describe the stochastic dynamics of non-interacting, active Brownian particles (ABPs) (Romanczuk et al., 2012) on curved surfaces and derive analytically coarse-grained mean-field equations, as well as a kernel representation of ABP dynamics. These results are used in the main text to validate our coarse-graining and inference framework.

We consider active Brownian particles at position $x \in ℝ^{3}$ that move with speed v₀ on the surface of a unit sphere (radius $R_{0} = 1$ ) in the direction of their unit orientation vector $u \in ℝ^{3}$ . Since $| x | = 1$ at all times, we can interpret v₀ as the particle’s angular speed on the unit sphere. The orientation vector is at all times tangential to the surface, but is subject to random in-plane fluctuations characterized by a rotational diffusion coefficient $D_{r}$ . The corresponding dynamics of $x (t)$ and $u (t)$ is given by the stochastic differential equations (in units $R_{0} = 1$ )

d x = v_{0} u d t

d u = - v_{0} x d t + (x \times u) \sqrt{2 D_{r}} \circ d ξ,

where the stochastic differential Equation 38b is interpreted in the Stratonovich sense, as denoted by the symbol "o" (Braumann, 2007). It follows from Equation 38a that $x (t)$ and $u (t)$ are normalized at all times. In the absence of rotational diffusion ( $D_{r} = 0$ ), the vectors $x$ and $u$ rotate over time by an angle $v_{0} t$ around the axis $u \times x$ . Consequently, particle trajectories in the absence of noise trace out great circles in the plane defined by $(u \times x)$ .

Spatial correlation of APBs on a sphere

$C (t) = ⟨ x (t) \cdot x (0) ⟩$ To illustrate how ABPs on a sphere differ from ABPs in Euclidean space, we study first the correlation function , where the angled brackets denote a Gaussian white-noise average. To this end, we rewrite the ABP dynamics Equation 38a in their equivalent Itô form given by

d x = v_{0} u d t

d u = - (v_{0} x + D_{r} u) d t + \sqrt{2 D_{r}} (x \times u) d ξ .

In the Itô formulation any smooth function $f (x, u)$ obeys $⟨ f (x, u) d ξ ⟩ = 0$ , such that (Winkler et al., 2015)

\frac{d}{d t} ⟨ x (t) \cdot x (0) ⟩ = v_{0} ⟨ u (t) \cdot x (0) ⟩

and

\frac{d}{d t} ⟨ u (t) \cdot x (0) ⟩ = - v_{0} ⟨ x (t) \cdot x (0) ⟩ - D_{r} ⟨ u (t) \cdot x (0) ⟩,

which yields a damped harmonic oscillator equation for the correlation function

\frac{d^{2}}{d t^{2}} C (t) + D_{r} \frac{d}{d t} C (t) + v_{0}^{2} C (t) = 0 .

Normalization and orthogonality of $x (t)$ and $u (t)$ imply the initial conditions $C = 1$ and $d C / d t = 0$ at $t = 0$ . The behavior of solutions of Equation 40 is a function of the rotational Péclet number ${Pe}_{r} := v_{0} / D_{r}$ that quantifies the ratio between active motion and orientational diffusion. For ${Pe}_{r} §lt; 1$ , (‘high-noise regime’), the position correlation function $C (t) = ⟨ x (t) \cdot x (0) ⟩$ decays according to Equation 40 monotonically to zero. For ${Pe}_{r} §gt; 1$ , (‘low -noise regime’) position correlations exhibit damped oscillations. To validate our simulation method (described in the following section), analytic predictions for $C (t)$ are in Figure 3B (main text) compared against the ensemble average $⟨ x (t) \cdot x (0) ⟩$ over $3 \times 10^{4}$ simulated ABPs.

Stochastic simulation of active Brownian particles on the sphere

To ensure a numerically exact normalization of the particle’s position and orientation vectors on the unit sphere, we simulated the dynamics

d x = \frac{v_{0}}{| u |} (u - \frac{u \cdot x}{{| x |}^{2}} x) d t

d u = - v_{0} \frac{x}{{| x |}^{2}} d t + \frac{(x \times u)}{| x \times u |} \sqrt{2 D_{r}} \circ d ξ .

We numerically solve the Itô formulation of this system using the Euler-Mayurama scheme (Higham., 2001), and confirm that this system reproduces the correlation dynamics predicted by Equation 40 (Figure 3B, main text).

Fokker-Planck equation

To study the continuum dynamics of a large number of non-interacting ABPs on a sphere, we determine the dynamics of the probability density $p (x, u, t)$ of particle positions $x$ and orientations $u$ at time $t$ . To do so, it is convenient to express particle positions in terms of a parameterisation $x (t) = x [x^{1} (t), x^{2} (t)]$ that defines tangential basis vectors by $e_{i} = \partial x / \partial x^{i}$ ( $i = 1, 2$ ) and a metric tensor $g_{i j} = e_{i} \cdot e_{j}$ . By definition, we have $d x = e_{i} d x^{i}$ and Equation 38a can be rewritten as

d x^{i} = v_{0} u^{i} d t .

General tangential vectors on the surface can be written as $u = u^{i} e_{i}$ and on a unit sphere the surface normal can be identified with particle positions $n = e_{1} \times e_{2} / | e_{1} \times e_{2} | = x$ . Hence, on the unit sphere the Gauss-Weingarten relation reads $\partial_{i} e_{j} = - C_{i j} x + Γ_{i j}^{k} e_{k}$ , where $Γ_{i j}^{k}$ denote Christoffel symbols and $C_{i j}$ is the curvature tensor. This implies together with Equation 42 the geometric relation

\begin{aligned} d u & = e_{i} d u^{i} + u^{i} (\partial_{j} e_{i}) d x^{j} \\ = e_{i} d u^{i} - C_{i j} u^{i} u^{j} v_{0} x d t + v_{0} u^{i} u^{j} Γ_{i j}^{k} e_{k} d t . \end{aligned}

Comparing this identity with the stochastic dynamics $d u$ in Equation 38b and using that $C_{i j} u^{i} u^{j} = g_{i j} u^{i} u^{j} = {| u |}^{2} = 1$ for unit vectors $u$ on the unit sphere, we find the covariant stochastic differential equation

d u^{i} = - v_{0} u^{j} u^{k} Γ_{j k}^{i} d t + ϵ_{k}^{i} u^{k} \sqrt{2 D_{r}} \circ d ξ .

In Equation 43, $ϵ_{i j} = x \cdot (e_{i} \times e_{j})$ denotes the Levi-Civita tensor on the unit sphere.

In this covariant basis, we define the scalar probability density

p (x, u, t) = ⟨ \frac{1}{\sqrt{g (x)}} \prod_{i} δ [x^{i} - x^{i} (t)] δ [u^{i} - u^{i} (t)] ⟩,

where $δ (x)$ denotes a Dirac function. Combining Equations 42; 43, standard methods (Fily et al., 2016; Castro-Villarreal and Sevilla, 2018) allow us to obtain the Fokker-Planck equation for $p (x, u, t)$ as

\frac{\partial}{\partial t} p (x, u, t) = D_{r} \frac{\partial}{\partial u^{i}} [ϵ_{k}^{i} u^{k} \frac{\partial}{\partial u^{j}} (ϵ_{l}^{j} u^{l} p)] - \nabla_{i} (v_{0} u^{i} p) + \frac{\partial}{\partial u^{i}} (v_{0} u^{j} u^{k} Γ_{j k}^{i} p)

Using the identity $ϵ_{k}^{i} ϵ_{l}^{j} = g^{i j} g_{k l} - δ_{l}^{i} δ_{k}^{j}$ , the dynamics of the probability density is finally given by

\frac{\partial}{\partial t} p (x, u, t) = D_{r} \frac{\partial}{\partial u^{i}} [(g^{i j} - u^{i} u^{j}) \frac{\partial p}{\partial u^{j}}] - v_{0} u^{i} \nabla_{i} p + \frac{\partial}{\partial u^{i}} (v_{0} u^{j} u^{k} Γ_{j k}^{i} p),

which agrees with the result in Castro-Villarreal and Sevilla, 2018.

Hydrodynamic expansion

To connect the Fokker-Planck dynamics given in Equation 46 to hydrodynamic fields, we define (probability) density and fluxes by $ρ (x, t) = \int d^{2} u p (x, u, t)$ , and $J^{i} (x, t) = v_{0} \int d^{2} u u^{i} p (x, u, t)$ . Their dynamics on the unit sphere is given by Castro-Villarreal and Sevilla, 2018.

\frac{\partial ρ}{\partial t} = - \nabla_{i} J^{i}

\frac{\partial J^{i}}{\partial t} = - \frac{v_{0}^{2}}{2} \nabla^{i} ρ - D_{r} J^{i},

where couplings to higher order fields are neglected, as they vanish at shorter time-scales due to the presence of rotational noise. Expressing Equation 47a and Equation 47b in terms of scalar and vector spherical harmonics (see Appendix 2) for an arbitrary sphere radius R₀ yields the mode dynamics Equation 13a, Equation 13b and Equation 13c the main text.

Appendix 4

Learning and interpreting the linear model

We describe details about the inference procedure used to learn the linear ordinary differential equation (ODE) model considered in the main text. We then discuss how the matrix $M$ found by this procedure can be further studied in terms of its real-space kernel representation and derive this kernel for the ABP dynamics introduced in Appendix 4.

Inference of the dynamical mode coupling matrix M

Given a dynamical mode vector $a (t) = {[ρ_{l m} (t), j_{l m}^{(1)} (t), j_{l m}^{(2)} (t)]}^{⊤}$ , the goal is to learn a linear minimal model

\frac{d a (t)}{d t} = M \cdot a (t)

of the mode dynamics. Here, $M$ is an unknown $n \times n$ mode coupling matrix, where generally $n = 3 {(l_{\max} + 1)}^{2} - 2$ . In systems with global mass conservation, as considered in this work, one can additionally use that the mode $ρ_{00}$ is constant and eliminate the corresponding couplings from $M$ .

To describe the algorithm that was used to infer the mode coupling matrix $M$ , we parameterize $M$ by a vector $p$ that contains all non-zero entries and introduce a function $ℳ$ that represents the underlying matrix structure. Together, they generate the explicit form $M = ℳ (p)$ of the mode coupling matrix. Imposing structure on the matrix, such as rank constraints, or sparsity leads to a shorter vector $p$ and modifies the definition of $ℳ$ accordingly. Denoting $A (t; M, p, a_{0})$ as the result of numerically integrating the system of ODEs Equation 48 up to time $t$ from initial condition $a_{0}$ with $M = ℳ (p)$ , we define the loss function

L (p; M, t_{I}, t_{N}) = \frac{1}{N - I} \sum_{i = I}^{N} ‖ a (t_{i}) - A (t_{i}; M, p, a (t_{I})) ‖_{2}^{2},

where the t_i are time points in an interval $[t_{I}, t_{N}]$ at which the data and the ODE solution are sampled. Using the ODE solvers and optimization functions provided by the Julia modules DifferentialEquations.jl and DiffEqFlux.jl (Rackauckas et al., 2021), we can differentiate through the ODE solver to calculate derivatives of the loss function Equation 49 with respect to parameters $p$ and subsequently apply gradient-based optimization algorithms. The loss function is minimized using the ADAM algorithm (Kingma and Ba, 2017), followed by the Broyden-Fletcher-Goldfarb-Shannon (BFGS) algorithm (Nocedal and Wright, 2006). To increase the robustness of the optimization and promote sparsity, we use a sequentially thresholded algorithm (Supekar et al., 2021; Brunton et al., 2016; Reinbold, 2020). A complete overview of this procedure is shown in Appendix 4—figure 1 and the details of the specific design decisions made in the algorithm are discussed in the following:

1. To account for the variation in scale between the different modes in the data $a (t)$ , each mode is normalized by its median absolute deviation (MAD) across the full time-span in which the data are available. Specifically, we scale each mode by

m a d (a_{i}) = {m e d i a n}_{k} (| a_{i} (t_{k}) - {\bar{a}}_{i} |) .

where ${\bar{a}}_{i} = {median}_{k} [a_{i} (t_{k})]$ and the median is taken over all time-points, giving rise to a scaled mode vector $\tilde{a} (t)$ . Losses analogous to Equation 49 that are computed using scaled data are denoted in the following by $\tilde{L}$ .

2. To prevent over-fitting, we divide the data into two regions, a learning region from $t_{I}$ to $t_{N}$ and a validation region from $t_{N}$ to $t_{F}$ . Only data from the learning region is used in the optimization of the loss function Equation 49. However, the model is integrated into the validation region, and a corresponding validation loss using only the data in the validation region is calculated. During each optimization run, we choose the model with the lowest loss in the validation region, lowering the likelihood of over-fitting to the specific data in the learning region.

3. To prevent the optimization from getting stuck in local minima, we incrementally increase the time-span of the data included in the optimization objective (blue box in Appendix 4—figure 1). We increase the time window backward from a fixed endpoint $t_{1} = t_{F}$ , choosing each iteration an earlier initial condition at time $t_{i} §lt; t_{i - 1}$ . The advantage of stepping backward rather than forward from a fixed initial condition is twofold: first, the validation region stays unchanged throughout the optimization, making comparisons of the validation loss easy. Second, because the initial condition changes with each run, the learned matrix tends to be more robust to fluctuations in the initial condition.

4. After the optimization step, sparsity is promoted by thresholding the elements in the matrix (Brunton et al., 2016), removing small magnitude elements that do not noticeably contribute to the mode dynamics (purple box in Appendix 4—figure 1). The optimization procedure is then repeated until the thresholding converges. The threshold is chosen to generate a sparse matrix that still reproduces the dynamics faithfully.

5. Once the sparsity pattern is obtained from the sequential thresholding and optimization procedure a final run of the optimization is performed on the unscaled mode data to find the final dynamical matrix $M$ , which removes any potential slight bias the MAD scaling might have introduced in the parameter values $p$ .

Finally, the numerical stability of the model can be checked by examining the eigenvalues of the learned matrix. For the ABP test data, we learn a matrix $M$ for which the largest real part of its eigenvalues is at machine precision. For the experimental data, the largest real part in the eigenvalues is $7.4 \times 10^{- 4}$ , which corresponds to a time scale of around 675 mins. While the corresponding dynamics will eventually become unstable, solutions remain bound over a period of approximately 45 hours, which is four times as long as the input data from which the mode coupling matrix was learned.

Learning and validation regions used in this work

For the ABP data, the first 15 frames are excluded, so that – consistent with coarse-graining assumptions (see Appendix 3, Equation 47a, Equation 47b) any remnants of higher orientational order introduced by the initial conditions have decayed. The subsequent 140 frames are used as the learning region, followed by a validation region of 20 frames. Each frame corresponds to a time interval of approximately 0.06 in units of $R_{0} / v_{0} = 1$ . We exclude the first and last 10 frames of the experimental zebrafish data and split the remaining data into a learning region of 360 frames, with the remaining 40 frames used for validation. Each frame corresponds to a time interval of 2 min.

Appendix 4—figure 1

Download asset Open asset

Initially the data is rescaled using the median absolute deviation (MAD) defined in Equation 50 to account for variation in scales across the modes.

Scaled variables are denoted by tildes. To avoid local minima of the optimization function, we iteratively feed more data into the cost function. Next we sequentially threshold the small terms in the matrix until convergence is reached. These procedures are repeated until the sparsity pattern converges. Finally the scaling is undone and the parameters are optimized on the unscaled data to produce the final matrix. Schematic of the learning procedure.

Green’s function representation of the learned matrix

The learned matrix $M$ consists of 9 blocks each with $[{(l_{max} + 1)}^{2} - 1] \times [{(l_{max} + 1)}^{2} - 1]$ entries. Each block relates a mode family to time derivatives of another and we write

M = (\begin{matrix} M^{ρ ρ} & M^{ρ 1} & M^{ρ 2} \\ M^{1 ρ} & M^{11} & M^{12} \\ M^{2 ρ} & M^{21} & M^{22} \end{matrix}) .

We denote the components of each block by ${(M^{m_{1} m_{2}})}_{l m, l^{'} m^{'}} \equiv M_{α β}^{m_{1} m_{2}}$ , where $m_{1}, m_{2} \in {ρ, 1, 2}$ , and $α$ , $β$ are multi-indices that represent the harmonic modes $(l m)$ . Using the mode representation Equation 5 and the form of the linear minimal model Equation 48, we find

\begin{array}{ll} \frac{\partial}{\partial t} J (r, t) & = \sum_{α = l m} (\frac{d j_{α}^{(1)} (t)}{d t} Ψ_{α} (\hat{r}) + \frac{d j_{α}^{(2)} (t)}{d t} Φ_{α} (\hat{r})) \\ = \sum_{α = l m} \sum_{β = l^{'} m^{'}} [M_{α β}^{1 ρ} ρ_{β} (t) + M_{α β}^{11} j_{β}^{(1)} (t) + M_{α β}^{12} j_{β}^{(2)} (t)] Ψ_{α} (\hat{r}) \\ + [M_{α β}^{2 ρ} ρ_{β} (t) + M_{α β}^{21} j_{β}^{(1)} (t) + M_{α β}^{22} j_{β}^{(2)} (t)] Φ_{α} (\hat{r}) . \end{array}

Using Equation 30a, Equation 51 can be cast into the dynamic kernel Equation 14 given in the main text, where we defined the vector kernel

m^{ρ} (r, r^{'}) = \sum_{α = l m} \sum_{β = l^{'} m^{'}} M_{α β}^{1 ρ} Ψ_{α} (\hat{r}) Y_{β} ({\hat{r}}^{'}) + M_{α β}^{2 ρ} Φ_{α} (\hat{r}) Y_{β} ({\hat{r}}^{'})

and the matrix kernel

\begin{array}{ll} M^{J} (r, r^{'}) & = \sum_{α = l m} \sum_{β = l^{'} m^{'}} \frac{1}{l (l + 1)} [M_{α β}^{11} Ψ_{α} (\hat{r}) \otimes Ψ_{β} ({\hat{r}}^{'}) \\ + M_{α β}^{12} Ψ_{α} (\hat{r}) \otimes Φ_{β} ({\hat{r}}^{'}) + M_{α β}^{21} Φ_{α} (\hat{r}) \otimes Ψ_{β} ({\hat{r}}^{'}) \\ + M_{α β}^{22} Φ_{α} (\hat{r}) \otimes Φ_{β} ({\hat{r}}^{'})], \end{array}

where ⊗ denotes a dyadic product. The matrix $M^{J} (r, r^{'})$ has a 0 eigenvalue with right eigenvector ${\hat{r}}^{'}$ and left eigenvector $\hat{r}$ , which implies $det (M^{J}) = 0$ . Numerical analysis of the matrix invariants shows that a second eigenvalue is 0 (Appendix 4—figure 2), leaving only a single non-zero eigenvalue that can be conveniently found from $tr [M^{J} (r, r^{'})]$ and is shown in Figure 4D (main text).

Real-space kernels of active Brownian particle dynamics

In the following we determine a real-space kernel representation in the form Equation 14 for the flux dynamics of ABPs given in Equation 47b. We can read off the kernel coefficients in Equation 52 and in Equation 53 from the coarse-grained ABP dynamics in mode space, given in Equation 13b and Equation 13c. For the kernel $m^{ρ} (r, r^{'})$ , we have $M_{α β}^{1 ρ} = - \frac{v_{0}^{2}}{2} δ_{α β}$ and $M_{α β}^{2 ρ} = 0$ ( $α, β = (l m)$ ), such that Equation 52 becomes

m^{ρ} (r, r^{'}) = - \frac{v_{0}^{2}}{2} \nabla_{S} \sum_{α = l m} Y_{α} (\hat{r}) Y_{α} ({\hat{r}}^{'}) = - \frac{v_{0}^{2}}{2} \nabla_{S} δ (r - r^{'}) .

Here, we have used in the first step the definition of $Ψ_{l m} (\hat{r})$ given in Equation 29a and in the second step the completeness of the spherical harmonic basis functions $Y_{l m} (\hat{r})$ , where $δ (r - r^{'}) = δ (ϕ - ϕ^{'}) δ (\cos θ - \cos θ^{'})$ denotes the delta function on a sphere. Note that a unit sphere was considered throughout this analysis, such that $r = \hat{r}$ . Similarly, Equation 13b and Equation 13c imply for the kernel coefficients in Equation 53 that $M_{α β}^{11} = M_{α β}^{22} = - D_{r} δ_{α β}$ and $M_{α β}^{12} = M_{α β}^{21} = 0$ . Consequently, we have

M^{J} (r, r^{'}) = - D_{r} \sum_{α = l m} \frac{1}{l (l + 1)} [Ψ_{α} (\hat{r}) \otimes Ψ_{α} ({\hat{r}}^{'}) + Φ_{α} (\hat{r}) \otimes Φ_{α} ({\hat{r}}^{'})] = - D_{r} δ (r - r^{'}) P_{∥},

where $P_{∥} = I - r \otimes r$ is the tangential projector on the unit sphere. The hydrodynamic flux Equation 47b of ABPs on a sphere can therefore be written in the equivalent integral kernel form

\partial_{t} J (r, t) = \int d Ω^{'} [- \frac{v_{0}^{2}}{2} \nabla_{S} δ (r - r^{'}) ρ (r^{'}, t) - D_{r} δ (r - r^{'}) J (r^{'}, t)] .

To make analytic kernel properties comparable to practical inference scenarios in which we work with a finite number of harmonic modes, we computed the sums in Equations 54; 55 up to a maximum mode number $l_{max} = 4$ . The resulting kernels – depicted in Figure 4D (main text) – approximate the Dirac delta function $δ (r - r^{'})$ and its derivative, leading to the finite range of $tr (M^{J})$ with amplitude maximum at $ω = 0$ , while $| m^{ρ} |$ vanishes at and peaks away from $ω = 0$ . Additionally, finite mode representations introduce an apparent kernel inhomogeneity across the spherical surface as evident from the non-zero standard deviation depicted in Figure 4D of the main text (blue shades).

Appendix 4—figure 2

Download asset Open asset

The $3 \times 3$ -matrix invariant $I_{2} = \frac{1}{2} (tr [{(M^{J})}^{2}] - {(tr [M^{J}])}^{2})$ sampled for pairs of positions $r$ , $r^{'}$ vanishes to machine precision for the dynamical matrix $M$ learned on the zebrafish data.

This invariant can be expressed in terms of matrix eigenvalues as $I_{2} = λ_{1} λ_{2} + λ_{1} λ_{3} + λ_{2} λ_{3}$ . Additionally, $λ_{1} λ_{2} λ_{3} = \det (M^{J}) = 0$ (see discussion below Equation 53), which implies only one eigenvalue is non-zero. Evaluating I₂ for the kernel matrix $M^{J}$ encoded by the theoretical (see Equation 13b and Equation 13c) and inferred (see Figure 3C, main text) dynamical matrix $M$ of the ABP dynamics yields similar results.

Data availability

Raw data used in this study can be obtained at https://doi.org/10.1038/s41467-019-13625-0 and https://imb-dev.gitlab.io/cell-flow-navigator/.

References

(2020) Universal scaling of active nematic turbulence
Nature Physics 16:682–688.

https://doi.org/10.1038/s41567-020-0854-4
- Google Scholar
Book
(2013) Mathematical Methods for Physicists: A Comprehensive Guide
Amsterdam, Netherlands: Elsevier Science.

https://doi.org/10.1016/C2009-0-30629-7
- Google Scholar
(1985) Vector spherical harmonics and their application to magnetostatics
European Journal of Physics 6:287–294.

https://doi.org/10.1088/0143-0807/6/4/014
- Google Scholar
(2020) Anomalous refraction of optical spacetime wave packets
Nature Photonics 14:416–421.

https://doi.org/10.1038/s41566-020-0645-6
- Google Scholar
(2015) New class of turbulence in active fluids
PNAS 112:15048–15053.

https://doi.org/10.1073/pnas.1509304112
- PubMed
- Google Scholar
1. Braumann CA
(2007) Itô versus Stratonovich calculus in random population growth
Mathematical Biosciences 206:81–107.

https://doi.org/10.1016/j.mbs.2004.09.002
- PubMed
- Google Scholar
(2016) Discovering governing equations from data by sparse identification of nonlinear dynamical systems
PNAS 113:3932–3937.

https://doi.org/10.1073/pnas.1517384113
- PubMed
- Google Scholar
1. Burns KJ
2. Vasil GM
3. Oishi JS
4. Lecoanet D
5. Brown BP
(2020) Dedalus: A flexible framework for numerical simulations with spectral methods
Physical Review Research 2:023068.

https://doi.org/10.1103/PhysRevResearch.2.023068
- Google Scholar
1. Castro-Villarreal P
2. Sevilla FJ
(2018) Active motion on curved surfaces
Physical Review. E 97:052605.

https://doi.org/10.1103/PhysRevE.97.052605
- PubMed
- Google Scholar
Conference
(2012) 9th IEEE International Symposium on Biomedical Imaging
An algebraic solution to rotation recovery in HARDI from correspondences of orientation distribution functions. pp. 38–41.

https://doi.org/10.1109/ISBI.2012.6235478
- Google Scholar
1. Collinet C
2. Lecuit T
(2021) Programmed and self-organized flow of information during morphogenesis
Nature Reviews. Molecular Cell Biology 22:245–265.

https://doi.org/10.1038/s41580-020-00318-6
- PubMed
- Google Scholar
(2020) Topological defects promote layer formation in Myxococcus xanthus colonies
Nature Physics 17:211–215.

https://doi.org/10.1038/s41567-020-01056-4
- Google Scholar
(2016) Defect-Mediated Morphologies in Growing Cell Colonies
Physical Review Letters 117:048102.

https://doi.org/10.1103/PhysRevLett.117.048102
- PubMed
- Google Scholar
Book
(2014)
Chebfun Guide

Oxford: Oxford University Press.
- Google Scholar
(2013) Fluid dynamics of bacterial turbulence
Physical Review Letters 110:228102.

https://doi.org/10.1103/PhysRevLett.110.228102
- PubMed
- Google Scholar
1. Etournay R
2. Popović M
3. Merkel M
4. Nandi A
5. Blasse C
6. Aigouy B
7. Brandl H
8. Myers G
9. Salbreux G
10. Jülicher F
11. Eaton S
(2015) Interplay of cell dynamics and epithelial tension during morphogenesis of the Drosophila pupal wing
eLife 4:e07090.

https://doi.org/10.7554/eLife.07090
- PubMed
- Google Scholar
Preprint
(2016) Active Particles on Curved Surfaces
arXiv.

https://arxiv.org/abs/1601.00324
- Google Scholar
(2021) The ultraspherical spectral element method
Journal of Computational Physics 436:110087.

https://doi.org/10.1016/j.jcp.2020.110087
- Google Scholar
Book
1. Gilbert SF
2. Barresi MJF
(2016)
Developmental Biology

Oxford: Oxford University Press.
- Google Scholar
1. Goldenfeld N
2. Woese C
(2011) Life is Physics: Evolution as a Collective Phenomenon Far From Equilibrium
Annual Review of Condensed Matter Physics 2:375–399.

https://doi.org/10.1146/annurev-conmatphys-062910-140509
- Google Scholar
Preprint
(2020) Integer Topological Defects Organize Stresses Driving Tissue Morphogenesis
bioRxiv.

https://doi.org/10.1101/2020.06.02.129262
- Google Scholar
1. Hannezo E
2. Dong B
3. Recho P
4. Joanny JF
5. Hayashi S
(2015) Cortical instability drives periodic supracellular actin pattern formation in epithelial tubes
PNAS 112:8620–8625.

https://doi.org/10.1073/pnas.1504762112
- PubMed
- Google Scholar
1. Hannezo E
2. Heisenberg CP
(2019) Mechanochemical Feedback Loops in Development and Disease
Cell 178:12–25.

https://doi.org/10.1016/j.cell.2019.05.052
- PubMed
- Google Scholar
1. Hartmann R
2. Singh PK
3. Pearce P
4. Mok R
5. Song B
6. Díaz-Pascual F
7. Dunkel J
8. Drescher K
(2019) Emergence of three-dimensional order and structure in growing biofilms
Nature Physics 15:251–256.

https://doi.org/10.1038/s41567-018-0356-9
- PubMed
- Google Scholar
(2021) Modal Analysis of Turbulent Flow near an Inclined Bank–Longitudinal Structure Junction
Journal of Hydraulic Engineering 147:04020100.

https://doi.org/10.1061/(ASCE)HY.1943-7900.0001856
- Google Scholar
1. Higham. DJ
(2001) An Algorithmic Introduction to Numerical Simulation of Stochastic Differential Equations
SIAM Review 43:525–546.

https://doi.org/10.1137/S0036144500378302
- Google Scholar
1. Jaynes ET
2. Cummings FW
(1963) Comparison of quantum and semiclassical radiation theories with application to the beam maser
Proceedings of the IEEE 51:89–109.

https://doi.org/10.1109/PROC.1963.1664
- Google Scholar
Conference
1. Jin Y
2. Sendhoff B
(2008) IEEE Transactions on Systems, Man, and Cybernetics, Part C
Pareto-Based Multiobjective Machine Learning: An Overview and Case Studies. pp. 397–415.

https://doi.org/10.1109/TSMCC.2008.919172
- Google Scholar
1. Kac M
(1966) Can One Hear the Shape of a Drum?
The American Mathematical Monthly 73:1.

https://doi.org/10.2307/2313748
- Google Scholar
1. Kamien RD
(2002) The geometry of soft materials: a primer
Reviews of Modern Physics 74:953–971.

https://doi.org/10.1103/RevModPhys.74.953
- Google Scholar
1. Kantsler V
2. Goldstein RE
(2012) Fluctuations, dynamics, and the stretch-coil transition of single actin filaments in extensional flows
Physical Review Letters 108:038103.

https://doi.org/10.1103/PhysRevLett.108.038103
- PubMed
- Google Scholar
1. Keller PJ
2. Schmidt AD
3. Santella A
4. Khairy K
5. Bao Z
6. Wittbrodt J
7. Stelzer EHK
(2010) Fast, high-contrast imaging of animal development with scanned light sheet-based structured-illumination microscopy
Nature Methods 7:637–642.

https://doi.org/10.1038/nmeth.1476
- PubMed
- Google Scholar
Preprint
1. Kingma DP
2. Ba J
(2017) Adam: A Method for Stochastic Optimization
arXiv.

https://arxiv.org/abs/1601.00324
- Google Scholar
Conference
1. Knowles I
2. Renka R
(2014) Electron J Diff Eqns
Methods for numerical differentiation of noisy data.

https://ejde.math.txstate.edu/conf-proc/21/k3/knowles.pdf
- Google Scholar
1. Kobitski AY
2. Otte JC
3. Takamiya M
4. Schäfer B
5. Mertes J
6. Stegmaier J
7. Rastegar S
8. Rindone F
9. Hartmann V
10. Stotzka R
11. García A
12. van Wezel J
13. Mikut R
14. Strähle U
15. Nienhaus GU
(2015) An ensemble-averaged, cell density-based digital model of zebrafish embryo development derived from light-sheet microscopy data with single-cell resolution
Scientific Reports 5:8601.

https://doi.org/10.1038/srep08601
- PubMed
- Google Scholar
1. Kolmogorov AN
(1941) The local structure of turbulence in incompressible viscous fluid for very large Reynolds numbers
C R (Dokl) Acad Sci URSS 30:301–305.

https://doi.org/10.1098/rspa.1991.0075
- Google Scholar
1. Kraichnan RH
2. Montgomery D
(1980) Two-dimensional turbulence
Reports on Progress in Physics 43:547–619.

https://doi.org/10.1088/0034-4885/43/5/001
- Google Scholar
(2008) Tensile forces govern germ-layer organization in zebrafish
Nature Cell Biology 10:429–436.

https://doi.org/10.1038/ncb1705
- PubMed
- Google Scholar
1. Lenne PF
2. Munro E
3. Heemskerk I
4. Warmflash A
5. Bocanegra-Moreno L
6. Kishi K
7. Kicheva A
8. Long Y
9. Fruleux A
10. Boudaoud A
11. Saunders TE
12. Caldarelli P
13. Michaut A
14. Gros J
15. Maroudas-Sacks Y
16. Keren K
17. Hannezo E
18. Gartner ZJ
19. Stormo B
20. Gladfelter A
21. Rodrigues A
22. Shyer A
23. Minc N
24. Maître JL
25. Di Talia S
26. Khamaisi B
27. Sprinzak D
28. Tlili S
(2021) Roadmap for the multiscale coupling of biochemical and mechanical signals during development
Physical Biology 18:041501.

https://doi.org/10.1088/1478-3975/abd0db
- PubMed
- Google Scholar
Conference
1. Levy B
(2006) IEEE International Conference on Shape Modeling and Applications 2006
Laplace-Beltrami Eigenfunctions Towards an Algorithm That “Understands” Geometry. pp. 13–19.

https://doi.org/10.1109/SMI.2006.21
- Google Scholar
1. Marchetti MC
2. Joanny JF
3. Ramaswamy S
4. Liverpool TB
5. Prost J
6. Rao M
7. Simha RA
(2013) Hydrodynamics of soft active matter
Reviews of Modern Physics 85:1143–1189.

https://doi.org/10.1103/RevModPhys.85.1143
- Google Scholar
(2020) Topological defects in the nematic order of actin fibres as organization centres of Hydra morphogenesis
Nature Physics 17:251–259.

https://doi.org/10.1038/s41567-020-01083-1
- Google Scholar
Book
1. Mason JC
2. Handscomb DC
(2002) Chebyshev Polynomials
CRC press.

https://doi.org/10.1201/9781420036114
- Google Scholar
Book
1. Matlab
(2019)
MathWorks

Natick, MA, USA: Matlab.
- Google Scholar
(2020) Bacteria solve the problem of crowding by moving slowly
Nature Physics 17:205–210.

https://doi.org/10.1038/s41567-020-01070-6
- Google Scholar
(2019) Minimal Model of Cellular Symmetry Breaking
Physical Review Letters 123:188101.

https://doi.org/10.1103/PhysRevLett.123.188101
- PubMed
- Google Scholar
1. Morita H
2. Grigolon S
3. Bock M
4. Krens SFG
5. Salbreux G
6. Heisenberg CP
(2017) The Physical Basis of Coordinated Tissue Spreading in Zebrafish Gastrulation
Developmental Cell 40:354–366.

https://doi.org/10.1016/j.devcel.2017.01.010
- PubMed
- Google Scholar
1. Münster S
2. Jain A
3. Mietke A
4. Pavlopoulos A
5. Grill SW
6. Tomancak P
(2019) Attachment of the blastoderm to the vitelline envelope affects gastrulation of insects
Nature 568:395–399.

https://doi.org/10.1038/s41586-019-1044-3
- PubMed
- Google Scholar
Book
1. Nocedal J
2. Wright S
(2006) Numerical Optimization
Berlin, Germany: Springer Science & Business Media.

https://doi.org/10.1007/978-0-387-40065-5
- Google Scholar
Book
1. Pope SB
(2000) The scales of turbulent motion
In: Pope SB, editors. Turbulent Flows. Cambridge: Cambridge University Press. pp. 182–263.

https://doi.org/10.1017/CBO9780511840531
- Google Scholar
1. Power RM
2. Huisken J
(2017) A guide to light-sheet fluorescence microscopy for multiscale imaging
Nature Methods 14:360–373.

https://doi.org/10.1038/nmeth.4224
- PubMed
- Google Scholar
Preprint
1. Rackauckas C
2. Ma Y
3. Martensen J
4. Warner C
5. Zubov K
6. Supekar R
7. Skinner D
8. Ramadhan A
9. Edelman A
(2021) Universal Differential Equations for Scientific Machine Learning
arXiv.

https://arxiv.org/abs/2001.04385
- Google Scholar
1. Ramaswamy R
2. Jülicher F
(2016) Activity induces traveling waves, vortices and spatiotemporal chaos in a model actomyosin layer
Scientific Reports 6:20838.

https://doi.org/10.1038/srep20838
- PubMed
- Google Scholar
1. Reinbold PAK
(2020) Using noisy or incomplete data to discover models of spatiotemporal dynamics
Physical Review. E 101:010203(R).

https://doi.org/10.1103/PhysRevE.101.010203
- Google Scholar
Conference
1. Reuther A
2. Kepner J
3. Byun C
4. Samsi S
5. Arcand W
6. Bestor D
7. Bergeron B
8. Gadepally V
9. Houle M
10. Hubbell M
11. Jones M
12. Klein A
13. Milechin L
14. Mullen J
15. Prout A
16. Rosa A
17. Yee C
18. Michaleas P
(2018)
2018 IEEE High Performance Extreme Computing Conference

Interactive Supercomputing on 40,000 Cores for Machine Learning and Data Analysis. pp. 1–6.
- Google Scholar
1. Rohde LA
2. Heisenberg C
(2007) Zebrafish gastrulation: cell movements, signals, and mechanisms
International Review of Cytology 261:159–192.

https://doi.org/10.1016/S0074-7696(07)61004-3
- Google Scholar
(2012) Active Brownian particles
The European Physical Journal Special Topics 202:1–162.

https://doi.org/10.1140/epjst/e2012-01529-y
- Google Scholar
Software
1. Romeo N
(2022) surf-vec-defects, version swh:1:rev:6dc742c376b0d085e19ece65f932ac6935342aba
Software Heritage.

https://archive.softwareheritage.org/swh:1:dir:ee914ebe4261f8ecd9656d8fede9b4b56bc67136;origin=https://github.com/NicoRomeo/surf-vec-defects;visit=swh:1:snp:fdca320e1b23108e79cc51d3ad08492d9baed6fe;anchor=swh:1:rev:6dc742c376b0d085e19ece65f932ac6935342aba
1. Royer LA
2. Lemon WC
3. Chhetri RK
4. Wan Y
5. Coleman M
6. Myers EW
7. Keller PJ
(2016) Adaptive light-sheet microscopy for long-term, high-resolution imaging in living organisms
Nature Biotechnology 34:1267–1278.

https://doi.org/10.1038/nbt.3708
- PubMed
- Google Scholar
1. Sandberg VD
(1978) Tensor spherical harmonics on S2 and S3 as eigenvalue problems
Journal of Mathematical Physics 19:2441–2446.

https://doi.org/10.1063/1.523649
- Google Scholar
1. Saw TB
2. Doostmohammadi A
3. Nier V
4. Kocgozlu L
5. Thampi S
6. Toyama Y
7. Marcq P
8. Lim CT
9. Yeomans JM
10. Ladoux B
(2017) Topological defects in epithelia govern cell death and extrusion
Nature 544:212–216.

https://doi.org/10.1038/nature21718
- PubMed
- Google Scholar
1. Schmid PJ
(2010) Dynamic mode decomposition of numerical and experimental data
Journal of Fluid Mechanics 656:5–28.

https://doi.org/10.1017/S0022112010001217
- Google Scholar
(2020) Quantification of nematic cell polarity in three-dimensional tissues
PLOS Computational Biology 16:e1008412.

https://doi.org/10.1371/journal.pcbi.1008412
- PubMed
- Google Scholar
(2013) Rotation invariant features for HARDI
Information Processing in Medical Imaging 23:705–717.

https://doi.org/10.1007/978-3-642-38868-2_59
- PubMed
- Google Scholar
1. Shah G
2. Thierbach K
3. Schmid B
4. Waschke J
5. Reade A
6. Hlawitschka M
7. Roeder I
8. Scherf N
9. Huisken J
(2019) Multi-scale imaging and analysis identify pan-embryo cell dynamics of germlayer formation in zebrafish
Nature Communications 10:5753.

https://doi.org/10.1038/s41467-019-13625-0
- PubMed
- Google Scholar
1. Sknepnek R
2. Henkes S
(2015) Active swarms on a sphere
Physical Review. E, Statistical, Nonlinear, and Soft Matter Physics 91:022306.

https://doi.org/10.1103/PhysRevE.91.022306
- PubMed
- Google Scholar
1. Slater JC
2. Koster GF
(1954) Simplified LCAO Method for the Periodic Potential Problem
Physical Review 94:1498–1524.

https://doi.org/10.1103/PhysRev.94.1498
- Google Scholar
1. Solnica-Krezel L
(2005) Conserved patterns of cell movements during vertebrate gastrulation
Current Biology 15:R213–R228.

https://doi.org/10.1016/j.cub.2005.03.016
- PubMed
- Google Scholar
Book
1. Soong TT
2. Grigoriu M
(1993)
Random Vibration of Mechanical and Structural Systems

NASA STI/Recon Technical Report A.
- Google Scholar
Book
1. Stein EM
2. Shakarchi R
(2011)
Fourier Analysis: An Introduction

New Jersey, United States: Princeton University Press.
- Google Scholar
1. Stelzer EHK
(2015) Light-sheet fluorescence microscopy for quantitative biology
Nature Methods 12:23–26.

https://doi.org/10.1038/nmeth.3219
- PubMed
- Google Scholar
(2018) Global morphogenetic flow is accurately predicted by the spatial distribution of myosin motors
eLife 7:e27454.

https://doi.org/10.7554/eLife.27454
- PubMed
- Google Scholar
Preprint
1. Supekar R
2. Song B
3. Hastewell A
4. Choi GPT
5. Mietke A
6. Dunkel J
(2021) Learning Hydrodynamic Equations for Active Matter from Particle Simulations and Experiments
arXiv.

https://arxiv.org/abs/2101.06568
- Google Scholar
(2014) On dynamic mode decomposition: Theory and applications
Journal of Computational Dynamics 1:391–421.

https://doi.org/10.3934/jcd.2014.1.391
- Google Scholar
Software
1. Wagner J
(2020) Regularised numerical differentiation
MATLAB.

https://www.mathworks.com/matlabcentral/fileexchange/74165-regularised-numerical-differentiation
Book
(2021)
Emergence of puffs, weak and strong slugs from a stochastic predator-prey model for transitional turbulence with stream-wise shear interactions

In: Shih HY, editors. Bulletin of the American Physical Society. New York, United States: Simons Foundation. pp. 65–69.
- Google Scholar
(2015) Virial pressure in systems of spherical active Brownian particles
Soft Matter 11:6680–6691.

https://doi.org/10.1039/c5sm01412c
- PubMed
- Google Scholar

Article and author information

Author details

Nicolas Romeo
1. Department of Mathematics, Massachusetts Institute of Technology, Cambridge, United States
2. Department of Physics, Massachusetts Institute of Technology, Cambridge, United States
Contribution
Data curation, Formal analysis, Investigation, Methodology, Software, Validation, Writing – original draft, Writing – review and editing

Contributed equally with
Alasdair Hastewell

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0001-6926-5371
Alasdair Hastewell

Department of Mathematics, Massachusetts Institute of Technology, Cambridge, United States

Contribution
Methodology, Software, Writing – original draft, Writing – review and editing

Contributed equally with
Nicolas Romeo

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0003-2633-380X
Alexander Mietke

Department of Mathematics, Massachusetts Institute of Technology, Cambridge, United States

Contribution
Conceptualization, Formal analysis, Methodology, Project administration, Supervision, Validation, Writing – original draft, Writing – review and editing

For correspondence
amietke@mit.edu

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0003-1170-2406
Jörn Dunkel

Department of Mathematics, Massachusetts Institute of Technology, Cambridge, United States

Contribution
Conceptualization, Methodology, Project administration, Supervision, Writing – original draft, Writing – review and editing

For correspondence
dunkel@math.mit.edu

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0001-8865-2369

Funding

European Molecular Biology Organization (ALTF 528-2019)

Alexander Mietke

Deutsche Forschungsgemeinschaft (431144836)

Alexander Mietke

James S. McDonnell Foundation (Complex Systems Scholar Award)

Jörn Dunkel

Alfred P. Sloan Foundation (G-2021-16758)

Jörn Dunkel

MathWorks

Nicolas Romeo
Alasdair Hastewell

Robert E Collins Distinguished Scholarship Fund

Jörn Dunkel

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank Nico Scherf and Ghopi Shah for providing single-cell tracking data and for giving helpful advice on zebrafish development and we thank Paul Matsudaira for discussions. We thank the MIT SuperCloud (Reuther et al., 2018) for providing us access to HPC resources. This work was supported by a MathWorks Science Fellowship (NR and ADH), a Longterm Fellowship from the European Molecular Biology Organization (ALTF 528–2019, AM), a Postdoctoral Research Fellowship from the Deutsche Forschungsgemeinschaft (Project 431144836, AM), a Complex Systems Scholar Award from the James S McDonnell Foundation (JD), the Robert E Collins Distinguished Scholarship Fund (JD) and the Alfred P Sloan Foundation (G-2021–16758, JD).

Copyright

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

2,135

views
414

downloads
17

citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Citations by DOI

17

citations for umbrella DOI https://doi.org/10.7554/eLife.68679

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Mendeley

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

Nicolas Romeo
Alasdair Hastewell
Alexander Mietke
Jörn Dunkel

(2021)

Learning developmental mode dynamics from single-cell trajectories

eLife 10:e68679.

https://doi.org/10.7554/eLife.68679

Categories and tags

Research organism

Zebrafish

Share this article

Cite this article

From single-cell tracking data to sparse mode amplitude representations.

Time evolution of the pre-processed cell tracking data (point cloud, see Materials and methods), and of the density field ρ(r,t) (colormap) and associated flux J(r,t) (streamlines) corresponding to the harmonic modes {ρlm,jlm(1),jlm(2)} shown in Figure 1D.

Reconstruction of the hydrodynamics fields in real space by adding consecutive scalar and vector spherical harmonic modes of progressively higher order l.

Mode signatures of developmental symmetry breaking and topological defects in cellular flux.

Learning active Brownian particle (ABP) dynamics on a sphere.

Coarse-grained dynamics of active Brownian particles on the unit sphere in the low-noise (Dr=0.5) and high-noise (Dr=10) regime.

Model learning for experimental data of collective cell motion during early zebrafish development.

Comparison of dynamics of the experimental and learned density ρ⁢(r,t) (colormap) and flux fields J(r,t) (streamlines) represented in a Mollweide projection.

Illustration of the action of the coarse-graining tensor kernel K⁢(r,r′)i⁢j′ Equation 19.

Family of kernel functions fk⁢(ω) and gk⁢(ω) given in Equation 26a.

Sequentially adding vector spherical harmonics Ψl⁢m and Φl⁢m – equivalent to increasing lmax in Equation 5 – resolves increasing levels of details present in experimental flux fields ("Data").

Normalized spectral entropy as a function of the coarse-graining kernel width (top) computed for density ρ and flux field J using Equation 34.

Relative average mode reconstruction error Emodes⁢(nmax,lmax) Equation 36.

Scurl reconstruction error landscape (log scale) as a function of lmax and nmax.

Initially the data is rescaled using the median absolute deviation (MAD) defined in Equation 50 to account for variation in scales across the modes.

The 3×3-matrix invariant I2=12⁢(tr⁢[(MJ)2]-(tr⁢[MJ])2) sampled for pairs of positions r, r′ vanishes to machine precision for the dynamical matrix M learned on the zebrafish data.

Author details

Nicolas Romeo

Contribution

Contributed equally with

Competing interests

Alasdair Hastewell

Contribution

Contributed equally with

Competing interests

Alexander Mietke

Contribution

For correspondence

Competing interests

Jörn Dunkel

Contribution

For correspondence

Competing interests

Citations by DOI

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

Categories and tags

Research organism

Time evolution of the pre-processed cell tracking data (point cloud, see Materials and methods), and of the density field $ρ (r, t)$ (colormap) and associated flux $J (r, t)$ (streamlines) corresponding to the harmonic modes ${ρ_{l m}, j_{l m}^{(1)}, j_{l m}^{(2)}}$ shown in Figure 1D.

Reconstruction of the hydrodynamics fields in real space by adding consecutive scalar and vector spherical harmonic modes of progressively higher order $l$ .

Coarse-grained dynamics of active Brownian particles on the unit sphere in the low-noise ( $D_{r} = 0.5$ ) and high-noise ( $D_{r} = 10$ ) regime.

Comparison of dynamics of the experimental and learned density $ρ (r, t)$ (colormap) and flux fields $J (r, t)$ (streamlines) represented in a Mollweide projection.

Illustration of the action of the coarse-graining tensor kernel $K {(r, r^{'})}_{i j^{'}}$ Equation 19.

Family of kernel functions $f_{k} (ω)$ and $g_{k} (ω)$ given in Equation 26a.

Sequentially adding vector spherical harmonics $Ψ_{l m}$ and $Φ_{l m}$ – equivalent to increasing $l_{max}$ in Equation 5 – resolves increasing levels of details present in experimental flux fields ("Data").

Normalized spectral entropy as a function of the coarse-graining kernel width (top) computed for density $ρ$ and flux field $J$ using Equation 34.

Relative average mode reconstruction error $E_{modes} (n_{max}, l_{max})$ Equation 36.

$S_{curl}$ reconstruction error landscape (log scale) as a function of $l_{max}$ and $n_{max}$ .

The $3 \times 3$ -matrix invariant $I_{2} = \frac{1}{2} (tr [{(M^{J})}^{2}] - {(tr [M^{J}])}^{2})$ sampled for pairs of positions $r$ , $r^{'}$ vanishes to machine precision for the dynamical matrix $M$ learned on the zebrafish data.