Abstract
Bacterial communities are pivotal to maintaining ecological function and preserving the rich tapestry of biological diversity. The rapid development of environmental sequencing technologies, such as metagenomics, has revolutionized our capacity to probe such diversity. However, despite these advances, a theoretical understanding connecting empirical data with ecosystem modelling, in particular in the framework of disordered systems akin to spin glasses, is still in its infancy. Here, we present a comprehensive framework using theories of disordered systems to decode microbiome data, which offers insight into the ecological forces that shape macroecological states. By employing the quenched disordered generalized Lotka-Volterra model, we analyze species abundance data in healthy and diseased human gut microbiomes. Results reveal the emergence of two distinct patterns of species-interaction networks, elucidating the pathways through which dysbiosis may drive microbiome instability. Interaction patterns thus provide a window into the systemic shifts accompanying the transition from health to disease, offering a new perspective on the dynamics of the microbial community. Our findings suggest the potential of disordered systems theory to characterize microbiomes by capturing the essence of ecological interactions and their consequences on stability and functioning, leveraging our understanding of the linkages of dysbiosis and microbial dynamics.
1 Introduction
Bacterial communities are a fundamental reservoir of ecological functions and biological diversity. They are relevant for any environmental and host-associated ecosystem, ranging from soil to the human gut. In particular, the human gut microbiome, by interacting with host metabolism and immune system Levy et al. [2016], Barroso-Batista et al. [2015], Barreto and Gordo [2023], is a key regulator of human health. Dysbioses are defined as alterations of the healthy microbiome composition associated with the gastrointestinal tract Lloyd-Price et al. [2019], Das and Nair [2019], Nishida et al. [2018]. Some recent studies have shown that gut dysbioses are associated with changes in microbial species interaction patterns Bashan et al. [2016], Seppi et al. [2023], while emergent patterns focusing on single species properties Azaele et al. [2016], Grilli [2020], such as species abundance distribution (SAD), fail to characterize altered states of microbiome Seppi et al. [2023], Pasqualini et al. [2023]. Microbiomes host various interactions, measured for in-vitro communities Kehe et al. [2021], Venturelli et al. [2018], and are expected to govern natural ones.
Recently, stochastic non-interacting neutral Zeng et al. [2015], Venkataraman et al. [2015], Sala et al. [2016], Sieber et al. [2019] and logistic models Descheemaeker and De Buyl [2020], Grilli [2020], Zaoli and Grilli [2022], have been applied to give a satisfactory description of empirical patterns in different types of communities, ranging from human to soil microbiomes. However, they completely neglect species interactions in the population dynamics, and therefore they do not seem to have the right modelling tools to obtain novel insights from dysbioses models.
For these reasons, several approaches have been proposed in recent years to infer microbial interactions Faust and Raes [2012], Xiao et al. [2017], Camacho-Mateu et al. [2024]. However, such inference remains very challenging and problematic in many respects Faisal et al. [2010], Angulo et al. [2017], Tu et al. [2019], Armitage and Jones [2019], Holt [2020]: a) the very high dimension of typical microbiome datasets and the lack of longitudinal (long-term) experiments; b) the time-dependent nature of microbial species interactions Pacciani-Mori et al. [2021]; c) abiotic factors such as resources, temperature, and pH, can introduce environmental filtering effects Sireci et al. [2022] and induce effective interactions; d) experimental and technical challenges that introduce sampling effects, false positive species Tovo et al. [2020], spurious correlation, and bias in the data Weiss et al. [2016], Gloor et al. [2017], Dohlman and Shen [2019]. Therefore, even in a scenario where metagenomic samples are noise- and bias-free, reconstructing species-interaction networks of the underlying microbial dynamics remains a hard task.
An alternative, theoretically more elegant approach was pioneered by May May [1972], proposing to use random matrices to model species interaction networks, i.e., each entry of the adjacency matrix is extracted at random from a given distribution. Given the impossibility of empirically measuring the interaction strengths, the advantage of such an approach is the reduction from ~ S2 (if S is the number of species) to just a few parameters (e.g., 2) used to parameterize the distribution Allesina and Tang [2012]. In recent years several investigations have applied this framework to the study of population dynamics through generalized Lotka-Volterra (gLV) equations, employing quenched disordered system techniques (also known as “glassy”), mutated from Statistical Physics Bunin [2017], Galla [2018], Biroli et al. [2018], Altieri et al. [2021], Lorenzana and Altieri [2022]. The main idea is thus to focus on the ensemble properties of the gLV, which will depend only on a few relevant control parameters (e.g., the mean μ and the variance σ2 of the random interactions strengths) Barbier et al. [2018]. However, despite a range of very interesting results, this approach has remained confined mainly to purely theoretical domains (but see Hatton et al. [2024]) and has only recently been used to give an interpretation of controlled experiments with microbial communities Hu et al. [2022].
In this work, we therefore provide a proof of concept for the applicability of this framework to human microbiomes. In particular, we will infer the parameters of the quenched dgLV model from healthy and unhealthy cohorts of gut microbiomes. We will show that the different physiological states of the human gut microbiome correspond to different interaction patterns of the dgLV model. We will then introduce quantitative metrics to quantify the role of stability and the contribution of ecological forces to the dynamics.
2 Results
2.1 Disordered generalized Lotka-Volterra model
The dgLV model describes the evolution in time of the concentration abundances of a local pool of S interacting species, i.e.,
where Ni is the populations of species i-th, Ki is its carrying capacity, and ri the growth rate where
The replica formalism, a well-known technique in disordered systems, comes into play and allows us to derive a non-interacting Hamiltonian corresponding to the dgLV symmetric interactions αij (see Methods and the SI, Section B for the complete derivation). In simple words, instead of trying to solve the problem for one random setup, the formalism considers many replicas of the system and averages their behaviors. This approach helps to smooth out the randomness and reveals the typical behavior of the system. We also assume a single equilibrium scenario, known as replica-symmetric (RS) regime. Although the SADs of empirical microbial communities display fat tails, a feature more compatible with the multi-attractors phase of the asymmetric 1RSB Mallmin et al. [2024] case or with the GLV with annealed disorder Suweis et al. [2023], we consider this simplification as it is the regime in which we can obtain explicit analytical relations between the model parameters and the data and where model inversion is feasible.
On the other hand, by using a cavity argument Mezard and Montanari [2009], Altieri and Baity-Jesi [2024], one can analytically derive the SAD of the model (see Methods and SI, Section D):
where the auxiliary variable
The RS ansatz can also be characterized by its stability to external perturbations, such as the external temperature, the immigration rate, or the interaction heterogeneity. To investigate the stability of the RS phase, we consider the Hessian matrix of the theory by performing a harmonic expansion of the replicated free energy de Almeida and Thouless [1978]. When the leading eigenvalue of the Hessian matrix – suitably diagonalized on the so-called replicon sector – goes below zero, unstable equilibria appear: either the system moves towards a Replica Symmetric Breaking (RSB) phase with multiple locally stable minima, or it develops a marginal full-RSB phase, leading to a hierarchical structure of states and a very complex landscape (also known as glass phase). Therefore, in the latter cases, the RS ansatz no longer represents a thermodynamically stable phase. Based on standard calculations in disordered systems, this leading eigenvalue of the Hessian matrix, called replicon R, can be analytically evaluated as
where
2.2 Data Through the Glass
We now consider cross-sectionally sampled gut microbiomes of two different cohorts: one of healthy and another one of diseased individuals. Here we focus on chronic inflammation syndromes, but, in principle, what we present holds for groups of phenotypically distinct populations. We can idealize the samples of each group as generated from the stationary distribution of the dgLV Eq.(1), with different realizations of disorder α, but with shared μ and ρ (see Methods). The approach we propose therefore considers all samples in the same group as coming from the same statistical ensemble but, on the other hand, it distinguishes ecological regimes characterized by distinct phenotypes through different statistics (e.g., dissimilar μ, and σ) of the random species interactions α. We aim to test the hypothesis that each cohort will be described by a different set of ecological parameters (θ = (μ, σ, T, λ)).

Inference of the dgLV generative model is performed by a moment matching optimization procedure. We aim to infer the free parameters θ = (μ, σ, T, λ) so that to match the order parameters and the average carrying capacity (h, q0, qd, K). This procedure allows us to extract ecological dynamics information for cross-sectional data of healthy (blue) and diseased (red) microbiomes, which are conceptualized as independent disorder realizations.
To calculate the order parameters (h, qd, q0, K) from the data, we thus need to specify how we perform the ensemble and the disorder averages from the data (see Figure 1). Given that we typically lack time series data, and leveraging on an effective mean-field description, we consider estimating the ensemble average as the average among species (that is, different species are realizations of the same underlying stochastic process Azaele et al. [2016], known as neutral hypothesis), i.e.,
where Na,j gives the population density of species j in sample a.
To evaluate the carrying capacities Ki from the data, as done in most of the works using dGLV Bunin [2017], Biroli et al. [2018], Altieri et al. [2021], Suweis et al. [2023], Mallmin et al. [2024], we assume that in each cohort, all species are characterized by the same carrying capacity Ki = K. Here, we assign the value K as the average maximum relative abundance of species among the available samples for each cohort, thus

Order parameters inferred from the data using Eqs. (2.2). Panel A) h B) q0 and C) qd in healthy (blue) and unhealthy (red) cohorts. The gray points show the values for the corresponding null models where sample labels have been randomized. Standard deviations have been computed over 5000 realizations.
Note also that, from metagenomics, we only have access to compositional data for species abundances Pasqualini et al. [2023]. This is crucial to properly treat different samples and end up with a consistent analysis. Moreover, this also allows us to give an ecological interpretation of some of the order parameters. In particular, thanks to the compositionality of the data, we have that
From the modelling point of view, it can be rigorously shown that constraining the abundances to be normalized to one corresponds to introducing an additional Lagrange multiplier in the expression of the free energy. However, this only acts on the linear term contributing to shifting the mean of the random interactions, μ, but does not have any influence on the heterogeneity σ, therefore on the phase diagram. For more details, we refer the interested reader to the SI (Section B1) along with Altieri et al. [2021]. In Altieri et al. [2021] the authors also showed how a global normalizing constraint on species abundances reflects in a one-to-one mapping with the random replicant model Biscari and Parisi [1995]. (The replicator equations originally introduced by Diederich and Opper [1989] and recast within the replica formalism Biscari and Parisi [1995], Altieri et al. [2021] describe the dynamics of an ensemble of replicants that evolve via random couplings.)
2.3 Species interactions patterns characterize the state of microbiomes
We thus collect all the parameters estimated from the data in a vector π = (h, qd, q0, K). As we will better detail in the Methods, we develop a moment matching inference algorithm to infer the model parameters θ. We therefore generated order parameters from our model that are as close as possible to the data, as depicted in Figure 1. The idea of the method is to introduce a cost function C(θ|π), representing a total relative error for some self-consistent equations. If the parameters θ are such that the right part of the self-consistent equation equals the left part, the problem can be considered solved. Since the landscape of this cost function presents several minima, we performed multiple optimization procedures to collect an ensemble of possible solutions, from which we retained the top 30.
First, we find that different solutions θ* of the optimization problem provides ecological insights into the underlying microbiome populations. As Figure 3A shows, the inferred demographic noise strength (T) and the heterogeneity of the interactions (σ) for healthy and diseased microbiomes are clustered in the T - σ plane. In particular, the SAD for the healthy cohort is robust among the different solutions of the inference procedure, as shown by the superposition of the different curves in the inset of Figure 3A. On the other hand, SADs inferred from unhealthy patients have high sensitivity to different solutions. In particular, some of them display a mode for high-abundance species (light red lines in Figure 3A), a signature of dominant strain in the gut. Consistently, the distribution of the interactions P(αi,j) generated through the inferred parameters μ and σ is different between healthy and diseased cohorts, giving a distinct pattern of interactions (see Figure 3B), a result that is compatible with that found by Bashan and collaborators Bashan et al. [2016]. In particular, we find that dysbiosis reduces the heterogeneity of interaction strengths, a result also observed when taking correlations as a proxy for interactions Seppi et al. [2023].

Panel a: inferred T (demographic noise strength) and σ (interactions heterogeneity) for healthy (blue) and diseased (red) microbiomes are clustered. Darker dots correspond to better solutions (i.e., solutions with a lower value of the cost function C), while the two points with hexagonal markers correspond to the best two (healthy and diseased, respectively) solutions. In the first panel inset, we also show (in log-log scale) the SADs corresponding to each solution. To have a more concise representation, we present each SAD fixing the disorder to its average
We then assess how close the inferred σ and β = 1/T (the inverse temperature in a statistical physics approach) are to the critical replica symmetry-breaking (RSB) line of the dgLV (R = 0), evaluated being all the other parameters constant (see Methods). We find again that the replicon values R corresponding to each solution of our optimization protocol are significantly different for the two investigated microbiome phenotypes (see Figure 4A). In particular, diseased microbiomes are closer to marginal stability within the replica-symmetric ansatz Altieri et al. [2021], Mézard et al. [1987], de Almeida and Thouless [1978]. Furthermore, by investigating the shape of the SAD given by Eq. (2), we can estimate the ratio between niche (represented by species interaction) and neutral (represented by birth/death and immigration) ecological forces, which can be captured by the quantity ψ Wu et al. [2021]. It detects the emergence of internal modes in the species abundance distribution as a hallmark of niche processes (see SI, Section E).
Inspired by field-theory arguments (see Methods section Free-energy landscape exploration: replica formulation), we can call mass of the theory the m parameter as defined above in Eq. (2). When m → 0 the theoretical order parameters diverge and the system enters a pathological regime of unbounded growth. As such, the mass term, can be considered as a complementary stability measure, capable of capturing the transition to the unbounded growth regime.
In the model, two kinds of effects compete to shape the community structure. On the one hand, we have niche effects, encoded in disordered interactions and thus tracked by the parameters μ, σ, and K. Their overall effect is selective and tends to concentrate the SAD around the typical abundance value. On the other hand, we have neutral effects encoded into the stochastic dynamics and immigration, governing the low abundance regime of the SAD. When the demographic noise amplitude is stronger than immigration (ν < 1, as in our case) the SAD exhibits a low-abundance integrable divergence. In the opposite scenario, ν > 1, there is no divergence and the SAD is modal. Since interactions are random, the probability of observing an internal mode can be estimated as the fraction of SADs realization having non-trivial solutions to the stationary point equation. Such quantity, dubbed as the niche-neutral ratio, can be analytically evaluated and reads:
where

The replicon eigenvalue corresponding to each solution of our optimization procedure (shaded dots). The solid hexagon represents the replicon corresponding to the best solutions that minimize the error in predicting the order parameters of the theory (minimum C). The two investigated microbiome phenotypes (healthy in blue, diseased in red) are significantly different. In particular, diseased microbiomes are closer to the marginal stability of replica-symmetric ansatz (grey horizontal line). Panel b: Solutions of the moment-matching objective function are shown as a function of ψ and m, which in turn depend on the SAD parameters (see main text). Healthy (blue) and diseased (red) microbiomes appear to be clustered. Therefore, distinct ecological organization scenarios (strong neutrality/emergent neutrality) emerge. Darker dots correspond to solutions with lower values of the cost function, while hexagonal markers correspond to the two best solutions.
Discussion and Conclusions
In our exploration of the gut microbiome through the lens of disordered systems and random matrix theory, we have embarked on an ambitious journey to connect intricate theoretical concepts to practical analyses of environmental microbiome data. In particular, we have characterized healthy and unhealthy gut microbiomes using the disordered generalized Lotka-Volterra model (dgLV) for population dynamics. This attempt to apply advanced theoretical methodologies from disordered systems to real-world biological data represents a significant stride in understanding the complex interplay within microbial communities.
One delicate aspect of our study lies in addressing the inherent limitations and the potential scope for improvement in future research. A notable challenge is the discrepancy between the empirical Species Abundance Distribution (SAD), which is observed in the data, and the theoretical SAD predicted by the quenched dgLV model Suweis et al. [2023]. While empirical data showcase a diverse range of species abundance, following a power-law distribution, the model predictions tend to exhibit exponential decay in SAD tails. This mismatch underscores the need for further refinement of the model to accurately capture the nuanced patterns observed in real-world data. For example, recently it has been proposed that introducing an annealed disorder can generate SADs that look closer to the empirical ones Suweis et al. [2023]. Another possibility is to set the dgLV model parameters in such a way as to reproduce the multi-attractor phase. In fact, in this region, the SADs display a more heterogeneous shape Arnoulx de Pirey and Bunin [2024]. We already intend to explore this follow-up direction by combining one-step replica-symmetry-breaking (1RSB) computations in the multiple-attractor phase with Dynamical Mean-Field Theory analysis for the asymmetric interaction case.
Another related limitation is the challenge of generating species abundance samples from the dgLV model that mirrors the statistical properties of the observed empirical data. In our current framework, each microbiome sample could be extracted from p(N|ζ), where ζ is a realization of the disorder. However, p(N|ζ) near N → 0 presents a power-law exponent ν - 1, with νH,D ≈ 10-3. This generates numerical instabilities and dominates the sampling process, posing difficulties in generating representative synthetic samples. Moreover, while microbiome data are inherently compositional Pasqualini et al. [2023], the dgLV model species populations Ni are positive real numbers. However, as already noted, it can be shown that introducing a normalization
In conclusion, our work opens up new avenues for future research, particularly in refining the theoretical models to better align with empirical observations and in exploring the nuances of species abundance distributions within the microbiome context. Moreover, the integration of other forms of environmental variability and species-specific traits could provide a more holistic view of ecosystem dynamics as also proposed by the One Health-One Microbiome framework Tomasulo et al. [2024]. The insights gained from this study not only enhance our understanding of the complex dynamics in microbial ecosystems but also pave the way for more sophisticated analytical tools in ecological modelling, contributing significantly to our ability to interpret and manage these vital biological systems.
Acknowledgements
S.S. acknowledges Iniziativa PNC0000002-DARE - Digital Lifelong Prevention. A.M. acknowledges financial support under the National Recovery and Resilience Plan (NRRP), Mission 4, Component 2, Investment 1.1, Call for tender No. 104 by the Italian Ministry of University and Research (MUR), funded by the European Union - NextGenerationEU – Project Title “Emergent Dynamical Patterns of Disordered Systems with Applications to Natural Communities” – CUP 2022WPHMXK - Grant Assignment Decree No. 2022WPHMXK adopted on 19/09/2023 by the Italian Ministry of University and Research (MUR). A.A. acknowledges the support received from the Agence Nationale de la Recherche (ANR) of the French government, under the grant ANR-23-CE30-0012-01 (project “SIDECAR”). We thank Silvia De Monte for insightful discussions.
References
- Metabolites: messengers between the microbiota and the immune systemGenes & development 30:1589–1597
- Adaptive immunity increases the pace and predictability of evolutionary change in commensal gut bacteriaNature communications 6:8945
- Intrahost evolution of the gut microbiotaNature Reviews Microbiology 21:590–603
- Multi-omics of the gut microbial ecosystem in inflammatory bowel diseasesNature 569:655–662
- Homeostasis and dysbiosis of the gut microbiome in health and diseaseJournal of biosciences 44:1–8
- Gut microbiota in the pathogenesis of inflammatory bowel diseaseClinical journal of gastroenterology 11:1–10
- Universality of human microbial dynamicsNature 534:259–262
- Emergent functional organization of gut microbiomes in health and diseasesBiomolecules 14:5
- Statistical mechanics of ecological systems: Neutral theory and beyondReviews of Modern Physics 88:035003
- Macroecological laws describe variation and diversity in microbial communitiesNature communications 11:4743
- Emergent ecological patterns and modelling of gut microbiomes in health and in diseasebioRxiv :2023.10.19.563037
- Positive interactions are common among culturable bacteriaScience advances 7:eabi7159
- Deciphering microbial interactions in synthetic human gut microbiome communitiesMolecular systems biology 14:e8157
- Neutral models of microbiome evolutionPLoS computational biology 11:e1004365
- Application of a neutral community model to assess structuring of the human lung microbiomeMBio 6:10–1128
- Stochastic neutral modelling of the gut microbiota’s relative species abundance from next generation sequencing dataBMC bioinformatics 17:179–188
- Neutrality in the metaorganismPLoS Biology 17:e3000298
- Stochastic logistic models reproduce experimental time series of microbial communitieseLife 9:e55650
- The stochastic logistic model with correlated carrying capacities reproduces beta-diversity metrics of microbial communitiesPLOS Computational Biology 18:e1010043
- Microbial interactions: from networks to modelsNature Reviews Microbiology 10:538–550
- Mapping the ecological networks of microbial communitiesNature communications 8:2042
- Sparse species interactions reproduce abundance correlation patterns in microbial communitiesProceedings of the National Academy of Sciences 121:e2309575121
- Inferring species interaction networks from species abundance data: A comparative evaluation of various statistical and machine learning methodsEcological Informatics 5:451–464
- Fundamental limitations of network reconstruction from temporal dataJournal of the Royal Society Interface 14:20160966
- Reconciling cooperation, biodiversity and stability in complex ecological communitiesScientific reports 9:5580
- How sample heterogeneity can obscure the signal of microbial interactionsThe ISME journal 13:2639–2646
- Some thoughts about the challenge of inferring ecological interactions from spatial dataBiodiversity Informatics 15:61–66
- Constrained proteome allocation affects coexistence in models of competitive microbial communitiesThe ISME Journal 15:1458–1477
- Environmental fluctuations explain the universal decay of species-abundance correlations with phylogenetic distancebioRxiv :2022.07.12.499693
- Taxonomic classification method for metagenomics based on core protein families with core-kaijuNucleic Acids Research 48:e93-e93
- Correlation detection strategies in microbial data sets vary widely in sensitivity and precisionThe ISME journal 10:1669–1681
- Microbiome datasets are compositional: and this is not optionalFrontiers in microbiology 8:2224
- Mapping the microbial interactome: Statistical and experimental approaches for microbiome network inferenceExperimental Biology and Medicine 244:445–458
- Will a large complex system be stable?Nature 238:413–414
- Stability criteria for complex ecosystemsNature 483:205–208
- Ecological communities with lotka-volterra dynamicsPhysical Review E 95:042414
- Dynamically evolved community size and stability of random lotka-volterra ecosystems (a)Europhysics Letters 123:48004
- Marginally stable equilibria in critical ecosystemsNew Journal of Physics 20:083051
- Properties of equilibria and glassy phases of the random lotka-volterra model with demographic noisePhysical Review Letters 126:258301
- Well-mixed lotka-volterra model with random strongly competitive interactionsPhysical Review E 105:024307
- Generic assembly patterns in complex ecological communitiesProceedings of the National Academy of Sciences 115:2156–2161
- Diversity begets stability: Sublinear growth and competitive coexistence across ecosystemsScience 383:eadg8488https://doi.org/10.1126/science.adg8488
- Emergent phases of ecological diversity and dynamics mapped in microcosmsScience 378:85–89
- Effects of intraspecific cooperative interactions in large ecosystemsSciPost Physics 12:013
- Chaotic turnover of rare and abundant species in a strongly interacting model communityProceedings of the National Academy of Sciences 121:e2312822121
- Generalized lotka-volterra systems with time correlated stochastic interactionsarXiv preprint arXiv:2307.02851
- Information, physics, and computationOxford University Press
- An introduction to the theory of spin glassesIn: Encyclopedia of Condensed Matter Physics Academic Press pp. 361–370
- Stability of the sherrington-kirkpatrick solution of a spin glass modelJournal of Physics A: Mathematical and General 11:983
- Replica symmetry breaking in the random replicant modelJournal of Physics A: Mathematical and General 28:4697
- Replicators with random interactions: A solvable modelPhysical Review A 39:4333
- Spin glass theory and beyond: An Introduction to the Replica Method and Its ApplicationsWorld Scientific Publishing Company
- Understanding species abundance distributions in complex ecosystems of interacting speciesarXiv preprint arXiv:2103.02081
- Many-species ecological fluctuations as a jump process from the brink of extinctionPhysical Review X 14:011037
- Deviation from neutral species abundance distributions unveils geographical differences in the structure of diatom communitiesScience Advances 10:eadh0477
- Microbiome one health model for a healthy ecosystemScience in One Health :100065
Article and author information
Author information
Version history
- Preprint posted:
- Sent for peer review:
- Reviewed Preprint version 1:
Copyright
© 2025, Pasqualini et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
- views
- 106
- downloads
- 0
- citations
- 0
Views, downloads and citations are aggregated across all versions of this paper published by eLife.