Computerguided design of optimal microbial consortia for immune system modulation
Abstract
Manipulation of the gut microbiota holds great promise for the treatment of diseases. However, a major challenge is the identification of therapeutically potent microbial consortia that colonize the host effectively while maximizing immunologic outcome. Here, we propose a novel workflow to select optimal immuneinducing consortia from microbiome compositicon and immune effectors measurements. Using published and newly generated microbial and regulatory Tcell (T_{reg}) data from germfree mice, we estimate the contributions of twelve Clostridia strains with known immunemodulating effect to T_{reg} induction. Combining this with a longitudinal dataconstrained ecological model, we predict the ability of every attainable and ecologically stable subconsortium in promoting T_{reg} activation and rank them by the T_{reg} Induction Score (TrIS). Experimental validation of selected consortia indicates a strong and statistically significant correlation between predicted TrIS and measured T_{reg}. We argue that computational indexes, such as the TrIS, are valuable tools for the systematic selection of immunemodulating bacteriotherapeutics.
https://doi.org/10.7554/eLife.30916.001eLife digest
The role of the immune system is to protect the body from infection. It does this by using a powerful toolkit that isolates pathogens and removes damaged tissue. When directed against bacteria and viruses, the system helps to keep the body safe, but an imbalance of the components of the immune system can lead to inflammatory or allergic diseases.
The body has builtin mechanisms to shut off the immune response. For example, regulatory Tcells are immune cells with an antiinflammatory effect, meaning they can switch off inflammation after the immune system has cleared an invasion. If their numbers are too low, it can contribute to unwanted inflammation. In ulcerative colitis, for instance, an unbalanced immune system mistakenly attacks healthy tissue. This causes inflammation and ulcers in the intestine, resulting in bleeding and weight loss.
Previous studies revealed that bacteria in the gut might help to control regulatory Tcell numbers. Certain combinations of bacteria can stimulate these regulatory Tcells and potentially dampen inflammation. Tweaking the different populations of microbes in the gut could provide a new way to treat diseases like ulcerative colitis. But, identifying the best combination to use is a major challenge. Testing them all one by one would be extremely challenging. Now, Stein, Tanoue et al. present a mathematical model designed to predict the best microbemix to use.
The model incorporated data from regulatory Tcells and microbes gathered from mice to estimate the contribution that different strains of bacteria make to regulatory Tcell numbers. This then fed into an ecological model predicted how different combinations of bacteria would behave in mice. The model aimed to fulfill two key criteria. First, to find combinations that can form stable, longterm bacterial colonies alone and when faced with competing microbes. Second, to boost regulatory Tcell numbers, helping them to expand to correct the immune imbalance.
To test the predictions, mice received combinations of bacteria suggested by the model. The model had predicted some combinations to be 'weak' at inducing regulatory Tcells, some 'intermediate' and some 'strong'. The results matched the predictions, validating the method.
This new model allows the rapid design of microbes that could dampen the immune response in the gut and paves the way for new treatments to correct imbalances of the immune system. By using already available data, it should be possible to expand the model to other diseases with imbalance in the immune system. In future, similar models could also find combinations for other medical uses. For example, to optimise the delivery of cancer immunotherapy, or to modulate the immune system after an organ transplant.
https://doi.org/10.7554/eLife.30916.002Introduction
The intestinal microbiota has been shown to critically influence a multitude of host physiological functions, often through modulation of the immune system (Gevers et al., 2014; Paun et al., 2016). Evidence includes studies in germfree mice, which show that both proinflammatory Thelper 17 (T_{h}17) cells and antiinflammatory regulatory Tcells (T_{reg}) are reduced in numbers in the intestinal lamina propria relative to conventional or specific pathogenfree mice (Atarashi et al., 2008), and that repopulation with specific bacteria can reconstitute them (Atarashi et al., 2015; Ivanov et al., 2009). One route to how the microbiota influences immune system activation is by the production of small molecules (Donia and Fischbach, 2015). Microbially produced shortchain fatty acids (SCFAs) were shown to facilitate extrathymic differentiation of immune system modulatory T_{reg} (Arpaia et al., 2013; Atarashi et al., 2013; Smith et al., 2013) and are implicated in T_{reg}dependent antiinflammatory properties of a mix of 17 humanderived Clostridia strains (Atarashi et al., 2013; Tanoue et al., 2016). Interestingly, individual Clostridia strains only have a modest effect on T_{reg} induction – optimal induction relies on the synergistic interplay of several strains (Atarashi et al., 2013; Belkaid and Hand, 2014). Because of these properties, there is currently great interest in the manipulation of this community for the treatment of inflammatory and allergic diseases (Honda and Littman, 2012; Olle, 2013; Vieira et al., 2016). Efforts involving transplantation of bacteria from healthy humans to humans with C. difficile infections or with metabolic diseases have provided evidence that microbiota repopulation could be used as possible strategy for disease prevention and treatment (van Nood et al., 2013; Vrieze et al., 2012). For this reason, intestinal supplementation of defined compositions of gut bacteria to treat a range of diseases is currently being pursued by multiple biopharmaceutical companies (Garber, 2015). Determining what microbial consortia can colonize the host and stably coexist with an already resident microbial community, while inducing the desired immune response, is still a major challenge that hinders the translation of these efforts into the clinic (MaldonadoGómez et al., 2016; Weil and Hohmann, 2015).
In previous work, some of us identified a set of 17 potent T_{reg}inducing Clostridia strains to select candidate subsets from (Atarashi et al., 2013). The determination of maximally immunephenotype inducing combinations from these 17 strains is, however, experimentally infeasible as it would require the testing of 2^{17}–1 = 131071 possible subsets in mice (Faith et al., 2014). Therefore, a computational approach to prioritize which subsets to validate experimentally would have great utility. To our knowledge, to date there has been no model that allows for the simultaneous prediction of the dynamics of both the microbiota and the host immune response. Building on the approach to select for optimal bacterial combinations from (Faith et al., 2014) and other efforts aimed at identifying potential immunemodulating microbes (GevaZatorsky et al., 2017; Schirmer et al., 2016), we overcome this problem by proposing a mathematical modelingbased framework that, by resolving microbiome–immune system interactions with parameters constrained to experimental observations of microbiome and immune effectors, allows computational optimization of immunestimulating bacterial combinations. To achieve this, we use a series of logically connected analyses that capture CD4^{+}FOXP3^{+} T_{reg} accumulation in the colonic lamina propria (Omenetti and Pizarro, 2015; Round and Mazmanian, 2010) in response to the dynamics of humanderived Clostridia strains in germfree mice, which had previously been shown to potently induce T_{reg} expansion (Atarashi et al., 2013) (Figure 1). To constrain this proposed model framework to experimental data, we combine previously published (Atarashi et al., 2013) and newly generated fluorescenceactivated cell scanning (FACS) data with simulated and newly generated timeseries colonization data from germfree mice (Figure 2A). Subsequently, by applying a microbiome–T_{reg} mathematical model to this combined data set, we infer each strain’s individual contribution to the CD4^{+}FOXP3^{+} T_{reg} pool (Figure 2B) and obtain an estimate of every possible consortium’s T_{reg} induction potential. To justify the usage of predicted monocolonization concentrations from a previously developed microbiome ecological model (Bucci et al., 2016), we validate its ability in predicting temporal dynamics in response to different subsets of T_{reg}inducing strains (Figure 3A) and quantify the deviation of data and predictions (Figure 3B). Introduction of the TrIS, which assigns a score to every predicted steadystate microbial composition, enables us to identify combinations that robustly maximize T_{reg} induction (Figure 4A). We analyze the relationship of TrIS and biomass (Figure 4B) as well as metabolic features of the consortia (Figure 4C and D). Most importantly, we demonstrate the utility of our approach in predicting the potency of selected microbial combinations by validating the T_{reg} induction and colonization ability of modelpredicted strong, intermediate and weak T_{reg}inducing consortia in vivo (Figure 4E).

Figure 2—source data 1
 https://doi.org/10.7554/eLife.30916.005

Figure 4—source data 1
 https://doi.org/10.7554/eLife.30916.010
We envision that our framework, while in this study tailored to finding combinations ameliorating autoinflammatory conditions, may also have direct relevance to other immunesystem enhancing applications such as the optimal delivery of probioticbased cancer immunotherapies (Garrett, 2015).
Results
Generation of multimodal microbiome–T_{reg} data set
The goal of this study is to develop a mathematical modelingbased framework to rapidly and systematically select microbial consortia that maximize a desired immune outcome when introduced in a specific host microbial background. To achieve this, we combine (i) a microbiome ecological model that accurately describes the dynamics of these bacteria in the host with (ii) a microbiome–T_{reg} mathematical model that characterizes the contribution of every strain to the immune phenotype of interest given corresponding microbiome colonization data (Figure 1). For (i), we gathered newly produced quantitative polymerase chain reaction (qPCR) colonization data from gnotobiotic mice and combined it with a previously published microbiome ecological model of the dynamics of 12 T_{reg}inducing Clostridia strains that are part of an original consortium of 17 T_{reg}inducing strains discovered by some of us (Atarashi et al., 2013) (see also Table 1 for a breakdown of strains used in each study). In contrast to the original study of (Bucci et al., 2016), we included only 12 of the 13 strains used there because, based on the modeling and analysis of Bucci et al. (2016), Strain 6 from the 13strain set was predicted to not stably colonize in the presence of the other 12 strains. The published colonization data have been reported in our previous work (Bucci et al., 2016) and include timeseries measurements of microbial abundances by qPCR under dietary perturbations. These are used to derive a predictive microbiome ecology model in gnotobiotic conditions based on an extension of the generalized Lotka–Volterra (gLV) equations (Hofbauer and Sigmund, 1998) as introduced in Stein et al. (2013). For the newly generated dataset, we gavaged 14 mice with one of three possible 11strain subsets from the 12strain subset of original 13 strains (Figure 2) and used the derived stool measurements to validate the ability of our mathematical model in predicting unseen conditions. The three 11strain subsets were chosen based on our ‘keystoneness’ definition, a measure describing the marginal predicted quantitative effect of removing each strain from the full community (Bucci et al., 2016). Specifically, we included two 11strain combinations each missing one of the two highest keystonenessscoring strains (VE202 Strain 15 and VE202 Strain 4) and one 11strain combination which lacks the lowest keystoneness scoring strain (VE202 Strain 29). In analogy to Bucci et al. (2016), each strain’s density was profiled over time by qPCR with strainspecific primers (see Materials and methods). To resolve the contribution of each of these to T_{reg} induction for point (ii) we coupled the colonization data from fecal content with newly collected and published FACS measurements of the CD4^{+}FOXP3^{+} T_{reg} population in the lamina propria of these mice (Figure 2A). As it is crucial to capture each strain’s contribution alone and in combination with others, we also included CD4^{+}FOXP3^{+} T_{reg} measurements from our previously published monocolonization experiments (Atarashi et al., 2013) (Figure 2A). Due to the fact that the singlestrain monocolonization concentrations were not measured in these experiments (Atarashi et al., 2013), we simulated them using the microbiome ecological model and parameters from (Bucci et al., 2016). This choice was supported by the model's capability in predicting unseen validation data (Figure 3A). Spearman’s rankorder correlation coefficient ranges from 0.92 to 0.98 (pvalue<10^{−16}) between observations and predictions depending on the time point (Figure 3B).
Derivation of the microbiome–T_{reg} mathematical model
We used the described microbiome colonization data and corresponding CD4^{+}FOXP3^{+} T_{reg} measurements to determine the contribution of each strain to the T_{reg} pool. We begin by subdividing the population of CD4^{+} Tcells into two major subpopulations depending on their intracellular FOXP3 expression: CD4^{+}FOXP3^{+} T_{reg} and the remainder among the CD4^{+} Tcells, the conventional CD4^{+}FOXP3^{−} Tcells (Bilate and Lafaille, 2012; Rudensky, 2011). The concentration of CD4^{+} Tcells at time $t$ in the colonic lamina propria, ${c}_{\text{T}}(t)$, is then the combination of these two Tcell populations, ${c}_{\text{T}}\left(t\right)={c}_{\text{FOXP}{3}^{+}}\left(t\right)+{c}_{\text{FOXP}{3}^{}}\left(t\right)$. To include a variety of effects into our model, we assume Tcell dynamics to follow the gLV equations (Gerber, 2014; Hofbauer and Sigmund, 1998), which also account for the effect of the microbial strains in the lumen on the CD4^{+}FOXP3^{−} Tcell subpopulation. In addition, we use an extension of the standard gLV equations to include the impact of the Clostridia strains in terms of external perturbations (Stein et al., 2013). The resulting microbiome–T_{reg} mathematical model is found as,
where ${\alpha}_{{\text{FOXP}3}^{+}}$ denotes the basal growth rate and ${\beta}_{{\text{FOXP}3}^{+}{\text{FOXP}3}^{+}}$ the selfinteraction term of the CD4^{+}FOXP3^{+} T_{reg} population. The interaction terms ${\beta}_{{\text{FOXP}3}^{+}{\text{FOXP}3}^{}}$ and ${\beta}_{{\text{FOXP}3}^{}{\text{FOXP}3}^{+}}$ represent the effect of the CD4^{+}FOXP3^{−} Tcell on the CD4^{+}FOXP3^{+} T_{reg} population and of the CD4^{+}FOXP3^{+} T_{reg} on the CD4^{+}FOXP3^{−} Tcell population, respectively (d’Onofrio, 2005). Consequently, positive interaction parameters correspond to activation, negative ones to inhibition. Moreover, ${\epsilon}_{{i}_{k}}$ denotes the effect of strain ${i}_{k}$ on the CD4^{+}FOXP3^{+} T_{reg} population. For longterm observations, $t\to \infty $, the dynamics of $\text{}{c}_{\text{FOXP}{3}^{+}}(t)$ are given by its (nontrivial) steadystate solution, which simplifies the then static microbiome–T_{reg} mathematical model of the relative CD4^{+}FOXP3^{+} proportion in steady state, ${r}_{{\text{FOXP}3}^{+}\text{,}\mathrm{}\text{ss}}$, to a linear regression problem,
Here, we use that the absolute and relative abundance of the CD4^{+}FOXP3^{+}T_{reg} population are coupled by $\text{}{c}_{\text{FOXP}{3}^{+/}\text{,}\phantom{\rule{thinmathspace}{0ex}}\text{ss}}={c}_{\text{T},\phantom{\rule{thinmathspace}{0ex}}\text{ss}}\cdot {r}_{\mathrm{F}\mathrm{O}\mathrm{X}\mathrm{P}{3}^{+/}\text{,}\phantom{\rule{thinmathspace}{0ex}}\text{ss}}$ (see Materials and methods). We assume that the steadystate CD4^{+} Tcell concentration is constant, $c}_{\text{T,}\phantom{\rule{thinmathspace}{0ex}}\text{ss}}=\text{const.$, across all microbial compositions. We justify this because we are dealing with genetically similar mice and a set of closely related Clostridia. This assumption is however not justifiable when comparing noncolonized and colonized germfree mice (Faith et al., 2014).
Assigning to $r}_{\text{FOXP}{3}^{+}\text{,}\phantom{\rule{thinmathspace}{0ex}}\text{ss}$ the measured FACSderived CD4^{+}FOXP3^{+} T_{reg} proportions after 35 days and to $c}_{{\text{strain}}_{{i}_{k}}\text{,}\phantom{\rule{thinmathspace}{0ex}}\text{ss}$ the corresponding microbial profiles, we infer each strain's contribution to the CD4^{+}FOXP3^{+} T_{reg} pool (Figure 2B) by solving Equation (2) with an $\ell}^{2$penalized leastsquare regression with one shrinkage parameter, which is determined in a leaveonesampleout crossvalidation (Stein et al., 2013). The resulting normalized rootmean square deviation on leftout samples was found to be 12%.
Derivation of the T_{reg} Induction Score (TrIS) and selection of T_{reg}inducing consortia
After deriving a model to predict our candidate strains dynamics in germfree conditions and having resolved each strain’s contribution to T_{reg} expansion, we aim to use this information to computationally select for consortia that maximize T_{reg} induction while being ecologically robust (Bucci et al., 2016; Stein et al., 2013). To specify a measure of ecological robustness as well as immune induction potential for microbial consortia in germfree mice, we define the T_{reg} Induction Score (TrIS) as the average predicted regulatory Tcell activation of a given consortium of K strains $\left\{{\text{strain}}_{{i}_{1}},\cdots ,{\text{strain}}_{{i}_{K}}\right\}$ while ignoring contributions from the host,
If the predicted steady state of the microbial consortium $\left({c}_{{\text{strain}}_{{i}_{1}}}^{\left(n\right)},\dots ,{c}_{{\text{strain}}_{{i}_{K}}}^{\left(n\right)}\right)}_{\text{ss}$ that is computed from the $n$th Markov Chain Monte Carlo (MCMC) parameter estimate (Bucci et al., 2016) is biologically meaningful, i.e. positive, and stable, then $c}_{{\text{strain}}_{{i}_{k}}\text{,}\phantom{\rule{thinmathspace}{0ex}}\text{ss}}^{\left(n\right)$ denotes the steadystate concentration of strain ${i}_{k}$; otherwise $c}_{{\text{strain}}_{{i}_{k}}\text{,}\phantom{\rule{thinmathspace}{0ex}}\text{ss}}^{\left(n\right)$ is set to 0. Hence, the value of the TrIS is indicative of the expected CD4^{+}FOXP3^{+} T_{reg} induction (after removing the host contribution) and it is of the same unit as the FACS measurements. We evaluated TrIS for every possible strain combination that would stably colonize the gut in germfree background. In our computation, of the 2^{12}–1 = 4095 possible steadystate strain configurations evaluated in N = 22,500 MCMC parameter estimates, 84% are found to be biologically meaningful and stable. Interestingly, while the average TrIS increases with consortium size, our analysis shows that a subset size of seven already contains bacterial combinations maximizing induction (Figure 4A). Furthermore, in addition to the strong correlation between TrIS and the predicted total bacterial abundance in the consortium, we observed that highinduction consortia display an especially large enrichment in the abundance of Strain 27 (Figure 4B, Figure 4—figure supplement 1). Because shortchain fatty acids have been previously associated with colonic T_{reg} induction (Arpaia et al., 2013) and increase in density upon supplementation with these strains (Atarashi et al., 2013), we decided to test if modelingpredicted high T_{reg}inducing consortia were also enriched in SCFAs. We therefore compared the top 5inducing microbial consortia of size seven against their samesize counterpart bottom 5 (Figure 4C). We predicted the SCFA concentration for each of the predicted compositions by summing the scaled metabolic outputs of each strain measured in monocolonization experiments (Atarashi et al., 2013; Narushima et al., 2014) and normalized by the strain’s modelpredicted monocolonization density. We performed a Welch twosamples ttest for each of the predicted SCFAs concentrations and found significant enrichment for all estimated SCFAs (p<0.05, one tailed) in the highTrIS consortia compared to the low ones (Figure 4D).
Experimental validation of mathematical model predictions
We decided to experimentally test our approach's ability to correctly predict consortium ranking with respect to T_{reg}induction. Due to regulatory constraints on the used probiotic strains – limiting us to a maximum of four strains at a time in followup experiments – we selected five 4strain combinations with a variety of predicted immune effects. We measured CD4^{+}FOXP3^{+} T_{reg} induction for five microbial consortia of size four and the germfree control. We chose the two highest TrIS consortia (H1, H2 with rank 1 and 2, respectively), the lowest one (L with rank 495) and two TrISintermediate consortia of interest (M1, M2 with rank 129 and 452, respectively). Strains contained in each of the five consortia are detailed in Table 1. We correlated the TrIS score with the mean observed CD4^{+}FOXP3^{+} T_{reg} percentage and found a significant Pearson correlation with coefficient of 0.97 and pvalue<0.01 (Figure 4E). Importantly, when using 16S rRNA sequencing to investigate the resulting colonization profiles for these combinations, we observed that the high TrISscoring consortia (H1, H2, M1) all stably colonized while the two lowscoring consortia only displayed a subset of the introduced strains. This result remarkably reflects the nature of our scoring system which, in addition to immune activation potential, also incorporates colonization success (Equation 3).
Because the SCFA enrichment analysis from in vivo measurements of individual strains (Figure 4C) predicted acetate to be significantly increased in the five high versus low T_{reg}inducing consortia of size seven, we decided to compare the measured acetic acid concentrations (Figure 4—figure supplement 2A) to predictions of acetate in the two high (H1 and H2) and low (L) T_{reg}inducing 4strain consortia (Figure 4—figure supplement 2B). ANOVA and subsequent posthoc Tukey test showed significance in the measured enrichment of H1 compared to L (adjusted pvalue<0.05) and of H1 compared to H2 (adjusted pvalue<0.05) (Figure 4—figure supplement 2A). However, no statistical difference was observed between measurements in H2 and L (adjusted pvalue<0.05). Remarkably, the modelpredicted germfree normalized acetate enrichment and the observed acetic acid concentration (normalized by the mean in germfree conditions) are well correlated (Figure 4—figure supplement 2B).
Discussion
Manipulation of the intestinal microbiota with defined bacterial consortia for the treatment of disease is a promising route for future therapeutics (Hansen and Sartor, 2015). However, choosing bacterial combinations from the vast combinatorial space of microbes that effectively colonize a host (possibly with a dysbiotic microbiota) and at the same time maximize a desired host phenotype requires an enormous number of expensive and timeconsuming experimental trials (Faith et al., 2014). While data mining and statistical approaches could aid in this process, the majority of available microbiome analysis methods is still based on correlations (Gerber, 2014; Morgan et al., 2015) and inept at predicting unseen phenotypes. Identification of causal associations between microbes and host phenotype has been recently achieved by using an experimentally based microbe–phenotype triangulation (Surana and Kasper, 2017). However, this approach remains impractical when the goal is to explore and rank all possible microbiome consortium combinations with respect to a host phenotype of interest. In this study, we leveraged our previous computational methods to forecast temporal microbiome dynamics and make predictions on stability in the context of infectious and inflammatory diseases including C. difficile colonization and inflammatory bowel disease (Bucci et al., 2016; Buffie et al., 2015; Stein et al., 2013).
Specifically, we have presented a novel modelingbased method that combines dynamical predictions from datadriven models of the microbiome and the interactions between microbes and the immune response (such as CD4^{+}FOXP3^{+} T_{reg}) and thereby enables the rational design of immunemodulating bacterial consortia. Limiting our scope to longterm steadystate dynamics, we were able to derive a microbiome–T_{reg} mathematical model which is constrained by experimental observations from multimodal data. In the presented work, the used data modalities are qPCR and 16S rRNA sequencing data to estimate microbial abundances and FACS to assess CD4^{+}FOXP3^{+} T_{reg} induction, respectively. However, the proposed framework is naturally extensible to other host data, for example data from metabolite profiling, immune readouts (e.g. CD8 Tcell activation, T_{h}1/T_{h}17 cell depletion) or host transcriptome profiling (Atarashi et al., 2017; Morgan et al., 2015). To evaluate candidate consortia that could be relevant to the treatment of autoinflammatory diseases, we introduced the T_{reg} Induction Score, TrIS, as a novel metric that accounts for both ecological stability and efficacy in immune modulation. This metric allowed us to rationally identify combinations that would stably colonize and at the same time produce substantial T_{reg} induction in germfree mice. Remarkably, in validation experiments of five distinct modelingguided consortia with predicted diverse induction potential, we proved the ability of our approach to successfully select microbial combinations with a desired therapeutic activity. To our knowledge, this is to date, the first study in which observationconstrained in silico modeling of microbiome and host phenotype has successfully guided the rational design of drugs of defined bacterial consortia.
Our work relies on the gLV model assumption for both microbiome and CD4^{+} regulatory Tcell dynamics which, despite being very versatile, has some limitations including the lack of third or higher order interactions or saturation effects (Wangersky, 1978). Moreover, with the data available to us, we needed to assume that the overall Tcell density at steady state is constant across mice and different microbiome compositions. While this assumption was reasonable for the data we analyzed (see Results), the addition of future measurements on total Tcell abundance will likely improve the prediction accuracy of our model as well as provide important insight into the variability of Tcell population size in response to microbial immune induction.
We performed our data and modeling analysis in germfree mice. While previous work by some of us showed that the amount of T_{reg} induction is independent of the germfree background (IQI, Balb/c or C57BL/6) (Atarashi et al., 2015), it is noteworthy that our model, which was trained on IQI mice data, robustly predicted both T_{reg} induction and acetate enrichment of C57BL/6 colonized mice as used in the validation experiments.
Our predictions suggest some degree of consistency in terms of functional outcome (e.g. enrichment in acetate) of highscoring consortia relative to lowscoring ones in germfree mice. However, before translating these findings into therapeutic development, future studies will need to be performed in not germfree settings (e.g. SPF mice, humanized mice) in order to account for the effects of an alreadyestablished flora. Individualspecific microbiome features may constrain the colonization potential of the selected strains due to specific ecological network effects (Smits et al., 2016), which suggests the need for a careful characterization of the ecological interactions between the proposed probiotic product and a specific recipient community (e.g. a single ulcerative colitis patient (Atarashi et al., 2013)).
Treatment of autoimmune diseases overall is not necessarily achieved by the exclusive optimization of one objective function (e.g. for ulcerative colitis, the maximization of T_{reg} activation), but may need the simultaneous manipulation of a multiprocess host immune spectrum, which could include the concomitant reduction of proinflammatory phenotypes (Atarashi et al., 2015). Given new data availability and careful experimental design, we believe that the modeling framework proposed in this study could overcome these problems by introducing constraints into the scoring metric that account for precolonized mouse model features. To the limit, in the presence of a large training data set that would include sufficient information on individual patient variation, our workflow will give us the unprecedented potential of testing microbiota manipulation on a personalized level in silico.
Mathematical modelingbased methods have the potential to greatly accelerate the development of treatment of human disease (Michor and Beal, 2015). In this work, we develop and demonstrate in a firstofakind experimental validation the usefulness of a mathematical microbiome–immune system model. It can be considered a steppingstone to the accelerated prototyping and rational design of microbiome therapies.
Materials and methods
Derivation of the microbiome–T_{reg} induction mathematical model
Request a detailed protocolWe assume the following dynamics for the CD4^{+}FOXP3^{+} regulatory Tcell population:
Here, ${\alpha}_{{\text{FOXP}3}^{+}}$ denotes the basal growth rate and ${\beta}_{{\text{FOXP}3}^{+}{\text{FOXP}3}^{+}}$ the selfinteraction term of the CD4^{+}FOXP3^{+} T_{reg} population, while the interaction parameters ${\beta}_{{\text{FOXP}3}^{+}{\text{FOXP}3}^{}}$ and ${\beta}_{{\text{FOXP}3}^{}{\text{FOXP}3}^{+}}$ characterize the effect of the CD4^{+}FOXP3^{− }Tcells on the CD4^{+}FOXP3^{+} T_{reg} population and of the CD4^{+}FOXP3^{+ }T_{reg} on the CD4^{+}FOXP3^{−} Tcell population, respectively (d’Onofrio, 2005). Moreover, ${\epsilon}_{{i}_{k}}$ denotes the effect of strain ${i}_{k}$ on the CD4^{+}FOXP3^{+} T_{reg} population. The nontrivial steadystate solution (i.e., the algebraic solution of the righthand side of Equation (1) set to 0 with ${c}_{{\text{FOXP}3}^{+}}\ne 0$) is found as,
Using ${c}_{\text{T,}\phantom{\rule{thinmathspace}{0ex}}\text{ss}}={c}_{\text{FOXP}{3}^{+}\text{,}\phantom{\rule{thinmathspace}{0ex}}\text{ss}}+{c}_{\text{FOXP}{3}^{}\text{,}\phantom{\rule{thinmathspace}{0ex}}\text{ss}},$ the steadystate concentrations of the CD4^{+}FOXP3^{+} T_{reg} and CD4+FOXP3^{−} populations, $c}_{\text{FOXP}{3}^{+/}\text{,}\phantom{\rule{thinmathspace}{0ex}}\text{ss}$, are derived from the FACSbased relative abundances $r}_{\text{FOXP}{3}^{+/}\text{,}\phantom{\rule{thinmathspace}{0ex}}\text{ss}$ by, $\text{}{c}_{\text{FOXP}{3}^{+/}\text{,}\phantom{\rule{thinmathspace}{0ex}}\text{ss}}=\text{}{c}_{\mathrm{T}\text{,}\phantom{\rule{thinmathspace}{0ex}}\text{ss}}\cdot {r}_{\mathrm{F}\mathrm{O}\mathrm{X}\mathrm{P}{3}^{+/}\text{,}\text{}\text{ss}}={c}_{\text{T,}\phantom{\rule{thinmathspace}{0ex}}\text{ss}}\left(1{r}_{\mathrm{F}\mathrm{O}\mathrm{X}\mathrm{P}{3}^{/+}\text{,}\phantom{\rule{thinmathspace}{0ex}}\text{ss}}\right)$. Finally, the linear relationship between the relative abundances, $r}_{\text{FOXP}{3}^{+}\text{,}\phantom{\rule{thinmathspace}{0ex}}\text{ss}$, and the strain densities, $c}_{{\text{strain}}_{{i}_{k}\text{,}\phantom{\rule{thinmathspace}{0ex}}\text{ss}}$, is found as,
assuming constant concentration of CD4^{+} Tcells, ${c}_{\text{T,}\phantom{\rule{thinmathspace}{0ex}}\text{ss}}=\text{const}.$, across all possible microbiome compositions. The unknown parameters $\stackrel{~}{\alpha}$ and $\stackrel{~}{\epsilon}}_{{i}_{k}$ are estimated in an $\ell}^{2$penalized leastsquare regression (socalled ridge regression) with a positive shrinkage parameter $\lambda$, which is determined in a leaveonesampleout crossvalidation as ${\lambda}^{*}=2$.
Collection of experimental data for training the microbiome–immune system model
Strain abundance profiling
Request a detailed protocolIn our previous work (Bucci et al., 2016), we have inferred a mathematical model describing the dynamics of a 13strain subset from the original 17strain humanderived Clostridia consortium from Atarashi et al. (2013) in germfree mice. Using newly generated data from the same experimental setup, we are now able to assess its predictive quality for three distinct 11strain subsets. The three 11strain compositions were selected based on their capability to maintain stability in the simulations when either one of two high (Strain 15 and Strain 4, here referred to as cases I and II, respectively) or one low (Strain 29; III) keystone strain was removed (Bucci et al., 2016). For the experimental validation, germfree IQI mice were purchased from Sankyo Laboratories (Japan), randomized and maintained in germfree vinyl isolators in the animal facility of RIKEN. Twelve T_{reg}inducing Clostridia strains were selected from the previously reported VE202 consortium consisting of 17 T_{reg}inducing strains (4, 7, 9, 13–16, 21, 26–29) and were individually cultured in modified Eggerth Gagnon broth under strictly anaerobic conditions (80% N_{2}, 10% H_{2}, 10% CO_{2}) at 37°C in an anaerobic chamber (Coy Laboratory Products, Grass Lake, MI) to confluence. The cultured bacterial strains were then mixed and the three mixtures of 11 strains (described above) were orally inoculated into five IQI germfree adult mice each. One mouse for condition II died and was therefore discarded from the study. After an initial 9day interval of acclimation, we collected fecal pellets at 2–4 days interval until day 35, time at which mice were euthanized and analyzed for CD4^{+}FOXP3 induction as in Atarashi et al. (2013). Colonization levels for each strain were assessed by amplifying strainspecific regions with qPCR as described in Bucci et al. (2016). Bacterial genomic DNA was extracted from 1 to 2 fecal pellets using QIAamp DNA Stool Mini Kit (Qiagen, Hilden, Germany). The amount of DNA was quantified using a Qubit dsDNA HS assay kit and Qubit fluorometer (Invitrogen, Carlsbad, CA). DNA was then subjected to qPCR using Thunderbird SYBR qPCR Mix (TOYOBO, Osaka, Japan) and a LightCycler 480 (Roche) with primers specific to 16S ribosomal RNA (rRNA) genes of the 12 Clostridia strains as in Bucci et al. (2016). Quantification of each strain in each sample was accomplished using standard curves of known concentrations of DNAs purified from each strain individually cultured in vitro. Strain densities in each sample were calculated by dividing the above absolute quantification numbers by the weight of the extracted fecal DNA. The 11strain experiments were ethically approved by RIKEN, Keio and Azabu Universities under protocol H249(14) from RIKEN.
Estimation of CD4^{+}FOXP3^{+} T_{reg} – Isolation of intestinal lamina propria lymphocytes and flow cytometry
Request a detailed protocolThe colons were collected and opened longitudinally, washed with PBS to remove all luminal contents and shaken in Hanks’ balanced salt solution (HBSS) containing 5 mM EDTA for 20 min at 37°C. After removing epithelial cells, muscle layers and fat tissue using forceps, the lamina propria layers were cut into small pieces and incubated with RPMI1640 containing 4% fetal bovine serum, 0.5 mg/ml collagenase D, 0.5 mg/ml dispase and 40 mg/ml DNase I (all Roche Diagnostics, RischRotkreuz, Switzerland) for 1 hr at 37°C in a shaking water bath. The digested tissues were washed with HBSS containing 5 mM EDTA, resuspended in 5 ml of 40% Percoll (GE Healthcare, Boston, MA) and overlaid on 2.5 ml of 80% Percoll in a 15 ml Falcon tube. Percoll gradient separation was performed by centrifugation at 850 g for 25 min at 25°C. The lamina propria lymphocytes were collected from the interface of the Percoll gradient and suspended in icecold PBS. For analysis of T_{reg}, isolated lymphocytes were labeled with the LIVE/DEAD fixable dead cell stain kit (Life Technologies, Carlsbad, CA) to exclude dead cells from the analysis. Then, surface and intracellular staining of CD3, CD4 and FOXP3 was performed using the BV605labeled antiCD3 (17A2, Biolegend, San Diego, CA), BV421labeled antiCD4 (RM45, Biolegend), Alexa700labeled antiFOXP3 antibody (FJK16 s, eBioscience, San Diego, CA), and FOXP3 staining buffer set (eBioscience). The antibodystained cells were analyzed with LSR Fortessa and data were analyzed using FlowJo software (Tree Star, Ashland, OR).
Measurement of organic acids
Request a detailed protocolOrganic acid concentrations in caecal contents were determined by gas chromatographymass spectrometry (GCMS). Caecal contents (10 mg) were disrupted using 3 mm zirconia/silica beads (BioSpec Products) and homogenized in extraction solution containing 100 ml of internal standard (100 mM crotonic acid), 50 ml of HCl and 200 ml of ether. After vigorous shaking using a Shakemaster neo (Bio Medical Science) at 1500 rpm for 10 min, homogenates were centrifuged at 1000 g for 10 min and then the top ether layer was collected and transferred into new glass vials. Aliquots (80 ml) of the ether extracts were mixed with 16 ml of NtertbutyldimethylsilylNmethyltrifluoroacetamide (MTBSTFA). The vials were sealed tightly by screwing and heated at 80°C for 20 min in a water bath, and left at room temperature for 48 hr for derivatization. The samples were then run through a 6890N Network GC System (Agilent Technologies) equipped with HP5MS column (0.25 mm 330 m 30.25 mm) and 5973 Network Mass Selective Detector (Agilent Technologies, Santa Clara, CA). Pure helium (99.9999%) was used as a carrier gas and delivered at a flow rate of 1.2 ml/min. The head pressure was set at 10 psi with split 10:1. The inlet and transfer line temperatures were 250 µC and 260 µC, respectively. The following temperature program was used: 60 µC (3 min), 60–120°C (5°C/min), 120–300°C (20°C/min). One microliter quantity of each sample was injected with a run time of 30 min. Organic acid concentrations were quantified by comparing their peak areas with the standards.
Numerical simulations of the three 11strain subsets used for microbiome–T_{reg} model training
Request a detailed protocolWe used 22,500 sets of Markov Chain Monte Carlo generalized Lotka–Volterra parameter sets determined by applying the Bayesian Variable Selection algorithm within MDSINE to the data of Bucci et al. (2016), and the first time point of measured microbial profiles for each of the 14 validation mice as initial condition, to simulate the gLV system of differential equations corresponding to each mouse microbiome (Figure 3A). Prediction accuracy was evaluated by calculating the Spearman correlation coefficient between observed and predicted data (Figure 3B).
Simulation of monocolonization abundances
Request a detailed protocolAs the experimental data from Atarashi et al. (2013) only provided CD4^{+}FOXP3^{+} Tcell levels for the monostrain colonization experiments at 35 days after inoculation but no measurement of the longterm microbial concentrations in the gut, we used instead the corresponding estimated monostrain colonization densities obtained from the gLV model (section above and Figure 2A). In general, the longterm behavior of the gLV system is determined by its steady states which are uniquely defined by the inferred model parameters (Stein et al., 2013). We computed the parameter median in each single model variable from the 22,500 MCMC parameter sets. These median parameters were then used to deduce the steadystate densities, which were together with the measured CD4^{+}FOXP3 proportions included into the training of the microbiome–T_{reg} model.
Collection of CD4^{+}FOXP3^{+} T_{reg} data from 4strain experiments for validation of the modelingbased predictions
Request a detailed protocolBacterial strains 4, 7, 9, 14, 15, 16, 27, 28, 29 were grown anaerobically in PYG broth (Peptone, Yeast and Glucose broth from Anaerobe Systems, Cat no: AS822) until they reached stationary phase (48 hr for strains 27 and 29, 24 hr for the remaining strains). Each 200 µlmouse dose of a 4strain LBP contained 50 µl of 20 times concentrated stationary phase culture. Germfree C57BL/6 mice aged 6–8 weeks were randomized and gavaged with a total dose of 5·10^{7}–2·10^{8} bacteria in a 200 µl, and maintained under gnotobiotic conditions for four weeks. Use of a C57BL/6 background for these experiments was motivated by availability of animals at the facility where we performed the validation and justified by the fact that previous work from us has shown that T_{reg} induction by our Clostridia strains does not differ between Balb/c, IQI, and C57BL/6 mice (Atarashi et al., 2015). Mice were then sacrificed, colons harvested, and lamina propria leukocytes isolated and stained for CD3^{+}CD4^{+}FOXP3^{+} T_{reg} as described above. Eight mice each were used for consortia High 1 (H1strains: 7, 27, 28, 29) and High 2 (H2strains: 4, 7, 27, 29). Five mice each were used for the intermediate high (M1strains: 4, 7, 14, 28), and intermediate low consortia (M2strains: 9, 16, 27, 29). Three mice were used for Low 1 (L1strains: 14, 15, 16, 29). Colonization profiling was determined through 16S rRNA sequencing (as above) and verified by blasting representative sequences to a 16S VE202 fasta database. The 4strain validation experiments were performed in the Massachusetts Host Microbiome Center under IACUC protocol 2016N000141.
Data availability

Th17 Cell Induction by Adhesion of Microbes to Intestinal Epithelial CellsPublicly available at the DDBJ Center (accession no. DRA003884).
References

Induced CD4+Foxp3+ regulatory T cells in immune toleranceAnnual Review of Immunology 30:733–758.https://doi.org/10.1146/annurevimmunol020711075043

Identifying gut microbehost phenotype relationships using combinatorial communities in gnotobiotic miceScience Translational Medicine 6:ra11.https://doi.org/10.1126/scitranslmed.3008051

The treatmentnaive microbiome in newonset Crohn's diseaseCell Host & Microbe 15:382–392.https://doi.org/10.1016/j.chom.2014.02.005

Therapeutic manipulation of the microbiome in IBD: Current results and future approachesCurrent Treatment Options in Gastroenterology 13:105–120.https://doi.org/10.1007/s1193801400427

The microbiome in infectious disease and inflammationAnnual Review of Immunology 30:759–795.https://doi.org/10.1146/annurevimmunol020711074937

Immune recognition and response to the intestinal microbiome in type 1 diabetesJournal of Autoimmunity 71:10–18.https://doi.org/10.1016/j.jaut.2016.02.004

Regulatory T cells and Foxp3Immunological Reviews 241:260–268.https://doi.org/10.1111/j.1600065X.2011.01018.x

Development and maintenance of intestinal regulatory T cellsNature Reviews Immunology 16:295–309.https://doi.org/10.1038/nri.2016.36

Duodenal infusion of donor feces for recurrent Clostridium difficileNew England Journal of Medicine 368:407–415.https://doi.org/10.1056/NEJMoa1205037

New insights into therapeutic strategies for gut microbiota modulation in inflammatory diseasesClinical & Translational Immunology 5:e87.https://doi.org/10.1038/cti.2016.38

Lotkavolterra population modelsAnnual Review of Ecology and Systematics 9:189–218.https://doi.org/10.1146/annurev.es.09.110178.001201

Fecal microbiota transplant: benefits and risksOpen Forum Infectious Diseases 2:ofv005.https://doi.org/10.1093/ofid/ofv005
Decision letter

Rob KnightReviewing Editor; University of California, San Diego, United States
In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.
Thank you for submitting your article "Computational prediction of immune effectors from the microbiome to select for optimal immune–modulating consortia" for consideration by eLife. Your article has been reviewed by three peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Wendy Garrett as the Senior Editor. The reviewers have opted to remain anonymous.
The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.
Summary:
In this manuscript, the authors colonize germ–free mice with individual bacterial strains, and test for induction of T_{regs}. They then use this information to construct a model for induction of T_{regs} by a complex microbial community, reducing the number of combinations that must be tested explicitly. Specifically, they predict strong– and weak–inducing consortia and provide experimental validation of fit to the model. The authors use existing qPCR data from 13 strains and a generalized Lotka–Volterra model they previously published as the basis for the experimental validation. They deploy this model on 11–strain and 4–strain consortia. The three 11–strain combinations consist of two that lack strains with the highest "keystone" scores, and one that lacks the strain with the lowest score. They also test five 4–strain consortia, again collecting information on the dynamics. They also provide additional predictions relevant to ulcerative colitis.
In our view, this manuscript provides a fascinating example of validation of computational predictions relating specific microbial strains to immune response. However, we believe that although the work is impressive, additional validation needs to be done and that the last disease test case is less convincing than the rest of the work. We recommend that this be either removed or that experimental validation be provided.
Overall this is an impressive body of work and, as the authors note, point the way towards better rational design of microbial communities that alter the immune system.
Essential revisions:
1) Several of the figures are misleading or unreadable. Figure 3 shows observed vs predicted data and r^{2} in log space, but the results are likely to look much worse in linear space given the pattern of scatter, and this should be disclosed to readers rather than concealed.
2) Figure 4B is unreadable, and per–strain correlations should be summarized in a different way that is easier to understand.
3) Figure 5A is also unreadable and it is impossible to match up the stacked bar and area presentation; some other form of showing the data is needed.
4) Because the pathway data are normalized, Mann–Whitney U is not appropriate (in related tasks, its false discovery rate has been shown to be up to 95%) and appropriate statistics for differential abundance that take compositionality into account should be used.
5) The theoretical predictions for the consortia transferred into mice with UC patient stool, while intriguing, should ideally be backed by experiment. The results showing model fit to data for the 4–strain communities are impressive but much weaker than the rhetoric in the text implies: Spearman rho of –0.47 means that more than three quarters of the variance in the rank order of the data, let alone the quantitative data, is not explained by their explanatory variable. So, although the association is statistically significant, its power to predict results is low. This component should be removed.
6) The authors have only moderate support for their primary hypothesis: that immune response as a function of microbial consortia can be predicted via a model. In the GF mouse case, I see the key data testing this hypothesis in Figure 4E. While the stated correlation is significant, I'm concerned this statistic is being used inappropriately. Incorporating the replicates into the correlation seems ill–advised (doing so, for example, could let you compute a correlation involving only two groups provided each had enough replicates within). Rather, if the authors want to use the replicates, an ANOVA with post–hoc tests seems more appropriate. Perhaps even better, the authors could correlate the mean observed percentage of CD4^{+}FOXP3^{+} to the mean predicted TrIS across the 5 groups. The trends look promising. But, this may be too small a number of groups to actually power such a correlation; hence, my overall concern that there's not enough data validating the model.
7) The manuscript needs to make clear when the authors are using new vs. previously published, and in silico vs. measured data. Examples include Figure 4, which analyses the insilico results using a mixture of metabolomic predictions and experimental validation. It seems Figure 1 and Figure 2 are efforts to clear up that confusion. Unfortunately, they didn't work in my case; perhaps it’s worth trying to merge those figures into a single flow chart?
8) Some variables in Equations 1–3 aren't explicitly defined in the text (and don't seem to be in the Materials and methods section either), including i, α, β, ε, and n. These need to be defined.
9) How many strains were available and how many strains were actually selected? The numbers mentioned in the text are inconsistent in different parts. I provide only a few examples below, but this inconsistency is present throughout the text.
– Introduction: "a mixture of 17 human–derived Clostridia (ref 9)".
– Introduction: "our previous work identified on the order of 15 possible candidate strains to select from (ref 9)".
– Results section: "we gavaged 14 mice with one of three possible 11–strain subsets from the original 13 strains (Figure 2)".
– Figure 1: "12 therapeutically relevant Clostridia strains in the lumen".
– Figure 2: "13 strain cocktail".
Bottom line, the authors should provide a clear layout/table of the strains that were identified in ref 9, of the strains that were used in the paper, and of the combinations. At the moment, this part is very confusing, and the numbers do not match throughout the text.
10) How many GF mice were actually used, and why do the authors use IQI GF mice for some experiments, and C57BL/6 GF mice for others? The mouse genetic background was changed for no apparent reason.
Also Introduction: "we gavaged 14 mice with one of three possible 11–strain subsets from the original 13 strains (Figure 2)".
Materials and methods section: "the three mixtures of 11 strains (described above) were orally inoculated into five IQI germ–free adult mice each."
How many mice were actually gavages with the 11 strains? 14 or 5? Please clarify the experimental design.
https://doi.org/10.7554/eLife.30916.016Author response
Essential revisions:
1) Several of the figures are misleading or unreadable. Figure 3 shows observed vs predicted data and r^{2} in log space, but the results are likely to look much worse in linear space given the pattern of scatter, and this should be disclosed to readers rather than concealed.
We edited Figure 3 by removing the old scatter plot (old panel B) and substituted it with one where we calculated the Spearman correlation between observations and predictions at different discrete time points in order to not violate the assumption of independence of observations for correlation tests. Calculation of Spearman’s correlation between predictions and observations follows our previous work (Stein et al., 2013). See also subsection “Generation of multimodal microbiome–T_{reg} data”.
2) Figure 4B is unreadable, and per–strain correlations should be summarized in a different way that is easier to understand.
We agree with the reviewer’s comment. Accordingly, we modified Figure 4B to only focus on Strain 27. We added a new supplementary figure (Figure 4—figure supplement 1) to display the same analysis for all other strains each in a separate panel.
3) Figure 5A is also unreadable and it is impossible to match up the stacked bar and area presentation; some other form of showing the data is needed
Following the suggestion from the editor we removed the UC biota analysis. Hence, Figure 5 is now removed from the manuscript.
4) Because the pathway data are normalized, Mann–Whitney U is not appropriate (in related tasks, its false discovery rate has been shown to be up to 95%) and appropriate statistics for differential abundance that take compositionality into account should be used.
We agree with the reviewer’s comment. Similarly to the response to comment #3, this analysis is not anymore included as the UCbiota related work has been removed from the manuscript.
5) The theoretical predictions for the consortia transferred into mice with UC patient stool, while intriguing, should ideally be backed by experiment. The results showing model fit to data for the 4–strain communities are impressive but much weaker than the rhetoric in the text implies: Spearman rho of –0.47 means that more than three quarters of the variance in the rank order of the data, let alone the quantitative data, is not explained by their explanatory variable. So, although the association is statistically significant, its power to predict results is low. This component should be removed.
As suggested by the editor, we removed the data and analysis pertaining the UC biota scenario. Regarding the analysis of the 4–strain validation experiments see the response to comment #6.
6) The authors have only moderate support for their primary hypothesis: that immune response as a function of microbial consortia can be predicted via a model. In the GF mouse case, I see the key data testing this hypothesis in Figure 4E. While the stated correlation is significant, I'm concerned this statistic is being used inappropriately. Incorporating the replicates into the correlation seems ill–advised (doing so, for example, could let you compute a correlation involving only two groups provided each had enough replicates within). Rather, if the authors want to use the replicates, an ANOVA with post–hoc tests seems more appropriate. Perhaps even better, the authors could correlate the mean observed percentage of CD4^{+}FOXP3^{+} to the mean predicted TrIS across the 5 groups. The trends look promising. But, this may be too small a number of groups to actually power such a correlation; hence, my overall concern that there's not enough data validating the model.
We fully agree with the reviewer’s comment and followed the reviewer’s suggestion to correlate the mean CD4^{+}FOXP3^{+} Treg level for each consortium with the modelingpredicted Treg induction score. Pearson’s correlation yields a coefficient of 0.97 and a pvalue < 0.01. This should sufficiently validate our modeling predictions. We accordingly modified the text (see subsection “Derivation of the Treg Induction Score (TrIS) and selection of T_{reg}inducing consortia”), the related figure panel and the corresponding caption.
7) The manuscript needs to make clear when the authors are using new vs. previously published, and in silico vs. measured data. Examples include Figure 4, which analyses the insilico results using a mixture of metabolomic predictions and experimental validation. It seems Figure 1 and Figure 2 are efforts to clear up that confusion. Unfortunately, they didn't work in my case; perhaps it’s worth trying to merge those figures into a single flow chart?
We agree with the reviewer and apologize for the confusion. We edited Figure 1 to be solely a clear conceptual diagram detailing the entire workflow that we present in the paper. We also rearranged Figure 2 and explicitly state what is new data and what data was published already. Also, we now explicitly specify (in Figure 2 and caption) what data was experimentally measured and what was derived from in silico simulations. We also better describe this in the main text (see Introduction and Results section). See also response to comment #9.
8) Some variables in Equations 1–3 aren't explicitly defined in the text (and don't seem to be in the Materials and methods section either), including i, α, β, ε, and n. These need to be defined.
Following the reviewer’s suggestion, we included a detailed description of the used parameters when they are introduced (see subsection “Derivation of the microbiome–T_{reg} mathematical model to select microbial consortia” and Discussion section).
9) How many strains were available and how many strains were actually selected? The numbers mentioned in the text are inconsistent in different parts. I provide only a few examples below, but this inconsistency is present throughout the text.
– Introduction: "a mixture of 17 human–derived Clostridia (ref 9)".
– Introduction: "our previous work identified on the order of 15 possible candidate strains to select from (ref 9)".
– Results section: "we gavaged 14 mice with one of three possible 11–strain subsets from the original 13 strains (Figure 2)".
– Figure 1: "12 therapeutically relevant Clostridia strains in the lumen".
– Figure 2: "13 strain cocktail".
Bottom line, the authors should provide a clear layout/table of the strains that were identified in ref 9, of the strains that were used in the paper, and of the combinations. At the moment, this part is very confusing, and the numbers do not match throughout the text.
We agree with the reviewer comment and again apologize for the confusion. Following the reviewers comment we made a table (Table 1) detailing membership for all the experiments performed for this work as well as from our previous studies. We also modified the text to clarify this point (Introduction).
10) How many GF mice were actually used, and why do the authors use IQI GF mice for some experiments, and C57BL/6 GF mice for others? The mouse genetic background was changed for no apparent reason.
Also Introduction: "we gavaged 14 mice with one of three possible 11–strain subsets from the original 13 strains (Figure 2)".
Materials and methods section: "the three mixtures of 11 strains (described above) were orally inoculated into five IQI germ–free adult mice each."
How many mice were actually gavages with the 11 strains? 14 or 5? Please clarify the experimental design.
We made consistent and clarified in throughout text the number of animals used. The use of a C57BL/6 background for these experiments was motivated by animal availability at the time of performing experimental validation. C57BL/6 use can be justified by the fact that previous work from us has shown that Treg induction by our Clostridia strains does not differ between Balb/c, IQI, and C57BL/6 mice (see Atarashi et al., 2015). We reworded the text and this statement and added a paragraph to the Discussion section.
https://doi.org/10.7554/eLife.30916.017Article and author information
Author details
Funding
National Institutes of Health (P41 GM103504)
 Richard R Stein
 Chris Sander
Defense Advanced Research Projects Agency (BRICS award HR001115C0094)
 Georg K Gerber
Human Frontier Science Program (RGP00055/2015)
 Chris Sander
Takeda Science Foundation
 Kenya Honda
National Institute of General Medical Sciences (5R01 GM106303)
 Chris Sander
Japan Agency for Medical Research and Development
 Kenya Honda
National Institute of Allergy and Infectious Diseases (R15AI11298501A1)
 Vanni Bucci
National Science Foundation (1458347)
 Vanni Bucci
Core Research for Evolutional Science and Technology
 Kenya Honda
Brigham and Women's Hospital (Precision Medicine Initiative)
 Georg K Gerber
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Ethics
Animal experimentation: 11strain timeseries mouse experiments were performed under ethical approval by RIKEN, Keio and Azabu Universities under protocol H249(14) (RIKEN). 4strain validation mouse work was performed at Brigham and Women's Hospital in Boston, MA in the Massachusetts Host Microbiome Center under IACUC protocol 2016N000141.
Reviewing Editor
 Rob Knight, University of California, San Diego, United States
Version history
 Received: August 1, 2017
 Accepted: March 31, 2018
 Accepted Manuscript published: April 17, 2018 (version 1)
 Version of Record published: May 11, 2018 (version 2)
Copyright
© 2018, Stein et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics

 7,648
 Page views

 1,328
 Downloads

 52
 Citations
Article citation count generated by polling the highest count across the following sources: Scopus, Crossref, PubMed Central.