Metabolic model-based ecological modeling for probiotic design

  1. James D Brunner  Is a corresponding author
  2. Nicholas Chia  Is a corresponding author
  1. Biosciences Division, Los Alamos National Laboratory, United States
  2. Center for Nonlinear Studies, Los Alamos National Laboratory, United States
  3. Data Science and Learning, Argonne National Laboratory, United States
7 figures, 3 tables and 2 additional files

Figures

Population models such as Lotka-Volterra generally require dense time-longitudinal data to accurately parameterize, and directly fit interactions between species (solid arrow in right panel).

In this work, we leverage genome-scale metabolic modeling to parameterize population models with only genomic data from a single time-point. This is accomplished by modeling microbial interactions …

Comparison of AUC-ROC Between Inferred Parameters & Null Model.

(A, B) The power to predict engraftment versus non-engraftment of the B. longum probiotic was relatively robust to the joint flux balance analysis (FBA) hyperparameter setup, as shown and measured …

AUC-ROC of Standard Classifier Predictions.

(A) Our method (horizontal lines) significantly outperformed the support vector machine classifier, which was assessed with 1000 random train/test splits. (B) The random forest classifier, also …

Effect of Uniform Parameter Shifts in the Lotka-Volterra Model.

(A, B) We altered the generalized Lotka-Volterra model with uniform shifts in parameters which added either antagonism or self-inhibition to the model. We tested self-inhibition with values from 0 …

Schematic of the modeling process.

In brief, we generate an interaction network of genome-scale models using pairwise joint flux balance analysis. To produce a prediction of engraftment for a given sample, we use the taxa present in …

Author response image 1
Author response image 2

Tables

Table 1
Area under the receiver operating characteristic curves for our method’s predictions of 22 samples from each of two time-points (TP) using six sets of parameters inferred from joint flux balance analysis (FBA) with six different sets of hyperparameters.

The first three sets of inferred parameters differ in the ‘resource allocation constraint (RAC)’ in joint FBA. We used values of 35 and 70 for this parameter, as well as using joint FBA without RAC. …

Baseline TP (p-value)Treatment TP (p-value)
EU average diet (RAC 35)0.6161 (0.1020)0.8482 (<0.001)
EU average diet (No RAC)0.6161 (0.1020)0.8571 (<0.001)
EU average diet (RAC 70)0.6429 (0.0741)0.8482 (<0.001)
EU average diet (C halved)0.6071 (0.1107)0.8393 (<0.001)
EU average diet (C doubled)0.6339 (0.0808)0.8304 (0.0010)
Complete medium0.6071 (0.1155)0.7143 (0.0221)
Table 2
We experimented with simulated knock-outs for the top 5 taxa in average abundance in the data.

The ‘sample proportion’ column gives the proportion of samples in the data set that contain the organism that was knocked out. The ‘average score difference’ is the average effect of the knock-out …

Sample proportionAverage score differenceAUC-ROC difference
Baseline TPBifidobacterium adolescentis0.9545450.0101140.017857
Uncultured Ruminococcus sp.1.0000000.0128030.026786
Uncultured Clostridium sp.1.0000000.006259–0.008929
Eubacterium rectale1.0000000.0061830.017857
Faecalibacterium prausnitzii1.000000–0.0022500.000000
Treatment TPB. adolescentis0.9545450.0154150.000000
Uncultured Ruminococcus sp.1.0000000.0167070.008929
Uncultured Clostridium sp.1.0000000.0144240.017857
E. rectale1.0000000.0111330.026786
F. prausnitzii0.9545450.0133230.035714
Table 3
The average sensitivity of engraftment score across 8 parameters and the 22 baseline and 22 treatment samples, as well as the average (across samples) variance across the 8 edges.

The 8 edges were chosen because they were the 2 strongest positive edges, 2 strongest negative edges, the 2 strongest positive direct edges (i.e. with B. longum as a target) and the 2 strongest …

Baseline time-pointTreatment time-point
Variance across setups4.270608e-062.621430e-05
Average sensitivity (8 tested edges)3.435107e+337.504735e+10
Variance of sensitivity (8 tested edges)6.250203e+681.626682e+23

Additional files

MDAR checklist
https://cdn.elifesciences.org/articles/83690/elife-83690-mdarchecklist1-v2.pdf
Supplementary file 1

Supplementary tables.

(Relative_Abundance_Table) Relative abundance of each taxa in each sample in the data-set, product of Bracken analysis on original data. (RefSeq_Genomes_Used) Genomes matched to taxa in data from RefSeq database, used to create models. (Taxa_Names) Names of taxa in the data, matched with Taxa ID. (Baseline_Sample_Coverage) Coverage (as proportion of relative abundance) of models used in analysis of Baseline time point samples (i.e. taxa for which we could identify a high quality close match genome). (Treatment_Sample_Coverage) Coverage (as proportion of relative abundance) of models used in analysis of Treatment time point samples (i.e. taxa for which we could identify a high quality close match genome). (EU_Average_Diet) Main media file used, from vmh.life. (Probiotic_Cell_Counts) Cell counts of B. longum probiotic, provided by Maldonado-Gomez et al. (Paramater_Sensitivity) Summary of parameter sensitivity results (average and variance across samples). (Baseline_Sample_Sensitivity) Parameter sensitivity in baseline time-point samples (2 strongest positive and 2 strongest negative edges, 2 strongest positive and 2 strongest negative edges with B. longum as target). (Treatment_Sample_Sensitivity) Parameter sensitivity in treatment time-point samples (2 strongest positive and 2 strongest negative edges, 2 strongest positive and 2 strongest negative edges with B. longum as target).

https://cdn.elifesciences.org/articles/83690/elife-83690-supp1-v2.xlsx

Download links