Metage2Metabo, microbiota-scale metabolic complementarity for the identification of key species

  1. Arnaud Belcour
  2. Clémence Frioux  Is a corresponding author
  3. Méziane Aite
  4. Anthony Bretaudeau
  5. Falk Hildebrand
  6. Anne Siegel
  1. Univ Rennes, Inria, CNRS, IRISA, France
  2. Inria Bordeaux Sud-Ouest, France
  3. Gut Microbes and Heath, Quadram Institute, United Kingdom
  4. Digital Biology, Earlham Institute, United Kingdom
  5. Inria, UMR IGEPP, BioInformatics Platform for Agroecosystems Arthropods (BIPAA), France
  6. Inria, IRISA, GenOuest Core Facility, France
9 figures, 4 tables and 2 additional files

Figures

Overview of the Metage2Metabo (M2M) pipeline.

(a) Main steps of the M2M pipeline and associated tools. The software’s main pipeline (m2m workflow) takes as inputs a collection of annotated genomes that can be reference genomes or …

Figure 2 with 6 supplements
Power graph analysis of predicted microbial associations within communities for the human gut dataset.

Each category of metabolites predicted as newly producible in the gut was defined as a target set for community selection among the 1520 genome-scale metabolic networks (GSMNs) from the gut …

Figure 2—figure supplement 1
Sugars derivatives power graph.

Power graph associated to the minimal communities producing the sugars derivatives group of targets.

Figure 2—figure supplement 2
Lipids derivatives power graph.

Power graph associated to the minimal communities producing the lipids derivatives group of targets.

Figure 2—figure supplement 3
Amino acids and derivatives power graph.

Power graph associated to the minimal communities producing the amino acids and derivatives group of targets.

Figure 2—figure supplement 4
Aromatic compounds power graph.

Power graph associated to the minimal communities producing the aromatic compounds group of targets.

Figure 2—figure supplement 5
Carboxy-acids compounds power graph.

Power graph associated to the minimal communities producing the carboxy-acids group of targets.

Figure 2—figure supplement 6
Coenzyme A derivatives power graph.

Power graph associated to the minimal communities producing the coenzyme A derivatives group of targets.

Robustness analysis of Metage2Metabo (M2M) results on datasets of altered metagenome-assembled genomes (MAGs).

A proportion of genes were randomly removed from all or a random subset of the 913 rumen MAGs: 2% from all genomes (2pc100), 5% from 80% of the genomes (5pc80), 5% from all genomes (5pc100) and 10% …

Effect of the disease status on the metabolism of communities in MHD samples.

M2M was run on collections of genome-scale metabolic networks (GSMNs) associated to metagenome-assembled genomes (MAGs) identified in metagenomic samples from a cohort of healthy and diabetic …

Appendix 1—figure 1
Characteristics of the metabolic networks built for the gut and the rumen datasets.

(a) Distribution of the number of metabolic compounds in genome-scale metabolic networks (GSMNs) reconstructed for the gut dataset (purple) and the rumen dataset (green). (b) Distribution of the …

Appendix 2—figure 1
Shannon diversity index, richness, and metabolic distance of the samples.

(a) Histogram depicting the Shannon diversity index of the samples. (b) Histogram depicting the richness of the samples. (c and d) Principal coordinate analysis (PCoA) of the Bray-Curtis distance …

Appendix 2—figure 2
Taxonomic diversity of the genomes used for genome-scale metabolic networks (GSMNs) reconstruction using OTU mapping (at species level) to curated metabolic models (a), or reconstructed metagenomic species (MGS) (b).

Phyla composition of the genomes, and number of distinct representatives for each phylum.

Appendix 2—figure 3
Impact of the cohort when studying the metabolisms of individuals from the metagenomic dataset.

Panels (a) to (d) focus on the community scope, that is the set of metabolites reachable by the community associated to a sample. Panel (d) shows the representation of a multidimensional scaling …

Appendix 2—figure 4
Impact of the status when studying the metabolisms of individuals from the MHD metagenomic dataset.

Panel (a) is the receiver operating curve (ROC) of the classification experiment aiming at deciphering the disease status for the MHD cohort (control n = 49, Type-1 Diabetes n = 31 or Type-2 …

Tables

Table 1
List and description of Metage2Metabo (M2M) commands.
CommandAction
m2m workflowRuns the whole m2m workflow
m2m metacomRuns the workflow with already-reconstructed metabolic networks
m2m reconReconstructs metabolic networks using Pathway Tools
m2m iscopeComputes scopes for individual metabolic networks
m2m cscopeComputes the community scope
m2m addedvalueComputes the cooperation potential
m2m mincomSelects a minimal community and computes key species
m2m seedsCreates a SBML file for nutrients
m2m testRuns m2m workflow on a sample dataset
m2m_analysisRuns additional analyses on community selection
Table 2
Results of the genome-scale metabolic network (GSMN) reconstruction step and metabolic potential analysis for the three datasets presented in the article (Avg = Average, '±' precedes standard deviation).
Gut datasetRumen datasetDiabetes dataset
Initial dataDraft reference genomesMAGsMAGs
Number of genomes1520913778
GSMN reconstruction
All reactions393244185554
All metabolites400144665386
Avg reactions per GSMN1144 (±255)1155 (±199)1640 (±368)
Avg metabolites per GSMN1366 (±262)1422 (±212)1925 (±361)
Avg genes per mn596 (±150)543 (±107)1658 (±469)
% reactions associated to genes74.6 (±2.17)73.8 (±2.61)79.57 (±1.60)
Avg pathways per mn163 (±49)146 (±32)220 (±58)
Metabolic potential
Number of seeds9326175
Avg scope per mn286 (±70)101 (±44)508 (±83)
Union of individual scopes8283681326
Table 3
Community reduction analysis of the target categories in the gut.

All minimal communities were enumerated, starting from the set of 1520 genome-scale metabolic networks (GSMNs). KS: key species, ES: essential symbionts, AS: alternative symbionts, Firm.: …

Firm.Bact.Acti.Prot.Fuso.Total
Aminoacids and derivatives (5 targets)4 bact. per community120,329 communitiesKS142520276227
ES000000
AS142520276227
Aromatic compounds (11 targets)5 bact. per community950 communitiesKS520020072
ES200103
AS500019069
Carboxyacids (14 targets)9 bact. per community48,412 communitiesKS1613028259
ES200204
AS1413026255
CoA derivatives (10 targets)5 bact. per community95,256 communitiesKS106050171174
ES000011
AS106050170173
Lipids (28 targets)7 bact. per community58,520 communitiesKS314022200185
ES300104
AS014022190181
Sugar derivatives (58 targets)11 bact. per community7,860,528 communitiesKS113078230142
ES500005
AS63078230137
Appendix 1—table 1
Effect of MAG degradation on genome-scale metabolic network (GSMN) reconstructions.

Numbers are averages. '±' precedes standard deviation values. 'original’: initial MAGs prior degradation, '2pc100’: 2% gene removal in all MAGs, '5pc80’: 5% gene removal in 80% of MAGs, '5pc100’: 5% …

Original2pc1005pc805pc10010pc70
Genes in MAGs2100 (±501)2058 (±491)2016 (±484)1994 (±478)1954 (±480)
Reactions in GSMNs1155 (±199)1131 (±192)1116 (±192)1108 (±190)1094 (±192)
Metabolites in GSMNs1422 (±212)1402 (±207)1388 (±208)1381 (±206)1366 (±208)
Genes in GSMNs543 (±108)532 (±106)521 (±105)515 (±103)505 (±105)
% reactions with genes73.84%74.05%73.82%73.72%73.61%
Gene loss in MAGs1.98%4.01%5.03%6.94%
Reaction loss in GSMNs1.96%3.30%3.89%5.17%
Metabolite loss in GSMNs1.37%2.41%2.91%3.92%
Gene loss in GSMNs2.09%4.17%5.11%7.02%

Additional files

Download links