A Chemical Reaction Similarity-Based Prediction Algorithm Identifies the Multiple Taxa Required to Catalyze an Entire Metabolic Pathway of Dietary Flavonoids

  1. Tufts University
  2. Texas AM University


  • Reviewing Editor
    Axel Brakhage
    Hans Knöll Institute, Jena, 07743, Germany
  • Senior Editor
    Wendy Garrett
    Harvard T.H. Chan School of Public Health, Boston, United States of America

Reviewer #1 (Public Review):


Flavonoids are abundant in plant-based foods. They have been widely recognized for their health-promoting properties. There is increasing evidence that the effects of dietary flavonoids depend on their metabolism by gut bacteria, which can enhance, reduce or otherwise alter the flavonoids' bioactivities. On the other hand, little is known regarding the enzymes and species that can utilize flavonoids as metabolic substrates.

In the current manuscript, the authors analyzed the possibility to predict the degradation of flavonoids that we take up with our food by gut bacteria. In contrast to plants, bacteria do not contain obvious degradation enzymes.


To predict such enzymes with a broad substrate specificity (enzyme promiscuity) the authors optimized/modified a bioinformatic tool to predict whether a gut bacterial enzyme could catalyze a flavonoid reaction based on the chemical reaction similarity of the enzyme's native reaction and known flavonoid reactions in plants.
They predicted such enzyme activities in genomes of bacteria that had been shown to occur in the human gut. Then, they cultivated selected bacteria with the predicted enzymatic activities and in fact showed, that they can degrade parts of these flavonoids. Together with the bioinformatic and mass spectrometry they identified a metabolization pathway of the flavonoid tilianin that spanned multiple species, i.e., Bifidobacterium longum subsp. animalis, Blautia coccoides, and Flavonifractor plautii. Lastly, the authors showed that tilianin metabolites exhibit protective effects against H2O2 through reactive oxygen species scavenging activity and thus, improve viability of a neuronal cell line, while the parent compound, tilianin, was ineffective. This protective effect might be due to gut microbiota-dependent physiological effects of dietary flavonoids.


  1. To confirm the bioinformatic-based predictions the authors used in vitro culture experiments and LC-MS experiments. Although these in vitro experiments clearly add value to the bioinformatic prediction, they fall short of providing firm evidence for the predictions because they do not show whether the predicted enzymes really catalyze the predicted reactions. In theory, there could be other enzymes not identified bioinformatically that catalyze the reactions.

  2. It is not clear how the authors selected the bacterial species. Did they analyze meta genome sequences or hundreds of genomes of gut bacteria? Did they analyze bacteria isolated from the gut or rather type strains? What about other bacterial species in the gut? Do they also encode relevant enzymes? If yes, how many do? This needs to be clarified.

  3. The reported data on E. coli is difficult to understand. Has E. coli a different degradation pathway leading to the observed disappearance of tilianins?

Reviewer #2 (Public Review):

The manuscript deals with an interesting topic in metabolism: the so-called underground metabolism enabled by enzymes with broad substrate specificity. This is mainly relevant in secondary metabolisms. The authors deal, in particular, with the conversion of flavonoids, which have health-promoting effects. They present an algorithm for predicting the moonlight activities of enzymes, which must be given as inputs. Moreover, the authors performed experiments on the antioxidant activities of the flavonoids under study.

My focus was on the bioinformatics part. Overall, the bioinformatics part is not a major scientific achievement in my eyes, or it is too poorly described to see its merits. There may be difficulties understanding the presented algorithm.


The prediction algorithm should be explained much better. Although the manuscript is quite long, it does not describe the approaches sufficiently well. It is quite hard to read.

As far as I can see, the method was only tested with a small sample of different flavonoid substances.

Major comments
(1) I see the following contradiction. Line 18/19: "As flavonoids are not natural substrates of gut bacterial enzymes" and lines 76/77: "commensal gut microorganisms do not have specialized enzymes that utilize flavonoids as their native substrates" versus lines 72-74: "flavonoids ..., which makes them available to be metabolized". How can they be metabolized given what is said in the first two phrases?
(2) It should be explained better what is meant by "reaction class" (e.g. in lines 97 and 99). Is this the same as the EC number (in the Enzyme Catalogue)? The term "reaction class" is indeed used in the KEGG database. On the webpage
it seems indeed as if the terms "reaction class" and EC number are somehow equivalent. However, the term "RClass RC00392" in line 557 of the manuscript points to a difference in meaning.
(3) The prediction algorithm should be explained much better. For example, in the Figure showing the workflow, it is shown that an EC number should be given as input. However, if we search for enzymes which could potentially degrade a given flavonoid, we may not know any suitable EC number. Line 122: "To match a given enzyme with its non-native polyphenolic substrates..." However, where can we take the enzyme name/EC number from? Moreover, given that it is assumed that the reaction is performed by underground metabolism, should the enzyme given as input come from another organism, for example, a plant?
(4) Lines 521-523: Our prediction tool can take either a single enzyme in the form of Enzyme Commission (EC) number (e.g. "ec:"), or a KEGG organism-identifier (e.g. "cpv") or a consortium, a list of different organism-identifiers, as input." I do not understand the wording "or a consortium". According to the Figure showing the workflow, it should read "and a consortium".
(5) In the Materials and Methods section, the KEGG PATHWAY database is mentioned. This comes somewhat out of the blue. What is the connection to the "reaction class" concept in KEGG? Or is the PATHWAY database only used for extracting the negative controls?
(6) Line 142,143. "Our analysis shows that RClass-based similarity can predict the correct reactions for known flavonoid-metabolizing enzymes". How do the authors know that the results are correct? If it is easy to check, then I assume the test whether a given enzyme is able to catalyze reactions with flavonoids can be done manually in KEGG, so that a computer algorithm is unnecessary.
(7) Elaborating on the previous point - I have the impression that the algorithm is a rather simple search routine for finding reactions in the KEGG database that match certain criteria. This might be a helpful tool to save time in comparison to doing the search manually. However, at least the bioinformatics part of the paper is not a major scientific achievement as far as I can see.
(8) It is not sufficiently clear whether the prediction algorithm only works for the example shown in the top figure (tilianin, acacetin etc), which would be quite a restricted application, or for many or even all flavonoids. In line 565, the authors say: "our tabulated 312 unique flavonoids", while in the upper part of the MS, e.g. in lines 26 and 109, only the pathway starting from tilianin is mentioned.
(9) In which programming language was the algorithm implemented?
(10) The connection between the theoretical and experimental parts of the paper is not fully clear. Some of the experiments serve to test the predictions, which is fine. The experiments on free radicals, however, seem to be somewhat unrelated.

  1. Howard Hughes Medical Institute
  2. Wellcome Trust
  3. Max-Planck-Gesellschaft
  4. Knut and Alice Wallenberg Foundation