Science Forum: Author-sourced capture of pathway knowledge in computable form using Biofactoid

  1. Jeffrey V Wong  Is a corresponding author
  2. Max Franz
  3. Metin Can Siper
  4. Dylan Fong
  5. Funda Durupinar
  6. Christian Dallago
  7. Augustin Luna
  8. John Giorgi
  9. Igor Rodchenkov
  10. Özgün Babur
  11. John A Bachman
  12. Benjamin M Gyori
  13. Emek Demir  Is a corresponding author
  14. Gary D Bader  Is a corresponding author
  15. Chris Sander  Is a corresponding author
  1. The Donnelly Centre, University of Toronto, Canada
  2. Computational Biology Program, Oregon Health and Science University, United States
  3. Computer Science Department, University of Massachusetts Boston, United States
  4. Department of Cell Biology, Harvard Medical School, United States
  5. Department of Systems Biology, Harvard Medical School, United States
  6. Department of Informatics, Technische Universität München, Germany
  7. Department of Data Sciences, Dana-Farber Cancer Institute, United States
  8. Broad Institute, Massachusetts Institute of Technology, Harvard University, United States
  9. Laboratory of Systems Pharmacology, Harvard Medical School, United States
  10. Department of Computer Science, Department of Molecular Genetics, University of Toronto, United States
  11. The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Canada
  12. Princess Margaret Cancer Centre, University Health Network, Canada
3 figures, 2 tables and 1 additional file


The Biofactoid curation tool.

Curation in Biofactoid involves drawing a network of relationships between genes or chemicals. (A) Genes and chemicals are represented by circles (nodes, highlighted in blue) where users provide a label, the type of gene product, and the organism. A search engine matches the label to a corresponding record from a database of genes or chemicals. (B) Relationships are represented by connecting genes, chemicals and/or rectangular complexes (shown in grey) with plain lines (when neither activation nor repression occur), arrows (to indicate activation), or ‘T-bars’ (to represent repression). Users select the mechanism that best describes the interaction. Complexes are represented as genes and/or chemicals enclosed by a box (e.g. the grey box labelled "Activated ras").

Biofactoid data is connected to information sources and establishes a bridge between related pathways.

(A) The Biofactoid Explorer is an interactive web app that publicly presents each author-curated entry alongside their article. Yellow arrows indicate how curated information is connected to outside knowledge bases. A “Network overview” (left) displays information about the article and pathway as a whole; a “Network item view” (right) displays information for a selected item (e.g. interaction, protein). (B) Biofactoid data establishes a bridge between related pathways described by structured biological knowledge from distinct data sources. Two author-curated interactions submitted to Biofactoid (red edges) bridge previously distinct pathways from the Reactome Pathway Database involved in mitochondrial biogenesis (left) and provide a new, more direct regulatory route between two mitochondrial genes (right). Pathway and interaction information was provided by Pathway Commons (, a web resource that provides a single point of access for multiple public interaction and pathway databases. Details regarding the generation of these networks can be found in the section “Visualization of network data across sources” in Materials and methods.

Biofactoid pilot study.

A three-phase pilot tested the feasibility of Biofactoid and involved journal editors and authors of research articles. Phase I and II involved editors and authors whose articles were recently published. In Phase III, 2,065 published articles were screened and authors of suitable articles were invited to Biofactoid. The articles screened were from 16 journals (Table 1).


Table 1
Prevalence of articles with pathway knowledge suitable for Biofactoid.
ISSNJournal*Coverage² [Vol. (Issue)]Articles screenedHits% Hits
2211–1247Cell Reports30(1) - 32(11)95310910.3
1097–4164Molecular Cell73(1) - 79(6)7258510.5
1549–5477Genes & Development34(1-2) - 34(17-18)931513.9
1476–4679Nature Cell Biology22(4) - 22(9)841010.6
1083–351 XJournal of Biological Chemistry295(31) - 295(37)210219.1
Weighted Average----10.4
  1. *

    Only journals in which at least 80 ‘hits’ were identified were included. A ‘hit’ is an article that provides direct evidence for a molecular interaction that can be captured by Biofactoid. The ability of Biofactoid to capture an interaction depends upon the type of bioentities, the relationship types and organisms described in the article. In total, articles from 16 journals were screened: EMBO; Molecular and Cellular Biology; Cell; Cancer Cell; iScience; J Biol Chem; Cell Metabolism; Science; Nature Genetics; Science Signaling; Science Advances; Immunity; Cell Reports; Molecular Cell; Genes & Development; Nature Cell Biology. ²Coverage indicates the span of journal issues that were included. Only primary research articles from each issue were screened.

Table 2
Comparison of non-centrally curated biocuration projects.

Comparison of projects that support community curation of pathway and interaction knowledge as their primary concern.

ProjectScopeSourceIntegrated curation toolAutomatic entity recognitionSingle-article orientedRef.
BiofactoidPathwayAuthorThis study.
Structured Digital Abstract (FEBS Letters)Protein-ProteinAuthor--Ceol et al., 2008; Leitner et al., 2010; Gerstein et al., 2007
WikiPathwaysPathwayAnyone---Slenter et al., 2018
SourceDataFigureAuthor-Liechti et al., 2017

Additional files

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Jeffrey V Wong
  2. Max Franz
  3. Metin Can Siper
  4. Dylan Fong
  5. Funda Durupinar
  6. Christian Dallago
  7. Augustin Luna
  8. John Giorgi
  9. Igor Rodchenkov
  10. Özgün Babur
  11. John A Bachman
  12. Benjamin M Gyori
  13. Emek Demir
  14. Gary D Bader
  15. Chris Sander
Science Forum: Author-sourced capture of pathway knowledge in computable form using Biofactoid
eLife 10:e68292.