Abstract

The COVID-19 pandemic demands assimilation of all biomedical knowledge to decode mechanisms of pathogenesis. Despite the recent renaissance in neural networks, a platform for the real-time synthesis of the exponentially growing biomedical literature and deep omics insights is unavailable. Here, we present the nferX platform for dynamic inference from 45 quadrillion+ possible conceptual associations from unstructured text and triangulation with insights from Single Cell RNA-sequencing, bulk RNAseq and proteomics from diverse tissue types. A hypothesis-free profiling of ACE2 suggests tongue keratinocytes, olfactory epithelial cells, airway club cells and respiratory ciliated cells as potential reservoirs of the SARS-CoV-2 receptor. We find the gut as the putative hotspot of COVID-19, where a maturation correlated transcriptional signature is shared in small intestine enterocytes among coronavirus receptors(ACE2, DPP4, ANPEP). A holistic data science platform triangulating insights from structured and unstructured data holds potential for accelerating the generation of impactful biological insights and hypotheses.

Data availability

All data used in this manuscript were obtained from published and freely available sources online. A complete list of these can be found in Supplementary File 1.

Article and author information

Author details

  1. AJ Venkatakrishnan

    R&D, nference, Cambridge, United States
    Competing interests
    AJ Venkatakrishnan, AJ Venkatakrishnan is affiliated to nference. The author has financial interests in nference..
  2. Arjun Puranik

    Data Science, nference, San Francisco, United States
    Competing interests
    Arjun Puranik, Arjun Puranik is affiliated to nference. The author has financial interests in nference..
  3. Akash Anand

    Data Science, nference, Bangalore, India
    Competing interests
    Akash Anand, Akash Anand is affiliated to nference. The author has financial interests in nference..
  4. David Zemmour

    R&D, nference, Cambridge, United States
    Competing interests
    David Zemmour, David Zemmour is affiliated to nference. The author has no financial interests to declare..
  5. Xiang Yao

    R&D Data Sciences, Janssen, San Diego, United States
    Competing interests
    Xiang Yao, Xiang Yao is affiliated to Janssen. The author has no financial interests to declare..
  6. Xiaoying Wu

    R&D Data Sciences, Janssen, Spring House, United States
    Competing interests
    Xiaoying Wu, Xiaoying Wu is affiliated to Janssen. The author has no financial interests to declare..
  7. Ramakrishna Chilaka

    Engineering, nference, Bangalore, India
    Competing interests
    Ramakrishna Chilaka, Ramakrishna Chilaka is affiliated to nference. The author has financial interests in nference..
  8. Dariusz K Murakowski

    R&D, nference, Cambridge, United States
    Competing interests
    Dariusz K Murakowski, Dariusz Murakowski is affiliated to nference. The author has financial interests in nference..
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-9920-4980
  9. Kristopher Standish

    R&D Data Sciences, Janssen, San Diego, United States
    Competing interests
    Kristopher Standish, Kristopher Standish is affiliated to Janssen. The author has no financial interests to declare..
  10. Bharathwaj Raghunathan

    Data Sciences, nference, Toronto, Canada
    Competing interests
    Bharathwaj Raghunathan, Bharathwaj Raghunathan is affiliated to nference. The author has financial interests in nference..
  11. Tyler Wagner

    R&D, nference, Cambridge, United States
    Competing interests
    Tyler Wagner, Tyler Wagner is affiliated to nference. The author has financial interests in nference..
  12. Enrique Garcia-Rivera

    R&D, nference, Cambridge, United States
    Competing interests
    Enrique Garcia-Rivera, Enrique Garcia-Rivera is affiliated to nference. The author has financial interests in nference..
  13. Hugo Solomon

    R&D, nference, Cambridge, United States
    Competing interests
    Hugo Solomon, Hugo Solomon is affiliated to nference. The author has financial interests to declare..
  14. Abhinav Garg

    Engineering, nference, Bangalore, India
    Competing interests
    Abhinav Garg, Abinav Garg is affiliated to nference. The author has financial interests in nference..
  15. Rakesh Barve

    Data Sciences, nference, Bangalore, India
    Competing interests
    Rakesh Barve, Rakesh Barve is affiliated to nference. The author has financial interests in nference..
  16. Anuli Anyanwu-Ofili

    R&D Strategy & Operations, Janssen, Spring House, United States
    Competing interests
    Anuli Anyanwu-Ofili, Anuli Anyanwu-Ofili is affiliated to Janssen. The author has no financial interests to declare..
  17. Najat Khan

    R&D Data Sciences, R&D Strategy & Operations, Janssen, Spring House, United States
    Competing interests
    Najat Khan, Najat Khan is affiliated to Janssen. The author has no financial interests to declare..
  18. Venky Soundararajan

    R&D, nference, Cambridge, United States
    For correspondence
    venky@nference.net
    Competing interests
    Venky Soundararajan, Ramakrishna Chilaka is affiliated to nference. The author has financial interests in nference..
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-7434-9211

Funding

No external funding was received for this work.

Copyright

© 2020, Venkatakrishnan et al.

This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 4,141
    views
  • 526
    downloads
  • 47
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. AJ Venkatakrishnan
  2. Arjun Puranik
  3. Akash Anand
  4. David Zemmour
  5. Xiang Yao
  6. Xiaoying Wu
  7. Ramakrishna Chilaka
  8. Dariusz K Murakowski
  9. Kristopher Standish
  10. Bharathwaj Raghunathan
  11. Tyler Wagner
  12. Enrique Garcia-Rivera
  13. Hugo Solomon
  14. Abhinav Garg
  15. Rakesh Barve
  16. Anuli Anyanwu-Ofili
  17. Najat Khan
  18. Venky Soundararajan
(2020)
Knowledge synthesis of 100 million biomedical documents augments the deep expression profiling of coronavirus receptors
eLife 9:e58040.
https://doi.org/10.7554/eLife.58040

Share this article

https://doi.org/10.7554/eLife.58040

Further reading

    1. Epidemiology and Global Health
    2. Medicine
    3. Microbiology and Infectious Disease
    Edited by Diane M Harper et al.
    Collection

    eLife has published the following articles on SARS-CoV-2 and COVID-19.

    1. Medicine
    Gabriel O Heckerman, Eileen Tzng ... Adrienne Mueller
    Research Article

    Background: Several fields have described low reproducibility of scientific research and poor accessibility in research reporting practices. Although previous reports have investigated accessible reporting practices that lead to reproducible research in other fields, to date, no study has explored the extent of accessible and reproducible research practices in cardiovascular science literature.

    Methods: To study accessibility and reproducibility in cardiovascular research reporting, we screened 639 randomly selected articles published in 2019 in three top cardiovascular science publications: Circulation, the European Heart Journal, and the Journal of the American College of Cardiology (JACC). Of those 639 articles, 393 were empirical research articles. We screened each paper for accessible and reproducible research practices using a set of accessibility criteria including protocol, materials, data, and analysis script availability, as well as accessibility of the publication itself. We also quantified the consistency of open research practices within and across cardiovascular study types and journal formats.

    Results: We identified that fewer than 2% of cardiovascular research publications provide sufficient resources (materials, methods, data, and analysis scripts) to fully reproduce their studies. Of the 639 articles screened, 393 were empirical research studies for which reproducibility could be assessed using our protocol, as opposed to commentaries or reviews. After calculating an accessibility score as a measure of the extent to which an article makes its resources available, we also showed that the level of accessibility varies across study types with a score of 0.08 for Case Studies or Case Series and 0.39 for Clinical Trials (p = 5.500E-5) and across journals (0.19 through 0.34, p = 1.230E-2). We further showed that there are significant differences in which study types share which resources.

    Conclusion: Although the degree to which reproducible reporting practices are present in publications varies significantly across journals and study types, current cardiovascular science reports frequently do not provide sufficient materials, protocols, data, or analysis information to reproduce a study. In the future, having higher standards of accessibility mandated by either journals or funding bodies will help increase the reproducibility of cardiovascular research.

    Funding: Authors Gabriel Heckerman, Arely Campos-Melendez, and Chisomaga Ekwueme were supported by an NIH R25 grant from the National Heart, Lung and Blood Institute (R25HL147666). Eileen Tzng was supported by an AHA Institutional Training Award fellowship (18UFEL33960207).