Meta-Research: COVID-19 research risks ignoring important host genes due to pre-established research patterns

  1. Thomas Stoeger  Is a corresponding author
  2. Luís A Nunes Amaral  Is a corresponding author
  1. Successful Clinical Response in Pneumonia Therapy (SCRIPT) Systems Biology Center, Northwestern University, United States
  2. Department of Chemical and Biological Engineering, Northwestern University, United States
  3. Center for Genetic Medicine, Northwestern University School of Medicine, United States
  4. Northwestern Institute on Complex Systems (NICO), Northwestern University, United States
  5. Department of Molecular Biosciences, Northwestern University, United States
  6. Department of Physics and Astronomy, Northwestern University, United States
  7. Department of Medicine, Northwestern University School of Medicine, United States
  • Download
  • Cite
  • CommentOpen annotations (there are currently 0 annotations on this page).
4 figures and 7 additional files

Figures

Figure 1 with 1 supplement
Most host genes implicated in COVID-19 identified by genome-wide approaches are not being investigated.

(A) Share of identified genes, which are ignored (never tagged, blue) or tagged (at least once) within the COVID-19 literature. (B) Share of tagged genes identified by a single (orange) or multiple …

Figure 1—figure supplement 1
Share of identified genes that are ignored or tagged.

Share of identified genes, which are ignored (never tagged, blue) or tagged at least once (red) within the COVID-19 literature after additionally including genes occurring in abstracts of preprints.

Figure 2 with 1 supplement
What the future holds?

Percentage of genes with indicated levels of support by the four genome-wide studies which have been tagged at least once in the COVID-19 literature. (A) Analysis restricted to the 50% of genes with …

Figure 2—figure supplement 1
Temporal trends in the diversity of COVID-19 research.

(A) Gini Coefficient within COVID-19 literature until indicated day (green dots). (B) As Figure 2D but considering individual months (ochre dots). (C) As Figure 2F but considering individual months …

Availability of reagents.

(A) Drugs studied in COVID-19 related clinical trials are frequently studied within the non-COVID-19 literature. We compare non-COVID-19 publications measured for human protein-coding genes that are …

Author response image 1

Additional files

Source code 1

Source code for curation and analysis of datasets.

https://cdn.elifesciences.org/articles/61981/elife-61981-code1-v1.zip
Supplementary file 1

Gene Ontology enrichment analysis for human protein-coding genes tagged in the COVID-19 literature.

https://cdn.elifesciences.org/articles/61981/elife-61981-supp1-v1.xlsx
Supplementary file 2

Identification of genes through multiple GWAS comparisons.

https://cdn.elifesciences.org/articles/61981/elife-61981-supp2-v1.xlsx
Supplementary file 3

Implicated host genes identified by multiple genome-wide studies.

https://cdn.elifesciences.org/articles/61981/elife-61981-supp3-v1.xlsx
Supplementary file 4

Extent of tags in COVID-19 literature compared to rate identification in genome-wide datasets.

Per-gene average share of COVID-19 literature and per-gene average identification rate in genome-wide datasets (one if identified, 0 if not identified). Shown are the ratios of this share and the rates in individual groups (100 first tagged genes, and 20% top-tagged genes) over the share and the rates of the other genes that have been tagged in the COVID-19 literature.

https://cdn.elifesciences.org/articles/61981/elife-61981-supp4-v1.xlsx
Supplementary file 5

Number of laboratories working on individual genes, identified within one of the four genome-wide datasets, between 2006 and 2015.

https://cdn.elifesciences.org/articles/61981/elife-61981-supp5-v1.xlsx
Transparent reporting form
https://cdn.elifesciences.org/articles/61981/elife-61981-transrepform-v1.docx

Download links