Cancer Biology

Reproducibility in Cancer Biology: Challenges for assessing replicability in preclinical cancer biology

Center for Open Science, United States
Science Exchange, United States
University of Virginia, United States

Dec 7, 2021

https://doi.org/10.7554/eLife.67995

Open access
Copyright information

Download
Cite
CommentOpen annotations (there are currently 0 annotations on this page).
Share

4 figures, 1 table and 1 additional file

Figures

Figure 1 with 1 supplement

Download asset Open asset

Barriers to conducting replications – by experiment.

During the design phase of the project the 193 experiments selected for replication were coded according to six criteria: availability and sharing of data; reporting of statistical analysis (i.e., did the paper describe the tests used in statistical analysis?; if such tests were not used, did the paper report on biological variation (e.g., graph reporting error bars) or representative images?); availability and sharing of analytic code; did the original authors offer to share key reagents?; what level of protocol clarifications were needed from the original authors?; how helpful were the responses to those requests? The 29 Registered Reports published by the project included protocols for 87 experiments, and these experiments were coded according to three criteria: were reagents shared by the original authors?; did the replication authors have to make modifications to the protocol?; were these modifications implemented? A total of 50 experiments were completed.

Figure 1—figure supplement 1

Download asset Open asset

Barriers to conducting replications – by paper.

This figure is similar to Figure 1, except that the results for the experiments have been combined to give results for the papers that contained the experiments. Experiments from 53 papers were coded during the design phase; experiments from 29 papers were initiated; and results from 23 papers were published. Two methods were used to convert the scores for experiments into scores for papers. For four criteria (protocol clarifications needed; authors helped; modifications needed; modifications implemented) the average of the scores from the experiments was assigned to the paper. For five criteria (data shared; analysis reported; code shared; reagents offered; reagents shared), scoring was conducted with a 'liberal' interpretation for sharing: for example, if data were shared for just one experiment in a paper, the paper was coded as data shared. See main text for further details.

Figure 2 with 2 supplements

Download asset Open asset

Relationship between extent of clarification needed and helpfulness of authors.

Fluctuation plots showing the coded ratings for extent of clarifications needed from original authors and the degree to which authors were helpful in providing feedback and materials for designing the replication experiments. The size of the square shows the number (Freq) of papers/experiments for each combination of extent of clarification needed and helpfulness. (A) To characterize papers (N = 53), coded ratings were averaged across experiments for each paper. The average number of experiments per paper was 3.6 (SD = 1.9; range = 1–11). The Spearman rank-order correlation between extent of clarification needed and helpfulness was –0.24 (95% CI [–0.48, 0.03]) across papers. (B) For experiments (N = 193), the Spearman rank-order correlation between extent of clarification needed and helpfulness was –0.20 (95% CI [–0.33, –0.06]).

Figure 2—figure supplement 1

Download asset Open asset

Techniques used in the original experiments.

A total of 820 experimental techniques were identified in the 193 experiments selected for replication at the start of the project. These techniques were coded into 48 sub-categories, which were grouped into four categories (cell assays; immunoassays; animal assays; molecular assays).

Figure 2—figure supplement 2

Download asset Open asset

Relationship between extent of clarification needed or helpfulness with category of techniques.

Fluctuation plots showing the coded ratings for the category of techniques used in the original experiments and the extent of clarification needed from the original authors by paper (A) and by experiment (B). Fluctuation plots showing the category of techniques and the helpfulness of the original authors by paper (C) and by experiment (D). The size of the square shows the number (Freq) of papers/experiments for each combination. The average number of categories used in the 193 experiments was 2.2 (SD = 0.7; range = 1–4). To characterize papers (N = 53), coded ratings were averaged across the experiments for each paper; the average number of categories used per paper was 2.9 (SD = 0.7; range = 1–4).

Figure 3

Download asset Open asset

Relationship between extent of modifications needed and implementation of modifications.

Fluctuation plots showing the coded ratings for extent of modifications needed in order to conduct the replication experiments, and the extent to which the replication authors were able to implement these modifications for experiments that were conducted. The size of the square shows the number (Freq) of papers/experiments for each combination. (A) To characterize papers (N = 29), coded ratings were averaged across the experiments conducted for each paper. The average number of experiments conducted per paper was 2.6 (SD = 1.3; range = 1–6), and the Spearman rank-order correlation between extent of modifications needed and implementation was –0.01 (95% CI [–0.42, 0.40]). (B) For the experiments that were started (N = 76), the Spearman rank-order correlation was 0.01 (95% CI [–0.27, 0.28]).

Figure 4

Download asset Open asset

The different phases of the replication process.

Graph showing the number of papers entering each of the six phases of the replication process, and the mean duration of each phase in weeks. 53 papers entered the design phase, which started with the selection of papers for replication and ended with submission of a Registered Report (mean = 30 weeks; median = 31; IQR = 21–37). 32 papers entered the protocol peer reviewed phase, which ended with the acceptance of a Registered Report (mean = 19 weeks; median = 18; IQR = 15–24). 29 papers entered the preparation phase (Prep), which ended when experimental work began (mean = 12 weeks; median = 3; IQR = 0–11). The mean for the prep phase was much higher than the median (and outside the IQR) because this phase took less than a week for many studies, but much longer for a small number of studies. The same 29 papers entered the conducted phase, which ended when the final experimental data were delivered (mean = 90 weeks; median = 88; IQR = 44–127), and the analysis and writing phase started, which ended with the submission of a Replication Study (mean = 24 weeks; median = 23; IQR = 7–32). 18 papers entered the results peer review phase, which ended with the acceptance of a Replication Study (mean = 22 weeks; median = 18; IQR = 15–26). In the end, 17 Replication Studies were accepted for publication. The entire process had a mean length of 197 weeks and a median length of 181 weeks (IQR = 102–257).

Tables

Table 1

The 53 papers selected for replication in the RP:CB.

Original paper	Experiments selected	Registered report	Experiments registered	Replication study*	Experiments completed	Data, digital materials, and code
Poliseno et al., 2010	11	Khan et al., 2015	6	Kerwin et al., 2020	5	https://osf.io/yyqas/
Sharma et al., 2010	8	Haven et al., 2016	8	N/A	0	https://osf.io/xbign/
Gupta et al., 2010	2	N/A	0	N/A	0	https://osf.io/4bokd/
Figueroa et al., 2010	6	N/A	0	N/A	0	https://osf.io/xdojz/
Ricci-Vitiani et al., 2010	3	Chroscinski et al., 2015b	2	Errington et al., 2021a	1	https://osf.io/mpyvx/
Kan et al., 2010	3	Sharma et al., 2016a	3	Errington et al., 2021a	1	https://osf.io/jpeqg/
Heidorn et al., 2010	8	Bhargava et al., 2016a	5	Errington et al., 2021a	1	https://osf.io/b1aw6/
Hatzivassiliou et al., 2010	4	Bhargava et al., 2016b	3	Pelech et al., 2021	2	https://osf.io/0hezb/
Vermeulen et al., 2010	4	Evans et al., 2015a	3	Essex et al., 2019	3	https://osf.io/pgjhx/
Carro et al., 2010	8	N/A	0	N/A	0	https://osf.io/mfxpj/
Nazarian et al., 2010	5	N/A	0	N/A	0	https://osf.io/679uw/
Johannessen et al., 2010	5	Sharma et al., 2016b	5	Errington et al., 2021a	2	https://osf.io/lmhjg/
Poulikakos et al., 2010	5	N/A	0	N/A	0	https://osf.io/acpq7/
Sugahara et al., 2010	4	Kandela et al., 2015a	3	Mantis et al., 2017	3	https://osf.io/xu1g2/
Ward et al., 2010	3	Fiehn et al., 2016	3	Showalter et al., 2017	3	https://osf.io/8l4ea/
Ko et al., 2010	3	N/A	0	N/A	0	https://osf.io/udw78/
Zuber et al., 2011	3	N/A	0	N/A	0	https://osf.io/devog/
Delmore et al., 2011	2	Kandela et al., 2015b	2	Aird et al., 2017	2	https://osf.io/7zqxp/
Goetz et al., 2011	2	Fiering et al., 2015	2	Sheen et al., 2019	2	https://osf.io/7yqmp/
Sirota et al., 2011	1	Kandela et al., 2015c	1	Kandela et al., 2017	1	https://osf.io/hxrmm/
Raj et al., 2011	4	N/A	0	N/A	0	https://osf.io/uvapt/
Possemato et al., 2011	3	N/A	0	N/A	0	https://osf.io/u1mfn/
Tay et al., 2011	5	Phelps et al., 2016	5	Wang et al., 2020	4	https://osf.io/oblj1/
Xu et al., 2011	5	Evans et al., 2015b	5	N/A	0	https://osf.io/kvshc/
DeNicola et al., 2011	4	N/A	0	N/A	0	https://osf.io/i0yka/
Zhu et al., 2011	3	N/A	0	N/A	0	https://osf.io/oi7jj/
Liu et al., 2011	4	Li et al., 2015	3	Yan et al., 2019	3	https://osf.io/gb7sr/
Dawson et al., 2011	3	Fung et al., 2015	3	Shan et al., 2017	3	https://osf.io/hcqqy/
Qian et al., 2011	3	N/A	0	N/A	0	https://osf.io/ckpsn/
Sumazin et al., 2011	3	N/A	0	N/A	0	https://osf.io/wcasz/
Chaffer et al., 2011	2	N/A	0	N/A	0	https://osf.io/u6m4z/
Opitz et al., 2011	5	N/A	0	N/A	0	https://osf.io/o2xpf/
Kang et al., 2011	2	Raouf et al., 2015	2	N/A	0	https://osf.io/82nfe/
Chen et al., 2012	2	N/A	0	N/A	0	https://osf.io/egoni/
Driessens et al., 2012	2	N/A	0	N/A	0	https://osf.io/znixv/
Garnett et al., 2012	3	Vanden Heuvel et al., 2016	3	Vanden Heuvel et al., 2018	3	https://osf.io/nbryi/
Schepers et al., 2012	3	N/A	0	N/A	0	https://osf.io/1ovqn/
Willingham et al., 2012	2	Chroscinski et al., 2015a	1	Horrigan and Reproducibility Project: Cancer Biology, 2017a	1	https://osf.io/9pbos/
Straussman et al., 2012	4	Blum et al., 2014	4	N/A	0	https://osf.io/p4lzc/
Arthur et al., 2012	2	Eaton et al., 2015	2	Eaton et al., 2018	2	https://osf.io/y4tvd/
Peinado et al., 2012	3	Lesnik et al., 2016	2	Kim et al., 2018	2	https://osf.io/ewqzf/
Malanchi et al., 2011	3	Incardona et al., 2015	2	N/A	0	https://osf.io/vseix/
Berger et al., 2012	1	Chroscinski et al., 2014	1	Horrigan et al., 2017b	1	https://osf.io/jvpnw/
Prahallad et al., 2012	4	N/A	0	N/A	0	https://osf.io/ecy85/
Wilson et al., 2012	3	Greenfield et al., 2014	2	N/A	0	https://osf.io/h0pnz/
Lu et al., 2012	5	Richarson et al., 2016	3	Errington et al., 2021a	2	https://osf.io/vfsbo/
Lin et al., 2012	2	Blum et al., 2015	2	Lewis et al., 2018	2	https://osf.io/mokeb/
Lee et al., 2012	3	N/A	0	N/A	0	https://osf.io/i25y8/
Castellarin et al., 2012	1	Repass et al., 2016	1	Repass and Reproducibility Project: Cancer Biology, 2018	1	https://osf.io/v4se2/
Crasta et al., 2012	3	N/A	0	N/A	0	https://osf.io/47xy6/
Png et al., 2011	5	N/A	0	N/A	0	https://osf.io/tkzme/
Metallo et al., 2011	5	N/A	0	N/A	0	https://osf.io/isdbh/
Morin et al., 2010	1	N/A	0	N/A	0	https://osf.io/6kuy8/

193 experiments in 53 papers were selected for replication. The papers are listed in column 1, and the number of experiments selected from each paper is listed in column 2. Registered Reports for 87 experiments from 29 papers were published in eLife. The Registered Reports are listed in column 3, and the number of experiments included in each Registered Report is listed in column 4. 50 experiments from 23 Registered Reports were completed. 17 Replication Studies reporting the results of 41 experiments were published in eLife; the results of another nine experiments from the six remaining Registered Reports were published in an aggregate paper (Errington et al., 2021a). The Replication Studies are listed in column 5, and the number of experiments included in each study is listed in column 6. Column seven contains a link to data, digital materials, and code.