1. Cancer Biology
Download icon

Replication Study: Intestinal inflammation targets cancer-inducing activity of the microbiota

  1. Kathryn Eaton
  2. Ali Pirani
  3. Evan S Snitkin
  4. Reproducibility Project: Cancer Biology  Is a corresponding author
  1. University of Michigan Medical School, United States
Replication Study
  • Cited 0
  • Views 530
  • Annotations
Cite this article as: eLife 2018;7:e34364 doi: 10.7554/eLife.34364

Abstract

As part of the Reproducibility Project: Cancer Biology we published a Registered Report (Eaton et al., 2015) that described how we intended to replicate selected experiments from the paper “Intestinal Inflammation Targets Cancer-Inducing Activity of the Microbiota” (Arthur et al., 2012). Here we report the results. We observed no impact on bacterial growth or colonization capacity when the polyketide synthase (pks) genotoxic island was deleted from E. coli NC101, similar to the original study (Supplementary Figure 7; Arthur et al., 2012). However, for the experiment that compared inflammation, invasion, and neoplasia in azoxymethane (AOM)-treated interleukin-10-deficient mice mono-associated with NC101 or NC101Δ pks the experimental timing of the replication attempt was longer than that of the original study. This difference was because in the original study the methodology was not clearly stated and likely led to the increased mortality and severity of inflammation observed in this replication attempt. Additionally, early death occurred during AOM treatment with higher mortality observed in NC101Δ pks mono-associated mice compared to NC101, which was in the same direction, but more severe than the original study (Suppleme1ntal Figure 10; Arthur et al., 2012). A meta-analysis suggests that mice mono-associated with NC101Δ pks have higher mortality compared to NC101. While these data were unable to address whether, under the conditions of the original study, NC101 and NC101Δ pks differ in inflammation, invasion, and neoplasia this replication attempt demonstrates that clear description of experimental methods is essential to ensure accurate reproduction of experimental studies.

https://doi.org/10.7554/eLife.34364.001

Introduction

The Reproducibility Project: Cancer Biology (RP:CB) is a collaboration between the Center for Open Science and Science Exchange that seeks to address concerns about reproducibility in scientific research by conducting replications of selected experiments from a number of high-profile papers in the field of cancer biology (Errington et al., 2014). For each of these papers a Registered Report detailing the proposed experimental designs and protocols for the replications was peer reviewed and published prior to data collection. The present paper is a Replication Study that reports the results of the replication experiments detailed in the Registered Report (Eaton et al., 2015) for a 2012 paper by Arthur et al., and uses a number of approaches to compare the outcomes of the original experiments and the replications.

In 2012, Arthur et al. reported results that intestinal inflammation modifies the gut microbiota affecting the progression of colorectal cancer (CRC). The model used in that study was one of a group of related models that are commonly used to study the role of inflammation in colon carcinogenesis (Kanneganti et al., 2011). These models use a combination of treatment with azoxymethane (AOM), a proximate carcinogen, with an initiator of local inflammation. The group of models vary as to the dose and duration of AOM treatment as well as the treatment used to induce inflammation. Inflammatory insults used may be chemical, most commonly dextran sodium sulfate (DSS, a local irritant), genetic (e.g. engineered absence11 of a regulator of inflammation, such as Il10-/- ), infectious (e.g. Helicobacter hepaticus, Escherichia coli (E. coli), or Salmonella typhimurium), or a combination of these (for details, see Kanneganti et al., 2011). The model used by Arthur and colleagues was a combination of interleukin-10-deficient (Il10-/- ) mice and E. coli mono-association to produce a background of chronic inflammation followed by six weekly injections of AOM to induce neoplastic transformation. Using this inflammation-induced CRC model, Arthur and colleagues reported that germ-free mice mono-associated with the commensal mouse adherent-invasive E. coli strain NC101 developed invasive mucinous carcinomas which did not occur in mice mono-associated with Enterococcus faecalis, another colitis-inducing bacterial strain (Arthur et al., 2012). NC101 harbors the polyketide synthase (pks) pathogenicity island that encodes the biosynthetic machinery for synthesizing the genotoxin colibactin (Nougayrède et al., 2006). Mono-association of NC101 led to enhanced tumor multiplicity and invasion in AOM-treated germ-free Il10-/- mice, which was decreased in AOM-treated germ-free Il10-/- mice mono-associated with an isogenic mutant deficient for pks island (NC101Δ pks), without altering colonic inflammation (Arthur et al., 2012).

The Registered Report for the 2012 paper by Arthur et al. described the experiments to be replicated (Figure 4A–F, and Supplemental Figure 7 and 10), and summarized the current evidence for these findings (Eaton et al., 2015). Since that publication there have been additional studies investigating the effect of pks-harboring E. coli strains to enhance tumorigenesis. Similar observations were observed using APCMin/+ mice (Bonnet et al., 2014), germ-free APCMin/+; Il10-/- mice (Tomkovich et al., 2017), or a AOM-DDS xenograft mouse model of CRC (Cougnoux et al., 2014). A follow-up study by Arthur and colleagues, reported that colonic inflammation was necessary for the tumor-promoting activity of NC101 through modulation of specific microbial genes (Arthur et al., 2014).

The outcome measures reported in this Replication Study will be aggregated with those from the other Replication Studies to create a dataset that will be examined to provide evidence about reproducibility of cancer biology research, and to identify factors that influence reproducibility more generally.

Results and discussion

Impact of pks island deletion on bacterial growth

Using the same commensal mouse adherent-invasive E. coli NC101 strain and an isogenic pks-deficient (NC101Δ pks) strain as the original study (Arthur et al., 2012), we confirmed deletion of the pks island by PCR and whole genome sequencing, which revealed no variants or insertions/deletions other than the desired pks deletion between the two isogenic strains (Figure 1—figure supplement 1). The two bacterial strains were analyzed to determine if the absence of pks affected bacterial growth. This is comparable to what was reported in Supplemental Figure 7 of Arthur et al. (2012) and described in Protocol 1 in the Registered Report (Eaton et al., 2015). Similar to the original study, NC101 and NC101Δ pks growth curves were visually equivalent to each other (Figure 1, Figure 1—figure supplement 2), indicating deletion of pks does not affect E. coli growth in vitro. The intrinsic doubling time of the NC101 strain was 36 min, 95% CI [27-45], while the doubling time of the NC101Δ pks strain was 52 min, 95% CI [6-97]. This compares to the original study that had an estimated doubling time of ~53 min for the NC101 strain and ~64 min for the NC101Δ pks strain. To summarize, for this experiment we found results that were similar to the original study.

Figure 1 with 2 supplements see all
Detection of pks island and impact of pks island deletion on bacterial growth in vitro.

In vitro growth curve of E. coli NC101 and E. coli NC101Δ pks. Overnight bacterial cultures were diluted 1:500 in Luria-Bertani (LB) broth and incubated at 37˚C. Absorbance at 600 nm (OD600) was measured every 30 min. Representative growth curves of 3 independent biological repeats. The intrinsic doubling time was determined to be 36 min, 95% CI [27-45] for the NC101 strain and 52 min, 95% CI [6-97] for the NC101Δ pks strain. Additional details for this experiment can be found at https://osf.io/54rgt/.

https://doi.org/10.7554/eLife.34364.002

Intestinal tumorigenesis and inflammation of germ-free Il10-/- mice mono-associated with E. coli NC101 or NC101 Δpks

AOM-colitis models of carcinogenesis have been comprehensively reviewed (Kanneganti et al., 2011); however, it is important to note the unique features of the model used in the original study and this replication attempt. First, the model used germ-free mice. Germ-free mice differ from conventional mice in that first, they do not normally develop colitis, even in the absence of IL-10 (Eaton et al., 2011). Il10-/- mice that are housed in the presence of enteric microbes develop varying levels of colitis that appear to depend on their specific microbiota. Germ-free mice have been used fairly extensively in a related AOM-DSS model of carcinogenesis, but rarely in AOM-infection models. Also, germ-free mice differ from specific-pathogen-free (SPF) mice in their hepatic metabolism. Because AOM is a hepatotoxin in addition to a carcinogen, its toxicity may be altered in mice without a normal microbiota (Selwyn et al., 2016; Tung et al., 2017). The second unique aspect of the current model is that it uses mutant mice on a 129 SvEv background. Mouse strains differ in their response to AOM with the effect also modulated by non-genetic factors, like diet, which tend to vary widely across studies making it difficult to directly compare results (Bissahoyo et al., 2005) For these reasons, we attempted to control both genetic and non-genetic factors between the original study and this replication attempt.

To test whether absence of the pks island reduced tumorigenic potential, but not inflammatory potential of E. coli, we attempted to independently replicate an experiment similar to the one reported in Figure 4A–F, and Supplemental Figure 10, of Arthur et al. (2012). The protocol used was described in Protocol 3 in the Registered Report (Eaton et al., 2015), which was based on the available information provided in the original published paper (Arthur et al., 2012) and through communication with the authors of the original study. Germ-free Il10-/- mice on a 129/SvEV background (derived from the same germ-free colony used in the original study) were mono-associated with either the NC101 or NC101Δ pks isolate described above, treated with AOM, and then assessed for colonic inflammation and tumorigenesis. The original study also reported cohorts of mice that did not receive AOM treatment (assessed at 12 weeks), or received AOM treatment and were assessed at 14 weeks. These additional cohorts were not included in the design of this replication attempt. Furthermore, the number of mice required for this study to have sufficient power to detect the originally reported effect sizes were determined a priori and took into account the number of anticipated animal deaths that would occur prior to the 18 week assessment based on the originally reported survival rates. To summarize, bacterial colonization was confirmed 4 weeks after mono-association, after which AOM was administered weekly for a total of 6 injections, and then mice were monitored for 18 weeks after the last AOM injection (experimental timing visualized in Figure 2A). The timing for this experiment was based on information available in the original paper (Arthur et al., 2012) and remained after informal review and feedback by the authors of the original paper during preparation of the Registered Report manuscript, peer review of the Registered Report, and post-publication peer review of the published Registered Report. During peer review of this Replication Study, however, one of the reviewers suggested that the experimental timing used in this replication attempt was different from the original study, which evaluated colonic inflammation and tumorigenesis 18 weeks after mono-association rather than 18 weeks after AOM treatment as was described in the Registered Report and performed in this study. Thus, this replication attempt had an experimental endpoint 9 weeks longer than the original study. This methodological error, which confounds the interpretation of the results of this replication attempt, was based on the methods derived from the original study and not corrected on review of the Registered Report (Eaton et al., 2015). Others have reported how assumptions in experimental timing or methods hindered their efforts to understand how seemingly similar experiments produced different results (Hines et al., 2014; Lithgow et al., 2017). One approach to mitigate the potential for misinterpreting complex study designs is to include a timeline diagram or flowchart as recommended by the ARRIVE Guidelines (Kilkenny et al., 2010).

Figure 2 with 2 supplements see all
Impact of pks island deletion on mouse survival.

Female and male Il10-/- germ-free mice were randomly assigned to be mono-associated with either E. coli NC101 or E. coli NC101Δ pks at age 7-12 weeks. Four weeks after mono-association, mice received six weekly azoxymethane (AOM) injections to induce colon tumors. Mice were monitored until euthanized due to health complications or the pre-specified study end-point of 18 weeks after the last AOM injection (Replication Study). This experimental end-point was longer than the original study, which euthanized mice at 18 weeks after mono-association (Arthur et al., 2012). Number of mice monitored: n=39 for NC101 mono-associated mice and n=45 for NC101Δ pks mono-associated mice. Early in the study a few isocages were contaminated and had to be removed from the study. This did not affect the other isocages which remained gnotobiotic. Animals where bacterial contamination was detected (n=7 in 3 cages) were censored in the plots (denoted by a cross line). (A) Kaplan-Meier plot of overall survival starting at time of mono-association until 18 weeks after the last AOM injection, which is the experimental timing used in this replication. Green bar indicates when euthanized mice were histopathologically (histo) evaluated for inflammation and tumorigenesis. Log-rank (Mantel-Cox) test of NC101Δ pks compared to NC101 (p = 0.0608); HR=1.57, 95% CI [0.98, 2.51]. (B) Kaplan-Meier plot of overall survival starting at time of mono-association until 18 weeks after mono-association, which is the experimental timing used in the original study (Arthur et al., 2012). Original and replication data are plotted for direct comparison. Exploratory analysis of replication data: Log-rank (Mantel-Cox) test of NC101Δ pks compared to NC101 (p = 0.0230); HR=1.95, 95% CI [1.10, 3.45]. Additional details for this experiment can be found at https://osf.io/pm5xa/.

https://doi.org/10.7554/eLife.34364.005

In this study, we found most mice did not survive to the planned 18-week post-AOM time point, and either died or were euthanized prior to the intended study end-point (Figure 2A). Despite efforts to include more animals into the study (n=39 for NC101 mono-associated mice; n=45 for NC101Δ pks mono-associated mice), we did not achieve the planned number of animals at the end point of 18 weeks post-AOM treatment. Only 11 mice survived to 17-18 weeks after AOM treatment (4 mono-associated with NC101; 7 mono-associated with NC101Δ pks), while a total of 25 mice survived 14 weeks post-AOM treatment or later. During the course of the entire study, there was a median survival of 154 days (range: 31-176 days) and 57 days (range: 29-188 days) for NC101 and NC101Δ pks mono-associated mice, respectively, which was not statistically significant (log-rank (Mantel-Cox) test; p = 0.0608). Regardless of mono-associated bacterial strain, mortality was highest during AOM treatment; however during this interval, mortality was greater in mice mono-associated with NC101Δ pks compared to mice mono-associated with NC101. In the NC101Δ pks group, fewer than half the animals survived beyond the last AOM treatment. Importantly, there were similar levels of colonization capacity in vivo for both strains of bacteria (Figure 2—figure supplement 1), similar to the original study, suggesting bacterial load was not a factor in the survival differences.

Interestingly, while a similar distribution of male and female mice were assigned to both strains for mono-association (NC101: female=21, male=18; NC101Δ pks: female=23, male=22), male mice became overrepresented during the course of this study: 14 weeks post-AOM treatment (NC101: female=5, male=9; NC101Δ pks: female=3, male=8); 17-18 weeks post-AOM treatment (NC101: female=0, male=4; NC101Δ pks: female=2, male=5). In the original study 4 female and 8 male mice were mono-associated with NC101 and 8 male mice were mono-associated with NC101Δ pks (Arthur, personal communication) of which 14 mice (9 mono-associated with NC101; 5 mono-associated with NC101Δ pks) survived to 18 weeks after mono-association (Arthur et al., 2012); however, the sex distribution of the surviving mice was not published or communicated.

To facilitate a direct comparison of these results to the original study we determined survival up to 18 weeks following mono-association, similar to the timing performed in the original study (Figure 2B). When treating 18 weeks following mono-association as the endpoint (i.e. ignoring events after this time point), we observed the absence of the pks island had an impact on survival (NC101: 53.8%; NC101Δ pks: 26.7%). This corresponds to a median survival of 57 days for NC101Δ pks while the median survival for NC101 mono-associated mice could not be determined since more than half of the animals were still alive at 18 weeks following mono-association. An exploratory analysis to compare the survival distributions between the two groups during this timeframe was statistically significant (log-rank (Mantel-Cox) test; p = 0.0230). The original study reported that the absence of the pks island had a small, but not statistically significant effect on mouse survival (NC101: 75% (9 of 12 mice); NC101Δ pks: 62.5% (5 of 8 mice)) (Arthur et al., 2012). When considering only the survival data up to 18 weeks after mono-association, which is the timing performed in the original study, we found results that were in the same direction as the original study.

During this study, early death that occurred during AOM treatment was associated with widespread liver lesions. This outcome was not reported in the original study; however, the different outcomes between the two studies could be due to methodological details that were unaccounted for (Bramhall et al., 2015), or other unknown differences between the two studies. AOM is a known hepatoxin, but the dose and protocol used were below the published toxic dose for AOM in mice (Bissahoyo et al., 2005), and were not expected to cause lesions. Four mice that were found dead 1-2 days following the first or second AOM treatment (NC101 = 1, NC101Δ pks = 3) had massive acute hepatic necrosis that appeared severe enough to have caused death (Figure 2—figure supplement 2A). The other mice that died during AOM treatment had liver lesions of less severity, including widespread hepatocellular vacuolation and multifocal hepatocellular necrosis (Figure 2—figure supplement 2B). Mice that died after the AOM treatment phase all had chronic liver lesions suggestive of a regenerative response to ongoing damage. These included anisocytosis often with giant hepatocytes well beyond what would be expected in aged mice, atrophy and disorganization of hepatic acini, and multifocal single-cell necrosis (Figure 2—figure supplement 2C). Surprisingly, there was no evidence of fibrosis even in the most chronic lesions. Although the previous, or ongoing, liver damage likely contributed to the animals poor condition in mice that survived the AOM treatment, the chronic liver lesions did not appear severe enough to have caused death directly.

In the mice that died or were euthanized prematurely, colitis first appeared at the start of AOM treatment (5 weeks after mono-association) and was present in all mice that were examined histologically 9 weeks or more after mono-association. Most mice also had typhlitis, which was less severe than the colitis. Inclusion of AOM-only controls could have indicated any increased susceptibility of the mice used in this study to AOM and should be considered in the experimental design of future studies.

Colon adenocarcinoma was first detected in mice that died 8 weeks after AOM treatment (17 weeks after mono-association) and was present in all mice euthanized 13 weeks or more after AOM treatment. Notably, a few mice died at time points close to the time that mice were harvested in the original study: one with typhlitis (NC101 at 14 weeks), two with severe colitis (NC101Δ pks at 14 weeks; NC101Δ pks at 14 weeks), one with typhlitis and dysplasia (NC101Δ pks at 19 weeks), and three with colon adenocarcinoma (NC101Δ pks at 17 weeks; NC101 at 22 weeks; NC101 at 22 weeks). Furthermore, five mice (between 19 and 27 weeks after mono-association) had anal squamous cell carcinoma in addition to colon adenocarcinoma (NC101 = 3, NC101Δ pks = 2) (Figure 2—figure supplement 2D,E).

In addition to the increased early death rate during and immediately after AOM treatment in this replication attempt, severity of chronic lesions and extent of neoplasia were greater than what was reported in the original study. This is most likely due to the longer experimental timing that occurred in this replication attempted compared to the original study. Lesions were similar in morphology to those described in the original study, but more severe. Grossly evident colon thickening was present in mice examined 22 or more weeks after mono-associations. These lesions were widespread and coalescing, and unlike in the original study, individual masses could not be distinguished either grossly or histologically (Figure 3—figure supplement 1). This morphology is typical of both neoplastic and non-neoplastic inflammation-associated proliferative lesions in mice (Boivin et al., 2003; Washington et al., 2013). The most severely affected mice had markedly irregular colon mucosa, sometimes extending from the anus to the mid-proximal part of the colon. The proximal colon and cecum were grossly normal, while the distal half to third of the colon had lesions. Lesions were also sometimes present in the mid-colon, but only when the distal colon was severely affected. These observations were the same whether the mouse was mono-associated with NC101 or NC101Δ pks. In the few cases (10 of 24 mice) where histologically detectable non-neoplastic tissue was present between neoplastic foci, an attempt was made to enumerate individual tumors in histologic sections (Figure 3A). The median count in sections in which individual tumors were detectable for NC101 mono-associated mice was 3.5, which was greater than in NC101Δ pks mono-associated mice (median count of 2). Because individual tumors could not be distinguished, tumor counts cannot be directly compared to the original study where the absence of the pks island resulted in a statistically significant decrease in macroscopic tumor burden (estimated median count: NC101 = 8, NC101Δ pks = 2), without an impact on tumor size (Arthur et al., 2012).

Figure 3 with 3 supplements see all
Impact of pks island deletion on colonic inflammation and tumorigenesis.

Female and male Il10-/- germ-free mice mono-associated with either E. coli NC101 or E. coli NC101Δ pks and treated with AOM were blindly assessed for inflammation and tumorigenesis at sacrifice. This is from the same experiment as in Figure 2. Results presented for mice that survived 14 weeks post-AOM treatment or later (Number of mice analyzed: n=14 for NC101, n=10 for NC101Δ pks). Dot plots where each symbol represents data from one mouse with medians reported as crossbars. One mouse inoculated with NC101Δ pks was found dead (186 days post-AOM) and was too autolyzed for interpretation, and thus was not included in plots. (A) Macroscopic tumor number where individual tumors were detectable. TNTC (too numerous to count) indicates mice where individual tumors could not be enumerated because of the coalescing nature of the lesions. (B) Histological inflammation scores. Exploratory analysis: Wilcoxon-Mann-Whitney test; U = 72, p = 0.892; Cliff’s d = 0.029, 95% CI [-0.28, 0.33]. (C) Histological invasion scores. Exploratory analysis: Wilcoxon-Mann-Whitney test; U = 80, p = 0.488; Cliff’s d = 0.14, 95% CI [-0.25, 0.49]. (D) Histological neoplasia scores. Exploratory analysis: Wilcoxon-Mann-Whitney test; U = 72, p = 0.892; Cliff’s d = 0.029, 95% CI [-0.28, 0.33]. Additional details for this experiment can be found at https://osf.io/pm5xa/.

https://doi.org/10.7554/eLife.34364.008

Colonic inflammation and tumorigenesis were scored as far as possible using the same scoring criteria as the original study (Arthur et al., 2012) while taking into account other published criteria (see Materials and methods section; Boivin et al., 2003; Rath et al., 1996; Washington et al., 2013). We found that there were no substantial differences in the inflammation, invasion, or neoplasia scores between mice mono-associated with NC101 or NC101Δ pks. When considering all the mice that survived 14 weeks post-AOM treatment (5 weeks beyond the endpoint in Arthur et al., 2012), or later, the median scores for each measure were at or near the maximum possible for both cohorts (inflammation = 6; invasion = 8; neoplasia = 5) (Figure 3B–D). These results are confounded by the increased severity of inflammation and mortality of animals, most likely due to the experimental timing that occurred in this replication, which was longer than what occurred in the original study. Thus, these results cannot be directly compared to the original study which reported that the absence of the pks island resulted in a statistically significant decrease in neoplasia scores (NC101: median = 4, range = 4-5; NC101Δ pks: median = 4, range = 3-4) and invasion scores (NC101: median = 2, range = 1-6; NC101Δ pks: median = 1, range = 0-2), but not inflammation scores (NC101: median = 4, range = 4-4; NC101Δ pks: median = 4, range = 3-4) 18 weeks after mono-association (Arthur et al., 2012). Although the original study analyzed the ordinal scoring data as interval measurements (by t test), which is not appropriate since the mean cannot be defined (Baker et al., 2014; Gibson-Corley et al., 2013), similar results were obtained when a non-parametric test (i.e. Mann Whitney test) was applied on the original data (inflammation: U = 27, p = 0.233; invasion: U = 39.5, p = 0.0240; neoplasia: U = 37.5, p = 0.0297).

As noted above, the difference in severity of inflammation, invasion, and neoplasia between the two studies are most likely explained by the increased experimental timing that occurred in this replication attempt that differed from the original study. The absolute scores were greater in this replication attempt compared to the original study, particularly for inflammation and invasion, which, combined with the survival and histopathological observations described above suggests that lesions were more severe and/or progressive over time in this replication attempt than in the original study. Any differences attributable to the absence of pks would have been masked by the greatly increased severity of the lesions. Additionally, over time it is possible the products of excessive inflammatory responses (e.g. reactive oxygen species), which promote tumorigenesis, become more important than pks status. Thus, it is possible that the experimental timing in this mouse model is crucial for differentiating the outcomes of NC101 and NC101Δ pks. While subjective differences in histologic interpretation could also account for differences between the studies (Cross, 1998; Gibson-Corley et al., 2013), the level of variation observed between these two studies is likely greater than would be expected due to differences in interpretation alone. To summarize, since this replication attempt did not model the kinetics of the mouse model as they occurred in the original study, these data are unable to address whether, under the conditions of the original study, NC101 and NC101Δ pks differ in inflammation, invasion, and neoplasia. These results highlight the importance of completeness and clarity in publication of experimental methodology, including experimental timing, to facilitate reproducibility between studies.

Meta-analyses of original and replicated effects

We performed a meta-analysis using a random-effects model, where possible, to combine each of the effects described above as pre-specified in the confirmatory analysis plan (Eaton et al., 2015). We excluded the comparisons of inflammation, invasion, and neoplasia scores since the experimental timing between the original study and this replication attempt were not the same, preventing a direct comparison of results. To provide a standardized measure of the effect calculated for survival, a common effect size was calculated for each effect from the original and replication studies. The hazard ratio (HR) is the ratio of the probability of a particular event, in this case death, in one group compared to the probability in another group. The estimate of the effect size of one study, as well as the associated uncertainty (i.e. confidence interval), compared to the effect size of the other study provides another approach to compare the original and replication results (Errington et al., 2014; Valentine et al., 2011). Importantly, the width of the confidence interval (CI) for each study is a reflection of not only the confidence level (e.g. 95%), but also variability of the sample (e.g. SD) and sample size.

A meta-analysis of the intrinsic doubling times of the NC101 and NC101Δ pks strains was not conducted since the original study reported a single growth curve for both strains. Comparing the original and replication results, the original value reported in Arthur et al. (2012) for NC101 fell outside the 95% CI of the values generated during this replication attempt, while the original value for NC101Δ pks was within the 95% CI (Figure 1).

The comparison of the overall survival distributions between NC101 mono-associated mice compared to those that were mono-associated with NC101Δ pks resulted in a HR of 1.95, 95% CI [1.10, 3.45] for this replication attempt compared to a HR of 1.69, 95% CI [0.31, 9.09] for the original study (Arthur et al., 2012). Importantly, the calculation of the HR for both studies used data during the same timeframe (i.e. 18 weeks from mono-association). Both results are consistent when considering the direction of the effect, that death occurred more often in mice mono-associated with NC101Δ pks compared to NC101, with both effect size point estimates falling within the confidence interval of the other study. A meta-analysis (Figure 4) of these effects resulted in a HR of 1.92, 95% CI [1.11, 3.30], which was statistically significant (p = 0.0188) and implies the null hypothesis that the survival distributions for the two cohorts are the same, can be rejected.

This direct replication provides an opportunity to understand the present evidence of these effects. Any known differences, including reagents and protocol differences, were identified prior to conducting the experimental work and described in the Registered Report (Eaton et al., 2015). However, this is limited to what was obtainable from the original paper and through communication with the original authors, which means there might be particular features of the original experimental protocol that could be critical, but unidentified. So while some aspects, such as bacteria strain, mouse strain, and AOM dose were maintained, one aspect, experimental timing, was revealed during peer review of this Replication Study to be incorrect due to the methodology not being clearly stated in the original study, which hindered efforts to reproduce the original methodology. Thus, this replication attempt illustrates the need for methodology to be reported in sufficient detail to allow published research to be accurately compared, reproduced, and interpreted (Glasziou et al., 2014). Furthermore, other factors were unknown or not easily controlled for. These include variables such as mouse sex (Clayton and Collins, 2014), genetic heterogeneity of mouse inbred strains (Casellas, 2011), housing temperature in mouse facilities (Kokolus et al., 2013), differing compound potency and purity resulting from different stock solutions (Davis et al., 2012; Kannt and Wieland, 2016; Neufert et al., 2007), and genetic differences in the bacterial strains (Kuo et al., 2009). Environmental differences such as husbandry staff, bedding type and source, light levels, and other intangibles, all of which, by necessity, differed between the studies also affect experimental outcomes with mice (Howard, 2002; Jensen and Ritskes-Hoitinga, 2007; Nevalainen, 2014; Sorge et al., 2014). Additionally, in this replication attempt, mice were housed in isocages rather than in bubble isolators. While the difference in caging did not affect the gnotobiotic status of the mice, subtle differences in housing could result in different outcomes. Differences in pathologist’s interpretation in quantification of histologic lesions is another source of variability between studies, necessitating clear delineation of criteria and terminology used for diagnosis, preferably by illustrative photomicrographs (Elmore et al., 2017; Ward et al., 2017). Whether these or other factors influence the outcomes of this study is open to hypothesizing and further investigation, which is facilitated by direct replications and transparent reporting.

Materials and methods

Key resources table
Reagent type
(species) or resource
DesignationSource or referenceIdentifiersAdditional information
Strain, strain
background (Escherichia
coli, NC101)
NC101doi:10.1126/science.1224820
Strain, strain
background
(E. coli, NC101Δpks)
NC101Δpksdoi:10.1126/science.1224820
Strain, strain
background (Mus musculus
, 129/SvEv, Il10-/-)
Germ-free Il10-/-doi:10.1126/science.1224820Germ-free mice
Commercial
assay or kit
MoBio PowerMag
Microbial DNA
Isolation Kit
Qiagencat# 27200–4
Commercial
assay or kit
NEBNext Ultra DNA
Library Prep Kit
for Illumina
New England BioLabscat# E7370
Chemical
compound, drug
AOMSigma-Aldrichcat# A5486lot# SLBN5975V
Software, algorithmFastQChttp://www.bioinformatics.babraham.ac.uk/projects/fastqcRRID:
SCR_014583
version 0.11.5
Software, algorithmTrimmomaticdoi:10.1093/bioinformatics
/btu170
RRID:
SCR_011848
version 0.36
Software, algorithmSPAdesdoi:10.1089/cmb.2012.0021RRID:
SCR_000131
version 3.5.0
Software, algorithmABACASdoi:10.1093/bioinformatics
/btp347
RRID:
SCR_015852
version 1.3.1
Software, algorithmProkkadoi:10.1093/bioinformatics
/btu153
RRID:
SCR_014732
version 1.11
Software, algorithmshort-read
Burrows-Wheeler
Aligner
doi:10.1093/bioinformatics/
btp324
RRID:
SCR_015853
version 0.7.13
Software, algorithmPicardhttp://broadinstitute.github.io/picardRRID:
SCR_006525
version 1.130
Software, algorithmSAMtools
and BCFtools
doi:10.1093/bioinformatics/
btr509
RRID:
SCR_005227
version 1.2
Software, algorithmGATK’s
VarieantFiltration
doi:10.1002/0471250953.
bi1110s43
RRID:
SCR_001876
version 3.3.0
Software, algorithmArtemis
Comparison Tool
doi:10.1093/bioinformatics/
bti553
RRID:
SCR_004507
version 13.0.0
Software, algorithmR Project for
statistical computing
https://www.r-project.orgRRID:
SCR_001905
version 3.4.4

As described in the Registered Report (Eaton et al., 2015), we attempted a replication of the experiments reported in Figure 4A–F, and Supplemental Figure 7 and 10 of Arthur et al. (2012). A detailed description of all protocols can be found in the Registered Report (Eaton et al., 2015) and are described below with additional information not listed in the Registered Report, but needed during experimentation.

Meta-analysis of survival.

Effect size and 95% confidence interval are presented for Arthur et al. (2012), this replication attempt (RP:CB), and a random effects meta-analysis of those two effects. To directly compare and combine the results of both studies, the survival data during the same timeframe was used (i.e. 18 weeks from mono-association). HR greater than 1 indicates death occurred more often in NC101Δ pks compared to NC101, while HR less than 1 indicates the reverse. Sample sizes used in Arthur et al. (2012) and this replication attempt are reported under the study name. Random effects meta-analysis of HR for NC101 mono-associated mice compared to NC101Δ pks mono-associated mice (meta-analysis p = 0.0188). Additional details for this meta-analysis can be found at https://osf.io/2raud/.

https://doi.org/10.7554/eLife.34364.012

Bacterial strains and growth conditions

E. coli. NC101, isolated from the feces of an inbred 129S6/SvEv background mouse raised in SPF conditions (Kim et al., 2005), and NC101Δ pks (Arthur et al., 2012), were shared by the Arthur lab, (University of North Carolina at Chapel Hill). Bacteria from an overnight culture were washed and diluted to approximately 108 CFU prior to inoculation into Luria-Bertani (LB) broth. Bacteria were grown at 37˚C in room air with shaking for all experiments.

In vitro bacterial growth assay

E. coli strains that were grown overnight (12–16 hr) were used to inoculate 10 ml of LB broth at 1:5, 1:50, 1:500, and 1:5000 dilutions. Starting at the time of inoculation, cultures were plated in a 96 well plate in technical triplicate and were measured every 30 min at 600 nm absorbance using a microplate spectrophotometer. Plates were maintained at 37˚C during the assay. LB broth was used to determine the background, which was subtracted from the readings. Technical repeats were averaged for each biological repeat. To summarize the growth characteristics (e.g. doubling time), values for each biological repeat were fit to the standard form of the logistic equation common in ecology and evolution using the Growthcurver R package (Sprouffske and Wagner, 2016) and R software (RRID:SCR_001905), version 3.4.4 (Core Team, 2018).

PCR detection of pks island

Bacterial DNA was isolated from overnight cultures using a DNeasy Blood and Tissue kit according to manufacturer’s instructions (Qiagen, cat# 69504). Fecal material was processed for DNA extraction by resuspending pellets in lysis buffer supplemented with 20 mg/ml lysozyme, incubated at 37˚C for 30 min and then supplemented with 10% SDS and 350 µg/ml proteinase K. DNA was extracted with a DNeasy Blood and Tissue kit according to the manufacturer’s instructions. DNA was quantified using a NanoDrop spectrophotometer (Thermo Fisher Scientific, cat# 2000C). PCR was performed on a MJ Mini Gradient Thermal Cycler (BioRad, cat# PCT-1148) and Opticon Monitor software (RRID:SCR_014241). PCR reactions were performed using primers specific for the 5’ and 3’ end of the pks island, colibactin, and 16S rRNA, with sequences listed in the Registered Report (Eaton et al., 2015). Reaction volumes were 50 µl and consisted of 5 µl 10X Taq polymerase Master Mix supplemented with 1.5 mM Mg2+ and 0.5 µl Taq polymerase (Sigma-Aldrich, cat# D9307), 0.05 mM dNTPs, 0.05 µM forward and reverse primers, and 2 µl DNA diluted in water. A negative control without DNA was also included. PCR cycling conditions were: 1 cycle 95°C for 5 min – 35 cycles (or 27 cycles) 95°C for 45 s, 56°C 45 s, 72°C 45 s – 1 cycle 72°C for 10 min. PCR reactions were run on a 1.5% agarose gel to visualize if a product of expected size was produced.

Genome sequencing data processing and assembly

Bacterial DNA was extracted with the MoBio PowerMag Microbial DNA Isolation Kit (Qiagen, cat# 27200-4) according to the manufacturer’s instructions and prepared for sequencing on an Illumina MiSeq instrument (San Diego, California) (MiSeq run parameters can be found at https://osf.io/fnu62/) using the NEBNext Ultra DNA Library Prep Kit for Illumina (New England BioLabs, cat# E7370) and sample-specific barcoding. Library preparation and sequencing were performed at the Center for Microbial Systems at the University of Michigan. Samples were evaluated for contamination and excessive low quality sequence with FastQC (RRID:SCR_014583), version 0.11.5 (Andrews, 2016), and processed using Trimmomatic (RRID:SCR_011848), version 0.36 (Bolger et al., 2014), to trim low quality bases and remove reads with poor average quality scores. De novo genome assemblies were generated for each sample by running SPAdes (RRID:SCR_000131), version 3.5.0 (Bankevich et al., 2012), on trimmed sequencing reads. For comparison, the assembly for NC101Δ pks was ordered relative to NC101 using ABACAS (RRID:SCR_015852), version 1.3.1 (Assefa et al., 2009), and base genome annotation was assigned with Prokka (RRID:SCR_014732), version 1.11 (Seemann, 2014).

Variant detection

Variants were identified by: (1) mapping filtered reads from NC101Δ pks to the assembled NC101 reference genome (GenBank Accession number: AM229678.1) using the short-read Burrows-Wheeler Aligner (BWA) (RRID:SCR_015853), version 0.7.13 (Li and Durbin, 2009), (2) discarding PCR duplicates with Picard (RRID:SCR_006525), version 1.130 (http://broadinstitute.github.io/picard), and (3) calling variants with SAMtools and BCFtools (RRID:SCR_005227), version 1.2 (Li, 2011). Variants were filtered from raw results using GATK’s (RRID:SCR_001876) VariantFiltration, version 3.3.0 (QUAL > 100, MQ > 50, >10 reads supporting variant, FQ < 0.025) (Van der Auwera et al., 2013). In addition, a custom python script (https://osf.io/jgqdb/) was used to filter out single nucleotide variants that were: (1) < 5 bp in proximity to indels or (2) < 10 bp in proximity to another variant.

Large indel detection

To identify genomic regions differing between NC101 and NC101Δ pks, bi-directional BLAST queries were performed between the contigs in the genome assemblies. Regions found to be unique to either genome were verified by mapping reads using BWA and visually verifying that no reads map to putative unique genomic regions using the Artemis Comparison Tool (RRID:SCR_004507), version 13.0.0 (Carver et al., 2005).

AOM/Il10-/- animal model

All animal procedures were approved by the Michigan University IACUC# 7291 and were in accordance with Michigan University’s policies on the care, welfare, and treatment of laboratory animals. Blinding occurred during histopathology scoring of inflammation and tumorigenesis scoring. Mice were randomized for mono-association.

Germfree (GF) Il10-/- mice of the 129S6/SvEV background were originally from the National Gnotobiotic Rodent Resource Center at the University of North Carolina, Chapel Hill and shipped in germ free shipping containers (Taconic) to the Germ-Free & Gnotobiotic Mouse Facilities at Michigan University. Mice for this study were born and raised in GF isolators until they reached the age of 7-12 weeks. GF status was verified by bacterial culture, Gram stain, mold trap, and 16S bacterial PCR. Gram stain and bacterial culture were performed at every isolator entry. Mice were aseptically removed from the isolators, randomly assigned to be mono-associated with NC101 or NC101Δ pks, and housed in sterile isocages (Tecniplast), where they stayed throughout the study. After the mice were moved to the isocages, gnotobiotic status was verified weekly by Gram stain and bacterial culture, while mold traps were monitored daily to confirm GF status. Early in the study a few isocages were contaminated and had to be removed from the study. This did not affect the other isocages which remained gnotobiotic.

A similar distribution of male and female mice were assigned to both strains (NC101: female=21, male=18; NC101Δ pks: female=23, male=22). Mice of the same sex and mono-associated with the same bacteria strain were caged together (2-4 mice per isocage). Mice were mono-associated by oral gavage and rectal swabbing with 200 µl of an overnight log phase bacterial culture at a concentration of 2x109 colony forming units (CFU)/ml. Gnotobiotic status was verified weekly by Gram stain and culture of fecal contents. For culture, sterile swabs were used to transfer fecal material to sheep blood agar plates, which were incubated at 37°C under aerobic or anaerobic conditions. For Gram staining, swabs with fecal material were transferred to glass slides and spread evenly in a thin layer. Slides were air-dried, heat-fixed, and Gram stained using a BBL Gram Stain Kit according to manufacturer’s instructions (BD Biosciences, cat# BD 212539). Throughout the experiment, mice were observed at least once daily and weighed weekly. Moribund and dead animals were necropsied unless autolysis precluded any interpretation. Mice were offered Purina Lab Diet 3500 (the same diet that was used in the original study) and sterile water ad libitum. They were housed on Tek-Fresh bedding (Envigo) and offered Enviro-dri nesting material (Shepherd Specialty Papers) as enrichment.

Four weeks after mono-association, colonization and bacterial strain were verified by culture and PCR. At the same time, weekly intraperitoneal injections of 10 mg/kg AOM diluted to a final dilution of 2.5 mg/ml (Sigma-Aldrich, cat# A5486, lot# SLBN5975V) were initiated and continued until mice received six injections. The same lot of AOM was used for the entire study with 25 mg/ml aliquots stored at −80˚C until use. One vial was used for each injection day and any remainder discarded to avoid unnecessary freeze-thaw cycles. Mice were euthanized and necropsied 18 weeks after the sixth AOM injection as prespecified in the Registered Report (Eaton et al., 2015), or when they became moribund. At necropsy, gross lesions were recorded and photographed, if present, and as far as possible, samples were collected for culture and PCR of cecal contents, and for histopathologic evaluation of colon lesions. Colon lesions were scored for all mice that survived to 14 weeks or more after the last AOM injection. The experimental timeline is illustrated in Figure 2A.

Determination of E. coli CFU

To quantify cultures for mono-association, bacteria were cultured overnight (12–16 hr) at 37°C in LB broth to log phase growth and CFU/ml was estimated based on OD600 readings. Cultures were adjusted to the desired density on the OD reading and mice were mono-associated. For precise determination of CFU/ml, an aliquot of the culture was quantified by serial dilution (10 fold dilutions in LB broth). 100 µl of each dilution were plated on LB agar and incubated at 37°C for 24 hr under aerobic conditions. On plates with discrete colonies, the number of colonies were counted and results were expressed as CFU/ml of contents.

To quantify colonization of mice, samples of feces were aseptically collected into pre-weighed, sterile tubes (average weight of feces was 0.05 g). Samples were resuspended as a slurry in 1 ml sterile LB broth and serially diluted (10 fold dilutions in LB broth). 100 µl of each dilution were plated on LB agar and incubated at 37°C for 24 hr under aerobic conditions. On plates with discrete colonies, the number of colonies were counted and results were expressed as CFU/ml of contents.

Histopathology

Mice were sacrificed at the indicated time points and colon (proximal, mid-proximal (transverse), distal) and cecum, liver, spleen, and any gross lesions were collected for histological evaluation. Colons were blindly examined macroscopically for tumors by a board-certified veterinary pathologist. Tissues were fixed in 10% neutral buffered formalin for 24–48 hr, with colon tissue Swiss-rolled from the proximal to the distal end. The fixed tissue was embedded in paraffin, sectioned at five microns, and stained with hematoxylin and eosin (H&E) as described in the Registered Report (Eaton et al., 2015). Individual sections were blindly examined microscopically (Olympus BX41) by a board-certified veterinary pathologist. Colon sections were scored for inflammation (Rath et al., 1996) and invasion using the same scoring criteria as the original study (Arthur et al., 2012) and specified in the Registered Report (Eaton et al., 2015). Histopathology scoring, images, and additional protocol details are available at https://osf.io/pm5xa/.

Scoring of inflammation was based on a study by Rath et al. (1996) where inflammation was semi-quantified. Criteria are summarized as follows: score 1 = increased inflammation in the lamina propria, decreased goblet cells and mucosal thickening (all mild); score 2 = moderately increased inflammation, decreased goblet cells and mucosal thickening with the addition of mild submucosal inflammation: score 3 = severely increased inflammation, decreased goblet cells and mucosal thickening with moderate submucosal inflammation and mild destruction of architecture; score 4 = severely increased inflammation, decreased goblet cells and mucosal thickening, and moderate destruction of architecture, score 4.5–6 was used based on the presence of ulcers and crypt abscesses. For the current study, increased mucosal thickening was interpreted to mean mucosal hypertrophy, and destruction of architecture was interpreted to mean atrophy or loss of epithelial cells or glands, fibrosis, or collapse of lamina propria. The entire length of the colon was examined and an overall score assigned. The scoring system for inflammation is illustrated in Figure 3—figure supplement 2.

Dysplasia was scored as described in the original study with minor clarifications. Low and high-grade dysplasia, intra-epithelial neoplasia, adenoma, herniation, invasion, and adenocarcinoma, not defined in the original publication, were here defined based on published criteria (Boivin et al., 2003; Washington et al., 2013). Invasion was distinguished from herniation based on the level of scirrhous response and cellular atypia (Boivin et al., 2003; Washington et al., 2013). Altered crypt foci, a gross characteristic, was not evaluated in this study. Gastrointestinal intraepithelial neoplasia was interpreted to mean carcinoma in situ, also referred to as small non-invasive adenomas or individually transformed crypts (Washington et al., 2013). The scoring system for neoplasia and invasion is illustrated in Figure 3—figure supplement 3. Neoplasia was scored taking into account the entire colon section and not simply the most severe lesion, and summarized as follows: 0 = no dysplasia; 1 = mild dysplasia characterized as aberrant crypt foci, +0.5 for multiples; 2 = moderate dysplasia characterized as gastrointestinal neoplasia, +0.5 for multiples; 3 = severe or high grade dysplasia characterized as adenoma, restricted to the mucosa; 4 = invasive adenocarcinoma, invading into or through the muscularis mucosa; and 5 = fully invasive adenocarcinoma, full invasion through the submucosa and into or through the muscularis propria (Arthur et al., 2014).

Statistical analysis

Statistical analysis was performed with R software (RRID:SCR_001905), version 3.4.4 (Core Team, 2018). All data, csv files, and analysis scripts are available on the OSF (https://osf.io/y4tvd/). Confirmatory statistical analysis was pre-registered (https://osf.io/yt9ki/) before the experimental work began as outlined in the Registered Report (Eaton et al., 2015) with any other analysis indicated as exploratory. Data were checked to ensure assumptions of statistical tests were met. The nonparametric Wilcoxon-Mann-Whitney test was used for the inflammation and tumorigenesis scoring analysis because ordinal scoring data do not meet the assumption of a normal distribution (Gibson-Corley et al., 2013). That is, while a number is used for the scoring it represents non-numeric concepts like ‘severe or high grade dysplasia characterized as adenoma, restricted to the mucosa’. Thus, non-parametric approaches are the best way to describe results from these data, especially with small sample sizes (Baker et al., 2014). The asymmetric confidence intervals for the overall Cliff’s d estimate was determined using the normal deviate corresponding to the (1-alpha/2)th percentile of the normal distribution (Cliff, 1993). A meta-analysis of a common original and replication effect size was performed with a random effects model and the metafor R package (Viechtbauer, 2010) (https://osf.io/2raud/). The original study data presented in Figure 4A–E was extracted a priori from the published Figure by estimating the value of each symbol based on the scoring criteria described in the original study methods and shared by the original authors. The data were published in the Registered Report (Eaton et al., 2015) and used in the power calculations to determine the sample size for this study. To provide a comparison of the replication results to the original study for in vitro bacterial growth and animal survival, the values reported in the original study in Supplemental Figure 7 and 10 were estimated.

Data availability

Additional detailed experimental notes, data, and analysis are available on OSF (RRID:SCR_003238) (https://osf.io/y4tvd/; Eaton et al., 2018). This includes the R Markdown file (https://osf.io/ektn3/) that was used to compose this manuscript, which is a reproducible document linking the results in the article directly to the data and code that produced them (Hartgerink, 2017). The Whole Genome sequencing data generated during this study has been deposited at NCBI SRA under the Bioproject accession PRJNA481682. The genome assemblies have been deposited at Genbank under the accession QVAD00000000 and QVAE00000000.

Deviations from registered report

The in vitro bacterial growth assay was performed at multiple dilutions in addition to the 1:500 dilution specified in the Registered Report. This was done to test if there was an impact of the starting density on growth kinetics. The number of mice enrolled in the study was increased from an estimated total of 30 to 84. This was largely due to early deaths, during AOM treatment and an attempt to obtain the prespecified number of mice 18 weeks after the last AOM injection. Scoring of colon sections for neoplasia was performed using a slight variation of the scale listed in the Registered Report and methods of the original study (Arthur et al., 2012) to ensure a direct comparison was made between the original study and this replication. The values assigned to ‘adenocarcinoma, invasion through the muscularis mucosa’ were changed from 3.5 to 4, while the values assigned to ‘adenocarcinoma, full invasion through the submucosa and into or through the muscularis propria’ were changed from 4 to 5. These changes justify the reported scores of 5 in Figure 4B of Arthur et al. (2012), which would not be possible unless the scale was changed, and was the scale described in a more recent paper by the original authors (Arthur et al., 2014). The statistical analyses proposed in the Registered Report for the scoring and survival data were not able to be performed due to differences in experimental timing that we were informed about during peer review of this Replication Study manuscript, but were not revealed during informal review and feedback by the authors of the original paper during experimental planning or during peer review of the Registered Report. The exploratory analyses for survival took into account the entire study period (mono-association to 18 weeks after the last AOM treatment) and the same period as the original study (18 weeks from mono-association). The latter analysis was used in the meta-analysis of the two studies. The exploratory analyses for the scoring data were performed using nonparametric tests as described above. This differs from the original study that used parametric tests to analyze non-parametric scoring data (i.e. ordinal data). Since we observed a higher death rate for mice mono-associated with NC101Δ pks compared to NC101, we performed whole genome sequencing on the two E. coli strains to examine if any genetic differences existed beyond the deletion of the pks island. Additional materials and instrumentation not listed in the Registered Report, but needed during experimentation are also listed.

References

  1. 1
    FastQC: A Quality Control Tool for High Throughput Sequence Data
    1. S Andrews
    (2016)
    Cambridge, UK: Bioinformatics Group at the Babraham Institute.
  2. 2
  3. 3
  4. 4
  5. 5
  6. 6
  7. 7
  8. 8
  9. 9
  10. 10
  11. 11
  12. 12
  13. 13
  14. 14
  15. 15
  16. 16
    R: A language and environment for statistical computing
    1. R Core Team
    (2018)
    R Foundation for Statistical Computing, Vienna, Austria.
  17. 17
  18. 18
  19. 19
  20. 20
  21. 21
  22. 22
    Study 41: replication of Arthur, Arthur et al., 2012 (Science)
    1. K Eaton
    2. A Pirani
    3. ES Snitkin
    4. E Iorns
    5. R Tsui
    6. A Denis
    7. N Perfito
    8. TM Errington
    (2018)
    OSF, 10.17605/OSF.IO/Y4TVD.
  23. 23
  24. 24
  25. 25
  26. 26
  27. 27
    Composing reproducible manuscripts using R markdown
    1. CHJ Hartgerink
    (2017)
    eLife. Accessed January 20, 2017.
  28. 28
  29. 29
  30. 30
  31. 31
  32. 32
  33. 33
  34. 34
  35. 35
  36. 36
  37. 37
  38. 38
  39. 39
  40. 40
  41. 41
  42. 42
  43. 43
  44. 44
  45. 45
  46. 46
  47. 47
  48. 48
  49. 49
  50. 50
  51. 51
  52. 52
    From FastQ data to high confidence variant calls: the genome analysis toolkit best practices pipeline
    1. GA Van der Auwera
    2. MO Carneiro
    3. C Hartl
    4. R Poplin
    5. G Del Angel
    6. A Levy-Moonshine
    7. T Jordan
    8. K Shakir
    9. D Roazen
    10. J Thibault
    11. E Banks
    12. KV Garimella
    13. D Altshuler
    14. S Gabriel
    15. MA DePristo
    (2013)
    Current Protocols in Bioinformatics 43:11.10.1–11.1033.
  53. 53
    Conducting Meta-Analyses in R with the metafor package
    1. W Viechtbauer
    (2010)
    Journal of Statistical Software, 36.
  54. 54
  55. 55

Decision letter

In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

Thank you for submitting your article "Replication Study: Intestinal inflammation targets cancer-inducing activity of the microbiota" for consideration by eLife. Your article has been reviewed by Wendy Garrett as the Senior Editor, a Reviewing Editor, and three reviewers. The reviewers have opted to remain anonymous.

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

Summary:

In this replication work, several features of the original publication by Arthur et al. were replicated including similar mouse colonization with PKS+ or ΔPKS NC101 E. coli; no effect of the ΔPKS on bacterial growth; and strain confirmation including PKS island deletion. However, for the critical comparisons of inflammation, invasion and neoplasia, the replication attempt did not model the kinetics of the mouse model as reported by Arthur et al. This appeared to be due to at least two technical features including increased murine death likely associated with the AOM batch used in the experiments, a known but not highly visible understanding from the literature and use of an experimental evaluation time course differing from the Arthur et al., paper. Further, contamination of isolators occurred that further limited the data interpretation. Thus, as conducted, the study did not achieve replication of the conditions of the paper of Arthur et al. As presented by the reviewer comments, the paper discussion and conclusion should be revised to focus on critical technical and analytical issues that emerged and must be considered by other investigators if working with this model and these bacterial strains.

Essential revisions:

As pointed out by all reviewers, the attempt at replication primarily revealed the technical aspects that must be controlled for in working with this mouse model in future experiments. A careful standardization of the protocol/disease course is required in individual laboratories before comparing different E. coli strains. Key features of the model to be verified before strain testing include low mortality rates, distinguishable individual tumors at the endpoint and additional controls including non-colonized AOM-injected-only control mice and mono-colonization-only and possibly AOM-untreated control mice. Revising the paper to focus on the technical limitations rather than that the replication attempt led to results differing from the original paper seems prudent. Ultimately, since the biology and kinetics of the Arthur et al., model were not faithfully replicated, these data are unable to address whether, under the conditions of the original paper, NC101 and ΔPKS NC101 differ in inflammation, invasion and neoplasia. This final point should be made clear to the readers of the paper.

Reviewer #1:

Eaton et al., try to reproduce key findings from Arthur et al., 2012 paper.

All of the experiments proposed in the initial eLife Registered Report have been performed and analyzed. The observed results and conclusions according to the authors are the following:

1) Authors established colonization of germ free Il10-/- mice with the PKS+ or ΔPKS NC101 E. coli as in original paper.

2) AOM injection in the colonized mice induced colorectal tumorigenesis as in original paper.

3) Similarly to Arthur et al., deletion of PKS island in E. coli genome did not affect colonization and bacterial growth of E. coli in germ free mice.

4) Authors conclude that ΔPKS induces similar level of inflammation (agreement with original study) and similar level of tumorigenicity (multiplicity, invasion- in disagreement with original study) in comparison with PKS+ colonized animals.

5) Authors claim that overall inflammation and other histopathological observations were more severe in the replication attempt.

6) Overall mortality of mice throughout the experiment was much higher in the replication attempt.

7) Meta-analyses suggest that there is no difference between ΔPKS and PKS+ colonized mice, contrary to the original paper by Arthur et al., 2012.

8) Authors hint at the conclusion that the main observation of Arthur et al. paper is not reproduced (more neoplasia, tumors and invasion in PKS+ colonized mice) but also conclude that results seen in replication attempt may be confounded by the increased severity of inflammation and mortality of animals during the experiment.

Overall authors performed all of the experiments proposed. Data analysis is solid. This reviewer is not an expert biostatistician, so no professional comments on the statistics part can be made.

Specific comments:

1) Mortality seen on Figure 2A. These curves may represent two different types of mortalities (causes)- AOM induced mortality early on and then inflammation and/or tumor load induced mortality later.

Early mortality in the Replication experiment is clearly higher than in Arthur et al. (>50% vs less than 30%).

What is presented on Figure 2B may be very similar to the data on Supplementary Figure 10 by Arthur et al., (although original paper mortality curve is not well described in terms of what timepoints there are plotted), if mortality between day 0 and 10 is still attributed to the action of AOM. Then long time there is no mortality (days 20 to 80) in both Replication experiment and in Arthur et al. Then mice in Arthur et al., seemingly start to die, but they are immediately collected (all of them), and experiment stopped, so we do not know what happens to mortality. In Replication experiment past day 90 mortality starts but mice are euthanized according to symptoms or dying, so there is an impression of high mortality.

2) One important discrepancy here is that in Arthur et al., 14 wk and 18 wk point mice were collected when they still were not dying, indicating that probably there is still a room for a difference in inflammation and tumor load/invasion.

In Replication attempt, it seems from the Figure legend that mice were mostly collected and "declared dead" when they were already sick (right thing to do in terms of following IACUC regulation) but one could think that by the time PKS+ and ΔPKS mice are already equally very sick, tumor load and inflammation may be already too high and indistinguishable. Indeed, authors note that overall role of inflammation, tumor load and mortality in their experiment was higher than in initial experiment.

3) Another thing noted during initial planning of experiments and review of Registered Report is that AOM quality may vary. Indeed, different lots of AOM may have different amount of "acting" AOM and sometimes it”s too much- more toxicity/mortality and more tumors in residual mice or poor decayed AOM- more toxicity/mortality and less tumors at the end. Seemingly, in Replication report higher mortality (presumably AOM induced) and higher tumor load is seen.

4) Since AOM metabolism and potentially bacterial and liver metabolism may be involved, the diets used for germ free mice should be compared and that should be discussed

5) Using Chi- values for tumor numbers in Replication study when tumors cannot be enumerated (why?) further complicates the interpretation.

6) Another important caveat which should be acknowledged also stems from high mortality. When mortality is that high, there may be a natural selection for mice with lower inflammatory response and lower tumor load- and the mice where potentially there was (could be) a difference have been kept for too long under these conditions of inflammation and AOM lot. This is not very scientific argument, I know, but after this Replication attempt is published, the field will need all of the possible scenarios/explanations discussed.

7) Inflammation is known to promote cancer. PKS+ bacteria is proposed to be more carcinogenic than PKSΔ. However, under the conditions of stronger inflammation (observed in Replication attempt), Inflammation arguably can become more important than PKS status. PKS is genotoxic, but so are the products of excessive inflammatory responses (ROS, RNI etc.). One could argue that under stronger inflammation the presence or absence of PKS becomes less relevant as for its effect towards tumor progression. Therefore, comparison of this (Replication) and Arthur et al. studies with regard to tumor characteristics only makes sense if the same levels of inflammation are achieved.

In my view, for these experiments the technical attempt to reproduce the results has been attempted but based on the confounded data it is difficult to judge whether important conceptual results are reproduced or not (i.e. whether PKS+ is superior at tumor progression).

The only thing which can be concluded, in my view, is that during the Replication attempt, the team did not achieve the same levels of AOM toxicity/action, the same levels of inflammation and tumor load (more); and observed higher mortality. Under these circumstances it is still not clear whether PKS+ E. coli induces more tumors and more aggressive tumors; than ΔPKS controls.

Reviewer #2:

The authors of the this replication study of the paper by Artur et al., discuss in detail all experiments performed in their study based on the initial registered replication plan and also justify all changes done compared to that protocol. They have also included in their Introduction all relevant articles published since the Registered Report was published.

The major experimental findings of the study are the following:

1) In Figure 1 the authors compare in vitro the growth of E. coli strains NC101 and NC101𝛥pks (both obtained from the laboratory of the original study) and observe that pks deletion does not affect the growth curve, in agreement to what reported by Arthur et al. Therefore, the deletion of the pks island does not affect E. coli growth in vitro.

2) The authors verify that E. coli NC101𝛥pks bears a genetic deletion of the pks island and shows no other genetic variants or indels compared to the NC101 strain. (Figure 1—figure supplement 1).

3) The authors validate that E. coli strains NC101 and NC101𝛥pks display a similar growth in vivo in monocolonized germ-free Il10-/- mice, as assessed by CFU/ml in the feces. (Figure 1—figure supplement 1).

4) In Figure 2 the authors compare the survival curves of germ-free Il10-/- mice that are mono-associated with E. coli NC101 or E. coli NC101𝛥pks. Extensive lethality was observed in mice monocolonized with both strains, starting after the first AOM injection. No statistically significant differences were observed between mice colonize with the two strains in terms of survival over the course of the experiment.

5) In Figure 3 the authors count the number of tumors/mouse in surviving mice in which individual tumors could be distinguished and also evaluate the extent of inflammation in them.

6) In Figure 4 the authors present the results of metanalyses of their results and the ones by Arthur et al., as described in the registered replication plan.

Major points:

The major aim of the present study is to compare the potential of the E. coli strains NC101 and NC101𝛥pks in inducing intestinal inflammation and tumor formation in monocolonized GF Il10-/- mice in order to evaluate the reproducibility of the experiments published by Arthur et al. For this comparison to be performed the kinetics of the model should be similar to what was reported by Arthur et al. In the present study mice mono-colonized with both E. coli strains display a much higher lethality and an exacerbated phenotype by histopathology as compared to the report by Arthur et al. A very low number of mice met the endpoint of the study. Because of these issues no conclusions can be drawn regarding the potential of E. coli NC101 vs NC101𝛥pks to induce intestinal inflammation and tumor formation in monocolonized GF Il10-/- mice.

In the Abstract the authors first mention that "Mono-association […] resulted in similar levels of intestinal inflammation and tumorigenesis; whereas the original study reported decreased tumor multiplicity […]" and secondly, they mention that this replication study showed more severe histopathological observations and a much lower survival rate compared to the original study. This structure of the Abstract is misleading to the reader. The strong differences in the kinetics of the experiment compared to the original report and the lack of power of the current study to assess the reproducibility of the original findings should be mentioned first. The observations regarding tumor number and histopathological features could be discussed as indicative of a lack of difference in the context of a severely exacerbated disease course.

In my opinion, the major conclusion of this study is that the AOM/E. coli NC101 Il10-/- mono-colonization model is characterized by important technical limitations that should be carefully controlled in future studies. A careful standardization of the protocol/disease course is required in individual laboratories before comparing different E. coli strains. This should lead to low mortality rates and distinguishable individual tumors at the endpoint. Including non-colonized AOM-injected-only control mice and mono-colonization-only, AOM-untreated control mice is essential for the correct interpretation of results in this model. Such factors should be considered for the experimental design of future studies.

Additional points:

The comparison of the death rates between this study and Arthur et al. is vague in this part of the manuscript despite the fact that this is the major limitation of this replication attempt. The authors should clearly mention the percentages of mice meeting the endpoint in the two studies.

Mortality was highest during AOM treatment independently of the bacterial strain used whereas mice started dying right after the first AOM injection. The interpretation of these observations is difficult because of the absence of AOM-only controls: lethality over the injection period may be an effect of AOM toxicity alone or of AOM in combination with the effect of E.coli. It may also be an effect of the injection process itself (bleeding of internal organs for example). This is an additional technical variable that should be mentioned in the Discussion.

The authors mention that mortality during AOM injections was greater in mice colonized with NC101𝛥pks. Is this difference statistically significant? This should be mentioned and if significant further discussed.

The authors discuss sex-related differences which may confound the differences observed by Arthur et al. where 4 female and 8 male mice mono-associated with NC101 and 8 male mice mono-associated with NC101𝛥pks were analyzed. What is the result of these analyses if only male mice are considered? (n = 8 vs 8)?

The study is not powered enough to draw conclusions from the histopathological analyses because of the small number of mice resulting from an exacerbated disease course in the experiment.

Figure 2A: Log-rank test not shown in the legend.

Reviewer #3:

In this replication study, the authors had conducted the experiments under the conditions as close as possible to the original study; correctly reported everything they observed in the study including both the experiments that they were able and unable to replicate; and reasoned that they were unable to reproduce the intestinal inflammation and tumorigenesis results may be due to the overall disease development in this replication study was much more severe than the original study.

It appears that the authors of this replication study had tried the best to conduct the experiments under the conditions as close as possible to the original study. For instance, bacterial strains were shared by the Arthur Lab (subsection “Bacterial strains and growth conditions”), germ-free Il10-/-mice were derived from the same germ-free colony used in the original study) (subsection “Intestinal tumorigenesis and inflammation of germ-free Il10-/- mice mono-associated with E. 121 coli NC101 or NC101𝛥pks”), and the used AOM dose, 10 mg/kg, was the same as the original study (the same vendor, same catalog number, but with different lot number). AOM administration (the same vendor, same catalog number, same dosage used, but with different lot number) had resulted in a significantly higher mortality rate in this replication study compared with the original study. As cited by the authors (subsection “Intestinal tumorigenesis and inflammation of germ-free Il10-/- mice mono-associated with E. 121 coli NC101 or NC101𝛥pks”), a previous study (Bissahoyo, 2005) showed that 33 mice were treated with 10 mg/kg of AOM, and "no premature loss of mice" was observed at "six months after the first AOM dose". This appears in line with the original study and indicates that 10 mg/kg of AOM should not cause substantial mortality rate; in contrast, AOM administration has resulted in a significantly higher mortality rate in this replication study, which could be due to the differences in AOM lots, mouse facilities, microbiota, etc. It appears reasonable that the difference in the effects of NC101 to NC101 Δpks on the intestinal inflammation and tumorigenesis, as reported in the original study, could not be replicated, with the more severe disease development in this replication study. However, in order to firmly evaluate the reproducibility of the original study, especially the effects of NC101 to NC101 Δpks on the intestinal inflammation and tumorigenesis, the experimental conditions (such as the administrated AOM dosage) could be adjusted in this replication study to get enough animals surviving through 18 weeks for the further analyses.

Reviewer #4:

This current study is a replication of Arthur et al., 2012. The current study was able to reproduce many findings of the original 2012 study: no differences in bacterial growth and colonization between strains, no difference in histologic inflammation in vivo, no difference in detecting the pks island by PCR and sequencing. The authors also include a meta-analysis of the current and former studies. However, the main finding that E. coli pks promotes cancer could not be reproduced. Significant differences in experimental timeframe likely underlie this observation – see below. I also have a significant concern about mention of "bacterial contamination" and the extent of AOM toxicity.

1) In both studies, colonization was established for 4 weeks, after which AOM was administered for 6 weeks. However, the current study evaluated survival and tumorigenesis 18 weeks after the last (6th) AOM injection, whereas the original 2012 study evaluated survival and tumorigenesis 18 weeks after colonization. This equates to 10 weeks longer in the current study, which is 50% greater time than in the original, and could explain why no difference in tumorigenesis was observed. The authors in fact indicate that it was difficult to detect differences in tumorigenesis because it had advanced so far and many mice did not survive to this time point. This substantial difference in experimental timing – 18 weeks (2012 study) and 28 weeks (current study) – is a major flaw in this replication study. This difference should be clearly stated in the Abstract and manuscript text. Further, I”m not sure if the meta-analysis in Figure 4 is appropriate with such differences in time frame between the two studies.

2) Potential AOM toxicity: It is concerning that AOM treatment killed 30-70% of the mice, simply during the 6 week injection period. I am not aware of this extent of toxicity in AOM/DSS or AOM/Il10-/- studies. Perhaps the current authors injected mice with a very concentrated AOM solution? Survival curves from the original 2012 paper do not indicate this extent of toxicity. I recommend the authors amend their Abstract and manuscript text to indicate how different these results were between the current and former studies.

3) Figure 2 figure legend mentions "animals where bacterial contamination was detected (7 out of 84) were censored". What does this mean? Does this mean that some animals housed in gnotobiotic (mono-associated) isolators became contaminated? This is highly concerning. It also raises the possibility that these mice were also infected with other non-bacterial microorganisms, such as virus or fungus. Any contamination could alter results. Potential contamination is only mentioned in this figure legend and does not appear to be mentioned in Results section or Materials and methods section. The authors must explain in greater detail what this "bacterial contamination" means – both to this reviewer and to the readers.

[Editors” note: further revisions were requested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "Replication Study: Intestinal inflammation targets cancer-inducing activity of the microbiota" for further consideration at eLife. Your revised article has been favorably evaluated by Wendy Garrett (Senior Editor) and a Reviewing Editor.

The manuscript has been improved but there are some remaining issues that need to be addressed before acceptance, as outlined below:

The authors provided a highly responsive revision of the original paper. In re-reviewing Eaton, 2015 (study plan) and the original Arthur et al., paper (2012), we agree that the time line for the mouse experiments proposed in Eaton, 2015 was clear whereas the time line between bacterial inoculation, AOM administration and mouse harvest was hard to discern in the original paper. Besides the importance of clearer time line delineation in original manuscripts, this highlights flaws in the review of the original study plan, certainly a distributed responsibility in which several reviews failed to raise any questions.

Thus, in the review of the current manuscript, there a number of places where the text should be modified to be clearer and more detailed as listed below.

1) Introduction. The authors state incorrectly that in Arthur et al., a 100-fold increase in E. coli NC101 was detected in the lumen of Il10 KO mice relative to WT controls. This is Figure 1I in Arthur et al. In this experiment, conducted at 20 weeks after transfer to SPF conditions, only the presence of E. coli was examined, not NC101 which would have required specific testing for pks or colibactin. NC101 is referred to but, not specifically tested for at this juncture in the Arthur et al., paper.

2) Subsection “Intestinal tumorigenesis and inflammation of germ-free Il10-/- mice mono-associated with E. coli NC101 or NC101𝛥pks”. It was difficult to find wording in the Arthur et al. paper that clearly delineated whether harvest occurred 18 weeks from monoassociation vs 18 weeks post-AOM (wording was “14 and 18 weeks with AOM”) and better definition was missed by several reviews of the Eaton et al. plan including, even perhaps the original authors, as such the following wording change is suggested: “based on methods derived from the original study and not corrected on review of Eaton et al.” as this may represent a more accurate presentation of what occurred.

3) Need to insert the starting N of mice (NC101 39, NC101Δpks 45 mice) since the reader can‘t interpret the mouse survival numbers without this information.

4 Please add the range of days of survival to augment the median survival stated.

5) Please expand “suggesting bacterial load was not a factor in the survival differences” as follows: “suggesting while bacterial load was not a factor in the survival differences, host:bacterial interactions could have contribute to the early demise” (or similar).

6) Why are half the mice surviving in the NC101Δpks group vs the NC101 group considered a “small impact”? The 57 day survival is the NC101Δpks group but it is stated to the NC101 group and then median survival is stated as not able to be calculated but provided as 154 days previously. Please clarify for the reader.

7) The timing of the onset of colitis is unclear. Do the authors mean by 4 weeks of monocolonization or only following AOM?

8) What was the timing of detection of the anal carcinoma?

9) The mouse colon is well known to have a distal “smooth” part grossly and then the proximal half has a “feathered” gross structure. Please provide clarity on whether distal colon refers to the smooth section of the mouse colon or not. For example, in the enterotoxigenic B fragilis model of colon tumorigenesis, the transition between the smooth and feathered colon in the most severely affected animals is essentially a “hard stop” for further tumorigenesis (i.e., tumors push to this transition zone and then rarely penetrate into the feathered area of the mouse colon). Is that is what is being described here?

10) Please clarify the statement: “These observations were the same whether the mouse was mono-associated with NC101 or NC101Δpks”. Was this true even when mice were harvested at a time point close to the time that mice were harvested in the original Arthur et al. paper (i.e., day ~126)? The tumorigenesis results of any mice harvested near the timing of the original Arthur et al. paper should be commented on specifically. It seems possible that the time line in this model is crucial for differentiating the outcomes of NC101 and NC101Δpks.

11) What is the timing of the pictures in Figure 3—figure supplement 1?

12) Same question as 10 above. Are there any mice analyzed at the time point of the harvests in the Arthur et al., paper?

13) Suggested edit for clarity: change “14 weeks post-AOM treatment” to “14 weeks post-AOM treatment (5 weeks beyond the endpoint in Arthur et al.,)”.

14) Are the median scores reported correct? They have exactly the same numbers and statistical significance seems doubtful as reported. This requires either more explanation or correction.

15) Suggest adding “more severe and/or progressive over time” to the sentence “The absolute scores were greater in this replication attempt compared to the original study, particularly for inflammation and invasion, which, combined with the survival and histopathological observations described above suggests that lesions were more severe in this replication attempt than in the original study.”

16) Suggest adding after “AOM treatment” the following sentence: These results highlight the importance of experimental time lines in assessing differences in mouse models and between reports.” (or similar).

17) “[…], had a natural tendency to have a reduced inflammatory response and tumor load compared to ones that died, then the data reported would be distorted”: The word “distorted” does not seem apt. Mouse models and mice within experiments can be remarkably variable including littermates, mice caged together etc. Isn‘t this the nature of mouse experiments and the point is that investigators need to be aware that inbred mice by no means provide clear “smoothing” of the data?

18) “Additionally, under these conditions it is possible the products of excessive inflammatory responses (e.g. reactive oxygen species”: It would be clearer to state: “Additionally over time […]”.

19) “This increased severity confounds the ability to detect differences […]”: Again, suggest changing wording to “increased severity over time confounds […]”

20) Subsection “Meta-analyses of original and replicated effects”. Again, for clarity change “a common effect size was calculated for each effect from the original and replication studies” to “calculated for survival [.,,]”.

21) Subsection “Genome sequencing data processing and assembly”. Both NC101 and NC101Δpks were sequenced. Are these data submitted to GenBank? The accession numbers should be provided.

22) Subsection “Determination of E. coli CFU”. What does “intestinal tissue feces” mean? Do the authors mean that intestinal luminal contents were removed from the colon at the time of mouse necropsy? Please clarify.

23) Subsection “Statistical Analysis”. Suggest “reported in the original study in Supplemental […]”.

24) Figure 1A,B should be revised to include much more data that would encapsulate the experiments and their contrast for the reader. Suggest adding numbers of mice for A,B experiments including number of males, females; mark the timing of mouse deaths with arrows; mark timing of onset of detection of colitis and tumors/invasive cancers (per the text the timing of onset of detection of carcinoma was at/near the time point at which the Arthur et al., experiments ended); add to legend what the cross lines added to the timeline represent (mouse censoring due to bacterial contamination). Ideally this figure would enable the reader to readily capture the contrasts between the studies and their timelines. Hopefully any needed information from Arthur et al. would be available to complete this.

25) Figure 2 legend. (n=7 in 4 cages), correct?

26) Figure 2—figure supplement 1. Add range of days/wks that sacrifices occurred.

27) Figure 3—figure supplement 2 legend. Please add the timing of harvest of each mouse displayed in B,C, D and E, ideally on the figure. In Figure 3E, the herniation described is not really visible to the reader. Please provide a higher power inset. Similarly, please clarify Figure 3—figure supplement and Figure 3. Are these sections all from one mouse or different mice? What is the timing of the necropsies leading to these images? Sizing bars are missing from images A and C. In A, the “mucus lakes” should be marked and likely a higher power image of these provided. C appears to be at higher magnification than B. It would be best to show B and C at the same magnification.

[Editors” note: further revisions were requested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "Replication Study: Intestinal inflammation targets cancer-inducing activity of the microbiota" for further consideration at eLife. Your revised article has been favorably evaluated by Wendy Garrett (Senior Editor) and a Reviewing Editor.

The manuscript has been improved and a highly responsive revision is again noted. However, there are some remaining issues that need to be addressed before acceptance, as outlined below:

Two data concerns, both seem like typos:

1) Subsection “Intestinal tumorigenesis and inflammation of germ-free Il10-/- mice mono-associated with E. coli NC101 or NC101𝛥pks”. Looking at Figure 2B, this reviewing editor thinks that NC101 and NC101Δpks are reversed. Namely, NC101(not NC101Δpks) median survival cannot be determined because more than 1/2 the mice were alive at 18 weeks following monoassociation.

2) Subsection “Intestinal tumorigenesis and inflammation of germ-free Il10-/- mice mono-associated with E. coli NC101 or NC101𝛥pks”. This reviewer thinks the authors are repeating the data although the language differs (lesions and macroscopic tumor burden). The authors note that a “statistically significant decrease in neoplastic lesions” reported in Arthur et al., but provide identical median scores. Please clarify.

https://doi.org/10.7554/eLife.34364.025

Author response

[…] Essential revisions:

As pointed out by all reviewers, the attempt at replication primarily revealed the technical aspects that must be controlled for in working with this mouse model in future experiments. A careful standardization of the protocol/disease course is required in individual laboratories before comparing different E. coli strains. Key features of the model to be verified before strain testing include low mortality rates, distinguishable individual tumors at the endpoint and additional controls including non-colonized AOM-injected-only control mice and mono-colonization-only and possibly AOM-untreated control mice. Revising the paper to focus on the technical limitations rather than that the replication attempt led to results differing from the original paper seems prudent. Ultimately, since the biology and kinetics of the Arthur et al., model were not faithfully replicated, these data are unable to address whether, under the conditions of the original paper, NC101 and ΔPKS NC101 differ in inflammation, invasion and neoplasia. This final point should be made clear to the readers of the paper.

Thank you for sharing this critical information. We were unaware of the differences in experimental timing supplied by reviewer 4. As raised by reviewer 1 the original paper was not well described in terms of the timepoints. The experimental timing was based on the information in the original paper and described in the Registered Report with the mice to be sacrificed “18 weeks after last AOM injection”. This remained after informal review and feedback by the original authors during preparation of the Registered Report manuscript, peer review of the Registered Report, and post-publication peer review of the published Registered Report. This was also not raised in the other independent reviews of this Replication Study manuscript. One approach to mitigate the potential for misinterpreting complex study designs is to include a timeline diagram or flowchart as recommended by the ARRIVE Guidelines. We included these points in the revised manuscript.

We also agree with the reviewer regarding the implications this has on the presentation and interpretation of the replication data. We have revised the figures, manuscript, and the Abstract to reflect the difference in experimental timing between the original study and this replication attempt. This includes removing the histopathological analysis from the meta-analysis and revising the survival meta-analysis to reflect the shared time frame between the two studies. We also revised the manuscript regarding the implications this has on the presentation and interpretation of the replication data.

Reviewer #1:

[…] Specific comments:

1) Mortality seen on Figure 2A. These curves may represent two different types of mortalities (causes)- AOM induced mortality early on and then inflammation and/or tumor load induced mortality later.

Early mortality in the Replication experiment is clearly higher than in Arthur et al. (>50% vs less than 30%).

What is presented on Figure 2B may be very similar to the data on Supplementary Figure 10 by Arthur et al., (although original paper mortality curve is not well described in terms of what timepoints there are plotted), if mortality between day 0 and 10 is still attributed to the action of AOM. Then long time there is no mortality (days 20 to 80) in both Replication experiment and in Arthur et al. Then mice in Arthur et al., seemingly start to die, but they are immediately collected (all of them), and experiment stopped, so we do not know what happens to mortality. In Replication experiment past day 90 mortality starts but mice are euthanized according to symptoms or dying, so there is an impression of high mortality.

We have revised this figure, the manuscript, and the Abstract to reflect the difference in experimental timing between the original study and this replication attempt in light of the information supplied by reviewer 4.

2) One important discrepancy here is that in Arthur et al., 14 wk and 18 wk point mice were collected when they still were not dying, indicating that probably there is still a room for a difference in inflammation and tumor load/invasion.

In Replication attempt, it seems from the Figure legend that mice were mostly collected and "declared dead" when they were already sick (right thing to do in terms of following IACUC regulation) but one could think that by the time PKS+ and ΔPKS mice are already equally very sick, tumor load and inflammation may be already too high and indistinguishable. Indeed, authors note that overall role of inflammation, tumor load and mortality in their experiment was higher than in initial experiment.

We have revised the figures, the manuscript, and the Abstract to reflect the difference in experimental timing between the original study and this replication attempt in light of the information supplied by reviewer 4.

3) Another thing noted during initial planning of experiments and review of Registered Report is that AOM quality may vary. Indeed, different lots of AOM may have different amount of "acting" AOM and sometimes it”s too much- more toxicity/mortality and more tumors in residual mice or poor decayed AOM- more toxicity/mortality and less tumors at the end. Seemingly, in Replication report higher mortality (presumably AOM induced) and higher tumor load is seen.

We agree the observed toxicity is unexpected, especially since the dose and treatment schedule used is below the published toxic dose for AOM in mice. We have revised the manuscript and the Abstract to reflect the difference in experimental timing between the original study and this replication attempt in light of the information supplied by reviewer 4.

4) Since AOM metabolism and potentially bacterial and liver metabolism may be involved, the diets used for germ free mice should be compared and that should be discussed.

We agree that diets could be a factor and have included this in the revised manuscript. We also used the same diet as the original study (as communicated by the original authors) and have included this in the revised manuscript.

5) Using Chi- values for tumor numbers in Replication study when tumors cannot be enumerated (why?) further complicates the interpretation.

We revised the figure to better distinguish the tumors that could not be enumerated from the ones that could. This is included in the figure to illustrate the number that could not be quantified in both groups because of the coalescing nature of the lesions.

6) Another important caveat which should be acknowledged also stems from high mortality. When mortality is that high, there may be a natural selection for mice with lower inflammatory response and lower tumor load- and the mice where potentially there was (could be) a difference have been kept for too long under these conditions of inflammation and AOM lot. This is not very scientific argument, I know, but after this Replication attempt is published, the field will need all of the possible scenarios/explanations discussed.

We agree that the increased severity in the replication attempt complicates the interpretation of the data. Besides leading to information loss, if the mice that survived, and thus were quantified for inflammation and tumorigenesis, had a natural tendency to have a reduced inflammatory response and tumor load compared to ones that died, then the data reported would be distorted, limiting the opportunity to detect differences between the two groups. We have included this point in the revised manuscript.

7) Inflammation is known to promote cancer. PKS+ bacteria is proposed to be more carcinogenic than PKSΔ. However, under the conditions of stronger inflammation (observed in Replication attempt), Inflammation arguably can become more important than PKS status. PKS is genotoxic, but so are the products of excessive inflammatory responses (ROS, RNI etc.). One could argue that under stronger inflammation the presence or absence of PKS becomes less relevant as for its effect towards tumor progression. Therefore, comparison of this (Replication) and Arthur et al. studies with regard to tumor characteristics only makes sense if the same levels of inflammation are achieved.

We have revised the figures, the manuscript, and the Abstract to reflect the difference in experimental timing between the original study and this replication attempt in light of the information supplied by reviewer 4.

In my view, for these experiments the technical attempt to reproduce the results has been attempted but based on the confounded data it is difficult to judge whether important conceptual results are reproduced or not (i.e. whether PKS+ is superior at tumor progression).

The only thing which can be concluded, in my view, is that during the Replication attempt, the team did not achieve the same levels of AOM toxicity/action, the same levels of inflammation and tumor load (more); and observed higher mortality. Under these circumstances it is still not clear whether PKS+ E. coli induces more tumors and more aggressive tumors; than ΔPKS controls.

We have revised the figures, the manuscript, and the Abstract to reflect the difference in experimental timing between the original study and this replication attempt in light of the information supplied by reviewer 4.

Reviewer #2:

[…] Major points:

The major aim of the present study is to compare the potential of the E. coli strains NC101 and NC101𝛥pks in inducing intestinal inflammation and tumor formation in monocolonized GF Il10-/- mice in order to evaluate the reproducibility of the experiments published by Arthur et al. For this comparison to be performed the kinetics of the model should be similar to what was reported by Arthur et al. In the present study mice mono-colonized with both E. coli strains display a much higher lethality and an exacerbated phenotype by histopathology as compared to the report by Arthur et al. A very low number of mice met the endpoint of the study. Because of these issues no conclusions can be drawn regarding the potential of E. coli NC101 vs NC101𝛥pks to induce intestinal inflammation and tumor formation in monocolonized GF Il10-/- mice.

We have revised the figures, the manuscript, and the Abstract to reflect the difference in experimental timing between the original study and this replication attempt in light of the information supplied by reviewer 4.

In the Abstract the authors first mention that "Mono-association […] resulted in similar levels of intestinal inflammation and tumorigenesis; whereas the original study reported decreased tumor multiplicity […]" and secondly, they mention that this replication study showed more severe histopathological observations and a much lower survival rate compared to the original study. This structure of the Abstract is misleading to the reader. The strong differences in the kinetics of the experiment compared to the original report and the lack of power of the current study to assess the reproducibility of the original findings should be mentioned first. The observations regarding tumor number and histopathological features could be discussed as indicative of a lack of difference in the context of a severely exacerbated disease course.

We have revised the Abstract to reflect the difference in experimental timing between the original study and this replication attempt in light of the information supplied by reviewer 4.

In my opinion, the major conclusion of this study is that the AOM/E. coli NC101 Il10-/- mono-colonization model is characterized by important technical limitations that should be carefully controlled in future studies. A careful standardization of the protocol/disease course is required in individual laboratories before comparing different E. coli strains. This should lead to low mortality rates and distinguishable individual tumors at the endpoint. Including non-colonized AOM-injected-only control mice and mono-colonization-only, AOM-untreated control mice is essential for the correct interpretation of results in this model. Such factors should be considered for the experimental design of future studies.

We agree. Our findings clearly demonstrate the essential nature of clear and precise reporting of experimental details to ensure published research are accurately compared, reproduced, and interpreted. We have included these factors as considerations for the experimental design of future studies in the revised manuscript.

Additional points:

The comparison of the death rates between this study and Arthur et al. is vague in this part of the manuscript despite the fact that this is the major limitation of this replication attempt. The authors should clearly mention the percentages of mice meeting the endpoint in the two studies.

We have revised the figures, the manuscript, and the abstract to reflect the difference in experimental timing between the original study and this replication attempt in light of the information supplied by reviewer 4. We have also moved the percentages of mice earlier in the text, when comparing death rates.

Mortality was highest during AOM treatment independently of the bacterial strain used whereas mice started dying right after the first AOM injection. The interpretation of these observations is difficult because of the absence of AOM-only controls: lethality over the injection period may be an effect of AOM toxicity alone or of AOM in combination with the effect of E.coli. It may also be an effect of the injection process itself (bleeding of internal organs for example). This is an additional technical variable that should be mentioned in the Discussion.

We agree and have included these factors as considerations for the experimental design of future studies in the revised manuscript.

The authors mention that mortality during AOM injections was greater in mice colonized with NC101Δpks. Is this difference statistically significant? This should be mentioned and if significant further discussed.

We have revised the figures, the manuscript, and the abstract to reflect the difference in experimental timing between the original study and this replication attempt in light of the information supplied by reviewer 4. Further discussion for the increased mortality during AOM injections was included

The authors discuss sex-related differences which may confound the differences observed by Arthur et al. where 4 female and 8 male mice mono-associated with NC101 and 8 male mice mono-associated with NC101Δpks were analyzed. What is the result of these analyses if only male mice are considered? (n = 8 vs 8)?

This is an interesting point. During preparation of the Registered Report we were informed of the ratio of female and male mice in their experiment; however, we did not receive the raw data from the authors that would allow us to conduct this exploratory analysis.

The study is not powered enough to draw conclusions from the histopathological analyses because of the small number of mice resulting from an exacerbated disease course in the experiment.

We agree we did not achieve the target number of 20 (instead reaching 10), which was based off the data from the 14 mice reported in the original study (at the 18 week timepoint). However, we have removed all analysis from the histopathological analysis in light of the information supplied by reviewer 4.

Figure 2A: Log-rank test not shown in the legend.

We included this exploratory test (of the entire timecourse) in the figure legend and manuscript in addition to the revised analysis of the comparable timecourse between the original study and this replication in light of the information supplied by reviewer 4.

Reviewer #3:

[…] It appears that the authors of this replication study had tried the best to conduct the experiments under the conditions as close as possible to the original study. For instance, bacterial strains were shared by the Arthur Lab (subsection “Bacterial strains and growth conditions”), germ-free Il10-/- mice were derived from the same germ-free colony used in the original study) (subsection “Intestinal tumorigenesis and inflammation of germ-free Il10-/- mice mono-associated with E. 121 coli NC101 or NC101𝛥pks”), and the used AOM dose, 10 mg/kg, was the same as the original study (the same vendor, same catalog number, but with different lot number). AOM administration (the same vendor, same catalog number, same dosage used, but with different lot number) had resulted in a significantly higher mortality rate in this replication study compared with the original study. As cited by the authors (subsection “Intestinal tumorigenesis and inflammation of germ-free Il10-/- mice mono-associated with E. 121 coli NC101 or NC101𝛥pks”), a previous study (Bissahoyo, 2005) showed that 33 mice were treated with 10 mg/kg of AOM, and "no premature loss of mice" was observed at "six months after the first AOM dose". This appears in line with the original study and indicates that 10 mg/kg of AOM should not cause substantial mortality rate; in contrast, AOM administration has resulted in a significantly higher mortality rate in this replication study, which could be due to the differences in AOM lots, mouse facilities, microbiota, etc. It appears reasonable that the difference in the effects of NC101 to NC101 Δpks on the intestinal inflammation and tumorigenesis, as reported in the original study, could not be replicated, with the more severe disease development in this replication study. However, in order to firmly evaluate the reproducibility of the original study, especially the effects of NC101 to NC101 Δpks on the intestinal inflammation and tumorigenesis, the experimental conditions (such as the administrated AOM dosage) could be adjusted in this replication study to get enough animals surviving through 18 weeks for the further analyses.

We have revised the figures, the manuscript, and the Abstract to reflect the difference in experimental timing between the original study and this replication attempt in light of the information supplied by reviewer 4.

Reviewer #4:

[…] Essential revisions:

1) In both studies, colonization was established for 4 weeks, after which AOM was administered for 6 weeks. However, the current study evaluated survival and tumorigenesis 18 weeks after the last (6th) AOM injection, whereas the original 2012 study evaluated survival and tumorigenesis 18 weeks after colonization. This equates to 10 weeks longer in the current study, which is 50% greater time than in the original, and could explain why no difference in tumorigenesis was observed. The authors in fact indicate that it was difficult to detect differences in tumorigenesis because it had advanced so far and many mice did not survive to this time point. This substantial difference in experimental timing – 18 weeks (2012 study) and 28 weeks (current study) – is a major flaw in this replication study. This difference should be clearly stated in the Abstract and manuscript text. Further, I”m not sure if the meta-analysis in Figure 4 is appropriate with such differences in time frame between the two studies.

Thank you for sharing this critical information. We were unaware of these differences in experimental timing. As raised by reviewer 1 the original paper was not well described in terms of the timepoints. The experimental timing was based on the information in the original paper and described in the Registered Report with the mice to be sacrificed “18 weeks after last AOM injection”. This remained after informal review and feedback by the original authors during preparation of the Registered Report manuscript, peer review of the Registered Report, and post-publication peer review of the published Registered Report. This was also not raised in the other independent reviews of this Replication Study manuscript. One approach to mitigate the potential for misinterpreting complex study designs is to include a timeline diagram or flowchart as recommended by the ARRIVE Guidelines. We included these points in the revised manuscript.

We also agree with the reviewer regarding the implications this has on the presentation and interpretation of the replication data. We have revised the figures, manuscript, and the Abstract to reflect the difference in experimental timing between the original study and this replication attempt. This includes removing the histopathological analysis from the meta-analysis and revising the survival meta-analysis to reflect the shared time frame between the two studies.

2) Potential AOM toxicity: It is concerning that AOM treatment killed 30-70% of the mice, simply during the 6 week injection period. I am not aware of this extent of toxicity in AOM/DSS or AOM/Il10-/- studies. Perhaps the current authors injected mice with a very concentrated AOM solution? Survival curves from the original 2012 paper do not indicate this extent of toxicity. I recommend the authors amend their Abstract and manuscript text to indicate how different these results were between the current and former studies.

We agree the observed toxicity is unexpected, especially since the dose and treatment schedule used is below the published toxic dose for AOM in mice and, to our knowledge, is the same used in the original study. The cause of early death following AOM injection was confirmed to be acute liver failure, based on histologic evaluation. Inclusion of AOM-only controls could have indicated any increased susceptibility to AOM of the mice used in this study. We have also revised the Abstract to reflect the increased severity between the two studies during AOM treatment.

3) Figure 2 figure legend mentions "animals where bacterial contamination was detected (7 out of 84) were censored". What does this mean? Does this mean that some animals housed in gnotobiotic (mono-associated) isolators became contaminated? This is highly concerning. It also raises the possibility that these mice were also infected with other non-bacterial microorganisms, such as virus or fungus. Any contamination could alter results. Potential contamination is only mentioned in this figure legend and does not appear to be mentioned in Results section or Materials and methods section. The authors must explain in greater detail what this "bacterial contamination" means – both to this reviewer and to the readers.

We have revised the manuscript (Materials and methods section and Figure 2 figure legend) to provide a clearer description of the contaminated mice. In brief, (1) the mice were in isocages, not isolators, with only a few isocages contaminated during the course of the experiment, and (2) the rest of the isocages remained gnotobiotic.

[Editors” note: further revisions were requested prior to acceptance, as described below.]

The authors provided a highly responsive revision of the original paper. In re-reviewing Eaton, 2015 (study plan) and the original Arthur et al., paper (2012), we agree that the time line for the mouse experiments proposed in Eaton, 2015 was clear whereas the time line between bacterial inoculation, AOM administration and mouse harvest was hard to discern in the original paper. Besides the importance of clearer time line delineation in original manuscripts, this highlights flaws in the review of the original study plan, certainly a distributed responsibility in which several reviews failed to raise any questions.

Thus, in the review of the current manuscript, there a number of places where the text should be modified to be clearer and more detailed as listed below.

1) Introduction. The authors state incorrectly that in Arthur et al. a 100-fold increase in E. coli NC101 was detected in the lumen of Il10 KO mice relative to WT controls. This is Figure 1I in Arthur et al. In this experiment, conducted at 20 weeks after transfer to SPF conditions, only the presence of E. coli was examined, not NC101 which would have required specific testing for pks or colibactin. NC101 is referred to but, not specifically tested for at this juncture in the Arthur et al., paper.

Thank you for raising this error. We have revised the manuscript to remove this statement.

2) Subsection “Intestinal tumorigenesis and inflammation of germ-free Il10-/- mice mono-associated with E. coli NC101 or NC101𝛥pks”. It was difficult to find wording in the Arthur et al. paper that clearly delineated whether harvest occurred 18 weeks from monoassociation vs 18 weeks post-AOM (wording was “14 and 18 weeks with AOM”) and better definition was missed by several reviews of the Eaton et al. plan including, even perhaps the original authors, as such the following wording change is suggested: “based on methods derived from the original study and not corrected on review of Eaton et al. As this may represent a more accurate presentation of what occurred.

We agree and have revised this sentence as suggested.

3) Subsection “Intestinal tumorigenesis and inflammation of germ-free Il10-/- mice mono-associated with E. coli NC101 or NC101𝛥pks”. Need to insert here the starting N of mice (NC101 39, NC101Δpks 45 mice) since the reader can’t interpret the mouse survival numbers without this information.

We agree and have included this in the revised manuscript.

4) Please add the range of days of survival to augment the median survival stated.

We have included the range of survival times for each group.

5) Please expand “suggesting bacterial load was not a factor in the survival differences” as follows: “suggesting while bacterial load was not a factor in the survival differences, host:bacterial interactions could have contribute to the early demise” (or similar).

While the interpretation that bacterial load was not a factor, it is not clear that host:bacterial interactions necessarily contributed. The early death rate in response to AOM was not explained in this study and thus we feel it is not appropriate to speculate.

6) There to be an error. Why are half the mice surviving in the NC101Δpks group vs the NC101 group considered a “small impact”? The 57 day survival is the NC101Δpks group but it is stated to the NC101 group and then median survival is stated as not able to be calculated but provided as 154 days previously. Please clarify for the reader.

This analysis was to compare the replication results to the original study. In order to do that we treated 18 weeks after mono-association as the study end point (i.e. ignoring all events after this time point). We have revised the manuscript to better clarify what these results represent. We have also removed the word “small” as this was indeed an error.

7) The timing of the onset of colitis is unclear. Do the authors mean by 4 weeks of monocolonization or only following AOM?

We have revised the manuscript to reflect the timing of the onset of colitis. Specifically, the first mouse was at 5 weeks after monocolonization.

8) What was the timing of detection of the anal carcinoma?

We have included the timing of the anal carcinoma in the revised manuscript. Specifically, this was observed between 19 and 27 weeks after monocolonization.

9) The mouse colon is well known to have a distal “smooth” part grossly and then the proximal half has a “feathered” gross structure. Please provide clarity on whether distal colon refers to the smooth section of the mouse colon or not. For example, in the enterotoxigenic B fragilis model of colon tumorigenesis, the transition between the smooth and feathered colon in the most severely affected animals is essentially a “hard stop” for further tumorigenesis (i.e., tumors push to this transition zone and then rarely penetrate into the feathered area of the mouse colon). Is that what is being described here?

We have revised this section to better clarify the observations. The most extensive tumors reached approximately mid-colon. The gross appearance of the proximal mucosa was not recorded.

10) Please clarify the statement: “These observations were the same whether the mouse was mono-associated with NC101 or NC101Δpks”. Was this true even when mice were harvested at a time point close to the time that mice were harvested in the original Arthur et al. paper (i.e., day ~126)? The tumorigenesis results of any mice harvested near the timing of the original Arthur et al. paper should be commented on specifically. It seems possible that the time line in this model is crucial for differentiating the outcomes of NC101 and NC101Δpks.

We have revised this section to clarify the range these observations were made. We also included in a previous paragraph the observations of the mice that were harvested at times close to the original study.

11) What is the timing of the pictures in Figure 3—figure supplement 1?

This is from a mouse that was mono-associated with NC101 and died 13 weeks after AOM treatment. The timing is described in the figure legend.

12) Are there any mice analyzed at the time point of the harvests in the Arthur et al., paper?

We included the available information in the paragraph included in response to point 10 above.

13) Suggested edit for clarity: change “14 weeks post-AOM treatment” to “14 weeks post-AOM treatment (5 weeks beyond the endpoint in Arthur et al.,)”.

We agree and included this in the revised manuscript.

14) Subsection “Intestinal tumorigenesis and inflammation of germ-free Il10-/- mice mono-associated with E. coli NC101 or NC101𝛥pks”. Are these scores correct? They have exactly the same numbers and statistical significance seems doubtful as reported. This requires either more explanation or correction.

These are the median scores reported in the original study. The original study analyzed the ordinal scoring data as interval measurements (by t test), which is not appropriate since the mean cannot be defined (Baker et al., 2014; Gibson-Corley et al., 2013). That is, while a number is used for the scoring it represents non-numeric concepts like “severe or high grade dysplasia characterized as adenoma, restricted to the mucosa”. We conducted a non-parametric test (i.e. Mann Whitney test) on the original data which gave similar results. We included this additional explanation in the revised manuscript.

15) Suggest adding “more severe and/or progressive over time” to the sentence “The absolute scores were greater in this replication attempt compared to the original study, particularly for inflammation and invasion, which, combined with the survival and histopathological observations described above suggests that lesions were more severe in this replication attempt than in the original study.”

We agree and included this in the revised manuscript. We have also revised this section to better clarify the observations in light of the differences in methodology between the two studies.

16) Subsection “Intestinal tumorigenesis and inflammation of germ-free Il10-/- mice mono-associated with E. coli NC101 or NC101𝛥pks”. Suggest adding after “AOM treatment” the following sentence: These results highlight the importance of experimental time lines in assessing differences in mouse models and between reports.” (or similar).

We have revised this section to highlight that it is possible that the experimental timing in this mouse model is crucial for differentiating the outcomes of NC101 and NC101Δpks. Additionally, we have included a sentence that methods should be clearly described and published to facilitate reproducibility.

17) “[…], had a natural tendency to have a reduced inflammatory response and tumor load compared to ones that died, then the data reported would be distorted”: The word “distorted” does not seem apt. Mouse models and mice within experiments can be remarkably variable including littermates, mice caged together etc. Isn’t this the nature of mouse experiments and the point is that investigators need to be aware that inbred mice by no means provide clear “smoothing” of the data?

We removed this sentence and revised this section to better clarify the observations in light of the differences in methodology between the two studies.

18) “Additionally, under these conditions it is possible the products of excessive inflammatory responses (e.g. reactive oxygen species”: It would be clearer to state: “Additionally over time […]”.

We agree and included this in the revised manuscript.

19) “This increased severity confounds the ability to detect differences […]”: Again, suggest changing wording to “increased severity over time confounds […]”

We removed this sentence and revised this section to better clarify the observations in light of the differences in methodology between the two studies.

20) Subsection “Meta-analyses of original and replicated effects”. Again, for clarity change “a common effect size was calculated for each effect from the original and replication studies” to “calculated for survival [.,,]”.

We agree and included this in the revised manuscript.

21) Subsection “Genome sequencing data processing and assembly”. Both NC101 and NC101Δpks were sequenced. Are these data submitted to GenBank? The accession numbers should be provided.

We have included the accession numbers in the revised manuscript.

22) Subsection “Determination of E. coli CFU”. What does “intestinal tissue feces” mean? Do the authors mean that intestinal luminal contents were removed from the colon at the time of mouse necropsy? Please clarify.

This was a typographic error. “Intestinal tissue” has been deleted.

23) Subsection “Statistical Analysis”. Suggest “reported in the original study in Supplemental […]”.

We agree and included this in the revised manuscript.

24) Figure 1A,B should be revised to include much more data that would encapsulate the experiments and their contrast for the reader. Suggest adding numbers of mice for A,B experiments including number of males, females; mark the timing of mouse deaths with arrows; mark timing of onset of detection of colitis and tumors/invasive cancers (per the text the timing of onset of detection of carcinoma was at/near the time point at which the Arthur et al., experiments ended); add to legend what the cross lines added to the timeline represent (mouse censoring due to bacterial contamination). Ideally this figure would enable the reader to readily capture the contrasts between the studies and their timelines. Hopefully any needed information from Arthur et al. would be available to complete this.1

We have revised Figure 2A,B to include more information as suggested and included what information was available from Arthur et al. The timing of mouse deaths is already indicated in the Kaplan-Meier plot in the vertical lines, thus we did not include additional marks.

25) Figure 2 legend. (n=7 in 4 cages), correct?

There were only 3 cages. We have revised the figure legend to accurate reflect this in addition to the number of mice (n=7).

26) Figure 2—figure supplement 1. Add range of days/wks that sacrifices occurred.

We have included the range of sacrifices.

27) Figure 3—figure supplement 2 legend. Please add the timing of harvest of each mouse displayed in B,C, D and E, ideally on the figure. In Figure 3E, the herniation described is not really visible to the reader. Please provide a higher power inset. Similarly, please clarify Figure 3—figure supplement 3 and Figure 3. Are these sections all from one mouse or different mice? What is the timing of the necropsies leading to these images? Sizing bars are missing from images A and C. In A, the “mucus lakes” should be marked and likely a higher power image of these provided. C appears to be at higher magnification than B. It would be best to show B and C at the same magnification.

The time of euthanasia has been added to the figure legends. The herniated gland in Figure 3—figure supplement 2E has been circled to clarify it. Size bars have been added to Figure 3—figure supplement 3. Figure 3—figure supplement B,C are at different magnifications because they demonstrate different distributions of lesions. Parts A and C are from the same mouse, while part B is from a different mouse.

[Editors” note: further revisions were requested prior to acceptance, as described below.]

The manuscript has been improved and a highly responsive revision is again noted. However, there are some remaining issues that need to be addressed before acceptance, as outlined below:

Two data concerns--both seem like typos:

1) Subsection “Intestinal tumorigenesis and inflammation of germ-free Il10-/- mice mono-associated with E. 124 coli NC101 or NC101𝛥pks”. Looking at Figure 2B, this reviewing Editor thinks that NC101 and NC101Δpks are reversed. Namely, NC101(not NC101Δpks) median survival cannot be determined because more than 1/2 the mice were alive at 18 weeks following monoassociation.

Thank you for catching this error. We have corrected it in the revised manuscript.

2) Subsection “Intestinal tumorigenesis and inflammation of germ-free Il10-/- mice mono-associated with E. coliE. coli NC101 or NC101𝛥pks”. This reviewer thinks the authors are repeating the data although the language differs (lesions and macroscopic tumor burden). The authors note that a “statistically significant decrease in neoplastic lesions” reported in Arthur et al., but provide identical median scores. Please clarify.

Thank you for raising these. One reference was to the number of macroscopic tumors, while the other referred to the scoring of tissues for neoplasia. We have revised the latter sentence from “neoplastic lesions” to “neoplasia scores” to avoid any confusion and describe the invasion and inflammation scores in the same manner.

The original median scores for neoplasia, invasion, and inflammation are accurately reported and the reviewer is correct that the median scores for NC101 and NC101Δpks are identical (both are 4). While a t-test was not appropriate for these data, a U-test gave the same results, which we report. The statistical difference is because of the spread of the data, despite identical medians. We have included the range for each measure to help clarify this point.

https://doi.org/10.7554/eLife.34364.026

Article and author information

Author details

  1. Kathryn Eaton

    Department of Microbiology and Immunology, University of Michigan Medical School, Ann Arbor, United States
    Contribution
    Acquisition of data, Analysis and interpretation of data, Drafting or revising the article
    Competing interests
    Germ-Free and Gnotobiotic Mouse Facilities, University of Michigan Medical School was a Science Exchange associated lab.
  2. Ali Pirani

    Department of Microbiology and Immunology, University of Michigan Medical School, Ann Arbor, United States
    Contribution
    Acquisition of data, Analysis and interpretation of data, Drafting or revising the article
    Competing interests
    No competing interests declared
  3. Evan S Snitkin

    Department of Microbiology and Immunology, University of Michigan Medical School, Ann Arbor, United States
    Contribution
    Acquisition of data, Analysis and interpretation of data, Drafting or revising the article
    Competing interests
    No competing interests declared
  4. Reproducibility Project: Cancer Biology

    Contribution
    Analysis and interpretation of data, Drafting or revising the article
    For correspondence
    1. tim@cos.io
    2. nicole@scienceexchange.com
    Competing interests
    EI, RT, NP: Employed by and hold shares in Science Exchange Inc.
    1. Elizabeth Iorns, Science Exchange, Palo Alto, United States
    2. Rachel Tsui, Science Exchange, Palo Alto, United States
    3. Alexandria Denis, Center for Open Science, Charlottesville, United States
    4. Nicole Perfito, Science Exchange, Palo Alto, United States
    5. Timothy M Errington, Center for Open Science, Charlottesville, United States
    6. Elizabeth Iorns, Science Exchange, Palo Alto, United States
    7. Rachel Tsui, Science Exchange, Palo Alto, United States
    8. Alexandria Denis, Center for Open Science, Charlottesville, United States
    9. Nicole Perfito, Science Exchange, Palo Alto, United States
    10. Timothy M Errington, Center for Open Science, Charlottesville, United States

Funding

Laura and John Arnold Foundation

  • Reproducibility Project: Cancer Biology

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

The Reproducibility Project: Cancer Biology would like to thank Dr. Janelle C Arthur (University of North Carolina at Chapel Hill), for sharing critical information, data, and reagents, specifically the E. coli NC101 and NC101Δ pks strains, as well as the Il10-/- germ-free mice (grants: 5-P39-DK034987 and 5-P40-OD010995). We would like to thank Clinton Fontaine and Nicholas Pudlo for performing some of the in vitro work and Sara Poe, Chriss Vowles, Trisha Denike, and Natalie Anderson who performed the animal work. We would also like to thank the following companies for generously donating reagents to the Reproducibility Project: Cancer Biology; American Type and Tissue Collection (ATCC), Applied Biological Materials, BioLegend, Charles River Laboratories, Corning Incorporated, DDC Medical, EMD Millipore, Harlan Laboratories, LI-COR Biosciences, Mirus Bio, Novus Biologicals, Sigma-Aldrich, and System Biosciences (SBI).

Ethics

Animal experimentation: All animal procedures were approved by the Michigan University IACUC# 7291 and were in accordance with Michigan University's policies on the care, welfare, and treatment of laboratory animals.

Publication history

  1. Received: December 15, 2017
  2. Accepted: September 19, 2018
  3. Version of Record published: October 3, 2018 (version 1)

Copyright

© 2018, Eaton et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 530
    Page views
  • 50
    Downloads
  • 0
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Cancer Biology
    Curated by Roger Davis et al.
    Collection Updated

    Investigating reproducibility in preclinical cancer research.

    1. Cancer Biology
    Bente Benedict et al.
    Research Article