Registered report: Fusobacterium nucleatum infection is prevalent in human colorectal carcinoma

  1. John Repass
  2. Nimet Maherali
  3. Kate Owen
  4. Reproducibility Project: Cancer Biology  Is a corresponding author
  1. ARQ Genetics, United States
  2. Harvard Stem Cell Institute, United States
  3. University of Virginia, United States

Abstract

The Reproducibility Project: Cancer Biology seeks to address growing concerns about reproducibility in scientific research by conducting replications of selected experiments from a number of high-profile papers in the field of cancer biology. The papers, which were published between 2010 and 2012, were selected on the basis of citations and Altmetric scores (Errington et al., 2014). This Registered Report describes the proposed replication plan of key experiments from 'Fusobacterium nucleatum infection is prevalent in human colorectal carcinoma' by Castellarin and colleagues published in Genome Research in 2012 (Castellarin et al., 2012). The experiment to be replicated is reported in Figure 2. Here, Castellarin and colleagues performed a metagenomic analysis of colorectal carcinoma (CRC) to identify potential associations between inflammatory microorganisms and gastrointestinal cancers. They conducted quantitative real-time PCR on genomic DNA isolated from tumor and matched normal biopsies from a patient cohort and found that the overall abundance of Fusobacterium was 415 times greater in CRC versus adjacent normal tissue. These results confirmed earlier studies and provide evidence for a link between tissue-associated bacteria and tumorigenesis. The Reproducibility Project: Cancer Biology is a collaboration between the Center for Open Science and Science Exchange and the results of the replications will be published in eLife.

https://doi.org/10.7554/eLife.10012.001

Introduction

The human intestine is populated by an estimated 1014 microbes comprising over 1000 bacterial phylotypes (Ley et al., 2006). The overall composition of the intestinal microbiota is determined by a number of factors, including host genetics, environment, diet and hygiene (Arrieta et al., 2014; Keku et al., 2015). These bacteria play important roles in host biology by maintaining intestinal homeostasis, barrier function, immunity and metabolic function (Backhed et al., 2005; Jones et al., 2014). Perturbations or imbalances in the microbiome (microbial dysbiosis) are linked to a number of disease pathologies such as inflammatory bowel disease (Collins, 2014; Hold et al., 2014), obesity (Bajzer and Seeley, 2006; Brown et al., 2012), and colorectal cancers (CRCs; Dulal and Keku, 2014; Keku et al., 2015).

CRC is a complex disease arising from the sequential accumulation of somatic mutations and epigenetic alterations. Activating mutations in the K-ras oncogene, as well as the loss of tumor suppressor genes like p53 (TP53) and adenomatous polyposis coli (APC), contribute to the tumorigenic transformation of normal colonic epithelium (Vogelstein et al., 1988; Fearon, 2011; Mundade et al., 2014). In addition to genetic factors, microbial dysbiosis, such as altered bacterial diversity, is strongly associated with the development of CRC (Keku et al., 2015). However, despite numerous longitudinal studies comparing intestinal microbial communities over time (Rodriguez et al., 2015), and across various cancer stages (Kubota, 1990; Chen et al., 2013; Nugent et al., 2014), there is limited information on the contribution of specific bacteria to CRC development.

To identify potential associations between inflammatory microorganisms and gastrointestinal cancers, Castellarin et al. (2012) first performed RNA sequencing (RNA-seq) on a limited number of tumor and matched normal tissue samples. Initial observations indicated a striking overrepresentation of Fusobacterium nucleatum sequences in carcinoma samples compared to controls. To confirm these findings, Castellarin et al. (2012) assessed the relative abundance of Fusobacterium in a larger cohort of tumor and matched normal biopsy samples. In Figure 2, the authors performed quantitative real-time PCR (qPCR) on genomic DNA (gDNA) isolated from an additional 88 colorectal carcinoma (CRC) specimens and adjacent matched control tissues. Fusobacterium abundance was observed to be significantly higher in the tumor samples compared to matching control samples. This key experiment will be replicated in Protocol 1.

Similar findings confirming the higher relative abundance of Fusobacterium in CRC tumor tissues compared to control biopsies have been reported by other investigators (Kostic et al., 2012; McCoy et al., 2013; Warren et al., 2013; Tahara et al., 2014). In fact, the study by Kostic et al. (2012) is considered a co-discovery of this phenomenon. McCoy et al. (2013) successfully validated the association between Fusobacterium and CRC in a set of matched CRC tumor and normal human colon tissue samples using both pyrosequencing and qPCR analysis of the 16S bacterial rRNA gene. Findings by Mira-Pascual et al. (2015) further confirm this trend, as this group observed a significantly higher presence of F. nucleatum in mucosal samples from the CRC patients compared to the healthy subjects (as opposed to matched tissue biopsies). Recent studies have also reported a higher presence of Fusobacterium species in human colonic adenomas (polyps) and in stool samples from adenoma and tumor carcinoma patients compared to healthy subjects (Kostic et al., 2012; 2013; McCoy et al., 2013). Furthermore, other studies have expanded these findings to identify potential mechanisms of action of F. nucleatum during tumorigenesis (Rubinstein et al., 2013; Gur et al., 2015). Rubenstein et al. (2013) also indirectly confirm a higher abundance of Fusobacterium in CRC patients by measuring higher F. nucleatum FadA mRNA expression relative to healthy controls.

Materials and methods

Unless otherwise noted, all protocol information was derived from the original paper, references from the original paper, or information obtained directly from the authors. An asterisk (*) indicates data or information provided by the Reproducibility Project: Cancer Biology core team. A hashtag (#) indicates information provided by the replicating lab.

Protocol 1: quantitative PCR for amplification of F. nucleatum from matched normal and tumor human colon cancer specimens

This protocol utilizes quantitative PCR to test the relative abundance of F. nucleatum DNA in gDNA isolated from matched normal and tumor human colon cancer specimens. It is a replication of Figure 2.

Sampling

Request a detailed protocol
  • This experiment will include 40 matched samples for a final power of 87.26%.

    • See power calculations for details.

  • Each patient sample has two cohorts:

    • Cohort 1: Colon tumor sample (n = 40)

    • Cohort 2: Matched normal tissue within the same individual (n = 40)

    • Cohort 3: Age/ethnicity-matched normal tissue from additional control individuals (n = 40)

  • Tissue is collected during surgery (either partial colectomy, ileocolectomy, colorectal resection, or proctocolectomy) from tumor tissue, adjacent normal tissue, or from normal controls. Samples are frozen on liquid nitrogen within 30 min after extractions. Diagnosis is confirmed by a pathologist using histological sections from each sample.

  • Quantitative PCR will be performed for each sample two independent times in technical triplicate for the following:

    • F. nucleatum DNA

    • Prostaglandin transporter—reference gene

Materials and reagents

Request a detailed protocol
ReagentManufacturerCatalog #Comments
Frozen human colon tumor samples
and matched normal samples
#iSpecimenData include age, gender, ethnicity,
diagnosis, histopathology report
Gentra Puregene Genomic
DNA extraction kit
Qiagen158667Replaces Qiagen 69504
PicoGreen Assay#Life TechnologiesP7589
Spectrophotometer#NanoDropND1000
384-well optical PCR plate#Phoenix ResearchMPS-3898
Fusobacteria forward qPCR primerPart of a custom-designed
Taqman primer/probe set
(Applied Biosystems)
CAACCATTACTTTAACTCTA
CCATGTTCA
Fusobacteria reverse qPCR primerGTTGACTTTACAGAAGGAGA
TTATGTAAAAATC
Fusobacteria FAM probeTCAGCAACTTGTCCTTCTTGA
TCTTTAAATGAACC
PGT forward qPCR primerPart of a custom-designed
Taqman primer/probe set
(Applied Biosystems)


ATCCCCAAAGCACCTGGTTT
PGT reverse qPCR primerAGAGGCCAAGATAGTCCTG
GTAA
PGT FAM probeCCATCCATGTCCTCATCTC
TaqMan Universal Master MixABI#4304437
qPCR thermal cycling systemABI#43514057900HT system
  1. Note: Probe sequence from original manuscript incorrect. Correct sequence seen here from Flanagan et al., 2014.

Procedure

Request a detailed protocol
  1. Obtain ~40 sets from frozen human CRC tumors with matched normal control, and an additional control group of age/ethnicity-matched tissue from healthy individuals.

    1. Tissue will have been flash-frozen in liquid nitrogen very soon after harvest.

    2. Pathological data showing positive diagnosis for CRC will be included with samples.

  2. Extract gDNA using Gentra Puregene genomic DNA extraction kit according to manufacturer’s instructions.

  3. Quantify gDNA concentration by Nanodrop spectrophotometer.

  4. Assemble 20 μL qPCR reactions in a 384-well optical PCR plate. Each sample is assayed in triplicate for each primer/probe set. Each reaction contains:

    1. 5 ng of gDNA

    2. 18 μM of each primer

    3. 5 μM of probe

    4. 1 X final concentration of TaqMan Universal Master Mix

  5. Perform amplification and detection of DNA using the following reaction conditions:

    1. 2 min at 50°C

    2. 10 min at 95°C

    3. 40 cycles of 15 s at 95°C and 1 min at 60°C.

  6. Calculate cycle threshold using the automated settings. Analyze and compute ΔΔCT values by normalizing to prostaglandin transporter reference gene.

    1. The mean ΔΔCT values from the technical replicates from the tumor and normal sample will be used to calculate the ratio of tumor versus normal for each matched biopsy.

  7. Repeat steps 3–5 for each sample a second time.

    1. The mean ratios of ΔΔCT values in tumor versus normal sample from the two independent experimental replicates will be calculated for each matched biopsy.

Deliverables

Request a detailed protocol
  • Data to be collected:

    • Descriptive data of gDNA samples including: patient sample age/sex, ethnicity, and % area of the tumor involved with necrosis.

    • Purity (A260/280 and A260/230 ratios) and concentration of isolated total gDNA from tumor biopsies.

    • Raw qRT-PCR values, as well as analyzed ΔΔCT values for each tumor and matched biopsy sample. Bar graph of mean relative abundance of F. nucleatum in tumor versus normal colorectal samples (compare to Figure 2A).

Confirmatory analysis plan

Request a detailed protocol

This replication attempt will perform the statistical analysis listed below:

  • Statistical analysis of replication data:

    • Note: At the time of analysis, we will perform the Shapiro–Wilk test and generate a quantile–quantile (qq) plot to assess the normality of the data. If the data appear skewed, we will perform the appropriate transformation in order to proceed with the proposed statistical analysis. If this is not possible, we will perform the equivalent nonparametric test (e.g., Wilcoxon-signed rank test).

    • One-sample Student’s t-test using the log of the mean ratios of ΔΔCT values from the two independent experimental replicates, tumor ΔΔCT/matched within individual controls compared to a mean value of zero.

  • Additional exploratory analysis:

    • Two Student’s t-tests with Bonferroni correction comparing absolute values from:

      • Mean tumor Fusobacterium abundance versus within subject matched control (paired)

      • Mean tumor Fusobacterium abundance versus healthy matched control (unpaired)

  • Meta-analysis of original and replication attempt effect sizes:

    • Compute the effect size, compare it against the effect size in the original paper and use a random effects meta-analytic approach to combine the original and replication effects, which will be presented as a forest plot.

Known differences from the original study

Request a detailed protocol

All known differences are listed in the 'Materials and reagents' section with the originally used item listed in the comments section. All differences have the same capabilities as the original and are not expected to alter the experimental design. We have added an additional control of matched gDNA from healthy individuals.

Provisions for quality control

Request a detailed protocol

The sample purity (A260/280 and A260/230 ratios) of the isolated gDNA from each sample will be reported. All of the raw data, including the analysis files, will be uploaded to the project page on the OSF (https://osf.io/v4se2) and made publically available.

Power calculations

Request a detailed protocol

For a detailed breakdown of all power calculations, see spreadsheet at https://osf.io/yadgq/

Protocol 1

Summary of original data

Request a detailed protocol
  • Note: Data estimated from graph reported in Figure 2.

SampleLog (mean)N
1‑1.57872
2‑1.19572
3‑0.92772
4‑0.87662
5‑0.51922
6‑0.44682
7‑0.41282
8‑0.31492
9‑0.29362
10‑0.26812
11‑0.27662
12‑0.23832
13‑0.2342
14‑0.22
15‑0.17872
16‑0.17032
17‑0.16172
18‑0.13622
19‑0.06812
20‑0.02982
210.0342
220.01282
230.00952
240.0172
250.02132
260.02132
270.02552
280.01282
290.0172
300.01282
310.0172
320.02552
330.02132
340.03012
350.0342
360.05552
370.13622
380.14472
390.17452
400.19152
410.22
420.20862
430.2172
440.22132
450.25962
460.40432
470.44682
480.45112
490.46812
500.49792
510.50642
520.50212
530.5492
540.57872
550.57872
560.58722
570.60852
580.62132
590.65532
600.69792
610.72342
620.76172
630.80432
640.82982
650.9662
660.96172
671.00422
681.01282
691.0172
701.02552
711.06812
721.05962
731.08512
741.12342
751.19582
761.31492
771.31492
781.40852
791.62982
801.75752
811.7832
821.87232
831.94042
841.9832
8522
862.25532
872.42982
882.47232
892.47232
902.55322
912.67232
922.68932
932.90642
943.05962
953.24252
963.34472
973.58722
983.82
994.2612

Test family

Request a detailed protocol
  • Ratio one-sample t-test: aerror = 0.05, µ = 0.

Power calculations

Request a detailed protocol
  • Ratio t-test and power calculations were performed with R software, version 3.1.2 (Team RC 2014).

MeanEffect
size d
A priori powerTotal sample size
Ratio0.758938380.502456887.26%40*
  1. *Forty total ratios (40 tumor 40 matched controls) will be used.

Additional exploratory analysis

Test family

Request a detailed protocol
  • Paired Student’s t-test (two-tailed): aerror = 0.025.

Power calculations

Request a detailed protocol
  • Sensitivity calculations were performed with G*Power software, version 3.1.7 (Faul et al., 2007).

Group 1Group 2Detectable effect
size d
A priori powerTotal
sample
size
Tumor sampleAdjacent matched
control
0.5038480%40

Test family

Request a detailed protocol
  • Independent Student’s t-test (two-tailed): aerror = 0.025.

Power calculations

Request a detailed protocol
  • Sensitivity calculations were performed with G*Power software, version 3.1.7. (Faul et al., 2007).

Group 1Group 2Detectable effect
size d
A priori powerTotal
sample
size
Tumor sampleHealthy
individual matched
control
0.700780%40

References

    1. Collins SM
    (2014) A role for the gut microbiota in IBS
    Nature Reviews Gastroenterology & Hepatology 11:497–505.
    https://doi.org/10.1038/nrgastro.2014.40
    1. Keku TO
    2. Dulal S
    3. Deveaux A
    4. Jovov B
    5. Han X
    (2015) The gastrointestinal microbiota and colorectal cancer
    American Journal of Physiology - Gastrointestinal and Liver Physiology 308:G351–G363.
    https://doi.org/10.1152/ajpgi.00360.2012
    1. Kubota Y
    (1990)
    [Fecal intestinal flora in patients with colon adenoma and colon cancer]
    Nihon Shokakibyo Gakkai Zasshi 87:771–779.
  1. Book
    1. Team RC
    (2014)
    R: a language and environment for statistical computing
    Vienna, Austria: R Foundation for Statistical Computing.

Article and author information

Author details

  1. John Repass

    ARQ Genetics, Bastrop, United States
    Contribution
    JR, Drafting or revising the article
    Competing interests
    JR: ARQ Genetics is a Science Exchange-associated lab.
  2. Nimet Maherali

    Harvard Stem Cell Institute, Cambridge, United States
    Contribution
    NM, Drafting or revising the article
    Competing interests
    No competing interests declared.
  3. Kate Owen

    University of Virginia, Charlottesville, United States
    Contribution
    KO, Drafting or revising the article
    Competing interests
    No competing interests declared.
  4. Reproducibility Project: Cancer Biology

    Contribution
    RP:CB, Conception and design; Drafting or revising the article
    For correspondence
    nicole@scienceexchange.com
    Competing interests
    RP:CB: EI, FT, JL, NP: Employed by and hold shares in Science Exchange Inc.
    1. Elizabeth Iorns, Science Exchange, Palo Alto, United States
    2. William Gunn, Mendeley, London, United Kingdom
    3. Fraser Tan, Science Exchange, Palo Alto, United States
    4. Joelle Lomax, Science Exchange, Palo Alto, United States
    5. Nicole Perfito, Science Exchange, Palo Alto, United States
    6. Timothy Errington, Center for Open Science, Charlottesville, United States

Funding

Laura and John Arnold Foundation

  • Reproducibility Project: Cancer Biology

The Reproducibility Project: Cancer Biology is funded by the Laura and John Arnold Foundation, provided to the Center for Open Science in collaboration with Science Exchange. The funder had no role in study design or the decision to submit the work for publication.

Acknowledgements

The Reproducibility Project: Cancer Biology core team thank Courtney Soderberg at the Center for Open Science for assistance with statistical analyses. We also thank the following companies for generously donating reagents to the Reproducibility Project: Cancer Biology; American Type Culture Collection (ATCC), Applied Biological Materials, BioLegend, Charles River Laboratories, Corning, DDC Medical, EMD Millipore, Harlan Laboratories, LI-COR Biosciences, Mirus Bio, Novus Biologicals, Sigma–Aldrich, and System Biosciences (SBI).

Copyright

© 2016, Repass et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 1,659
    views
  • 304
    downloads
  • 35
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. John Repass
  2. Nimet Maherali
  3. Kate Owen
  4. Reproducibility Project: Cancer Biology
(2016)
Registered report: Fusobacterium nucleatum infection is prevalent in human colorectal carcinoma
eLife 5:e10012.
https://doi.org/10.7554/eLife.10012

Share this article

https://doi.org/10.7554/eLife.10012

Further reading

    1. Cancer Biology
    John Repass, Reproducibility Project: Cancer Biology
    Replication Study

    As part of the Reproducibility Project: Cancer Biology, we published a Registered Report (Repass et al., 2016), that described how we intended to replicate an experiment from the paper ‘Fusobacterium nucleatum infection is prevalent in human colorectal carcinoma’ (Castellarin et al., 2012). Here we report the results. When measuring Fusobacterium nucleatum DNA by qPCR in colorectal carcinoma (CRC), adjacent normal tissue, and separate matched control tissue, we did not detect a signal for F. nucleatum in most samples: 25% of CRCs, 15% of adjacent normal, and 0% of matched control tissue were positive based on quantitative PCR (qPCR) and confirmed by sequencing of the qPCR products. When only samples with detectable F. nucleatum in CRC and adjacent normal tissue were compared, the difference was not statistically significant, while the original study reported a statistically significant increase in F. nucleatum expression in CRC compared to adjacent normal tissue (Figure 2; Castellarin et al., 2012). Finally, we report a meta-analysis of the result, which suggests F. nucleatum expression is increased in CRC, but is confounded by the inability to detect F. nucleatum in most samples. The difference in F. nucleatum expression between CRC and adjacent normal tissues was thus smaller than the original study, and not detected in most samples.

    1. Cancer Biology
    2. Genetics and Genomics
    Yaroslav Kainov, Fursham Hamid, Eugene V Makeyev
    Research Article

    The expression of eukaryotic genes relies on the precise 3'-terminal cleavage and polyadenylation of newly synthesized pre-mRNA transcripts. Defects in these processes have been associated with various diseases, including cancer. While cancer-focused sequencing studies have identified numerous driver mutations in protein-coding sequences, noncoding drivers – particularly those affecting the cis-elements required for pre-mRNA cleavage and polyadenylation – have received less attention. Here, we systematically analysed somatic mutations affecting 3'UTR polyadenylation signals in human cancers using the Pan-Cancer Analysis of Whole Genomes (PCAWG) dataset. We found a striking enrichment of cancer-specific somatic mutations that disrupt strong and evolutionarily conserved cleavage and polyadenylation signals within tumour suppressor genes. Further bioinformatics and experimental analyses conducted as a part of our study suggest that these mutations have a profound capacity to downregulate the expression of tumour suppressor genes. Thus, this work uncovers a novel class of noncoding somatic mutations with significant potential to drive cancer progression.