1. Cancer Biology
Download icon

Replication Study: Biomechanical remodeling of the microenvironment by stromal caveolin-1 favors tumor invasion and metastasis

  1. Mee Rie Sheen
  2. Jennifer L Fields
  3. Brian Northan
  4. Judith Lacoste
  5. Lay-Hong Ang
  6. Steven Fiering
  7. Reproducibility Project: Cancer Biology  Is a corresponding author
  1. Department of Microbiology and Immunology, United States
  2. MIA Cellavie Inc, Canada
  3. Harvard Medical School, United States
Replication Study
  • Cited 0
  • Views 729
  • Annotations
Cite this article as: eLife 2019;8:e45120 doi: 10.7554/eLife.45120

Abstract

As part of the Reproducibility Project: Cancer Biology we published a Registered Report (Fiering et al., 2015) that described how we intended to replicate selected experiments from the paper ‘Biomechanical remodeling of the microenvironment by stromal caveolin-1 favors tumor invasion and metastasis’ (Goetz et al., 2011). Here we report the results. Primary mouse embryonic fibroblasts (pMEFs) expressing caveolin 1 (Cav1WT) demonstrated increased extracellular matrix remodeling in vitro compared to Cav1 deficient (Cav1KO) pMEFs, similar to the original study (Goetz et al., 2011). In vivo, we found higher levels of intratumoral stroma remodeling, determined by fibronectin fiber orientation, in tumors from cancer cells co-injected with Cav1WT pMEFs compared to cancer cells only or cancer cells plus Cav1KO pMEFs, which were in the same direction as the original study (Supplemental Figure S7C; Goetz et al., 2011), but not statistically significant. Primary tumor growth was similar between conditions, like the original study (Supplemental Figure S7Ca; Goetz et al., 2011). We found metastatic burden was similar between Cav1WT and Cav1KO pMEFs, while the original study found increased metastases with Cav1WT (Figure 7C; Goetz et al., 2011); however, the duration of our in vivo experiments (45 days) were much shorter than in the study by Goetz et al. (2011) (75 days). This makes it difficult to interpret the difference between the studies as it is possible that the cells required more time to manifest the difference between treatments observed by Goetz et al. We also found a statistically significant negative correlation of intratumoral remodeling with metastatic burden, while the original study found a statistically significant positive correlation (Figure 7Cd; Goetz et al., 2011), but again there were differences between the studies in terms of the duration of the metastasis studies and the imaging approaches that could have impacted the outcomes. Finally, we report meta-analyses for each result.

Introduction

The Reproducibility Project: Cancer Biology (RP:CB) is a collaboration between the Center for Open Science and Science Exchange that seeks to address concerns about reproducibility in scientific research by conducting replications of selected experiments from a number of high-profile papers in the field of cancer biology (Errington et al., 2014). For each of these papers a Registered Report detailing the proposed experimental designs and protocols for the replications was peer reviewed and published prior to data collection. The present paper is a Replication Study that reports the results of the replication experiments detailed in the Registered Report (Fiering et al., 2015) for a 2011 paper by Goetz et al., and uses a number of approaches to compare the outcomes of the original experiments and the replications.

In 2011, Goetz et al. reported that Caveolin-1 (Cav1), an activator of Rho/ROCK signaling (Joshi et al., 2008), remodels the intratumoral microenvironment facilitating tumor invasion and correlating with increased metastatic burden. By regulating the Rho inhibitor p190RhoGAP, Cav1 expression results in cancer-associated fibroblasts (CAFs) that promote extracellular matrix (ECM) alignment and stiffening (Goetz et al., 2011). As the ECM is stiffened, it may direct cancer cell invasion into the surrounding stroma for eventual metastasis (Wang et al., 2016). To specifically address the role of Cav1 in the tumor stroma, primary mouse embryonic fibroblasts (pMEFs) derived from either wild-type (Cav1WT) or Cav1 knockout (Cav1KO) mice were co-injected with LM-4175 tumor cells, a cell line of a lung metastasis derived MDA-MB-231 breast cancer cells (Minn et al., 2005). The number of metastases was increased when Cav1 was present compared to the Cav1KO condition (Goetz et al., 2011). Additionally, there was increased ECM remodeling (e.g. fibronectin fiber alignment) of the primary tumors when Cav1WT pMEFs were present compared to Cav1KO pMEFs (Goetz et al., 2011). Intratumoral fibronectin alignment was correlated with increased metastatic burden suggesting Cav1 positive stroma are permissive for tumor progression (Goetz et al., 2011).

The Registered Report for the paper by Goetz et al. described the experiments to be replicated (Figure 7C and Supplemental Figures S2A and S7C), and summarized the current evidence for these findings (Fiering et al., 2015). Since that publication additional studies have reported that fibronectin assembly by CAFs stimulate cancer cell invasion (Attieh et al., 2017). Several studies have also reported the correlation of stromal Cav1 expression and clinical outcome, with some associating high expression of Cav1 with unfavorable outcome (Chatterjee et al., 2015; Sun et al., 2017) and some associating Cav1 expression with favorable clinical outcome (Eliyatkin et al., 2018; Neofytou et al., 2017). The reported tumor-promoting and tumor-suppressive functions of Cav1 are likely due to cell-specific effects, physiological context, and cancer stage (Celus et al., 2017).

The outcome measures reported in this Replication Study will be aggregated with those from the other Replication Studies to create a dataset that will be examined to provide evidence about reproducibility of cancer biology research, and to identify factors that influence reproducibility more generally.

Results and discussion

Isolation and characterization of Cav1 wild-type and Cav1 knockout primary MEFs

To test the effect of Cav1 in the tumor stroma, we isolated Cav1WT and Cav1KO pMEFs. The experimental approach to isolate and characterize the pMEFs was described in Protocol 1 and 2 of the Registered Report (Fiering et al., 2015). Isolated pMEFs were assessed for Smooth Muscle Actin (SMA) expression to determine if Cav1WT pMEFs had increased expression compared to Cav1KO pMEFs. This was suggested during peer review of the Registered Report as a marker of increased fibroblast activation and ECM remodeling capabilities of the pMEFs (Fiering et al., 2015). We observed similar SMA expression between Cav1WT and Cav1KO pMEFs (Figure 1A,B). The pMEFs were used within a few passages after isolation, however, growth on a stiff substrate (i.e. plastic) can lead to increased SMA expression (Jones and Ehrlich, 2011; Shi et al., 2013), which could have masked any subtle differences in expression between Cav1WT and Cav1KO pMEFs. Indeed, the original study observed that three-dimensional (3D) growth preferentially raised SMA expression in Cav1WT immortalized MEFs close to levels of SMA in pMEFs grown under two-dimensional (2D) conditions (Goetz et al., 2011). Instead, we performed a collagen contraction assay to test if Cav1WT pMEFs have increased ECM remodeling capabilities compared to Cav1KO pMEFs. This was reported for immortalized MEFs in the original study; however, for pMEFs the data were ‘not shown’ in the published paper because of the journal policy at Cell restricting the number of supplemental figures allowed (del Pozo, personal communication). Although the data were not reported, the original study stated that the ECM remodeling capabilities of Cav1KO pMEFs were reduced compared to Cav1WT pMEFs, similar to the results reported with immortalized MEFs (Goetz et al., 2011). In this study, we also found Cav1KO pMEFs had decreased contraction compared to Cav1WT pMEFs (Figure 1C–E). This result is consistent with Cav1 contributing to fibroblast contractility. To summarize, we were unable to observe differences in SMA expression between Cav1WT and Cav1KO pMEFs in 2D conditions on a rigid substrate, but did observe contraction in Cav1WT pMEFs, that was reduced in Cav1KO pMEFs, a result that was in the same direction as the original study.

Characterization of Cav1 wild-type and Cav1 knockout pMEFs.

Primary MEFs (pMEFs) from wild-type (WT) or knockout (KO) embryos were examined for increased fibroblast activation and extracellular matrix (ECM) remodeling capabilities in vitro. (A) Western blots of the indicated pMEFs probed with antibodies against caveolin-1 (CAV1), alpha-smooth muscle actin (SMA), and gamma-Tubulin. Numbers indicate individual pMEF clones. (B) Western blot bands were quantified, SMA levels were normalized to Tubulin, and protein expression are presented relative to Cav1WT. Dot plot with means reported as crossbars and error bars represent SD. Number of individual clones per group: Cav1WT = 5, Cav1KO = 4. Exploratory analysis: Student’s two-tailed t-test; t(7) = 1.791, p=0.116; Cohen’s d = 1.20, 95% CI [−0.52, 2.92]. (C) Representative images from collagen contraction assay of the indicated conditions at 24 or 48 hr after plating. (D) Line graph of contraction index, measured as the change in percent of gel area at time of plating, of Cav1WT and Cav1KO pMEFs at the indicated times after plating. Means reported and error bars represent SD. Number of individual clones per group: Cav1WT = 4, Cav1KO = 4. (E) The contraction index was used to calculate the area under the curve (AUC) for each clone. Bar plots for each individual clone tested (numbers indicate same clone number as in A). Dashed lines indicate means of each group. Exploratory analysis: Student’s two-tailed t-test; t(6) = 10.80, p=3.72×10−5; Cohen’s d = 7.64, 95% CI [2.66, 12.62]. Additional details for this experiment can be found at https://osf.io/na5h2/.

Subcutaneous tumorigenicity assay of tumor cells co-injected with Cav1WT or Cav1KO primary MEFs

We next used the pMEFs to replicate an experiment to test whether stromal Cav1 remodels the intratumoral microenvironment and facilitates tumor cell metastasis. This experiment is similar to what was reported in Figure 7C and Supplemental Figure 7C of Goetz et al. (2011) and described in Protocols 3 and 4 in the Registered Report (Fiering et al., 2015). Tumor cells engineered to express luciferase (LM-4175) were mixed with Cav1WT or Cav1KO pMEFs, or not mixed with pMEFs (control group), and injected subcutaneously into female nude mice. While the original study also tested the role of p190RhoGAP by injecting LM-4175 cells mixed with p190RhoGAP-silenced Cav1KO pMEFs, this replication attempted did not attempt to include this condition. To determine the experimental endpoint, we first performed a pilot experiment and established that mice should be euthanized 45 days after cell injection to maximize the length of time for tumor growth while minimizing animal suffering. Importantly, while the original study did not report tumor sizes, in this pilot experiment tumor burden was determined to be excessive (1.5 cm3) at an experimental endpoint that was 25 days shorter than the original study, which maintained mice for 70 days after injection. Thus, it is possible that tumors grew more rapidly in this replication attempt than the original study. Following the same time course as the pilot experiment we injected female nude mice with LM-4175 cells with or without Cav1WT or Cav1KO pMEFs. Similar to the pilot study we observed criteria warranting euthanasia in some mice (e.g. ulceration at the tumor site) confirming that 45 days after injection was the appropriate endpoint. Before euthanasia, each mouse was injected with luciferin to monitor primary tumor growth and metastasis formation. We found there was not a statistically significant difference in primary tumor growth between the three groups (Kruskal-Wallis: H(2) = 0.0439, p=0.978), with a median (Mdn) bioluminescence of 1.92 × 109 photons/sec [n = 10] for LM-4175, 1.92 × 109 photons/sec [n = 26] for LM-4175 plus Cav1WT pMEFS, and 1.64 × 109 photons/sec [n = 25] for LM-4175 plus Cav1KO pMEFs (Figure 2A,B). This compares to a Mdn bioluminescence of 2.16 × 1010 photons/sec [n = 6] for LM-4175, 1.54 × 1010 photons/sec [n = 13] for LM-4175 plus Cav1WT pMEFS, and 2.08 × 1010 photons/sec [n = 15] for LM-4175 plus Cav1KO pMEFs reported in the original study (Goetz et al., 2011).

Primary tumor growth and metastatic burden from subcutaneous tumorigenicity assay.

Female nude mice were subcutaneously injected with 1 × 106 LM-4175 cells mixed with or without 1 × 106 Cav1WT or Cav1KO pMEFs and monitored for 45 days. (A) At the end of the experiment primary tumors were imaged in vivo. Box and whisker plot of primary tumor photon flux with median represented as the line through the box and whiskers representing values within 1.5 IQR of the first and third quartile. Number of primary mice per group: LM-4175 only (control group)=10, LM-4175 plus Cav1WT pMEFs = 26, LM-4175 plus Cav1KO pMEFs = 25. Kruskal-Wallis test on all three groups: H(2) = 0.0439, p=0.978. (B) Representative images of primary tumors in vivo and extracted organs ex vivo. (C) The indicated organs were dissected, imaged ex vivo, and individual metastatic foci were blindly quantified. Box and whisker plots of metastatic foci counts for each organ and total metastatic counts with median represented as the line through the box and whiskers representing values within 1.5 IQR of the first and third quartile (dots represent outliers). Note: the y-axes have been truncated for visualization purposes and excludes two outliers from Total (LM-4175 only), one from Lymph Nodes (LM-4175 plus Cav1WT), seven from Lung (two from LM-4175 only; five from LM-4175 plus Cav1KO), and one from Intestines (LM-4175 only). The excluded outliers were included in the statistical analysis below. Number of mice per group: LM-4175 only = 10, LM-4175 plus Cav1WT pMEFs = 26, LM-4175 plus Cav1KO pMEFs = 25. Planned Wilcoxon-Mann-Whitney comparison on total metastatic counts between LM-4175 only and LM-4175 plus Cav1WT pMEFs: U = 103, uncorrected p=0.318 with a priori alpha level of 0.0167, Bonferroni corrected p=0.954, Cliff’s delta = 0.21, 95% CI [−0.08, 0.46]. Planned Wilcoxon-Mann-Whitney comparison on total metastatic counts between LM-4175 only and LM-4175 plus Cav1KO pMEFs: U = 175.5, uncorrected p=0.062, Bonferroni corrected p=0.185, Cliff’s delta = 0.40, 95% CI [0.10, 0.64]. Planned Wilcoxon-Mann-Whitney comparison on total metastatic counts between LM-4175 plus Cav1WT pMEFs and LM-4175 plus Cav1KO pMEFs: U = 389.5, uncorrected p=0.219, Bonferroni corrected p=0.657, Cliff’s delta = −0.20, 95% CI [−0.47, 0.11]. Additional details for this experiment can be found at https://osf.io/bq54u/.

To assess metastatic burden, we excised the same organs examined in the original study and reimaged them ex vivo. Similar to the original study, we observed that the incidence of a metastatic foci, when considering all the mice examined, was highest in the lymph node (RP:CB: 41% (25 out of 61 mice); Goetz et al., 2011: 77% (27 out of 35)). In this replication attempt we observed the lowest incidence of metastatic foci in the kidney and liver (13% (8 out of 61)) while the original study observed the lowest incidence in the spleen (37% (13 out of 35)). When considering the total number of metastatic foci detected among all the examined organs, we found mice injected with LM-4175 cells formed a median of 0 metastatic foci (range: 0–115; incidence: 40% (4 out of 10 mice)), mice injected with LM-4175 cells plus Cav1WT pMEFS formed a median of 3 metastatic foci (range: 0–25; incidence: 62% (16 out of 26)), and mice injected with LM-4175 cells plus Cav1KO pMEFs formed a median of 5 metastatic foci (range: 0–33; incidence: 84% (21 out of 25)) (Figure 2A,C). The original study reported a median of 3.5 (range: 0–4; incidence: 67% (4 out of 6 mice)) for LM-4175, 26 (range: 1–67; incidence: 100% (12 out of 12)) for LM-4175 plus Cav1WT pMEFs, and 8 (range: 0–34; incidence: 93% (14 out of 15)) for LM-4175 plus Cav1KO pMEFs (Goetz et al., 2011). There are multiple approaches that could be taken to explore these data; however, to provide a direct comparison to the original data, we conducted the analysis specified a priori in the Registered Report (Fiering et al., 2015). To test if the number of metastatic foci differed between the three groups we performed three planned comparisons, which were not statistically significant (see Figure 2 figure legend). Interpretation of the metastatic burden should take into consideration the shorter time from cell injection until euthanasia conducted in this replication attempt, which was 25 days (36%) shorter than the original study. To summarize, for assessment of metastasis formation we found results that were not in the same direction as the original study and not statistically significant.

There are a number of factors that can affect the evaluation of tumor growth and metastasis formation using bioluminescence imaging. For in vivo imaging, the depth and location of the tumor, as well as the thickness or color of the animal’s skin can alter the bioluminescent signal (Baba et al., 2007). The type of anesthetics used can impact the luciferase reaction (Keyaerts et al., 2012) as well as the route of injection of D-luciferin. So while both studies used an intraperitoneal injection there can be variation in the signal due to changes in the rate of absorption across the peritoneum (Close et al., 2010). Thus, intravenous and subcutaneous administration of D-luciferin have been suggested alternatives (Keyaerts et al., 2008; Khalil et al., 2013). The imaging time postinjection can also affect the sensitivity of the bioluminescent signal as well as differences in instrumentation settings (Burgos et al., 2003; Rettig et al., 2006). Additionally, the animal diet can also affect the background gut phosphorescence with standard mouse chow with plant material displaying greater phosphorescence compared to a diet without plant material (Zinn et al., 2008). Finally, an immune response against luciferase has also been reported to restrict tumor growth and metastatic potential of luciferase expressing tumor cells (Baklaushev et al., 2017).

As noted above, the difference in experimental timing could have had important effects on both the extent and patterns of metastases observed. There are numerous cellular processes that tumor cells must accomplish to form metastases, including evasion of immune responses and programmed cell death, invasion of the host stroma, escape through vasculature and/or lymphatics, and survival and growth in distant sites (Chambers et al., 2002). Thus, there are multiple steps during malignant progression that are influenced by a number of factors, particularly time. Most experimental systems, however, do not model all of the steps necessary for metastasis formation (Saxena and Christofori, 2013). Subcutaneous approaches, such as the model used in the original study and this replication, can robustly model in vivo tumor growth, as well as local invasion toward skin mesenchyme, but do not reliably recapitulate metastatic behavior, likely because of ectopic anatomical context (Antonello and Nucera, 2014; Pearson and Pouliot, 2013). Experimental timing of primary tumor growth and spontaneous metastasis are important to maintain to minimize confounding variables especially since growth is nonlinear (Tyuryumina and Neznanov, 2018). This can be complicated when primary tumor growth necessitates the sacrifice of animals before sufficient time for metastatic development. Monitoring and reporting tumor growth, such as tumor volumes for each animal at the experimental endpoint, can allow for mitigation strategies if there are variations in the growth of tumors between studies. For example, the primary tumor could be resected at a specific time, or tumor size, to allow for a longer follow-up of metastasis development. Additionally, other growth monitoring criteria, such as using biomarkers to visualize metastatic burden in vivo without sacrificing animals, should be considered in the experimental design of future studies.

Intratumoral stroma remodeling

In addition to monitoring metastasis formation we blindly examined intratumoral stroma remodeling in a random subset of the primary tumors. Tumors sections were stained for fibronectin and SMA using the same antibodies and protocol as the original study. Fibronectin staining gave specific staining with little background in control conditions (Figure 3—figure supplement 1A); however, there was high non-specific staining observed with SMA (Figure 3—figure supplement 1B). In an attempt to reduce the non-specific staining we included a mouse-on-mouse blocking step since a mouse anti-SMA antibody was being used on mouse tissue. While this reduced background staining, we observed heterogeneity in the patterns (e.g. fibrillar structures, bright dots) and intensities within the tumors (Figure 3—figure supplement 1C). This introduced an unanticipated difficulty in needing to separate out the bright dots, which appeared specific based on the controls, from the fibrous SMA. As such, we did not conduct the SMA analysis that was outlined in the Registered Report.

We next quantified intratumoral orientation of fibronectin fibers from 10 random images per tumor. As specified in the Registered Report (Fiering et al., 2015), we attempted to determine fibronectin orientation using MetaMorph software as described in the original study, but found that the Integrated Morphometry Analysis (IMA) function to reveal objects of interest was unable to be executed as there were too many objects to process (see detailed approach in Materials and methods). Instead, we created a workflow using the KNIME analytics platform (Berthold et al., 2007) that allows the integration of ImageJ commands into a single workflow to ensure all images of the dataset are processed in an identical manner. We were unable to perform the exact methodology since there were thousands of objects remaining in control conditions after performing the 35% threshold at the maximum internal intensity as prespecified in the Registered Report. The number of objects detected at this step was higher than the number MetaMorph IMA function could handle. A large portion of these objects were very small, therefore we included an additional parameter that selected objects that were above a certain size (Figure 3—figure supplement 2A). The fibronectin fiber orientation among the various images was determined by arbitrarily setting the mode angle, that represents the angle with the most fibers observed, to 0° for each image and then calculating the average percentage of fibers oriented within 20° of the mode angle (i.e. −20° to 20°) (Amatangelo et al., 2005). We found the percentage of fibers within 20° of the mode was highest in tumors from LM-4175 plus Cav1WT pMEFs [Mdn = 44.1%, interquartile range (IQR) = 41.3–45.2%, n = 8] or LM-4175 cells [Mdn = 44.6%, IQR = 38.0–46.0%, n = 5] compared to tumors from LM-4175 plus Cav1KO pMEFs [Mdn = 41.2%, IQR = 40.0–41.8%, n = 7] (Figure 3A,B, Figure 3—figure supplement 3A,B). To test if the orientation of fibronectin fibers differed we performed the two planned comparisons outlined in the Registered Report (LM-4175 vs LM-4175 plus Cav1WT pMEFs; LM-4175 plus Cav1WT pMEFs vs LM-4175 plus Cav1KO pMEFs), which were not statistically significant (see Figure 3 figure legend). This compares to the original study that reported a statistically significant increase in fibronectin fiber alignment when LM-4175 cells were co-injected with Cav1WT pMEFs [Mdn = 52.5%, IQR = 43.6–55.4%, n = 8] compared to LM-4175 cells [Mdn = 36.9%, IQR = 36.8–37.7%, n = 5] or LM-4175 plus Cav1KO pMEFs [Mdn = 42.2%, IQR = 40.8–43.2%, n = 10], suggesting stromal Cav1 remodels the intratumoral microenvironment (Goetz et al., 2011). To summarize, we found results that were in the same direction as the original study and not statistically significant where predicted. Interpretation of these results should take into consideration the changes in analysis workflow between the original and replication studies. It is unknown what the impact of the change in methods are since the workflow used for the original study could not be implemented on the replication data and vice versa. Although, despite these differences, the median value of percent of fibronectin fibers oriented within 20% across all tumors was similar between this replication attempt [Mdn = 42.1%, IQR = 39.1–44.7%, n = 20] and the original study [Mdn = 42.5%, IQR = 37.9–45.1%, n = 23]. Nonetheless, the fibrillar nature of fibronectin staining might not be fully captured due to variations in staining and imaging (e.g. image noise), a common challenge that affects the quality of fluorescence-based images because of the low-light nature of the signal. It was unclear if this occurred in the original study, and if so, what was performed to manage this. Others have suggested methods to evaluate noise (Heintzmann et al., 2018; Murray, 2007), with management steps implemented in either the equipment or in the image process and analysis protocol. Additionally, data analysis should be done blinded to conditions and batch processed, with specific details of what will occur stated prior to data collection, such as in a pre-registered analysis plan, to minimize confirmation bias (Wagenmakers et al., 2012).

Figure 3 with 4 supplements see all
Intratumoral fibronectin fiber orientation and correlation to metastasis.

A random subset of the primary tumors from the subcutaneous tumorigenicity assay (20 of 61 mice) were stained for fibronectin and analyzed to determine the average percentage of fibers oriented within 20° of the modal angle. (A) Bar graphs of the frequency of fibronectin fiber angle plotted relative to the modal angle (set at 0°). Means reported and error bars represent s.e.m. Number of mice, and thus tumors, per group: LM-4175 only = 5, LM-4175 plus Cav1WT pMEFs = 8, LM-4175 plus Cav1KO pMEFs = 7. Values reported above bar graphs indicate the median and interquartile range (IQR) of percent of fibers oriented within 20° of the modal angle (−20° to 20°, represented as purple bars) for each tumor. Planned Wilcoxon-Mann-Whitney comparison on percent of fibers oriented within 20° of the modal angle between LM-4175 only and LM-4175 plus Cav1WT pMEFs: U = 19, uncorrected p=0.884 with a priori alpha level of 0.025, Bonferroni corrected p>0.99. Planned Wilcoxon-Mann-Whitney comparison on percent of fibers oriented within 20° of the modal angle between LM-4175 plus Cav1WT pMEFs and LM-4175 plus Cav1KO pMEFs: U = 13, uncorrected p=0.0826, Bonferroni corrected p=0.165. (B) Different fields of views (fov) of the immunostained primary tumors for fibronectin and Hoechst. Three independent tumors derived from LM4175 only (first column, top to bottom: tumor 2 fov5, tumor 4 fov9, tumor 34 fov3), LM-4175 plus Cav1WT pMEFs (second column, top to bottom: tumor 16 fov7, tumor 51 fov 7, tumor 61 fov2), and LM-4175 plus Cav1KO pMEFs (third column, top to bottom: tumor 20 fov4, tumor 30 fov1, tumor 47 fov6). Scale bar: 50 µm. Fibronectin signal is pseudo-colored in red (microscope emission peak wavelength: 614 nm) and Hoechst signal is pseudo-colored in blue (microscope peak emission wavelength: 454 nm). Images are maximum intensity projections of the Z-stacks, corrected for background (as described in Materials and methods - Fibronectin fiber analysis) and displayed in the same range of grey levels. (C) Scatter plot of percentage of fibers within 20° of the modal angle and total number of metastatic counts for 20 tumors analyzed for fibronectin orientation. Line represents spearman rank correlation and light gray region represents 95% CI. Spearman rank-order correlation analysis: rs(18) = −0.50, p=0.025. Additional details for this experiment can be found at https://osf.io/bq54u/.

We also explored additional methods to examine fiber orientation. Anisotropy, a measure of orderly structure, was measured with FibrilTool (Boudaoud et al., 2014; proposed by the original authors during preparation of the Registered Report), coherency, the degree to which the local features are oriented, was measured with OrientationJ (Rezakhaniha et al., 2012), and a blinded manual scoring was performed to assess the frequency of parallel fibers. These additional measures were found to be well correlated with the percentage of fibers oriented within 20° of the mode angle (Figure 3—figure supplement 2B). The same statistical comparisons between the three groups that were performed above were also explored, which gave similar results (Figure 3—figure supplement 2C). Although the full range of possible methods were not explored, these concordant results indicate the robustness of the findings (Silberzahn et al., 2018; Steegen et al., 2016).

Finally, intratumoral fibronectin fiber alignment was examined to determine if there was a correlation with metastasis formation. Results of the Spearman’s rank-order correlation indicated that there was a statistically significant negative relationship between the number of metastatic foci and percentage of fibers within 20° of the mode (rs(18) = −0.50, p=0.025) (Figure 3C). The same type of analysis was reported in the original study, which indicated a statistically significant positive relationship (Goetz et al., 2011). Interpretation of this analysis should take into consideration the results above, especially since the shorter experimental timing could have impacted the number of metastases observed. Additionally, the primary tumors and metastatic foci counts used for the correlation analysis were a random subset of all the mice evaluated in this study. To summarize, we found results that were statistically significant in the opposite direction as the original study.

Interpretation of the above results should take into account experimental differences between the original and replication studies. The decreased experimental endpoint (45 days instead of 70 days) could have had important effects on both the extent and patterns of metastases as well as affect the fibronectin pattern associated with intratumoral remodeling. That is, the shorter in vivo experimental timing might not have allowed for the same level of metastatic progression to occur in this replication compared to the original study. Additionally, the in vivo ECM remodeling capabilities of the primary fibroblasts used in this study are unknown due to cross reaction during SMA staining despite using the same protocol as the original study. Thus, while the in vitro contractility assay observed a difference between the pMEFs, a larger difference might be required to observe an effect on intratumoral orientation with this experimental design. An examination of the involvement of p190RhoGAP should also be considered in the experimental design of future studies. Importantly, observing different outcomes with similar experimental designs are informative to establish the range of conditions under which a given effect can be observed (Bailoo et al., 2014).

Meta-analyses of original and replication effects

We performed a meta-analysis using a random-effects model, where possible, to combine each of the effects described above as pre-specified in the confirmatory analysis plan (Fiering et al., 2015). To provide a standardized measure of the effect, a common effect size was calculated for each effect from the original and replication studies. Cliff’s delta (d) is a non-parametric estimate of effect size that measures how often a value in one group is larger than the values from another group, while the effect size r is a standardized measure of the correlation (strength and direction) of the association between two variables. The estimate of the effect size of one study, as well as the associated uncertainty (i.e. confidence interval), compared to the effect size of the other study provides one approach to compare the original and replication results (Errington et al., 2014; Valentine et al., 2011). Importantly, the width of the confidence interval (CI) for each study is a reflection of not only the confidence level (e.g. 95%), but also variability of the sample (e.g. SD) and sample size.

The comparisons of the primary tumor growth between the three groups of mice, LM-4175 cells injected with or without Cav1WT or Cav1KO pMEFs, which were reported in Figure 2A of this study and Supplemental Figure 7Ca of Goetz et al. (2011), were in the same directions between the two studies and the effect size point estimate of each study was within the CI of the other study (Figure 4A). Furthermore, the meta-analysis was not statistically significant (p=0.467), suggesting primary tumor growth does not change when Cav1 is absent in the tumor stroma.

Meta-analyses of each effect.

Effect size and 95% confidence interval are presented for Goetz et al. (2011), this replication attempt (RP:CB), and a random effects meta-analysis to combine the two effects. The effect size r is a standardized measure of the correlation (strength and direction) of the association between two variables and Cliff’s delta is a standardized measure of how often a value in one group is larger than the values from another group. Sample sizes used in Goetz et al. (2011) and this replication attempt are reported under the study name. (A) Primary tumor growth between mice injected with LM-4175 cells with or without Cav1WT or Cav1KO pMEFs (meta-analysis p=0.467). (B) Total metastasis counts between mice injected with LM-4175 cells and LM-4175 plus Cav1WT pMEFs (meta-analysis p=0.107), mice injected with LM-4175 cells and LM-4175 plus Cav1KO pMEFs (meta-analysis p=0.0027), and mice injected with LM-4175 plus Cav1WT pMEFs and LM-4175 plus Cav1KO pMEFs (meta-analysis p=0.680). (C) Fibronectin fiber orientation (average percentage of fibers oriented within 20° of the mode angle) between tumors from mice injected with injected with LM-4175 cells and LM-4175 plus Cav1WT pMEFs (meta-analysis p=0.269) and mice injected with LM-4175 plus Cav1WT pMEFs and LM-4175 plus Cav1KO pMEFs (meta-analysis p=1.11×10−4). (D) Rank-order correlation between fibronectin fiber orientation and total metastasis counts (meta-analysis p=0.837). Additional details for these meta-analyses can be found at https://osf.io/rvf57/.

There were three comparisons of total metastasis formation between the three groups, which was reported in Figure 2C of this study and Figure 7Cb of Goetz et al. (2011). The meta-analyses were not statistically significant for the LM-4175 vs LM-4175 plus Cav1WT pMEFs comparison (p=0.107) and the LM-4175 plus Cav1WT pMEFs vs LM-4175 plus Cav1KO pMEFs comparison (p=0.680), but was for the LM-4175 vs LM-4175 plus Cav1KO pMEFs comparison (p=0.0027) (Figure 4B). The direction of the LM-4175 vs LM-4175 plus Cav1KO pMEFs comparison was the same in both the original study and this replication attempt with the CI of each study encompassing the effect size point estimate of the other study. The effect size point estimates of the LM-4175 vs LM-4175 plus Cav1WT pMEFs and LM-4175 plus Cav1WT pMEFs vs LM-4175 plus Cav1KO pMEFs comparisons for each study, however, were not within the CI of the other study. Additionally, for these two effects, the large CI of the meta-analyses along with statistically significant Cochran’s Q tests (LM-4175 vs LM-4175 plus Cav1WT pMEFs, p=1.54×10−4; LM-4175 plus Cav1WT pMEFs vs LM-4175 plus Cav1KO pMEFs, p=0.0077) suggests heterogeneity between the studies.

There were two comparisons of fibronectin fiber orientation, which was reported in Figure 3A of this study and Supplemental Figure 7Cc of Goetz et al. (2011). Both comparisons were consistent when considering the direction of the effect; however, results varied as to whether the effect size point estimate of one study fell within the CI of the other study (Figure 4C). The meta-analysis for the LM-4175 plus Cav1WT pMEFs vs LM-4175 plus Cav1KO pMEFs comparison was statistically significant (p=1.11×10−4), which suggests Cav1 expression is necessary for intratumoral fibronectin remodeling; however, the meta-analysis for the LM-4175 vs LM-4175 plus Cav1WT pMEFs comparison was not statistically significant (p=0.269), which along with a statistically significant Cochran’s Q test (p=0.023) suggests heterogeneity between the studies.

Finally, the rank-order correlation that was determined in both studies to determine the association of fibronectin fiber orientation and total metastasis for all three groups, LM-4175 cells injected with or without Cav1WT or Cav1KO pMEFs, reported in Figure 3C of this study and Figure 7Cd of Goetz et al. (2011), were not consistent when considering direction of the effect (Figure 4D). Furthermore, the meta-analysis was not statistically significant (p=0.837) with a large CI and a statistically significant Cochran’s Q test (p=2.13×10−5) that suggests heterogeneity between the original study and this replication attempt.

This direct replication provides an opportunity to understand the present evidence of these effects. Any known differences, including reagents and protocol differences, were identified prior to conducting the experimental work and described in the Registered Report (Fiering et al., 2015). However, this is limited to what was obtainable from the original paper and through communication with the original authors, which means there might be particular features of the original experimental protocol that could be critical, but unidentified. So while some aspects, such as cell line, mouse strain, antibodies, and the method to measure metastatic counts were maintained, others were changed during the execution of the replication that could affect results, such as the time from cell injection until euthanasia, which was shorter in this replication attempt than what was conducted in the original study. Additionally, other aspects were unknown or not easily controlled for. These include variables such as cell line genetic drift (Ben-David et al., 2018; Hughes et al., 2007; Kleensang et al., 2016), including subclonal drift in heterogeneous stable cells (Shearer and Saunders, 2015), genetic heterogeneity of mouse inbred strains (Casellas, 2011), the microbiome of recipient mice (Macpherson and McCoy, 2015), and housing temperature in mouse facilities (Kokolus et al., 2013). Mutations could have also accumulated during cell passage in vitro and drive cell lines towards a different phenotype that is observed in vivo (Gregoire et al., 2001; Hurlin et al., 1991). Environmental differences such as husbandry staff, bedding type and source, light levels, and other intangibles, all of which, by necessity, differed between the studies, which along with bias during welfare assessment and measurement imprecision can also affect experimental outcomes with mice (Howard, 2002; Jensen and Ritskes-Hoitinga, 2007; Nevalainen, 2014; Sorge et al., 2014). Differences in imaging instruments is another source of variability that could affect the outcomes between studies. The implementation of standardization procedures for equipment performance (e.g. International Organization for Standardization/Draft International Standard for confocal microscopes currently under development [ISO/DIS 21073]) could provide metrics to compare one instrument to another, facilitating reproducibility. Furthermore, differences in image analysis and batch processing could be another source of variability between studies, illustrating the benefit of documenting all analysis configuration parameters and keeping results connected to the input data (Nanes, 2015). Also, there is the possibility that human cancer cells, such as the LM-4175 cells used in the original study and this replication attempt, may behave differently in mouse models compared to other studies where mouse cancer cells were injected in mice (Capozza et al., 2012). Whether these or other factors influence the outcomes of this study is open to hypothesizing and further investigation, which is facilitated by direct replications and transparent reporting.

Materials and methods

Key resources table
Reagent type
(species) or resource
DesignationSource or referenceIdentifiersAdditional information
Cell line (Mus musculus)Cav1WT pMEFsThis paperisolated from embryonic day 14.5 embryos from B6129SF2/J mice (Jackson Laboratory, Stock No. 101045, RRID:IMSR_JAX:101045)
Cell line (M. musculus)Cav1KO pMEFsThis paperisolated from embryonic day 14.5 embryos from Cav1tm1Mls/J mice (Jackson Laboratory, Stock No. 004585, RRID:IMSR_JAX:004585)
Cell line (H. sapiens, female)LM-4175doi: 10.1038/nature03799Expresses HSV-tk1-GFP-Fluc; shared by del Pozo lab, CNIC
Strain, strain background (M. musculus, Athymic Nude-Foxn1nu, female)athymic nudeEnvigoMGI:5652489
OtherMatrigelCorningcat# 356234
OtherD-luciferinPromegacat# P1042
Antibodymouse anti-Caveolin 1BD Biosciencescat# 610406; clone: 2297; RRID:AB_3977891:1000 dilution
Antibodymouse anti-alpha-SMASigma-Aldrichcat# A5228; clone: 1A4; RRID:AB_2620541:100 or 1:1000 dilution
Antibodymouse anti-gamma-tubulinSigma-Aldrichcat# T6557; clone: GTU-88; RRID:AB_4775841:1000 dilution
Antibodyrabbit-fibronectinSigma-Aldrichcat# F3648, RRID:AB_4769761:200 dilution
AntibodyHRP-conjugated goat anti-mouseThermo Fisher Scientificcat# 32430; RRID:AB_11855661:5000 to 1:10,000 dilution
AntibodyAlexa Fluor 594-conjugated donkey anti-rabbitJackson ImmunoResearch Laboratoriescat# 711-585-152; RRID:AB_23406211:300 dilution
AntibodyAlexa Fluor 647-conjugated donkey anti-mouseJackson ImmunoResearch Laboratoriescat# 715-605-151; RRID:AB_23408631:300 dilution
Antibodyrabbit IgG isotype controlSigma-Aldrichcat# I5006; RRID:AB_11636591:200 dilution
Antibodymouse IgG2a isotype controlSigma-Aldrichcat# M5409; clone: UPC-10; RRID:AB_11636911:100 dilution
Software, algorithmQcapture-proTeledyne QimagingRRID:SCR_014432version 6.0.0.605
Software, algorithmLiving ImagePerkin ElmerRRID:SCR_014247version 4.3.1
Software, algorithmZen Black AcquisitionZEISSRRID:SCR_013672version 2.0
Software, algorithmKNIMEKNIMERRID:SCR_006164version 3.5.1
Software, algorithmImageJdoi:10.1038/nmeth.2089RRID:SCR_003070version 1.50a
Software, algorithmFijidoi:10.1038/nmeth.2019RRID:SCR_002285version 2.0.0-rc-34
Software, algorithmFibrilTooldoi:10.1038/nprot.2014.024RRID:SCR_016773
Software, algorithmOrientationJdoi:10.1007/s10237-011-0325-zRRID:SCR_014796version 2.0.3
Software, algorithmMetaMorphMolecular DevicesRRID:SCR_002368version 7.10.1
Software, algorithmBio-Formats Importer plugindoi:10.1083/jcb.201004104RRID:SCR_000450version 5.1.9
Software, algorithmR Project for statistical computinghttps://www.r-project.orgRRID:SCR_001905version 3.5.1

As described in the Registered Report (Fiering et al., 2015), we attempted a replication of the experiments reported in Figure 7C and Supplemental Figures S2A and S7C of Goetz et al. (2011). A detailed description of all protocols can be found in the Registered Report (Fiering et al., 2015) and are described below with additional information not listed in the Registered Report, but needed during experimentation.

Cell culture

Request a detailed protocol

Cav1WT and Cav1KO pMEFs were isolated from embryonic day 14.5 embryos from B6129SF2/J mice (Jackson Laboratory, Stock No. 101045, RRID:IMSR_JAX:101045) and Cav1tm1Mls/J mice (Jackson Laboratory, Stock No. 004585, RRID:IMSR_JAX:004585), respectively, following the procedure outlined in the Registered Report (Fiering et al., 2015). Multiple pMEF clones were isolated and tested (Figure 1) and clone #6 for Cav1WT and clones #12 and #13 for Cav1KO were used for the animal study. pMEFs were used in all experiments before passage 5. LM-4175 cells (lung metastasis derived from MDA-MB-231 cells) retrovirally infected with a triple-fusion protein reporter construct encoding herpes simplex virus thymidine kinase 1, green fluorescent protein (GFP), and firefly luciferase (HSV-tk1-GFP-Fluc) (Minn et al., 2005) were shared by Dr. Miguel A. del Pozo, Centro Nacional de Investigaciones Cardiovasculares Carlos III (CNIC) at passage 20 and used at passage 23 for experiments. Cav1WT pMEFs, Cav1KO pMEFs, and LM-4175 cells were grown in DMEM (Thermo Fisher Scientific, cat# 11054001) supplemented with 10% fetal bovine serum (FBS), 4 mM L-glutamine, 100 U/ml penicillin and 100 µg/ml streptomycin at 37°C in a humidified atmosphere at 5% CO2. Quality control data for the cell lines are available at https://osf.io/hkdwv/. This includes results confirming the cell lines are free of mycoplasma contamination and common mouse pathogens, as well as STR DNA profiling of the cell lines with LM-4175 matched to MDA-MB-231 (RRID:CVCL_0062) when queried against an STR profile database (IDEXX BioResearch, Columbia, Missouri).

Western blots

Request a detailed protocol

Cav1WT and Cav1KO pMEFs (at passage 3) were prepared in RIPA lysis buffer (50 mM Tris-HCl, pH 8.0, 150 mM NaCl, 1% Triton X-100, 0.1% SDS, 0.5% Sodium deoxycholate, 1 mM NaF, and 1 mM Na3VO4), supplemented with protease (Roche, cat# 04693116001) and phosphatase inhibitors (Roche, cat# 04906845001) at manufacturer recommended concentrations. Lysed cells were scraped from plates and centrifuged at 14,000xg for 15 min at 4°C before protein concentration of supernatant was quantified using a Bradford assay following manufacturer’s instructions. Lysate samples were separated by SDS-PAGE gel electrophoresis in 1X Tris-glycine SDS buffer run at 100V through the stacking part of the gel and 180V after the proteins had migrated through the resolving gel (15%) until the dye front was at the bottom of the gel, but had not migrated off. Gels were transferred to an Immobilon-P PVDF membrane (Millipore, cat# IPVH00010) and then incubated with 5% non-fat dry milk in 1X TBS with 0.1% Tween-20 (TBST). Membranes were probed with the following primary antibodies diluted in 5% non-fat dry milk in TBST: mouse anti-Caveolin 1 [clone 2297] (BD Biosciences, cat# 610406, RRID:AB_397789), 1:1000 dilution; mouse anti-alpha-SMA [clone 1A4] (Sigma-Aldrich, cat# A5228, RRID:AB_262054), 1:1000 dilution; mouse anti-gamma-tubulin [clone GTU-88] (Sigma-Aldrich, cat# T6557, RRID:AB_477584), 1:1000 dilution. Membranes were washed with TBST and incubated with secondary antibody diluted in 5% non-fat dry milk in TBST: HRP-conjugated goat anti-mouse (Thermo Fisher Scientific, cat# 32430, RRID:AB_1185566), 1:5000 to 1:10,000 dilution. Membranes were washed with TBST and incubated with ECL reagent (Santa Cruz Biotechnology, cat# sc-2048) to visualize signals. Scanned Western blots were quantified using ImageJ software (RRID:SCR_003070), version 1.50a (Schneider et al., 2012). Additional methods and data, including full Western blot images, are available at https://osf.io/na5h2/.

Collagen gel contraction assay

Request a detailed protocol

1.5 × 105 Cav1WT or Cav1KO pMEFs were mixed with NaOH-titrated collagen I (Corning, cat# 354249) to a final collagen I concentration of 1 mg/ml in a total of 500 µl. The mixture was immediately transferred to a 24 well ultra low attachment plate (Corning, cat# 3473) and allowed to solidify at room temperature for about 1 hr. After solidification, 500 µl of cell growth medium was added to each well and gels were dissociated from the well by gently running a 200 µl pipet tip along the gel edge without shearing or tearing the gel. Plates were swirled to ensure the gel was free from the plate and then incubated at 37°C in a humidified atmosphere at 5% CO2 for 48 hr. Images were taken at 24 hr and 48 hr to document contraction. Assay was performed in triplicate for each clone and no cell controls (cell growth medium only) were included. Gel contraction index was calculated from the gel surface area measured on acquired images using a digital camera (Leica MZ16 stereomicroscope and QCapture-pro software (Teledyne QImaging, RRID:SCR_014432), version 6.0.0.605) at a fixed distance above the gels, and reported as the percentage of contraction of the initial surface area. This experiment was pre-registered before experimental work began (https://osf.io/9cgk4/). Additional detailed methods and data, including images of gels, are available at https://osf.io/na5h2/.

Subcutaneous tumorigenicity assay

Request a detailed protocol

All animal procedures were approved by the Dartmouth College IACUC# 1133 and were in accordance with the Dartmouth College policies on the care, welfare, and treatment of laboratory animals. Eight-ten-week old female athymic nude mice (Envigo, Strain: Hsd:Athymic Nude-Foxn1nu, MGI:5652489) were housed (4-5 per cage) in standard ventilated filtered cages, with corn Cobb bedding and a nestlet for nesting (changed evey other week), 12 hr light/dark cycles, and fed sterile rodent chow (Teklad global 18% protein rodent diet (Envigo, cat# 2918)) and acidified water (changed weekly) ad libitum. The mice were housed for approximately 2 weeks before being enrolled in the study. The individual mouse was considered the experimental unit within the studies and inclusion/exclusion criteria (e.g. mice were excluded if injection of tumor cells entered the peritoneum) are described in the Registered Report (Fiering et al., 2015). Housing and experimentation (e.g. injection, IVIS imaging, etc) were conducted in the same facility, which was kept at 72°F +/- 2°F with 30-70% relative humidity.

A pilot study was performed on five mice. Mice were anesthetized with 2-2.5% isoflurane (Patterson Veterinary, cat# 07-893-1389) mixed with 1L/min of medical grade oxygen and injected subcutaneously with 1 × 106 LM-4175 cells unmixed (four mice) or mixed 1:1 with Cav1WT pMEFs (one mouse) in 100 µl PBS mixed with 100 µl of Matrigel (Corning, cat# 356234) in the flank using a 25-gauge needle. Mice were monitored until visible tumors formed. Once tumor growth was detected in any animal, tumors were measured using precision calipers twice a week, and mice were monitored for signs of distress daily. Mice were euthanized starting at day 40 post-injection due to ascites thru day 63 due to excessive tumor burden. After reviewing tumor measurements, it was determined that day 45 was the target end date according to the IACUC approved protocol.

Following the pilot study, a total of 62 mice were randomized (simple randomization using a random number generator) to receive a subcutaneous injection with 1 × 106 LM-4175 cells unmixed, the control group, (10 mice) or mixed 1:1 with Cav1WT pMEFs (26 mice) or Cav1KO pMEFs (26 mice) as described in the pilot study. Injections occurred on two separate days, with half the mice for each group injected on each day. Mice were euthanized at day 45 post-injection to ensure consistency of collected data and minimize animal suffering per IACUC guidelines. Of note, one mouse, LM-4175 plus Cav1KO pMEFs, was euthanized, and thus excluded, because the tumor was above the approved IACUC protocol tumor limit before the specified endpoint of 45 days. No other mice were excluded, although similar to the pilot study we observed adverse events (e.g. ulceration at the tumor site) confirming that 45 days after injection was the appropriate endpoint. To measure primary tumor growth and metastasis, mice were anesthetized and injected with 100 µl of 30 mg/ml D-luciferin (Promega, cat# P1042) intraperitoneally. Twenty minutes later, mice were placed into an IVIS Spectrum system (Caliper, Xenogen) for imaging of ventral views for photon flux quantification. Following imaging mice were injected with 50 µl of 30 mg/ml D-luciferin intraperitoneally. Twenty minutes later, mice were euthanized and the primary tumor and following organs were dissected: lymph nodes, spleen, lungs, liver, intestines, kidneys. Organs were placed separated, into a 100 mm dish and placed into IVIS for imaging. The primary tumors were cut in half and frozen in O.C.T. compound by placing cassette with tumor into a dry ice/ethanol bath until frozen and then storing at −80°C until shipped on dry ice for image processing. Anesthesia, luciferin injections, imaging, second luciferin injections, euthanasia, dissection, and imaging/freezing primary tumors were performed during daylight (afternoon hours) with mice from different groups in parallel so variations during the procedure were equal across groups.

IVIS imaging

Request a detailed protocol

Images were acquired with a Xenogen IVIS Imaging System (Perkin Elmer, 200 Series) and Living Image software (RRID:SCR_014247), version 4.3.1 at a medium binning level and the field of view set at ‘E’. For in vivo imaging, mice were placed into the IVIS with front limbs taped above head and black shields used to block bioluminescence from the primary tumors to visualize metastases in vivo. Exposure time for photon flux quantification of primary tumors, a primary outcome measure, was 0.2 s. After in vivo imaging, dissected organs were imaged ex vivo to detect metastatic foci, a primary outcome measure. Images were taken at multiple exposures (0.2 s, 0.5 s, 1 s, 10 s, 20 s, 60 s, and 120 s) and used to manually quantify visible metastatic foci. Quantification was performed blinded to the cells the animals were injected with. Image files are available at https://osf.io/bq54u/.

Immunofluorescence and confocal microscopy

Request a detailed protocol

A random subset (simple randomization from each group) of the cryopreserved primary tumors from the subcutaneous tumorigenicity assay were sectioned (8 µm thick), fixed, permeabilized, and stained as described in the Registered Report (Fiering et al., 2015) with the following primary antibodies diluted in PBS supplemented with 2% BSA overnight at 4°C: rabbit anti-fibronectin (Sigma-Aldrich, cat# F3648, RRID:AB_476976), 1:200 dilution; mouse anti-alpha-SMA [clone 1A4] (Sigma-Aldrich, cat# A5228, RRID:AB_262054), 1:100 dilution. Sections were washed in PBS and incubated with the following secondary antibodies diluted in PBS supplemented with 2% BSA for 1 hr at 37°C: Alexa Fluor 594 conjugated donkey anti-rabbit (Jackson ImmunoResearch Laboratories, cat# 711-585-152, RRID:AB_2340621), 1:300 dilution; Alexa Fluor 647 conjugated donkey anti-mouse (Jackson ImmunoResearch Laboratories, cat# 715-605-151, RRID:AB_2340863), 1:300 dilution. Hoechst dye (1:5000 dilution) was used to counterstain nuclei. Additional controls were included on a subset of the primary tumor sections: rabbit IgG isotype control (Sigma-Aldrich, cat# I5006, RRID:AB_1163659), 1:200; mouse IgG2a isotype control [clone UPC-10] (Sigma-Aldrich, cat# M5409, RRID:AB_1163691), 1:100; secondary antibody only controls. Additionally, in an attempt to reduce the non-specific staining observed with the mouse anti-SMA antibody, we included a mouse-on-mouse blocking step (Vector lab, cat# MKB-2213) before incubation with the primary antibodies as a test on subset of the samples (Figure 3—figure supplement 1C). Samples were imaged using a LSM 880 upright confocal microscope (ZEISS, Oberkochen, Germany) fitted with a 40X Plan Apochromat NA 1.3 oil immersion objective. Ten random (simple randomization) z-stacks, with a total of 26 slices at 0.3 µm intervals per z-stack, were acquired per sample. Image acquisition was performed with a laser-scanning confocal laser running with Zen Black Acquisition Software (RRID:SCR_013672), version 2.0. Detailed image acquisition settings are contained in the metadata of the raw images. Image acquisition was performed blinded to the sample identity and the different fields of view were chosen randomly. Image files are available at https://osf.io/bq54u/.

Fibronectin fiber analysis

Request a detailed protocol

All image analysis was performed blinded to the sample identity. Image analysis output files are available at https://osf.io/bq54u/. Images were processed using KNIME (www.KNIME.com; RRID:SCR_006164), version 3.5.1 (Berthold et al., 2007). A screenshot of the processing/analysis steps are illustrated in Figure 3—figure supplement 4. Briefly, the fibronectin channel was selected and Z-stacks were processed for maximum intensity projections. The ImageJ (RRID:SCR_003070) command ‘Subtract background’, with a rolling setting of 50 (value that was optimized for the current dataset) was applied. Images were then thresholded for 35% intensity, outputting the binary images necessary for subsequent measurements (Figure 3—figure supplement 3B). The ImageJ command ‘Analyze particles’ was then applied, with options set to ‘Iterations = 1’, ‘Count = 1’ black’, ‘Set measurements: area, mean, fit, redirect = None, decimal = 3’, ‘Analyze particles: size = 25.0 infinity, circularity = 0.0–1.0, show=(Outlines)”. This was repeated for objects larger than 25, 50, 150, 300, and 357 pixels and the angle measurements of each object were exported for further analysis. To determine percent of fibers within 20° of the modal angle, the primary outcome measure, the relative angles were rounded to the nearest 10° angle using the rounding base function of R and then determining the mode angle for each image (i.e. the angle with the most fibers observed) as described previously (Amatangelo et al., 2005; Fiering et al., 2015). Script used to determine the mode angle for each image is available at https://osf.io/qgjme/.

Additionally, for each image, an anisotropy factor and the average angle of the fibers was measured by implementing the FibrilTool macro (RRID:SCR_016773) (Boudaoud et al., 2014) (FibrilTool was converted to a KNIME-node, see: https://osf.io/au4dx/), coherency was measured with the OrientationJ plugin (RRID:SCR_014796), version 2.0.3 (Rezakhaniha et al., 2012), and blinded manual scoring to assess ‘the frequency of parallel fibers’ used the following scale: (1) Not at all (in about 0%), (2) Occasionally (in about 30%), (3) Sometimes (in about 50%), (4) Usually (in about 80%), (5) All are (in about 100%). The values from the ten images for each tumor were averaged to generate a single score for each tumor.

An attempt to determine fibronectin orientation was made using MetaMorph (Molecular Devices, RRID:SCR_002368), version 7.10.1. Images were processed one at a time, not batch processed. In Fiji (RRID:SCR_002285) (Schindelin et al., 2012)/ImageJ, version 2.0.0-rc-34/1.50a (build 927ecc3c7a), the Bio-Formats Importer plugin (RRID:SCR_000450) (Linkert et al., 2010), version 5.1.9 was used to read/open the confocal microscopy raw data. The plugin was configured to split the channels to pursue the processing only on fibronectin. The resulting fibronectin Z-stacks were saved and then read/opened in MetaMorph, and subjected to a maximum intensity projection to generate a single 2D image (Note: this was our interpretation of the original description ‘Overlay Z-slices to make reconstituted views of the corresponding 3-D fibers for each region’). The MetaMorph Background and shading correction function was executed with a setting of 15 pixels (Note: this was our interpretation of the original description ‘Reduce non-specific background by selectively darkening objects with a pixel area greater than 15 using the flatten background function’). Using the MetaMorph internal threshold function, a binary image was created at the 35% setting (i.e. 35% of the pixels have the intensity). The resulting binary image was subjected to the MetaMorph IMA (Integrated Morphometry Analysis) function to reveal objects of interest, which was unable to be executed as there were too many objects to process.

Statistical analysis

Request a detailed protocol

Statistical analysis was performed with R software (RRID:SCR_001905), version 3.5.1 (R Development Core Team, 2018). All data, csv files, and analysis scripts are available on the OSF (https://osf.io/7yqmp/). Confirmatory statistical analysis was pre-registered (https://osf.io/s6ndp/) before the experimental work began as outlined in the Registered Report (Fiering et al., 2015). Data were checked to ensure assumptions of statistical tests were met. When described in the results, the Bonferroni correction, to account for multiple testings, was applied to the alpha error or the p-value. The Bonferroni corrected value was determined by divided the uncorrected value (0.05) by the number of tests performed. A meta-analysis of a common original and replication effect size was performed with a random effects model and the metafor R package (Viechtbauer, 2010) (https://osf.io/rvf57/). Meta-analyses were performed without weighting for Cliff’s d, since unweighted Cliff’s d has been reported to reduce bias (Kromrey et al., 2005). The asymmetric confidence intervals for the overall Cliff’s d estimate was determined using the normal deviate corresponding to the (1 - alpha/2)th percentile of the normal distribution (Cliff, 1993). The raw data pertaining to Figure 7Cb, 7 Cd, S7Ca, and S7Cc of Goetz et al. (2011) were shared by the original authors and compared back to the published summary data and figures. The summary data was published in the Registered Report (Fiering et al., 2015) and used in the power calculations to determine the sample sizes for this study.

Data availability

Request a detailed protocol

Additional detailed experimental notes, data, and analysis are available on OSF (RRID:SCR_003238) (https://osf.io/7yqmp/; Sheen et al., 2018). This includes the R Markdown file (https://osf.io/rd3yf/) that was used to compose this manuscript, which is a reproducible document linking the results in the article directly to the data and code that produced them (Hartgerink, 2017). The image analysis workflow generated during this study is available on Amazon Web Services (AWS) as an Amazon Machine Image (AMI). The machine image is located in the N. Virginia (us-east-1) region with the AMI ID: ami-09ee55780b0c19120, and AMI Name: rpcb-analysis-study20. Computation was performed on an Instance Type of m5.4xlarge (16 vCPU, 64 GiB Memory), with 500 GiB of Elastic Black Store (EBS) storage, and running Windows Server 2016. The administrator account password required to login is ‘RPCB!Analysis’.

Deviations from registered report

Request a detailed protocol

Following completion of the Western blot analysis to assess SMA levels in Cav1WT and Cav1KO pMEF clones, we consulted with the original authors regarding the lack of observable change between the two types of pMEFs. As suggested by the original authors we conducted a collagen gel contraction assay to assess ECM remodeling capabilities, which was pre-registered before experimental work began (https://osf.io/9cgk4/). For the subcutaneous tumorigenicity assay, the planned study design indicated the mice would be euthanized 70 days after cell injection, or an earlier time point to not compromise the ability to obtain enough mice for analysis while ensuring no animal suffering. Following a pilot study this was determined to be 45 days after injection, which was confirmed in the experimental study. A different anesthesia than listed in the Registered Report was used during cell and luciferin injections (isoflurane instead of ketamine and xylazine due to availability) as well as a different dose of luciferin (100 µl of 30 mg/ml for the first injection instead of 150 µl of 17.5 mg/ml and 50 µl of 30 mg/ml for the second injection instead of 50 µl of 17.5 mg/ml). Also, as described above (Figure 3—figure supplement 1), we were unable to obtain SMA staining that was specific, based on controls, to allow for the quantification as specified in the Registered Report to be conducted. As such, we did not conduct the analysis that was dependent on the SMA staining outlined in the Registered Report. As described above, we attempted to determine fibronectin orientation using MetaMorph software, but found the processing could not be executed as previously described as there were too many objects to process. Instead, we created a workflow using the KNIME analytics platform. We also explored additional methods to examine fiber orientation as described above. Additional materials and instrumentation not listed in the Registered Report, but needed during experimentation are also listed.

References

  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
  6. 6
  7. 7
  8. 8
    Studies in Classification, Data Analysis, and Knowledge Organization (GfKL 2007
    1. MR Berthold
    2. N Cebron
    3. F Dill
    4. TR Gabriel
    5. T Kötter
    6. T Meinl
    7. P Ohl
    8. C Sieb
    9. K Thiel
    10. B Wiswedel
    (2007)
    KNIME: The Konstanz Information Miner,, Studies in Classification, Data Analysis, and Knowledge Organization (GfKL 2007, Springer.
  9. 9
  10. 10
  11. 11
  12. 12
  13. 13
  14. 14
  15. 15
  16. 16
  17. 17
  18. 18
  19. 19
  20. 20
  21. 21
  22. 22
    Spontaneous malignant transformation of human ovarian surface epithelial cells in vitro
    1. L Gregoire
    2. R Rabah
    3. EM Schmelz
    4. A Munkarah
    5. PC Roberts
    6. WD Lancaster
    (2001)
    Clinical Cancer Research 7:4280–4287.
  23. 23
    Composing reproducible manuscripts using R markdown
    1. CHJ Hartgerink
    (2017)
    eLife. Accessed October 20, 2017.
  24. 24
  25. 25
  26. 26
  27. 27
  28. 28
  29. 29
  30. 30
  31. 31
  32. 32
  33. 33
  34. 34
  35. 35
  36. 36
    Robustness in Meta-Analysis: An Empirical Comparison of Point and Interval Estimates of Standardized Mean Differences and Cliff’s Delta
    1. JD Kromrey
    2. KY Hogart
    3. JM Ferron
    4. CV Hines
    5. MR Hess
    (2005)
    Statistical Meetings.
  37. 37
  38. 38
  39. 39
  40. 40
    Methods in Cell Biology
    1. JM Murray
    (2007)
    Practical Aspects of Quantitative Confocal Microscopy, Methods in Cell Biology, Elsevier.
  41. 41
  42. 42
    Weak stromal Caveolin-1 expression in colorectal liver metastases predicts poor prognosis after hepatectomy for liver-only colorectal metastases
    1. K Neofytou
    2. E Pikoulis
    3. A Petrou
    4. G Agrogiannis
    5. C Petrides
    6. I Papakonstandinou
    7. A Papalambros
    8. A Aggelou
    9. N Kavatzas
    10. T Liakakos
    11. E Felekouras
    (2017)
    Scientific Reports, 7, 10.1038/s41598-017-02251-9, 28515480.
  43. 43
  44. 44
    Metastatic Cancer: Clinical and Biological Perspectives, Molecular Biology Intelligence Unit
    1. HB Pearson
    2. N Pouliot
    (2013)
    Modeling Metastasis In Vivo, Metastatic Cancer: Clinical and Biological Perspectives, Molecular Biology Intelligence Unit, Austin, Landes Bioscience.
  45. 45
    R: A language and environment for statistical computing
    1. R Development Core Team
    (2018)
    R Foundation for Statistical Computing, Vienna, Austria.
  46. 46
  47. 47
  48. 48
  49. 49
  50. 50
  51. 51
  52. 52
    Study 20: replication of goetz
    1. MR Sheen
    2. JL Fields
    3. B Northan
    4. J Lacoste
    5. L-H Ang
    6. SN Fiering
    7. E Iorns
    8. R Tsui
    9. A Denis
    10. M Haselton
    11. N Perfito
    12. TM Errington
    (2018)
    Cell, 10.17605/OSF.IO/7YQMP.
  53. 53
  54. 54
  55. 55
  56. 56
  57. 57
  58. 58
  59. 59
  60. 60
    Conducting Meta-Analyses in R with the metafor Package
    1. W Viechtbauer
    (2010)
    Journal of Statistical Software, 36, 10.18637/jss.v036.i03.
  61. 61
  62. 62
  63. 63

Decision letter

  1. Morrison Sean J
    Senior Editor; Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, United States
  2. Massagué Joan
    Reviewing Editor; Memorial Sloan-Kettering Cancer Center, United States

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Thank you for submitting your article "Replication Study: Biomechanical remodeling of the microenvironment by stromal caveolin-1 favors tumor invasion and metastasis" for consideration by eLife. Your article has been reviewed by Sean Morrison as the Senior Editor, a Reviewing Editor, and four reviewers. The following individuals involved in review of your submission have agreed to reveal their identity: Miguel Del Pozo (Reviewer #1); Kenneth Yamada (Reviewer #4).

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

Summary:

This Replication Study has reproduced some parts of the original paper, but for other parts no statements about reproducibility can be drawn, since the experimental setup was different from the original study.

Essential revisions:

1) The original authors carried out in vivo experiments for 70 days, but this replication study terminated them at 45 days. While the rationale for shortening experiments by 25 days (a net reduction of 36% from original conditions!) is respectable, this makes it impossible to draw any firm conclusion with respect to the reproducibility of the original claims related to metastasis, since metastasis is a very time-dependent process. You need to more clearly acknowledge that you cut the metastasis assay short relative to the original study and therefore cannot draw a conclusion concerning the reproducibility of the metastasis results. This should be done in the Abstract.

2) You DID find similar SMA expression levels across genotypes when cultured in 2D plastic. The description of these experiments (subsection “Isolation and characterization of Cav1 wild-type and Cav1 knockout primary MEFs”) is misleading. The observations were not unexpected: the original study had already compared SMA expression between WT and Cav1KO MEFs in 2D and 3D conditions (Figure S2A, original study).

3) The data in Figure 3 exhibit poor staining quality and an overcompensation in digital levels. Hence, the quantitation is not comparable to that of the original study, and this should clearly be noted in the manuscript.

4) Clearly discuss the problem with the primary fibroblasts used in this study, which appear to be less contractile (see Figure 1C-E) than in the initial study (see Figure 1F). The issue is that these fibroblasts could not be further characterized because of antibody cross reactions during SMA staining (Figure 3). In the absence of further characterization of the fibroblasts used in this study, the authors conclusions are undermined.

5) Acknowledge and discuss the fact that not using the specific steps of MetaMorph analysis to score fibronectin fiber orientation could have altered the results.

Reviewer #1:

The goal of this Reproducibility Project is undermined because differential results are derived from experiments failing to reproduce key aspects of the original study, thus unsuitable for fair comparison. Furthermore, a bias is put into highlighting such 'differences', belittling observations faithfully replicating conditions and yielding results similar to those of the original study (SMA expression levels; collagen contraction assays; even trends for lower ECM fiber order in tumors bearing Cav1KO fibroblasts despite substantial shortcomings).

Goetz et al., carried out in vivo experiments for 70 days, but this replication study terminated them at 45 days. While the rationale for shortening experiments by 25 days (a net reduction of 36% from original conditions!) is respectable, this fully invalidates this section to provide a fair assessment of reproducibility of the original paper. Parameters hinting at how limited comparison is include different organ distribution of metastasis, and the failure to detect their luminescence in vivo. Such a key difference as experiment duration overrules other argued potential sources of variation (see subsection “Subcutaneous tumorigenicity assay of tumor cells co-injected with Cav1WT or Cav1KO primary MEFs"). Metastasis is a non-linear event, and conclusions cannot be drawn if timespan allowed for growth and progression is drastically different.

Therefore, the authors should acknowledge that in vivo experiments simply cannot be compared. The authors briefly mention these pivotal pitfalls somewhere in the text themselves (see for example subsection “Meta-analyses of original and replication effects” and subsection “Subcutaneous tumorigenicity assay of tumor cells co-injected with Cav1WT or Cav1KO primary MEFs"), but nonetheless highlight such differential observations (even in the Abstract!) as relevant, and compare their potential interpretation. It should be explicitly stated in the Abstract, and clarified in the main text, that these sections are not valid attempts to reproduce the original report. I would even suggest moving upfront passages such as those listed above (subsection “Meta-analyses of original and replication effects” and subsection “Subcutaneous tumorigenicity assay of tumor cells co-injected with Cav1WT or Cav1KO primary MEFs"), as they address the core aim of the replication study: to faithfully reproduce original experimental conditions.

Other sections of the reproducibility study that are misleading, or for which clear limitations should be explicitly acknowledged, are the following:

First: the authors DID recapitulate similar SMA expression levels between genotypes when cultured in 2D plastic. The description of these incomplete experiments (subsection “Isolation and characterization of Cav1 wild-type and Cav1 knockout primary MEFs”) is however inaccurate and misleading. Their observations were NOT unexpected: the original study had already compared SMA expression between WT and Cav1KO MEFs in 2D and 3D conditions (Figure S2A, original study). The replication study would have benefited from performing CDM-based experiments-which is how Goetz et al., uncovered the reported changes-in addition to collagen contraction assays.

Second, experiments in Figure 3 exhibit poor staining quality and an over compensation in digital levels. Hence, their quantitation is simply not comparable to that of the original study (where additional complementary techniques were used, such as SHG microscopy). Setbacks when trying to use the original image analysis tools (based on MetaMorph software) justify their using a completely different image setting and analysis approach. While similar trends to the observations from Goetz et al., were recorded, these facts (which add up to the dismal difference in the duration of the experiments) again preclude a fair comparison between both studies.

In summary, this report unintentionally distorts the purpose of the Reproducibility Project itself, because its experimental execution, interpretation and writing-up are biased, emphasizing differential results despite being irrelevant and not comparable. A major revision of the text should therefore be carried out at the least.

Reviewer #2:

In this replication study, the authors have attempted to validate some of the original findings published in 2011 by the Del Pozo laboratory (Goetz et al., 2011). Based on previous exchange with the reviewers on one hand, and the authors of the original study on the other hand, it was agreed that two experiments of the original study (mainly Figure 7) were worth replicating: it involved mainly a subcutaneous xenograft of breast cancer derived lung metastatic cells in mice where tumor growth and metastasis would be analyzed.

In present study, some of the findings that were originally published by Goetz et al., could be reproduced. First, they could confirm that primary mouse embryonic fibroblasts (pMEF) that express Cav1 have increased extracellular matrix remodeling capability both in vivo and in vitro. They also found in agreement with the initial study that primary growth of the tumor was not affected by the presence of either Cav1 WT or Cav1 KO pMEFs. Unfortunately, they could not reproduce a key result of the original study that is that metastatic dissemination was increased by fibroblasts expressing Cav1 WT. Indeed, no difference could be found here between Cav1 WT and CAV1 KO pMEFs.

This is the most annoying result of this replication study as the controversy that exists about Cav1 and fibroblasts deals mainly with the role of Cav1 in metastasis. From the present study, it appears however that the exact experimental conditions of replication could not be respected, and that is certainly the main explanation for this discrepancy. I see two main problems with the replicated experiments. First, the primary fibroblasts used in this study appear to be less contractile (see Figure 1C-E) than in the initial study (see Figure 1F). This is certainly an important issue as the study examines the effect of these fibroblasts on tumor growth and metastasis. I am not convinced that the difference that is shown on Figure 1 C-E is significant. Another issue is that these fibroblasts could not be further characterized because of antibody cross reactions on SMA staining (Figure 3). I was anticipating these problems and this is why I had recommended in my initial review to also examine the involvement of p190RhoGAP in this process. For me in the absence of further characterization of the fibroblasts used in this study, the authors cannot conclude.

The most important issue with this study is the considerable difference in the time of observation of metastatic foci in various organs. While the original study observed metastasis after 70 days, here the observation is done at a considerably earlier time i.e. 45 days that is almost half the time. Considering that metastasis is a secondary event that takes time to occur, it is likely that the observation has been done too prematurely.

In conclusion, the replication study shows several similar aspects with the original study by Goetz et al., regarding the role of Cav1 WT expressing cancer associated fibroblasts in remodeling the extracellular matrix although it could not be statistically validated. The lack of reproducibility for the metastasis process is most likely due to the impossibility to respect the same experimental protocol than the one in Goetz et al. Combined with the fact that the fibroblasts injected here present a weaker contractile activity, and thus may also explain the absence of effect in metastasis. This is clearly an experimental weakness of this study.

In the Discussion section, the authors should discuss more these two important discrepancies rather than involving a list of experimental features (subsection “Meta-analyses of original and replication effects”), which if it is true that they may be involved would then be always present in all studies and as a matter of fact would prevent from reproducing any experiment. Also, the possibility that human cancer cells (the ones injected here) may behave differently in a mouse model should be discussed as it could explain the discrepancies observed with other studies where mouse cancer cells were injected in mice (see Capozza reference cited in the original study and not cited here).

Reviewer #3:

I have attempted to review this from the perspective of the statistical analysis and presentation of the data.

This paper should follow the registered report by Fiering et al., 2015.

I have found the report difficult to follow as the methods employed, results and discussion are all presented in different places.

From what I can judge the authors have presented their results in a clear manner and appear to have carried out the analysis consistently with how it was proposed in the registered report.

Reviewer #4:

The goal of this manuscript to replicate a cancer biology study published in Cell appears well-intentioned and potentially valuable. Unfortunately, the authors make two major changes to the original study protocol that have unknown and potentially substantial consequences, making it impossible in my opinion to provide a valid replication test of the original study – even though multiple other aspects of the approach such as pre-registration and the careful attempts to identify specific reagents and materials were commendable.

1) The first major change, even though seemingly within the parameters of the loosely written original plan that allowed for adjusting study length, has potential major problems. In any spontaneous metastasis assay, extending the time for analysis by 55% (25 days beyond 45 days to reach the 70 days in the original paper) could have major effects on both the extent and patterns of metastases. The current authors used a markedly truncated experimental endpoint for all animals even if many may well have not been in distress – the result is that comparisons of results between the two studies performed at 45 days versus 70 days seems impossible. Effects seen in the original study may have been missed or may not, which is impossible to determine. In fact, if the original Registered Report had explicitly proposed substituting an experimental endpoint for spontaneous metastasis of 45 days in place of the original 70 days, it seems unlikely that it would have been approved.

2) This major change in experimental endpoint timing also likely compromised the comparisons of fibronectin fiber patterns. Although it is impossible to know, it is plausible that extending the time of tumor interactions with the local microenvironment by 55% might affect the fibronectin pattern associated with intratumoral remodeling. Unless the original conditions could be met at least approximately, it seems like comparing apples and oranges (or green vs. over-ripe apples).

3) A second potentially important change was complete substitution of alternative image analysis methods concerning fibronectin patterns compared to the Registered Report describing the use of MetaMorph and scoring SMA-positive cells. It is simply not clear whether scoring all cells rather than SMA-positive cells could have affected the results. More importantly, not using the specific steps of MetaMorph analysis to score fibronectin fiber orientation could have altered the results. The authors provide somewhat plausible reasons for not scoring SMA-positive cells (the staining did not work in their hands) and avoiding the use of MetaMorph (they found "too many objects to process") but changing the workflow completely appears to invalidate the attempted replication. Unless the current authors can demonstrate that their altered methods provide the same outputs on the original data as the original methods [e.g., by obtaining primary data from the original authors], these substitutions involving altering the cell population analyzed (all cells vs. SMA+) and image analysis (KNIME analytics vs. MetaMorph) fail to provide direct experimental replication. In fact, a major specific defect of the new analysis is apparent from Figure 3—figure supplement 3 in which the software fails to capture the fibrillar nature of fibronectin staining in 4 out of 9 samples in which average brightness is merely lower: the authors show dots and short lines rather than the interconnected fibronectin fibrillar staining patterns.

4) Although not the approach taking by these well-intentioned replicators, this reviewer believes that more direct exchanges of expertise and experience between the original research group and the replication group could have avoided the serious problems listed above that, at present, unfortunately invalidate this well-intentioned replication attempt.

[Editors' note: further revisions were requested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "Replication Study: Biomechanical remodeling of the microenvironment by stromal caveolin-1 favors tumor invasion and metastasis" for further consideration at eLife. Your revised article has been favorably evaluated by Sean Morrison (Senior Editor), a Reviewing Editor, and three reviewers.

The manuscript has been improved but there are some remaining issues that need to be addressed before acceptance. The strong consensus among the referees and reviewing editor is that the shortening of the metastasis period makes it difficult to compare this study with the original one, and the manner in which this point is presented in the revised manuscript is not satisfactory. In particular, you must include additional discussion on your understanding of the relevance of a shorter or longer followup period for metastasis outcome, and the manner in which variations in this important parameter may influence the outcome of the experiments and, with that, the robustness of the conclusions that can be drawn in your study about the reproducibility of the original results.

Reviewer #1:

Fiering et al., complied in this revised version with several previous recommendations concerning the summary text, as well as changes to the main text body of the manuscript. However, the rewritting of the Abstract, while in the requested direction, should still be improved. We cannot help to notice the authors keep explicitly stating and ´presenting´ certain divergent observations as results worth being considered:

"[…] We found metastatic burden was similar between Cav1WT and Cav1KO pMEFs, while the original study found increased metastases with Cav1WT (Figure 7C; Goetz et al., 2011). We also found a statistically significant negative correlation of intratumoral remodeling with metastatic burden, while the original study found a statistically significant positive correlation (Figure 7CD; Goetz et al., 2011). Finally, we report meta-analyses for each result."

As they subsequently state, these differences can be explained alone by the fact that key experimental details were not reproduced (i.e. the replication study shortened the duration of in vivo experiments by more than 35% as compared to the original study, and they used a different image approach when assessing ECM architecture from microscopy images (apart from additional technical issues, already mentioned in the previous round of review)). Thus, we claim again that these parts of their replication study cannot be subject to fair comparison as such and should not be presented upfront with the same value as the rest of the observations. Moreover, the fact that key experimental aspects were not fully reproduced should be clearly stated before those descriptions. To provide an example of what, in our opinion, should be stated in the Abstract regarding these experiments:

"[…] Two key experimental parameters (experimental timing of in vivo metastasis experiments and image-based analysis of tumoral ECM architecture) did not match with those used in the original study, rendering some observations, such as metastatic burden or its correlation with intratumoral ECM remodeling, not suitable for comparison. Finally, we report meta-analyses for each result."

While we appreciate that an effort has been made trying to correct these issues according to the previous revision round, we think these details are paramount to avoid any ambiguous or misleading messages in the Abstract to potential readers. Besides, this was a common claim of reviewers 2, 4 and us (while the only other referee focuses on the statistical analysis and presentation of the data).

Reviewer #5:

This replication study addressed experiments performed by Goetz et al., (2011) which indicated a critical role of caveolin-1 expressed by myofibroblasts of the tumor stroma in ECM remodeling and its role in inducing distant metastasis formation. The results confirm the originally shown role of caveolin-1 in collagen contraction and, to some extent, in fibronectin matrix alignment in vivo, as well as a lack of involvement in tumor growth at the implantation site. Conversely, this replication study could replicate the effect of caveolin-1 positive fibroblasts on metastatic evasion, and even indicated an inverse effect between metastasis outcome and fibronectin alignment in the primary tumor.

With the notable exception of a reduced followup period of metastasis after tumor implantation. The experiments were designed and performed with high fidelity and the quality of documentation is excellent. The meta-analysis further explores similarities and differences with competence and is delineated in a comprehensive manner. As major shortcoming, the explanation justifying the shortened observation period for analyzing metastasis outcome and the discussion on the implications for similar replication work in general are not yet satisfactory.

Specific points:

Subsection “Subcutaneous tumorigenicity assay of tumor cells co-injected with Cav1WT or Cav1KO primary MEFs": "45 days after cell injection to maximize the length of time for tumor growth while minimizing animal suffering" – it should be stated whether in this replication study, the tumors grew more rapidly, compared to Goetz et al., and whether a different tumor size until human endpoint mandated this deviation compared to the original study.

For a general audience, the authors should include additional discussion on their understanding of the relevance of a shorter or longer followup period for metastasis outcome. Particularly, they should discuss how robust results can be considered and how similar replication work in independent labs should be performed. Is it legitimate to adjust followup periods? Which rules should apply to a robust biological outcome? How can the inverse correlation of metastasis and ECM remodeling in the primary tumor site be explained when compared to the primary work? Which growth monitoring criteria should be applied to achieve high fidelity replication? How should researchers deal with differences in human endpoint criteria when constrained by a given legal framework? How would the authors perform this part of the work in retrospect, to achieve higher concordance with the original study?

The discussion on intermittent parameters potentially affecting the bioluminescence imaging is of general interest. It is not clear, however, whether these points are relevant here, assuming that the same imaging approach was used as in Goetz et al. (whole-body bioluminescence). However, Goetz et al., additionally used bioluminescence analysis of excised organs. Can the authors clarify why this more sensitive approach was apparently not used here?

[Editors' note: further revisions were requested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "Replication Study: Biomechanical remodeling of the microenvironment by stromal caveolin-1 favors tumor invasion and metastasis" for further consideration at eLife. Your revised article has been favorably evaluated by Sean Morrison (Senior Editor), a Reviewing Editor, and two reviewers.

The manuscript has been improved but there are a few remaining issues that need to be addressed before acceptance, as outlined below:

1) The most critical issue is that the editors and reviewers were not satisfied with the clarity of the Abstract in terms of the way it acknowledges the potential impact of the shorter metastasis assay on the interpretability of the results. Based on extensive consultation among the editors and reviewers, their concerns would be resolved if you would be willing to replace the last four sentences in the Abstract with the following text:

We found metastatic burden was similar between Cav1WT and Cav1KO pMEFs, while the original study found increased metastases with Cav1WT (Figure 7C; Goetz et al., 2011); however, the duration of our in vivo experiments (45 days) were much shorter than in the study by Goetz et al., (2011) (75 days). This makes it difficult to interpret the difference between the studies as it is possible that the cells required more time to manifest the difference between treatments observed by Goetz et al. We also found a statistically significant negative correlation of intratumoral remodeling with metastatic burden, while the original study found a statistically significant positive correlation (Figure 7Cd; Goetz et al., 2011), but again there were differences between the studies in terms of the duration of the metastasis studies and the imaging approaches that could have impacted the outcomes. Finally, we report meta-analyses for each result.

Beyond this change to the Abstract, the only other changes that are required are minor changes to the text of the manuscript to address the specific points raised by reviewer #5 below. The comments from reviewer #1 are included below to provide context but would be entirely addressed by the change in the Abstract noted above.

Reviewer #1:

I regret to note that the authors have missed a key point that required their attention, even though it was explicitly stated in my previous comments and my guiding suggestions on the lines of change I and other reviewers considered appropriate.

I quote here again the point I aimed at getting across: […] We cannot help noticing the authors keep explicitly stating and ´presenting´ certain divergent observations as results worth being considered […]

This clearly referred to two sentences in the Abstract, which were still included in that previous revision and in the latest version of the manuscript:

"[…] We found metastatic burden was similar between Cav1WT and Cav1KO pMEFs, while the original study found increased metastases with Cav1WT (Figure 7C; Goetz et al., 2011). We also found a statistically significant negative correlation of intratumoral remodeling with metastatic burden, while the original study found a statistically significant positive correlation (Figure 7Cd; Goetz et al., 2011) […]."

I contend again that these parts of their replication study cannot be subject to fair comparison, and should therefore not be presented upfront in this way, equaling them to the rest of the observations, as if they were to be considered as data obtained thoursough appropriate experimental replication. The way the Abstract keeps being presented is misleading (perhaps unintentionally), and leaves room for the wrong interpretation that those in vivo experiments deserve being considered as valid experiments that 'may have yielded a different outcome' for this reason or another. This is not useful for the readership of eLife and is damaging to the general aim of this reproducibility initiative, and I respectfully suggest again modifying this text following this example:

"[…] Two key experimental parameters (timing of in vivo metastasis experiments and image-based analysis of tumor ECM architecture) did not match those used in the original study, rendering some observations, such as metastatic burden or its correlation with intratumoral ECM remodeling, not suitable for comparison […]".

I did note and acknowledge that certain changes had partly been made in a good direction, such as the sentence at the end of the Abstract the authors bring up in their response, where they admit that key experimental aspects had not been reproduced and "could have impacted the outcomes [sic]". However, and notwithstanding the previous key point, this statement should at the very least be written before listing those experiments they performed under different conditions.

I am compelled to state again that these details are essential to preserve the original aim of the reproducibility initiative and provide a fair and non-misleading message to its readership. It must be noted this was a common claim of reviewers 2, 4 and us, and should therefore be fully complied with before publication.

Reviewer #5:

The authors have now included a specific justification for the humane endpoint, and it gets clear that the procedure was adequate.

The discussion on the implications of shorting the observation period and incomplete reporting of primary tumor burden at the endpoint in the original study could still be discussed with more care.

Specific points:

1) The authors state in subsection “Subcutaneous tumorigenicity assay of tumor cells co-injected with Cav1WT or Cav1KO primary MEFs”: "Although experimental timing is important, maintaining it might not be sufficient to observe the same malignant progression between studies." – it is not clear what is meant with "maintain timing" – the same follow-up period until the metastatic endpoint? The authors should acknowledge that reproducing the timelines of primary tumor growth and spontaneous metastasis as exactly as possible is critical for minimizing confounding parameters, because the multi-step cascade to metastasis including invasion, organ colonization and outgrowth are strongly time-dependent processes. In addition, growth of metastases is nonlinear, i.e. a few days towards later time points might impose major differences in aggregate tumor burden. This should be discussed in more detail.

2) In addition, it would be important to provide a recommendation how such inconsistencies can be mitigated in future work, for example by (i) reporting the tumor volumes for each animal at the endpoint, (ii) resecting the primary tumor by a standardized measure (e.g., size or time), to allow for a follow-up of metastasis development. This would allow to monitor metastasis independent of primary tumor load and premature humane endpoint because of variations in growth of the primary.

3) The readability of labels in Figures 2A, C and Figure 3C remains unacceptable and should be improved.

https://doi.org/10.7554/eLife.45120.sa1

Author response

Summary:

This Replication Study has reproduced some parts of the original paper, but for other parts no statements about reproducibility can be drawn, since the experimental setup was different from the original study.

Essential revisions:

1) The original authors carried out in vivo experiments for 70 days, but this replication study terminated them at 45 days. While the rationale for shortening experiments by 25 days (a net reduction of 36% from original conditions!) is respectable, this makes it impossible to draw any firm conclusion with respect to the reproducibility of the original claims related to metastasis, since metastasis is a very time-dependent process. You need to more clearly acknowledge that you cut the metastasis assay short relative to the original study and therefore cannot draw a conclusion concerning the reproducibility of the metastasis results. This should be done in the Abstract.

We have revised the Abstract to state this key difference.

2) You DID find similar SMA expression levels across genotypes when cultured in 2D plastic. The description of these experiments (subsection “Isolation and characterization of Cav1 wild-type and Cav1 knockout primary MEFs”) is misleading. The observations were not unexpected: the original study had already compared SMA expression between WT and Cav1KO MEFs in 2D and 3D conditions (Figure S2A, original study).

We have revised this description to reflect this comment and underscore the ‘subtle differences’ between WT and KO pMEFs that were communicated to us by the original authors when we sought feedback about this result. Importantly, this experiment was suggested during peer review of the Registered Report as a method to examine ECM remodeling capabilities in vitro, before using the pMEFs for the in vivo experiment. Thus, we were not expecting the result we observed. However, it was communicated to us that a number of factors can influence whether a difference will be observable (e.g. plastic, time after isolation, 2D vs 3D, etc), which we included in this manuscript. And as described in subsection “Deviations from Registered Report”, a recommendation to conduct a collagen contraction assay was agreed with the original authors and pre-registered before experimental work began.

3) The data in Figure 3 exhibit poor staining quality and an overcompensation in digital levels. Hence, the quantitation is not comparable to that of the original study, and this should clearly be noted in the manuscript.

We are unsure what evidence was used by the reviewer to make the statement that the staining was of poor quality. The staining specificity, using two negative controls (no primary antibody or isotype control antibody) was reported in this replication attempt (Figure 3—figure supplement 1), although these staining controls do not appear to have been reported in the original study. This comment may be referring to image noise, the random fluctuation of light intensity contained in images, which is a common issue that affects the quality of fluorescence-based images because of the low-light nature of the signal. In microscopy, noise is produced by the sensor and circuitry of the detectors used (e.g. camera and PMTs) and the unavoidable shot noise which relates to the amount of photons reaching the detector. Since fluorescence imaging demands a sufficient signal-to-noise ratio (SNR) to detect the features of interest, noise can have a negative impact on the subsequently process and analyze steps. Therefore, it is best to adopt strategies to minimize noise, with management steps usually in the image process and analysis protocol. As such, there are several methods to measure noise from either CCD cameras or confocal PMT detectors (Heintzmann et al., 2018,, van Vliet, Sudar and Young, 1998, Murray, 2007). Noise can be measured for particular instruments or from images themselves. Unfortunately, the original study, like many scientific reports publishing fluorescence microscopy images, did not mention any measurements of noise or SNR. Additionally, the original study did not report a strategy used to minimize noise and the original microscopy raw images were not available, therefore we did not have the possibility to compare the amount of noise of the original study verse this replication attempt.

We also disagree that overcompensation in digital levels occurred since there was no overcompensation done as stated in the figure legends (the images for each staining are displayed in the same range of grey levels).

4) Clearly discuss the problem with the primary fibroblasts used in this study, which appear to be less contractile (see Figure 1C-E) than in the initial study (see Figure 1F). The issue is that these fibroblasts could not be further characterized because of antibody cross reactions during SMA staining (Figure 3). In the absence of further characterization of the fibroblasts used in this study, the authors conclusions are undermined.

We disagree that the primary fibroblasts used in this study were any less contractile than the original study. First, the original study figure (Figure 1F) that is referenced did not use pMEFs, but immortalized MEFs. The assay that tested pMEFs in the original study was not reported. Second, at the two time points we analyzed (24 hours and 48 hours after plating), the contraction index reported in Figure 1F of the original study for MEFs (WT MEFs: 24 hours = ~70% and 48 hours = ~82%; KO MEFs: 24 hours = ~55% and 48 hours = ~63%) are quite similar to what we reported in Figure 1D for pMEFs (WT MEFs: 24 hours = 75% and 48 hours = 84%; KO MEFs: 24 hours = 56% and 48 hours = 63%).

We have revised the manuscript to further discuss the inability to further characterize the fibroblasts used in this study due to antibody cross reactions during SMA staining. It was unclear whether this was encountered in the original study and if so, what was performed to deal with the issue. As such, we emailed the original authors when we observed the cross reactions with SMA staining and confirmed the protocol we followed, that was described in the Registered Report, was the same that was performed in the original study. We also highlight the impact this unexpected complication has on interpretation of the results considering the absence of further characterization of the pMEFs.

5) Acknowledge and discuss the fact that not using the specific steps of MetaMorph analysis to score fibronectin fiber orientation could have altered the results.

We agree this could have altered the results obtained and in the previous version of this manuscript included this point, among others, in the final paragraph of the Discussion section. To further highlight this, we have included this in the Abstract and the relevant Results section.

Reviewer #1:

The goal of this Reproducibility Project is undermined because differential results are derived from experiments failing to reproduce key aspects of the original study, thus unsuitable for fair comparison. Furthermore, a bias is put into highlighting such 'differences', belittling observations faithfully replicating conditions and yielding results similar to those of the original study (SMA expression levels; collagen contraction assays; even trends for lower ECM fiber order in tumors bearing Cav1KO fibroblasts despite substantial shortcomings).

Goetz et al., carried out in vivo experiments for 70 days, but this replication study terminated them at 45 days. While the rationale for shortening experiments by 25 days (a net reduction of 36% from original conditions!) is respectable, this fully invalidates this section to provide a fair assessment of reproducibility of the original paper. Parameters hinting at how limited comparison is include different organ distribution of metastasis, and the failure to detect their luminescence in vivo. Such a key difference as experiment duration overrules other argued potential sources of variation (see subsection “Subcutaneous tumorigenicity assay of tumor cells co-injected with Cav1WT or Cav1KO primary MEFs"). Metastasis is a non-linear event, and conclusions cannot be drawn if timespan allowed for growth and progression is drastically different.

Therefore, the authors should acknowledge that in vivo experiments simply cannot be compared. The authors briefly mention these pivotal pitfalls somewhere in the text themselves (see for example subsection “Meta-analyses of original and replication effects” and subsection “Subcutaneous tumorigenicity assay of tumor cells co-injected with Cav1WT or Cav1KO primary MEFs"), but nonetheless highlight such differential observations (even in the Abstract!) as relevant, and compare their potential interpretation. It should be explicitly stated in the Abstract, and clarified in the main text, that these sections are not valid attempts to reproduce the original report. I would even suggest moving upfront passages such as those listed above (subsection “Meta-analyses of original and replication effects” and subsection “Subcutaneous tumorigenicity assay of tumor cells co-injected with Cav1WT or Cav1KO primary MEFs"), as they address the core aim of the replication study: to faithfully reproduce original experimental conditions.

We revised the manuscript to address this comment as addressed above.

Other sections of the reproducibility study that are misleading, or for which clear limitations should be explicitly acknowledged, are the following:

First: the authors DID recapitulate similar SMA expression levels between genotypes when cultured in 2D plastic. The description of these incomplete experiments (subsection “Isolation and characterization of Cav1 wild-type and Cav1 knockout primary MEFs”) is however inaccurate and misleading. Their observations were NOT unexpected: the original study had already compared SMA expression between WT and Cav1KO MEFs in 2D and 3D conditions (Figure S2A, original study). The replication study would have benefited from performing CDM-based experiments-which is how Goetz et al., uncovered the reported changes-in addition to collagen contraction assays.

We revised the manuscript to address this comment as addressed above. Regarding the last statement, we appreciate this point of view and also appreciate the original authors informing us about the collagen contraction assays in light of the data we observed during the course of the replication experimentation. However, if performing CDM-based experiments, or collagen contraction assays, to identify pMEFS that obtained a specific outcome in vitro, before using the pMEFs for the in vivo experiment, this replication attempt would have benefited from having this shared during preparation and/or review of the Registered Report. This type of feedback before conducting experiments helps minimize confirmation bias and to maximize the quality of the methodology in an attempt to have no reason to expect a priori a different result than the original study.

Second, experiments in Figure 3 exhibit poor staining quality and an over compensation in digital levels. Hence, their quantitation is simply not comparable to that of the original study (where additional complementary techniques were used, such as SHG microscopy). Setbacks when trying to use the original image analysis tools (based on MetaMorph software) justify their using a completely different image setting and analysis approach. While similar trends to the observations from Goetz et al. were recorded, these facts (which add up to the dismal difference in the duration of the experiments) again preclude a fair comparison between both studies.

We responded to this comment above and do not understand what evidence was used by the reviewer to make the statement that the staining was of poor quality. We also acknowledge that the original study used additional complementary techniques, such as SHG microscopy; however, importantly, these were not utilized for the experiment that was replicated and thus are not directly comparable.

In summary, this report unintentionally distorts the purpose of the Reproducibility Project itself, because its experimental execution, interpretation and writing-up are biased, emphasizing differential results despite being irrelevant and not comparable. A major revision of the text should therefore be carried out at the least.

Reviewer #2:

In this replication study, the authors have attempted to validate some of the original findings published in 2011 by the Del Pozo laboratory (Goetz et al., 2011). Based on previous exchange with the reviewers on one hand, and the authors of the original study on the other hand, it was agreed that two experiments of the original study (mainly Figure 7) were worth replicating: it involved mainly a subcutaneous xenograft of breast cancer derived lung metastatic cells in mice where tumor growth and metastasis would be analyzed.

In present study, some of the findings that were originally published by Goetz et al. could be reproduced. First, they could confirm that primary mouse embryonic fibroblasts (pMEF) that express Cav1 have increased extracellular matrix remodeling capability both in vivo and in vitro. They also found in agreement with the initial study that primary growth of the tumor was not affected by the presence of either Cav1 WT or Cav1 KO pMEFs. Unfortunately, they could not reproduce a key result of the original study that is that metastatic dissemination was increased by fibroblasts expressing Cav1 WT. Indeed, no difference could be found here between Cav1 WT and CAV1 KO pMEFs.

This is the most annoying result of this replication study as the controversy that exists about Cav1 and fibroblasts deals mainly with the role of Cav1 in metastasis. From the present study, it appears however that the exact experimental conditions of replication could not be respected, and that is certainly the main explanation for this discrepancy. I see two main problems with the replicated experiments. First, the primary fibroblasts used in this study appear to be less contractile (see Figure 1C-E) than in the initial study (see Figure 1F). This is certainly an important issue as the study examines the effect of these fibroblasts on tumor growth and metastasis. I am not convinced that the difference that is shown on Figure 1 C-E is significant. Another issue is that these fibroblasts could not be further characterized because of antibody cross reactions on SMA staining (Figure 3). I was anticipating these problems and this is why I had recommended in my initial review to also examine the involvement of p190RhoGAP in this process. For me in the absence of further characterization of the fibroblasts used in this study, the authors cannot conclude.

As stated above, we disagree that the primary fibroblasts used in this study were any less contractile than the original study. Nonetheless we agree that there might be some ‘minimal’ effect size of in vitro contractility that might be needed to observe an in vivo effect. Importantly, observing different outcomes are informative to establish a range of conditions under which a given effect can be observed, a point we’ve included in the revised manuscript. Additionally, we appreciate the view that additional information (i.e. examining p190RhoGAP) would be beneficial, and have included this in the revised manuscript as well.

The most important issue with this study is the considerable difference in the time of observation of metastatic foci in various organs. While the original study observed metastasis after 70 days, here the observation is done at a considerably earlier time i.e. 45 days that is almost half the time. Considering that metastasis is a secondary event that takes time to occur, it is likely that the observation has been done too prematurely.

We have revised the Abstract to state this key difference

In conclusion, the replication study shows several similar aspects with the original study by Goetz et al., regarding the role of Cav1 WT expressing cancer associated fibroblasts in remodeling the extracellular matrix although it could not be statistically validated. The lack of reproducibility for the metastasis process is most likely due to the impossibility to respect the same experimental protocol than the one in Goetz et al. Combined with the fact that the fibroblasts injected here present a weaker contractile activity, and thus may also explain the absence of effect in metastasis. This is clearly an experimental weakness of this study.

In the Discussion section, the authors should discuss more these two important discrepancies rather than involving a list of experimental features (subsection “Meta-analyses of original and replication effects”), which if it is true that they may be involved would then be always present in all studies and as a matter of fact would prevent from reproducing any experiment. Also, the possibility that human cancer cells (the ones injected here) may behave differently in a mouse model should be discussed as it could explain the discrepancies observed with other studies where mouse cancer cells were injected in mice (see Capozza reference cited in the original study and not cited here).

We have expanded the discussion on these two differences as well as the additional factor raised. Regarding the point of differences and their impact on replicating experiments, yes, it is possible any of these (among many others that can be hypothesized) could influence the outcome, which is true for any experiment. But whether they actually are is the open question. Replication plays a role here to test what is thought to matter, that is, this replication attempted to reproduce a previous finding with no a priori reason to expect a different outcome. When a different result is obtained, there are many possible factors that could explain the differences. Conversely, because it’s not possible to do the exact same experiment again, since there will always be a difference between the original and replication studies, when a similar result is observed in a replication, despite the known and unknown differences between studies, it increases confidence in the original finding as well as the generalizability of the result. This paper offers further insight on this topic: Errington and Nosek, 2017.

Reviewer #4:

The goal of this manuscript to replicate a cancer biology study published in Cell appears well-intentioned and potentially valuable. Unfortunately, the authors make two major changes to the original study protocol that have unknown and potentially substantial consequences, making it impossible in my opinion to provide a valid replication test of the original study – even though multiple other aspects of the approach such as pre-registration and the careful attempts to identify specific reagents and materials were commendable.

1) The first major change, even though seemingly within the parameters of the loosely written original plan that allowed for adjusting study length, has potential major problems. In any spontaneous metastasis assay, extending the time for analysis by 55% (25 days beyond 45 days to reach the 70 days in the original paper) could have major effects on both the extent and patterns of metastases. The current authors used a markedly truncated experimental endpoint for all animals even if many may well have not been in distress – the result is that comparisons of results between the two studies performed at 45 days versus 70 days seems impossible. Effects seen in the original study may have been missed or may not, which is impossible to determine. In fact, if the original Registered Report had explicitly proposed substituting an experimental endpoint for spontaneous metastasis of 45 days in place of the original 70 days, it seems unlikely that it would have been approved.

We appreciate this perspective and have revised the manuscript to address this comment as discussed above.

2) This major change in experimental endpoint timing also likely compromised the comparisons of fibronectin fiber patterns. Although it is impossible to know, it is plausible that extending the time of tumor interactions with the local microenvironment by 55% might affect the fibronectin pattern associated with intratumoral remodeling. Unless the original conditions could be met at least approximately, it seems like comparing apples and oranges (or green vs. over-ripe apples).

We agree and have revised the manuscript to reflect this.

3) A second potentially important change was complete substitution of alternative image analysis methods concerning fibronectin patterns compared to the Registered Report describing the use of MetaMorph and scoring SMA-positive cells. It is simply not clear whether scoring all cells rather than SMA-positive cells could have affected the results. More importantly, not using the specific steps of MetaMorph analysis to score fibronectin fiber orientation could have altered the results. The authors provide somewhat plausible reasons for not scoring SMA-positive cells (the staining did not work in their hands) and avoiding the use of MetaMorph (they found "too many objects to process") but changing the workflow completely appears to invalidate the attempted replication. Unless the current authors can demonstrate that their altered methods provide the same outputs on the original data as the original methods [e.g., by obtaining primary data from the original authors], these substitutions involving altering the cell population analyzed (all cells vs. SMA+) and image analysis (KNIME analytics vs. MetaMorph) fail to provide direct experimental replication. In fact, a major specific defect of the new analysis is apparent from Figure 3—figure supplement 3 in which the software fails to capture the fibrillar nature of fibronectin staining in 4 out of 9 samples in which average brightness is merely lower: the authors show dots and short lines rather than the interconnected fibronectin fibrillar staining patterns.

We agree that this was a challenging aspect of this replication attempt. Regarding the first point about SMA staining and the comparison of SMA-positive cells. The analysis conducted in the original study (in Figure S7Cc) and outlined in the Registered Report (Protocol 4) described two analyses on fibronectin orientation. One using all cells, which we report here, and one on just SMA-positive cells, which we were unable to conduct due to challenges we could not overcome with the SMA staining procedure used. Thus, we did not change the population of cells being investigated, but instead could not conduct a sub-analysis on just the SMA-positive cells as planned.

We agree that a change in workflow can have unexpected and not easily comparable aspects to it for assessing reproducibility. The suggestion to compare the workflow we generated on original primary data was something we explored, but we were unable to obtain original primary data or macros to do this (as a note, we asked this during the Registered Report because even details of the MetaMorph protocol were not explicit in all aspects). While we can not provide a benchmark for how the workflow we had to implement compares to the original study protocol, we were encouraged that the overall median values of percent of fibronectin fibers oriented with 20% we observed [Mdn=42.1%, IQR=39.1-44.7%, n=20] were generally similar to the original values [Mdn=42.5%, IQR=37.9-45.1%, n=23]. And yes, we agree there is the possibility of additional optimization of the analysis we present, especially as the reviewer points out variations in signal intensity between images. Importantly, though it is unclear how variable the original images were or how the original study handled these variations in their analysis pipeline as described. Importantly, the approach should be done blinded to the conditions and batch processed to mitigate any potential for bias. We also conducted additional complementary analysis procedures using FibrilTool (as recommended by the original authors during preparation of the Registered Report as an alternative approach to perform the analysis), and OrientationJ. Although the full range of possible methods were not explored, we found the results of all these methods, plus a blinded manual scoring, were reasonably well correlated indicating the robustness of the findings presented. These points are further discussed in the revised manuscript.

4) Although not the approach taking by these well-intentioned replicators, this reviewer believes that more direct exchanges of expertise and experience between the original research group and the replication group could have avoided the serious problems listed above that, at present, unfortunately invalidate this well-intentioned replication attempt.

We appreciate this perspective from the reviewer and do not disagree with it in principle. To this point, we made much effort to do this during the course of this study and appreciated the immense effort of the original authors to help us both in preparation of the Registered Report, but also during the experimentation. However, it is also valuable to know under what circumstances more direct exchanges are needed and how those exchanges take place. We argue that increased transparency of process and outcome is important to mitigate these concerns, hence the approach we took with this project (see: Errington et al., 2014). We also recognize that in some circumstances further exchanges might be beneficial. However, this is not feasible or practical to do this for all replication attempts.

[Editors' note: further revisions were requested prior to acceptance, as described below.]

The manuscript has been improved but there are some remaining issues that need to be addressed before acceptance. The strong consensus among the referees and reviewing editor is that the shortening of the metastasis period makes it difficult to compare this study with the original one, and the manner in which this point is presented in the revised manuscript is not satisfactory. In particular, you must include additional discussion on your understanding of the relevance of a shorter or longer followup period for metastasis outcome, and the manner in which variations in this important parameter may influence the outcome of the experiments and, with that, the robustness of the conclusions that can be drawn in your study about the reproducibility of the original results.

We have included further discussion on the implications of a changed time course in metastatic studies in the revised manuscript and the impact this could have when comparing the two studies.

Reviewer #1:

Fiering et al., complied in this revised version with several previous recommendations concerning the summary text, as well as changes to the main text body of the manuscript. However, the rewritting of the Abstract, while in the requested direction, should still be improved. We cannot help to notice the authors keep explicitly stating and ´presenting´ certain divergent observations as results worth being considered:

"[…] We found metastatic burden was similar between Cav1WT and Cav1KO pMEFs, while the original study found increased metastases with Cav1WT (Figure 7C; Goetz et al., 2011). We also found a statistically significant negative correlation of intratumoral remodeling with metastatic burden, while the original study found a statistically significant positive correlation (Figure 7CD; Goetz et al., 2011). Finally, we report meta-analyses for each result."

As they subsequently state, these differences can be explained alone by the fact that key experimental details were not reproduced (i.e. the replication study shortened the duration of in vivo experiments by more than 35% as compared to the original study, and they used a different image approach when assessing ECM architecture from microscopy images (apart from additional technical issues, already mentioned in the previous round of review)). Thus, we claim again that these parts of their replication study cannot be subject to fair comparison as such and should not be presented upfront with the same value as the rest of the observations. Moreover, the fact that key experimental aspects were not fully reproduced should be clearly stated before those descriptions. To provide an example of what, in our opinion, should be stated in the Abstract regarding these experiments:

"[…] Two key experimental parameters (experimental timing of in vivo metastasis experiments and image-based analysis of tumoral ECM architecture) did not match with those used in the original study, rendering some observations, such as metastatic burden or its correlation with intratumoral ECM remodeling, not suitable for comparison. Finally, we report meta-analyses for each result."

While we appreciate that an effort has been made trying to correct these issues according to the previous revision round, we think these details are paramount to avoid any ambiguous or misleading messages in the Abstract to potential readers. Besides, this was a common claim of reviewers 2, 4 and us (while the only other referee focuses on the statistical analysis and presentation of the data).

The quoted text of our manuscript Abstract above is from the first version of the manuscript and does not include the revisions that were made. Following the last round of peer review, we included an additional sentence that specific factors (experimental timing and image analysis approach) could have impacted the outcomes.

Reviewer #5:

This replication study addressed experiments performed by Goetz et al., (2011) which indicated a critical role of caveolin-1 expressed by myofibroblasts of the tumor stroma in ECM remodeling and its role in inducing distant metastasis formation. The results confirm the originally shown role of caveolin-1 in collagen contraction and, to some extent, in fibronectin matrix alignment in vivo, as well as a lack of involvement in tumor growth at the implantation site. Conversely, this replication study could replicate the effect of caveolin-1 positive fibroblasts on metastatic evasion, and even indicated an inverse effect between metastasis outcome and fibronectin alignment in the primary tumor.

With the notable exception of a reduced followup period of metastasis after tumor implantation. The experiments were designed and performed with high fidelity and the quality of documentation is excellent. The meta-analysis further explores similarities and differences with competence and is delineated in a comprehensive manner. As major shortcoming, the explanation justifying the shortened observation period for analyzing metastasis outcome and the discussion on the implications for similar replication work in general are not yet satisfactory.

We have included additional information in the body of the text for the shortened observation period in the revised manuscript, which is addressed in the specific point about subsection “Subcutaneous tumorigenicity assay of tumor cells co-injected with Cav1WT or Cav1KO primary MEFs” below. Additionally, we have included further discussion on the implications of a changed time course in metastatic studies.

Specific points:

Subsection “Subcutaneous tumorigenicity assay of tumor cells co-injected with Cav1WT or Cav1KO primary MEFs": "45 days after cell injection to maximize the length of time for tumor growth while minimizing animal suffering" – it should be stated whether in this replication study, the tumors grew more rapidly, compared to Goetz et al., and whether a different tumor size until human endpoint mandated this deviation compared to the original study.

It was not reported in the original paper, or shared with us, what the tumor growth rates were for the original study. However, the shortened time course followed in this replication, to minimize animal suffering, suggests the possibility of a faster tumor growth rate observed in this replication than the original study. This section of the manuscript has been revised to reflect this.

For a general audience, the authors should include additional discussion on their understanding of the relevance of a shorter or longer followup period for metastasis outcome. Particularly, they should discuss how robust results can be considered and how similar replication work in independent labs should be performed. Is it legitimate to adjust followup periods? Which rules should apply to a robust biological outcome? How can the inverse correlation of metastasis and ECM remodeling in the primary tumor site be explained when compared to the primary work? Which growth monitoring criteria should be applied to achieve high fidelity replication? How should researchers deal with differences in human endpoint criteria when constrained by a given legal framework? How would the authors perform this part of the work in retrospect, to achieve higher concordance with the original study?

We have included additional discussion on the impact of experimental time for metastasis outcome. While we addressed some of the specific questions raised, not all were addressed as we think they would be better addressed in an insight to this paper or in a global assessment of all replications that were conducted as part of the Reproducibility Project: Cancer Biology, not just this single replication.

The discussion on intermittent parameters potentially affecting the bioluminescence imaging is of general interest. It is not clear, however, whether these points are relevant here, assuming that the same imaging approach was used as in Goetz et al. (whole-body bioluminescence). However, Goetz et al., additionally used bioluminescence analysis of excised organs. Can the authors clarify why this more sensitive approach was apparently not used here?

The same approach as Goetz et al., was used in this replication. As described in subsection “Subcutaneous tumorigenicity assay of tumor cells co-injected with Cav1WT or Cav1KO primary MEFs” and the methods section ‘subcutaneous tumorigenicity assay’, tumors were imaged by whole-body bioluminescence (measurements reported in Figure 2A with representative images in lower panel of Figure 2B) and then organs and tumors were excised and imaged ex vivo (measurements reported in Figure 2C with representative images in upper panel of Figure 2B).

[Editors' note: further revisions were requested prior to acceptance, as described below.]

The manuscript has been improved but there are a few remaining issues that need to be addressed before acceptance, as outlined below:

1) The most critical issue is that the editors and reviewers were not satisfied with the clarity of the Abstract in terms of the way it acknowledges the potential impact of the shorter metastasis assay on the interpretability of the results. Based on extensive consultation among the editors and reviewers, their concerns would be resolved if you would be willing to replace the last four sentences in the Abstract with the following text:

We found metastatic burden was similar between Cav1WT and Cav1KO pMEFs, while the original study found increased metastases with Cav1WT (Figure 7C; Goetz et al., 2011); however, the duration of our in vivo experiments (45 days) were much shorter than in the study by Goetz et al., (2011) (75 days). This makes it difficult to interpret the difference between the studies as it is possible that the cells required more time to manifest the difference between treatments observed by Goetz et al. We also found a statistically significant negative correlation of intratumoral remodeling with metastatic burden, while the original study found a statistically significant positive correlation (Figure 7Cd; Goetz et al., 2011), but again there were differences between the studies in terms of the duration of the metastasis studies and the imaging approaches that could have impacted the outcomes. Finally, we report meta-analyses for each result.

Beyond this change to the Abstract, the only other changes that are required are minor changes to the text of the manuscript to address the specific points raised by reviewer #5 below. The comments from reviewer #1 are included below to provide context but would be entirely addressed by the change in the Abstract noted above.

We have revised the Abstract as suggested and addressed the specific points raised below by reviewer #5.

Reviewer #1:

I regret to note that the authors have missed a key point that required their attention, even though it was explicitly stated in my previous comments and my guiding suggestions on the lines of change I and other reviewers considered appropriate.

I quote here again the point I aimed at getting across: […] We cannot help noticing the authors keep explicitly stating and ´presenting´ certain divergent observations as results worth being considered […]

This clearly referred to two sentences in the Abstract, which were still included in that previous revision and in the latest version of the manuscript:

"[…] We found metastatic burden was similar between Cav1WT and Cav1KO pMEFs, while the original study found increased metastases with Cav1WT (Figure 7C; Goetz et al., 2011). We also found a statistically significant negative correlation of intratumoral remodeling with metastatic burden, while the original study found a statistically significant positive correlation (Figure 7Cd; Goetz et al., 2011) […]."

I contend again that these parts of their replication study cannot be subject to fair comparison, and should therefore not be presented upfront in this way, equaling them to the rest of the observations, as if they were to be considered as data obtained thoursough appropriate experimental replication. The way the Abstract keeps being presented is misleading (perhaps unintentionally), and leaves room for the wrong interpretation that those in vivo experiments deserve being considered as valid experiments that 'may have yielded a different outcome' for this reason or another. This is not useful for the readership of eLife and is damaging to the general aim of this reproducibility initiative, and I respectfully suggest again modifying this text following this example:

"[…] Two key experimental parameters (timing of in vivo metastasis experiments and image-based analysis of tumor ECM architecture) did not match those used in the original study, rendering some observations, such as metastatic burden or its correlation with intratumoral ECM remodeling, not suitable for comparison […]".

I did note and acknowledge that certain changes had partly been made in a good direction, such as the sentence at the end of the Abstract the authors bring up in their response, where they admit that key experimental aspects had not been reproduced and "could have impacted the outcomes [sic]". However, and notwithstanding the previous key point, this statement should at the very least be written before listing those experiments they performed under different conditions.

I am compelled to state again that these details are essential to preserve the original aim of the reproducibility initiative and provide a fair and non-misleading message to its readership. It must be noted this was a common claim of reviewers 2, 4 and us, and should therefore be fully complied with before publication.

Reviewer #5:

The authors have now included a specific justification for the humane endpoint, and it gets clear that the procedure was adequate.

The discussion on the implications of shorting the observation period and incomplete reporting of primary tumor burden at the endpoint in the original study could still be discussed with more care.

Specific points:

1) The authors state in subsection “Subcutaneous tumorigenicity assay of tumor cells co-injected with Cav1WT or Cav1KO primary MEFs”: "Although experimental timing is important, maintaining it might not be sufficient to observe the same malignant progression between studies." – it is not clear what is meant with "maintain timing" – the same follow-up period until the metastatic endpoint? The authors should acknowledge that reproducing the timelines of primary tumor growth and spontaneous metastasis as exactly as possible is critical for minimizing confounding parameters, because the multi-step cascade to metastasis including invasion, organ colonization and outgrowth are strongly time-dependent processes. In addition, growth of metastases is nonlinear, i.e. a few days towards later time points might impose major differences in aggregate tumor burden. This should be discussed in more detail.

We have revised the text to remove this sentence and to expand the importance of time and non-linear growth.

2) In addition, it would be important to provide a recommendation how such inconsistencies can be mitigated in future work, for example by (i) reporting the tumor volumes for each animal at the endpoint, (ii) resecting the primary tumor by a standardized measure (e.g., size or time), to allow for a follow-up of metastasis development. This would allow to monitor metastasis independent of primary tumor load and premature humane endpoint because of variations in growth of the primary.

We have included additional text to discuss these recommendations in the revised manuscript.

3) The readability of labels in Figure 2A, C and Figure 3C remains unacceptable and should be improved.

We have revised the labs in these figures to increase their readability and to make them consistent with the other figure labels.

https://doi.org/10.7554/eLife.45120.sa2

Article and author information

Author details

  1. Mee Rie Sheen

    Geisel School of Medicine at Dartmouth, Department of Microbiology and Immunology, Lebanon, United States
    Present address
    Department of Surgery, Massachusetts General Hospital, Harvard Medical School, Boston, United States
    Contribution
    Acquisition of data, Analysis and interpretation of data, Drafting or revising the article, Performed isolation and characterization of pMEFs
    Contributed equally with
    Jennifer L Fields
    Competing interests
    Transgenics and Genetic Constructs Shared Resource Center, Geisel School of Medicine at Dartmouth is a Science Exchange associated lab
  2. Jennifer L Fields

    Geisel School of Medicine at Dartmouth, Department of Microbiology and Immunology, Lebanon, United States
    Contribution
    Acquisition of data, Analysis and interpretation of data, Drafting or revising the article, Performed subcutaneous tumorigenicity assay
    Contributed equally with
    Mee Rie Sheen
    Competing interests
    Transgenics and Genetic Constructs Shared Resource Center, Geisel School of Medicine at Dartmouth is a Science Exchange associated lab
  3. Brian Northan

    MIA Cellavie Inc, Montreal, Canada
    Contribution
    Acquisition of data, Analysis and interpretation of data, Drafting or revising the article, Performed image analysis
    Contributed equally with
    Judith Lacoste
    Competing interests
    Cellavie Inc is a Science Exchange associated lab
  4. Judith Lacoste

    MIA Cellavie Inc, Montreal, Canada
    Contribution
    Acquisition of data, Analysis and interpretation of data, Drafting or revising the article, Performed image analysis
    Contributed equally with
    Brian Northan
    Competing interests
    Cellavie Inc is a Science Exchange associated lab
  5. Lay-Hong Ang

    Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, United States
    Contribution
    Acquisition of data, Analysis and interpretation of data, Drafting or revising the article, Performed staining and imaging of tumors
    Competing interests
    Confocal Imaging Core Facility, Beth Israel Deaconess Medical Center was a Science Exchange associated lab
  6. Steven Fiering

    Geisel School of Medicine at Dartmouth, Department of Microbiology and Immunology, Lebanon, United States
    Contribution
    Analysis and interpretation of data, Drafting or revising the article
    Competing interests
    Transgenics and Genetic Constructs Shared Resource Center, Geisel School of Medicine at Dartmouth is a Science Exchange associated lab
  7. Reproducibility Project: Cancer Biology

    Contribution
    Acquisition of data, Analysis and interpretation of data, Drafting or revising the article, Performed image analysis
    For correspondence
    1. tim@cos.io
    2. nicole@scienceexchange.com
    Competing interests
    EI, RT, NP: Employed by and hold shares in Science Exchange Inc.
    1. Elizabeth Iorns, Science Exchange, Palo Alto, United States
    2. Rachel Tsui, Science Exchange, Palo Alto, United States
    3. Alexandria Denis, Center for Open Science, Charlottesville, United States
    4. Nicole Perfito, Science Exchange, Palo Alto, United States
    5. Timothy M Errington, Center for Open Science, Charlottesville, United States

Funding

Laura and John Arnold Foundation

  • Reproducibility Project: Cancer Biology

The funder had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

The Reproducibility Project: Cancer Biology would like to thank Dr. Miguel A del Pozo (Centro Nacional de Investigaciones Cardiovasculares Carlos III (CNIC), Madrid, Spain), for sharing critical information, data, and reagents, specifically the LM-4175 cells expressing HSV-tk1-GFP-Fluc. We want to thank Claire Brown (Director of the Advanced Bio-Imaging Facility, McGill University, Montréal, Canada) and Vincent Pelletier (Implementation Specialist, Quorum Technologies, Puslinch, Canada) for sharing their expertise with the MetaMorph image analysis software. We thank Stefan Helfrich (Academic Alliance Manager, KNIME GmbH, Konstanz, Germany) for assistance with KNIME Analytics Platform. This paper was included in the 'Road testing for ARRIVE 2019', which helped improve our reporting of the animal experimental details. We would also like to thank the following companies for generously donating reagents to the Reproducibility Project: Cancer Biology; American Type and Tissue Collection (ATCC), Applied Biological Materials, BioLegend, Charles River Laboratories, Corning Incorporated, DDC Medical, EMD Millipore, Harlan Laboratories, LI-COR Biosciences, Mirus Bio, Novus Biologicals, Sigma-Aldrich, and System Biosciences (SBI).

Ethics

Animal experimentation: All animal procedures were approved by the Dartmouth College IACUC# 1133 and were in accordance with the Dartmouth College policies on the care, welfare, and treatment of laboratory animals.

Senior Editor

  1. Morrison Sean J, Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, United States

Reviewing Editor

  1. Massagué Joan, Memorial Sloan-Kettering Cancer Center, United States

Publication history

  1. Received: January 14, 2019
  2. Accepted: November 6, 2019
  3. Version of Record published: December 17, 2019 (version 1)

Copyright

© 2019, Sheen et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 729
    Page views
  • 52
    Downloads
  • 0
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Cancer Biology
    Peter Friedl
    Insight

    The partial success of an attempt to repeat findings in cancer biology highlights the need to improve study designs for preclinical research into metastasis and the targeting of cancer cells.

    1. Cancer Biology
    Edited by Roger J Davis et al.
    Collection Updated

    Investigating reproducibility in preclinical cancer research.