Extent, impact, and mitigation of batch effects in tumor biomarker studies using tissue microarrays

  1. Konrad H Stopsack  Is a corresponding author
  2. Svitlana Tyekucheva
  3. Molin Wang
  4. Travis A Gerke
  5. J Bailey Vaselkiv
  6. Kathryn L.# Penney
  7. Philip W Kantoff
  8. Stephen P Finn
  9. Michelangelo Fiorentino
  10. Massimo Loda
  11. Tamara L Lotan
  12. Giovanni Parmigiani
  13. Lorelei A Mucci
  1. Harvard T.H. Chan School of Public Health, United States
  2. Dana-Farber Cancer Institute, United States
  3. Moffitt Cancer Center, United States
  4. Memorial Sloan Kettering Cancer Center, United States
  5. Trinity College, Ireland
  6. University of Bologna, Italy
  7. Weill Cornell Medical Center, United States
  8. Johns Hopkins University, United States

Abstract

Tissue microarrays (TMAs) have been used in thousands of cancer biomarker studies. To what extent batch effects, measurement error in biomarker levels between slides, affects TMA-based studies has not been assessed systematically. We evaluated 20 protein biomarkers on 14 TMAs with prospectively collected tumor tissue from 1,448 primary prostate cancers. In half of the biomarkers, more than 10% of biomarker variance was attributable to between-TMA differences (range, 1-48%). We implemented different methods to mitigate batch effects (R package batchtma), tested in plasmode simulation. Biomarker levels were more similar between mitigation approaches compared to uncorrected values. For some biomarkers, associations with clinical features changed substantially after addressing batch effects. Batch effects and resulting bias are not an error of an individual study but an inherent feature of TMA-based protein biomarker studies. They always need to be considered during study design and addressed analytically in studies using more than one TMA.

Data availability

The batchtma R package is available at https://stopsack.github.io/batchtma and the Comprehensive R Archive Network (CRAN). Code used to produce results this manuscript is at https://github.com/stopsack/batchtma_manuscript. Data are available for analysis on the Harvard FAS computing cluster through a project proposal for the Health Professionals Follow-up Study (https://sites.sph.harvard.edu/hpfs/for-collaborators).

Article and author information

Author details

  1. Konrad H Stopsack

    Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, United States
    For correspondence
    stopsack@mail.harvard.edu
    Competing interests
    No competing interests declared.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-0722-1311
  2. Svitlana Tyekucheva

    Department of Data Science, Dana-Farber Cancer Institute, Boston, United States
    Competing interests
    No competing interests declared.
  3. Molin Wang

    Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, United States
    Competing interests
    No competing interests declared.
  4. Travis A Gerke

    Department of Cancer Epidemiology, Moffitt Cancer Center, Tampa, United States
    Competing interests
    No competing interests declared.
  5. J Bailey Vaselkiv

    Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, United States
    Competing interests
    No competing interests declared.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-7702-9504
  6. Kathryn L.# Penney

    Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, United States
    Competing interests
    No competing interests declared.
  7. Philip W Kantoff

    Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, United States
    Competing interests
    Philip W Kantoff, reports the following disclosures for the last 24-month period: he has investment interest in Convergent Therapeutics Inc, Cogent Biosciences, Context Therapeutics LLC, DRGT, Mirati, Placon, PrognomIQ, SnyDevRx and XLink, he is a company board member for Context Therapeutics LLC and Convergent Therapeutics, he is a company founder for XLink and Convergent Therapeutics, and is/was a consultant/scientific advisory board member for Anji, Candel, DRGT, Immunis, AI (previously OncoCellMDX), Janssen, Progenity, PrognomIQ, Seer Biosciences, SynDevRX, Tarveda Therapeutics, and Veru, and serves on data safety monitoring boards for Genentech/Roche and Merck. He reports spousal association with Bayer..
  8. Stephen P Finn

    Trinity College, Dublin, Ireland
    Competing interests
    No competing interests declared.
  9. Michelangelo Fiorentino

    Pathology Unit, Addarii Institute, University of Bologna, Bologna, Italy
    Competing interests
    No competing interests declared.
  10. Massimo Loda

    Department of Pathology, Weill Cornell Medical Center, New York, United States
    Competing interests
    No competing interests declared.
  11. Tamara L Lotan

    Department of Pathology, Johns Hopkins University, Baltimore, United States
    Competing interests
    No competing interests declared.
  12. Giovanni Parmigiani

    Department of Data Science, Dana-Farber Cancer Institute, Boston, United States
    Competing interests
    Giovanni Parmigiani, reports the following disclosures for the last 24-month period: he had investment interest in CRA Health; he is a co-founder and company board member of Phaeno Biotechnology; he is a consultant / scientific advisory board member for Konica-Minolta, Delfi Diagnostics and Foundation Medicine; he serves on a data safety monitoring board for Geisinger. None of these activities are related to the content of this article..
  13. Lorelei A Mucci

    Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, United States
    Competing interests
    No competing interests declared.

Funding

National Cancer Institute (U01 CA167552)

  • Lorelei A Mucci

DOD Prostate Cancer Research Program (W81XWH-18-1-0330)

  • Konrad H Stopsack

Prostate Cancer Foundation (Young Investigator Award)

  • Konrad H Stopsack
  • Kathryn L.# Penney
  • Stephen P Finn
  • Tamara L Lotan
  • Lorelei A Mucci

National Cancer Institute (P50 CA090381)

  • Philip W Kantoff
  • Massimo Loda
  • Lorelei A Mucci

National Cancer Institute (P50 CA211024)

  • Massimo Loda

National Cancer Institute (P30 CA008748)

  • Konrad H Stopsack
  • Philip W Kantoff

National Cancer Institute (P30 CA006516)

  • Massimo Loda
  • Lorelei A Mucci

National Cancer Institute (5R37 CA227190-02)

  • Svitlana Tyekucheva
  • Kathryn L.# Penney
  • Giovanni Parmigiani
  • Lorelei A Mucci

National Cancer Institute (R03 CA212799)

  • Molin Wang

National Cancer Institute (R35 CA212799)

  • Molin Wang

National Cancer Institute (R01 CA131945)

  • Massimo Loda

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Ethics

Human subjects: Written informed consent was obtained from all participants, and the study protocol was approved by the institutional review boards of the Brigham and Women's Hospital and Harvard T.H. Chan School of Public Health (IRB 19-1430), and those of participating registries as required.

Copyright

© 2021, Stopsack et al.

This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 753
    views
  • 134
    downloads
  • 9
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

Share this article

https://doi.org/10.7554/eLife.71265

Further reading

    1. Cancer Biology
    2. Computational and Systems Biology
    Rosalyn W Sayaman, Masaru Miyano ... Mark A LaBarge
    Research Article Updated

    Effects from aging in single cells are heterogenous, whereas at the organ- and tissue-levels aging phenotypes tend to appear as stereotypical changes. The mammary epithelium is a bilayer of two major phenotypically and functionally distinct cell lineages: luminal epithelial and myoepithelial cells. Mammary luminal epithelia exhibit substantial stereotypical changes with age that merit attention because these cells are the putative cells-of-origin for breast cancers. We hypothesize that effects from aging that impinge upon maintenance of lineage fidelity increase susceptibility to cancer initiation. We generated and analyzed transcriptomes from primary luminal epithelial and myoepithelial cells from younger <30 (y)ears old and older >55 y women. In addition to age-dependent directional changes in gene expression, we observed increased transcriptional variance with age that contributed to genome-wide loss of lineage fidelity. Age-dependent variant responses were common to both lineages, whereas directional changes were almost exclusively detected in luminal epithelia and involved altered regulation of chromatin and genome organizers such as SATB1. Epithelial expression variance of gap junction protein GJB6 increased with age, and modulation of GJB6 expression in heterochronous co-cultures revealed that it provided a communication conduit from myoepithelial cells that drove directional change in luminal cells. Age-dependent luminal transcriptomes comprised a prominent signal that could be detected in bulk tissue during aging and transition into cancers. A machine learning classifier based on luminal-specific aging distinguished normal from cancer tissue and was highly predictive of breast cancer subtype. We speculate that luminal epithelia are the ultimate site of integration of the variant responses to aging in their surrounding tissue, and that their emergent phenotype both endows cells with the ability to become cancer-cells-of-origin and represents a biosensor that presages cancer susceptibility.

    1. Cancer Biology
    2. Cell Biology
    Maojin Tian, Le Yang ... Peiqing Zhao
    Research Article

    TIPE (TNFAIP8) has been identified as an oncogene and participates in tumor biology. However, how its role in the metabolism of tumor cells during melanoma development remains unclear. Here, we demonstrated that TIPE promoted glycolysis by interacting with pyruvate kinase M2 (PKM2) in melanoma. We found that TIPE-induced PKM2 dimerization, thereby facilitating its translocation from the cytoplasm to the nucleus. TIPE-mediated PKM2 dimerization consequently promoted HIF-1α activation and glycolysis, which contributed to melanoma progression and increased its stemness features. Notably, TIPE specifically phosphorylated PKM2 at Ser 37 in an extracellular signal-regulated kinase (ERK)-dependent manner. Consistently, the expression of TIPE was positively correlated with the levels of PKM2 Ser37 phosphorylation and cancer stem cell (CSC) markers in melanoma tissues from clinical samples and tumor bearing mice. In summary, our findings indicate that the TIPE/PKM2/HIF-1α signaling pathway plays a pivotal role in promoting CSC properties by facilitating the glycolysis, which would provide a promising therapeutic target for melanoma intervention.