Extent, impact, and mitigation of batch effects in tumor biomarker studies using tissue microarrays

  1. Konrad H Stopsack  Is a corresponding author
  2. Svitlana Tyekucheva
  3. Molin Wang
  4. Travis A Gerke
  5. J Bailey Vaselkiv
  6. Kathryn L.# Penney
  7. Philip W Kantoff
  8. Stephen P Finn
  9. Michelangelo Fiorentino
  10. Massimo Loda
  11. Tamara L Lotan
  12. Giovanni Parmigiani
  13. Lorelei A Mucci
  1. Harvard T.H. Chan School of Public Health, United States
  2. Dana-Farber Cancer Institute, United States
  3. Moffitt Cancer Center, United States
  4. Memorial Sloan Kettering Cancer Center, United States
  5. Trinity College, Ireland
  6. University of Bologna, Italy
  7. Weill Cornell Medical Center, United States
  8. Johns Hopkins University, United States

Abstract

Tissue microarrays (TMAs) have been used in thousands of cancer biomarker studies. To what extent batch effects, measurement error in biomarker levels between slides, affects TMA-based studies has not been assessed systematically. We evaluated 20 protein biomarkers on 14 TMAs with prospectively collected tumor tissue from 1,448 primary prostate cancers. In half of the biomarkers, more than 10% of biomarker variance was attributable to between-TMA differences (range, 1-48%). We implemented different methods to mitigate batch effects (R package batchtma), tested in plasmode simulation. Biomarker levels were more similar between mitigation approaches compared to uncorrected values. For some biomarkers, associations with clinical features changed substantially after addressing batch effects. Batch effects and resulting bias are not an error of an individual study but an inherent feature of TMA-based protein biomarker studies. They always need to be considered during study design and addressed analytically in studies using more than one TMA.

Data availability

The batchtma R package is available at https://stopsack.github.io/batchtma and the Comprehensive R Archive Network (CRAN). Code used to produce results this manuscript is at https://github.com/stopsack/batchtma_manuscript. Data are available for analysis on the Harvard FAS computing cluster through a project proposal for the Health Professionals Follow-up Study (https://sites.sph.harvard.edu/hpfs/for-collaborators).

Article and author information

Author details

  1. Konrad H Stopsack

    Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, United States
    For correspondence
    stopsack@mail.harvard.edu
    Competing interests
    No competing interests declared.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-0722-1311
  2. Svitlana Tyekucheva

    Department of Data Science, Dana-Farber Cancer Institute, Boston, United States
    Competing interests
    No competing interests declared.
  3. Molin Wang

    Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, United States
    Competing interests
    No competing interests declared.
  4. Travis A Gerke

    Department of Cancer Epidemiology, Moffitt Cancer Center, Tampa, United States
    Competing interests
    No competing interests declared.
  5. J Bailey Vaselkiv

    Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, United States
    Competing interests
    No competing interests declared.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-7702-9504
  6. Kathryn L.# Penney

    Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, United States
    Competing interests
    No competing interests declared.
  7. Philip W Kantoff

    Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, United States
    Competing interests
    Philip W Kantoff, reports the following disclosures for the last 24-month period: he has investment interest in Convergent Therapeutics Inc, Cogent Biosciences, Context Therapeutics LLC, DRGT, Mirati, Placon, PrognomIQ, SnyDevRx and XLink, he is a company board member for Context Therapeutics LLC and Convergent Therapeutics, he is a company founder for XLink and Convergent Therapeutics, and is/was a consultant/scientific advisory board member for Anji, Candel, DRGT, Immunis, AI (previously OncoCellMDX), Janssen, Progenity, PrognomIQ, Seer Biosciences, SynDevRX, Tarveda Therapeutics, and Veru, and serves on data safety monitoring boards for Genentech/Roche and Merck. He reports spousal association with Bayer..
  8. Stephen P Finn

    Trinity College, Dublin, Ireland
    Competing interests
    No competing interests declared.
  9. Michelangelo Fiorentino

    Pathology Unit, Addarii Institute, University of Bologna, Bologna, Italy
    Competing interests
    No competing interests declared.
  10. Massimo Loda

    Department of Pathology, Weill Cornell Medical Center, New York, United States
    Competing interests
    No competing interests declared.
  11. Tamara L Lotan

    Department of Pathology, Johns Hopkins University, Baltimore, United States
    Competing interests
    No competing interests declared.
  12. Giovanni Parmigiani

    Department of Data Science, Dana-Farber Cancer Institute, Boston, United States
    Competing interests
    Giovanni Parmigiani, reports the following disclosures for the last 24-month period: he had investment interest in CRA Health; he is a co-founder and company board member of Phaeno Biotechnology; he is a consultant / scientific advisory board member for Konica-Minolta, Delfi Diagnostics and Foundation Medicine; he serves on a data safety monitoring board for Geisinger. None of these activities are related to the content of this article..
  13. Lorelei A Mucci

    Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, United States
    Competing interests
    No competing interests declared.

Funding

National Cancer Institute (U01 CA167552)

  • Lorelei A Mucci

DOD Prostate Cancer Research Program (W81XWH-18-1-0330)

  • Konrad H Stopsack

Prostate Cancer Foundation (Young Investigator Award)

  • Konrad H Stopsack
  • Kathryn L.# Penney
  • Stephen P Finn
  • Tamara L Lotan
  • Lorelei A Mucci

National Cancer Institute (P50 CA090381)

  • Philip W Kantoff
  • Massimo Loda
  • Lorelei A Mucci

National Cancer Institute (P50 CA211024)

  • Massimo Loda

National Cancer Institute (P30 CA008748)

  • Konrad H Stopsack
  • Philip W Kantoff

National Cancer Institute (P30 CA006516)

  • Massimo Loda
  • Lorelei A Mucci

National Cancer Institute (5R37 CA227190-02)

  • Svitlana Tyekucheva
  • Kathryn L.# Penney
  • Giovanni Parmigiani
  • Lorelei A Mucci

National Cancer Institute (R03 CA212799)

  • Molin Wang

National Cancer Institute (R35 CA212799)

  • Molin Wang

National Cancer Institute (R01 CA131945)

  • Massimo Loda

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Ethics

Human subjects: Written informed consent was obtained from all participants, and the study protocol was approved by the institutional review boards of the Brigham and Women's Hospital and Harvard T.H. Chan School of Public Health (IRB 19-1430), and those of participating registries as required.

Copyright

© 2021, Stopsack et al.

This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 777
    views
  • 139
    downloads
  • 9
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Konrad H Stopsack
  2. Svitlana Tyekucheva
  3. Molin Wang
  4. Travis A Gerke
  5. J Bailey Vaselkiv
  6. Kathryn L.# Penney
  7. Philip W Kantoff
  8. Stephen P Finn
  9. Michelangelo Fiorentino
  10. Massimo Loda
  11. Tamara L Lotan
  12. Giovanni Parmigiani
  13. Lorelei A Mucci
(2021)
Extent, impact, and mitigation of batch effects in tumor biomarker studies using tissue microarrays
eLife 10:e71265.
https://doi.org/10.7554/eLife.71265

Share this article

https://doi.org/10.7554/eLife.71265

Further reading

    1. Cancer Biology
    Qianqian Ju, Wenjing Sheng ... Cheng Sun
    Research Article

    TAK1 is a serine/threonine protein kinase that is a key regulator in a wide variety of cellular processes. However, the functions and mechanisms involved in cancer metastasis are still not well understood. Here, we found that TAK1 knockdown promoted esophageal squamous cancer carcinoma (ESCC) migration and invasion, whereas TAK1 overexpression resulted in the opposite outcome. These in vitro findings were recapitulated in vivo in a xenograft metastatic mouse model. Mechanistically, co-immunoprecipitation and mass spectrometry demonstrated that TAK1 interacted with phospholipase C epsilon 1 (PLCE1) and phosphorylated PLCE1 at serine 1060 (S1060). Functional studies revealed that phosphorylation at S1060 in PLCE1 resulted in decreased enzyme activity, leading to the repression of phosphatidylinositol 4,5-bisphosphate (PIP2) hydrolysis. As a result, the degradation products of PIP2 including diacylglycerol (DAG) and inositol IP3 were reduced, which thereby suppressed signal transduction in the axis of PKC/GSK-3β/β-Catenin. Consequently, expression of cancer metastasis-related genes was impeded by TAK1. Overall, our data indicate that TAK1 plays a negative role in ESCC metastasis, which depends on the TAK1-induced phosphorylation of PLCE1 at S1060.

    1. Cancer Biology
    2. Cell Biology
    Rui Hua, Jean X Jiang
    Insight

    Cell crowding causes high-grade breast cancer cells to become more invasive by activating a molecular switch that causes the cells to shrink and spread.