1. Computational and Systems Biology
  2. Human Biology and Medicine
Download icon

Mitochondrial DNA copy number variation across human cancers

Research Article
Cited
31
Views
6,348
Comments
0
Cite as: eLife 2016;5:e10769 doi: 10.7554/eLife.10769

Abstract

Mutations, deletions, and changes in copy number of mitochondrial DNA (mtDNA), are observed throughout cancers. Here, we survey mtDNA copy number variation across 22 tumor types profiled by The Cancer Genome Atlas project. We observe a tendency for some cancers, especially of the bladder, breast, and kidney, to be depleted of mtDNA, relative to matched normal tissue. Analysis of genetic context reveals an association between incidence of several somatic alterations, including IDH1 mutations in gliomas, and mtDNA content. In some but not all cancer types, mtDNA content is correlated with the expression of respiratory genes, and anti-correlated to the expression of immune response and cell-cycle genes. In tandem with immunohistochemical evidence, we find that some tumors may compensate for mtDNA depletion to sustain levels of respiratory proteins. Our results highlight the extent of mtDNA copy number variation in tumors and point to related therapeutic opportunities.

https://doi.org/10.7554/eLife.10769.001

eLife digest

Within each cell of your body lie hundreds or thousands of mitochondria. These structures are perhaps best known for making energy, but mitochondria also play roles in processes like the immune response and cell signaling. However, in the mutant cells that form cancerous tumors, these roles can be subverted and altered.

Mitochondria contain their own DNA, which is distinct from the DNA stored in the nucleus of the cell, and codes for the proteins that the mitochondria need to produce energy. Reznik et al. used next-generation DNA sequencing data produced by The Cancer Genome Atlas consortium to estimate the number of copies of mitochondrial DNA in tumor cells and the adjacent normal tissue. This revealed that in many types of cancer, tumor cells have fewer copies of mitochondrial DNA than the cells that make up normal tissue. In many cases, the depletion of mitochondrial DNA was accompanied by a reduction of the expression of mitochondrial genes, suggesting that mitochondrial activity may be suppressed in these tumor types.

Reznik et al. also found that the number of copies of mitochondrial DNA in certain tumor types is related to the incidence of key 'driver' mutations that cause cells to become cancerous. This knowledge may help to develop new treatments for these tumors.

https://doi.org/10.7554/eLife.10769.002

Introduction

Human cells contain many copies of the 16-kilobase mitochondrial genome, which encodes 13 essential components of the mitochondrial electron transport chain and ATP synthase. Alterations of mitochondrial DNA (mtDNA), via inactivating genetic mutations or depletion of the number of copies of mtDNA in a cell, can impair mitochondrial respiration and contribute to pathologies as diverse as encephelopathies and neuropathies (El-Hattab and Scaglia, 2013), and the process of aging (Balaban et al., 2005; Finkel and Holbrook, 2000). In cancer, a number of studies have examined the role of mtDNA mutations in carcinogenesis (Wallace, 2012; Ju et al., 2014; Larman et al., 2012; He et al., 2010). However, the contribution of changes in the gross number of mtDNA genomes in a tumor (i.e. the ‘mtDNA copy number’) to tumor development and progression has not been adequately investigated.

In contrast to the fixed (diploid) copy number of the nuclear genome, many copies of mtDNA exist within each cell, and these levels can fluctuate. Because mitochondria undergo a constant process of fusion and fission, it is difficult to meaningfully determine the number of mtDNA molecules per mitochondrion. Instead, studies have focused on measuring mtDNA copy number per cell, with estimates for humans that vary between a few hundred and over one hundred thousand copies, depending on the tissue under examination (Wai et al., 2010). Furthermore, because mtDNA serves as a template for the transcription of essential electron transport chain complexes, the quantity of mtDNA in a cell may serve a surrogate marker for the cell’s capacity to conduct oxidative phosphorylation if the copy number of mtDNA is rate-limiting. For instance, a recent study estimated that energy-intensive tissues such as cardiac and skeletal muscle contained between 4000 and 6000 copies of mtDNA per cell, while liver, kidney, and lung tissues averaged between 500 and 2000 copies (D'Erchia et al., 2015).

Mitochondrial dysfunction plays several distinct roles in cancer (Schon et al., 2012; Wallace, 2012; Larman et al., 2012). First, the normal functions of mitochondria (e.g. respiration) may be subverted to support the growth of the tumor. A canonical example of this is the observation that many tumors suppress mitochondrial respiration in favor of increased uptake of glucose and secretion of lactate (‘the Warburg effect’), a phenomenon which has found clinical utility for imaging of tumors using FDG-PET (Vander Heiden et al., 2009). Second, mitochondria are susceptible to mutations in nuclear- and mitochondrially-encoded genes, and a subset of tumors are known to be caused by mutations of the mitochondrial enzymes FH, SDH, and IDH (King et al., 2006). Furthermore, mtDNA dysfunction affecting the electron transport chain can lead to generation of excess reactive oxygen species (ROS), contributing to tumor cell metastasis (Ishikawa et al., 2008).

To date, no comprehensive analysis of mtDNA copy number changes in tumors has been completed, despite a large literature of isolated reports (Yu, 2011). Large-scale studies of mtDNA in cancer have instead focused on the analysis of mutations and heteroplasmy, largely ignoring the contribution of mtDNA copy number variation to the development and progression of tumors. Here, we use whole-genome and whole-exome sequencing data to examine changes in mtDNA copy number across a panel of cancer types profiled by The Cancer Genome Atlas (TCGA) consortium. Using the resulting mtDNA copy number estimates, we ask fundamental questions about mtDNA and cancer. We investigate whether evidence of the Warburg effect can be found in patterns of mtDNA accumulation or depletion. We further examine the connection between gene expression levels and mtDNA copy number, and identify a subset of mitochondrially-localized metabolic pathways exhibiting a high degree of co-expression with mtDNA levels. Finally, we ask whether gross variations of mtDNA copy number are linked to the incidence of somatic alterations (including mutations and copy number alterations) across cancer types. Altogether, our results shed light on the contribution of aberrant mitochondrial function, through changes in mtDNA content, to cancer.

Summary of methods. 

(A) Reads were analyzed to determine the number aligning to each chromosome. Relative abundance of mitochondrial DNA was calculated as the ratio of mtDNA reads to nuclear DNA reads, and corrected for tumor purity and ploidy. The results of these calculations were employed in three different types of analysis. (B) Comparisons across samples profiled by both whole exome and whole genome sequencing provided validation of mtDNA copy number estimates. (C) Pairs of matched tumor/adjacent-normal samples were compared to uncover patterns of mtDNA accumulation and depletion. (D) Using all data available (including tumor samples lacking matched normal samples), statistical associations between mtDNA copy number and (1) mutation/copy number alterations, and (2) gene expression, were calculated.

https://doi.org/10.7554/eLife.10769.003

Results

Calculation of mtDNA abundance

To estimate the copy number of mtDNA in a tumor sample, we implemented a computationally efficient and fast approach based on comparing the number of sequencing reads aligning to (1) the mitochondrial (MT) genome and (2) the nuclear genome. Comparable approaches have been used to estimate somatic copy number alterations within the nuclear genome in cancer [for a review, see Zhao et al. (2013)]. The approach assumes that regions of the genome of equal ploidy should be sequenced to comparable depth. In a normal human cell, the autosomal nuclear genome is at a fixed (diploid) copy number. Thus, by calculating the ratio of reads aligning to the mitochondrial and nuclear genomes, respectively, it is possible to estimate mtDNA ploidy relative to a diploid standard. This approach to assaying mtDNA copy number has been proposed and implemented by others in prior work (Guo et al., 2013; D'Erchia et al., 2015; Samuels et al., 2013).

To estimate mtDNA copy number, we calculated the ratio of (1) the number of sequencing reads mapping to the MT genome (rm) to (2) the number of reads mapping to the nuclear genome (rn) (Equation 1). Because tumor cells can exhibit large-scale genomic amplifications and deletions, and may be infiltrated by stromal and immune cells, we applied a ploidy/purity correction (‘R’), described in detail in the Materials and methods. This calculation yields the relative mtDNA copy number m. Assuming two samples have been processed in identical manners, the sample with a higher value of m contains more copies of mtDNA (Guo et al., 2013; D'Erchia et al., 2015). In line with previous studies (e.g. [Ju et al., 2014]), we observed significant variation in mean mtDNA copy number between sequencing centers, as well as between each batch (i.e., each TCGA plate ID) within a single sequencing center. We applied a batch correction to control for this effect (see Materials and methods).

(1) m=rmrn×R

We applied this method to whole exome sequencing (WXS) and whole genome sequencing (WGS) data from 22 distinct TCGA studies (Figure 2, see Materials and methods for further details on data collection). To validate estimates of mtDNA copy number, we compared estimates from samples submitted to both WXS and WGS. Although mitochondrial reads are abundant in both WGS and WXS, the two sequencing methods capture mtDNA at different efficiencies: exome sequencing involves the targeted enrichment of exonic regions prior to sequencing and does not target mtDNA (Samuels et al., 2013), while WGS sequences cellular DNA in an unbiased manner. If our approach to estimating mtDNA copy number is accurate, then we expect that the two sequencing platforms should offer comparable estimates of mtDNA copy number across a panel of samples, i.e., samples with high mtDNA copy number in WGS should have similarly high mtDNA copy number in WXS. We compared mtDNA copy number estimates in 1110 samples across 8 tumor types profiled by both WXS and WGS, controlling for sequencing center and TCGA plate ID. We confirmed that across all combinations of cancer types and sequencing centers, WXS and WGS offer significantly correlated estimates of mtDNA copy number (Figure 2—figure supplement 1).

Figure 2 with 1 supplement see all
Summary of data.

Whole-exome and whole-genome sequencing data were obtained from 22 TCGA studies. Abbreviations for each cancer type follow the standard TCGA nomenclature. The data were processed at four different sequencing centers, each of which was analyzed separately. Over 1000 samples were paired instances of tumor/adjacent-normal tissue from the same patient, which were used to quantify changes in mtDNA content across tumors.

https://doi.org/10.7554/eLife.10769.004

Gross changes in mtDNA content are evident in many cancers

Do tumors have different numbers of copies of mtDNA compared to normal tissue? We investigated whether tumor samples showed a significant change in mtDNA content, relative to matched normal tissues. To do so, for each pair of tumor/adjacent-normal samples collected from the same patient, sequenced at a single sequencing center and within the same batch (1090 pairs in total), we calculated the ratio

(2) r=log2(mTmN)

where mT and mN are the mtDNA copy number estimates in tumor and normal tissues, respectively. We then used non-parametric Wilcoxon signed rank tests to assess whether each cancer type was signficantly enriched for tumor samples with higher or lower mtDNA content than matched normal tissue. The analysis was restricted to 15 cancer types for which we had at least 10 matched tumor/normal pairs. To ensure a meaningful comparison, we only used adjacent-normal tissue (and not blood) for the analysis. We elected to focus on analyzing whole-exome sequencing data, for which we had the largest number of samples. A complete list of all calculations is available in Supplementary file 1.

Strikingly, seven of the fifteen tumor types analyzed showed a statistically significant (BH-corrected Mann-Whitney p-value <0.05) decrease in mtDNA abundance in tumor samples (Figure 3A). These ‘mtDNA-depleted’ tumor types included bladder (BLCA), breast (BRCA), esophogeal (ESCA), head and neck squamous cell (HNSC), kidney (both clear-cell, KIRC, and papillary, KIRP, but not chromophobe, KICH, subtypes), and liver (LIHC) cancers. Despite a tendency towards mtDNA depletion, all tumor types contained at least one sample with higher mtDNA content than adjacent normal tissue. Nevertheless, the depletion effect was exceptionally strong in several tumor types: except for a handful of WGS samples, nearly all bladder tumors were depleted of mtDNA. Similarly, 87% of clear-cell kidney tumor samples contained less mtDNA than their normal tissue counterparts. In contrast, a single tumor type, lung adenocarcinoma (LUAD), showed statistically significant mtDNA accumulation. In cases where sufficient numbers of both WGS and WXS data were available (bladder, breast, head and neck, clear-cell kidney, thyroid, endometrial, and lung adenocarcinomas), we observed a consistent effect across samples processed by both platforms (Figure 3—figure supplement 1).

Figure 3 with 5 supplements see all
Many tumor types show depletion of mtDNA in tumor samples, relative to adjacent normal tissue.

Normalized histograms and density plots illustrate log2 ratio of mtDNA content in tumor tissue, to mtDNA content in normal tissue. Each row is a different tumor type. Statistical significance of trends is assessed using a Wilcoxon sign rank test, and p-values are corrected using the Benjamini-Hochberg procedure. Cancer types displaying significant depletion/accumulation of mtDNA are colored in blue/red. Seven of fiteen tumor types show a significant depletion of mtDNA content (a shift of the distribution to the left of the dashed line), relative to normal tissue. One tumor type, lung adenocarcinomas, shows an increase in mtDNA content, relative to normal tissue.

https://doi.org/10.7554/eLife.10769.006

Our estimates of tumor mtDNA content are based on sequencing DNA of bulk tumor tissue, which includes stromal and immune cell infiltration. While the effect of purity on mtDNA copy number is partially accounted for by the correction factor R, it is still possible that tumor impurity may bias the calculation of copy number (for example, if immune cells have lower mtDNA copy number than adjacent-normal tissue). Although our computational method is unable to deconvolve sequencing traces arising from tumor cells vs. infiltrate, we nevertheless investigated the statistical association between tumor mtDNA content and stromal/immune infiltration. We obtained estimates of stromal and immune cell infiltration based on gene expression data for eight tumor types calculated using the ESTIMATE algorithm (Yoshihara et al., 2013). We correlated these values with estimates of mtDNA copy number, with full results reported in Figure 3—figure supplements 25 and Figure 3—source data 1, with p-values reported as uncorrected for multiple hypothesis testing. We observed a recurrent but weak negative correlation between tumor mtDNA content and immune infilitration score. Further detailed study is required to trace the contribution of infiltrating cells to mtDNA content.

To further relate changes in mtDNA content to clinical progression of disease, we used Cox regression to determine if tumor mtDNA copy number was a significant predictor of patient survival (Figure 4). In total, five cancer types showed statistically significant association between patient survival and mtDNA content. In three cancer types (adrenocortical carcinoma, p-value 0.026; chromophobe renal cell carcinoma, p-value 0.053; and low-grade glioma, p-value 0.009), high tumor mtDNA content was associated with better survival. The opposite trend, of poor survival in patients with high tumor mtDNA, was observed in clear-cell renal cell carcinoma (p-value 0.023) and melanoma (p-value 0.043). The finding regarding KICH is particularly intriguing given the central role mitochondrial dysfunction has been proposed to play in the disease (Davis et al., 2014). That mtDNA copy number correlates with better or worse survival, depending on cancer type, suggests that other confounding factors strongly tied to survival, such as the presence of somatic mutations, may influence mtDNA levels. In a later section, we will investigate this hypothesis.

mtDNA content is significantly associated with patient survival in (A) adrenocortical (ACC) and (B) kidney chromophobe carcinoma (KICH).

For visualization purposes, patients are partitioned into two groups, based on tumor mtDNA copy number relative to the median mtDNA copy number across all tumor samples in the cancer type. Cox regression identified a significant association between high tumor mtDNA and better survival in these two tumor types (ACC, p-value 0.026; KICH, p-value 0.053).

https://doi.org/10.7554/eLife.10769.013

mtDNA copy number is correlated to the expression of mitochondrial metabolic genes

Proteins encoded in mtDNA localize exclusively to the mitochondrial electron transport chain and ATP synthase, and fluctuations in mtDNA copy number are well-known to influence the level of transcription of these genes. It has also been observed that complete depletion of mtDNA in cell lines by exposure to ethidium bromide affects a number of additional signaling pathways (Chandel and Schumacker, 1999). Thus, we were compelled to ask if changes in mtDNA content narrowly influenced changes in the expression of oxidative phosphorylation genes, or if they were more broadly connected to the other functions of mitochondria.

Our approach to this question was to search for gene sets whose transcriptional signatures were highly correlated to mtDNA copy number. To do so, we calculated the non-parametric Spearman correlation between the expression of each gene and mtDNA copy number, and then used the mean-rank gene set test implemented in limma (Law et al., 2014) to identify gene sets which were significantly enriched for highly correlated genes.The approach was applied in an unbiased manner to all Reactome gene sets in the Canonical Pathways group from the MSigDB database (Liberzon et al., 2011).

In general, each tissue exhibited specific gene sets which were strongly correlated to mtDNA copy number levels. However, when aggregating across all cancer types, mitochondrially-localized metabolic pathways showed the most frequent significant correlation with mtDNA abundance (Figure 5 and Supplementary file 2, Worksheet Fig5Data). This recurrent positive correlation between expression of mitochondrial genes and mtDNA copy number across many tumor types served as a second, independent validation that estimates of mtDNA copy number reflected in vivo mtDNA ploidy. We also calculated the correlation between mtDNA copy number and the expression of TFAM, a critical transcription and replication factor which binds to mtDNA in nucleoids, and found a significant positive correlation (Spearman p-value <0.05) in 34% of studies (Figure 5—source data 1).

Figure 5 with 1 supplement see all
Gene set analysis identifies pathways correlated to mtDNA content.

(A) Correlations between all genes and mtDNA content are calculated. Then, gene sets enriched for high/low correlation coefficients are identified. (B) mtDNA copy number is most strongly correlated to metabolic pathways including respiratory electron transport and the TCA cycle, which are localized to the mitochondria. Enrichment score corresponds to the -log10 p-value of the statistical enrichment test, accounting for the sign of the correlation (i.e. positive or negative correlation). Red blocks indicate an enrichment for positive correlation, blue blocks indicate an enrichment for negative correlation. The top ten most frequently positively correlated gene sets across all studies are depicted. Full results are available in Supplementary file 2.

https://doi.org/10.7554/eLife.10769.014

In line with expectation, we found that the ‘TCA Cycle and Respiratory Electron Transport’ gene set was the most frequently correlated to mtDNA copy number (1st out of 674 gene sets). Among the remaining top positively correlated gene sets, many were metabolism-related, and included mitochondrial beta oxidation of fatty acids, and branched chain amino acid (BCAA) catabolism. BCAAs (valine, leucine, isoleucine) are essential amino acids whose catabolism depends on the activity of an enzyme complex, branched chain α-keto acid dehydrogenase, located in the inner mitochondrial membrane (Hutson et al., 1988), and prior work has demonstrated that dietary supplementation of BCAA’s promotes mitochondrial biogenesis (Valerio et al., 2011; D'Antona et al., 2010). Furthermore, a recent study has shown that elevated plasma levels of BCAAs are found 2 to 5 years before a cohort of patients developed pancreatic ductal adenocarcinoma (Mayers et al., 2014).

A number of gene sets showed recurrent negative correlation to mtDNA copy number (Figure 5—figure supplement 1 and Supplementary file 2, Worksheet Fig5Data). Several of these gene sets, including those related to mRNA processing and the cell cycle, are associated with known non-metabolic functions of mitochondria in the cell. In particular, the replication of mitochondria and mtDNA is intimately linked to the cell cycle (Chatre and Ricchetti, 2013), and the nucleotide precursors to mtDNA are in part produced de novo, via a pathway that is only active during the S phase of the cell cycle (Sigoillot et al., 2003). Several immune pathways, including those related to interferon signaling, are also frequently negatively correlated with mtDNA content. This is interesting in light of the role that mitochondria play in innate immunity (West et al., 2011; Weinberg et al., 2015). Of particular interest is a recent report by West and colleagues (West et al., 2015), demonstrating that mtDNA stress induced by depletion of TFAM triggered the innate immune response via interferon-stimulated genes and anti-viral signaling. Of the seven tumor types shown to be depleted of mtDNA in Figure 3, five (BLCA, BRCA, ESCA, HNSC, KIRC ) exhibit a negative correlation between expression of immune system genes and tumor tissue (but not necessarily normal tissue) mtDNA content.

A subset of tumor types did not show strong positive correlation between mtDNA copy number and expression of mitochondrial metabolic genes. In some cases, this was the result of an apparently dominant correlation with another pathway. Interestingly, in prostate adjacent normal tissue, the expression of mitochondrial respiratory genes was anti-correlated to mtDNA content (see Supplementary file 2). We speculate that this effect may be associated with the unique mitochondrial metabolism of prostate epithelia, which secrete large amounts of citrate generated in the mitochondria, rather than oxidizing it further and using the resulting NADH in the respiratory electron transport chain (Costello et al., 1997; 2004).

Association with mutations and copy number alterations

The landscape of genetic events driving tumors is diverse, and the presence and activity of these genetic lesions is now being used in design of clinical trials and development of new treatments (Rubio-Perez et al., 2015). We sought to understand whether mtDNA abundance was associated with the incidence of particular mutations/copy number alterations (CNAs) in patient samples. To do so, we evaluated whether patients with a particular genetic lesion showed statistically significant increases or decreases in tumor mtDNA abundance, compared to wild-type samples. We restricted analysis to whole-exome sequencing data and which were not under embargo by the TCGA as of March 2015. All results for the analysis are reported in Figure 6 and Supplementary files 3 and 4.

Figure 6 with 1 supplement see all
mtDNA content is correlated to the incidence of certain mutations and copy number alterations.

Each point corresponds to a single alteration (e.g TP53 mutation). Direction of arrow indicates whether alteration increases or decreases mtDNA content. X-axis in (A) and (C) indicates the fraction of samples in a cancer-type that contained the alteration (i.e., 20% of LGG samples were 10q deleted). (A) 73 out of 1896 copy number alterations (CNAs) tested were found to be significantly associated with mtDNA content (Mann-Whitney p-value <0.05). CNAs from the UCEC cancer type were excluded because of strong association between mtDNA copy number and the ‘copy-number high’ serous-like subtype in UCEC, shown in (B). (B) The UCEC serous-like subtype displays a marked increase in tumor mtDNA copy number, relative to endometrial tumors of other subtypes. (C) Relatively few mutations (3 out of 3954 tested) associate significantly with tumor mtDNA content (Mann-Whitney p-value <0.05). The UCEC associations are likely the result of the correlation to the serous-like subtype. (D) IDH1 and PTEN mutation status is linked to tumor mtDNA copy number in LGG.

https://doi.org/10.7554/eLife.10769.017

The most apparent result of our analysis was the association of a large number of CNAs in endometrial carcinomas (UCEC) with increased mtDNA abundance. Recent work by the TCGA proposed a subtype stratification of endometrial carcinomas based on mutation and CNA frequency (Kandoth et al., 2013). Among these subtypes is a serous-like ‘copy-number-high’ subtype with large numbers of somatic CNAs. We obtained the UCEC subtype classifications and confirmed that serous-like endometrial carcinomas exhibited substantially higher mtDNA copy number than all other subtypes (Mann-Whitney p-value 7×10-6, Figure 6), explaining the large number of associations we observed. TP53 mutations are enriched in the serous-like subtype, and these mutations also showed statistically significant association with mtDNA abundance (BH-corrected p-value 0.012).

After removing associations in UCEC, we were left with a small number of statistically significant mutations and CNAs associated with mtDNA abundance. Among these, the strongest signal arose from increased tumor mtDNA content in IDH1-mutant low grade gliomas (Figure 6, BH-corrected p-value 0.012). Both IDH1 and IDH2 activating mutations induce production of the so-called ‘onco-metabolite’ 2-hydroxyglutarate, which competitively inhibits α-ketoglutarate-dependent histone demethylases and 5-methylcytosine hydroxylases, inducing a hypermethylation phenotype (Turcan et al., 2012; Xu et al., 2011). Surprisingly, IDH2 mutations showed no statistically significant change in mtDNA abundance, suggesting that the effect is specific to the cytosolic isoform IDH1. Notably, mutations in PTEN were associated with a significant decrease in mtDNA abundance (BH-corrected p-value 0.033). These results echo a complementary finding by Navis and colleagues (Navis et al., 2013), who reported that a mutant IDH1 R132H oligodendroglioma xenograft model displayed high densities of mitochondria and increased levels of mitochondrial metabolic activity. They proposed that an increase in mitochondrial mass would increase activity of mitochondrial IDH2 and compensate for loss of activity introduced by mutant IDH1.

Finally, prompted by a recent report implicating mutations in mtDNA itself with the pathology of kidney chromophobe carcinomas (KICH) (Davis et al., 2014), we investigated the connection between mtDNA copy number and mtDNA mutations in KICH. Using somatic mtDNA mutation calls provided by the TCGA (Davis et al., 2014), we examined whether mtDNA-mutated samples were likely to have more or fewer mtDNA copies than unmutated samples. We found that samples with mtDNA indels contained much higher quantities of mtDNA than unmutated samples (Mann-Whitney U-test p-value 0.002, Figure 6—figure supplement 1). The same effect was not found when examining only single nucleotide variants. These results suggest that the presence of inactivating mtDNA mutations may induce increased mtDNA replication, perhaps as a response to inadequate mitochondrial energy production.

Immunohistochemical investigation of respiratory protein content

So far, our findings have indicated that a number of tumor types appear to be depleted of mtDNA relative to normal tissue, and that in some (but not all) cases, the amount of mtDNA in a sample is correlated to the expression of respiratory genes. However, in some cancer types (e.g. bladder), tumors exhibited depletion of mtDNA (Figure 3), but expression of mitochondrial genes was not correlated to mtDNA copy number (Figure 5). This discrepancy is reminiscent of prior work describing mtDNA depletion which was not accompanied by a drop in respiratory activity or mitochondrial protein expression. Instead, a compensation of respiratory activity was described in cases of mtDNA depletion caused by either genetic alterations (Seidel-Rogol and Shadel, 2002; Barthélémy et al., 2001; Dorado et al., 2011) or reverse-transcriptase inhibitors (Kim et al., 2008; Miró et al., 2004; Stankov et al., 2007).

To investigate whether mtDNA depletion was associated with a concurrent decrease of mitochondrial protein expression, we examined the abundance of a mitochondrial protein using immunohistochemistry (IHC) (Thermo Fisher Scientific Mitochondria Ab-2, Clone MTC02, see Materials and methods) in 3 tumor/normal pairs of clear-cell renal cell carcinoma, papillary renal cell carcinoma, and high-grade muscle-invasive urothelial bladder carcinoma (corresponding to TCGA studies KIRC, KIRP, and BLCA respectively; Figure 7, and Figure 7—source data 1). In KIRC, which was the most strongly mtDNA-depleted tumor type in Figure 3, we found significant depletion of mitochondrial protein in all tumor samples compared to adjacent normal renal parenchyma. In KIRP, for which 69% of paired samples were depleted of mtDNA in Figure 3, we observed a more subtle depletion of mitochondrial protein in 2/3 tumor samples, compared to adjacent normal renal parenchyma. In BLCA, we found that 2/3 BLCA tumors showed increased levels of mitochondrial protein, which contrasted with Figure 3, where nearly all samples showed evidence of mtDNA depletion.

Top panel depicts H&E stains, and bottom panel depicts immunohistochemistry with antibody against mitochondrial protein.

In all H&E stains, red ‘T’ indicates tumor tissue, while blue ‘N’ indicates normal tissue. Orientation of tumor/normal tissue is mirrored in bottom panel. (A) H&E-stained section shows clear cell renal cell carcinoma (top left, KIRC Sample 1 from Figure 7—source data 1) with the classical features of tumor nests with clear cytoplasm, separated by intricate, branching vascular septae, and adjacent non-neoplastic renal parenchyma (lower right). (B) KIRC Sample 1 immunohistochemical staining with MITO Ab2 antibody reveals markedly lower mitochondrial content (cytoplasmic, brown granular positivity) in clear cell RCC compared to normal tubules. (C) H&E-stained section shows papillary renal cell carcinoma type 1 (KIRP Sample 3) with tumor (top right) and normal tubules (lower left). (D) KIRP Sample 3 immunohistochemical stain with MITO Ab2 antibody shows KIRP with a slightly weaker positivity compared to normal tubules. (E) H&E-stained section showing invasive high grade urothelial carcinoma (lower left) with sheets of tumor cells in the lamina propria and the overlying normal urothelium (top right). (F) Immunohistochemical staining with MITO Ab2 antibody reveals slightly higher mitochondrial staining in urothelial carcinoma compared to normal urothelium.

https://doi.org/10.7554/eLife.10769.019

Collectively, our results from IHC regarding mitochondrial protein expression agree with those from sequencing in 2/3 cancer types (KIRC and KIRP). In a third cancer type (BLCA), mtDNA depletion as quantified by sequencing is not mirrored by a synchronous down-regulation of mitochondrial protein levels. As mentioned earlier, our results from gene expression analysis (Figure 5) indicate that mtDNA copy number is correlated to mitochondrial respiratory gene expression in KIRC and KIRP, but not BLCA. In fact, in BLCA, the gene sets most strongly correlated to mtDNA copy number were associated with the cell cycle and immune response. This suggests that other mechanisms compensate for the depletion of mtDNA in BLCA (and potentially in other cancer types), which is further discussed in the concluding section. Taken together, these results support the notion that factors besides mtDNA copy number can determine the rate of mitochondrial transcription, and that mtDNA depletion is not sufficient evidence to conclude that mitochondrial respiration is down-regulated in a tumor.

Discussion

In this study, we have investigated the variation of mtDNA copy number levels across many tumor types, arriving at several intriguing observations. Across nearly half of the tumor types we studied, we found evidence for depletion of mtDNA, relative to adjacent normal tissues. Orthogonal measurements of transcription levels (via RNA-Seq) and mitochondrial protein levels (via IHC) in a subset of these samples linked this variation to downregulation of mitochondrially-localized metabolic pathways, in some but not all tumor types.

Our findings of gross changes in mtDNA content in tumors echo a number of prior but isolated observations, largely based on quantitative PCR measurements and with substantially smaller sample sizes, of mtDNA copy number changes in cancers (see [Yu, 2011] for a thorough review). For example, oncocytomas (not analyzed in this work) are well-known to be characterized by the excessive accumulation of mitochondria (Tickoo et al., 2000). Furthermore, decreases in mtDNA copy number have been reported in breast cancer (Mambo et al., 2005; Fan et al., 2009), liver cancer (Lee, 2004), and clear-cell kidney cancers (Meierhofer et al., 2004; Nilsson et al., 2015). While the majority of our observations agree with prior work (when comparing to [Yu, 2011]), some of our results are in contradiction to prior studies. The discordance between findings seems in part due to inadequate sample sizes, and incomplete or unavailable matched normal tissue. For example, in contrast to (Mambo et al., 2005) and (Wang et al., 2005), we find no clear increase or decrease in mtDNA content in thyroid or endometrial carcinomas, respectively. However, (Mambo et al., 2005) profiled 20 paired thyroid tumors, versus 66 paired thyroid tumors in this report; and (Wang et al., 2005) utilized unpaired samples of tumor and normal endometrial tissue (Wang et al., 2005), versus 32 paired samples here.

We further showed that mtDNA ploidy alone cannot be used as a surrogate for the respiratory activity of a tumor sample. The literature contains several reports of mtDNA copy number depletion without reduction in mitochondrial transcription/respiratory activity, both in vitro and in vivo. In (Seidel-Rogol and Shadel, 2002), HeLa cells depleted of mtDNA by culture in ethidium bromide showed substantial mitochondrial transcription despite the fact that mtDNA, TFAM, and mitochondrial RNA polymerase were all at depleted levels. There, the authors suggest that an excess of TFAM and mitochondrial RNA polymerase prior to depletion may ensure that, even once depleted, transcription is sustained. Another report examined mtDNA depletion as a result of thymidine kinase 2 deficiency in mice, and observed a down-regulation of the mitochondrial transcriptional terminator MTERF3 in heart tissue. As a result, the expression of mitochondrial transcripts (ND6 and COX1) increased in heart tissue, as did the ratio of the levels of these transcripts to mtDNA levels. The consequence of this transcriptional compensation was that the heart tissue was spared from respiratory deficiency (Dorado et al., 2011). In tandem with our report, these findings emphasize a nuanced connection between mtDNA copy number and respiratory gene expression. We would argue strongly that future studies investigating changes in mtDNA in tumors should quantify mtDNA protein expression in parallel with estimating mtDNA copy number. A number of related open questions remain to be resolved, including what mechanisms determine the incidence and/or extent of compensation to mtDNA depletion, and what the consequences of mtDNA depletion may be when such compensation takes place (e.g. upregulation of the immune response).

While mtDNA depletion or accumulation may typify certain cancer types, we further identified that subsets of patient samples, characterized by the presence of particular somatic mutations/copy number alterations, were enriched/depleted in mtDNA. The presence of activating IDH1 mutations (in low grade gliomas) or a large number of copy number alterations (in serous-like endometrial carcinomas) is strongly correlated to high tumor mtDNA content. If these tumors (and others with increased mtDNA content) have an increased dependence on mitochondrial metabolism to proliferate, using mitochondrially-targeted therapies (e.g. metformin) may be a therapeutic opportunity. Similarly, vulnerability to mitochondrially-targeted therapies might arise from disabling passenger mutations in genes required for mtDNA copy number maintenance (e.g. DNA polymerase gamma). Both hypotheses should be amenable to investigation in carefully chosen cell line models of cancer.

A number of reports have now described extensive genetic heterogeneity of some tumor types (e.g. kidney cancers [Gerlinger et al., 2012]), where spatially distinct biopsies isolated from the same patients have non-overlapping somatic alterations. However, no reports have examined how mitochondrial DNA mutations and copy number vary spatially across a tumor. Variation of this kind, if it exists, might reflect functional diversity in mitochondrial metabolic activity and signaling in different regions of a tumor. Alternately, it would be of particular interest to trace the time-evolution of mtDNA content in a single patient over the course of treatment. As critical players in immunity, signaling, and metabolism, we suspect that mitochondria will inevitably play a role in the evolution of resistance to therapeutic intervention.

Materials and methods

Data acquisition

Whole exome sequencing (WXS) and whole genome sequencing (WGS) BAM files for 22 distinct TCGA studies were obtained from the TCGA CGHub repository (Figure 2) (Wilks et al., 2014). We restricted our analyses to sequence data aligned to GRCH37 using the mitochondrial Cambridge Reference Sequence (CRS). We focused only on primary tumor, adjacent normal tissue, and normal blood samples (‘01’, ‘11’, and ‘10’ in the sample type field of the TCGA barcode). We further restricted our analyses to samples which were not whole-genome amplified prior to sequencing (i.e., we only used samples containing ’D’ in the analyte field of the TCGA barcode), because such amplification could potentially bias the relative abundances of mitochondrial and nuclear DNA in the sample.

Samtools (Li et al., 2009) was used to extract reads aligning to the mitochondrial genome meeting the following critieria: (1) passed quality-control, (2) were not marked as duplicate reads, (3) were properly paired, and (4) were aligned with Phred-scaled mapping quality (MAPQ) >30. The number of such reads aligning to the mitochondrial genome was compared to the number of such reads aligning to the nuclear genome.

The pipeline described above includes a number of controls to ensure that mtDNA copy number estimates are not influenced by nuclear integrations of mitochondrial sequences (NUMTs) (Hazkani-Covo et al., 2010). A direct result of restricting analysis to properly paired reads is that reads whose mate mapped to a different chromosome are removed prior to copy number calculation. Furthermore, by requiring a conservative Phred-scaled minimal mapping quality of 30 (equivalent to a 99.9% likelihood that reads are aligned to the correct genomic location), reads with homology to nuclear-encoded NUMTs are removed prior to copy number calculations. Prior work has established that more lenient mapping quality thresholds of 20 are sufficient for accurately calling mtDNA copy number (Ding et al., 2015)

A complete list of all copy number estimates is available in Supplementary file 1.

Purity and ploidy calculation and correction

Affymetrix SNP6 arrays for tumor and normal samples were acquired for 22 cancer types from the TCGA. Arrays for each individual cancer type were processed together, quantile-normalized and median polished with Affymetrix power tools using the birdseed algorithm to obtain allele-specific intensities. PennCNV (Wang et al., 2007) was used to generate log R ratio and B-allele frequencies for each tumor. ASCAT (Van Loo et al., 2010) was used to generate allele-specific copy number and estimate tumor ploidy and purity using matched arrays from tumor and normal tissue.

In order to estimate mtDNA copy number in Equation 1, we compared the number of reads aligning to the mitochondrial genome to the number of reads aligning to a genome of known ploidy. For samples of normal tissue, we assumed this known ploidy was equal to 2. For tumor tissue which may be infilitrated by stromal/immune cells and copy-number altered, we need to correct for the ‘effective ploidy’ of the sample. We define this correction factor to be

(3) RTumor=Purity×Ploidy+(1-Purity)×22

where the purity and ploidy values are obtained from ASCAT, as described above. When a sample is composed of pure normal tissue, R=1.

Correction for sequencing center and plate ID

Inspection of mtDNA copy number results indicated a potential association between mtDNA copy number and processing batch. This is consistent with prior reports, e.g. (Ju et al., 2014), which described large variation in efficiency of mtDNA depletion in exome sequencing in a sequencing-center-dependent manner. We separately examined the log10 mtDNA copy number for each TCGA plate ID for (1) blood, and (2) tissue-derived (tumor and adjacent-normal tissue) samples. Kruskal-Wallis tests using either blood or tissue-derived mtDNA copy number indicated significant differences in median mtDNA copy number between TCGA plates in 21/22 whole exome sequencing (WXS) datasets (p-value <0.05). In contrast 3/10 whole genome sequencing (WGS) datasets showed significant differences between TCGA plates using tissue-derived mtDNA copy number (5/10 using blood-derived mtDNA copy number). Manual inspection further indicated that the magnitude of the batch effect was smaller in WGS compared to WXS.

We also calculated, for each TCGA plate i in a given cancer type, the mean mtDNA copy number in (1) blood (mib) and (2) tumor/adjacent-normal tissue (mit). We observed a statistically significant positive linear correlation (Pearson p-value <0.1 ) between mt and mb in 17/19 cancer types profiled with WXS, but in 0/7 cancer types profiled with WGS (analysis restricted to studies with adequate numbers of samples, defined as at least 3 different TCGA plate IDs with at least 3 blood and 3 tissue samples). Importantly, in many cancer types, tumor and blood from the same patient were often processed in different TCGA plates (e.g. 30% in BLCA, 30% in BRCA, and 48% in STAD), and there was no reason a priori to expect that blood and tumor mtDNA estimates should be correlated across batches. The existence of such a correlation suggested a TCGA-plate-dependent contribution to the observed copy number of all samples (both blood and tissue-derived) processed within that plate.

Taken together, the results above suggested that a batch effect associated with TCGA plate ID/sequencing center was present in WXS, and possibly also in WGS. Based on these considerations, the small magnitude of the batch effect in WGS, and the prior literature describing significant variation in mtDNA sequencing depth in exome sequencing due to differences in exonic enrichment (Ju et al., 2014), we elected to control for a batch effect in WXS, but not WGS. Importantly, in Supplementary file 1, we report uncorrected mtDNA copy number, corrected mtDNA copy number, and plate IDs for all samples analyzed, so that future work by others may model the effect in whatever manner they see fit.

The set of observed log10 mtDNA copy numbers M are indexed by the TCGA plate i, the tissue type (either tumor/adjacent-normal or blood) j, and the sample k. We fit the following linear model to the mtDNA copy numbers Mijk:

(4) Mijk=μ+plateαiPi+tissueβjTj+ϵijk

where Pi and Tj are indicator variables corresponding to the plate and tissue that a sample comes from. The parameter μ is the grand mean of mtDNA copy number, α the effect attributable to the plate and β that attributable to the tissue. The residual ϵ is the sum of the true log10 mtDNA copy number and measurement error. We constructed a linear model corresponding to Equation 4 for each cancer type. Then, we corrected the observed log10 mtDNA copy number for each WXS sample according to the contribution α^ from the corresponding TCGA plate ID. Corrected copy number values were used for subsequent survival, gene expression, and somatic alteration analysis.

Survival analysis

Survival analysis was performed with univariate Cox proportional hazards regression models where the independent and dependent variables were the log10-transformed corrected mtDNA copy number and the overall survival respectively. The p-values for the significance of mtDNA copy number as a predictor of survival were obtained from Wald tests.

Gene set analysis

RSEM normalized RNA-Seq gene expression data were downloaded from the Broad Firehose, using the most recent data as of November 4, 2014. Data were filtered to remove genes with average read count less than 16. We calculated the nonparametric Spearman correlation and associated p-value between the expression of each gene and the copy number of mtDNA. To remove putatively spurious correlations, we identified all genes with a p-value greater than 0.05, and set the correlation coefficient for those genes equal to zero. We then use the geneSetTest function in the limma (Law et al., 2014) package to test whether particular gene sets showed an enrichment for positive/negative correlations, relative to the distribution of correlations across all genes. Thus, a single p-value was calculated for each alternative hypothesis. All p-values were adjusted using the Benjamini-Hochberg procedure. The method described above was applied to every Reactome pathway in the MSigDB canonical pathways gene set. The enrichment score depicted in Figure 5 is the -log10 corrected p-value of this statistical test, with the color (red or blue) indicating the direction of enrichment.

The analysis was run separately for tumor and normal tissues. We applied our gene set analysis pipeline to all studies for which we had at least 20 samples of RNA-Seq data (in order to retain sufficient statistical power). Analyses were run for each combination of tumor type and tissue, and ensuing results were then aggregated across all studies. All results from the analyses are provided in the Supplementary file 2.

Mutation and copy number alteration analysis

For each study, Gistic2 and MutSigCV results were downloaded from the Broad Firehose (most recent data as of Nov 14, 2014). From Gistic, we retained all arm-level and focal alterations with q-value less than 0.1. For mutations, we obtained the MAF file from the output of MutSig. For each gene, we calculated the number of patients in which this gene exhibited a nonsynonymous, coding mutation (i.e., missense, non-sense, frameshift, in-frame insertion/deletions, and splice-site mutations), excluding those with greater than 600 non-synonnymous coding mutations). We then retained any genes which were mutated in greater than 4% of patients.Non-parametric Mann-Whitney U-tests were used to evaluate whether tumors bearing a particular somatic alteration contained significantly higher/lower amounts of mtDNA in tumor samples. After testing all associations, p-values obtained from the U-tests were corrected using the Benjamini-Hochberg procedure.

Histology

All tissues were fixed in 10% neutral-buffered formalin and paraffin embedded as part of a routine surgical pathology procedure and 5-micron-thick sections stained with Hematoxylin and eosin (H&E) were reviewed. Immunohistochemical (IHC) analysis was performed on 5-micron-thick sections by Ventana, Discovery XT immunohistochemical stainer. The sections were deparaffinized and subjected to heat induced antigen retrieval using CC1 at high pH before primary incubation with MITO Ab2 (mouse monoclonal, clone MTC02, Neomarkers, 1:50 dilution). Slides were then counterstained with hematoxylin, dehydrated and cover-slipped.

References

  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
  6. 6
  7. 7
  8. 8
  9. 9
  10. 10
  11. 11
  12. 12
  13. 13
  14. 14
  15. 15
  16. 16
  17. 17
  18. 18
  19. 19
    Role of mitochondrial transamination in branched chain amino acid metabolism
    1. SM Hutson
    2. D Fenstermacher
    3. C Mahar
    (1988)
    The Journal of Biological Chemistry 263:3618–3625.
  20. 20
  21. 21
  22. 22
  23. 23
  24. 24
  25. 25
  26. 26
  27. 27
  28. 28
  29. 29
  30. 30
  31. 31
  32. 32
  33. 33
  34. 34
  35. 35
  36. 36
  37. 37
  38. 38
  39. 39
  40. 40
  41. 41
  42. 42
    Relationship of mitochondrial DNA depletion and respiratory chain activity in preadipocytes treated with nucleoside reverse transcriptase inhibitors
    1. MV Stankov
    2. T Lücke
    3. AM Das
    4. RE Schmidt
    5. GM Behrens
    6.  German Competence Network HIV/AIDS
    (2007)
    Antiviral Therapy 12:205–216.
  43. 43
  44. 44
  45. 45
  46. 46
  47. 47
  48. 48
  49. 49
  50. 50
  51. 51
  52. 52
  53. 53
  54. 54
  55. 55
  56. 56
  57. 57
  58. 58
  59. 59

Decision letter

  1. Chi Van Dang
    Reviewing Editor; University of Pennsylvania, United States

In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

Thank you for submitting your work entitled "Mitochondrial DNA Copy Number Variation Across Human Cancers" for peer review at eLife. Your submission has been favorably evaluated by Randy Schekman as Senior editor and two reviewers, one of whom, Chi Dang, is a member of our Board of Reviewing Editors.

The reviewers have discussed the reviews with one another and the Reviewing editor has drafted this decision to help you prepare a revised submission.

Summary:

The manuscript by Reznik et al. reports a comprehensive informatic analysis of mitochondrial DNA content in ~13,000 tumor samples from the TCGA effort. This study is potentially of high significance as it addresses systematically mtDNA copy number changes in various solid tumors for the first time. Other studies have done this before, but usually only on one tumor type and often with not enough samples to draw strong conclusions. This remarkable study uncovers the broad depletion of mtDNA in tumor vs normal samples across many, but not all tumor types. This study goes further to try and relate mtDNA copy number changes to gene expression and mutational changes, which is lauded and interesting. Informatic analyses also reveal that mtDNA content correlates with specific changes such as IDH1 mutation. Further, mtDNA content correlates with clinical outcome for certain tumor types. If the mtDNA copy number changes found are indeed valid, the study does break interesting new ground and moves the field forward.

Essential revisions:

1) There are hundreds of insertions of mtDNA sequences in the nuclear genome (so called NUMTs), some are even full-length mtDNA insertions. The authors do not mention these or acknowledge that they would be incorrectly assigned as actual mtDNA reads in the whole-genome sequencing method of mtDNA copy number. This may lead to false positive results regarding mtDNA copy number in the tumors and need to be subtracted from the analysis.

2) The ASCAT algorithm seems quite good at factoring out ploidy differences, but less robust with regard to the purity issue. Thus, there remains concern that the differences in mtDNA copy number observed somehow reflects the different cell composition of the tumor versus control normal tissue. For example, immune and stromal cells might have lower basal mtDNA copy number than resident tissue. The tumor samples assayed almost certainly have a high percentage of these cell types compared to normal tissue and hence the trend toward depletion in many of the tumor samples may simply reflect this. Can the authors provide some histology to bolster their conclusions? It is too much to do this for all the samples, but one or two representative cases to support their conclusions seem necessary.

3) TFAM is an abundant mtDNA-binding protein and transcription/replication factor that often correlates with mtDNA copy number. Can the authors address whether mutations, gene expression changes, or amounts of TFAM correlate with the mtDNA copy number changes they report?

4) Recently mtDNA stress (depletion and TFAM reductions) has been shown to activate innate immune pathways (e.g. interferon stimulated genes) (West et al. Nature 2015). Is there a gene expression profile indicative of this in the tumors that show mtDNA depletion?

[Editors' note: further revisions were requested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "Mitochondrial DNA Copy Number Variation Across Human Cancers" for further consideration at eLife. Your revised article has been favorably evaluated by Randy Schekman as Senior editor and two reviewers, one of whom is a member of our Board of Reviewing Editors. The manuscript has been improved but, as requested, we are granting you an extra month for the additional data analysis.

https://doi.org/10.7554/eLife.10769.027

Author response

Summary:

The manuscript by Reznik et al. reports a comprehensive informatic analysis of mitochondrial DNA content in ~13,000 tumor samples from the TCGA effort. This study is potentially of high significance as it addresses systematically mtDNA copy number changes in various solid tumors for the first time. Other studies have done this before, but usually only on one tumor type and often with not enough samples to draw strong conclusions. This remarkable study uncovers the broad depletion of mtDNA in tumor vs normal samples across many, but not all tumor types. This study goes further to try and relate mtDNA copy number changes to gene expression and mutational changes, which is lauded and interesting. Informatic analyses also reveal that mtDNA content correlates with specific changes such as IDH1 mutation. Further, mtDNA content correlates with clinical outcome for certain tumor types. If the mtDNA copy number changes found are indeed valid, the study does break interesting new ground and moves the field forward.

Thank you. A small comment: we analyze mtDNA copy number in ~13000 tissue samples, of which ~6000 are tumors. The remaining samples are blood or adjacent normal tissue.

Essential revisions:

1) There are hundreds of insertions of mtDNA sequences in the nuclear genome (so called NUMTs), some are even full-length mtDNA insertions. The authors do not mention these or acknowledge that they would be incorrectly assigned as actual mtDNA reads in the whole-genome sequencing method of mtDNA copy number. This may lead to a false positive results regarding mtDNA copy number in the tumors and need to be subtracted from the analysis.

We agree with the reviewer. Our computational pipeline includes a number of features that subtract the contribution of nuclear insertions of mtDNA to calculations of mtDNA copy number. These filters are based on those implemented by others in prior work (e.g.Ding 2015), although we have actually increased their stringency in this work (detailed below). We failed to describe these “hidden features” of our approach in the initial manuscript. We now include a discussion of these filters in the Methods section:

The pipeline described above includes a number of controls to ensure that mtDNA copy number estimates are not influenced by nuclear integrations of mitochondrial sequences (NUMTs) (Hazkani-Covo 2010). A direct result of restricting analysis to properly paired reads is that reads whose mate mapped to a different chromosome are removed prior to copy number calculation. Furthermore, by requiring a conservative Phred-scaled minimal mapping quality of 30 (equivalent to a 99.9% likelihood that reads are aligned to the correct genomic location), reads with homology to nuclear-encoded NUMTs are removed prior to copy number calculations. Prior work has established that more lenient mapping quality thresholds of 20 are sufficient for accurately calling mtDNA copy number (Ding 2015).”

The filter most relevant to the reviewers’ comment is that which requires reads to be properly paired, which ensures that the mate of a read is located on the same chromosome as the read, and within a reasonable distance. We used the samtools flagstat command to confirm that the mates of MT reads were also aligning to the MT genome.

Interestingly, in 7% of samples, we observed an extremely small proportion of MT reads (never more than 0.04%, or 4 out of every 10000, MT reads, and thus not affecting mtDNA copy number calculations) that, despite our filters, whose mates mapped to a different chromosome as reported by BWA. Upon further investigation, all such reads appeared to map to the terminal end of the Y chromosome. It appears that the flagging of such reads as properly paired despite mapping to different chromosomes may result from concatenation of the Y and MT chromosome during the indexing of the reference sequence by BWA. These artifacts affected our results to a completely negligible extent. However, we thought it useful to include this finding in the response document, as a guide for others who may encounter the artifact in future studies.

2) The ASCAT algorithm seems quite good at factoring out ploidy differences, but less robust with regard to the purity issue. Thus, there remains concern that the differences in mtDNA copy number observed somehow reflects the different cell composition of the tumor versus control normal tissue. For example, immune and stromal cells might have lower basal mtDNA copy number than resident tissue. The tumor samples assayed almost certainly have a high percentage of these cell types compared to normal tissue and hence the trend toward depletion in many of the tumor samples may simply reflect this. Can the authors provide some histology to bolster their conclusions? It is too much to do this for all the samples, but one or two representative cases to support their conclusions seem necessary.

The reviewers raise an intriguing point that occurred to us in the course of our investigation. We agree that when calculating tumor mtDNA copy number, we are in fact reporting the average aggregate copy number of the tumor sample, including contributions from stromal and immune cells. With this in mind, we felt that the reviewers’ remark rested on two separate questions:

First, can we provide orthogonal evidence that some tumor types are depleted of mtDNA? Second, more broadly, is mtDNA content associated with infiltration of the tumor by stroma and immune cells?

To address the first question, we completed immunohistochemistry on three clear-cell renal cell carcinoma (KIRC) tumor/normal pairs and three papillary renal cell carcinoma (KIRP) tumor/normal pairs using an antibody for an mtDNA-encoded protein (MTCO2, a component of Complex IV). In both cases, we ensured that adjacent-normal tissue was visible in the field of view.

We remind the reviewers that KIRC was found to be the most strongly mtDNA-depleted tumor type in our analysis, while KIRP was observed to be depleted but to a much lesser extent. As shown in Figure 3—figure supplement 2 and Figure 3—source data 1, we found that all 3 clear-cell tumor showed very poor staining for MTCO2, compared to adjacent normal tissue. In contrast, 2/3 papillary tumors also showed less staining for MTCO2 than adjacent normal tissue, but the magnitude of this effect was much smaller/more subtle than for the clear-cell samples. These findings are in line with those reported in Figure 3.

To address the second question, we obtained estimates of stromal and immune cell infiltration for 8 cancer types calculated using the ESTIMATE algorithm published in 2013. We calculated the correlation between each of [stromal score, immune score], and [tumor mtDNA content, log2 ratio of tumor:normal mtDNA content]. These results are reported in Figure 3—figure supplements 35, Figure 3—source data 1, Figure 5—source data 1, and Figure 7—source data 1. In most cases, we found no correlation between mtDNA content and these “purity parameters.” However, there were notable cases where we did observe a weak correlation: for example, bladder, breast, and endometrial tumors exhibit a negative correlation between immune cell infilitration and tumor mtDNA content. Importantly, even in these cases, the correlation between mtDNA content and stromal/immune infilitration is insufficient to explain the broad pattern of mtDNA depletion that we observe. They do, however, raise the interesting possibility that changes in mtDNA content may synchronously drive (or be driven by) changes in immune/stromal content (as evidenced by the relationship between interferon signaling and mtDNA depletion, described in Comment 4 below). We report these findings in the subsection “Gross Changes in mtDNA Content are Evident in Many Cancers”.

3) TFAM is an abundant mtDNA-binding protein and transcription/replication factor that often correlates with mtDNA copy number. Can the authors address whether mutations, gene expression changes, or amounts of TFAM correlate with the mtDNA copy number changes they report?

The reviewers raise an intriguing question, as TFAM is critical both to mtDNA transcription and packaging into nucleoids. As of October 8, 2015, cbioportal.org reported only a handful of TFAM missense/nonsense mutations across all studies in the TCGA. Rather than try to tease apart the contribution of these rare mutations to mtDNA copy number, we instead focused on the association of changes in TFAM gene expression and mtDNA copy number.

After separating cancer studies by sequencing center (e.g. Broad Institute, Baylor) and tumor/normal tissue identity, we computed the correlation between TFAM gene expression and mtDNA copy number for all cases where we had at least 30 samples. The results are reported in Table 5, with additional information on the number of samples available and the Log10 range of mtDNA copy number (i.e. a value of 1 indicates the difference between the maximal and minimal mtDNA copy number was a factor of 10). Of the 29 cases tested, 13 of them (~45%) showed statistically significant positive correlation between mtDNA copy number and TFAM expression (Spearman rho p-value <0.05).

Interestingly, we found that the studies that did not exhibit a positive correlation between mtDNA content and TFAM expression tended to have very little variation in mtDNA content (i.e. the Log10 range was small). In fact, there was a strong correlation between the significance of the correlation (-log10(P-value)) and the range of mtDNA content (see Author response image 1, Spearman rho p-value 0.0007). Based on this observation, we speculate that the poor correlation between mtDNA content and TFAM expression may, in some cases, arise because of biological effects (e.g.small variation in mtDNA content may not translate to comparable changes in TFAM expression) or technical issues (e.g.limits to the detection sensitivity of our mtDNA copy number calculations).

4) Recently mtDNA stress (depletion and TFAM reductions) has been shown to activate innate immune pathways (e.g. interferon stimulated genes) (West et al. Nature 2015). Is there a gene expression profile indicative of this in the tumors that show mtDNA depletion?

In our initial submission, we briefly mentioned that several of the pathways most negatively correlated with mtDNA content were immune pathways. We revisited and expand on this finding, and now report the results in Figure 5—figure supplement 1. Of the top 10 most frequently negatively correlated pathways, two were associated with interferon signaling. Among the 7 tumor types observed to be depleted of mtDNA relative to normal tissue and with available RNA-Seq data (no RNA-Seq for stomach adenocarcinomas), 6/7 exhibited statistically significant anti-correlation between interferon signaling and tumor mtDNA copy number. In other words, in these tumor types, the interferon signaling pathway was most highly expressed in samples with the lowest mtDNA copy number.

Upon investigating further, we found that the reviewers’ comment had exposed a peculiar and quite interesting pattern in the results of our gene set analysis. For 4 tumor types with depletion of mtDNA and sufficiently numerous adjacent-normal samples (BRCA, HNSC, KIRC, LIHC), the negative correlation between interferon signaling expression and mtDNA content was not present in expression data for normal tissue. In other words, in these tissues, mtDNA content was anti-correlated with interferon signaling in tumor samples, but not in normal samples. This finding is now reported in the subsection “mtDNA Copy Number is Correlated to the Expression of Mitochondrial Metabolic Genes”.

[Editors' note: further revisions were requested prior to acceptance, as described below.]

The manuscript has been improved but, as requested, we are granting you an extra month for the additional data analysis.

Thank you for the opportunity to revise our manuscript. Following submission of our revised manuscript in November 2015, we became aware of a potential batch effect in our mtDNA copy number calculations using whole exome sequencing data. We found that whole exome sequencing batches (defined by the sequencing center and TCGA plate ID) displayed significant variation in the efficiency of targeted exonic enrichment (and, therefore, removal of mtDNA reads). While we had partially controlled for this effect in our initial submission by separating analysis by sequencing center, inspection of the data indicated that controlling for plate ID was also necessary. Importantly, this batch effect was not evident for whole-genome sequencing data, supporting the hypothesis that it arose because of targeted enrichment of exonic regions.

We have completed several revised analyses after controlling for the batch effect, which are detailed in the attached manuscript. In summary, our results remain largely unchanged, and in fact improve in several respects.Changes to prior results are detailed below.

We have also included one additional analysisin the manuscript, which is detailed in Comment 7below. There, we complete additional immunohistochemical staining of a third cancer type (bladder) to examine the effects of mtDNA depletion on respiratory protein expression. Unlike the other mtDNA-depleted cancer types which we have stained (kidney clear-cell and kidney papillary), we found that bladder tumors did not show depletion of MT-CO2 (a mtDNA-encoded protein), suggesting that bladder tumors may compensate for mtDNA depletion to sustain mitochondrial transcription.

1) We implemented a batch correction using a linear model. By jointly modeling the contribution of batch to mtDNA copy number in blood and tissue (i.e.tumor/adjacent-normal), we were able to normalize mtDNA copy number estimates across all batches. A detailed description of the batch effect and our methodology to correct for it are in the subsection “Correction for Sequencing Center and Plate ID”.

2) To facilitate use of our results by others, and to promote transparency, we have augmented our primary data table (Supplementary file 1)to include the original (un-corrected) mtDNA copy number estimates, batch IDs, and the newly batch-corrected mtDNA copy number estimates.

3) We repeated the association analysis comparing mtDNA copy number estimates from whole-genome and whole-exome sequencing, controlling for batch. Our results are now significantly improved, and no longer need to be separated by sequencing center. To simplify the presentation, we have removed the table reporting the p-values of the correlations associated with this analysis, and now report them directly in the caption of the figure. All correlations between WXS and WGS are highly significant, and generally improve compared to the prior submission.A comparison of the two results (pre-batch correction vs. post-batch correction) is below. The effect of the correction should be evident when comparing the same study across the two figures (e.g.BLCA, top left corner of both). Please see Figure 2—figure supplement 1 too.

Author response image 2
Comparison of mtDNA copy number estimates from samples profiled using both WGS and WXS.

Results above are those prior to batch-correction of mtDNA copy number.

https://doi.org/10.7554/eLife.10769.026

4) We repeated the analysis comparing mtDNA copy number in tumor and adjacent-normal tissues, restricting ourselves to pairs of samples processed in the same batch. We found that ~90% of all tumor-normal pairs were processed in the same batch, and our results are largely unchanged, except that the depletion effect in one cancer type (stomach adenocarcinomas) now loses statistical significance upon multiple hypothesis correction.

5) We repeated analysis associating mtDNA copy number with gene expression data. Our results are largely unchanged, with identical “major findings.”Using batch-corrected mtDNA copy number estimates, more cancer types now exhibit correlation between respiratory gene expression and mtDNA copy number. In terms of expression data, mitochondrial respiratory genes remain the most recurrently correlated gene set with mtDNA copy number. Interestingly, some tumor types which did not show correlation with respiratory genes, now do (e.g.HNSC tumors). Of note, while the anti-correlation between expression of 2 different interferon-related genesets (α/β interferon and total interferon) and mtDNA copy number is still retained, it is no longer in the top-10 most anti-correlated gene sets and does not appear in Figure 5—figure supplement 1. However, the 2 gene sets are still among the most anti-correlated, ranking 27th and 40th respectively, out of 674 total gene sets (Supplementary file 2). Notably, other immune-related pathways remain in the top 10 in Figure 5—figure supplement 1.

We also repeated the analysis comparing mtDNA copy number estimates with ESTIMATE scores in the subsection “Gross Changes in mtDNA Content are Evident in Many Cancers”, Figure 3—figure supplements 25, and Figure 5—source data 1. We observed a weak negative correlation between immune scores in ESTIMATE and mtDNA copy number across several cancer types, which echoed our results from the gene expression analysis described above.

6) We repeated the analysis associating mtDNA content with somatic alterations. The results are qualitatively unchanged, although in some cases p-values did become less significant(i.e. the p-value for the association between IDH1 mutations in gliomas and mtDNA content is now larger/less significant than in prior versions). NF1 mutations are no longer statistically significantly associated with lower mtDNA content because of multiple hypothesis correction (un-corrected p-value 0.006), but we still feature the alteration in Figure 6.

7) A major new analysis in our manuscript is additional immunohistochemical staining of three high-grade muscle-invasive urothelial (bladder) cancers. This addition was not prompted by reviewers’ comments, but rather by our own curiosity and desire to understand a strange, counterintuitive phenomenon in our results. We noted that (1) bladder cancers are depleted of mtDNA content relative to normal tissue both in WXS and WGS (Figure 3), but (2) using gene expression data, levels of mtDNA in bladder tumors are not correlated to expression of respiratory genes (Figure 5). This left us asking whether the depletion in mtDNA in bladder tumors actually translated to a depletion in respiratory protein levels.

As we explain in detail in the subsection “Immunohistochemical Investigation of Respiratory Protein Content” and newly added Figure 7,and discuss in the Discussion, we used immunohistochemistry to stain 3 bladder tumors and adjacent-normal urothelium for MT-CO2. Two of the three cases showed stronger staining in tumor than in normal, although the magnitude of the effect was relatively small compared to the difference in staining in KIRC. These results suggest that there may be a compensatory effect in bladder cancers, such that respiratory capacity is sustained despite reduced mtDNA copy number. As we describe in the newly added text in third paragraph of the Discussion, a number of other groups have reported similar compensation in other instances mtDNA depletion (in vitro, and in vivo).

We feel this new data should be reported in the manuscript. It emphasizes, as one of the reviewers pointed out in an earlier reviewers’ comment, that our observation of mtDNA depletion across many solid tumor types should not be misconstrued for a widespread “Warburg-effect.” It highlights the nuanced regulation of mitochondrial transcription, and it opens several avenues for further research as to what the causes and consequences of mtDNA depletion are.

[Editors' note: During proofing of their manuscript, the authors noticed that they had incorrectly described an antibody using for staining in Figure 7 and asked for this to be corrected. The authors stained with mitochondrial antibody MTC02, which recognizes a 60 kDa mitochondrial protein. In the manuscript and in the response to reviewers, the authors describe this antibody as MT-CO2, and write that it recognizes a mitochondrial DNA-encoded protein. The manuscript has been corrected accordingly.]

https://doi.org/10.7554/eLife.10769.028

Article and author information

Author details

  1. Ed Reznik

    Computational Biology Program, Memorial Sloan Kettering Cancer Center, New York, United States
    Contribution
    ER, Conception and design, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article
    For correspondence
    reznike@mskcc.org
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon 0000-0002-6511-5947
  2. Martin L Miller

    Cancer Research UK, Cambridge Institute, Cambridge, United Kingdom
    Contribution
    MLM, Analysis and interpretation of data, Drafting or revising the article
    Competing interests
    The authors declare that no competing interests exist.
  3. Yasin Şenbabaoğlu

    Computational Biology Program, Memorial Sloan Kettering Cancer Center, New York, United States
    Contribution
    YŞ, Analysis and interpretation of data, Drafting or revising the article
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon 0000-0003-0958-958X
  4. Nadeem Riaz

    Department of Radiation Oncology, Memorial Sloan Kettering Cancer Center, New York, United States
    Contribution
    NR, Analysis and interpretation of data, Drafting or revising the article
    Competing interests
    The authors declare that no competing interests exist.
  5. Judy Sarungbam

    Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, United States
    Contribution
    JS, Analysis and interpretation of data, Drafting or revising the article
    Competing interests
    The authors declare that no competing interests exist.
  6. Satish K Tickoo

    Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, United States
    Contribution
    SKT, Analysis and interpretation of data, Drafting or revising the article
    Competing interests
    The authors declare that no competing interests exist.
  7. Hikmat A Al-Ahmadie

    Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, United States
    Contribution
    HAA-A, Analysis and interpretation of data, Drafting or revising the article
    Competing interests
    The authors declare that no competing interests exist.
  8. William Lee

    1. Computational Biology Program, Memorial Sloan Kettering Cancer Center, New York, United States
    2. Department of Radiation Oncology, Memorial Sloan Kettering Cancer Center, New York, United States
    Contribution
    WL, Analysis and interpretation of data, Drafting or revising the article
    Competing interests
    The authors declare that no competing interests exist.
  9. Venkatraman E Seshan

    Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, United States
    Contribution
    VES, Analysis and interpretation of data, Drafting or revising the article
    Competing interests
    The authors declare that no competing interests exist.
  10. A Ari Hakimi

    1. Computational Biology Program, Memorial Sloan Kettering Cancer Center, New York, United States
    2. Urology Service, Department of Surgery, Memorial Sloan Kettering Cancer Center, New York, United States
    Contribution
    AAH, Analysis and interpretation of data, Drafting or revising the article
    Competing interests
    The authors declare that no competing interests exist.
  11. Chris Sander

    Computational Biology Program, Memorial Sloan Kettering Cancer Center, New York, United States
    Contribution
    CS, Analysis and interpretation of data, Drafting or revising the article
    Competing interests
    The authors declare that no competing interests exist.

Funding

National Institutes of Health (5U24 CA143840-05 (Sander))

  • Eduard Reznik
  • Yasin Şenbabaoğlu
  • Chris Sander

National Institutes of Health (P30 CA008748)

  • Ed Reznik
  • Martin L Miller
  • Yasin Şenbabaoğlu
  • Nadeem Riaz
  • Judy Sarungbam
  • Satish K Tickoo
  • William Lee
  • Venkatraman E Seshan
  • A Ari Hakimi
  • Chris Sander
  • Hikmat A Al-Ahmadie

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank Deborah S Marks, Nick Gauthier, Arman Aksoy, Nils Weinhold, and Alessandro Pastore for thoughtful discussions and feedback.

Reviewing Editor

  1. Chi Van Dang, Reviewing Editor, University of Pennsylvania, United States

Publication history

  1. Received: August 10, 2015
  2. Accepted: January 8, 2016
  3. Version of Record published: February 22, 2016 (version 1)
  4. Version of Record updated: June 9, 2016 (version 2)

Copyright

© 2016, Reznik et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 6,348
    Page views
  • 1,440
    Downloads
  • 31
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Comments

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Computational and Systems Biology
    2. Developmental Biology and Stem Cells
    Francis Corson, Eric D Siggia
    Research Article