1. Genetics and Genomics
  2. Cancer Biology
Download icon

Origins and functional consequences of somatic mitochondrial DNA mutations in human cancer

  1. Young Seok Ju
  2. Ludmil B Alexandrov
  3. Moritz Gerstung
  4. Inigo Martincorena
  5. Serena Nik-Zainal
  6. Manasa Ramakrishna
  7. Helen R Davies
  8. Elli Papaemmanuil
  9. Gunes Gundem
  10. Adam Shlien
  11. Niccolo Bolli
  12. Sam Behjati
  13. Patrick S Tarpey
  14. Jyoti Nangalia
  15. Charles E Massie
  16. Adam P Butler
  17. Jon W Teague
  18. George S Vassiliou
  19. Anthony R Green
  20. Ming-Qing Du
  21. Ashwin Unnikrishnan
  22. John E Pimanda
  23. Bin Tean Teh
  24. Nikhil Munshi
  25. Mel Greaves
  26. Paresh Vyas
  27. Adel K El-Naggar
  28. Tom Santarius
  29. V Peter Collins
  30. Richard Grundy
  31. Jack A Taylor
  32. D Neil Hayes
  33. David Malkin
  34. ICGC Breast Cancer Group
  35. ICGC Chronic Myeloid Disorders Group
  36. ICGC Prostate Cancer Group
  37. Christopher S Foster
  38. Anne Y Warren
  39. Hayley C Whitaker
  40. Daniel Brewer
  41. Rosalind Eeles
  42. Colin Cooper
  43. David Neal
  44. Tapio Visakorpi
  45. William B Isaacs
  46. G Steven Bova
  47. Adrienne M Flanagan
  48. P Andrew Futreal
  49. Andy G Lynch
  50. Patrick F Chinnery
  51. Ultan McDermott
  52. Michael R Stratton
  53. Peter J Campbell  Is a corresponding author
  1. Wellcome Trust Sanger Institute, United Kingdom
  2. Cambridge University Hospitals NHS Foundation Trust, United Kingdom
  3. University of Cambridge, United Kingdom
  4. University of New South Wales, Australia
  5. National Cancer Centre, Singapore
  6. Duke-NUS Graduate Medical School, Singapore
  7. Dana-Farber Cancer Institute, United States
  8. Institute of Cancer Research, Sutton, United Kingdom
  9. University of Oxford, United Kingdom
  10. MD Anderson Cancer Center, United States
  11. University of Nottingham, United Kingdom
  12. National Institute of Health, United States
  13. University of North Carolina, United States
  14. University of Toronto, Canada
  15. University of Liverpool, United Kingdom
  16. HCA Pathology Laboratories, United Kingdom
  17. University of East Anglia, United Kingdom
  18. University of Tampere and Tampere University Hospital, Finland
  19. Johns Hopkins University, United States
  20. Royal National Orthopaedic Hospital, United Kingdom
  21. University College London, United Kingdom
  22. The University of Texas, MD Anderson Cancer Center, Houston, United States
  23. Newcastle University, United Kingdom
Research Article
Cite this article as: eLife 2014;3:e02935 doi: 10.7554/eLife.02935
7 figures, 1 table and 6 additional files


Figure 1 with 5 supplements
Mitochondrial somatic substitutions identified from 1675 Tumor–Normal pairs.

mtDNA genes and intergenic regions are shown. The strand of genes is shown based on mtDNA strand containing equivalent sequences of transcribed RNA. Substitution categories (silent, non-silent (missense and nonsense), non-coding (tRNA and rRNA), and intergenic) are shown by the shapes of each substitution. Six classes of substitutions are presented color-coded. The substitutions on the H, and L strand (when six substitutional classes were considered) are shown outside and inside of mtDNA genes, respectively. Vertical axes for H and L strand substitutions represent the VAF of each variant.

Figure 1—figure supplement 1
Correlation in amount of mtDNA reads between whole-genome and whole-exome sequencing.

139 DNA samples, either from tumors or bloods, sequenced by whole-genome sequencing were additionally sequenced by whole-exome sequencing. We compared the amount of mtDNA reads between whole-genome and whole-exome sequencing. As shown in this figure, we found strong positive correlation. * CGP; Cancer Genome Project, Wellcome Trust Sanger Institute, WUGSC; Washington University Genome Sequencing Center.

Figure 1—figure supplement 2
Correlation of heteroplasmy levels between whole-genome and whole-exome sequencing.

To validate the sensitivity and specificity of variant calling in this study, 19 tumor and normal pairs (which were originally whole-genome sequenced) were whole-exome sequenced and mtDNA variants were assessed independently. We correlated the heteroplasmic levels of 20 mutations detected in common.

Figure 1—figure supplement 3
Validation of mtDNA somatic substitutions.
Figure 1—figure supplement 4
Amount of off-target mtDNA reads across four sequencing centers.

* CGP; Cancer Genome Project, Wellcome Trust Sanger Institute (n = 855), WUGSC; Washington University Genome Sequencing Center (n = 140), BCM; Baylor College of Medicine (n = 85), BI; Broad Institute (n = 435).

Figure 1—figure supplement 5
Filtering samples of potential DNA contaminations.

(A) A histogram presenting potential sample swaps in tumor–sample pairs. (B) A histogram presenting potential minor DNA cross-contamination in tumor samples. Cross-contamination levels were considered in filtering substitutions (see “Minor cross-contamination of DNA samples” section in Materials and Methods). (C) Histograms showing number of somatic substitutions overlapping with known inherited polymorphisms and (D) number of back mutations.

Figure 2 with 1 supplement
mtDNA somatic substitutions of human cancer.

(A) Number of somatic substitutions in a tumor sample. (B) Average number of somatic substitutions per sample across 31 tumor types. (C) Age of diagnosis and number of mtDNA somatic substitutions in breast cancers.

Figure 2—figure supplement 1
VAFs of phased somatic mtDNA substitutions.

This figure presents VAF pairs between co-clonal, sub-clonal, and different strand mtDNA substitutions. We expect similar VAFs for co-clonal pairs; lower VAF in sub-clonal mutations compared to clonal ones; and sum of a VAF pair is equal or less than 1.0.

Figure 3 with 1 supplement
Replicative strand bias for mtDNA somatic substitutions.

(A) Replicative strand-specific substitution rate (# of observed/# of expected) by 96 trinucleotide context. Substitutions in a specific mtDNA segment (from Ori-b to OH) are not included, because they present a different substitutional signature. (B) Mutational signature across tumor types. Eighteen tumor types, which include at least 25 mtDNA mutations, were shown. (C) Inverted substitution signature in the Ori-b–OH.

Figure 3—figure supplement 1
Replicative strand bias observed in mtDNA substitutions.

(A) Mutational signature of mtDNA somatic substitutions on the 12 L strand genes by replicative strand (L/H strand). It agrees very well with the background mutational signature. (Chi-square p = 0.99999). (B) Mutational signature of mtDNA somatic substitutions on the H strand gene (MT-ND6) by replicative strand. It is very close to the background very close to the expected background signature (Chi-square p = 0.027). If we consider signature by transcriptional strand, the signature difference is very clear (Chi-square p = 1 × 10−21). These suggest the strand bias not to be transcription-coupled, but replication coupled. (C) Mutational spectrum of mtDNA somatic substitutions on the 22 tRNA genes by replicative strand. Again, it agrees very well with the background mutational signature (Chi-square p = 0.71). (D) Mutational spectrum of mtDNA somatic substitutions on the 22 tRNA genes by non-transcribed (coding) and transcribed (non-coding) strand. Strand bias was greatly subsided because somatic substitutions on 14 L strand and 8 H strand tRNAs neutralize the strand bias (CH > TH and TL > CL) each other. As a result, this signature of tRNA mutations by transcriptional strand is significantly different from the background one (Chi-square p = 3.3 × 10−12). Taken all together, we concluded that the cause of strand bias is not transcription-coupled but is replicative.

Figure 4 with 2 supplements
Mutational signature similar to processes shaping human mtDNA sequence over evolutionary time.

(A) Triplet codon depletion in human mtDNA by equivalent (CH > TH and TL > CL) mutational pressure. Relative frequency of each triplet codon within synonymous pairs (NNT–NNC or NNA–NNG) is shown by color. The arrows beside the box highlight the T > C (red) and G > A (blue) substitutional pressures on the L strand in germline mtDNA. (B) Correlation of triplet codon frequencies between from observed and from simulated evolutions of a random sequence mtDNA by the mtDNA somatic mutational signature with constraining mitochondrial protein sequences.

Figure 4—figure supplement 1
TC and GA skew for L strand mtDNA genes across 8 animal species.

C. elegans (a nematode) and D. melanogaster (fruit fly) mtDNA appears to have GL << AL (due to CH > TH mutational pressure) and CL >> TL (due to CL > TL mutational pressure) in the third base of triplet codon in L strand genes. Therefore they seem to have predominant C > T mutational pressure without strand bias. D. rerio (zebrafish), X. laevis (frog), and M. musculus (mouse) present GL << AL (due to CH > TH mutational pressure), but similar number of CL and TL. Therefore, mtDNA of these sequences is thought to have CH > TH, with strand bias. The existence of TL > CL is not clear. Finally, mtDNA of H. sapiens, P. troglodytes (Chimpanzee), and G. domesticus (Chicken) shows clear CH > TH and TL > CL as mentioned in the main manuscript. Interestingly, TL > CL seems to be slightly stronger in the mitochondria of chicken than that of human (or chimp). We suggest there would be some differences in the mechanism of mtDNA replication across the evolution tree.

Figure 4—figure supplement 2
Correlation of triplet codon frequencies between from observed and from simulated evolutions under the mtDNA somatic mutational signature.
Figure 5 with 3 supplements
Selection and mutational process for mtDNA somatic substitutions.

(A) Truncating mutations (nonsense substitutions and frame-shifting (FS) coding indels) present significantly lower VAF. (B) Change of VAF of mtDNA somatic mutation between primary and metastatic (or late) cancer tissues. (C) Mutational signature for mtDNA across various tumor types. None of the three highlighted mechanisms or nuclear DNA double-strand breaks repair mechanism (BRCA) match with the mtDNA mutational signature. * Only substitutions in protein-coding genes considered. (D) A proposed model of mtDNA mutational process.

Figure 5—figure supplement 1
Number of recurrent substitutions between silent and missense substitutions.

100 sites were randomly selected from silent substitutions (at third base of triplet codon) and missense substitutions (at first and second base of triplet codon). No significant difference was observed among these three groups.

Figure 5—figure supplement 2
Comparison of VAF of protein-truncating mutations (nonsense substitution and indels) across tumor types.

Four tumor types with more than 10 protein-truncating mutations are shown. Fisher's exact were applied between breast and other tissue types.

Figure 5—figure supplement 3
Negligible impacts of external mutagens (UV and tobacco smoking) to the somatic mtDNA mutations.

No evidence of UV and tobacco smoking was identified even in melanoma and lung cancers, respectively. (Left) We compared the proportion of C > T (and G > A) substitutions in the CpC (GpG) context (mutational signature for UV [Alexandrov et al., 2013]) between melanomas and breast cancers (controls). Because UV shows trivial impact to the nuclear DNA somatic mutations of breast cancers (Alexandrov et al., 2013), the vast majority of mtDNA C > T substitutions in the CpC context from breast cancers were not generated by UV. (Right) We compared the proportion of C > A (G > T) substitutions between lung and breast (control) cancers. C > A (G > T) substitutions are dominantly generated by tobacco smoking. Like UV, the impact of tobacco smoking to the somatic mutations of breast cancers is trivial (Alexandrov et al., 2013).

Author response image 1

Somatic mutational signature of mitochondrial DNA

Author response image 2

dN/dS ratio with 600 somatic substitutions randomly generated


Table 1

Summary statistics of mtDNA sequence data

WGSWXSAverage mt RD (WGS)Average mt RD (WXS)TotalWGSWXSAverage mt RD (WGS)Average mt RD (WXS)Total
Mesothelioma06-106.36Multiple myeloma069-43.269
Adenoid cystic ca.160714.775.661Ewing sarcoma027-69.527
Head & Neck4331369.118.846Kaposi sarcoma09-181.09
Total; 31 cancer types7049711675
  1. WGS, whole-genome sequencing; WXS, whole-exome sequencing; mt RD, mitochondrial read depth; MPD, myeloproliferative disease; MDS, myelodysplastic syndrome; ALL, acute lymphoblastic leukemia; CLL, chronic lymphoblastic leukemia; AML, acute myeloid leukemia; AMKL, acute megakaryoblastic leukemia.

Additional files

Supplementary file 1

Sequencing information of 1675 tumor–normal pairs.

Supplementary file 2

Catalogs of somatic mutations (substitutions and indels) and inherited polymorphisms identified in this study.

Supplementary file 3

List of phased somatic substitutions.

Supplementary file 4

dN/dS for 13 protein-coding genes in mitochondria.

Supplementary file 5

List of somatic substitution with higher recurrent rate than expected.

Supplementary file 6

Data accession numbers.


Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)