Explosive mutation accumulation triggered by heterozygous human Pol ε proofreading-deficiency is driven by suppression of mismatch repair
Abstract
Tumors defective for DNA polymerase (Pol) ε proofreading have the highest tumor mutation burden identified. A major unanswered question is whether loss of Pol ε proofreading by itself is sufficient to drive this mutagenesis, or whether additional factors are necessary. To address this, we used a combination of next generation sequencing and in vitro biochemistry on human cell lines engineered to have defects in Pol ε proofreading and mismatch repair. Absent mismatch repair, monoallelic Pol ε proofreading deficiency caused a rapid increase in a unique mutation signature, similar to that observed in tumors from patients with biallelic mismatch repair deficiency and heterozygous Pol ε mutations. Restoring mismatch repair was sufficient to suppress the explosive mutation accumulation. These results strongly suggest that concomitant suppression of mismatch repair, a hallmark of colorectal and other aggressive cancers, is a critical force for driving the explosive mutagenesis seen in tumors expressing exonuclease-deficient Pol ε.
https://doi.org/10.7554/eLife.32692.001eLife digest
New cells are made when an existing cell divides in two. Each time a cell divides, it duplicates its DNA so that each new cell inherits a complete copy. Molecular machines called DNA polymerases make these DNA copies. The main DNA polymerases, known as delta and epsilon, can “proofread” the new DNA, which ensures that the genetic information stored in the DNA is correctly copied. Cells also use another system, called mismatch repair, to catch any errors that get missed by the polymerases.
Cancer cells contain many mutations in genes that regulate the growth and production of new cells, which is why cancers grow out of control and produce tumors. Research shows that many cancer cells with high numbers of mutations have lost their proofreading ability. Yet it is not clear if the loss of proofreading is enough to cause cancers, or if other systems, such as mismatch repair, must also be defective.
Hodel, de Borja, Henninger et al. examined human cells grown in the laboratory to understand the importance of proofreading in cancer. It turns out that even the partial loss of polymerase epsilon proofreading could lead to distinctive mutations. Yet, these mutations were repaired by mismatch repair, so they actually are only found in cells when mismatch repair is also defective. This result demonstrates that the lack of proofreading is not enough to cause a large number of mutations. These cancers only happen when other systems are damaged too.
These new findings add to the current understanding of the origins of mutations in cancers and how mutations accumulate over time. It should lead scientists to further investigate the patterns of mutations that happen in the absence of proofreading. It may also enhance our knowledge of proofreading-deficient cancers.
https://doi.org/10.7554/eLife.32692.002Introduction
Human cancers share common features of genome instability and mutagenesis (Hanahan and Weinberg, 2011) that are the sources of the 103 to 106 somatic mutations observed in the genomes of most types of adult tumors (Stratton, 2011; Wheeler and Wang, 2013). The total mutation burden in a tumor is the result of multiple mutational pathways operating within the cells at varying rates over time. This can complicate attempts to assign the relative contributions of each pathway to the mutation spectrum of a tumor. One essential tool to our understanding of how mutations accumulate and influence tumor progression is using computational means to extract multiple individual signatures from many tumor genomes (Alexandrov et al., 2013a; Alexandrov and Stratton, 2014; Haradhvala et al., 2016). This is proving to be instrumental in resolving the relative extents to which pathways contribute to the ultimate mutation spectrum in a tumor (Nik-Zainal et al., 2016; Roberts et al., 2013). Comparing these tumor mutation signatures to those generated in experimental cell lines is another critical tool to understanding the relative rates and causality of mutation acquisition (Fox et al., 2016; Helleday et al., 2014). Traditionally, these measurements have relied on assays using reporter genes, which necessarily look at a tiny fraction of the genome and may miss global contributions to genome instability. Advances in next generation sequencing now allow for detailed genome-wide analyses of mutation accumulation over defined periods of cellular growth. Since each nucleotide in the genome is subject to the three major determinants of replication fidelity - nucleotide selection, proofreading and mismatch repair (MMR) - during every round of replication, tumors and cells with defects in replication fidelity are uniquely poised to address these issues.
Proofreading defects are now known to occur in a wide variety of tumors, with significant enrichment in colorectal and endometrial tumors (Cancer Genome Atlas Network, 2012; Kandoth et al., 2013; Heitzer and Tomlinson, 2014; Rayner et al., 2016). Mutations in DNA polymerase (Pol) ε cluster in the exonuclease proofreading domain and the tumors are clinically characterized by several criteria, including being ultrahypermutated, having a unique mutation spectrum, containing a heterozygous Pol ε mutation with no evidence of loss of heterozygosity (LOH) and being microsatellite stable (MSS) (Briggs and Tomlinson, 2013; Church et al., 2013; Palles et al., 2013; Zhao et al., 2013; Henninger and Pursell, 2014; Shinbrot et al., 2014; Shlien et al., 2015; Barbari and Shcherbakova, 2017). Whole genome and whole exome analyses of tumors have been the primary means to establish the ultrahypermutated (>100 Mutations per megabase) unique mutational signature that distinguish Pol ε tumors from other cancers (Alexandrov et al., 2013a; Alexandrov and Stratton, 2014; Shinbrot et al., 2014; Shlien et al., 2015; Alexandrov et al., 2013b; Campbell et al., 2017). While there is a rich history of studies on the effects of exonuclease defects on mutagenesis in model organisms, the extent to which Pol ε proofreading-deficiency by itself drives each of these criteria remains poorly understood.
It is clear from studies in model organisms that complete, biallelic inactivation of Pol ε proofreading activity causes mutagenesis and carcinogenesis in model organisms, where mutation rates have been precisely measured using reporter genes. For example, mutation rates are increased in haploid or diploid yeast strains expressing only proofreading-deficient alleles of Pols ε (Morrison et al., 1991; Morrison and Sugino, 1994; Shcherbakova et al., 2003) or δ (Morrison et al., 1993; Simon et al., 1991; Herr et al., 2011a). These rates are further elevated when combined with defects in mismatch repair, indicating that these errors are made during replication (Morrison and Sugino, 1994; Tran et al., 1999; Tran et al., 1997; Kennedy et al., 2015). In mouse models, homozygous inactivation of both copies of either Pol ε or δ exonuclease activity (Pol εexo-/exo- or Pol δexo-/exo-) causes increased mutation rates and cancer (Albertson et al., 2009; Goldsby et al., 2002; Goldsby et al., 2001). Interestingly, their tumor spectra are different, with gastrointestinal tumors predominant in Pol εexo-/exo- mice while thymic lymphomas are the major tumor in Pol δexo-/exo- mice.
However, mice with a heterozygous inactivation of a single Pol ε proofreading allele (the monoallelic Pol εwt/exo- genotype) fail to develop tumors when mismatch repair is functional (Albertson et al., 2009). The equivalent diploid heterozygous Pol ε exonuclease mutant in yeast is also a mutator, but the effect is modest and partially dominant to the wild type allele and lacks the unique mutation spectrum seen in human tumors (Morrison and Sugino, 1994; Shcherbakova et al., 2003; Morrison et al., 1993; Kane and Shcherbakova, 2014). These results raise critical questions as to the source of the unique, ultrahypermutant phenotype in human tumors with heterozygous Pol ε exonuclease-deficiency.
Mismatch repair is responsible for the recognition and removal of replication errors and deficiencies in this activity cause genome instabilities that can lead to cancer (Kunkel and Erie, 2005; Li, 2008; Jiricny, 2013; Modrich, 2006). Mismatch repair is normally an extremely efficient process, correcting more than 99% of replication errors. However, genome-wide studies have recently shown that MMR efficiencies can vary by over two orders of magnitude and are influenced by a number of factors, including the strand on which the mismatch occurs, the polymerase that made the error, the nature of the mismatch, local sequence context, distance from the origin and replication timing (Hawk et al., 2005; Hombauer et al., 2011; Lujan et al., 2014; Lujan et al., 2012; Supek and Lehner, 2015). Patients with biallelic mismatch repair disorder (bMMRD) have biallelic germline inactivating mutations in a mismatch repair gene and are completely lacking mismatch repair and develop a number of early-onset tumors in which microsatellite instability (MSI) is readily detectable (Durno et al., 2017; Wimmer et al., 2014). A subset of these patients acquires a later somatic mutation in a single allele of Pol ε, leading to very aggressive tumor development. Mutation rates from these Pol εwt/exo- MMR−/− tumors have been estimated on the order of several hundred per genome duplication (Shlien et al., 2015). This is consistent with results from model systems as mice with the equivalent genotype (heterozygous Pol εwt/exo- combined with homozygous MMR−/−) develop tumors within 1–2 months (Treuting et al., 2010). The equivalent yeast strains are strong mutators as well (Shcherbakova et al., 2003; Morrison et al., 1993; Kennedy et al., 2015).
However, since sporadic POLE tumors are generally microsatellite stable, the role of MMR in Pol ε proofreading-deficiency in the development of these MSS tumors remains a critical unanswered question. Whether MMR and POLE defects together are required for ultramutation, elevated mutation rates or for establishing the unique mutation signature is unknown. Understanding how MMR function or dysfunction affects proofreading-dependent mutagenesis is essential to understanding the mechanisms of mutagenesis during cancer development.
In the current study, we constructed a human cell line model system to address the roles of Pol ε proofreading in driving the clinical characteristics that define Pol ε tumors. Critically, we used a targeted knock-in approach to inactivate one copy of Pol ε 3'−5' exonuclease activity, since human tumors contain heterozygous, monoallelic Pol ε mutations. Using mutation rates measured at a reporter gene in combination with whole-exome and whole-genome sequencing we found a rapid accumulation of large numbers of Pol ε-specific mutations in mismatch repair-deficient cells. This confirms results suggested by observations in Pol ε mutant bMMRD tumors. We further show that mismatch repair is able to suppress exonuclease-deficient Pol ε-induced mutation rates back to wild type levels using a combination of reporter gene and whole-exome sequencing (WES). These results support the idea that additional unique features beyond a single exonuclease active site inactivation are helping facilitate the massive mutation acquisition seen in microsatellite stable tumors containing mutant Pol ε.
Results
Inactivation of Pol ε proofreading causes a mutator phenotype in human cells
Tumors with mutations in the exonuclease domain of POLE are generally microsatellite stable and show no or low loss of heterozygosity, suggesting that inactivation of exonuclease activity in one allele is sufficient to drive mutagenesis and tumor development, though this has not been directly tested previously. To test whether inactivation of a single allele of Pol ε proofreading was sufficient to cause a mutator phenotype in human cells, we used recombinant adenoassociated virus (rAAV)-mediated gene targeting to engineer a diploid human cell line to express one allele of Pol ε with the D275A/E277A double substitution (Figure 1—figure supplements 1–2; Figure 1—source data 1). We chose the D275A/E277A mutation because it inactivates exonuclease proofreading in vitro (Shcherbakova et al., 2003; Korona et al., 2011). The parental cell line, HCT-116, is constitutively mismatch repair-deficient due to an inactivating mutation in Mlh1, thus allowing us to first define the contributions of proofreading deficiency separately to mutagenesis. We then measured the mutation rate at the hypoxanthine-guanine phosphoribosyltransferase (HPRT1) locus using 6-thioguanine (6-TG) resistance and a fluctuation assay. The measurements were repeated in clones derived from independent exonuclease-deficient (exo-) allele integration events. A moderate mutator effect was seen in Pol εwt/exo- heterozygotes (Figure 1A), indicating the exo- allele was partially dominant over the endogenous exo + allele, similar to what is seen in a mismatch repair-deficient diploid cell line heterozygous for a Pol ε proofreading mutation, pol2-4/+pms1/pms1 (Pavlov et al., 2004). Mutation rates were not measured in cells from the comparable heterozygous Pol εwt/exo- mice lacking mismatch repair (Albertson et al., 2009).
-
Figure 1—source data 1
- https://doi.org/10.7554/eLife.32692.006
-
Figure 1—source data 2
- https://doi.org/10.7554/eLife.32692.007
To begin measuring the effect of inactivating a single Pol ε exonuclease allele on mutation rates in cells, we sequenced the HPRT1 gene from twenty and twenty-five independently derived 6-TGR (and thus HPRT1 mutant) clones from mismatch repair-deficient Pol εwt/wt and Pol εwt/exo- cells, respectively (Figure 1—source data 2). This allowed comparison to previously measured mutation rates from different groups using the same parental cell line. Mutation rates from the Pol εwt/wt cells were similar to the spontaneous mutation rates reported by three previous studies (Bhattacharyya et al., 1995; Glaab and Tindall, 1997; Ohzeki et al., 1997). These results suggest that the baseline rates of mutagenesis are an accurate measure of comparison for the Pol εwt/exo- cell lines.
The increase in mutation rate seen in the Pol εwt/exo- mismatch repair-deficient cells was primarily due to base pair substitutions (Figure 1B). Frameshift error rates did not change, in agreement with previous findings in vitro that Pol ε proofreading primarily strongly corrects base-base mispairs with little effect on frameshift fidelity (Korona et al., 2011). However, the number of mutational events scored by this method is insufficient to make statistical claims regarding individual mutations, reinforcing the need for genome sequencing to examine mutations in all possible sequence contexts.
Using an in vitro lacZ reversion substrate that specifically measures TCT→TAT transversions (Shinbrot et al., 2014; Shlien et al., 2015), the D275A/E277A mutant made these errors at a significantly higher rate in vitro than the wild type exonuclease-proficient Pol ε enzyme (Figure 1C). We used a construct comprised of the N-terminal 140 kDa of Pol ε, which contains the DNA polymerase and exonuclease domains and has similar fidelity and catalytic activity to the complete four subunit holoenzyme (Aksenova et al., 2010; Ganai et al., 2015; Zahurancik et al., 2015). Importantly, the elevated TCT→TAT error rate we observed with the D275A/E277A mutant was not statistically different from those measured with the S459F and S461P Pol ε cancer mutants previously (Shinbrot et al., 2014; Shlien et al., 2015), suggesting a common mechanism of mutagenesis for these hotspot mutations.
Mutation rates calculated using reporter genes (μL) can be used to extrapolate to genome-wide per base pair mutation rates (μBS) (Drake, 1991; Lynch, 2010). The availability of high-throughput DNA sequencing now allows for empirical validation of these calculations in addition to providing insight into the influence of genomic context on mutagenesis. To address this we performed whole-genome sequencing (2.8 × 109 bp at an average depth of 36.1x) on genomic DNA prepared from Pol εwt/exo- cells. Based on our measured mutation rate for HPRT1 (μL) in Pol εwt/exo- cells lacking mismatch repair (180 × 10−7), we calculated a μBS value of 0.23 × 10−7 mutations per base pair per genome duplication.
Because the parental HCT-116 cell line already carries a significant number of single nucleotide variants (SNVs) relative to the human reference sequence ([Abaan et al., 2013] and see Discussion), we needed a way of measuring de novo mutations resulting from Pol ε-dependent replication errors. To do this we first performed whole genome sequencing (WGS) on genomic DNA prepared from mismatch repair-deficient Pol εwt/exo- cells, which we then used as a matched normal control. We termed this mutation spectrum P0. We then passaged these cells through a calculated 13.9 population doublings and then performed WGS again on the passaged population, which we termed P14. Mutations unique to P14 arose during the defined number of population doublings. The P0 and P14 samples contained 140.3 and 141.4 Mut/Mb, respectively. Given the calculated μBS and the 2.8 × 109 bp sequenced, we predicted the accumulation of 906 novel genome-wide mutations after 14 population doublings. Whole-genome sequencing revealed 5,282 SNVs unique to the P14 population, 5.8-fold higher than that predicted from the μL at HPRT1. Mutations observed in HPRT1 in this cell line may thus slightly underrepresent those found genome-wide. This difference is consistent with what is seen in microbes, where reporter gene mutation rates are consistently 6–8-fold lower than concurrently measured whole-genome mutation rates, likely due to phenotypic lag, strong selective pressure and transcription in the reporter gene (Drake, 2012; Jee et al., 2016; Lee et al., 2012).
C→A transversions exceeding 20% of all base pair substitutions is a primary characteristic of mutation spectra from tumors containing Pol ε exonuclease domain mutations (Rayner et al., 2016; Shinbrot et al., 2014). C→A transversions were increased significantly in the Pol εwt/exo- cells as compared to the control Pol εwt/wt spectrum, accounting for 46% of all base pair substitutions (Figure 2A, χ2 = 11.874, p<0.0001). These were not cell line artifacts, as whole exome sequencing from HCT-116 cells from two independent studies ([Abaan et al., 2013] and this study) showed roughly 10% C→A transversions (Figure 2A, p>0.5). HCC2998 cells, which harbor the Pol εwt/P286R mutation, also showed a significant increase in C→A transversions relative to Pol εwt/wt cells (Figure 2A, p<0.0001).
Two sequence context mutational hotspots were observed that are consistent with Pol ε exonuclease domain mutant spectra: C→A transversions in TCT context and T→G transversions in TTT context and, to a lesser extent, ATT and GTT contexts (Figure 2B). These hotspots are seen in Pol ε tumors from patients with bMMRD (Shlien et al., 2015), colorectal and endometrial cancer (Alexandrov and Stratton, 2014; Cancer Genome Atlas Network, 2012; Kandoth et al., 2013; Shinbrot et al., 2014), as well as in the Pol ε-P286R HCC2998 cells (Figure 2—figure supplement 1, data extracted from [Abaan et al., 2013]). These are not mutational hotspots in HCT-116 cells, which contain wild type Pol ε (Figure 2—figure supplement 1). The largest number of mutations that arose during the 14 doublings were C→A transversions in triplet contexts containing adjacent cytosines: CCA, CCT, CCC and CCG. Triplet nucleotide occurrences can vary in the regions captured by WGS and WES. In order to address this we reanalyzed each sample relative to the number of times each trinucleotide is found in the relevant sample and found the hotspot patterns are all retained (Figure 2—figure supplements 3–4). The increase in C→A mutations in the CCT context was also seen in Pol ε exonuclease domain (EDM) tumors from bMMRD patients (Shlien et al., 2015), suggesting a link between Pol ε replication errors left uncorrected by mismatch repair. C→A mutations in CCA, CCC and CCG contexts are slightly elevated in Mutation Signature 20, which has been associated with loss of mismatch repair (Alexandrov et al., 2013b). These transversions were seen in the HCT116 cell line with wild type Pol ε (Figure 2—figure supplement 1), though to a lesser extent. The lack of C→T transitions in TCG contexts is significantly different from colorectal and endometrial Pol ε tumors, but consistent with their absence from bMMRD tumors with Pol ε EDM mutations (Cancer Genome Atlas Network, 2012; Kandoth et al., 2013; Shinbrot et al., 2014).
Expression of MMR suppresses Pol εwt/exo- mutagenesis
While it is clear that Pol ε-dependent mutagenesis in the absence of functional MMR accounts for the ultramutated phenotype in bMMRD tumors with Pol ε mutations, the role of MMR in Pol ε somatic tumors is less clear. In order to measure the effects of MMR on Pol ε exonuclease-dependent replication errors, we wanted to measure error rates in both the presence and absence of MMR. Previous studies have restored MMR by stably adding Mlh1-expressing chromosome 3 to cells (Glaab and Tindall, 1997). We made Mlh1-encoding lentivirus and used this to infect Mlh1-deficient HCT-116 cells containing wild type and mutant Pol ε (Figure 3A). Lentiviral Mlh1 expression reduced mutation rates at the HPRT1 locus by 14- to 20-fold in the wild type polymerase background (Figure 3B), similar to the 12-fold reduction reported when the Mlh1-encoding chromosome 3 was added back to HCT-116 cells ([Glaab and Tindall, 1997; Tindall et al., 1998]; 73 × 10−7 and 5.9 × 10−7; 12.4-fold reduction), indicating that the expressed Mlh1 is functional.
Mlh1 expression in Pol εwt/exo- cells caused an over 50-fold decrease in the mutation rate (to 2.3 and 3.0 × 10−7, Figure 3B), making them indistinguishable from those measured in Pol εwt/wt cells with Mlh1 expressed (Figure 3B). This result also suggests that Msh3 is unlikely to play a significant role in correcting the exonuclease-deficient Pol ε errors since HCT-116 cells are deficient in this factor and it was not added back in these experiments (Papadopoulos et al., 1994).
When fluctuation assay mutation rates are very low due to a significant number of independent isolates giving rise to zero HPRT1-mutant colonies, as was the case here, an alternative method to measure mutation rates can be used. We chose to periodically measure HPRT1 mutant frequencies at increasing population doubling level (PDL), where the slope of the plotted line is equal to the mutation rate (Glaab and Tindall, 1997). We measured HPRT1 mutant frequencies at several population doublings from PDL = 0 to PDL = 70 or 71 in Pol εwt/wt and Pol εwt/exo- cells expressing Mlh1, respectively (Figure 4A). At each PDL we scored between 1 and 19 6-TG-resistant colonies. However, when we sequenced the HPRT1 ORF from all 6-TG-resistant colonies we saw many instances of repeat mutations in a collection from a single PDL, indicative of a single mutational event that expanded throughout the population. Plotting mutant frequency values calculated for the indicated PDL using only the unique HPRT1 mutations (Figure 4—source data 1) returned a line with slope of ~1, suggesting that the mutation rates were at or near the level of detection of this assay. The Pol εwt/exo- mutant frequencies were consistently higher than those from the Pol εwt/wt cells, but this difference was not statistically significant (Figure 4A).
-
Figure 4—source data 1
- https://doi.org/10.7554/eLife.32692.023
-
Figure 4—source data 2
- https://doi.org/10.7554/eLife.32692.024
To determine if this phenomenon held throughout the genome, we carried out whole-exome sequencing to an average depth of 100x on the early (PDL = 0) and late (PDL = 70) samples from both Pol εwt/wt and Pol εwt/exo- mismatch repair-proficient cell lines (Figure 4B). Using the PDL = 0 samples as matched normal controls, we measured similar low mutation rates in Pol εwt/wt and Pol εwt/exo- cells (13 × 10−9 Mut/bp/doubling and 18 × 10−9 Mut/bp/doubling, respectively). The total numbers of all mutations acquired were essentially no different than with wild type Pol ε. Interestingly, there was a statistically significant increase in C→A transversions (p=0.0002) between the mismatch repair-proficient Pol εwt/exo- cells and the mismatch repair-proficient Pol εwt/wt cells, while no statistically significant difference was found in any other class of base pair substitution (p>0.2 for each of the six classes, Fisher’s Exact Test). Further, all triplet context mutations were observed in insufficient numbers to evaluate statistically. C→A mutations were, however, observed in all triplet contexts seen as hotspots in the MMR-deficient cells (CCA, CCC, CCG, CCT and TCT, Figure 4—figure supplement 1). Mutation signature 10, the unique Pol ε mutation signature, was extracted from Pol ε exonuclease-deficient mutation spectra from cells with and without mismatch repair (Figure 2—figure supplement 2 and Figure 4—source data 2). The relative contribution of signature 10 in Pol ε exo-deficient cells is closer to that seen in bMMRD patients (Figure 2—figure supplement 5), most likely due to the relative absence of C→T transitions in TCG context. These results indicate that the majority of replication errors made by the Pol ε-D275A/E277A mutant are in fact corrected by mismatch repair.
Discussion
In the current study we examined the relative contributions of two essential determinants of replication fidelity, proofreading and mismatch repair, on mutagenesis in human cells. We used a combination of gene editing, reporter gene studies and next generation sequencing to measure mutation rates and specificities in human cells engineered to model proofreading-deficient tumors with and without mismatch repair. This is the equivalent to what occurs in human tumors with mutations in the Pol ε exonuclease domain and genomic mutation frequencies exceeding 100 mutations per Mb (Cancer Genome Atlas Network, 2012; Rayner et al., 2016; Shinbrot et al., 2014; Shlien et al., 2015). We show that large and rapid mutation accumulation occurs when Pol ε exonuclease domain mutations occur along with inactivation of mismatch repair. Most of these are specific transversion mutations known to be hotspots of exonuclease-deficient Pol ε mutagenesis. We further show that this large increase in mutation rate is largely suppressed by functional mismatch repair. Taken together, these results suggest that the mechanism of replication error mutagenesis in sporadic tumors with heterozygous Pol ε mutations likely requires an additional feature, several of which are described below, including suppression of MMR and alternative effects on Pol ε activity.
We used rAAV-mediated gene targeting to replace two exonuclease active site residues, D275 and E277, with alanines on a single POLE allele. The single allelic inactivation was chosen to model the case in tumors with heterozygous Pol ε mutations. This double amino acid substitution has been shown to inactivate exonuclease proofreading in vitro and cause increased reporter gene mutation rates in yeast and mammalian cells (Morrison et al., 1991; Morrison and Sugino, 1994; Tran et al., 1999; Albertson et al., 2009; Korona et al., 2011; Shcherbakova and Pavlov, 1996; Agbor et al., 2013). Next generation sequencing on these cells in the presence or absence of mismatch repair over defined numbers of population doublings allowed us to compare genome-wide mutation rates and spectra to the mutation spectra from patient tumors.
Unbiased whole-genome sequencing confirmed the rapid accumulation of Pol ε-specific mutations seen in POLE tumors lacking functional mismatch repair (Shlien et al., 2015). The total number of measured SNVs suggests a mutation rate of 380 mutations per population doubling, similar to the 608 mutations per cell cycle calculated for a mismatch repair-deficient brain tumor harboring a Pol ε exonuclease domain mutation. Our cellular mutation rate values possibly underestimate the true Pol ε exonuclease-deficient mutation rate for several reasons. Our data were generated from a cancer cell line with a large number of pre-existing mutations (Abaan et al., 2013), as well as additional mutations that have assuredly arisen during passaging in the laboratory. These could conceivably include suppressor mutations functioning to restrain elevated mutation rates (Morrison and Sugino, 1994; Herr et al., 2011a; Williams et al., 2013). Importantly, no additional mutations in POLE were sequenced, suggesting that viability of this cell line is not due to an acquired mutation elsewhere in POLE acting to suppress the mutation rate, as occurs frequently in yeast (Herr et al., 2011a; Williams et al., 2013; Herr et al., 2011b; Dennis et al., 2017). While we cannot formally exclude the possibility that a de novo mutation in another gene acted to suppress the mutation rate in trans, no obvious candidates were identified.
An additional reason that our mutation rates may underestimate the true mutation rate is that mutations that arise in the last several rounds of replication and those that fall below 5% allele frequency would not meet the threshold for scoring as a true SNV. The genome data was generated using an instrument with high accuracy (<1% error rate) and variants were called using an established algorithm, however there are indeed a small number of areas in the genome that are inaccessible – either due to gaps in the reference assembly, or excessive numbers of repeats that prevent proper alignment. Experiments using single-cell sequencing could address these issues, ideally by selecting single cells, expanding subclones and then measuring mutations at higher stringency values than used here. These rates are also similar to the per base pair mutation rates in haploid yeast with complete Pol ε exonuclease deficiency and disrupted MMR (Kennedy et al., 2015). This similarity is striking considering our measurements were made in a heterozygous diploid human cell line. A key finding from the yeast study was that individual cell mutation rates could vary by an order of magnitude. We are currently unable to measure mutation rates in individual cells, but this remains a critical issue to address in future studies.
The unique mutation spectrum seen in POLE tumors was recapitulated in our gene-targeted cell lines, with one notable exception. In tumors, many C→A transversions occur in a highly specific triplet sequence context, TCT, which we also see in the cell lines, though not to the same proportion as in the tumor genomes. Interestingly, this particular mutation is also enriched in yeast with the P286R equivalent allele (Barbari and Shcherbakova, 2017). We also observe increased T→G transversions in TTT (and to a lesser extent ATT and CTT) context, similar to Pol ε tumors. Because of the limited number of mutational target sites we cannot at this time draw conclusions as to Pol ε strand usage during replication. Experiments designed to assess strand bias in these errors are currently underway. What is notable, however, is the lack of TCG→TTG transitions in our dataset. This is the second most frequent Pol ε-specific mutation in the TCGA database. TCG→TTG transitions were also not found elevated in the Pol ε bMMRD brain tumor mutation spectrum. This difference may reflect interesting, but as-yet undefined tissue differences.
Another possible explanation for these differences is that the Pol ε mutants found in tumors are somehow intrinsically different biochemically from the double alanine substitution mutant used in the current study. Depending on the reporter gene used, the monoallelic Pol ε-P286R mutant is a 2.3- to 12-fold stronger mutator than the pol2-4 mutant (equivalent to the human Pol ε D275A/E277A studied here) when measured in a diploid yeast strain (Kane and Shcherbakova, 2014). However, a number of direct biochemical comparisons of activity and fidelity (Figure 1C, (Shinbrot et al., 2014; Shlien et al., 2015) and unpublished observations) between several cancer mutant constructs and the D275A/E277A construct have not yet shown any significant differences that could account for this. Certain DNA Pol mutants, including some found in human tumors, can cause increased mutagenesis by inducing expansions of normal dNTP pools in yeast and human cells (Dennis et al., 2017; Mertz et al., 2015; Williams et al., 2015). Interestingly, the pol2-4 allele has no effect on dNTP pools in yeast, suggesting a possible explanation for possible allelic differences with functional MMR.
In heterozygous Pol εwt/exo- cells with functional mismatch repair, mutation rates were suppressed to the levels seen in cells with wild type Pol ε. These rates would be insufficient to give rise to ultrahypermutated tumors in a matter of months. In addition, there is no explosive accumulation of triplet context-specific mutations in the MMR-proficient Pol εwt/exo- cells that is seen in these tumors.
Given that the HCT-116 cells used in these studies are mutators themselves, it is possible that pre-existing deficiencies in other DNA repair or replication proteins could contribute to the observed mutagenesis. While direct contribution is unlikely given the absence of POLE mutation spectrum in the wild type Pol ε cells, cooperation with exonuclease-deficient Pol ε remains a formal possibility. To address this we used gene ontology to identify 58 DNA repair and replication proteins mutated in our HCT-116 cells, including 38 non-synonymous and 20 indel mutations. While several interesting candidates with known links to mutagenesis were identified, all have been shown by other groups to be expressed in this cell line and each, when tested, is functional (e.g. ATM, SETD2, Pol η, Pol ζ [Bhat et al., 2013; Hahn et al., 2011; Nicolay et al., 2012; Zhou et al., 2013; Zhu et al., 2009]). No mutation that arose during the population doubling experiments showed any obvious link to mutagenesis.
Our results support a model in which simple heterozygous loss of two Pol ε exonuclease metal chelating residues on a single allele of POLE is insufficient to drive Pol ε ultramutational specificity. Additional factors are likely required to help drive the ultramutated phenotype observed in POLE tumors, including suppression of mismatch repair, discussed below. In bMMRD, the complete lack of mismatch repair prior to Pol ε mutation leads to the moderate accumulation of Pol ε-independent replication errors (Figure 5). Mutation rates then increase dramatically upon loss of proofreading in one allele, with the Pol ε error signature representing a smaller fraction of the total errors, which is seen in these tumors (Shlien et al., 2015). Our results suggest that Pol ε mutations in somatic tumors can occur first and early, but later suppression of MMR would then accelerate overall mutation rates to that seen in the ultramutated tumors, while the signature mutation proportion remains high (Cancer Genome Atlas Network, 2012; Kandoth et al., 2013).
Analysis of the mutational status of all mismatch repair genes in Pol ε tumors sequenced by TCGA supports the model of mismatch repair loss dramatically accelerating the acquisition of Pol ε-specific mutations. 85% (22/26) of the TCGA Pol ε tumors also have a mutation in at least one mismatch repair gene, most of which (18/22) harbor at least one nonsense mutation, which are more likely to be inactivating mutations (Figure 5—figure supplement 1). This predicts that at least some tumors would show evidence of MSI. In the original TCGA studies, several POLE tumors were actually first classified as MSI (three as MSI-H; five as MSI-L) (Cancer Genome Atlas Network, 2012; Kandoth et al., 2013). Analysis of sequencing reads from 46 homonucleotide runs in the POLE endometrial tumors showed no evidence of instability, so the POLE tumors were then reclassified as MSS (Shinbrot et al., 2014). However, the initial TCGA studies used both homo- and di-nucleotide loci to score MSI, raising the possibility that a subset of POLE tumors have a microsatellite instability defect at repeats more complex than homonucleotides. Indeed, the repeat unit size, the number of repeats and the repeat sequence composition are known to have very strong influences on the variability of microsatellite mutagenesis (Shah et al., 2010). Curiously, however, most (15/18) of the MMR gene nonsense mutations are the result of TCT→TAT transversions, raising the possibility that Pol ε mutation occurs first and possibly even promotes subsequent mutational inactivation of MMR.
Of all the Pol ε mutant colorectal and endometrial tumors sequenced in the TCGA studies, 15% (4/26) lacked a mutation in any mismatch repair gene and also showed no evidence of MLH1 promoter hypermethylation, demonstrating that the ultramutated phenotype can arise when mismatch repair is intact at both the genetic and epigenetic level. An alternative possibility is that mismatch repair activity is suppressed at some point during POLE tumor development. In this scenario, mutations introduced by the mutant Pol ε could accumulate slowly even in the presence of genotypically and epigenetically wild type mismatch repair. A number of conditions have been shown to transiently and reversibly lower mismatch repair protein levels and inhibit mismatch repair activity, including hypoxia, oxidative damage, inflammation, reduced pH, exposure to adriamycin or cadmium and treatment with mutagenic dNTP analogs (Banerjee and Flores-Rozas, 2005; Francia et al., 2005; Larson and Drummond, 2001; Mihaylova et al., 2003; Chang et al., 2002; Hile et al., 2013; Iwaizumi et al., 2013; Lu et al., 2014; Negishi et al., 2002). The variable nature and duration of such a suppression event would be expected to result in a complex effect on microsatellite instability. Perhaps even more intriguingly, transient mismatch repair suppression has been seen in the context of proofreading-deficiency in E. coli (Fijalkowska and Schaaper, 1996; Schaaper and Radman, 1989). While replication errors made by the proofreading-deficient allele tested here were clearly insufficient to suppress MMR, it is possible that the nature and rate of errors made by cancer-associated alleles might be sufficient to saturate and overwhelm MMR pathways.
Our results support the idea that loss of a single Pol ε proofreading allele is sufficient to drive a subset of the observed clinical characteristics of Pol ε tumors, provided mismatch repair is compromised in some way. These observations further support the idea that in the presence of fully functional MMR the appearance of the ultrahypermutated mutation signature may be more directly related to some as yet uncharacterized additional defect in the mutant polymerase (Barbari and Shcherbakova, 2017). These ideas are not mutually exclusive of one another.
Given the recent success of immune checkpoint therapies in treating tumors with high mutation burden (Shlien et al., 2015; Bouffet et al., 2016; Hodi et al., 2010; Le et al., 2015; Santin et al., 2016), it is of great interest to understand the mechanisms that result in ultrahypermutated tumors harboring DNA polymerase mutations.
Materials and methods
Materials
Trypsin-EDTA was from Life Technologies and Geneticin was from Invitrogen. Antibodies against Mlh1 (mouse α-human Mlh1, G168-728) and β-actin (mouse α-human beta-actin, A1978) were from Pharmingen and Sigma, respectively.
Cell culture
Request a detailed protocolThe human colorectal cancer cell line HCT-116 (a kind gift from Dr. Prescott Deininger) was grown in HyClone MEM/EBSS (Thermo Scientific) supplemented with 10% fetal bovine serum (Atlanta Biologicals), 1% sodium pyruvate (Invitrogen) and 1% MEM-NEAA (Invitrogen). The HCT-116 cells used in this study were validated via analysis of genome-wide mutation signature, microsatellite instability and biomarker. HCT-116 cells lack Mlh1 resulting in a well-characterized MSI phenotype (Lynch et al., 1993; Parsons et al., 1993; Boland and Goel, 2010). They further have a unique mutational spectrum that can be evaluated via next-generation sequencing (Abaan et al., 2013). Western blot analyses (Figure 3A) showed a lack of Mlh1 protein. The mutation spectrum from our whole-exome sequencing of HCT-116 cells (Figure 2A and Figure 2—figure supplement 1) is identical with that reported by Abaan (Abaan et al., 2013). Lastly, we performed microsatellite stability analysis in our HCT116 cells at five mononucleotide homopolymeric run loci (NR27, NR21, NR24, BAT25, BAT26) using capillary electrophoresis, which showed instability at these loci providing a phenotypic readout consistent with the lack of Mlh1 expression in our cells (data not shown). The HCT-116 cell line is also not in the 488 commonly misidentified cell lines from the most recent ICLAS database (Version 8.0) and tested negative for mycoplasma.
Generation of targeting constructs
Request a detailed protocolIn order to target the proofreading inactivating mutations to the POLE locus in vivo, we used rAAV with a synthetic exon promoter trap (Rago et al., 2007). A 1045 bp fragment containing POLE exons 7 and 8 along with intron 7 (termed HA1) was PCR amplified from HCT-116 genomic DNA using primers designed to add unique NotI and SacI sites to the 5' and 3' ends, respectively. A 1057 bp fragment containing exons 9, 10 and 11 along with introns 9 and 10 (termed HA2) was PCR amplified from HCT-116 genomic DNA using primers designed to add unique EcoRI and NotI sites to the 5' and 3' ends, respectively. Both HA1 and HA2 were first cloned into pCR-TOPO and sequence verified. The catalytic exonuclease DIE residues located in HA2 (exon 9) were changed to AIA using site-directed mutagenesis and sequence verified. The Pol ε rAAV shuttle vector was assembled by four-way ligation using the restriction enzyme-digested gene-specific HA1 and HA2 fragments, along with the SEPT/loxP cassette digested with NotI-EcoRI and the ITR-containing pAAV shuttle vector digested with NotI (SEPT/loxP cassette and pAAV shuttle vectors were kind gifts of Dr. Fred Bunz, Johns Hopkins University). The Exo-targeting vector was used to package high-titer (1.6 × 106 PFU/ml) recombinant adeno-associated virus into AAV2 serotype capsids.
Gene targeting and isolation of recombinant cell lines
Request a detailed protocolCells were grown in 100 mm dishes and infected with rAAV when ~75–80% confluent. At the time of infection, cells were washed with 1x Hanks buffered saline solution (Invitrogen) before adding 3 ml of media containing 75 μl of a 1:250 dilution of rAAV lysate. 3 hr after infection an additional 6 ml of media was added to plates and allowed to incubate at 37°C for 48 hr. After 48 hr, media was changed and Geneticin was added to a final concentration of 400 μg/ml. Plates were then incubated under selection for an additional 14 days. At the end of the selection period, colonies from plates were isolated using glass cloning rings and 0.05% trypsin (Invitrogen) was used to transfer colonies to 6-well plates for subsequent expansion. Genomic DNA was extracted from expanded clones using DNeasy Blood and Tissue kit (QIAgen) according to the manufacturer’s protocol and eluted in 100 μl of elution buffer. Locus-specific integration was assessed by PCR using a primer that annealed outside the homology region and another that annealed within the neo cassette.
Cre-mediated excision
Request a detailed protocolTo remove the SEPT cassette from correctly targeted clones, cells were infected in a 25 cm2 flask with adenovirus that expresses the Cre recombinase (1.0 × 106 PFU/ml, Vector Biolabs, Philadelphia, PA). Cells were plated at a limiting dilution in nonselective medium 24 hr after infection. 12 days after infection, single cell colonies were plated in duplicate and geneticin was added to one set of wells at a final concentration of 400 μg/ml to test for sensitivity. During this time, genomic DNA was extracted as previously described and screened using primers that annealed across both homology arms. PCR products were digested with SacI to distinguish between the wild type and recombinant locus.
Southern blot analysis
Request a detailed protocolGenomic DNA was harvested from the knock-in cell lines using the DNeasy Blood and Tissue Kit (Qiagen), and double digested with SacI and SalI. Hoechst fluorimetry was used to determine the concentration of DNA samples for accurate loading of samples. 4 μg of each sample was run on a 0.8% agarose gel in TBE. DNA was transferred to Hybond N + membrane (Amersham), blotted with a probe to HA2 at 65°C overnight, and washed at 65°C. To make the probe, a 300 bp sequence was amplified from the HA2-pCR-TOPO clone using the primers: 5ʹ-GCATCTGCCCCACTGTTAGT-3ʹ and 5ʹ-CTCCCTGTTGGTGATGAGGT-3ʹ. The PCR product was labeled using the Prime-It II Random Primer Labeling Kit (Agilent) and α-32P-dCTP (Perkin Elmer). Membrane was blocked in Denhardt’s pre-hybridization buffer [6x SSC, 0.5% SDS, 0.1% Ficoll 70, 0.1% Ficoll 400, 0.2% PVP, and 0.2%] at 65°C for 1 hr. The probe was added to hybridization buffer [6x SSC, 0.5% SDS, and 10% Dextran Sulfate] and incubated overnight at 65°C. To wash off excess probe, the blot was washed for 2 × 15 min washes in wash 1 [10x SSC, 0.5% SDS], 2 × 15 min washes in wash 2 [1x SSC, 1% SDS], and 2 × 30 min washes in wash 3 [0.1x SSC, 1% SDS]. The gel was exposed to a PhosphorImage screen and scanned on a Typhoon Imager.
Purification of human Pol ε
Request a detailed protocolAn expression vector encoding residues 1–1189 of the catalytic subunit of human Pol ε containing the D275A/E277A substitution was prepared as described (Korona et al., 2011). Briefly, the human Pol ε was coexpressed in autoinduction medium with pRK603, which allows coexpression of TEV protease, at 25°C until the culture was saturated. Peak fractions from the HisTrap column were pooled, dialyzed into 50 mM HEPES, pH 7.5, 1 mM DTT, 5% glycerol and bound to SP sepharose. Bound protein was eluted with a 0–1 M with NaCl gradient. Peak fractions were pooled, dialyzed into 50 mM Tris, pH 7.5, 1 mM DTT, 5% glycerol, 100 mM NaCl and bound to Q Sepharose. Bound protein was eluted with a 100 mM–M M NaCl gradient. Peak fractions were pooled, concentrated and passed through a pre-equilibrated Superdex200 size exclusion column. Fractions containing the purified 140 kDa protein were pooled, dialyzed into 50 mM Tris, pH 8.0, 1 mM DTT, 5% glycerol and aliquots were frozen and stored at −80°C.
TCT→TAT in vitro error rate
Request a detailed protocolWe previously reported that the lacZ forward mutation assay template lacks sites at which TCT→TAT transversions are phenotypically detectable (Shlien et al., 2015). To overcome this limitation we previously made a reversion substrate that reports only this mutation by using site-directed mutagenesis to change A-11 to C-11. Double-stranded M13mp2 DNA containing the TC-11T sequence was used as a substrate in reactions containing 0.15 nM DNA, 50 mM Tris-Cl, pH 7.4, 8 mM MgCl2, 2 mM DTT, 100 μg/ml BSA, 10% glycerol, 250 μM dNTPs and 1.5 nM Pol ε at 37°C. Completely filled product was transfected into Escherichia coli cells, which were used to determine the frequency of dark blue revertant plaques that occurred as a result of TCT→TAT transversions arising during DNA synthesis. In this assay, accurate DNA synthesis yields colorless plaques. Error rates were calculated according to the following equation: error rate (per nucleotide synthesized) = ((number of mutants of a particular class) × (mutant frequency)) / ((number of mutations sequenced) × (0.6) × (number of detectable sites)).
Mlh1 lentivirus construction
Request a detailed protocolMlh1 ORF was PCR amplified using the pCMV-XL5-Mlh1 vector (kindly provided by Victoria Belancio, Tulane University), forward and reverse primers (fwd 5'-TCGACTCGAGTCCACCATGTCGTTCGTGGCAGG-3'; rev 5'-TCGAGGATCCGTTACTTAACACCTCTCAAAGAC-3') and Q5 DNA polymerase (NEB). After gel purification, dA was added to the 3' ends with Taq and the Mlh1 ORF was cloned into pLenti6.3/V5-TOPO (Invitrogen). Mlh1 was found to have a common I219V SNP that does not affect Mlh1 function (Plotz et al., 2008). Mlh1 Lentiviral particles were made using the ViraPower Lentiral Expression System (Invitrogen). Briefly, 293FT cells were transfected with pLenti6.3/V5-TOPO-Mlh1 and a mixture of plasmids encoding lentiviral packaging factors. Viral supernatant was harvested 48 hr after transfection, filter sterilized and stored in aliquots at −80°C. After titering, HCT-116 cells were transduced with Mlh1 lentivirus at MOI of 1.0. Cells were selected for 1 week in 10 μg/ml blasticidin. Blasticidin-resistant clones were identified and cells were harvested, lysed and probed by Western blot (mouse α-human Mlh1, G168-728, Pharmingen) to confirm Mlh1 expression.
Mutation rate and mutant frequency measurements
Request a detailed protocolPrior to mutation rate measurements, preexisting HPRT1 mutants were eliminated from cell populations by incubating cells in HAT medium (1x Hypoxanthine-Aminopterin-Thymidine) for five passages. For each cell line analyzed, 500 cells were seeded and grown to confluence in 12 wells across two 6-well plates. Cells from one well were harvested and counted to estimate cell number in the remaining 11 wells. For mutation rate measurement, 500 cells from each of the remaining eleven wells were seeded per dish in 3 × 100 mm dishes in media lacking 6-TG to be used to measure plating efficiency. At the same time, 5 × 105 cells from each of the remaining eleven wells were plated in 5 × 100 mm dishes in media containing 6-TG. After 7 days, colonies on the plating efficiency wells were stained with crystal violet and counted. After 12–14 days, the 6-TG resistant colonies were also stained with crystal violet and counted. Mutation rate was calculated using the Ma-Sandri-Sarkar Maximum Likelihood Estimator (MSS-MLE) method (Rosche and Foster, 2000).
For mutant frequency measurement, 500 cells per clone were seeded in duplicate in 6-well plates in media lacking 6-TG and allowed to grow for 5–7 days to determine plating efficiency. The remaining wells were seeded with 5 × 104 cells in media containing 6-TG and allowed to grow for 12–14 days. After the indicated time, colonies were stained with crystal violet and counted. Mutant frequency was calculated by the following equation: (# 6-TG resistant colonies) / ([(# colonies scoredPE)/(# cells seededPE)] x (# cells seeded6-TG)). PE refers to plating efficiency. Colonies were defined as ≥50 cells.
HCT116 and HCT116 + Mlh1 cells were seeded into T75 flasks and grown at 37°C/5% CO2 until 80% confluency was reached. Cells were counted using the Countess Automated Cell Counter (Invitrogen) and 1 × 106 cells were seeded into new T75 flasks and incubated until 80% confluency was reached. The above protocol was repeated at regular intervals (3–4 days) and population doubling (PDL) numbers calculated using the following equation: PDL = [ln(Nt)-ln(N0*PE)]/ln2. Nt = Number of viable cells counted after passage; N0 = Number of cells seeded prior to passage; PE = plating efficiency. At PDL ~ 6, 44 and 69 cells were trypsinized and counted. For mutant frequency measurement, 300 cells were seeded into each of 3 × 100 mm dishes in media lacking 6-TG to be used to measure plating efficiency. Concurrently, 2 × 105 cells were seeded into each of 10 × 100 mm dishes in media supplemented with 6-TG to a final concentration of 5 μg/mL. After 7 days, colonies on the plating efficiency dishes were stained with crystal violet and counted. After 12–14 days, 6-TG resistant colonies were isolated using glass cloning rings and 0.05% trypsin and transferred into 24-well plates for expansion and RNA isolation. Additionally, at the above PDLs an aliquot of cells were harvested, lysed and probed by Western blot (mouse α-human Mlh1, G168-15, Abcam) to confirm maintenance of Mlh1 expression.
Genomic per base pair mutation rates (μBS) were calculated using the method of Drake (Drake, 1991) with modifications as applied in Lynch (Lynch, 2010). The equation used was: μBS = (μL • fT • fBS) / (L • fL • [x (nm + nn)/nn]), where μL is the measured mutation rate at the HPRT1 reporter gene, fT is the fraction of mutants found after sequencing, fBS is the fraction of mutations due to base pair substitutions, L is the length (in nt) of the reporter gene, fL is the fraction of HPRT1 that gives rise to detectable mutations, x is the fraction of mutations that would give rise to chain terminator mutations, nm is the observed number of missense mutations and nn is the observed number of nonsense mutations. We used 126 HPRT1 mutations from three independent studies (Bhattacharyya et al., 1995; Glaab and Tindall, 1997; Ohzeki et al., 1997) to calculate μBS. The values used were: fT = 1.0, fBS = 79/126 = 0.627; L = 627 nt; fL = 1; x = 3/64 = 0.047; nm = 74; nn = 5. The μL value for Pol ε mutant cell lines was determined empirically using fluctuation analysis.
HPRT1 sequencing
Request a detailed protocolTotal RNA was isolated using the Qiagen RNeasy kit (Qiagen) according to the manufacturer’s protocol. RT-PCR was performed with SuperScript III Reverse Transcriptase (Invitrogen) according to the manufacturer’s protocol using 1 μg of RNA as a template. Primer-specific cDNA was amplified for 32 cycles at an annealing temperature of 60°C using the following HPRT1 primers: 123(fwd) CTTCCTCCTCCTGAGCAGTC and 1041 (rev) GCCCAAAGGGAACTGATAGTC. From the HPRT1 sequencing of 6-TG resistant colonies, one clone was found to have exon 2 completely deleted. Exon deletions in HPRT1 have been shown to be caused by splice site mutations (Bhattacharyya et al., 1995). We therefore amplified exon 2 and its flanking region from genomic DNA prepared from the appropriate clone using the following primers: Forward: TTGTTTTCTTACATAATTCATTATCATACC; Reverse: TTACTTTGTTCTGGTCCCTACAGAG.
Whole genome and exome sequencing
Request a detailed protocolNext generation sequencing was performed as per the published protocols. Whole genome sequencing (WGS) was performed on an Illumina HiSeq Xten instrument with libraries prepared using the manufacturer’s TruSeq Nano DNA Library Prep kit and sequenced to a depth of 36.1x. For exome sequencing, DNA was enriched using Agilent SureSelect Human Exome Library Preparation V5 kit, then sequenced to a depth of 101.38x (96.61x-108.19x).
Substitution detection from next generation sequenced data
Request a detailed protocolAll samples were processed from raw reads (FASTQ files) from paired end libraries. The reads were aligned to the human reference (GRCh37 with decoy sequences) using BWA-MEM v0.7.8 (Li and Durbin, 2009). Duplicate reads were identified and marked using Picard v1.108 (https://broadinstitute.github.io/picard/). The Genome Analysis Toolkit (GATK) v2.8.1 (McKenna et al., 2010) was used to locally realign reads to known indels and recalibrate base quality scores. Quality metrics were generated from the final BAM files to ensure high quality alignment. This includes:
average coverage >90 x in whole exome data (Figure 2—figure supplement 6 Mean_Coverage_Per_Sample.pdf)
alignment rate to the reference genome >99% across whole exome data (Figure 2—figure supplement 7 Proportion_of_properly_paired_reads.pdf) with >60M reads per sample (Figure 2—figure supplement 8 Total_Reads_Exome.pdf)
>90% of bases in the genome at >20 x coverage and >90% of bases in the exome at >30 x coverage (Figure 2—figure supplement 9)
Limitations in the genome due to low-complexity regions and incomplete areas in the genome (Li, 2014) prevent proper alignment resulting in sources of error.
Somatic point mutations between the tumour and matched normal were identified using MuTect v1.1.4 (Cibulskis et al., 2013). In addition, we used MuTect v1.1.4 in single sample mode to detect all mutations in each sample. All mutations were annotated using ANNOVAR v20130823 (Wang et al., 2010). Subsequent filtering was performed to reduce potential false positives and allow only high confidence mutations in the dataset using a custom R package (ShlienLab.Core.SNV v0.09). Mutations were retained if they met the following criteria:
not identified in common mutation databases including: dbSNP (138), 1000 genomes (1000g2012feb), complete genomics (CG69), Exome sequencing project (ESP 6500si)
for exome data, must have at least 20x normal and 30x tumour
for WGS data, must have at least 10x normal and 10x tumour (Figure 2—figure supplement 10)
To investigate the quality of somatic mutations, we also identified key metrics including:
Average alternate base quality to reference base quality of ~1.0 (Figure 2—figure supplement 11, mean_ratio_tumour_alt_ref_base_quality.pdf)
Data access
Request a detailed protocolDNA sequencing data from this study have been submitted to the NCBI Gene Expression Omnibus (GEO; http://www.ncbi.nlm.nih.gov/geo/) under accession number PRJNA327240.
Data availability
-
Homo sapiens Raw sequence reads - BioProjectPublicly available at NCBI BioProject (Accession no. PRJNA327240).
-
NCI-60 mutation datasetPublicly available in the Catalogue Of Somatic Mutations In Cancer (COSMIC) database at the Wellcome Sanger Institute (file labelled: 'CosmicMutantExport.tsv.gz').
References
-
Mutational signatures: the patterns of somatic mutations hidden in cancer genomesCurrent Opinion in Genetics & Development 24:52–60.https://doi.org/10.1016/j.gde.2013.11.014
-
Cadmium inhibits mismatch repair by blocking the ATPase activity of the MSH2-MSH6 complexNucleic Acids Research 33:1410–1419.https://doi.org/10.1093/nar/gki291
-
Rev3, the catalytic subunit of Polζ, is required for maintaining fragile site stability in human cellsNucleic Acids Research 41:2328–2339.https://doi.org/10.1093/nar/gks1442
-
Molecular analysis of mutations in mutator colorectal carcinoma cell linesHuman Molecular Genetics 4:2057–2064.https://doi.org/10.1093/hmg/4.11.2057
-
Microsatellite instability in colorectal cancerGastroenterology 138:2073–2087.https://doi.org/10.1053/j.gastro.2009.12.064
-
Oxidative stress inactivates the human DNA mismatch repair systemAmerican Journal of Physiology-Cell Physiology 283:C148–C154.https://doi.org/10.1152/ajpcell.00422.2001
-
DNA polymerase ε and δ exonuclease domain mutations in endometrial cancerHuman Molecular Genetics 22:2820–2828.https://doi.org/10.1093/hmg/ddt131
-
Sensitive detection of somatic point mutations in impure and heterogeneous cancer samplesNature Biotechnology 31:213–219.https://doi.org/10.1038/nbt.2514
-
Contrasting mutation rates from specific-locus and long-term mutation-accumulation proceduresG3: Genes|Genomes|Genetics 2:483–485.https://doi.org/10.1534/g3.111.001842
-
Yeast DNA polymerase ϵ catalytic core and holoenzyme have comparable catalytic ratesJournal of Biological Chemistry 290:3825–3835.https://doi.org/10.1074/jbc.M114.615278
-
Replicative DNA polymerase mutations in cancerCurrent Opinion in Genetics & Development 24:107–113.https://doi.org/10.1016/j.gde.2013.12.005
-
Mechanisms underlying mutational signatures in human cancersNature Reviews Genetics 15:585–598.https://doi.org/10.1038/nrg3729
-
DNA polymerase ε and its roles in genome stabilityIUBMB Life 66:339–351.https://doi.org/10.1002/iub.1276
-
Antimutator variants of DNA polymerasesCritical Reviews in Biochemistry and Molecular Biology 46:548–570.https://doi.org/10.3109/10409238.2011.620941
-
Tumor-specific microsatellite instability: do distinct mechanisms underlie the MSI-L and EMAST phenotypes?Mutation Research/Fundamental and Molecular Mechanisms of Mutagenesis 743-744:67–77.https://doi.org/10.1016/j.mrfmmm.2012.11.003
-
Improved survival with ipilimumab in patients with metastatic melanomaNew England Journal of Medicine 363:711–723.https://doi.org/10.1056/NEJMoa1003466
-
Acidic tumor microenvironment downregulates hMLH1 but does not diminish 5-fluorouracil chemosensitivityMutation Research/Fundamental and Molecular Mechanisms of Mutagenesis 747-748:19–27.https://doi.org/10.1016/j.mrfmmm.2013.04.006
-
Postreplicative mismatch repairCold Spring Harbor Perspectives in Biology 5:a012633.https://doi.org/10.1101/cshperspect.a012633
-
The high fidelity and unique error signature of human DNA polymerase epsilonNucleic Acids Research 39:1763–1773.https://doi.org/10.1093/nar/gkq1034
-
DNA mismatch repairAnnual Review of Biochemistry 74:681–710.https://doi.org/10.1146/annurev.biochem.74.082803.133243
-
Human mismatch repair and G*T mismatch binding by hMutSalpha in vitro is inhibited by adriamycin, actinomycin D, and nogalamycinJournal of Biological Chemistry 276:9775–9783.https://doi.org/10.1074/jbc.M006390200
-
PD-1 blockade in tumors with mismatch-repair deficiencyNew England Journal of Medicine 372:2509–2520.https://doi.org/10.1056/NEJMoa1500596
-
Mechanisms and functions of DNA mismatch repairCell Research 18:85–98.https://doi.org/10.1038/cr.2007.115
-
Decreased expression of the DNA mismatch repair gene Mlh1 under hypoxic stress in mammalian cellsMolecular and Cellular Biology 23:3265–3273.https://doi.org/10.1128/MCB.23.9.3265-3273.2003
-
Mechanisms in eukaryotic mismatch repairJournal of Biological Chemistry 281:30305–30309.https://doi.org/10.1074/jbc.R600022200
-
Pathway correcting DNA replication errors in Saccharomyces cerevisiaeThe EMBO journal 12:1467–1473.
-
Saturation of DNA mismatch repair and error catastrophe by a base analogue in Escherichia coliGenetics 161:1363–1371.
-
Evaluation of the MLH1 I219V alteration in DNA mismatch repair activity and ulcerative colitisInflammatory Bowel Diseases 14:605–611.https://doi.org/10.1002/ibd.20358
-
Genetic knockouts and knockins in human somatic cellsNature Protocols 2:2734–2746.https://doi.org/10.1038/nprot.2007.408
-
A panoply of errors: polymerase proofreading domain mutations in cancerNature Reviews Cancer 16:71–81.https://doi.org/10.1038/nrc.2015.12
-
The extreme mutator effect of Escherichia coli mutD5 results from saturation of mismatch repair by excessive DNA replication errorsThe EMBO journal 8:3511–3516.
-
3'-->5' exonucleases of DNA polymerases epsilon and delta correct base analog induced DNA replication errors on opposite DNA strands in Saccharomyces cerevisiaeGenetics 142:717–726.
-
Unique error signature of the four-subunit yeast DNA polymerase epsilonJournal of Biological Chemistry 278:43770–43780.https://doi.org/10.1074/jbc.M306893200
-
The 3' to 5' exonuclease activity located in the DNA polymerase delta subunit of Saccharomyces cerevisiae is required for accurate replicationThe EMBO journal 10:2165–2170.
-
Complementation of mismatch repair gene defects by chromosome transferMutation Research/Fundamental and Molecular Mechanisms of Mutagenesis 402:15–22.https://doi.org/10.1016/S0027-5107(97)00277-7
-
Hypermutability of homonucleotide runs in mismatch repair and DNA polymerase proofreading yeast mutantsMolecular and Cellular Biology 17:2859–2865.https://doi.org/10.1128/MCB.17.5.2859
-
From human genome to cancer genome: the first decadeGenome Research 23:1054–1062.https://doi.org/10.1101/gr.157602.113
Article and author information
Author details
Funding
Tulane University (Stem Cell and Regenerative Medicine Faculty Grant)
- Bruce A Bunnell
National Institute of Environmental Health Sciences (NIH R01ES028271)
- Zachary F Pursell
National Institute of Environmental Health Sciences (NIH R56ES026821)
- Zachary F Pursell
National Institute of Environmental Health Sciences (NIH R00 ES016780)
- Zachary F Pursell
National Institute of Environmental Health Sciences (NIH P20 RR020152)
- Zachary F Pursell
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
The authors would like to thank Dr. Fred Bunz (John Hopkins University) and Drs. Prescott Deininger and Victoria Belancio (Tulane University) for the kind sharing of reagents. Thanks are also due to Christine McBride for her contribution to the rAAV construction. Additionally, the authors would like to thank Dr. Art Lustig and Dr. Stuart Linn for insightful comments and advice.
Copyright
© 2018, Hodel et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 2,574
- views
-
- 327
- downloads
-
- 34
- citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Biochemistry and Chemical Biology
- Stem Cells and Regenerative Medicine
Human induced pluripotent stem cells (hiPSCs) have great potential to be used as alternatives to embryonic stem cells (hESCs) in regenerative medicine and disease modelling. In this study, we characterise the proteomes of multiple hiPSC and hESC lines derived from independent donors and find that while they express a near-identical set of proteins, they show consistent quantitative differences in the abundance of a subset of proteins. hiPSCs have increased total protein content, while maintaining a comparable cell cycle profile to hESCs, with increased abundance of cytoplasmic and mitochondrial proteins required to sustain high growth rates, including nutrient transporters and metabolic proteins. Prominent changes detected in proteins involved in mitochondrial metabolism correlated with enhanced mitochondrial potential, shown using high-resolution respirometry. hiPSCs also produced higher levels of secreted proteins, including growth factors and proteins involved in the inhibition of the immune system. The data indicate that reprogramming of fibroblasts to hiPSCs produces important differences in cytoplasmic and mitochondrial proteins compared to hESCs, with consequences affecting growth and metabolism. This study improves our understanding of the molecular differences between hiPSCs and hESCs, with implications for potential risks and benefits for their use in future disease modelling and therapeutic applications.
-
- Biochemistry and Chemical Biology
- Structural Biology and Molecular Biophysics
Dynamic conformational and structural changes in proteins and protein complexes play a central and ubiquitous role in the regulation of protein function, yet it is very challenging to study these changes, especially for large protein complexes, under physiological conditions. Here, we introduce a novel isobaric crosslinker, Qlinker, for studying conformational and structural changes in proteins and protein complexes using quantitative crosslinking mass spectrometry. Qlinkers are small and simple, amine-reactive molecules with an optimal extended distance of ~10 Å, which use MS2 reporter ions for relative quantification of Qlinker-modified peptides derived from different samples. We synthesized the 2-plex Q2linker and showed that the Q2linker can provide quantitative crosslinking data that pinpoints key conformational and structural changes in biosensors, binary and ternary complexes composed of the general transcription factors TBP, TFIIA, and TFIIB, and RNA polymerase II complexes.