Enhanced expression of the MYC transcription factor is observed in the majority of tumors. Two seemingly conflicting models have been proposed for its function: one proposes that MYC enhances expression of all genes, while the other model suggests gene-specific regulation. Here, we have explored the hypothesis that specific gene expression profiles arise since promoters differ in affinity for MYC and high-affinity promoters are fully occupied by physiological levels of MYC. We determined cellular MYC levels and used RNA- and ChIP-sequencing to correlate promoter occupancy with gene expression at different concentrations of MYC. Mathematical modeling showed that binding affinities for interactions of MYC with DNA and with core promoter-bound factors, such as WDR5, are sufficient to explain promoter occupancies observed in vivo. Importantly, promoter affinity stratifies different biological processes that are regulated by MYC, explaining why tumor-specific MYC levels induce specific gene expression programs and alter defined biological properties of cells.https://doi.org/10.7554/eLife.15161.001
Genes with the potential to cause tumors and cancer are commonly called oncogenes. One example of an oncogene encodes for a protein called MYC and many tumors contain high levels of this protein. MYC is a transcription factor and studies of aggressive tumors suggested that, like most other transcription factors, MYC binds to and regulates the activity of a small number of genes in tumors. However, other studies went on to show that MYC actually binds to thousands of genes and somehow only regulates a subset of them during tumor development.
Lorenzin et al. set out to understand how this process works by generating human cells in which the concentration of MYC protein could be altered. In the experiments, the concentration was varied from normal healthy levels to the high levels found in aggressive tumors. The amount of MYC bound to genes and the extent to which it activated the genes inside these cells was also measured.
Lorenzin et al. found that increasing MYC levels from normal to tumor-specific levels did not affect MYC binding at genes where the transcription factor was already strongly bound in normal cells. Rather, MYC binding increased only at genes that were weakly bound in normal cells. Consistent with this observation, only genes at which MYC was weakly bound in normal cells were activated by increasing MYC levels. This observation suggests that increasing the concentration of MYC protein from normal to tumor-specific levels “fills up” previously empty binding sites around these genes with the transcription factor.
Lorenzin et al. also used mathematical modeling to understand how the concentrations of MYC in normal and tumor cells might explain how MYC behaves in cells. Together, the results imply that the MYC transcription factor regulates distinct sets of genes in normal and tumor cells according to how much MYC is present. Further studies may show that the altered regulation of a tumor-specific set of genes is important for tumor development and could use this new information to identify new targets for treating MYC-driven tumors.https://doi.org/10.7554/eLife.15161.002
Deregulated expression of one of the three members of the MYC gene family (MYC, MYCN or MYCL) is observed in the majority of human tumors (Dang, 2012; Meyer and Penn, 2008). A broad body of evidence establishes that deregulated MYC expression causally contributes to multiple aspects of tumor development and that tumors depend on enhanced expression of MYC for growth and survival (Casey et al., 2016; Soucek et al., 2013).
MYC proteins are transcription factors that bind DNA as part of several protein complexes; best understood is a dimeric complex of MYC with its partner protein MAX that activates transcription via binding to a specific DNA motif termed E-box with a consensus sequence of CACGTG (Conacci-Sorrell et al., 2014). To repress transcription, the binary MYC/MAX complex associates with a zinc finger protein termed MIZ1 (Wiese et al., 2013). ChIP (chromatin-immunoprecipitation) (Fernandez et al., 2003) and ChIP-sequencing experiments (Lin et al., 2012; Nie et al., 2012; Sabò et al., 2014) demonstrate the presence of MYC on virtually all promoters with an open chromatin structure as well as on thousands of enhancers and intergenic sites in multiple cell types, raising the question what the functional relevance of this broad binding might be.
Given this global binding pattern, it is surprising that MYC-driven tumors can be recognized by a specific set of up- and down-regulated MYC target genes that holds considerable prognostic and therapeutic value (Sabò et al., 2014; Walz et al., 2014). One hypothesis to explain this observation suggests that MYC proteins globally enhance transcription. This has been termed the general amplifier model (Lin et al., 2012; Nie et al., 2012) and is supported by observations that MYC can cause an increase in total RNA and mRNA levels (Grandori et al., 2005; Hsu et al., 2015; Lin et al., 2012; Nie et al., 2012). In this model, specific gene expression patterns arise indirectly due to feedback and feedforward loops induced by general amplification. The alternative viewpoint suggests that MYC proteins regulate specific genes and that global changes in RNA and mRNA levels occur indirectly as a consequence of MYC-driven cell growth (Sabò and Amati, 2014). To explain the contrast between global binding and specific gene regulation, the latter model proposes that much of MYC binding to chromatin is non-productive in terms of transcriptional regulation (Kress et al., 2015).
We show here that the divergent models can be reconciled with experimental observations without the need to invoke productive and non-productive modes of DNA binding. We analyzed U2OS cells that express a doxycycline-inducible allele of MYC (Elkon et al., 2015; Walz et al., 2014) and characterized DNA binding and gene expression patterns at different levels of MYC. We showed previously that doxycycline-induced overexpression of MYC in these cells establishes a gene expression signature, which closely resembles multiple established signatures of MYC target genes and identifies expression signatures of patients with MYC amplification in tumors (Walz et al., 2014). Hence, these cells represent a simple model system, in which the effect of physiological and tumor-specific MYC levels can be compared and provide a tool to elucidate the mechanism(s) by which activation of a globally binding transcription factor can result in regulation of specific and functionally relevant gene expression patterns.
To determine the effect of changes in MYC levels on DNA binding and gene expression, we have previously engineered U2OS cells to express a doxycycline-inducible allele of MYC (Figure 1A). We chose U2OS cells because they have relatively low levels of endogenous MYC, despite being tumor cells. To illustrate this point, we determined MYC levels in a number of normal and transformed cells. Lysates of equal numbers of exponentially growing cells were probed by immunoblotting (Figure 1B). The results show that U2OS cells express levels of MYC that are comparable to non-transformed cells (IMEC, HMLE, MCF10A) and lower than those found in other tumor cell lines (HeLa and HCT116). Notably, prolonged exposure (>3 days) to doxycycline and hence long-term ectopic expression of MYC induces apoptosis in U2OS cells (Walz et al., 2014); therefore, all subsequent analyses were performed 28–30 hr after addition of doxycycline.
ChIP-sequencing of MYC in U2OSTet-on cells had previously shown that endogenous MYC binds to about 5,500 promoters and that this number increases to about 8,400 MYC-bound promoters upon addition of doxycycline (Walz et al., 2014). Different peak calling programs (MACS14, SICER, CCAT) with default parameters result in very similar peak numbers (Figure 1—figure supplement 1A,B). Reducing the stringency of peak calling resulted in a moderate increase in peak number but a large increase in the number of negative peaks, suggesting that an analysis using default parameters does not overlook a large number of significant peaks (Figure 1—figure supplement 1C). In agreement with reports from other systems, we concluded that the promoters of the majority of all expressed genes are bound by MYC. Surprisingly, individual promoters showed a wide range of occupancies for endogenous MYC (Figure 1C–E), whereas the differences in occupancy by MYC among promoters were much smaller after induction with doxycycline (Figure 1D,E). Promoters, which are least strongly bound by endogenous MYC recruit most MYC upon overexpression (i.e. VEGFA, Figure 1E), whereas the most strongly bound genes recruit no additional MYC (i.e. RPL8). This prompted us to analyze whether the anti-correlation between occupancy by endogenous MYC and recruitment of exogenous MYC is evident globally (Figure 1F). To this end, we determined the relative MYC recruitment at each bound promoter by calculating the fold-change of MYC occupancy in cells with exogenous and endogenous MYC levels. Genes were ranked according to these MYC recruitment values and plotted against the respective occupancy of endogenous MYC (Figure 1F, blue dots). Strikingly, genes, which are most weakly bound (mean: 89 tags), recruit MYC most strongly (5.3-fold), whereas the most strongly bound genes (mean: 580 tags) show on average no further MYC recruitment. Importantly, when MYC occupancy at exogenous MYC levels is analyzed (Figure 1F, orange dots), all bins of genes are bound to a high extent. One way to explain this observation is the hypothesis that genes strongly bound by endogenous MYC levels are fully occupied ('saturated') and hence exogenous MYC is preferentially recruited to weakly bound genes.
To test whether this explanation is correct, we made use of the observation that MYC/MAX heterodimers compete with MXD/MAX heterodimers and MAX homodimers for binding to their target sites (Conacci-Sorrell et al., 2014). If promoters were fully occupied by MYC/MAX heterodimers, one would predict that MXD proteins are completely displaced from these promoters. We tested this prediction by ChIP assays on four genes using an anti-MXD6 (MNT) antibody (Figure 1G). Binding of MXD6 was barely detectable above background for the two genes (NPM1, NCL), which are strongly occupied by endogenous MYC, and did not further decrease upon addition of doxycycline. In contrast, MXD6 occupancy was higher for two genes (HSPBAP1, FBX32), which are poorly bound by endogenous MYC, but strongly decreased upon induction of MYC. Taken together, the easiest model to explain the data is to suggest that a few hundred of promoters are saturated by endogenous MYC in U2OS cells and that overexpression of MYC leads to saturation of the majority of MYC-binding sites in promoters (Figure 1F).
To understand whether the number of MYC molecules per cell in U2OS cells is able to saturate the numerous genomic binding sites, we quantified the absolute expression levels of MYC. A carboxy-terminal fragment of human MYC comprising amino acids 353 to 434 was purified to homogeneity and its amount was determined using spectrometric methods (Figure 2—figure supplement 1A). We then used defined quantities of this fragment to calibrate a series of immunoblots. After complete transfer of both, the recombinant protein and cellular MYC (Figure 2—figure supplement 1B), membranes were probed with the 9E10 monoclonal antibody, which recognizes an epitope (EQKLISEEDL) corresponding to amino acids 410 to 419 of human MYC (Figure 2A, Figure 2—figure supplement 1D–F). From triplicate experiments, we estimated that U2OS cells express approximately 100,000 molecules of endogenous MYC per cell and that this number increases to approximately 3x106 molecules of MYC upon induction with 1 μg/ml doxycycline (see calculations in Supplementary file 1). We performed immunofluorescence to estimate the cell-to-cell variation in MYC levels (Figure 2—figure supplement 2). Induction of MYC expression by doxycycline was also observed in situ by immunostaining for MYC and showed that MYC is overexpressed in all cells (Figure 2—figure supplement 2A). Staining and quantification with three different antibodies demonstrated that endogenous MYC levels vary less than +/− 3.7-fold in 80% of all cells (EtOH, Figure 2—figure supplement 2B,C) and less than +/− 2.9-fold upon overexpression (doxycycline, Figure 2—figure supplement 2D,E). Previous estimates found that two human tumor cell lines derived from small cell lung cancer and multiple myeloma express up to 880,000 molecules of MYC per cell (Lin et al., 2012), confirming that upon maximal induction with doxycycline most U2OS cells express MYC levels comparable or slightly higher to levels found in human tumor cells.
We used the estimated number of cellular MYC molecules and the nuclear volume of U2OS cells (Koch et al., 2014) to calculate the nuclear concentration of MYC. Knowing both the nuclear MYC concentration and the occupancy of every promoter in cells at endogenous and exogenous MYC levels (ChIP-sequencing +/− doxycycline) allowed us to estimate affinities for all MYC bound promoters. We calculated the concentration of MYC required for half-maximal occupancy of each promoter (EC50) and used it as a measure for the apparent binding affinity (Figure 2B). Promoters with low EC50 values showed a high, and those with high EC50 values a low binding affinity toward MYC. A density plot illustrates that EC50 values of individual promoters vary over a large concentration range (Figure 2C).
To better understand the functional significance of the differences in promoter affinity, we used a modified gene set enrichment (GSE) analysis (Subramanian et al., 2005), which uses EC50 rather than expression values to test whether different biological processes can be stratified by promoter affinity towards MYC. In this analysis, promoters of MYC target genes that encode functionally related proteins are enriched if they exhibit similar EC50 values. Notably, genes encoding ribosomal proteins (RPs) showed the lowest EC50 values, followed by genes encoding proteins involved in biosynthetic processes, translation and ribosome biogenesis (Figure 2D; gene sets are shown in Supplementary file 2). These genes are thought to comprise a core signature of highly expressed MYC target genes (Ji et al., 2011). At the other extreme, genes encoding metabolite transporters, G-protein coupled receptors and genes involved in TGF-beta signaling and in the response to hypoxia are among those with the highest EC50 values (Figure 2E). We hypothesized that differences in promoter affinity enable distinct concentrations of MYC to regulate functionally different sets of target genes.
Given the high variation in EC50 values, we wondered which factors account for promoter affinity towards MYC. In a complex with MAX, MYC directly contacts E-box sequences in DNA. We initially tested whether the known DNA binding properties of MYC/MAX heterodimers can explain the EC50 values measured in ChIP-sequencing experiments. The heterodimer makes both base-specific contacts and contacts to the phosphate backbone of DNA and hence binds to canonical E-boxes (CACGTG), non-canonical E-boxes (CANNTG) and DNA with a random sequence (Nair and Burley, 2003). We modeled the binding behavior of MYC assuming canonical E-boxes and unspecific binding sites at random DNA sequences, which are in excess over the canonical E-boxes (model 1, Figure 3A, Appendix 1). This model ignores competition of MYC/MAX heterodimers with other E-box binding proteins with the same binding specificity, such as MXD/MAX complexes, as well as MITF, USF and TFE-3 (Conacci-Sorrell et al., 2014). We explored how the occupancy of canonical E-boxes in the experimentally determined range of cellular MYC concentrations depends on the dissociation constants of MYC reversibly binding to canonical E-boxes (KEbox) or unspecific binding sites (KNNNNNN) (Figure 3B; note that all simulations show steady state solutions discussed in Appendix 1). The simulations demonstrate that occupancies of canonical E-boxes above 90% can be observed for certain combinations of dissociation constants. In contrast, occupancy of unspecific binding sites is much smaller than 1% in the considered parameter space (Figure 3C) illustrating that the number of unspecific binding sites strongly exceeds the number of canonical E-boxes (Appendix 1). The model assumes that the entire genome is accessible and not blocked due to heterochromatin formation. An extended analysis investigating the impact of heterochromatin (Appendix 2) confirms that our assumption on genome accessibility hardly affects our presented results and conclusions. Values of dissociation constants have been previously determined for canonical E-boxes and for DNA with a non-E-box sequence. We fixed the dissociation constants in our subsequent model analyses to one pair of published values (Guo et al., 2014) (see red and blue lines in Figure 3B,C). The respective occupancies of canonical E-boxes and unspecific binding sites in the experimentally determined range of total MYC abundance are shown in Figure 3D (red and blue lines, respectively). Under these assumptions, MYC/MAX complexes are predicted to bind canonical E-boxes with a calculated EC50 value of 1x102 μM (Figure 3D). Comparison with the experimentally determined EC50 values showed that only 363 of 8,425 promoters show this or a higher EC50 value.
The majority of E-boxes located in promoters showed considerably lower EC50 values than those predicted by the affinity to E-box-DNA, indicating that binding to DNA alone is not sufficient to account for the chromatin occupancy of MYC observed by ChIP-sequencing. Model simulations predict that a reduction of the dissociation constant of MYC and E-boxes by about one order of magnitude shifts the EC50 value into the measured range of MYC molecules (Figure 3E). To explore potential underlying molecular mechanisms, we searched for features, which identify promoters with high affinity for MYC. Consistent with the DNA binding properties, we found a strong correlation between the occupancy of endogenous MYC and the occurrence of canonical E-box sequences in the binding region (Figure 4A, Figure 4—figure supplement 1A–F). In contrast, non-consensus E-box sequences are only moderately enriched in MYC peaks and their frequency does not positively correlate with MYC binding (Figure 4—figure supplement 1B,D-F). In addition to canonical E-box sequences in the binding region, occupancy by endogenous MYC positively correlated with overall expression of the respective gene (Figure 4B) and with features of open chromatin, such as trimethylation of histone H3K4 (Figure 4C). This observation is in agreement with several previous reports (Guccione et al., 2006; Guo et al., 2014; Lin et al., 2012; Nie et al., 2012). Recent work has identified WDR5, a WD40-repeat-containing protein, which is a part of the MLL/SET methyltransferases that methylate H3K4 and the MOF/NSL histone acetyltransferases that acetylate histone H4, as a direct interaction partner of MYC (Thomas et al., 2015). MYC binds to WDR5 with a KD of 9.3 µM via MYC BoxIII (Thomas et al., 2015), a domain that is not part of the DNA-binding domain, suggesting that binding of MYC to WDR5 occurs independently of binding to DNA. A modified model (model 2; Figure 3F; see also Appendix 1) that assumes (i) that WDR5 is constantly bound to its target sites (ii) that MYC and WDR5 are free to bind to each other when both are bound to chromatin in close proximity predicts an EC50 value of 0.014 µM for MYC occupancy of an E-box in the presence of WDR5 (Figure 3G). This value is lower than the one estimated experimentally for the large majority of promoters, arguing that the MYC/WDR5 interaction is of sufficient high affinity to explain the high occupancy of promoters with low EC50 values (7,963/8,425).
The model predicts that occupancy by MYC is strongest at promoters containing both E-boxes and WDR5-binding sites but is also high for promoters bound by WDR5 but lacking E-boxes (EC50 value of 0.27 μM; Figure 3G, yellow curve). This prediction could be confirmed by stratifying all MYC bound promoters in U2OS cells by these features and analyzing the individual groups for occupancy by endogenous MYC, functional annotation and expression (Figure 4—figure supplement 2A–C). Promoters bound by WDR5 and containing consensus E-box sequences are most strongly bound by MYC (Figure 4—figure supplement 2A), enriched in genes encoding for nuclear proteins (Figure 4—figure supplement 2B) and are associated with high expression (Figure 4—-figure supplement 2C). The two central predictions of the model are that (i) binding sites are little occupied at cellular MYC concentrations if only the affinity of MYC to E-boxes is considered (Figure 3G, red curve), and that (ii) the occupancy strongly increases if the interaction of MYC with WDR5 is considered in addition (Figure 3G, green curve). These predictions are valid for at least 80% of cells when the cell-to-cell variation in the population is taken into account (Figure 2—figure supplement 2F).
To test the notion that the interaction between MYC and WDR5 is critical for the high occupancy of some promoters, we re-analyzed published ChIP-sequencing data, performed in HEK293 cells, for WDR5, wild-type MYC and a mutant allele of MYC ('MYCWBM'), which is strongly compromised in binding to WDR5 (Thomas et al., 2015). In agreement with the published data, our re-analysis showed that (i) the majority of promoters is bound by WDR5, (ii) almost all binding sites of MYC in promoters overlap with WDR5 binding and (iii) disruption of the interaction between MYC and WDR5 strongly decreases MYC binding to chromatin (Figure 4D). Strikingly, analysis of MYCWBM binding in HEK293 cells showed that eliminating MYC’s ability to interact with WDR5 most strongly affects high-affinity binding sites with low EC50 values (Figure 4E). We conclude that known interactions of MYC with DNA together with promoter-bound factors can account for the range of promoter occupancies observed in vivo.
Since target gene expression, H3K4me3 and the occurrence of E-box sequences stratify high-affinity binding sites (Figure 4A–C, Figure 4—figure supplement 1G,H), we wondered whether this correlation also holds true for MYC that is recruited to promoters upon doxycycline induction. Importantly, we found a strong anti-correlation between the recruitment of exogenous MYC on one hand and the occurrence of E-boxes (Figure 4F), the expression level of target genes (Figure 4G) and trimethylation of histone H3K4 (Figure 4H) on the other hand. This highlights that the major change in occupancy is due to MYC binding to low-affinity binding sites and to weakly expressed genes when MYC levels increase.
Previously, we found that genes encoding ribosomal proteins and proteins involved in ribosomal biogenesis are not regulated in response to doxycycline-mediated MYC overexpression (Walz et al., 2014), supporting the hypothesis that MYC binding sites in their promoters are fully occupied by MYC at endogenous levels. To exclude the possibility that these genes are not regulated by MYC in U2OS cells, we depleted endogenous MYC in U2OS cells by siRNA and analyzed gene expression by RNA-sequencing (Figure 5A–D) and MYC binding by ChIP-sequencing (Figure 5—figure supplement 1A–G). RNA-sequencing identified 753 genes being down- and 1224 genes being significantly up-regulated (log2FC < −1 or > 1, q-value < 0.01, Figure 5B). Expression data were analyzed by GSEA to identify functional groups of MYC-regulated genes. Strikingly, ribosomal protein genes and genes encoding proteins involved in ribosomal biogenesis are most strongly down-regulated upon siRNA treatment (Figure 5C,D). Accordingly, ChIP-sequencing upon depletion of MYC (Figure 5—figure supplement 1A,C) demonstrated global loss of MYC binding to high-affinity binding sites (Figure 5—figure supplement 1B,D) including promoters of ribosomal protein genes (Figure 5—figure supplement 1E,F). Strikingly, genes, which show the highest affinity for MYC, most strongly respond to depletion of endogenous MYC (Figure 5—figure supplement 1G). We confirmed our previous observations that these gene sets are not regulated when MYC is overexpressed (Figure 5E). To test whether similar dose-dependent transcriptional responses to MYC are also observed during tumorigenesis, we utilized data from a murine model for MYC-driven B cell lymphomagenesis (Figure 5F–H) (Sabò et al., 2014). Expression profiles are published for three different conditions, (i) resting B cells expressing only low levels of endogenous MYC ('control'), (ii) proliferating B cells expressing intermediate levels of MYC from the Eµ-MYC transgene ('pre-tumor') and (iii) transformed B cells selected for very high MYC expression (Sabò et al., 2014). Strikingly, ribosomal protein genes, hence the genes with lowest EC50 values, are strongly induced when the pre-tumor condition is compared to control cells (Figure 5F,H). In contrast, expression of ribosomal protein genes does not increase further when tumors are compared with pre-tumor cells (Figure 5G,H), although MYC protein levels increase about eight fold (Sabò et al., 2014). Conversely, some genes that are regulated by exogenous MYC in U2OS cells and have low-affinity promoters, like those encoding transmembrane transporters, are also regulated when comparing the tumor to pre-tumor condition in B cell lymphomagenesis (Figure 5—figure supplement 1H,I). Therefore, we propose that selective transcriptional responses by MYC at physiological and oncogenic levels are similar in vivo and in vitro.
To further explore the correlation between binding affinity and selective transcriptional responses, we compared median EC50 DNA binding values of gene sets that respond to depletion of endogenous MYC to EC50 values of gene sets that are regulated in response to MYC overexpression (Figure 5I). Gene sets, which are strongly regulated upon overexpression, for example, genes involved in angiogenesis and cell migration (Figure 5I, orange boxplots), show a significantly higher median EC50 value than gene sets derived from the depletion of endogenous MYC (Figure 5I, blue boxplots). We noted that also genes involved in circadian rhythm are regulated upon MYC overexpression and showed high EC50 values, consistent with the recent demonstration that disruption of the circadian clock is a critical transforming function of MYC (Altman et al., 2015). We concluded that binding affinity determines specific MYC-mediated transcriptional responses at physiological and oncogenic MYC concentrations. To further support this conclusion, we analyzed expression data from embryonic fibroblasts isolated from mice that carry floxed MYC alleles (Perna et al., 2012). Strikingly, EC50 values of genes that are down-regulated upon deletion of endogenous MYC are very low (Figure 5I, green boxplot) confirming that, as in U2OS cells, at physiological levels MYC regulates genes with the highest binding affinity. In view of reports, which demonstrate that altering cellular metabolism is a transforming mechanism of MYC (Stine et al., 2015), we analyzed the affinity of promoters of genes encoding proteins involved in selected metabolic processes and indeed observed elevated EC50 values for lipid- and purine metabolism and potentially glycolysis (Figure 5I, violet boxplots).
Our data suggest that MYC regulates different sets of genes at physiological and oncogenic levels and that these gene sets can be stratified by promoter affinity. We wondered, if there are indeed two distinct modes of regulation or if MYC-mediated responses rather vary continually upon changes in cellular MYC concentrations. To test this, we gradually manipulated MYC levels in U2OSTet-On cells by titrating the amount of doxycycline and calculated the absolute number of MYC molecules in triplicate experiments (Figure 6A, Figure 2—figure supplement 1D–F). Analysis of gene expression profiles at five different concentrations of doxycycline revealed different response patterns for different gene sets (Figure 6B), ranging from hardly any response to MYC overexpression ('structural constituent of ribosome'), medium-level responses ('ribonucleoprotein complex biogenesis'), to gene sets, which responded predominantly at very high MYC levels ('substrate-specific transporter activity'). The RNA-sequencing data enabled us to calculate EC50 values for regulation and to correlate them with EC50 values for binding. Importantly, the observed differences in dose response correlate with the promoter affinity of the genes involved in these processes (Figure 6C). We conclude that differences in promoter affinity can account for the distinct transcriptional response programs observed in biological systems exhibiting different MYC expression levels (Figure 7).
Finally, a prominent discrepancy among reports analyzing the transcriptional consequences of MYC level manipulation is the presence or absence of direct MYC-mediated repression (Lin et al., 2012; Nie et al., 2012; Sabò et al., 2014; Walz et al., 2014). As direct repression is most apparent at high levels of MYC (Wiese et al., 2015), we hypothesized that differences in MYC expression levels might explain this seeming contradiction. Direct MYC-mediated repression largely depends on complex formation with MIZ1 (Peukert et al., 1997), which is strongly reduced when expressing a point mutant of MYC (MYCV394D). By using previously published datasets (Walz et al., 2014), we identified direct MYC-repressed sets of genes assuming that they are re-expressed if MIZ1 is depleted (shMIZ1) or if MYCV394D instead of wild type MYC is expressed. Interestingly MYC/MIZ1-mediated repression takes place at high levels of MYC, both in a global analysis (Figure 6D) as well as for individual genes (Figure 6E). Accordingly, genes directly repressed by MYC ('Down (direct)' in Figure 6F), show significantly higher EC50 values than transactivated genes ('Up'), whereas indirectly repressed genes ('Down (indirect)' in Figure 6F) show affinities comparable to that of transactivated genes. We concluded that direct MYC-mediated repression is a predominant form of transcriptional regulation at high (oncogenic) levels of MYC, which may not be reached during normal cell growth (Figure 7A).
The aim of this study was to understand whether an analysis of MYC function at different protein levels in combination with mathematical modeling can reconcile two seemingly conflicting observations: Whereas MYC regulates specific genes in some biological settings ('specifier model'), in others MYC enhances expression of all genes transcribed by RNA polymerase II ('amplifier model'). Our study has led to two major conclusions.
The first conclusion is that direct interaction with DNA alone is not sufficient to account for the observed occupancies of core promoters by MYC/MAX complexes. This result is in agreement with experimental observations showing that the positions occupied by MYC/MAX complexes across the human genome closely correlate with the RNA polymerase II transcription machinery even on promoters, which do not contain E-boxes (Guccione et al., 2006; Guo et al., 2014). The conclusion is based on a modeling approach that assumes two classes of binding sites, a high-affinity class of binding sites with consensus E-boxes and a second class of binding sites containing non-specific DNA. This assumption is an over-simplification, since MYC/MAX complexes have low affinity for binding sites containing non-consensus E-box sequences, which exceeds the affinity for non-specific DNA (Blackwell et al., 1993) (Figure 4—figure supplement 1E,F). Furthermore, multiple other transcription factors recognize the same consensus sequence as MYC/MAX proteins, suggesting that there is significant competition for E-box binding among these complexes in cells (Wolf et al., 2015). As a consequence, the current model overestimates the fraction of MYC/MAX complexes bound to consensus E-boxes. These considerations reinforce the conclusion that DNA binding alone cannot account for the high occupancies of core promoters by MYC/MAX complexes in cells.
The mathematical analysis shows that the interaction of MYC/MAX with promoter-bound WDR5 can strongly enhance occupancy of core promoters. Consistent with this finding, a mutant allele of MYC unable to bind to WDR5 shows much reduced binding to high-affinity promoters. MYC does not only interact with WDR5, but with several components of the transcription machinery including P-TEFb, TBP and TRRAP (Bouchard et al., 2004; Eberhardy and Farnham, 2002; Hateboer et al., 1993; McMahon et al., 1998; Rahl et al., 2010). Additionally, we recently identified the interaction of another conserved domain of MYC, MYC BoxI, with the PAF1C complex that is located at core promoters (Jaenicke et al., 2016). We restricted our analysis to WDR5, since no dissociation constants for other protein/protein interactions of MYC have been published. However, MYC high-affinity binding sites can be observed at genomic regions, which are not in proximity to WDR5 binding sites, suggesting that other factors are involved in establishing MYC genomic binding pattern. Future ChIP-sequencing studies of MYC upon depletion of MYC interaction proteins will clarify the impact of individual factors on chromatin binding of MYC.
Most likely, therefore, genome-wide occupancies of MYC/MAX complexes result both from interactions of MYC/MAX complexes with DNA and from protein/protein interactions with resident chromatin proteins. We propose that the dependence of MYC/MAX chromatin binding on interactions with proteins that are themselves influenced by existing gene expression patterns at different physiological states of cells, contributes to specific gene regulation by MYC even though MYC´s mechanism of action is the same at all promoters (as an example, see Liu et al. (2015)). Our analysis leads to the hypothesis that conserved protein/protein interactions of MYC are critical for targeting MYC/MAX complexes to specific classes of binding sites such as core promoters in vivo.
The second conclusion that can be drawn from our data is that promoter affinity for MYC confers specificity to gene regulation. Our results show that promoters differ in their apparent affinity for MYC and demonstrate a close correlation between apparent promoter affinity measured by ChIP-sequencing and response to different concentrations of MYC measured by RNA-sequencing at different MYC levels. This correlation leads to the conclusion that a gene is regulated by MYC in a given experimental setting, if the occupancy of its promoter (or a relevant enhancer) by MYC changes in this setting. Importantly, measurements of MYC levels and modeling approaches show that promoters can be saturated by levels of MYC that are reached in proliferating cells (Figure 7A). As a result, expression of genes may not respond to MYC in a certain experimental setting (such as a doxycycline-mediated induction of high MYC levels) since the promoters are fully occupied before addition of doxycycline. This is likely to be true for strongly transcribed genes as many ribosomal protein genes and genes encoding proteins involved in ribosome biogenesis and translation. The experimentally observed specificity in gene regulation can therefore be explained without the need to invoke productive and non-productive modes of interaction of MYC with chromatin.
Consistent with this notion, the analysis of MYC-driven lymphomagenesis shows that MYC can contribute to oncogenic transformation via two different mechanisms: low levels of constitutive MYC enhance expression of genes with high-affinity promoters, such as genes encoding proteins involved in ribosome function. This is seen when pre-tumorigenic B-cells are compared with control lymphocytes. MYC levels further increase at the transition to frank lymphomas and there is clear evidence for further increases in MYC levels during progression of several tumor entities, as for example in APC-mutant colorectal tumors (Kress et al., 2011; Myant and Sansom, 2011; Siemens et al., 2011). This additional increase in MYC levels will enhance occupancy of promoters that are lowly occupied in normal and proliferating cells. Since differences in apparent affinity and in EC50 values stratify classes of promoters of genes that encode proteins that differ in function, we propose that this will alter the physiological and metabolic state (Stine et al., 2015) of the tumor cells (Figure 7B). Notably, promoters with the lowest apparent affinity are found in genes that encode proteins involved in nutrient transport, G-protein coupled receptors and in the response to hypoxia. Expression of these genes responds to further increases in MYC levels arguing that they are the basis for the selective pressure for increases in MYC levels. Since many tumors depend on ('are addicted to') high MYC levels, inhibition of proteins encoded by this class of target genes can be a rational strategy for selective inhibition of MYC-dependent tumor growth.
Finally, we note that virtually all genes that are repressed by MYC/MAX in complex with MIZ1 display low apparent affinities and high EC50 values, suggesting that MIZ1-mediated repression occurs predominantly at high MYC levels. MIZ1-mediated repression is required for tumor formation in MYC-driven lymphoma (van Riggelen et al., 2010) and medulloblastoma (Vo et al., 2016), suggesting that targeting the complex may provide a second strategy with a significant therapeutic window for MYC-dependent tumors.
U2OS, HeLa, HEK293 and HCT116 cells were cultivated in DMEM (Sigma-Aldrich, St. Louis, USA) supplemented with 10% FCS (EMD Millipore, Billerica, USA) and 1% penicillin/streptomycin (Sigma-Aldrich). IMECs and HMLE cells were cultivated in DMEM/F12 (co) supplemented with 1% penicillin/streptomycin (Sigma-Aldrich), 10 µg/ml insulin (Sigma-Aldrich), 0.5 µg/ml hydrocortisone (Sigma-Aldrich) and 20 ng/ml EGF (Life Technologies, Carlsbad, CA). MCF10A cells were cultivated in DMEM/F12 (Thermo Fisher Scientific, Waltham, USA) supplemented with 5% horse serum (Sigma-Aldrich), 10 µg/ml insulin (Sigma-Aldrich), 0.5 µg/ml hydrocortisone (Sigma-Aldrich) and 20 ng/ml EGF (Life Technologies) and 100 ng/ml choleratoxin (Sigma-Aldrich). Cell lines were tested for mycoplasma and authenticated by STR profiling (HCT116, MCF10A, HeLa, U2OS). They were not found to be on the list of commonly misidentified cell lines (International Cell Line Authentication Committee, with the exception of HEK cells) and received from research labs, companies (HCT116: LGC genomics) or public repositories (HEK293: ATCC).
For siRNA transfections, the RNAiMAX reagent (Thermo Fisher Scientific) and the Opti-MEM medium (Thermo Fisher Scientific) were used following the manufacturers’ instructions. A pool of siRNA against MYC (ON-TARGETplus SMARTpool MYC, L-003282-02-0020) and a control pool of siRNA (ON-TARGETplus Non-targeting Pool, D-001810-10-20) were purchased from GE Healthcare. Cells were harvested 48 hr after transfection.
The mRNA sequence of the dimerization and DNA-binding domain of human MYC was cloned into the pETM11 (EMBL-Heidelberg) vector. The protein was expressed as an N-terminally His-tagged protein in ArcticExpress (DE3)RIL cells (Agilent Technologies, Santa Clara, USA) after an OD of 0.8 was reached by induction with 0.1 mM isopropyl-β-thiogalactoside (IPTG) at 14°C for 18 hr. The protein was purified to homogeneity by metal affinity chromatography (Ni-NTA, Thermo Fisher Scientific) followed by anion exchange chromatography (AEC). AEC was performed using a MonoQ 10/100 GL column (GE Healthcare, Chicago, USA) equilibrated with 20 mM HEPES pH 8.0 and 0.25 M NaCl. The protein did not bind to the column but eluted in the flow through. Size exclusion chromatography was performed using a HiLoad 16/60 Superdex 200 prep grade column (GE Healthcare) to confirm its monomeric state and exclude aggregation. The protein was concentrated after AEC to 0.59 mg/ml based on the calculated extinction coefficient using ProtParam (SwissProt) and then flash frozen adding 10% glycerol for storage.
Cells were either lysed in lysis buffer (50 mM HEPES pH 7.9, 140 mM NaCl, 1 mM EDTA, 1% Triton-X-100, 0.1% Sodium deoxycholate, 0.1% SDS) with protease and phosphatase inhibitor cocktails (Sigma-Aldrich) or harvested by trypsinization, counted and lysed directly in SDS sample buffer. The same number of cells or total protein amount was loaded for each sample on Bis-Tris or Tris-Glycine gels and transferred to PVDF membranes (EMD Millipore, Billerica, USA).
To detect the MYC protein the 9E10 antibody (amino acids: 410–419) or the anti-MYC antibody (Y69: N-terminus, Abcam, Cambridge, UK) were used. To detect Vinculin the V9131 anti-Vinculin antibody (Sigma-Aldrich) was used.
The immunoblots were visualized and quantified using the LAS-4000 mini (Fujifilm, Tokyo, Japan) or the Odyssey CLx Infrared Imaging System (LI-COR, Lincoln, USA).
The number of MYC molecules/cell was calculated by comparing the signal of recombinant MYC protein with cell lysates from U2OS cells in biological triplicates in quantitative immunoblots (Odyssey CLx Infrared Imaging System). To determine the nuclear MYC concentration, the size of U2OS nuclei was obtained from (Koch et al., 2014). Biological replicates are defined by individual U2OS cell harvestings.
ChIP and ChIP-sequencing was performed as described previously (Walz et al., 2014). Briefly, cells were crosslinked with 1% formaldehyde for 10 min at room temperature. After extraction, nuclei were lysed in lysis buffer (50 mM HEPES pH 7.9, 140 mM NaCl, 1 mM EDTA, 1% Triton-X-100, 0.1% Sodium deoxycholate, 0.1% SDS). The cell lysates were sonicated to reach an average DNA fragment size of 140-300 bp. Immunoprecipitation was performed 6 hours or overnight using Dynabeads (Thermo Fisher Scientific). For MYC and MNT immunoprecipitation, the anti-MYC antibody (N262, Santa Cruz, Dallas, USA) and the anti-MNT antibody (sc-769, Santa Cruz, Dallas, USA) were used, respectively. After washing, chromatin was eluted, de-crosslinked and treated with RNase A and Proteinase K. The DNA was purified by phenol/chloroform extraction and ethanol precipitation and analyzed by qPCR using MX3000P with Sybr green Mix (Thermo Fisher Scientific, ChIP) or sequenced on a Next-Seq500 after library preparation (NEBNext ChIP-Seq Library Prep Master Mix Set, NEB, Ipswich, USA).
For RT-qPCR analysis, total cellular RNA was extracted using the peqGOLD TriFast (Peqlab, Erlangen, Germany) reagent. The first strand synthesis was performed using M-MLV Reverse Transcriptase (Thermo Fisher Scientific) and random hexamer primers (Roche, Basel, Switzerland). Results were normalized to β2-microglobulin expression. Primers are listed in Supplementary file 3. Technical replicates are defined by performing individual qPCR-analyses (CT-value estimation) with the same cDNA. Biological replicates are defined by individual U2OS cell harvestings.
For RNA-sequencing, total RNA was extracted using the RNeasy mini columns (Qiagen, Hilden, Germany) including on-column DNase I digestion. PolyA+-RNA was isolated from total RNA with the NEBNext Poly(A) mRNA Magnetic Isolation Module (NEB, Ipswich, USA). The NEBNext Ultra RNA Library Prep Kit for Illumina was used for library preparation following the manufacturer’s instruction. Size selection of the libraries was performed with the AMPure XP Beads (Beckman Coulter, Brea, USA), followed by 12 PCR cycles for amplification. Library quantification and quality was assessed using the Experion Automated Electrophoresis System (Bio-Rad, Hercules, USA). The libraries were sequenced with the Illumina Genome Analyzer IIx (Illumina, San Diego, USA).
U2OS cells were treated with doxycycline/ethanol and were fixed with 4% paraformaldehyde. Fixed cells were treated with blocking solution (10% horse serum, 2% BSA and 5% sucrose in PBS) for 45 min after washing. Primary antibodies were incubated overnight at 4°C and incubated for 1 hr with the corresponding fluorescently labeled secondary antibodies at room temperature after washing (Tris-buffered saline with 0.1% Tween-20). Cells were mounted on object glass slides using aqua-fluoromount (Sigma) after washing and imaged under a confocal microscope (Nikon Ti-Eclipse). For fluorescent intensity measurement, ImageJ 1.50hr (Schneider et al., 2012) was used. Briefly, z stacked confocal images were converted to 12-bit images via maximum intensity projection and ROI assigned using ROI tool. The intensity was measured per cell nuclei for Hoechst and MYC. The fold change distribution of MYC was calculated by first taking MYC intensity normalized to Hoechst and then further normalizing the values to the median of the set of measurements. Finally Log2 of the median normalized values were displayed as density plots on R using 'sm' library density function.
For RNA-sequencing, reads were aligned to the human reference genome (hg19) using BOWTIE v.0.12.8. The obtained BAM files were used for further analysis with R/BioConductor. For differential gene expression analysis EdgeR was employed. The rpkm values were calculated by mapping the read counts to the exons of the corresponding gene. Heat maps showing the gene expression changes were obtained using R.
ChIP-sequencing data were obtained from (Walz et al., 2014) and analyzed similarly. The MYC peaks were visualized using the wig files from MACS v.1.4.2 (Feng et al., 2012) and the Integrated Genome Browser software (Nicol et al., 2009). ChIP-sequencing samples from (Walz et al., 2014) were sequenced deeper for visualizing purposes.
Peak calling was performed with MACS v.1.4.2 (parameters: --keep-dup 3, variable p-value cut-off), SICER v.1.1 (parameters: redundancy 1, window 200, fragment size 90, effective genome fraction 0.74, gap 0, FDR 0.01) (Zang et al., 2009) and CCAT v.3.0 (parameters: default for transcription factors) (Xu et al., 2010). Overlapping peaks were determined with the 'intersectBed' function from BEDTools v.2.17.0 (Quinlan, 2014). Promoter regions are defined as –1.5 kb to +0.5 kb relative to the transcriptional start site if not indicated otherwise.
To analyze the occurrence of E-boxes in the MYC peak the CentriMo tool from the MEME Suite (Bailey and Machanick, 2012) was used. The empirical probability of consensus and non-consensus E-boxes was plotted with a moving average of 20 bp.
To calculate the distribution around the TSSs (heat maps) or the center of the MYC peak (occupancy), Seqminer v.1.3.3 was used. Heat maps were visualized using Treeview. MYC and H3K4me3 occupancies (in – and + Dox) were calculated using a window of ± 100 bp for the center of the MYC peak and used to obtain MYC recruitment. DNA binding of MYC and a mutant of MYC that is compromised in binding to WDR5, was analyzed similar: 5x107 reads were randomly chosen from fastq files (GSE60897), mapped with BOWTIE and reads in a window of ± 100 bp around the center of the MYC peak were counted using Seqminer. For Venn diagrams, publicly available peak annotations (GSE60897) were used and the overlap was calculated using Bedtools. For binned plots, genes were sorted and divided in equally sized-groups and the median or mean values are shown. As few genes show very high EC50 values, sometimes one or two bins were not shown in the plots, for clarity.
EC50 values for MYC binding at each promoter were calculated using the MYC nuclear concentrations and MYC occupancy in – and +Dox. MYC occupancy was calculated by subtracting the tag values of the input from the tag values of the MYC ChIP and resulting negative values were set to 0. The Michaelis-Menten model was used to perform the fitting of the data in GraphPad Prism and to calculate maximal occupancy of MYC and the EC50 values.
For the estimation of the EC50 values for regulation of MYC target genes, the MYC depletion RNA-sequencing experiment and the Dox titration RNA-sequencing experiment were used. The fold change of expression for each gene was calculated relative to siMYC. Only activated genes were considered. The EC50 values were obtained using the fold change values and the nuclear MYC concentrations. Michaelis-Menten model was used to perform the fitting of the data in GraphPad Prism. To statistically test the difference of the distribution of the EC50 values of different gene sets, the Mann-Whitney-Wilcoxon test was used. For functional analysis of the genes, Gene Set Enrichment Analyses (GSEA) were performed with the C5 collection of gene sets from the MSigDB (http://www.broadinstitute.org/gsea/msigdb) using the GSEAPreranked tool, where indicated. Gene ontology analysis was performed using DAVID (Huang et al., 2009) with default settings and gene ontology (GO) terms from the domains 'cellular component', 'molecular function' and 'biological process' as database. Sample size was not estimated by statistical methods but chosen based on common standards.
The temporal changes of the system’s components are described by sets of ordinary differential equations (ODEs) and algebraic equations describing conservation relations. Model design, the particular equations and parameters are listed in the Appendix 1. An extended model analysis is provided in Appendix 2. Steady-state solutions were calculated by setting all time derivatives to zero and solving the resulting algebraic equation system numerically. All calculations were performed using Mathematica 10.2 (Wolfram Research, Inc.).
The data sets supporting the results of this article are available in the Gene Expression Omnibus repository, GEO: GSE77356. Data access: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE77356
Appendix 1 provides the detailed description of the mathematical models 1 and 2. Schemes of models 1 and 2 are provided in Appendix 1—figure 1.
The hexamer sequence NNNNNN excludes the CACGTG sequence.
The ratios of dissociation and association rate constant are substituted by the dissociation constants and .
MYC can reversibly bind to four independent classes of DNA binding sites: CACGTG sequences (Ebox), unspecific DNA sequences (NNNNNN), WDR5 and CACGTG sequence simultaneously (EboxWDR5), as well as WDR5 and unspecific DNA sequences simultaneously (NNNNNNWDR5).
The hexamer sequence NNNNNN excludes the CACGTG sequence. Eboxtotal excludes CACGTG sequences in the vicinity of WDR5 (EboxWDR5total).
The ratios of dissociation and association rate constant are substituted by the dissociation constants , , , and .
Occupancy of E-boxes is defined by the proportion of E-boxes bound by MYC in total number of E-boxes in percentage at steady state:
Likewise, occupancy of NNNNNN sequences, of Ebox-WDR5 and NNNNNN-WDR5 binding sites are given by
Our modelling approach partially builds on data of ChIP-sequencing experiments. ChIP-sequencing experiments are based on the general notion that the number of DNA-bound molecules (tags) presumably does not change (at least) for the time frame of conducting the experiment. We therefore use the steady state assumption for calculating apparent dissociation constants, which imply ratios on dissociation and association rate constants.
The amount of MYCtotal is varied up to 4.6·106 molecules, which is the maximal amount experimentally measured in U2OS cells under different treatments (EtOH, Dox, and siRNA; see Supplementary file 1).
Walz, Lorenzin et al. show that Dox-induced MYC is predominantly localized in U2OS cell nuclei ([Walz et al., 2014], extended data Figure 1B). We therefore consider only the nuclear volume of an U2OS cell. The nucleus dimensions were quantified as approximately 16 µm in width, 23 µm in length, and 4 µm in height (Koch et al., 2014). We assumed an ellipsoid to calculate the volume of an U2OS cell nucleus.
Genometotal is the number of considered base pairs estimated for the female hypertriploid karyotype of U2OS cells. Total base pairs are counted in the haploid unmasked human reference genome hg19 without taking the Y chromosome into account and excluding 'N' and 'chrN_random-sequences'. The number is multiplied by 2 to account for diploid cells and subsequently multiplied by a factor of 1.7 (Ben-Shoshan et al., 2014) to account for the hypertriploid karyotype of U2OS cells (http://www.lgcstandards-atcc.org/products/all/HTB-96.aspx#characteristics) (Janssen and Medema, 2013).
Eboxtotal is calculated by counting CACGTG sequences in the human reference genome hg19 without taking the Y chromosome into account and excluding 'N' and 'chrN_random-sequences'. The number is multiplied by 2 and 1.7 to account for the karyotype of U2OS cells.
The total number of possible hexameric sequences (NNNNNNtotal) in the genome of U2OS cells is estimated by subtracting Eboxtotal from Genometotal.
To convert the published values of dissociation constants given in [nM] and [µM] into [molecules] the average volume of a single U2OS cell nucleus listed in Appendix 1—table 1 is used.
To obtain Eboxtotal, NNNNNNtotal, EboxWDR5total, and NNNNNNWDR5total, we first downloaded processed data of a WDR5-ChIPseq experiment (Thomas et al., 2015) from Gene Expression Omnibus database (GSE60897), which comprised 17,171 WDR5 peaks. Next, we identified 27,853 unique RefSeq-TSS (USCS, hg19.refGene), which enumerated to 22,346 promoters assuming –1.5 kb to +0.5kb as promoter region and joining overlapping promoter regions. 3,408 of these 22,346 promoters contain WDR5 as well as a CACGTG sequence. This number (3,408) is multiplied by 2 and 1.7 to account for the karyotype of U2OS cells yielding EboxWDR5total. Eboxtotal is given by subtracting EboxWDR5total from Eboxtotal listed in Appendix 1—table 2. NNNNNNWDR5total is calculated by subtracting EboxWDR5total from all called WDR5 peaks (i.e., 17,171 WDR5 peaks multiplied by 2 and 1.7 to account for U2OS cell karyotype). NNNNNNtotal results from subtracting the sum of Eboxtotal, EboxWDR5total, and NNNNNNWDR5total from Genometotal.
The dissociation constant of MYC and EboxWDR5 is estimated by multiplying the individual dissociation constants of MYC and WDR5 (9.3·10–6 M; [Thomas et al., 2015) as well as MYC and E-boxes (0.1·10–9 M; [Guo et al., 2014]). This approach is based on our presumption that under constant pressure the standard change of Gibbs free energy at equilibrium of reversible binding of MYC to EboxWDR5 is equal to the sum of the standard change of Gibbs free energies at equilibria of reversible MYC-WDR5-association and reversible MYC- CACGTG-association. Similarly, the dissociation constant of MYC and NNNNNNWDR5 is estimated by multiplying the individual dissociation constants of MYC and WDR5 (9.3·106 M; (Thomas et al., 2015]) as well as MYC and unspecific DNA (20·109 M; [Guo et al., 2014]).
Appendix 2 provides an extended analysis and discussion of the modelling approach.
The mathematical models introduced in the manuscript consider two kinds of parameters: dissociation constants (, , , and ) and total numbers of molecules and binding sites (, ,, , and ). In Appendix 1, we provide a particular set of parameter values for each model. Here, in Appendix 2, we use model 1 to comprehensively dissect the impact of model parameters on the occupancy of E-boxes by MYC. We investigate:
how the occupancy of E-boxes by MYC depends on the total number of MYC molecules allowing for a wide range of dissociation constants and , and
to what extend this dependency is affected by variations of E-box and NNNNNN sequence counts to estimate the impact of heterochromatin or altered karyotype.
Our comprehensive analysis reveals that a higher affinity of MYC to E-boxes than NNNNNN sequences is sufficient to observe higher occupancy of E-boxes than of NNNNNN sequences by MYC in the measured range of MYC abundance. Second, we demonstrate that the dependence of E-boxes occupancy and NNNNNN sequence occupancy on MYC in the measured range of MYC abundance is strongly influenced by the ratio of the respective dissociation constants. Finally, we show that a variation of the number of E-boxes and NNNNNN sequences to account for heterochromatin or deviation of karyotype hardly affects our results.
As mentioned above, five parameters characterize model 1: , , , , and . Since the values of and are relatively defined for a given genome, we will initially fix to 9.3·105 and to 9.5·109 in our analysis to reduce parameter space. These values are calculated from human reference genome (hg19) assuming the karyotype of U2OS cells (see comments in Appendix 1). In a second step, we will extend our analysis to investigate the impact of other values of total number of E-boxes and NNNNNN sequences to take heterochromatin and karyotype alterations into account.
The values of were experimentally determined to range from 7.1·103 to 4.6·106 molecules (Supplementary file 1). Therefore, is varied in our analyses over orders of magnitude on a logarithmic scale from 3·103 to molecules.
The values of the dissociation constants and are varied over a very wide range (from 10–7 to 107 molecules) in this analysis to cover all possible conditions.
Appendix 2–figure 1 illustrates regions of certain occupancy in the three dimensional parameter space of dissociation constants and as well as total amount of MYC . It shows that higher occupancy of E-boxes (Appendix 2–figure 1A) than of NNNNNN sequences (Appendix 2–figure 1B) is observable for most combinations of dissociation constants and in the considered range of total amount of MYC. For instance, occupancy of E-boxes can reach 90% at values in the order of 106 molecules. In contrast, the occupancy of NNNNNN sequences is generally low in the order of 0.1% or less. This is mainly due to the vast excess of NNNNNN sequences compared to E-boxes.
Appendix 2–figure 2 shows the parameter space of Appendix 2–figure 1A from two different perspectives, i.e. rotated into the - -plane and into the --plane. For simplification only the surface of 50% occupancy of E-boxes by MYC is shown. Looking from these two perspectives, it seems that a critical total amount of MYC is necessary to observe occupancies above 50% and the dissociation constant must be larger than the dissociation constant . Indeed, an analytical analysis proofs (see below) that is both a necessary and a sufficient condition to observe . This implies that even in the case of small differences in the dissociation constants of MYC to E-boxes and NNNNNN sequences, differential occupancies occur.
Next, we investigate how strongly the occupancies of E-boxes and NNNNNN sequences can differ. To explore whether absolute values or ratios of the dissociation constants and play an important role to separate the two occupancy curves, we vary the value of as well as the ratio of to in a systematic fashion (Appendix 2–figure 3). We observe that for a given value of , the difference between the occupancy of E-boxes and NNNNNN sequences is the greater, the greater the ratio of to . Moreover, for a given ratio of the dissociation constants, the value of hardly impacts the occupancy curves. This demonstrates that the ratio of the dissociation constants but not their absolute values predominantly determines whether distinct occupancies of E-boxes and NNNNNN sequences can be observed. Please note that the changes of in the calculations for Figure 3E simultaneously changed the ratio of to and therefore affected the EC50 value.
So far, a fixed number of E-boxes and NNNNNN sequences was considered in the simulations ( and ). Taking into account that the euchromatic sequence makes up approximately 90% of the human genome (International Human Genome Sequencing Consortium, 2004) and the karyotype of U2OS cells substantially differs from that of normal human cells (Ben-Shoshan et al., 2014), the impact of the number of E-boxes and NNNNNN sequences on occupancy of E-boxes is now investigated. The analysis shows that a change in the numbers of E-boxes and NNNNNN sequences hardly affects the occupancy of E-boxes with respect to total MYC (Appendix 2–figure 4).
Using equations 1–11 (Appendix 1), we derive the following equation system:
This yields expressions of and :
Assuming holds, then
Similarly it can be proven that if then holds.
MYC disrupts the circadian clock and metabolism in cancer cellsCell Metabolism 22:1009–1019.https://doi.org/10.1016/j.cmet.2015.09.003
Binding of myc proteins to canonical and noncanonical DNA sequencesMolecular and Cellular Biology 13:5216–5224.https://doi.org/10.1128/MCB.13.9.5216
Myc recruits P-TEFb to mediate the final step in the transcriptional activation of the cad promoterJournal of Biological Chemistry 277:40156–40162.https://doi.org/10.1074/jbc.M207441200
Myc-binding-site recognition in the human genome is determined by chromatin contextNature Cell Biology 8:764–770.https://doi.org/10.1038/ncb1434
TATA-binding protein and the retinoblastoma gene product bind to overlapping epitopes on c-Myc and adenovirus E1A proteinProceedings of the National Academy of Sciences of the United States of America 90:8489–8493.https://doi.org/10.1073/pnas.90.18.8489
Confinement and deformation of single cells and their nuclei inside size-adapted microtubesAdvanced Healthcare Materials 3:1753–1758.https://doi.org/10.1002/adhm.201300678
MYC: connecting selective transcriptional control to global RNA productionNature Reviews. Cancer 15:593–607.https://doi.org/10.1038/nrc3984
Redeployment of Myc and E2f1-3 drives Rb-deficient cell cyclesNature Cell Biology 17:1036–1048.https://doi.org/10.1038/ncb3210
BEDTools: The swiss-army tool for genome feature analysisCurrent Protocols in Bioinformatics 47:1–34.https://doi.org/10.1002/0471250953.bi1112s47
Inhibition of Myc family proteins eradicates KRas-driven lung cancer in miceGenes & Development 27:504–513.https://doi.org/10.1101/gad.205542.112
Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profilesProceedings of the National Academy of Sciences of the United States of America 102:15545–15550.https://doi.org/10.1073/pnas.0506580102
Taming of the beast: shaping Myc-dependent amplificationTrends in Cell Biology 25:241–248.https://doi.org/10.1016/j.tcb.2014.10.006
Chi Van DangReviewing Editor; University of Pennsylvania, United States
In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.
Thank you for submitting your article "Different promoter affinities account for specificity in MYC-dependent gene regulation" for consideration by eLife. Your article has been favorably evaluated by Jessica Tyler (Senior editor) and three reviewers, one of whom, Chi Dang, is a member of our Board of Reviewing Editors. The following individual involved in review of your submission has agreed to reveal their identity: David Levens (peer reviewer).
The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission. The requested revision mostly requires additional analysis of extant data and does not necessarily require new experiments. For major point #1, it is possible that you may have the data already as alluded to in the Walz et al. paper (Nature 2014; extended Figure 1 immunofluorescent microscopy study).
The manuscript by Lorenzin et al. addresses the conundrum of widespread MYC binding to the genome and its specific regulation of various cell biological functions in the context of its general gene activation function, seemingly without any specificity. By measuring MYC DNA binding affinities, its association with WDR5, the authors were able to stratify promoters by change in MYC occupancy. The data indicate that low-affinity promoters show the greatest increase in MYC recruitment at high levels of MYC whereas the high affinity promoters show very little change over the same range. This result is similar to that of Fernandez et al. (Fernandez PC Genes & Dev 2003) that employed large-scale ChIP to propose MYC occupancy dependance by promoter affinity. Using mathematical modeling, Lorenzin et al. are able to assign a binding affinity to each promoter (EC50) and make the intriguing conclusion that E box binding alone is not strong enough to explain how MYC is tethered to the majority of its DNA targets. Finally, the authors measure gene expression by titrating the level of MYC which, consistent with their model, reveals a correlation between promoter affinity and the level to which expression from these genes was increased. Specifically, they demonstrate that high-affinity binding sites are almost fully occupied by MYC and could not be further bound, resulting in a seemingly lack of gene activation by ectopic MYC. This is best illustrated in an artificial system of U2OS cells that have relatively low MYC levels despite being a cancer cell line. Upon activation of ectopic MYC, genes encoding ribosome biogenesis genes, which have high-affinity MYC binding sites, could not be further bound and hence do not seem to be activated. However, knockdown of endogenous MYC in U2OS cells resulted in diminished binding of MYC and expression of ribosome biogenesis genes. By monitoring MXD at E-boxes, the authors show an in vivo competition with MYC that is sensitive to MYC levels. The authors also show that MIZ repression explains most of the negative gene regulation with MYC that is again dependent on the level of MYC binding. Overexpression of MYC, on the other hand, results in MYC binding to lower affinity sites, resulting in ectopic activation of genes such as those involved in angiogenesis and suppression of cell adhesion genes. The authors also use publicly available data for the MYC-mediated mouse lymphoma and also documented MYC's role in inducing ribosome biogenesis genes (which was previously documented by Ji et al. PLoS One 2011). Using data deposited on WDR5, the authors corroborate the role of WDR5 in recruiting MYC to DNA for gene regulation. In addition, the experimental data, the authors also provide mathematical modeling, providing further insights into how MYC binding affinities affect gene expression and drive specific biological functions.
Overall, this body of work is compelling, although the authors should openly acknowledge that the U2OS cell line is complicated by its intolerance of high expression of MYC (documented in Walz et al. Nature 2014; extended Figure 1), resulting in cell death upon prolonged exposure to high MYC levels (>3days). This nuance could well skew MYC's ectopic activities that are specific for this cell line and hence not generalizable to cancer cells driven by high MYC such as Burkitt's lymphoma or neuroblastoma. For example, the authors should compare the 'ectopic' MYC targets among the different datasets (U2OS, Emu-MYC normal-pretumor-tumor samples, and other datasets referenced in the manuscript). Are there cell-type-independent MYC ectopic target genes similar to cell-type-independent core MYC targets that are involved in ribosome biogenesis?
Most of the conclusion presented in the manuscript i.e., the EC50 values assigned to promoter and the necessity of secondary contacts of MYC to DNA via protein-protein interactions derive from the author's seminal experiments to determine the number of MYC molecules per cell. If these initial numbers are wrong, then the values used for calculations in subsequent experiments are dubious. The authors should address the following points, that if dealt with would increase the significance and rigor of the work.
1) The authors use recombinant, purified protein to calibrate immunoblots in order to calculate the absolute number of MYC molecules per cell using cell lysate. This method however will only indicate the average level of MYC for the entire population. It remains possible that some cells express no MYC while others express levels of MYC far exceeding the average. The authors should employ immunofluorescence or a similar technique to demonstrate MYC expression in their system is relatively homogenous.
2) In Figure 1A, the authors define the amount of MYC in their U2OS cells -/+ Dox. It appears that the amount of MYC in ETOH treated cells is beyond the range of their lowest standard (compare Figure 1A to Figure 2A). Indeed, if the fold range of expression is truly 14-fold as presented in the manuscript, then standards covering only a four-fold range (2.4 ng to 9.6 ng in Figure 2A) is insufficient. The authors should repeat the experiment with an expanded range of standards to insure that they have assigned the correct number of MYC molecules to the 'control' condition.
3) The author's analysis of transcription upon knock-down of MYC revealed that high-affinity promoters showed the greatest change in gene expression upon MYC depletion. This finding seems to be in disagreement with the finding from Nie et al. (Nie Z, 2012) in which B cells expressing very low levels of MYC show a proportional decrease of MYC binding to all genomic targets. Can the authors resolve this discrepancy? At the very least, the authors should perform ChIPseq upon MYC knock-down in their U2OS model and determine promoter occupancy to complement their gene expression experiments.
4) There appears to be somewhat of a disconnect between the accounting of MYC binding at promoters and the heat map shown in Figure 1C. From the heat map across all genes it appears that MYC follows a monotonic binding pattern all the way to the bottom (I don't see a negative region in the heat map) – in contrast the accounting of MYC positive versus negative promoters is highly dependent on the selection of p-values and/or false discovery rates to create an arbitrary distinction between + and –. Some discussion of this may be warranted. Perhaps the negative peaks simply reflect a continuum of MYC binding as it slides below the level of the technical background.
5) There are several issues related to the model. The authors only model the influence of WDR in concert with E-boxes; what is the level of binding and the influence of MYC at WDR+, E-box- promoters? This class should be considered, too. What is the extent of MYC binding, expression levels and functional GSEA for this class of promoters?
6) The model assumes equilibrium (or at least steady state) binding. While this is often a reasonable assumption, at promoters, in vivo, there are many energy-dependent and kinetic steps, that might skew that apparent on and off rates. For example, the authors' own previous work showed that MYC degradation during the transcription-cycle might be required to increase expression. Such degradation will increase the apparent MYC off-rate, artificially depressing apparent affinity of MYC for the promoter. The kinetic limitations of the model should be discussed in the text or in the supplemental material.
7) Some discussion of expression levels for the various functional classes of MYC-target promoters is warranted. It has previously been suggested that MYC binding is related to output levels. It may be that some cellular processes (such as translation) demand high output, and that it is the high level of transcription at these genes that licenses MYC binding, that in turn stimulates more transcription. Accompanying Tables of genes in groups identified in the Figures should be provided. (e.g., ribosome, extracellular matrix, solute transporter etc.). Further, the authors should provide an analysis of selected metabolic pathways, such as glycolysis and glutaminolysis, that may be regulated in U2OS cells.https://doi.org/10.7554/eLife.15161.035
- Apoorva Baluapuri
- Elmar Wolf
- Lisa Anna Jung
- Jana Wolf
- Martin Eilers
- Martin Eilers
- Elmar Wolf
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
We are grateful to Bruno Amati for sharing unpublished data. We thank Renate Metz, Angela Grün, Barbara Bauer and André Kutschke for excellent technical assistance and Laura Jänicke for her help in culturing non-transformed cell lines. We acknowledge all members of the ME/EW laboratory for helpful discussions and comments during the course of these experiments. This work was supported by a grant from the Deutsche Forschungsgemeinschaft to EW (Emmy Noether Research Group, WO 2108/1-1) and to ME (Ei 222/12-1) and, the German Excellence Initiative via the Graduate School of Life Sciences (University of Würzburg) to EW and AB, the Studienstiftung des deutschen Volkes to LAJ as well as by the German Federal Ministry of Education and Research (BMBF) within the framework of the e:Med research and funding concept to ME (01ZX1307G) and JW (01ZX1307F).
- Chi Van Dang, Reviewing Editor, University of Pennsylvania, United States
- Received: February 10, 2016
- Accepted: June 13, 2016
- Version of Record published: July 27, 2016 (version 1)
© 2016, Lorenzin et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.