Author Response
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
The authors have previously employed micrococcal nuclease tethered to various Mcm subunits to the cut DNA to which the Mcm2-7 double hexamers (DH) bind. Using this assay, they found that Mcm2-7 DH are located on many more sites in the S. cerevisiae genome than previously shown. They then demonstrated that these sites have characteristics consistent with origins of DNA replication, including the presence of ARS consensus sequences, location of very inefficient sites of initiation of DNA replication in vivo, are free of nucleosomes, they contain a G-C skew and they locate to intergenic regions of the genome. The authors suggest, consistent with published single molecule results, that there are many more potential origins in the S. cerevisiae genome than previously annotated.
The results are convincing and are consistent with prior observations. The analysis of the origin associated features is informative.
Reviewer #2 (Public Review):
By mapping the sites of the Mcm2-7 replicative helicase loading across the budding yeast genome using high-resolution chromatin endogenous cleavage or ChEC, Bedalov and colleagues find that these markers for origins of DNA replication are much more broadly distributed than previously appreciated. Interestingly, this is consistent with early reconstituted biochemical studies that showed that the ACS was not essential for helicase loading in vitro (e.g. Remus et al., 2009, PMID: 19896182). To accomplish this, they combined the results of 12 independent assays to gain exceptionally deep coverage of Mcm2-7 binding sites. By comparing these sites to previous studies mapping ssDNA generated during replication initiation, they provide evidence that at least a fraction of the 1600 most robustly Mcm2-7-bound sequences act as origins. A weakness of the paper is that the group-based (as opposed to analyzing individual Mcm2-7 binding sites) nature of the analysis prevents the authors from concluding that all of the 1,600 sites mentioned in the title act as origins. The authors also show that the location of Mcm2-7 location after loading are highly similar in the top 500 binding sites, although the mobile nature of loaded Mcm2-7 double hexamers prevents any conclusions about the location of initial loading. Interestingly, by comparing subsets of the Mcm2-7 binding sites, they find that there is a propensity of at least a subset of these sites to be nucleosome depleted, to overlap with at least a partial match to the ACS sequence (found at all of the most well-characterized budding yeast origins), and a GC-skew. Each of which is a characteristic of previously characterized origins of replication.
Overall, this manuscript greatly broadens the number of sites that are capable of loading Mcm2-7 in budding yeast cells and shows that a subset of these additional sites act as replication origins. Although these sites do have a propensity to include a match to the ACS, these studies suggest that the mechanism of helicase loading in yeast and multicellular organisms is more similar than previously thought.
Reviewer #1 (Recommendations For The Authors):
Specific Comments:
- The proposal, based on this study, that replication in S. cerevisiae is similar to that in Human cells (mentioned in the abstract, introduction and end of discussion) is not supported by the evidence, either in this paper or elsewhere. The authors suggest that even these inefficient origins are directed by specific sequences that load Mcm2-7 DH, but there is no evidence that this occurs outside a limited clade of budding yeasts and certainly no in human cells. Furthermore, the distribution and efficiency of origins of replication Human cells has not been shown to parallel the findings in this paper. Thus, the conclusion should be removed since it makes a statement that S. cerevisiae and Human cells have similar mechanisms for origin location. This might confuse non-specialists who do not appreciate the subtleties.
The reviewer's concern that we could confuse non-specialists is well-founded. We have made the following changes to emphasize the point that, while a wider distribution of origins makes S phase in yeast more like that in humans, the genome replication programs in the two organisms remain distinctly different:
- The last sentence of the abstract was changed as follows:
a. These results shed light on recent reports that as many as 15% of replication events initiate outside of known origins, and they reveal S phase in yeast to be surprisingly similar to that in humans.
b. These results shed light on recent reports that as many as 15% of replication events initiate outside of known origins, and this broader distribu5on of replica5on origins suggest that S phase in yeast may be less dis5nct from that in humans than is widely assumed.
- A sentence in the results was changed as follows:
a. Another characteris5c of known origins that we could use as a criterion to assess the nature of Mcm binding sites is the presence of an ACS.
b. Another characteris5c of known origins in S. cerevisiae (although not in most other organisms) that we could use as a criterion to assess the nature of Mcm binding sites is the presence of an ACS.
- We changed the last sentence of the Discussion as follows:
a. On the other hand, the sharply focused nature of its replication origins made S phase in yeast appear distinct from that in other organisms. Our discovery that sites of replica5on ini5a5on in yeast are much more widely dispersed than previously believed, with at least 1600 and possibly as many as 5500 origins, emphasizes its continued relevance to understanding genome duplication in humans.
b. On the other hand, the sharply focused nature of its replication origins made S phase in yeast appear dis?nct from that in other organisms. Although by no means elimina5ng this dis5nc5on, our discovery that sites of replication ini5a5on in yeast are much more widely dispersed than previously believed, with at least 1600 and possibly as many as 5500 origins, emphasizes yeast's continued relevance to understanding S phase in humans.
- The authors discuss in the introduction that origins in S. cerevisiae are equivalent to ARS sequences. Why didn't they ask if the inefficient origins also confer ARS activity? This would be a valuable addition and a very simple experiment.
The inefficient origins are not expected to confer ARS activity, because origins that are not licensed in essentially every G1 will be diluted out by cell division. We confirmed the absence of our inefficiently licensed origins in a data set generated by high throughput sequencing of a genomic library that was selected for origin activity (PMID: 23241746), but we did not note the results of this analysis in our manuscript, because the low complexity of the library used made this negative result uninformative. To clarify this point, we added the bolded clauses to the following sentences in the Introduction and Discussion:
- Origins vary widely in their efficiency, with some being used in almost every cell cycle while others may be used in only one in one thousand S phases (Boos and Ferreira, 2019), with only the former being capable of supporting plasmid replication in the traditional ARS assay.
- "Thus, we can detect Mcm complexes that are loaded in as few as 1 in 500 cells (Foss et al., 2021), even though such low affinity Mcm binding sites are not expected to be capable of supporting autonomous replication of a plasmid."
- While the authors have shown that Mcm2-7 is loaded adjacent to the principal ARS consensus sequence, consistent with biochemical studies on pre-RC assembly, two reports have shown that the Mcm2-7 ChIP is dependent on the B2 element of ARS1, but the ORC ChIP is not, suggesting that Mcm2-7 is loaded there (See Lipford and Bell, Mol. Cell 2007 and Zou and Stillman, Mol. Cell. Biol. 2000).
We have added the following two sentences in the Results section to note these reports:
"Furthermore, in the case of ARS1, two reports have demonstrated a requirement for the B2 element for Mcm loading, though not for Orc binding, suggesting that Orc may bind to the ACS but then load Mcm at the B2 element (Zou and Stillman 2000; Lipford and Bell 2001). This would still leave Mcm loaded downstream of the ACS, but we note this result to emphasize that not all details of Mcm loading in vitro have been definitively established."
**Reviewer #2 (Recommendations For The Authors):>>
Specific points:
- The authors state "It is notable that the Mcm-ChEC panel of Figure 3A shows no obvious change in Mcm stoichiometry across the entire range, from low abundance, at the bottom, to high abundance, at the top." The ChEC method does not intrinsically measure stoichiometry so this conclusion needs more explanation. The authors appear to be referring to the distribution of Mcm2-7 reads being similar across all origins, but this does not measure how many double hexamers are present at an origin. If the stoichiometry argument is based on a finding that each origin has only a single 60 bp region that is protected by Mcm2-7 (rather than a distribution of 60 bp regions spread across the origin), then the authors should provide more compelling evidence than what is shown in Fig. 3A.
We agree with the reviewer that our conclusion needs more explanation, and we have therefore made the following change, which we believe clarifies the point that we were trying to convey:
We agree with the reviewer that our conclusion needs more explanation, and we have therefore made the following change, which we believe clarifies the point that we were trying to convey:
Original version: It is notable that the Mcm-ChEC panel of Figure 3A shows no obvious change in Mcm stoichiometry across the entire range, from low abundance, at the bottom, to high abundance, at the top. This argues against models in which higher replication activity at more active origins reflect the loading of more Mcm double-hexamers at those origins within a single cell.
Updated version: It is notable that, when Mcm is present, it is present predominantly as a single double-hexamer (right panel of Figure 3A), and that this remains true across the entire range of abundance shown in Figure 3A. This argues against models in which higher replication activity at more active origins is caused by the loading of more Mcm double-hexamers at those origins within a single cell, since such models predict that multiple Mcm footprints should be more prevalent at the top (high abundance) of the Mcm-ChEC heat map in Figure 3A than at the bottom.
- The authors state "we estimate that ~1-2 % cells have an Mcm complex loaded at the Mcm binding sites in the eighth cohort (ranks 1401-1600)" but it is not clear how this estimate is calculated. An explanation would help the reader to understand this statement.
We have expanded on our earlier statement to clarify how we arrived at the estimate:
Original version: Based on our previous analysis of MCM occupancy (Foss et al., 2021), which showed that approximately 90% cells have an MCM complex loaded at one of the most active known replication origins, we estimate that ~1-2 % cells have an Mcm complex loaded at the Mcm binding sites in the eighth cohort (ranks 1401-1600).
Updated version: We have previously used Southern blodng to demonstrate that approximately 90% of the DNA at one of the most active known origins (ARS1103) is cut by Mcm-MNase (Foss et al., 2021), and to thereby infer that 90% of cells have a doublehelicase loaded at this origin. Using this as a benchmark, we estimate that ~1-2 % cells have an Mcm complex loaded at the Mcm binding sites in the eighth cohort (ranks 14011600).
- Although there is evidence that some subset of the CMBS sites exhibit nucleosome depletion, an ACS, and a GCskew, the authors should do a better job of making the reader aware that it is likely that a decreasing percentage of the individual origins in a group include these characteristic and that this is a likely factor explaining the increasingly rare use of these sites as Mcm2-7 loading sites and origins of replication.
We have added the following text to the Discussion to draw the reader's attention to this possibility, while also noting that we do not believe it to be a major factor in the increasingly rare use of sites within the first 5,500 CMBSs as replication origins:
Furthermore, it is possible that, as one moves to lower abundance groups of CMBSs within the most abundant 5500 sites, a smaller fraction of sites within those groups have any origin function at all. If one takes this model to the extreme, it would suggest that the continuous decline in replication activity seen in Figure 2B between the group comprised of ranks 1-200 and that comprised of ranks 1401-1600 reflects an ever increasing fraction of CMBSs with zero origin activity. At the other extreme, the decline in replication activity could be interpreted within a framework in which 100% of CMBSs in each group function as replication origins, but that their replication activity declines with rank, perhaps because continuously decreasing fractions of cells in the population contain a single double-hexamer. While the truth presumably lies between these two extremes, we favor a model that tilts toward the latter view, because of the abruptness of the transition that appears around rank 5,000 in (1) nucleosomal architecture (Figures 3A, 3B and S3); (2) intergenic versus genic localization and transcription levels (Figure 4A); (3) EACS position weight matrix scores (Figure 5B); and (4) GC skew (Figure 6B). By these criteria, the CMBSs below rank 5000 appear relatively homogeneous, while still showing a gradual decline in replication activity with MCM abundance within the range of detection (11600). Our assumption is that the qualitative homogeneity is more consistent with a quantitative, but not qualitative, change in CMBSs with declining MCM abundance among the top 5000 CMBSs.
- The argument that there are as many as 5,500 origins is not well justified. Similarly, the evidence that there are even 1,600 origins is not compelling. As the authors state, to see the peaks observed in the various analyses (ssDNA association, nucleosome depletion, etc.) of the increasingly less populated CMBSs (e.g. those with fewer ChEC reads), only a small subset of the CMBS are likely to have a given characteristic. Given that the loading of a Mcm2-7 double hexamer makes any site a potential origin, it would be more appropriate to say that there could be as many as 5,500 potential origins but many if not most are unlikely to ever direct initiation.
The reviewer is correct that, because many of our analyses rely on group averages rather than individual measurements, we are oien unable to make statements that can be applied to every member of a group. We had tried to emphasize this point in our original manuscript with the following two sentences (in bold), which were in the Results and Discussion, respectively:
First, clear peaks of ssDNA signal extend down to the eighth cohort (brown line), which corresponds to CMBSs ranked 1401-1600. Of course, this does not imply that all of these sites function as replication origins, and nor does it imply that no sites below that rank do so, since we have reached the limits of detection of this ssDNA-based assay. Nonetheless, it suggests that replication activity is common among sites extending at least down to rank 1600.
Of course, we do not conclude that all CMBSs with ranks lower than 5500 function as replication origins, nor that none with ranks above 5500 do so, but only that the number of replication origins is likely to be approximately an order of magnitude higher than widely believed.
We have now added a third sentence to further underline this point (in bold):
Second, by averaging signals of replication from multiple Mcm binding sites, we were able to extract weak signals of replication. This is due to the fact that noise, which is randomly distributed, will tend to cancel itself out, while signals of replication will consistently augment the signal at the midpoint of the origin (Figure 2). An inevitable shortcoming to this approach is that it precludes analysis of specific sites; in other words, not every member of the group will share the average characteristic of that group.
A separate issue that this touches on is the distinction between a replication origin and a site at which Mcm2-7 has been loaded. While it strikes us as unlikely that a loaded Mcm complex would be completely incalcitrant to activation, it is a formal possibility. To alert the reader to this issue, we have added the following clause, in bold, to the Abstract, and we have also added the sentence below that to the Discussion:
We conclude that, if sites at which Mcm double-hexamers are loaded can function as replication origins, then DNA replication origins are at least 3-fold more abundant than previously assumed, and we suggest that replication may occasionally initiate in essentially every intergenic region.
Finally, it is important to note that, in equating Mcm binding sites with potential replication origins, we are assuming that if an Mcm double-hexamer is loaded onto the DNA, then it is conceivable that that complex can be activated.
- The author's discussion of the relationship between Mcm2-7 location relative to the ACS and the mechanism of of Mcm2-7 loading does not consider that Mcm2-7 double hexamers can slide on DNA after loading (for example, Remus et al., 2009 PMID: 19896182). Thus, the authors are not looking at sites of loading only the distribution of Mcm2-7 molecules after loading. In addition, biochemical experiments do not predict a particular Mcm2-7 position relative to the ACS. Indeed, at ARS1, one would predict that the close proximity of the second weak match to the ACS (the B2 element) to the primary ACS would lead the Mcm2-7 double hexamer being initially formed at a site overlapping the ARS1 ACS. It is much more likely that the explanation for the distribution of Mcm2-7 locations relative to the ACS is that the ORC-bound ACS and the nucleosomes immediately flanking the origin prevents Mcm2-7 from occupying the right-side of the origin as illustrated in Fig. 5D.
We have tried to emphasize this point more clearly. In our original manuscript, we had brought up the possibility of Mcms sliding after being loaded in the following context (see bolded clause):
Specifically, in 112 out of 146 instances in which a peak of Mcm signal was within 100 base pairs of a known ACS, that peak was downstream of the ACS. The 34 exceptions may reflect (1) incorrect identification of the ACS; (2) incorrect inference of the directionality of the site; or (3) sliding of the Mcm complex after it has been loaded.
We have now added the following to further emphasize the point:
In interpreting the results above, it is important to remember that the locations at which we are detecting Mcm complexes by ChEC do not necessarily reflect the locations at which those complexes were loaded, since Mcm double-hexamers can slide along the DNA after loading (Remus et al. 2009; Gros et al. 2015; Foss et al. 2019).
We have also softened the following conclusion by changing "confirmation of" to "support for":
"...our results...provide in vivo support for in vitro predictions of the directionality of Mcm loading by Orc..."
There are missing references in several places:
- "For example, 15 of the 56 genes that contained a high abundance site have been implicated in meiosis and sporulation and are not expressed during vegetative growth (~5 out of 56 expected from random sampling), consistent with previous observations (Mori and Shirahige, 2007)." Should include Blitzblau et al., 2012 (PMC3355065) which showed that Mcm2-7 loading was impacted by differences in meiotic and mitotic transcription.
- "In contrast to the low abundance sites, the most abundant 500 sites showed a preference for convergent over divergent transcription (left of vertical dotted line in Figure 4B), in agreement with a previous report (Li et al., 2014)." This preference was first pointed out in MacAlpine and Bell, 2005 (PMID: 15868424).
- "This sequence is recognized by the Origin Recognition Complex (Orc), a 6-protein complex that loads MCM (Broach et al., 1983; Deshpande and Newlon, 1992; Eaton et al., 2010; Kearsey, 1984; Newlon and Theis, 1993; Singh and Krishnamachari, 2016; Srienc et al., 1985)." This list should include a reference to Bell and Stillman, 1992 (PMID: 1579162), which first described ORC and showed that it recognized the ACS. It would also be more helpful to the reviewer to distinguish the references that identified that ACS from those concerning ORC binding to it.
We thank the reviewer for pointing out these missing references, and we have added them. We have also separated the references that note the identification of the ACS sequence from those that demonstrate Orc binding to that sequence.