Genome-wide identification of stable RNA-chromatin interactions

  1. Program in Bioinformatics and Systems Biology, University of California San Diego, La Jolla, USA
  2. Shu Chien-Gene Lay Department of Bioengineering, University of California San Diego, La Jolla, USA

Peer review process

Not revised: This Reviewed Preprint includes the authors’ original preprint (without revision), an eLife assessment, public reviews, and a provisional response from the authors.

Read more about eLife’s peer review process.

Editors

  • Reviewing Editor
    Arun Kumar Ganesan
    University of New Mexico, Albuquerque, United States of America
  • Senior Editor
    Yamini Dalal
    National Cancer Institute, Bethesda, United States of America

Reviewer #1 (Public review):

Summary:

This manuscript constitutes further analysis of a dataset generated for a previously-published study from the same group. The experiments in the previous work use an RNA-DNA proximity assay to capture RNAs that interact with chromatin, especially beyond their site of transcription, by crosslinking-and proximity ligation. The previous work added one novel feature to this treatment, compared to other studies by the same group, where they treated the nuclei with RNase A prior to crosslinking. The initial study concluded that long-range chromatin interaction via chromatin looping is affected by RNase treatment. In the current manuscript, the group analyze the data from this experiment in more detail. They describe some notable features of RNAs that remain after RNase treatment and where they are associated within the genome. Overall, the further analyses are somewhat useful, with some exceptions for specific analyses that are not clear in the current manuscript. The work is very complementary to the previously published original study, to the point that it is surprising it was not included in that study.

Strengths:

(1) The analyses are a useful complement that fill in gaps from the Calandrelli et al paper. Some of the findings are suggestive of RNA-protein networks that operate at long distances to regulate promoters.

Weaknesses:

(1) The beginning of the Results section, and elsewhere, describes steps that likely were performed in the previous publication from which the data are being further analyzed and possibly partially reanalyzed. The current manuscript should more clearly describe if there are any aspects of the pipeline that have been modified from the Calandrelli study (which does not have much detail regarding iMARGI parameters in the published paper) for the further analysis in this manuscript.

(2) The RNase treatment approach is similar to that addressed in recent papers from the Jenner and Davidovich groups (https://doi.org/10.1016/j.celrep.2024.113856; https://doi.org/10.1016/j.celrep.2024.113858) where these groups found RNase treatment significantly affected solubility of chromatin, causing aggregation. The authors should address this work and place it in light of their current study.

(3) Figure 1f: it is not clear what it means for genes to be "non-differentially expressed" in this context. Isn't this also RNase-insensitive? And how is the "Ctrl specific" RNA set determined? This is confusing, since RNase is assumed to degrade most of the RNA in these samples.

(4) Figure 2a: The results are somewhat surprising, given that protein-coding genes are depleted more in the RNase treatment. Is the Ctrl set the same as in 1f? This emphasizes the importance of defining that population better.

(5) Figure 3a: The text references this figure in ways that do not match the figure, referencing at least nine column clusters when there are only six. Heatmaps of certain TFs and "RAH explained" percentages don't seem to match the Results section description, either. The authors claim EZH2 binding sites are the top TF overlap with RAHs and yet do not include EZH2 in Figure 3a. Suz12 (EZH2 binding partner) and H3K27me3 (EZH2 product) are also referenced in the text for this figure, but not included in the figure itself.

(6) The manuscript uses the term "non-diffusive RNA-chromatin interactome" which is not directly supported by data. The authors use the term initially to describe the RNase-resistant species in their previous work, but through the current study, they support a model where the RNase resistance is simply due to protection by protein binding, not by any constraints on diffusion in particular chromatin environments.

Reviewer #2 (Public review):

Summary:

In this manuscript, the authors re-analyze RNase-treated iMARGI data to systematically identify and analyze RNase-resistant RNA-chromatin interactions. In general

Strengths:

Analyses are well-thought-out and generally solid.

Weaknesses:

Conclusions are massively overstated, and though the analytical pipelines used are solid, the conclusions deriving from them lack the backing of solid computational and molecular controls.

Reviewer #3 (Public review):

Summary:

The study investigated stable RNA-chromatin interactions by applying RNase treatment before the iMARGI (in situ mapping of RNA-genome interactome) procedure to remove promiscuous, unprotected RNA transcripts and selectively enrich for RNA-inaccessible, potentially functional RNA-chromatin interactions (RNA-Transcription factor and RNA-histone). The researchers found that short-range interactions (<1kb) are RNase resistant, possibly due to the protection from RNA polymerases. They noticed that long-range RNA-chromatin interactions (>2Mbp or interchromosomal) were also enriched after RNase treatment, hypothesizing that these interactions are stabilized by chromatin-binding proteins. They found that genic caRNAs were sensitive, while repeat-derived caRNAs, such as rRNA and satellite repeats, were resistant to RNase. Long non-coding RNAs (lncRNAs), particularly those associated with diseases, were over-represented among RNase-insensitive transcripts, indicating their potential regulatory significance. Additionally, RNase-insensitive caRNAs exhibited higher evolutionary conservation, implying that they are protected by protein complexes, especially in long-range interactions. RNA Attachment Hot Zones (RAHs) enriched post-RNase treatment were found to localize in functional genomic regions such as promoters, transcription factor binding sites (TFBS), and histone modification sites. Importantly, RNase treatment amplified specific RNA-transcription factor interactions, with caRNA signals being preserved at TFBS for factors with RNA-binding capabilities, suggesting that direct RNA-protein binding helps protect caRNAs from degradation. They also found that different TFs are enriched with specific caRNA species, distinguishing them from their genomic footprints. In addition, transcripts with higher abundance tend to enrich at more TFBS. Overall, the study highlights the role of RNase-inaccessible caRNAs in chromatin regulation and provides insight into their functional significance in genome organization.

Strengths:

This study involves rigorous and comprehensive data analysis involving datasets with very high sequencing depth and appropriate statistical tests (e.g., chi-square tests to validate the association between caRNAs and TFBS statistically). This analysis was further strengthened by comparing their results with orthogonal datasets, such as RedChIP and fRIP-seq, providing robust, cross-validated evidence for the caRNA-TFBS associations. In addition to examining broad interactions, the authors identified specific long-range RNA-chromatin interactions and pinpointed specific transcription factors and histone modification markers that are associated with these interactions. The authors explored the evolutionary implications of RNase-insensitive caRNAs and their potential medical relevance, particularly by identifying caRNAs linked to disease-associated genes and long non-coding RNAs (lncRNAs). This combination of detailed analysis, along with functional relevance, broadens the scope of the research, making it a significant contribution to chromatin biology.

Weaknesses:

However, I have the following concerns regarding the studies:
(1) I don't understand the logic behind calling promoters, enhancers, and similar regions "functionally important regions" when describing the enrichment of RNase-insensitive interactions. Genic regions that are RNase-sensitive are also functionally relevant. So, what makes promoters, enhancers, etc, unique in terms of functionality?
(2) First, while the study offers strong evidence for associations between caRNAs, transcription factors, and chromatin markers, it lacks direct functional validation experiments such as RNA knockdown or CRISPR interference, to confirm the specific roles of these RNAs in gene regulation or chromatin structure modifications.
(3) Another limitation is the incomplete investigation of caRNAs with short-range interactions (<1kb). The authors hypothesized that these are protected by RNA polymerases but did not provide supporting experimental evidence or references to previous studies. Offering either experimental validation or a rationale for excluding these short-range interactions would strengthen this hypothesis. The conclusion that authors drew on that "chromatin-associated RNAs (caRNAs) involved in short- to middle-range interactions are more susceptible to RNase treatment" was unclear for the specific "short-range" distance. The data shown in Supplementary Figure 2a contradicted the conclusion in the discussion that "long-distance RNA-chromatin interactions are preferentially preserved after RNase treatment, while short-range interactions are depleted." as well as the suggestion made linking RNase inaccessibility to evolutionarily conserved in the paper.
(4) The study heavily relies on RNase treatment to isolate stable RNA-chromatin interactions, which might neglect important transient or weak interactions and overlook the functional relevance of RNase-sensitive interactions, hence missing the dynamic nature of RNA-chromatin interactions.
(5) Tthe analysis is limited to human embryonic stem cells (H1 cells), which might restrict the generalizability of the findings. Expanding the study to include a cell type that represents a broader range of cell types or tissues will strengthen the conclusions.
(6) The term "RNase A treatment" in the methods section could be clearer if specified as "RNase-treated iMARGI," which encompasses the standard iMARGI protocol.
(7) There is some ambiguity regarding whether the researchers generated new data or reanalyzed existing datasets. While it is mentioned early on that previously published RNase-treated iMARGI datasets were reanalyzed, the text later states that "three biological replicates were generated for the RNase-treated samples." Clarifying whether the data were newly generated in this study or obtained from public datasets would improve the clarity.
(8) The color scheme should be the same for heatmaps for control, and RNase-treated samples in Figure 4.

Author response:

We thank the editors and reviewers for their thorough evaluation of our manuscript. We appreciate the constructive feedback and insights provided.

We acknowledge that some of our conclusions would benefit from more measured statements and additional computational controls. We will revise the manuscript to better reflect the scope and limitations of our analytical approach. While we cannot add new experimental validations at this stage, we will strengthen our computational analyses and clarify our methodology.

Below, we outline our planned revisions to address the major points raised in the public reviews:

Clarification of Terms and Definitions:

(1) We will make it clearer in our manuscript to emphasize that we reuse the same raw datasets from our previous study as described in Calendrilli et al, 2023, and there is no modification to the experimental methods or data.

(2) We will provide clear definitions for:

- "Non-differentially expressed" genes

- "Ctrl specific" RNA sets

- The composition of control populations in different analyses

(3) We will revise the use of "non-diffusive RNA-chromatin interactome" and “RNase-resistant” terminology to better reflect our actual findings.

(4) We will also improve clarity regarding:

- The rationale for focusing on specific genomic regions

- The interpretation of evolutionary conservation data

(5) We will provide additional rationale on the exclusion of short-range interactions.

Figure Revisions:

(1) Figure 3a: We will correct any discrepancy between text references and figure content.

(2) Figure 4: We will standardize the color scheme between control and RNase-treated samples.

(3) We will follow the reviewer's suggestion to move figure 1g to the supplementary file.

Additional Computational Analyses:

(1) We will consider adding controls for RNA length effects and integrate any existing knowledge on the protection extent variation across different RBP.

Discussions:

(1) We will carefully rephrase our conclusions to more accurately reflect the scope and limitations of our computational findings, ensuring we do not overstate the implications.

(2) We will expand the discussion of limitations, including:

- The focus on RNase-resistant interactions only

- The cell-type specificity of our findings

- The lack of functional validation

- The limited ability to discern and study the transient or weak RNA-chromatin interactions using the current dataset

(3) Regarding the recent papers from Jenner and Davidovich groups about RNase treatment effects on chromatin solubility:

- We will discuss these findings in our revised manuscript

- We will address potential limitations this may impose on our interpretations

  1. Howard Hughes Medical Institute
  2. Wellcome Trust
  3. Max-Planck-Gesellschaft
  4. Knut and Alice Wallenberg Foundation