An anciently diverged family of RNA binding proteins maintain correct splicing of ultra-long exons through cryptic splice site repression

  1. Biosciences Institute, Faculty of Medical Sciences, Newcastle University, Newcastle, United Kingdom
  2. Bioinformatics Support Unit, Faculty of Medical Sciences, Newcastle University, Newcastle, United Kingdom
  3. Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, United States
  4. School of Biosciences, University of Sheffield, Sheffield, United Kingdom
  5. School of Computing, Newcastle University, Newcastle, United Kingdom


  • Reviewing Editor
    Gene Yeo
    University of California, San Diego, La Jolla, United States of America
  • Senior Editor
    Kevin Struhl
    Harvard Medical School, Boston, United States of America

Reviewer #1 (Public Review):

The article by Siachisumo, Luzzi and Aldalaquan et al. describes studies of RBMX and its role in maintaining proper splicing of ultra-long exons. They combine CLIP, RNA-seq, and individual example validations with manipulation of RBMX and its family members RBMY and RBMXL2 to show that the RBMX family plays a key role in maintaining proper splicing of these exons.

I think one of the main strengths of the manuscript is its ability to explore a unique but interesting question (splicing of ultra-long exons), and derive a relatively simple model from the resulting genomics data. The results shown are quite clean, suggesting that RBMX plays an important role in the proper regulation of these exons. The ability of family members to rescue this phenotype (as well as only particular domains) is also quite intriguing and suggests that the mechanisms for keeping these exons properly spliced may be a quite important and highly conserved mechanism.

I think my main critique is that a lot of the conclusions in the text are written with very broad and general claims, but these are often based on either a small number of examples or a non-transcriptome-wide analysis that I think would be necessary for such broad conclusions. For example, pg 5 - "The above data indicated that RBMX has a major role in repressing cryptic splicing patterns in human somatic cells." is based on seeing ~120 exons with a 2:1 ratio of exclusion to inclusion (and doesn't include analysis confirming which of these are cryptic). Similarly, on page 15, line 31 "The above experiments showed that RBMXL2 is able to globally replace RBMX activity" and Fig 5 as well - I think the 'globally' term here is a bit too much with most of the analysis derived from n=3 events.

Another weakness I think is the lack of context in the paper, where the basic description of how many 'long' and 'ultra-long' exons there are and what percent of those are bound by and/or regulated by RBMX is missing. The text (including the title, which to me is written to imply a general role for ultra-long exons') seems to imply that RBMX maintains proper splicing of ultra-long exons globally (which to me implies a significant percent of them), but I don't think the manuscript succeeds in strongly proving this global role.

Along those same lines, I think the manuscript claims that the described principles are specific to the RBMX family, but I think the lack of negative examples weakens the strength of this claim. For example, for Figure 2 (D/E/F/G), what I think would make this more powerful is a negative example - if you take CLIP + knockdown/RNA-seq data for a different splicing regulator (RBFOX2, PTBP1, etc.), do you not see this size difference? I think the analysis described here is sufficient to show that there is an overlap for RBMX, but I think this would make it much stronger in confirming it's not just a 'longer genes get more CLIP tags' artifact.

Reviewer #2 (Public Review):

One of the greatest challenges for the spliceosome is to be able to repress the many cryptic splice sites that can occur in both the intronic and exotic sequences of genes. Although many studies have focused on cryptic signals in introns (because of their common involvement in disease) the question still remained open as to the factors that repress cryptic exons in exons. Because exons are normally much shorter than introns, in many cases the problem does not exist. However, in human genes, a significant proportion of exons can be considerably longer than the average 150 nt length and this raises the question of how cryptic splicing can be prevented in long exons. To address this question, the authors have focused on the possible role played by an ancient mammalian RBD protein called RBMX. Using a combination of high-throughput and classic splicing methodologies, they have shown that there is a class of RBMX-dependent ultra-long exons connected where the RBMX, RBMXL2, and RBMY paralogs have closely related functional activity in repressing cryptic splice site selection.

In general, the present work sheds light on what has been a rather understudied process in splicing research. The use of iCLIP and RNA-seq data has not only allowed us to identify the long exons where cryptic splicing is prevented by the RBMX proteins but has also allowed us to identify a network of genes mostly involved in genome stability and transcriptional control where these proteins seem to play a prominent role. This can therefore also shed additional information on the way splicing has shaped evolutionary processes in the mammalian lineage and will therefore be of interest to many researchers in this field.

There are no major weaknesses.

Reviewer #3 (Public Review):

The manuscript by Siachisumo et al builds upon a previous publication from the same group of collaborators that showed that depletion of mouse RBMXL2 leads to a block in spermatogenesis associated with mis-splicing, particularly of large exons in genes associated with genome stability (Ehrmann et al eLife 2019). RBMXL2 is an RNA-binding protein and an autosomal retrotransposed paralog of the X-chromosomally encoded RBMX. RBMXL2 is expressed during meiosis when RBMX and the more distantly related RBMY (on the Y chromosome) are silenced. It is therefore an appealing hypothesis that RBMXL2 might provide cover for RBMX function during meiosis. To address this hypothesis the authors analysed the transcriptomic consequences of RBMX depletion by RNA-Seq in human cells (MDA-MB-231 and existing RNA-Seq data from HEK293 cells), complemented by iCLIP to analyze the binding targets of FLAG-tagged RBMX in HEK293 cells. The findings convincingly demonstrate that - like RBMXL2 - RBMX mainly acts as a splicing repressor and that it particularly acts to protect the integrity of very long ("ultra-long") exons. Upon RBMX depletion, many of these exons are shortened due to the use of cryptic 5' and/or 3' splice sites. Moreover, affected genes are particularly enriched for functions associated with genome integrity - indeed "comet assays" show that RBMX depletion leads to DNA damage defects.

The manuscrupt therefore delivers a clear affirmative answer to the question of whether the two highly related proteins have similar molecular functions, particularly with respect to suppressing cryptic splicing that affects ultra-long exons. This conclusion is reinforced by the ability of induced expression of either RBMXL2 or RBMY to fully complement the effects of RBMX knockdown upon three target events in the ETAA1, REV3L, and ATRX genes.

The manuscript also includes some experiments that address more mechanistic questions, such as the potential for RBMX to block access of spliceosome components to splice site elements and structure-function analyses of RBMX. These areas have a distinctly "preliminary" feel to them. For example, for one target (ETAA1) it is shown that CLIP tags are close to mapped branchpoints. However, no attempt is made to integrate the RNA-Seq and iCLIP data-sets to look for more generalized relationships between binding and activity. Likewise, one experiment shows that the RRM domain of RBMXL2 is not necessary for activity. Given that the RRM domain represents only ~25% of the total RBMXL2 sequence, this is a somewhat preliminary, albeit interesting, observation. Another surprising omission was that there was no global comparison of the consequences of RBMX depletion and complementation by RBMXL2, despite the fact that the relevant RNA-Seq data-sets had been generated (Figure 4 supplement 1 shows RNA-Seq IGV tracks that confirm the effects on ETAA1, REV3L and ATRX shown by RT-PCR in Figure 4).

In summary, this manuscript provides clear evidence to support the role of RBMX as a repressor of cryptic splice sites in ultra-long exons, similar to RBMXL2.

  1. Howard Hughes Medical Institute
  2. Wellcome Trust
  3. Max-Planck-Gesellschaft
  4. Knut and Alice Wallenberg Foundation