Introduction

V(D)J recombination serves as the central process for early lymphocyte development and generates diversity in antigen receptors. This process involves the double-strand DNA cleavage of gene segments by the V(D)J recombinase, including RAG1 and RAG2. RAG recognizes conserved recombination signal sequences (RSSs) positioned adjacent to V, D, and J gene segments. A bona fide RSS contains a conserved palindromic heptamer (consensus5′-CACAGTG) and A-rich nonamer (consensus 5′ACAAAAACC) separated by a degenerate spacer of either 12 or 23 base pairs (Schatz and Ji, 2011; Hirokawa, et al., 2020). The process of efficient recombination is contingent upon the presence of recombination signal sequences (RSSs) with differing spacer lengths, as dictated by the “12/23 rule” (Eastman, et al., 1996; Banerjee and Schatz, 2014). Following cleavage, the DNA ends are joined via non-homologous end joining (NHEJ), resulting in the precise alignment of the two coding segments and the signaling segment (Rooney, et al., 2004). V(D)J recombination promotes B cell development, but aberrant V(D)J recombination can lead to precursor B-cell malignancies through RAG mediated off-target effects (Thomson, et al., 2020; Mendes, et al., 2014; Onozawa and Aplan, 2012).

The regulation of RAG expression and activity is multifactorial, serving to ensure V(D)J recombination and B cell development (Gan, et al., 2021; Kumari, et al., 2021). The RAGs consist of core and non-core region. Although recombinant dispensability appears evident, the non-core RAG regions are evolutionarily conserved, indicating their potential significance in vivo that may involve quantitative or qualitative modifications in RAG activity and expression (Liu, et al., 2022; Curry and Schlissel, 2008; Sekiguchi, et al., 2001). Specifically, the non-core RAG2 region (amino acids 384–527 of 527 residues) contains a plant homeodomain (PHD) that can recognize histone H3K4 trimethylation, as well as a T490 locus that mediates a cell cycle-regulated protein degradation signal in proliferated pre-B cells stage (Matthews, et al., 2007; Liu, et al., 2007). The off-target V(D)J recombination frequency is significantly higher when RAG2 is C-terminally truncated, thereby establishing a mechanistic connection between the PHD domain, H3K4me3-modified chromatin, and the suppression of off-target V(D)J recombination (Lu, et al., 2015; Mijušković, et al., 2015). RAG2 destruction is linked to the cell cycle through the cyclin-dependent kinase cyclinA/Cdk2, which phosphorylates T490. Failure to degrade RAG during the S stage poses a threat to the genome (Zhang, et al., 2011). A T490A mutation at the phosphorylation site contribute to lymphomagenesis in a p53-deficient background. The RAG1’ non-core region (amino acids1–383 of 1040 residues) has been identified as a RAG1 regulator. While the core RAG1 maintains its catalytic activity, it’s in vivo recombination efficiency and fidelity are reduced in comparison to the full-length RAG1 (fRAG1). In addition, the RAG binding to the genome is more indiscriminate (Silver, et al., 1993; Beilinson, et al., 2021; Sadofsky, et al., 1993). The N-terminal domain (NTD), which is evolutionarily conserved, is predicted to contain multiple zincbinding motifs, including a Really Interesting New Gene (RING) domain (aa 287 to 351) that can ubiquitylate various targets, including RAG1 itself (Deng, et al., 2015). While the ubiquitylation activity has been characterized in vitro, its in vivo relevance to V(D)J recombination and off-target V(D)J recombination remains uncertain. Furthermore, the N-terminal domain (NTD) contains a specific region (amino acids 1 to 215) that facilitates interaction with DCAF1, leading to the degradation of RAG1 in a CRL4dependent manner (Schabla, et al., 2018). Additionally, the NTD plays a role in chromatin binding and the genomic targeting of the RAG complex (Schatz and Swanson, 2011). Despite increased evidence emphasizing the significance of non-core RAG regions, particularly RAG1’s non-core region, the function of non-core RAG regions in off-target V(D)J recombination and the underlying mechanistic basis have not been fully clarified.

Typically, genomic DNA is safeguarded against inappropriate RAG cleavage by the inaccessibility of cryptic RSSs (cRSSs), which are estimated to occur once per 600 base pairs (Lewis, et al., 1997; Teng, et al., 2015). However, recent research has demonstrated that epigenetic reprogramming in cancer can result in heritable alterations in gene expression, including the accessibility of cRSSs (Khoshchehreh, et al., 2019; Becker, et al., 2020; Fatma, et al., 2022; Goel, et al., 2022). We selected the BCRABL1+ B-ALL model, which is characterized by ongoing V(D)J recombinase activity and BCR-ABL1 gene rearrangement in pre-B leukemic cells (Schjerven, et al., 2017; Wong and Witte, 2004). The genome structural variations (SVs) analysis was conducted on leukemic cells from fRAG, cRAG1, and cRAG2, BCR-ABL1+ B-ALL mice to examine the involvement of non-core RAG regions in offtarget V(D)J recombination events. The non-core domain deletion in both RAG1 and RAG2 led to accelerated leukemia onset and progression, as well as an increased off-target V(D)J recombination. Our analysis showed a reduction in RAG binding accuracy in cRAG cells and a decrease in recombinant size in cRAG1 cells, which may be responsible for the increased off-target V(D)J recombination in cRAG leukemia cells. In conclusion, our results highlight the potential importance of the non-core RAG region, particularly RAG1’s non-core region, in maintaining accuracy of V(D)J recombination and genomic stability in BCR-ABL1+ B-ALL.

Method

Mice

The C57BL/6 mice were procured from the Experimental Animal Center of Xi’an Jiaotong University, while cRAG1 (amino acids 384-1040) and cRAG2 (amino acids 1-383) were obtained from David G. Schatz (Yale University, New Haven, Connecticut, USA). The mice were bred and maintained in a specific pathogen-free (SPF) environment at the Experimental Animal Center of Xi’an Jiaotong University. All animal-related procedures were in accordance with the guidelines approved by the Xi’an Jiaotong University Ethics Committee for Animal Experiments.

Generation of Retrovirus Stocks

The pMSCV-BCR-BAL1-IRES-GFP vector is capable of co-expressing the human BCR-ABL1 fusion protein and green fluorescence protein (GFP), while the pMSCV-GFP vector serves as a negative control by solely expressing GFP. To produce viral particles, 293T cells were transfected with either the MSCV-BCRBAL1-IRES-GFP or MSCV-GFP vector, along with the packaging vector PKAT2, utilizing the X-tremeGENE HP DNA Transfection Reagent from Roche (Basel, Switzerland). After 48 hours, the viral supernatants were collected, filtered, and stored at -80°C.

Bone Marrow Transduction and Transplantation

Experiments were conducted using mice aged between 6 to 10 weeks. BCR-ABL1+ B-ALL was induced by utilizing marrow from donors who had not undergone 5-FU treatment. The donor mice were euthanized through CO2 asphyxiation, and the bone marrow was harvested by flushing the femur and tibia with a syringe and 26-gauge needle. Erythrocytes were not removed, and 1 ×106 cells per well were plated in six-well plates. A single round of cosedimentation with retroviral stock was performed in medium containing 5% WEHI-3B-conditioned medium and 10 ng/mL IL-7 (Peprotech, USA). After transduction, cells were either transplanted into syngeneic female recipient mice (1 × 106 cells each) that had been lethally irradiated (2 × 450 cGy), or cultured in RPMI-1640 (Hyclone, Logan, UT) medium supplemented with 10% fetal calf serum (Hyclone), 200 mmol/L L-glutamine, 50 mmol/L 2mercaptoethanol (Sigma, St Louis, MO), and 1.0 mg/ml penicillin/streptomycin (Hyclone). Subsequently, recipient mice were monitored daily for indications of morbidity, weight loss, failure to thrive, and splenomegaly. Weekly assessment of peripheral blood GFP percentage was done using FACS analysis of tail vein blood. Hematopoietic tissues and cells were utilized for histopathology, in vitro culture, FACS analysis, secondary transplantation, genomic DNA preparation, protein lysate preparation, or lineage analysis, contingent upon the unique characteristics of mice under study.

Secondary Transplants

Thawed BM cells were sorted using a BD FACS Aria II (Becton Dickinson, San Jose California, USA). GFP positive leukemic cells (1× 106, 1×105, 1×104, and 1×103) were then resuspended in 0.4 mL Hank’s Balanced Salt Solution (HBSS) and intravenously administered to unirradiated syngeneic mice.

Flow cytometry analysis and sorting

Bone marrow, spleen cells, and peripheral blood were harvested from leukemic mice. Red blood cells were eliminated using NH4Cl RBC lysis buffer, and the remaining nucleated cells were washed with cold PBS. In order to conduct in vitro cell surface receptor staining, 1× 106 cells were subjected to antibody staining for 20 minutes at 4°C in 1×phosphate buffer saline (1×PBS) containing 3% BSA. Cells were then washed with 1×PBS and analyzed using a CytoFLEX Flow Cytometer (Beckman Coulter, Miami, FL) or sorted on a BD FACS Aria II. Apoptosis was analysed by resuspending the cells in Binding Buffer (BD Biosciences, Baltimore, MD, USA), and subsequent labeling with anti-annexin V-AF647 antibody (BD Biosciences) and propidium iodide (BD Biosciences) for 15 minutes at room temperature. The lineage analysis was performed using the following antibodies, which were procured from BD Biosciences: anti-BP-1-PerCP-Cy7, anti-CD19-PerCP-CyTM5.5, antiCD43-PE, anti-B220-APC, and anti-μHC-APC.

BrdU incorporation and analysis

Cells obtained from primary leukemic mice were cultured in sixwell plates containing RPMI-1640 medium supplemented with 10% FBS and 50 mg/ml BrdU. After a 30-minute incubation at 37°C, cells were harvested and intranuclearly stained using anti-BrdU and 7-AAD antibodies, as per the manufacturer’s instructions.

Western blotting analysis

Over 1 × 106 leukemic cells were centrifuged and washed with icecold PBS. The cells were then treated with ice-cold RIPA buffer, consisting of 50 mM Tris-HCl (pH 7.4), 0.15 M NaCl, 1% Triton X-100, 0.5% NaDoc, 0.1% sodium dodecyl sulphide (SDS), 1 mM ethylene diamine tetraacetic acid (EDTA), 1 mM phenylmethane sulphony fluoride (PMSF) (Amresco), and fresh protease inhibitor cocktail Pepstain A (Sigma). After sonication using a Bioruptor TMUCD-200 (Diagenode, Seraing, Belgium), the suspension was spined at 14,000 g for 3 minutes at 4°C. The total cell lysate was either utilized immediately or stored at -80°C. Protein concentrations were determined using DC Protein Assay (Bio–Rad Laboratories, Hercules, California, USA). Subsequently, the protein samples (20 μg) were incubated with α-RAG1 (mAb 23) and α-RAG2 (mAb 39) antibodies, with GAPDH serving as the loading control. The signal was further detected using secondary antibody of goat anti-rabbit IgG conjugated with horseradish peroxidase (Thermo Scientific, Waltham, MA). The band signal was developed with Immobilon™ Western Chemiluminescent HRP substrate (Millipore, Billerica, MA). The band development was analyzed using GELPRO ANALYZER software (Media Cybernetics, Bethesda, MD).

Genomic PCR

Genomic PCR was performed using the following primers (Schlissel, et al., 1991):

DhL-5′-GGAATTCGMTTTTTGTSAAGGGATCTACTACTGTG-3′; J3-5′-GTCTAGATTCTCACAAGAGTCCGATAGACCCTGG-3′; VQ52-5′-CGGTACCAGACTGARCATCASCAAGGACAAYTCC-3′; Vh558-5′-CGAGCTCTCCARCACAGCCTWCATGCARCTCARC-3′; Vh7183-5′-CGGTACCAAGAASAMCCTGTWCCTGCAAATGASC-3′.

RNA-seq library preparation and sequencing

GFP+CD19+ cells were sorted from the spleen of cRAG1 (n=3, 1×106 cells /sample), cRAG2 (n=3, 1×106 cells /sample), and fRAG (n=3, 1×106 cells /sample) B-ALL mice. Total RNA was extracted using Trizol reagent (Invitrogen, CA, USA) following the manufacturer’s guidelines. RNA quantity and purity analysis was done using Bioanalyzer 2100 and RNA 6000 Nano LabChip Kit (Agilent, CA, USA) with RIN number >7.0. RNA-seq libraries were prepared by using 200 ng total RNA with TruSeq RNA sample prep kit (Illumina). Oligo(dT)-enriched mRNAs were fragmented randomly with fragmentation buffer, followed by firstand second-strand cDNA synthesis. After a series of terminal repair, the double-stranded cDNA library was obtained through PCR enrichment and size selection. cDNA libraries were sequenced with the Illumina Hiseq 2000 sequencer (Illumina HiSeq 2000 v4 Single-Read 50 bp) after pooling according to its expected data volume and effective concentration. Two biological replicates were performed in the RNA-seq analysis. Raw reads were then aligned to the mouse genome (GRCm38) using Tophat2 RNA-seq alignment software, and unique reads were retained to quantify gene expression counts from Tophat2 alignment fles. The differentially expressed mRNAs and genes were selected with log2 (fold change) >1 or log2 (fold change) <-1 and with statistical significance (p value < 0.05) by R package. Bioinformatic analysis was performed using the OmicStudio tools athttps://www.omicstudio.cn/tool.

Preparation of tumor DNA samples

GFP+CD19+ splenic cells, tail and kidney tissue were obtained from cRAG1, cRAG2 and fRAG BCR-ABL1+ B-ALL mice, and genomic DNA was extracted using a TIANamp Genomic DNA Kit (TIANGEN-DP304). Subsequently, paired-end libraries were constructed from 1 µg of the initial genomic material using the TruSeq DNA v2 Sample Prep Kit (Illumina, #FC-121-2001) as per the manufacturer’s instructions. The size distribution of the libraries was assessed using an Agilent 2100 Bioanalyzer (Agilent Technologies, #5067-4626), and the DNA concentration was quantified using a Qubit dsDNA HS Assay Kit (Life Technologies, #Q32851). The Illumina HiSeq 4000 was utilized to sequence the samples, with two to four lanes allocated for sequencing the tumor and one lane for the control DNA library of the kidney or liver, each with 150 bp paired end reads.

Read alignment and structural variant calling

Fastq files were generated using Casava 1.8 (Illumina), and BWA 37 was employed to align the reads to mm9. PCR duplicates were eliminated using Picard’s Mark Duplicates tool (sourceforge.net/apps/mediawiki/picard). Our custom scripts (http://sourceforge.net/projects/svdetection) were utilized to eliminate BWA-designated concordant and read pairs with low BWA mapping quality scores. Intrachromosomal and inter-chromosomal rearrangements were identified using SV Detect from discordant, quality prefiltered read pairs. The mean insertion size and standard deviation for this analysis were obtained through Picard’s InsertSizeMetrics tool (sourceforge.net/apps/mediawiki/picard). Tumor-specific structural variants (SVs) were identified using the manta software (https://github.com/Illumina/manta/blob/mater/docs/userGuide/REA DME.md#introduction).

Validation of high confidence off-target candidates

The elimination of non-specific structural mutations from the kidney or tail was necessary for tumor-specific structural variants identification. Subsequently, the method involving 21-bp CAC-tobreakpoint was employed to filter RAG-mediated off-target gene. The validation of high confidence off-target candidates was carried out through PCR. Oligonucleotide primers were designed to hybridize within the “linking” regions of SV Detect, in the appropriate orientation. The PCR product was subjected to Sanger sequencing and aligned to the mouse mm9 reference genome using BLAST (https://blast.ncbi.nlm.nih.gov/Blast.cgi).

Statistics

Statistical analysis was conducted using SPSS 20.0 (IBM Corp.) and GraphPad Prism 6.0 (GraphPad Software). Descriptive statistics were reported as means ± standard deviation for continuous variables. The equality of variances was assessed using Levene’s test. Two-group comparisons, multiple group comparisons, and survival comparisons were performed using independent-samples t-test, one-way analyses of variance (ANOVA) with post hoc Fisher’s LSD test, and log-rank Mantel-Cox analysis, respectively. Kaplan-Meier survival curves were utilized to depict the changes in survival rate over time. Statistical significance was set at P<0.05.

Results

cRAG give more aggressive leukemia in a mouse model of BCR-ABL1+ B-ALL

In order to assess the impact of RAG activity on the clonal evolution of BCR-ABL1+ B-ALL through a genetic experiment, we utilized bone marrow transplantation (BMT) to compare disease progression in fRAG, cRAG1, and cRAG2 BCR-ABL1+ B-ALL (Yu, et al., 2019). Bone marrow cells transduced with a BCR-ABL1/GFP retrovirus were administered into syngeneic lethally irradiated mice, and CD19+ B cell leukemia developed within 30-80 days (Fig. 1A, Fig. S1). Western blot results confirmed equivalent transduction efficiencies of the retroviral BCR-ABL1 in all three cohorts (Fig. S2A). In order to investigate potential variances in leukemia outcome across distinct genomic backgrounds, we used Mantel-Cox estimation to assess survival rates in fRAG, cRAG1, or cRAG2 mice that were transplanted with BCR-ABL1-transformed bone marrow cells. Our findings indicate that, compared to fRAG BCRABL1+ B-ALL mice, cRAG1 or cRAG2 BCR-ABL1+ B-ALL mice exhibited reduced survival rates during the primary transplant phase (median 74.5 days versus 39 or 57 days, P < 0.0425, Fig. 1A). This survival rates discrepancy was also observed during the secondary transplant phase, wherein leukemic cells were extracted from the spleens of primary recipients and subsequently purified via GFP+ cell sorting. A total of 105,104 and 103 GFP+ leukemic cells that originated from fRAG, cRAG1, or cRAG2 leukemic mice were transplanted into corresponding non-irradiated immunocompetent syngenetic recipient mice (median survival days 11-26, 10-16, 11-21 days, P < 0.0023-0.0299, Fig. S2B). Additionally, the cRAG mice exhibited a significantly higher leukemia burden in the bone marrow, spleen, and peripheral blood compared to the fRAG mice (Fig. 1B-D). In order to investigate the cellular process underlying the increased growth rate in cRAG BCRABL1+ B-ALL, a flow cytometry analysis was conducted to examine the cell cycle and cell apoptosis status. The results demonstrated that cRAG BCR-ABL1+ B-ALL had elevated proportion of cells in S/G2-M phase compared to fRAG (Fig. 1E). Furthermore, we observed increased growth due to decreased apoptosis in cRAG leukemic cells (Fig. S2C). RNA-seq analysis revealed the changes of cell differentiation and proliferation/apoptotic pathways (Fig. S3) These results suggest that the absence non-core RAG regions of expedites the progression of malignant transformation and leukemic growth, resulting in aggressive disease phenotype in the cRAG BCR-ABL1+ B-ALL model.

cRAGs give more aggressive leukemia in mice model of BCR-ABL1+ B-ALL (A) Kaplan-Meier survival curve for fRAG (n=8), cRAG1 (n=6), and cRAG2 (n=10) recipient mice. The survival was calculated by Mantel–Cox test (P<0.0425). (B) The spleen weights of fRAG, cRAG1 and cRAG2 leukemic mice (fRAG, n=8, cRAG1, n=7, cRAG2, n=9; fRAG vs cRAG1, P<0.0001, fRAG vs cRAG2, P=0.1352). (C) The spleen cell numbers of fRAG, cRAG1 and cRAG2 leukemic mice (fRAG, n=7, cRAG1, n=8, cRAG2, n=13; fRAG vs cRAG1, P=0.0047, fRAG vs cRAG2, P=0.0180). (D) The percentage of GFP+ cells in peripheral blood (PB, fRAG, n=6, cRAG1, n=6, cRAG2, n =6; fRAG vs cRAG1, P=0.0003, fRAG vs cRAG2, P=0.0035), bone marrow (BM, fRAG, n=5, cRAG1, n=5, cRAG2, n=6; fRAG vs cRAG1, P=0.0341, fRAG vs cRAG2, P=0.0008), and spleen (SP, fRAG, n=9, cRAG1, n=4, cRAG2, n=9; fRAG vs cRAG1, P=0.0016, fRAG vs cRAG2, P<0.0001) of fRAG, cRAG1 and cRAG2 leukemic mice. (E) Representative flow cytometry plots of cell cycle arrest of leukemic cells in fRAG, cRAG1 and cRAG2 mice. In the graph, the percentages of each phase of the cell cycle are summarized below (fRAG, n=3, cRAG1, n=5, cRAG2, n=5; G0/G1, fRAG vs cRAG1, P=0.0082, fRAG vs cRAG2, P=0.0279; S, fRAG vs cRAG1, P=0.0146, fRAG vs cRAG2, P=0.0370; G2/M, fRAG vs cRAG1, P=0.0134, fRAG vs cRAG2, P=0.1507). In figures B, C, D and J, error bars represent the mean ± s.d., P values were calculated by Student’s t test and *P < 0.05, **P < 0.01, ***P < 0.001.

The loss of non-core RAG regions corresponds to a less mature cell surface phenotype but does not impede IgH VDJ recombination

To identify the B cells developmental stages from which the accumulated B leukemic cells originated. We stained single cells with B cell-specific surface markers and analyzed the samples on flow cytometry. We observed that 91%-98% of GFP+ cells from cRAG mice were CD19+BP-1+B220+CD43+, suggesting that the majority of leukemic cells belonged to the large pre-B cell stage. However, in fRAG leukemic mice, there were 65% large pre-B (GFP+CD19+BP-1+B220+CD43+) and 35% small pre-B cells (GFP+CD19+BP-1+B220+CD43-) (Fig. 2A). The lineage results in accordance with the immunoglobulin μ heavy chain (μHC) expression. Specifically, 5% of fRAG leukemic cells exhibited μHC expression, while cRAG leukemic cells lacked μHC expression, indicating a deficiency in the pre-BCR checkpoint (Fig. 2B). These results suggest that leukemic cells derived from cRAG mice arrest at early B cell developmental stage. Typically, IgH rearrangement initiates with D-J joining in pro-B cells, followed by V-DJ joining in large pre-B cells, and ultimately, V-J rearrangements occur at the IgL loci in small pre-B cells. Genomic PCR of GFP+CD19+ cell’s DNA was used to investigate VDJ rearrangement. Notably, cRAG leukemic cells exhibited a high degree of oligoclonality, as all tumors analyzed consistently displayed the rearrangement of a limited number of VH family members. In contrast, the fRAG leukemias exhibited a significant degree of polyclonality, as confirmed by the recurrent rearrangement of multiple VH family members investigated, each rearranged to all possible JH1-3 segments (Fig. S4AB). This finding is consistent with the more aggressive disease phenotype observed in cRAG BCR-ABL1+ B-ALL. The progression of BCR-ABL-induced leukemia in cRAG mice necessitates the selection of secondary oncogenic events, ultimately leading to the emergence of one or a few dominant leukemic clones. Loss of the non-core RAG region results in the emergence of fewer leukemic clones to generate oligoclonal tumors.

The non-core RAG region loss corresponds to a less mature cell surface phenotype (A) Flow cytometry analysis of the B cell markers CD19, BP-1, B220, and CD43 on BCR-ABL1-transformed fRAG, cRAG1 and cRAG2 leukemic bone marrow cells. The percentages of each phase of the B cell stage are summarized in the bottom graph (fRAG, n=9, cRAG1, n=4, cRAG2, n=9; Large-preB, fRAG vs cRAG1, P=0.0349, fRAG vs cRAG2, P=0.0017; Small-pre-B, fRAG vs cRAG1, P=0.0141, fRAG vs cRAG2, P=0.0005). The expression of the cytoplasmic μ chain was analyzed by flow cytometry. Representative samples are shown in (B), and the results from multiple samples analyzed in independent experiments are summarized in the bottom graph as the fraction of cells expressing cytoplasmic factors (fRAG, n=11, cRAG1, n=8, cRAG2, n=8; fRAG vs cRAG1, P=0.3020, fRAG vs cRAG2, P=0.2267). Error bars represent the mean ± s.d., P values were calculated by Student’s t test and *P < 0.05, **P < 0.01, ***P < 0.001.

The loss of non-core RAG regions highlights genomic DNA damage

The aforementioned findings indicate that leukemic cells derived from three kind of mice were hindered in the large pre-B phase to varying extents, instead of consistent development. During this phase, typical B cells show reduced RAG1 expression to accommodate DNA replication and rapid cell proliferation. The non-core RAG regions contain the RING finger domain and T490 residue, which are responsible for regulating RAG degradation in B cell development. Therefore, it is imperative to investigate the potential consequences of non-core region deletion on RAG expression and function in these leukemic cells. Western blotting was done to address this question. The results revealed that RAG1 (cRAG1) and RAG2 (cRAG2) were expressed in GFP+CD19+ splenic leukemic cells derived from BCR-ABL1+ B-ALL mice with varying genetic backgrounds (Fig. 3A). Notably, upregulation of the RAG1 (cRAG1) protein was observed in cRAG1 leukemic cells compared to fRAG (Fig. 3A, Fig. S5A). The in vitro V(D)J recombination assay confirmed that rearrangements mediated by RSS occurred in leukemic cells. This finding suggests that different forms of RAG exhibited cleavage activity (Fig. 3B and Fig. S5B).

The non-core RAG region loss highlights genomic DNA damage (A) Western blotting analysis showed RAG1 and RAG2 expression in GFP+CD19+ leukemic cells originating from BCR-ABL1+ B-ALL in different genetic backgrounds. (B) Rearrangement substrate retrovirus was transduced into leukemic cells. Flow cytometry was used to analyze the percentage of CD90.1 and hCD4 positive cells, and the percentage populations are shown in the bottom graph (fRAG, n=3, cRAG1, n=3, cRAG2, n=3; fRAG vs cRAG1, P=0.0002, fRAG vs cRAG2, P=0.5865). (C) Flow cytometry analysis of LH2AX levels in fRAG, cRAG1 and cRAG2 leukemic cells and the percentage of L-H2AX-positive cell populations shown in the bottom graph (fRAG, n=11, cRAG1, n=8, cRAG2, n=8; fRAG vs cRAG1, P=0.0505, fRAG vs cRAG2, P=0.0094). Error bars represent the mean ± s.d., P values were calculated by Student’s t test and *P < 0.05, **P < 0.01, ***P < 0.001.

To investigate the potential correlation between aberrant RAG activities and increased DNA double-strand breaks (DSBs), we evaluated the levels of phosphorylated H2AX (L-H2AX), a DSB response factor, in fRAG, cRAG1, and cRAG2 leukemic cells (gated on GFP+). This served as a measure of both DNA DSBs and global genomic instability. Our flow cytometry findings revealed that cRAG leukemic cells had increased L-H2AX compared to the fRAG compartment (Fig. 3C). This suggests that cRAG play a more significant role in mediating somatic structural variants in BCR-ABL1+ B lymphocytes. These results indicate that the stalled B-precursors exhibit high expression of RAG endonucleases and increased DNA damage.

Off-target recombination mediated by RAG in BCR-ABL1+ B lymphocytes

Genome-wide sequencing and analysis were performed to compare somatic structural variants (SVs) in BCR-ABL1+ B lymphocytes derived from fRAG, cRAG1, and cRAG2 mice. The leukemic cells were sequenced with an average coverage of 25× (Table. S1). The SVs generated by RAG were screened based on two criteria: the presence of a CAC to the right (or GTG to the left) of both breakpoints, and its occurrence within 21 bp from the breakpoint (Mijušković, et al., 2015). Further elaboration on these criteria can be found in Supplementary Figure 6. Consequently, aberrant V-toV junctions and V to intergenic regions were encompassed in five validated abnormal rearrangements at Ig loci in cRAG leukemic mice (Table. S2). Additionally, seven samples had 24 somatic structural variations, with an average of 3.4 coding region mutations per sample (range of 0-9), which is consistent with the limited number of acquired somatic mutations observed in hematological cancers and childhood malignancies (Fig. 4 and Table S3). The results of the study demonstrate that fRAG cells had low SVs (0-1 per sample), cRAG1 cells exhibited higher SVs (6-9 per sample) while cRAG2 cells had moderate SVs incidence (1-4 per sample) (Fig. 4, Table S3). These findings suggest that cRAG may lead to an elevated off-target recombination, eventually posing a threat to the BCR-ABL1+ B lymphocytes genome.

Structural alterations in BCR-ABL1+ B lymphocytes

(A-C) Circos plot representation of all off-target recombination detected in the genome-wide analyses of fRAG, cRAG1 and cRAG2 leukemic cells. See also Table S3.

Off-target V(D)J recombination characteristics in BCR-ABL1+ B lymphocytes

We further analyzed the characteristics of the identified SVs. Specifically, an assessment of the exon-intron distribution profiles of 42 breakpoints engendered by 24 SVs was executed via genome analysis. The results indicated that while 57% of the breakpoints were situated on the gene body, 43% were enriched within the flanking sequence, majority of which were identified as transcriptional regulatory sequence (Figure 5A). P and N nucleotides are recognized as distinctive characteristics of V(D)J recombination. Consequently, the length of P and N nucleotides remains consistent during RSS-to-RSS and cRSS-to-cRSS recombination.

Overview and characteristics of off-target recombination in BCR-ABL1+ B-ALL leukemic cells from fRAG and cRAG mice (A)Exon-intron distribution profiles of 42 breakpoints generated by 24 SVs. Gene body includes exon (n = 9; 17.3%) and intron (n = 20; 38.5%). Flanking sequence includes 3’UTR (n = 6; 11.5%), 5’UTR (n = 2; 3.8%), promoter (n = 6; 11.5%), and downstream (n = 9; 17.3%). (B) the off-target recombination was filtered and verified by whole genomic sequence and PCR respectively. P nucleotides and N nucleotides of RSS to RSS and cRSS to cRSS were calculated in BCR-ABL1+ B-ALL. (C) Hybrid joint percentage generated by either fRAG, cRAG1 or cRAG2 in BCR-ABL1+ B-ALL. It was 0, 100%, and 93% in fRAG, cRAG1 or cRAG2 leukemic cells respectively. (D)The 24 off-target recombination genes were retrieved by COSMIC Cancer Gene Census (http://cancer.sanger.ac.uk/census/). 0.5 genes and 0.5 cancers gene average sample in fRAG leukemic cells; 8 genes and 4.5 cancer genes average sample in cRAG1 leukemic cells; 3.3 genes and 0.3 cancer genes average sample in cRAG2 leukemic cells.

However, the frequency of P and N sequences was 50%/50% (P/N) in RSS-to-RSS recombination, while it was 4%/8% (P/N) in cRSS-to-cRSS recombination (Fig. 5B). The notably reduced frequency of P and N sequences indicating that off-target sites DNA repair in BCR-ABL1+ B lymphocytes differs from classical V(D)J recombination repair.

The hybrid joints were specifically pronounced in cRAG1 and cRAG2 leukemic cells (100% and 93% respectively), indicating that the non-core regions might be involved in suppressing potentially harmful transposition events (Figure 5C). In order to ascertain the potential impact of non-core RAG region deletion on the occurrence of oncogenic mutations, a comparative analysis of cancer genes was performed across three distinct leukemic cell backgrounds. The results revealed that the number of cancer genes produced in the cRAG1 leukemic cell was significantly higher than the other two. This observation aligns with the manifestation of a most aggressive leukemic phenotype and concurrent alterations in mRNA transcription in cRAG1 BCR-ABL1+ B-ALL mice.

The non-core regions have effects on RAG binding accuracy and off-target recombination size in BCR-ABL1+ B lymphocytes

Sequence logos were used to visually compare the RSS and cRS Ss in Ig loci and non-Ig loci respectively. The RSS elements in Ig lo ci exhibited the closest match to the canonical RSS (CACAGTG [1 2/23 spacer] ACAAAAACC), particularly at functionally important p ositions. It is noteworthy that the first 5 bases (underlined) of the p erfect heptamer sequence CACAGTG served as the binding motif of fRAG, while the first 4 bases of the heptamer sequence, the CA CA tetranucleotide, were identified as the cRAG binding motif in le ukemic cells (Fig. 6A). Although both motifs (CACAG and CACA) c orresponding to the highly conserved portion of the RSS heptamer sequence, variations in the cRSSs sequence among off-target gen es in fRAG and cRAG mice indicate that the removal of RAG’s non -core region reduces binding precision and increases the off-target recombination in BCR-ABL1+ B lymphocytes.

The non-core regions have effects on RAG binding accuracy and recombinat size in BCR-ABL1+B lymphocytes (A) Sequence logos were used to compare the RSS and cRSS in Ig loci and non-Ig loci. Top panel: V(D)J recombination at Ig locus; the next three panels: RAG-mediated off-target recombination at non-Ig locus from fRAG, cRAG1 and cRAG2 leukemic cells respectively. The scale of recombinant size was categorized into three ranges: <1000bp, 1000-10000bp, and >10000bp. The distribution of different recombinant sizes in fRAG, cRAG1, and cRAG2 leukemic cells was presented in (B), while the number of different recombinant sizes in fRAG, cRAG1, and cRAG2 leukemic cells was displayed in (C). (D) A schematic depiction of the mechanism of cRAG-accelerated off-target V(D)J recombination was provided. Both RAG1 and RAG2’s non-core region deletion decreases RAG binding accuracy in cRAG1 and cRAG2, BCR-ABL1+ B ALL. Additionally, RAG1’s non-core region deletion significantly reduces the size and scale of off-target V(D)J recombination in cRAG1, BCR-ABL1+ B ALL.

Antigen receptor genes are assembled by large-scale deletions and inversions. Our investigation revealed that both fRAG and cR AG2 leukemic cells generated 100% and 92% off-target recombina tion, respectively, exceeding 10000 bp. However, cRAG1 leukemic cells exhibited only 6% off-target recombination exceeding 10000bp, with 48% being <1000bp, 46% being 1000-10000bp (Fig. 6BC). The results indicate that cRAG1 generate minor size of off-target r ecombination in BCR-ABL1+ B lymphocytes, and non-core RAG1 r egion influences the off-target recombination magnitude. Additional ly, non-core RAG1 region deletion results in reduced off-target reco mbination size, which may account for the higher incidence of off-t arget V(D)J recombination in cRAG1 leukemic cells (Fig. 6D).

Discussion

In this study, we have demonstrated that non-core region deletion of both RAG1 and RAG2 leads to accelerated development of leukemia and increased off-target V(D)J recombination in BCR-ABL1+ B lymphocytes. Furthermore, we report reduced cRAG binding accuracy and off-target recombination size in cRAG1 leukemia cells, which might contribute to exacerbated off-target V(D)J recombination of cRAG BCR-ABL1+ B lymphocytes. These findings suggest that the non-core regions, particularly the non-core region of RAG1, play a crucial role in maintaining accuracy of V(D)J recombination and genomic stability in BCR-ABL1+ B lymphocytes.

Our observations indicate that cRAGs leukemic cells exhibit a heightened production of hybrid joints, and that the non-core RAG regions might suppress hybrid joint generation in vivo. Post-cleavage synaptic complexes (PSCs) consist of the RAGs, coding ends, and RSS ends (Fugmann, et al., 2000; Libri, et al., 2021). It is likely that RAG evolution has resulted in the formation of PSCs with optimal conformation and/or stability for standard coding and RSS end-joining. Conversely, cRAGs PSCs may facilitate RAGmediated hybrid joints by enabling the close proximity of coding and RS ends or increasing the PSC stability. Furthermore, it is possible for fRAGs to enlist disassembly/remodeling factors to PSCs, a process that may facilitate the involvement of NHEJ factors in the completion of the normal reaction (Fugmann, et al., 2000). Within this context, cRAGs might exhibit reduced recruitment capacity due to alterations in overall conformation or the absence of specific motifs, resulting in the formation of more stable PSCs and an increased potential for the accumulation of incomplete hybrid joints. However, our results showed that over 90% junction were hybrid joints in cRAGs leukemic cells, surpassing the frequency previously reported in literature (Raghavan, et al., 2006; Talukder, et al., 2004). Studies have indicated that deficiency in non-homologous end joining (NHEJ) may lead to chromosomal instability and lymphomagenesis (Wiegmans, et al., 2021; Gaymes, et al., 2002; Rassool, 2003; Scully, et al., 2019). Notably, our findings have revealed significant variations in the NHEJ repair path-way among leukemic cells with different genetic backgrounds, suggesting aberrant expression of DNA repair pathways in cRAGs leukemic cells (Fig. S3B). This observation suggests the potential for cRAGs to generate elevated levels of hybrid joints, particularly in the absence of a competing normal pathway for efficient formation of coding and RSS joins in a NHEJ-aberrant background.

The off-target V(D)J recombination process mediated by RAG has the potential to generate oncogenic rearrangements (Mijušković, et al., 2015; Greaves, 2018; Thomson, et al., 2020). Our study reveals that non-core RAG regions deletion increased the off-target oncogenic genes, particularly in cRAG1 leukemic cells. These genes have the capacity to influence cell proliferation, differentiation, or survival. In line with this, the leukemic cells exhibit augmented cell proliferation and reduced cell apoptosis with alterations in the cell cycle pathway, particularly in cRAG1 leukemic cells (Fig. S3DF). CDKN2B is susceptible to recurrent breakage in cRAG1 leukemic cells, and its function in impeding proliferation and enhancing leukemic cells apoptosis is attributed to its capacity to block CDK6 (Lopes-Ventura, et al., 2019; Suzuki, et al., 1995). CDKN2B reduction and CDK6 elevation in cRAG1 leukemic cells have been verified (Fig. S7AB), thereby suggesting that off-target V(D)J recombination generates known or suspected oncogenic mutations. Nevertheless, the degree to which RAG-mediated oncogenic recombination contributes to leukemia necessitates further examination.

In human ETV6-RUNX1 ALL, the ETV6-RUNX1 fusion gene is believed to initiate prenatally, yet the disease remains clinically latent until critical secondary events occur, leading to leukemic transformation-“pre-leukemia to leukemia” (Mori, et al., 2002; Bateman, et al., 2010; Bhojwani, et al., 2012). Genomic rearrangement, mediated by aberrant RAG recombinase activity, is a frequent driver of these secondary events in ETV6-RUNX1 ALL (Papaemmanuil, et al., 2014). In contrast, RAG mediated off-target V(D)J recombination is also observed in BCR-ABL1+ B-ALL. These oncogenic structural variations can also be considered as secondary events that promote the transition -“leukemia to aggressive leukemia”. The enhancement of BCR-ABL1+ B-ALL deterioration and progression by cRAG in mouse model was consistent with our previous study that RAG enhances BCR-ABL1 positive leukemic cell growth through its endonuclease activity. Additionally, we showed that non-core RAG1 region deletion leads to increased cRAG1 expression and high RAG expression related to low survival in pediatric acute lymphoid leukemia (Fig. 3A and Fig. S8). Therefore, more attention should be be paid to the non-core RAG region mutation in BCR-ABL1+ B-ALL for the role of non-core region in leukemia suppression and off-target V(D)J recombination.

Disclosure of Potential Conflicts of Interest

The authors declare no potential conflicts of interest.

Authors’ Contributions

Yanghong Ji: Conceptualization, resources, data curation, funding acquisition, validation, writing-review, and editing. Xiaozhuo Yu and Wen Zhou: Conceptualization, validation, visualization, methodology, writing-original draft, writing-review, and editing. Xiaodong Chen: validation, writing-review, and editing. Shunyu He: methodology, writing-review, and editing. Mengting Qin: writing-review, and editing. Meng Yuan: validation, writing-review, and editing. Yang Wang: validation, writing-review and editing. Woodvine otieno Odhiambo: writing-review and editing. YinSha Miao: funding, validation, writing-review, and editing.

Acknowledgements

This study was supported by grants (no. 31170821, no. 31370874 and no. 81670157) from the National Natural Scientific Foundation of China and by a grant (no. 2016JZ030) from the Natural Scientific Foundation of Shaanxi. The authors would like to thank Professor Shaoguang Li from the Division of Hematology/Oncology, University of Massachusetts Medical School, for providing the MSCV-BCR-BAL1-IRES-GFP construct. The authors would also like to thank Mr. Xiaofei Wang (Xi’an Jiaotong University Health Science Centre) for providing expert technical assistance with cell sorting.

Supplementary Figure 1. Construction of fRAG, cRAG1 and cRAG2, BCR-ABL1+ B-ALL mice models using bone marrow transplantation (BMT) In the establishment of BCR-ABL1+ B-ALL mice models, fRAG, cRAG1 and cRAG2 recipient mice after syngeneic lethal irradiation were transplanted with corresponding donor bone marrow cells transduced by MSCV-BCR-BAL1IRES-GFP or MSCV-GFP retroviral supernatants. (A) Gross appearance of the spleen in fRAG, cRAG1, and cRAG2 leukemic mice and corresponding control mice. (B) Peripheral blood (PB) and bone marrow (BM) lymphoblastic cells were stained by Wright-Giemsa. The scale bars represent 10 µm. (C) Bone marrow cells from fRAG, cRAG1 and cRAG2 leukemic mice were examined by flow cytometry for the expression of GFP and CD19.

Supplementary Figure 2. Biological behavior of leukemia in fRAG, cRAG1 and cRAG2 BCR-ABL1+ B-ALL mouse model (A) BCR-ABL1 expressions in GFP+CD19+ leukemic cells were determined by western. GAPDH protein was used as a loading control. The K562 and 293T cell lines served as the positive control and negative controls, respectively. (B) Survival of secondary transplant setting. Leukemia cells from primary recipients were recovered from the spleens and purified by GFP+ cell sorting. A total of 105,104 and 103 GFP+ leukemia cells originating from fRAG, cRAG1 or cRAG2 BCR-ABL1+ B-ALL were transplanted into corresponding nonirradiated immunocompetent syngenetic recipient mice (fRAG, n=3, cRAG1, n=3, cRAG2, n=3; P < 0.0023-0.0299 by Mantel–Cox test). (C) Apoptosis was measured by flow cytometry (Annexin V and 7-AAD). The Annexin V+ and 7-AADcells were defined as early apoptotic cells, while Annexin V+ and 7-AAD+ cells were late apoptotic cells (fRAG, n=11, cRAG1, n=6, cRAG2, n=9; early apoptotic cells: fRAG vs cRAG1, P=0.0002, fRAG vs cRAG2, P=0.0026; late apoptotic cells, fRAG vs cRAG1, P=0.0026, fRAG vs cRAG2, P<0.0001). Error bars represent the mean ± s.d. P values were calculated by Student’s t test and *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001.

Supplementary Figure 3. The genetic pathways in fRAG, cRAG1, and cRAG2 BCR-ABL1+ lymphocytes. mRNA sequence was performed in GFP and CD19 double positive cells. (A) Principal Component Analysis (PCA) showing the distribution of differentially expressed samples of fRAG, cRAG1, and cRAG2, BCR-ABL1+ B-ALL. (B) Heatmap of representative different expressed genes related to nonhomologous end repair. The scale ranges from minimum (blue) to medium (yellow) to maximum (red) relative expression. (C) Volcano plot depicting log2 (fold change) (x-axis) and −log10 (p value) (y-axis) for differentially expressed genes (FC > 2, p < 0.05) in GFP+ CD19+ leukemic cells sorted from fRAG and cRAG1, BCR-ABL1+ B-ALL mice; upregulated (red) and downregulated (blue). n = 3 per group. (D) The Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis was conducted to identify the differentially expressed genes in cRAG1 BCR-ABL1+ B-ALL. The top 15 pathways that exhibited significant differences were listed in this paragraph. The cell proliferation, apoptosis, and differentiation related pathway were highlighted in red squares. (E) Volcano plot depicting log2 (fold change) (x-axis) and −log10 (p value) (y-axis) for differentially expressed genes (FC > 2, p < 0.05) in GFP+ CD19+ leukemic cells sorted from fRAG and cRAG2, BCR-ABL1+ B-ALL mice; upregulated (red) and downregulated (blue). n = 3 per group. (D) KEGG analysis was conducted to identify the differentially expressed genes in cRAG2 BCR-ABL1+ ALL. The top 15 pathways that exhibited significant differences were listed in this paragraph. The cell proliferation, apoptosis, and differentiation related pathway were highlighted in red squares.

Supplementary Figure 4. VDJ recombination in leukemic cells with different genetic backgrounds (A) VDJ recombination was analyzed by genomic PCR in GFP+CD19+ cell’s DNA from fRAG, cRAG1 or cRAG2 leukemic cells.Genomic DNA from RAG1/bone marrow cells and WT spleen was used as negative and positive control respectively.

Supplementary Figure 5. RAG protein expression levels and schematic diagram of the recombinant substrate vector (A)RAG1/cRAG1 and RAG2 protein levels were compared by western blot and ImageJ software in fRAG1 and cRAG1, B-ALL cells. Error bars represent the mean ± s.d. The P value was calculated by t test, ***P<0.001, ns P>0.05. (B) The B-ALL cells were subjected to transformation with the recombinant substrate vector. In the event of expression of RAG recombinase in the leukemic cells, the RSS sequences flanking CD90.1 would be cleaved by RAG, thereby facilitating the positioning and expression of both CD90.1 and hCD4. In the absence of RAG expression, only hCD4 would be expressed.

Supplementary Figure 6. The criteria for identifying off-target recombination. We adopted the criteria used in previous studies. First, a CAC must exist to the right (or GTG to the left) of both breakpoints, which includes the four RAGmediated DNA fragmentation cases mentioned above, and second, it must occur within a specified distance from the breakpoint and the CAC distanceto-breakpoint value was set at 21 bp.

Supplementary Figure 7. CDKN2B and CDK6 mRNA levels in leukemic cells. (AB) Leukemic cells were harvested from fRAG and cRAG1 B-ALL mice with CDKN2B deletion. CDKN2B and CDK6 were determined by mRNA sequence. Error bars represent the mean ± s.d., fRAG B-ALL mice, n = 3; cRAG1 B-ALL mice n=3. *P <0.05, **P < 0.01

Supplementary Figure 8. The relationship of RAG1 mRNA levels and survival of pediatric acute lymphoid leukemia. The relationship of RAG1 mRNA levels and survival of pediatric acute lym-phoid leukemia was research by cBioPortal (https://www.cbioportal.org/). RAG1 mRNA levels were studied in pediatric patients with ALL at the time of diagnosis. The patients were separated into two groups based on mRNA levels of RAG1 (mRNA expression z score relative to diploid sample, RNA sequence RPKM, RAG1 upregulated group, n=8; RAG1 unaltered group, n=146). The P values were calculated from the log-rank test, P=0.0732.