Introduction

The vast majority of Cas systems explored as genome editors originate from mesophilic hosts. The emergence of the thermophilic GeoCas9, with DNA cleavage function up to 85°C, can expand CRISPR technology to higher temperature regimes and stabilities,1,2 but its regulatory mechanism relative to canonical Cas9s must be established. The SpCas9, which originates from the mesophilic Streptococcus pyogenes, as well as GeoCas9, are both effectors of Type-II CRISPR systems. Interestingly, the Type II-A SpCas9 has been by far the most used Cas enzyme, including in ongoing clinical trials.3,4 But Cas9 homologs of the Type II-C class, such as Neisseria meningitis (NmeCas9) and Campylobacter jejuni (CjeCas9), to which GeoCas9 belongs, have been validated for mammalian genome editing,2,5,6 reinforcing the need to better understand this CRISPR class.

The similar domain arrangements of GeoCas9 and SpCas9 led us to initially speculate that these could share atomic level mechanistic similarities.7 GeoCas9 utilizes a guide RNA (gRNA) to localize and unwind a double-stranded DNA (dsDNA) target after recognition of its 5’NNNNCRAA-3’ protospacer adjacent motif (PAM).2,8 Upon recognition of the PAM sequence by the PAM-Interacting domain (PI), Cas9-bound guide (gRNA) forms an RNA:DNA hybrid with the target DNA strand. Initially thought to be part of the PI domain2, the wedge (WED) domain recognizes the repeat:anti-repeat region of the gRNA and the dsDNA upstream of the target region.9 The Rec lobe of Cas9 is responsible for orienting the RNA:DNA hybrid, as well as the adjacent nuclease domains, into their active conformations.1013 Coordinated cleavage of the target and non-target DNA strand then occurs via the HNH and RuvC nucleases, respectively. The GeoCas9 nuclease active sites within HNH and RuvC are spatially distinct from the PAM recognition site in the PI domain, necessitating structural and dynamic changes that allosterically couple dsDNA binding to cleavage. Biochemical11,14,15 and structural16,17 experiments using the extensively studied SpCas9 have revealed that its function is governed by a sophisticated allosteric mechanism that transfers gRNA and dsDNA binding information from the Rec lobe to the distal catalytic sites. A dynamically driven allosteric signal spans the HNH domain of SpCas9, enabled by the plasticity of the Rec lobe, which orchestrates the conformational activation required for DNA cleavage.14,18 Our prior work revealed a divergence in the timescales of allosteric motions in the SpCas9 and GeoCas9 HNH domains7,17 suggesting an unusually flexible HNH and unique allosteric mode of regulation for GeoCas9. It is therefore also possible that docking of the gRNA with GeoCas9, and thus its interaction with the RNA:DNA hybrid, may differ from the SpCas9 system, as GeoCas9 contains a truncated Rec lobe with only two of the three canonical subdomains.

The high thermal stability and more compact size of GeoCas9 (it is 281 residues shorter than SpCas9) can be especially important for in vivo delivery applications, since promising viral vectors (i.e. adeno-associated virus, AAV) have cargo capacities of ∼4.7kb,19 which prevents SpCas9-gRNA packaging into a single AAV vector but permits “all-in-one” delivery of GeoCas9-sgRNA.12 Until the very recent cryo-EM structures of GeoCas9,20,21 little was known about specific residues that influence its structure, gRNA binding, or function. Our recent NMR work with SpCas9 uncovered pathways of micro-millisecond timescale motions that propagate chemical information related to allostery and specificity through SpRec and its RNA:DNA hybrid,16,17 prompting us to investigate this phenomenon in GeoRec.

An atomic-level structural understanding of specificity in large multi-domain protein-nucleic acid complexes like Cas9 is often difficult to address by NMR spectroscopy. Although dynamic ensembles in DNA repair enzymes have provided some insight,22 many efforts to improve Cas9 specificity and reduce off-target activity have relied on large mutational screens23 or error-prone PCR24, which are less intuitive. Inter-subunit allosteric communication between the catalytic HNH domain and the Rec lobe is critical to Cas9 specificity, as the binding of off-target DNA sequences at Rec alter HNH dynamics to affect DNA cleavage.8,25,26 To further probe the fundamental role of protein motions in the function and specificity of GeoCas9, as well as the effect of protein-nucleic acid interactions on its structural signatures, we engineered two mutations in GeoRec (K267E and R332A, housed within GeoRec2). We hypothesized that these variants could enhance GeoCas9 specificity (i.e. limit its off-target cleavage) for two reasons. First, the chosen mutation sites are homologous to those of specificity-enhancing variants of SpCas9.24,27 Second, altered Cas9-gRNA interactions have been shown to be a consequence of specificity-enhancement and these charged residues appear to directly interact with the gRNA.10,11,15,28 Balancing these two points is the fact that Type-II Cas systems generally have conserved nuclease domains, but are delineated by highly varied Rec lobes.12 This implies that the structural and dynamic properties of Rec may play an outsized role in differentiating the functions of SpCas9 and GeoCas9, which may not be identical. Nevertheless, our work provides new insight into the biophysical, biochemical, and functional role of the GeoRec lobe and how mutations modulate the domain itself and its interaction with gRNA in full-length GeoCas9.

Results

The structural similarity of GeoRec1, GeoRec2, and GeoRec facilitates NMR analysis of protein dynamics and RNA affinity

GeoCas9 is a 1087 amino acid polypeptide, thus we employed a “divide and concur” approach for NMR studies, which we previously showed to be useful for quantifying allosteric structure and motion in SpCas9.16,17,29,30 The GeoRec lobe is comprised of subdomains GeoRec1 and GeoRec2, which likely work together to recognize nucleic acids. We engineered constructs of the GeoRec1 (136 residues, 16 kDa) and GeoRec2 (212 residues, 25 kDa) subdomains and solved the X-ray crystal structure of GeoRec2 at 1.49 Å, which aligns remarkably well with the structure of the GeoRec2 domain within the AlphaFold model (RMSD 1.03 Å) and new cryo-EM structure of GeoCas9 (RMSD 1.10 Å, Figure 1A). We were neither able to crystallize GeoRec1 nor full-length GeoCas9 in the apo state, but our GeoRec2 crystal structure represents the structure of the subdomain within the full-length GeoCas9 protein quite well. Our previous studies of GeoHNH also show identical superpositions of X-ray crystal structures with full-length Cas complexes.7 In addition to the individual subdomains, we also generated an NMR construct of the intact GeoRec (370 residues, 43 kDa).

(A) Arrangement of GeoCas9 domains across the primary sequence. The cryo-EM structure of GeoCas9 in complex with gRNA (PDB: 8JTR) shows poor resolution of HNH. The GeoRec2 domain from PDB: 8JTR (gray) is overlaid with our X-ray structure of GeoRec2 (red, PDB: 9B72). (B) 1H15N TROSY HSQC NMR spectrum of GeoRec collected at 850 MHz. Overlays of this spectrum with resonances from spectra of GeoRec1 (black) and GeoRec2 (blue) demonstrate a structural similarity between the isolated subdomains and intact GeoRec.

Despite only 22% sequence identity, the structure of SpRec3 and GeoRec2 are highly similar (RMSD 2.00 Å, Figure S1). The structure of GeoRec1, in contrast, does not align perfectly with SpRec1, instead, it partially aligns with both SpRec1 and SpRec2 (Figure S1). Thus, the nearly identical SpRec3 and GeoRec2 architectures and their intrinsic dynamics may be a common thread among Type II Cas9s of different size and PAM preference. To capture atomic-level signatures of GeoRec, we obtained well-resolved 1H-15N NMR fingerprint spectra for all three protein constructs and assigned the amide backbones (Figure S2). 1H-15N amide and 1H-13CH3 Ile, Leu, and Val (ILV)-methyl NMR spectra (Figure 1B, S3) of GeoRec overlay very well with those of its individual subdomains, suggesting that the linkage of subdomains within the full-length GeoRec polypeptide does not alter their individual folds. Consistent with this observation, circular dichroism (CD) thermal unfolding profiles of GeoRec1 (Tm ∼ 34 °C) and GeoRec2 (Tm = 61.50°C) are distinct and occur as separate events in the unfolding profile of GeoRec (Figure S4). The dumbbell shape of GeoRec, with its two globular subdomains connected by a short flexible linker, is a likely contributor to these biophysical properties.

Rationally designed GeoRec2 mutants do not substantially impact the GeoRec structure

To understand how the structure and gRNA interactions of GeoCas9 can be modulated at the level of GeoRec, we engineered two charge-altering point mutants in the GeoRec2 subdomain, K267E and R332A. Based on the AlphaFold2 model of GeoCas9, both of these residues are < 5Å from the bound RNA:DNA hybrid and were predicted to interface with the nucleic acids directly (Figure 2A/B). A new experimental cryo-EM structure of GeoCas9 confirmed the interaction between K267 and the gRNA, but does not report a < 5Å interaction of R332 with the gRNA.31 The rationale for our designed mutations was also that removal of positive charge would weaken the interactions between GeoCas9 and the gRNA, affecting Kd via the electrostatics or dynamics of the GeoRec lobe. Studies of SpCas9 revealed that interaction of SpRec3 (analogous to GeoRec2) with its RNA:DNA hybrid triggers conformational rearrangements that allow the catalytic HNH domain to sample its active conformation.11 Thus, SpRec3 acts as an allosteric effector that recognizes the RNA:DNA hybrid to activate HNH. Mismatches (i.e. off-target DNA sequences) in the target DNA generally prevent SpRec3 from undergoing the full extent of its required conformational rearrangements, leaving HNH in a “proofreading” state with its catalytic residues too far from the DNA cleavage site. Off-target DNA cleavage by Cas9 remains an area of intense study and substantial effort from various groups has gone into mitigating such effects.15,23,28,32,33 Indeed, many high-specificity SpCas9 variants contain mutations within SpRec3 that increase the threshold for its conformational activation, reducing the propensity for HNH to sample its active state in the presence of off-target DNA sequences.11,15,23 Studies of flexibility within Rec itself, as well as its gRNA interactions in the presence of mutations, are therefore essential to connecting biophysical properties to function and specificity in related Cas9s.

(A, B) Sites of selected mutations within GeoRec2, K267 and R332, are highlighted as purple sticks directly facing the RNA and DNA modeled from NmeCas9 (PDB ID: 6JDV), allowing for prediction of the binding orientation within GeoCas9. NMR chemical shift perturbations caused by the K267E (C) or R332A (D) mutations are plotted for each residue of GeoRec. Gray bars denote sites of line broadening, the blue bar denotes an unassigned region of GeoRec corresponding to the native Rec1-Rec2 linker, and the red bar indicates the mutation site. The red dashed line indicates 1.5α above the 10% trimmed mean of the data. Chemical shift perturbations 1.5α above the 10% trimmed mean are mapped onto K267E (E) and R332A (F) GeoRec (red spheres). Resonances that have broadened beyond detection are mapped as yellow spheres and the mutation sites are indicated by a black sphere and green arrow.

The K267E GeoRec2 variant is sequentially and structurally similar to a specificity enhancing site in SpCas9 (K526E), within the evoCas9 system.27 The SpCas9 K526E mutation substantially reduced off-target activity alone, but was even more effective in conjunction with three other single-point mutations in SpRec3.27 The R332A GeoRec2 variant also resembles one mutation within a high-specificity SpCas9 variant, an early iteration of HiFi SpCas9 called HiFi Cas9-R691A.24 We assessed mutation-induced changes to local structure in GeoRec via NMR chemical shift perturbations in 1H-15N HSQC backbone amide spectra. Consistent with experiments using GeoRec2 alone, chemical shift perturbations and line broadening are highly localized to the mutation sites. (Figure 2C-F). Perturbation profiles of the GeoRec2 subdomain and intact GeoRec also implicate the same residues as sensitive to the mutations (Figure S5).

CD spectroscopy revealed that wild-type (WT), K267E, and R332A GeoRec2 maintained similar alpha-helical secondary structure, though the thermostability of both variants was slightly reduced from that of WT GeoRec2 (Figure S6). The Tm of WT GeoRec2 is ∼62 °C, consistent with the Tm of the full-length GeoCas9, while that of K267E GeoRec2 was decreased to ∼55 °C. Though the R332A GeoRec2 Tm remains ∼62 °C, this variant underwent a smaller unfolding event near 40 °C before completely unfolding. These data suggest that despite small structural perturbations, both mutations are destabilizing to GeoRec2, which led us to expect a change in NMR-detectable protein dynamics.

Mutations enhance and redistribute molecular motions within GeoRec2

Due to the high molecular weight of the intact GeoRec lobe, decays in NMR signal associated with spin relaxation experiments were significant and hampered data quality. Thus, we focused on quantifying the molecular motions of the GeoRec2 subdomain, where the K267E and R332A mutations reside, and the chemical shift perturbations are most apparent. To obtain high-quality per-residue information representative of GeoRec, we measured longitudinal (R1) and transverse (R2) relaxation rates and heteronuclear 1H-[15N] NOEs (Figure S7), then used these data in a Model-free analysis of per-residue order parameters (S2). Previous measurements of S2 across the adjacent GeoHNH nuclease revealed substantial ps-ns timescale flexibility,7 leading us to wonder whether a similar observation would be made for GeoRec2, which abuts GeoHNH. Such a finding could suggest that HNH-Rec2 crosstalk in GeoCas9 is driven primarily by rapid bond vector fluctuations. However, unlike GeoHNH, S2 values for GeoRec2 are globally elevated, suggesting that the ps-ns motions of this subdomain arise primarily from global tumbling of the protein in solution. We therefore carried out Carr-Purcell-Meiboom-Gill (CPMG) relaxation dispersion NMR experiments to assess the flexibility of GeoRec2 on slower timescales, which has been linked to chemical information transfer in the well-studied SpCas9.10,16,17,29 Evidence of μs-ms motions (i.e. curved relaxation dispersion profiles) is observed in 17 residues within the GeoRec2 core, spanning its interfaces to Rec1 and HNH (Figure 3B). Such motions are completely absent from GeoHNH, thus two neighboring domains, GeoRec2 and GeoHNH, diverge in their intrinsic flexibility (at least in isolation), raising questions about the functional implications of these motions in GeoRec2. We previously showed that heightened flexibility of SpRec3 via specificity-enhancing mutations concomitantly narrowed the conformational space sampled by SpHNH, highlighting a “motional trade-off” between the domains. Manipulation of the flexibility of SpCas9 and GeoCas9 domains by mutagenesis also impacts aspects of nucleic acid binding and cleavage,1,10,14,15,29 which led us to investigate similar perturbations in GeoRec2.

(A) CPMG relaxation dispersion profiles of all residues with evidence of μs-ms motion, fit to a global kex of 147 ± 41 s-1 (WT GeoRec2, left), 376 ± 89 s-1 (K267E GeoRec2, center), and 142 ± 28 s-1 (R332A GeoRec2, right) collected at 25 °C and 600 MHz. Residues are colored in accordance with Table S1. Relaxation dispersion profiles for individual resonances are shown in Figures S8-S10. (B) Sites exhibiting CPMG relaxation dispersion in (A) are mapped to GeoRec as blue spheres. Adjacent domains within the cryo-EM structure of GeoCas9 are also shown. (C) Per-residue NMR order parameters of WT (black), K267E, and R332A (red, separate plots) GeoRec.

Since SpRec3 and GeoRec2 have similar structures and μs-ms flexibility, we speculated that charge-altering mutations would modulate the biophysical properties of GeoRec and the function of GeoCas9, as observed for SpCas9. We investigated K267E and R332A GeoRec2 with NMR spin relaxation, as described for WT GeoRec2 (vide supra). An analysis of chemical exchange rates, kex, derived from dual-field CPMG relaxation dispersion show a global kex for WT GeoRec2 of 147 ± 41 s-1. The K267E mutation, which directly contacts the nucleic acids, shifts the globally fitted kex to 376 ± 89 s-1, while the R332A variant maintains a global kex similar to that of WT GeoRec2 (142 ± 28 s-1) and consistent with its similar thermal stability. The global fit of the K267E variant is based on CPMG profiles of 33 residues, while that of R332A is derived from 18 residues (Table S1). Interestingly, the residues participating in the global motions of both variants are distinct from those of WT GeoRec2, demonstrating that residue-specific flexibility is redistributed throughout GeoRec2, which suggests an altered intradomain molecular crosstalk within the larger GeoRec. Indeed, perturbation to NMR-detectable motions in SpCas9 rewired its allosteric signaling and enzymatic function.16,29 A similar dynamic modulation of GeoCas9 may fine-tune its DNA cleavage, which has been demonstrated within the GeoHNH nuclease1 and wedge (WED) domains.32 We also assessed the ps-ns fluctuations of GeoRec2 variants (a negligible contribution to the WT GeoRec2 dynamic profile) and calculated order parameters from R1, R2 and 1H-[15N] NOE relaxation measurements (Figure 3C). Bond vector fluctuations on the ps-ns timescale are only locally altered, thus the mutation-induced reshuffling of these motions is negligible (<ΔS2> ≤ 0.1) and suggests that, like WT GeoRec2, ps-ns motion arises primarily from global tumbling in solution.

Mutations within GeoRec alter its affinity for guide RNA

The role of the Rec lobe in orienting the RNA:DNA hybrid within Cas9 is crucial to its function.1013 Thus, the structure, motions, and nucleic acid interactions of Rec represent a critical piece of the Cas9 signaling machinery. Previous studies of SpCas9 revealed that gRNA binding to the Rec lobe induces a global structural rearrangement of the protein that positions the adjacent HNH into its “proofreading” state,15 after which target DNA binding positions the nucleases into active conformations for cleavage.15,34 We wondered if the atomistic details of the apo GeoCas9-to-RNP transition could be captured by NMR using the GeoRec construct. In our previous studies, we used an in vitro DNA cleavage assay with GeoCas9 and a 141nt gRNA containing a 21nt spacer targeting the mouse Tnnt2 gene locus.1 Since this assay was already established, we utilized the same gRNA sequence. However, truncating this gRNA was necessary to optimize binding studies for NMR analysis. We focused on the 5’ end of the gRNA, which includes the spacer sequence, based on the GeoCas9 AlphaFold2 model and structural data from NmeCas9 and SpCas9 showing interactions between the Rec lobe and this region of the gRNA. The subsequent cryo-EM structure of GeoCas9 corroborated this interaction.20 Initial attempts using a truncated 101nt gRNA resulted in poor NMR spectra. An overlay of the 1H-15N HSQC NMR spectra of apo GeoRec and GeoRec-RNP at a 1:1 molar ratio showed extensive line broadening (Figure S11), likely due to the large size of the complex (75.5 kDa). To mitigate this issue, a 39nt gRNA containing the 21bp spacer sequence was selected for its ability to maintain the NMR signal while being long enough to interact fully with the Rec lobe, as suggested by prior structures. When bound to GeoRec, this complex is 55.6 kDa and a 1H-15N NMR spectral overlay of apo GeoRec and the domain bound to 39nt gRNA shows clear, resolved resonances with significant chemical shift perturbations and line broadening (Figure 4A/B, S11). The strongest chemical shift perturbations are localized to the GeoRec2 subdomain that interfaces with the RNA:DNA hybrid at the PAM distal end, where previous studies of specificity-enhancing variants of SpCas9 have identified alterations in nucleic acid binding to SpRec3.16 It is not known whether specific residues at the PAM distal binding interface of GeoRec2 play a similar role. Line broadening is evident in both GeoRec1 and GeoRec2, primarily localized to the RNA:DNA hybrid interface revealed in recent GeoCas9 structures. Microscale thermophoresis (MST) experiments quantified the affinity of GeoRec for gRNA, producing a Kd = 3.3 ± 1.5 µM that is consistent with the concentration-dependent NMR chemical shift perturbations (Figure 5A).

(A) NMR chemical shift perturbations caused by gRNA binding to WT, K267E, and R332A GeoRec. Gray bars denote sites of line broadening, and the blue bar denotes an unassigned region of GeoRec corresponding to the flexible Rec1-Rec2 linker. The red dashed line indicates 1.5α above the 10% trimmed mean of the data. (B) Representative NMR resonance shifts caused by titration of 39nt gRNA into WT GeoRec. (C) NMR titration of 39nt gRNA into K267E (top) and R332A (bottom) GeoRec. The left panel of each pair demonstrates that minimal change in NMR chemical shift or resonance intensity is apparent at gRNA concentrations mimicking the WT titration. The right panel of each pair depicts the titration over a three-fold wider concentration range of gRNA, where shifts and line broadening are visible. Representative resonances are colored by increasing gRNA concentration in the legend.

Representative MST-derived profiles of WT (A), K267E, and R332A (B) GeoRec binding to a Cy5-labeled 39nt gRNA, yielding Kd = 3.3 ± 1.5 µM, Kd = 7.2 ± 1.0 µM and Kd = 7.2 ± 1.5 µM, respectively. Bar graphs comparing Kd values across n ≥ 3 replicate samples are shown for Tnnt2 gRNA (C) and 8UZA gRNA from a recent cryo-EM structure (D). *p < 0.05, **p < 0.004

To understand how the K267E and R332A mutants impact gRNA binding to GeoRec, we conducted gRNA titration experiments via NMR and observed that chemical shift perturbations were attenuated in both variants, relative to WT GeoRec. Despite this muted structural effect, the impact from gRNA-induced line broadening remains substantial in the GeoRec1 subdomain. Our NMR data revealed that a three-fold greater concentration of gRNA was required to induce the maximal structural and dynamic effects in the variants than is required for WT GeoRec (Figure 4A/C), suggesting that the variants have a reduced gRNA affinity. MST experiments showed statistically significant reductions in gRNA affinity for the K267E and R332A constructs, relative to WT GeoRec, where K267E GeoRec produced a Kd = 7.2 ± 1.0 µM and R332A GeoRec produced a Kd = 7.2 ± 1.5 µM (Figure 5B). The ∼3-fold decrease in Kd may also be due, in part, to a change in the binding mode of the gRNA, such as a faster koff. Collectively, these data reveal that mutations within GeoRec primarily alter its structure around the mutation site with weaker distal effects, but more significantly impact protein dynamics and in turn, the gRNA interaction. NMR experiments also demonstrate that the presence of gRNA impacts both subdomains of GeoRec, providing a significant structural interface for additional molecular tuning of nucleic acid binding.

To investigate the impact of gRNA binding on GeoRec2 in greater detail, we conducted NMR titration experiments using the isolated domain, which yielded even clearer NMR spectra. Figure S12 shows NMR spectra of WT, K267E, and R332A GeoRec2 overlaid with their corresponding gRNA-bound spectra (39nt Tnnt2 RNA). At protein:RNA molar ratios used for full-length GeoRec studies, the WT GeoRec2 spectrum exhibited significant line broadening across the GeoRec2 sequence. Plots of NMR peak intensities (Ibound/Ifree) show substantial resonance intensity losses (Figure S12), with many residues likely in the intermediate exchange regime, in addition to the assumed changes in rotational correlation of the domain. In comparison, spectra of the gRNA-bound K267E and R332A GeoRec2 variants showed less pronounced signal decay at the same levels of titrant, retaining nearly double the Ibound/Ifree ratio across these spectra (Figure S12). These data are consistent with the results of NMR experiments with the 43 kDa GeoRec, supporting the premise that GeoRec2 mutations weaken its interaction with gRNA. The less crowded NMR spectrum of isolated GeoRec2 facilitated the resolution of distinct structural features that explain the impact of the mutations on gRNA binding (Figure S9). For example, residue I53 adopts a similar conformation in gRNA-bound WT and K267E GeoRec2 but assumes a different structural state in R332A. Conversely, residue R25 populates a WT-like structure in gRNA-bound R332A GeoRec2, unlike K267E. Additionally, two resonances are observed for residue K71 in the gRNA-bound R332A NMR spectrum, indicating real-time equilibration between two structural states. This effect is unique to the R332A variant and underscores subtle structural and dynamic changes to GeoRec during gRNA binding.

We further examined the NMR data to attempt to identify residues most critical for gRNA binding to GeoRec. In an overlay of the WT and mutant gRNA-induced chemical shift perturbations (Δδ, Figure S13), it became clear that the effect of gRNA binding to GeoRec variants was muted, where even at saturating concentrations, the chemical shift perturbations across the K267E and R332A GeoRec2 sequences were weaker than those of same residues in WT GeoRec. The residual Δδ (WT - mutant) was plotted (Figure S13), where positive values indicate that residues in a GeoRec variant are weakly affected by gRNA, relative to WT. Negative residual Δδ denote sites where GeoRec variants experience a greater structural impact from gRNA than corresponding sites in WT. Of particular interest are the positive residuals that hint at the sites in GeoRec most critical for tight gRNA binding. These residues were mapped onto the GeoRec structure (Figure S13) and termed allosteric hotspots, as many are not at the RNA interface. Mutations of these hotspots in future studies offers a potential means of precisely tuning the affinity of GeoRec to gRNA. Notably, residues with positive residual Δδ (suggested as critical for tight RNA binding) largely overlap in the analysis of both variants. Specifically, residues F170, R192, H264, R269, L270, L279, H300, D301, E368, D376, D403, E405, E408, and I429 appear as allosteric hotspots (with CPMG relaxation dispersion) critical to WT-like gRNA interaction.

Having observed a reduced affinity of GeoRec variants for gRNA by NMR and MST, we next quantified the impact of the K267E and R332A mutations on RNP formation and stability in full-length GeoCas9. The thermal unfolding midpoint of full-length WT GeoCas9 determined by CD is ∼60 °C and the K267E and R332A mutations do not change the Tm of the apo protein (Figure S14). Upon formation of an RNP, the Tm of WT GeoCas9 increases to 73°C. K267E GeoCas9 retains a similar Tm increase to 70 °C, while R332A GeoCas9 forms a less stable RNP with Tm of 61 °C. The trend of these data is consistent with NMR and MST, which highlight that although K267E and R332A mutations within GeoRec have somewhat muted structural effects, these changes alter protein dynamics and the interaction with gRNA.

Mutations in full-length GeoCas9 alter its structural dynamics and interaction with gRNA

To further investigate the effects of mutations on protein dynamics, we performed molecular dynamics (MD) simulations based on the cryo-EM structure of full-length GeoCas9 (PDB: 8UZA) in complex with gRNA and target DNA. We simulated the full-length WT GeoCas9 and its K267E and R332A mutants as well as a double mutant combining K267E and R332A (Figure 6A), in three replicates of approximately 2 μs each. Multi-microsecond simulations revealed substantial changes in the dynamics of the GeoCas9 mutants compared to the WT (Figure S15). Specifically, we observed that mutations in the REC domain significantly altered the dynamics of both the Rec and the adjacent HNH (Figure S15). Differential root-mean-square fluctuations (ΔRMSF) analysis of protein residues between the WT and variants further highlighted these alterations, showing increased dynamics in the HNH and Rec domains induced by the mutations (Figure 6B). To quantify the impact of these mutations, we analyzed protein-RNA interactions by calculating the number of contacts between GeoCas9 and gRNA in the WT and mutant systems. A contact was defined as a distance between two atoms of ≤ 4.5 Å. The number of contacts was significantly reduced in all variants compared to the WT, with the most pronounced reduction observed in the K267E variant (Figure 6C). We next quantified gRNA-Rec domain binding by calculating the binding free energy using the MM-GBSA method over ∼200 ns of stable simulation trajectories (details in Materials and Methods). Consistent with a reduction in gRNA contacts, variants with the K267E mutation exhibited a substantial reduction in binding free energy (>60 kcal mol-1) relative to the WT, whereas the R332A variant displayed a smaller reduction (<20 kcal mol-1, Figure 6D). Notably, protein-DNA interactions remained largely unaffected, suggesting that these mutations do not impair GeoCas9 DNA cleavage ability.

Effects of mutations in full-length GeoCas9 revealed by MD simulations. (A) The structure of GeoCas9 (PDB: 8UZA, protein in gray) bound to gRNA (orange) and DNA (magenta) is shown. Mutations studied include K267E (red), R332A (blue), and the 10 mutations of iGeoCas9 (lime green), all highlighted in surface representation. (B) Differential root-mean-square fluctuations (ΔRMSF) of protein residues computed between WT GeoCas9 and the K267E (red), R332A (blue), and double mutant (pink). (C) Distribution of protein-RNA contacts for WT and GeoCas9 variants computed over the 6 μs simulation ensemble. (D) Comparison of gRNA binding free energy to the Rec domain in WT GeoCas9 and variants. (E) Representative snapshots from MD simulations illustrating structural changes in Rec-gRNA association in WT GeoCas9 (left) and variants (right).

Additionally, we simulated a novel variant, iGeoCas9 (PDB: 8UZB), containing mutations in the Rec1 and WED domains (Figure 6A, mutations highlighted in lime green). This variant was recently demonstrated to have enhanced specificity in genome-editing.21 Intriguingly, iGeoCas9 exhibited increased dynamics in the HNH and Rec domains, along with a reduction in gRNA binding free energy similar to the K267E and R332A mutants. These results suggest a distinct allosteric pathway involving additional residues that enables iGeoCas9 to maintain improved DNA cleavage activity despite reduced gRNA binding affinity. In fact, iGeoCas9 samples the greatest conformational space of any variant tested (Figure S15), suggesting a high level of flexibility is critical to enhanced specificity in GeoCas9. Collectively, our simulations reveal that mutations K267E and R332A destabilize the GeoCas9 interaction with gRNA, consistent with NMR observations. Furthermore, the enhanced dynamics and altered binding affinities observed in iGeoCas9 indicate potential allosteric mechanisms that optimize its genome-editing functionality, where K267E and R332A evoke a similar, but lesser degree of biophysical change.

DNA cleavage assays suggest the highly stable GeoCas9 is resistant to functional changes by K267E or R332A mutations

The dynamic impact of the GeoRec mutations and their altered gRNA interactions at the biophysical level led us to speculate that either mutation incorporated into full-length GeoCas9 would also alter its DNA cleavage function, especially at elevated temperatures where WT GeoCas9 is most active. Temperature-dependent functional alterations were previously observed for single-point mutations within GeoHNH.1 Although the K267E and R332A mutations slightly diminished on-target DNA cleavage by GeoCas9, the effect was very subtle and these overall cleavage activities followed the temperature dependence of WT GeoCas9 quite closely (Figure S16).

To assess the impact of the K267E and R332A mutations on GeoCas9 specificity, we assayed the propensity for off-target cleavage using DNA substrates with mismatches 5-6 or 19-20 base pairs from the PAM site (Figure S17, Table S2). As a control for on- and off-target activity, we assayed WT SpCas9 alongside the widely used high-specificity HiFi-SpCas9 variant24 (Figure S17, Table S3) and found a lower percent of digested off-target (mismatched) DNA sequences when compared to WT SpCas9. As expected, WT GeoCas9 was increasingly sensitive to mismatched target sequences closer to the seed site, which has been demonstrated with SpCas9 and other Cas systems.2,6,8,35 No significant differences in activity were observed with digestion durations ranging from 1-60 minutes,2 implying that a 1-minute digestion is sufficient for in vitro activity of GeoCas9 with the target DNA template. While these findings generally align with prior investigations of off-target DNA cleavage,2,6,8 there are nuanced differences. Specifically, a previous study reported ∼10% cleavage of off-target DNA with a mismatch 5-6 base pairs from the PAM by WT GeoCas9.2 Our results showed nearly 50% cleavage for the same off-target mismatch, but still a significant decrease in cleavage from on-target or 19-20 base pair distal mismatches. This could be due to the relatively high RNP concentrations (600-900 nM) in our assay (for clear visibility on the gel), compared to prior studies with RNP concentrations σ; 500 nM.2 Our results corresponded closely to those of prior studies with a 19-20 base pair mismatch, where off-target cleavage is tolerated by WT GeoCas9.2 Single-point mutants K267E and R332A GeoCas9 have negligible impact on GeoCas9 specificity (both variants follow the trend of WT GeoCas9, Figure S17), which contrasts prior work with SpCas9 that demonstrated robust specificity enhancement with single-point mutations in Rec.24,27 Additionally, a GeoCas9 double mutant K267E/R332A exhibits decreased on-target cleavage efficiency, which has been noted in high-specificity Cas systems.15,23,24,27,3638 However, the additive effect of the K267E/R332A double mutant still does not enhance GeoCas9 specificity in our assay. MST-derived binding affinities using full-length Tnnt2 gRNA and full-length WT, K267E, or R332A GeoCas9 indicate that all three proteins have similar affinities for the gRNA used in the functional assays (Figure 7). Thus, mutations do not substantially alter full-length GeoCas9 binding to Tnnt2 gRNA, supporting similar cleavage activities for these proteins. We repeated the MST experiments using a different gRNA sequence derived from the new cryo-EM structure of GeoCas9 (PDB:8UZA), which has a different spacer sequence. We observed a similar trend, as all GeoCas9 variants exhibited comparable affinities for this gRNA. Our functional studies illustrate an apparent resilience of GeoCas9 to major functional changes at the level of Rec, despite comparable mutations having profound functional impacts in mesophilic Cas9s.

(A) Representative MST-derived profiles of WT, K267E, and R332A GeoCas9 binding to a Cy5-labeled full-length Tnnt2 gRNA. (B) Bar graph comparing Kd values across n ≥ 3 replicate samples are shown for Tnnt2 and 8UZA gRNA from a recent cryo-EM structure of GeoCas9. *p < 0.01

Discussion

CRISPR-Cas9 is a powerful tool for targeted genome editing with high efficiency and modular specificity.2,15,24,27 Allosteric signals propagate DNA binding information to the HNH and RuvC nuclease domains, facilitating their concerted cleavage of double-stranded DNA.8,10,14,15 The intrinsic flexibility of the nucleic acid recognition lobe plays a critical role in this information transfer, exerting a measure of conformational control over catalysis.10 This study provides new insights into the structural, dynamic, and functional role of the thermophilic GeoCas9 recognition lobe. Novel constructs of subdomains GeoRec1 and GeoRec2, as well as intact GeoRec show a high structural similarity to the domains in full-length GeoCas9, facilitating solution NMR experiments that captured the intrinsic allosteric motions across GeoRec2. These studies revealed the existence of μs-ms timescale motions that are classically associated with allosteric signaling and enzyme function, which span the entire GeoRec2 domain to its interfaces with GeoRec1 and the adjacent GeoHNH domain.

Based on homology to specificity-enhancing variants of the better studied SpCas9, the biophysical and biochemical consequences of two mutations were tested in GeoRec2, the larger GeoRec lobe, and full-length GeoCas9. We speculated that removing positively charged residues with potential to interact with negatively charged nucleic acids could disrupt GeoCas9-gRNA complex formation, stability, and subsequent function by altering the protein or nucleic acid motions. Indeed, CPMG relaxation dispersion experiments revealed that mutations enhanced and reorganized the μs-ms flexibility of GeoRec2. Further, NMR titrations showed the affinity of K267E and R332A GeoRec for gRNA to be weaker than that of WT GeoRec, consistent with MST-derived Kd values using the isolated domain. The mutations also diminished the stability of the full-length GeoCas9 RNP complex.

The collective changes to protein dynamics, gRNA binding, and RNP thermostability suggested that mutations could modulate GeoCas9 function, as observed in similar studies of SpCas9 reporting that gRNA dynamics, affecting the potential for the RNA:DNA hybrid to dissociate, have affected function.11,15,39 Yet, the functional impact of single-point and double mutations in this work were negligible, despite homologous K-to-E and R-to-A single point mutations enhancing specificity of the mesophilic SpCas9. The biophysical impact of mutations within GeoRec2 and GeoRec may be tempered by its evolutionary resilience and the highly stable neighboring domains in the context of full-length GeoCas9, reflected in an unchanged affinity for target DNA once the RNP was formed.40,41 Thus, a greater number of additive (or synergistic) mutations within GeoRec would be required to fine-tune activity or specificity to a large degree.

It should be noted that the effects of these and other GeoRec mutations may vary in vivo or with alternative target cleavage sites and cell types. Such studies will be the subject of future work, as will biochemical assays of homologous mutations across diverse Cas9s, which have contributed to the wide use of CRISPR technology.42 We also note that despite the homology between GeoRec2 and SpRec3 and the latter’s role in evo- and HiFi-SpCas9 variants that inspired the K267E and R332A mutations, the maximally enhanced SpCas9 variants contain four mutations each. Presumably each individual substitution plays a small role modulating specificity. However, there is no consistent pattern that discerns whether multiple mutations will have additive or synergistic impacts on Cas9 function. NMR and MD studies of high-specificity SpCas9 variants (HF-1, Hypa, and Evo, each with distinct mutations in the SpRec3 domain) reveal universal structural and dynamic variations in regions of SpRec3 that interface with the RNA;DNA hybrid.16 Notably, a recently published variant, iGeoCas9,32 demonstrated enhanced genome-editing capabilities in HEK293T cells with eight mutations, though none in the Rec2 subdomain. This study highlighted the functional adaptability of iGeoCas9 under low magnesium conditions, a trait beneficial in mammalian cells, distinguishing it from WT GeoCas9. These very recently published data, as well as the findings reported here still advance our molecular understanding of the functional handles in GeoCas9, relevant to the design of new enhanced variants.

This study marks the first phase of mapping allosteric motions and pathways of information flow in the GeoRec lobe with solution NMR experiments. Such information transfer is critical to the crosstalk between Rec and HNH in several Cas9s. Despite NMR advancements in perdeuteration,43 transverse relaxation-optimized spectroscopy (TROSY),44 and sparse isotopic labeling,45 per-residue dynamics underlying allosteric signaling in large multi-domain proteins such as GeoCas9 (∼126 kDa) have remained challenging to characterize. Novel cryo-EM structures of GeoCas932 will facilitate the merging of future NMR and MD simulation studies to report on RNP dynamics and atomic level networks of communication. The identification of additional (or synergistic) allosteric hotspots within GeoRec using an integrated workflow will help to further resolve the balance between structural flexibility and the unusually high stability of GeoCas9, leading to new insight into targeted manipulation of RNA affinity and enhanced variants.

Here, we set out to biophysically characterize the Rec lobe of GeoCas9 to obtain new understanding of its function (in the context of well-studied mesophilic Cas9s). Using an AlphaFold2 model, and later a cryo-EM structure of GeoCas9, we introduced mutations based on proximal gRNA interactions and homology to specificity-enhancing sites in SpCas9. However, the mutations did not affect GeoCas9 function as expected, highlighting the complicated interplay between the biophysics of mesophilic and thermophilic Cas enzymes and the difficulty of applying universal functional predictions to Cas9. The very recent report of the iGeoCas9 variant further reinforces this point.21 While high-specificity SpCas9 variants are heavily mutated in Rec3 (analogous to GeoRec2), iGeoCas9 lacks mutations in Rec2 entirely, raising new questions about the functional role of GeoRec. MD simulations of WT GeoCas9, iGeoCas9, and the Rec variants revealed that while K267E and R332A induce dynamic effects on a similar trajectory to iGeoCas9, a true high-specificity variant samples a very wide conformational space with displacements of both Rec and HNH (Figures 6 and S15). Through further study of the fundamental mechanism of GeoCas9, it remains possible that engineering of GeoRec may produce high-specificity variants.

Materials and methods

Expression and purification of GeoRec1, GeoRec2, GeoRec, and GeoCas9

The GeoRec1 (residues 90-225) and GeoRec2 (residues 245-456) subdomains, as well as the entire GeoRec lobe (residues 90-456) of G. stearothermophilus Cas9 were engineered into a pET28a vector with a N-terminal His6-tag and a TEV protease cleavage site. The K267E and R332A mutations were separately introduced into the GeoRec2 plasmid. Plasmids were transformed into BL21 (DE3) cells (New England Biolabs). Protein samples for CD spectroscopy, MST, and functional assays were grown in Lysogeny Broth (LB, Fisher), while isotopically labeled samples for NMR were grown in M9 minimal media (deuterated for GeoRec2 and GeoRec) containing CaCl2, MgSO4, MEM vitamins, and 1.0 g/L 15N ammonium chloride and 2.0 g/L 13C glucose (Cambridge Isotope Laboratories), as the sole nitrogen and carbon sources, respectively. Cells were induced with 1 mM IPTG after reaching an OD600 of 0.8−1.0 and grown for 4 hours at 37 °C post induction. The cells were harvested by centrifugation, resuspended in a buffer of 50 mM Tris-HCl, 250 mM NaCl, 5 mM imidazole, and 1 mM PMSF at pH 7.4, lysed by ultrasonication, and purified by Ni−NTA affinity chromatography. Following TEV proteolysis of the terminal His-tag, the samples were further purified on a Superdex75 size exclusion column. NMR samples were dialyzed into a buffer containing 20 mM NaPi, 80mM KCl, 1mM DTT, and 1mM EDTA at pH 7.4.

The full-length GeoCas9 plasmid was acquired from Addgene (#87700), expressed in TB media and was expressed and purified as previously described.2 The K267E, R332A, and K267E/R332A variants were introduced into full-length GeoCas9 by modifying the original plasmid acquired from Addgene.

NMR spectroscopy

Backbone resonance assignments of GeoRec1 and GeoRec2 were carried out on a Bruker Avance NEO 600 MHz spectrometer at 25 °C. The following triple resonance experiments were collected for each sample: 1H-15N TROSY-HSQC, HNCA, HN(CO)CA, HN(CA)CB, HN(COCA)CB, HN(CA)CO and HNCO. All spectra were processed in NMRPipe46 and analyzed in Sparky47. Three-dimensional correlations and assignments were made in CARA48 and GeoRec1 and GeoRec2 backbone assignments were deposited in the BMRB under accession numbers 52363 and 51197, respectively. Backbone resonance assignments of GeoRec were completed by transferring assignments from the individually assigned spectra of GeoRec1 and GeoRec2, as done previously for other large Cas9 fragments.49,50

NMR spin relaxation experiments were carried out in a temperature-compensated manner at 600 and 850 MHz on Bruker Avance NEO and Avance III HD spectrometers, respectively. CPMG experiments were adapted from the report of Palmer and coworkers51 with a constant relaxation period of 20 ms and νCPMG values of 0, 25, 50, 75, 100, 150, 250, 500, 750, 800, 900, and 1000 Hz. Exchange parameters were obtained from global fits of the data carried out with RELAX52 using the R2eff, NoRex, and CR72 models, as well as in-house fitting in GraphPad Prism with the following models:

Model 1: No exchange

Model 2: Two-state, fast exchange (Meiboom equation 53)

Global fitting of CPMG profiles was determined to be superior to individual fits based on the Akaike Information Criterion.54 Uncertainties in these rates were determined from replicate spectra with duplicate relaxation delays of 0, 25, 50 (×2), 75, 100, 150, 250, 500 (×2), 750, 800 (×2), 900, and 1000 Hz.

Longitudinal and transverse relaxation rates were measured with randomized T1 delays of 0, 20, 60, 100, 200, 600, 800, and 1200 ms and T2 delays of 0, 16.9, 33.9, 50.9, 67.8, 84.8, and 101.8 ms. Peak intensities were quantified in Sparky and the resulting decay profiles were analyzed in Sparky with errors determined from the fitted parameters. Uncertainties in these rates were determined from replicate spectra with duplicate relaxation delays of 20 (x2), 60 (x2), 100, 200, 600 (x2), 800, and 1200 ms for T1 and 16.9, 33.9 (x2), 50.9 (x2), 67.8 (x2), 84.8, 101.8 (x2) ms for T2. Steady-state 1H-[15N] NOE were measured with a 6 second relaxation delay followed by a 3 second saturation (delay) for the saturated (unsaturated) experiments and calculated by Isat/Iref. All relaxation experiments were carried out in a temperature-compensated interleaved manner.

Model-free analysis was carried out by fitting relaxation rates to five different forms of the spectral density function with local τm, spherical, prolate spheroid, oblate spheroid, or ellipsoid diffusion tensors.5560 The criteria for inclusion of resonances in the diffusion tensor estimate was based on the method of Bax and coworkers.61 N-H bond lengths were assumed to be 1.02 Å and the 15N chemical shift anisotropy tensor was −160 ppm. Diffusion tensor parameters were optimized simultaneously in RELAX under the full automated protocol.52 Model selection was iterated until tensor and order parameters did not deviate from the prior iteration.

NMR titrations were performed on a Bruker Avance NEO 600 MHz spectrometer at 25 °C by collecting a series of 1H-15N TROSY HSQC spectra with increasing ligand (i.e. gRNA) concentration. The 1H and 15N carrier frequencies were set to the water resonance and 120 ppm, respectively. Samples of WT, K267E, and R332A GeoRec were titrated with gRNA until no further spectral perturbations were detected. NMR chemical shift perturbations were calculated as:

Microscale thermophoresis (MST)

MST experiments were performed on a Monolith X instrument (NanoTemper Technologies), quantifying WT, K267E, R332A, and K267E/R332A GeoRec binding to a 39-nt Cy5-labeled gRNA at a concentration of 20 nM in a buffer containing 20 mM sodium phosphate, 150 mM KCl, 5 mM MgCl2, and 0.1% Triton X-100 at pH 7.6. The GeoRec proteins were serially diluted from a 200 µM stock into 16 microcentrifuge tubes and combined in a 1:1 molar ratio with serially diluted gRNA from a 40 nM stock. After incubation for 5 minutes at 37 °C in the dark, each sample was loaded into a capillary for measurement. Kd values for the various complexes were calculated using the MO Control software (NanoTemper Technologies). Statistical significance was calculated using a two-tailed T-test.

Circular dichroism (CD) spectroscopy

All GeoCas9 and GeoRec proteins were buffer exchanged into a 20 mM sodium phosphate buffer at pH 7.5, diluted to 1 μM, and loaded into a 2 mm quartz cuvette (JASCO instruments). A CD spectrum was first measured between 200 - 250 nm, after which the sample was progressively heated from 20 – 90 °C in 1.0 °C increments while ellipticity was monitored at 222 and 208 nm. Phosphate buffer baseline spectra were subtracted from the sample measurements. Prior to CD measurements, GeoCas9-RNP was formed by incubating 3 μM GeoCas9 with gRNA at a 1:1.5 molar ratio at 37 °C for 10 minutes. The unfolding CD data was fit in GraphPad Prism to:

X-ray crystallography

GeoRec protein purified as described above was crystallized by sitting drop vapor diffusion at room temperature by mixing 1.0 µL of 15 mg/mL GeoRec in a buffer of 20 mM HEPES and 100 mM KCl at pH 7.5 with 2.0 µL of crystallizing condition: 0.15 M calcium chloride, 15 % polyethylene glycol 6000, 0.1 M HEPES at pH 7.0. Crystals were cryoprotected in crystallizing condition supplemented with 30% ethylene glycol. Diffraction images were collected at the NSLS-II AMX beamline at Brookhaven National Laboratory under cryogenic conditions. Images were processed using XDS62 and Aimless in CCP4.63 Chain A of the N. meningitidis Cas9 X-ray structure (residues 249-445 only, PDB ID: 6JDQ) was used for molecular replacement with Phaser followed by AutoBuild in Phenix.64 Electron density was only observed for the GeoRec2 subdomain. The GeoRec2 structure was finalized through manual building in Coot65 and refinement in Phenix.

Molecular dynamics (MD) simulations

Molecular Dynamics (MD) simulations were based on the cryo-EM structure of full-length GeoCas9 (PDB: 8UZA, resolution 3.17 Å) in complex with gRNA and target DNA with two mutations in GeoCas9 (at residues 8 and 582). Four systems were considered for the MD studies: WT, K267E, R332A, K267E/R332A and iGeoCas9. We generated the WT GeoCas9 by back-mutating A8D and A582H from the cryo-EM structure (PDB: 8UZA), followed by introducing the mutations K267E, R332A, or a double mutation (with both K267E and R332A) for the variant systems. Subsequently, we performed MD simulation of iGeoCas9 (PDB: 8UZB, resolution 2.63 Å) consisting of 10 mutations (D8A, E149G, T182I, N206D, P466Q, H582A, Q817R, E843K, E854G, K908R). All systems were solvated with explicit water in a periodic box of ∼ 134 Å x ∼ 154 Å x ∼ 151 Å resulting in ∼ 276,000 atoms. Counter ions were added to neutralize the systems. MD simulations were performed using a protocol tailored for protein-nucleic acid complexes,66 previously applied in studies of CRISPR-Cas systems.6769 All the simulations were performed by using Amber ff19SB force field for protein,70 ff99bsc1 corrections and χOL3 corrections for DNA and RNA, respectively.71,72 Water molecules were described by TIP3P model.73 All bonds involving hydrogens were constrained using the LINCS algorithm. A particle mesh Ewald method (PME) with a 10 Å cutoff was used to calculate electrostatics. Energy minimization was performed to relax the water molecules and counterions, keeping the protein-nucleic acid complex fixed with harmonic potential restraints of 100 kcal/mol Å2. Equilibration was performed by gradually increasing the temperature from 0 to 100 K and then to 200 K in canonical NVT ensemble and isothermal-isobaric NPT ensemble. A final temperature of 300 K was maintained via Langevin dynamics with a collision frequency γ = 1/ps and a reference pressure of 1 atm was achieved through Berendsen barostat. Production runs were carried out in NVT ensemble for 2 µs for each system in three replicates, resulting in 6 µs per system (totaling 30 µs for all systems). The equations of motion were integrated with the leapfrog Verlet algorithm with a time step of 2 fs. All simulations were conducted using the GPU-empowered version of AMBER 22.74 Analysis was performed on the aggregate ensemble (i.e., ∼6 μs per system).

To characterize the protein-nucleic acid interactions in all the systems under investigation, we performed contact analysis. A contact was considered to form between two atoms within a cutoff distance of ≤4.5 Å. The binding free energy of gRNA and DNA with GeoCas9 was calculated using the Molecular Mechanics Generalized Born Surface Area (MM-GBSA) method.7577 This approach was used to compare the Rec-gRNA binding affinity of WT GeoCas9 with its mutants. For each system, the binding energies were calculated over the ∼200 ns ensemble of the stable trajectories at an interval of ∼20 ns.

DNA cleavage assays

GeoCas9 gRNA templates containing 21-nt spacers targeting the mouse Tnnt2 gene locus were introduced into EcoRI and BamHI sites in pUC57 (Genscript). The plasmid was transformed into BL21(DE3) cells (New England BioLabs) and subsequent restriction digest of the plasmid DNA was carried out using the BamHI restriction enzyme (New England BioLabs) according to the manufacturer’s instructions. Linearized plasmid DNA was immediately purified using the DNA Clean and Concentrator-5 kit (Zymo Research) according to the manufacturer’s instructions. RNA transcription was performed in vitro with the HiScribe T7 High Yield RNA Synthesis Kit (New England BioLabs). DNA substrates containing the target cleavage site (479 base pairs, Figure S13) were produced by polymerase chain reaction (PCR) using mouse genomic DNA as a template and primer pairs 5’CAAAGAGCTCCTCGTCCAGT3’ and 5’ ATGGACTCCAGGACCCAAGA3’ followed by a column purification using the NucleoSpinⓇ Gel and PCR Clean-up Kit (Macherey-Nagel). For the in vitro activity assay, RNP formation was achieved by incubating 3 µM GeoCas9 (WT, K267E, R332A, or K267E/R332A mutant) and 3 µM gRNA at either 37 °C, 60 °C, 75 °C, or 85 °C for 30 minutes in a reaction buffer of 20 mM Tris, 100 mM KCl, 5 mM MgCl2, 1 mM DTT, and 5% glycerol at pH 7.5. The 10 µL cleavage reactions were set up by mixing RNP at varying concentrations with 149 nanograms of PCR products on ice followed by incubation at 37 °C for 30 minutes. The reaction was quenched with 1 µL of proteinase K (20 mg/mL) and subsequent incubation at 56 °C for 10 minutes. 6x DNA loading buffer was added to each reaction and 10 µL of reaction mixture per lane was loaded onto an agarose gel. DNA band intensity measurements were carried out with ImageJ.

For in vitro off-target activity assays, RNP formation was achieved by incubating 10 µM GeoCas9 (WT, K267E, R332A, or K267E/R332A mutant) and 10 µM gRNA at 37 °C for 30 minutes in the reaction buffer described above. The 10 µL cleavage reactions were set up by mixing 1 µM RNP with 150 nanograms of PCR products (off-target DNA sequences listed in Table S2) on ice followed by incubation at 37 °C for varying time points. The reaction was quenched with 1 µL of proteinase K (20 mg/mL) and subsequent incubation at 56 °C for 10 minutes. 6x DNA loading buffer was added to each reaction and 10 µL of reaction mixture per lane was loaded onto an agarose gel. DNA band intensity measurements were carried out with ImageJ. WT and HiFi SpCas9 control proteins were purchased from Integrated DNA Technologies (IDT, cat. No. 108158 and No. 108160, respectively), as was the associated SpCas9 gRNA, Alt-R™ CRISPR-Cas9 gRNA, with an RNA spacer sequence complementing 5’-TGGACAGAGCCTTCTTCTTC-3’. The on-target and off-target DNA sequences used for the SpCas9 in vitro cleavage assay can be found in Table S3.

Acknowledgements

This work was supported by NIH grant R01 GM 136815 (to GP and GPL) and NSF grant MCB 2143760 (to GPL). GP acknowledges support from the NIH (Grant No. R01GM141329) and the NSF (CHE-2144823), as well as from the Sloan Foundation (FG-2023-20431) and the Camille and Henry Dreyfus Foundation (TC-24-063). This research used the AMX beamline of the National Synchrotron Light Source II, a U.S. Department of Energy (DOE) Office of Science User Facility operated for the DOE Office of Science by Brookhaven National Laboratory under Contract No. DE-SC0012704. The Center for BioMolecular Structure (CBMS) is primarily supported by NIGMS through a Center Core P30 Grant (P30 GM133893), and by the DOE Office of Biological and Environmental Research (KP1607011). Computational studies were carried out using Expanse at the San Diego Supercomputing Center through allocation MCB160059 and Bridges2 at the Pittsburgh Supercomputer Center through allocation BIO230007 from the Advanced Cyberinfrastructure Coordination Ecosystem: Services & Support (ACCESS) program, which is supported by NSF grants #2138259, #2138286, #2138307, #2137603, and #2138296.

Additional information

Author Contributions

HBB and ALK produced GeoRec2, GeoRec1, GeoRecFL, and GeoCas9 proteins and sgRNA, conducted the NMR and biophysical experiments, analyzed the data, and wrote the original draft of the manuscript. AMD solved the X-ray crystal structure of GeoRec2. CP carried out MD simulations and analyzed the data. ZF and JL conducted GeoCas9 functional assays and analyzed the data. GP supervised the MD studies and obtained funding. GJ supervised collection of X-ray crystallographic data. GPL conceived the study, supervised collection of NMR spectroscopic data, obtained funding, and wrote the original draft. The final manuscript was written and edited with contributions from all authors.

Additional files

Supporting Information