xCas9 variant of the S. Pyogenes Cas9 (SpCas9) protein bound to a guide RNA (gray) and a target DNA (charcoal) including the 5’-AGG-3’ PAM recognition sequence (red) (PDB 6AEB)(Guo et al., 2019). xCas9 includes seven amino acid substitutions (blue) with respect to SpCas9. Close-up views of the PAM recognition region for xCas9 bound to AGG (PDB 6AEB, left)(Guo et al., 2019) and GAT (PDB 6AEG, right)(Guo et al., 2019). The PAM nucleobases (red) and the PAM interacting residues (R1333 and R1335, blue) are shown as sticks. The E1219V mutation is also shown.

Binding of SpCas9 and xCas9 to PAM sequences that are recognized and ignored.

A. Binding of SpCas9 and xCas9 to PAM sequences that are recognized (TGG, top; AAG, centre; GAT, bottom). B. Interaction pattern established by R1333 (left) and R1335 (right) with PAM nucleobases (NB), PAM backbone (BB) and non-PAM nucleotides in SpCas9 bound to TGG (i.e., the wilt-type system) and xCas9 bound to PAM sequences that are recognized (TGG, AAG, GAT) and ignored (CCT, TTA, ATC). Interaction frequencies are averaged over ∼6 μs of collective ensemble for each system. Errors are computed as standard deviation of the mean over four simulations replicates. C. Root mean square fluctuations (RMSF) of the R1333 and R1335 side chains in SpCas9 bound to its TGG PAM, compared to xCas9 bound to recognized and ignored PAMs. D. Frequencies of hydrogen bond formation between the arginine side chains and the PAM NB, BB and non-PAM nucleotides (details in the SI). E. Specificity index, representing the frequency of hydrogen bond formation between a given arginine and the PAM nucleotides relative to the frequency of forming hydrogen bonds with non-PAM residues. Data are reported with the standard deviation of the mean over four simulations replicates. F. PAM recognition region in xCas9 bound to PAM sequences that are ignored (CCT, top; TTA, centre; ATC, bottom). Figure 2–figure supplement 1.

DNA binding preference of R1335 in SpCas9 vs. xCas9.

A. Free energy surface (FES) describing the preference of R1335 for binding either the G3 nucleobase or E1219 in SpCas9. The FES is plotted along the distances between the R1335 guanidine and either the E1219 carboxylic group (CV1) or the G3 nucleobase (CV2, details in Materials and Methods). A well-defined minimum indicates that R1335 is “sandwiched” between G3 and E1219 (right). B. FES describing the binding of R1335 to G3 and the DNA backbone in SpCas9 (left) and xCas9 (centre). The FES is plotted along the distances between the R1335 guanidine and either the backbone phosphate (CV1) or G3 nucleobase (CV2). In SpCas9, R1335 mainly binds the G3 nucleobase, while in xCas9, it alternates interactions between the nucleobase and backbone (right). The free energy, ΔG, is expressed in kcal/mol. Figure 3–figure supplements 1-3.

DNA binding free energy difference (ΔΔG) between SpCas9 and its xCas91-3 mutants.

Relative changes in the DNA binding free energy (ΔΔG) upon transitioning from SpCas9 to xCas91, from xCas91 to xCas92, and from xCas92 to xCas93 in the presence of TGG (red) and AAG (pink) PAM sequences. Binding free energies from alchemical free energy calculations denoted with the associated error computed through the Multistate Bennett Acceptance Ratio (MBAR) method(Shirts and Chodera, 2008) (details in Materials and Methods). Figure 4–figure supplements 1-2.

Enthalpic contribution to the DNA binding free energy for the SpCas9 → xCas91 transformation.

A. Enthalpic contribution to the ΔΔG of DNA binding while transitioning from SpCas9 to xCas91 in the presence of the TGG (red) and AAG (pink) PAM sequences, computed as the average changes in the interaction energy (ΔE) between selected amino acid residues and the DNA. The ΔE is also computed as the overall interaction energy between the rest of the protein (Prot-rest, without the selected residues) and the DNA. Error bars were computed by averaging the results from different segments of the given trajectory (details in Materials and Methods). B. Probability density of the distance (r) between the centers of mass (COM) of the REC3 and HNH domains from molecular dynamics simulations of SpCas9 (PDB 4UN31) incorporating the xCas9 mutations in the presence of TGG (red) and AAG (pink) PAM sequences. Data from ∼6 μs of aggregate sampling for each system. The values of the distance r in the X-ray structures of SpCas9 (PDB 4UN31, 31.6 Å) and xCas9 bound to AAG (PDB 6AEB3, 43.8 Å) are indicated using a vertical dashed bar. The statistical significance between the two distributions was evaluated using Z-score statistics with a two-tailed hypothesis (P-value was less than 0.0001). The distance (r) between the REC3 and HNH COMs is indicated on the three-dimensional structure of Cas9 (right), highlighting the AAG-induced conformational change using an arrow. Figure 5– figure supplements 1-3.