The CRISPR-associated endonuclease Cas9 from Streptococcus pyogenes (SpyCas9), along with a programmable single-guide RNA (sgRNA), has been exploited as a significant genome-editing tool. Despite the recent advances in determining the SpyCas9 structures and DNA cleavage mechanism, the cleavage-competent conformation of the catalytic HNH nuclease domain of SpyCas9 remains largely elusive and debatable. By integrating computational and experimental approaches, we unveiled and validated the activated Cas9-sgRNA-DNA ternary complex in which the HNH domain is neatly poised for cleaving the target DNA strand. In this catalysis model, the HNH employs the catalytic triad of D839-H840-N863 for cleavage catalysis, rather than previously implicated D839-H840-D861, D837-D839-H840, or D839-H840-D861-N863. Our study contributes critical information to defining the catalytic conformation of the HNH domain and advances the knowledge about the conformational activation underlying Cas9-mediated DNA cleavage.https://doi.org/10.7554/eLife.46500.001
The DNA inside human cells provides instructions for all of the processes that happen inside the body. Errors in the DNA may lead to cancer, sickle cell disease, cystic fibrosis, Huntington’s disease, or other genetic disorders. Medical researchers are exploring whether it is possible to replace or repair the faulty DNA (an approach known as gene therapy) to reduce the symptoms, or even cure individuals, of these conditions.
Over the last ten years, a new technology known as CRISPR-Cas9 gene editing has proved to be a reliable and efficient way to make small and precise changes to DNA in living cells. First, an enzyme called Cas9 searches for a segment of target DNA segment that matches a template molecule the enzyme carries. Cas9 then cuts the target DNA, which is repaired to match a new customized DNA sequence: this changes the genetic information of the cell.
The Cas9 protein is made of a succession of building blocks called amino acids that create long chains which then fold to form the final three-dimensional shape of the enzyme. A region of Cas9 known as the HNH domain is responsible for cutting the target DNA. However, it remains unclear exactly which amino acids within this domain work together to sever the DNA.
Here, Zuo et al. combined computational and experimental approaches to reveal the three-dimensional structure of the Cas9 enzyme when the HNH domain is poised to cut the target DNA. The findings were used to generate a computational model of Cas9 and this model predicted that the HNH domain relies on a group of three amino acids known collectively as D839-H840-N863 to cleave DNA strands.
This knowledge is useful to understand exactly how Cas9 modifies genetic information. Ultimately, this may help to improve CRISPR-Cas9 technology so it could be safely used in geneediting therapies.https://doi.org/10.7554/eLife.46500.002
The clustered regularly interspaced short palindromic repeats (CRISPR)-associated endonuclease Cas9 from Streptococcus pyogenes (SpyCas9) has become a gene-editing tool that holds an immense promise for the development of novel therapeutic approaches for human diseases (Cong et al., 2013; Jinek et al., 2012; Knott and Doudna, 2018; Mali et al., 2013). Two, magnesium (Mg2+) ion-dependent, nuclease domains (i.e. HNH and RuvC domains) in Cas9 cleave the target DNA strand (tDNA) complementary to the guide region of a dual single-guide RNA (sgRNA) and the non-target DNA strand (ntDNA), respectively (Gasiunas et al., 2012; Jinek et al., 2012). The conformational and mechanistic knowledge of Cas9 activation to achieve DNA cleavage is essential for the rational engineering of Cas9 to possibly ensure minimal off-target effects while retaining high gene-editing efficiency (Chen et al., 2017a; Kleinstiver et al., 2016; Slaymaker et al., 2016; Sternberg et al., 2015).
Several structures of the SpyCas9-sgRNA-DNA ternary complex to depict the HNH nuclease domain in a ‘cleavage-competent’ state have been reported (Huai et al., 2017; Palermo et al., 2018; Palermo et al., 2017; Zuo and Liu, 2017). Notably, the amino acid residue D861 in the HNH domain of SpyCas9 pointed towards the catalytic center in the absolute majority of resolved crystal structures (Anders et al., 2016; Anders et al., 2014; Dong et al., 2017; Hirano et al., 2016; Jiang et al., 2015; Jiang et al., 2016; Jinek et al., 2014; Liu et al., 2019; Nishimasu et al., 2014; Olieric et al., 2016; Yang and Patel, 2017), molecular dynamic simulation models (Palermo et al., 2018; Zuo and Liu, 2017), and cryo-electron microscopy (cryo-EM) structures (Huai et al., 2017; Jiang et al., 2019; Shin et al., 2017) (Figure 1a–1d). Despite lacking experimental evidence, it is generally believed that D861 directly participates in Mg2+ chelation and tDNA cleavage (Huai et al., 2017; Palermo et al., 2018; Palermo et al., 2017; Zuo and Liu, 2017). An in silico model (Yoon et al., 2019) that was recently reported also suggested that D861 and N863 are potentially involved in chelating the Mg2+ ion at the catalytic center of the HNH domain, although this discovery also remained untested in an experimental setting. In the diverse homologous structures of DNA/RNA nucleases from other species (Yang, 2008; Yang, 2011), however, the residues spatially equivalent to the D861 of SpyCas9 are conserved as an asparagine. The substitution (N62D) of the corresponding asparagine in the active center of bacteriophage T4 Endonuclease VII (T4 Endo VII) has been shown to abrogate its DNA cleavage activity (Biertümpfel et al., 2007). These observations motivated us to examine the potential role of D861 in the HNH domain of Cas9. We mutated D861 to alanine and tested the activity of the D861A variant using an experimental approach based on Cas9-mediated disruption of the egfp gene in EGFP-expressing human cells. Unexpectedly, this Cas9 variant exhibited DNA-cleavage activity level similar to that of the wild-type protein (Figure 1e and Figure 1—figure supplement 1). To further validate our finding, we performed in vitro cleavage assays using either plasmid or oligo DNA as a substrate and observed that the D861 variant retained similar activity as the wild type, given enough reaction time (Figure 1—figure supplement 2 and Figure 1—figure supplement 3). Our experiments thus demonstrate that D861 is not critical for HNH domain-catalyzed tDNA cleavage, unlike what would be expected from the reported Cas9 complex structures (Anders et al., 2014; Huai et al., 2017; Jiang et al., 2016; Palermo et al., 2018; Palermo et al., 2017; Zuo and Liu, 2017). In other words, the previously reported structures of the HNH domain of DNA-bound Cas9 (Huai et al., 2017; Palermo et al., 2018; Palermo et al., 2017; Zuo and Liu, 2017) potentially represent a conformation that is incompetent for tDNA cleavage. Hence, we refer to this cleavage-incompetent conformation with an inward-facing D861 as ‘psuedoactive state’ hereafter. This report aims to unmask the catalytic state of the HNH nuclease domain in Cas9 and explore the underlying mechanism of activation.
Intrigued by the above paradoxical findings between the structural and functional experiments, we performed molecular modeling and molecular dynamics (MD) simulations to further investigate the residues that may participate in the catalysis of tDNA cleavage. We first examined the apo-state crystal structure of Cas9 (Jinek et al., 2014). One noticeable feature in this apo-Cas9 structure is that the α-helical element in the HNH domain ββα-metal (ββα-Me) fold appears to pose a unique conformation with N863 pointing toward the catalytic center (Figure 1—figure supplement 4a). This inward orientation of N863 was distinct from the inward orientation of D861 observed in the Cas9 structures co-crystalized with sgRNA and/or DNA (Figure 1a–1d). However, the N-terminal of the α-helical segment is disordered in this apo-Cas9 structure (Figure 1—figure supplement 4a), suggesting a high conformational flexibility around this region. We completed the disordered regions of the apo-Cas9 structure by homology modeling and performed molecular dynamics (MD) simulation with a Mg2+ ion placed in the catalytic ββα-Me motif (Figure 1—figure supplement 4b). As a result, the Mg2+ ion formed an octahedral coordination with D839, N863, and four water molecules (Figure 1—figure supplement 4c), which closely resembled the Mg2+ coordination in the X-ray crystal structure of Actinomyces naeslundii Cas9 (AnaCas9, Figure 1—figure supplement 5a) (Jinek et al., 2014). Encouraged by this finding, we next applied the above Mg2+-bound α-helical conformation in our optimized pseudoactive Cas9 complex structure (Zuo and Liu, 2017) (Figure 2a–2b) and performed MD-based refinement on the entire structure (see details in Materials and methods). In our final structure, D839, H840 and N863 on the ββα-Me motif formed the catalytic triad that was poised for cleaving the tDNA. The Mg2+ ion was coordinated with the residues D839 and N863 of SpyCas9, a scissile phosphate (the pro-Sp oxygen and leaving O3’ engaged), and two water molecules in a strained geometry (Figure 2d–2e). This structure highly resembled the catalytic configuration present in the X-ray structure of T4 Endo VII/DNA complex (Biertümpfel et al., 2007) (Figure 2c), indicating the N863 of Cas9 could be engaged in the formation of the catalytic center for tDNA cleavage.
To validate the role of N863 in SpyCas9 function, we examined the gene-editing activities of an N863A variant and a D861A/N863A double-mutant variant using cell-based assays. Both SpyCas9 variants virtually lost the entire gene-editing capability compared to the wild-type protein (Figure 1f and Figure 1—figure supplement 1). Consistently, our in vitro assays also showed an effective loss of tDNA cleavage activity for the N863A variant, essentially producing only nicked products in the plasmid cleavage assay (Figure 1—figure supplement 2) and causing the disappearance of the 23-nucleotide tDNA cleavage product in the oligo assay (Figure 1—figure supplement 3). We note that the DNA nicks produced in the cell-based assay could be efficiently repaired by the cellular machinery, thereby causing the apparent loss of gene-editing capability in the above N863A variants. Together, our results clearly show N863, instead of D861, is the functional catalytic residue. Some studies have also suggested the possibility of N863 in forming the catalytic center of the Cas9 HNH domain (Nishimasu et al., 2014; Yoon et al., 2019). In contrast to the observation from another in silico model that was recently published (Yoon et al., 2019), our computational model indicated that D861 and N863 are unlikely to be simultaneously involved in the coordination of Mg2+ ion for cutting the tDNA. Besides N863, the functional significance of H840 (acting as a general base) has been confirmed by experiments (Jinek et al., 2012; Nishimasu et al., 2014). Another putative active residue in the HNH domain is D839, which could be inferred from the structures of homologous nucleases (Yang, 2008), despite lacking experimental data to support its catalytic role. Here, we discovered that the D839A mutation substantially compromised the gene-editing activity of Cas9 (Figure 1f), which is consistent with our simulation studies (Zuo and Liu, 2017) and directly supports the significance of D839 for Cas9 activity. Collectively, our structural and functional data corroborate the essence of D839, H840 and N863 for the cleavage function of the HNH domain.
Our MD simulations also excluded the likelihood of D837 and N854 (at the two ends of the β-sheet of the ββα-Me motif) to function as the catalytic residues previously postulated to be directly engaged in the coordination of the Mg2+ ion (Chen and Doudna, 2017b; Huai et al., 2017). The simulations showed that the side chain of N854 is ~10 Å distant from the tDNA, making it unlikely be a residue that directly mediates catalysis. However, an activity assay demonstrated the partial loss of activity in the N854A variant (Nishimasu et al., 2014), suggesting that N854 may play an auxiliary role in catalysis. According to our MD models and the solved Cas9 structures (Figure 1a–1d), N854 may stabilize the activated conformation of the HNH domain by making several intra-molecular hydrogen bonds within the ββα-Me motif. Moreover, our structural analyses revealed that the SpyCas9 N854 is spatially and functionally equivalent to the AnaCas9 N597 and Staphylococcus aureus Cas9 (SauCas9) N571 (Figure 1—figure supplement 5), indicating this stabilization role may exist in multiple Cas9 orthologs (Jinek et al., 2014; Nishimasu et al., 2015).
The D837A Cas9 moderately lost gene-editing activity (Figure 1f). From the solved Cas9 structures and our simulations, D837 may contribute to the structural integrity of the ββα-Me motif, although only one hydrogen bond involving its carboxyl group appear to be established (Figure 1a–1d). It is also possible that D837 may aid in proper positioning of the tDNA relative to the HNH domain (as described below).
The comparison between the active and pseudoactive Cas9-nucleic acid complex structures revealed distinct interactions that stabilize these two conformational states. In the pseudoactive conformation, R864 appeared to stabilize the catalytic core by forming ionic and hydrogen-bonding interactions within the ββα-Me motif (Figure 2a), as observed in various bound forms of Cas9 (Figure 1a–1d), whereas K866 is engaged to E371 to form a salt bridge (Figure 2b). In contrast, the corresponding residue of R864 is surrogated by K866 in the active state (Figure 2d), which is in line with the structures of AnaCas9 (K609) and SauCas9 (K583) (Figure 1—figure supplement 5), while R864, on the other hand, is tethered to a negative pocket lined by E370, E396 and D406 on the REC1 domain (Figure 2e). Further structural and sequence analysis indicated that the arginine residue, which is at an equivalent position of R864 in the pseudoactive state, is highly conserved across the ββα-Me nucleases beyond Cas9 (Figure 2—figure supplement 1), whereas both the Type II-A and II-C Cas9 orthologs feature an invariant lysine residue (Figure 2—figure supplement 2) that is spatially equivalent to K866 in the active state. Therefore, a basic amino acid residue (Lys/Arg) seemed to be required at the catalytic core for the ββα-Me superfamily. Moreover, K862 formed aliphatic-aromatic or cation-π interactions with F405 in the pseudoactive state (Figure 2b), whereas it appeared to interact with D837 and/or the +5 phosphate (relative to the protospacer adjacent motif) in one of our simulations for the active state (Figure 2—figure supplement 3). Despite these differences in the interactions with the REC1 domain, other regions of the HNH domain, especially the helical segment preceding the ββα motif, showed similar patterns of binding to the REC2 domain in the pseudoactive and active states (Figure 2f and Figure 2—figure supplement 4).
The removal of the interfacial Mg2+ ion from our models substantially destabilized the pseudoactive and active conformations of the Cas9-nucleic acid complex by ~600 kcal/mol (Supplementary file 1), supporting the indispensable role of the Mg2+ ion for Cas9 conformational activation that has been previously demonstrated (Dagdas et al., 2017; Osuka et al., 2018; Zuo and Liu, 2017). The pseudoactive conformation (denoted as N863-OUT hereafter) appeared to be unique to SpyCas9 since the corresponding asparagine residues in other ββα-Me nucleases, regardless of the apo or bound form, typically oriented toward the catalytic center (Figure 1—figure supplement 5 and Figure 2—figure supplement 1). End-state free energy calculations revealed that the N863-OUT SpyCas9 complex is thermodynamically more stable than its catalytically active (N863-IN) form (Supplementary file 1). Long time-scale MD simulations based on the isolated HNH domain of Cas9 also showed that the N863-IN conformation is remarkably flexible and tends to transition to the N863-OUT conformation, but not vice versa (Figure 2—figure supplement 5). These results, at least partially, explain why the N863-OUT SpyCas9 conformation was captured in the majority of experimental structures (Anders et al., 2016; Anders et al., 2014; Hirano et al., 2016; Huai et al., 2017; Jiang et al., 2015; Jiang et al., 2016; Jinek et al., 2014; Nishimasu et al., 2014).
To further enhance the confidence in our MD-derived structure models, QM/MM simulations were performed. The metal centers shown in Figure 2a and d were optimized by DFTB3 QM/MM simulations (Figure 2—figure supplement 6). Compared to our MD simulations, we identified very limited differences from the QM/MD simulations in terms of the active site configuration. Encouragingly, this observation is consistent with our most recent benchmark work (Zuo and Liu, 2018a) demonstrating the good performance of the used nonbonded Mg2+ ion model (Liao et al., 2017) in maintaining challenging enzyme metal centers.
Based on the findings reported here and in the previous studies (Dagdas et al., 2017; Osuka et al., 2018; Sternberg et al., 2015; Yang et al., 2018), we present a working model to illustrate how the activation of SpyCas9-mediated DNA cleavage is achieved (Figure 2g). The binding of DNA to Cas9-sgRNA induces the interconversion of the HNH domain between the checkpoint intermediate (I) state and the pre-cleavage (P) state. By bridging a gap between the catalytic center and the opposite scissile phosphate, the ambient Mg2+ ion subsequently facilitates a conformational change to transition the HNH domain from the P state to the tDNA docked pseudoactive (D, N863-OUT) or active (D*, N863-IN) state. Before the cleavage reaction occurs, both states exist and the conformational population of the two states reaches an equilibrium. The lower-energy pseudoactive state has larger conformational population than the active state. Once the irreversible DNA cleavage reaction rapidly occurs, the active state conformational population decreases rapidly. The equilibrium between these two states is lost and the conformational population begins to move from the pseudoactive state to the active state until the reaction is completed. The coexistence of multiple Cas9 ternary complexes has important implications for deciphering the biphasic kinetics of Cas9-catalyzed DNA cleavage that has been shown recently by Raper et al. (2018). Taking their results into considerations, we surmise that the early, fast phase characterizes a direct transition from the precatalytic state to the active state, while the slow transition from the pseudoactive to the active state likely contributes to the late, slow phase. Our conclusion would benefit from structural determination of the activated Cas9 ternary complex with metal ions.
Meanwhile, we note that the possibility cannot be ruled out that the N863-IN conformation may be stably formed when enough Mg2+ ions are present in the solvent. In other words, the N863-IN conformation could be induced by Mg2+ even without engaging a DNA substrate. To test this hypothesis, we have performed additional simulations on the pre-cleavage Cas9 complex (Jiang et al., 2016) with multiple Mg2+ ions positioned around N863, but were unable to observe the trend toward the N863-IN conformation over µs time scales (data not shown). Consistently, the HNH domain in 4UN3 (Figure 1b, right panel) and other crystals (like 5B2R, 5B2T, 5B2S, 5FQ5, 5FW1, 5FW2, 5FW3 and 6AI6) remained in the N863-OUT conformation even though as high as eight Mg2+ ions were cocrystalized (Anders et al., 2016; Hirano et al., 2016; Nishimasu et al., 2014; Olieric et al., 2016; Shin et al., 2017). Therefore, the presence of metal ions alone appears to be insufficient for the HNH domain conformational activation. Nevertheless, further experimental and computational research is needed to elucidate the transition process between the two states shown herein as well as the chemical principles defining the requirement for asparagine instead of aspartic acid.
Overall, our study has delineated a molecular framework underlying the catalytic conformation of the HNH nuclease domain of SpyCas9. The findings presented here advance our knowledge of conformational activation that enables Cas9-mediated DNA cleavage, set an important foundation for future studies to further understand the structure-function relationships of Cas9, and facilitate the rational design of Cas9 variants in the future.
Human embryonic kidney 293T (HEK293T) cells (ATCC, Manassas, VA) as a subclone of the HEK293 cell line were cultured in DMEM (Thermo Fisher Scientific, Carlsbad, CA) containing 10% fetal bovine serum (FBS; Thermo Fisher Scientific, Carlsbad, CA) at 37°C. All cells were periodically tested using the MycoAlert mycoplasma detection kit (Lonza, Walkersville, MD) and free of mycoplasma. The HEK293T cells were used to established EGFP-expressing cells by the lentivirus-mediated transduction of pLenti-CMV-GFP-Puro expression plasmid (Addgene, Cambridge, MA) into the cells followed by the selection of single-cell clones that stably express EGPF and fluorescence green (Campeau et al., 2009). The stable clone A2 was used in this study.
For testing Cas9-mediated editing of the egfp gene in the HEK293T-EGFP (A2) cells, an EGFP-targeting sgRNA sequence (EGFP sgRNA1: 5’GGGCGAGGAGCTGTTCACCG3’) was cloned into a lentiCRISPR plasmid (Addgene, Cambridge, MA) and resulted in a construct of a one-vector system for co-expression of sgRNA and wild-type SpyCas9 (Addgene, Cambridge, MA) (Shalem et al., 2014). The site-directed mutagenesis was performed to specifically introduce mutations into the cas9 gene open reading frame (ORF) in the expression construct to generate the expression vectors of different Cas9 variants with the EGFP sgRNA sequence. After mutagenesis, the DNA sequencing of each expression construct was performed to confirm the mutations of the Cas9 gene ORF. HEK293T-EGFP (A2) cells transduced with the Cas9 and sgRNA expression constructs were selected using 5 ug/ml puromycin for two weeks prior to the downstream analysis to determine the editing efficiencies of different Cas9 variants.
The general procedure for immunoblotting was described in previously published reports (Wang et al., 2008; Zolekar et al., 2018). The primary antibody against SpyCas9 (catalog# ab191468) was obtained from Abcam (Cambridge, MA). HRP-conjugated secondary antibodies were from Jackson ImmunoResearch Laboratories (West Grove, PA).
For determining fluorescence intensity and quantifying the percentages of EGFP fluorescence-positive cells in cell samples, samples (~5×105 cells per sample) harvested and resuspended in phosphate-buffered saline (PBS) were analyzed using a SH800Z cell sorter (Sony Biotechnology, San Jose, CA).
The protein variants, SpyCas9D861A and SpyCas9N863A were produced by sequence independent cloning method (SLIC) using SpyCas9WT template plasmid (Addgene: pMJ806) (Jinek et al., 2012) and mutagenic primers (Supplementary file 2) (Scholz et al., 2013). Sequence confirmed clones were transformed into Escherichia coli Rosetta strain 2 (DE3) for protein expression. Overexpression and protein purification were carried out using previously published protocols (Babu et al., 2019; Jinek et al., 2012).
The template for in vitro transcription of sgRNA (98-nucleotide long) contained a 20 nt long spacer as previously described (Babu et al., 2019; Nishimasu et al., 2014). The protocols used for in vitro transcription and sgRNA annealing were as reported previously (Babu et al., 2019). For creating target DNA, a 30 nt long protospacer flanked by a PAM (GGG) was introduced into pUC19 (Babu et al., 2019).
The proteins were diluted in 20 mM HEPES pH 7.5, 150 mM KCl, and 2 mM TCEP, and the cleavage assays were carried out in a final volume of 10 μL. The reaction mix contained 20 mM Tris-HCl pH 7.5, 100 mM KCl, 5 mM MgCl2, 5% (v/v) glycerol, 0.5 mM TCEP, 100 ng of substrate plasmid, 50 nM SpyCas9, and 60 nM sgRNA (protein and RNA at a ratio of 1:1.2 molar). The reaction mixture was incubated at 37°C and stopped at different time points (15 s, 30 s, 1 min, 2.5 min, 5 min, 7.5 min, 10 min, 15 min, 30 min, 45 min, and 60 min) by the addition of 50 mM EDTA and 1% SDS. The reaction products were resolved on 1% agarose gel and products were visualized by ethidium bromide staining. The gel was imaged using a BioRad ChemiDoc MP apparatus. To quantify the cleavage activities, each gel image was analyzed using the ImageJ software (Schneider et al., 2012). The bands of nicked (N), linear (L), and supercoiled (SC) DNA were quantified and designated as IN, IL, and ISC respectively. The nicked, linear and total activity (TA) was calculated using the following equations:
For each reported data point, average values were obtained from a minimum of three replications that were performed using proteins produced from two independent preparations to account for variations in active protein fraction between different preparations. Standard deviation (SD) and standard error of mean (SEM) were calculated based on the number of replications using the following equations:
where R is a data value from each replication, RAV is average of data values of all the replications, and n is the number of replications (a total of three for each protein variant).
Two separate oligo DNA strands used for the radioactive assay were ordered from Integrated DNA technology (IDT, Supplementary file 2). Target (T) and non-target (NT) strands were mixed at equimolar concentration in the annealing buffer (30 mM HEPES pH 7.5, 100 mM potassium acetate) and heated at 95°C for 2 min and allowed for slow cooling. The annealed duplex oligo DNA was 5’ end labeled with 32P (γ−32P ATP purchased from PerkinElmer) using T4 polynucleotide kinase (New England Biolabs). The labeled oligo DNA was purified using BioSpin column P-30 (BioRad). The reaction buffer contained 20 mM Tris-HCl pH 7.5, 100 mM KCl, 10 mM MgCl2, 5% (v/v) glycerol, 0.5 mM TCEP. Approximately 5 nM of labeled oligo duplex was incubated with 250 nM SpyCas9 and 300 nM sgRNA (protein and RNA at a ratio of 1:1.2 molar) at 37°C and stopped at different time points (15 min, 30 min, and 60 min) using EDTA at 10 mM final concentration. Then the samples were treated with proteinase K (New England Biolabs) for 15 min at 50° C to remove SpyCas9. This was followed by addition of equal volume of loading dye (2X concentration is 20 mM EDTA, 95% formamide, 2% SDS, and 0.025% bromophenol blue). The reaction samples were resolved on a 16% poly-acrylamide gel containing 20% formamide and 6.4 M urea. The bands were visualized by phosphor imaging with Typhoon FLA 7000 system (GE life sciences). Three independent replications were performed using proteins from two independent preparations.
The initial coordinates of the apo-state SpyCas9 were taken from the Protein Data Bank (PDB) under accession number 4CMP (solvated at 2.6 Å resolution) (Jinek et al., 2014). This X-ray structure contains two Cas9 monomers, and the molecule B was considered for modeling here (Figure 1—figure supplement 4a). The disordered regions were built up with the tool SWISS-MODEL (Waterhouse et al., 2018) and the missing heavy atoms and hydrogens were added by using the leap program within AMBERTOOL16 (Salomon-Ferrer et al., 2013). The complete structure was then solvated in a cubic water box with a minimal thickness of 13.5 Å from each edge, leading to a periodic boundary box of 138 × 153×126 Å3. The system was neutralized by Na+, and additional NaCl was added to generate a physiological ionic strength of 150 mM. The resulting simulation box contains ~230,000 atoms.
The above system was simulated by the CUDA-accelerated version of AMBER16 pmemd engine (pmemd.cuda; Salomon-Ferrer et al., 2013) using the amber force field ff14SBonlysc for protein, the TIP3P model for water (Jorgensen et al., 1983), and the Joung-Cheatham parameter sets for monovalent ions (Joung and Cheatham, 2008). The non-bonded interactions were truncated at 10 Å, and the long-range electrostatics were calculated through the particle mesh Ewald (PME) summation method (Darden et al., 1993), with a grid spacing of 1 Å. The bonds involving hydrogens were constrained via the SHAKE algorithm (Miyamoto and Kollman, 1992), allowing use of 2-fs time step of integration. After thorough energy minimization, the system underwent slow heating over 50 ps from 0 K to the target 310.15 K in the isothermal-isochoric (NVT) ensemble, followed by a 20-ns equilibration in the isothermal-isobaric (NPT) condition. The protein backbone atoms were restrained in the heating and equilibration stages. Finally, the production run was performed under the NPT ensemble without restraints, extending up to 100 ns. The pressure was controlled at 1.013 bar via the Monte Carlo barostat, and the temperature was maintained at 310.15 K through the Langevin thermostat implemented in AMBER16.
The final structural snapshot from the above simulation was then extracted, and a Mg2+ ion was placed at its HNH domain active center to set up the Mg2+-bound system (Figure 1—figure supplement 4) by reference to the AnaCas9 crystal structure bound with a Mg2+ ion (Figure 1—figure supplement 5a) (Jinek et al., 2014). Also, extra Mg2+ ions were introduced into the system to obtain a physiological concentration of 5 mM. The parameter set developed by Li et al. (Li et al., 2013b) was selected for Mg2+. In the equilibration stage, the distances between the Mg2+ and the coordinating oxygens on D839 and N863 was restrained to 2.1 Å (i.e., the experimental ion-oxygen distance; Zheng et al., 2008). The production run without restrains was extended to 50 ns.
The starting structure of the pseudoactive Cas9 complex (Figure 1d, right panel) was obtained from our recent work (Zuo and Liu, 2017), which was derived by employing the unbiased, brute-force MD simulations on the crystal structure of Cas9-sgRNA-DNA (PDB code: 5F9R) that was captured in the pre-cleavage state (Jiang et al., 2016). The structural model has been validated by different experiments, yet the Mg2+ ion at the HNH domain catalytic center appeared to lose one critical coordination bond with the leaving group O3’ of the scissile phosphate (Zuo and Liu, 2017) (Figure 2—figure supplement 7), as compared to the homologous T4 Endo VII structure complexed with a DNA junction (Biertümpfel et al., 2007) (Figure 2c). We reasoned that this issue may be due to the deficiency with the simple point-charge Mg2+ model used. Most recently, we systematically evaluated the performance of all four types of non-bonded Mg2+ ion models in terms of maintaining a challenging metal center configuration in a nuclease system (Nowotny et al., 2005). Our benchmark calculations demonstrated that the multisite models based a 12-6-4 Lennard-Jones potential (Li and Merz, 2014; Liao et al., 2017), which take charge-induced dipole effects into account, are the only ones that are capable of reproducing the experimental coordination patterns (Zuo and Liu, 2018a). Accordingly, the 12-6-4 type multisite model (Jorgensen et al., 1983) (here the midC4 set) was considered for the Cas9 complex simulation, along with the TIP4PEw model for water, the Joung-Cheatham parameter sets for monovalent ions (Joung and Cheatham, 2008), and the amber force fields ff14SBonlysc, ff99bsc0_chiOL3, ff99bsc0_OL15 for protein, RNA and DNA, respectively. Basically, the complex system was set up and simulated following the above protocol for the apo-Cas9 systems. The generated simulation box is approximately 109 × 145×166 Å3 sized, containing ~282,000 atoms. With different random seed numbers, two parallel simulations were carried out by using the latest AMBER18 (Salomon-Ferrer et al., 2013) that enables GPU calculations of the 12-6-4 ion potential. The simulation length was set to 200 ns each.
The initial model for the active Cas9 complex was constructed by replacing the α-helical segment of the ββα-Me motif in the optimized pseudoactive Cas9 complex (Figure 1a) with the corresponding part in the Mg2+-bound apo-Cas9 structure (Figure 1—figure supplement 4c). The pseudoactive Cas9 complex structure was taken from the above production simulation near 100 ns (i.e. about half of the simulation time). The Mg2+-bound apo-Cas9 structure from the simulation trajectory was selected based on the observation of reasonable bonding with the connecting residues and minimal steric clashes after replacement of the α-helical segment. After sufficient energy minimization, the structural model was subjected to multi-stage equilibration: an initial 20-ns relaxation of the α-helical segment and surrounding residues, an another 20-ns equilibration with the inter-atomic distances within the metal center retrained relative to the T4 Endo VII system (Biertümpfel et al., 2007), followed by an additional 20-ns equilibration with the restraints gradually released. Subsequently, two independent replicas were performed (250 ns/run) under the same simulation conditions set for the pseudoactive system above.
Additional MD simulations were performed to investigate the relative stability of the two conformational states (i.e. N863-IN and N863-OUT) of the α structure element containing N863 and D861. The starting coordinates were taken from the respective structure models above, and only the HNH domain of Cas9 (residues 781 to 905) was included in our simulation to enhance the sampling efficiency. Each isolated HNH domain in the two states was immersed in a truncated octahedral water box, with a minimal thickness of 14.5 Å. The ionic centration was set to 100 mM by adding an appropriate number of K+ and Cl- ions in the aqueous solution. The amber force field ff14SBonlysc and the TIP3P model were used for describing the protein and the water molecules, respectively, and the parameter sets for the monovalent ion were derived from the work by Joung and Cheatham (2008). For each system, five independent simulations were performed under the NPT ensemble with different initial velocities, using a timestep of 2-ps. Each replica was extended to ~10 us, yielding a total of ~50 us of sampling for each system.
The semiempirical DFTB3 QM/MM simulations were further implemented to improve the reliability of our MD-derived structure models. DFTB3 is the third-order variant of density functional theory tight binding (DFTB) that is formulated in a DFT framework (Gaus et al., 2012). According to the extensive benchmark calculations, DFTB3 in its current form is most reliably for structural properties, including for fairly complex bimetallic motifs in diverse metalloenzymes (particularly the phosphoryl-transfer enzymes) (Gaus et al., 2012; Lu et al., 2015; Lu et al., 2016; Roston et al., 2018). The QM region includes the catalytic Mg2+ ion, the protein residues that coordinate the metal ion (i.e., D839 and D861/N863), the general base H840, part of the scissile phosphate and nearby atoms on the target strand, and the water molecules surrounding the metal ions, H840, and the scissile phosphate (cf. Figure 2a and d). Only the side chains of protein and the backbone of DNA are kept in the QM region, and link atoms are added between the Cα and Cβ atoms for the amino acids or between the C4’ and C5’ atoms for the nucleotides. The partitioning results in a total of 72 and 75 QM atoms for the pseudoactive and active Cas9 models, respectively. The dummy complex for the Mg2+ ions employed in pure MD simulations is changed back to the realistic single-atom form. The MM part of the protein and nucleic acids are described using the same AMBER force fields as mentioned above, and the water molecules are described with the TIP3P model. After the stages of energy minimization and slow heating, each system was subjected to two parallel 1,000-ps QM/MM simulations performed with the AMBER program.
The free energies of the Cas9-nucleic acid complex conformers were estimated through the end-point Molecular Mechanics-Generalized Born Surface Area (MM-GBSA) approach (Miller et al., 2012). Compared to the alternative Molecular mechanics-Poisson Boltzmann Surface Area (MM-PBSA), MM-GBSA has been proven to be give comparable or even better accuracy in ranking ligand binding affinities as well as in calculating the relative stability of multiple conformations of a biomolecular system, though MM-PBSA is physically more rigorous (Li et al., 2013a; Zuo and Liu, 2016a; Zuo et al., 2018b). For each state, the MM-GBSA calculations were performed over an ensemble of 2000 snapshots extracted from the last 50 ns of the simulation trajectories using the program MMPBSA.py in AmberTools16. The pairwise GB model of Hawkins, Cramer, and Truhlar (GBHCT) (Hawkins et al., 1995; Hawkins et al., 1996) was used, with the parameters described by Tsui and Case (2000). The default values of the surface tension and the offset to correct the non-polar contribution to the solvation free energy were adopted and the salt concentration was set to 150 mM. Following our recent works (Zuo and Liu, 2016a; Zuo et al., 2018b), the two water molecules closest to the Mg2+ ion at the HNH domain active center were retained as part of the complex, considering the importance of the interfacial water for binding. The entropic contribution was not taken into account due to high computational demand and potential convergence problem, yet omission of this term does not qualitatively affect the results as previously suggested (Hou et al., 2011; Li et al., 2013a; Zuo et al., 2018b; Zuo et al., 2016b).
Particle mesh Ewald: An N⋅log(N) method for Ewald sums in large systemsThe Journal of Chemical Physics 98:10089–10092.
DFTB3: extension of the self-consistent-charge density-functional tight-binding method (SCC-DFTB)Journal of Chemical Theory and Computation 7:931–948.https://doi.org/10.1021/ct100684s
Pairwise solute descreening of solute charges from a dielectric mediumChemical Physics Letters 246:122–129.https://doi.org/10.1016/0009-2614(95)01082-K
Parametrized models of aqueous free energies of solvation based on pairwise descreening of solute atomic charges from a dielectric mediumThe Journal of Physical Chemistry 100:19824–19839.https://doi.org/10.1021/jp961710n
Assessing the performance of the MM/PBSA and MM/GBSA methods. 1. the accuracy of binding free energy calculations based on molecular dynamics simulationsJournal of Chemical Information and Modeling 51:69–82.https://doi.org/10.1021/ci100275a
Structural insights into DNA cleavage activation of CRISPR-Cas9 systemNature Communications 8:1375.https://doi.org/10.1038/s41467-017-01496-2
Comparison of simple potential functions for simulating liquid waterThe Journal of Chemical Physics 79:926–935.https://doi.org/10.1063/1.445869
Determination of alkali and halide monovalent ion parameters for use in explicitly solvated biomolecular simulationsThe Journal of Physical Chemistry B 112:9020–9041.https://doi.org/10.1021/jp8001614
The interactions and recognition of cyclic peptide mimetics of tat with HIV-1 TAR RNA: a molecular dynamics simulation studyJournal of Biomolecular Structure & Dynamics 31:276–287.https://doi.org/10.1080/07391102.2012.698248
Rational design of particle mesh ewald compatible Lennard-Jones parameters for +2 metal cations in explicit solventJournal of Chemical Theory and Computation 9:2733–2748.https://doi.org/10.1021/ct400146w
Taking into account the Ion-induced dipole interaction in the nonbonded model of ionsJournal of Chemical Theory and Computation 10:289–297.https://doi.org/10.1021/ct400751u
Extending the nonbonded cationic dummy model to account for Ion-Induced dipole interactionsThe Journal of Physical Chemistry Letters 8:5408–5414.https://doi.org/10.1021/acs.jpclett.7b02358
Parametrization of DFTB3/3OB for magnesium and zinc for chemical and biological applicationsThe Journal of Physical Chemistry B 119:1062–1082.https://doi.org/10.1021/jp506557r
QM/MM free energy simulations: recent progress and challengesMolecular Simulation 42:1056–1078.https://doi.org/10.1080/08927022.2015.1132317
MMPBSA.py: An Efficient Program for End-State Free Energy CalculationsJournal of Chemical Theory and Computation 8:3314–3321.https://doi.org/10.1021/ct300418h
Settle: an analytical version of the SHAKE and RATTLE algorithm for rigid water modelsJournal of Computational Chemistry 13:952–962.https://doi.org/10.1002/jcc.540130805
Data-collection strategy for challenging native SAD phasingActa Crystallographica Section D Structural Biology 72:421–429.https://doi.org/10.1107/S2059798315024110
Real-time observation of flexible domain movements in CRISPR-Cas9The EMBO Journal 37:e96941.https://doi.org/10.15252/embj.201796941
Functional insights revealed by the kinetic mechanism of CRISPR/Cas9Journal of the American Chemical Society 140:2971–2984.https://doi.org/10.1021/jacs.7b13047
Analysis of Phosphoryl-Transfer enzymes with QM/MM free energy simulationsMethods in Enzymology 607:53–90.https://doi.org/10.1016/bs.mie.2018.05.005
An overview of the amber biomolecular simulation packageWiley Interdisciplinary Reviews: Computational Molecular Science 3:198–210.https://doi.org/10.1002/wcms.1121
SWISS-MODEL: homology modelling of protein structures and complexesNucleic Acids Research 46:W296–W303.https://doi.org/10.1093/nar/gky427
An equivalent metal ion in one- and two-metal-ion catalysisNature Structural & Molecular Biology 15:1228–1231.https://doi.org/10.1038/nsmb.1502
Nucleases: diversity of structure, function and mechanismQuarterly Reviews of Biophysics 44:1–93.https://doi.org/10.1017/S0033583510000181
Data mining of metal ion environments present in protein structuresJournal of Inorganic Biochemistry 102:1765–1776.https://doi.org/10.1016/j.jinorgbio.2008.05.006
Insights into the inhibitory mechanism of D13-9001 to the multidrug transporter AcrB through molecular dynamics simulationsThe Journal of Physical Chemistry B 120:2145–2154.https://doi.org/10.1021/acs.jpcb.5b11942
Structure and dynamics of Cas9 HNH domain catalytic stateScientific Reports 7:17271.https://doi.org/10.1038/s41598-017-17578-6
Assessing the performance of the nonbonded Mg2+ Models in a Two-Metal-Dependent RibonucleaseJournal of Chemical Information and Modeling 59:399–408.https://doi.org/10.1021/acs.jcim.8b00627
Blake WiedenheftReviewing Editor; Montana State University, United States
Cynthia WolbergerSenior Editor; Johns Hopkins University School of Medicine, United States
In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.
Thank you for submitting your article "Structural and functional insights into the bona fide catalytic state of the CRISPR-Cas9 HNH nuclease domain" for consideration by eLife. Your article has been reviewed by three peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Cynthia Wolberger as the Senior Editor. The reviewers have opted to remain anonymous.
The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.
Despite considerable effort, there is still no consensus about which residues are necessary for Cas9-mediated cleavage of DNA. In this focused study, the authors combine structural modeling and MD simulations to identify putative residues (i.e., D839-H840-N863) in the HNH domain of Cas9 that are necessary for cleavage of the complementary DNA. The importance of these residues, as well as previously implicated residues, is tested using an in vivo reporter assay. While the reviewers agree that this study helps clarify the catalytic mechanism, the experiments do not provide direct evidence that D839, H840 and N863 are necessary for DNA cleavage.
The reviewers raise a few concerns that must be adequately addressed before the paper can be accepted. Some of the required revisions will require further experimentation.
1) Use direct methods to measure the efficiency of DNA cleavage for WT and mutant Cas9s. In vitro biochemistry performed using the preferred enzymes is preferred.
2) MM-GBSA calculations are not accurate. The reviewers suggest more rigorous methods (e.g. umbrella sampling, metadynamics, etc.) to obtain the free energy landscape between the two different conformational states of the loop containing N863 and D861.https://doi.org/10.7554/eLife.46500.023
The reviewers raise a few concerns that must be adequately addressed before the paper can be accepted. Some of the required revisions will require further experimentation.
1) Use direct methods to measure the efficiency of DNA cleavage for WT and mutant Cas9s. In vitro biochemistry performed using the preferred enzymes is preferred.
To fully address this concern, we set up a new collaboration with Dr. Rakhi Rajan. Rajan lab performed in vitro plasmid cleavage assays using the two variants of SpyCas9 (viz. SpyCas9D861A and SpyCas9N863A), with the wild-type enzyme (SpyCas9WT) as a control. The results are shown in Figure 1—figure supplement 2. SpyCas9WT cleaved >95% of the plasmid within 15 seconds and ~80% linearization was obtained within 2.5 mins (Figure 1—figure supplement 2A). The activity profile of SpyCas9D861A is similar to that of SpyCas9WT, although SpyCas9D861A appeared to require slightly longer time to linearize 80% of the same plasmid. Based on our results, SpyCas9D861A cleaved >95% of the plasmid in 5 mins, with ~80% linearization within 30 mins (Figure 1—figure supplement 2B). Interestingly, SpyCas9N863A produced only nicked products even after 60-minitue incubation (Figure 1—figure supplement 2C). Time required for SpyCas9N863A to nick ~80% of the plasmid was estimated around 10 mins. The in vitro cleavage assays here strongly support the conclusion from our cell-based experiments and in silico simulations that N863, instead of D861, is the essential for the catalytic reaction performed by the HNH domain of SpyCas9.
Furthermore, we performed radioactive assays using duplex oligos labeled with 32P at 5’ end of both strands to determine which endonuclease domain is inactive. SpyCas9WT predominantly produced one band for target strand cleavage by HNH (23 nt long) and one band for non-target strand cleavage by RuvC (31 nt long) (Figure 1—figure supplement 3). SpyCas9D861A also cleaved both target and non-target strands, although the rate of cleavage is relatively lower compared to SpyCas9WT. In contrast, there was no product formation from target strand cleavage by SpyCas9N863A, clearly indicating that the HNH domain is impaired in this variant. Interestingly, the SpyCas9N863A has significantly reduced total activity, as indicated by the amount of uncut duplex present in the reaction even after one-hour incubation (Figure 1—figure supplement 3).
Altogether, our results indicate that N863 is indispensable for the HNH nuclease activity, whereas D861 appears to provide a supporting role to enable a faster reaction rate but is nonessential for target DNA strand cleavage. A closer examination of our simulation trajectories led us to propose that D861 might have a role in stabilizing the catalytic ββα-Me motif by forming an intra-molecular salt bridge (Figure 2D and Figure 2—figure supplement 6A), or aid the initial recruitment of metal ions around the active center along with other negative species like D839 and D837 (Figure 2A and Figure 2—figure supplement 6B) (Picot et al., 2017, Nat. Commun.). The in vitro experimental results are illustrated in Figure 1—figure supplement 2 and Figure 1—figure supplement 3, and other relevant information have been included in our revised manuscript.
2) MM-GBSA calculations are not accurate. The reviewers suggest more rigorous methods (e.g. umbrella sampling, metadynamics, etc.) to obtain the free energy landscape between the two different conformational states of the loop containing N863 and D861.
We agree that the end-state MM-GBSA approach is physically not so rigorous as umbrella sampling (US) and metadynamics in estimating free energy and should be regarded as semiquantitative in nature. In this study, the difference in free energy between the two conformational states (denoted N863-IN and N863-OUT) is of our most interest, and MM-GBSA is especially advantageous in treating such kinds of calculations due to its good balance of computational efficiency and accuracy as well as due to the absence of the need of reaction coordinate definition (Hou et al., 2011; Miller et al., 2012). As demonstrated by extensive benchmark calculations (Hou et al., 2011), the accuracy of MM-GBSA is generally desirable for problems where trends are of special interest, despite being system-dependent in some cases. Specifically, with this method, we and other groups have obtained reasonable results in (relative) free energy calculations on the same Cas9 system and other homologous systems in recent studies (Zuo and Liu, 2016; Zuo and Liu, 2017; Zheng, Shi and Mu, 2018; Zuo and Liu, 2018).
We fully agree with the reviewer that determining the free energy profile between the two conformational states would benefit a deeper mechanistic understanding of the conformational activation in CRISPR-Cas9. Concerning the case here, however, it is quite challenging to construct the expected PMF profiles with an acceptable accuracy in two months, the given revision time. The loop of ~14 amino acids (denoted α structure element in the text due to historical reason) undergoes a substantial conformational reorganization in between the N863-OUT and N863-IN state (RMSD of ~6 Å). It is nontrivial to define an appropriate collective variable (CV) to describe this process. More importantly, the conformational activation of the HNH domain is aided by concerted interactions with the metal ions and the target DNA, and it is almost infeasible to characterize the complex and subtle interactions properly with a single CV or even a set of CVs. Moreover, the systems here are quite large (containing ~230,000 atoms), and the computational burden needed to derive a converged, multi-dimensional PMF (assuming ideal CVs are available) would be quite massive. Therefore, we think this open topic might warrant a separate, more focused research in the future (as we had noted in Discussion of the manuscript).
Despite the difficulties, we have conducted a set of five unconstrained, long-time scale MD simulations starting from each conformational state using the isolated HNH domain (see details for the simulation protocols in Materials and methods). The resultant free energy landscapes shown in Figure 2—figure supplement 5A-5B were constructed based on the principal component analysis over 50-μs of sampling and are plotted against the first two principalcomponents (PC1 and PC2). We also calculated the free energy landscape against the RMSD of the α structure element and the difference in distances of D839-D861 (dD839-D861) and D839-N863 (dD839-N863) using the combined simulation trajectories (see details in the legend).
The N863-OUT energy landscape is characterized by a single minimum around the simulation initial point (Figure 2—figure supplement 5B), indicating the N863-OUT conformation is quite stable. By contrast, the N863-IN system is remarkably flexible as shown by the accessibility of much broader conformational space (Figure 2—figure supplement 5A). The observation is well consistent with the X-ray crystallographic data, adding another support to our finding presented in this study.
The entire transition between the N863-OUT and N861 conformations have not been captured in our simulations. Yet we observed that the N863-IN conformation appears to trend toward the N863-OUT conformation via a metastable and a prominent intermediate state (Figure 2—figure supplement 5C). The metastable state is characterized by hydrogen bond interactions between N863 and D861 in the α structure element, with N863 being at a position between the IN and OUT conformation. In contrast, the α structure element in the intermediate state is largely extended, where N863 faces outward and N861 orients partially inward. In the meanwhile, we note that care should be exercised in interpreting the current free energy results, as the use of two CVs only still has difficulty in accounting for such a complex transition process. Overall, the free energy calculations qualitatively support our main conclusion from this work.https://doi.org/10.7554/eLife.46500.024
- Zhicheng Zuo
- Rakhi Rajan
- Rakhi Rajan
- Yu-Chieh Wang
- Jin Liu
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
We appreciate the assistance from Dr. Qinghua Liao (Uppsala University) in preparing the Mg2+ ion model for MD simulations and thank the Texas Advanced Computing Center (TACC) at the University of Texas at Austin and the University of North Texas (UNT) High Performance Computing Services (a division of the University Information Technology with additional support from UNT Office of Research and Economic Development) for providing us with computational powers that our study has leveraged.
- Cynthia Wolberger, Johns Hopkins University School of Medicine, United States
- Blake Wiedenheft, Montana State University, United States
© 2019, Zuo et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.