Comprehensive interrogation of the ADAR2 deaminase domain for engineering enhanced RNA editing activity and specificity

  1. Dhruva Katrekar
  2. Yichen Xiang
  3. Nathan Palmer
  4. Anushka Saha
  5. Dario Meluzzi
  6. Prashant Mali  Is a corresponding author
  1. Department of Bioengineering, University of California San Diego, United States
  2. Division of Biological Sciences, University of California San Diego, United States

Abstract

Adenosine deaminases acting on RNA (ADARs) can be repurposed to enable programmable RNA editing, however their exogenous delivery leads to transcriptome-wide off-targeting, and additionally, enzymatic activity on certain RNA motifs, especially those flanked by a 5’ guanosine is very low thus limiting their utility as a transcriptome engineering toolset. Towards addressing these issues, we first performed a novel deep mutational scan of the ADAR2 deaminase domain, directly measuring the impact of every amino acid substitution across 261 residues, on RNA editing. This enabled us to create a domain-wide mutagenesis map while also revealing a novel hyperactive variant with improved enzymatic activity at 5’-GAN-3’ motifs. As overexpression of ADAR enzymes, especially hyperactive variants, can lead to significant transcriptome-wide off-targeting, we next engineered a split-ADAR2 deaminase which resulted in >100-fold more specific RNA editing as compared to full-length deaminase overexpression. Taken together, we anticipate this systematic engineering of the ADAR2 deaminase domain will enable broader utility of the ADAR toolset for RNA biotechnology applications.

Editor's evaluation

This manuscript provides a deep mutational scanning of the deaminase domain of human ADAR2 to provide a comprehensive assessment of amino acids that alter editing activity at a specific adenosine flanked by preferred nucleotides (UAG). The results are quite important in terms of impact on precision medicine.

https://doi.org/10.7554/eLife.75555.sa0

Introduction

Adenosine to inosine (A-to-I) editing is a common post-transcriptional modification in RNA that occurs in a variety of organisms, including humans. This A-to-I deamination of specific adenosines in double-stranded RNA is catalyzed by enzymes called adenosine deaminases acting on RNA (ADARs) (Melcher et al., 1996; Bass and Weintraub, 1988; Peng et al., 2012; Nishikura, 2016; Eggington et al., 2011; Wagner et al., 1989; Bass and Weintraub, 1987; Mannion et al., 2014; Tan et al., 2017; Tomaselli et al., 2015; Levanon et al., 2004; Schoft et al., 2007). Since inosine is structurally similar to guanosine, it is interpreted as a guanosine during the cellular processes of translation and splicing, thereby making ADARs powerful systems for altering protein sequences.

Correspondingly, adenosine deaminases have been repurposed for site-specific RNA editing by recruiting them to target RNA sequences using engineered ADAR-recruiting RNAs (adRNAs) (Woolf et al., 1995). Recently, several studies have demonstrated the potential of both genetically encodable and chemically modified RNA-guided adenosine deaminases for the correction of point mutations and the repair of premature stop codons both in vitro (Montiel-Gonzalez et al., 2013; Stafforst and Schneider, 2012; Cox et al., 2017; Wettengel et al., 2017; Merkle et al., 2019; Sinnamon et al., 2017; Monteleone et al., 2019; Fukuda et al., 2017; Qu et al., 2019) and in vivo (Katrekar et al., 2019; Sinnamon et al., 2020). These studies have primarily relied on exogenous ADARs which introduce a significant number of transcriptome-wide off-target A-to-I edits (Cox et al., 2017; Katrekar et al., 2019; Vallecillo-Viejo et al., 2018; Vogel et al., 2018). One solution to this problem is the engineering of adRNAs to enable the recruitment of endogenous ADARs. In this regard, we recently showed that using simple long antisense RNA (>60 bp) can suffice to recruit endogenous ADARs and these adRNAs are both genetically encodable and chemically synthesizable (Katrekar et al., 2019); and Merkle and colleagues showed that using engineered chemically synthesized antisense oligonucleotides (Merkle et al., 2019) could also lead to robust RNA editing via endogenous ADAR recruitment. Although this modality allows for highly specific editing, its applicability is restricted to editing adenosines in certain RNA motifs preferred by the native ADARs, and in tissues with high endogenous ADAR activity. Additionally, it cannot be utilized for novel functionalities such as deamination of cytosine to uracil (C-to-U) editing which requires exogenous delivery of ADAR2 variants (Abudayyeh et al., 2019). Thus, engineering a genetically encodable RNA editing tool that efficiently edits RNA with high specificity and activity is essential for enabling broader use of this toolset for biotechnology and therapeutic applications.

In this regard, the crystal structure of the ADAR2 deaminase domain (ADAR2-DD) (Macbeth et al., 2005; Matthews et al., 2016; Thuy-Boun et al., 2020) and several key biochemical and computational studies (Ohman et al., 2000; Kuttan and Bass, 2012; Daniel et al., 2017; Daniel et al., 2012; Dawson et al., 2004; Wang and Beal, 2016; Roth et al., 2019; Schaffer et al., 2020; Stefl et al., 2010; Riedmann et al., 2008) have laid the foundation for understanding its catalytic mechanism and target preferences, but we still lack comprehensive knowledge of how mutations and fragmentation affect the ability of the ADAR2-DD to edit RNA. To address this, we first carried out a quantitative deep mutational scan (DMS), measuring the effect of every possible single amino acid substitution across 261 residues of the ADAR2-DD, on enzyme function. We utilized the sequence-function map thus generated, to identify novel enhanced variants for A-to-I editing. Additionally, combining information from these sequence-function maps with existing knowledge of the structure and residue conservation scores, we also engineered a genetically encodable split-ADAR2 system that enabled efficient and highly specific RNA editing.

Results

DMS of the ADAR2-DD

To gain comprehensive insight into how mutations affect the ADAR2-DD, we used DMS, a technique that enables simultaneous assessment of the activities of thousands of protein variants (Fowler and Fields, 2014; Araya and Fowler, 2011). Typically, this approach relies on phenotypic selection methods such as cell fitness or fluorescent reporters that result in an enrichment of beneficial variants and a depletion of deleterious variants. However, as RNA editing yields are not precisely quantifiable using surrogate readouts, we focused on directly measuring RNA editing activity in the screens. To do so, we linked genotype to phenotype by placing the RNA editing sites on the same transcript encoding the deaminase variant. These RNA editing sites were chosen within the deaminase domain such that they met two criteria: (1) the sites were outside the 261 residue region where single amino acid substitutions were created for the DMS; and (2) an A-to-I(G) change at these sites resulted in a synonymous alteration at the codon level. By ensuring every cell in the pooled screen received a single library element, we could perform a quantitative DMS of the core 261 amino acids (residues 340–600) of the ADAR2-DD via 4959 (261 × 19) single amino acid variants, directly measuring the effect of each mutation on A-to-I editing yields (Figure 1a, b, and c).

Figure 1 with 3 supplements see all
Overview of the ADAR2 deaminase domain (ADAR2-DD) deep mutational scan (DMS) and 5’-GAN-3’ enhancer editing screen.

(a) Lentiviral vectors comprising two copies of MS2-adRNA both targeting either a 5’ or a 3’ UAG target site for the DMS were created. For the 5’-GAN-3’ enhancer screen, the MS2-adRNAs targeted a 5’ or 3’ GAC site. The lentiviral vector alco contained a mCherry and hygromycin resistance marker. Libraries of single amino acid variants of the ADAR2-DD were created in a second lentiviral with a puromycin selection marker. For the 5’-GAN-3’ enhancer screen, the corresponding library was created in the hyperactive ADAR2-DD(E488Q) backbone. (b) HEK293FT cells were transduced with the MS2-adRNA lentiviruses at a high multiplicity of infection (MOI) and upon hygromycin selection, a single clone with high mCherry expression was selected. Four independent clonal cell lines were created, harboring MS2-adRNA targeting the 5’ and 3’ UAG and GAC sites. The clonal cell line bearing the MS2-adRNA was then transduced with the lentiviral library of MCP-ADAR2-DD variants at a low MOI to ensure delivery of a single variant per cell. Cells were then selected with puromycin. (c) During the processes of cellular transcription and translation, each cell produces the MS2-adRNA as well as MCP-ADAR2-DD variant. Upon translation in the cell, each MCP-ADAR2-DD variant, in combination with the MS2-adRNA, edits its own transcript creating a synonymous change. Cells were harvested, mRNA isolated and regions of the ADAR2-DD were amplified and sequenced. The fraction of edited reads was then computed for each mutant. (d) Replicate correlation for the ADAR2-DD DMS. The X and Y axes represent the log2 fold change in editing as compared to the ADAR2. (e) Structure of the ADAR2-DD bound to its substrate (PDB 5HP3) with the degree of mutability of each residue as measured by the DMS highlighted. Residues that are highly intolerant to mutations are colored red while residues that are highly mutable are colored yellow. Residues not assayed in this DMS are colored white. (f) Using the library chassis of the DMS, a screen of deaminase domain mutants (in an E488Q background) was performed to mine variants with improved activity against 5’-GAN-3’ RNA motifs. Replicate correlation for the 5’-GAN-3’ enhancer mutant screen. The X and Y axes represent the log2 fold change in editing as compared to the ADAR2. (g) Structure of the ADAR2-DD(E488Q) bound to its substrate (PDB 5ED1) with the N496 residue highlighted in red, the E488Q residue in cyan, the target adenosine in green, the orphaned cytosine in magenta, and the adenosine on the unedited strand that base pairs with the 5’ uracil flanking the target adenosine in orange. (h) The E488Q, N496F double mutant was validated by editing a GAA motif in the GAPDH CDS, a GAG motif in the KRAS CDS and GAC, GAT motifs in the RAB7A 3’ UTR. Values represent mean ± SEM (n = 3). p-Values were computed using a two-tailed unpaired t-test. All experiments were carried out in HEK293FT cells.

Figure 1—source data 1

ADAR2 deaminase domain deep mutational scan and 5’-GAN-3’ enhancer editing screen.

https://cdn.elifesciences.org/articles/75555/elife-75555-fig1-data1-v2.xlsx

Given the large size of the deaminase domain at >750 bp, the library of ADAR variants was created using six tiling oligonucleotide pools (Figure 1—figure supplement 1a). These pools were cloned into a lentiviral vector containing a sequence encoding the MS2 coat protein (MCP) and the remainder of the deaminase domain bearing a nuclear export signal (NES) followed by a self-cleaving peptide (P2A) and a puromycin resistance gene (Figure 1a). To ensure read length coverage in next generation sequencing, members of the first three library pools were assayed for editing at the 5’ end of the ADAR2-DD while the remaining members were assayed at the 3’ end (Figure 1c, Figure 1—figure supplement 1a). To minimize assay noise resulting from varying expression of the guide RNA in each cell, two HEK293FT clonal cell lines were created following a high multiplicity of infection (MOI) transduction with lentivirus bearing two copies of MS2-adRNAs targeting either the 5’ or the 3’ UAG sites integrated into them (Figure 1b). The DMS was carried out in cell lines harboring these MS2-adRNAs by transducing them with the corresponding ADAR mutant libraries at a low MOI of 0.2–0.4. Two biological replicates were performed in independent plates of cells transduced with the lentiviral libraries. Following lentiviral transduction and puromycin selection, RNA was extracted from the harvested cells and reverse transcribed. Relevant regions of the deaminase domain were amplified from the cDNA and sequenced (Figure 1—figure supplement 1a); 4958 of the 4959 possible variants were successfully detected, of which 4931 elements had over 50 reads in each biological replicate and were included in the subsequent analysis (Figure 1—figure supplement 1b). The deaminase domain transcripts for each variant also contained the associated A-to-I editing yields, which were then quantified for both replicates of the DMS (R2 = 0.687) (Figure 1d).

The scans revealed both intrinsic domain properties, and also several mutations that enhanced RNA editing (Figure 1e, Figure 1—figure supplement 1c). Specifically: (1) As expected, most mutations in conserved regions 442–460 and 469–495 that bind the RNA duplex near the editing site led to a significant decrease in editing efficiency of the enzyme (Matthews et al., 2016). (2) However, mutating the negatively charged E488 residue, which recognizes the cytosine opposite the flipped adenosine by donating hydrogen bonds, to a positively charged or most polar-neutral amino acids resulted in an improvement in editing efficiency. This is consistent with the previously discovered E488Q mutation which has been shown to improve the catalytic activity of the enzyme (Kuttan and Bass, 2012). (3) Furthermore, most mutations to residues that contact the flipped adenosine (V351, T375, K376, E396, C451, R455) were observed to be detrimental to enzyme function (Matthews et al., 2016). (4) Similarly, the residues of the ADAR2-DD that interact with the zinc ion in the active site and the inositol hexakisphosphate (R400, R401, K519, R522, S531, W523, D392, K483, C451, C516, H394, and E396) were all also extremely intolerant to mutations (Macbeth et al., 2005). (5) Additionally, as expected, surface exposed residues in general readily tolerated mutations as compared to buried residues (Matthews et al., 2016).

To independently validate the results from the DMS, we individually examined 33 mutants from the DMS whose editing efficiencies ranged from very low to very high as compared to the wild-type ADAR2-DD. The mutants were assayed for their ability to repair a premature amber stop codon (UAG) in the cypridina luciferase (cluc) transcript (Cox et al., 2017). The Pearson correlation between the arrayed validations and the data obtained in the screen was 0.818 while the Spearman (rank) correlation was 0.824 (Figure 1—figure supplement 2a). Additionally, we also validated several of these mutants for their ability to edit UAG motifs in the GAPDH and KRAS CDS (Figure 1—figure supplement 2b). These validations suggest that the trends in RNA editing activity (high vs. low) are well predicted using this novel screening approach and this enabled us to create a mutability map of the ADAR2-DD where residues that tolerate mutations are highlighted in yellow (Figure 1e). Additionally, we compared the efficiency of variants in our ADAR2-DD DMS at editing UAG triplets, to published mutants (Matthews et al., 2016; Thuy-Boun et al., 2020; Kuttan and Bass, 2012) and again observed similar agreement in the trends of activity of most variants, confirming the efficacy of this DMS at accurately indicating whether a mutation is beneficial, neutral, or detrimental for enzymatic activity.

Enhancing enzyme activity at 5’-GAN-3’ motifs

Building on this platform (Figure 1a, b, and c), we next screened for domain variants that improved editing at refractory RNA motifs such as adenosines flanked by a 5’ guanosine (Vogel et al., 2018; Kuttan and Bass, 2012). Toward this, two HEK293FT clonal cell lines were created with two copies of MS2-adRNAs targeting either the 5’ or 3’ GAC sites integrated into them. A screen was carried out in cell lines harboring these MS2-adRNAs by transducing them with the corresponding MCP-ADAR2-DD(E488Q) libraries at a low MOI (0.2–0.4), evaluating the potential of 3287 mutants to edit a GAC motif. Similar to above, following lentiviral transduction and selection, RNA was extracted, reverse transcribed, and relevant regions of the deaminase domain amplified, sequenced, and analyzed (R2 = 0.476) (Figure 1f). Via this approach, we discovered a novel mutant E488Q, N496F that enhanced editing at a 5’-GAC-3’ motif. Interestingly, in the ADAR2-DD(E488Q) crystal structure, the N496 residue is in close proximity to the adenosine on the unedited strand that base pairs with the 5’ uracil flanking the target adenosine (Figure 1g; Matthews et al., 2016). We validated this mutant using a cypridina luciferase (cluc) reporter bearing a premature opal stop codon (UGA) and confirmed that the E488Q, N496F double mutant was 3-fold better at restoring luciferase activity as compared to E488Q alone (Figure 1—figure supplement 3a). To further confirm that the E488Q, N496F double mutant could be used to efficiently edit adenosines flanked by a 5’ guanosine, we tested the ability of this mutant to edit GAC, GAT, GAG, and GAA motifs in the endogenous RAB7A (3’ UTR), KRAS (CDS), and GAPDH (CDS) transcripts. We observed that the double mutant E488Q, N496F was 1.1- to 2.1-fold more efficient at editing various 5’-GAN-3’ motifs as compared to the E488Q (Figure 1h, Figure 1—figure supplement 3b and c), confirming the ability of this novel screening format to discover variants with enhanced activity toward refractory RNA motifs. We also confirmed that this new variant was at least as efficient as the E488Q at editing all other motifs, with improved editing also observed at 5’-NAC-3’ motifs (Figure 1—figure supplement 3c).

Improving specificity via splitting of the ADAR2-DD

In addition to increasing the on-target activity of ADARs at editing adenosines in non-preferred motifs, another challenge toward unlocking their utility as a RNA editing toolset is that of improving specificity. Due to their intrinsic dsRNA binding activity, overexpression of ADARs leads to promiscuous transcriptome-wide off-targeting, and thus, when relying on exogenous ADARs, it is important to restrict the catalytic activity of the overexpressed enzyme only to the target mRNA. We hypothesized that it might be possible to achieve this by splitting the deaminase domain into two catalytically inactive fragments that come together to form a catalytically active enzyme only at the intended target (Figure 2a). Since we and others have utilized the MCP and Lambda N (λN) systems to efficiently recruit ADARs, we first decided to utilize these systems to recruit the two split halves, that is, the N- and C-terminal fragments of the ADAR2-DD (Montiel-Gonzalez et al., 2013; Katrekar et al., 2019). Specifically, constructs were created with cloning sites for N-terminal fragments located downstream of the MCP while those for the C-terminal fragments located upstream of the λN. Chimeric adRNAs were designed to bear a BoxB and an MS2 stem loop along with an antisense domain complementary to the target. Examining the results of the DMS (focusing on sites with high mutability), as well as the crystal structure of the ADAR2-DD (focusing on high solvent accessible surface area), and residue conservation scores across species (focusing on low scores of conservation), we identified 18 putative regions for splitting the protein (Figure 2b; Matthews et al., 2016; Dagliyan et al., 2018). The resulting 18 different split-ADAR2 pairs were assayed for their ability to repair a premature amber stop codon (UAG) in the cypridina luciferase (cluc) transcript in the presence of the recruiting adRNA bearing BoxB and MS2 stem loops (Figure 2c). Of these pairs 9–12 showed the best editing efficiency, and notably were all located within residues 465–468 which have low conservation scores across species (Matthews et al., 2016). Interestingly, this region is flanked by highly conserved amino acids (442–460 and 469–495). The split-ADAR2 pair 12 is hereon referred to as ADAR2-DDN and ADAR2-DDC.

Figure 2 with 1 supplement see all
Engineering split-ADAR2 deaminase domains (ADAR2-DD).

(a) Schematic of the split-ADAR2 engineering approach. (b) Sequence of the ADAR2-DD. The protein was split between residues labelled in red, and a total of 18 pairs were evaluated. (c) The ability of each split pair from (b) to correct a premature stop codon when transfected with a chimeric BoxB-MS2 ADAR-recruiting RNA (adRNA) was assayed via a luciferase assay. The pairs 1–18 correspond to the residues in red in (b) in the order in which they appear. The residues in (b) in bold red correspond to pairs 9–12. Values represent mean (n = 2). All experiments were carried out in HEK293FT cells.

We also confirmed that every component of the split-ADAR2 system was essential for RNA editing. Specifically, we assayed all components and pairs of components for their ability to edit the RAB7A transcript and also restore luciferase activity. The MCP-ADAR2-DD was included as a control. We observed editing of the RAB7A transcript and restoration of luciferase activity only when every component of the split-ADAR2 system was delivered, confirming that the individual components lacked enzymatic activity (Figure 3a, Figure 2—figure supplement 1a). Additionally, we also confirmed the importance of fragment orientation for the formation of a functional enzyme. Toward this, we swapped the positions of the N- and C-terminal fragments and created ADAR2-DDN-MCP and λN-ADAR2-DDC in addition to the working MCP-ADAR2-DDN and ADAR2-DDC-λN pair. We then tested each pair of N- and C-terminal fragments and observed functionality only for the MCP-ADAR2-DDN paired with ADAR2-DDC-λN (Figure 2—figure supplement 1b).

Figure 3 with 3 supplements see all
Characterizing the split-ADAR2 deaminase domains.

(a) The components of the split-ADAR2 system based on pair 12 were tested for their ability to edit the RAB7A transcript. Editing was observed only when every component was delivered. Values represent mean ± SEM (n = 3). (b) 2D histograms comparing the transcriptome-wide A-to-G editing yields observed with each construct (y-axis) to the yields observed with the control sample (x-axis). Each histogram represents the same set of reference sites, where read coverage was at least 10 and at least one putative editing event was detected in at least one sample. Bins highlighted in red contain sites with significant changes in A-to-G editing yields when comparing treatment to control sample. Red crosses in each plot indicate the 100 sites with the smallest adjusted p-values. Blue circles indicate the intended target A-site within the RAB7A transcript. (c) The split-ADAR2 system was assayed for editing the KRAS and CKB transcripts. Values represent mean ± SEM (n = 3). All experiments were carried out in HEK293FT cells.

Figure 3—source data 1

Characterizing the split-ADAR2 deaminase domains.

https://cdn.elifesciences.org/articles/75555/elife-75555-fig3-data1-v2.xlsx

Since MCP and λN are proteins of viral origin we next replaced these with the human TAR binding protein and the stem loop binding protein, respectively, to create a humanized split-ADAR2 system with improved translational relevance (Rauch et al., 2019). In the presence of a chimeric adRNA containing a histone stem loop and a TAR stem loop, we observed restoration of luciferase activity (Figure 2—figure supplement 1c). This also confirmed that the split-ADAR2-DD could indeed be recruited for RNA editing using two independent sets of protein-RNA binding systems.

Finally, we investigated the specificity profiles via analysis of the transcriptome-wide off-target A-to-I(G) editing effected by this system (Figure 3b and Figure 3—figure supplements 1 and 2). Each condition from Figure 3a (where a UAG in the endogenous RAB7A transcript was targeted) was analyzed by RNA-seq. From each sample, we collected 25 million uniquely aligned sequencing read pairs. We then used Fisher’s exact test to quantify significant changes in A-to-G editing yields, relative to untransfected cells, at each reference adenosine site having sufficient read coverage. As expected with hyperactive enzyme variants, the ADAR2-DD(E488Q) showed significantly high off-target editing as compared to the ADAR2-DD (Figure 3b). Notably, by splitting the deaminase domain, we observed a 1000- to 1300-fold reduction in the number of off-targets as compared to the full-length ADAR2-DD or ADAR2-DD(E488Q) (Figure 3b). Excitingly, the specificity profiles of the split-ADAR2 system were comparable to those seen when using endogenous recruitment of ADARs via long antisense RNA (Katrekar et al., 2019).

To confirm generalizability of the results, we also tested the split-ADAR2 at two additional endogenous loci: an adenosine in the 3’UTR of CKB and an adenosine in the CDS of KRAS, and observed robust editing efficiency of the split-ADAR2 system (Figure 3c). To enable convenient delivery of the split-ADAR2 system, we also created an all-in-one vector bearing a bicistronic ADAR2-DDC-λN-P2A-MCP-ADAR2-DDN which also enabled higher editing efficiencies across all three loci tested (Figure 3a and c). We also confirmed transcriptome-wide specificity of targeting using the all-in-one vector while targeting RAB7A (Figure 3—figure supplement 2) and also while targeting the KRAS locus (Figure 3—figure supplement 3). A closer look at the off-targets revealed that in case of the split-ADAR2 system, highly edited off-targets were indeed guide RNA sequence dependent. This is in contrast to full-length deaminase domain overexpression where off-targets were predominantly deaminase domain driven (Supplementary file 1). The entire split-ADAR2 system consisting of CMV promoter-driven ADAR2-DDC-λN-P2A-MCP-ADAR2-DDN and a human U6 promoter-driven BoxB-MS2 adRNA is ~3500 bp in size and can easily be packaged into a single adeno-associated virus (AAV).

Lastly, to test if the split-ADAR2 chassis could be expanded to highly active ADAR variants that enable efficient editing of 5’-GAN-3’ motifs as well as those that enable C-to-U editing, we created a split-ADAR2-DD(E488Q, N496F) and a split-RESCUE (RNA Editing for Specific C-to-U Exchange) system (Abudayyeh et al., 2019). We confirmed that the split-ADAR2-DD(E488Q, N496F) was indeed better at editing a GAC motif as compared to the split-ADAR2-DD(E488Q) (Figure 4a). Although the full-length ADAR2-DD(E488Q, N496F) was highly promiscuous, splitting it enabled high transcriptome-wide specificity while targeting a UAG in the RAB7A transcript (Figure 4b, Figure 3—figure supplement 2). Additionally, we noted comparable C-to-U RNA editing levels of the endogenous RAB7A transcript using the split-RESCUE and the full-length MCP-RESCUE (Figure 4c). However, the split-RESCUE system was also highly specific, both in the A-to-I(G) and C-to-U space as compared to the MCP-RESCUE (Figure 4d).

Optimizing and expanding the utility of split-ADAR2 deaminase domains (ADAR2-DD).

(a) A split-ADAR2-DD(E488Q, N496F) was engineered and used to edit a GAC motif in the RAB7A transcript. Values represent mean ± SEM (n = 3). (b) 2D histograms comparing the transcriptome-wide A-to-G editing yields observed with full-length and split-ADAR2-DD(E488Q, N496F) constructs. Blue circles indicate the intended target UAG site within the RAB7A transcript. (c) A split-RESCUE was engineered and assayed for cytosine to uracil (C-to-U) editing of the RAB7A transcript. Values represent mean ± SEM (n = 3), quantified by NGS. (d) 2D histograms comparing the transcriptome-wide A-to-G and C-to-U editing yields observed with full-length and split RESCUE constructs. Blue circles indicate the intended target C site within the RAB7A transcript. All experiments were carried out in HEK293FT cells.

Figure 4—source data 1

Optimizing and expanding the utility of split-ADAR2 deaminase domains.

https://cdn.elifesciences.org/articles/75555/elife-75555-fig4-data1-v2.xlsx

Discussion

Toward addressing two of the fundamental challenges in using ADARs for programmable RNA editing, specifically, (1) poor enzymatic activity on certain RNA motifs such as those flanked by a 5’ guanosine (Vogel et al., 2018; Kuttan and Bass, 2012), and (2) exogenous delivery leading to massive transcriptome-wide off-targeting (Cox et al., 2017; Katrekar et al., 2019; Vallecillo-Viejo et al., 2018; Vogel et al., 2018), we have explored ADAR2 deaminase protein engineering via two distinct approaches. First, we performed a novel DMS, comprehensively assaying all possible single amino acid substitutions of 261 residues of the deaminase domain for their impact on RNA editing yields. We created a sequence-function map of the deaminase domain that complements existing knowledge derived from prior structure and biochemistry-based studies and improves our understanding of the enzyme. This can serve as a map for engineering novel ADAR2 variants with tailored activity for specific applications. For instance, while utilizing the ADAR enzyme to create transcriptomic timestamps, tailored variants might enable capture of information at different time scales (McMahon et al., 2016). Additionally, use of tailored variants might help better understand kinetics of protein-RNA interactions (Rodriques et al., 2021). This novel screening approach also enabled us to identify variants such as the ADAR2-DD(E488Q, N496F) that demonstrated increased activity at 5’-GAN-3’ motifs. Specifically, this mutant was 1.1- to 2.1-fold more efficient at editing adenosines with a 5’ guanosine than the classic hyperactive ADAR2-DD(E488Q) and also maintained similar activity levels against all other motifs. However, like the ADAR2-DD(E488Q), the ADAR2-DD(E488Q, N496F) showed increased bystander editing and transcriptome-wide off-targeting as compared to the ADAR2-DD. While this novel screening approach was useful for creating an average mutagenesis map of the ADAR2-DD and identifying highly active variants, we believe that replicate correlation values were negatively impacted given that the screen was carried out in a lentiviral format. As lentivirus integrates randomly into the cell’s genome, this results in variability in ADAR2-DD transcript levels between different cells.

Second, we engineered split deaminases, consisting of two inactive enzyme fragments that formed a functional enzyme upon combining at the target site. Due to this requirement of the split-domains to assemble, the efficiency of this system was ~50–70% compared to full-length domain overexpression, but the split-ADAR2 tool was highly transcript specific (~1000-fold compared to full-length domain over expression), and notably with off-target profiles similar to those seen via recruitment of endogenous ADARs (Katrekar et al., 2019). We further demonstrated the applicability of these split-deaminases toward editing 5’-GAN-3’ motifs and uracils via creation of a split-ADAR2-DD(E488Q, N496F) and a split-RESCUE, both of which exhibited high transcriptome-wide specificity as compared to their full-length counterparts. In summary, this study enables broader utility of the ADAR toolset for biotechnology and therapeutic applications. Additionally, these approaches could also be applied to the study and engineering of other RNA modifying enzymes (Rosenberg et al., 2011; Liu et al., 2014).

Materials and methods

DMS and screen

Oligonucleotide pools

Request a detailed protocol

To create the library of single amino acid substitutions in the ADAR2-DD, we ordered an oligonucleotide chip (CustomArray) consisting of six oligonucleotide pools (each 168 bp in length). These pools, in combination, spanned residues 340–600 of the ADAR2-DD. Each of these pools was amplified in a 50 μl PCR using Kapa HiFi HotStart PCR Mix (Kapa Biosystems), 40 ng of synthesized oligonucleotide as template and pool-specific primers. The six PCR products were purified using the QIAquick PCR Purification Kit (Qiagen) to eliminate byproducts.

Creation of vectors for cloning oligonucleotide pools

Request a detailed protocol

We ordered a gene block (IDT) for MCP-ADAR2-DD-NES and used mutagenesis PCR to create the MCP-ADAR2-DD(E488Q)-NES. These fragments were then used as templates to generate six PCR fragments from which deletions of the MCP-ADAR2-DD-NES and the MCP-ADAR2-DD(E488Q)-NES were created. The deleted regions corresponded to the sequence covered by each of the six oligonucleotide pools and were replaced instead with an Esp3I digestion site. To create the plasmid library, we began by mutating the two Esp3I digestion sites in the LentiCRISPR v2 plasmid (gift from Feng Zhang, Addgene #52961) (Sanjana et al., 2014) using PCR mutagenesis followed by Gibson Assembly. Next, we created six cloning vectors for the MCP-ADAR2-DD-NES and MCP-ADAR2-DD(E488Q)-NES, cloning the PCR fragments generated above into the LentiCRISPR v2 vector digested with BamHI and XbaI using Gibson Assembly. All PCRs in this section were carried out using Kapa HiFi HotStart PCR Mix (Kapa Biosystems), 20 ng template and appropriate primers in 20 μl reactions. All digestions in this section were carried out in 50 μl reactions for 3 hr at 37°C using 2 μg of plasmid and 10 units of enzyme(s). All Gibson Assembly reactions in this section were carried out using 50 ng backbone and 30 ng of insert in a 10 μl volume and incubated at 50°C for 1 hr. Digestions and PCRs were purified using the QIAquick PCR Purification Kit (Qiagen).

Creation of plasmid library

Request a detailed protocol

Once we had six cloning vectors corresponding to the MCP-ADAR2-DD-NES ready, we digested these with Esp3I. These digestions were carried out in 50 μl reactions for 6 hr at 37°C using 2 μg of plasmid and 10 units of enzyme followed by heat inactivation at 65°C for 20 min. The digestion reaction was then purified using the QIAquick PCR Purification Kit (Qiagen). This was followed by cloning of the six oligonucleotide pools into their respective cloning vectors via Gibson Assembly using 50 ng of the digested backbone and 10 ng of the purified oligonucleotide PCR products in a 10 μl reaction, incubated at 50°C for 80 min. The Gibson Assembly reaction was purified by dialysis and used to electroporate ElectroMAX Stbl4 cells (Thermo Fisher) as per the manufacturer’s instructions. A small fraction (1–10 µl) of cultures was spread on carbenicillin LB plates to calculate the library coverage, and the rest of the cultures were amplified overnight in 150 ml LB medium containing carbenicillin. A library coverage of at least 400× was ensured before proceeding. Plasmid libraries were sequenced using the MiSeq (300 bp PE run).

Creation of MS2-adRNA vectors

Request a detailed protocol

We began by replacing the Cas9-P2A-Puromycin from the LentiCRISPR v2 with a mCherry-P2A-Hygromycin by digesting the backbone with XbaI and PmeI. We used fusion PCRs to create the mCherry-P2A-Hygromycin-WPRE-3’LTR(Delta U3) insert which was then cloned into the digested backbone via Gibson Assembly. We used PCRs to create a MS2-adRNA-mU6-MS2-adRNA cassette which was cloned into the Esp3I digested backbone via Gibson Assembly. Four vectors with 2× MS2-adRNAs were created targeting 5’ and 3’ TAG and GAC. All PCRs in this section were carried out using Kapa HiFi HotStart PCR Mix (Kapa Biosystems) in 20 μl reactions. All digestions in the section were carried out in 50 μl reactions for 3 hr at 37°C using 2 μg of plasmid and 10 units of enzymes. All Gibson Assembly reactions in this section were carried out using 50 ng backbone and 20–40 ng of insert in a 10 μl volume and incubated at 50°C for 1 hr. Digestions and PCRs were purified using the QIAquick PCR Purification Kit (Qiagen).

Lentivirus production

Request a detailed protocol

HEK293FT (Thermo Fisher) cells were maintained in DMEM supplemented with 10% FBS (Thermo Fisher) and 1% Antibiotic-Antimycotic (Thermo Fisher) in an incubator at 37°C and 5% CO2 atmosphere. These cells were authenticated by STR and tested for mycoplasma contamination by the vendor. To produce lentivirus particles, HEK293FT cells were seeded in 15 cm tissue culture dishes 1 day before transfection and were 60% confluent at the time of transfection. Before transfection, the culture medium was changed to prewarmed DMEM supplemented with 10% FBS. For each 15 cm dish, 36 µl of Lipofectamine 2000 (Thermo Fisher) was diluted in 1.2 ml OptiMEM (Thermo Fisher). Separately, 3 µg pMD2.G (gift from Didier Trono, Addgene #12259), 12 µg of pCMV delta R8.2 (gift from Didier Trono, Addgene #12263), and 9 µg of lentiviral vector were diluted in 1.2 ml OptiMEM. After incubation for 5 min, the Lipofectamine 2000 mixture and DNA mixture were combined and incubated at room temperature for 30 min. The mixture was then added dropwise to HEK293FT cells. Viral particles were harvested 48 and 72 hr after transfection, further concentrated to a final volume of 500–1000 µl using 100 kDA filters (Millipore), divided into aliquots and frozen at −80°C. Lentivirus was produced individually for all MS2-adRNA vectors and in a pooled format for the libraries. While producing lentivirus, libraries were grouped together as 1 + 2, 3, 4, 5 + 6 so as to facilitate sequencing using the NovaSeq 6000 (250 bp PE run).

Creation of a clonal cell line with MS2-adRNA

Request a detailed protocol

HEK293FT cells grown in a six-well plate were transduced with lentiviruses (high MOI) carrying 2× MS2-adRNA targeting 5’ and 3’ TAG and GAC to create four different cell lines. For transductions, the lentivirus was mixed with DMEM supplemented with 10% FBS (Thermo Fisher) and Polybrene Transfection reagent (Millipore) at a concentration of 5 µg/ml and added to HEK293FT cells at 40–50% confluency. Hygromycin (Thermo Fisher) was added to the media at a concentration of 100 µg/ml, 48 hr post transduction. Top 1% of mCherry expressing cells for each line were then sorted into a 96-well plate. Three clones of each of the four cell lines were then frozen down.

Screen

Request a detailed protocol

Lentiviral libraries 1 + 2 and 3 were used to transduce clones with the 5’ TAG and GAC MS2-adRNA and libraries 4 and 5 + 6 were used to transduce clones with the 3’ TAG and GAC MS2-adRNA stably integrated. Transductions were carried out in duplicates. The lentiviral libraries were mixed with DMEM supplemented with 10% FBS (Thermo Fisher), Hygromycin (Thermo Fisher) at 100 µg/ml, Polybrene Transfection reagent (Millipore) at a concentration of 5 µg/ml and added to the stable clones harboring the MS2-adRNA in a 15 cm dish at 40–50% confluency. To ensure most cells received 0 or 1 ADAR2 variant, cells were transduced at a low MOI of 0.2–0.4. 24 hr post transfections, cells were passaged 1:4 into a new 15 cm dish and grown in DMEM supplemented with 10% FBS (Thermo Fisher) and Hygromycin (Thermo Fisher) at 100 µg/ml. Forty-eight hours post transductions, the growth medium was changed to DMEM supplemented with 10% FBS (Thermo Fisher) and Puromycin (Thermo Fisher) at 3 µg/ml. Seventy-two hours post transduction, fresh growth medium with Puromycin was added to the cells. Ninety-six hours post transductions, the growth media was taken off and cells were washed with PBS and then harvested. Cell pellets were stored at –80°C until RNA extraction. At least 1000× coverage was maintained at all steps of the screen.

RNA, cDNA, amplifications, indexing

Request a detailed protocol

RNA was extracted using the RNeasy mini kit (Qiagen) as per the manufacturer’s instructions. cDNA was synthesized from RNA using the Protoscript II First Strand cDNA synthesis Kit (NEB). To ensure library coverage of 500×, 5 ng of RNA was converted to cDNA per library element in every sample of the screen. The volume of each cDNA reaction was 90 µl with 4.5 µg RNA, 45 µl of the reaction mix, 9 µl random primers, and 9 µl enzyme. Samples were incubated in a thermocycler at 25°C for 5 min; 42°C for 80 min; 80°C for 5 min. The entire volume of the cDNA reaction was used to set up PCRs. The volume of each PCR was 100 µl with 44 µl cDNA, 6 µl primers (10 µM), and 50 µl Q5 high fidelity master mix (NEB). The thermocycling parameters were: 98°C for 30 s; 24–28 cycles of 98°C for 10 s, 62°C for 15 s, and 72°C for 35 s; and 72°C for 2 min. The numbers of cycles were tested to ensure that they fell within the linear phase of amplification. The amplicons were 440–570 bp in length and purified using the QIAquick PCR Purification Kit (Qiagen). To continue maintaining at least 500× coverage, at minimum 0.15 ng of the PCR product per library element was used to set up a second PCR adding indices onto the libraries. This was done in 50 µl reactions using 3 µl dual index primers (NEB), 135 ng purified PCR product from the previous reaction and 25 µl Q5 high fidelity master mix (NEB). The thermocycling parameters were: 98°C for 30 s; 5–8 cycles of 98°C for 10 s, 65°C for 20 s, 72°C for 35 s, and 72°C for 2 min. The numbers of cycles were tested to ensure that they fell within the linear phase of amplification. Amplicons were purified with Agencourt AMPure XP beads (Beckman Coulter) at a 0.8 ratio. The libraries were quantified using the Qubit dsDNA HS assay kit (Thermo Fisher) and pooled together at a concentration of 10 nM for sequencing on a 250 bp PE run on the NovaSeq 6000.

Sequencing analysis

Request a detailed protocol

Raw fastq reads were aligned to the ADAR2 reference sequence using minimap2 (Li, 2018) in short-read mode with default parameters. For libraries with overlapping paired-end reads, the reads were first combined using FLASH (Magoč and Salzberg, 2011). The aligned reads were then classified into library members using strict filtering, that is, reads were only included if they perfectly matched exactly one library member, aside from the target ADAR editing site. The editing rate at this target site was then quantified for each library member and averaged across two replicates with weights for differential coverage. To analyze the degree to which each library member differed in editing rate from the wild type, we performed a two-proportion Z-test using a pooled sample proportion to calculate the standard error of the sampling distribution, and a two-tailed procedure to calculate p-values. Note that the wild-type rate was restricted to the rate measured within each library, such that each library member was compared only to the wild-type rate measured in the same biological context. Z-scores were calculated as follows, where x is the RNA editing rate and n is the number of counts:

x_=xwtnwt+xininwt+ni
SE=x_(1x_)((1ni)+(1nwt))
Zi=xixwtSE

Post library classification and editing quantification heatmap plot-ting was done with modified code from Enrich2 (https://github.com/FowlerLab/Enrich2; Rubin, 2021; Rubin et al., 2017).

Cloning individual mutants

Request a detailed protocol

We began by creating a cloning vector with the MCP inserted into the LentiCRISPR v2 vector digested with BamHI and XbaI using Gibson Assembly. This vector was then digested with BamHI to clone the DD mutants. All mutants were created using mutagenesis PCR followed by Gibson Assembly. All PCRs in this section were carried out using Q5 PCR Mix (NEB), 5 ng template and appropriate primers in 20 μl reactions. All digestions in this section were carried out in 50 μl reactions for 3 hr at 37°C using 3 μg of plasmid and 20 units of enzyme(s). All Gibson Assembly reactions in this section were carried out using 30 ng backbone and 15 ng of insert in a 6 μl volume and incubated at 50°C for 1 hr. Digestions and PCRs were purified using the QIAquick PCR Purification Kit (Qiagen).

Luciferase assay

Request a detailed protocol

All HEK293FT cells were grown in DMEM supplemented with 10% FBS and 1% Antibiotic-Antimycotic (Thermo Fisher) in an incubator at 37°C and 5% CO2 atmosphere. All in vitro luciferase experiments for DMS validations were carried out in HEK293FT cells seeded in 96-well plates, at 25–30% confluency, using 250 ng total plasmid and 0.5 μl of commercial transfection reagent Lipofectamine 2000 (Thermo Fisher). Specifically, every well received 100 ng of the Cluc-W85X(TAG) or Cluc-W85X(TGA) reporters, 50 ng of MCP-ADAR2-DD mutants, and 100 ng of the MS2-adRNA plasmids. In cases where less than three plasmids were needed, a balancing plasmid was added to keep the total amount per well as 250 ng. Forty-eight hours post transfections, 20 μl of supernatant from cells was added to a Costar black 96-well plate (Corning). For the readout, 50 μl of Cypridina Assay buffer was mixed with 0.5 μl Vargulin substrate (Thermo Fisher) respectively and added to the 96-well plate in the dark. The luminescence was read within 10 min on Spectramax i3x or iD3 plate readers (Molecular Devices) with the following settings: 5 s mix before read, 5 s integration time, 1 mm read height.

RNA editing

Request a detailed protocol

RNA editing experiments for targeting 5’-GA-3’ were carried out in HEK 293 FT cells seeded in 24-well plates using 1000 ng total plasmid and 2 µl of commercial transfection reagent Lipofectamine 2000 (Thermo Fisher). Specifically, every well received 500 ng each of the MCP-ADAR2-DD variant and the adRNA plasmids. Cells were transfected at 25–30% confluence and harvested 48 hr post transfection for quantification of editing. RNA from cells was extracted using the RNeasy Mini Kit (Qiagen). cDNA was synthesized from 500 ng RNA using the Protoscript II First Strand cDNA synthesis Kit (NEB). One µl of cDNA was amplified by PCR with primers that amplify about 200 bp surrounding the sites of interest using OneTaq PCR Mix (NEB). The numbers of cycles were tested to ensure that they fell within the linear phase of amplification. PCR products were purified using a PCR Purification Kit (Qiagen) and sent out for Sanger sequencing. The RNA editing efficiency was quantified using the ratio of peak heights G/(A + G). For a comprehensive analysis of the ADAR2-DD(E488Q, N496F) mutant, several adenosines within the RAB7A 3’UTR and GAPDH CDS were targeted. The editing efficiencies for each of these adenosines were normalized with that of the ADAR2-DD(E488Q) so as to represent all of the 16 motifs on a single heatmap.

Split-ADAR2

Vector design and construction

Request a detailed protocol

We began by digesting the pAAV_hU6_mU6_CMV_GFP with AflII to clone the NES-FLAG-MCP-linker and linker-4xλN-HA-NES downstream of the CMV promoter which were amplified from the MCP-ADAR2-DD-NLS (Katrekar et al., 2019) and 4x-λN-cdADAR2 (Montiel-Gonzalez et al., 2013) respectively. AvrII digestion sites were included downstream of the NES-FLAG-MCP-linker and upstream of the linker-4xλN-HA-NES to facilitate cloning of the split fragments. All split fragments were amplified from the MCP-ADAR2-DD-NLS or MCP-ADAR2-DD(E488Q)-NLS (Katrekar et al., 2019). For each split-ADAR2 pair, the N-terminal DD fragment was cloned downstream of the NES-FLAG-MCP-linker and the C-terminal DD fragment was cloned upstream of the linker-4xλN-HA-NES using Gibson Assembly. MS2-MS2, MS2-BoxB, BoxB-MS2, and BoxB-BoxB adRNA were created by annealing primers and cloned downstream of the hU6 promoter into the AgeI+ NheI digested pAAV_hU6_mU6_CMV_GFP using Gibson Assembly. All PCRs in this section were carried out using Kapa HiFi HotStart PCR Mix (Kapa Biosystems) in 20 μl reactions. All digestions in this section were carried out in 50 μl reactions for 3 hr at 37°C using 3 μg of plasmid and 20 units of enzyme(s). All Gibson Assembly reactions in this section were carried out using 40 ng backbone and 5–20 ng of insert in a 10 μl volume and incubated at 50°C for 1 hr. Digestions and PCRs were purified using the QIAquick PCR Purification Kit (Qiagen).

Luciferase assay

Request a detailed protocol

All HEK293FT cells were grown in DMEM supplemented with 10% FBS and 1% Antibiotic-Antimycotic (Thermo Fisher) in an incubator at 37°C and 5% CO2 atmosphere. All in vitro luciferase experiments for the split-ADAR2 were carried out in HEK293FT cells seeded in 96-well plates, at 25–30% confluency, using 400 ng total plasmid and 0.6 μl of commercial transfection reagent Lipofectamine 2000 (Thermo Fisher). Specifically, every well received 100 ng each of the Cluc-W85X(TAG) reporter, N- and C-terminal ADAR2 fragments and the adRNA plasmids. In cases where less than four plasmids were needed, a balancing plasmid was added to keep the total amount per well as 400 ng. Forty-eight hours post transfections, 20 μl of supernatant from cells was added to a Costar black 96-well plate (Corning). For the readout, 50 μl of Cypridina Glow Assay buffer was mixed with 0.5 μl Vargulin substrate (Thermo Fisher) and added to the 96-well plate in the dark. The luminescence was read within 10 min on Spectramax i3x or iD3 plate readers (Molecular Devices) with the following settings: 5 s mix before read, 5 s integration time, 1 mm read height.

RNA editing

Request a detailed protocol

All in vitro RNA editing experiments were carried out in HEK293FT cells seeded in 24-well plates using 1500 ng total plasmid and 2 µl of commercial transfection reagent Lipofectamine 2000 (Thermo Fisher). Specifically, every well received 500 ng each of the N- and C-terminal ADAR2 fragments and the adRNA plasmids. In cases where less than three plasmids were needed, a balancing plasmid was added to keep the total amount per well as 1500 ng. Cells were transfected at 25–30% confluence and harvested 48 hr post transfection for quantification of editing. RNA from cells was extracted using the RNeasy Mini Kit (Qiagen). cDNA was synthesized from 500 ng RNA using the Protoscript II First Strand cDNA synthesis Kit (NEB). One µl of cDNA was amplified by PCR with primers that amplify about 200 bp surrounding the sites of interest using OneTaq PCR Mix (NEB). The numbers of cycles were tested to ensure that they fell within the linear phase of amplification. PCR products were purified using a PCR Purification Kit (Qiagen) and sent out for Sanger sequencing. The RNA editing efficiency was quantified using the ratio of peak heights G/(A + G). RNA-seq libraries were prepared from 250 ng of RNA, using the NEBNext Poly(A) mRNA magnetic isolation module and NEBNext Ultra RNA Library Prep Kit for Illumina. Samples were pooled and loaded on an Illumina Novaseq 6000 (100 bp paired-end run) to obtain 40–45 million reads per sample.

Quantification of RNA-seq A-to-G editing

Request a detailed protocol

RNA-seq analysis for quantification of transcriptome-wide A-to-G editing was carried out as described in Katrekar et al., 2019.

Data availability

Sequencing data will be accessible via NCBI GEO under accession GSE158656. Source data has been made available with the submission.

The following data sets were generated
    1. Katrekar D
    2. Meluzzi D
    3. Palmer N
    4. Mali P
    (2022) NCBI Gene Expression Omnibus
    ID GSE158656. Comprehensive interrogation of the ADAR2 deaminase domain for engineering enhanced RNA base-editing activity, functionality and specificity.

References

Decision letter

  1. Timothy W Nilsen
    Reviewing Editor; Case Western Reserve University, United States
  2. James L Manley
    Senior Editor; Columbia University, United States
  3. Nina Papavasiliou
    Reviewer; Deutsche Krebsforschungszentrum (DKFZ), Germany
  4. Heather A Hundley
    Reviewer; Indiana University, United States

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Decision letter after peer review:

[Editors’ note: the authors submitted for reconsideration following the decision after peer review. What follows is the decision letter after the first round of review.]

Thank you for submitting your work entitled "Comprehensive interrogation of the ADAR2 deaminase domain for engineering enhanced RNA editing activity and specificity" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and a Senior Editor. The following individuals involved in review of your submission have agreed to reveal their identity: Timothy A Whitehead (Reviewer #2); Nina Papavasiliou (Reviewer #3).

Our decision has been reached after consultation between the reviewers. Based on these discussions and the individual reviews below, we regret to inform you that your work will not be considered further for publication in eLife.

All three reviewers raised substantive concerns regarding description of the methods and statistical analyses. Given these concerns, we had no choice but to decline the paper, but encourage you to resubmit if and when you can address the points raised by the referees as thoroughly as possible. The issues are particularly concerning since this manuscript was submitted as a research method or tool paper.

Reviewer #1:

The authors aims to enhance efficiency and specificity of RNA engineering mediated by ADAR2 deaminase domain (ADAR2-DD) by three approaches. First, they interrogated the residues in a portion of ADAR2-DD by deep mutational scanning (DMS), providing comprehensive map of residues important for the core region of ADAR2-DD. Second, they identified a novel mutation N496F that increases editing efficiency for a 5'-GA-3' motif. Third, they showed the split ADAR2-DD design with good editing efficiency and specificity. This work provides deeper understanding of ADAR2 deaminase domain as a tool for RNA directed engineering. Although the work presents advances, I have several concerns that the authors need to address in more detailed analysis.

1) The DMS screen described in Figure 1a is not intuitive to understand. Where exactly are the editing sites in the ADAR2-DD? How many sites are there in total? How are the editing sites and mutations identified in the sequencing reads? A better drawing, explanation, figure legend and methods are needed.

2) For the DMS method, since the editing site is in ADAR2-DD itself, when making mutations the structure of the RNA substrate might change, especially the ones near the editing site. Then the effect might not be sole consequence of the protein mutant. If RNA substrate might change in structure, is the DMS result still consistent with the validation result using the cluc assays?

3) In addition to measuring the editing level of targeted sites, can the authors measure off target in the deep sequencing data? This would result in both editing efficiency and editing specificity measurement.

4) ADAR1 is perhaps expressed in the chosen human cells. Would any of the results from the DMS and the validation be complicated by the editing from endogenous ADAR1?

5) It is very interesting that many mutants showed equal or higher activity than the well-known E488Q mutant. What are the efficiency and specificity for these (at least for some representative ones from the validation)?

6) For the N496F mutant, what are the transcriptome wide off-targeting data? What about other non-GA sequences?

7) How was the split ADAR2-DD chosen? Current writing is very simple. It would be useful for the community if the authors provide more details of the reasoning.

8) What is the correlation of the luciferase assay signal to the actual editing level of the transcript? What if comparing all designs (mutations and split-ADAR2-DD) in the same assay so we can see direction comparison of the editing efficiency and specificity? Preferably, using editing level of a target site as a readout to compare all designs and the off-target analysis by RNA-seq.

9) To validate all the findings in this work, it would be desired to show how an engineered ADAR2 DD, in a split fashion, would edit an endogenous substrate with a non-UAG motif (such as GAC). What would be the editing efficiency (% editing level) and the transcriptome-wide specificity?

Reviewer #2:

Katrekar and colleagues developed a screen for deaminase acting on RNA (ADAR) and screened most single amino acid substitutions across the catalytically active domain for RNA editing and for activity at 5'-GA-3' motifs. Separately, they developed a split ADAR and evaluated specific and off-target RNA editing using whole transcriptomes. The paper does not read like a coherent story and instead is two separate papers: Figure 1 – 2 involve the screen and evaluation of single clones resulting from the screen, whereas Figure 3-4 involve the split ADAR. The strengths of the paper involve the novelty of the genetic screen and, separately, the development and validation of the split ADAR system. There exist major concerns about the representation of the results from the screen, along with minor suggestions on the split ADAR story.

1. The statistical underpinning of the validity of the screening results are unexplored in the main text and need to be described accurately. The authors split between Z scores (Figure 1), Fold change in DMS relatiive to ADAR2-DD (Figure 2), % edited (in supporting information files and Figure 1d), DMS log2 fold change (SI Figure 2). Each of these screening outputs (if all are included) need to be described in the main text and justified. My personal opinion is that one or at most two metrics can be used in the paper to avoid confusion. I have particular concerns in this section about the following:

a. Replicates for the screen. The paper only lists replicates in three places, and in no place was how this replicate performed. How were they performed? Biological replicates? Technical replicates? Different days? These experimental details need to be discussed explicitly.

b. Of concern for the replicates are the relatively low correlation between replicates (R2 = 0.48 by my calculation). The correlation is not discussed at all in the main text – this data needs to be explicit for the reader to judge for herself the validity of the data presented.

c. The replicate showed in SI Figure 1d has a correlation missing for the worst performing sample ("wt-X-TAG") and the meaning for wt-X, wt-Y, etc are not described.

d. The validation performed isogenically involves cherrypicked samples with low variance between them (R2 for the variants described in figure 2b are 0.87) and don't represent a fair comparison. The authors state that "We observed that a majority of the mutants (85%) followed the same trend in our arrayed validation as seen in the pooled screens" but the meaning behind the sentence is not clear. What does the same trend mean and how is it calculated? Determine the statistical significance using a t test and show comparisons between isogenic datasets using rank correlation or R2 correlation.

e. Points a-d lead to the following conclusion that the screen, while clever and well implemented, has relatively high error and the data should not be presented as a heat map as the authors present in Figure 1. Deep mutational scanning experiments where data is presented as heat maps typically have R2 values of 0.8 or higher. This data is useful – the data for conservation at each position should be relatively robust even with the error in the screen reported in the paper. This screen can also be used to identify 'hits'.

Reviewer #3:

The paper by Mali and colleagues is an interesting mix of experiments on ADAR2 functionality: on mutations that increase activity, on a split ADAR2 construct appears to decrease off target effects, and on a split RESCUE construct that is said to also increase the specificity of C to U editing.

The notion of a split ADAR is certainly novel (brought together by the binding in situ of two elements on the RNA – an MS2 and boxB element, plus a second pair). However the paper would really benefit from being more explicit on some of the results. For instance, the "decrease of off target effects" though apparently significant, would benefit from some nuance – for example are there commonalities to the off-target targets? (are there RAB7A-specific off-targets vs KRAS specific ones and what would that imply?) In other words how generalizable to other transcripts are these findings?

Continuing with the notion of tradeoffs, are GAC/GAG-focused mutants "worse" on other triplets? and which?

Finally, given how little has been published on targeted C to U editing (excepting RESCUE), it is important to treat figure 4d as a little more than an afterthought – with a comprehensive analysis equivalent to the treatment of A to I editing.

[Editors’ note: further revisions were suggested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "Comprehensive interrogation of the ADAR2 deaminase domain for engineering enhanced RNA editing activity and specificity" for further consideration by eLife. Your revised article has been evaluated by James Manley (Senior Editor) and Timothy Nilsen (Reviewing Editor).

The manuscript has been improved but there are some remaining issues that need to be addressed, as outlined below:

While both reviewers were quite positive about your revised paper, they felt that additional textual changes and/or additions as outlined below would further improve the manuscript. Please address these comments as thoroughly as possible.

Reviewer #1:

The manuscript by Mali and colleagues is substantially improved from the earlier version. If I had a single comment to make on the revision (and only because they now have the data to look) is this. Normally, a guide RNA bound to the coding region (as will be the case for the cAg in Rab7a) would be expected to reduce transcript abundance (through RNAi like effects). In view of the data in figure 4a/4b, is this true? if so it would be important to point out, because this is not normally something that would be observed in a "restore of a stop codon" situation, but it is something we need to worry about in terms of therapeutic efficiency.

Reviewer #2:

This revised manuscript provides a deep mutational scanning of the deaminase domain of human ADAR2 to provide a comprehensive assessment of amino acids that alter editing activity at a specific adenosine flanked by preferred nucleotides (UAG). The author recover 33 individual mutations that either increase or decrease editing, including several mutations that were previously known to impact editing. The authors perform a second DMS starting with a known hyperactive mutant and seeking to obtain a mutant that has altered preferences for the nearest neighbors of the target adenosine. This second goal is quite important in terms of impact on precision medicine. The last goal of the paper is to develop a split ADAR method of editing target adenosines with the goal of reducing off target adenosines, again an important technological advancement for therapeutic use of ADARs. Overall, the revisions adequately address the concerns about clarity, experimental approaches and statistical analysis.

Two areas that need to be addressed are listed below.

1. It would be beneficial if the authors specifically identified the novel mutations identified in the deaminase domain determined from the initial DMS experiment on the UAG codon (SI Figure 2). Furthermore, the initial reviews noted that there were several mutations (ex. D419W, D362R, D365R, etc) that exhibit a similar elevated activity as the well-described E488Q hyperactive mutant on the UAG substrate. The authors were asked (by reviewer 1) whether these hyperactive mutants are specific to UAG or also behave similar to E488Q (exhibit increased editing at less preferred codons). This was not addressed in the revised manuscript.

2. The second DMS screen identified only one mutant, N496F, that could significantly enhance editing of a GAC codon. The authors did not recover this mutant in the initial screen, is that due to the lack of N496F affecting editing at UAG codons?

The abstract describes this mutant as "greatly increased enzymatic activity at 5' GAN-3' motifs". This language is overstated both in terms of the activity (which is simply 1.1-2 fold enhanced (Figure 1h)) and with regards to the specific motif. The comprehensive assessment of the specificity of this mutant (requested in the initial review, SI Figure 3c) indicates this mutant has enhanced activity not only for GAN motifs but also CAN motifs, with CAC being the second most edited codon after GAA (and above several other GAN codons).

The authors should both tone down the language in the abstract and discuss the lack of specificity of the E448Q, N486F mutant, especially in terms of what may occur with off-targets. The authors have already performed the experiment (Figure 3b and Figure 4b) but do not discuss the data in this regard, despite the initial request by Reviewer 1. This is particularly important as the authors stress that finding mutants with altered preference is important for precision medicine, but if these ADAR mutants also increase off-target editing, the findings are less exciting-albeit more rationale for using the split ADAR technology developed.

The methodology for the comprehensive analysis of editing preferences should be added to the manuscript.

https://doi.org/10.7554/eLife.75555.sa1

Author response

[Editors’ note: the authors resubmitted a revised version of the paper for consideration. What follows is the authors’ response to the first round of review.]

Reviewer #1:

The authors aims to enhance efficiency and specificity of RNA engineering mediated by ADAR2 deaminase domain (ADAR2-DD) by three approaches. First, they interrogated the residues in a portion of ADAR2-DD by deep mutational scanning (DMS), providing comprehensive map of residues important for the core region of ADAR2-DD. Second, they identified a novel mutation N496F that increases editing efficiency for a 5'-GA-3' motif. Third, they showed the split ADAR2-DD design with good editing efficiency and specificity. This work provides deeper understanding of ADAR2 deaminase domain as a tool for RNA directed engineering. Although the work presents advances, I have several concerns that the authors need to address in more detailed analysis.

(1) The DMS screen described in Figure 1a is not intuitive to understand. Where exactly are the editing sites in the ADAR2-DD? How many sites are there in total? How are the editing sites and mutations identified in the sequencing reads? A better drawing, explanation, figure legend and methods are needed.

To improve clarity we have now split Figure 1 into Figures 1a, 1b and 1c. The novelty of the DMS lies in coupling RNA editing and mutant information by linking them on the same transcript, thereby enabling direct measurement of biochemical activity (vs. via a surrogate assay). To enable this the editing sites are located in the ADAR2-deaminase domain (DD) outside of the region where single amino acid substitutions are created (residues 340-600). Specifically, we assayed UAG and GAC editing yields for ADAR2-DD mutants between amino acids positions 340-468 via sites located at amino acid positions 332 and 329. To not alter the protein coding sequence, at these sites an A-to-G substitution creates a synonymous amino acid change. Similarly, we assayed UAG and GAC editing yields for ADAR2-DD mutants between amino acids positions 469-600 via sites located at amino acid positions 621 and 626. To not alter the protein coding sequence, at these sites an A-to-G substitution creates a synonymous amino acid change. By transducing cells at a low multiplicity of infection (MOI), we ensure that each cell receives a single mutant. The schematic for the location of editing sites is highlighted in SI Figure 1a. When corresponding transcription and translation occur in a cell, the MCP-ADAR2 mutant complexes with the MS2 guide RNA to edit its own transcript. If a mutant is highly active, a higher number of the mutant transcripts will contain the A-to-G substitution at the editing site, as compared to a less functional mutant. By isolating RNA, converting it to cDNA and amplifying the ADAR2DD, we could quantify the total number of reads arising from each mutant and determine the fraction of reads that contain the targeted A-to-G edit. UAG and GAC editing yields were assayed in independent experiments via use of ADAR2 or ADAR2 (E488Q) mutant libraries, and in cells harboring distinct MS2 guide RNAs.

(2) For the DMS method, since the editing site is in ADAR2-DD itself, when making mutations the structure of the RNA substrate might change, especially the ones near the editing site. Then the effect might not be sole consequence of the protein mutant. If RNA substrate might change in structure, is the DMS result still consistent with the validation result using the cluc assays?

We understand the reviewer’s concern regarding the local structure of the target being altered by the mutations in the ADAR2-DD sequence. The DMS assay was purposefully designed such that: (1) the editing sites were chosen to be substantially outside of the region where the mutations were made, with the minimum distance between the editing site and mutation being >20 bp; (2) additionally, the mutations were created outside the binding site of the guide RNA; (3) furthermore, we confirmed the effects of these mutations on the overall free energy of RNA folding are predicted to be minimal; and (4) importantly, the DMS results were consistent with the validation results, confirming the efficacy of the screen. See Figure 1-figure supplement 2.

(3) In addition to measuring the editing level of targeted sites, can the authors measure off target in the deep sequencing data? This would result in both editing efficiency and editing specificity measurement.

While we have assayed activity and transcriptome-wide specificity for individual mutants in this study, given the pooled nature of the screen it is however not possible to measure at-scale both on-target the off-target editing associated with each mutant in the library. This would in principle require single cell RNA seq measurements. However these assays in current formats (such as via 10X genomics) yield: (1) very sparse data, and (2) provide sequence information only for RNA regions close to poly-A sites.

(4) ADAR1 is perhaps expressed in the chosen human cells. Would any of the results from the DMS and the validation be complicated by the editing from endogenous ADAR1?

ADAR1 is indeed expressed in the HEK293FT cells in which the experiment was carried out. However, when using the short MS2-adRNA antisense sequences (length 20 bp) there is no recruitment of endogenous ADAR1 (data reproduced below from Katrekar et al. Nature Methods, 2019).

Author response image 1
Specifically, on-target RNA editing by MCP–ADAR2 DD-NLS required co-expression of the MS2 adRNA.

GluR2 adRNA and MS2 adRNA used in this experiment had an antisense domain of length 20. Values represent mean ± s.e.m. (n = 3). All experiments were carried out in HEK293T cells.

5) It is very interesting that many mutants showed equal or higher activity than the well-known E488Q mutant. What are the efficiency and specificity for these (at least for some representative ones from the validation)?

We agree and present corresponding detailed characterization data for ADAR2(E488Q, N496F) - notably, this novel mutant showed enhanced activity against GAC motifs which wt-ADAR2 is unable to efficiently edit. We have rigorously characterized this hit via editing and specificity analysis, including all 16 5’-NAN-3’ motifs, and have additionally also created all-in-one-split versions of the same. See Figures 4a and b, Figure 1-figure supplement 3.

We have also validated several additional mutants identified in the screen via both a luciferase reporter assay and direct RNA editing measurements. See Figure 1—figure supplement 2.

6) For the N496F mutant, what are the transcriptome wide off-targeting data? What about other non-GA sequences?

As suggested by the reviewer, we have now carried out deep RNA-seq to quantify the transcriptome wide off-target editing observed with the ADAR2(E488Q, N496F) double mutant expressed both as a full domain and a split-domain. Per our engineered designs, the latter format eliminates off-target RNA editing while still retaining robust on-target activity. See Figure 4b.

Additionally, we also comprehensively examined editing across all possible 5’ and 3’ flanking nucleotides, and corresponding data comparing the profiles with ADAR2(E488Q), see Figure 1—figure supplement 3c.

7) How was the split ADAR2-DD chosen? Current writing is very simple. It would be useful for the community if the authors provide more details of the reasoning.

We agree, and the rationale for choosing the residues is now more explicitly detailed in the main text: “Examining the results of the DMS (focusing on sites with high mutability), as well as the crystal structure of the ADAR2-DD (focusing on high solvent accessible surface area), and residue conservation scores across species (focusing on low scores of conservation), we identified 18 putative regions for splitting the protein.

8) What is the correlation of the luciferase assay signal to the actual editing level of the transcript? What if comparing all designs (mutations and split-ADAR2-DD) in the same assay so we can see direction comparison of the editing efficiency and specificity? Preferably, using editing level of a target site as a readout to compare all designs and the off-target analysis by RNA-seq.

As suggested, we have now plotted the editing levels of the transcript against the luciferase signal (RLU). The Pearson correlation for the scatter plot is 0.818 while the Spearman correlation is 0.824. See Figure 1—figure supplement 2.

Additionally, as requested, we have also compared the full-length ADAR2, the E488Q mutant and the E488Q, N496F double mutant and their respective all-in-one-split counterparts via RNA-seq (while editing the same UAG site in the RAB7A 3’ UTR). See Figures 3b, 4b.

9) To validate all the findings in this work, it would be desired to show how an engineered ADAR2 DD, in a split fashion, would edit an endogenous substrate with a non-UAG motif (such as GAC). What would be the editing efficiency (% editing level) and the transcriptome-wide specificity?

As suggested by the reviewer, we have now created an all-in-one-split-ADAR2DD(E488Q, N496F) and compared it with the all-in-one-split-ADAR2(E488Q), and indeed can confirm a significant increase in editing efficiency of the GAC motif. Additionally, we show that splitting the enzyme does significantly reduce off-target editing as compared to the full-length MCP-ADAR2-DD(E488Q, N496F). This RNA-seq was carried out in the context of the same UAG editing site of the RAB7A used in all other samples so as to enable a side-by-side comparison between them. See Figures 3b and 4b.

Reviewer #2:

Katrekar and colleagues developed a screen for deaminase acting on RNA (ADAR) and screened most single amino acid substitutions across the catalytically active domain for RNA editing and for activity at 5'-GA-3' motifs. Separately, they developed a split ADAR and evaluated specific and off-target RNA editing using whole transcriptomes. The paper does not read like a coherent story and instead is two separate papers: Figure 1 – 2 involve the screen and evaluation of single clones resulting from the screen, whereas Figure 3-4 involve the split ADAR. The strengths of the paper involve the novelty of the genetic screen and, separately, the development and validation of the split ADAR system. There exist major concerns about the representation of the results from the screen, along with minor suggestions on the split ADAR story.

1. The statistical underpinning of the validity of the screening results are unexplored in the main text and need to be described accurately. The authors split between Z scores (Figure 1), Fold change in DMS relatiive to ADAR2-DD (Figure 2), % edited (in supporting information files and Figure 1d), DMS log2 fold change (SI Figure 2). Each of these screening outputs (if all are included) need to be described in the main text and justified. My personal opinion is that one or at most two metrics can be used in the paper to avoid confusion.

As suggested by the reviewer, we have now switched to log2 fold change for all the replicate correlations and individual hit validations, and for the screen heatmap used Zscores.

I have particular concerns in this section about the following:

a. replicates for the screen. The paper only lists replicates in three places, and in no place was how this replicate performed. How were they performed? Biological replicates? Technical replicates? Different days? These experimental details need to be discussed explicitly.

Here are the specific details (also included in the manuscript): “Two biological replicates were performed in independent plates of cells transduced with independent vials of lentivirus.”

b. Of concern for the replicates are the relatively low correlation between replicates (R2 = 0.48 by my calculation). The correlation is not discussed at all in the main text – this data needs to be explicit for the reader to judge for herself the validity of the data presented.

We agree. To address this, we have now increased the sequencing depth and also used more stringent filtering. Updated metrics are in Figure 1d and Figure 1-figure supplement 1b.

We have also included a statement regarding the R2 value in the main text which now reads: “The deaminase domain transcripts for each variant also contained the associated A-to-I editing yields, which were then quantified for both replicates of the DMS (R2 = 0.687).”

c. The replicate showed in SI Figure 1d has a correlation missing for the worst performing sample ("wt-X-TAG") and the meaning for wt-X, wt-Y, etc are not described.

This section has been updated.

d. The validation performed isogenically involves cherrypicked samples with low variance between them (R2 for the variants described in figure 2b are 0.87) and don't represent a fair comparison. The authors state that "We observed that a majority of the mutants (85%) followed the same trend in our arrayed validation as seen in the pooled screens" but the meaning behind the sentence is not clear. What does the same trend mean and how is it calculated? Determine the statistical significance using a t test and show comparisons between isogenic datasets using rank correlation or R2 correlation.

The samples for validations were picked based on Z-score (low, medium, high). We have determined the statistical significance using an independent t-test with unequal variance (Welch’s t-test). 24 out of the 33 mutants that were validated are not significantly different from the screening data (p>0.05). For the 9 samples that are significantly different it is an issue of magnitude rather than direction with the screen tending to overestimate the validation. The Pearson correlation between the arrayed validations and data from the screen is 0.818. The Spearman correlation is 0.824. Taken together, the screen is accurate in predicting whether a particular mutation will impair, improve or not alter RNA editing. See Supplementary Figure 1-figure supplement 2a.

e. Points a-d lead to the following conclusion that the screen, while clever and well implemented, has relatively high error and the data should not be presented as a heat map as the authors present in Figure 1. Deep mutational scanning experiments where data is presented as heat maps typically have R2 values of 0.8 or higher. This data is useful – the data for conservation at each position should be relatively robust even with the error in the screen reported in the paper. This screen can also be used to identify 'hits'.

By further increasing the sequencing depth, and also using more stringent filtering, our data now has an R2 = 0.687. However, per the reviewer's suggestion, we have now deemphasized the DMS screen, moved the corresponding heat map to the Supplementary Information section, and primarily utilized it in the manuscript as a screen for identifying novel hits. We would however like to highlight that this novel screening format directly measures RNA editing yields and thus enzymatic activity. This is unlike the large majority of mutagenesis screens in literature that rely primarily on surrogate readouts.

Reviewer #3:

The paper by Mali and colleagues is an interesting mix of experiments on ADAR2 functionality: on mutations that increase activity, on a split ADAR2 construct appears to decrease off target effects, and on a split RESCUE construct that is said to also increase the specificity of C to U editing.

The notion of a split ADAR is certainly novel (brought together by the binding in situ of two elements on the RNA – an MS2 and boxB element, plus a second pair). However the paper would really benefit from being more explicit on some of the results. For instance, the "decrease of off target effects" though apparently significant, would benefit from some nuance – for example are there commonalities to the off-target targets? (are there RAB7A-specific off-targets vs KRAS specific ones and what would that imply?) In other words how generalizable to other transcripts are these findings?

We thank the reviewer for this suggestion. We have now carried out RNA-seq analysis on the KRAS targeting samples. A closer look at the off-targets confirms that they are guide RNA dependent for the split-ADAR system. This is in contrast to the overexpressed full length deaminases where off-target edits are primarily driven by the non-specific dsRNA binding activity of the enzyme. A statement towards this has now also been included in the main text: “A closer look at the off-targets revealed that in case of the splitADAR2 system, highly edited off-targets were guide RNA sequence dependent. This is in contrast to full-length deaminase domain overexpression where off-targets were predominantly deaminase domain driven.”

A summary of full length deaminase domain overexpression is shown in Author response image 2 both in the presence and absence of guide RNA.

Author response image 2

1753 off-targets were shared between all constructs. This indicates that enzyme preferences dictate off-targets as 2 of the latter constructs lack a guide RNA.We have also closely examined the off-targets in 4 of our split-deaminase samples. A summary is shown in Author response image 3.

Author response image 3

68 off-targets were shared only between the KRAS targeting constructs. 3 of 68 shared off-targets are seen in the PAICS transcript, and an alignment of the reverse complement of the guide and the off-target site is shown:Reverse complement of guide: TCCTCATGTTACAAACTTGTGGTGC

Highly edited off-target : TGCTAGGGTTACAGACATGAGCCAC

Continuing with the notion of tradeoffs, are GAC/GAG-focused mutants "worse" on other triplets? and which?

We agree this is an important characterization of the mutant. We have now looked at editing across other motifs and compared it to the ADAR2(E488Q). See Figure 1-figure supplement 3c.

Finally, given how little has been published on targeted C to U editing (excepting RESCUE), it is important to treat figure 4d as a little more than an afterthought – with a comprehensive analysis equivalent to the treatment of A to I editing.

We fully agree, and as suggested by the reviewer, have now delved deeper into C-to-U editing via RESCUE. Specifically, we carried out RNA-seq analyses which confirmed that splitting the enzyme reduces off-targets both in the A-to-I space as well as the C-to-U space. See Figures 4c and 4d.

[Editors’ note: what follows is the authors’ response to the second round of review.]

Reviewer #1:

The manuscript by Mali and colleagues is substantially improved from the earlier version. If I had a single comment to make on the revision (and only because they now have the data to look) is this. Normally, a guide RNA bound to the coding region (as will be the case for the cAg in Rab7a) would be expected to reduce transcript abundance (through RNAi like effects). In view of the data in figure 4a/4b, is this true? if so it would be important to point out, because this is not normally something that would be observed in a "restore of a stop codon" situation, but it is something we need to worry about in terms of therapeutic efficiency.

This is an important point brought up by the reviewer. RNAi like effects could potentially be seen while targeting a CDS. We have now carried out qPCRs on two targets, in the presence or absence of a guide RNA targeting a CAG in the RAB7A transcript and a CAT in the GAPDH transcript. We do not observe any RNAi like effects in presence of the guide RNA.

Author response image 4

Reviewer #2:

This revised manuscript provides a deep mutational scanning of the deaminase domain of human ADAR2 to provide a comprehensive assessment of amino acids that alter editing activity at a specific adenosine flanked by preferred nucleotides (UAG). The author recover 33 individual mutations that either increase or decrease editing, including several mutations that were previously known to impact editing. The authors perform a second DMS starting with a known hyperactive mutant and seeking to obtain a mutant that has altered preferences for the nearest neighbors of the target adenosine. This second goal is quite important in terms of impact on precision medicine. The last goal of the paper is to develop a split ADAR method of editing target adenosines with the goal of reducing off target adenosines, again an important technological advancement for therapeutic use of ADARs. Overall, the revisions adequately address the concerns about clarity, experimental approaches and statistical analysis.

Two areas that need to be addressed are listed below.

1. It would be beneficial if the authors specifically identified the novel mutations identified in the deaminase domain determined from the initial DMS experiment on the UAG codon (SI Figure 2). Furthermore, the initial reviews noted that there were several mutations (ex. D419W, D362R, D365R, etc) that exhibit a similar elevated activity as the well-described E488Q hyperactive mutant on the UAG substrate. The authors were asked (by reviewer 1) whether these hyperactive mutants are specific to UAG or also behave similar to E488Q (exhibit increased editing at less preferred codons). This was not addressed in the revised manuscript.

While we do understand the importance of further characterizing the mutants identified in the initial DMS, we want to point out that the goal of this DMS was primarily to create a mutagenesis map of the ADAR2 and thereby understand which residues tolerate mutations and which ones do not. An exhaustive list of mutants with associated RNA editing activities against the UAG is listed in the supporting tables. We have now also evaluated a handful of these mutants against CAG and GAC motifs. Given that the screen was carried out against a UAG motif, the effects of these mutations on editing of CAG and GAC motifs was not as pronounced as the UAG motif.

Author response image 5

2. The second DMS screen identified only one mutant, N496F, that could significantly enhance editing of a GAC codon. The authors did not recover this mutant in the initial screen, is that due to the lack of N496F affecting editing at UAG codons?

The second DMS was carried out in the E488Q background while the first was carried out in the wild type ADAR2. The second screen helped uncover a double mutant E488Q, N496F which had enhanced activity against a GAC triplet. When this double mutant was evaluated against a UAG, the improvement in editing was 1.12 fold compared to the E488Q as can be seen in Figure 1—figure supplement 3c. This N496F mutant was on an average 1.2 fold better than the ADAR2 at editing a UAG motif in the two replicates of the first screen. This tells us that the N496F itself does not greatly alter editing at UAG motifs.

The abstract describes this mutant as "greatly increased enzymatic activity at 5' GAN-3' motifs". This language is overstated both in terms of the activity (which is simply 1.1-2 fold enhanced (Figure 1h)) and with regards to the specific motif. The comprehensive assessment of the specificity of this mutant (requested in the initial review, SI Figure 3c) indicates this mutant has enhanced activity not only for GAN motifs but also CAN motifs, with CAC being the second most edited codon after GAA (and above several other GAN codons).

We have now toned down the abstract and the relevant statement reads: “This enabled us to create a domain wide mutagenesis map while also revealing a novel hyperactive variant with improved enzymatic activity at 5’-GAN-3’ motifs.”

5’-CAC-3’ definitely is highly edited, however, this is not the case with other 5’-CAN-3’ motifs but instead with 5’-NAC-3’ motifs. We believe that since the screen was carried out while targeting a 5’-GAC-3’ motif, we have selected for a mutant with improved editing in the context of a 5’-G and a 3’-C. This is now included in the main text and reads: We also confirmed that this new variant was at least as efficient as the E488Q at editing all other motifs, with improved editing also observed at 5’-NAC-3’ motifs.

The authors should both tone down the language in the abstract and discuss the lack of specificity of the E448Q, N486F mutant, especially in terms of what may occur with off-targets. The authors have already performed the experiment (Figure 3b and Figure 4b) but do not discuss the data in this regard, despite the initial request by Reviewer 1. This is particularly important as the authors stress that finding mutants with altered preference is important for precision medicine, but if these ADAR mutants also increase off-target editing, the findings are less exciting-albeit more rationale for using the split ADAR technology developed.

We agree with the reviewer and have made the necessary changes to the manuscript.

We have altered the abstract and the relevant statement reads: However, exogenous delivery of ADAR enzymes, especially hyperactive variants, leads to significant transcriptome wide off-targeting.

We have also included the following statement in the main text: Although the full-length ADAR2-DD(E488Q, N496F) was highly promiscuous, splitting it enabled high transcriptome-wide specificity while targeting a UAG in the RAB7A transcript.

Additionally, we have also added the following statement to the Discussion section: However, like the ADAR2-DD(E488Q), the ADAR2-DD(E488Q, N496F) showed increased bystander editing and transcriptome wide off-targeting as compared to the ADAR2-DD.

The methodology for the comprehensive analysis of editing preferences should be added to the manuscript.

This statement has now been added to the methods section and reads: For a comprehensive analysis of the ADAR2-DD(E488Q, N496F) mutant, several adenosines within the RAB7A 3’UTR and GAPDH CDS were targeted. The editing efficiencies for each of these adenosines were normalized with that of the ADAR2-DD(E488Q) so as to represent all of the 16 motifs on a single heatmap.

https://doi.org/10.7554/eLife.75555.sa2

Article and author information

Author details

  1. Dhruva Katrekar

    Department of Bioengineering, University of California San Diego, San Diego, United States
    Contribution
    Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Writing – original draft, Writing – review and editing
    Competing interests
    has filed a patent pertaining to the screening methodology, novel mutants and splitting of the ADAR2-DD (application number: 63/075,717). Is now an employee of Shape Therapeutics
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-8028-3244
  2. Yichen Xiang

    Department of Bioengineering, University of California San Diego, San Diego, United States
    Contribution
    Investigation
    Competing interests
    has filed a patent pertaining to the screening methodology, novel mutants and splitting of the ADAR2-DD (application number: 63/075,717). Is now an employee of Shape Therapeutics
  3. Nathan Palmer

    Division of Biological Sciences, University of California San Diego, San Diego, United States
    Contribution
    Formal analysis
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-6347-9379
  4. Anushka Saha

    Department of Bioengineering, University of California San Diego, San Diego, United States
    Contribution
    Investigation
    Competing interests
    No competing interests declared
  5. Dario Meluzzi

    Department of Bioengineering, University of California San Diego, San Diego, United States
    Contribution
    Formal analysis
    Competing interests
    No competing interests declared
  6. Prashant Mali

    Department of Bioengineering, University of California San Diego, San Diego, United States
    Contribution
    Conceptualization, Funding acquisition, Investigation, Methodology, Project administration, Supervision, Writing – original draft, Writing – review and editing
    For correspondence
    pmali@ucsd.edu
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-3383-1287

Funding

National Human Genome Research Institute (R01HG009285)

  • Prashant Mali

National Cancer Institute (R01CA222826)

  • Prashant Mali

National Institute of General Medical Sciences (R01GM123313)

  • Prashant Mali

U.S. Department of Defense (PR210085)

  • Prashant Mali

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank members of the Mali lab for discussions, advice and help with experiments. This work was generously supported by UCSD Institutional Funds, NIH grants (R01HG009285, R01CA222826, R01GM123313, 1K01DK119687), and Department of Defense Grant (DOD PR210085). This publication includes data generated at the UC San Diego IGM Genomics Center utilizing an Illumina NovaSeq 6000 that was purchased with funding from a National Institutes of Health SIG grant (#S10 OD026929).

Senior Editor

  1. James L Manley, Columbia University, United States

Reviewing Editor

  1. Timothy W Nilsen, Case Western Reserve University, United States

Reviewers

  1. Nina Papavasiliou, Deutsche Krebsforschungszentrum (DKFZ), Germany
  2. Heather A Hundley, Indiana University, United States

Publication history

  1. Preprint posted: September 9, 2020 (view preprint)
  2. Received: November 14, 2021
  3. Accepted: January 18, 2022
  4. Accepted Manuscript published: January 19, 2022 (version 1)
  5. Version of Record published: February 2, 2022 (version 2)

Copyright

© 2022, Katrekar et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 1,446
    Page views
  • 239
    Downloads
  • 0
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Dhruva Katrekar
  2. Yichen Xiang
  3. Nathan Palmer
  4. Anushka Saha
  5. Dario Meluzzi
  6. Prashant Mali
(2022)
Comprehensive interrogation of the ADAR2 deaminase domain for engineering enhanced RNA editing activity and specificity
eLife 11:e75555.
https://doi.org/10.7554/eLife.75555

Further reading

    1. Biochemistry and Chemical Biology
    2. Microbiology and Infectious Disease
    Lauren C Radlinski, Andreas J Bäumler
    Insight

    Listeria monocytogenes uses respiration to sustain a risky fermentative lifestyle during infection.

    1. Biochemistry and Chemical Biology
    2. Cell Biology
    Haikel Dridi et al.
    Research Article Updated

    Age-dependent loss of body wall muscle function and impaired locomotion occur within 2 weeks in Caenorhabditis elegans (C. elegans); however, the underlying mechanism has not been fully elucidated. In humans, age-dependent loss of muscle function occurs at about 80 years of age and has been linked to dysfunction of ryanodine receptor (RyR)/intracellular calcium (Ca2+) release channels on the sarcoplasmic reticulum (SR). Mammalian skeletal muscle RyR1 channels undergo age-related remodeling due to oxidative overload, leading to loss of the stabilizing subunit calstabin1 (FKBP12) from the channel macromolecular complex. This destabilizes the closed state of the channel resulting in intracellular Ca2+ leak, reduced muscle function, and impaired exercise capacity. We now show that the C. elegans RyR homolog, UNC-68, exhibits a remarkable degree of evolutionary conservation with mammalian RyR channels and similar age-dependent dysfunction. Like RyR1 in mammals, UNC-68 encodes a protein that comprises a macromolecular complex which includes the calstabin1 homolog FKB-2 and is immunoreactive with antibodies raised against the RyR1 complex. Furthermore, as in aged mammals, UNC-68 is oxidized and depleted of FKB-2 in an age-dependent manner, resulting in ‘leaky’ channels, depleted SR Ca2+ stores, reduced body wall muscle Ca2+ transients, and age-dependent muscle weakness. FKB-2 (ok3007)-deficient worms exhibit reduced exercise capacity. Pharmacologically induced oxidization of UNC-68 and depletion of FKB-2 from the channel independently caused reduced body wall muscle Ca2+ transients. Preventing FKB-2 depletion from the UNC-68 macromolecular complex using the Rycal drug S107 improved muscle Ca2+ transients and function. Taken together, these data suggest that UNC-68 oxidation plays a role in age-dependent loss of muscle function. Remarkably, this age-dependent loss of muscle function induced by oxidative overload, which takes ~2 years in mice and ~80 years in humans, occurs in less than 2–3 weeks in C. elegans, suggesting that reduced antioxidant capacity may contribute to the differences in lifespan among species.