Boosting targeted genome editing using the hei-tag

Precise, targeted genome editing by CRISPR/Cas9 is key for basic research and translational approaches in model and non-model systems. While active in all species tested so far, editing efficiencies still leave room for improvement. The bacterial Cas9 needs to be efficiently shuttled into the nucleus as attempted by fusion with nuclear localization signals (NLSs). Additional peptide tags such as FLAG- or myc-tags are usually added for immediate detection or straightforward purification. Immediate activity is usually granted by administration of preassembled protein/RNA complexes. We present the ‘hei-tag (high efficiency-tag)’ which boosts the activity of CRISPR/Cas genome editing tools already when supplied as mRNA. The addition of the hei-tag, a myc-tag coupled to an optimized NLS via a flexible linker, to Cas9 or a C-to-T (cytosine-to-thymine) base editor dramatically enhances the respective targeting efficiency. This results in an increase in bi-allelic editing, yet reduction of allele variance, indicating an immediate activity even at early developmental stages. The hei-tag boost is active in model systems ranging from fish to mammals, including tissue culture applications. The simple addition of the hei-tag allows to instantly upgrade existing and potentially highly adapted systems as well as to establish novel highly efficient tools immediately applicable at the mRNA level.


Introduction
In the last decade, the CRISPR/Cas9 system and its derivatives facilitated and revolutionized genome editing across all phyla (Nidhi et al., 2021). The efficiency of editing crucially depends on the on-site activity of the particular Cas9 enzymes used (usually Streptococcus pyogenes Cas9,SpCas9) in the nucleus. State-of-the-art Cas9 variants differ by peptide tags added to the N-and C-termini of the respective endonuclease resulting in reported different activities (Liu et al., 2021;Zhang et al., 2014). Employed tags usually comprise diverse nuclear localization signals (NLSs) and epitope tags (e.g. FLAG, Myc, HA) for potential protein purification or visualization. To achieve nuclear localization of the Cas9 enzyme, the monopartite NLS originating from the SV40 large T-antigen (Kalderon et al., 1984) or a bipartite NLS discovered in Xenopus nucleoplasmin is routinely employed (Dingwall et al., 1988). However, the nuclear localization activity of commonly used NLSs is tightly controlled during early development (Poon and Jans, 2005) and is first detectable during gastrulation. In fish embryos, an optimized artificial NLS (Inoue et al., 2016) (oNLS) facilitates prominent nuclear localization already immediately after fertilization, while the SV40 NLS acts most prominently much later and facilitates nuclear localization approximately at the 1000-cell stage. For high targeting efficiency with low mosaicism, a peak activity should be achieved in the zygote or at early cleavage stages. Here, we present the hei-tag, a short bipartite tag composed of a myc-tag and optimized NLSs at the N-and C-termini, that boosts Cas9 or cytosine-to-thymine (C-to-T) base editor-mediated targeted genome editing in organismo and cell culture.

Results
Assessing the genome editing efficiency requires a reliable and quantitative readout based on an apparent phenotype. We established a quantitative assay for loss-of-eye pigmentation to address the activity of different Cas9 variants in two teleost model systems, medaka (Oryzias latipes) and zebrafish (Danio rerio) covering a wide evolutionary distance of 200 million years (Furutani-Seiki and Wittbrodt, 2004). Our assay on retinal pigmentation provides a highly reproducible quantitative readout for the loss of the conserved transporter protein oculocutaneous albinism type 2 (oca2), required for melanin biosynthesis (Figure 1a). Only its bi-allelic inactivation results in the loss of pigmentation of eyes and skin (Lischik et al., 2019). A prominent knock-out phenotype thus can either result from a single to few early events, or from many events at subsequent developmental stages. Although phenotypically indifferent, the allele variance (genetic mosaicism) reflects the time point of action. eLife digest The genetic code stored within DNA provides cells with the instructions they need to carry out their role in the body. Any changes to these genes, or the DNA sequence around them, has the potential to completely alter how a cell behaves.
Scientists have developed various tools that allow them to experimentally modify the genome of cells or even entire living organisms. This includes the popular Cas9 enzyme which cuts DNA at specific sites, and base editors which can precisely change bits of genetic code without cutting DNA. While there are lots of Cas9 enzymes and base editors currently available, these often differ greatly in their activity depending on which cell type or organism they are applied to.
Finding a tool that can effectively modify the genome of an organism at the right time during development also poses a challenge. All the cells in an organism arise from a single fertilized cell. If this cell is genetically edited, all its subsequent daughter cells (which make up the entire organism) will contain the genetic modification. However, most genome editing tools only work efficiently later in development, resulting in an undesirable mosaic organism composed of both edited and non-edited cells.
Here, Thumberger et al. have developed a new 'high efficiency-tag' (also known as hei-tag for short) that can enhance the activity of gene editing tools and overcome this barrier. The tag improves the efficiency of gene editing by immediately shuttling a Cas9 enzyme to the nucleus, the cellular compartment that stores DNA. In all cases, gene editing tools with hei-tag worked better than those without in fish embryos and mouse cells grown in the laboratory. When Cas9 enzymes connected to a hei-tag were injected into the first fertilized cell of a fish embryo, this resulted in an even distribution of edited genes spread throughout the whole organism.
The online version of this article includes the following source data and figure supplement(s) for figure 1: Source data 1. Raw data for quantifications shown in Figure 1d and g.  State-of-the-art protocols employ high concentrations of Cas9 and respective sgRNAs to ensure efficient on-site editing. To facilitate uniform Cas9 action, we followed our successful mRNA injection protocol (Gutierrez-Triana et al., 2018). One-cell stage medaka embryos were co-injected with sgRNAs targeting the oca2 gene (OlOca2 T1, T2) together with mRNA encoding a Cas9 endonuclease and mRNA encoding the green fluorescent protein (GFP) as injection tracer. Injected embryos were fixed at 4.5 days post fertilization (dpf) (Iwamatsu, 2004) well after the onset of pigmentation in control injections and subjected to image analysis ( Figure 1b). In brief, the eyes were segmented, (residual) pigmentation was thresholded (Figure 1c-c') and quantified according to mean gray values (0, i.e. fully pigmented, 255, i.e. completely unpigmented, Figure 1d).
We first established the base activity level for the assay at standard conditions with high molar excess (150 ng/µl concentration) and determined the activity of a Cas9 variant codon optimized for zebrafish, that is, a Cas9 carrying an SV40 NLS at the N-and C-terminus (nls-zCas9-nls, hereinafter: zCas9, Plasmid #47929 Addgene, Jao et al., 2013). The analysis of medaka oca2 knock-out embryos injected with zCas9 revealed bi-allelic inactivation events of the oca2 gene, yet with a strong overall variability as apparent by patchy unpigmented domains in the eyes (median of mean gray values = 134.5 compared to uninjected controls, median = 0.4; Figure 1d). This patchy distribution of small, unpigmented areas indicated that bi-allelic targeting occurred only in few cells at later stages of development. To address whether different peptide domains (NLSs, Myc-tag, amino acid linkers) flanking the Cas9 enzyme enhance the targeting efficiency, we performed a permutation screen with Cas9 variants carrying these domains at different positions, which resulted in the identification of the 'hei-tag' (Figure 1-figure supplement 1). The hei-tag comprises a myc-tag connected via a flexible linker to an oNLS at the N-terminus complemented by a second oNLS fused to the C-terminus of a mammalian codon-optimized Cas9 (see Supplementary file 1 for sequence) and in this conformation displayed highest editing activity. Any alteration of those domains in relative order or sequence negatively impacted on editing efficiency compared to the hei-tag (Supplementary file 2).
When assessing the activity of the resulting heiCas9 at high molar excess (standard conditions, 150 ng/µl), heiCas9 displayed a 70% increase in bi-allelic targeting efficiency vs. the reference zCas9 (median zCas9 = 134.5, heiCas9 = 225.3; Figure 1d) in medaka. Embryos co-injected with heiCas9 mRNA and sgRNAs against oca2 essentially lost pigmentation. The observed absence of pigmentation argues for an early time point of action due to high activity and efficient nuclear translocation of the tagged heiCas9 variant already at the earliest cleavage stages. In developing organisms, the time point of genome editing essentially impacts on the allele variance, that is, the number of alleles established by the targeting attempt. To immediately provide a functional editing machinery, preassembled ribonucleoproteins (RNPs) containing Cas9 protein and guide RNA are popular, employing high molar excess/high concentrations of Cas9 (Kroll et al., 2021;Wu et al., 2018). Strikingly, the editing efficiency of injected heiCas9 mRNA was fully comparable to such RNP approaches ( To address whether the enhancement by hei-tag fusion to Cas9 is applicable to different models, we next compared the activities of the zCas9 and heiCas9 in a second, evolutionarily distant fish species D. rerio (zebrafish) targeting the orthologous oca2 gene (sgRNAs DrOca2 T1, T2; Hammouda et al., 2019). Injected and control embryos were fixed well after the onset of pigmentation at 2.5 dpf (Kimmel et al., 1995;Figure 1e-f) and subjected to the quantitative assay for eye pigmentation described above. Taking the activity of zCas9 as base level (median = 14.7), heiCas9 delivered an outstanding targeting efficiency (median = 254.6), reflecting a 17-fold increase (p = 2.1e-56) ( Figure 1g, Figure 1-figure supplement 2). Similar to the results in medaka, yet even more pronounced, nearly unpigmented embryos were obtained with the heiCas9, arguing for highly efficient, early targeting. Taken together, addition of the hei-tag to a mammalian codon-optimized Cas9 resulted in the highly efficient heiCas9, which boosted the targeting efficiency 17-fold, even when used at saturating concentrations. It prominently inactivated both alleles of the targeted oca2 locus, with a putatively early onset of action upon injection of heiCas9 mRNA and the respective sgRNAs at the one-cell stage.

Figure 1 continued
To address whether the high targeting efficiency of heiCas9 was conveyed by the high molar excess employed or was possibly restricted to the oca2 locus, we turned to a multiplexing regime at 10-fold reduced concentrations of the Cas9 variants employed. We targeted four different genomic loci with four different sgRNAs: exonic targeting of oca2 (OlOca2 T2), targeting of the start codon of the retina-specific transcription factor 2 (rx2; Stemmer et al., 2015), and the crystallin alpha a (cryaa; Stemmer et al., 2015) as well as intronic targeting of rx3 (Zilova et al., 2021). Medaka one-cell stage embryos were co-injected with a mix of 12.5 ng/µl per sgRNA, the 10-fold reduced (15 ng/µl) zCas9 or heiCas9 mRNA and 20 ng/µl mCherry mRNA as injection tracer.
For each multiplexing experiment, the genomic DNA of three pools each containing eight randomly picked crispants was extracted at 4 dpf and subjected to allele-specific genotyping via Illumina sequencing. In the multiplexing approaches, a total of 823,898 reads for the zCas9 and 824,817 reads for the heiCas9, compared to 711,739 control reads, were analyzed (Supplementary file 3, Figure 2-figure supplement 1). In all cases, heiCas9 performed dramatically better than the reference zCas9 (Figure 2a; mean percentage of modified alleles zCas9 [black dots] vs. heiCas9 [red dots]: Figure 2. Increased knock-out activity and reduced allele variance in heiCas9 crispants. Multiplexed injections with 15ng/µl mRNA of zCas9 or heiCas9 (red) mRNA and 12.5ng/µl per sgRNA targeting exonic sequences in oculocutaneous albinism type 2 (oca2; OlOca2 T2), the start codons of the retinaspecific homeobox transcription factor 2 (rx2; OlRx2) and of the alpha a crystallin (cryaa; OlCryaa), as well as an intronic sequence in rx3 (OlRx3). Illumina sequencing performed on three biological replicates (eight embryos each) per targeted locus. (a) Increased knock-out efficiency in heiCas9 crispants as shown by proportion of modified over all Illumina sequencing reads per replicate and locus. (b) Reduced allele variance in heiCas9 crispants as shown by abundance of specific allele divided by all modified alleles per replicate and locus. Bold line, mean values of zCas9 (black) and heiCas9 (red). Total aligned Illumina reads analyzed: OlOca2: zCas9 = 194,931,heiCas9 = 180,222;OlRx2: zCas9 = 224,146,heiCas9 = 269,103;OlRx3: zCas9 = 195,248,heiCas9 = 175,044;OlCryaa: zCas9 = 209,573,heiCas9 = 200,448. Statistical analysis performed in R, Student's t-test.
The online version of this article includes the following source data and figure supplement(s) for figure 2:  OlOca2: 3.38% vs. 54.59%, p = 0.026; OlRx2: 20.82% vs. 95.85%, p = 3.2e-06; OlRx3: 16.61% vs. 49.36%, p = 0.0041; OlCryaa: 83.50% vs. 98.44%, p = 0.039). Strikingly, although the overall targeting efficiency was consistently higher as reflected by the high percentage of edited alleles (Figure 2a), at the same time the allele variance was reduced in all cases when using heiCas9 (Figure 2b;  . This reduced allele variance for all multiplexed loci indicates an early editing by heiCas9. Given this and the overall higher targeting efficiency in all loci analyzed in the multiplexing approach, heiCas9 outperformed zCas9. It resulted in a massive performance boost, which was partially masked at saturating conditions, and now became fully apparent. The high efficiency of heiCas9 thus allows efficient editing at low concentrations with the potential to reduce off-target effects. Whether this putative reduction of off-targets is (over-)compensated by the efficient nuclear localization needs to be assessed by whole-genome sequencing approaches in the future.
While the early onset of action is required for uniform editing in developing organisms, cell culture approaches demand efficient translocation of the sgRNA/Cas9 complex in a large number of cells.
To validate the range of action on the one hand and to address the relevance of the hei-tag in a mammalian setting, we expanded the scope of the analysis to mammalian cell culture. We focused on mRNA-based assays and compared the activity of heiCas9 to state-of-the-art Cas9 variants, that is, the commercially available GeneArt CRISPR nuclease as well as a mammalian codon-optimized Cas9 (JDS246-Cas9, Addgene #43861) in mouse SW10 cells. We assessed the respective genome editing efficiencies by independent and complementary tools, the Tracking of Indels by Decomposition (TIDE) analysis (Brinkman et al., 2014) as well as by Inference of CRISPR Editing (ICE) (Hsiau et al., 2018). Both approaches decompose the mixed Sanger reads of PCR products spanning the CRISPR target site and compute an efficiency score as well as the distribution of expected indels. To target the murine Periaxin (Prx) locus, mouse SW10 cells were co-transfected with MmPrx crRNA/  ATTO-550-linked tracrRNA and the mRNAs of either JDS246-Cas9, GeneArt CRISPR nuclease, or heiCas9. The Prx locus was PCR amplified and sequenced. Similar to targeting in organismo, heiCas9 also exhibited the highest genome editing efficiency when compared to JDS246-Cas9 (TIDE: 123.6%, ICE: 113%) and GeneArt CRISPR nuclease (TIDE: 123.1%, ICE: 111%) in mammalian cell culture (Figure 3, Figure 3-figure supplement 1, R 2 > 0.9 (TIDE) and >0.9 (ICE) for all mRNAs tested). Notably, the KO-score efficiencies (ICE) amounted to 173% compared to JDS246-Cas9 and to 167% compared to GeneArt CRISPR nuclease, indicating higher abundance of frameshifts (Hsiau et al., 2018) at this genomic locus.
Remarkably, heiCas9-transfected cells showed a highly increased number of mutant alleles with an increased abundance of a 26 nt deletion when compared to GeneArt CRISPR nuclease and JDS246-Cas9 (Figure 3-figure supplement 1).
Given the observed boosting of Cas9 activity by the simple addition of the hei-tag, we next tested if the hei-tag also improves further Cas9-based techniques. Base editing is an increasingly applied method with a potential for therapeutics (Antoniou et al., 2021). Base editors are composed of a modified Cas9 that only nicks one DNA strand and does not introduce a double-strand break (Cas9 nickase or Cas9n) and a nucleotide deaminase for precisely targeted nucleotide editing (Anzalone et al., 2020). To increase the efficiency of base editors, several iterative rounds of optimization of the employed deaminases and linkers have been undertaken, yielding optimal performance with the newest variants (Carrington et al., 2020;Cornean et al., 2022;Rosello et al., 2021;Zhao et al., 2020). To investigate if the addition of the hei-tag provides an easy and straightforward alternative route for increasing the activity of a nuclear protein of interest, we selected a C-to-T base editor version with intermediate efficiency (BE4-Gam Komor et al., 2017) to introduce non-sense or severe miss-sense mutations into the pigmentation gene oca2. We employed our tool ACEofBASEs (Cornean et al., 2022) to design and evaluate sgRNA target sites that introduce non-synonymous codon mutations and/or pre-mature STOP codons upon editing of the respective open reading frame (ORF). We compared three different sgRNAs (OlOca2 T1, T3, and T4) employing the original BE4-Gam and the hei-tag fused variant (heiBE4-Gam). In the oca2 ORF, the transition of cytosines 766, 922, and 997 to thymine all convert the respective codon to a pre-mature STOP (OlOca2 T3: C766T, leading to Q256*; OlOca2 T4: C922T, leading to Q308*; OlOca2 T1: C995-997T, leading to T332I and Q333*). Again, the loss of pigmentation was used as proxy for bi-allelic targeting efficiency following medaka one-cell stage injections with either one of the three sgRNAs (OlOca2 T1, T3, or T4, 30ng/µl) as well as 150ng/µl mRNA of either BE4-Gam or heiBE4-Gam. Screening and analysis was performed at 4.5 dpf as described above. For each sgRNA employed, heiBE4-Gam resulted in more pronounced loss of pigmentation in comparison to BE4-Gam (Figure 4a; control median = 0.0; medians BE4-Gam vs. heiBE4-Gam: OlOca2 T1, 0.6 vs. 28.0, p = 1.737e-20; OlOca2 T3, 0.0 vs. 0.8, p = 0.0471; OlOca2 T4, 93.8 vs. 170.1, p = 5.215e-12). Quantification of Sanger sequencing reads confirmed an increase of all C-to-T transitions at the OlOca2 T1 target site when heiBE4-Gam was used (74.1% ± 8.9% for heiBE4-Gam vs. 44.2% ± 6.8% for BE4-Gam; Figure 4-figure supplement 1, three replicates containing five randomly picked embryos each). In particular, the C997T transition introducing a premature STOP codon was increased 1.7-fold (i.e. 68% in heiBE4-Gam vs. 41% in BE4-Gam) in case of heiBE4-Gam (Figure 4b and c).
In conclusion, using the hei-tag to extend the ORFs of a mammalian codon-optimized SpCas9 or a C-to-T base editor (BE4-Gam) severely enhanced the respective genome targeting efficiency.

Discussion
While the use of the optimized NLS in the hei-tag explains the earlier and better performance of the hei-tagged versions of Cas9 and base editors in developing organisms, the impact of the specific topology of domains contained in the hei-tag remains elusive. It is speculated that the addition of certain peptide tags influences the efficacy and specificity of the fused protein of interest, due to their different isoelectric points and charge distributions (Zhang et al., 2014). Interestingly, our permutation screen demonstrated that although comprising the exact same peptides (for instance, compare MFO-Cas9-O [heiCas9] vs. OMF-Cas9-O and MSF-Cas9-S vs. SMF-Cas9-S in Figure 1-figure supplement 1), position of the particular tags relative to each other conveyed different genome editing efficiencies.
The hei-tag renders the resulting heiCas9 into a highly efficient endonuclease with broad applicability overcoming the limitations of current SpCas9 variants by dramatically increasing the efficiency of targeted genome editing in organismo, as demonstrated in two evolutionarily distant fish models, as well as in mouse cell culture. In those systems, heiCas9 leads to a high abundance of identical mutant alleles, important for testing specific hypotheses or introducing site-specific modifications by homology-directed repair (Gutierrez-Triana et al., 2018). Conversely, Cas9 variants without the hei-tag are better suited for targeted screening approaches since they introduce a large number of different mutant alleles. heiCas9 markedly increased the (bi-allelic) targeting rate alongside a decrease in allele variance, indicating a high targeting efficiency already at the earliest stages of development. Precedentially such early targeting in developing organisms was most of all reported using RNPs (Kroll et al., 2021;Wu et al., 2018), yet mRNA injection of heiCas9 is fully comparable to these protein approaches. The benefits of using mRNA over protein are apparent: new Cas9 variants can easily be generated and produced cost-efficiently by highly reproducible in vitro transcription, a standard method in molecular biology labs.
In light of the ever-expanding CRISPR tool kit, the addition of the hei-tag provides the means to boost current specialized and future variants, as the simple addition of the hei-tag sequence also potentiated the activity of a cytosine base editor, with heiBE4-Gam resulting in an overall increase of about 30% of C-to-T transition rates (Figure 4 and Figure 4-figure supplement 1). Taken together, the boosting activity of the hei-tag is neither limited by the species nor the approach, making it  a powerful tweak to swiftly upgrade any specifically adapted Cas-based genome editing approach (Anzalone et al., 2020).

Fish maintenance
Zebrafish (D. rerio) and medaka (O. latipes) fish were bred and maintained as previously described (Koster et al., 1997;Westerfield, 2000). The animal strains used in the present study were zebrafish AB/back and medaka Cab. All experimental procedures were performed according to the guidelines of the German animal welfare law and approved by the local government (Tierschutzgesetz §11, Abs. 1, Nr. 1, husbandry permit number 35-9185.64/BH Wittbrodt).
Embryos were screened for GFP or mCherry expression 4-7 hr or 1 day after injection using a Nikon SMZ18 stereomicroscope, and uninjected specimens were discarded.

Image acquisition and phenotype analysis
Medaka 4.5 dpf embryos (Iwamatsu, 2004) and zebrafish 2.5 dpf (Kimmel et al., 1995) embryos were fixed with 4% paraformaldehyde in 2× PTW (2× PBS pH 7.3, 0.1% Tween 20). Images of medaka embryos were acquired with the high content screening ACQUIFER Imaging Machine (DITABIS AG, Pforzheim, Germany). Images of zebrafish embryos were acquired with a Nikon digital sight DS-Ri1 camera mounted onto a Nikon Microscope SMZ18 and the Nikon Software NIS-Elements F version 4.0. Only properly developed embryos were included in the following analysis. Image analysis was performed with Fiji (Schindelin et al., 2012), that is, mean gray values were obtained on minimum intensity projections and locally thresholded (Phansalkar algorithm with parameters r = 20, p = 0.4, k = 0.4) pictures and elliptical selections for each individual eye. The mean gray value per eye was used for the boxplot and statistical analysis (pairwise comparisons using Wilcoxon rank sum test, Bonferroni corrected) in RStudio (Team, 2020).

Targeted amplicon sequencing via illumina
The multiplex approach was genotyped on DNA extractions of pools with each replicate containing eight randomly picked crispants per zCas9 or heiCas9 injection or six control specimens. DNA was prepared by grinding and lysis in DNA extraction buffer (0.4 M Tris/HCl pH 8.0, 0.15 M NaCl, 0.1% SDS, 5 mM EDTA, pH 8.0, 1 mg/ml proteinase K) at 60°C overnight. Proteinase K was inactivated at Table 3. Locus-specific primers with 5' partial illumina adapter sequences.
Locus-specific primers with Illumina adapter sequence underscored.

Genotyping of editants
Genotyping was performed on DNA extractions (see above) of three replicates containing five randomly picked editants each of BE4-Gam and heiBE4-Gam injections. Q5 polymerase (NEB), primers fwd 5'-GTTA AAAC AGTT TCTT AAAA AGAA CAGG A-3' and rev 5'-AGCA GAAG AAAT GACT CAAC ATTT TG-3' (annealing at 62°C) were used on 1 µl of diluted DNA sample according to the manufacturer's instructions with 30× PCR cycles. PCR products were analyzed on a 1% agarose gel, bands excised, DNA extraction performed using innuPREP Gel Extraction Kit (Analytik Jena) according to the manufacturer's instructions and subjected to Sanger sequencing (see below).

Sanger sequencing
Sanger sequencing was performed by Eurofins Genomics using fwd 5'-GTTA AAAC AGTT TCTT AAAA AGAA CAGG A-3' to evaluate base editing at OlOca2 T1 target site and using fwd 5'-GAGA CACT CACT CCAG ACCC -3' and rev 5'-ACTC AGTA ACCC AACA GCCA -3' to evaluate genome editing of the Prx locus in SW10 cells. Quantification of base editing from Sanger sequencing reads was performed with EditR (Kluesner et al., 2018). Genome editing efficiency was assessed by sequence analysis using the TIDE web tool (Brinkman et al., 2014) and by ICE (Hsiau et al., 2018) using default parameters and indel size range up to 30 bp.

Additional files
Supplementary files • Supplementary file 1. Nucleotide and translated amino acid sequence of heiCas9.

• Transparent reporting form
Data availability All data generated or analysed during this study are included in the manuscript and supporting files.