Endogenous protein tagging in medaka using a simplified CRISPR/Cas9 knock-in approach
Abstract
The CRISPR/Cas9 system has been used to generate fluorescently labelled fusion proteins by homology-directed repair in a variety of species. Despite its revolutionary success, there remains an urgent need for increased simplicity and efficiency of genome editing in research organisms. Here, we establish a simplified, highly efficient, and precise strategy for CRISPR/Cas9-mediated endogenous protein tagging in medaka (Oryzias latipes). We use a cloning-free approach that relies on PCR-amplified donor fragments containing the fluorescent reporter sequences flanked by short homology arms (30–40 bp), a synthetic single-guide RNA and Cas9 mRNA. We generate eight novel knock-in lines with high efficiency of F0 targeting and germline transmission. Whole genome sequencing results reveal single-copy integration events only at the targeted loci. We provide an initial characterization of these fusion protein lines, significantly expanding the repertoire of genetic tools available in medaka. In particular, we show that the mScarlet-pcna line has the potential to serve as an organismal-wide label for proliferative zones and an endogenous cell cycle reporter.
Introduction
The advent of gene editing tools (Wang et al., 2016; Jinek et al., 2012; Cong et al., 2013) in conjunction with the expansion of sequenced genomes and engineered fluorescent proteins (Chudakov et al., 2010; Shaner et al., 2013; Bindels et al., 2017; Campbell et al., 2020) has revolutionized the ability to generate endogenous fusion protein knock-in (KI) lines in a growing number of organisms (Paix et al., 2015; Paix et al., 2017a, Gratz et al., 2014; Kanca et al., 2019; Wierson et al., 2020; Gutierrez-Triana et al., 2018; Auer and Del Bene, 2014; Yoshimi et al., 2016; Yao et al., 2017; Cong et al., 2013; Dickinson et al., 2015; Leonetti et al., 2016; Wierson et al., 2019). These molecular markers expressed at physiological levels are central to our understanding of cellular- and tissue-level dynamics during embryonic development (Gibson et al., 2013). To this end, researchers have utilized the Streptococcus pyogenes CRISPR-associated protein 9 (Cas9) and a programmed associated single-guide RNA (sgRNA) to introduce a double strand break (DSB) at a pre-defined genomic location (Jinek et al., 2012). Cell DNA repair mechanisms are triggered by the DSB and it has been shown that providing DNA repair donors with homology arms that match those of the targeted locus can lead to integration of the donor constructs containing fluorescent reporter sequences in the genome by the process of homology-directed repair (HDR) (Danner et al., 2017; Jasin and Haber, 2016; Ceccaldi et al., 2016; Lisby and Rothstein, 2004). Despite its success, HDR mediated precise single-copy KI efficiencies in vertebrate models can still be low and the process of generating KI lines remains cumbersome and time consuming. Recent reports have improved the methodology by the usage of 5′ biotinylated long homology arms that prevent concatemerization of the injected dsDNA (Gutierrez-Triana et al., 2018) or by linking the repair donor to the Cas9 protein (Gu et al., 2018; Savic et al., 2018; Carlson-Stevermer et al., 2017; Aird et al., 2018). In addition, repair donors with shorter homology arms in combination with in vivo linearization of the donor plasmid have been shown to mediate efficient knock-ins in zebrafish and in mammalian cells (Wierson et al., 2020; Hisano et al., 2015; Cristea et al., 2013; Yao et al., 2017).
In this work, we establish a simplified, highly efficient, and precise strategy for CRISPR/Cas9-mediated endogenous protein tagging in medaka (Oryzias latipes). Our approach relies on the use of biotinylated PCR-amplified donor fragments that contain the fluorescent reporter sequences flanked by short homology arms (30–40 bp), by-passing the need for cloning or in vivo linearization. We use this approach to generate and characterize a series of novel KI lines in medaka fish (Supplementary files 3a and 4). By utilizing whole genome sequencing (WGS) with high coverage in conjunction with Sanger sequencing of edited loci, we provide strong evidence for precise single-copy integration events only at the desired loci. In addition to generating an endogenous ubiquitous nuclear label and novel tissue-specific reporters, the KI lines allow us to record cellular processes, such as intra-cellular trafficking and stress granule formation in 4D during embryonic development, significantly expanding the genetic toolkit available in medaka. Finally, we provide proof-of-principle evidence that the endogenous mScarlet-pcna KI we generate serves as a bona fide proliferative cell label and an endogenous cell cycle reporter, with broad application potential in a vertebrate model system.
Results
A simplified, highly efficient strategy for CRISPR/Cas9-mediated fluorescent protein knock-ins in medaka
To simplify the process of generating fluorescent protein knock-ins in medaka we utilized PCR-amplified dsDNA repair donors with short homology arms (30–40 bp). In addition, biotinylated 5′ ends were used to prevent in vivo concatemerization of DNA (Gutierrez-Triana et al., 2018). We used a streptavidin-tagged Cas9 (Cas9-mSA), with the goal of enhancing its binding to the biotinylated repair donor constructs (Gu et al., 2018). This approach by-passes the need for cloning, as the short homology arms are added during PCR amplification. Also, given a linear PCR repair donor is used, there is no need for a second gRNA for in vivo plasmid linearization (Hoshijima et al., 2016; Shin et al., 2014; Zu et al., 2013; Yao et al., 2017; Cristea et al., 2013; Auer et al., 2014; Hisano et al., 2015; Li et al., 2019; Wierson et al., 2020; Kimura et al., 2014). The three-component mix: biotinylated PCR-amplified dsDNA donors, synthetic sgRNA, and Cas9-mSA mRNA (Supplementary file 3b-e) was injected into one-cell-stage medaka embryos (Figure 1 and Figure 1—figure supplement 1), for a detailed protocol see Supplementary files 1 and 2. We targeted a list of eight genes with a variety of fluorescent proteins (Figure 1 and Figure 1—figure supplement 1, Supplementary files 3a-d and 4), both N and C terminus tags were attempted (a list of all genomic loci targeted can be found in Supplementary file 3a). Targeting efficiency in F0 ranged from 11% to 59% of embryos showing mosaic expression (Supplementary files 3a and 4). Control injections with the actb sgRNA, Cas9-mSA mRNA, and the donor eGFP construct without homology arms showed no evidence of eGFP-positive cell clones in F0 (Supplementary file 3a), while the same construct with homology arms resulted in 39% of surviving injected embryos showing mosaic expression of eGFP (Supplementary files 3a and 4). The germline transmission efficiency of fluorescent F0 fish ranged from 25% to 100% for the different targeted loci (Supplementary files 3a and 4). For F0 adults with germline transmission, the percentage of positive F1 embryos ranged between 6.6% (2/30) and 50% (25/50). Using this method, we were able to establish eight stable KI lines. Importantly, a single injection round was sufficient to generate a KI line for most targeted loci (7/8; Supplementary files 3a-f and 4). As previously reported, the actb-eGFP tag was embryonic lethal (Gutierrez-Triana et al., 2018) and we could not obtain a KI line for that locus. We also performed an initial comparison (using fluorescent screening in F0) between different Cas9 designs, that is, with and without mSA. Our results indicate comparable efficiencies of KI insertions in F0s, irrespective of whether a streptavidin tag was included (Supplementary file 3g). Combined, our results provide evidence that highly efficient targeting of endogenous loci with large inserts (~800 bp) is obtained in medaka using the simplified KI approach presented here (Figure 1 and Figure 1—figure supplement 1). In addition to being highly efficient, this protocol is rapid and simple-to-implement, as it relies on a PCR-amplified repair construct and hence alleviates the need for any additional cloning or in vivo plasmid linearization (Figure 1—figure supplement 1).
Precise, single-copy knock-ins of fluorescent protein reporters
We next assessed the specificity and precision of the approach. It is possible that either concatemerization of inserts or off-target integrations could occur after foreign DNA delivery and CRISPR/Cas9-mediated DSBs (Gutierrez-Triana et al., 2018; Doench et al., 2016; Fu et al., 2013; Paix et al., 2017a, Won and Dawid, 2017; Yan et al., 2013; Hackett et al., 2007; Wierson et al., 2020; Auer et al., 2014; Shin et al., 2014; Winkler et al., 1991; Hoshijima et al., 2016; Kimura et al., 2014). To identify off-target insertions genome wide and verify single-copy integration we performed next generation WGS with high coverage (for details of WGS, see Materials and methods) on three KI lines (Figure 1B–D): eGFP-cbx1b, mScarlet-pcna, and mNeonGreen-myosinhc. For the eGFP-cbx1b KI line, we could only identify paired-end eGFP reads anchored to the endogenous cbx1b locus and nowhere else in the genome (Figure 1—figure supplement 2A). Likewise, in the mScarlet-pcna line, mScarlet reads only mapped to the endogenous pcna locus (Figure 1—figure supplement 2B). For the mNeonGreen-myosinhc line, mNeonGreen sequences mapped to the myosinhc locus (Figure 1—figure supplement 2C), but paired-end analysis yielded a second, weakly supported partial insertion of mNeonGreen at an intronic region in the edf1 gene. We were not able to confirm the latter insertion by subsequent PCR and hence it remains unclear whether this a false-positive prediction or a rare insertion of very low frequency. Combined, the WGS results therefore provide strong evidence that the method we report results in single-copy insertions only at the targeted locus. In addition to WGS, genotyping F1 adults followed by Sanger sequencing confirmed the generation of single-copy in-frame fusion proteins in the eGFP-cbx1b, mGreenLantern-cbx1b, mScarlet-pcna, mNeonGreen-myosinhc, cdh2-eGFP, mapre1b-mScarlet, and eGFP-rab11a lines (Figure 1—figure supplement 3, Supplementary file 3f, and Materials and methods). Only 2/14 Sanger sequenced junctions showed evidence of imprecise repair following HDR, one of which was a partial duplication within the 5′ homology arm, 22 basepairs upstream of the start codon while the other showed a partial duplication within the 3′ homology arm four basepairs after the stop codon (for details see Materials and methods), in both cases the coding sequence of the targeted genes is unaffected. Overall, the method we present here shows high precision and specificity enabling the rapid generation of endogenously tagged alleles in a vertebrate model.
Visualization of endogenous protein dynamics enables in vivo recording of cellular processes in medaka
As a proof of principle, we employed the simplified CRISPR/Cas9 strategy to generate a series of endogenous fusion protein KI medaka lines (Supplementary files 3a-f and 4; Figures 1 and 2, and Figure 2—figure supplements 2–4). Here, we provide an initial characterization of eight of these novel KI lines that are made available to the community, to label cell compartments (nucleus), cell processes (cell cycle, intra-cellular trafficking, stress granule formation), cell adhesion (adherens junctions), microtubules (plus-ends), and specific cell types (muscle cells).
Ubiquitous nuclear marker
To generate a ubiquitously expressed nuclear label reporter line, we targeted the cbx1b (Chromobox protein homolog) locus with eGFP and mGreenLantern (mGL). Cbx1b is a member of the chromobox DNA-binding protein family and is a known component of heterochromatin that is expressed ubiquitously (Lomberk et al., 2006; Nielsen et al., 2001). Chromobox proteins are involved in several important functions within the nucleus, such as transcription, nuclear architecture, and DNA damage response (Luijsterburg et al., 2009; Gilmore et al., 2016). We generated two KI lines eGFP-cbx1b and mGL-cbx1b by targeting either eGFP or mGL to the N-terminus of the cbx1b coding sequence in medaka. The resulting lines express the fluorescent reporter in all nuclei of every tissue examined, and serve as endogenous ubiquitous nuclear labels in teleosts (Figure 1B and Figure 2A, Figure 2—figure supplements 1 and 2, and Video 1, n > 10 embryos).
Proliferative cell marker
With the goal of generating an endogenous cell cycle reporter, we targeted the pcna (proliferating cell nuclear antigen) locus to generate a mScarlet-pcna fusion protein. Pcna is an essential protein regulator of DNA replication and integrity in eukaryotic cells (Moldovan et al., 2007; Maga and Hubscher, 2003; Mailand et al., 2013). It has been previously shown that cells that exit the cell cycle, for example post-mitotic differentiated cell types, express very low levels of Pcna (Zerjatke et al., 2017; Thacker et al., 2003; Yamaguchi et al., 1995; Buttitta et al., 2010; Alunni et al., 2010). This has led researchers to utilize Pcna as a highly conserved marker for proliferating cells (Zerjatke et al., 2017; Barr et al., 2016; Leonhardt et al., 2000; Leung et al., 2011; Piwko et al., 2010; Alunni et al., 2010; Thacker et al., 2003; Santos et al., 2015; Held et al., 2010). In addition to being a specific label for cycling cells, the appearance of nuclear speckles of Pcna within the nucleus is a hallmark of cells in late S phase of the cell cycle (Zerjatke et al., 2017; Barr et al., 2016; Leonhardt et al., 2000; Leung et al., 2011; Piwko et al., 2010; Santos et al., 2015; Held et al., 2010). More recently, endogenously tagged Pcna has been used in mammalian cell lines to dynamically score all the different cell cycle phases (Zerjatke et al., 2017). We targeted the first exon of pcna with mScarlet with high efficiency (28% mosaic expression in F0s, and 50% germline transmission) and generated the mScarlet-pcna KI line (Figure 1C and Figure 2B). Using stage 40 medaka embryos, we detected mScarlet-Pcna-positive cells within the epidermis, specifically in supra-basal epidermal cells (Figure 2B, n = 10 embryos). A subset of these cells showed nuclear speckles of mScarlet-Pcna that likely represent replication foci and are a characteristic marker for late S phase (Figure 2B, yellow arrowheads). We validate the use of this line both as an organismal-wide label for proliferative zones, and an endogenous cell cycle reporter in later sections.
Intra-cellular trafficking
To generate a reporter line allowing monitoring subcellular trafficking of endosomes and exosomes, we targeted Rab11a (Ras-Related Protein), a small GTPase and known marker of intra-cellular trafficking organelles in vertebrates (Welz et al., 2014; Cullen and Steinberg, 2018; Stenmark, 2009). We generated an N-terminus tagged eGFP-rab11a fusion protein that shows punctate intra-cellular signal most likely corresponding to trafficking organelles (Figure 2C, Figure 2—figure supplement 3, and Videos 2–4, n = 4 embryos). As a proof of principle, we detected high levels of eGFP-rab11a in cells of the spinal cord (Figure 2C, yellow arrowhead) and in neuromasts of the lateral line (Figure 2C, magenta arrowhead, Figure 2—figure supplement 3, and Video 4, n = 6 embryos). Using the eGFP-rab11a KI line, we were also able to observe dynamics of what appear to be intra-cellular organelle trafficking in vivo both in individual skin epithelial cells in the mid-trunk region and in the caudal fin region of developing medaka embryos (Videos 2 and 3, n = 4 embryos) providing initial evidence of the utility of this line as a possible subcellular trafficking marker in medaka.
Stress granule marker
We were able to generate a g3b1-eGFP KI line by targeting eGFP to the 11th exon of the medaka g3bp1 gene. G3bp1 (GTPase activating protein SH3-domain-binding protein) is a DNA/RNA-binding protein and an initiating factor involved in stress granule formation (Irvine et al., 2004; Yang et al., 2020). Stress granules are non-membrane bound cell compartments, which form under cellular stress and accumulate non-translating mRNA and protein complexes, and play an important role in cellular protection by regulating mRNA translation and stability (Decker and Parker, 2012; Protter and Parker, 2016). Under normal conditions G3bp1-eGFP is expressed in the cytoplasm (Figure 2F, Video 5, n = 8 embryos) but upon stress (temperature shock), we observe that the protein changes its localization and accumulates in cytoplasmic foci corresponding to forming stress granules (Figure 2F’, yellow arrowheads, Video 5, n = 8 embryos). This is consistent with previous reports showing similar changes in the localization of G3bp1 in response to stress in a number of organisms (Guarino et al., 2019; Wheeler et al., 2016; Kuo et al., 2020). The initial characterization of the g3bp1-eGFP line shows its potential to serve as a real-time in vivo reporter for the dynamics of stress granules formation in a vertebrate model.
Muscle cell marker
To label muscle cells, we targeted muscular myosin heavy chain with mNeonGreen (mNG). Myosins are a highly conserved class of motor proteins implicated in actin microfilament reorganization and movement (Sellers, 2000; Hartman and Spudich, 2012). We generated an N-terminus fusion of mNG-myosinhc KI that exclusively labels muscle cells (Figure 1D and Figure 2D and Figure 2—figure supplement 4, n > 10 embryos). In the medaka myotome, we were able to observe mNG-myosinhc chains of individual sarcomeres (A-bands separated by the I-bands), indicating that tagged Myosinhc is incorporated correctly in muscle fibers (Taylor et al., 2015; Loison et al., 2018). We use this line to record the endogenous dynamics of Myosinhc during muscle growth in vivo for the first time to the best of our knowledge, in a vertebrate model (Video 6, n = 9 embryos). The mNG-myosinhc line therefore enables the in vivo recording of endogenous Myosinhc dynamics during myogenesis in medaka.
Cell adhesion marker
Cadherins are a highly conserved class of transmembrane proteins that are essential components of cell–cell adhesion and are thus expressed on cellular membranes (Leckband and de Rooij, 2014). A large number of cadherin genes exist in vertebrates where they exhibit tissue-specific expression patterns and are implicated in various developmental processes (Halbleib and Nelson, 2006). We decided to tag the C-terminus of medaka cadherin 2 (cdh2, n-cadherin) with eGFP. cdh2 is known to be expressed primarily in neuronal tissues in a number of vertebrates (Harrington et al., 2007; Suzuki and Takeichi, 2008). The cdh2-eGFP KI line shows cellular membrane expression in a variety of neuronal and non-neuronal tissues including the spinal cord, the eye, and the notochord (Figure 2E, n = 5 embryos) and neuromasts of the lateral line (Figure 2—figure supplement 3, n = 5 embryos, Video 7, n = 3 embryos), in addition to the developing heart (data not shown) (Chopra et al., 2011). The high expression of cdh2 in both differentiated notochord cell types (Figure 2E, Figure 2—figure supplement 3) has not been previously reported in medaka but is not unexpected as this tissue experiences a high level of mechanical stress and requires strong cell–cell adhesion (Lim et al., 2017; Adams et al., 1990; Garcia et al., 2017; Seleit et al., 2020). The cdh2-eGFP KI can be used to study dynamics of n-cadherin distribution in vivo during vertebrate embryogenesis (Video 8, n = 2 embryos).
Microtubule marker
Microtubule plus-end binding proteins are conserved regulators of microtubule dynamics, acting as a scaffold to recruit several additional proteins to ensure essential cell functions such as cell polarity, intra-cellular transport, and mitosis (Nehlig et al., 2017; Tirnauer and Bierer, 2000; Galjart, 2010). We successfully targeted the microtubule plus-end binding protein mapre1b (eb1), generating a C-terminal fusion protein with mScarlet. mapre1b-mScarlet is widely expressed in medaka embryos: epithelial cells, muscle cells, the notochord, and neuromasts all show mapre1b-mScarlet expression (Figure 2G, Video 9, n = 5 embryos), the highest level of expression occurs in the spinal cord (Figure 2G, yellow arrowhead). We were also able to record microtubule dynamics in the spinal cord of living embryos (Video 10, n = 5 embryos) highlighting the utility of this line for exploring the dynamics of microtubules in vivo during development.
mScarlet-pcna: an organismal-wide marker for proliferative zones
We reasoned that the novel mScarlet-pcna line can act as an organismal-wide bona fide marker for the location of proliferative cells within any tissue or organ of interest. We therefore decided to generate double transgenic animals with eGFP-cbx1b as a ubiquitous nuclear marker and mScarlet-pcna as a label for cycling cells (Figure 3). As a proof of principle, we set out to investigate the location of proliferative zones in a number of organs and tissues in medaka. We began by assessing the position of proliferative cells in neuromast organs of the lateral line (Seleit et al., 2017b; Pinto-Teixeira et al., 2015; Romero-Carvajal et al., 2015). Neuromasts are small rosette shaped sensory organs located on the surface of teleost fish that sense the direction of water flow and relay the information back to the central nervous system (Seleit et al., 2017a; Romero-Carvajal et al., 2015; Jones and Corwin, 1993; Wada et al., 2013). They consist of four cell types: differentiated hair cells (HCs) in the very centre, underlying support cells (SCs), a ring of mantle cells (MCs), and neuromast border cells (nBCs) (Seleit et al., 2017b; Dufourcq et al., 2006). Previous work in medaka has established MCs to be the true life-long neural stem cells within mature neuromast organs (Seleit et al., 2017b). While the eGFP-cbx1b labels all neural cells within a mature neuromast organ (HCs, SCs, and MCs) (Figure 3A), mScarlet-pcna expression matches the previously reported location of proliferative MCs (Seleit et al., 2017b; Figure 3A’–A’’, white arrowhead). Neither the differentiated HCs nor the SCs directly surrounding them show evidence of Pcna expression in mature neuromast organs under homeostatic conditions in medaka (Figure 3A–A’’, n = 10 neuromast organs). Our results validate the utility of mScarlet-pcna as an in vivo marker of proliferative cells. Previous work has shown that nBCs are induced to form from epithelial cells that come into contact with neuromast precursors during organ formation and that these induced cells become the stem cell niche of mature neuromast organs (Seleit et al., 2017b). However, an open question is whether transformed nBCs are differentiated, post-mitotic cells or whether they remain cycling. Utilizing the mScarlet-pcna line we were able to observe nBCs (4/42) in late S phase of the cell cycle, as evident by the presence of nuclear speckles, in mature neuromast organs (Figure 3A’–A’’, yellow arrowheads). This provides direct evidence that nBCs retain the ability to divide and are thus not post-mitotic cells.
Next, we turned our attention to the optic tectum, which is essential for integrating visuomotor cues in all vertebrates (Lavker and Sun, 2003; Alunni et al., 2010; Nguyen et al., 1999). We show that proliferative cells in the optic tectum of medaka are located at the lateral, caudal, and medial edge of the tectum in a crescent-like topology (Figure 3B–B’’, n = 4 embryos). Moreover, mScarlet-pcna expression is graded, with the more central cells gradually losing expression of Pcna (Figure 3B’). This is in line with previous histological findings using BrdU/IdU stainings in similarly staged medaka embryos (Nguyen et al., 1999; Alunni et al., 2010). We next analyzed the expression of mScarlet-pcna in the developing pectoral fin (Figure 3C–C’’, n = 4 embryos). We found that cells located proximally expressed the highest levels of mScarlet-pcna, with mScarlet-pcna expression decreasing gradually along the proximo-distal axis (Figure 3C’). To the best of our knowledge this proliferation pattern has not been previously reported and our data provide evidence that the differentiation axis of the pectoral fin is spatially organized from proximal to distal in medaka. Lastly, we reveal that proliferative cells are present in the spinal cord of stage 40 medaka embryos, a finding that has not been previously reported, and we show that these mScarlet-Pcna-positive cells occur in clusters preferentially located on the dorsal side of the spine (Figure 3—figure supplement 1, n = 4 embryos). The newly developed mScarlet-pcna line therefore acts as a stable label of proliferative cells and as such can be used to uncover the location of proliferation zones in vivo within organs or tissues of interest in medaka.
mScarlet-pcna: an endogenous cell cycle reporter
In addition to its use as a marker for cells in S phase, it has been shown that endogenously tagged Pcna can be used to determine all other cell cycle phases. This is based on the fact that both the levels and dynamic distribution of Pcna show reproducible characteristics in each phase of the cell cycle (Held et al., 2010; Piwko et al., 2010; Santos et al., 2015; Zerjatke et al., 2017; Leonhardt et al., 2000; Leung et al., 2011). To assess whether the endogenous mScarlet-pcna line recapitulates these known characteristic expression features during the cell cycle, we aimed to quantitatively analyze endogenous mScarlet-Pcna levels in individual cells during their cell cycle progression. To this end, we imaged skin epithelial cells located in the mid-trunk region of medaka embryos (Figure 4, Figure 4—figure supplements 1 and 2). Cells in the G1 phase of the cell cycle have been shown to decrease the levels of Pcna within the nucleus over time (Figure 4—figure supplement 1, Video 11, n = 9 epithelial cells) (Zerjatke et al., 2017). On the other hand, cells progressing through to S phase have been shown to increase the levels of Pcna expression within the nucleus over time (Leonhardt et al., 2000; Piwko et al., 2010; Leung et al., 2011; Santos et al., 2015; Barr et al., 2016; Zerjatke et al., 2017; Held et al., 2010). Indeed, we found that all tracked epithelial cells that eventually underwent a cellular division showed an increase in nuclear intensity of mScarlet-Pcna prior to the appearance of nuclear speckles (Figure 4A–B’’, Figure 4—figure supplement 1, Video 12, n = 9 epithelial cells). Nuclear speckles of Pcna mark the presence of replication foci in late S phase of the cell cycle (Leonhardt et al., 2000; Piwko et al., 2010; Leung et al., 2011; Barr et al., 2016; Zerjatke et al., 2017; Held et al., 2010). Previous work has also shown that the S/G2 transition can be identified as the point of peak pixel intensity distribution of endogenous Pcna within the nucleus (Zerjatke et al., 2017). We were are able to determine the peak pixel intensity distribution within nuclei by a combination of 3D surface plots and histograms of pixel intensity distributions over time (Figure 4B–D, Figure 4—figure supplements 2 and 3, and Videos 12–14, n = 9 epithelial cells). Finally, onset of M phase is marked by a sharp decrease in nuclear levels of Pcna (Zerjatke et al., 2017; Leung et al., 2011; Piwko et al., 2010; Held et al., 2010), which we could consistently detect in the endogenous mScarlet-Pcna intensity tracks of epithelial cells undergoing division (Figure 4A–C, Figure 4—figure supplements 1 and 2, and Videos 12–14, n = 9 epithelial cells). We therefore provide initial evidence that the mScarlet-pcna line recapitulates known dynamics of Pcna within the nucleus (Held et al., 2010; Piwko et al., 2010; Santos et al., 2015; Zerjatke et al., 2017; Barr et al., 2016; Leonhardt et al., 2000; Leung et al., 2011) and that it can therefore be utilized as an endogenous ‘all-in one’ cell cycle reporter in vertebrates.
Discussion
Despite the CRISPR/Cas9 system being repurposed as a broad utility genome editing tool almost a decade ago (Jinek et al., 2012; Cong et al., 2013; Wang et al., 2016) and despite its revolutionary impact as a method to generate knock-ins by HDR (Danner et al., 2017; Jasin and Haber, 2016; Ceccaldi et al., 2016; Lisby and Rothstein, 2004), there is still a paucity of precise, single-copy fusion protein lines in vertebrates, in general, and in teleost fish in particular. In fact, in medaka there are a total of three validated single-copy fusion protein lines by CRISPR/Cas9 reported prior to this work (Gutierrez-Triana et al., 2018), and in zebrafish, a handful of lines have been reported so far (Wierson et al., 2020; Hisano et al., 2015; Auer et al., 2014; Kimura et al., 2014; Li et al., 2019). This underscores the complexity of generating and validating precise single-copy fusion protein KI lines in teleost models. Previous techniques to generate large KIs (such as fluorescent reporters) required the usage of plasmid vectors commonly containing long homology arms (>200 bp) (Zu et al., 2013; Shin et al., 2014; Hoshijima et al., 2016; Kimura et al., 2014; Li et al., 2019). Problems arising during and after injection include DNA concatemerization of the donor construct (Gutierrez-Triana et al., 2018; Auer et al., 2014; Winkler et al., 1991; Hoshijima et al., 2016; Shin et al., 2014), in addition to possible imprecise and off-target integration of either the fluorescent protein sequence or the plasmid backbone (Auer et al., 2014; Gutierrez-Triana et al., 2018; Won and Dawid, 2017; Wierson et al., 2020; Shin et al., 2014; Hoshijima et al., 2016; Kimura et al., 2014; Li et al., 2019; Yan et al., 2013; Hackett et al., 2007). The vast majority of reported HDR-mediated knock-ins in teleosts rely on in vivo linearization of the plasmid donors. This strategy is utilized due to the observation that, although linear dsDNA donors can drive HDR, they might be prone to degradation, concatemerization, and are generally thought to be more toxic than plasmid donors (Auer et al., 2014; Cristea et al., 2013; Auer and Del Bene, 2014; Hisano et al., 2015; Hoshijima et al., 2016; Wierson et al., 2020; Yao et al., 2017; Winkler et al., 1991). Plasmid donors therefore contain an additional guide RNA sequence to drive in vivo linearization in order to synchronize the availability of the linear DNA donor with Cas9 activity (Auer et al., 2014; Cristea et al., 2013; Hisano et al., 2015; Hoshijima et al., 2016; Kimura et al., 2014; Li et al., 2019; Wierson et al., 2020; Yao et al., 2017). We reasoned that directly injecting PCR-amplified linear DNA with short homology arms (~35 bp) could be highly effective since these donors are relatively small (~780 bp) compared to plasmids (several kbs), and therefore a small quantity of donors (~10 ng/µl) will provide a large number of molecules (~20 nM) available to engage the HDR machinery following the Cas9-induced DSB.
Building on recent improvements in CRISPR/Cas9 KI strategies, we used 5′ biotinylated primers in order to limit in vivo concatemerization of the donor construct (Gutierrez-Triana et al., 2018), and synthetic sgRNAs were used to increase the efficiency of DSBs by Cas9 (Paix et al., 2015; Kroll et al., 2021; Hoshijima et al., 2019). In addition, we utilized a monomeric streptavidin-tagged Cas9 that has a high affinity to the biotinylated donor fragments to increase targeting efficiency (Gu et al., 2018). While 7/8 KI lines were generated and validated using Cas9 mSA and 5′ biotinylated donor fragments, the benefit of using the mSA/Biotin system to increase targeting efficiency in F0 needs to be further evaluated: in our initial comparison, using (2xNLS) Cas9 with and without mSA, and repair donors with and without biotinylation, we have found a comparable lethality and mosaic KI efficiency rates in F0s (Supplementary file 3g). We also tested another Cas9 containing only one NLS and no mSA (Gagnon et al., 2014) and found a lower lethality rate and a lower mosaic KI efficiency in F0s compared to Cas9 with two NLSs (Supplementary file 3g). Irrespective of these considerations, the approach reported here is a highly efficient, precise, and scalable strategy for generating single-copy fusion proteins (Supplementary files 3g and 4). The fact that the repair donors are synthesized by PCR amplification eliminates the need for both cloning and a second guide RNA for in vivo linearization. Therefore, the strategy we utilize significantly simplifies the process of endogenous protein tagging in a vertebrate model.
Very recently a similar approach to the one we present here showed the potential to generate CRISPR/Cas9-mediated KI lines in zebrafish by targeting non-coding genomic regions with PCR-amplified donor constructs (Levic et al., 2021). Together with our present work in Medaka, this supports the notion that donors with short homology arms are sufficient to drive HDR when they are in the form of linear dsDNA. One possible explanation is that following the Cas9-induced DSB, the donor is integrated by synthesis-dependent strand annealing (SDSA). During SDSA, the 3′ ends of chromosomal DNA strands at the DSB can anneal with the repair donor and drive its replication and insertion at the site of the DSB (Lisby and Rothstein, 2004; Danner et al., 2017; Jasin and Haber, 2016; Ceccaldi et al., 2016). Short homologies (30–40 bp) appear to be sufficient to anneal with chromosomal DNA and engage the SDSA machinery (Paix et al., 2017a, Paix et al., 2016; Grzesiuk and Carroll, 1987).
An important aspect of any KI strategy to generate fusion proteins is its precision. The validation process of single-copy insertions is complicated in approaches that use long homology arms (>200 bps) to generate knock-ins as concatemerization (Winkler et al., 1991; Gutierrez-Triana et al., 2018) and the formation of episomes (Aljohani et al., 2020; Wade-Martins et al., 1999; Udvadia and Linney, 2003; Winkler et al., 1991; Wierson et al., 2020) cannot be easily ruled out. Locus genotyping by PCR and Sanger sequencing is difficult when using primers external to the repair donor due to the large size of the expected fragment and competition for amplification with the wildtype allele. Internal primers within the donor (junction PCR) have been used to avoid this limitation, but this can lead to PCR artefacts and crucially, it does not rule out concatemerization of the injected dsDNA (Won and Dawid, 2017; Gutierrez-Triana et al., 2018; Auer et al., 2014; Hoshijima et al., 2016; Shin et al., 2014). Southern blotting is considered the gold standard to assess single-copy integration (Wierson et al., 2020; Gutierrez-Triana et al., 2018; Won and Dawid, 2017; Auer et al., 2014; Shin et al., 2014; Zu et al., 2013). While it has its advantages, Southern blotting depends on experimental design (genomic DNA preparation and digestion strategy) and probe design/sensitivity, and therefore cannot exclude that part of the donor construct or part of the vector backbone integrates elsewhere in the genome. Indeed, it has been reported that plasmid donors can lead to additional unwanted insertions in the genome (Won and Dawid, 2017; Auer et al., 2014; Hoshijima et al., 2016; Kimura et al., 2014; Li et al., 2019; Shin et al., 2014). We address those issues by performing WGS with high coverage on KI lines and provide evidence that our approach yields single-copy integration only at the desired locus. In addition, utilizing repair donors with short homology arms on both ends (30–40 bp) simplifies the validation of the insertion by using primers that sit outside the targeting donor fragment. These external primers can then be used for genotyping of the full insertion by simple PCR followed by Sanger sequencing to know the precise nature of the edit. We show that the usage of donor fragments with short homology arms, in combination with high coverage WGS, to be important aspects in validating the precision of single-copy CRISPR/Cas9-mediated KI lines in vertebrate models.
We were able to generate eight novel endogenous protein fusion lines that significantly expand the repertoire of genetic tools to track cellular dynamics in medaka. The eGFP-cbx1b and mGL-cbx1b KI lines serve as endogenous ubiquitous nuclear markers (Nielsen et al., 2001; Lomberk et al., 2006). The generation of truly ubiquitous lines by transgene overexpression in teleost fish (Centanin et al., 2014; Burket et al., 2008) is a difficult endeavour and requires constant monitoring for variegation and silencing (Goll et al., 2009; Akitake et al., 2011; Burket et al., 2008; Stuart et al., 1990). Yet these ubiquitous fluorescent reporter lines are invaluable tools for researchers. Ubiquitous fusion proteins expressed from the endogenous locus avoid potential issues with transgene overexpression and variegation. The highly conserved cbx1b locus could therefore provide an alternative strategy to generate faithful ubiquitous nuclear markers in other teleosts and non-model organisms. In addition, this locus could serve as a landing site for ubiquitous expression of genetic constructs (e.g. utilizing a 2A self-cleaving peptide) in medaka (Li et al., 2019; Kim et al., 2011). Next, we validate the use of g3bp1-eGFP KI as a stress granule formation marker, and utilizing 4D live imaging show the formation of stress granules in response to temperature shock in real time, as previously shown in other models using a variety of stress conditions (Guarino et al., 2019; Kuo et al., 2020; Wheeler et al., 2016; Decker and Parker, 2012; Protter and Parker, 2016). This line can therefore be used both as an in vivo marker of stress conditions and to study the process of stress granule formation. The eGFP-rab11a line serves as an intra-cellular trafficking (Welz et al., 2014; Cullen and Steinberg, 2018; Stenmark, 2009) marker that allows us to dynamically follow exosomes and endosomes in vivo. We report that both neuromasts and the spinal cord show substantially higher expression of rab11a than other tissues, the basis of this remains unclear but could indicate that these tissues exhibit higher levels of protein turnover. Despite being a highly conserved protein involved in myogenesis (Sellers, 2000; Hartman and Spudich, 2012), no endogenous KI of any myosin family member has been reported in teleosts. The mNeonGreen-myosinhc KI enables the detection and recording of endogenous myosin dynamics in vivo during muscle growth in a vertebrate model. We also generate cdh2-eGFP KI line (Leckband and de Rooij, 2014; Halbleib and Nelson, 2006) and show that it is expressed in a tissue-specific manner primarily in the spinal cord, neuromasts and the notochord. Since N-cadherin has been shown to be involved in epithelial–mesenchymal transition (EMT) (Harrington et al., 2007; Suzuki and Takeichi, 2008; Desclozeaux et al., 2008), this line can be used to study dynamical changes in N-cadherin distribution in vivo facilitating our understanding of EMT and other fundamental cell adhesion processes in vertebrates. Lastly, we generate and characterize the mScarlet-pcna KI line and discuss its usage and implications across teleosts below.
An overarching goal of developmental and stem cell biology is to discover the location of stem and progenitor cells in different organs and tissues, followed by a molecular characterization of their properties (Rhee et al., 2006; Nowak et al., 2008; Snippert et al., 2010; Buczacki et al., 2013; Lu et al., 2012; Lavker and Sun, 2003). Major advances have relied on finding resident stem cell markers that differentiates stem cells from other cell types within the same tissue, followed by BrdU/IdU staining to confirm their proliferative abilities (Nguyen et al., 1999; Rhee et al., 2006; Nowak et al., 2008; Nowak and Fuchs, 2009; Alunni et al., 2010; Snippert et al., 2010; Lu et al., 2012; Buczacki et al., 2013; Stolper et al., 2019; Tsingos et al., 2019). However, BrdU/IdU staining requires the sacrifice of the animal precluding the ability to perform 4D live imaging to analyze stem cell behaviour in vivo over time. The medaka KI line with endogenously labelled Pcna that we present here helps to circumvent this limitation. In addition, since Pcna is expressed exclusively in cycling cells (Yamaguchi et al., 1995; Thacker et al., 2003; Buttitta et al., 2010; Zerjatke et al., 2017; Alunni et al., 2010), it has the potential to be used to discover the location of proliferative zones in vivo within any organ or tissue of interest. We provide proof-of-principle evidence that the mScarlet-pcna KI line acts as a bona fide marker for proliferative zones in a variety of tissues in medaka fish. This line therefore represents an important new tool for stem cell research in medaka. A similar strategy could be adopted to generate endogenously tagged Pcna both in the teleost field and in other organisms.
In addition to its use as a bona fide marker for proliferative zones, we provide evidence that the mScarlet-pcna line can be used as an endogenous cell cycle reporter in medaka. It has previously been shown that both the levels and dynamic distribution of Pcna are indicative of the different cell cycle phases (Held et al., 2010; Piwko et al., 2010; Santos et al., 2015; Zerjatke et al., 2017; Leonhardt et al., 2000; Barr et al., 2016; Leung et al., 2011). This led researchers to successfully utilize it as an ‘all-in-one’ cell cycle reporter in mammalian cells (Held et al., 2010; Piwko et al., 2010; Santos et al., 2015; Zerjatke et al., 2017; Barr et al., 2016; Leonhardt et al., 2000). By quantitatively tracking endogenous Pcna levels during one cell cycle in epidermal cells of medaka fish, we were able to confirm the dynamic nature of mScarlet-pcna expression, which correlated with the previously described behavior of the Pcna protein within the nucleus of other vertebrates (Held et al., 2010; Piwko et al., 2010; Santos et al., 2015; Zerjatke et al., 2017; Barr et al., 2016; Leonhardt et al., 2000; Leung et al., 2011). As such, we provide proof-of-principle evidence that the mScarlet-pcna line can be successfully used as an endogenous cell cycle reporter in a teleost model. Using the visualization of endogenous Pcna for cell cycle phase classification offers an attractive alternative to cell cycle reporters that rely on the insertion of two-colour transgenes, such as the FUCCI system (Sugiyama et al., 2009; Dolfi et al., 2019; Araujo et al., 2016; Bajar et al., 2016; Oki et al., 2014; Sakaue-Sawano et al., 2008). First, by using endogenous fusion proteins there is no requirement for overexpression of cell cycle regulators. Second, the potential issue with transgene variegation and silencing is avoided (Akitake et al., 2011; Goll et al., 2009; Burket et al., 2008; Stuart et al., 1990). Finally, utilizing a single-colour cell cycle reporter allows its simultaneous use with other fluorescent reporters during live-imaging experiments. Due to the high conservation of Pcna in eukaryotes, developing Pcna reporters in other model organisms using a similar strategy is an attractive possibility to pursue.
Materials and methods
Animal husbandry and ethics
Request a detailed protocolMedaka (O. latipes, Cab strain) (Iwamatsu, 2004; Naruse et al., 2004; Kasahara et al., 2007) were maintained as closed stocks in a fish facility built according to the European Union animal welfare standards and all animal experiments were performed in accordance with European Union animal welfare guidelines. Animal experimentation was approved by The EMBL Institutional Animal Care and Use Committee (IACUC) project code: 20/001_HD_AA. Fishes were maintained in a constant recirculating system at 27–28°C with a 14 hr light/10 hr dark cycle.
Cloning-free CRISPR/Cas9 knock-ins
Request a detailed protocolA detailed step-by-step protocol for the cloning-free approach is provided in Supplementary files 1 and 2. A detailed list of all repair donors, PCR primers, fluorescent protein sequences, and sgRNAs used is provided in Supplementary file 3a-e. Briefly, for the preparation of Cas9-mSA mRNA: the pCS2+ Cas9 mSA plasmid was a gift from Janet Rossant (Addgene #103882, Supplementary file 3d; Gu et al., 2018). 6–8 µg of Cas9-mSA plasmid was linearized by Not1-HF restriction enzyme (NEB #R3189S). The 8.8 kb linearized fragment was cut out from a 1.5% agarose gel and DNA was extracted using QIAquick Gel Extraction Kit (Qiagen #28115). In vitro transcription was performed using mMachine SP6 Transcription Kit (Invitrogen #AM1340) following the manufacturer’s guidelines. RNA cleanup was performed using RNAeasy Mini Kit (Qiagen #74104). Other Cas9 encoding plasmids used were pCS2+ Cas9 (Addgene #122948) (Gu et al., 2018) and pCS2-Cas9 (Addgene #47322) (Gagnon et al., 2014; Supplementary file 3d,g). sgRNAs were manually selected using previously published recommendations (Paix et al., 2017a; Paix et al., 2019; Doench et al., 2016; Gagnon et al., 2014; Paix et al., 2017b) and in silico validated using CCTop and CHOPCHOP (Labun et al., 2019; Stemmer et al., 2015; Supplementary file 3e). The genomic coordinates of all genes targeted can be found in Supplementary file 3a. Synthetic sgRNAs used in this study were ordered from Sigma-Aldrich (spyCas9 sgRNA, 3 nmol, HPLC purification, no modification). PCR repair donor fragments were designed and prepared as described previously (Paix et al., 2014; Paix et al., 2017b, Paix et al., 2015; Paix et al., 2016; Paix et al., 2017a) and a detailed protocol is provided in Supplementary file 1. Briefly the design includes approximately 30–40 bp of homology arms and a fluorescent protein sequence with no ATG or stop codon (Supplementary file 3b, d). PCR amplifications were performed using Phusion or Q5 high fidelity DNA polymerase (NEB Phusion Master Mix with HF buffer #M0531L or NEB Q5 Master Mix # M0492L). MinElute PCR Purification Kit (Qiagen #28004) was used for PCR purification. Primers were ordered from Sigma-Aldrich (25 nmol scale, desalted) and contained Biotin moiety on the 5′ ends for repair donor synthesis. A list of all primers and fluorescent protein sequences used in this study can be found in Supplementary file 3c, d. The injection mix in medaka contains the sgRNA (15–20 ng/µl) + Cas9 mSA mRNA (or Cas9 mRNA without mSA) (150 ng/µl) + repair donor template (8–10 ng/µl). For injections, male and female medakas are added to the same tank and fertilized eggs collected 20 min later. The mix is injected in one-cell staged medaka embryos (Iwamatsu, 2004), and embryos are raised at 28°C in 1XERM (Seleit et al., 2017a; Seleit et al., 2017b; Rembold et al., 2006). A list of KI lines generated and maintained in this study can be found in Supplementary file 3f.
Live-imaging sample preparation
Request a detailed protocolEmbryos were prepared for live imaging as previously described (Seleit et al., 2017a; Seleit et al., 2017b). 1× Tricaine (Sigma-Aldrich #A5040-25G) was used to anaesthetize dechorionated medaka embryos (20 mg/ml – 20× stock solution diluted in 1XERM). Anaesthetized embryos were then mounted in low melting agarose (0.6–1%) (Biozyme Plaque Agarose #840101). Imaging was done on glass-bottomed dishes (MatTek Corporation Ashland, MA, USA). For g3bp1-eGFP live imaging, temperature was changed from 21 to 34°C after 1 hr of imaging.
Immunofluorescence
Request a detailed protocolImmunohistochemistry was performed as previously described (Centanin et al., 2014). Primary rabbit anti-GFP antibody (Torrey Pines Biolabs #TP401) was used at a 1:500 dilution from the stock solution. Secondary goat anti-rabbit antibody (Abcam AlexaFluor 488 #ab150077) was used at a 1:500 dilution from the stock. Hoechst 33342 (Thermo Fischer #H3570) was used with a dilution of 1:500 of the 10 mg/ml stock solution.
Microscopy and data analysis
Request a detailed protocolFor all embryo screening, a Nikon SMZ18 fluorescence stereoscope was used. All live-imaging, except for g3bp1-eGFP and cdh2-eGFP embryos, was done on a laser-scanning confocal Leica SP8 (CSU, White Laser) microscope, ×20 and ×40 objectives were used during image acquisition depending on the experimental sample. For the SP8 confocal equipped with a white laser, the laser emission was matched to the spectral properties of the fluorescent protein of interest. g3bp1-eGFP line live imaging was performed using a Zeiss LSM780 laser-scanning confocal with a temperature control box and an Argon laser at 488 nm, imaged through a ×20 plan apo objective (numerical aperture 0.8). For cdh2-eGFP, 4D live imaging was performed on a Luxendo TruLive SPIM system using a ×30 objective. Open-source standard ImageJ/Fiji software (Schindelin et al., 2012) was used for analysis and editing of all images post-image acquisition. Stitching was performed using standard 2D and 3D stitching plug-ins on ImageJ/Fiji. For quantitative values on endogenous mScarlet-Pcna dynamics, ROI manager in ImageJ/Fiji was used to define fluorescence intensity within the nucleus of tracked cells (yellow circle in Figure 4 and Videos 10 and 11), fluorescent intensity measurements were then extracted from the time series and the data was normalized by dividing on the initial intensity value in each time-lapse movies. Data was plotted using R software. Pixel intensity distribution within nuclei were analyzed using a custom python based script (Source code file 1). Individual live-cell tracks were plotted using PlotTwist (Goedhart, 2020).
Fin-clips, genotyping, and Sanger sequencing
Request a detailed protocolIndividual adult F1 fishes were fin clipped for genotyping PCRs. Briefly, fish were anaesthetized in 1× Tricaine solution. A small part of the caudal fin was cut by sharp scissors and placed in a 2 ml Eppendorf tube containing 50 µl of fin-clip buffer. The fishes were recovered in small beakers and were transferred back to their tanks. Eppendorf tubes were then incubated overnight at 65°C. 100 µl of H2O was then added to each tube and then the tubes were incubated for 10–15 min at 90°C. Tubes were then centrifuged for 30 min at 10,000 rpm in a standard micro-centrifuge. Supernatant was used for subsequent PCRs. Fin-clip buffer is composed of 0.4 M Tris–HCl pH 8.0, 5 mM EDTA pH 8.0, 0.15 M NaCl, 0.1% SDS in H2O. 50 µl of proteinase K (20 mg/ml) was added to 1 ml fin-clip buffer before use. 2 µl of genomic DNA from fin-clips was used for genotyping PCRs. A list of all genotyping primers used in this study can be found in Supplementary file 3c. After PCRs the edited and wild-type amplicons were sent to Sanger sequencing (Eurofins Genomics). Sequences were analyzed using Geneious software (Figure 1—figure supplement 3). In-frame integrations were confirmed by sequencing for eGFP-cbx1b, mScarlet-pcna, mNG-myosinhc, eGFP-rab11a, mapre1b-mScarlet, mGL-cbx1b, and cdh2-eGFP. We were able to detect an internal partial duplication of the 5′ homology arm in the mScarlet-pcna line that does not affect the protein coding sequence nor the 5′ extremity of the homology arm itself. Specifically, 22 basepairs upstream of the start codon of pcna (and within the 5′ homology arm); we detect a 21-bp partial duplication of the 5′ homology arm (CGCAACCCTCCACAGAATAAC) and a 7-bp insertion (GGTCGAC) indicative that the repair mechanism involved can lead to errors (Paix et al., 2017a, Wierson et al., 2020). The 5′ homology junction itself is unaltered and precise. We were also able to detect a partial duplication (26 basepairs) of the 3′ homology arm in the cdh2-eGFP line (TTTCCTCGGTGTGGACCTTCCTACTT) that does not affect the protein coding sequence and occurs four basepairs after the stop codon.
Whole genome sequencing
Request a detailed protocolFive to ten positive F1 medaka embryos (originating from the same F0 founder) of the eGFP-cbx1b, mScarlet-pcna, and mNeonGreen-myosinhc lines were snap frozen in liquid nitrogen and kept at −80°C in 1.5 ml Eppendorf tubes. Genomic DNA was extracted using DNeasy Blood and Tissue Kit (Qiagen #69504) according to the manufacturer’s guidelines. The libraries were prepared on a liquid handling system (Beckman i7 series) using 200 ng of sheared gDNA and 10 PCR cycles using the NEBNext Ultra II DNA Library Prep Kit for Illumina (NEB #E7645S). The DNA libraries were indexed with unique dual barcodes (8 bp long), pooled together and then sequenced using an Illumina NextSeq550 instrument with a 150 PE mid-mode in paired-end mode with a read length of 150 bp. Sequenced reads were aligned to the O. latipes reference genome (Ensembl! Assembly version ASM223467v1) using BWA mem version 0.7.17 with default settings (Li and Durbin, 2009). The reference genome was augmented with the known inserts for eGFP, mScarlet, and mNeonGreen to facilitate a direct integration discovery using standard inter-chromosomal structural variant (SV) predictions. The insert sequences are provided in Supplementary file 3b, d. After the genome alignment, reads were sorted and indexed using SAMtools (Li et al., 2009). Quality control and coverage analyses were performed using the Alfred qc subcommand (Rausch et al., 2019). For SV discovery, aligned reads were processed with DELLY v0.8.7 (Rausch et al., 2012) using paired-end mapping and split-read analysis. SVs were filtered for inter-chromosomal SVs with one breakpoint in one of the additional insert sequences (eGFP, mScarlet, and mNeonGreen). Plots shown in Figure 1—figure supplement 2 were made using Integrative Genomics Viewer (IGV) (Thorvaldsdóttir et al., 2013). The estimated genomic coordinates for integration are: eGFP-cbx1b (chr19:19,074,552), mScarlet-pcna (chr9:6,554,003), and mNeonGreen-myosinhc (chr8:8,975,799). Coverage of eGFP-cbx1b_gDNA1 is 20.4× and eGFP-cbx1b_gDNA2 is 23.6×. Coverage of mScarlet-pcna is 14.4×. Coverage of mNeonGreen-myosinhc is 14.5×. Raw sequencing data were deposited in European Nucleotide Archive (ENA) under study number ERP127162. Accession numbers are: eGFP-cbx1b(1) ERS5796960 (SAMEA8109891), eGFP-cbx1b(2) ERS5796961 (SAMEA8109892), mScarlet-pcna ERS5796962 (SAMEA8109893), and mNeonGreen-myosinhc ERS5796963 (SAMEA8109894).
Data availability
Sequencing data have been deposited in European Nucleotide Archive (ENA) under study number ERP127162. Accession numbers are: eGFP-cbx1b(1) ERS5796960 (SAMEA8109891), eGFP-cbx1b(2) ERS5796961 (SAMEA8109892), mScarlet-pcna ERS5796962 (SAMEA8109893) and mNeonGreen-myosinhc ERS5796963 (SAMEA8109894).
-
European Nucleotide ArchiveID PRJEB43219. WGS on CRISPR mediated Knock-ins in medaka.
References
-
Transgenerational analysis of transcriptional silencing in zebrafishDevelopmental Biology 352:191–201.https://doi.org/10.1016/j.ydbio.2011.01.002
-
Evidence for neural stem cells in the medaka optic tectum proliferation zonesDevelopmental Neurobiology 70:693–713.https://doi.org/10.1002/dneu.20799
-
A robust cell cycle control mechanism limits E2F-induced proliferation of terminally differentiated cells in vivoThe Journal of Cell Biology 189:981–996.https://doi.org/10.1083/jcb.200910006
-
Repair Pathway Choices and Consequences at the Double-Strand BreakTrends in Cell Biology 26:52–64.https://doi.org/10.1016/j.tcb.2015.07.009
-
Cardiac myocyte remodeling mediated by N-cadherin-dependent mechanosensingAmerican Journal of Physiology. Heart and Circulatory Physiology 300:1252–1266.https://doi.org/10.1152/ajpheart.00515.2010
-
Fluorescent proteins and their applications in imaging living cells and tissuesPhysiological Reviews 90:1103–1163.https://doi.org/10.1152/physrev.00038.2009
-
In vivo cleavage of transgene donors promotes nuclease-mediated targeted integrationBiotechnology and Bioengineering 110:871–880.https://doi.org/10.1002/bit.24733
-
To degrade or not to degrade: mechanisms and significance of endocytic recyclingNature Reviews. Molecular Cell Biology 19:679–696.https://doi.org/10.1038/s41580-018-0053-7
-
Control of gene editing by manipulation of DNA repair mechanismsMammalian Genome 28:262–274.https://doi.org/10.1007/s00335-017-9688-5
-
P-bodies and stress granules: possible roles in the control of translation and mRNA degradationCold Spring Harbor Perspectives in Biology 4:a012286.https://doi.org/10.1101/cshperspect.a012286
-
Active Rab11 and functional recycling endosome are required for E-cadherin trafficking and lumen formation during epithelial morphogenesisAmerican Journal of Physiology. Cell Physiology 295:545–556.https://doi.org/10.1152/ajpcell.00097.2008
-
Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9Nature Biotechnology 34:184–191.https://doi.org/10.1038/nbt.3437
-
Mechano-sensory organ regeneration in adults: the zebrafish lateral line as a modelMolecular and Cellular Neurosciences 33:180–187.https://doi.org/10.1016/j.mcn.2006.07.005
-
High-frequency off-target mutagenesis induced by CRISPR-Cas nucleases in human cellsNature Biotechnology 31:822–826.https://doi.org/10.1038/nbt.2623
-
The transience of transient overexpressionNature Methods 10:715–721.https://doi.org/10.1038/nmeth.2534
-
Recombination of DNAs in Xenopus oocytes based on short homologous overlapsNucleic Acids Research 15:971–985.https://doi.org/10.1093/nar/15.3.971
-
Cadherins in development: cell adhesion, sorting, and tissue morphogenesisGenes & Development 20:3199–3214.https://doi.org/10.1101/gad.1486806
-
Cadherin-mediated adhesion regulates posterior body formationBMC Developmental Biology 7:130.https://doi.org/10.1186/1471-213X-7-130
-
The myosin superfamily at a glanceJournal of Cell Science 125:1627–1632.https://doi.org/10.1242/jcs.094300
-
Precise Editing of the Zebrafish Genome Made Simple and EfficientDevelopmental Cell 36:654–667.https://doi.org/10.1016/j.devcel.2016.02.015
-
Rasputin, more promiscuous than ever: a review of G3BPThe International Journal of Developmental Biology 48:1065–1077.https://doi.org/10.1387/ijdb.041893ki
-
Stages of normal development in the medaka Oryzias latipesMechanisms of Development 121:605–618.https://doi.org/10.1016/j.mod.2004.03.012
-
CHOPCHOP v3: expanding the CRISPR web toolbox beyond genome editingNucleic Acids Research 47:W171–W174.https://doi.org/10.1093/nar/gkz365
-
Cadherin adhesion and mechanotransductionAnnual Review of Cell and Developmental Biology 30:291–315.https://doi.org/10.1146/annurev-cellbio-100913-013212
-
Dynamics of DNA replication factories in living cellsThe Journal of Cell Biology 149:271–280.https://doi.org/10.1083/jcb.149.2.271
-
The Sequence Alignment/Map format and SAMtoolsBioinformatics 25:2078–2079.https://doi.org/10.1093/bioinformatics/btp352
-
DNA damage checkpoint and repair centersCurrent Opinion in Cell Biology 16:328–334.https://doi.org/10.1016/j.ceb.2004.03.011
-
Heterochromatin protein 1 is recruited to various types of DNA damageThe Journal of Cell Biology 185:577–586.https://doi.org/10.1083/jcb.200810035
-
Proliferating cell nuclear antigen (PCNA): a dancer with many partnersJournal of Cell Science 116:3051–3060.https://doi.org/10.1242/jcs.00653
-
Regulation of PCNA-protein interactions for genome stabilityNature Reviews. Molecular Cell Biology 14:269–282.https://doi.org/10.1038/nrm3562
-
Medaka genomics: a bridge between mutant phenotype and gene functionMechanisms of Development 121:619–628.https://doi.org/10.1016/j.mod.2004.04.014
-
Regulation of end-binding protein EB1 in the control of microtubule dynamicsCellular and Molecular Life Sciences 74:2381–2393.https://doi.org/10.1007/s00018-017-2476-2
-
Isolation and culture of epithelial stem cellsMethods in Molecular Biology 482:215–232.https://doi.org/10.1007/978-1-59745-060-7_14
-
Rapid Tagging of Human Proteins with Fluorescent Reporters by Genome Engineering using Double-Stranded DNA DonorsCurrent Protocols in Molecular Biology 129:e102.https://doi.org/10.1002/cpmb.102
-
Principles and Properties of Stress GranulesTrends in Cell Biology 26:668–679.https://doi.org/10.1016/j.tcb.2016.05.004
-
Cyclebase 3.0: a multi-organism database on cell-cycle regulation and phenotypesNucleic Acids Research 43:1140–1144.https://doi.org/10.1093/nar/gku1092
-
Fiji: an open-source platform for biological-image analysisNature Methods 9:676–682.https://doi.org/10.1038/nmeth.2019
-
Development and regeneration dynamics of the Medaka notochordDevelopmental Biology 463:11–25.https://doi.org/10.1016/j.ydbio.2020.03.001
-
Myosins: a diverse superfamilyBiochimica et Biophysica Acta 1496:3–22.https://doi.org/10.1016/s0167-4889(00)00005-7
-
Rab GTPases as coordinators of vesicle trafficNature Reviews. Molecular Cell Biology 10:513–525.https://doi.org/10.1038/nrm2728
-
Cadherins in neuronal morphogenesis and functionDevelopment, Growth & Differentiation 50 Suppl 1:S119–S130.https://doi.org/10.1111/j.1440-169X.2008.01002.x
-
Integrative Genomics Viewer (IGV): high-performance genomics data visualization and explorationBriefings in Bioinformatics 14:178–192.https://doi.org/10.1093/bib/bbs017
-
EB1 proteins regulate microtubule dynamics, cell polarity, and chromosome stabilityThe Journal of Cell Biology 149:761–766.https://doi.org/10.1083/jcb.149.4.761
-
Long-term stability of large insert genomic DNA episomal shuttle vectors in human cellsNucleic Acids Research 27:1674–1682.https://doi.org/10.1093/nar/27.7.1674
-
CRISPR/Cas9 in Genome Editing and BeyondAnnual Review of Biochemistry 85:227–264.https://doi.org/10.1146/annurev-biochem-060815-014607
-
Orchestration of cell surface proteins by Rab11Trends in Cell Biology 24:407–415.https://doi.org/10.1016/j.tcb.2014.02.004
-
Expanding the CRISPR Toolbox with ErCas12a in Zebrafish and Human CellsThe CRISPR Journal 2:417–433.https://doi.org/10.1089/crispr.2019.0026
-
Transient expression of foreign DNA during embryonic and larval development of the medaka fish (Oryzias latipes)Molecular & General Genetics 226:129–140.https://doi.org/10.1007/BF00273596
-
A nucleotide sequence essential for the function of DRE, a common promoter element for Drosophila DNa replication-related genesThe Journal of Biological Chemistry 270:15808–15814.https://doi.org/10.1074/jbc.270.26.15808
-
Mechanism of random integration of foreign DNA in transgenic miceTransgenic Research 22:983–992.https://doi.org/10.1007/s11248-013-9701-z
-
ssODN-mediated knock-in with CRISPR-Cas for large genomic regions in zygotesNature Communications 7:10431.https://doi.org/10.1038/ncomms10431
Article and author information
Author details
Funding
H2020 European Research Council (866537)
- Alexander Aulehla
- Ali Seleit
EMBL interdisciplinary Postdoc (847543)
- Ali Seleit
The funders had no role in study design, data collection, and interpretation, or the decision to submit the work for publication.
Acknowledgements
We would like to thank all members of the Aulehla lab for the fruitful discussions on the work presented here. We would like to thank Aissam Ikmi for input on the manuscript. We would like to thank Takehito Tomita for help with Python scripts. The European Molecular Biology Laboratory (EMBL-Heidelberg). Genecore is acknowledged for support in WGS data acquisition and analysis. We would like to thank Vladimir Benes and members of his team at Genecore EMBL Heidelberg for continuous help and support, Tobias Rausch for computational work on the WGS data and Mireia Osuna Lopez for help in library preparation of WGS DNA. In addition, we would like to thank all animal-care takers at EMBL Heidelberg and in particular Sabine Goergens for excellent support. We would also like to thank Addgene for access to plasmids. This work was supported by the European Molecular Biology Laboratory (EMBL-Heidelberg) and the EMBL interdisciplinary Postdoc (EIPOD4) under Marie Sklodowska-Curie Actions Cofund (grant agreement number 847543) fellowship for funding to Ali Seleit. This work also received support from the European Research Council under an ERC consolidator grant agreement no. 866537 to AA.
Ethics
Medaka (Oryzias latipes, Cab strain) (Iwamatsu, 2004; Naruse et al., 2004; Kasahara et al., 2007) were maintained as closed stocks in a fish facility built according to the European Union animal welfare standards and all animal experiments were performed in accordance with European Union animal welfare guidelines. Animal experimentation was approved by The EMBL Institutional Animal Care and Use Committee (IACUC) project code: 20/001_HD_AA. Fishes were maintained in a constant recirculating system at 27–28°C with a 14 hr light /10 hr dark cycle.
Copyright
© 2021, Seleit et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 6,828
- views
-
- 715
- downloads
-
- 22
- citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Developmental Biology
- Genetics and Genomics
Cilia defects lead to scoliosis in zebrafish, but the underlying pathogenic mechanisms are poorly understood and may diverge depending on the mutated gene. Here, we dissected the mechanisms of scoliosis onset in a zebrafish mutant for the rpgrip1l gene encoding a ciliary transition zone protein. rpgrip1l mutant fish developed scoliosis with near-total penetrance but asynchronous onset in juveniles. Taking advantage of this asynchrony, we found that curvature onset was preceded by ventricle dilations and was concomitant to the perturbation of Reissner fiber polymerization and to the loss of multiciliated tufts around the subcommissural organ. Rescue experiments showed that Rpgrip1l was exclusively required in foxj1a-expressing cells to prevent axis curvature. Genetic interactions investigations ruled out Urp1/2 levels as a main driver of scoliosis in rpgrip1 mutants. Transcriptomic and proteomic studies identified neuroinflammation associated with increased Annexin levels as a potential mechanism of scoliosis development in rpgrip1l juveniles. Investigating the cell types associated with annexin2 over-expression, we uncovered astrogliosis, arising in glial cells surrounding the diencephalic and rhombencephalic ventricles just before scoliosis onset and increasing with time in severity. Anti-inflammatory drug treatment reduced scoliosis penetrance and severity and this correlated with reduced astrogliosis and macrophage/microglia enrichment around the diencephalic ventricle. Mutation of the cep290 gene encoding another transition zone protein also associated astrogliosis with scoliosis. Thus, we propose astrogliosis induced by perturbed ventricular homeostasis and associated with immune cell activation as a novel pathogenic mechanism of zebrafish scoliosis caused by cilia dysfunction.
-
- Developmental Biology
- Stem Cells and Regenerative Medicine
Stimulation of pancreatic beta cell regeneration could be a therapeutic lead to treat diabetes. Unlike humans, the zebrafish can efficiently regenerate beta cells, notably from ductal pancreatic progenitors. To gain insight into the molecular pathways involved in this process, we established the transcriptomic profile of the ductal cells after beta cell ablation in the adult zebrafish. These data highlighted the protein phosphatase calcineurin (CaN) as a new potential modulator of beta cell regeneration. We showed that CaN overexpression abolished the regenerative response, leading to glycemia dysregulation. On the opposite, CaN inhibition increased ductal cell proliferation and subsequent beta cell regeneration. Interestingly, the enhanced proliferation of the progenitors was paradoxically coupled with their exhaustion. This suggests that the proliferating progenitors are next entering in differentiation. CaN appears as a guardian which prevents an excessive progenitor proliferation to preserve the pool of progenitors. Altogether, our findings reveal CaN as a key player in the balance between proliferation and differentiation to enable a proper beta cell regeneration.