Circular synthesized CRISPR/Cas gRNAs for functional interrogations in the coding and noncoding genome

Abstract
Introduction
Results
Discussion
Materials and methods
Data availability
References
Article and author information
Metrics

Abstract

Current technologies used to generate CRISPR/Cas gene perturbation reagents are labor intense and require multiple ligation and cloning steps. Furthermore, increasing gRNA sequence diversity negatively affects gRNA distribution, leading to libraries of heterogeneous quality. Here, we present a rapid and cloning-free mutagenesis technology that can efficiently generate covalently-closed-circular-synthesized (3Cs) CRISPR/Cas gRNA reagents and that uncouples sequence diversity from sequence distribution. We demonstrate the fidelity and performance of 3Cs reagents by tailored targeting of all human deubiquitinating enzymes (DUBs) and identify their essentiality for cell fitness. To explore high-content screening, we aimed to generate the largest up-to-date gRNA library that can be used to interrogate the coding and noncoding human genome and simultaneously to identify genes, predicted promoter flanking regions, transcription factors and CTCF binding sites that are linked to doxorubicin resistance. Our 3Cs technology enables fast and robust generation of bias-free gene perturbation libraries with yet unmatched diversities and should be considered an alternative to established technologies.

https://doi.org/10.7554/eLife.42549.001

Introduction

CRISPR/Cas has rapidly become the gold standard for unbiased high-throughput experiments, outperforming preexisting technologies such as RNAi (Evers et al., 2016; Morgens et al., 2016). A fundamentally important aspect of high-fidelity CRISPR/Cas screening is the quality of the gRNA library that is interrogated, with its diversity and distribution primarily influencing downstream experimental scales (Sanson et al., 2018). Conventionally used methods to generate gRNA libraries in pooled or arrayed formats include T4 ligase or homology-based cloning techniques, which require the PCR-based amplification of gRNA-encoding oligonucleotides as well as the presence of open plasmid DNA for successful gRNA sequence cloning (Arakawa, 2016; Koike-Yusa et al., 2014; Ong et al., 2017; Schmidt et al., 2015; Shalem et al., 2014; Vidigal and Ventura, 2015; Wang et al., 2014). Owing to these technical constraints, conventional libraries contain an unwanted PCR and cloning-dependent bias in their gRNA distribution that influences the experimental scale required for statistically significant hit calling (Shalem et al., 2014; Wang et al., 2014). CRISPR libraries have become ubiquitously used in functional genomics efforts, underscoring relevance and utility of new PCR- and cloning-free technologies.

The rod-shaped filamentous phage M13 differs from other bacteriophages in that its genome-packaging capacity is variable and in that it is present as single-stranded (ss) DNA. Kunkel mutagenesis utilizes M13’s malleable coat and the ease of ssDNA purification from M13 phage and enables rapid site-specific mutagenesis to construct high-quality phage display libraries (Handa and Varshney, 1998; Huang et al., 2012; Kunkel, 1985; Kunkel, 2001). Kunkel mutagenesis has significantly contributed to the great success of phage display technologies (Ernst et al., 2013; Sidhu, 2001).

Here, we demonstrate the applicability of Kunkel mutagenesis in the generation of high-quality and high-fidelity CRISPR/Cas and RNAi gene perturbation reagents. In more detail, we developed a highly reproducible improved Kunkel mutagenesis technology that is designed to generate 3Cs CRISPR/Cas gRNA libraries robustly over a broad range of gRNA diversities. We demonstrate the high fidelity of 3Cs gRNA libraries by targeting all human DUBs and then determining their proliferative depletion phenotype, confirming previously known and discovering hitherto unknown DUB phenotypes. In an effort to enable unbiased screening within coding and noncoding regions, we encoded SpCas9 nucleotide preferences into a degenerated oligonucleotide and generated a highly complex CRISPR/Cas gRNA library (truly genome-wide (TGW)). Doxorubicin-positive selection screens with the TGW library in unperturbed human telomerase-immortalized retinal pigmented epithelial cells (hTERT-RPE1) were used to identify coding and noncoding regions, emphasizing the relevance of noncoding sequence elements in drug-resistance mechanisms. To enable high-content functional interrogations on a truly genome-wide scale, we introduce an optimized version of this library (oTGW). In summary, we establish the 3Cs technology as a robust alternative method for the generation of high-quality CRISPR/Cas gene perturbation libraries.

Results

Circular synthesized gRNAs are high-quality CRISPR/Cas reagents

In classical Kunkel mutagenesis (Kunkel, 1985; Kunkel, 2001), the circular ssDNA isolated from filamentous phage is hybridized with a complementary oligonucleotide that is extended and ligated to obtain a double-stranded DNA plasmid. As Kunkel mutagenesis is performed on ssDNA, we anticipated that it would be insensitive to the secondary DNA structures of viral sequence elements and therefore should enable the PCR and cloning-free generation of lentiviral gene perturbation reagents (Huang et al., 2012; Kunkel, 1985). We therefore hypothesized that the generation of lentiviral CRISPR/Cas gRNA libraries using circular ssDNA and Kunkel mutagenesis would reduce the coupling of gRNA diversity to gRNA distribution and would generate reagents of high quality (Figure 1A).

Figure 1 with 1 supplement see all

Download asset Open asset

The 3Cs technology - covalently-closed-circular-synthesized (3Cs) CRISPR/Cas gRNA reagents.

(A) The general 3Cs workflow. The individual steps of the protocol (grey arrows), time requirements (on top of arrow) and used or expected DNA yields (below arrow) are highlighted. Time requirements are separated by total versus hands-on time (grey scaled bars). Please note that the protocol contains two possible break points (red stop signs) at which purified phages can be stored at 4°C (break point #1) or bacterial pellets/purified plasmid DNA can be stored at −20°C (break point #2). In more detail, f1-origin containing double-stranded CRISPR/Cas plasmids are converted to dU-containing circular ssDNA. Guide RNA sequence (orange triangle) containing oligonucleotides (orange arrows) are annealed to ssDNA, and extended and ligated by T7 DNA polymerase and T4 ligase, respectively. Heteroduplex dU-3Cs-DNA is transformed into base-excision-repair-sufficient bacteria to deplete template DNA (grey strand) and to amplify the newly synthesized DNA (orange) selectively. (B) Lentiviral CRISPR/Cas plasmids (pLentiGuide, pLentiCRISPRv2) and the mammalian cDNA expression plasmid pcDNA3 (positive control) were converted to dU-containing circular ssDNA and analyzed by gel electrophoresis. Although identical in size, circular ssDNA appears as a single band and migrates faster than the corresponding dsDNA form. (C) The lentiviral circular ssDNA of panel (B) was annealed with a pool of six oligonucleotides, encoding six GFP-targeting gRNAs, to generate a pool of 3Cs-dsDNA and analyzed by gel electrophoresis. A successful 3Cs in vitro reaction is indicated by three distinct 3Cs-dsDNA product bands (Huang et al., 2012). (D) Bar graph showing the degree of template remnants in the final 3Cs products in the presence and absence of additional Uridine in the phage culture medium as well as an I-SceI clean-up step. The gRNA libraries from panel (C) were sequenced by NGS before and after I-SceI restriction enzyme digest. Although the effect of Uridine is marginal, an enzymatic digest with I-SceI removes template plasmid remnants. (**E, F**) gRNA distribution displayed as raw read count data points (E) and normalized values in box plot format (F). The coefficient of variation was calculated by dividing the standard deviation by the mean of the library’s read counts and is displayed as percentage above the box plot (F). Data were derived from the NGS data shown in panel (D). The final GFP-targeting 3Cs-gRNA library is free of sequence bias, as demonstrated by the low coefficient of variation of 33.18%, and by the uniform sequence distribution ((E), also see Figure 1—figure supplement 1H). (G) GFP-expressing hTERT–RPE1 cells were transduced with lentiviral 3Cs-gRNA constructs (non-targeting control gRNA (non-human target sequences (NHT)), a single GFP-targeting 3Cs-gRNA (GFP#1) or a pool of six GFP-targeting 3Cs-gRNAs (GFP#1–6)), and selected with puromycin before GFP gene editing was analyzed by T7 endonuclease I assay (Guschin et al., 2010). Individual band intensities were quantified (black numbers). An empty control (–) served as the reference. (H) A dose-dependent reduction of GFP fluorescence was determined by the flow cytometry of GFP-expressing hTERT–RPE1 cells and transduced with increasing volumes of lentiviral supernatant containing a pool of six GFP-targeting 3Cs-gRNAs (GFP#1–6). Error bars represent standard deviations (SDs) over three biological replicates (n = 3). (I) Immunoblot analysis of hTERT–REP1 cells treated as in panel (G) demonstrates that GFP-targeting 3Cs-gRNAs induce a 3- to 4-fold reduction in total GFP protein levels over three biological replicates (n = 3, for quantification see also Figure 1—figure supplement 1I).

https://doi.org/10.7554/eLife.42549.002

To demonstrate its general applicability to lentiviral CRISPR/Cas plasmids, we transformed Escherichia coli CJ236 bacteria with the commonly used pLentiGuide and pLentiCRISPRv2 plasmids (Sanjana et al., 2014), both of which contain a U6 promoter-controlled non-human targeting (NHT) placeholder gRNA followed by a SpCas9 tracrRNA sequence (Doench et al., 2014; Sanjana et al., 2014). Importantly, F-factor-containing CJ236 bacteria lack dUTPase (dut^–) and uracil-glycosylase (ung^–), and therefore tolerate the presence of deoxyuridine (dU) in genomic and plasmid DNA (Kim and Wilson, 2012). Superinfection of single-colony CJ236 culture with M13KO7 bacteriophage (10⁸ cfu/mL) facilitated the generation of >30 µg of dU-containing circular ssDNA. Although circular ssDNA is identical in length to dsDNA, the circular ssDNA of lentiviral CRISPR/Cas plasmids migrated faster and appeared as a single band in gel electrophoresis (Figure 1B). Circular dU-ssDNA was hybridized with a gRNA-encoding complementary oligonucleotide that contained sequence homology regions (3Cs homology) at its 5′ and 3′ ends, and then extended and ligated with T7 polymerase and T4 ligase, respectively (Figure 1A). This resulted in heteroduplexed 3Cs DNA (3Cs-dsDNA), which were composed of dU-template ssDNA and deoxythymidine-containing newly synthesized complementary DNA that also includes the gRNA-encoding oligonucleotide (Figure 1A) (Huang et al., 2012; Kunkel, 1985; Kunkel, 2001). To gain insights into the oligonucleotide requirements and kinetics of 3Cs reactions, we tested different 3Cs homology lengths of 10, 13, 15, and 18 nucleotides, performed a 3Cs reaction time series, and demonstrated that 18 nucleotides of homology (above 45°C annealing temperature) and 2 hr of 3Cs reaction time were sufficient (Figure 1—figure supplement 1A–C) (Kunkel, 2001).

Using rule set 2 (RS2) (Doench et al., 2014, Doench et al., 2016), we designed six GFP-targeting gRNA sequences and extended them by 5′ and 3′ 3Cs homology. Synthesized gRNA-encoding oligonucleotides were hand-pooled in equimolar ratios, phosphorylated and used in a 1:5 ratio (2 µg ssDNA to 60 ng oligonucleotide) to generate heteroduplex dU-3Cs-sDNA (Figure 1C). To remove NHT/dU-containing template and to amplify the gRNA-encoding complementary strand, 3Cs products were column-purified and transformed in dut⁺/ung⁺bacteria. Bacterial clones were grown and their plasmid DNA Sanger sequenced, revealing that 81% of pLentiGuide and 82% of plentiCRISPRv2 contained GFP-targeting gRNAs (Figure 1D and Figure 1—figure supplement 1D). To test whether dU supplementation reduces the amount of NHT-containing template plasmid by improving dU-incorporation during ssDNA production, CJ236 culture medium was supplemented with 2.5 µM dU. In addition, the gRNA placeholder sequence of pLentiGuide and plentiCRISPRv2 was changed to contain an I-SceI restriction enzyme recognition site. Although the effect of increased dU concentrations was negligible, I-SceI-mediated removal of wildtype plasmids reduced their level to below our detection limit (Figure 1D and Figure 1—figure supplement 1D–F). We performed next-generation sequencing (NGS) on the plentiCRISPRv2 sample with an average read count of 1.15 million per GFP sequence and identified a wildtype rate of below 0.3% in the absence of any apparent sequence bias (Figure 1E and Supplementary file 1). A one-sided Chi-squared test for goodness of fit identified a uniform distribution (p=0.1) of all six gRNA sequences. The uniform gRNA distribution was supported by a low coefficient of variation (CV) of 33.18% and an area under the curve (AUC, Lorenz curve) of only 0.56 (Figure 1E–F and Figure 1—figure supplement 1G).

To test for the cellular functionality of 3Cs gRNAs, we used the plentiCRISPRv2 GFP-targeting 3Cs gRNA constructs to generate infectious lentiviral particles and transduced GFP-positive human telomerase-immortalized retinal pigmented epithelial (hTERT–RPE1) cells. Seven days post-transduction, we performed a T7 Endonuclease I assay and observed robust GFP gene editing, both by a single GFP-targeting 3Cs gRNA (3Cs-gRNA) and by the pool of six 3Cs-gRNAs, whereas un-transduced (–) and an NHT control gRNA failed to edit the GFP gene (Figure 1G). GFP gene editing translated to a lentiviral dose-dependent loss of GFP protein when analyzed by fluorescence flow-cytometry and immunoblotting (Figure 1H–I and Figure 1—figure supplement 1H). Taken together, we demonstrate that the 3Cs technology enables the rapid and cloning-free generation of high-quality single and pooled CRISPR/Cas gRNAs.

3Cs uncouples sequence diversity from sequence distribution

The absence of PCR amplification and cloning steps, in combination with the uniform distribution of the six GFP-targeting 3Cs-gRNAs, led us to reason that 3Cs may uncouple sequence diversity from sequence distribution during gRNA library generation. To test this hypothesis, we designed six degenerated 3Cs oligonucleotides with increasing numbers of randomized nucleotides to mimic gRNA sequence pools with diversities ranging from 256 to 262,144 individual sequences (Figure 2A). The six pools were applied in parallel 3Cs syntheses on a dU-ssDNA template of pLentiCRISPRv2 (Figure 2B and Figure 2—figure supplement 1A). Independent of an oligonucleotide’s diversity, NGS and computational analyses identified all of the individual sequences and uniform distributions with area under the curve values between 0.6 and 0.73 (Figure 2—figure supplement 1B and Supplementary file 2). Despite the uniform distribution, we observed a prominent cytosine (C) bias in the randomized libraries, with C contents of above 40% within the top 10% of the most abundant gRNAs (Figure 2C). We reasoned that the C bias is probably due to incomplete phosphoramidite mixing during oligonucleotide synthesis and should therefore be absent from gRNA libraries containing nonrandom gRNA sequences (Ellington and Pollard, 2009). To test this hypothesis, we designed and generated a nonrandom 3Cs-gRNA library targeting all 105 human DUBs, each with three gRNAs (DUB library). NGS and nucleotide content analysis confirmed our hypothesis and revealed the absence of C bias from the nonrandom DUB library (Figure 2C and Supplementary file 3). To correct the randomized libraries for the C bias, we determined the individual nucleotide frequency at every randomized position and used these frequencies to normalize the original read counts, leading to improved AUC values and sequence distributions (Figure 2D and Supplementary file 2) and further confirming the uncoupling of sequence diversity and distribution in 3Cs reactions. Taken together, these findings lead us to conclude that 3Cs is a robust technology that uncouples sequence distribution from sequence diversity and, therefore, is a powerful alternative technology to conventional gRNA cloning methods for generating gRNA libraries.

Figure 2 with 1 supplement see all

Download asset Open asset

3Cs is a robust technology that uncouples sequence diversity from sequence distribution.

(A) To determine the sequence distribution of 3Cs-gRNA libraries with increasing gRNA diversity, an increasing number of randomized nucleotides (orange) were incorporated into 3Cs oligonucleotides to mimic gRNA diversities ranging from 256 to 262,144 sequences (4–9N libraries). A range of four to nine randomized nucleotides (orange) were introduced into an NHT gRNA sequence. Randomization of the central nucleotides ensures the replacement of the template I-SceI restriction site in order to prevent the digestion of correctly synthesized 3Cs synthesis products. (B) The 3Cs synthesis products of the combination of randomized primers and pLentiGuide were resolved by gel electrophoresis. (C) The Scatter plot displays ranked gRNA abundances per library against the gRNA cytosine content (C). The gRNA libraries that are shown are derived from (A) and (B) and the library with gRNAs targeted against DUBs (DUBs library). All libraries were processed by I-SceI-dependent removal of template plasmid remnants and subjected to NGS and computational analysis. Importantly, all gRNA libraries were complete, irrespective of their individual gRNA diversity. However, the partially randomized gRNA libraries displayed a strong C bias within the most abundant gRNA sequences. In fact, the top 5% of most abundant gRNAs had a C content of above 60%. The DUBs library did not show this C bias, strongly suggesting incomplete phosphoramidite mixing during oligonucleotide synthesis as the main cause of the C bias. (D) Lorenz curves displaying the cumulative fraction of represented NGS reads versus the gRNAs ranked by abundance of each partially randomized (4–9N) and nonrandomized (DUB) library revealed a uniform distribution of gRNA sequences. Area under the curve values (AUC, number next to library name) confirm the uniform gRNA distribution of these libraries and demonstrate that 3Cs uncouples sequence diversity from sequence distribution.

https://doi.org/10.7554/eLife.42549.004

3Cs-gRNA libraries are of high fidelity: the proliferative essentiality of human DUBs

Next, we investigated the performance of 3Cs-gRNA reagents in cellular screenings. To do so, we generated infectious lentiviral particles of the 3Cs-gRNA DUB library and applied them to a proliferation screen in non-transformed hTERT–RPE1 cells in biological duplicates (multiplicity of infection (MOI) 0.2, coverage 1,000). Two days after lentiviral transduction, cells were either collected (day 0, reference time point) or selected by puromycin and kept in culture for 11 days (day 11) or 21 days (day 21) in cycling conditions representing at least a 1,000-fold library coverage (Figure 3A). Cells collected at day 0, 11, or 21 were subjected to genomic DNA extraction and amplicon-based NGS library preparation, as has been reported previously (Doench et al., 2016; Koike-Yusa et al., 2014). We performed single-read sequencing on an Illumina NextSeq500 with an averaged read count per gRNA of above 35,000 across all biological samples and replicates (Supplementary file 4). As in previously reported CRISPR analysis algorithms, and to enable a comparison of individual time points, we summed all individual gRNA read counts per gene and normalized each gene read count per sample to the total number of read counts within that sample (Supplementary file 4) (Li et al., 2014; Spahn et al., 2017). In line with reports of the high experimental reproducibility of CRISPR/Cas screenings (Evers et al., 2016; Morgens et al., 2016), we determined R² values of 0.95, 0.88, and 0.90 for time points day 0, 11, and 21, respectively (Figure 3—figure supplement 3A–C). Spearman correlation and Shapiro–Wilk confidence tests revealed correlations of above 0.88 and p-values of below 0.001, respectively, demonstrating a high experimental confidence in the level of gRNA representation (Figure 3B and Figure 3—figure supplement 3A–C) (Shapiro and Wilk, 1965). To analyze the reproducibility of our screen in terms of the level of gene phenotypes, we applied MAGeCK and PinAPL-Py, two established algorithms for the analysis of CRISPR/Cas screens, to raw gRNA read counts of both replicates and calculated aggregated positive and negative proliferation phenotypes by means of log₂-fold changes with associated p-values (Figure 3C–E and Figure 3—figure supplement 3D–H and Supplementary file 5–6) (Li et al., 2014; Spahn et al., 2017). Consistent over both time points, cells depleted of USP28 or BRCC3 proliferated more rapidly than cells harboring non-human target sequences (NHTs), identifying both as negative regulators of hTERT–RPE1 proliferation (Figure 3C–D and Figure 3F and Supplementary file 4–5). By contrast, cells that were depleted of PSMD14, USP7 or COPS6 proliferated less rapidly than cells harboring non-human target sequences (NHTs), identifying them as positive regulators of hTERT–RPE1 proliferation (Figure 3C–D and Figure 3F and Supplementary file 4–5).

Figure 3 with 3 supplements see all

Download asset Open asset

3Cs reagents are of high fidelity — the essentiality of human DUBs for cell fitness.

(A) Schematic of the performed CRISPR screen. Highlighted are the experimental conditions under which the screen was performed (MOI of 0.2, library coverage of 1,000). In brief, hTERT–RPE1 cells were transduced with lentivirus for 48 hr in duplicates, after which the cells of one duplicate were harvested (day 0) to extrapolate the baseline gRNA distribution. Simultaneously, cells of the second duplicate were subject to puromycin selection for 11 days, after which time all cells were harvested (day 11), counted and plated back in low density to the original library representation of 1,000-fold coverage. Plated cells remained in cycling conditions until day 21, when all cells were collected (day 21). After harvesting the cells, their genomic DNA was extracted and processed for gRNA NGS. (B) Graph showing the distribution of individual sgRNAs. Means ± standard deviation are highlighted. (**C–E**) Volcano plots visualizing log_2-fold changes of gene phenotypes and their associated p-values. Data are derived from MAGeCK analyses, corresponding to day 11 (C), day 21 (D) and day11/21 (E). The dashed red line shows p=0.05 with points above the line having p<0.05 and points below the line having p>0.05. Data points with p>0.05 are displayed as translucent symbols. Genes of interest are highlighted. (F) Venn diagram of significantly enriched (blue) or depleted (orange) DUBs. The time point overlap visualizes DUB genes with time independent (overlap of three) and time dependent (overlap of two) proliferation phenotypes. (G) Fold increase in cell number after shRNA-mediated depletion of target genes. Data are means of duplicates.

https://doi.org/10.7554/eLife.42549.006

CRISPR/Cas drop out screens are performed with varying experimental durations, ranging from 5 to 15 days (Joung et al., 2017; Potting et al., 2018). However, recent work demonstrates that CRISPR/Cas induces a G₁ phase arrest in p53 proficient hTERT–RPE1 cells that impacts hit calling (Haapaniemi et al., 2018), suggesting that later screening time points are beneficial for hit calling. Indeed, when comparing normalized gene ranks, we observed a trend of increased phenotype resolution among negative gene ranks over time, although this effect was largely absent from positive gene ranks (Figure 3—figure supplement 1I and Figure 3—figure supplement 2 and 3). This effect has been reported previously and can potentially be explained by the disproportional assay window of positive and negative cell proliferation phenotypes, leading to a higher phenotypic resolution among negative proliferative effects (Shalem et al., 2014; Wang et al., 2014).

Stable and robust proliferative phenotypes are time-independent, so the phenotype resolution enhances over time (Figure 3—figure supplement 3I). However, multiple mechanisms and cellular backgrounds can influence phenotype strength, timing and orientation. To identify time-dependent phenotypes, we analyzed the MAGeCK-derived log_2-fold changes with associated p-values for time points day 11 and day 21, and identified genes whose deletion phenotype significantly changed between day 11 and day 21 (Figure 3E–F). Depletion of USP28 and USP46 induced the strongest positive change, whereas deletion of USP22, USP48 or TNFAIP3 induced the most significant negative change, in phenotype between day 11 and day 21, suggesting a time-dependent absence of compensatory mechanisms to accommodate an early loss-of-function phenotype (Figure 3E–F and Figure 3—figure supplement 3I). In order to validate our findings, we chose two positive and negative proliferation-inducing DUBs, generated lentiviral supernatant to deliver shRNA sequences targeting the selected DUBs, and transduced hTERT–RPE1 cells. Over the course of the 2 weeks after transduction, we measured cell numbers by AlamarBlue staining. When compared to negative- (Luciferase) and positive-control (Plk1) shRNA sequences, depletion of USP28 and BRCC3 induced a rapid positive proliferation effect (Figure 3G). By contrast, depletion of USP7 and COPS6 induced an instant and strong negative proliferation effect (Figure 3G), validating the gRNA-mediated knockout phenotypes. Collectively, our analysis demonstrates the quality and fidelity of 3Cs reagents in functional genomics applications.

3Cs is versatile and generates arrayed and pooled 3Cs-shRNA reagents

Owing to its versatility,CRISPR/Cas technology has become the method of choice for gene perturbation experiments, yet classical short hairpin RNAs (shRNA, RNAi) are still widely used. However, shRNA oligonucleotides contain complementary sequences that form stable secondary structures that render the generation of shRNA reagents inefficient (McIntyre and Fanning, 2006). A crucial step in our improved Kunkel mutagenesis technology is the denaturation of the gRNA-encoding oligonucleotides and their subsequent annealing to template ssDNA (see 'Materials and methods' section). Owing to this denaturation and annealing step, we anticipated that improved Kunkel mutagenesis would circumvent the secondary structures of shRNA oligonucleotide and enable the generation of shRNA reagents. We chose pLKO.1 (Stewart et al., 2003), one of the most widely used lentiviral and shRNA-expressing plasmids to generate circular dU-ssDNA. Similar to circular ssDNA of CRISPR/Cas plasmids, the circular ssDNA of pLKO.1 migrated as a single band in agarose gel electrophoresis (Figure 3—figure supplement 2A). Next, we designed a GFP-targeting 3Cs shRNA (3Cs-shRNA) primer consisting of 5′ and 3′ 3Cs homology and two complementary GFP–shRNA sequences separated by a six-nucleotide hairpin sequence (Figure 3—figure supplement 2B). We performed two parallel 3Cs reactions using 60 ng and 120 ng of ssDNA, and both reactions yielded the characteristic 3Cs-DNA band pattern with no major difference in bacterial transformation efficiency between the two tested scales (Figure 3—figure supplement 2C).

To demonstrate the successful integration of the GFP–shRNA sequence into pLKO.1, we amplified single bacterial clones carrying 3Cs-DNA of shRNA reactions and analyzed their plasmid DNA by SANGER sequencing. This resulted in the expected GFP–shRNA sequence and the absence of adjacent nucleotide changes (Figure 3—figure supplement 2D), from which we concluded that 3Cs is a versatile technology that generates high-quality gRNA and shRNA reagents. To demonstrate 3Cs-shRNA fidelity, we generated infectious lentiviral particles of the GFP-targeting 3Cs-shRNA and transduced GFP-positive hTERT–RPE1 cells. Strikingly, 96 hr after lentiviral transduction, we observed a reduction in GFP-fluorescence, confirming the functionality of the 3Cs-shRNAs in cells (Figure 3—figure supplement 2E). Moreover, we investigated the performance of improved Kunkel mutagenesis in generating 3Cs-shRNA libraries. On the basis of the principles described above, we designed a 3Cs-shRNA library targeting all human ubiquitin-conjugating E2 enzymes (E2s), each with two shRNAs (Supplementary file 7). To generate the library, individually synthesized oligonucleotides were pooled in equimolar ratios and applied to a pooled 3Cs reaction. The resulting products were resolved by gel electrophoresis (Figure 3—figure supplement 3A). Like the I-SceI-mediated depletion of wildtype remnants from CRISPR/Cas 3Cs-gRNA constructs, a Bsu36I restriction enzyme clean-up step removed pLKO.1 wildtype remnants, and SANGER sequencing of the final E2 3Cs-shRNA library (E2.2) confirmed a randomization of forward- and reverse-complement shRNA sequences (Figure 3—figure supplement 3B–C). To determine the E2 3Cs-shRNA distribution more accurately, we performed NGS sequencing with an average shRNA read count of >8,300 and determined a wildtype remnant level of 0.04%, a CV of 37.9% and an AUC of 0.68, demonstrating an almost uniform distribution (Figure 3—figure supplement 3D–E). To correlate 3Cs-shRNA abundance and the distribution of the E2 3Cs-shRNA libraries before and after Bsu36I enzyme digest, we determined the ratios of their respective normalized read counts. Importantly, all ratios were close to one and lined up close to the respective diagonal with a linear regression R² of 0.9687 (Figure 3—figure supplement 5F), demonstrating a high correlation of individual data points and no influence of the Bsu36I digest on 3Cs-shRNA sequence distribution. In summary, this demonstrates that our 3Cs technology can be adapted to generate high-quality shRNA reagents in single and pooled formats.

A partially randomized 3Cs gRNA library to target the coding and noncoding genome simultaneously

The 3Cs method does not require the PCR-amplification of gRNA-encoding oligonucleotides, is free of conventional cloning steps and uncouples sequence diversity from sequence distribution. Thus, we hypothesized that 3Cs gRNA library diversity is mostly limited by the number of distinguishable oligonucleotides within a 3Cs reaction and the subsequent bacterial electroporation efficiencies. Limitations in electroporation efficiencies can be overcome by accumulating the individual efficiencies of multiple parallel reactions, as routinely performed to amplify phage libraries with diversities beyond 10⁹ (Smith and Scott, 1993). The number of distinguishable oligonucleotides is limited by the capacity of synthetic oligonucleotide synthesis, rendering truly genome-wide gene perturbation libraries unfeasible.

Previously identified SpCas9 nucleotide preferences included a preference for 3′ pyridine bases whereas thymidine nucleotides are disfavored (Doench et al., 2014; Doench et al., 2016). In an exploratory effort, we translated SpCas9 gRNA nucleotide preferences into a degenerated oligonucleotide sequence (truly genome-wide, TGW) of 20 nucleotides, representing a theoretical diversity of 7.3 × 10¹⁰ (Figure 4A) and maximally targeting 1.65 × 10⁷ sites in the human coding and noncoding genome (Figure 4A and Supplementary file 8). As mentioned above, randomized positions in DNA oligonucleotides can contain a strong single nucleotide cytosine bias, we therefore used hand-mixed phosphoramidite pools to generate this oligonucleotide pool. In eight parallel large-scale 3Cs reactions (each involving 20 µg ssDNA and 600 ng oligonucleotide), we applied this oligonucleotide to dU-ssDNA of pLentiGuide and pLentiCRISPRv2 and resolved the 3Cs products by gel electrophoresis (Figure 4B). Importantly, we note that the amplification of this degenerated library is limited by the number of transformed bacteria. As complete generation of this reagent is currently unfeasible, we limited our efforts to eight parallel electroporation reactions and achieved a cumulative transformation efficiency of 1.2 × 10¹⁰, accounting for ~16% of TGW sequences, assuming a stringent uniform sequence distribution. In order to approximate the gRNA distribution, we generated 14.4 million NGS reads and found 94.19% to be unique (Figure 4C and Supplementary file 9). We went on to extract the nucleotide frequencies for each gRNA position from TGW NGS reads, translated them to IUPAC nomenclature, and identified the identical degenerated sequence that we initially applied in the form of the degenerated oligonucleotide pool (Figure 4D). Furthermore, the distribution of TGW read counts had a CV of 0.26% and an AUC of 0.52, suggesting a nearly uniform distribution of sequenced and represented gRNA sequences (Figure 4E–F) (Makowski and Soares, 2003).

Figure 4

Download asset Open asset

A truly genome-wide (TGW) CRISPR/Cas 3Cs-gRNA library to interrogate the coding and noncoding genome.

(A) Previously reported SpCas9 nucleotide preferences were translated into a degenerated oligonucleotide sequence (TGW) representing a total sequence diversity of 7.3 × 10¹⁰ (Doench et al., 2014). (B) The TGW oligonucleotide shown in panel (A) was used in a 3Cs reaction on template ssDNA derived from pLentiGuide and plentiCRISPRv2 plasmids to generate 3Cs-dsDNA, which was analyzed by gel electrophoresis. (C) Scatter plot visualizing TGW library NGS data from 14,448,469 total reads. Displayed are the log₁₀ values of gRNA abundance (reads) against the log₁₀ of the respective percentage of identified TGW gRNAs. 94.2% of all identified gRNAs were found only once (see also Supplementary file 9). (D) High-throughput sequencing data from panel (C) were used to compute the nucleotide frequency at each gRNA nucleotide position in order to determine the nucleotide profile of the TGW library. The identified nucleotide frequencies closely resemble the pattern of the degenerated TGW oligonucleotide from panel (A). Color code represents nucleotide frequency as indicated by the color gradient on the right. (E) Box plot of TGW gRNA distribution with data derived from panel (C). The coefficient of variation of 0.26% suggests a uniform distribution of represented sequences. (F) The gRNA distribution of the TGW library as derived from panel (C) plotted as a Lorenz curve. TGW NGS data derived from pane; (C). The area under the curve (AUC) of 0.52 suggests a uniform distribution of gRNA sequences.

https://doi.org/10.7554/eLife.42549.010

The 3Cs technology enables the generation of gRNA libraries with sequence diversities exceeding those that can be captured by coverage-based screenings. Being aware of the coverage limitations, we explored a TGW library screen in the context of a strong positive selection pressure. In hTERT–RPE1 cells, doxorubicin induces a robust, irreversible and dose-dependent reduction of cell viability within 4 days (Figure 5—figure supplement 1A). We therefore generated 5.5 × 10⁸ infectious lentiviral particles of the TGW library and applied them to screen for resistance to doxorubicin (Figure 5A–B). In three biologically independent experiments, we transduced a total of 5.5 × 10⁸ hTERT–RPE1 cells with a MOI of 1, added 1 µM doxorubicin 7 d post transduction and replaced the media every 7 d for 21 consecutive days. Cells that survived the treatment were harvested and their genomic DNA was extracted for NGS (Figure 5B). Although the experimental reproducibility was low (0.004%), we identified an experimental overlap of 4,232 gRNAs, with associated Spearman ranking and Pearson correlations of above 0.75 (Figure 5C and Figure 5—figure supplement 1B). To validate these sequences, we designed and generated a new 3Cs-gRNA validation library consisting of the identified 4,232 gRNAs and repeated the doxorubicin resistance screen with established experimental parameters (coverage of 1,000 and MOI of 0.2) (Figure 5—figure supplement 2A–B). As a result, we reidentified 2,716 gRNAs of which 795 were more than two-fold enriched after 21 d of doxorubicin treatment when compared to the untreated control (Figure 5D and Supplementary file 11). In order to map the 795 gRNA sequences to a location within the human genome, we applied Cas-OFFinder and used the Ensembl, ENCODE, Roadmap Epigenomics and Blueprint databases for sequence annotation (Bernstein et al., 2010; Dunham et al., 2012; Fernández et al., 2016; Zerbino et al., 2018). We identified seven gRNAs to target five genes (PDE8B, AVPR2, CYSLTR2, IL3RA, and POLE2), of which PDE8B and AVPR2 were targeted by two gRNAs, and a single gRNA sequence matched a noncoding location within chromosome 8 (chr8:93022800) (Figure 5E and Figure 5G and Supplementary file 12–13). The coding hits that we identified included CysLTR2, a Leukotriene C4 G-protein-coupled eicosanoid receptor that was recently reported to induce doxorubicin resistance by abolishing the accumulation of reactive oxygen species (Dvash et al., 2015). To validate CysLTR2 as a doxorubicin-resistance inducing hit, we chemically inhibited CysLTR2 with increasing concentrations of Bay-CysLT2 or Bay-u9773 in the presence of doxorubicin and quantified cell viability by AlamarBlue staining. Importantly, both drugs reverted the doxorubicin-induced toxicity in a dose-dependent manner (Figure 5F), suggesting that the loss of CysLTR2 causes doxorubicin resistance.

Figure 5 with 2 supplements see all

Download asset Open asset

TGW-based identification of coding and noncoding sequences that are associated with doxorubicin resistance.

(A) Scheme illustrating the workflow used to generate the pooled lentivirus of the TGW library. The DNA of eight independent TGW 3Cs syntheses was pooled and used to transfect HEK293T cells to produce 5.5 × 10⁸ infectious lentiviral particles. (B) Experimental workflow of the doxorubicin screen in hTERT–RPE1 cells. hTERT–RPE1 cells were transduced with TGW lentivirus with an MOI of 1, selected with puromycin for 7 days, and treated with 1 µM doxorubicin. After three weeks of continuous doxorubicin treatment, all surviving cells were collected and processed for further analysis. (C) Genomic DNA derived from three independent experiments (n = 3), performed according to the scheme illustrated in panel (B), was used to perform NGS and gRNA sequence identification. Computational analysis identified an experimental overlap of 4,232 gRNAs (see also Supplementary file 10). (D) A 3Cs library containing the experimental overlap of 4,232 gRNAs (the validation library) was generated and screened with an experimental coverage of 1,000 and an MOI of 0.1 (similar to the workflow shown in panel (B); see also Figure 5—figure supplement 2). NGS of all surviving cells and computational analysis identified 795 gRNAs that were enriched more than two-fold (orange) when compared to an untreated control. (E) Pie chart visualizing the distribution of coding target regions with respect to SpCas9 off-target rate (0 to 2 mismatches). A total of 192 gRNAs (22.38% of 795 gRNAs) could be mapped to coding regions. Color code represents degree of nucleotide mismatch. (F) Chemical inhibition of cells rescued by CysLTR2 from doxorubicin-mediated toxicity. hTERT–RPE1 cells were treated for 4 d with increasing concentrations of doxorubicin and two chemical inhibitors of CysLTR2 (Bay-CysLT2 and Bay-u9773) before cellular viability was determined by AlamarBlue assays. Averaged values over three biological replicates (n = 3) in arbitrary units (AU) are displayed. (G) Pie chart visualizing the distribution of noncoding target regions with respect to SpCas9 off-target rate (0 to 2 mismatches). A total of 222 gRNAs (27.92% of 795 gRNAs) could be mapped to noncoding regions. Color codingshows the degree of nucleotide mismatch. (H) Molecular signature analysis of coding gRNA target sites identifies a set of genes that are downregulated in cells expressing mutant forms of ERCC3 as the top hit. From among the 178-coding gRNA target site-associated genes, 25 genes are part of the ERCC3 group (which has a total of 855 genes) with high confidence (p=3.7e-17). (I) A list of the 25 ‘down in ERCC3 mutated cells’ genes (light green), as well as their known first- and second-degree interacting genes (grey), identifies cytokinesis (DOCK1/4 genes) and vesicle transport (SEC24B/TRAPPC8 genes) gene interactions. Interaction data adapted from String 10.5. (J) Pie chart visualizing the distribution of noncoding gRNA target site annotations, including their frequency (as percentages of total noncoding hits). Please note: for 52.7% of all noncoding gRNA target sites, no annotation is available. (K) Molecular signature analysis of noncoding gRNA target sites, using adjacently located genes (one for each, 5′ and 3′). 33 genes, out of the 211 genes analyzed, are part of the ‘Biosynthetic process’ group (which includes a total of 1,805 genes) with high confidence (p=3.4e-10).

https://doi.org/10.7554/eLife.42549.011

To account for sequence differences between rRNAs from hTERT–RPE1 cells and the reference genome (GRCh38.86), as well as SpCas9 off-target activity (Cho et al., 2014; Hsu et al., 2013; Pattanayak et al., 2013), we extended our computational analysis by allowing up to two mismatches during Cas-OFFinder-based target sequence identification. As expected, the number of gRNAs that could be mapped to the reference genome increased to 192 and 222 for coding and noncoding target sites, respectively, accounting for 50.3% of the 795 gRNAs (Figure 5E and Supplementary file 12). Interestingly, when mismatches are allowed, we identified three gRNAs that targeted two different coding positions within the AKAP6 gene (chr14:32671632, chr14:32784395), as well as five different gRNAs that targeted the exact same coding position within the ASPA2 gene (chr2:9229295) (Figure 5E and Supplementary file 12). Within the noncoding gRNA target sites, we identified four gRNAs that targeted four different positions on chromosome X (56546543, 57766898, 63133046, 63245878), all of which are in close proximity to the SPIN2A gene (Figure 5G and Supplementary file 13), suggesting a doxorubicin-tolerance-inducing function in this locus.

In order to reveal whether the identified set of coding genes correlated with reported phenotypes or gene ontologies, we performed a molecular signature analysis of the 178 coding target regions and identified 25 genes that match the UV_RESPONSE_VIA_ERCC3 (downregulated in mutant ERCC3-expressing fibroblasts) group as the most significant hit (p-value of 6.11E-17) (Figure 5H–I and Supplementary file 14). Importantly, doxorubicin-induced interstrand crosslinks are repaired by ERCC3-dependent nucleotide excision repair (NER), and NER-deficient cells have been shown to display greater tolerance to adduct-forming anthracycline treatment, connecting these 25 genes to an increased doxorubicin tolerance (Bret et al., 2013; Spencer et al., 2008; van Brabant et al., 2000). Furthermore, mutations in noncoding sequences have been linked to the misregulation of adjacently located genes by disrupting cis-regulatory elements (Hnisz et al., 2016; Katainen et al., 2015; Weinhold et al., 2014). Therefore, we searched for available biotypes that are associated with the identified noncoding target regions and were able to identify target sites matching ‘predicted promoter’ (12.6%), ‘lincRNAs’ (11.3%) as well as ‘processed pseudogenes’ (4.5%) and ‘CTCF binding sites’ (3.6%) annotations (Figure 5J and Supplementary file 13). However, no biotype or genomic annotation was available for 52.7% of the noncoding gRNAs, and we therefore identified the nearest 5′ and 3′ located genes and used them to perform a molecular signature analysis (Figure 5K and Supplementary file 15). Among the four most enriched molecular signatures, we identified genes that are regulated by the transcription factors FOXO4, KLF1, and NFAT, noting that FOXO4 and NFAT downregulation has previously been reported to increase doxorubicin tolerance (Figure 5K and Supplementary file 15).

In summary, we explored the possibility of generating a partially degenerated SpCas9-gRNA library and its application in positive selection screens. Despite the limitations attributed to the generation of such a reagent and its applicability in cellular screens, we identified previously known and unknown genes that are presumably linked to doxorubicin resistance. In addition, we identified noncoding sequence regions and their neighboring genes for which gene set enrichment analyses revealed an enrichment for transcription factors that are connected to increased doxorubicin tolerance.

An optimized truly genome-wide 3Cs gRNA library

A library’s sequence diversity and distribution directly dictates the experimental scale for positive and negative selection screens. Therefore, reducing the size of the TGW library to enable coverage-based screens is highly desirable. In line with this, gRNAs that are truncated to 17 nt have been demonstrated to maintain on-target efficiencies while reducing off-target effects (Fu et al., 2014; Wyvekens et al., 2015). We therefore truncated the degenerated TGW oligonucleotide sequence to 17 nt(optimized TGA, oTGW), approximating to a a total oligonucleotide diversity of 1.5 × 10⁹(Figure 6A). Importantly, the oTGW sequence diversity is 50-times smaller than the TGW sequence diversity, while the 1.65 × 10⁷ unique target sequences in the human genome remain identical (Figure 6A and Supplementary file 8). As for the TGW oligonucleotide, we used hand-mixed phosphoramidite pools to synthesize the oTGW oligonucleotide and performed 3Cs reactions by combining this nucleotide with a ssDNA dU-template of the three conventionally used lentiviral CRISPR/Cas plasmids: pLentiGuide, pLentiCRISPRv2(Puro) and pLentiCRISPR(GFP-Puro). We then determined successful 3Cs reactions by gel electrophoresis (Figure 6B). Subsequent to bacterial amplification, an I-SceI clean-up step was performed before the three oTGW libraries were analyzed by NGS with an average of 28.4 million reads per library (Figure 6C–E and Supplementary file 8). Importantly, extracted gRNA-position nucleotide frequencies were extracted and translated to IUPAC nomenclature, revealing the initial oTGW degenerated oligonucleotide sequence (Figure 6F–H). Furthermore, an average wildtype remnant rate of 0.2% was determined and AUC values were 0.54 or below (Figure 6—figure supplement 1), suggesting a uniform distribution of represented gRNA sequences in all three oTGW libraries (Makowski and Soares, 2003). Thus, these oTGW libraries are the first of their kind, have the potential to elevate functional genomics approaches and will be made available to the scientific community by the Goethe University Depository (http://www.innovectis.de/INNOVECTIS-Frankfurt/Technologieangebote/Depository).

Figure 6 with 1 supplement see all

Download asset Open asset

Optimized TGW (oTGW) libraries for functional interrogations in the coding and noncoding genome.

(A) oTGW oligonucleotide sequence, based on reported SpCas9 nucleotide preferences. The truncation of three 5′ nucleotides results in 17-mer gRNAs with a total oligonucleotide diversity of 1.5 × 10⁹. (B) oTGW 3Cs-dsDNA was synthesized on a ssDNA-template of pLentiGuide, pLentiCRISPRv2-Puro and pLentiCRISPRv2-GFP/Puro. 3Cs products are analyzed by gel electrophoresis on a 0.8% TAE/agarose gel. (**C–E**) Removal of template plasmid remnants with an I-SceI restriction enzyme digest. oTGW 3Cs-dsDNA was electroporated with efficiencies above 6.31 × 10⁹ and amplified for DNA purification (P1). A subsequent I-SceI restriction enzyme digest and an electroporation of P1 yielded the final 3Cs libraries containing no detectable template plasmid (P2). An analytical restriction enzyme digest with I-SceI and EcoRV removes a 2.5-kb DNA fragment from the template plasmid (empty) and to a minor degree from P1 DNA pools. No 2.5-kb fragment could be observed in the final P2 DNA library pools, demonstrating the high purity of the final libraries (see also Figure 6—figure supplement 1). (**F–H**) High-throughput sequencing data derived from panels (**C–E**) were used to compute the nucleotide frequency of each gRNA nucleotide position, which are visualized as heat maps. The identified nucleotide frequencies closely resemble the pattern of the degenerated oTGW oligonucleotide shown in panel (A). Color coding illustrates the nucleotide frequencies (0% in blue to 50% in red).

https://doi.org/10.7554/eLife.42549.014

Discussion

In the present study, we describe the 3Cs technology, an improved Kunkel mutagenesis protocol that facilitates the one-step and cloning-free generation of high-fidelity CRISPR/Cas and RNAi gene perturbation reagents. 3Cs uncouples sequence diversity from sequence distribution, making it useful for the generation of CRISPR/Cas gRNA libraries of arbitrary sequence diversities.

The 3Cs technology has several unique features. First, the bacteriophage-mediated generation of ssDNA makes the technology applicable to all plasmids containing a f1-origin of replication. Second, ssDNA-mutagenic oligonucleotides are annealed to the ssDNA of the template DNA, thereby circumventing the need for two oligonucleotides per gRNA and amplification by PCR, reducing associated costs and sequence bias. In line with this, T7 DNA polymerase, which is used in the 3Cs reaction, has an error rate of approximately 15 × 10⁻⁶, resulting in as little as 0.0015% of mutated heteroduplex 3Cs product assuming 2 µg ssDNA of a 10 kb plasmid (Kong et al., 1993). Third, the presence of a gRNA placeholder sequence enables the near-complete removal of wildtype plasmid remnants. Last, we demonstrate 3Cs applicability and performance using the example of lentiviral plasmids. However, we foresee the plasmid range to be expanded to recombinant Adeno-Associated Virus (rAAV) plasmids and adenoviruses, as well as coding sequences for protein mutagenesis that enable in-cell and in-vivo functional screenings.

We demonstrate the fidelity and performance of 3Cs reagents by identifying the proliferative phenotype of human DUBs, validating previously known and uncovering hitherto unknown DUB phenotypes. We show that depletion of the DUBs USP28 and BRCC3 induces positive proliferation phenotypes, suggesting that they have tumor suppressive functions. In line with this, USP28 was recently identified as preventing p53 elevation in response to centrosome loss resulting from Plk4 inhibition, thereby preventing growth arrest in response to prolonged mitosis (Meitinger et al., 2016). On the other hand, we identify DUB enzymes whose depletion reduces cell fitness dramatically. Among them are COPS6 and USP7, both of which have been implicated in DNA damage response and enhanced p53 stability, leading to a prolonged G₁-phase cell cycle arrest (Li et al., 2004). In line with this, recent work demonstrates a direct connection between CRISPR/Cas gene editing and a p53-dependent DNA damage response that is associated with G₁ cell cycle arrest (Haapaniemi et al., 2018), suggesting that DNA damage-associated DUBs (e.g. COPS6 or USP7) can be used as p53 alternatives to control for DNA-damage-induced cell cycle arrest and negative hit calling in CRISPR/Cas functional genomic screens.

CRISPR/Cas functional genomic screens are widely used to interrogate protein-coding regions, but only few studies have used CRISPR/Cas gene editing to investigate the noncoding genome (Canver et al., 2015; Diao et al., 2016; Korkmaz et al., 2016; Sanjana et al., 2016). Although these studies have the potential to open a new area of functional genomics, their general applicability is limited to a predefined set of gRNAs and therefore to a small subset of genomic regions. A unique feature of the 3Cs technology is the uncoupling of gRNA sequence diversity from gRNA sequence distribution, facilitating the generation of partially randomized gRNA libraries. A fully randomized library with oligonucleotides of length 20 (20N) resembles the entire space of possible gRNA sequences and comprises 4²⁰ = 1.1*10¹² different sequences. Although an oligonucleotide pool covering this huge sequence space could theoretically be synthesized, there are at least two major reasons that render the experimental application unfeasible. The first reason is that the fraction of gRNA sequences that have a target site in the genome of interest would be very low. The ~300 million SpCas9 target sites in the human genome would be covered by a fully randomized library, but would represent only 0.027% of the library. Consequently, the second reason is that the experimental scale would have to be extremely high to cover all naturally occurring target sites, including all non-human targeting gRNAs that also need to be included, with sufficient coverage. Screening the 20N library with a coverage of 100 and a MOI of 0.5 would require 2.2*10¹⁴ cells, a cell number that is clearly not feasible in current experimental setups.

By focusing on SpCas9 nucleotide preferences, we introduce the partially randomized TGW library, which preferentially targets active gRNA sequences in the entire genome, including both coding and noncoding regions. The size of the TGW library is dramatically smaller than a fully randomized library but still comprises 7.3*10¹⁰ different gRNA sequences. We sought to explore the experimental application of such a large library and chose to screen for resistance to the cytotoxic agent doxorubicin in hTERT–RPE1 cells. Although insufficient TGW coverage led to low biological reproducibility, we were still able to retrieve gRNA overlap of three experiments. We identified protein-coding genes that have previously been associated with doxorubicin resistance such as CysLTR2, whose inhibition by two small chemical compounds reverted the doxorubicin-induced toxicity.

Interestingly, about half of the gRNAs for which we were able to identify a matching sequence in the human genome map to regions in the noncoding genome. Noncoding mutations have been shown to play pivotal roles in tumorigenesis by disrupting the function of cis-regulatory elements (e.g. promoters, enhancer, or transcription- factor binding sites) and topologically associating domains (TADs), thereby directly affecting the transcriptional regulation of adjacently located genes (Katainen et al., 2015; Weinhold et al., 2014). We annotated noncoding regions that are associated with hits from our TGW screen by mapping the corresponding gRNA sequences against the human reference genome. By allowing up to two mismatches, we attempted to account not only for exact matches but also for mismatched target sites. This approach yielded a number of potential hits that should be interpreted with caution because cut sites might be called incorrectly and more stringent validation criteria are necessary. Nevertheless, we found a small set of gRNAs that presumably target predicted promoter sequences as well as CTCF and transcription factor binding sites, although further validation is necessary to gain mechanistic insights into how these sequences are linked to doxorubicin resistance.

Furthermore, our computational approach for coding and noncoding hit-calling is sensitive to incorrect calling of cut sites that can lead to false-positive target regions. We therefore suggest that the actual genome sequence of the cell line of interest is used in order to limit false-positive hit calling. We used the human reference genome (GRCh38.86), which might differ from the hTERT–RPE1 genome, potentially giving misleading conclusions in terms of hit calling. Another strategy to increase the rate of true-positive hits is to increase library coverage in the experiment. However, it is currently not feasible to screen either the 20N or the TGW library with sufficient coverage, at least not in adherent cell lines. To enable coverage-based truly genome-wide screenings, we reduced the TGW library diversity by truncating the library to 17-mers, yielding optimized TGW (oTGW) libraries that are more suited for high-throughput experiments with suspension cells. We propose that the oTGW CRISPR/Cas gRNA libraries are suitable for broad biological screenings with the highest genetic target complexity. Such screens can be followed by targeted validation screens using newly synthesized libraries that are tailored to the initially identified cut sites, which will enable rapid functional validation of such complex genetic experiments.

Materials and methods

Key resources table

Reagent type (species) or resource	Designation	Source or reference	RRID identifiers
Cell line (human)	HEK293T	ATCC	RRID:CVCL_0063
Cell line (human)	hTERT-RPE1	ATCC	RRID:CVCL_4388
Cell line (human)	RPE1	Ian Cheeseman
Antibody	Anti-GFP (B-2)	Santa Cruz Biotechnology	RRID:AB_627695
Antibody	Anti-alpha Tubulin	DSHB	RRID:AB_2315509
Antibody	Goat anti-Mouse IgG (H + L) Secondary Antibody	Thermo Fisher Scientific	RRID:AB_228307
Antibody	Goat anti-Rabbit IgG (H + L) Secondary Antibody	Thermo Fisher Scientific	RRID:AB_228341
Bacteria (E. coli)	K12 CJ236	NEB (E4141)
Bacteria (E. coli)	10 beta	NEB (C3020K)
Recombinant DNA reagent	pLentiGuide	Addgene (52963)
Recombinant DNA reagent	pLentiCRISPRv2	Addgene (52961)
Recombinant DNA reagent	PLKO.1	Addgene (8453)
Recombinant DNA reagent	pPax2	Addgene (12260)
Recombinant DNA reagent	pMD2.G	Addgene (12259)
Commercial kit	E.Z.N.A. M13 DNA Mini Kit	Omega Bio-Tek (D69001-01)
Commercial kit	GeneJET Gel extraction kit	Thermo Fisher (K0692)
Commercial kit	Plasmid Maxi Kit	Qiagen (12163)
Commercial kit	PureLink Genomic DNA Mini Kit	Invitrogen (K1820-01)
Chemical compound, drug	Ampicillin	Roth (K029.2)
Chemical compound, drug	Chloramphenicol	Roth (3886.1)
Chemical compound, drug	Kanamycin	Roth (T832.3)
Chemical compound, drug	NaCl	Roth (31434)
Chemical compound, drug	ATP	NEB (756)
Chemical compound, drug	DTT	Cell Signaling Technology Europe (7016)
Chemical compound, drug	dNTP mix	Roth (0178.1/2)
Chemical compound, drug	Penicillin-streptomycin	Sigma-Aldrich (P4333)
Chemical compound, drug	Hygromycin	Capricorn Scientific (HYG-H)
Chemical compound, drug	Lipofectamin 2000	Thermo Fisher (11668019)
Chemical compound, drug	Polybrene	Sigma Aldrich (H9268)
Chemical compound, drug	Doxorubicin	Selleckchem (S1208)
Chemical compound, drug	Bay-CysLT2	Cayman Chemical (10532)
Chemical compound, drug	Bay-u9773	Tocris Bioscience (3138)
Other	T4 DNA ligase	NEB (M0202)
Other	T7 DNA polymerase (unmodified)	NEB (M0274)
Other	2 mm electroporation cuvette	BTX (45–0125)
Other	Gene Pulser electroporation system	BioRad (164–2076)
Other	I-SceI	NEB (R0694)
Other	DMEM	Thermo Fisher (41965–039)
Other	DMEM/F12	Thermo Fisher (11320–074)
Other	FBS	Thermo Fisher (10270)
Other	M13KO7 helper phage	NEB (N0315)
Other	Polyethylene glycol	Roth (263.2)
Other	SOC outgrowth medium	Thermo Fisher (15544034)
Other	2YT medium	Roth (6676.2)
Other	T4 polynucleotide kinase	NEB (M0201)
Other	T7 endonuclease	NEB (M0302)
Other	OneTaq DNA polymerase	NEB (M0480)
Other	Next High-Fidelity 2x PCR Master Mix	NEB (M0541)
Other	NextSeq 500	Illumina
Software, algorithm	bcl2fastq	Illumina	RRID:SCR_015058
Software, algorithm	cutadapt v.1.15	Martin, 2011	RRID:SCR_011841
Software, algorithm	CasOFFinder v2.4	Bae et al., 2014
Software, algorithm	SnpEff 4.3T	Cingolani et al., 2012	RRID:SCR_005191
Software, algorithm	MAGeCK	Li et al., 2014

Share this article

Cite this article

The 3Cs technology - covalently-closed-circular-synthesized (3Cs) CRISPR/Cas gRNA reagents.

3Cs is a robust technology that uncouples sequence diversity from sequence distribution.

3Cs reagents are of high fidelity — the essentiality of human DUBs for cell fitness.

A truly genome-wide (TGW) CRISPR/Cas 3Cs-gRNA library to interrogate the coding and noncoding genome.

TGW-based identification of coding and noncoding sequences that are associated with doxorubicin resistance.

Optimized TGW (oTGW) libraries for functional interrogations in the coding and noncoding genome.

Author details

Martin Wegner

Contribution

Contributed equally with

Competing interests

Valentina Diehl

Contribution

Contributed equally with

Competing interests

Verena Bittl

Contribution

Competing interests

Rahel de Bruyn

Contribution

Competing interests

Svenja Wiechmann

Contribution

Competing interests

Yves Matthess

Contribution

Competing interests

Marie Hebel

Contribution

Competing interests

Michael GB Hayes

Contribution

Competing interests

Simone Schaubeck

Contribution

Competing interests

Christopher Benner

Contribution

Competing interests

Sven Heinz

Contribution

Competing interests

Anja Bremm

Contribution

Competing interests

Ivan Dikic

Contribution

Competing interests

Andreas Ernst

Contribution

Competing interests

Manuel Kaulich

Contribution

For correspondence

Competing interests

Citations by DOI

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

Categories and tags

Research organism