High-throughput library transgenesis in Caenorhabditis elegans via Transgenic Arrays Resulting in Diversity of Integrated Sequences (TARDIS)

eLife assessment

This manuscript provides a description of an approach for efficiently integrating diverse libraries into the C. elegans genome and tools that enable researchers to use the method. It is a valuable contribution for researchers carrying out experiments that would benefit from easy generation of such libraries, and the data for the effectiveness of the method is solid. The advantages of this approach in terms of ease and effectiveness relative to others with similar aims will emerge as they are put to more general use in addressing biological problems.

https://doi.org/10.7554/eLife.84831.3.sa0

Significance of the findings:

Valuable: Findings that have theoretical or practical implications for a subfield

Landmark
Fundamental
Important
Valuable
Useful

Strength of evidence:

Solid: Methods, data and analyses broadly support the claims with only minor weaknesses

Exceptional
Compelling
Convincing
Solid
Incomplete
Inadequate

During the peer-review process the editor and reviewers write an eLife Assessment that summarises the significance of the findings reported in the article (on a scale ranging from landmark to useful) and the strength of the evidence (on a scale ranging from exceptional to inadequate). Learn more about eLife Assessments

Abstract
eLife digest
Introduction
Results
Discussion
Materials and methods
Data availability
References
Article and author information
Metrics

Abstract

High-throughput transgenesis using synthetic DNA libraries is a powerful method for systematically exploring genetic function. Diverse synthesized libraries have been used for protein engineering, identification of protein–protein interactions, characterization of promoter libraries, developmental and evolutionary lineage tracking, and various other exploratory assays. However, the need for library transgenesis has effectively restricted these approaches to single-cell models. Here, we present Transgenic Arrays Resulting in Diversity of Integrated Sequences (TARDIS), a simple yet powerful approach to large-scale transgenesis that overcomes typical limitations encountered in multicellular systems. TARDIS splits the transgenesis process into a two-step process: creation of individuals carrying experimentally introduced sequence libraries, followed by inducible extraction and integration of individual sequences/library components from the larger library cassette into engineered genomic sites. Thus, transformation of a single individual, followed by lineage expansion and functional transgenesis, gives rise to thousands of genetically unique transgenic individuals. We demonstrate the power of this system using engineered, split selectable TARDIS sites in Caenorhabditis elegans to generate (1) a large set of individually barcoded lineages and (2) transcriptional reporter lines from predefined promoter libraries. We find that this approach increases transformation yields up to approximately 1000-fold over current single-step methods. While we demonstrate the utility of TARDIS using C. elegans, in principle the process is adaptable to any system where experimentally generated genomic loci landing pads and diverse, heritable DNA elements can be generated.

eLife digest

Transgenesis – the ability to insert foreign genetic material (known as transgenes) in to the genome of an organism – has revolutionized biological research. This approach has made it possible for scientists to study the role of specific genes and to produce animal models which mimic aspects of human diseases.

For transgenes to be maintained and passed down to future generations, they must be introduced into germ cells which will go on to form the egg and sperm of the organism. However, despite advances in genetic engineering, this process (called ‘specific transgenesis’) is still laborious and time-consuming, and limits researchers to working with only a small number of known DNA sequences at a time.

In contrast, ‘exploratory transgenesis’ – where dozens of transgenes from a library of DNA sequences are introduced simultaneously into multiple individuals – is more efficient and allows for more large-scale experiments. However, this approach can only be done with single-celled organisms like bacteria, and remains virtually impossible in laboratory animals like worms or mice.

Stevenson et al. therefore set out to boost the efficiency of exploratory transgenesis in a commonly used laboratory animal, the roundworm Caenorhabditis elegans. To do this, they used the ‘library’ principle of exploratory transgenesis in order to develop a new resource called TARDIS (short for, Transgenic Arrays Resulting in Diversity of Integrated Sequences).

First, Stevenson et al. genetically engineered worms to carry a ‘landing site’ for foreign DNA. Next, a library of transgenes and a mechanism which cuts pieces of DNA and pastes them into the landing site were introduced into the germ cells of these worms using traditional methods. The worms were then bred to generate a large population of offspring that had inherited this array of foreign DNA sequences. Finally, the ‘cut and paste’ mechanism was switched on and a random transgene was inserted into the landing site in the genome. This resulted in thousands of worms which each had a unique genetic modification that can be passed on to future generations.

These results show for the first time that larger-scale transgenesis experiments are possible in multi-cellular animals. In the future, Stevenson et al. hope that TARDIS can be adapted to different organisms and allow researchers to carry out experiments that were not previously possible.

Introduction

Transgenesis, which is the specific and heritable introduction of foreign DNA into genomes, has been a central tool for functional analysis and genetic engineering for nearly 40 y. The power of transgenesis is due in part to the wide variety of assays and techniques that are built upon controlled introduction of novel DNA sequences into a native genome. While there are many uses for transgenesis, in practice most can be grouped into those inserting a small number of known sequences (specific transgenesis) and those introducing many sequence variants from experimental libraries (exploratory transgenesis). While the ability to perform specific transgenesis has become a de facto requirement for all model organisms, exploratory transgenesis remains effectively limited to single-cell models (both prokaryotic and eukaryotic) because of biological limitations generated by inheritance in multicellular organisms. In single-cell models, high-throughput transgenesis has been used for exploratory sampling of sequence space using protein interaction libraries (Joung et al., 2000), barcode-lineage tracking libraries (Levy et al., 2015; Nguyen Ba et al., 2019), directed evolution (Packer and Liu, 2015), synthetic promoter library screens (Wu et al., 2019), and mutagenesis screens (Bock et al., 2022; Erwood et al., 2022; Kim et al., 2022; Sánchez-Rivera et al., 2022). Despite the usefulness of such experiments in single-celled systems, either in microorganisms or in cell culture, increasing transgenic throughput in multicellular models holds the potential to expand the impact of exploratory transgenesis in functional domains, such as inter-tissue signaling, neuronal health, and animal behavior, that are dependent on multicellular interactions and therefore difficult to replicate in single-cell models.

Exploratory transgenesis in single-cell models has been facilitated by the availability of in vitro-generated DNA libraries, selectable markers, plasmids, in vivo homologous recombination, and most importantly, the ability to massively parallelize transgenesis using microbial transformation or eukaryotic cell transfection/transduction. Currently, there is no practical means to make populations of uniquely transgenic individuals from sequence libraries at a similar scale in animal systems due to the Weismann barrier (Weismann, 1893): the split between soma and germline. The requirement that the germline be accessible and editable has forced animal systems into a transgenic bottleneck compared to single-cell systems because it is very difficult to introduce exogenous DNA directly into the germline in a high-throughput manner, relying instead on injection, bombardment, or some other physical intervention. This low-throughput limitation in animals dramatically reduces the sequence diversity that can be sampled, effectively preventing large-scale exploratory experiments from being performed. Attempts have been made to parallelize transgenic creation in multicellular model organisms, for example, the development of Brainbow (Livet et al., 2007; Weissman and Pan, 2015), ifgMosaic analysis (Pontes-Quero et al., 2017), P[acman] libraries in Drosophila (Venken et al., 2009), and multiple types of transformation in plants (Ismagul et al., 2018; Xu et al., 2022). In Caenorhabditis elegans, CRISPR technology combined with custom engineered sites within the genome (‘landing pads’) has facilitated the generation of single-copy integrations (Malaiwong et al., 2023; Nonet, 2020; Nonet, 2021; Silva-García et al., 2019; Stevenson et al., 2020; Vicencio et al., 2019), and attempts have been made to multiplex transgenesis using traditional integration methods in conjunction with specialized landing pad systems (Gilleland et al., 2015 ; Kaymak et al., 2016; Mouridi et al., 2022; Radman et al., 2013). While these efforts have increased throughput over standard single-copy integration methods, throughput still remains too low for effective exploratory transgenesis, and in some cases requires significant additional labor, cost, equipment, and/or expertise.

Here, we present ‘Transgenic Arrays Resulting in Diversity of Integrated Sequences’ (TARDIS) (Stevenson et al., 2021), a simple yet powerful alternative to traditional single-copy transgenesis. TARDIS greatly expands throughput by explicitly separating and reordering of the conceptual steps of transgenesis (Figure 1). To increase throughput, TARDIS begins with an in vitro-generated DNA sequence library that is introduced into germ cells via traditional low-throughput methods (i.e., germline transformation, Figure 1). While traditional transgenesis typically couples the physical introduction of DNA into cells with the integration of a selected sequence from the original library, the DNA sequences in TARDIS are designed to be incorporated in large numbers into diverse, heritable sub-libraries (TARDIS libraries), rather than be directly integrated into the desired genomic locus. In addition to the sequence library, a functioning selectable marker is also included to stabilize the inheritance of the TARDIS library over generations. These TARDIS libraries function to create ‘metaploidy’ – expanding the total number of alleles available for inheritance, essentially making the worm genetically ‘bigger on the inside.’ TARDIS library-bearing animals are then allowed to propagate under selection to generate a large population of TARDIS library carriers. After population expansion, genome integration of a single-sequence unit is performed by inducing a double-strand break at a genetically engineered landing pad. This landing pad is designed to both integrate a sequence unit and act as a second selectable marker. We chose C. elegans to validate the TARDIS approach because C. elegans naturally form extrachromosomal arrays that can be several megabases in size (Carlton et al., 2022; Lin et al., 2021; Mello et al., 1991; Stinchcomb et al., 1985) from injected DNA, which simplifies the generation of heritable ‘TARDIS library arrays’ (TLA) that encompass significant sequence diversity.

Figure 1

Download asset Open asset

Transformation compared to Transgenic Arrays Resulting in Diversity of Integrated Sequences (TARDIS).

For transformation, a large population of cells are individually transformed with a DNA library, resulting in a diverse population of individuals. TARDIS achieves a diversity of individuals by splitting transgenesis into two separate processes: (1) the introduction of a diverse library, which is formed into a TARDIS library array, passed down to future generations and thus replicated; and (2) an event that triggers the integration a sequence from the library at random, resulting in a diversity of integrated sequences.

We demonstrate the functionality of TARDIS for two use cases: unique animal barcoding and promoter library transgenesis. Barcoding has been widely adopted in microbial systems for evolutionary lineage tracking (Jahn et al., 2018; Levy et al., 2015; Nguyen Ba et al., 2019) and for developmental lineage tracking in animals (Kebschull and Zador, 2018; McKenna et al., 2016). In microbial systems, barcode libraries have relied on highly diverse randomized oligo libraries, compared to animal systems, which have relied on CRE recombinases or randomized Cas9-induced mutations. Here, we present a novel TARDIS barcoding system for an animal model that mimics the scope and diversity previously only possible using microbial systems. Our results show that large, heritable libraries containing thousands of barcodes can be created and maintained as extrachromosomal arrays. Individual sequences are selected and removed from the library upon experimental induction of Cas9 in a proportion consistent with the composition of the TLA with rare overrepresented sequences. We found that TARDIS is also compatible with the integration of large promoters and can be used to simultaneously integrate promoters into multiple genomic locations, providing a tool for multiple insertions at defined locations across the genome. While we demonstrate the system’s advantages in C. elegans, in principle, the system is adaptable for any situation where the sequences for integration can be introduced with high diversity and heritability, and where a genomic site for integration can be made or is available.

Results

Generation of barcode landing pad

We designed a specific landing pad for the introduction and selection of small barcode fragments from high-diversity, multiplexed barcode libraries (Figure 2). This landing pad was designed to be targeted by Cas9 and requires perfect integration on both the 5′ and 3′ ends of a synthetic intron for functional hygromycin B resistance. Current split selection landing pads only provide selection on one side of the double-strand break, which can result in a small percentage of incomplete integrations (Stevenson et al., 2020). To fully test a large library approach, the requirement of genotyping to identify correct integrations must be overcome. A split-selection, hygromycin resistance (HygR) system was chosen for its simplicity and integration-specific selection. A unique synthetic CRISPR guide RNA target sequence was created by removing coding sequence on both sides of an artificial intron, resulting in a nonfunctional HygR gene. By removing critical coding sequence on both sides of the gene, only ‘perfect’ integration events will result in hygromycin resistance (Figure 2A). The synthetic landing pad was integrated at Chromosome II: 8,420,157, which has previously been shown to be permissive for germline expression (Dickinson et al., 2015; Frøkjær-Jensen et al., 2012; Frøkjaer-Jensen et al., 2008).

Figure 2 with 1 supplement see all

Download asset Open asset

Barcode landing pad and diverse donor library.

(A) Schematic design for the barcode landing pad and integration. A broken hygromycin resistance gene is targeted by Cas9, which repairs off the Transgenic Arrays Resulting in Diversity of Integrated Sequences (TARDIS) array, integrating a barcode and restoring the functionality of the gene. (B) The TARDIS multiplex library was created from a randomized oligo library, which underwent 10 cycles of PCR to make a dsDNA template. The barcode fragment was then added into a three fragment overlap PCR to add homology arms and make the final library for injection.

Generation of high-diversity donor library and TARDIS arrays

Transgenes or DNA sequences can be cloned into plasmid vectors for injections in C. elegans. However, the cloning process is laborious, and the plasmid vector is unnecessary for integration into an array or the genome. We sought to provide a protocol for library generation that maximized diversity and eliminated the requirement of cloning (Figure 2B). Oligo libraries have been used for barcoding (Levy et al., 2015) and for identification of promoter elements (de Boer et al., 2020) in yeast, but practical implementation of large synthetic libraries for transgenesis has never been performed in an animal system. We used randomized synthesized oligos to build a highly diverse library of barcodes, similar to the one described by Levy et al., 2015, via complexing PCR. Given randomized bases present at the 11 nucleotide positions centrally located within the barcode, our base library can yield a theoretical maximum of approximately 4.2 million sequences. Our overlap PCR approach achieves high levels of diversity with minimal ‘jackpotting’ – sequences with higher representation than expected (Figure 3—figure supplement 1). With low-coverage sequencing, we found almost 800,000 unique barcode sequences, providing a large pool of potential sequences that can be incorporated into TARDIS arrays. Only 472 sequences were overrepresented (counts greater than 50), accounting for approximately 6.7% of the total reads and only approximately 0.06% of the unique barcodes detected.

We injected our complexed barcodes and isolated individual TARDIS array lines, each containing a subset of the barcode library (Figure 3). Individual injected worms were singled, and we identified four arrays from three plates. Arrays 1 and 2 were identified on separate plates, and were therefore derived from independent array formation events, while array 3, profile 1 and array 3, profile 2 were both identified on the same plate. Analysis of array diversity within these lines shows, somewhat unexpectedly, that during array formation a subset of barcode sequences tended to increase in frequency (Figure 3A and B). Higher frequency barcodes in arrays tend to be independent of the jackpotted sequences of the injection mix as very few are represented in the set of high-frequency barcodes from the injection mix. The high-frequency barcodes also varied between arrays.

Figure 3 with 3 supplements see all

Download asset Open asset

Transgenic Arrays Resulting in Diversity of Integrated Sequences (TARDIS) library arrays can contain large barcode diversity.

(A) Frequency distribution of 1319 unique barcodes in array 1 (PX816). (B) Frequency distribution of the 3001 unique barcode sequences in array 2 (PX817). (C) Sequence logo probabilities of the 15 base pair positions of the barcodes in the injection mix, array 1 and array 2.

We found that array formation does not seem to favor any particular barcode sequence motif (Figure 3C) and that arrays can range considerably in diversity. Array 1 had 1319 unique barcode sequences, array 2 had 3001 unique barcode sequences, array 3 profile 1 had 91 unique barcode sequences, and array 3 profile 2 had 204 unique barcode sequences (Figure 3—figure supplement 2). Across the four arrays, we found a total of 4395 unique barcode sequences. When we compared the individual sequences incorporated during the three independent injections, we found little overlap. 96.5% (4395/4553) of the identified sequences were unique to one injection, 3.0% (136) were incorporated twice, and 0.5% (22) were recovered from all three injections. In contrast to the diversity between injection events, a similar comparison of the two profiles derived from a single injection for array 3 showed considerable overlap, with 68% (62/91) of the profile 1 sequences also being present in profile 2. Overall, our results suggest our complexing PCR oligo library can produce a highly diverse library and that arrays can store a large diversity of unique sequences.

The distribution of element frequency within a given array follows a clear Poisson distribution. Arrays 1 and 2 show more diversity, with barcode frequencies more similar to one another than the two profiles isolated from array 3 (Figure 3—figure supplement 2). The null assumption is that the array is formed from a simple sample of the injected barcodes in equal proportions. However, arrays have been already reported to jackpot certain sequences. For example, when Lin et al., 2021 injected fragmented DNA, they found that larger fragments were favored in the assembly. In our case, we find some barcode sequences become jackpotted, despite being identical in size. A possible explanation is that early in formation, arrays are replicating sequences, possibly to reach a size threshold. Consistent with this hypothesis, arrays with higher barcode diversity had frequencies closer to one another, while arrays with lower diversity had wider frequency ranges.

Integration from TARDIS array to F1

Our primary motivation in developing the TARDIS method was to utilize individual sequences from the TARDIS array as integrated barcodes. To assay the integration efficiency, we performed TARDIS integration on two biological replicates from a TARDIS array line (PX786) synchronized in the presence of G-418. Out of the 100 L1’s per plate initially plated on antibiotic free plates, an average of 41 worms (N = 255 plates) for replicate 1 and 62 worms for replicate 2 (N = 125 plates) survived to the next day. These surviving individuals contained the array, allowing them to survive early-life G-418 exposure and generally showed fluorescent co-marker expression as well. Following heat shock to induce Cas9, replicate 1 produced 104 plates with hygromycin-resistant individuals, indicating barcode integration, and replicate 2 produced 71. These results suggest that approximately 200–300 worms need to be heat-shocked to obtain an integrated line when using 150 bp homology arms and relatively small inserts such as the barcodes. To assay the integration frequency from the array to the F1, we performed TARDIS integration on four biological replicates derived from PX786. We found that the frequency of integration for barcodes in F1 individuals was strongly correlated with the barcodes’ frequency in the TLA (Figure 4A; R ≈ 0.96, p ≈ 5.7 × 10^–154). Notably, there are two replicated outliers across the four biological replicates. One barcode (TTAAATTATCACATG) tended to integrate more often than would be predicted by its frequency in the array, while barcode (GCTCATTCTGACGTA) integrated less frequently than expected (Figure 4—figure supplement 1). In general, however, we did not observe any noticeable bias in sequence motif selection following integration (Figure 4B). Several individual lineages were isolated from the population with hygromycin selection, validating functional restoration of the HygR gene, and three were randomly chosen for Sanger sequencing to confirm perfect barcode integration. As expected, these sequenced barcodes were also found amongst the barcode sequences of the array.

Figure 4 with 1 supplement see all

Download asset Open asset

Integration frequency from Transgenic Arrays Resulting in Diversity of Integrated Sequences (TARDIS) library array to F1.

(A) Frequency of integration from TARDIS library array to the F1, R ≈ 0.96, p≈5.7 × 10^-154. Different colors represent four biological replicates. Line shading represents 95% confidence interval. (B) Sequence probabilities of PX786 compared to the F1 integrations (91 unique barcodes were identified in the array and 118 in the F1s, with a five read threshold).

Generation and integration of TARDIS promoter library

For testing insertion of promoter libraries via TARDIS, two separate landing pad sites utilizing split selection were engineered in chromosome II (Figure 5A). The first contained the 3′ portions of both the mScarlet-I and the HygR genes in opposite orientation to each other and separated by a previously validated synthetic Cas9 target (Stevenson et al., 2020). Similarly, the second landing pad site contained the 3′ portions of mNeonGreen and Cbr-unc-119(+) separated by the same synthetic Cas9 target, allowing both sites to be targeted by the same guide. These landing pads were engineered into an unc-119(ed3) background to allow for selection via rescue of the uncoordinated (Unc) phenotype. A strain containing only the split mScarlet-I/split HygR landing pad was also constructed, in which case a copy of Cbr-unc-119(+) was retained at the landing pad site. Repair templates contained the 5′ portion of the respective selective gene, a lox site allowing for optional removal of the selective gene after integration (by expression of Cre) and the chosen promoters in front of the 5′ portion of the respective fluorophore. The selective gene and fluorophore fragments contained >500 bp overlaps with the landing pad to facilitate homology directed repair. Correct homology directed repair at both junctions resulted in worms that were fluorescent, hygromycin resistant, and had wild-type movement.

Figure 5 with 1 supplement see all

Download asset Open asset

Transgenic Arrays Resulting in Diversity of Integrated Sequences (TARDIS) promoter library.

(A) Overview of the two split landing pads and their associated promoter insertion vectors. Both the selective marker and the fluorophore expression are restored upon correct integration. (B) Transcriptional reporters for nine genes were recovered from a single heatshock of a single TARDIS array line (PX819). Integration was into the single mScarlet-I/HygR landing pad. Main images show mScarlet-I expression for the indicated reporter while insets show polarized image of the same region. (C) Example simultaneous, dual integration from a single TARDIS array into the double landing pad strain with PEST. ceh-10p::mNeonGreen::PEST is false-colored green and ceh-40p::mScarlet-I::PEST is false-colored magenta. All scale bars represent 20 µm.

The initial promoter library tested was composed of 13 promoters targeted to a single landing pad site with split mScarlet-I and split HygR (Table 1). These promoters ranged in size from 330 to 5545 bp (total repair template length of 2238–7453 bp). Seven different array lines were generated, which exhibited distinct profiles when probed by PCR as a crude measure of array composition and diversity (Figure 5—figure supplement 1A). Promoter-specific PCR showed these arrays to contain 2–13 of the 13 injected promoters, with a mean of 10.7 and a median of 12 (Figure 5—figure supplement 1B). For the selected line (PX819), 12 promoters were incorporated into the TARDIS array. From this line, approximately 200 G-418-resistant L1s (i.e., those containing the array) were plated onto each of 60 plates and then heat-shocked as L2/L3s to initiate integration. Hygromycin-resistant individuals were recovered from 59 of the 60 plates, indicating one or more integration events on each of those plates. Four individuals were singled from each of these plates, with the intent of maximizing the diversity of fluorescent profiles and analyzed by PCR to identify the integrated promoters (Figure 5—figure supplement 1B). Based on the banding patterns, 83 of these PCR products were sequenced with nine different promoters confirmed as integrated (Table 1 and Figure 5B). This included both the smallest (aha-1p) and the largest promoter (nhr-67p) in the set. Notably, two of the three promoters that were in the array but not recovered as integrants were found to be integrated in a subsequent experiment (see below), suggesting the failure to be recovered in this case was likely due to the array composition rather than any properties of these particular promoters. For approximately half of the plates, two or more promoters were identified from the four worms chosen. Of the 83 PCR products sequenced, 5 had incorrect sequences and/or product sizes inconsistent with the promoter identified and 3 failed to prime. Additionally, several samples failed to amplify or gave a nonspecific banding pattern and likely also represent incorrect integrations.

Table 1

Characteristics of injected promoters and presence in tested array line (PX819) and integrated lines derived from that array.

Promoter	Promoter size (bp)	Expected expression location	Array	Integrated
aha-1	330	Neurons, hypodermis, intestine, pharynx (Jiang et al., 2001)	Y	Y
hlh-16	514	Head neurons (Bertrand et al., 2011)	Y	N
ceh-40	965	Dopaminergic neurons (Sarov et al., 2012)	Y	Y
ceh-10	1172	Neurons, seam cells (Reece-Hoyes et al., 2007)	Y	Y
ahr-1	1387	ALM and RME neurons (Huang et al., 2004)	Y	N
mdl-1	2000	Neurons, body wall, pharynx (Reece-Hoyes et al., 2007)	Y	Y
egl-43	2001	Neurons, gonad (Hwang et al., 2007)	Y	N
ceh-20	2015	Neurons, seam cells, vulva (Reece-Hoyes et al., 2007)	Y	Y
ceh-43	2096	Neurons, anterior hypodermis (Reece-Hoyes et al., 2007)	Y	Y
daf-7	2524	Nead neurons, coelemocytes, pharynx (Klabonski et al., 2016)	Y	Y
lin-11	2857	Neurons, uterus, vulva, head muscle (Gupta et al., 2003)	Y	Y
egl-46	4477	Neurons (Wu et al., 2001)	N	N
nhr-67	5545	Neurons, excretory cell, rectal valve cell, vulva (Fernandes and Sternberg, 2007)	Y	Y

Y, yes; N, no.

To test whether TARDIS could be used to target multiple sites simultaneously, a second promoter library containing seven promoters targeted to each site (ahr-1p, ceh-10p, ceh-20p, ceh-40p, ceh-43p, hlh-16p, mdl-1p) was injected into worms containing both landing pad sites. Five plates of mixed stage worms were heat-shocked, and worms that were both hygromycin resistant and had wild-type movement were found on three of those plates. Worms that were hygromycin resistant but retained the Unc phenotype were also observed on some plates, representing individuals with integrations at a single site. For two of the plates, a single pair of integrations was observed, in both cases being ahr-1p::mScarlet plus hlh-16p::mNeonGreen. For the third plate, two different combinations were recovered: ahr-1p::mScarlet plus mdl-1p::mNeonGreen and ceh-40p::mScarlet plus ceh-10p::mNeonGreen (Figure 5C). While multi-site CRISPR is known to be possible (Arribere et al., 2014), these results suggest that TARDIS provides a unique way to engineer multiple locations using a single injection.

When transcriptional reporter lines were examined by fluorescent microscopy, expression of the fluorophores was concentrated in but not exclusive to the nucleus, consistent with the presence of nuclear localization signals (NLS) on the fluorophores. For all promoters, expression was seen in at least one previously reported tissue (Table 1) but was absent in one or more tissues for several of the promoters. Expression of single-copy reporters is frequently more spatially restricted than that from integrated or extrachromosomal arrays (Aljohani et al., 2020). The differences in expression pattern may also reflect the differences in the region used as the promoter or the fact that only a single developmental stage (late L4/early adult) was examined. Overall, we find that TARDIS can be used to screen functional libraries, either individually or in combination.

Discussion

Here, we present the first implementation of a practical approach to large-scale library transgenesis in an animal system (Figure 1). Building on over a half century of advancements in C. elegans genetics, we can now make thousands of specific, independent genomic integrations from single microinjection events that traditionally yield at most a small handful of transgenic individuals. Increasing transgenesis throughput has long been desired, and in C. elegans several attempts have been made to multiplex transgenic protocols. Library mosSCI and RMCE, which both introduce a multiplexed injection mixture and do indeed achieve multiple integrations (Kaymak et al., 2016; Nonet, 2020). However, just as in the case of standard mosSCI or single-donor injections for RMCE, anti-array screening, genotyping, and the direct integration of the process substantially limit the multiplex potential of these methods. One group has adopted arrays with small pools of guides coupled with heatshock-inducible Cas9 to produce randomized mutations at targeted locations (Froehlich et al., 2021). This protocol shares similarities with TARDIS, in that diverse arrays are coupled with inducible Cas9. However, the focus of that technology was to produce randomized genomic edits, and it does not produce precise, library integrations into the genome. Recently, another group (Mouridi et al., 2022) built on the utility of heatshock Cas9 and integrated three individual sequences from an array. While these prior multiplexed methods made substantial contributions in improving the efficiency of specific transgenesis, none have yet demonstrated multiplexing beyond tens of unique sequences – orders of magnitude below what would be needed for exploratory transgenesis. TARDIS therefore provides the first true library-based approach for multiplexing transgenesis in C. elegans.

TARDIS as a method for creating barcoded individuals

Genetic barcode libraries have been applied to many high-throughput investigations to reduce sequencing costs and achieve a higher resolution within complex pools of individuals. By focusing the sequencing reads on a small section of the genome, a larger number of individual variants can be identified or experimentally followed. This critical advancement has led to the widespread use of barcoding for evolutionary lineage tracking in microbial systems (Blundell and Levy, 2014; Kasimatis et al., 2021; Levy et al., 2015; Levy, 2016; Nguyen Ba et al., 2019; Venkataram et al., 2016) – uncovering the fitness effects of thousands of individual lineages without requiring large coverage depth of the whole genome. In addition to this application, using barcoded individuals can be used to facilitate any application that involves screening a large pool of diverse individuals within a shared environment. For example, barcodes have been used in microbial studies investigating pharmaceutical efficacy (Smith et al., 2011) and barcoded variant screening (Emanuel et al., 2017). The TARDIS-based system presented here provides an approximately 1000×-fold increase in barcoding throughput in C. elegans, making it a unique resource among multicellular models that allows the large diversity pool and design logic of microbial systems to be adapted to animal models.

While we designed our barcode sequence units for the purpose of barcoding individuals, this approach could also prove useful in future optimization and functional understanding of array-based processes. In particular, the high-sequence diversity but identical physical design of the synthetic barcode library may provide a unique window into extrachromosomal array biology that would be helpful in optimizing sequence units for incorporation into heritable TLAs. For example, an unexpected result of the barcoding experiment was the discovery that a small minority of sequences were overrepresented, or ‘jackpotted,’ in the TLA relative to their frequency in the injection mix (Figure 3 and Figure 3—figure supplement 1). Our expectation was that arrays would form in an equal molar fashion proportional to the injection mix based on the model that arrays are formed by physical ligation of the injected DNA fragments (Mello et al., 1991). Deviations from random array incorporation have been observed before, and a bias for incorporating larger fragments has been proposed as an explanatory mechanism (Lin et al., 2021). Our results suggest that the ultimate array composition is not directly proportional to the molarity of the injected fragments or strictly weighted towards the size of the fragment as has been suggested. In contrast, we propose that array size affects the maintenance of extrachromosomal arrays. As such, selection can act to increase the rate of recovery for arrays that have increased their size through random amplification of some sequences by an unknown process early in the formation of the array or by expansion of similar sequences by DNA polymerase slippage during replication, as has been well documented for native chromosomes (Levinson and Gutman, 1987). These hypotheses would be consistent with observations of Lin et al., 2021 if the underlying mechanism for their observation is that inclusion of larger fragments tends to be positively correlated with ultimate array size, and therefore likelihood of maintenance.

TARDIS as a method for the introduction of promoters and other large constructs

While the barcode approach demonstrates the potential for using TARDIS to integrate large numbers of 433 bp PCR products, previous work using CRISPR/Cas9-initiated homology-directed repair has suggested that integration efficiencies decrease with the size of the insert (Dickinson and Goldstein, 2016). We therefore implemented TARDIS for integrating promoters cloned into a vector backbone and ranging in size from 330 bp to 5.5 kb to determine TARDIS functionality under a physically different use case directed specifically at functional analysis. We found that promoter libraries could be integrated into either single sites or two sites simultaneously. Unsurprisingly, the frequency at which various promoters were recovered varied from array to array (e.g., ahr-1p was never recovered in the single-site integration experiment despite being present in the array, while it was the most common promoter recovered in the two-site integration experiment) and likely reflects the same relationship between integration frequency and prevalence in the array, as was seen with the distribution of insert abundance for the barcodes. While we showed that plasmid donors can be used in the TARDIS pipeline, not all arrays contained all 13 plasmids. Given that the estimated 1–13 MB size of arrays (Carlton et al., 2022) would be adequate to hold copies of each of the plasmids, as well as the extreme diversity obtained when using smaller DNA fragments, differential presence of a given promoter fragment was somewhat unexpected. This may reflect a preferential use of linear fragments in the in situ assembly of arrays. Future use of linear fragments where feasible may increase incorporation and overall diversity (Priyadarshini et al., 2022).

For both the one- and two-site promoter library integrations, transgenic individuals were readily detected, suggesting that the TARDIS method for integration was highly efficient. It has long been understood that successful CRISPR editing at one site significantly increases the chances of successful editing at a second site. This is the premise behind commonly used co-conversion screening strategies (also referred to as co-CRISPR), such as the dpy-10 screen commonly used in C. elegans (Arribere et al., 2014; Ward, 2015). Here, we show that same type of co-conversion also occurs when using only ‘large’ (>1 kb), plasmid-based repair templates containing gene-sized repair constructs. Additionally, we have simultaneously targeted the same two landing pads presented here using standard CRISPR techniques and find that approximately half of hygromycin resistant individuals also have rescue of the Unc phenotype (i.e., editing has occurred at both sites; data not shown). Given the high rate of co-conversion, this work demonstrates multiplex integrations are possible not only by targeting multiple repair templates to a single site but also by simultaneously utilizing multiple insertion sites.

In order to recover individual edits most efficiently, given the high frequency of integration using TARDIS, we recommend to either heat-shock small cohorts of array-bearing individuals, such that most cohorts only yield one edited individual or to screen multiple individuals per cohort. Additionally, while split-selection methods allow for direct verification of integration, depending on the downstream use case, integrations should be confirmed by sequencing as errors can still occur, including internal deletions within the insert.

Expansion of TARDIS to other multicellular systems

Unlocking the investigative potential of transgenesis in animal systems would enable exploratory experiments normally restricted to single-cell models. For example, alanine scanning libraries and protein–protein interactions (Cunningham and Wells, 1989; Matthews, 1996; Wells, 1991), CRISPR library screening (Bock et al., 2022), and promoter library generation (Delvigne et al., 2015; Zaslaver et al., 2006). While we demonstrate the use of TARDIS in C. elegans here, the intellectual underpinnings of the approach are agnostic to the research model used. Conceptually, TARDIS facilitates high-throughput transgenesis by using two engineered components: a heritable TARDIS library containing multiplexed transgene units and a genomic split selection landing pad that facilitates integration of single-sequence units from the library. To generate the first TARDIS libraries, we capitalized on the endogenous capacity of C. elegans to assemble experimentally provided DNA into heritable extrachromosomal arrays. Extrachromosomal arrays are formed from exogenous DNA, are megabases in size (Lin et al., 2021; Woglar et al., 2020), do not require specific sequences to form and replicate, and can be maintained in a heritable manner via selection (Mello et al., 1991). These qualities make them suitable for use as a heritable library upon which TARDIS can be based. To adopt TLAs in systems beyond C. elegans, methods must be adopted to introduce large heritable libraries into the germline as most systems do not maintain extrachromosomal arrays. In mice, the locus H11 has been used for large transgenic insertions (Liu et al., 2022), while in Drosophila, the use of PhiC31-mediated transgenesis coupled with bacterial artificial chromosomes (BACs) has allowed for many approximately 10 kb+-sized fragments to be integrated into their respective genomes (Venken et al., 2006). Each of these large integration strategies can provide a vehicle for stable inheritance of a TLA.

The second component of the TARDIS integration system is a pre-integrated landing pad sequence. We have generally favored split selectable landing pads (SSLPs) that use HygR for its effectiveness (Mouridi et al., 2022; Stevenson et al., 2020; Stevenson et al., 2021). The SSLPs are engineered to accept experiment-specific units from the array. For example, here we used SSLPs designed to accept barcodes for experimental lineage tracking and promoters for generation of transcriptional reporters. To translate TARDIS to other systems, a genomic site needs to be engineered to act as a landing pad that can utilize sequence units from the TLA and can be customized to the specific system and use. Because TLAs allow the experimenter to design the library of interest and the landing pad to recapitulate the strengths of single-cell systems, adoption of TARDIS in multicellular animal experiments can leverage the high-resolution, high-diversity exploratory space of DNA synthesis. In addition to adapting assays currently restricted to single-cell models, TARDIS also opens the door to animal-specific uses, such as developmental biology, neurobiology, endocrinology, and cancer research.

In developmental genetics, the lack of large-library transgenesis has resulted in ‘barcode’ libraries in a different form, utilizing randomized CRISPR-induced mutations to form a unique indel. For example, GESTALT (McKenna et al., 2016) creates a diversity of barcodes in vivo via random indel formation at a synthetic target location. LINNAEUS (Spanjaard et al., 2018) similarly utilizes randomized targeting of multiple RFP transgenes to create indels, allowing for cells to be barcoded for single-cell sequencing. TARDIS barcodes do not rely on randomized indel generation and thus can be much simpler to implement with sequencing approaches outlined above.

In vivo cancer models have also adopted the high-resolution, high-variant detection of barcodes for the study of tumor growth and evolution. Rogers et al. developed Tuba-seq (Rogers et al., 2017; Winslow, 2022), a pipeline that takes advantage of small barcodes allowing for in vivo quantification of tumor size. In Tuba-seq, barcodes are introduced via lentiviral infection, leading to the barcoding of individual tumors. TARDIS brings the multiplexed library into the animal context without requiring viral vectors or intermediates, thereby allowing large in vivo library utilization and maintenance. Capitalizing on the large-sequence diversity possible within synthesized DNA libraries with a novel application in multicellular systems generates new opportunities for experimental investigation in animal systems heretofore only possible within microbial models.

Conclusion

In conclusion, here we have presented TARDIS, a simple yet powerful approach to transgenesis that overcomes the limitations of multicellular systems. TARDIS uses synthesized sequence libraries and inducible extraction and integration of individual sequences from these heritable libraries into engineered genomic sites to increase transgenesis throughput up to 1000-fold. While we demonstrate the utility of TARDIS using C. elegans, the process is adaptable to any system where experimentally generated genomic loci landing pads and diverse, heritable DNA elements can be generated.

Materials and methods

Key resources table

Reagent type (species) or resource	Designation	Source or reference	Identifiers	Additional information
Genetic reagent (Caenorhabditis elegans)	aha-1p	wormbase.org	WBGene00000095
Genetic reagent (C. elegans)	hlh-16p	wormbase.org	WBGene00001960
Genetic reagent (C. elegans)	ceh-40p	wormbase.org	WBGene00000461
Genetic reagent (C. elegans)	ceh-10p	wormbase.org	WBGene00000435
Genetic reagent (C. elegans)	ahr-1p	wormbase.org	WBGene00000096
Genetic reagent (C. elegans)	mdl-1p	wormbase.org	WBGene00003163
Genetic reagent (C. elegans)	egl-43p	wormbase.org	WBGene00001207
Genetic reagent (C. elegans)	ceh-20p	wormbase.org	WBGene00000443
Genetic reagent (C. elegans)	ceh-43p	wormbase.org	WBGene00000463
Genetic reagent (C. elegans)	daf-7p	wormbase.org	WBGene00000903
Genetic reagent (C. elegans)	lin-11p	wormbase.org	WBGene00003000
Genetic reagent (C. elegans)	egl-46p	wormbase.org	WBGene00001210
Genetic reagent (C. elegans)	nhr-67p	wormbase.org	WBGene00003657
Strain, strain background (C. elegans)	N2	Caenorhabditis Genetics Center
Strain, strain background (C. elegans)	N2-PD1073	doi:10.17912/micropub.biology.000518		Available from the Caenorhabditis Intervention Testing Program- upon request (https://citp.squarespace.com/)-
Strain, strain background (C. elegans)	PX740	This paper		N2-PD1073 fxIs47 [rsp-0p:: 5′ ΔHygR:: GCGAAGTGACGGTAGACCGT:: 3′ ΔHygR::unc-54 3′::loxP]
Strain, strain background (C. elegans)	GT331	This paper		aSi9[lox2272 Cbr-unc-119(+) lox2272+loxP 3′3′ ΔHygR +3′ ΔmScarlet-I::PEST]; unc-119(ed3)
Strain, strain background (C. elegans)	GT332	This paper		aSi10[lox2272 Cbr-unc-119(+) lox2272+loxP 3′ ΔHygR +3′ ΔmScarlet-I]; unc-119(ed3)
Strain, strain background (C. elegans)	GT336	This paper		aSi12[lox2272 rps-0p::HygR+hsp−16.41p::Cre::tbb-2 3′UTR+sqt-1(e1350) lox2272+loxN 3′ ΔCbr-unc-119(+)::tjp2a_guide:: 3′ ΔmNeonGreen::PEST::egl-13nls::tbb-2 3′UTR] aSi9[lox2272 Cbr-unc-119(+) lox2272+loxP 3′ΔHygR::tjp2a guide::3′ΔmScarlet-I::PEST::egl-13nls::tbb-2 3′UTR] II; unc-119(ed3) III
Strain, strain background (C. elegans)	GT337	This paper		aSi13[lox2272+loxN 3' ΔCbr-unc-119(+)+3' ΔmNeonGreen::PEST] aSi14[lox2272+loxP 3′ ΔHygR +3′ ΔmScarlet-I::PEST]; unc-119(ed3),
Strain, strain background (C. elegans)	QL74	Gift from QueeLim Ch’ng		oxEx1578 [eft-3p::GFP+Cbr-unc-119(+)] 6x outcross EG4322
Strain, strain background (C. elegans)	PX786	This paper		fxEx23 [TARDIS #5 5′ ΔHygR::Intron5'::Read1::NNNCNNTNTNANNNN::Read2::Intron3':: 3' ΔHygR (89 Unique Sequences) hsp-16.41p::piOptCas9::tbb-2 34' UTR+rsp-27p::NeoR::unc-54 3' UTR+U6p:: GCGAAGTGACGGTAGACCGT]; fxSi47[ rsp-0p:: 5' ΔHygR:: GCGAAGTGACGGTAGACCGT:: 3' ΔHygR::unc-54 3′::loxP]
Strain, strain background (C. elegans)	PX816	This paper		fxEx25 [TARDIS #1 5' ΔHygR::Intron5'::Read1::NNNCNNTNTNANNNN::Read2::Intron3':: 3' ΔHygR (1,319 Unique Sequences) hsp-16.41p::piOptCas9::tbb-2 34' UTR+rsp-27p::NeoR::unc-54 3' UTR+U6p:: GCGAAGTGACGGTAGACCGT]; fxSi47[ rsp-0p:: 5' ΔHygR:: GCGAAGTGACGGTAGACCGT:: 3' ΔHygR::unc-54 3′::loxP]
Strain, strain background (C. elegans)	PX817	This paper		fxEx26 [TARDIS #2 5' ΔHygR::Intron5'::Read1::NNNCNNTNTNANNNN::Read2::Intron3':: 3' ΔHygR (3,001 Unique Sequences) hsp-16.41p::piOptCas9::tbb-2 34' UTR+rsp-27p::NeoR::unc-54 3' UTR+U6p:: GCGAAGTGACGGTAGACCGT]; fxSi47[ rsp-0p:: 5' ΔHygR:: GCGAAGTGACGGTAGACCGT:: 3' ΔHygR::unc-54 3′::loxP]
Strain, strain background (C. elegans)	PX818 profile 1	This paper		fxEx27 [TARDIS #3 5' ΔHygR::Intron5'::Read1::NNNCNNTNTNANNNN::Read2::Intron3':: 3' ΔHygR (91 Unique Sequences) hsp-16.41p::piOptCas9::tbb-2 34' UTR+rsp-27p::NeoR::unc-54 3' UTR+U6p:: GCGAAGTGACGGTAGACCGT]; fxSi47[ rsp-0p:: 5' ΔHygR:: GCGAAGTGACGGTAGACCGT:: 3' ΔHygR::unc-54 3′::loxP]
Strain, strain background (C. elegans)	PX818 profile 2	This paper		fxEx28 [TARDIS #4 5' ΔHygR::Intron5'::Read1::NNNCNNTNTNANNNN::Read2::Intron3':: 3' ΔHygR (204 Unique Sequences) hsp-16.41p::piOptCas9::tbb-2 34' UTR+rsp-27p::NeoR::unc-54 3' UTR+U6p:: GCGAAGTGACGGTAGACCGT]; fxSi47[ rsp-0p:: 5' ΔHygR:: GCGAAGTGACGGTAGACCGT:: 3' ΔHygR′::unc-54 3′::loxP]
Strain, strain background (C. elegans)	PX819	This paper		N2 fxEx24 [(rps-0p:: 5′ ∆HygR+loxP + aha-1p::SV40 NLS:: 5′ ∆mScarlet-I) + (rps-0p:: 5′ ∆HygR+loxP + ahr-1p::SV40 NLS::5′ ∆mScarlet-I) + (rps-0p:: 5′ ∆HygR+loxP + ceh-10-1p::SV40 NLS::5′ ∆mScarlet-I) + (rps-0p:: 5′ ∆HygR+loxP + ceh-20p::SV40 NLS::5′ ∆mScarlet-I) + (rps-0p:: 5′ ∆HygR+loxP + ceh-40p::SV40 NLS::5′ ∆mScarlet-I) + (rps-0p:: ∆HygR+loxP + ceh-43p::SV40 NLS::5′ ∆mScarlet-I) + (rps-0p:: 5′ ∆HygR+loxP + daf-7p::SV40 NLS::5′ ∆mScarlet-I) + (rps-0p:: ∆HygR+loxP + egl-43p::SV40 NLS::5′ ∆mScarlet-I) + (rps-0p:: 5′ ∆HygR+loxP + hlh-16p::SV40 NLS::5′ ∆mScarlet-I) + (rps-0p:: 5′ ∆HygR+loxP + lin-11p::SV40 NLS::5′ ∆mScarlet-I) + (rps-0p:: 5′ ∆HygR+loxP + mdl-1p::SV40 NLS::5′ ∆mScarlet-I) + (rps-0p:: 5′ ∆HygR+loxP + nhr-67p::SV40 NLS::5′ ∆mScarlet-I)+hsp−16.41p::piOptCas9::tbb-2 34′ UTR+prsp-27::NeoR::unc-54 3′ UTR]; aSi10[lox2272+Cbr-unc-119(+)+lox2272+loxP + 5′ ∆HygR::unc-54 3' UTR+5′ ∆mScarlet-I::egl-13 NLS::tbb-2 3' UTR, II:8420157]; unc-119(ed3) III
Strain, strain background (C. elegans)	EG4322	doi.org/10.1038 ng .248; Caenorhabditis Genetics Center
Strain, strain background (Escherichia coli)	PXKR1	This paper		NA22 transformed with pUC19
Recombinant DNA reagent	Plasmid pDSP15	This paper	193853 (Addgene)	5′ ΔHygR::loxP::MCS::5′ Δ mScarlet-I
Recombinant DNA reagent	Plasmid pDSP16	This paper	193854 (Addgene)	5′ ΔCbr-unc-119(+)::loxN::MCS::5′ Δ 5′mNeonGreen
Recombinant DNA reagent	Plasmid pMS84	This paper	193852 (Addgene)	U6p::GGACAGTCCTGCCGAGGTGG
Recombinant DNA reagent	Plasmid pZCS36	This paper	193048 (Addgene)	hsp16.41p::Cas9(dpiRNA)::tbb-2 ′3UTR
Recombinant DNA reagent	Plasmid pZCS38	This paper	193049 (Addgene)	rsp-27p::NeoR::unc-54 3′ UTR
Recombinant DNA reagent	Plasmid pZCS41	This paper	193050 (Addgene)	U6p::GCGAAGTGACGGTAGACCGT
Sequence-based reagent	ZCS422	This paper		Design and construction of barcode donor library
Commercial assay or kit	DNA Clean and Concentrator	Zymo Research	Cat# D4004
Commercial assay or kit	Genomic DNA Clean and Concentrator	Zymo Research	Cat# D4011
Commercial assay or kit	Zymoclean Gel DNA Recovery Kit	Zymo Research	Cat# D4008
Commercial assay or kit	Zyppy Plasmid Miniprep Kit	Zymo Research	Cat# D4019
Software, algorithm	Cutadept	doi.org/10.14806/ej.17.1.200	Version 4.1
Software, algorithm	AmpUMI	doi.org/10.1093/bioinformatics/bty264	Version 1.2
Software, algorithm	Starcode	doi.org/10.1093/bioinformatics/btv053	Version 1.4
Software, algorithm	Google colab	colab.research.google.com
Software, algorithm	Python (version)	Guido van Rossum, 1991	Version 3.7.13
Software, algorithm	Juypter Notebook (IPython)	doi:10.3233/978-1-61499-649-1-87	Version 7.9.0
Software, algorithm	matplotlib	doi:10.5281/zenodo.3898017	Version 3.7.13
Software, algorithm	Fiji	imagej.net/software/fiji/	Version 2.9.011.53t
Chemical compound, drug	G-418	GoldBio (CAS number 108321-42-2)	Cat# G-418-5
Chemical compound, drug	Hygromycin B	GoldBio (CAS number 31282-04-9)	Cat# H-270-10-1

Share this article

Cite this article

Transformation compared to Transgenic Arrays Resulting in Diversity of Integrated Sequences (TARDIS).

Barcode landing pad and diverse donor library.

Transgenic Arrays Resulting in Diversity of Integrated Sequences (TARDIS) library arrays can contain large barcode diversity.

Integration frequency from Transgenic Arrays Resulting in Diversity of Integrated Sequences (TARDIS) library array to F1.

Transgenic Arrays Resulting in Diversity of Integrated Sequences (TARDIS) promoter library.

Characteristics of injected promoters and presence in tested array line (PX819) and integrated lines derived from that array.

Author details

Zachary C Stevenson

Contribution

Competing interests

Megan J Moerdyk-Schauwecker

Contribution

Competing interests

Stephen A Banse

Contribution

Competing interests

Dhaval S Patel

Present address

Contribution

Competing interests

Hang Lu

Contribution

Competing interests

Patrick C Phillips

Contribution

For correspondence

Competing interests

Citations by DOI

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

Categories and tags

Research organism