1. Genetics and Genomics
Download icon

A confinable home-and-rescue gene drive for population modification

  1. Nikolay P Kandul
  2. Junru Liu
  3. Jared B Bennett
  4. John M Marshall
  5. Omar S Akbari  Is a corresponding author
  1. Section of Cell and Developmental Biology, University of California, San Diego, United States
  2. Biophysics Graduate Group, University of California, Berkeley, United States
  3. Division of Epidemiology and Biostatistics, School of Public Health, University of California, Berkeley, United States
Research Article
  • Cited 3
  • Views 1,115
  • Annotations
Cite this article as: eLife 2021;10:e65939 doi: 10.7554/eLife.65939

Abstract

Homing-based gene drives, engineered using CRISPR/Cas9, have been proposed to spread desirable genes throughout populations. However, invasion of such drives can be hindered by the accumulation of resistant alleles. To limit this obstacle, we engineer a confinable population modification home-and-rescue (HomeR) drive in Drosophila targeting an essential gene. In our experiments, resistant alleles that disrupt the target gene function were recessive lethal and therefore disadvantaged. We demonstrate that HomeR can achieve an increase in frequency in population cage experiments, but that fitness costs due to the Cas9 insertion limit drive efficacy. Finally, we conduct mathematical modeling comparing HomeR to contemporary gene drive architectures for population modification over wide ranges of fitness costs, transmission rates, and release regimens. HomeR could potentially be adapted to other species, as a means for safe, confinable, modification of wild populations.

Introduction

Effective insect control strategies are necessary for preventing human diseases, such as malaria and dengue virus, and protecting crops from pests. These challenges have fostered the development of innovative population control technologies such as Cas9/guideRNA (Cas9/gRNA) homing-based gene drives (HGDs) (Champer et al., 2016; Esvelt et al., 2014) which have been laboratory-tested for either population modification (Adolfi et al., 2020; Carballar-Lejarazú et al., 2020; Gantz et al., 2015; Li et al., 2020; Pham et al., 2019) to spread desirable traits that can impair the mosquitoes’ ability to transmit pathogens (e.g. Buchman et al., 2020; Buchman et al., 2019; Hoermann et al., 2020; Isaacs et al., 2012; Marshall et al., 2019) or population suppression (Hammond et al., 2016; Kyrou et al., 2018; Simoni et al., 2020) to reduce and eliminate wild disease-transmitting populations of mosquitoes. Despite significant progress, HGDs are still an emerging technology that can suffer from the formation of resistant alleles, hindering their efficacy (Adolfi et al., 2020; Carballar-Lejarazú et al., 2020; Gantz et al., 2015; Hammond et al., 2016; Kandul et al., 2020; Kyrou et al., 2018; Li et al., 2020; Pham et al., 2019; Simoni et al., 2020).

In CRISPR/Cas9, the Cas9 endonuclease cuts a programmed DNA sequence complementary to a user-defined short guide RNA molecule (gRNA). To engineer an HGD, leveraging creative designs originally proposed by Burt, 2003, CRISPR components are integrated at the target site in the genome. These components are configured so that when they cut the recipient wildtype (wt) allele, it is repaired via homology-directed repair (HDR) in heterozygotes, using the donor allele (i.e. allele harboring the HGD) as a template for DNA repair. This enables the HGD to home, or copy, itself into the recipient allele (Alphey et al., 2020; Champer et al., 2016; Esvelt et al., 2014) (referred to as homing from hereon). This general architecture for HGD was quickly adopted, and many HGDs were developed in several insect species (Gantz et al., 2015; Hammond et al., 2016; Kandul et al., 2020; Kyrou et al., 2018; Li et al., 2020; Simoni et al., 2020; Verkuijl et al., 2020). However, it soon became widely apparent that HGDs unintentionally promote the formation of resistant alleles through mutagenic repair. When these alleles are positively selected, they can hinder HGD spread in laboratory cage populations (Champer et al., 2017; Hammond et al., 2017; Kandul et al., 2020; KaramiNejadRanjbar et al., 2018; Oberhofer et al., 2018), with one exception that targeted a conserved sex determination gene for population suppression (Kyrou et al., 2018; Simoni et al., 2020). This resistance arises from Cas9/gRNA-directed DNA cuts being repaired by alternative DNA end-joining (EJ) repair pathways, including non-homologous (NHEJ) and microhomology-mediated end-joining (MMEJ), which can introduce insertions or deletions (indels) at the target site(s). Many of these indels produce loss-of-function (LOF) alleles, which can be selected against if deleterious to the organism. However, functional in-frame EJ-induced indel alleles can also be generated, which are unrecognized by the same Cas9/gRNA complex and become drive resistant alleles. When resistant alleles are induced in germ cells, they are heritable and can hinder spread of HGDs (Champer et al., 2017; Hammond et al., 2017; Kandul et al., 2020; KaramiNejadRanjbar et al., 2018; Oberhofer et al., 2018). Both induced and naturally existing resistant alleles can pose significant challenges to engineering a stable HGD capable of spreading and persisting long term in a population.

To overcome the accumulation of drive resistant alleles, CRISPR-based toxin-antidote (TA) drives, in which embryos are essentially ‘poisoned’ and only those embryos harboring the TA genetic cassette are rescued, were described (Figure S8 in Kandul et al., 2019) and engineered (Champer et al., 2020a; Oberhofer et al., 2020a, Oberhofer et al., 2020b, Oberhofer et al., 2019). Generally these designs utilize a toxin consisting of a non-HGD harboring multiple gRNAs targeting a vital gene, and an ‘addictive’ antidote that is a re-coded, cleavage-immune version of the targeted gene. These TA-based drives are Mendelianly transmitted and spread instead by killing progeny that fail to inherit the drive (e.g. 50% perish from heterozygous mother). Alternative HDR-based TA designs were also described (Champer et al., 2016; Esvelt et al., 2014), modeled (Noble et al., 2017), and recently tested in mosquitoes (Adolfi et al., 2020) targeting a recessive non-essential eye pigmentation gene, and in Drosophila melanogaster targeting either haplosufficient (i.e. the non-functional allele is recessive as a single functional copy of the target gene is sufficient for normal function) genes (Terradas et al., 2021) or a rare haploinsufficient (i.e. the non-functional allele is dominant as a single functional copy of the target gene is not sufficient for normal function) gene (Champer et al., 2020b), each demonstrating drive capacity.

Building upon prior work, here we describe the development of a home-and-rescue (HomeR) split-drive (i.e. Cas9 separated from the drive) targeting an essential, haplosufficient gene in D. melanogaster. We demonstrate that the accumulation of EJ-induced resistant alleles can be reduced by strategically following four design criteria. First, designing the HGD to target the 3’ coding sequence of an essential gene required for insect viability. Second, encoding a dominant rescue of the endogenous target gene into HomeR. Third, using an exogenous 3’ UTR to prevent expected deleterious recombination events between the drive and the endogenous target gene. Fourth, by exploiting a process we previously first described as lethal biallelic mosaicism (LBM) in which maternal carryover of Cas9/gRNA complexes contributes to RNA-guided dominant biallelic disruption of an essential target gene throughout development thereby ensuring recessive non-functional resistant alleles result in dominant deleterious/lethal mutations that can get negatively selected out of a population (Kandul et al., 2019). Importantly, individuals that inherit the drive allele express a dominant re-coded rescue and are protected from LBM. We demonstrate that efficient cleavage of the target sequence by HomeR and rescue are requisites to achieve nearly ~100% transmission in the presence of Cas9, which is accomplished mostly by homing in trans-heterozygous females. Further, we perform multigenerational population drive experiments demonstrating long-term stability and efficient Cas9-dependent drive. Finally, we conduct comprehensive mathematical modeling to demonstrate that HomeR can outperform contemporary gene drive systems for population modification over wide ranges of fitness, transmission rates, introduction frequencies, and release regimens. Given the simplistic design, this system could be adapted to other species.

Results

Selection of PolG2 as a HomeR drive target

To develop a HomeR-based drive, we first identified an essential haplosufficient gene to target. We chose DNA Polymerase gamma subunit 2 (PolG2DNA polymerase γ 35 kDa, CG33650), required for the replication and repair of mitochondrial DNA (mtDNA; Carrodeguas and Bogenhagen, 2000; Carrodeguas et al., 2001) whose LOF results in lethality (Iyengar et al., 2002). PolG2 encodes the small subunit of the mtDNA polymerase gamma, acting together with the large subunit 1 (PolG1, DNA polymerase γ 125 kDa, CG8987) for the replication and repair of the mitochondrial genome (Carrodeguas and Bogenhagen, 2000; Carrodeguas et al., 2001). PolG2 is a short conserved gene with an ~130 amino acids (AA) C-terminal domain (cd02426; Lu et al., 2020) sharing ~55% AA identity with the human PolG2 (Lecrenier et al., 1997; Figure 1—figure supplement 1A–B). Importantly, Drosophila PolG2 LOF mutations are known to confer lethality at the early pupal stage (Iyengar et al., 2002). The C-terminal location of the functional domain in PolG2 facilitates minimal re-coding, making PolG2 an optimal target for a HomeR gene drive (Figure 1A, Figure 1—figure supplement 1A).

Figure 1 with 4 supplements see all
HomeR homes in the presence of Cas9 and biases its transmission in females to nearly 100%.

(A) Schematic maps of the PolG2 donor allele harboring HomeR integrated at the gRNA#1PolG2 cut site (PolG2HomeR, Figure 1—figure supplement 1A), and the recipient wildtype (wt) allele (PolG2WT) encompassing the area spanning PolG2 and Orc5 (CG7833) genes. To facilitate site-specific integration, the HomeR element is surrounded by the left and right homology arms (LHA and RHA) from the Cas9/guideRNA (Cas9/gRNA) cut site (red arrows and lines) in the wt allele. The re-coded 3’ end sequence of PolG2 is shown in dark blue. The yellow line under the PolG2WT recipient allele depicts the location of the C-terminal domain (Figure 1—figure supplement 1A). (B) Embryonic lethality of PolG2WT alleles cannot result in the nearly 100% transmission of PolG2HomeR. (C) The egg-to-adult survival rate indicates that the developmental lethality of PolG2WT alleles also cannot account for the preferential transmission of PolG2HomeR. Therefore, the homing of PolG2HomeR into PolG2WT alleles causes the super-Mendelian transmission of HomeR. (D) PolG2HomeR supports super-Mendelian transmission in conjunction with different Cas9 transgenes and/or maternal carryover of Cas9 protein (Figure 1—figure supplement 3). Trans-heterozygous females (♀) and males (♂) harboring paternal Cas9 expressed under different promoters were mated to wt flies of the opposite sex, and F1 progeny were scored for the GFP dominant marker of PolG2HomeR. The transmission rate was compared to that in PolG2HomeR/PolG2WT; +/+ flies without Cas9 (control) of the corresponding sex (statistical significance indicated above data points). In addition, the transmission rate by trans-heterozygous females was compared to that of trans-heterozygous males for each Cas9 promoter (statistical significance indicated below data points). Notably, while PolG2HomeR can bias its transmission in both sexes, the highest transmission rate is achieved in Drosophila females: 99.6 ± 0.6% in ♀ vs. 75.0 ± 6.1% in ♂. Plots show the mean ± SD over at least five biological replicates. Statistical significance was estimated using a two-sided Student’s t test with equal variance (p ≥ 0.05ns, p<0.05*, p<0.01**, and p<0.001***).

Assessment of gRNAs targeting PolG2

Given that separate gRNAs can result in varying degrees of cleavage efficiencies (Kandul et al., 2019), we tested two separate gRNAs targeting the C-terminal domain of PolG2 (gRNA#1PolG2 and gRNA#2PolG2) (Figure 1—figure supplement 1A–B). According to the D. melanogaster Genetic Reference Panel 2 (DGRP2) that includes natural variation in genome architecture among 205 D. melanogaster lines (Huang et al., 2014; Mackay et al., 2012), both gRNA target sequences are completely devoid of any single nucleotide polymorphisms (SNPs) indicating a high degree of conservation. To genetically assess the efficiency of Cas9/gRNA-mediated cleavage induced by each gRNA, we crossed these established gRNA lines to two separate Cas9 expressing lines including: (i) a previously characterized ubiquitously expressing Cas9 line (Port et al., 2014) in the DNA ligase four null genetic background (Act5C-Cas9; Lig4–/–) (Zhang et al., 2014), and a (ii) germline-enriched Cas9 driven by the nanos promoter (nos-Cas9) (Kandul et al., 2020; Kandul et al., 2019; Figure 1—figure supplement 1CD). We tested the Cas9/gRNA-mediated cleavage in a Lig4–/– background, to decrease the activity of DNA repair by the NHEJ pathway (McVey et al., 2004). As the Lig4 gene is located on the X chromosome, maternal Lig4– alleles will be inherited by all male progeny, making them hemizygous Lig4– mutants, while females are heterozygous Lig4–/+.

We observed that the genetic cross between either gRNA#1PolG2 or gRNA#2PolG2 homozygous males to Act5C-Cas9, Lig4 –/– homozygous females was lethal for all male progeny (Figure 1—figure supplement 1C, Supplementary file 2). Notably, gRNA#1PolG2 also induced lethality in trans-heterozygous females harboring Act5C-Cas9 in the Lig4 +/– genetic background, suggesting that gRNA#1PolG2 is likely more potent. Furthermore, we found that the Cas9 protein deposited by nos-Cas9/+ females without inheritance of the nos-Cas9 transgene, referred to as maternal carryover (Kandul et al., 2019; Lin and Potter, 2016), was sufficient to ensure lethality of the F1 progeny harboring gRNA#1PolG2, while gRNA#2PolG2 induced lethality only in a fraction of the F1 gRNA#2PolG2/nos-Cas9 trans-heterozygous flies, independent of sex (Figure 1—figure supplement 1D, Supplementary file 3). Sanger sequencing of trans-heterozygous pupae revealed expected mutations at PolG2 gRNA target sites. As we previously first described, the mechanism ensuring lethality results from a dominant process we coined LBM (Kandul et al., 2019), in which maternal carryover/zygotic expression results in mosaic target gene cleavage throughout development leading to wide scale loss of target gene function which can be detrimental to viability of the organism if essential genes are targeted. Taken together, these results indicate that both tested gRNAs induced cleavage of the PolG2 target sequences, though gRNA#1PolG2 induced greater cleavage than gRNA#2PolG2 as evidenced by complete lethality of females and males with both sources of Cas9.

Development of split HomeR drives with encoded rescue

Using these characterized gRNAs described above (gRNA#1PolG2 or gRNA#2PolG2), we engineered two Pol2 HomeR (HomeRPolG2 and HomeR(B)PolG2) drives, respectively. Fitting with the split-drive (i.e. two-locus) design (Champer et al., 2020a; Kandul et al., 2020; Li et al., 2020), neither HomeRPolG2 nor HomeR(B)PolG2 include the Cas9 gene and thus are inherently confinable drives (Akbari et al., 2015; Champer et al., 2016; Esvelt et al., 2014; Marshall and Akbari, 2018). To mediate HDR, both HomeR constructs include left and right homology arms (LHA and RHA) matching the sequences surrounding the cut site of the corresponding gRNA. The LHA includes a carefully re-coded sequence of 22 or 27 AA downstream from the cut site #1 or #2 (Figure 1—figure supplement 1A–B), and a p10 3’ UTR to support robust expression of the re-coded PolG2 and to prevent gene conversion between the rescue HomeR allele and the endogenous allele, which proved problematic in previous drive design architectures (Champer et al., 2020a; Champer et al., 2020b). Additionally, we included a dominant 3xP3-eGFP-SV40 marker gene to visually track the presence of HomeR (Figure 1—figure supplement 2).

Two different approaches were used to generate transgenic lines harboring site-specific integrations of HomeRPolG2 and HomeR(B)PolG2 constructs at the corresponding cut sites in PolG2 (termed PolG2HomeR and PolG2HomeR(B) when integrated into genome). In the first approach, the constructs were initially randomly integrated into the genome and then relocated precisely into the PolG2 cut sites via Homology Assisted CRISPR Knock-in (HACK; Gantz and Akbari, 2018; Lin and Potter, 2016). In the second approach, the constructs were directly integrated into the PolG2 cut sites by injecting them into nos-Cas9 embryos (Kandul et al., 2019; Figure 1—figure supplement 2B). Using both approaches, multiple independent transgenic lines of each PolG2HomeR and PolG2HomeR(B) were generated. To confirm that PolG2HomeR or PolG2HomeR(B) lines were indeed inserted precisely at the corresponding target site in PolG2, we assessed their ability for super-Mendelian inheritance in the presence of Cas9 in trans. Establishment of pure breeding, viable homozygous stocks of PolG2HomeR/PolG2HomeR and PolG2HomeR(B)/PolG2HomeR(B), demonstrated a functional rescue of wt PolG2 function. Finally, we Sanger-sequenced the junction sites (Figure 1—figure supplement 2C) and molecularly confirmed the precision of HDR-mediated insertions.

Assessment of germline transmission and cleavage rates

To assess the effect of gRNA-mediated cleavage efficiency on the inheritance of HomeR, we compared the two HomeRs, as they harbored two distinct gRNA sequences that differed in cleavage efficiencies. The PolG2HomeR and PolG2HomeR(B) lines encode gRNA#1PolG2 and RNA#2PolG2, respectively, with slightly different LHA and RHA corresponding to their respective gRNA cut sites (Figure 1—figure supplement 2A). We found that PolG2HomeR/+; nos-Cas9/+ trans-heterozygous females crossed to wt males transmitted PolG2HomeR to 99.5 ± 0.6% of progeny, while PolG2HomeR(B)/+; nos-Cas9/+ females transmitted the corresponding PolG2HomeR(B) to a significantly lower fraction of F1 progeny (68.7 ± 6.2%, two-sided Student’s t test with equal variances, p<0.0001; Figure 1—figure supplement 3). Genetic crosses of either PolG2HomeR/+; nos-Cas9/+ or PolG2HomeR(B)/+; nos-Cas9/+ trans-heterozygous males to wt females did not result in significant biased transmission to F1 progeny (60.7 ± 5.3% vs. 52.9 ± 4.0% or 54.3 ± 4.0% vs. 51.5 ± 1.8%, respectively, two-sided Student’s t test with equal variances, p>0.05; Figure 1—figure supplement 3, Supplementary file 45). Maternal carryover of Cas9 protein by nos-Cas9/+ females significantly increased transmission of PolG2HomeR by F1 PolG2HomeR/CyO females, 66.1 ± 0.8% vs. 52.9 ± 4.0% (two-sided Student’s t test with equal variances, p<0.001; Figure 1—figure supplement 3, Supplementary file 4). These results suggest that the higher efficiently of gRNA#1PolG2 to guide the Cas9-mediated PolG2 disruption, which results in lethality (Figure 1—figure supplement 1C–D), likely contributes to the higher transmission rates of PolG2HomeR, and underscores the importance of selecting an efficient gRNA for engineering gene drives.

HomeR biases its inheritance predominantly by homing

We hypothesized that either homing (indicating allelic conversion) in germ cells (Figure 1A) or ‘destruction’ of wt alleles in the progeny of trans-heterozygous PolG2HomeR/+; Cas9/+ females via LBM (Figure 1—figure supplement 4; Kandul et al., 2019) could contribute to biased PolG2HomeR transmission rates. LBM contributes to dominant biallelic disruption of the target gene throughout development thereby ensuring recessive non-functional resistant alleles (R2 type) result in dominant deleterious/lethal mutations that can get selected out of a population (Figure 1—figure supplement 4). Previously, destruction of the wt allele in conjunction with maternal carryover of a ‘toxin’ was used to engineer gene drives based on an ‘addictive’ TA approach (Champer et al., 2020a; Oberhofer et al., 2020a, Oberhofer et al., 2019). In these TA drives, one half of the F1 progeny did not inherit the TA cassette, that is, not rescued, and were killed—ensuring survival of only progeny inheriting the drive resulting in a rapid spread of the genetic cassette throughout laboratory populations.

To explore the mechanism resulting in the super-Mendelian inheritance of PolG2HomeR, we determined the egg hatching and egg-to-adult survival rates for the progeny of trans-heterozygous females and compared it to those of females heterozygous for PolG2HomeR or just Cas9 (Figure 1B,C). The hatching rate of F1 eggs generated by PolG2HomeR/PolG2WT; nos-Cas9/+ trans-heterozygous females crossed to wt males was reduced by 5.6% as compared to that of PolG2HomeR/PolG2WT; +/+ heterozygous females (88.8 ± 2.4% vs. 94.4 ± 1.2%; two-sided Student’s t test with equal variances, p<0.004, Figure 1B) and slightly lower than +/+; nos-Cas9/+ heterozygous females (88.8 ± 2.4% vs. 93.4 ± 3.7%; two-sided Student’s t test with equal variances, p=0.052; Figure 1B, Supplementary file 6), suggesting some degree of embryo killing. Furthermore, we observed no significant difference among egg-to-adult survival rates estimated for four female types crossed to wt males: 72.2 ± 9.4% for PolG2HomeR/PolG2WT; nos-Cas9/+ ♀, 72.2 ± 15.2% for PolG2HomeR/+ ♀, 70.0 ± 17.6% for nos-Cas9/+ ♀, and 71.5 ± 7.4% for wt ♀ (Figure 1C, Supplementary file 7). Taken together, these data indicate that only a small fraction of PolG2WT alleles transmitted by trans-heterozygous females were ‘destroyed’ via LBM—meaning mutated and not complemented by the paternal PolG2WT allele, since it was also mutated by Cas9/gRNA maternal carryover (Figure 1—figure supplement 4). Therefore, the HomeR transmission of 99.5% by the PolG2HomeR/PolG2WT; nos-Cas9/+ females could not be explained simply by the ‘destruction’ of PolG2WT alleles, which would result in the lethality of 50% progeny as in cleave and rescue (ClvR; Oberhofer et al., 2020a; Oberhofer et al., 2020b; Oberhofer et al., 2019) and toxin-antidote recessive embryo (TARE; Champer et al., 2020a) drives. Instead, HomeR biases its transmission predominantly by homing (i.e. allelic conversion of PolG2WT into PolG2HomeR) from trans-heterozygous females (Figure 1A).

HomeR exhibits the strong transmission bias from females

The split-drive design facilitates testing of different Cas9 promoters. Therefore, we quantified the transmission of PolG2HomeR from either trans-heterozygous females or males harboring PolG2HomeR in combination with four Cas9 promoters active in germ cells of both sexes (Figure 1D). Nanos (nos) and vasa (vas) promoters were previously described as germline-specific promoters active in both sexes (Hay et al., 1988; Sano et al., 2002; Van Doren et al., 1998), though recent evidence indicates ectopic expression in somatic tissues from both nos-Cas9 and vas-Cas9 (Kandul et al., 2020; Kandul et al., 2019). The Ubiquitin 63E (ubiq) and Actin 5C (Act5C) promoters support strong expression in both somatic and germ cells (Kandul et al., 2020; Kandul et al., 2019; Port et al., 2014; Preston et al., 2006). Since maternal carryover of the Cas9 protein was shown to induce a ‘shadow drive’ two generations later (Guichard et al., 2019; Kandul et al., 2020), we used trans-heterozygous flies that inherited paternal Cas9 to quantify the transmission of PolG2HomeR. Trans-heterozygous females harboring PolG2HomeR together with nos-Cas9, vas-Cas9, ubiq-Cas9, or Act5C-Cas9 crossed to wt males biased transmission of PolG2HomeR to nearly ~100% of F1 progeny (99.5 ± 0.6%, 97.6 ± 2.6%, 99.6 ± 0.6%, and 99.0 ± 0.4%, respectively, vs. 52.9 ± 4.0% by PolG2HomeR/PolG2WT; +/+ females, two-sided Student’s t test with equal variances, p<0.001; Figure 1D). Note that the corresponding trans-heterozygous males only modestly biased PolG2HomeR transmission from 55.3 ± 5.0% of F1 progeny to 60.7 ± 5.3% (p>0.05), 63.2 ± 6.6% (p<0.03), 66.1 ± 4.6% (p<0.004), and 62.0 ± 1.7% (p<0.017, two-sided Student’s t test with equal variances, Figure 1D, Supplementary file 4), respectively.

To assess whether males could support robust homing, we investigated three alternative male-specific promoters. We established the Drosophila exuperantia (CG8994) large fragment (exuL) promoter for an early male-specific expression. The Rcd-1 related (Rcd1r, CG9573; Chan et al., 2013) and βTubulin 85D (βTubChan et al., 2011; Michiels et al., 1989) promoters support an early and late, respectively, testis-specific expression in Drosophila males. We found that only exuL-Cas9 induced the male-specific super-Mendelian inheritance of PolG2HomeR; trans-heterozygous males, but not females, transmitted PolG2HomeR to more than 50% of F1 progeny (75.0 ± 6.1% vs. 55.3 ± 5.0% in ♂, p<0.0001; and 50.7 ± 3.4% vs. 52.8 ± 4.0% in ♀, p>0.05, two-sided Student’s t test with equal variances; Figure 1D, Supplementary file 4). To our surprise, Rcd1r-Cas9 induced super-Mendelian inheritance of PolG2HomeR in both trans-heterozygous males and females (68.2 ± 3.8% vs. 55.3 ± 5.0% in ♂, p<0.002; and 90.8 ± 0.5% vs. 52.8 ± 4.0% in ♀, p>0.0001, two-sided Student’s t test with equal variances; Figure 1D). Finally, βTub-Cas9 did not induce changes in transmission of PolG2HomeR by either trans-heterozygous males or females (55.6 ± 5.7% vs. 55.3 ± 5.0% in ♂, p=0.55; and 51.5 ± 2.1% vs. 52.9 ± 4.0% in ♀, p=0.94, two-sided Student’s t test with equal variances; Figure 1D, Supplementary file 4). These results suggest that Drosophila males bias PolG2HomeR transmission; however, this bias is substantially lower than the nearly ~100% transmission of PolG2HomeR in females.

Induced resistant alleles do not impede drive invasion

We reasoned that insertion of HomeR into an essential gene could enable spread into a population by biasing transmission while also inhibiting the accumulation of LOF resistant alleles (R2 type, PolG2R2) through a combination of slow-acting Mendelian selection and by LBM (Figure 1—figure supplement 4). However, functional resistant alleles (R1) that alter the AA sequence of the target can indeed still be generated (e.g. by EJ repair resulting from in-frame indels, or nonsynonymous base substitutions) and may hinder drive spread. While these mutations may partially rescue the function of the target gene resulting in viability, it has not escaped our attention that they may still confer detrimental fitness costs to their carriers resulting from incompletely preserving the function of the essential target gene due to the altered AA sequence(s) in a critical domain and may result in negative selection within a population. We therefore refer to these here as ‘non-silent R1’ or ‘non-silent PolGR1’ mutations. On the contrary, resistant alleles that alter the nucleotide sequence, but not the AA sequence, can also be generated, and these are referred to here as ‘silent R1’ or ‘silent PolGR1 mutations’. These silent R1 mutations are expected to be especially problematic to drive spread as these would not be predicted to impose fitness costs on homozygous carriers, and therefore they would be expected to spread at the expense of the drive. Notwithstanding, the generation of both types of R1 functional resistant alleles (non-silent/silent) is a shared problem of all HGDs developed to date.

To explore this potential, in addition to spread and stability, we initiated three multigenerational cage populations of heterozygous PolG2HomeR/+ flies (50% allelic frequency) in the nos-Cas9/nos-Cas9 genetic background and assessed if induced resistant alleles could impede drive invasion and stability in 10 discrete generations (Figure 2A). Functional resistant alleles (R1) were expected to be generated, especially at the earlier generations initiated with 50% wt alleles, and would be straightforward to score in our assay by a simple loss of the dominant GFP marker. Note that any viable GFP-negative (GFP–) adult flies must have at least one functional PolG2 allele to survive (either PolG2WT or PolG2R1).

Figure 2 with 1 supplement see all
Induced resistant alleles do not impede the Cas9-dependent spread of PolG2HomeR.

(A) To explore the fate of induced functional resistant alleles (R1), three population cages were seeded with PolG2HomeR/PolG2WT heterozygous flies in nos-Cas9/nos-Cas9 genetic background and run for 10 discrete generations. PolG2HomeR and nos-Cas9 were tagged by dominant eye-specific GFP and body-specific dsRed, respectively. Images of an individual male (♂), female (♀), and a group of flies are shown. In total, nine PolG2HomeR-negative flies, as determined by the absence of eye-specific GFP, were identified at generations 2 and 3 in populations #1 and #3. After these flies participated in seeding the next generation, they were isolated and genotyped. Two functional PolG2R1 resistant alleles identified in these flies (Figure 2—figure supplement 1A) were not sampled in flies collected at generation 10 by Illumina amplicon sequencing (Figure 2—figure supplement 1B). (B) The PolG2HomeR allele spread efficiently in the homozygous Cas9 genetic background. Population drives were seeded with 50 PolG2HomeR/PolG2HomeR ♂, 50 wt ♂, and 100 wt virgin ♀ in the presence (green points) or absence (blue points) of nos-Cas9, and the carrier frequency of PolG2HomeR was scored at each discrete generation. The model for the HomeR population replacement drive (gray diamonds and a black dotted line) was fitted to the empirical data of the PolG2HomeR spread in the presence of nos-Cas9 (green points). After 10 generations, the PolG2HomeR allele spread from the introductory frequency of 25% to the carrier frequency of 99.9 ± 0.3% in the presence of nos-Cas (green points) or continued to drift at 60.8 ± 3.8% without the Cas9 transgene (blue points; p<0.0001, a two-sided Student’s t test with equal variance). (C) The invasion of PolG2HomeR is limited by the fitness of the Cas9 transgene. Double homozygous males at frequencies of 75% or 50% were released into the wt genetic background to establish four drive populations with mixtures of trans-heterozygous and wt flies at above or below 50%, respectively, in generation 0. Both PolG2HomeR and nos-Cas9 transgenes were scored at each generation by the GFP (green points) and dsRed (purple points) markers. The carrier frequency of nos-Cas9 decreased from 69.1% to 9.9% or from 36.4% to 4.0% in six generations confining the spread of PolG2HomeR.

In early generations 2 and 3, we sampled nine such flies lacking PolG2HomeR in two out of three populations (#1 and #3; Figure 2A, Supplementary file 8). To ensure that the PolG2R1 alleles had a chance to transmit and compete with PolG2HomeR alleles, these flies were transferred among the subsequent generation and allowed to mate with other flies and lay eggs in each population lineage before genotyping. As expected, we determined that each fly indeed harbored at least one PolG2R1 allele that rescued viability. Two distinct non-silent PolG2R1 alleles were identified, with one non-silent PolG2R1 type induced independently in two populations (Figure 2—figure supplement 1A). Since we did not find any fly without the PolG2HomeR allele after generation 3, it can be noted that the identified non-silent PolG2R1 resistant alleles did not impede drive invasion/stability. Despite this, we cannot rule out the possibility that these and other resistant alleles were still present at low frequencies in populations masked by PolG2HomeR alleles.

To further explore the diversity of resistant alleles remaining after 10 generations, we performed next-gen sequencing on 60 randomly selected GFP+ flies (note that each fly had at least one copy of PolG2HomeR) from each population to identify and quantify mutant PolG2 alleles, which did not harbor the large insert of HomeRPolG2 (~2.5 kb; Figure 1A). From nearly 150,000 sequence reads generated, we did not identify the two previously sampled functional non-silent PolG2R1 alleles (Figure 3—figure supplement 1A). Instead, we found two novel non-silent PolG2R1 in-frame indel alleles, 18 and 9 bp deletions, in populations #2 and #3 (Figure 2—figure supplement 1B). Additionally, we found 11 LOF PolG2R2 alleles harboring out-of-frame indels ranging from a 1 bp insertion to a 23 bp deletion (Figure 2—figure supplement 1B). Two PolG2R2 alleles, 2 and 4 bp deletions, were also seen in the genotyped flies at generations 2 and 3, suggesting that these may have persisted in the populations. The relative abundance of each allele can be used to extrapolate the minimum number of resistant alleles sampled in the 60 heterozygous and/or homozygous flies. We inferred that at least 9, 5, and 17 resistant alleles persisted for 10 generations and were rescued by the PolG2HomeR allele in 60 sampled flies from populations #1, #2, and #3, respectively (Figure 2—figure supplement 1B). Since a single PolG2HomeR allele rescues the wt function of PolG2 and can mask the opposite allele at the PolG2 locus from slow-acting purifying selection, it is not surprising that LOF resistant alleles can be found persisting in the population.

HomeR spreads in a Cas9-dependent manner

To evaluate drive efficacy in the presence of Cas9, we established five drive and three control (‘no-drive’) populations by seeding 50 homozygous PolG2HomeR males and 50 wt males together with 100 wt virgin females in the presence or absence of Cas9, respectively (Figure 2B). The introduction ratio of PolG2HomeR to PolG2WT was 1:2 (or 25% allele frequency) in the parental generation (P). Both types of homozygous PolG2HomeR males with and without Cas9 were able to compete with the corresponding wt males and sired 41.0 ± 12.4% (Supplementary file 9) and 70.0 ± 5.5% (Supplementary file 10) of progeny, respectively. Notably, the PolG2HomeR males were significantly less competitive with wt males for female mates in the nos-Cas9 genetic background than in the wt genetic background (p=0.01, two-sided Student’s test with equal variances; Figure 2B). Nevertheless, the PolG2HomeR allele spread to the carrier frequency of 96.7 ± 4.4% in the presence of nos-Cas9 vs. 56.9 ± 11.6% without the Cas9 transgene in a time span of four generations (p=0.0004, a two-sided Student’s t test with equal variance; Supplementary file 910). At generation 10, the PolG2HomeR allele was fixed in four out of five drive populations and continued to drift at moderate frequency in three control populations in the absence of Cas9: 99.9 ± 0.3% vs. 60.8 ± 3.8%, respectively (p<0.0001, a two-sided Student’s t test with equal variance; Figure 2B, Supplementary file 910), underscoring Cas9 dependence for drive.

A few GFP– flies harboring PolG2R1 alleles appeared over multiple generations in drive populations #4 and #5 (Supplementary file 9). To assess the fertility of viable PolG2R1 carriers, we collected 7 GFP– females and 7 GFP– males at generation 9 and individually crossed them to wt flies of the corresponding sex. Interestingly, we found that the GFP– males were fertile, while each tested GFP– female died in 3 days without producing progeny suggesting that the sampled PolG2R1 allele(s) incurred fitness costs to female carriers. Dead females were genotyped and we identified four non-silent R1 alleles that rescued their ‘short-lived’ viability (Figure 2—figure supplement 1C). Notably one non-silent R1 allele was already sampled in the GFP– flies from heterozygous population #1 (R1#2 in Figure 2—figure supplement 1C). Each tested GFP– male was fertile, and four genotype males had the same non-silent allele identified in the females (R1#3 in Figure 2—figure supplement 1C). In summary, all in-frame resistant alleles identified in this study resulted in AA changes and are non-silent and importantly we did not sample any silent PolG2R1 mutant alleles.

Fitting a mathematical model of CRISPR/Cas9-based homing drive to the observed cage data in Figure 2B (see 'Materials and methods'), we found the data to be consistent with cleavage efficiencies in females and males of 99.2% (95% credible interval [CrI] 96.4–100%) and 99.6% (95% CrI 98.2–100%), respectively, and a frequency of accurate HDR, given cleavage, in females and males of 99.5% (95% CrI 97.8–100%) and 9.6% (95% CrI 8.0–10.0%), respectively. When accurate HDR did not occur, the data were consistent with 2.9% (95% CrI 1.9–4.0%) of resistant alleles being in-frame, and the remainder being out-of-frame or otherwise costly LOF alleles. Individuals having the HomeR system were found to have a negligible fitness cost of 0.3% (95% CrI 0.0–1.4%), while individuals homozygous for the LOF allele were modeled as completely unviable. The fitted parameter estimates are consistent with parameters estimated from individual pair crossings (Figure 1D).

The spread of HomeR is confined by the fitness of Cas9

In the split-drive (i.e. two-locus) design, the continued spread of HomeR is contingent on the availability of Cas9 which ensures confinability. Therefore, to explore the invasion potential of HomeR under a limited supply of Cas9, we seeded additional drive populations with double homozygous and wt flies, and scored both PolG2HomeR and Cas9 at each generation. Four populations with PolG2HomeR/PolG2HomeR; nos-Cas9/nos-cas9 and wt (PolG2WT/PolG2WT; +/+) males mixed at 1:1 (two replicates; 25% allele frequency) or 3:1 ratios (two replicates; 37.5% allele frequency) and 100 wt virgin females were seeded (Figure 2C). These ratios generated two population types with frequencies of trans-heterozygous flies below and above 50% in the subsequent generation: 36.4 ± 4.8% and 69.1 ± 3.0% (generation 0 in Figure 2C, Supplementary file 11). After tracking these populations for six generations, we found that frequencies of the PolG2HomeR allele gradually increased to 67.9 ± 1.3% and 73.7 ± 9.8%, respectively, and persisted near these frequencies. Notably, we also observed that the frequency of nos-Cas9 decreased each generation down to 4.1 ± 0.2% or 9.9 ± 1.6% in two population types by generation 6 (Figure 2C, Supplementary file 11). These results suggest that the nos-Cas9 incurred fitness costs on its carrier and was therefore negatively selected out from the population. Furthermore, these experiments underscore the significant fitness cost Cas9 can impose, and indicate that a single release, at the introduction thresholds used here, would be insufficient to achieve fixation of PolG2HomeR given such immense costs to Cas9. We therefore use mathematical modeling to explore multi-release scenarios (below).

Modeling indicates that HomeR is an efficacious gene drive

To compare the performance of HomeR against contemporary gene drive systems for population modification, we modeled one- (i.e. autonomous, linked-Cas9) and two-locus (i.e. split-drive) versions of ClvR (Faber et al., 2020; Oberhofer et al., 2020a, Oberhofer et al., 2020b, Oberhofer et al., 2019), the one-locus TARE system from Champer et al., 2020a, as well as a two-locus TARE configuration based on their design, an HGD targeting a non-essential gene (Gantz et al., 2015; Hammond et al., 2016), and HomeR (for mechanistic comparisons of these systems, see Figure 3—figure supplement 1). In each case, we first simulated population spread of each gene drive system for an ideal parameterization (see 'Materials and methods' for more details) and included additional simulations for HomeR under current experimentally derived parameters (HomeR-exp; Figure 3A and C, parameters consistent with Figure 2C). To gauge behavior across a range of scenarios, we performed simulations for a range of fitness costs (implemented as female fecundity reduction) and drive system transmission rates (implemented by varying the cleavage rate), providing heatmaps of the expected performance for each drive system at each parameter combination (Figure 3B and D). Drive efficacy, the outcome in these comparisons, is defined as the expected fraction of individuals that carry the effector allele, in either heterozygous or homozygous form, at 20 generations following a 25% release of male homozygotes for each drive system.

Figure 3 with 1 supplement see all
Performance of contemporary gene drive systems for population modification with a single release.

(A) Simulations of carrier frequency trajectories (i.e. heterozygotes and homozygotes) for one-locus versions of ClvR, TARE, HomeR, and HGD for ideal parameters (see 'Materials and methods'), and HomeR for experimental parameters (HomeR-Exp, see 'Materials and methods'). Twenty-five repetitions (lighter lines) were used to calculate the average behavior of each drive (thicker, dashed lines). Populations were initialized with 50% wildtype (+/+) adult females, 25% wildtype (+/+) adult males, and 25% drive homozygous (drive/drive) males. (B) Heatmaps depicting drive efficacy for one-locus versions of ClvR, TARE, HomeR, and HGD for a range of fitness and transmission rate parameter values. Fitness costs were incorporated as a dominant, female-specific fecundity reduction. Transmission rate was varied based on cleavage rate, using HDR rates consistent with ideal parameters, when applicable (see 'Materials and methods'). Drive efficacy is defined as the average carrier frequency at generation 20 (approximately 1 year, given a generation period of 2-3 weeks) based on 100 stochastic simulations with the same initial conditions as (A). (C) Simulations of carrier frequency trajectories for two-locus (split-drive) versions of ClvR, TARE, HomeR, and HGD for ideal parameters (see 'Materials and methods'), and HomeR for experimental parameters (HomeR-Exp, see 'Materials and methods'). Twenty-five repetitions (lighter lines) were used to calculate the average behavior of each drive (thicker, dashed lines). Populations were initialized with 50% wildtype (+/+; +/+) adult females, 25% wildtype (+/+; +/+) males, and 25% drive homozygous (Cas9/Cas9; gRNA/gRNA) males. (D) Heatmaps depicting drive efficacy for two-locus versions of ClvR, TARE, HomeR, and HGD for a range of fitness and transmission rate parameter values, implemented as in panel (B), with initial conditions given in (C).

When one-locus gene drive (GD) systems are compared for ideal parameter values, HomeR outperforms all other GDs in terms of speed of spread, and reaches near fixation in terms of carrier frequency, as do ClvR and TARE (Figure 3A). HGD displays a similar speed of spread to HomeR initially; however, fitness costs from the targeted gene disruption and LOF (R2) alleles slow the introgression and allow functional resistance alleles (R1) to build up over time, preventing fixation. The HomeR design overcomes this fitness reduction and R2 allele build-up by rescuing the wt function of a targeted essential gene (Figure 3—figure supplement 1). ClvR and TARE perform similarly to each other for ideal parameter values, but reach near carrier fixation ~4 generations after HomeR does for ideal parameter values (Figure 3A). When experimental parameters are used for HomeR (HomeR-Exp, in Figure 3A), it reaches near carrier fixation a generation after ClvR; almost on-par with ideal ClvR and TARE systems and significantly better than HGD. HomeR also reaches near carrier fixation for the widest range of fitness and transmission rate parameter values (Figure 3B). As an HGD, HomeR drives to high carrier frequencies provided its inheritance bias (or transmission rate) exceeds its associated fitness cost. In contrast, drive efficacy of ClvR and TARE is strongly dependent on fitness cost and weakly dependent on transmission rate. Indeed, ClvR and TARE can each only tolerate fitness costs less than ~20% (Figure 3B). This is a consequence of their design, employing a TA scheme, which induces a significant fecundity reduction (Figure 3—figure supplement 1A–B) in addition to other fitness costs. A one-locus HGD also exhibits efficacy across a wide range of parameter combinations, but its efficacy is reduced compared to HomeR due to the build-up of R2 and R1 alleles (Figure 3—figure supplement 1C–D), which can potentially block spread of the HGD in large populations (Figure 3B).

In two-locus simulations, Cas9 is separated from drive in all designs and undergoes independent assortment during gametogenesis. The effects of this design change are evident (Figure 3C). Under the same experimental conditions as one-locus simulations, there is significantly more variation in behavior of two-locus GDs, with a reduced speed of introgression into the population and slightly reduced overall efficacy. Nevertheless, general trends remain the same; HomeR with ideal parameters is more capable than comparable drives, though current experimental realizations require improvement. TARE performs significantly worse in a split configuration (Champer et al., 2020a). ClvR, when completely unlinked, also performs significantly worse, in agreement with results from Oberhofer et al., 2020a. Exploring the performance under a range of parameters, we found reduced overall efficacy for all drives (Figure 3D), but an increased range of lower efficacy for ClvR, HomeR, and TARE. This is consistent with fitness costs applied to the Cas9 locus, which is now separated from the effector gene and gRNAs. HomeR demonstrates the widest range of achieving efficacy as well as the widest range of high expected efficacy.

Exploration of multi-release scenarios

To probe the ability of these drive designs to modify field populations, we implemented an overlapping generation model (Sanchez et al., 2019), performing weekly male releases into a naive population, and tested if the frequency of females carrying the effector allele reached 95% of the female population, and how long that carrier frequency remained above 95%. One-locus constructs of ClvR, TARE, and HomeR were consistently able to reach this threshold, though HomeR achieved these thresholds over the widest range of transmission rates and fitness costs (Figure 4A). HGD never reached this threshold because of R2 allele build-up. This does not indicate that HGD cannot be effective at lower thresholds (indeed, during testing it was), but that even low rates of resistance generation are problematic. HomeR was the only one to consistently remain above a 95% carrier frequency for over 100 days (Figure 4B).

Performance of contemporary gene drive systems for population modification with multiple releases.

(A) Simulations of one-locus designs of ClvR, TARE, HomeR, and HGD in an ecologically consistent model (see 'Materials and methods'). Weekly releases of drive homozygous males (20% of the population size) were performed for up to 13 weeks (3 months, approximately one field season), and the female population was then tested for carrier frequency above 95% at any point within the subsequent 4 years. (B) Same setup as (A), but now the female populations were measured for how many days the carrier frequency remained above 95%, starting at the first release, lasting up to 4 years. This indicates the window where a disease-refractory allele could provide protection (window-of-protection [WOP]). (C) Simulations of two-locus (split-drive) versions of ClvR, TARE, HomeR, and HGD in an ecologically consistent model (see 'Materials and methods'). This time, only the first release was homozygous for Cas9 and the gene drive. Supplementary releases included only Cas9. Male releases were 20% of the total population size, and the female population was measured for drive carrier frequency, not Cas9 frequency, above 95%. (D) Using the data from (C), we applied the method from (B) to measure the WOP, where the drive carrier frequency remained above 95% in the female population. This only measures the drive frequency, not females who carry the Cas9 allele. All simulations contained 100 stochastic repetitions.

Two-locus designs showed significantly reduced ability to reach 95% carrier frequency in females, often requiring more and larger (20% of the total population size) releases to be effective (Figure 4C). For split-drive designs, only the first release was homozygous Cas9 and gene drive, while supplementary releases were homozygous Cas9 only (Faber et al., 2020; Oberhofer et al., 2020a, Oberhofer et al., 2020b, Oberhofer et al., 2019). A similar pattern of efficacy is seen for ClvR, TARE, and HomeR, but by splitting the HGD and maintaining the fitness effects on the Cas9 allele, it is now able to reach 95% introgression over a small parameter range. Additionally, as fitness costs are predominantly associated with the Cas9 allele and not the effector gene, all constructs were adequate at maintaining effector allele frequency in the population over a long period of time (Figure 4D). Taken together, these results suggest that multi-releases would be sufficient to ensure HomeR spreads and persists stably in a population.

Discussion

We have engineered a system we term HomeR, for population modification that mitigates some issues related to drive resistance. To limit the potential for inducing functional resistant alleles, an essential gene required for insect viability was strategically targeted. Multigenerational population drive experiments indicate that PolG2HomeR can spread and persist efficiently in the presence of Cas9, and this persistence is not impacted by induced resistant alleles, including functional resistant alleles (non-silent R1), mitigating a major challenge for population modification HGDs that are not designed to target essential genes.

The re-coded rescue strategy that we used to develop HomeR was also used in previous Drosophila TA non-HGDs (Champer et al., 2020a; Oberhofer et al., 2020a, Oberhofer et al., 2020b, Oberhofer et al., 2019; Figure 3—figure supplement 1A–B) and recent HGDs in both Drosophila (Champer et al., 2020b) and Anopheles stephensi (Adolfi et al., 2020), though each of these examples suffered from potential drawbacks. For example, both the haplolethal HGD (Champer et al., 2020b) and the TARE design (Champer et al., 2020a) share similar problematic design architectures that can be unstable as they are susceptible to functional resistant alleles induced via recombination between the promoter including sequences 5’ of the coding sequence and 3’ UTR regions, which are identical between the re-coded sequence and the wt sequence (Figure 2—figure supplement 1C in Champer et al., 2020a and Figure 2 in Champer et al., 2020b). Moreover, a haplolethal HGD (Champer et al., 2020b) that biases transmission in females requires a strict germline-specific promoter that limits maternal carryover, otherwise LBM, either mono- or biallelic, may result in dominant negative fitness costs to its carrier (Figure 3—figure supplement 1E) and impede drive spread. In fact, our efforts to find such promoters in Drosophila proved exceedingly difficult—with previously tested ‘germline-specific’ promoters such as nanos and vasa showing significant somatic activity at multiple insertion sites (Kandul et al., 2020; Kandul et al., 2019). The recent HGD in A. stephensi (Adolfi et al., 2020) was designed to target and rescue a non-essential gene for viability (i.e. the eye pigmentation kynurenine hydroxylase gene), whose disruption was pleiotropic and only partially costly to female fecundity and survival (Adolfi et al., 2020; Gantz et al., 2015; Pham et al., 2019). Notwithstanding, this drive spread efficiently in small, multigenerational laboratory population cages under several release thresholds; however, many drives did not reach, nor maintain, complete fixation presumably due to the viability and partial fertility of drive generated homozygous LOF resistant alleles (Figure 3—figure supplement 1D), underscoring the critical importance of targeting a recessive essential gene for such drives especially for larger releases. Comparatively, the ClvR system is quite stable, however, it can be cumbersome to engineer—requiring re-coding of the essential rescue gene, including all target sequences within the coding sequence (lacking introns), and uses an exogenous promoter and 3’ UTR, necessitating precise titration of expression from a distal genomic location with exogenous sequences to guarantee rescue without imposing deleterious fitness costs. These features may be difficult to accomplish for essential genes requiring complex regulatory elements and networks not directly adjacent to the target gene. In contrast to the aforementioned drives, (i) HomeR relies on the endogenous promoter sequence of the target gene to facilitate rescue expression which significantly simplifies the design and ensures endogenous expression of the rescue using native regulatory machinery, (ii) creatively designed to target the 3’ end of the essential gene to limit the degree of re-coding required for the rescue, (iii) an exogenous 3’ UTR to prevent deleterious recombination, and (iv) exploits LBM (Kandul et al., 2019) by targeting an essential gene to ensure recessive non-functional resistant alleles result in dominant deleterious/lethal mutations that are actively selected out of a population (Figure 1—figure supplement 4), four important design features that distinguish HomeR from other population modification drives.

Our findings are congruent with previous studies demonstrating reduced homing in Drosophila males (Chan et al., 2013; Chan et al., 2011; Windbichler et al., 2011). We tested multiple Cas9 lines supporting Cas9 expression in early and/or late germ cells with different levels of specificity and have not achieved high levels of homing as reported in Anopheline mosquito males (Gantz et al., 2015; Kyrou et al., 2018). Achiasmatic meiosis in Drosophila males likely correlates with the weak activity of the HDR pathway (Preston et al., 2006), which in turn results in inefficient homing in Drosophila males. Mosquito males have chiasmatic meiosis and recombination (Kitzmiller, 1976) that require active HDR machinery in primary spermatocytes, possibly contributing to efficient homing. Reduced homing efficacy in Drosophila males should be accounted for when designing HGDs in other species exhibiting achiasmatic meiosis, such as Drosophila suzukii, an invasive fruit pest.

Results from independent multigenerational population cage experiments indicate that HomeR spreads and persists efficiently in the nos-Cas9 genetic background (Figure 2B). As expected, a single copy of the HomeR inserted at an essential gene provides sufficient rescue and complements the corresponding LOF allele. The PolG2HomeR allele persisted for 10 generations in control cage populations without Cas9 and its frequency drifted >50%, underscoring the lack of major fitness costs to PolG2HomeR. The LOF alleles complemented by PolG2HomeR also persisted for many generations after a carrier frequency reached 100%. Once LOF alleles are complemented by the PolG2HomeR allele, it takes several generations for LOF alleles to combine as lethal homozygotes and be negatively selected out of the population. The slow-acting elimination of LOF alleles takes especially long time by HGDs targeting non-essential genes or genes whose disruption does not cause complete lethality or sterility of homozygous carriers (Figure 3—figure supplement 1D; Adolfi et al., 2020; Gantz et al., 2015) underscoring the importance of targeting essential genes.

Functional resistant (R1) alleles are a problematic feature shared universally by many kinds of gene drives. These alleles can still be induced even when an essential gene required for insect viability is targeted (Figure 2—figure supplement 1). However, it should be noted that here we did not identify any silent R1 mutations (i.e. mutations that change the DNA sequence but not the protein AA sequence) which would be expected to be fitness neutral. Each identified in-frame non-silent PolG2R1 allele we found changed at least one AA and thus may still affect the fitness of its carrier, especially since we are targeting an essential gene, preventing such alleles from accumulating at the expense of the drive. Indeed, we observed that functional R1 alleles imposed fitness costs on seven female carriers sampled in drive populations #4 and #5. This fitness cost likely limits their accumulation and results in negative selection out of the population, in favor of the PolG2HomeR alleles, over multiple generations (Figure 2A–B) again underscoring the importance of targeting an essential gene. Nevertheless, multiplexing by encoding additional gRNAs into HomeR may further diminish the probability of inducing functional resistant alleles and further increase drive stability, spread, and persistence (Champer et al., 2018; KaramiNejadRanjbar et al., 2018; Marshall et al., 2017; Oberhofer et al., 2018).

Splitting HomeR into two genetic loci (HomeR and Cas9) integrated on different chromosomes serves as an important molecular containment mechanism (Akbari et al., 2015; Long et al., 2020). The HomeR element is able to home into wt alleles and bias its transmission. However, the Cas9 element, which is inherited Mendelianly, is required for its homing. Therefore, the independent assortment of Cas9 and HomeR limits the spread of HomeR and acts as a genetic ‘brake’ for the invasion of HomeR. The spread dynamic of split-HGDs resembles that of high-threshold drives and thus requires a high introduction rate for HomeR to spread into a local population and prevents its spread into neighboring populations, which is an important feature for confining drive spread and may be necessary for initial field testing of gene drives (Adelman et al., 2017; Akbari et al., 2015; Friedman et al., 2020; Kandul et al., 2020; Li et al., 2020; Raban and Akbari, 2017; Raban et al., 2020). Moreover, HomeR can be further confined by fitness costs to either the HomeR drive itself or to the Cas9 element, and our experiments revealed that the Cas9 element imposed significant fitness costs that can impede drive invasion (Figure 2C). Notwithstanding, even with significant fitness costs, multiple releases of the HomeR could still enable drive spread and long-term persistence as evidenced by mathematical models (Figure 4). As an additional safety measure, if unintended consequences arise, HomeR’s spread can be reversed by reintroduction of insects harboring wt alleles of the gene targeted. Notwithstanding, if desired, HomeR could facilely be converted into a non-localized gene drive by incorporating the Cas9 into the HomeR drive cassette and our modeling illustrates that it could perform quite well under this configuration (Figure 3A,B, Figure 4A,B). Taken together, the split-drive design of HomeR is a safe localized gene drive technology that could be widely adopted and implemented for local population control, and if a non-localized drive is desired for more wide scale spread, HomeR could be adapted for that purpose too.

In sum, HomeR combines promising aspects of current population modification drives—confinablity, high transmission of HGDs, and resilience to EJ generated resistant alleles (R2 type and R1 type that induces a fitness cost) similar to TA drives (Figure 3—figure supplement 1). Modeling illustrates success of both design aspects in linked or split-drive form, demonstrating robust behavior over a range of parameter combinations (Figures 34). This underscores its stability and resilience to EJ alleles, overcoming a significant hurdle for current HGD designs. Given the simplicity of the HomeR design, it could be universally adapted to a wide range of species including human disease vectors in the future.

Materials and methods

Key resources table
Reagent type
(species) or
resource
DesignationSource
or reference
IdentifiersAdditional
information
 Strain, strain background (D. melanogaster)gRNA#1PolG215967491378This publication
 Strain, strain background (D. melanogaster)gRNA#2PolG2159675n/aThis publication
 Strain, strain background (D. melanogaster)HomeRPolG2159676Gene drives cannot be deposited at BDSCThis publication
 Strain, strain background (D. melanogaster)HomeR(B)PolG2159677Gene drives cannot be deposited at BDSCThis publication
 Strain, strain background (D. melanogaster)nos-Cas91126857900430622266
 Strain, strain background (D. melanogaster)vas-Cas91126867900530622266
 Strain, strain background (D. melanogaster)Uniq-Cas91126877900630622266
 Strain, strain background (D. melanogaster)Act5C-Cas9n/a5459025002478
 Strain, strain background (D. melanogaster)exuL-Cas915967191375This publication
 Strain, strain background (D. melanogaster)Rcd1r-Cas915967391377This publication
 Strain, strain background (D. melanogaster)bTub-Cas915967291376This publication

Selection of Cas9/gRNA target sites

Request a detailed protocol

We inserted a HomeR in DNA Polymerase gamma subunit 2 (PolG2 or Pol-γ35, CG33650). PolG2 is an essential gene required for insect viability. The C-terminal domain of PolG2 is located at the end of the coding sequence, which facilitates its re-coding (Figure 1—figure supplement 1A–B). We PCR-amplified a 413-base fragment of the domain with 1073A.S1F and 1073A.S2R from multiple Drosophila strains (w1118, Canton S, Oregon R, nos-Cas9; Kandul et al., 2019) and used the consensus sequence along with the tool CHOPCHOP v2 (Labun et al., 2016) to choose two gRNA targets sites that minimize off-target cleavage. In addition, we used the DGRP2 (http://dgrp2.gnets.ncsu.edu) that includes natural variation in genome architecture among 205 D. melanogaster genetic reference panel lines (Huang et al., 2014; Mackay et al., 2012) to explore SNPs found inside both gRNA target sequences.

Design and assembly of genetic constructs

Request a detailed protocol

We used Gibson enzymatic assembly to build all genetic constructs (Gibson et al., 2009). To assemble both gRNA constructs, we used the previously described sgRNASxl plasmid (Kandul et al., 2019; Addgene #112688) harboring the mini-white gene and attB docking site. We removed the fragment encompassing the U6.3 promoter and gRNA scaffold by AscI and SacII digestion, and cloned it back as two fragments overlapping at a novel gRNA sequence (Figure 1—figure supplement 1A). Both gRNA#1PolG2 and gRNA#2PolG2 plasmids targeting PolG2 are deposited at http://www.addgene.org/ (#159774 and #159675).

We assembled two HomeRPolG2 constructs using two tested gRNAs (Figure 1—figure supplement 2A). Each HomeRPolG2 was built around a specific gRNA, with matching LHA and RHA: HomeRPolG2 harbored gRNA#1PolG2, and HomeR(B) had gRNA#2PolG2. We digested the nos-Cas9 plasmid (Kandul et al., 2019; Addgene #112685) with AvrII and AscI, preserving the backbone containing the piggyBac left and right sequences that encompass the Opie-dsRed-SV40 marker gene. The HomeR construct was assembled between Opie-dsRed-SV40 and piggyBacR in three steps. First, we cloned the gRNA#1 or #2 from the corresponding plasmid together with the 3xP3-eGFP-SV40 marker gene, to tag site-specific insertion of GDe. Then, we cloned three fragments: (i) LHA, which was amplified from the Drosophila genomic DNA; (ii) the re-coded fragment downstream from the gRNA cut site, which was PCR-amplified from the dePolG2 gBlock custom synthesized by IDT (Supplementary file 1); (iii) the p10 3’ UTR to provide robust expression (Pfeiffer et al., 2012) of the re-coded PolG2 rescue. Finally, we cloned RHA, which was PCR-amplified from genomic DNA, corresponding to each specific gRNA cut site. Importantly, the re-coding was carefully designed to ensure the translation of the re-coded DNA sequence in the wt amino acid sequence of Pol2 with respect to Drosophila codon usage bias. Both HomeRPolG2 and HomeR(B)PolG2 plasmids, targeting the PolG2 locus, are deposited at http://www.addgene.org/ (#159676 and #159677).

To assemble the three constructs for testis-specific Cas9 expression, we used a plasmid harboring the hSpCas9-T2A-GFP, the Opie2-dsRed transformation marker, and both piggyBac and attB-docking sites, which were previously used to establish Cas9 transgenic lines in Aedes aegypti (Li et al., 2017) and D. melanogaster (Kandul et al., 2020; Kandul et al., 2019). We removed the Ubiquitin 63E promoter from the ubiq-Cas9 plasmid (Addgene #112686) (Kandul et al., 2019) by digesting it with SwaI at +27°C and then with NotI at +37°C, and cloned a promoter fragment amplified from the Drosophila genomic DNA. The Drosophila exuperantia (CG8994) 783 bp fragment (exuL) upstream of the exuperantia gene was amplified with ExuL.1F and ExuL.2R primers (Supplementary file 1) and cloned to assemble the exuL-Cas9 plasmid. The Rcd-1 related (Rcd1r, CG9573; Chan et al., 2013) and β-Tubulin 85D (βTubChan et al., 2011; Michiels et al., 1989) promoters support early and late, respectively, testis-specific expression in Drosophila males. The 937-base-long fragment upstream of Rcd1r was amplified with 1095.C1F and 1095.C2R primers and cloned to assemble the Rcd1r-Cas9 plasmid. The 481-base-long fragment upstream of βTub was amplified with βTub.1F and βTub.2R primers (Supplementary file 1) and cloned to build the βTub-Cas9 plasmid. Three plasmids for testis-specific Cas9 expression are deposited at http://www.addgene.org/ (#159671–159773).

Fly maintenance and transgenesis

Request a detailed protocol

Flies were maintained under standard conditions: 26°C with a 12 hr/12 hr light/dark cycle. Embryo injections were performed by Rainbow Transgenic Flies, Inc. We used φC31-mediated integration (Groth et al., 2004) to insert the gRNA#1 and gRNA#2 constructs at the P{CaryP}attP1 site on the second chromosome (BDSC #8621), and the exuL-Cas9, βTub-Cas9, and Rcd1r-Cas9 constructs at the PBac{y+-attP-3B}KV00033 on the third chromosome (BDSC #9750). Two methods were used to generate the site-specific insertion of HomeRPolG2 or HomeR(B)PolG2 constructs at the gRNA#1PolG2 or gRNA#2PolG2 cut sites, respectively, inside the PolG2 gene via HDR. First, we injected the mixture of HomeR and helper phsp-pBac, carrying the piggyBac transposase (Handler and Harrell, 1999), plasmids (500 and 250 ng/µl, respectively, in 30 µl) into w1118 embryos. Random insertions of HomeRPolG2 and HomeR(B)PolG2, assessed by double (eye-specific GFP and body-specific dsRed) fluorescence (Figure 1—figure supplement 2B), established with this injection were genetically crossed to nos-Cas9/nos-Cas9 (BDSC #79004; Kandul et al., 2019) flies to ‘relocate’ HomeRPolG2 or HomeR(B)PolG2 to the corresponding gRNA cut site via HACK (Lin and Potter, 2016). A few site-specific PolG2HomeR and PolG2HomeR(B) lines tagged with only eye-specific GFP fluorescence were recovered. Second, we injected HomeRPolG2 or HomeR(B)PolG2 plasmids directly into nos-Cas9/nos-Cas9 (BDSC #79004; Kandul et al., 2019) embryos, generating multiple independent, site-specific insertions for each PolG2HomeR (Figure 1—figure supplement 2B). Recovered transgenic lines were balanced on the second and third chromosomes using single-chromosome balancer lines (w1118; CyO/snaSco for II and w1118; TM3, Sb1/TM6B, Tb1 for III) or a double-chromosome balancer line (w1118; CyO/Sp; Dr/TM6C, Sb1). While both techniques (random insertion/HACK and HDR) worked to generate site-directed insertions, all subsequent analysis was performed exclusively on lines derived from the HDR-based transgenesis approach.

We established three homozygous lines of PolG2HomeR and PolG2HomeR(B) from independent insertion lines, and confirmed the precision of site-specific insertions by sequencing the borders between HomeR constructs and the Drosophila genome (Figure 1—figure supplement 2C). The 1118-base-long fragment overlapping the left border was PCR-amplified with 1076B.S9F and 1076B.S2R and was sequenced with 1076B.S3F and 1076B.S4R primers. The same-length fragment at the right border was amplified with 1073A.S1F and 1076B.S10R and was sequenced with 1076B.S7F and 1076B.S8R primers (Supplementary file 1).

Fly genetics and imaging

Request a detailed protocol

Flies were examined, scored, and imaged on a Leica M165FC fluorescent stereomicroscope equipped with a Leica DMC2900 camera. We assessed the transmission rate of HomeR by following its eye-specific GFP fluorescence, while the inheritance of Cas9 was tracked via body-specific dsRed fluorescence (Figure 2A). All genetic crosses were done in fly vials using groups of 10 males and 10 females.

RNAPolG2 cleavage assay

Request a detailed protocol

To assess the cleavage efficiency of each gRNA targeting the C-terminal domain of PolG2, we genetically crossed 10 w1118; gRNA#1PolG2 or w1118; gRNA#2PolG2 homozygous males to 10 y1, Act5C-Cas9, w1118, Lig4 (Zhang et al., 2014) (BDSC #58492) homozygous females, and scored the lethality of F1 males (Figure 1—figure supplement 1C). The F1 males would then inherit the X chromosome from their mothers, expressing gRNA#1PolG2 or gRNA#2PolG2 with Act5C-Cas9 in a Lig4-null genetic background, and this results in male lethality when a tested gRNA directs cleavage of the PolG2 locus. To assess the induced lethality in the Lig4+/+ genetic background, we crossed 10 y1, Act5C-Cas9, w1118 (BDSC #54590; Port et al., 2014) flies to 10 U6.3-gRNA#1PolG2 flies in both directions, and scored survival of trans-heterozygous and heterozygous F1 progeny. To measure the Cas9/gRNA-directed cleavage of PolG2 by maternally deposited Cas9 protein in the Lig4+ background, the same homozygous males were genetically crossed to w1118/w1118; nos-Cas9/CyO females (Figure 1—figure supplement 1D), and the F1 progeny, harboring gRNA#1PolG2 or gRNA#2PolG2, were scored and compared to each other.

Assessment of PolG2HomeR transmission rates

Request a detailed protocol

To compare transmission rates of PolG2HomeR and PolG2HomeR(B), we first established trans-heterozygous parent flies by genetically crossing PolG2HomeR/PolG2HomeR; +/+ or PolG2HomeR(B)/PolG2HomeR(B); +/+ females to +/+; nos-Cas9/nos-Cas9 males. We then assessed the transmission rates by trans-heterozygous parent females and males crossed to wt flies. For controls, we estimated the transmission rates of HomeRPolG2 and HomeR(B)PolG2 in the absence of Cas9, by heterozygous PolG2HomeR/+ or PolG2HomeR(B)/+ females and males crossed to wt flies (Figure 1—figure supplement 3). To explore the effect of maternally deposited Cas9 protein on transmission of PolG2HomeR, we generated heterozygous PolG2HomeR/CyO embryos containing Cas9 protein deposited by nos-Cas9/CyO mothers and estimated the transmission of PolG2HomeR by females and males raised from these embryos and crossed to wt flies. We tested five different Cas9 lines—supporting germline (vas-Cas9), ubiquitous (ubiq-Cas9, Act5C-Cas9), and early (exuL-cas9, Rcd1r-Cas9) or late testes-specific expression (βTub-Cas9)—together with the strongest HomeR, PolG2HomeR. To control for position effect variegation, each Cas9 transgene was inserted at the same attP docking site on the third chromosome, except for Act5C-Cas9 that was integrated on the X chromosome (Port et al., 2014). Ten trans-heterozygous females or males, generated by crossing homozygous PolG2HomeR females to homozygous Cas9 males, were genetically crossed to wt flies and the transmission of PolG2HomeR was quantified in their F1 progeny (Figure 1D).

Egg hatching and egg-to-adult survival rates

Request a detailed protocol

To identify the mechanism of the super-Mendelian transmission of PolG2HomeR, we assessed the percentage of F1 hatched eggs laid by trans-heterozygous PolG2HomeR/+; nos-Cas9/+ females genetically crossed to wt males and compared it to those hatched from two types of heterozygous females: PolG2HomeR/+; +/+ ♀ and +/+; nos-Cas9/+ ♀ (Figure 1B). We collected virgin females and aged them for 3 days inside food vials supplemented with a yeast paste, then five groups of 25 virgin females of each type were transferred into vials with fresh food containing 25 wt males and allowed to mate overnight (12 hr) in the dark. Then, all males were removed from the vials, while females were transferred into small embryo collection cages (Genesee Scientific 59–100) with grape juice agar plates. After 12 hr of egg laying, a batch of at least 200 laid eggs was counted for each sample group and incubated for 24 hr at 26°C before the number of unhatched eggs was counted. To assess the egg-to-adult survival rate, at least 12 groups of 75 eggs were collected to each type of tested progeny and transferred to individual vials. The emerged flies from each vial were counted (Figure 1C), and their sex and fluorescence were scored (Supplementary file 7).

‘Fishing’ for functional resistant alleles, PolG2R1

Request a detailed protocol

To explore the generation and accumulation of functional resistant alleles induced by EJ, we initiated three populations by crossing 50 +/+; nos-Cas9/nos-Cas9 females and 50 PolG2HomeR/PolG2HomeR; nos-Cas9/nos-Cas9 (Figure 2A) males in 0.3 l plastic bottles (VWR Drosophila Bottle 75813–110). Parent (P) flies were removed after 6 days, and their progeny were allowed to develop, eclose, and mate for 13–15 days. This established a 100% heterozygous PolG2HomeR/+; nos-Cas9/+ population in every bottle at the next generation (G0). Each generation, around 250–350 emerged flies were anesthetized using CO2, and their genotypes with respect to PolG2HomeR (presence or absence) were determined using the dominant eye-specific GFP marker. Then they were transferred to a fresh bottle and allowed to lay eggs for 6 days before removing them, and the cycle was repeated. Three populations were maintained in this way for 11 generations, which corresponds to 10 generations of gene drive. Note that any fly scored without the PolG2HomeR allele was transferred into a fresh bottle to ensure any PolG2 resistant or wt alleles could be passed to the next generation. We retrieved and froze the flies for genotyping only after 6days to ensure sufficient time for breeding. We expected that if PolG2R1 alleles were frequently generated and did not incur fitness costs, they would persist and accumulate over a few generations at the expense of PolG2HomeR. However, as we did not find any fly without the PolG2HomeR allele after G3, we stopped populations after 10 generations and froze 60 flies after G10 for further sequence analysis.

HomeR population drives in the Cas9 and wt genetic backgrounds

Request a detailed protocol

For HomeR drives in the nos-Cas9/nos-Cas9 genetic background, we seeded five experimental (Cas9+) drives and three control (Cas9–) drives with 50 homozygous PolG2HomeR/PolG2HomeR males and 50 wt males together with 100 wt virgin females in 0.3 l plastic bottles. Seeded flies either encoded Cas9 (experimental drive) or not (control or ‘no-drive’, Figure 2B). For HomeR drives in the wt genetic background, we seeded four drive populations with 100 wt virgin females and double homozygous (PolG2HomeR/PolG2HomeR; nos-Cas9/nos-Cas9) males mixed with wt males at the ratios of 1:1 (two populations with 25% of PolG2HomeR) or 3:1 (two populations with 37.5% of PolG2HomeR, Figure 2C). Note that the PolG2HomeR males were competing with wt males for female mates, and their mating competitiveness could be scored by the dominant 3xP3-GFP marker of PolG2HomeR in their progeny at generation 0 (G0). Both types of homozygous PolG2HomeR males with and without Cas9 were able to compete with the corresponding wt males for female mates resulting in the increase of PolG2HomeR from 25% or 37.5% in parents to nearly 50% or 70%, respectively, at generation 0 (Figure 2B–C). The discrete-generation populations were maintained and scored as described above. Each generation, around 250–350 emerged flies were anesthetized using CO2, and their genotypes were scored for the presence or absence of PolG2HomeR (eye-specific GFP) and nos-Cas9 (body-specific dsRed). Then they were transferred to a fresh bottle and allowed to lay eggs for 6 days before removing them, and the cycle was repeated.

Sequencing of induced resistant alleles

Request a detailed protocol

To analyze the molecular changes that caused functional in-frame (R1) and LOF (R2) resistant mutations in PolG2, we PCR-amplified the 232-base-long genomic region containing both gRNA#1PolG2 and gRNA#2PolG2 cut sites using 1073A.S3F and 1073A.S4R primers (Supplementary file 1). For PCR genotyping from a single fly, we followed the single-fly genomic DNA prep protocol (Kandul et al., 2019). PCR amplicons were purified using the QIAquick PCR purification kit (QIAGEN), subcloned into the pCR2.1-TOPO plasmid (Thermo Fisher), and at least seven clones were sequenced in both directions by Sanger sequencing at Retrogen and/or Genewiz to identify both alleles in each fly. Sequence AB1 files were aligned against the corresponding wt sequence of PolG2 in SnapGene 4.

To explore the diversity of resistant alleles persisting after 10 generations of PolG2HomeR in a 100% heterozygous population, we froze 60 flies (30 ♀ and 30 ♂), each harboring at least one copy of the dominant marker of PolG2HomeR, from each lineages after G10 (Figure 2A). Using these flies, we quantified any resistant and wt alleles remaining in the population via Illumina sequencing of heterogeneous PCR amplicons at the PolG2 locus. Note that PCR amplicons did not include the PolG2HomeR allele due to its length (Figure 1A). Additionally, this assay will not be able to accurately distinguish between germline and somatic mutations as whole flies were used. DNA was extracted using the DNeasy Blood and Tissue Kit (QIAGEN). To analyze heterogeneous PCR products, we used the Amplicon-EZ service by Genewiz and followed the Genewiz guidelines for sample preparation. In brief, Illumina adapters were added to the 1073A.S3F and 1073A.S4R primers to simplify the library preparation, PCR products were purified using QIAquick PCR purification kit (QIAGEN), around 50,000 one-direction reads covering the entire amplicon length were generated, and relative abundances of recovered SBS and indel alleles at the gRNA#2PolG2 cut site were inferred using Galaxy tools (Afgan et al., 2018). Amplicon-EZ data from Genewiz were first uploaded to Galaxy.org. A quality control was performed using FASTQC. Sequence data were then paired and aligned against the PolG2WT sequence using Map with BWA-MEM under ‘Simple Illumina mode’. The SBS and indel alleles were detected using FreeBayes, with the parameter selection level set to ‘simple diploid calling’.

Model fitting to cage experiment data

Request a detailed protocol

Empirical data from the HomeR population replacement experiments were used to parameterize a model of CRISPR-based homing gene drive including resistant allele formation. Model fitting was carried out for all five gene drive cage experiments using Markov chain Monte Carlo (MCMC) methods in which estimated parameters related to cleavage efficiencies in females and males, accurate HDR frequencies given cleavage in females and males, the proportion of resistant alleles that are in-frame and cost-free, and the fitness cost associated with having the HomeR system. We considered discrete generations, random mixing, and Mendelian inheritance rules at the gene drive locus, with the exception that for adults heterozygous for the homing allele (denoted by ‘H’) and wt allele (denoted by ‘W’), a proportion, c, of the W alleles are cleaved, while a proportion, 1 c, remain as W alleles. Of those that are cleaved, a proportion, pHDR, are subject to accurate HDR and become H alleles, while a proportion, (1-pHDR), become resistant alleles. Of those that become resistant alleles, a proportion, pRES, become in-frame, functional, cost-free resistant alleles (denoted by ‘R’), while the remainder, (1-pRES), become out-of-frame, non-functional, or otherwise costly resistant alleles (denoted by ‘B’). The values of c and pHDR were allowed to vary depending on whether the HW individual is female or male. The fitness cost associated with the HomeR system, sH,F, was assumed to be female-specific. These considerations allowed us to calculate expected genotype frequencies in the next generation, and to explore the parameter values that maximize the likelihood of the experimental data. The model fitting framework is described in full in S1 text of Pham et al., 2019.

Comparative modeling of gene drive systems

Request a detailed protocol

Comparative gene drive simulations were performed using a discrete-generation version of the Mosquito Gene Drive Explorer (MGDrivE) modeling framework (Sanchez et al., 2019). The first generation was seeded with 400 adults, 50% wt females, 25% wt males, and 25% homozygous gene drive males. At each generation, adult females mate with males, thereby obtaining a composite mated genotype (their own, and that of their mate) with mate choice following a multinomial distribution determined by adult male genotype frequencies. Egg production by mated adult females then follows a Poisson distribution, proportional to the genotype-specific lifetime fecundity of the adult female. Offspring genotype follows a multinomial distribution informed by the composite mated female genotype and the inheritance pattern of the gene drive system. Sex distribution of offspring follows a binomial distribution, assuming equal probability for each sex. Female and male adults from each generation are then sampled equally to seed the next generation, with a sample size of 400 individuals (200 female and 200 male), following a multivariate hypergeometric distribution. Twenty-five repetitions were run for each drive in the trace plots (Figure 3A and C), and 100 repetitions were run for each parameter combination in the heatmaps (Figure 3B and D).

The inheritance pattern is captured by the ‘inheritance cube’ module of MGDrivE (Sanchez et al., 2019). ClvR and TARE constructs were implemented to match their published descriptions (Champer et al., 2020a; Oberhofer et al., 2020a, Oberhofer et al., 2019). HomeR and HGD were implemented as one- or two-locus systems following equivalent inheritance rules. When Cas9 and gRNAs co-occur in the same individual, wt alleles are cleaved at a rate cF (cM) (female- (male-) specific cleavage), with 1-cF (1 cM) remaining wt. Given cleavage, successful HDR occurs at a rate chF (chM), with 1-chF (1-chM) alleles undergoing some form of EJ. Of these, a proportion, crF (crM), are in-frame EJ alleles, while the remainder, 1-crF (1-crM), are LOF alleles. Maternal carryover (maternal deposition, or maternal perdurance) was modeled to occur in zygotes of mothers having both Cas9 and gRNAs, impacting a proportion, dF, of zygotes. Of the wt alleles in impacted zygotes, a proportion, drF, become in-frame EJ alleles, while the remainder, 1-drF, become LOF alleles. These inheritance rules apply to both HomeR and HGD, with differing fitness costs.

ClvR (Oberhofer et al., 2020a; Oberhofer et al., 2019) was modeled using a 99% cleavage rate in female and male germ cells, as well as in embryos from maternal carryover. For two-locus ClvR, the two loci were assumed to undergo independent assortment (≥50 cM separation), as was assumed for all two-locus systems in this analysis. For both configurations, it was assumed that 0.1% of cleaved alleles were converted to functional resistant alleles (R1 type), and the rest became LOF alleles (R2 type). In addition to the 50% egg-hatching reduction due to the non-homing drive (Figure 3—figure supplement 1A–B), an additional 5% reduction in fecundity was applied to females that harbored Cas9. For consistency, TARE, HGD, and HomeR (for ideal parameters) also used a cleavage rate of 99% in females and males, though TARE demonstrated lower maternal carryover (Champer et al., 2020a), and was modeled with 95% cleavage. HGD and HomeR (for ideal parameters), which rely on HDR, were simulated with 90% HDR rates in females and males. Cleaved alleles that did not undergo HDR were assumed to be R1 alleles with proportion 0.5%, and R2 LOF alleles the remainder of the time. TARE and HomeR were also modeled with a small (5%) fitness reduction, applied as a reduction of female fecundity. Since an HGD does not provide a rescue for a disrupted target gene, its carriers demonstrate higher fitness costs and were assigned a 20% fitness reduction with the assumption that the HGD is inserting into a non-lethal gene that imposes a low/moderate fitness cost. Experimentally derived parameters for HomeR differed from ideal parameters in two ways: (i) there was no HDR in males (although cleavage remained the same) and (ii) 1% of EJ-repaired wt alleles were converted into R1 alleles (cf. 0.5% for the ideal case).

To determine the number of releases required to introgress effector genes into 95% of the female portion of a population, and ascertain how long that introgression could be effective (up to 4 years; Figure 4), we performed simulations using the full version of MGDrivE (Sanchez et al., 2019), implementing overlapping generations and density-dependent growth effects on aquatic stages. All gene drive characteristics were maintained as stated above. For one-locus designs, male releases, up to 13, were performed at 20% of the total population size (10,000). For two-locus designs, the first release was males homozygous for Cas9 and gene drive, but subsequent releases were only homozygous for Cas9. The two-locus releases were 20% of the total population size. Life cycle parameters are: 2 days for egg maturation, 5 days for larval maturation, 1 day for pupal maturation, and an expected adult lifespan of 11 days (Li et al., 2020). All simulations were performed, analyzed, and plotted in R (R Development Core Team, 2017). Code is available upon request.

Statistical analysis

Request a detailed protocol

Statistical analysis was performed in JMP 8.0.2 by SAS Institute Inc, and graphs were constructed in Prism 8.4.1 for MacOS by GraphPad Software LLC. At least three biological replicates were used to generate statistical means for comparison. p-values were calculated using a two-sample Student’s t test with equal or unequal variance.

Gene drive safety measures

Request a detailed protocol

All gene drive crosses were performed in accordance with protocols approved by the Institutional Biosafety Committee at UCSD, in which gene drive experiments were performed in a high-security ACL2 barrier facility in plastic vials that were autoclaved prior to being discarded, in accordance with currently suggested guidelines for the laboratory confinement of gene drive systems (Akbari et al., 2015; National Academies of Sciences, Engineering, and Medicine et al., 2016).

Ethical conduct of research

Request a detailed protocol

We have complied with all relevant ethical regulations for animal testing and research and conformed to the UCSD institutionally approved biological use authorization protocol (BUA #R2401).

Data availability

All data are represented fully within the tables and figures. The gRNA#1PolG2, gRNA#2PolG2, HomeRPolG2, HomeR(B)PolG2, exuL-Cas9, Rcd1r-Cas9, and βTub-Cas9 plasmids and corresponding fly lines are deposited at http://www.addgene.org/ (159671-159677) and the Bloomington Drosophila Stock Center (91375-91378), respectively.

References

    1. Friedman RM
    2. Marshall JM
    3. Akbari OS
    (2020)
    Gene drives new and improved
    Issues in Science and Technology 36:72–78.

Decision letter

  1. Claude Desplan
    Reviewing Editor; New York University, United States
  2. Patricia J Wittkopp
    Senior Editor; University of Michigan, United States

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Acceptance summary:

The paper describes and tests a gene drive system by inserting a synthetic rescue gene into an essential gene that also contains the gRNA (gene drive) but not the Cas9 (safe split gene drive). Although there are clearly some limitations to the effectiveness of the drive, to a large extent due to the fitness cost of Cas-9, this new strategy will be a useful path to follow in order to thwart evolution of resistance to the drive. This will move the field forward in getting researchers to broaden their efforts in development of gene drives for specific purposes.

Decision letter after peer review:

[Editors’ note: the authors submitted for reconsideration following the decision after peer review. What follows is the decision letter after the first round of review.]

Thank you for submitting your work entitled "A home and rescue gene drive efficiently spreads and persists in populations" for consideration by eLife. Your article has been reviewed by a Senior Editor, a Reviewing Editor, and four reviewers. The following individual involved in review of your submission has agreed to reveal their identity: Ernst A Wimmer (Reviewer #2).

Our decision has been reached after consultation between the reviewers. Based on these discussions and the individual reviews below, we feel that the paper is of significant conceptual and practical interest but it is lacking in several points that should absolutely be addressed before the paper can be considered. I therefore we regret to inform you that your work will not be considered for publication in eLife but we strongly encourage you to submit a new manuscript once these points have been addressed carefully, which should take a significant amount of time. I wish to emphasize that the reviewers would like this paper to be published in eLife but it needs this additional work. The reviewers all also strongly believe that the presentation of the paper needs to be seriously fixed to make it simpler and targeted to a wide audience, which will be very interested by these results.

The following points must be addressed as they are the two main points the manuscript is trying to make but they are not well supported:

– The major claim of the paper, the release of the drive into a wild type population with no Cas9 must be investigated as there is no practical demonstration of a split drive.

– There is also no final demonstration of the repression of R1 alleles: The strategy for sampling resistance appears to not be appropriate and must be changed as suggested by reviewer #4.

– The paper claims to have used an ultra conserved gene. As so much importance is placed on the drive design overcoming resistance, this must be demonstrated. A gene that is considered as ultra conserved would require that there is little or no nucleotide variation (e.g. dsx in the Kyrou et al.,). Conservation at the protein level does little to protect against synonymous mutants that would constitute resistance. A better justification for this choice and a demonstration that it was indeed the right choice must be presented.

– The data are really presented in a very difficult way and the paper must be extensively revised with a much broader audience in mind.

– The reviewers did a very thorough review of the paper that should help you to improve it. The individual reviews are included below.

Reviewer #1:

This paper shows that Cas 9 mediated homology directed repair can be used to insert at synthetic rescue gene into an essential gene, here mitochondrial Pol-γ35 was chosen. The insertion is marked by an eyeless-GFP reporter and also contains the gRNA (gene drive) but not the Cas9 (considered as a safe split gene drive). 'Homing' of the eye-GFP is assayed to detect insertion at the homologous locus when Cas9 is present by HDR.

The authors show that this works well in the female germline with various tested Cas9 lines (vas, nos, Act5C and ubiq-Cas9). In all cases close to 100% transmission to the homologous locus on the homologous chromosome is achieved when an effective guide RNA is used. Hence, eye GFP transmits (“homes”) in a “super-Mendelian” ratio at the chosen target. A male specific transmission works less well (exuL-Cas9).

The reason why it works well appears to be that the chosen target is an essential gene (Pol-γ35) in which small changes caused by NHEJ that result in homing “resistant” alleles will be loss of function alleles and hence will not spread in the population.

Unfortunately, the authors did not test, how the drive could spread in a wild type population (no Cas9 expression). I am also missing a test relevant for pest studies that would achieve the spread of a potentially deleterious or beneficial insertion that could kill a population or make it resistant to a disease.

1) This paper is very hard to read. Sentences are excessively long and complicated. References to the Figures appear not always correct.

2) Figure 1. Genotypes in Figure 1A are unreadable in the print version because of the small font. Are the 2 crossing schemes required that only differ in gRNA1 or gRNA2? The surviving progeny should be quantified as in Figure 1B. Figure 1B shows nos-Cas9 and not act-Cas9 results (several typos in subsection “Design and testing of gRNAs targeting an essential gene”).

Figure 1C: the incidence of heterozygous, homozygous and “resistant” cells is schematic and not supported by data, hence questionable if Figure 1C should be shown in results.

3) Figure 2. Genotypes not readable in print. Is it necessary to show schemes of the procedure how transgenic flies were generated and how the Pol-γ 35 HomeR were made with all chromosomes detailed (Figure 1D)? This could move to the Materials and methods, as it is standard and we learn not much new.

4) More typos: Subsection “Assessment of germline transmission and cleavage rates”: Figure 2B is the wrong reference; Actic 5C should read Actin 5C. Figure 4B GGG codes for Gly (not Gla). sixth paragraph of the Discussion should refer to Figure 6?

5) Figure 5 – as Figure 1 Figure 2, only readable on the computer.

6) It would be interesting to see how the gene drive would spread if Home R and Cas9 would be introduced in a competitive way into wild type populations. This is similar to Figure 4C, but the only the Home R males or females would carry the Cas9. This would be a more realistic test how the gene drive could spread in a wild population that obviously does not express Cas9.

Reviewer #2:

Kandul et al., present an interesting study that could lead to important improvements on the use of homing-based gene drives. However, before publication can be supported there are a number of things that should be addressed to improve the manuscript for better comprehension by readers.

Overall the manuscript presents a load of data. But the presentation of these data could be made in a better digestible way. The authors should go over their maunscript with a reader in mind, that is interested but not necessarily knows all the relevant literature in the very detail.

Abstract: Please remove "inherently confinable" from the Abstract. The drive is indeed designed in a split drive design, however, all the experiments were done in a homozygous Cas9 background. Therefore, there are no experimental data for a split drive provided in this manuscript. The split situation seems to be here more of a practical reason to be allowed to do the experiments in a less stringent laboratory environment. Thus there are no experimental data that would support the confineable nature of this drive. Actually there are not even modelling data to this. Thus, such a statement should not be put in the Abstract. This manuscript is not a demonstration of a confineable drive.

Results: How was Pol-γ35 identified? It would be interesting to the reader to get to know about the exact reasoning, why this gene was chosen. Or were there several ones chosen before and this turned out to work the best or was the easiest to design. This could be very interesting considerations important to the field.

Results (Figure 1B; Figure 1C) and Materials and methods and Figure 1 (both Figure and legend):

The addressing of the Figure panels and the writing to it don't fit. Has there been a rearrangement of the Figure that was not worked through the text?

When referring to "B" in the text, it is still about Act5C-Cas9 and the nos-Cas9 data are in the text referred to Figure 1C. But Figure 1C is BLM.

In current panel Figure 1B, what does "all" mean below the X-axis? This is not comprehensible. Panel C is not really described in the Figure legend.

Results, Discussion, and Figure 1—figure supplement 4 legend. "converting recessive non-functional resistant alleles into dominant deleterious /lethal mutations" is completely misleading. There is no "conversion" and how should that be done molecularly. There is a continuous removal of such alleles from the population because of lethal transheterozygous conditions caused in the drive. However, there is no active conversion of such alleles into dominant lethal ones. This needs to be clearly rewritten to avoid the misleading idea.

Figure 1—figure supplement 4 also seems to have a slight conceptional problem. What are "cells" (rectangles) with a red frame and a green core? Green means at least one wt allele (this must include the recoded rescue allele.). Red means biallelic knock-out: thus a red cell cannot have a wt allele. Thus what is a red-framed green core cell?

To explain the removal of R2 alleles, a depiction of yellow framed red core cells in the germ line would be helpful, since this would explain how R2 alleles are selected against and might be continuously removed from the population.

Results: Before going into the modelling, the reader should be clearly informed about all the different approaches that are now to be compared. This is currently not done well, if at all. Thus moving current Figure 6 before current Figure 5 might clearly help. Also a better explanation of the panels in Figure 6 is necessary as well as a correction of Fig6 Panel E.

A comparison of a great number of the currently approached toxin-antidote (gene destruction – rescue, but not killer-rescue.) systems is greatly appreciated. However, the authors cannot expect the general reader to know about the small detailed differences between the systems that are compared here. Thus the authors need to do some explanation and categorization of the different approaches here and also cite all the respective literature.

– First subdivision: Non-homing (interference-based drives) VERSUS Homing (thus overreplication-based drives). This will also help then to better understand, why the interference-based drives (TARE and ClvR) are more sensitive to fitness parameters than overreplication drives.

– Second subdivision: same-site VERSUS distant site. This is important to understand the difference between the here modelled TARE and the CLvR. Actually ClvR is a TARE, but you use TARE here more specifically as the results in the respective paper are demonstrating only a same-site TARE. But this needs to be clearly stated here.

– Third subdivision: viable VERSUS haplosufficient VERSUS haploinsufficient. This also needs to be clearly depicted in labellling panels C to F of Figure 6, which are currently hard to grasp what the essential differences are, before looking at the panels in detail:

C: HGD of viable gene (HGD)

D: HGD of viable gene with rescue (HGD+R)

E: HGD of haploinsufficient gene with rescue (HGD-hi+R). This panel needs major correction.

F: HGD of haplosufficient (essential) gene with rescue (HomeR)

– Fourth subdivision: split VERSUS non-split. Here for the split HGD situation, the respective papers of which the current authors are co-authors should be cited: Kandul et al., 2020 and Li et al. 2020. In addition, it is also important to state clearly that "split or two locus" is completely independent of the "distant site" concept.

The reader needs to understand the differences of the systems that are compared here, without having the reader to go to the respective publications themselves and then try to find out what the differences really are. This is not so obvious and the current authors have a clear chance here to do that and help the reader in the mists of all this similar but still distinct approaches.

Figure 6 Panel E: This depiction is not consistent within itself, not consistent with the legend, and not consistent with the cited literature.

– Why should the rescuing drive construct over the wt allele be lethal as indicated in the right two boxes?

– The cited paper Champer et al., 2020b clearly states that there is maternal carry over, which actually makes it so hard to use and is probably only working via male propagation. In the Figure legend it is said that "maternal carryover and somatic expression.… are empirically unavoidable", which is contrast to the depiction. The legend then also states that this is "unachievable". This should be better replaced by "hard to achieve", since the approach is published and seems to drive, even though probably just via the males. Thus the depiction of panel E needs to be thoroughly revised.

Discussion: The haplolethal HGD works (admittingly poorly) despite the maternal carryover (Champer et al., 2020b). Therefore, your statement needs to be refined or deleted: "requires germline-specific promoter that lacks maternal carryover" is not consistent with the published paper. The drive could go via the males because then you do not have maternal carry over. And homing based drives can go via males and do not necessarily have to be promoted through females, see also KaramiNejadRanjbar et al., 2018.

Discussion. This sentence is based on an old but clearly overruled idea. NHEJ repair is not restricted to a time before the fusion of the paternal and maternal genetic material. It has been clearly demonstrated that R1 and R2 alleles are generated in the early embryo also after the zygote state (Champer et al., 2017, KaramiNejadRanjbar et al., 2018). Actually, all of the authors' Figure 1C and Figure 1—figure supplement 4 are about NHEJ mutation in the early embryo causing "BLM". Thus this sentence is inconsistent with current believes and also with the authors' own writing.

Figure 4: Panel C graph: Why is in the controls the transgene consistently and significantly higher inherited to the next generation (0). It is about 75% progeny sired by the transgenic fathers compared to the wild type fathers? Was there an age advantage of the transgenic ones or whatever other fitness factor? This is surprising and no explanation is given at all.

In contrast, in the Cas9 background, in generation 0 less than 50% carry the drive allele, which is probably due to induced lethality. But this should also be commented on.

In the legend it is stated that 7 of 9 flies carried an R1 allele heterozygous to an R2 allele. What about the other two?

Reviewer #3:

The authors are to be commended for the effort put into careful experimental design and clear presentation of methods and results.

My main concern with the manuscript is that the claim about their specific polymerase gene being "ultraconserved" is not backed up with their own data or by citations from the literature. If the gene sequence was ultra-conserved, I wouldn't have expected the authors to be able to do so much recoding of the gene without fitness consequences. Furthermore, it is clear that homozyogous-viable NHEJ mutations did develop in the experiment. Without explanation, this seems to be a fatal flaw in the design.

This manuscript describes a modification of the general homing gene drive concept by use of a split drive system that increases the frequency of a recoded polymerase gene that replaces a cleavage susceptible, naturally occurring, haplosufficient, conserved polymerase gene. This approach is taken in order to limit the evolution of cleavage resistance in the naturally occurring gene.

As mentioned in the summary, I am not convinced that the research presented achieves the intended goals. I did a quick look for literature on the "ultraconserved" polymerase pol-y35 gene a could find none. I am not sure if the conservation is at the DNA sequence level or at the amino acid level. If at the amino acid level, then it makes sense that resistance alleles can form at the DNA level that don't impact the protein at all. Figure 2A shows the 22 and 27 recoded nucleotides for the two guide RNA sites. The authors say that these changes to the sequences didn't seem to impede fitness. Did the authors try many other recodings and finally decide on these because all others caused loss of fitness, or is it just that this gene is robust to substitutions even though the protein is conserved.

Figure 4C shows that the frequency of flies with at least one copy of the pol-y35home R1 increased from about 25% to about 50% between the parental and F0 generation when there was no Cas9 present. As long as the transgenic males were competitive with the wild flies this makes sense because the released flies were homozygous for that allele and the offspring should all have inherited one copy of the gene. What doesn't make sense is that when the work was done with all flies harboring the Cas9, the pol-y35home R1 increased less than in the former case, from the parental to generation F0, the frequency of flies with the pol-y35home R1. In some replicates the frequency of such flies didn't increase at all. It should be noted that the parents were always homozygous. This certainly indicates a fitness cost to the flies with a combination of Cas9 and the homing construct.

In this same Figure, results from the model are plotted. It seems like the model assumes no fitness cost because it shows an exact increase from 25% to 50% flies carrying at least on copy of the pol-y35home R1 theoretical construct. In later generations the experimental results outperform the model. Presumably, this model is used to construct Figure 6. This mismatch needs to be addressed in the manuscript.

The fact that in all three replicates of the experiment without Cas9, the F0 is above 50% indicates that something else may be going on that is unrelated to gene drive. It could be due to heterosis between the two slightly different strains of flies. When wildtype males mate with wildtype females, the offspring are more inbred than when a transgenic male mates with a wildtype female. -just a hypothesis.

Reviewer #4:

Gene drives can be used for sustainable control of disease vectors, and there is a need for a different gene drive strategies that can be tailored to the particular species, timescale, and desired spatial spread. Kandul and colleagues present a welcome new addition to the growing number of strategies for gene drive, called HomeR, that combines elements of killer-rescue and homing-based drive to exert spatiotemporal control over its spread, whilst counteracting the rise of resistant mutations. Whilst it is extremely promising, some major claims of this manuscript are inaccurate or unsupported by the evidence. The authors could easily address the most important concerns by expanding their sequencing analysis to better detect and quantify resistant mutations, paying careful attention not to overstress the potential of this drive to mitigate resistance, and by comparing the relative strengths of different drive strategies instead of focussing only on features that are most flattering to the HomeR strategy.

1) The drive release strategy of Figure 4A and 4C are primed to underestimate and potentially mask resistance. In Figure 4A, where the authors search for signs of resistance, the population was seeded with males that were all homozygous for the drive, meaning that 100% of their G0 progeny will inherit it. As the rate of homing is close to 99%, only a small fraction of their G1 could have inherited a non-drive (potentially resistant allele) allele. In a realistic release scenario, resistant alleles will have ample opportunity to be generated and subsequently selected. Though still far from adequate, resistance testing would have been better performed on samples collected from the lower frequency releases in panel C. This experiment should not be used to draw strong conclusions about resistance to pHomeR, but should be used to make broader observations regarding the spread and stability of the construct.

2) The strategy for sampling resistance will obscure almost all resistance in the population, and would fail to detect even a strong selection for it. Flies were only selected for resistance genotyping if they lacked GFP, meaning they carry two non-HomeR alleles (i.e. homozygous for the R1 allele or transheterozygous with another R1/R2/WT). One would expect most resistant alleles to be heterozygous in a population that was seeded with almost complete drive homozygosity. The authors could, and should, have done more to identify and quantify these. Amplicon sequencing was used to sample the full diversity of alleles in a larger pool of individuals (including GFP+ flies) collected at G10, why was this approach not used throughout? By adopting the approach earlier they would have been able to track the changing frequencies of R1 and R2 alleles over time.

3) The impression given in the Figure and main text is that R1 alleles were rare (or entirely absent), when they were not. In spite of the incredible advantage given to the drive, and a bias in sampling method that would mask the presence of resistant alleles, resistance was observed in every generation tested (G2, G3 and G10). The authors claim that because GFP- individuals were not observed in later generations, the resistant alleles had not come under positive selection. This logic is flawed, and indeed their own amplicon sequencing analysis performed on G10 flies revealed several resistant alleles, including an R1 present in 80% of non-drive alleles. The two most frequent mutant alleles detected were in frame, and I do not agree that these are likely to be deleterious recessive (as the authors speculated). These could be functionally resistant mutations. I believe there were many more R1 alleles in heterozygosity with the HomeR allele, these alleles could have been spreading, but were excluded from the genotyping analysis. Could these putative R1 individuals not have been specifically tested to see if they do, or do not confer resistance?

4) The modelling takes a very limited approach to comparing different drive strategies, and by comparing proof-of-principle designs, important differences are obscured. For example, simple modifications that would mitigate resistance are likely to be included in many designs – such as multiplexing gRNAs. The nuances of each design are lost in a discussion focused on the rate of spread, which is largely irrelevant now because all of drives are predicted to spread well.

5) The authors did not discuss the relevance of having performed releases in a population that was already homozygous for Cas9. Do the release experiments and model really suggest the drive could spread if released into an otherwise WT population? I'm not sure the data presented in this manuscript can support that claim.

[Editors’ note: further revisions were suggested prior to acceptance, as described below.]

Thank you for submitting your revised article "A home and rescue gene drive efficiently spreads and persists in populations" for consideration by eLife. Your article has been reviewed by Patricia Wittkopp as the Senior Editor, a Reviewing Editor, and two reviewers. The reviewers have opted to remain anonymous.

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

We would like to draw your attention to changes in our policy on revisions we have made in response to COVID-19 (https://elifesciences.org/articles/57162). Specifically, when editors judge that a submitted work as a whole belongs in eLife but that some conclusions require a modest amount of additional new data, as they do with your paper, we are asking that the manuscript be revised to either limit claims to those supported by data in hand, or to explicitly state that the relevant conclusions require additional supporting data.

Our expectation is that the authors will eventually carry out the additional experiments and report on how they affect the relevant conclusions either in a preprint on bioRxiv or medRxiv, or if appropriate, as a Research Advance in eLife, either of which would be linked to the original paper.

This paper describes a new gene drive system that appear to have some advantage over existing systems. It shows that Cas 9 mediated homology directed repair can be used to insert at synthetic rescue gene into an essential gene. Although there are clearly some limitations to the effectiveness of the drive, to a large extent due to the fitness cost of Cas-9, this new strategy will be a useful path to follow in order to thwart evolution of resistance to the drive.

Essential revisions:

The reviewers appreciated the huge efforts you made to address their initial concerns and in particular the release into Cas-9 negative populations, even though the fitness cost of Cas-9 is a real issue that limit applicability of the approach. We therefore ask you to strongly decrease your claims to better reflect the results described, and in particular to change the title of the paper. You might want to mention that mosquitos could be better suited for this type of approach than Drosophila. I therefore expect to see an amended manuscript in the very near future where, we sincerely hope, you will have represented the results without un-necessary hype.

Reviewer #1:

I commend the authors for conducting additional experiments that enable assessment of the drive dynamics of their strategy under conditions when the split drive is introduced into a lab population without Cas9 and testing the Cas9 independently. The finding that the Cas9 has a fitness cost explains some of the previous results. This must have been a lot of work, but I think it was worth the effort.

I appreciate that the authors have removed the term "ultra-conserved", but I am still not comfortable with their use of the term "conserved" in relationship to the focus of the manuscript on gene drive. It's not just the term, but the expectation that this will be a stable drive system. Even with the small sample size in the current laboratory experiments (compared to what would be expected for the size of the target population in a field release) mutations arose that seemed to have no fitness consequence in males even as single copies with an LOF copy. Isn't it therefore reasonable to expect mutations to arise due to NHEJ that wouldn't have fitness effects on males and females? Beyond that, wouldn't a natural population be expected to already harbor some genotypes that would be immediately resistant to this drive? The authors should clearly address why they don't expect this problem with using their design outside of the lab.

In Figure 1—figure supplement 4, the authors show amino acid sequences that appear to be consensus sequences. What is important for this paper is understanding how much variation exists in the DNA sequence for the 3' end part of the domain of the gene for D. melanogaster and other potential targets of gene drive. At least for D. melanogaster there are many sequences available. Such data may also be available for some other pest insects. Before this paper is accepted, I think it behooves the authors to provide information on this issue that could predict whether this drive would really thwart resistance evolution.

I commend the authors for having done quite a bit of work to simplify the presentation, although, as they say, there is a limit to how much simplification can be done.

Reviewer #2:

1) The main point from my last review was considering the significance of this study. I suggested to test if this gene drive can spread in wild-type populations not expressing Cas9. The authors have now included data that test this and find the gene drive does not spread, possibly because expressing Cas9 comes at cost of fitness.

Considering this negative result, I do see limited impact of the presented data as the method does not work in wild populations. This new result contrasts what the authors state at the end of their abstract, that HomeR would work for wild populations. Hence, I feel this paper should be more suitable for a specialized journal. However, I am not a population geneticist. I leave this issue of impact to the other reviewers/editor. I am also not able to judge the usefulness and accuracy of the new simulations presented in Figure 3 and 4, comparing to other methods without doing any experiments.

2) I appreciate that the authors tried to make this paper more readable. However, I feel there is still a long way to go. Several sentences are still excessively long. 2nd sentence in the introduction extends across 9 lines. Fourth last sentence of intro: 11 lines. Many non-standard abbreviations are used throughout the paper (HG, LBM, EJ, GD, MMEJ, HACK .).

Figure 1—figure supplement 1C. The genotypes on the crosses shown are still much very small and hence unreadable without zooming in. Why do the authors need to show 2 identical crossing schemes with the only difference that gRNA#1 or #2 was used? This information could simply be listed in a table or as done in FigS1D. The authors describe in an extremely complicated way in the text the simple fact that expression of gRNA#1PolG2 in the presence of Act-Cas9 is killing flies more effectively than gRNA#2 PolG2.

Fig S2B. Why do we need a figure that shows how to make transgenes in 2 different ways for both HomeR drives? This should be in the methods. There is no discovery shown. In the end, only one line for each HomeR construct in the PolG2 gene is used for the population experiments. Which method was applied to generate this one is not clear. Again, this distracts from the message and makes the paper hard to read.

https://doi.org/10.7554/eLife.65939.sa1

Author response

[Editors’ note: the authors resubmitted a revised version of the paper for consideration. What follows is the authors’ response to the first round of review.]

The following points must be addressed as they are the two main points the manuscript is trying to make but they are not well supported:

– The major claim of the paper, the release of the drive into a wild type population with no Cas9 must be investigated as there is no practical demonstration of a split drive.

We appreciate this suggestion and have now done additional multigenerational population cage experiments (15 in total.), including the HomeR drives into the WT genetic background at two different release rates, and demonstrated that the spread of HomeR is limited by Cas9 – a desired self-limiting safety feature inherent to the split-gene drive design of HomeR.

– There is also no final demonstration of the repression of R1 alleles: The strategy for sampling resistance appears to not be appropriate and must be changed as suggested by reviewer #4.

We have carefully described our results as to the representation of resistant alleles in our multigenerational population cage experiments in which Homer:Cas9 males were released atat 25% into Cas9 background (5 replicates, followed for 10 generations – reaches fixation). We have provided 12 new multigenerational population cage experiments. Moreover, we assessed fitness of R1 alleles sampled from drive populations and found that these non-silent R1 incurred fitness costs on female carriers. Taken together, these 15 drive experiments illustrate that HomeR is stable, can spread and persist in a Cas9 dependent manner. See more specific comments below.

– The paper claims to have used an ultraconserved gene. As so much importance is placed on the drive design overcoming resistance, this must be demonstrated. A gene that is considered as ultraconserved would require that there is little or no nucleotide variation (e.g. dsx in the Kyrou et al.,). Conservation at the protein level does little to protect against synonymous mutants that would constitute resistance. A better justification for this choice and a demonstration that it was indeed the right choice must be presented.

We have removed the term “ultraconserved” and added more detail as to how and why we chose PolG2 (note: the gene’s name was changed during the review process). We have also provided amino acid alignments to demonstrate just how well conserved this target site is across diverse species from Humans – frogs – chickens – mice – insects (Figure 1—figure supplement 1B).

– The data are really presented in a very difficult way and the paper must be extensively revised with a much broader audience in mind.

We have edited and simplified the language of the paper to target a broader audience. We have moved many of the figures to the supplement to help streamline the paper.

– The reviewers did a very thorough review of the paper that should help you to improve it. The individual reviews are included below.

We agree and to be honest this is one of the most constructive sets of reviews on a paper that we have ever received. We tremendously appreciate all the reviewers for their hard work.

Reviewer #1:

This paper shows that Cas 9 mediated homology directed repair can be used to insert at synthetic rescue gene into an essential gene, here mitochondrial Pol-γ35 was chosen. The insertion is marked by an eyeless-GFP reporter and also contains the gRNA (gene drive) but not the Cas9 (considered as a safe split gene drive). 'Homing' of the eye-GFP is assayed to detect insertion at the homologous locus when Cas9 is present by HDR.

The authors show that this works well in the female germline with various tested Cas9 lines (vas, nos, Act5C and ubiq-Cas9). In all cases close to 100% transmission to the homologous locus on the homologous chromosome is achieved when an effective guide RNA is used. Hence, eye GFP transmits (“homes”) in a “super-Mendelian” ratio at the chosen target. A male specific transmission works less well (exuL-Cas9).

The reason why it works well appears to be that the chosen target is an essential gene (Pol-γ35) in which small changes caused by NHEJ that result in homing 'resistant' alleles will be loss of function alleles and hence will not spread in the population.

Unfortunately, the authors did not test, how the drive could spread in a wild type population (no Cas9 expression). I am also missing a test relevant for pest studies that would achieve the spread of a potentially deleterious or beneficial insertion that could kill a population or make it resistant to a disease.

We thank this reviewer for this comment. To strengthen this paper, we have now included a total of three separate multigenerational population cage experiments (15 experiments with replicates in total). These include release of Homer:Cas9 males at 50% in Cas9 background (3 replicates, followed for 10 generations – reaches fixation), and release of Homer:Cas9 males at 25% in Cas9 background (5 replicates, followed for 10 generations – reaches fixation). We also perform a negative control experiment by releasing Homer males at 25% in WT background (3 replicates, followed for 10 generations, does not reach fixation as expected since drive is Cas9 dependent and inherently confineable). In addition to these experiments, we also have included additional experimental data illustrating the behavior of the HomeR system in a wildtype population released at two introduction frequencies. These include the release of HomeR:Cas9 males at50% into a WT background (2 replicates, followed for 6 generations) and the release of HomeR:Cas9 males at75% into a WT background (2 replicates, followed for 6 generations). As expected from all these experiments, the HomeR drive persists/spreads in a Cas9 dependent manner, making the drive inherently confineable. Moreover, we have provided further modelling to illustrate the behavior with multi-releases of HomeR drive with varied fitness costs to both the drive and to the Cas9 alleles.

1) This paper is very hard to read. Sentences are excessively long and complicated. References to the Figures appear not always correct.

We thank this reviewer for this comment. We have extensively revised this manuscript and have moved many of the non-essential results (e.g. second gRNA construct details) to the supplement to make the manuscript easier to digest.

2) Figure 1. Genotypes in Figure 1A are unreadable in the print version because of the small font.

We appreciate this comment – we have increased the font/readability of this panel and also moved this Figure to Figure 1—figure supplement 1C.

Are the 2 crossing schemes required that only differ in gRNA1 or gRNA2?

Yes, there are two crossing schemes differing by gRNA (1 or 2). We wanted to illustrate that gRNA#1/Cas9 is lethal to all trans-hets (male/female) while gRNA#2 is lethal only to trans-het males. We hope that by increasing the font size this distinction will now be clearer.

The surviving progeny should be quantified as in Figure 1B. Figure 1B shows nos-Cas9 and not act-Cas9 results (several typos in subsection “Design and testing of gRNAs targeting an essential gene”).

Corrected. The data for both Nos-Cas9 and Act5C-Cas9 can be found in Supplementary fil2 and Supplementary file 3.

Figure 1C: the incidence of heterozygous, homozygous and “resistant” cells is schematic and not supported by data, hence questionable if Figure 1C should be shown in results.

Corrected. We appreciate this suggestion and have moved this panel to Figure 1—figure supplement 4. LBM is a mechanism explaining the data. The expression of Cas9/gRNA during development induces independent mutations in somatic and germ cells resulting in lethality at the organism level.

3) Figure 2. Genotypes not readable in print. Is it necessary to show schemes of the procedure how transgenic flies were generated and how the Pol-γ35 HomeR were made with all chromosomes detailed (Figure 1D)? This could move to the Materials and methods as it is standard and we learn not much new.

Corrected. We appreciate this suggestion and have moved this from the main figures and to Figure 1—figure supplement 2.

4) More typos: Subsection “Assessment of germline transmission and cleavage rates”:

Figure 2B is the wrong reference.

Corrected.

Actic 5C should read Actin 5C.

Corrected.

Figure 4B GGG codes for Gly (not Gla).

Corrected, now Figure 2—figure supplement 1.

Sixth paragraph of the Discussion should refer to Figure 6?

Corrected.

5) Figure 5 – as Figure 1 Figure 2, only readable on the computer.

We have updated and modified the figures to make them more readable.

6) It would be interesting to see how the gene drive would spread if Home R and Cas9 would be introduced in a competitive way into wild type populations. This is similar to Figure 4C, but the only the Home R males or females would carry the Cas9. This would be a more realistic test how the gene drive could spread in a wild population that obviously does not express Cas9.

We agree, and we have now included those experiments. In total we now have 15 multi-generational drive experiments. We have also included additional mathematical modelling to support this new data and make predictions related to fitness costs (to wither allele: Cas9 of the Homer) in addition to multi-release scenarios.

Reviewer #2:

Kandul et al., present an interesting study that could lead to important improvements on the use of homing-based gene drives. However, before publication can be supported there are a number of things that should be addressed to improve the manuscript for better comprehension by readers.

Overall the manuscript presents a load of data. But the presentation of these data could be made in a better digestible way. The authors should go over their manuscript with a reader in mind, that is interested but not necessarily knows all the relevant literature in the very detail.

We thank the reviewer for this comment and have significantly revised the manuscript and moved much of the nonessential material to the methods /supplement to make the paper more digestible.

Abstract: Please remove "inherently confinable" from the abstract. The drive is indeed designed in a split drive design, however, all the experiments were done in a homozygous Cas9 background. Therefore, there are no experimental data for a split drive provided in this manuscript. The split situation seems to be here more of a practical reason to be allowed to do the experiments in a less stringent laboratory environment. Thus there are no experimental data that would support the confineable nature of this drive. Actually there are not even modelling data to this. Thus, such a statement should not be put in the Abstract. This manuscript is not a demonstration of a confineable drive.

We thank the reviewer for this comment. However, we have provided new population cage data (See response to reviewer 1 comments above) demonstrating that Homer spreads in a Cas9 dependent manner. These included multiple experimental releases of HomeR into WT populations demonstrating both drive capacity, persistence and confineblity. Therefore, given this ample new data, we prefer to leave in the terms “inherently confineable,” as we have clearly demonstrated this potential.

Results: How was Pol-γ35 identified? It would be interesting to the reader to get to know about the exact reasoning, why this gene was chosen. Or were there several ones chosen before and this turned out to work the best or was the easiest to design. This could be very interesting considerations important to the field.

We have added in more detail as to why we chose this target gene.

“Selection of PolG2 as a HomeR drive target.

To develop a HomeR-based drive, we first identified an essential haplosufficient gene to target. We chose DNA Polymerase γ subunit 2 (PolG2, DNA polymerase Ɣ 35-kDa, CG33650), required for the replication and repair of mitochondrial DNA (mtDNA) (Carrodeguas, 2000; Carrodeguas et al., 2001) whose LOF results in lethality (Iyengar et al., 2002). PolG2 encodes the small subunit of the mitochondrial DNA polymerase γ, acting together with the large subunit 1 (PolG1, DNA polymerase Ɣ 125-kDa, CG8987) for the replication and repair of the mitochondrial genome (Carrodeguas, 2000; Carrodeguas et al., 2001). PolG2 is a short and conserved gene, and its C-terminal domain (cd02426, (Lu et al., 2020)) is roughly 130 amino acids (AA) and shares 55% AA identity with the Human PolG2 (Lecrenier et al., 1997) (Figure 1—figure supplement 1A,B). Importantly, Drosophila PolG2 loss-of-function (LOF) mutations are known to cause lethality at the early pupal stage (Iyengar et al., 2002). The C-terminal location of the functional domain in PolG2 facilitates its re-coding, making PolG2 an optimal target for a HomeR gene drive (Figure 1A).

Moreover, we want to point out here that since receiving this paper back from peer review – Flybase has updated the gene name for this gene to PolG2 – and we have therefore modified this gene name throughout the entire manuscript.

Results (Figure 1B; Figure 1C) and Materials and methods and Figure 1 (both Figure and legend):

The addressing of the Figure panels and the writing to it don't fit. Has there been a rearrangement of the Figure that was not worked through the text?

When referring to "B" in the text, it is still about Act5C-Cas9 and the nos-Cas9 data are in the text referred to Figure 1C. But Figure 1C is BLM.

Corrected. We have updated the figures to correct this issue.

In current panel Figure 1B, what does "all" mean below the X-axis? This is not comprehensible.

Corrected. We have edited the data presentation in this figure panel to make it easier to comprehend. We have also indicated where the raw data for this panel can be found.

Panel C is not really described in the Figure legend.

Corrected. We have removed this panel from this Figure. LBM is now described in Figure 1—figure supplement 4.

Results, Discussion, and Figure 1—figure supplement 4legend. "converting recessive non-functional resistant alleles into dominant deleterious /lethal mutations" is completely misleading. There is no "conversion" and how should that be done molecularly. There is a continuous removal of such alleles from the population because of lethal transheterozygous conditions caused in the drive. However, there is no active conversion of such alleles into dominant lethal ones. This needs to be clearly rewritten to avoid the misleading idea.

Corrected. We appreciate this comment and have revised our wording to better describe the mechanism of BLM.

Figure 1—figure supplement 4 also seems to have a slight conceptional problem. What are "cells" (rectangles) with a red frame and a green core? Green means at least one wt allele (this must include the recoded rescue allele.). Red means biallelic knock-out: thus a red cell cannot have a wt allele. Thus what is a red-framed green core cell?

To explain the removal of R2 alleles, a depiction of yellow framed red core cells in the germ line would be helpful, since this would explain how R2 alleles are selected against and might be continuously removed from the population.

Corrected. We appreciate this comment and have revised this figure.

Results: Before going into the modelling, the reader should be clearly informed about all the different approaches that are now to be compared. This is currently not done well, if at all. Thus moving current Figure 6 before current Figure 5 might clearly help. Also a better explanation of the panels in Figure 6 is necessary as well as a correction of Fig6 Panel E.

A comparison of a great number of the currently approached toxin-antidote (gene destruction – rescue, but not killer-rescue.) systems is greatly appreciated. However, the authors cannot expect the general reader to know about the small detailed differences between the systems that are compared here. Thus the authors need to do some explanation and categorization of the different approaches here and also cite all the respective literature.

We appreciate this comment and agree that a detailed description of each system would be great. However, we prefer to focus on the results from our study and have therefore directed the reader to Figure 3—figure supplement 1 for mechanistic comparisons of each of these systems. What would be useful at this point would be a detailed review article that covers and compares each of these approaches – perhaps we could take that on later.

– First subdivision: Non-homing (interference-based drives) VERSUS Homing (thus overreplication-based drives). This will also help then to better understand, why the interference-based drives (TARE and ClvR) are more sensitive to fitness parameters than overreplication drives.

This distinction is present in the Figure. See top right. Non-homing (empty Circle) ; Homing (green Circle)

– Second subdivision: same-site VERSUS distant site. This is important to understand the difference between the here modelled TARE and the CLvR. Actually ClvR is a TARE, but you use TARE here more specifically as the results in the respective paper are demonstrating only a same-site TARE. But this needs to be clearly stated here.

We appreciate the suggestion. We have added in same-site (teal diamond) and distant-site (red ellipse) into the schematic. This should help demonstrate that we are depicting a same-site TARE as the reviewer points out.

– Third subdivision: viable VERSUS haplosufficient VERSUS haploinsufficient. This also needs to be clearly depicted in labellling panels C to F of Figure 3—figure supplement 1, which are currently hard to grasp what the essential differences are, before looking at the panels in detail:

This distinction is present in the Figure. See top right. Non-essential gene (C,D; Yellow Hexagon); Haplosufficient (F; Orange Square) ; Haploinsufficeint (E; blue triangle)

C: HGD of viable gene (HGD)

D: HGD of viable gene with rescue (HGD+R)

E: HGD of haploinsufficient gene with rescue (HGD-hi+R). This panel needs major correction.

F: HGD of haplosufficiant (essential) gene with rescue (HomeR)

Corrected. We appreciate these suggestions – and have updated our Figure for more clarity. All of this information can now be found in the figure itself or the Figure legend.

– Fourth subdivision: split VERSUS non-split. Here for the split HGD situation, the respective papers of which the current authors are co-authors should be cited: Kandul et al., 2020 and Li et al. 2020. In addition, it is also important to state clearly that "split or two locus" is completely independent of the "distant site" concept.

Corrected. All the drives in this Figure are split designs and we have added that descriptor into the title “Mechanistic comparison of contemporary split-drives for population modification.” If we wanted to compare non-split – we would need to generate a new figure and given that HomeR is a split system we don’t think that is necessary here. Those papers (Kandul et al., 2020 and Li et al., 2020) have been cited.

The reader needs to understand the differences of the systems that are compared here, without having the reader to go to the respective publications themselves and then try to find out what the differences really are. This is not so obvious and the current authors have a clear chance here to do that and help the reader in the mists of all this similar but still distinct approaches.

We have tried our best to articulate these systems and make clear what the differences are. This underscores that what is really needed is a detailed review on this topic that covers these various systems. Please reach out to us if you might be interested in writing that piece with us in the future.

Figure 6 Panel E: This depiction is not consistent within itself, not consistent with the legend, and not consistent with the cited literature.

– Why should the rescuing drive construct over the wt allele be lethal as indicated in the right two boxes?

Thank you for pointing this. We have modified the figure legend to match the depiction in panel E. The reviewer asks, “Why should the rescuing drive construct over the wt allele be lethal as indicated in the right two boxes?” This results from maternal deposition/somatic expression of Cas9/gRNA acting on the “WT” allele resulting in disruption of that allele. Given that this is targeting a Haploleathal gene – having only one functional copy is not sufficient – hence lethality. From our experiences using the Nanos-Cas9 line, which is the same line used by Champer et al., 2020, we find significant maternal deposition and somatic activity. That said, the reviewer is correct, Champer et al., 2020 reportedly had “minimal somatic expression” from Nanos-Cas9 and therefore we have updated this figure to reflect these findings.

– The cited paper Champer et al., 2020b clearly states that there is maternal carry over, which actually makes it so hard to use and is probably only working via male propagation. In the Figure legend it is said that "maternal carryover and somatic expression.… are empirically unavoidable", which is contrast to the depiction. The legend then also states that this is "unachievable". This should be better replaced by "hard to achieve", since the approach is published and seems to drive, even though probably just via the males. Thus the depiction of panel E needs to be thoroughly revised.

Corrected.

Discussion: The haplolethal HGD works (admittingly poorly) despite the maternal carryover (Champer et al., 2020b). Therefore, your statement needs to be refined or deleted: "requires germline-specific promoter that lacks maternal carryover" is not consistent with the published paper. The drive could go via the males because then you do not have maternal carry over. And homing based drives can go via males and do not necessarily have to be promoted through females, see also KaramiNejadRanjbar et al., 2018.

We thank the reviewer for this comment and have modified this sentence.

Discussion. This sentence is based on an old but clearly overruled idea. NHEJ repair is not restricted to a time before the fusion of the paternal and maternal genetic material. It has been clearly demonstrated that R1 and R2 alleles are generated in the early embryo also after the zygote state (Champer et al., 2017, KaramiNejadRanjbar et al., 2018). Actually, all of the authors' Figure 1C and Figure 1—figure supplement 4 are about NHEJ mutation in the early embryo causing "BLM". Thus this sentence is inconsistent with current believes and also with the authors' own writing.

We thank the reviewer for this comment and have modified this sentence.

Figure 4: Panel C graph: Why is in the controls the transgene consistently and significantly higher inherited to the next generation (0). It is about 75% progeny sired by the transgenic fathers compared to the wild type fathers? Was there an age advantage of the transgenic ones or whatever other fitness factor? This is surprising and no explanation is given at all.

In contrast, in the Cas9 background, in generation 0 less than 50% carry the drive allele, which is probably due to induced lethality. But this should also be commented on.

We thank the reviewer for pointing this out. We have mentioned the higher mating competitiveness of HomeR without Cas9 relative to WT males in the text (Figure 2B). This data suggests that HomeR provides a good rescue and does not incur fitness costs without Cas9. We now clearly show that nos-Cas9 does incur fitness costs to its carriers alone (Figure 2C), so then HomeR+Cas9 males released together with Cas9-alone male, the former are less competitive (Figure 2B).

In the legend it is stated that 7 of 9 flies carried an R1 allele heterozygous to an R2 allele. What about the other two?

We thank the reviewer for this comment and have clarified these results in the manuscript. The current legend of Figure 3—figure supplement 1: “Seven out of nine flies were heterozygous, harboring one of the identified PolG2R1 alleles together with an out-offrame indel (LOF) allele. The remaining two GFP- flies were likely homozygous for the R1#1 allele, because ten randomly sequenced clones harbored the same allele.” We described in the method section that PCR amplicons from a single fly were subcloned into the pCR2.1-TOPO plasmid, and at least 7 clones were sequenced in both directions by Sanger sequencing. In general, we were able to identify both allele from sequencing 7 clones. If all 7 clones were identical provide, we would sequence 3 additional clones.

Reviewer #3:

The authors are to be commended for the effort put into careful experimental design and clear presentation of methods and results.

My main concern with the manuscript is that the claim about their specific polymerase gene being "ultraconserved" is not backed up with their own data or by citations from the literature. If the gene sequence was ultra-conserved, I wouldn't have expected the authors to be able to do so much recoding of the gene without fitness consequences. Furthermore, it is clear that homozyogous-viable NHEJ mutations did develop in the experiment. Without explanation, this seems to be a fatal flaw in the design.

Corrected. We thank the reviewer for this comment and have removed the term “ultraconserved” from the manuscript. We have also expanded the discussion on resistant alleles that were viable, and assess their fitness. It should be noted that every drive system is susceptible to resistant alleles and these cannot be 100% avoided by any design. We have not identified any silent R1 allele: every sampled in-frame resistant alleles changes the amino acid sequence of PolG2.

This manuscript describes a modification of the general homing gene drive concept by use of a split drive system that increases the frequency of a recoded polymerase gene that replaces a cleavage susceptible, naturally occurring, haplosufficient, conserved polymerase gene. This approach is taken in order to limit the evolution of cleavage resistance in the naturally occurring gene.

As mentioned in the summary, I am not convinced that the research presented achieves the intended goals. I did a quick look for literature on the "ultraconserved" polymerase pol-y35 gene a could find none. I am not sure if the conservation is at the DNA sequence level or at the amino acid level. If at the amino acid level, then it makes sense that resistance alleles can form at the DNA level that don't impact the protein at all. Figure 2A shows the 22 and 27 recoded nucleotides for the two guide RNA sites. The authors say that these changes to the sequences didn't seem to impede fitness. Did the authors try many other recodings and finally decide on these because all others caused loss of fitness, or is it just that this gene is robust to substitutions even though the protein is conserved.

We thank the reviewer for this comment, the term “ultraconserved” was removed from the manuscript. In terms of the fitness question, we did not observe any significant decreases in fitness. Moreover, when Homer was released at 25% allele frequency – it remained relatively stable in 3 multigenerational population cage experiments observed for 10 generations (Figure 2B). Additionally, we targeted the 3’ end of the gene to minimize the degree of recoding required to further minimize potential fitness impacts. To help illustrate the degree of conservation we are referring to – we have provided the sequence alignments in Figure 1—figure supplement 1B.

Figure 4C shows that the frequency of flies with at least one copy of the pol-y35home R1 increased from about 25% to about 50% between the parental and F0 generation when there was no Cas9 present. As long as the transgenic males were competitive with the wild flies this makes sense because the released flies were homozygous for that allele and the offspring should all have inherited one copy of the gene. What doesn't make sense is that when the work was done with all flies harboring the Cas9, the pol-y35home R1 increased less than in the former case, from the parental to generation F0, the frequency of flies with the pol-y35home R1. In some replicates the frequency of such flies didn't increase at all. It should be noted that the parents were always homozygous. This certainly indicates a fitness cost to the flies with a combination of Cas9 and the homing construct.

We appreciate this comment and agree that there seems to be a major fitness cost when Cas9 is present. In this revision we have provided additional multigenerational population cage experiment data to support this claim. For example, Figure 2C shows HomeR:Nos-Cas9 male releases at two thresholds (25% and 37.5% allele frequencies) into a WT background. In the first couple generations we can see a bit of stochasticity – however at generation 2-onward we see the Nos-Cas9 allele rapidly decline in frequency – heading toward elimination. This illustrates the inherent fitness cost the Cas9 allele carries which can explain the discrepancy this reviewer highlights.

In this same Figure, results from the model are plotted. It seems like the model assumes no fitness cost because it shows an exact increase from 25% to 50% flies carrying at least on copy of the pol-y35home R1 theoretical construct. In later generations the experimental results outperform the model. Presumably, this model is used to construct figure 6. This mismatch needs to be addressed in the manuscript.

Our original fits indicated that males experienced very few issues due to expression of Cas9, leading to the near perfect inheritance seen by the reviewer. However, we believe that was an artifact of a Cas9 population, and have since updated our work to reflect the new cage experiments, current Figure 3C, HomeR:Nos-Cas9 released into a WT background. These experiments allowed us to uncover a significant mating reduction in males, due to expression of Cas9, which is also reflected in HomeR-Exp. simulations in figure 5A and 5C. This fitness cost aligns our model with the current experimental work. The model does indeed take into account finess costs and these are more impactful later in the drive spread – but it does not take into account mating competition / inbreeding.

The fact that in all three replicates of the experiment without Cas9, the F0 is above 50% indicates that something else may be going on that is unrelated to gene drive. It could be due to heterosis between the two slightly different strains of flies. When wildtype males mate with wildtype females, the offspring are more inbred than when a transgenic male mates with a wildtype female. -just a hypothesis.

Heterosis is an interesting hypothesis. We think this is more likely due to stochasticity resulting from small caged populations in addition to the fitness costs associated with Cas9 as articulated above. Regardless, we have provided ample additional multigenerational population data (15 in total) further supporting the conclusion that HomeR can spread in a Cas9 dependent manner.

Reviewer #4:

Gene drives can be used for sustainable control of disease vectors, and there is a need for a different gene drive strategies that can be tailored to the particular species, timescale, and desired spatial spread. Kandul and colleagues present a welcome new addition to the growing number of strategies for gene drive, called HomeR, that combines elements of killer-rescue and homing-based drive to exert spatiotemporal control over its spread, whilst counteracting the rise of resistant mutations. Whilst it is extremely promising, some major claims of this manuscript are inaccurate or unsupported by the evidence. The authors could easily address the most important concerns by expanding their sequencing analysis to better detect and quantify resistant mutations, paying careful attention not to overstress the potential of this drive to mitigate resistance, and by comparing the relative strengths of different drive strategies instead of focussing only on features that are most flattering to the HomeR strategy.

1) The drive release strategy of Figure 4A and 4C are primed to underestimate and potentially mask resistance. In Figure 4A, where the authors search for signs of resistance, the population was seeded with males that were all homozygous for the drive, meaning that 100% of their G0 progeny will inherit it. As the rate of homing is close to 99%, only a small fraction of their G1 could have inherited a non-drive (potentially resistant allele) allele. In a realistic release scenario, resistant alleles will have ample opportunity to be generated and subsequently selected. Though still far from adequate, resistance testing would have been better performed on samples collected from the lower frequency releases in panel C. This experiment should not be used to draw strong conclusions about resistance to pHomeR, but should be used to make broader observations regarding the spread and stability of the construct.

We agree with this reviewer and appreciate this comment. The experiments in Figure 4A (new Figure 2A) were designed to mostly explore the stability of HomeR and “fish out” any frequent R1 resistant allele(s) induced in generation 0. In this new revision we have provided ample new multigenerational population cage experiment data – assessing the performance of HomeR when seeded at lower frequencies (e.g. 25% and 37%). These lower release frequencies have indeed provided opportunities for resistant alleles to be generated and selected at the expense of the HomeR allele, this did not happen. In the presence of Cas9, HomeR spread to the fixation (new Figure 2B). The Drives in new Figure 2C also spread or were able to persist at moderate frequency – but the Cas9 allele had too high of a fitness cost and quickly fell out of the populations – limiting the drive spread ability. That said, we did sample additional R1 resistant alleles from the drives in new Figure 2B – indicating again that these alleles can indeed be generated. However, we crudely assessment of fertility of the sampled R1 alleles before genotyping them; and found the sampled R1 alleles imposed fitness cost to the HomeR- viable females.

2) The strategy for sampling resistance will obscure almost all resistance in the population, and would fail to detect even a strong selection for it. Flies were only selected for resistance genotyping if they lacked GFP, meaning they carry two non-HomeR alleles (i.e. homozygous for the R1 allele or transheterozygous with another R1/R2/WT). One would expect most resistant alleles to be heterozygous in a population that was seeded with almost complete drive homozygosity. The authors could, and should, have done more to identify and quantify these. Amplicon sequencing was used to sample the full diversity of alleles in a larger pool of individuals (including GFP+ flies) collected at G10, why was this approach not used throughout? By adopting the approach earlier they would have been able to track the changing frequencies of R1 and R2 alleles over time.

We agree with the reviewer that by sequencing every generation would be an ideal approach to assess the fate of all resistant alleles, and would be extremely beneficial for the gene drive field. However, it was beyond the scope and goal of this project. This would be a large, very interesting, and expensive project endeavour, that we would prefer to save for work in an actual pest species (i.e. mosquitoes). Illumina amplicon sequencing was undertaken primarily to see if the R1 alleles identified in generations 2 and 3 persisted until generation 10. Just as a note, we think that due to the HomeR design R2 alleles are not going to block its spread, they can only slow its genotypic fixation. Whereas R1 alleles can block the spread of any homing based gene drive system, including HomeR. We understand that our approach with 320 heterozygous flies at generation 0 is limited -- R1 alleles are hidden by HomeR alleles. WT alleles in heterozygous females will be converted into HomeR alleles, but many WT alleles in heterozygous males will be passed to the next generation as WT, R1 or R2 alleles. Moreover, we have provided additional multigenerational population cage experiments to support the drive propensity, stability and confineability of HomeR. We have also sampled additional R1 resistant alleles from these cage experiments and crudely assess the fitness of their carriers.

3) The impression given in the Figure and main text is that R1 alleles were rare (or entirely absent), when they were not. In spite of the incredible advantage given to the drive, and a bias in sampling method that would mask the presence of resistant alleles, resistance was observed in every generation tested (G2, G3 and G10). The authors claim that because GFP- individuals were not observed in later generations, the resistant alleles had not come under positive selection. This logic is flawed, and indeed their own amplicon sequencing analysis performed on G10 flies revealed several resistant alleles, including an R1 present in 80% of non-drive alleles. The two most frequent mutant alleles detected were in frame, and I do not agree that these are likely to be deleterious recessive (as the authors speculated). These could be functionally resistant mutations. I believe there were many more R1 alleles in heterozygosity with the HomeR allele, these alleles could have been spreading, but were excluded from the genotyping analysis. Could these putative R1 individuals not have been specifically tested to see if they do, or do not confer resistance?

We removed any indication that these R1 alleles were rare events since they were observed at each generation tested. Moreover, we removed the logic that GFP- individuals were not observed in later generations, the resistant alleles had not come under positive selection. We also added a note into the description of this data indicating the limitation of this analysis: “we cannot rule out the possibility that there were many other potential resistant alleles that were present in the population – that could have been masked by the HomeR allele.”

4) The modelling takes a very limited approach to comparing different drive strategies, and by comparing proof-of-principle designs, important differences are obscured. For example, simple modifications that would mitigate resistance are likely to be included in many designs – such as multiplexing gRNAs. The nuances of each design are lost in a discussion focused on the rate of spread, which is largely irrelevant now because all of drives are predicted to spread well.

While we appreciate the reviewer’s comments, we consider such “proof-of-principle” constructs demonstrative of any mitigating modifications to these designs. We also think that the rate of gene drive spread is an important parameter since the slower spread can be blocked by the immigration of existent R1 alleles from neighboring populations. Moreover, we have provided additional modelling to this manuscript taking into consideration multi-releases, fitness and transmission rates. We hope these models will help the reader understand and compare the HomeR system to other contemporary drive designs (see response to #5 below).

5) The authors did not discuss the relevance of having performed releases in a population that was already homozygous for Cas9. Do the release experiments and model really suggest the drive could spread if released into an otherwise WT population? I'm not sure the data presented in this manuscript can support that claim.

The reviewer is correct, we did not provide experimental data of HomeR releases into a WT population in our first version of this paper. That said, this revision includes new multigenerational population cage experiments (15 experiments in total – this was A LOT of work) demonstrating HomeR’s drive potential, stability and confineability. Additionally, we have provided additional mathematical models, determining the required number of releases of HomeR, into a naive population, to reach a population frequency of 95%. These simulations implement overlapping generations and density-dependent effects in larval stages (reflecting biological conditions), along with drive efficacy and fitness parameters estimated from the new cage trails, to accurately estimate the behavior of HomeR in the field.

[Editors’ note: what follows is the authors’ response to the second round of review.]

Reviewer #1:

I commend the authors for conducting additional experiments that enable assessment of the drive dynamics of their strategy under conditions when the split drive is introduced into a lab population without Cas9 and testing the Cas9 independently. The finding that the Cas9 has a fitness cost explains some of the previous results. This must have been a lot of work, but I think it was worth the effort.

Thank you for this feedback. We agree, this was a significant amount of work – but it was essential to include this additional data.

I appreciate that the authors have removed the term "ultra-conserved", but I am still not comfortable with their use of the term "conserved" in relationship to the focus of the manuscript on gene drive. It's not just the term, but the expectation that this will be a stable drive system. Even with the small sample size in the current laboratory experiments (compared to what would be expected for the size of the target population in a field release) mutations arose that seemed to have no fitness consequence in males even as single copies with an LOF copy. Isn't it therefore reasonable to expect mutations to arise due to NHEJ that wouldn't have fitness effects on males and females? Beyond that, wouldn't a natural population be expected to already harbor some genotypes that would be immediately resistant to this drive? The authors should clearly address why they don't expect this problem with using their design outside of the lab.

We appreciate this comment and have removed the word conserved throughout the manuscript. We stressed further in the revised version of the manuscript that some functional resistant alleles can exist in natural populations and together with induced functional resistance alleles they can slow down the spread of HomeR and hinder its spread. We agree with the reviewer that eventually functional resistant alleles can be sampled or induced, and no homing gene drive system is completely stable in the face of natural selection. We offer suggestions on how to improve HomeR drive stability by multiplexing gRNAs.

In Figure 1—figure supplement 4, the authors show amino acid sequences that appear to be consensus sequences. What is important for this paper is understanding how much variation exists in the DNA sequence for the 3' end part of the domain of the gene for D. melanogaster and other potential targets of gene drive. At least for D. melanogaster there are many sequences available. Such data may also be available for some other pest insects. Before this paper is accepted, I think it behooves the authors to provide information on this issue that could predict whether this drive would really thwart resistance evolution.

Thank you, this is a very great suggestion. We previously sequenced the targeted area in PolG2 in a few D. melanogaster lines available in the lab (Oregon, Canton S, etc.), but did not explore the SNP datasets available for D. melanogaster. To explore these datasets, we downloaded the Drosophila melanogaster Genetic Reference Panel 2 (http://dgrp2.gnets.ncsu.edu) that includes natural variation in genome architecture among 205 Drosophila melanogaster Genetic Reference Panel lines (Mackay et al., 2012; Huang et al., 2014) to search for SNPs mapped to the target sequences. Interestingly, we did not find any SNPs mapped to the target sequences. The nearest SNP is located 8 bases downstream from the PAM sequence of gRNA#1 (the blue C in Author response image 1). A description of this analysis was added to the revised manuscript. Author response image 1 shows the alignment of 63 bases including both gRNAs of 13 Drosophila species (consensus) sequences.

Author response image 1

I commend the authors for having done quite a bit of work to simplify the presentation, although, as they say, there is a limit to how much simplification can be done.

Thank you for this feedback.

Reviewer #2:

1) The main point from my last review was considering the significance of this study. I suggested to test if this gene drive can spread in wild-type populations not expressing Cas9. The authors have now included data that test this and find the gene drive does not spread, possibly because expressing Cas9 comes at cost of fitness. Considering this negative result, I do see limited impact of the presented data as the method does not work in wild populations. This new result contrasts what the authors state at the end of their abstract, that HomeR would work for wild populations. Hence, I feel this paper should be more suitable for a specialized journal. However, I am not a population geneticist. I leave this issue of impact to the other reviewers/editor. I am also not able to judge the usefulness and accuracy of the new simulations presented in Figure 3 and 4, comparing to other methods without doing any experiments.

We have described a novel design of the homing gene drive (HomeR) and thoroughly showed that it spreads via homing by scoring the egg-to-adult rate. The assessment of egg-to-adult rate has only recently become a golden standard for assessment of drive mechanism. We have built HomeR as a split-drive (two-locus) to engineer a biocontainment (safety feature) into its design. The HomeR drive is dependent upon Cas9. When Cas9 levels are high in a population the HomeR spreads to fixation (Figure 2A,B). When Cas9 is introduced at low frequency (25%) and continues to reduce in frequency in subsequent generations (nearing 0 by Gen 6), resulting from inherent fitness costs, HomeR spreads in the first few generations (while Cas9 is higher in frequency) then persists (above >50% frequency in all 4 independent populations tested). This should not be misinterpreted as “ authors have now included data that test this and find the gene drive does not spread, possibly because expressing Cas9 comes at cost of fitness.” This result demonstrates that HomeR can spread and its spread is limited by the availability of Cas9. This result was expected, since it is a biocontainment safety switch incorporated into a split-drive design, and is not a “negative result.” Moreover, it should be noted that we only used one release threshold here, and by providing mathematical modeling we can see that HomeR would perform well under a multi-release scenario (Fig 4) against a WT population. Limiting the spread of a split-drive has been a desirable outcome - and a HomeR drive performed well for this purpose in a WT population.

2) I appreciate that the authors tried to make this paper more readable. However, I feel there is still a long way to go. Several sentences are still excessively long. 2nd sentence in the introduction extends across 9 lines. Fourth last sentence of intro: 11 lines. Many non-standard abbreviations are used throughout the paper (HG, LBM, EJ, GD, MMEJ, HACK .).

We appreciate this feedback. These sentences have been corrected. In terms of non-standard abbreviations, unfortunately these are all terms that have been used throughout the literature to describe processes related to DNA repair: End Joining (EJ), Microhomology Mediated End Joining (MMEJ), loss-of-function (LOF) alleles, Homology assisted CRISPR Knock-In (HACK), or used throughout the gene drive literature; Homing Gene Drive (HGD) and Lethal Biallelic Mosaicism (LBM). In fact, none of the terms listed are original to this paper and have been adopted from other published work. Therefore, these are standard abbreviations used in this field.

Figure 1-figure supplement 1C. The genotypes on the crosses shown are still much very small and hence unreadable without zooming in. Why do the authors need to show 2 identical crossing schemes with the only difference that gRNA#1 or #2 was used? This information could simply be listed in a table or as done in FigS1D. The authors describe in an extremely complicated way in the text the simple fact that expression of gRNA#1PolG2 in the presence of Act-Cas9 is killing flies more effectively than gRNA#2 PolG2.

We appreciate this feedback and have increased the font size of Figure 1-figure supplement 1C. We opted to show both crossing schemes as we wanted to be clear which genotypes are perishing. Given that this is a supplemental figure and we are not limited by space - we prefer to keep this figure as is. We have modified the text to make it less complicated as the reviewer suggests.

Fig S2B. Why do we need a figure that shows how to make transgenes in 2 different ways for both HomeR drives? This should be in the methods. There is no discovery shown. In the end, only one line for each HomeR construct in the PolG2 gene is used for the population experiments. Which method was applied to generate this one is not clear. Again, this distracts from the message and makes the paper hard to read.

We appreciate this comment and have added the relevant information to the methods. We thank that the method may be difficult for some readers and hence have made this figure a supplemental figure. We think this figure is important to help readers understand the methods we used to generate the gene drive lines and therefore would prefer to keep this supplemental figure.

https://doi.org/10.7554/eLife.65939.sa2

Article and author information

Author details

  1. Nikolay P Kandul

    Section of Cell and Developmental Biology, University of California, San Diego, San Diego, United States
    Contribution
    Conceptualization, Formal analysis, Supervision, Investigation, Methodology, Writing - original draft, Writing - review and editing
    Competing interests
    is a consultant for Agragene.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-7347-5558
  2. Junru Liu

    Section of Cell and Developmental Biology, University of California, San Diego, San Diego, United States
    Contribution
    Data curation, Investigation, Writing - review and editing
    Competing interests
    No competing interests declared
  3. Jared B Bennett

    Biophysics Graduate Group, University of California, Berkeley, Berkeley, United States
    Contribution
    Formal analysis, Investigation, Visualization, Methodology, Writing - review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-4718-257X
  4. John M Marshall

    Division of Epidemiology and Biostatistics, School of Public Health, University of California, Berkeley, Berkeley, United States
    Contribution
    Formal analysis, Supervision, Funding acquisition, Investigation, Methodology, Writing - review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-0603-7341
  5. Omar S Akbari

    Section of Cell and Developmental Biology, University of California, San Diego, San Diego, United States
    Contribution
    Conceptualization, Formal analysis, Supervision, Funding acquisition, Investigation, Methodology, Writing - original draft, Project administration, Writing - review and editing
    For correspondence
    oakbari@ucsd.edu
    Competing interests
    is a founder of Agragene, Inc, has an equity interest, and serves on the company's Scientific Advisory Board.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-6853-9884

Funding

Defense Advanced Research Projects Agency (HR0011-17-2-0047)

  • Omar S Akbari

National Institutes of Health (R21RAI149161A)

  • Omar S Akbari

National Institutes of Health (R01AI151004)

  • Omar S Akbari

National Institutes of Health (DP2AI152071)

  • Omar S Akbari

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

This work was supported in part by funding from a DARPA Safe Genes Program Grant (HR0011-17-2-0047), and NIH awards (R21RAI149161A, R01AI151004, DP2AI152071) awarded to OSA. The functional characterization of the ExuL promoter was done by OSA while at Caltech working with Bruce A Hay.

Senior Editor

  1. Patricia J Wittkopp, University of Michigan, United States

Reviewing Editor

  1. Claude Desplan, New York University, United States

Publication history

  1. Received: December 19, 2020
  2. Accepted: March 4, 2021
  3. Accepted Manuscript published: March 5, 2021 (version 1)
  4. Version of Record published: March 17, 2021 (version 2)

Copyright

© 2021, Kandul et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 1,115
    Page views
  • 125
    Downloads
  • 3
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Genetics and Genomics
    2. Plant Biology
    Ranjith K Papareddy et al.
    Research Article

    DNA methylation has evolved to silence mutagenic transposable elements (TEs) while typically avoiding the targeting of endogenous genes. Mechanisms that prevent DNA methyltransferases from ectopically methylating genes are expected to be of prime importance during periods of dynamic cell cycle activities including plant embryogenesis. However, virtually nothing is known regarding how DNA methyltransferase activities are precisely regulated during embryogenesis to prevent the induction of potentially deleterious and mitotically stable genic epimutations. Here, we report that microRNA-mediated repression of CHROMOMETHYLASE 3 (CMT3) and the chromatin features that CMT3 prefers help prevent ectopic methylation of thousands of genes during embryogenesis that can persist for weeks afterwards. Our results are also consistent with CMT3-induced ectopic methylation of promoters or bodies of genes undergoing transcriptional activation reducing their expression. Therefore, the repression of CMT3 prevents epigenetic collateral damage on endogenous genes. We also provide a model that may help reconcile conflicting viewpoints regarding the functions of gene-body methylation that occurs in nearly all flowering plants.

    1. Genetics and Genomics
    2. Microbiology and Infectious Disease
    Peter J Diebold et al.
    Research Article Updated

    The horizonal transfer of plasmid-encoded genes allows bacteria to adapt to constantly shifting environmental pressures, bestowing functional advantages to their bacterial hosts such as antibiotic resistance, metal resistance, virulence factors, and polysaccharide utilization. However, common molecular methods such as short- and long-read sequencing of microbiomes cannot associate extrachromosomal plasmids with the genome of the host bacterium. Alternative methods to link plasmids to host bacteria are either laborious, expensive, or prone to contamination. Here we present the One-step Isolation and Lysis PCR (OIL-PCR) method, which molecularly links plasmid-encoded genes with the bacterial 16S rRNA gene via fusion PCR performed within an emulsion. After validating this method, we apply it to identify the bacterial hosts of three clinically relevant beta-lactamases within the gut microbiomes of neutropenic patients, as they are particularly vulnerable multidrug-resistant infections. We successfully detect the known association of a multi-drug resistant plasmid with Klebsiella pneumoniae, as well as the novel associations of two low-abundance genera, Romboutsia and Agathobacter. Further investigation with OIL-PCR confirmed that our detection of Romboutsia is due to its physical association with Klebsiella as opposed to directly harboring the beta-lactamase genes. Here we put forth a robust, accessible, and high-throughput platform for sensitively surveying the bacterial hosts of mobile genes, as well as detecting physical bacterial associations such as those occurring within biofilms and complex microbial communities.