Mapping the functional landscape of the receptor binding domain of T7 bacteriophage by deep mutational scanning

Abstract
eLife digest
Introduction
Results
Discussion
Materials and methods
Data availability
References
Article and author information
Metrics

Abstract

The interaction between a bacteriophage and its host is mediated by the phage's receptor binding protein (RBP). Despite its fundamental role in governing phage activity and host range, molecular rules of RBP function remain a mystery. Here, we systematically dissect the functional role of every residue in the tip domain of T7 phage RBP (1660 variants) by developing a high-throughput, locus-specific, phage engineering method. This rich dataset allowed us to cross compare functional profiles across hosts to precisely identify regions of functional importance, many of which were previously unknown. Substitution patterns showed host-specific differences in position and physicochemical properties of mutations, revealing molecular adaptation to individual hosts. We discovered gain-of-function variants against resistant hosts and host-constricting variants that eliminated certain hosts. To demonstrate therapeutic utility, we engineered highly active T7 variants against a urinary tract pathogen. Our approach presents a generalized framework for characterizing sequence–function relationships in many phage–bacterial systems.

eLife digest

Bacteria can cause diseases, but they also battle their own microscopic enemies: a group of viruses known as bacteriophages. For instance, the T7 bacteriophage preys on various strains of Escherichia coli, a type of bacteria often found in the human gut. While many E. coli strains are inoffensive or even beneficial to human health, some can be deadly. Finding a way to kill harmful strains while sparing the helpful ones would be a helpful addition to the medicine toolkit.

Bacteriophages identify and interact with their specific target through a structure known as the receptor binding protein, or RBP. However, it is still unclear exactly how RBP helps the viruses recognize which type of bacteria to infect. Here, Huss et al. set to map out and modify this structure in T7 bacteriophage so the virus is more efficient and specific about which strain of E. coli it kills.

First, the role of each building block in the tip of RBP was meticulously dissected; this generated the knowledge required to genetically engineer a large number of different T7 bacteriophages, each with a slightly variation in their RBP. These viruses were then exposed to various strains of bacteria. Monitoring the bacteriophages that survived and multiplied the most after infecting different strains of E. coli revealed which RBP building blocks are important for efficiency and specificity. This was then confirmed by engineering highly active T7 bacteriophage variants against an E. coli strain that causes urinary tract infections.

These findings demonstrate that even small changes to the bacteriophages can make a big difference to their ability to infect their preys. The approaches developed by Huss et al. help to understand exactly how the RBP allows a virus to infect a specific type of bacteria; this could one day pave the way for new therapies that harness those viruses to fight increasingly resistant bacterial infections.

Introduction

Bacteriophages (or ‘phages’) shape microbial ecosystems by infecting and killing targeted bacterial species. As a result, they are promising tools for treatment of antibiotic-resistant bacterial infections and microbiome manipulation (Canfield and Duerkop, 2020; Chen et al., 2014; Clokie et al., 2011; Dedrick et al., 2019; Kilcher and Loessner, 2019; Kutter et al., 2015; Mizuno et al., 2020; Sausset et al., 2020; Schooley et al., 2017; Shen et al., 2015; Shkoporov and Hill, 2019; De Sordi et al., 2019). Interaction of phages with their bacterial receptors is a key determinant of their host range and virulence (Bertozzi Silva et al., 2016; de Jonge et al., 2019; Rousset et al., 2018). This interaction is primarily mediated by the receptor binding proteins (RBPs) of the phage (Nobrega et al., 2018). RBPs enable phages to adsorb to diverse cell surface molecules, including proteins, polysaccharides, lipopolysaccharides (LPS), and carbohydrate-binding moieties. Phages exhibit high functional plasticity through genetic alterations to RBPs and by natural and laboratory-guided evolution, which can modulate activity and host range to different hosts and environments (Ando et al., 2015; Chen et al., 2017; Dedrick et al., 2019; Dunne et al., 2019; Garcia et al., 2003; Gebhart et al., 2017; Holtzman et al., 2020; Lin et al., 2012; Meyer et al., 2012; Yehl et al., 2019; Yosef et al., 2017). In essence, survivability of a phage is intimately linked to the adaptability of its RBP. The challenge now is to understand the molecular code of RBPs in sufficient depth to enable predictable manipulation of host range and virulence. We sought to do so by combining deep mutational scanning (DMS) of the RBP with powerful selections on multiple hosts.

Although RBPs remain the focus of many mechanistic, structural, and evolutionary studies and are a prime target for engineering, we currently lack a systematic and comprehensive understanding of how RBP mutations influence phage activity and host range. Though insightful, directed evolution enriches only a small group of ‘winners’, which makes it difficult to glean a comprehensive mutational landscape of the RBP (Holtzman et al., 2020). Random mutagenesis-based screens generate multi-mutant variants whose individual effects cannot be easily deconvolved (Dunne et al., 2019; Yehl et al., 2019). Other approaches including swapping homologous RBPs lead to gain of function; however, the underlying molecular determinants of function can be difficult to explain (Ando et al., 2015; Chen et al., 2017; Gebhart et al., 2017; Yosef et al., 2017). In summary, despite the extraordinary functional potential of phage RBPs, how systematic changes to their sequence shape the overall functional landscape of a phage remains unknown.

Here, we carried out DMS, a high-throughput experimental technique, of the tip domain of the T7 phage RBP (tail fiber) to uncover molecular determinants of activity and host range. The tip domain is the distal region of the tail fiber that makes primary contact with the host receptor (González-García et al., 2015; Molineux, 2001; Qimron et al., 2006). We developed ORACLE (Optimized Recombination, Accumulation, and Library Expression), a high-throughput, locus-specific, phage genome engineering method to create a large, unbiased library of phage variants at a targeted gene locus. Using ORACLE, we systematically and comprehensively mutated the tip domain by making all single amino acid substitutions at every site (1660 variants) and quantified the functional role of all variants on multiple bacterial hosts. We generated high-resolution functional maps delineating regions concentrated with function-enhancing substitutions and host-specific substitutional patterns, many of which were previously unknown. We discovered T7 variants with far greater virulence than wildtype T7, demonstrating that even those natural phages that are well adapted to a host can be engineered for higher efficacy.

However, many variants highly adapted to one host performed poorly on others, underscoring a tradeoff between activity and host range. This functional screening highlights ideal regions of the tip domain for engineering host range. Furthermore, we demonstrated the functional potential of RBPs by discovering gain-of-function variants against resistant hosts and host-constriction variants that selectively eliminate specific hosts. To demonstrate the therapeutic value of ORACLE, we engineered T7 variants that avert emergence of spontaneous resistance in pathogenic Escherichia coli causing urinary tract infections (UTIs).

Our study explains the molecular drivers of adaptability of the tip domain and identifies key functional regions determining activity and host range. ORACLE provides a generalized framework to describe sequence–function relationships in phages to elucidate the molecular basis of phages, the most abundant life form on earth.

Results

Creating an unbiased library of phage variants using ORACLE

ORACLE is a high-throughput precision phage genome engineering technology designed to create a large, unbiased library of phage variants to investigate sequence–function relationships in phages. ORACLE overcomes three major hurdles. First, phage variants are created during the natural infection cycle of the phage, which eliminates a common bottleneck from transforming DNA libraries. By recombining a donor cassette containing prespecified variants to a targeted site on the phage genome, ORACLE allows sequence programmability and generalizability to phages with transformable bacterial hosts capable of maintaining a plasmid library. Second, ORACLE minimizes library bias that can rapidly arise due to fitness advantage or deficiency of any variant on the propagating host that may then be amplified due to exponential phage growth. Minimizing bias is critical because variants that perform poorly on a propagating host but well on targeted hosts may disappear during propagation. Third, ORACLE prevents extreme abundance of wildtype over variants, which allows for resolving and scoring even small functional differences between variants. The development of this technology was necessary to overcome challenges with existing engineering approaches for creating a large, unbiased phage library. Direct transformation of phage libraries, while ideal for creating one or small groups of synthetic phages, will not work because phage genomes are typically too large for library transformation (Ando et al., 2015; Kilcher et al., 2018; Marinelli et al., 2008; Marinelli et al., 2019). Homologous recombination has low, variable recombination rates and high levels of wildtype phage are retained, which mask library members (Pires et al., 2016; Yehl et al., 2019). Libraries of lysogenic phages could potentially be made using conventional bacterial genome engineering tools as the phage integrates into the host genome. However, this approach is not applicable to obligate lytic phages. Our desire to develop ORACLE for obligate lytic phages is motivated by their mandated use for phage therapy. Any phage, including lysogenic phages, with a sequenced genome and a transformable host that can maintain a plasmid library should be amenable to ORACLE.

ORACLE is carried out in four steps: (a) making acceptor phage, (b) inserting gene variants through recombination, (c) accumulating recombined phages, and (d) expressing the library for selection (Figure 1A). An ‘acceptor phage’ is a synthetic phage genome where the gene of interest (i.e., tail fiber) is replaced with a fixed sequence flanked by Cre recombinase sites to serve as a landing site for inserting variants (Figure 1—figure supplement 1). We created T7 acceptor phages by assembling PCR fragments of the phage genome in yeast (Ando et al., 2015; Jaschke et al., 2012) (see Materials and methods). T7 acceptor phages lacking a wildtype tail fiber gene cannot plaque on E. coli and do not spontaneously reacquire the tail fiber during propagation (Figure 1B, Figure 1—figure supplement 2A). Furthermore, the T7 acceptor phages have no plaquing deficiency relative to wildtype when the tail fiber gene is provided from a helper plasmid (Figure 1—figure supplement 2A). Thus, the tail fiber gene is decoupled from the rest of the phage genome for interrogation of function. Next, phage variants are generated within the host during the infection cycle by Optimized Recombination by inserting tail fiber variants from a donor plasmid into the landing sites in the acceptor phage using site-specific recombination. To minimize biasing of variants during propagation, a helper plasmid constitutively provides the wildtype tail fiber in trans such that all progeny phages can amplify comparably regardless of the fitness benefit or deficient of any variant. At this stage, we typically have approximately 1 recombined phage among 1000 acceptor phages (Figure 1C). To enrich recombined phages in this pool, we passage all progeny phages on E. coli expressing Cas9 and a gRNA targeting the fixed sequence flanked by recombinase sites we introduced into the acceptor phage. The helper plasmid is retained during this stage to continue minimizing bias by providing the wildtype tail fiber in trans. As a result, only unrecombined phages will be inhibited while recombined phages with tail fiber variants are Accumulated without bias. The Cas9-gRNA system successfully inhibits acceptor phages but has no effect on plaquing of untargeted phages (Figure 1—figure supplement 2A–D). Recombined phages were highly enriched by over one thousandfold in the phage population when an optimized gRNA targeting the fixed sequence was used, whereas a randomized control gRNA yielded no enrichment of recombined phages (Figure 1D, Figure 1—figure supplement 2E, F). In the final step, phages are propagated on E. coli which lack the helper plasmid that previously provided the wildtype tail fiber in trans to prevent bias. In this Library Expression, propagation on this host allows for full expression of the library variant – this is the first time during library creation that the variant is fully expressed on the phage particle. We sequenced the distribution of the library of tail fiber variants integrated on the phage genome after ORACLE. We compared this distribution to the distribution of variants on the recombination plasmid library to evaluate how effective ORACLE was at integrating variants and preventing bias during library creation. The post-ORACLE phages were mildly skewed toward more abundant members but remained generally evenly distributed and comparable to the distribution of variants in the input donor plasmid library, retaining 99.8% coverage (Figure 1E). Comparison of variant libraries with and without DNAse treatment was well correlated (R = 0.994), indicating no unencapsidated phage genomes influenced library distribution (Figure 1—figure supplement 3). In summary, ORACLE is a generalizable tool for creating large, unbiased variant libraries of obligate lytic phages. These phage variants, including those that have a fitness deficiency on the host used to create the library, can all be characterized in a single selection experiment by deep sequencing phage populations before and after selection in a host. Compared to traditional plaque assays, this represents increased throughput by nearly 3–4 orders of magnitude.

Figure 1 with 3 supplements see all

Download asset Open asset

Optimized Recombination, Accumulation, and Library Expression (ORACLE) workflow for creating phage variant libraries.

(A) Schematic illustration of the four steps of ORACLE: creation of acceptor phage, inserting gene variants (Optimized Recombination), enriching recombined phages (Accumulation) and expressing library for selection (Library Expression). Color notations are as follows: yellow triangles – Cre recombinase sites, blue colored segments – gene variants, orange colored segment – Cas9, grey colored segments – wild type phage parts including the wildtype tail fiber from the helper plasmid (B) Ability of different versions of T7 to infect *E. coli* 10G in stationary (dark gray bar) and exponential (light gray bar) phases by Efficiency of Plating (EOP) using exponential 10G with *gp17* tail fiber helper plasmid as reference host. T7 without tail fiber (T7Δgp17) and T7 Acceptor phages (T7 Acc) cannot visibly plaque, but wildtype T7 (T7 WT), and T7 with *gp17* recombined into the acceptor locus (T7 Rec) plaque efficiently. (C) Concentration of total (Total T7) and recombined (T7 Rec) phages after a single passage on host containing Cre recombinase system. Recombination rate is estimated to be ~7.19x10^-4. (D) Percentage of recombined phages in total phages when using gRNA targeting fixed sequence at acceptor site T7 Acc (Targeted) or randomized gRNA (Random). (E) Histogram of abundance of variants in the input plasmid library (left) and on the phage genome after ORACLE (right) binned using log proportion centered on equal representation. All data represented as mean ± SD of biological triplicate.

Figure 1—source data 1 Deep sequencing summary for phage variant expression library with and without DNAse treatment after Optimized Recombination, Accumulation, and Library Expression (ORACLE). Related to Figure 1E and Figure 1—figure supplement 3.: https://cdn.elifesciences.org/articles/63775/elife-63775-fig1-data1-v2.xlsx
Download elife-63775-fig1-data1-v2.xlsx
Figure 1—source data 2 Percentage distribution of each variant in the expression library. Related to Figure 1E and Figure 1—figure supplement 3.: https://cdn.elifesciences.org/articles/63775/elife-63775-fig1-data2-v2.xlsx
Download elife-63775-fig1-data2-v2.xlsx

DMS of the tip domain shows phage adaptation at molecular resolution

DMS is a high-throughput experimental technique to characterize sequence–function relationships through large-scale mutagenesis coupled to selection and deep sequencing. The scale and depth of DMS is used to reveal sites critical for activity, host specificity, and stability in a protein. DMS has been employed to study many proteins, including enzymes, transcription factors, signaling domains, and viral surface proteins (Fowler and Fields, 2014; Lee et al., 2018; Raman et al., 2014; Romero et al., 2015).

Bacteriophage T7 is a podovirus that infects E. coli. T7 has a short non-contractile tail made up of three proteins, including the tail fiber encoded by gp17. Each of the six tail fibers is a homotrimer composed of a relatively rigid shaft ending with a β-sandwich tip domain connected by a short loop (Garcia-Doval and van Raaij, 2012). The tip domain is likely the very first region of the tail fiber to interact with host LPS and position the phage for successful, irreversible binding with the host (González-García et al., 2015; Molineux, 2001; Qimron et al., 2006). The tip domain is a major determinant of host range and activity and is often naturally exchanged between phages to readily adapt to new hosts (Fraser et al., 2006; Fraser et al., 2007; Lin et al., 2012). Even single amino acid substitutions to this domain are sufficient to alter host range between E. coli strains (Heineman et al., 2008). Due to its critical functional role, we chose the tip domain to comprehensively characterize phage activity and host range by DMS.

We generated a library of 1660 single mutation variants of the tip domain, prespecified as chip-based oligonucleotides, where all 19 non-synonymous and 1 nonsense substitution were made at each codon spanning residue positions 472–554 (Figure 2A, residue numbering based on PDB 4A0T). Using ORACLE, the library was inserted into T7 to generate variants to be selected and deep sequenced (Figure 2B) on three laboratory E. coli hosts: B strain derivative BL21, K-12 derivative BW25113, and DH10B derivative 10G. Each variant was given a functional score, F, based on the ratio of their relative abundance before and after selection consisting of an estimated four infection cycles, which was then normalized to wildtype to yield F_N, where wildtype F_N = 1 (Figure 2C–E, see Materials and methods). Selection on each host gave excellent correlation across biological triplicates (Figure 2—figure supplement 1). To validate the functional relevance of the screen, we hypothesized that the flexible C-terminal end (residue positions 552–554 and a three-residue extension if the stop codon is substituted) is unlikely to have any structural or host recognition role. As expected, these positions broadly tolerated nearly all substitutions across all three hosts, indicating that the functional scores likely reflect true biological effects (Figure 2C–E).

Figure 2 with 3 supplements see all

Download asset Open asset

Deep mutational scanning of tip domain shows phage adaptation at molecular resolution.

(A) Crystal structure and secondary structure topology of the tip domain color coded as interior loops (red), β-sheets (beige) and exterior loops (blue) (B) Functional analysis of variants by comparing their abundances pre- and post-selection on a host. (**C-E**) Heat maps showing normalized functional scores (F_N) of all substitutions (red gradient) and wildtype amino acid (F_N=1 and black dot upper left) at every position for *E. coli* 10G (C), BL21 (D) and BW25113 (E). Residue numbering (based on PDB 4A0T), wildtype amino acid and secondary structure topology are shown above left to right, substitutions listed top to bottom. (F) Parallel plot showing F_N for enriched (F_N≥2) variants on 10G, BL21, and BW25113. Coloring indicates enrichment only on 10G (grey), only on BL21 (red), only on BW25113 (blue) enriched on 10G and BL21 (green). Connecting lines indicate F_N of the same variant on other hosts.

Figure 2—source data 1 Deep sequencing summary for the phage variant library after selection on different hosts. Related to Figure 2C–E.: https://cdn.elifesciences.org/articles/63775/elife-63775-fig2-data1-v2.xlsx
Download elife-63775-fig2-data1-v2.xlsx
Figure 2—source data 2 Variant-specific F_N for phage variants after selection on E. coli 10G, BL21 ,and BW25113 and physicochemical statistics. Related to Figure 2C–F, Figure 2—figure supplement 2, and Figure 3.: https://cdn.elifesciences.org/articles/63775/elife-63775-fig2-data2-v2.xlsx
Download elife-63775-fig2-data2-v2.xlsx

We compared the activities of phage variants across hosts to assess their fitness and evolutionary adaptation to each host. Between the three hosts, T7 variants appeared most and least adapted to BW25113 and 10G, respectively, as evidenced by the fraction of depleted variants (F_N < 0.1) after selection on each host (10G: 0.66 ± 0.03; BL21: 0.59 ± 0.01; and BW25113: 0.51 ± 0.01; all significantly different from each other with p<0.05) (Figure 2—figure supplement 2A–C). Furthermore, wildtype T7 fared relatively poorly on 10G (F = 0.77 ± 0.05), indicating a fitness impediment, but performed significantly better on BL21 (F = 2.92 ± 0.2, p < 0.01) and BW25113 (F = 2.26 ± 0.1, p < 0.01) (Figure 2—source data 1). The fitness impediment gave many more variants competitive advantage, resulting in greater enrichment (F_N > 2) over wildtype on 10G (48 variants) compared to BL21 (2 variants) and BW25113 (16 variants) (Figure 2—figure supplement 2A–C). In fact, the best performing variants on 10G were 10 times more enriched than wildtype, suggesting substantially higher activity (Figure 2F, Figure 2—figure supplement 2D). Examining enriched variants on each host (F_N > 2) provides compelling evidence of the tradeoff between activity and host range (Figure 2F, Figure 2—figure supplement 2E). The top ranked variants on each host were remarkably distinct from those on other hosts (except G479Q shared between 10G and BL21). Hierarchical clustering of F_N across all three hosts revealed grouping of similar variants that performed better selectively on some hosts but not others (Figure 2—figure supplement 3). No variant performed exceptionally well on all hosts (F_N > 2, Figure 2F); however, 406 variants were tolerated on all three hosts (Figure 2—source data 2). Thus, specialization toward a host comes at the cost of sacrificing breadth, mirroring observations made of natural phage populations (Elena et al., 2009).

We investigated the global physicochemical properties and topological preferences of substitutions after selection on each host (Figure 2—figure supplement 2F–H). On 10G, there was enrichment of larger and more hydrophilic amino acids and depletion of hydrophobic amino acids (all p < 0.001, r > 0.12), which is visually striking on the heatmap (see R, K, and H substitutions in Figure 2C). In contrast, no significant enrichment or depletion was observed on BL21 (Figure 2—figure supplement 2F–H). This is consistent with our earlier observation that wildtype T7 is generally well adapted to BL21 since it had the fewest variants outperforming wildtype. We reasoned that since BL21 has been used to propagate T7 it may have already adapted well to this host over time. On BW25113, hydrophobic residues were modestly enriched (all p < 0.034, r > 0.07) (Figure 2—figure supplement 2H), a trend opposite to 10G. This provides a molecular explanation as to why high-scoring substitutions on one host fare poorly on others (Figure 2F). We mapped positions of enriched substitutions (F_N ≥ 2) on each host onto the structure to determine topologically distinct patterns of substitution that may be masked in global comparisons of the entire tip domain (Figure 2—figure supplement 2E). These fall predominantly on four exterior loops (BC, DE, FG, and HI), the adjoining region (β-strand H) close to exterior loop HI, and less frequently on the ‘side’ of the tip domain. This suggests directionality to phage–bacterial interactions and orientational bias of the tip domain with respect to the bacterial surface. Directionality and orientational bias is particularly valuable information since no high-resolution structure of this phage bound to receptor exists.

Several key lessons emerged from these host screens. First, single amino acid substitutions alone can generate broad functional diversity, highlighting the evolutionary adaptability of the RBP. Second, T7 can be optimized and activity can be increased, even on hosts that T7 is already considered to grow well on. Third, enrichment patterns on each host follow broad trends but have nuance at each position.

Comparison across hosts reveals regions of functional importance

Next, we sought to elucidate features of each residue unique to each host or common across all hosts. There were over 30 residues with contrasting substitution patterns between different hosts, revealing fascinating features of receptor recognition for T7 (Figure 2—source data 2). Here, we focus on five of these residues, N501, R542, G479, D540, and D520, which showed starkly contrasting patterns of selection (Figure 3A). N501 and R542 are located on exterior loops oriented away from the phage and toward the receptor (Figure 3C). In fact, R542 forms a literal ‘hook’ to interact with the receptor (Garcia-Doval and van Raaij, 2012). On 10G and BL21, only positively charged residues (R, K, and H) were tolerated at residues 501 and 542, while in contrast many more substitutions were tolerated at both residues on BW25113. One such substitution, R542Q, is the best performing variant on BW25113 (F_N = 3.31) but is conspicuously depleted on 10G and BL21, suggesting that even subtle molecular disparities can lead to large biases in activity. The substitution profiles of G479 and D540 are loosely the inverse of N501 and R542 as many substitutions are tolerated on BL21 and 10G, but very few are tolerated on BW25113 (Figure 3A). We hypothesize that D540 is critical for host recognition on BW25113. Since D540, a receptor-facing position on an exterior loop, is only 6 Å from G479, it is likely that any substitution at G479 may sterically hinder D540, resulting in the noted depletion of G479 substitutions on BW25113. This hypothesis is further supported by enrichment of adjacent S541D on BW25113 (F_N = 2.82, the third highest scoring substitution), while this substitution is depleted on 10G and BL21 (Figure 2—source data 2). D520 displays a third variation in substitution patterns where substitutions are generally tolerated on 10G and BW25113, but not tolerated on BL21 (Figure 3A). This loop is also oriented downward toward the receptor, and we hypothesize that D520 or the local region around this exterior loop is more important for receptor recognition in BL21 than it is for the other two hosts, mirroring the result for D540 for BW25113. Another stark contrast can be drawn at adjacent S519, where no substitutions are tolerated in BL21 or 10G but several substitutions are enriched on BW25113, indicating that substitutions can improve receptor binding on one host while reducing function on another host. Overall, these host-specific substitution patterns reveal a nuanced relationship between the tip domain composition and receptor preferences.

Figure 3 with 3 supplements see all

Download asset Open asset

Comparison across hosts reveals regions of functional importance.

(A) Host-specific differences in substitution patterns at five positions in the tip domain recapitulated from Figure 2. (B) Role of each position determined by aggregating scores of all substitutions in all hosts at that position. Substitutions are classified as intolerant (F_N < 0.1 in all hosts), tolerant (F_N ≥ 0.1 in all hosts), or functional (F_N < 0.1 in one host, F_N ≥ 0.1 in another host) and bar plots are shown as proportion of classified variants at that position. (C) Crystal structure of the tip domain (center) with each residue colored as intolerant, tolerant, or functional based on the dominant effect at that position, β-sheets and residues listed in (A) are labeled. Key interactions defining function and orientation are highlighted in peripheral panels.

Figure 3—source data 1 Functional comparison for each variant on susceptible hosts. Related to Figure 3B, C.: https://cdn.elifesciences.org/articles/63775/elife-63775-fig3-data1-v2.xlsx
Download elife-63775-fig3-data1-v2.xlsx

We quantitatively characterized the role of every residue by integrating selection data across all hosts to reveal a functional map of the tip domain at granular resolution (Figure 3B, C, Figure 3—source data 1). We classified every residue as ‘intolerant’, ‘tolerant’, or ‘functional’ based on aggregated F_N scores of all substitutions across all three hosts at every residue. Our method of classifying functional regions was robust to adjusting the F_N threshold used to identify functional variants (Figure 3—figure supplement 2). Residues where the majority of substitutions were depleted were considered intolerant to substitution, while residues where at least a third of substitutions were depleted in one host and tolerated or enriched in another host were considered functional; the remaining positions were considered tolerant (see Materials and methods). The hydrophobic core comprising W474, I495, W496, I497, Y515, W523, L524, F526, I528, F535, and I548 is essential for stability and therefore is highly intolerant to substitutions (Figure 3B). Other intolerant positions include an elaborate network of salt bridge interactions involving D489, R491, R493, R508, and D512 in the interior loops, which likely constrain the orientation of the tip domain relative to the shaft (Figure 3C). Glycines generally provide conformational flexibility between secondary structure elements and normally tend to be mutable. Interestingly, several glycines (G476, G510, G522 ,and G532) are highly intolerant to substitutions. These glycines may be essential to minimize steric obstruction to adjacent larger residues, similar to G479 and D540 on BW25113 (Figure 3C). For example, G510 and G532 may facilitate formation of salt bridges in the interior loop, while G476 and G522 may facilitate a required receptor interaction in exterior loops for all three hosts.

It has been previously assumed that exterior loops are the primary functional region of the tip domain (Garcia-Doval and van Raaij, 2012; Yehl et al., 2019). We found that functional positions did typically point outward and are densely concentrated along exterior loops BC, DE, FG, and HI, as well as adjacent β-sheet residues. This is consistent with two specificity-switching substitutions found in a previous study, D520Q and V544A, which are both located in exterior loops (Heineman et al., 2008). However, several residues in exterior loops, such as G476 and S543, were notably intolerant, indicating that these residues may be poor targets for engineering or future combinatorial studies. Functional positions were also found in regions other than exterior loops, such as I514, Q527, and K536, which are β-sheet residues located along one side of the tip domain (Figure 3C). This suggests the phage can use the ‘side’ of the tip domain to engage the receptor, increasing the apparent functional area of the tip domain and highlighting several new regions as valuable engineering targets.

We also determined if the functionally important regions could be predicted computationally as the ability to predict functionally important regions without DMS could rapidly accelerate engineering efforts. We used Rosetta, a state-of-the-art protein modeling software, to calculate the change in Gibbs free energy (ΔΔG) for each of the 1660 mutations and compared this distribution to our DMS results (Figure 3—figure supplement 1, also see Materials and methods) and generated a truth table to summarize results compared to our functional data (Figure 3—figure supplement 3). Predicted thermodynamic changes in stability mapped very well with over 93% of tolerated or functional positions having a substitution that was predicted to be stabilizing. The remaining 7% of tolerant or enriched substitutions were predicted to be destabilizing, and we hypothesize that this may indicate these substitutions result in improved dynamic or induced fit positioning of the tip domain for productive infection. Incorporating stability estimations could further improve the engineering power of the assay. For example, substitutions predicted to be stable but that are intolerant in the DMS assay may indicate that the substituted residue is necessary for all three hosts.

Overall, these results paint a complex enrichment profile for each host with some broad trends but subtle host-specific effects. These results suggest that exterior loops and some outward-facing positions in β-sheets act as a reservoir of function-switching and function-enhancing mutations, likely promoting host-specific and orientation-dependent interactions between phage and bacterial receptors. Functional positions identified by this comparison are ideal engineering targets to customize host range and activity.

Discovery of gain-of-function variants against resistant hosts

The tail fiber is considered a reservoir of gain-of-function variants due to its principal role in determining fitness of a phage through host adsorption (Holtzman et al., 2020; Yehl et al., 2019). We hypothesized that novel gain-of-function variants against a resistant host could be discovered by subjecting our tail fiber variant library to selection on a resistant host. To identify a resistant host, we focused on host genes rfaG and rfaD involved in the biosynthesis of surface LPS, which is a known receptor for T7 in E. coli (González-García et al., 2015; Molineux, 2001; Qimron et al., 2006). Gene rfaG (synonyms WaaG or pcsA) transfers glucose to the outer core of LPS and deletion strains lack the outer core of LPS (Pagnout et al., 2019), while rfaD (synonyms gmhD or WaaD) encodes a critical epimerase required for building the inner core of LPS (Valvano et al., 2002; Figure 4A). Deletion of either gene reduces the ability of T7 to infect E. coli by several orders of magnitude (Figure 4F). We challenged the library of T7 variants against E. coli deletion strains BW25113ΔrfaG and BW25113ΔrfaD through pooled selection and deep sequencing as before (Figure 2) and determined an F_N score for each substitution on both strains (Figure 4B, C). Independent replicates showed good correlation for BW25113ΔrfaG (R = 0.99, 0.93, 0.93) but only adequate correlation for BW25113ΔrfaD (R = 0.51, 0.68, 0.39) (Figure 4—figure supplement 1). Although the scale of F_N was inconsistent across replicates on BW25113ΔrfaD, the same substitutions were largely enriched in all three replicates, suggesting reproducibility of results (Figure 4—figure supplement 3). Inconsistencies in F_N scores may arise due to severe loss of diversity causing stochastic differences in enrichment to become magnified across independent experiments and the four infection cycles used for selection. Separately we examined correlation after selection using only a single infection cycle, which produced more highly correlated results for BW25113ΔrfaD (R = 0.89, 0.90, 0.89) (Figure 4—figure supplement 4), indicating that fewer infection cycles may be ideal for future work with highly resistant hosts.

Figure 4 with 5 supplements see all

Download asset Open asset

Discovery of gain-of-function variants against resistant hosts.

(A) Schematic view of the LPS on wildtype BW25113, BW25113Δ*rfaG* and BW25113Δ*rfaD*. (**B-C**) Heat maps showing normalized functional scores (F_N) of all substitutions (red gradient) and wildtype amino acid (F_N=1 and black dot upper left) at every position for BW25113Δ*rfaG* (B) and BW25113Δ*rfaD* (C). (**D-E**) Among highly enriched variants (F_N ≥ 10), targeted amino acids (left), their substitutions (middle) and topological location on the structure (right) on BW25113Δ*rfaG* (D) and BW25113Δ*rfaD* (E), with each alluvial colored based on the substituted amino acid and scaled by F_N. (F) EOP (mean ± SD, biological triplicates) for wildtype phage and select variants on BW25113 (Wild Type), BW25113Δ*rfaG* and BW25113Δ*rfaD* in exponential (dark gray) and stationary phases (light gray) using BW25113 as a reference host.

Figure 4—source data 1 Deep sequencing summary for phage variant library after selection on different hosts. Related to Figure 4B, C.: https://cdn.elifesciences.org/articles/63775/elife-63775-fig4-data1-v2.xlsx
Download elife-63775-fig4-data1-v2.xlsx
Figure 4—source data 2 Variant-specific F_N for phage variants after selection on E. coli BW25113ΔrfaG and BW25113ΔrfaD and physicochemical statistics. Related to Figure 4B, C and Figure 4—figure supplement 4.: https://cdn.elifesciences.org/articles/63775/elife-63775-fig4-data2-v2.xlsx
Download elife-63775-fig4-data2-v2.xlsx

We engineered several gain-of-function T7 variants that could infect both deletion strains with activity comparable to wildtype T7 infecting susceptible BW25113 (Figure 4F). Low-sequence diversity and high enrichment scores of T7 variants indicate a strong selection bottleneck, which is consistent with diminished activity of wildtype T7 on the deletion strains. This is reflected in the significantly lower functional score of wildtype T7 on BW25113ΔrfaG and BW25113ΔrfaD (F = 0.09 ± 0.3 and F = 0.03 ± 0.2, respectively) in comparison to BW25113 (F = 2.26 ± 0.1, p < 0.001) (Figure 2—source data 1). The number of enriched variants outperforming wildtype T7 (F_N ≥ 2) on the deletion strains (BW25113ΔrfaG: 55 variants, 3.3% and BW25113ΔrfaD: 68 variants, 4.1%) was over three times higher than BW25113 (16 variants, 1%) but comparable to 10G (48 variants, 2.9%) (Figure 4—figure supplement 2A, B). However, the enrichment scores of top performing variants such as G521H and G521R on BW25113ΔrfaG and S541K and N501H on BW25113ΔrfaD were over 100 times greater than wildtype T7, suggesting strong gain of function on the deletion strains (Figure 4—figure supplement 2C). Of the 78 variants with F_N ≥ 2 on either deletion strain, 45 variants had F_N ≥ 2 on both strains, indicating that variants that performed well on one strain typically performed well on the other strain. This implies that the enriched variants may have broad affinity for truncated LPS but cannot discriminate based on the length of the LPS. Nonetheless, hydrophilic substitutions were more strongly enriched on BW25113ΔrfaG (p < 0.001, r > 0.11), but not as significantly on BW25113ΔrfaD (p < 0.033, r < 0.10), suggesting subtle differences in surface chemical properties of deletion strains leading to host-specific enrichments (Figure 4—figure supplement 2D–F). Indeed, there were several variants with contrasting F scores on both strains such as S541T (BW25113ΔrfaD F_N = 44.8, BW25113ΔrfaG F_N = 0.6) and G521E (BW25113ΔrfaD F_N = 0, BW25113ΔrfaG F_N = 17.4), suggesting potential host preference. Most substitutions were concentrated in the exterior loops BG, FG, HI, and β-strand H, all pointing downward toward the bacterial surface, reinforcing the functional importance of these regions of the tip domain (Figure 4D, E). Notably, the most enriched variants had large positively charged substitutions (K, R, and H) akin to the enrichment pattern on 10G, suggesting that the bacterial surface of these truncated mutants likely resembles that of 10G. Our results are consistent with a recent continuous evolution study, which identified G480E and G521R as possible gain-of-function variants on a strain similar to BW25113ΔrfaD and G479R and G521S as possible gain-of-function variants on BW25113ΔrfaG (Holtzman et al., 2020), although these variants only represent a small fraction of the gain-of-function variants discovered in our study.

We validated the results of the pooled selection experiment by clonally testing the ability of phage variants with high F_N (A539R, G521H, and D540S) to plaque on the deletion strains based on a standard efficiency of plating (EOP) assay. Indeed, EOP results showed significant gain of function in these variants on the deletion strains (Figure 4F). D540S was particularly noteworthy as it performed better on the deletion strain BW25113ΔrfaG over wildtype BW25113 by 1–2 orders of magnitude. Based on these results, we conclude that D540 is critical for infecting wildtype BW25113 (Figure 3) likely by interacting with the outer core of LPS. When the outer core of the LPS is missing (BW25113ΔrfaG), a substitution at this position becomes necessary for adsorption either to a different LPS moiety or to an alternative receptor.

We introduced stop codon at every position to systematically evaluate the function of tip domains truncated to different lengths. Many truncated variants performed well, especially on BW25113ΔrfaG, which included some with F_N ≥ 10 (Figure 4—source data 2). Truncated variants that performed well are distributed throughout the tip domain and are not localized to any one region (Figure 4—figure supplement 5). We clonally tested variant R525*, the best performing truncated library member (BW25113ΔrfaG F_N = 9.55, BW25113ΔrfaD F_N = 75.7), and found that this mutant showed no ability to plaque on any host unless provided the tail fiber in trans. These truncated phages, detectable here only using deep sequencing, may demonstrate how obligate lytic phages could become less active in a bacterial population, slowly replicating alongside their bacterial hosts, requiring only a single mutation to become fully active again. In fact, acceptor phages altogether lacking a tail fiber were present at extremely low abundance (Figure 4—source data 1). These phages are not artifacts from library creation as some ability to replicate is required to produce detectable concentrations of each phage. We concluded that these are viable phage variants albeit with a much slower infection cycle, resulting in their inability to form visible plaques.

Targeting pathogenic E. coli causing UTIs using T7 variants

Phage therapy is emerging as a promising solution to the antibiotic resistance crisis. Recent clinical success stories against multidrug-resistant Acinetobacter and Mycobacterium showcase the enormous potential of phage therapy (Dedrick et al., 2019; Schooley et al., 2017). Despite notable exceptions, in general development of effective phage-based therapeutics is hindered by onset of bacterial resistance, resulting in low phage susceptibility. Although initial application of phages in a laboratory setting may reduce bacterial levels, the residual bacterial load remains high, causing bacteria to quickly recover after phage application (Fister et al., 2016; Huss and Raman, 2020; Silva et al., 2014). A high ratio of phage to bacteria (multiplicity of infection [MOI]) may productively kill bacteria in a laboratory setting by overwhelming a host with many phages (Abedon, 2011). However, ensuring an overwhelming amount of phages in a clinical setting is not always feasible (Principi et al., 2019). Engineering highly active phages that overcome bacterial insensitivity and can therefore productively eliminate bacterial populations at low MOI in a laboratory setting would greatly enhance phage-based therapeutics. We hypothesized that engineered tip domain variants may abate bacterial insensitivity and be active even at low MOI by better adsorbing to the native receptor or recognizing a new bacterial receptor altogether.

To test this hypothesis, we chose pathogenic E. coli strain isolate 473 isolated from a patient with a UTI (Arthur et al., 1990). Although T7 can infect this UTI strain, insensitivity arises rapidly, a phenomenon all too common with the use of natural phages. EOP assays for wildtype T7 showed insensitive plaque morphology consisting of small, slow-growing plaques. No visible lysis was detected after overnight incubation when wildtype T7 was applied in liquid culture (MOI = 1), indicating onset of insensitivity. However, the variant library applied on the UTI strain cleared the culture (MOI = 1) after overnight incubation, suggesting that T7 variants are capable of lysing and attenuating insensitivity existing in the pool. We clonally characterized three variants (N501H, D520A, and G521R) isolated from plaques. All three variants vastly outperformed wildtype T7 in terms of onset of insensitivity. Insensitivity emerged approximately 11–13 hr after initial lysis for the three variants, whereas it took merely 1–2 hr after initial lysis for wildtype (Figure 5A). In particular, the N501H variant lysed cells faster and produced a lower bacterial load post lysis, suggesting far greater activity compared to wildtype T7. Next, we compared the effect of phage MOI (MOI = 10²-10⁻⁵) on the lysing activity of N501H and wildtype T7 (Figure 5B). At all MOIs, wildtype phages lysed UTI473 significantly more slowly compared to N501H phages (all p < 0.05). At lower MOI, time to lysis of N501H was half that of wildtype T7, though they were more comparable at higher MOI.

Figure 5

Download asset Open asset

Targeting pathogenic *E. coli* causing UTIs using T7 variants.

(A) Growth time course of UTI473 strain subject to wildtype T7 and select variants. Phages were applied after an hour at an MOI of ~10^-2. (B) Estimated time to lysis of UTI473 strain incubated with wildtype T7 and N501H variant over a range of MOIs, derived from time course experiments. (C) Cell density (OD₆₀₀) of UTI473 strain when incubated with wildtype T7 and N501H variant at select timepoints after initial lysis. All data represented as mean ± SD of biological triplicate.

A striking contrast between N501H and wildtype T7 is evident in reduced bacterial insensitivity at progressively lower MOI (Figure 5C). Between an MOI of 100 to 1, application of both N501H and wildtype phage resulted in similar bacterial insensitivity. However, between an MOI of 10⁻¹ and 10⁻⁵, application of N501H phage reduced insensitivity over a 10 hr window, while application of wildtype phages resulted in rapid onset of insensitive bacteria. We postulate that at high MOI wildtype T7 simply overwhelms the host before insensitivity arises, while at lower MOI insensitivity can emerge and only variants adapted to the host can effectively kill the host. These results indicate that ORACLE can generate phage variants superior to wildtype phage that could then become starting points for further engineering therapeutic phages. Further experiments will be required to assess the in vivo efficacy of the T7 variants.

Host range constriction emerges from global comparison across variants

Most phages are specialists that selectively target a narrow range of hosts but are unable to productively infect other closely related hosts (Hyman and Abedon, 2010). We wanted to assess differences in the host range of individual variants on 10G, BL21, and BW25113 and identify variants with constricted host ranges. Ideally, host specificities can be determined by subjecting a co-culture of all three hosts to the phage library. However, deconvolving specificities of thousands of variants from a pooled co-culture experiment can be technically challenging. Instead, we sought to estimate specificities by comparing F_N of a phage variant on all three hosts. Although F_N compares activity of variants within a host, it could nonetheless be a useful proxy for estimating specificities across hosts. For instance, a phage variant with high F_N on BL21 but completely depleted on BW25113 is more likely to specifically lyse BL21 than BW25113 in a co-culture experiment. Based on this rationale, we considered different metrics of comparison of F_N and settled on difference in F_N of a variant with reduced weight for enrichment (or F_D, see Materials and methods) between any two hosts as an approximate measure of host preference. This metric is not an absolute measure of host specificity, but one devised to reveal broad trends in specificity to prioritize variants for downstream validation.

To assess if variants preferred one host over another, we computed F_D for all three pairwise combinations and plotted functional substitutions as points on or above/below a ‘neutral’ line (Figure 6A–C). Variants above the line favor lysis of the noted host, and vice versa for variants below the line. To check if this F_D-based approach is suitable for assessing host specificity, we compared our results with previously published data. Two substitutions, D520Q and V544A, that were reported to have a preference for BW25113 and BL21, respectively, in head-to-head comparisons (Heineman et al., 2008) were placed correctly in our plots, confirming the validity of our F_D-based classification scheme. We identified 118 out of 1660 variants as good candidates for constricting host range (|F_D| ≥ 1, see Figure 6—source data 1). Of the 118 variants, 53 variants favor BW25113 over BL21 and 98 variants favor BW25113 over 10G in pairwise comparisons (Figure 6A, C). Between BL21 and 10G, there are 15 variants that favor BL21 but none that favor 10G (Figure 6B).

Figure 6

Download asset Open asset

Host range constriction emerges from global comparison across variants.

(**A-C**) Pairwise comparison of differences in functional scores of variants between hosts (see Methods). Variants above the line favor lysis of host noted above the line, and vice versa for variants below the line. (D) EOP (mean ± SD, biological triplicates) for wildtype T7 and select variants on BW25113, 10G and BL21 in exponential (dark gray) and stationary phases (light gray) using exponential 10G with gp17 tail fiber helper plasmid (10G_H) as a reference host. R542Q plaques are atypically small until EOP ~10-2.

Figure 6—source data 1 ΔΔG and F_Dconversion for all variants.: https://cdn.elifesciences.org/articles/63775/elife-63775-fig6-data1-v2.xlsx
Download elife-63775-fig6-data1-v2.xlsx

Certain key positions, including G479, D540, R542, and D520, which we previously identified as functionally important (Figure 3A), are the molecular drivers of specificity between hosts (Figure 6A–C). Taken together, our data suggests that it would be easier to find a variant capable of specifically lysing BW25113, less so for BL21, and most challenging for 10G.

To validate our analysis, we clonally tested variant R542Q, which had a greater preference for BW25113 than BL21 or 10G (BW25113 F_D = 2.0, BL21 F_D = 0, 10G F_D = 0), and variant D540S, which had a greater preference for BL21 and 10G than BW25113 (10G F_D = 1.03, BL21 F_D = 0.62, BW25113 F_D = 0.03). Indeed, R542Q showed a significant approximately hundredfold decrease in the ability to plaque on BL21 compared to BW25113 while 10G plaques were atypically small, indicating a severe growth defect (Figure 6D). In contrast, D540S showed a significant approximately hundredfold decrease in the ability to plaque on BW25113 compared to BL21 and 10G (Figure 6D), confirming the host constriction properties of these variants. In summary, pairwise comparison is a powerful tool to map substitutions that constrict host range and can be leveraged to tailor engineered phages for targeted hosts.

Discussion

In this study, we used ORACLE to create a large, unbiased library of T7 phage variants to comprehensively characterize the mutational landscape of the tip domain of the tail fiber. Our study identified hundreds of novel function-enhancing substitutions that had not been previously characterized. We mapped regions of function-enhancing substitutions on to the crystal structure to rationalize how sequence and structure influence activity and host range. Several important insights emerged from these results. Cross-comparison between different hosts and selection on resistant hosts allowed us to map key substitutions, leading to host discrimination and gain of function. Single amino acid substitutions are sufficient to enhance activity and host range, including some that confer dramatic increases in activity or specificity. The functional landscape on each host is unique, reflecting both different molecular preferences of adsorption and the fitness of wildtype T7 on these hosts. For instance, hydrophilic substitutions were enriched in 10G while hydrophobic substitutions were enriched in BW25113. Notably, substitutions on 10G (an E. coli K-12 derivative lacking LPS components) mirrored substitutions that recovered function on BW25113 mutants with truncated LPS, which shows convergence of selection. Function-enhancing substitutions were densely concentrated in the exterior loops, indicating an orientational preference for receptor recognition. However, they were also found on other surface residues, albeit less frequently, suggesting alternative binding modes of the tip domain for host recognition, and several intolerant residues were located in exterior loops. Taken together, these results highlight the extraordinary functional potential of the tip domain and rationalize the pervasive use of this structural fold in nature for molecular recognition. Comparison of these functional profiles precisely reveals the regions that are ideal engineering targets for customizing host range and activity and identifies intolerant residues that should be avoided when engineering synthetic phages.

These results also highlight the power of deep sequencing to detect and resolve small functional effects over traditional low-throughput plaque assays. This is best illustrated in the case of truncated variants visible only to deep sequencing, but incapable of plaque formation without a helper plasmid. The truncated variants are likely not experimental artifacts as some ability to replicate is required to survive multiple rounds of selection on the host. Truncation of the tip domain may misorient the phage relative to the receptor, likely resulting in slower growth and deficiency in plaquing, while still capable of replicating. Since plaque formation is a complex process, inability to plaque may not imply a functionally incompetent phage.

ORACLE is designed as a foundational technology to elucidate sequence–function relationships in phage genes. On T7, ORACLE can be used to investigate the function of several important genes, including the remainder of the tail fiber and tail structure, capsid components, or lysins and holins. Together, these will provide a comprehensive view of the molecular determinants of the structure, function, and evolution of a phage. Once the phage variants are created, scaling up ORACLE to investigate potentially tens of hosts merely scales up sequencing volume, not experimental complexity. Such a large-scale study will lead to a detailed molecular understanding and adaptability of phage bacterial interactions. Any phage with a sequenced genome and a transformable host capable of maintaining a plasmid library should be amenable to ORACLE because the phage variants are created during the natural infection cycle. This approach can be leveraged to tune activity for known phages with high activity, such as T7, or to identify engineering targets that dramatically increase activity for newly isolated natural phages.

The confluence of genome engineering, high-throughput DNA synthesis, and sequencing enabled by ORACLE together with viral metagenomics could transform phage biology. Phages constitute unparalleled biological variation found in nature and are aptly called the ‘dark matter’ of the biosphere. Their sequence diversity and richness are coming to light in the growing volume of viral metagenome databases. However, what functions these sequences encode remains largely unknown. For instance, fecal viromes estimate 10⁸–10⁹ virus-like particles per gram of feces, but less than a quarter of sequence reads align to existing databases (Reyes et al., 2012). While this knowledge gap is daunting, it also presents an opportunity to mine metagenomic sequences to characterize their function and engineer programmable phages. By enabling sequence programmability, we envision ORACLE as a powerful tool to discover new phage ‘parts’ from metagenomic sequences.

Materials and methods

Key resources table

Reagent type (species) or resource	Designation	Source or reference	Identifiers	Additional information
Strain, strain background (Escherichia coli)	E. coli 10G	Lucigen	Lucigen:60107-1
Strain, strain background (Escherichia coli)	E. coli BL21	ATCC	ATCC:BAA-1025
Strain, strain background (Escherichia coli)	E. coli 10-beta	NEB	NEB:C3020
Strain, strain background (Escherichia coli)	E. coli BW25113	Baba et al., 2006	BW25113
Strain, strain background (Escherichia coli)	E. coli BW25113ΔrfaG	Baba et al., 2006	BW25113ΔrfaG
Strain, strain background (Escherichia coli)	E. coli BW25113 ΔrfaD	Baba et al., 2006	BW25113 ΔrfaD
Strain, strain background (Escherichia coli)	E. coli UTI473	Arthur et al., 1990	UTI473
Strain, strain background (T7 bacteriophage)	T7 bacteriophage	ATCC	ATCC:BAA-1025-B2
Strain, strain background (T7 bacteriophage)	T7 bacteriophage variants	This paper	Available on request	DMS variants, available from the Raman lab.
Commercial assay or kit	KAPA HiFi PCR Kit	Roche	Roche:KK2101
Commercial assay or kit	KAPA2G Robust PCR Kit with dNTPS	Roche	Roche:KK5005
Commercial assay or kit	Golden Gate Assembly Kit (BsaI-HFv2)	NEB	NEB:E1601L
Recombinant DNA reagent	pHT7Helper1 (plasmid)	This paper		Helper with T7 gp17. See Materials and methods for full details.
Recombinant DNA reagent	pHRec1 and derivatives (plasmids)	This paper		Recombination plasmid. See Materials and methods for details.
Recombinant DNA reagent	pHCas9 and derivatives (plasmids)	This paper		Plasmid with Cas9 targeting acceptor phage. See Materials and methods for full details.
Software, algorithm	R scripts for DMS analysis	This paper	N/A	Available here https://github.com/raman-lab/oracle; Huss, 2021; copy archived at swh:1:rev:657e8eef12e4ee886f5d188b745ff0b38f94f479
Software, algorithm	R scripts for physicochemical comparisons	This paper	N/A	Available here https://github.com/raman-lab/oracle.
Software, algorithm	R scripts for Rosetta ΔΔG calculations	This paper	N/A	Available here https://github.com/raman-lab/oracle.

Share this article

Cite this article

Optimized Recombination, Accumulation, and Library Expression (ORACLE) workflow for creating phage variant libraries.

Figure 1—source data 1

Figure 1—source data 2

Deep mutational scanning of tip domain shows phage adaptation at molecular resolution.

Figure 2—source data 1

Figure 2—source data 2

Comparison across hosts reveals regions of functional importance.

Figure 3—source data 1

Discovery of gain-of-function variants against resistant hosts.

Figure 4—source data 1

Figure 4—source data 2

Targeting pathogenic E. coli causing UTIs using T7 variants.

Host range constriction emerges from global comparison across variants.

Figure 6—source data 1

Author details

Phil Huss

Contribution

Competing interests

Anthony Meger

Contribution

Competing interests

Megan Leander

Contribution

Competing interests

Kyle Nishikawa

Contribution

Competing interests

Srivatsan Raman

Contribution

For correspondence

Competing interests

Citations by DOI

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

Categories and tags

Research organism

Further reading