Comprehensive mutagenesis library design and production assay. (A) Organization of the AAV2 genome and Rep protein domains. Top: single-stranded DNA genome, middle: RNA transcripts, bottom: Rep proteins. Dotted lines indicate mutated regions. (B) Density plot of barcode counts in the pCMV-Rep78/68 plasmid library. (C) Overview of production assay for the pCMV-Rep78/68 library and calculation of wild-type normalized production fitness values (s’). (D) Amino acid level production fitness values from replicate transfections of the pCMV-Rep78/68 library. Pearson R correlation coefficient calculated after log transformation. (E) Density plot of production fitness values for wild-type (black) and premature stop codon (blue) controls for the pCMV-Rep78/68 library. *p < 10−20 (Mann Whitney U test)

Effects of all single amino acid substitutions and deletions in the Rep78/68 proteins on AAV2 production. Amino acid level production fitness values from the pCMV-Rep78/68 production assay were calculated as in Figure 1C by summing barcode counts for synonymous mutations. Rectangles are colored by mutational effect on the production of genome-containing particles, with black indicating deleterious mutations and red indicating beneficial mutations. Colored bars above the heatmaps indicate protein domains. Black: origin-binding domain, light blue: helicase domain, gray: nuclear localization signal, and navy blue: zinc-finger domain. Black dots indicate wild type amino acid identity.

Beneficial substitutions cluster in DNA-interacting residues. (A) Average production fitness values for all substitutions at each position mapped onto the structure of the origin-binding domain in complex with the AAVS1 Rep binding site (PDB 4ZQ9). (B) Close up view of origin-binding domain-Rep binding site interactions. Residues where substitutions to positively charged residues are enriched are shown as sticks. (C) Average production fitness values mapped onto the structure of the origin-binding domain in complex with the single stranded inverted terminal repeat hairpin (PDB 6XB8). DNA interacting residues are shown as sticks. Residues are colored by mutability, with red indicating higher mutability and black indicating lower mutability.

Comparison of comprehensive mutagenesis measurements to variation in nature. (A) Distribution of wild-type normalized production fitness values for conserved variants (blue) and variants found only in the library (gray). (B) Total number of variants and number of conserved variants with s’ greater than wild-type (s’ > 1). *p < 10−20 (Mann Whitney U test)

Validation AAV2 library production assay results. (A) DNase-resistant particle titers fo fourteen single amino acid pCMV-Rep78/68-inverted terminal repeat variants produced individually. Titers for previously characterized variants are plotted to the left of the dotted line. (B) Relationship between normalized production fitness values from library experiments and DNase-resistant particle titers from individual transfections. (C) DNase-resistant particle titers for two rAAV genomes produced with the indicated Rep variants. Titers for previously characterized variants are plotted to the left of the dotted line. (D) Relationship between rAAV DNase-resistant particle titers and normalized production fitness values from library experiments. (E) Expression of Rep and VP proteins from variant pRepCap plasmids by Western blot.

Mutations to AAV2 rep have similar effects on AAV2, AAV5, and AAV9 capsid production. Wild-type normalized production fitness values from the library production assay with (A) AAV5 and AAV2 capsids and (B) AAV9 and AAV2 capsids. Pearson R correlation coefficient calculated after log transformation.

Comprehensive mutagenesis library design and production assay results for WT AAV2 format library. (A) Density plot of barcode counts in the WT AAV2 plasmid library. (B) Amino acid level production fitness values from replicate transfections for the WT AAV2 library. Pearson R correlation coefficient calculated after log transformation. (C) Density plot of wild-type (black) and premature stop codon (gray) controls for the WT AAV2 library. *p < 10−20 (Mann Whitney U test)

Percent of expected variants sequenced in plasmid and viral libraries.

Effects of all single amino acid substitutions and deletions in Rep78, Rep68, Rep52, and Rep40 on AAV2 production. Amino-acid level production fitness values from the WT AAV2 library production assay were calculated as in Figure 1C by summing barcode counts for synonymous variants. Rectangles are colored by mutational effect on the production of genome-containing particles, with black indicating deleterious mutations and red indicating beneficial mutations. Colored bars above the heatmaps indicate protein domains. Black: origin-binding domain, light blue: helicase domain, gray: nuclear localization signal, and navy blue: zinc-finger domain. Black dots indicate wild type amino acid identity.

Mutations to AAV2 rep have similar effects on AAV2 production in pCMV-Rep78/68 and WT AAV2 format libraries. (A) Production fitness values for library production assay with pCMV-Rep78/68 and WT AAV2 format libraries, Pearson R correlation coefficient calculated after log transformation, (B) DNase-resistant particle titers for AAV2 capsids produced with pCMV-Rep78/68-inverted terminal repeat or p5-RepCap-inverted terminal repeat plasmids containing the indicated Rep substitutions. *p < 0.05, **p < 0.01 (Welch’s t test)

Average production fitness values from the pCMV-Rep78/68 library production assay mapped onto (A) the structure of the origin-binding domain active site (PDB 5DCX) and (B) the structure of the helicase domain (PDB 1S9H). H90, H92, and the active site nucleophile, Y156, are shown as sticks in (A). Residues are colored by mutability, with red indicating higher mutability and black indicating lower mutability.

Codon level production fitness values for the pCMV-Rep78/68 format library. Codon level production fitness values were calculated as in Figure 1C by summing counts for all barcodes corresponding to a given codon variant. Rectangles are colored by mutational effect on the production of genome-containing particles, with black indicating deleterious mutations and red indicating beneficial mutations. Black dots indicate wild-type codon identity.

Codon level production fitness values for the WT AAV2 format library. Codon level production fitness values were calculated as in Figure 1C by summing counts for all barcodes corresponding to a given codon variant. Rectangles are colored by mutational effect on the production of genome-containing particles, with black indicating deleterious mutations and red indicating beneficial mutations. Black dots indicate wild-type codon identity.

Inclusion of synonymous codon variants enables interrogation of nucleotide-level effects. Analysis of pCMV-Rep78/68 variants with premature stop codons in +1 (A) and +2 (B) reading frames does not identify any frameshifted open reading frames. Pink dots: average production fitness for mutations that introduce stop codons into +1 or +2 frame, black dots: average production fitness for mutations synonymous to +1 or +2 stop codon mutations in Rep open reading frame, lines indicate 10 bp rolling averages. (C) Average production fitness at each nucleotide position for variants that introduce the indicated nucleotide change in the pCMV-Rep78/68 library, orange: T, blue, G, green: C, red: A. (D) Average production fitness for variants that introduce the indicated nucleotide change in the WT AAV2 library.

Effect of single amino acid Rep substitutions on the viral genome and physical particle titers of affinity purified rAAV. Capsid (gray) and viral genome (black) titers for pHef1a-EGFP-IRES-Luc2 rAAV2 produced with the indicated Rep variant and affinity purified with AVB Sepharose. For each variant samples A and B represent replicate transfections and affinity purifications.

Viral genome and physical particle titers for rAAV2 produced with Rep variants

Effects of all single amino acid substitutions and deletions in the Rep78/68 proteins on AAV5 capsid production. Amino acid level production fitness values from the pCMV-Rep78/68 production assay were calculated as in Figure 1C by summing barcode counts for synonymous mutations. Rectangles are colored by mutational effect on the production of genome-containing particles, with black indicating deleterious mutations and red indicating beneficial mutations. Colored bars above the heatmaps indicate protein domains. Black: origin-binding domain, light blue: helicase domain, gray: nuclear localization signal, and navy blue: zinc-finger domain. Black dots indicate wild type amino acid identity.

Effects of all single amino acid substitutions and deletions in the Rep78/68 proteins on AAV9 capsid production. Amino acid level production fitness values from the pCMV-Rep78/68 production assay were calculated as in Figure 1C by summing barcode counts for synonymous mutations. Rectangles are colored by mutational effect on the production of genome-containing particles, with black indicating deleterious mutations and red indicating beneficial mutations. Colored bars above the heatmaps indicate protein domains. Black: origin-binding domain, light blue: helicase domain, gray: nuclear localization signal, and navy blue: zinc-finger domain. Black dots indicate wild type amino acid identity.

Codon level production fitness values for the AAV5 capsid production assay. Codon level production fitness values were calculated as in Figure 1C by summing counts for all barcodes corresponding to a given codon variant. Rectangles are colored by mutational effect on the production of genome-containing particles, with black indicating deleterious mutations and red indicating beneficial mutations. Black dots indicate wild-type codon identity.

Codon level production fitness values for the AAV9 capsid production assay. Codon level production fitness values were calculated as in Figure 1C by summing counts for all barcodes corresponding to a given codon variant. Rectangles are colored by mutational effect on the production of genome-containing particles, with black indicating deleterious mutations and red indicating beneficial mutations. Black dots indicate wild-type codon identity.

DNase-resistant particle titers for AAV2, AAV5, and AAV9 capsids produced individually with the indicated Rep variants.