1. Epidemiology and Global Health
  2. Microbiology and Infectious Disease
Download icon

Comprehensive mapping of adaptation of the avian influenza polymerase protein PB2 to humans

  1. YQ Shirleen Soh
  2. Louise H Moncla
  3. Rachel Eguia
  4. Trevor Bedford
  5. Jesse D Bloom  Is a corresponding author
  1. Fred Hutchinson Cancer Research Center, United States
  2. Howard Hughes Medical Institute, United States
Research Article
Cite this article as: eLife 2019;8:e45079 doi: 10.7554/eLife.45079
Voice your concerns about research culture and research communication: Have your say in our 7th annual survey.
7 figures, 1 table and 5 additional files

Figures

Figure 1 with 3 supplements
Deep mutational scanning of avian influenza PB2 in human and avian cells.

(A) We mutagenized all codons of PB2 from an avian influenza strain. We generated mutant virus libraries using a helper-virus approach, and passaged libraries at low MOI in human (A549) or duck (CCL-141) cells to select for functional PB2 variants. (B) We deep sequenced PB2 mutants from the initial mutant plasmid library and the mutant virus library after passage through each cell type. We computed the ‘preference’ for each amino acid in each cell type by comparing the frequency of each mutation before and after selection. In the logo plots, the height of each letter is proportional to the preference for that amino acid at that site. (C) To identify mutations that are adaptive in one cell type versus the other, we computed the differential selection by comparing the frequency of each amino-acid mutation in human versus avian cells. Letter heights are proportional to the log enrichment of the mutation in human versus avian cells. Figure 1—figure supplement 1 shows the phylogenetic relation of the chosen avian influenza strain to other influenza strains. Figure 1—figure supplement 2 shows further details of deep mutation scanning experiment. Figure 1—figure supplement 3 shows relative amplification of full-length PB2 versus PB2-GFP and PB2-deletion gene segments.

https://doi.org/10.7554/eLife.45079.003
Figure 1—figure supplement 1
Phylogenetic relationship of PB2 sequence of chosen avian influenza strain to other influenza strains.

(A) Phylogenetic tree of influenza PB2. We used PB2 sequences from the following influenza strains: A/Green-winged Teal/Ohio/175/1986 (indicated with a green dot), diverse strains sampled across years and hosts (Doud et al., 2015), and representatives of lineage-defining strains (human H3N2, human pandemic H1N1) and recent sporadic human cases of avian influenza strains (H5N1, H7N9). PB2 nucleotide sequences (of the coding sequence) were aligned using MAFFT and the phylogenetic tree was built using RAxML using the GTRCAT substitution model. Scale bar: mean nucleotide substitutions per site. (B) Pairwise amino-acid identity between all PB2 sequences shown in the tree, between just avian strains, between just human strains, and between human and avian strains.

https://doi.org/10.7554/eLife.45079.004
Figure 1—figure supplement 2
Details of deep mutational scanning experiment.

(A) Experiments were performed in biological triplicate, starting from plasmid mutagenesis. All experimental steps were also performed on a wild-type PB2 gene (blue) to estimate error rates during deep sequencing and other experimental steps. (B—F) We picked 48 clones across the three replicate mutant plasmid libraries for Sanger sequencing. (B) There was an average of 1.4 codon mutants per clone, with the number of mutations per clone roughly following a Poisson distribution. (C) Distribution of number of nucleotide changes for each codon mutation. (D) Nucleotide frequencies in the mutant versus parent codons. (E) Mutations were distributed uniformly across the PB2 gene. (F) Cumulative distribution of pairwise distance between pairs of codon mutations, for clones with multiple mutations. The observed distribution is close to the expected distribution of pairs of mutations occurred independently. (G) Cumulative distribution of the fraction of mutations that are found less than or equal to the indicated number of times. ‘DNA’ refers to the mutant plasmid library. (H) Per codon frequencies of nonsynonymous, stop, and synonymous mutations for each mutant library replicate and wild type, measured either in the DNA plasmid library or after passaging in human (A549) or avian (CCL-141) cells. The top plot shows all mutations; the bottom plot shows only mutations accessible by 2 and 3 nucleotide substitutions. (I) Correlations among experimental replicates of all amino-acid preferences. Correlations compare replicates passaged in human cells (orange), avian cells (green), and between the two cell types (black).

https://doi.org/10.7554/eLife.45079.005
Figure 1—figure supplement 3
Relative amplification of full-length PB2 versus PB2-GFP and PB2-deletion gene segments.

Following passaging of PB2 mutant virus libraries and RNA extraction, PB2 vRNA was reversed transcribed then amplified using primers annealing to the ends of the vRNA. The PCR products were separated by gel electrophoresis. Bands likely corresponding to full length PB2, PB2-GFP, and PB2-deletion gene segments are labeled. The PB2-GFP comes from residual helper virus (which packages GFP in the PB2 segment) that was not purged by the low MOI passage, while internal deletions in PB2 are well known to arise spontaneously when virus is passaged in cell culture (Xue et al., 2016).

https://doi.org/10.7554/eLife.45079.006
Figure 2 with 1 supplement
Functional constraints on PB2.

(A) The amino acid preferences measured in human and avian cells for key regions of PB2: the start codon, sites involved in cap-binding, and sites comprising the nuclear localization sequence (NLS). The height of each letter is proportional to the preference for that amino acid at that site. Known critical amino acids are generally strongly preferred in both cell types. (B) Correlation of the site entropy of the amino-acid preferences measured in each cell type. (C) Sites of high variability (as measured by entropy) in natural human influenza sequences occur at sites of high entropy as experimentally measured in human cells. (D) Sites with high variability in natural avian influenza sequences occur at sites of high entropy as experimentally measured in duck cells. Figure 2—figure supplement 1 shows the complete map of amino acid preferences as measured in human and avian cells. Preferences (as well as mutation effect and differential selection for all mutations as calculated for Figure 3) are in Figure 2—source data 1.

https://doi.org/10.7554/eLife.45079.007
Figure 2—source data 1

Preference, mutation effect, and differential selection results for all mutations.

https://doi.org/10.7554/eLife.45079.009
Figure 2—figure supplement 1
Complete map of amino acid preferences measured in human and avian cells.

(A) Measurements in human (A549) cells. (B) Measurements in avian (CCL-141) cells. The height of the letter at each site is proportional to the rescaled preference for that amino acid at that site. Domains of PB2 (Pflug et al., 2017) are indicated by the top color bar. The wild-type S009 PB2 sequence is indicated by the letters above each site’s logoplot.

https://doi.org/10.7554/eLife.45079.008
Figure 3 with 1 supplement
Deep mutational scanning identifies known and novel host-adaptive mutations.

(A) Distribution of experimentally measured differential selection for previously characterized human adaptive mutations and all other possible mutations to PB2. Positive differential selection means a mutation is favored in human versus avian cells. (B) Scatterplot of each mutation’s effect in human versus avian cells, showing the top adaptive mutations identified in the deep mutational scanning. (C) Logoplots showing the differential selection at the sites of mutations that we chose for functional validation. The height of each letter above the line indicates how strongly it was selected in human versus avian cells. Top adaptive mutations are colored in orange (human-adaptive) or green (avian-adaptive). Mutations chosen for functional validation are indicated by an asterisk(*). Additional mutations chosen for validation are colored light orange (differentially selected in human) or light green (differentially selected in bird). Mutations observed in H7N9 avian-to-human transmission are indicated by #. Note that not all mutations with high differential selection in human versus avian cells are classified as top adaptive mutations because we also filtered for mutations that are substantially beneficial relative to wildtype. (D) Logoplots showing amino acid preferences at sites we chose for functional validation. Top mutations beneficial in both human and avian cells are colored purple. Mutations chosen for validation are indicated by *. Figure 3—figure supplement 1 shows the complete map of differential selection in human versus avian cells. Catalog of previously described human/mammalian adaptive mutations are in Figure 3—source data 1.

https://doi.org/10.7554/eLife.45079.010
Figure 3—source data 1

Catalog of previously described human/mammalian adaptive mutations.

https://doi.org/10.7554/eLife.45079.012
Figure 3—figure supplement 1
Complete map of differential selection in human versus avian cells.

(A) Differential selection for PB2. The height of the letter at each site is proportional to the differential selection in human versus avian cells for that amino acid at that site. Letters above the center line are favored in human cells. The wild-type avian influenza (S009) PB2 sequence is indicated above each site’s logoplot. (B) Scatter plots of differential selection versus mutation effect as measured in human or avian cells. Top experimentally adaptive mutations identified in our deep mutational scanning (orange, green, purple dots) are indicated on the plots.

https://doi.org/10.7554/eLife.45079.011
Validation of top experimentally adaptive mutations.

The polymerase activity of selected PB2 mutants as measured using minigenome assays in A549 (A) and HEK293T (B) human cells. The mutations chosen for characterization include previously known human adaptive mutations, top adaptive mutations identified by our deep mutational scanning (orange = human adaptive, green = avian adaptive), and additional mutations differentially selected in human (light orange) or avian (light green) cells. E627E is a synonymous mutation at site 627 used as a negative control. Minigenome activity is represented as percent of transfected cells that expressed a viral GFP reporter. The gray horizontal line indicates the mean value measured for the wild type avian PB2. Minigenome assays were performed in biological triplicate. Mutations that have significantly different minigenome activity from wild type are indicated by asterisks (unpaired t-test, p<0.05). (C) Competition of virus bearing the indicated mutant PB2 against virus with wild-type PB2. For each competition, human A549 and avian CCL-141 cells were infected with mutant and wild-type viruses mixed at a 1:1 ratio of transcriptionally active particles, and the frequency of each variant after viral replication was measured by deep sequencing viral RNA. For samples collected at 10 hr post infection, we infected cells at MOI of 0.1, and sequenced vRNA from cellular extract. For samples collected at 48 hr post infection, we infected cells at MOI of 0.01, and sequenced vRNA from the supernatant. The plots show the ratio of the mutant over wild-type variant in A549, divided by the same ratio in CCL-141 cells. A ratio >1 indicates that a viral mutant grows better in human than avian cells. Competition assays were performed in biological duplicate; circle and cross represent replicate experiments. Flow data for minigenome activity and and mutation counts for viral competition are provided in Figure 4—source datas 1 and 2.

https://doi.org/10.7554/eLife.45079.013
Figure 4—source data 1

Flow cytometry data for minigenome assays.

https://doi.org/10.7554/eLife.45079.014
Figure 4—source data 2

Mutant frequency data for competition assay.

https://doi.org/10.7554/eLife.45079.015
Figure 5 with 1 supplement
Locations of top human-adaptive mutations on the structure of the influenza polymerase.

Overall structure of the influenza polymerase complex comprising PB2, PB1 and PA in (A, B) the transcription pre-initiation form (PDB: 4WSB) and (C, D) the apo form (PDB: 5D98). PB2 domains defined as in Pflug et al. (2017). (B, D) Sites of top human-adaptive mutations identified by deep mutational scanning are shown in red on the PB2 subunit of the structure. Sites of previously experimentally verified human-adaptive mutations are in blue (25 sites as listed in Figure 3—source data 1). Sites identified by deep mutational scanning and which were also previously known are in purple. A subset of sites are labeled and/or circled for referencing in the main text, to indicate surfaces that might mediate host-interactions. Similar results are obtained if we instead analyze the structures in terms of a continuous variable representing the extent of human-specific adaptation at each site (Figure 5—figure supplement 1B, C). (E) Structure of PB2 C-terminal fragment co-crystalized with importin-α7 (PDB: 4UAD). Sites on PB2 interacting with major and minor NLS binding surfaces of importin-α7 are in green and cyan respectively. Importin-α7 is depicted in ribbon form in tan. We used the deep mutational scanning to define a continuous variable indicating the extent of host-specific adaptation at each site of PB2. Specifically, for each site, we computed the positive site differential selection by summing all positive mutation differential selection values at that site (i.e., the total height of the letter stack in the positive direction in logoplots such as in Figure 3D). We mapped this differential selection onto the PB2 C-terminal fragment in red; PB2 sites with high differential selection are numbered. Regions of importin-α7 that differ from importin-α3 are colored in orange, those near PB2 sites with high differential selection are shown as spheres. For all structures, the avian influenza (S009) PB2 amino acid sequence was mapped onto the PB2 chain by one-2-one threading using Phyre2 (Kelley et al., 2015) (Confidence in models for 4WSB, 5D98, and 4UAD are 100%, 100%, and 99% respectively). Sites are numbered according to the S009 PB2 sequence. Figure 5—figure supplement 1 shows relative solvent accessibility of human-adaptive mutations, as well as positive site differential selection mapped onto structures of influenza polymerase.

https://doi.org/10.7554/eLife.45079.016
Figure 5—figure supplement 1
Solvent accessibility of sites of human-adaptive mutations, and positive site differential selection mapped onto structures of influenza polymerase.

(A) Scatterplot of relative solvent accessibility (RSA) for all sites in the transcription pre-initiation form (PDB: 4WSB), and the apo form (PDB: 5D98) of the influenza polymerase complex. Sites of top experimentally identified human-adaptive mutations are in orange with site position labeled; all other sites are in gray. Blue lines indicate the RSA cut-off of >0.2 for surface exposed sites. Positive site differential selection mapped onto the PB2 subunit of the influenza polymerase complex in (B) the transcription pre-initiation form (PDB: 4WSB), and (C) the apo form (PDB: 5D98). (D) Positive site differential selection mapped onto PB2 (PDB: 6F5O). Sites in PB2 involved in RNA Pol II CTD binding indicated in green. Sites with high differential selection are numbered. The avian influenza (S009) PB2 amino acid sequence was mapped onto the PB2 chain by one-2-one threading using Phyre2 (Kelley et al., 2015) (Confidence in model for 6F5O: 100%). Sites are numbered according to the S009 PB2 sequence.

https://doi.org/10.7554/eLife.45079.017
Figure 6 with 5 supplements
Experimentally identified human-adaptive mutations are enriched in avian-human transmission of H7N9 influenza.

(A) Phylogeny of H7N9 influenza PB2 sequences. Branches in human and avian hosts are colored black and gray respectively. Orange or red dots indicate where a mutation was inferred to have occurred. Branch lengths are scaled by annotated and inferred dates of origin of each sequence. (B) Distribution of experimentally measured differential selection values for all mutations occurring during H7N9 evolution in human and avian hosts. A positive differential selection value means that our experiments measured the mutation to be beneficial in human versus avian cells. A subset of top differentially selected mutations that occur frequently are labeled and plotted in orange. Enlarged phylogenetic trees are in Figure 6—figure supplement 15. Counts of mutations identified in phylogenetic analysis are in Figure 6—source data 1. Mutations plotted in each bin of the histogram are in Figure 6—source data 2.

https://doi.org/10.7554/eLife.45079.018
Figure 6—source data 1

H7N9 human and avian mutation counts.

https://doi.org/10.7554/eLife.45079.024
Figure 6—source data 2

H7N9 human and avian mutation differential selection values and counts in each histogram bin.

https://doi.org/10.7554/eLife.45079.025
Figure 6—figure supplement 1
Phylogeny of H7N9 influenza PB2 sequences showing where mutations at site 627 were inferred to have occurred.

Branches in human and avian hosts are colored black and gray respectively. Orange or red dots indicate where a mutation was inferred to have occurred. Branch lengths are scaled by annotated and inferred dates of origin of each sequence.

https://doi.org/10.7554/eLife.45079.019
Figure 6—figure supplement 2
Phylogeny of H7N9 influenza PB2 sequences showing where mutations at site 701 were inferred to have occurred.

Branches in human and avian hosts are colored black and gray respectively. Orange dots indicate where a mutation was inferred to have occurred. Branch lengths are scaled by annotated and inferred dates of origin of each sequence.

https://doi.org/10.7554/eLife.45079.020
Figure 6—figure supplement 3
Phylogeny of H7N9 influenza PB2 sequences showing where mutations at site 534 were inferred to have occurred.

Branches in human and avian hosts are colored black and gray respectively. Orange dots indicate where a mutation was inferred to have occurred. Branch lengths are scaled by annotated and inferred dates of origin of each sequence.

https://doi.org/10.7554/eLife.45079.021
Figure 6—figure supplement 4
Phylogeny of H7N9 influenza PB2 sequences showing where mutations at site 355 were inferred to have occurred.

Branches in human and avian hosts are colored black and gray respectively. Orange or red dots indicate where a mutation was inferred to have occurred. Branch lengths are scaled by annotated and inferred dates of origin of each sequence.

https://doi.org/10.7554/eLife.45079.022
Figure 6—figure supplement 5
Phylogeny of H7N9 influenza PB2 sequences showing where mutations at site 521 were inferred to have occurred.

Branches in human and avian hosts are colored black and gray respectively. Orange dots indicate where a mutation was inferred to have occurred. Branch lengths are scaled by annotated and inferred dates of origin of each sequence.

https://doi.org/10.7554/eLife.45079.023
Evolutionary accessibility of mutations from current avian influenza PB2 sequences.

Distribution of mean nucleotide substitutions required to access all amino-acid mutations, previously characterized human-adaptive mutations, and top human-adaptive mutations identified in our deep mutational scanning. Mean nucleotide substitution is calculated by averaging over all avian influenza PB2 sequences collected from 2015 to 2018. Most previously characterized human-adaptive mutations are accessible by single nucleotide substitution, whereas many of the new adaptive mutations that we identified require multiple nucleotide substitutions. Mean nucleotide substitutions for each mutation are in Figure 7—source data 1.

https://doi.org/10.7554/eLife.45079.026
Figure 7—source data 1

Mean nucleotide substitutions from avian sequences of all mutations.

https://doi.org/10.7554/eLife.45079.027

Tables

Key resources table
Reagent type
(species) or resource
DesignationSource or referenceIdentifiersAdditional
information
Cell line
(Homo sapiens)
A549ATCCCCL-185;
RRID:CVCL_0023
Cell line
(Homo sapiens)
HEK293TATCCCRL-3216;
RRID:CVCL_0063
Cell line
(Canis familiaris)
MDCK-SIAT1Sigma-Aldrich5071502;
RRID:CVCL_Z936
Cell line
(Anas platyrhynchus
domesticus)
CCL-141ATCCCCL-141;
RRID:CVCL_T281
Cell line
(Canis familiaris)
MDCK-SIAT1-tet-
S009-PB2-E627K
this paperMDCK-SIAT1 cells
expressing S009 PB2-E627K
under control of a
doxycycline-inducible
promoter
Recombinant
DNA reagent
pHW_noCMV_S009_PB2;
pHW_noCMVnoTerm_BsmBI
this paperPlasmids for generating
mutant plasmid
library; see
Supplementary file 1
Recombinant
DNA reagent
pHW_S009_PB2;
pHW_S009_PB1;
pHW_S009_PA;
pHW_S009_NP
this paperPlasmids for generating
helper virus; see
Supplementary file 1
Recombinant
DNA reagent
HDM_S009_PB2;
HDM_S009_PB1;
HDM_S009_PA;
HDM_S009_NP
this paperPlasmids for protein
expression of S009
polymerase complex;
see Supplementary file 1
Recombinant
DNA reagent
pHH_PB2_S009_flank
_99_eGFP_100
this paperPlasmids for generating
helper virus; see
Supplementary file 1
Recombinant
DNA reagent
pHW184_HA;
pHW186_NA;
pHW187_M;
pHW188_NS
Hoffmann et al. (2000)
Recombinant
DNA reagent
pHH-PB1-flank-eGFPBloom et al. (2010)Reporter plasmid for
minigenome assay;
see Supplementary file 1
Recombinant
DNA reagent
pcDNA3.1_mCherrythis paperTransfection control
for minigenome
assay; see
Supplementary file 1
Recombinant
DNA reagent
pSBtet_RP_S009
_PB2_E627K
this paperPlasmid for generating
PB2-expressing cell
line; see Supplementary file 1
Sequence-based
reagent
primersthis paperSee Supplementary file 2
Commercial
assay or kit
NEBuilder HiFi
DNA Assembly
Master Mix
New England BiolabsE2621S
Commercial
assay or kit
ElectroMAX DH10B
competent cells
Invitrogen18290015
Commercial
assay or kit
Rneasy Mini KitQiagen74104
Commercial
assay or kit
Accuscript Reverse
Transcriptase
Agilent200820
Commercial
assay or kit
KOD Hot Start Master
Mix
EMD Millipore71842
Commercial
assay or kit
QIAamp Viral RNA
Mini Kit
Qiagen52904
Commercial
assay or kit
SuperScript IIIThermoFisher Scientific18080051
Chemical
compound, drug
BioTBioland ScientificB01-01
Chemical
compound, drug
Lipofectamine 3000ThermoFisher
Scientific
L3000015
AntibodyH17-L19Gerhard et al. (1981)
Software,
algorithm
dms_tools2https://jbloomlab.github.io/dms_tools2,
version 2.3.0
Software,
algorithm
Jupyter notebooks
that perform all
steps of analyses
this paperSee
Supplementary file 3;
https://github.com/jbloomlab/PB2-DMS

Additional files

Supplementary file 1

Plasmid sequences.

https://doi.org/10.7554/eLife.45079.028
Supplementary file 2

Primer sequences.

https://doi.org/10.7554/eLife.45079.029
Supplementary file 3

Jupyter notebooks documenting computational analyses.

https://doi.org/10.7554/eLife.45079.030
Supplementary file 4

Comparison of ExpCM to standard phylogenetic substitution models.

https://doi.org/10.7554/eLife.45079.031
Transparent reporting form
https://doi.org/10.7554/eLife.45079.032

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)