Long read sequencing reveals poxvirus evolution through rapid homogenization of gene arrays

  1. Thomas A Sasani
  2. Kelsey R Cone
  3. Aaron R Quinlan  Is a corresponding author
  4. Nels C Elde  Is a corresponding author
  1. University of Utah, United States
11 figures and 4 tables

Figures

Figure 1 with 3 supplements
A single nucleotide variant accumulates following increases in K3L copy number.

(A) Following 20 serial infections of the ΔE3L strain (MOI 0.1 for 48 hr) in HeLa cells (see Materials and methods for further details), replication was measured in triplicate in HeLa cells for …

https://doi.org/10.7554/eLife.35453.002
Figure 1—source data 1

Data used to generate Figure 1A.

https://doi.org/10.7554/eLife.35453.006
Figure 1—source data 2

Data used to generate Figure 1D.

https://doi.org/10.7554/eLife.35453.007
Figure 1—source data 3

Statistics for Figure 1D, One-way ANOVA followed by Dunnett’s multiple comparison test.

https://doi.org/10.7554/eLife.35453.008
Figure 1—figure supplement 1
K3LHis47Arg and K3L CNV are non-adaptive in the permissive BHK cell line.

Replication was measured for plaque purified clones (as in Figure 1D) in BHK cells by 48 hr infection (MOI 0.1) in triplicate. All titers were measured multiple times in BHK cells by plaque assay, …

https://doi.org/10.7554/eLife.35453.003
Figure 1—figure supplement 1—source data 1

Data used to generate Figure 1—figure supplement 1.

https://doi.org/10.7554/eLife.35453.009
Figure 1—figure supplement 1—source data 2

Statistics for Figure 1—figure supplement 1, One-way ANOVA followed by Dunnett’s multiple comparison test.

https://doi.org/10.7554/eLife.35453.010
Figure 1—figure supplement 2
Allele frequencies of the two high-frequency SNVs identified in vaccinia populations.

Population-level K3LHis47Arg and E9LGlu495Gly allele frequencies were estimated using freebayes on Illumina MiSeq reads from different passages.

https://doi.org/10.7554/eLife.35453.004
Figure 1—figure supplement 3
The E9LGlu495Gly variant does not contribute to virus replication.

A virus clone containing the E9LGlu495Gly variant as the only genetic change relative to ΔE3L was isolated following four rounds of plaque purification in BHK cells (clone a in Figure 1C). …

https://doi.org/10.7554/eLife.35453.005
Figure 1—figure supplement 3—source data 4

Data used to generate Figure 1—figure supplement 3.

https://doi.org/10.7554/eLife.35453.012
Figure 1—figure supplement 3—source data 5

Statistics for Figure 1—figure supplement 3, unpaired 2-tailed t test with Welch’s correction.

https://doi.org/10.7554/eLife.35453.013
Figure 2 with 3 supplements
ONT reads capture SNVs and copy number expansions in individual virus genomes.

(A) Representative structure of the K3L locus in the VC-2 reference genome is shown on top, with representative Illumina MiSeq and ONT MinION reads shown to scale below. The K3LHis47Arg variant …

https://doi.org/10.7554/eLife.35453.014
Figure 2—source data 1

Single nucleotide variants in virus populations from Illumina or ONT datasets, used to generate Figure 2B.

https://doi.org/10.7554/eLife.35453.018
Figure 2—figure supplement 1
E9LGlu495Gly variant dynamics.

Population-level E9LGlu495Gly allele frequencies were estimated using freebayes and nanopolish on Illumina or ONT reads, respectively, from different passages as in Figure 2B.

https://doi.org/10.7554/eLife.35453.015
Figure 2—figure supplement 2
Error rate profiles in ONT reads.

The proportions of non-reference bases aligned to the 5-mers containing the K3LWT (A), K3LHis47Arg (B), E9LWT (C), or E9LGlu495Gly (D) sequences were calculated from alignments of ONT reads from the …

https://doi.org/10.7554/eLife.35453.016
Figure 2—figure supplement 3
ONT reads capture high K3L copy number in vaccinia genomes.

K3L copy number was assessed in ONT reads from P10, P15, and P20 as described in Figure 2C. Stacked bar plots indicate overall proportions of sequencing reads that contain between 6 and 16 copies of …

https://doi.org/10.7554/eLife.35453.017
Figure 3 with 4 supplements
The K3LHis47Arg variant homogenizes within multicopy arrays throughout experimental evolution.

Stacked bar plots representing the proportions of mixed and homogeneous K3L arrays were generated from ONT reads for the indicated virus populations (passages are listed above each plot). The …

https://doi.org/10.7554/eLife.35453.024
Figure 3—figure supplement 1
Simulated accumulation of the K3LHis47Arg SNV.

The K3LHis47Arg allele was uniformly distributed in simulated vaccinia populations with copy number distributions identical to passages P10, P15, and P20 (see Materials and methods for further …

https://doi.org/10.7554/eLife.35453.025
Figure 3—figure supplement 2
ONT flowcell chemistries do not affect observed proportions of homogeneous and mixed K3L arrays.

The P15 population was sequenced using R7.3, R9, and R9.4 ONT flowcell chemistries, and stacked bar plots representing the proportions of mixed and homogeneous K3L arrays were generated as in Figure …

https://doi.org/10.7554/eLife.35453.026
Figure 3—figure supplement 3
ONT sequencing error rates do not affect observed proportions of homogeneous and mixed K3L arrays.

(A) Using reads from passage 15 populations sequenced with R7.3, R9, and R9.4 flowcell chemistries, all mixed arrays were converted into homogeneous arrays (see Materials and methods for further …

https://doi.org/10.7554/eLife.35453.027
Figure 3—figure supplement 4
Multicopy K3L arrays contain diverse combinations of K3LWT and K3LHis47Arg alleles.

(A) The proportions of 3-copy K3L arrays containing each possible combination of K3LWT and K3LHis47Arg alleles at P10, P15, and P20 were counted. Dotted red lines separate mixed and homogeneous …

https://doi.org/10.7554/eLife.35453.028
The K3LHis47Arg variant homogenizes in K3L arrays regardless of copy number.

(A) ONT reads from every 5th passage were grouped by K3L copy number, and each K3L copy was assessed for the presence or absence of the K3LHis47Arg SNV. Reads containing 1–5 K3L copies are shown. (B)…

https://doi.org/10.7554/eLife.35453.029
K3LHis47Arg homogenization within multicopy arrays is independent of intergenomic recombination rate.

The P10 population was serially passaged in HeLa cells at different MOIs (listed above each plot), and each of the resulting P15 populations was sequenced with ONT. Stacked bar plots representing …

https://doi.org/10.7554/eLife.35453.030
K3LHis47Arg variant homogenization is dependent on selection.

The P10 population was serially passaged five times in BHK cells (MOI = 0.1, 48 hr; P15-BHK). P10 and P15 data are included from previous figures for comparison with P15-BHK. (A) K3L copy number was …

https://doi.org/10.7554/eLife.35453.031
Figure 6—source data 2

Data used to generate Figure 6D.

https://doi.org/10.7554/eLife.35453.032
Figure 6—source data 1

Single nucleotide variants in all sequenced virus populations from Illumina or ONT datasets.

Data used to generate Figure 6B.

https://doi.org/10.7554/eLife.35453.033
Model of K3LHis47Arg homogenization within K3L CNV via gene conversion.
https://doi.org/10.7554/eLife.35453.034
Author response image 1
Variety of mixed vaccinia genomes in P15 sequencing data.

We selected a random set of 20 vaccinia genomes of the specified copy number from the P15 sequencing data (a passage with a large proportion of mixed genomes), and plotted the distribution of K3LHis4…

https://doi.org/10.7554/eLife.35453.036
Author response image 2
Error rate distribution within K3LHis47Arg sequence context.

(A) Kernel density plots representing the distribution of error rates for T>C, T>A, T>G, and T>deletion error across all TATGC 5-mers in data from each flowcell chemistry. (B) Kernel density plots …

https://doi.org/10.7554/eLife.35453.037
Author response image 3
Oxford Nanopore sequencing error does not impact observed patterns of K3L copy number or K3LHis47Arg allele heterogeneity.

The P15 vaccinia population was sequenced with R7.3, R9, and R9.4 chemistry ONT flowcells. (A) Stacked bar plots representing the diversity of allele combinations within single-copy and multicopy …

https://doi.org/10.7554/eLife.35453.038

Tables

Table 1
Summary of ONT sequencing datasets
https://doi.org/10.7554/eLife.35453.020
Population*Total sequenced readsMean read length (bp)Read length N50 (bp)Total sequenced bases (Gbp)Reads containing K3L
P5239,737216859320.521190
P1091,815352376930.32912
P15388,502449369081.754317
P2094,050289377020.27789
  1. *ONT sequencing datasets for all populations are available in Table 1-source data 1

Table 1-source data 1

Complete summary of ONT sequencing datasets

https://doi.org/10.7554/eLife.35453.021
Table 2
Median sequencing error rates using various ONT flowcell chemistries
https://doi.org/10.7554/eLife.35453.022
Mutation and context (amino acid change)R7.3R9R9.4
TA[T > C]GC (His47Arg)0.0230.0230.005
TA[C > T]GC (Arg47His)0.0150.0240.026
AT[T > C]CG (Glu495Gly)0.0140.0180.009
AT[C > T]CG (Gly495Glu)0.0250.0090.009
Table 3
Structural variant breakpoint frequencies during passaging
https://doi.org/10.7554/eLife.35453.023
Breakpoint frequency*
BreakpointK2L breakK4L breakP5P10P15P20
130,284-0.760.690.760.66
1-30,8370.760.630.720.62
230,287-0.140.060.100.08
2-30,8400.120.040.090.05
  1. *Due to sequencing errors, a proportion of reads do not match either breakpoint

Key resources table
Reagent type (species)
or resource
DesignationSource or referenceIdentifiersAdditional information
Gene (Vaccinia virus)K3LNANCBI_Gene ID:3707649
Strain, strain
background (Vaccinia virus)
VC-2, Copenhagen(Goebel et al., 1990)
PMID: 2219722
NCBI_txid:10249;
NCBI_GenBank:M35027.1
Strain, strain
background (Vaccinia virus)
ΔE3L, Copenhagen(Beattie et al., 1995)
PMID: 7527085
Cell line (Homo sapiens)HeLaOtherObtained from Geballelab, University of Washington
Cell line
(Mesocricetus auratus)
BHKOtherObtained from Geballelab, University of Washington
Commercial assay
or kit
Covaris g-TUBECovaris, Inc.Catalog no: 520079
Commercial assay
or kit
DIG High-Prime DNA
Labeling and Detection
Starter Kit II
RocheCatalog no: 11585614910
Commercial assay
or kit
Nextera XT DNA library
preparation kit
IlluminaCatalog no: FC-131–1024
Commercial assay
or kit
SQK-NSK007; SQK-LSK208;
SQK-LSK308; SQK-RAD002
Oxford Nanopore
Technologies
Catalog no: SQK-NSK007;
SQK-LSK208; SQK-LSK308;
SQK-RAD002
Commercial assay
or kit
FLO-MIN104; FLO-MIN106;
FLO-MIN107
Oxford Nanopore
Technologies
Catalog no: FLO-MIN104;
FLO-MIN106; FLO-MIN107
Chemical compound,
drug
DMEMHyClone, VWRCatalog no: 16777–129
Chemical compound,
drug
FBSHyClone, VWRCatalog no: 26-140-079
Chemical compound,
drug
Penicillin-streptomycinGE Life Sciences, VWRCatalog no: 16777–164
Chemical compound,
drug
SG-2000GE Life Sciences, VWRCatalog no: 82024–258
Software, algorithmGraphPad PrismGraphPad Software
Software, algorithmBWA-MEM(Li, 2013)v0.7.15arxiv.org/abs/1303.3997
Software, algorithmsamblaster(Faust and Hall, 2014)
PMID: 24812344
v0.1.24https://github.com/GregoryFaust/samblaster
Software, algorithmfreebayes(Garrison and Marth,
2012)
v1.0.2–14arxiv.org/abs/1207.3907
Software, algorithmMetrichorOxford Nanopore
Technologies
v2.40
Software, algorithmAlbacoreOxford Nanopore
Technologies
v1.2.4
Software, algorithmporetools(Loman and Quinlan,
2014) PMID: 25143291
v0.6.0https://github.com/arq5x/poretools
Software, algorithmPorechopOtherv0.2.3https://github.com/rrwick/Porechop
Software, algorithmnanopolish(Loman et al., 2015)
PMID: 26076426
v0.8.4https://github.com/jts/nanopolish
Software, algorithmsource codethis paperSee Materials andmethods, https://github.com/tomsasani/vacv-ont-manuscript;
copy archived at
https://github.com/elifesciences-publications/vacv-ont-manuscript)
Software, algorithmraw sequencing datathis paperSRP128569; SRP128573;
DOI: 10.5281/zenodo.1319732
See Materials and methods
Software, algorithmraw sequencing data(Elde et al., 2012)
PMID: 22901812
SRP013146

Download links