Figures and data
![](https://prod--epp.elifesciences.org/iiif/2/98102%2Fv1%2Fcontent%2F581995v1_fig1.tif/full/max/0/default.jpg)
Basepairing is predictive of synonymous substitution frequency.
Distribution of frequencies of synonymous substitutions for the most common substitutions (each approximately corresponding to 5% or more of observed substitutions), expressed as the estimated mutational fitness, which is a logarithmic comparison of the observed versus the expected number of occurrences of each type of substitution in the SARS-CoV-2 phylogenetic tree4. Distributions are grouped by substitution type and whether or not positions are basepaired in a full-genome secondary structure of SARS-CoV-2 in Huh7 cells2.
![](https://prod--epp.elifesciences.org/iiif/2/98102%2Fv1%2Fcontent%2F581995v1_fig2.tif/full/max/0/default.jpg)
Estimated mutational fitness correlates with secondary structure for nonsynonymous C→T substitutions.
Scatter plots compare mutational fitness to average DMS reactivity for positions with potential nonsynonymous C→T substitutions. The minimum observed DMS reactivity value is assigned to positions lacking data. Points are colored by basepairing in the full genome secondary structure model. Nonsynonymous C→T substitutions at basepaired positions are highlighted which rank highly for mutational fitness and characterize major SARS-CoV-2 lineages. Synonymous C29095T at an unpaired position is also highlighted. Left: Estimated mutational fitness based only on observed versus expected occurrences of C→T at each position. Right: Mutational fitness adjusted by constants derived from the medians of mutational fitness for synonymous substitutions at basepaired, unpaired, and all potential C→T positions.