Basepairing is predictive of synonymous substitution frequency.

Distribution of frequencies of synonymous substitutions for the most common substitutions (each approximately corresponding to 5% or more of observed substitutions), expressed as the estimated mutational fitness, which is a logarithmic comparison of the observed versus the expected number of occurrences of each type of substitution in the SARS-CoV-2 phylogenetic tree4. Distributions are grouped by substitution type and whether or not positions are basepaired in a full-genome secondary structure of SARS-CoV-2 in Huh7 cells2.

Estimated mutational fitness correlates with secondary structure for nonsynonymous C→T substitutions.

Scatter plots compare mutational fitness to average DMS reactivity for positions with potential nonsynonymous CT substitutions. The minimum observed DMS reactivity value is assigned to positions lacking data. Points are colored by basepairing in the full genome secondary structure model. Nonsynonymous CT substitutions at basepaired positions are highlighted which rank highly for mutational fitness and characterize major SARS-CoV-2 lineages. Synonymous C29095T at an unpaired position is also highlighted. Left: Estimated mutational fitness based only on observed versus expected occurrences of CT at each position. Right: Mutational fitness adjusted by constants derived from the medians of mutational fitness for synonymous substitutions at basepaired, unpaired, and all potential CT positions.