Basic organization of N-protein and RNPs.

N-protein (1-419) has two folded domains, NTD (45-180) and CTD (248-363), and three IDRs including the N-arm (1-44), the central linker (181-247) and the C-arm (364-419). (A) Displayed is an AF2 structure where the disordered N-arm, linker, and C-arm are artificially stretched for clarity. The residues are color-coded according to the number of different amino acids that have been observed at this position in the mutational landscape replacing the Wuhan-Hu-1 sequence. The bioinformatic analysis was carried out as described, previously (Zhao et al., 2023), updated to August 26, 2024, using a threshold of >5 genomes for each mutation. (B) Schematic of protein-protein and protein/RNA interfaces in RNP assembly. The dimerization domain (CTD) is indicated in green. The linker is subdivided in a serine and arginine-rich region (180-205, SR) and a L-rich region (206-247, LRS). LRS can transiently fold into helices (yellow) that create a hydrophobic patch for promiscuous self-association (indicated as yellow pattern). For clarity the cartoon only shows 3 neighboring N-protein dimers, although higher-order oligomers assemble in RNPs. Nucleic acid binding sites (purple triangles) preferentially bind single-stranded RNA at the NTD (grey lines), and double stranded RNA at the two sites per CTD dimer, with the ability to cross-link neighboring dimers potentially in various configurations. New inter-dimer interactions evolved in VOCs are indicated by red connectors, including the promotion of beta-sheet oligomerization through the N:P13L mutation in the N-arm (as in Omicron and Lambda variants), and the introduction of cysteines at the base of the LRS helices in N:G214C (as in Lambda variants) and N:G215C (as in Delta variants). (C) Three-dimensional cartoon of the circular organization of N-protein domains in RNPs, with one dimer shaded slightly darker to highlight the dimeric building blocks. For clarity, subunit sizes are not drawn to scale. (D) CD spectra of N-protein in the presence of SL7 under near physiological salt conditions leading to majority assembly of RNPs. Spectra are corrected for free SL7 contributions. Shown are spectra of ancestral N-protein alone (black), in the presence of SL7 forming RNPs (red), Nλ alone (cyan) and in the presence of SL7 forming RNPs (magenta). Spectra are truncated at < 205 nm due to limited buffer transparency. For comparison the dotted line shows a previously published CD spectrum of ancestral N-protein in low-salt buffer that permits measurement at shorter wavelengths (Nguyen et al., 2024).

Overview of N-protein mutant species studied

The P13L mutation creates self-association interfaces in the N-arm through stabilization of β-sheets.

(A) Electron micrograph of negatively stained omicron N-arm Nο,1-43:P13L,Δ31-33 after equilibration at 10 μM in 20 mM HEPES, 150 mM NaCl, pH 7.50. The magnified regions are examples of twisted (*) and straight (**) fibrils. (B) CD spectra of N1-43:P13L (red), N1- 43:Δ31-33 (blue), and Omicron Nο,1-43:P13L,Δ31-33 (magenta) at 0.4 mM (dashed) and 1.0 mM (solid), in comparison with ancestral N-arm (black). (C) Subset of ColabFold prediction of multimers of N10-20:P13L highlighting hydrogen bonds. The P13L residue is highlighted in red in the middle peptide.

MD simulation of ancestral LRS and comparison with 214C and 215C mutants of Delta and Lambda variants.

Snapshots at equal time intervals (4 ns) taken from the 200-ns MD simulations of the ancestral peptide N210-246 and single-point G→C mutants (all monomers were oriented by overlying the helical region and displayed in orthosteric view). The upper row shows a view from the N-terminus side (with the helix axes perpendicular to the plane of the figure), while the lower row presents a side view (with axes on the plane). For clarity, the disordered C-terminal segment (residues 236-246) has been removed. The glycine and cysteine residues at positions 214 and 215 (colored blue and red, respectively, in the ancestral peptide and rendered as ball-and-stick in the mutants) restructure the flexible backbone around the GG motif into a well-defined α-helix turn, directing the sulfhydryl group in specific orientations (illustrated schematically by the blue and red arches). These orientations can be quantified relative to the Leu-rich central region (indicated by arrows and rendered as gray van der Waals spheres), which forms the hydrophobic interfaces of the oligomers (Zhao et al., 2023). Under reducing conditions, this reorientation of the N-terminus relative to the helix can influence helix binding during the early stages of oligomerization or alter the conformation and physicochemical properties of the resulting oligomers, as illustrated in Supplementary Figure S5.

Impact of mutations in self-association interfaces on oligomeric state.

(A) Sedimentation coefficient distributions of cysteine mutants N:G215C (reduced in yellow) and N:G215C* (oxidized in brown), as well as reduced Nλ (reduced in magenta) and Nλ* (oxidized in violet), with Nλ data offset by 2. For each sample data were acquired at high (solid lines) and low (dashed lines) concentration. (B) Sedimentation coefficient distributions of 2 μM N:P13L,Δ31-33, ancestral N, and N:P13L,Δ31-33,L222P in the presence of 10 μM T10 in low salt buffer. The inset shows DLS autocorrelation data of the same samples (symbols) and single-species fits (lines).

Size distributions and stability of ancestral and mutant RNPs.

Show are SV (A,B) and MP data (C-F) for mixtures of N-protein with stem-loop RNA SL7 in molar ratio of 1(N):1.15(SL7) at high concentration of 3 μM (A,D) or low concentration of 0.3 μM (B,C,E,F) in reducing buffer conditions. All panels use the same color scheme for N-protein: ancestral (black), N:P13L,Δ31-33 (red), N:G215C (orange), Nο (cyan), N:R203K/G204R (blue), Nλ (magenta). (A) Sedimentation coefficient distributions of mixtures equilibrated at high concentration, with reaction boundary peaks magnified in the inset. For reference, the ancestral RNP s-value is drawn as dotted vertical line. Absorbance data are recorded at 260 nm and are weighted by SL7 content of sedimenting species. Higher reaction boundary s-values signify greater affinity or lifetime of the mutant RNPs. (B) Sedimentation coefficient distributions of the same samples as in (A), tenfold diluted and equilibrated, highlight dissociation of most RNPs into a range of intermediate size complexes. (C) MP experiments of equilibrated 0.3 μM mixtures. The measured number distributions are presented as cumulative distributions, which display higher percentages of large species as shifts to the right. Most samples are largely dissociated into dimers, with remaining peaks corresponding to populations of dimer to hexamers of N2/SL72 subunits (as highlighted in the differential distributions in the inset). As an example for the resolution of distinct species, the inset shows the differential distribution (histogram) for N:R203K/G204R (blue), ancestral N (black), and N:P13L,Δ31-33 (red). (D) Mass distributions acquired in stopped-flow configuration applied to 3 μM mixtures. Larger (negative) contrasts correspond to higher molecular weights, with major peaks corresponding to species containing 1, 4, 5, and 6 N2/SL72 subunits. (E) For kinetic experiments, mass distributions were acquired in different time intervals after tenfold dilution of 3 μM mixtures, here showing data collected from 3 sec to 23 sec. (F) Number-average molecular weights of assembled RNPs between 500 and 1500 kDa observed in consecutive 20 sec data acquisition intervals after tenfold dilution of 3 μM mixtures (circles). The dashed horizontal lines are number-averages determined from the equilibrated 0.3 μM mixtures in (E). The solid lines are a best-fit single exponentials constrained to decay to the measured equilibrium values, yielding RNP lifetimes listed in Table 1.

Impact of LRS disulfide bonds on the size and stability of RNPs.

N-protein with cysteine mutations in the LRS were oxidized to form disulfide-linked oligomers (as shown in Figure 4A) and mixed with stem-loop RNA SL7 in molar ratio of 1(N):1.15(SL7). Shown are data for oxidized N:G215C* (brown) and oxidized Nλ* (green), and for comparison, reduced N:G215C (yellow) and reduced Nλ (magenta). (A) Sedimentation coefficient distributions at 3 μM (solid lines) and 0.3 μM (dashed lines) protein. (B) Molecular weight distributions in MP experiments of the same mixtures rapidly diluted to 0.3 μM protein, acquired from 3 – 23 sec after dilution. (C) Time-course of number-average RNP molecular weights between 500 and 1,500 kDa (circles), determined from rapid dilution experiments in (B) for consecutive 20 sec data acquisition intervals. The solid lines are best-fit single-exponential decays constrained to attain the separately measured equilibrium values at 0.3 μM protein (dashed lines), with lifetimes listed in Table 1.

Mutation effect on packaging and cell entry in a VLP assay.

Error bars are standard deviations from n = 4. Stars indicate significance (P > 0.95) of a two-sided Kolmogorov-Smirnov test comparing the control ancestral measurements with mutants.

Replication kinetics of recombinant SARS-CoV-2 reporter viruses in cell lines.

(A) Representative images of Vero-TMPRSS2 and A549-ACE2 cells infected with SARS-CoV-2 P13L or WT at different time points post-infection. (B) Quantification of fluorescence intensity from P13L and WT virus infections shown in (A). (C) Viral titers in the supernatant from infected cells. Error bars are standard deviations (n = 3), and stars indicate significant differences on a P=0.95 confidence level.