Research Advance

Evolution of a fuzzy ribonucleoprotein complex in viral assembly

Laboratory of Dynamics of Macromolecular Assembly, National Institute of Biomedical Imaging and Bioengineering, National Institutes of Health, United States
Cellular Biology Section, Laboratory of Viral Diseases, National Institute of Allergy and Infectious Diseases, National Institutes of Health, United States
Bioinformatics and Computational Biosciences Branch, National Institute of Allergy and Infectious Diseases, National Institutes of Health, United States
Electron Microscopy Unit, Trans-NIH Shared Resource on Biomedical Engineering and Physical Science, National Institute of Biomedical Imaging and Bioengineering, National Institutes of Health, United States
Biophysics Core Facility, National Heart, Lung, and Blood Institute, National Institutes of Health, United States
Laboratory of Cellular Imaging and Macromolecular Biophysics, National Institute of Biomedical Imaging and Bioengineering, National Institutes of Health, United States
HIV Dynamics and Replication Program, Center for Cancer Research, National Cancer Institute, United States

Dec 30, 2025

https://doi.org/10.7554/eLife.108922.3

Open access
Copyright information

Figures
Tables
Additional files

12 figures, 2 tables and 2 additional files

Figures

Figure 1 with 2 supplements

Download asset Open asset

Basic organization of N-protein and RNPs.

N-protein (1-419) has two folded domains, NTD (45-180) and CTD (248-363), and three intrinsically disordered regions including the N-arm (1-44), the central linker (181-247), and the C-arm (364-419). (A) Displayed is an AlphaFold2 structure where the disordered N-arm, linker, and C-arm are artificially stretched for clarity. The residues are color-coded according to the number of different amino acids that have been observed at this position in the mutational landscape replacing the Wuhan-Hu-1 sequence. The bioinformatic analysis was carried out as described previously (Zhao et al., 2023), updated to August 26, 2024, using a threshold of >5 genomes for each mutation. (B) Schematic of protein-protein and protein/RNA interfaces in RNP assembly. The nucleic acid binding domain at the N-terminus (NTD) is indicated in blue, the LRS in yellow, and the dimerization domain (CTD) in green. Regions of self-association are indicated by shaded backgrounds. The linker is subdivided into a serine and arginine-rich region (180–205, SR) and a L-rich region (206–247, LRS). LRS can transiently fold into helices that create a hydrophobic patch for promiscuous self-association (indicated as yellow background). For clarity, the cartoon only shows three neighboring N-protein dimers, although higher-order oligomers assemble in RNPs. Nucleic acid binding sites (purple triangles) preferentially bind single-stranded RNA at the NTD (gray lines), and double-stranded RNA at the two sites per CTD dimer, with the ability to cross-link neighboring dimers potentially in various configurations. New inter-dimer interactions evolved in variants of concern are indicated by red connectors, including the promotion of beta-sheet oligomerization through the N:P13L mutation in the N-arm (as in Omicron and Lambda variants), and the introduction of cysteines at the base of the LRS helices in N:G214C (as in Lambda variants) and N:G215C (as in Delta variants). (C) Three-dimensional cartoon of the circular organization of N-protein domains in RNPs, with one dimer shaded slightly darker to highlight the dimeric building blocks. For clarity, subunit sizes are not drawn to scale. Alternate arrangements are depicted in Figure 1—figure supplement 1. (D) CD spectra of N-protein in the presence of SL7 under near physiological salt conditions leading to majority assembly of RNPs. Spectra are corrected for free SL7 contributions. Shown are spectra of ancestral N-protein alone (black), in the presence of SL7 forming RNPs (red), N_λ alone (cyan), and in the presence of SL7 forming RNPs (magenta). Spectra are truncated at <205 nm due to limited buffer transparency. For comparison, the dotted line shows a previously published CD spectrum of ancestral N-protein in low-salt buffer that permits measurement at shorter wavelengths (Nguyen et al., 2024). Triplicate scans yield average standard deviations of 0.13 (N), 0.17 (N+SL7), 0.16 (N_λ), and 0.21 (N_λ +SL7) 10³ deg cm²/dmol, respectively, with non-overlapping confidence bands for the different species, for example, between 215 and 220 nm.

Figure 1—figure supplement 1

Download asset Open asset

Possible heterogeneity of RNPs from alternate configurations of N dimer subunits.

Only the N protein scaffold is shown for clarity. The nucleic acid binding domain at N-terminus (NTD) is indicated in blue, the dimerization domain (CTD) in green, and the L-rich region (206–247, LRS) folded into helices is shown as yellow rods. The other disordered regions are shown as black lines. (A) Based on the cartoon of Figure 1C, an RNP is sketched where two of the six dimer subunits have flipped LRS orientations, resulting in the swapped position of NTD and CTD subunits. (B) Arrangement of five dimer subunits. (C) Arrangement of seven dimer subunits. (D) N_210-419* constructs lacking the NTD are capable of forming RNPs of similar size, conceivably substituting NTD binding sites for nucleic acid by additional CTD subunits upside-down orientation. (E) Mixed RNPs combining N_210-419* and full-length N protein different orientations.

Figure 1—figure supplement 2

Download asset Open asset

Structural prediction of an RNP.

Top view (left) and side view (right) of an AlphaFold3 model of 5 N-protein dimers (ancestral Wuhan Hu-1) with 10 copies of SL7 RNA. One N-protein dimer is highlighted showing the two chains in blue and cyan. Three copies of SL7 adjacent to the highlighted dimer are shown in yellow, other copies are omitted for clarity. The top view shows the symmetrical arrangement of LRS helices forming a decameric core. The remainder of the linker, as well as the entire N-arm and C-arm, is disordered and not meaningfully predicted. Due to significant disordered regions, the predicted model is not unique and is limited by the fact that multiple copies of SL7 are used as RNA ligand mimicking the biophysical RNP model.

Figure 2 with 2 supplements

Download asset Open asset

The P13L mutation creates self-association interfaces in the N-arm through stabilization of b-sheets.

(A) Electron micrograph of negatively stained omicron N-arm N_o,1-43:P13L,Δ31-33 after equilibration at 10 μM in 20 mM HEPES, 150 mM NaCl, pH 7.50. The magnified regions are examples of twisted (*) and straight (**) fibrils. (B) CD spectra of N_1-43:P13L (red), N_1-43:Δ31-33 (blue), and Omicron N_o,1-43:P13L,Δ31-33 (magenta) at 0.4 mM (dashed) and 1.0 mM (solid), in comparison with ancestral N-arm (black). (C) Subset of ColabFold prediction of multimers of N_10-20:P13L highlighting hydrogen bonds. The P13L residue is highlighted in red in the middle peptide.

Figure 2—figure supplement 1

Download asset Open asset

Electron micrographs of N1-43:P13L and control.

(A) Electron micrograph of negatively stained N-arm peptide N_1-43:P13L after equilibration at 10 μM in 20 mM HEPES, 150 mM NaCl, pH 7.50. (B) As a control, electron micrograph of negatively stained C-arm N_364-419 under the same conditions.

Figure 2—figure supplement 2

Download asset Open asset

Comparison of WT and P13L N-arm structure predictions.

Structures were predicted in ColabFold for a 12mer of the N-arm peptide N_10-20 of ancestral N (Wuhan-Hu-1) and N:P13L.

Figure 3 with 2 supplements

Download asset Open asset

MD simulation of ancestral LRS and comparison with 214 C and 215 C mutants of Delta and Lambda variants.

Snapshots at equal time intervals (4 ns) taken from the 200-ns MD simulations of the ancestral peptide N_210-246 and single-point G→C mutants (all monomers were oriented by overlying the helical region and displayed in orthosteric view). The upper row shows a view from the N-terminus side (with the helix axes perpendicular to the plane of the figure), while the lower row presents a side view (with axes on the plane). For clarity, the disordered C-terminal segment (residues 236–246) has been removed. The glycine and cysteine residues at positions 214 and 215 (colored blue and red, respectively, in the ancestral peptide and rendered as ball-and-stick in the mutants) restructure the flexible backbone around the GG motif into a well-defined α-helix turn, directing the sulfhydryl group in specific orientations (illustrated schematically by the blue and red arches). These orientations can be quantified relative to the Leu-rich central region (indicated by arrows and rendered as gray van der Waals spheres), which forms the hydrophobic interfaces of the oligomers (Zhao et al., 2023). Under reducing conditions, this reorientation of the N-terminus relative to the helix can influence helix binding during the early stages of oligomerization or alter the conformation and physicochemical properties of the resulting oligomers, as illustrated in Figure 3—figure supplement 2.

Figure 3—figure supplement 1

Download asset Open asset

Size distribution of G214C cysteine mutant LRS peptides.

Shown are peptides N_{LRS, 210-246}:G214C reduced (blue) vs unreduced (red). (A) Autocorrelation functions (circles) and best-fit size-distribution fits (solid lines). (B) Best-fit hydrodynamic radius distributions.

Figure 3—figure supplement 2

Download asset Open asset

Structural and physicochemical effects of LRS mutants G214C and G215C under reducing conditions.

Snapshots taken at equal time intervals (4 ns) from the 200-ns MD simulations of the trimeric complex of the ancestral (WT) peptide N_210-246 and single-point G→C mutants (trimers oriented by overlaying the helical regions of the three monomers; the panels are shown in orthosteric view for ease of comparison). The upper row presents a view from the N-terminus side (compare with Figure 3). The second row provides a side view, showing the tightly packed hydrophobic core that drives and stabilizes monomer self-association. The conformational changes and reorientation of the N-terminal segment observed in the mutant monomers (see Figure 3) influence the interactions between the helices and affect the complex structure and stabilization (see text). A consequence of this rearrangement is shown in the third row, where E216 (the oxygen atoms of the carboxyl group depicted as red van der Waals spheres), expected to be anionic under the experimental conditions (neutral pH; mimicked in the simulations; see Materials and methods), becomes more spatially concentrated at the helix end. The varying local electric fields (less negative in the WT, slightly more negative in the G214C mutant, and much more negative in the G215C), illustrated on the surface electrostatic potential in the fourth row, may alter, among other features, the binding of cations in the solution, potentially influencing further aggregation of the peptide or full-length N-protein, phase separation, or particle assembly in vivo. Oligomerization under oxidative conditions (not investigated in this study) may differ significantly, as the hydrophobic moieties that drive wild-type helix dimerization could reorient outward in disulfide-bridged dimers, thereby affecting the oligomerization mechanism and the complex’s properties.

Figure 4 with 3 supplements

Download asset Open asset

Impact of mutations in self-association interfaces on oligomeric state.

(A) Sedimentation coefficient distributions of cysteine mutants N:G215C (reduced in yellow) and N:G215C* (oxidized in brown), as well as reduced N_λ (reduced in magenta) and N_λ* (oxidized in violet), with N_λ data offset by 2. For each sample, data were acquired at high (solid lines) and low (dashed lines) concentration. (B) Sedimentation coefficient distributions of 2 μM N:P13L,Δ31-33, ancestral N, and N:P13L,Δ31-33,L222P in the presence of 10 μM T₁₀ in low-salt buffer. The inset shows DLS autocorrelation data of the same samples (symbols) and single-species fits (lines).

Figure 4—figure supplement 1

Download asset Open asset

Non-reducing SDS-PAGE of reduced and oxidized N:G215C* and N_λ*.

Lanes: (1) oxidized N_λ*; (2) reduced N_λ; (3) oxidized N:G215C*; (4) reduced N:G215C.

Figure 4—figure supplement 1—source data 1 Labeled non-reducing SDS-PAGE of reduced and oxidized N:G215C* and Nλ*.: https://cdn.elifesciences.org/articles/108922/elife-108922-fig4-figsupp1-data1-v1.zip
Download elife-108922-fig4-figsupp1-data1-v1.zip
Figure 4—figure supplement 1—source data 2 Unlabeled non-reducing SDS-PAGE of reduced and oxidized N:G215C* and Nλ*.: https://cdn.elifesciences.org/articles/108922/elife-108922-fig4-figsupp1-data2-v1.zip
Download elife-108922-fig4-figsupp1-data2-v1.zip

Figure 4—figure supplement 2

Download asset Open asset

N-arm mutations rescue LLPS of LRS-helix deficient mutants.

Optical microscopy images of 10 μM protein with 5 μM T₄₀ in phosphate buffer with 10 mM NaCl, pH 7.4, acquired under conditions as previously reported (Nguyen et al., 2024). Images show (A) LLPS with N:L222P and (B) LLPS with N:P13L,Δ31-33,L222P mutants.

Figure 4—figure supplement 3

Download asset Open asset

Measurement of the affinity of N:P13L,Δ31-33 for oligonucleotide T₁₀ by SV-AUC titration.

Sedimentation coefficient distributions are shown for 1.5 or 3.0 μM protein with T₁₀ in different molar ratios. Integration of the overall weight-average s-value leads to the isotherm shown in the inset (circles), which is globally fitted with a binding model (lines) resulting in a K_D of 0.88 (0.70 – 1.1) μM. Experiments are in 20 mM HEPES, 150 mM NaCl, pH 7.5. In comparison, for ancestral N, the best-fit K_D-values and 95% confidence intervals are 1.1 [0.8–1.6] μM (Nguyen et al., 2024).

Figure 5

Download asset Open asset

Size distributions and stability of ancestral and mutant RNPs.

Show are SV (**A, B**) and MP data (**C–F**) for mixtures of N-protein with stem-loop RNA SL7 in molar ratio of 1(N):1.15(SL7) at high concentration of 3 μM (**A, D**) or low concentration of 0.3 μM (**B, C, E, F**) in reducing buffer conditions. All panels use the same color scheme for N-protein: ancestral (black), N:P13L,Δ31-33 (red), N:G215C (orange), N_ο (cyan), N:R203K/G204R (blue), N_λ (magenta). All panels are subdivided into two plots for clarity, with each showing ancestral trace in black for comparison. (A) Sedimentation coefficient distributions of mixtures equilibrated at high concentration, with reaction boundary peaks magnified in the inset. For reference, the ancestral RNP s-value is drawn as a dotted vertical line. Absorbance data are recorded at 260 nm and are weighted by SL7 content of sedimenting species. Higher reaction boundary s-values signify greater affinity or lifetime of the mutant RNPs. (B) Sedimentation coefficient distributions of the same samples as in (A), tenfold diluted and equilibrated, highlight dissociation of most RNPs into a range of intermediate size complexes. (C) MP experiments of equilibrated 0.3 μM mixtures. The measured number distributions are presented as cumulative distributions, which display higher percentages of large species as shifts to the right. Most samples are largely dissociated into dimers, with remaining peaks corresponding to populations of dimers to hexamers of N₂/SL7₂ subunits (as highlighted in the differential distributions in the inset). As an example for the resolution of distinct species, the inset shows the differential distribution (histogram) for N:R203K/G204R (blue), ancestral N (black), and N:P13L,Δ31-33 (red), with the peak labels indicating the number of N-dimer/2SL7 subunits. (D) Mass distributions acquired in stopped-flow configuration applied to 3 μM mixtures. Larger (negative) contrasts correspond to higher molecular weights, with major peaks corresponding to species containing 1, 4, 5, and 6 N₂/SL7₂ subunits. (E) For kinetic experiments, mass distributions were acquired in different time intervals after tenfold dilution of 3 μM mixtures, here showing data collected from 3 s to 23 s. (F) Number-average molecular weights of assembled RNPs between 500 and 1500 kDa observed in consecutive 20 s data acquisition intervals after tenfold dilution of 3 μM mixtures (circles). The dashed horizontal lines are number-averages determined from the equilibrated 0.3 μM mixtures in (E). The solid lines are a best-fit single exponentials constrained to decay to the measured equilibrium values, yielding RNP lifetimes listed in Table 1.

Figure 6

Download asset Open asset

Impact of LRS disulfide bonds on the size and stability of RNPs.

N-protein with cysteine mutations in the LRS was oxidized to form disulfide-linked oligomers (as shown in Figure 4A) and mixed with stem-loop RNA SL7 in molar ratio of 1(N):1.15(SL7). Shown are data for oxidized N:G215C* (brown) and oxidized N_λ* (green), and for comparison, reduced N:G215C (yellow) and reduced N_λ (magenta). (A) Sedimentation coefficient distributions at 3 μM (upper panel) and 0.3 μM (lower panel) protein. (B) Molecular weight distributions in MP experiments of the same mixtures rapidly diluted to 0.3 μM protein, acquired from 3 to 23 s after dilution, with peak labels reflecting the multiples of N dimer/2SL7 subunits. (C) Time course of number-average RNP molecular weights between 500 and 1500 kDa (circles), determined from rapid dilution experiments in (B) for consecutive 20 s data acquisition intervals. The solid lines are best-fit single-exponential decays constrained to attain the separately measured equilibrium values at 0.3 μM protein (dashed lines), with lifetimes listed in Table 1.

Figure 7

Download asset Open asset

Mutation effect on packaging and cell entry in a VLP assay.

Error bars are standard deviations from n=4. Stars indicate significance (p>0.95) of a two-sided Kolmogorov-Smirnov test comparing the control ancestral measurements with mutants.

Figure 8

Download asset Open asset

Replication kinetics of recombinant SARS-CoV-2 reporter viruses in cell lines.

(A) Representative images of Vero-TMPRSS2 and A549-ACE2 cells infected with SARS-CoV-2 P13L or WT at different time points post-infection. (B) Quantification of fluorescence intensity from P13L and WT virus infections shown in (A). (C) Viral titers in the supernatant from infected cells. Error bars are standard deviations (n=3), and stars indicate significant differences on a p=0.95 confidence level.

Figure 9

Download asset Open asset

Mutations of N:P12 across the phylogenetic tree of SARS-CoV-2.

Shown are all-time global sequence samples with clade labels and color-coded amino acid at position 13, with the ancestral P13 in green and P13L in yellow. The blue arrow points to the Lambda sequences. Additionally, a cluster of P13L mutations occurred in India in clade 19 A. The phylogenetic tree was generated by Nextstrain (Hadfield et al., 2018).

Figure 10

Download asset Open asset

Mutations of N:G215 across the phylogenetic tree of SARS-CoV-2.

Shown are all-time global sequence samples with clade labels and color-coded amino acid at position 215, with the ancestral G215 in green and G215C in yellow. The phylogenetic tree was generated by Nextstrain (Hadfield et al., 2018).

Figure 11

Download asset Open asset

Mutations of N:G214 and N:G215 across the phylogenetic tree of SARS-CoV-2.

Shown are all-time sequence samples in South America with clade labels and color-coded amino acid at position 214 and 215. The combination of 214 C/G215 strain 21 G (Lambda) is shown in blue, whereas the combination G214/215 C of strain 21 J (Delta) is shown in yellow. The phylogenetic tree was generated by Nextstrain (Hadfield et al., 2018).

Figure 12

Download asset Open asset

Mutations of N:R203 and N:G204 across the phylogenetic tree of SARS-CoV-2.

Shown are global sequence samples mostly representing sequences of the recent 6 months, with clade labels and color-coded amino acid at positions 203 and 204. The ancestral combination of R203/G215 is shown in green, the mutation 203 M of the Delta VOC in blue, the combination 203 K/204 R common to Alpha and Omicron VOCs in yellow, and the combination 203 K/204 P defining in the Omicron XEC variant in orange. The phylogenetic tree was generated by Nextstrain (Hadfield et al., 2018).

Tables

Table 1

Overview of N-protein mutant species studied.

Designation	N-protein mutations	In set of defining VOC mutations*	Predominant oligomeric state at low μM concentrations	Reaction boundary s-value in RNP assay(S)	MP final average Mw in 500–1500 kDa range (kDa)	Best-fit mass increase from 0.3 to 3 μM(kDa)	Best-fit effective RNP life-time(s)
N (ancestral)	none	Wuhan-Hu-1	Dimer	19.7	614	63.8	66.3
N:P13L	P13L	λ, ο (all)
N:Δ31–33	Δ31–33	ο (all)
N:P13L/Δ31–33	P13L, Δ31–33	ο (all)	Dimer	20.2	625	118.2	43.6
N:R203K/G204R	R203K, G204R	α, γ, λ, ς, and ο (except XEC)	Dimer	19.5	596	70.3	54.6
N_ο	P13L, Δ31–33, R203K, G204R	ο (except XEC)	Dimer	20.5	617	77.2	58.5
N:G215C	G215C (reduced)	δ (all 21 J)	Dimer/tetramer	20.4	619	47.8	231
N:G215C*	G215C (oxidized)	δ (all 21 J)	Tetramer	20.8	649	56.1	41.8
N_λ	P13L, R203K, G204R, G214C (reduced)	λ	Dimer/tetramer	21.0	669	20.0	67.4
N_λ*	P13L, R203K, G204R, G214C (oxidized)	λ	Tetramer	21.5	660	51.6	98.2
N:R203M	R203M	δ, κ
N_210-419*	Δ1–209	ο (not XEC)

*

Referring to the most common mutations in the variants of concern, excluding sporadic spontaneous reversions or other variations.

Key resources table

Reagent type (species) or resource	Designation	Source or reference	Identifiers	Additional information
Cell line (Chlorocebus sabaeus)	Vero E6-TMPRSS2	JCRB cell bank	JCRB1819
Cell line (Homo sapiens)	A549-hACE2	BEI	NR-53522
Cell line (Mesocricetus auratus)	BHK21-ACE	Li et al., 2023a
Cell line (Homo sapiens)	293T	ATCC	RRID:CVCL_1926
Strain, strain background (Escherichia coli)	BL21(DE3)pLysS	Thermo Fisher	C606003
Recombinant DNA reagent	pLVX-EF1alpha-SARS-CoV-2-E-2xStrep-IRES-Puro	Addgene	RRID:Addgene_141385
Recombinant DNA reagent	pLVX-EF1alpha-SARS-CoV-2-M-2xStrep-IRES-Puro	Addgene	RRID:Addgene_141386
Recombinant DNA reagent	pLVX-EF1alpha-SARS-CoV-2-N-2xStrep-IRES-Puro	Addgene	RRID:Addgene_141391
Recombinant DNA reagent	pET29a(+) (plasmid)	Genescript
Recombinant DNA reagent	pIRES2-EGFP	NovoPro	V011106
Sequence-based reagent	N1-43 (N-arm) and N210-246	ABI Scientific		For sequences see Supplementary file 1
Sequence-based reagent	T10, SL7	Integrated DNA Technologies		For sequences see Supplementary file 1
Software, algorithm	SEDFIT	Biophys. J. (2000) 1606–1619	RRID:SCR_018365	Can be retrieved from https://doi.org/10.7910/DVN/4JPARC