Evolution of a fuzzy ribonucleoprotein complex in viral assembly
Figures
Basic organization of N-protein and RNPs.
N-protein (1-419) has two folded domains, NTD (45-180) and CTD (248-363), and three intrinsically disordered regions including the N-arm (1-44), the central linker (181-247), and the C-arm (364-419). (A) Displayed is an AlphaFold2 structure where the disordered N-arm, linker, and C-arm are artificially stretched for clarity. The residues are color-coded according to the number of different amino acids that have been observed at this position in the mutational landscape replacing the Wuhan-Hu-1 sequence. The bioinformatic analysis was carried out as described previously (Zhao et al., 2023), updated to August 26, 2024, using a threshold of >5 genomes for each mutation. (B) Schematic of protein-protein and protein/RNA interfaces in RNP assembly. The nucleic acid binding domain at the N-terminus (NTD) is indicated in blue, the LRS in yellow, and the dimerization domain (CTD) in green. Regions of self-association are indicated by shaded backgrounds. The linker is subdivided into a serine and arginine-rich region (180–205, SR) and a L-rich region (206–247, LRS). LRS can transiently fold into helices that create a hydrophobic patch for promiscuous self-association (indicated as yellow background). For clarity, the cartoon only shows three neighboring N-protein dimers, although higher-order oligomers assemble in RNPs. Nucleic acid binding sites (purple triangles) preferentially bind single-stranded RNA at the NTD (gray lines), and double-stranded RNA at the two sites per CTD dimer, with the ability to cross-link neighboring dimers potentially in various configurations. New inter-dimer interactions evolved in variants of concern are indicated by red connectors, including the promotion of beta-sheet oligomerization through the N:P13L mutation in the N-arm (as in Omicron and Lambda variants), and the introduction of cysteines at the base of the LRS helices in N:G214C (as in Lambda variants) and N:G215C (as in Delta variants). (C) Three-dimensional cartoon of the circular organization of N-protein domains in RNPs, with one dimer shaded slightly darker to highlight the dimeric building blocks. For clarity, subunit sizes are not drawn to scale. Alternate arrangements are depicted in Figure 1—figure supplement 1. (D) CD spectra of N-protein in the presence of SL7 under near physiological salt conditions leading to majority assembly of RNPs. Spectra are corrected for free SL7 contributions. Shown are spectra of ancestral N-protein alone (black), in the presence of SL7 forming RNPs (red), Nλ alone (cyan), and in the presence of SL7 forming RNPs (magenta). Spectra are truncated at <205 nm due to limited buffer transparency. For comparison, the dotted line shows a previously published CD spectrum of ancestral N-protein in low-salt buffer that permits measurement at shorter wavelengths (Nguyen et al., 2024). Triplicate scans yield average standard deviations of 0.13 (N), 0.17 (N+SL7), 0.16 (Nλ), and 0.21 (Nλ +SL7) 103 deg cm2/dmol, respectively, with non-overlapping confidence bands for the different species, for example, between 215 and 220 nm.
Possible heterogeneity of RNPs from alternate configurations of N dimer subunits.
Only the N protein scaffold is shown for clarity. The nucleic acid binding domain at N-terminus (NTD) is indicated in blue, the dimerization domain (CTD) in green, and the L-rich region (206–247, LRS) folded into helices is shown as yellow rods. The other disordered regions are shown as black lines. (A) Based on the cartoon of Figure 1C, an RNP is sketched where two of the six dimer subunits have flipped LRS orientations, resulting in the swapped position of NTD and CTD subunits. (B) Arrangement of five dimer subunits. (C) Arrangement of seven dimer subunits. (D) N210-419* constructs lacking the NTD are capable of forming RNPs of similar size, conceivably substituting NTD binding sites for nucleic acid by additional CTD subunits upside-down orientation. (E) Mixed RNPs combining N210-419* and full-length N protein different orientations.
Structural prediction of an RNP.
Top view (left) and side view (right) of an AlphaFold3 model of 5 N-protein dimers (ancestral Wuhan Hu-1) with 10 copies of SL7 RNA. One N-protein dimer is highlighted showing the two chains in blue and cyan. Three copies of SL7 adjacent to the highlighted dimer are shown in yellow, other copies are omitted for clarity. The top view shows the symmetrical arrangement of LRS helices forming a decameric core. The remainder of the linker, as well as the entire N-arm and C-arm, is disordered and not meaningfully predicted. Due to significant disordered regions, the predicted model is not unique and is limited by the fact that multiple copies of SL7 are used as RNA ligand mimicking the biophysical RNP model.
The P13L mutation creates self-association interfaces in the N-arm through stabilization of b-sheets.
(A) Electron micrograph of negatively stained omicron N-arm No,1-43:P13L,Δ31-33 after equilibration at 10 μM in 20 mM HEPES, 150 mM NaCl, pH 7.50. The magnified regions are examples of twisted (*) and straight (**) fibrils. (B) CD spectra of N1-43:P13L (red), N1-43:Δ31-33 (blue), and Omicron No,1-43:P13L,Δ31-33 (magenta) at 0.4 mM (dashed) and 1.0 mM (solid), in comparison with ancestral N-arm (black). (C) Subset of ColabFold prediction of multimers of N10-20:P13L highlighting hydrogen bonds. The P13L residue is highlighted in red in the middle peptide.
Electron micrographs of N1-43:P13L and control.
(A) Electron micrograph of negatively stained N-arm peptide N1-43:P13L after equilibration at 10 μM in 20 mM HEPES, 150 mM NaCl, pH 7.50. (B) As a control, electron micrograph of negatively stained C-arm N364-419 under the same conditions.
Comparison of WT and P13L N-arm structure predictions.
Structures were predicted in ColabFold for a 12mer of the N-arm peptide N10-20 of ancestral N (Wuhan-Hu-1) and N:P13L.
MD simulation of ancestral LRS and comparison with 214 C and 215 C mutants of Delta and Lambda variants.
Snapshots at equal time intervals (4 ns) taken from the 200-ns MD simulations of the ancestral peptide N210-246 and single-point G→C mutants (all monomers were oriented by overlying the helical region and displayed in orthosteric view). The upper row shows a view from the N-terminus side (with the helix axes perpendicular to the plane of the figure), while the lower row presents a side view (with axes on the plane). For clarity, the disordered C-terminal segment (residues 236–246) has been removed. The glycine and cysteine residues at positions 214 and 215 (colored blue and red, respectively, in the ancestral peptide and rendered as ball-and-stick in the mutants) restructure the flexible backbone around the GG motif into a well-defined α-helix turn, directing the sulfhydryl group in specific orientations (illustrated schematically by the blue and red arches). These orientations can be quantified relative to the Leu-rich central region (indicated by arrows and rendered as gray van der Waals spheres), which forms the hydrophobic interfaces of the oligomers (Zhao et al., 2023). Under reducing conditions, this reorientation of the N-terminus relative to the helix can influence helix binding during the early stages of oligomerization or alter the conformation and physicochemical properties of the resulting oligomers, as illustrated in Figure 3—figure supplement 2.
Size distribution of G214C cysteine mutant LRS peptides.
Shown are peptides NLRS, 210-246:G214C reduced (blue) vs unreduced (red). (A) Autocorrelation functions (circles) and best-fit size-distribution fits (solid lines). (B) Best-fit hydrodynamic radius distributions.
Structural and physicochemical effects of LRS mutants G214C and G215C under reducing conditions.
Snapshots taken at equal time intervals (4 ns) from the 200-ns MD simulations of the trimeric complex of the ancestral (WT) peptide N210-246 and single-point G→C mutants (trimers oriented by overlaying the helical regions of the three monomers; the panels are shown in orthosteric view for ease of comparison). The upper row presents a view from the N-terminus side (compare with Figure 3). The second row provides a side view, showing the tightly packed hydrophobic core that drives and stabilizes monomer self-association. The conformational changes and reorientation of the N-terminal segment observed in the mutant monomers (see Figure 3) influence the interactions between the helices and affect the complex structure and stabilization (see text). A consequence of this rearrangement is shown in the third row, where E216 (the oxygen atoms of the carboxyl group depicted as red van der Waals spheres), expected to be anionic under the experimental conditions (neutral pH; mimicked in the simulations; see Materials and methods), becomes more spatially concentrated at the helix end. The varying local electric fields (less negative in the WT, slightly more negative in the G214C mutant, and much more negative in the G215C), illustrated on the surface electrostatic potential in the fourth row, may alter, among other features, the binding of cations in the solution, potentially influencing further aggregation of the peptide or full-length N-protein, phase separation, or particle assembly in vivo. Oligomerization under oxidative conditions (not investigated in this study) may differ significantly, as the hydrophobic moieties that drive wild-type helix dimerization could reorient outward in disulfide-bridged dimers, thereby affecting the oligomerization mechanism and the complex’s properties.
Impact of mutations in self-association interfaces on oligomeric state.
(A) Sedimentation coefficient distributions of cysteine mutants N:G215C (reduced in yellow) and N:G215C* (oxidized in brown), as well as reduced Nλ (reduced in magenta) and Nλ* (oxidized in violet), with Nλ data offset by 2. For each sample, data were acquired at high (solid lines) and low (dashed lines) concentration. (B) Sedimentation coefficient distributions of 2 μM N:P13L,Δ31-33, ancestral N, and N:P13L,Δ31-33,L222P in the presence of 10 μM T10 in low-salt buffer. The inset shows DLS autocorrelation data of the same samples (symbols) and single-species fits (lines).
Non-reducing SDS-PAGE of reduced and oxidized N:G215C* and Nλ*.
Lanes: (1) oxidized Nλ*; (2) reduced Nλ; (3) oxidized N:G215C*; (4) reduced N:G215C.
-
Figure 4—figure supplement 1—source data 1
Labeled non-reducing SDS-PAGE of reduced and oxidized N:G215C* and Nλ*.
- https://cdn.elifesciences.org/articles/108922/elife-108922-fig4-figsupp1-data1-v1.zip
-
Figure 4—figure supplement 1—source data 2
Unlabeled non-reducing SDS-PAGE of reduced and oxidized N:G215C* and Nλ*.
- https://cdn.elifesciences.org/articles/108922/elife-108922-fig4-figsupp1-data2-v1.zip
N-arm mutations rescue LLPS of LRS-helix deficient mutants.
Optical microscopy images of 10 μM protein with 5 μM T40 in phosphate buffer with 10 mM NaCl, pH 7.4, acquired under conditions as previously reported (Nguyen et al., 2024). Images show (A) LLPS with N:L222P and (B) LLPS with N:P13L,Δ31-33,L222P mutants.
Measurement of the affinity of N:P13L,Δ31-33 for oligonucleotide T10 by SV-AUC titration.
Sedimentation coefficient distributions are shown for 1.5 or 3.0 μM protein with T10 in different molar ratios. Integration of the overall weight-average s-value leads to the isotherm shown in the inset (circles), which is globally fitted with a binding model (lines) resulting in a KD of 0.88 (0.70 – 1.1) μM. Experiments are in 20 mM HEPES, 150 mM NaCl, pH 7.5. In comparison, for ancestral N, the best-fit KD-values and 95% confidence intervals are 1.1 [0.8–1.6] μM (Nguyen et al., 2024).
Size distributions and stability of ancestral and mutant RNPs.
Show are SV (A, B) and MP data (C–F) for mixtures of N-protein with stem-loop RNA SL7 in molar ratio of 1(N):1.15(SL7) at high concentration of 3 μM (A, D) or low concentration of 0.3 μM (B, C, E, F) in reducing buffer conditions. All panels use the same color scheme for N-protein: ancestral (black), N:P13L,Δ31-33 (red), N:G215C (orange), Nο (cyan), N:R203K/G204R (blue), Nλ (magenta). All panels are subdivided into two plots for clarity, with each showing ancestral trace in black for comparison. (A) Sedimentation coefficient distributions of mixtures equilibrated at high concentration, with reaction boundary peaks magnified in the inset. For reference, the ancestral RNP s-value is drawn as a dotted vertical line. Absorbance data are recorded at 260 nm and are weighted by SL7 content of sedimenting species. Higher reaction boundary s-values signify greater affinity or lifetime of the mutant RNPs. (B) Sedimentation coefficient distributions of the same samples as in (A), tenfold diluted and equilibrated, highlight dissociation of most RNPs into a range of intermediate size complexes. (C) MP experiments of equilibrated 0.3 μM mixtures. The measured number distributions are presented as cumulative distributions, which display higher percentages of large species as shifts to the right. Most samples are largely dissociated into dimers, with remaining peaks corresponding to populations of dimers to hexamers of N2/SL72 subunits (as highlighted in the differential distributions in the inset). As an example for the resolution of distinct species, the inset shows the differential distribution (histogram) for N:R203K/G204R (blue), ancestral N (black), and N:P13L,Δ31-33 (red), with the peak labels indicating the number of N-dimer/2SL7 subunits. (D) Mass distributions acquired in stopped-flow configuration applied to 3 μM mixtures. Larger (negative) contrasts correspond to higher molecular weights, with major peaks corresponding to species containing 1, 4, 5, and 6 N2/SL72 subunits. (E) For kinetic experiments, mass distributions were acquired in different time intervals after tenfold dilution of 3 μM mixtures, here showing data collected from 3 s to 23 s. (F) Number-average molecular weights of assembled RNPs between 500 and 1500 kDa observed in consecutive 20 s data acquisition intervals after tenfold dilution of 3 μM mixtures (circles). The dashed horizontal lines are number-averages determined from the equilibrated 0.3 μM mixtures in (E). The solid lines are a best-fit single exponentials constrained to decay to the measured equilibrium values, yielding RNP lifetimes listed in Table 1.
Impact of LRS disulfide bonds on the size and stability of RNPs.
N-protein with cysteine mutations in the LRS was oxidized to form disulfide-linked oligomers (as shown in Figure 4A) and mixed with stem-loop RNA SL7 in molar ratio of 1(N):1.15(SL7). Shown are data for oxidized N:G215C* (brown) and oxidized Nλ* (green), and for comparison, reduced N:G215C (yellow) and reduced Nλ (magenta). (A) Sedimentation coefficient distributions at 3 μM (upper panel) and 0.3 μM (lower panel) protein. (B) Molecular weight distributions in MP experiments of the same mixtures rapidly diluted to 0.3 μM protein, acquired from 3 to 23 s after dilution, with peak labels reflecting the multiples of N dimer/2SL7 subunits. (C) Time course of number-average RNP molecular weights between 500 and 1500 kDa (circles), determined from rapid dilution experiments in (B) for consecutive 20 s data acquisition intervals. The solid lines are best-fit single-exponential decays constrained to attain the separately measured equilibrium values at 0.3 μM protein (dashed lines), with lifetimes listed in Table 1.
Mutation effect on packaging and cell entry in a VLP assay.
Error bars are standard deviations from n=4. Stars indicate significance (p>0.95) of a two-sided Kolmogorov-Smirnov test comparing the control ancestral measurements with mutants.
Replication kinetics of recombinant SARS-CoV-2 reporter viruses in cell lines.
(A) Representative images of Vero-TMPRSS2 and A549-ACE2 cells infected with SARS-CoV-2 P13L or WT at different time points post-infection. (B) Quantification of fluorescence intensity from P13L and WT virus infections shown in (A). (C) Viral titers in the supernatant from infected cells. Error bars are standard deviations (n=3), and stars indicate significant differences on a p=0.95 confidence level.
Mutations of N:P12 across the phylogenetic tree of SARS-CoV-2.
Shown are all-time global sequence samples with clade labels and color-coded amino acid at position 13, with the ancestral P13 in green and P13L in yellow. The blue arrow points to the Lambda sequences. Additionally, a cluster of P13L mutations occurred in India in clade 19 A. The phylogenetic tree was generated by Nextstrain (Hadfield et al., 2018).
Mutations of N:G215 across the phylogenetic tree of SARS-CoV-2.
Shown are all-time global sequence samples with clade labels and color-coded amino acid at position 215, with the ancestral G215 in green and G215C in yellow. The phylogenetic tree was generated by Nextstrain (Hadfield et al., 2018).
Mutations of N:G214 and N:G215 across the phylogenetic tree of SARS-CoV-2.
Shown are all-time sequence samples in South America with clade labels and color-coded amino acid at position 214 and 215. The combination of 214 C/G215 strain 21 G (Lambda) is shown in blue, whereas the combination G214/215 C of strain 21 J (Delta) is shown in yellow. The phylogenetic tree was generated by Nextstrain (Hadfield et al., 2018).
Mutations of N:R203 and N:G204 across the phylogenetic tree of SARS-CoV-2.
Shown are global sequence samples mostly representing sequences of the recent 6 months, with clade labels and color-coded amino acid at positions 203 and 204. The ancestral combination of R203/G215 is shown in green, the mutation 203 M of the Delta VOC in blue, the combination 203 K/204 R common to Alpha and Omicron VOCs in yellow, and the combination 203 K/204 P defining in the Omicron XEC variant in orange. The phylogenetic tree was generated by Nextstrain (Hadfield et al., 2018).
Tables
Overview of N-protein mutant species studied.
| Designation | N-protein mutations | In set of defining VOC mutations* | Predominant oligomeric state at low μM concentrations | Reaction boundary s-value in RNP assay(S) | MP final average Mw in 500–1500 kDa range (kDa) | Best-fit mass increase from 0.3 to 3 μM(kDa) | Best-fit effective RNP life-time(s) |
|---|---|---|---|---|---|---|---|
| N (ancestral) | none | Wuhan-Hu-1 | Dimer | 19.7 | 614 | 63.8 | 66.3 |
| N:P13L | P13L | λ, ο (all) | |||||
| N:Δ31–33 | Δ31–33 | ο (all) | |||||
| N:P13L/Δ31–33 | P13L, Δ31–33 | ο (all) | Dimer | 20.2 | 625 | 118.2 | 43.6 |
| N:R203K/G204R | R203K, G204R | α, γ, λ, ς, and ο (except XEC) | Dimer | 19.5 | 596 | 70.3 | 54.6 |
| Nο | P13L, Δ31–33, R203K, G204R | ο (except XEC) | Dimer | 20.5 | 617 | 77.2 | 58.5 |
| N:G215C | G215C (reduced) | δ (all 21 J) | Dimer/tetramer | 20.4 | 619 | 47.8 | 231 |
| N:G215C* | G215C (oxidized) | δ (all 21 J) | Tetramer | 20.8 | 649 | 56.1 | 41.8 |
| Nλ | P13L, R203K, G204R, G214C (reduced) | λ | Dimer/tetramer | 21.0 | 669 | 20.0 | 67.4 |
| Nλ* | P13L, R203K, G204R, G214C (oxidized) | λ | Tetramer | 21.5 | 660 | 51.6 | 98.2 |
| N:R203M | R203M | δ, κ | |||||
| N210-419* | Δ1–209 | ο (not XEC) |
-
*
Referring to the most common mutations in the variants of concern, excluding sporadic spontaneous reversions or other variations.
| Reagent type (species) or resource | Designation | Source or reference | Identifiers | Additional information |
|---|---|---|---|---|
| Cell line (Chlorocebus sabaeus) | Vero E6-TMPRSS2 | JCRB cell bank | JCRB1819 | |
| Cell line (Homo sapiens) | A549-hACE2 | BEI | NR-53522 | |
| Cell line (Mesocricetus auratus) | BHK21-ACE | Li et al., 2023a | ||
| Cell line (Homo sapiens) | 293T | ATCC | RRID:CVCL_1926 | |
| Strain, strain background (Escherichia coli) | BL21(DE3)pLysS | Thermo Fisher | C606003 | |
| Recombinant DNA reagent | pLVX-EF1alpha-SARS-CoV-2-E-2xStrep-IRES-Puro | Addgene | RRID:Addgene_141385 | |
| Recombinant DNA reagent | pLVX-EF1alpha-SARS-CoV-2-M-2xStrep-IRES-Puro | Addgene | RRID:Addgene_141386 | |
| Recombinant DNA reagent | pLVX-EF1alpha-SARS-CoV-2-N-2xStrep-IRES-Puro | Addgene | RRID:Addgene_141391 | |
| Recombinant DNA reagent | pET29a(+) (plasmid) | Genescript | ||
| Recombinant DNA reagent | pIRES2-EGFP | NovoPro | V011106 | |
| Sequence-based reagent | N1-43 (N-arm) and N210-246 | ABI Scientific | For sequences see Supplementary file 1 | |
| Sequence-based reagent | T10, SL7 | Integrated DNA Technologies | For sequences see Supplementary file 1 | |
| Software, algorithm | SEDFIT | Biophys. J. (2000) 1606–1619 | RRID:SCR_018365 | Can be retrieved from https://doi.org/10.7910/DVN/4JPARC |
Additional files
-
MDAR checklist
- https://cdn.elifesciences.org/articles/108922/elife-108922-mdarchecklist1-v1.pdf
-
Supplementary file 1
Peptide and oligonucleotide sequences used in the present study.
- https://cdn.elifesciences.org/articles/108922/elife-108922-supp1-v1.docx