Revealing druggable cryptic pockets in the Nsp1 of SARS-CoV-2 and other β-coronaviruses by simulations and crystallography

  1. Alberto Borsatto
  2. Obaeda Akkad
  3. Ioannis Galdadas
  4. Shumeng Ma
  5. Shymaa Damfo
  6. Shozeb Haider
  7. Frank Kozielski
  8. Carolina Estarellas  Is a corresponding author
  9. Francesco Luigi Gervasio  Is a corresponding author
  1. School of Pharmaceutical Sciences, University of Geneva, Switzerland
  2. ISPSO, University of Geneva, Switzerland
  3. School of Pharmacy, University College London, United Kingdom
  4. UCL Centre for Advanced Research Computing, University College London, United Kingdom
  5. Department of Nutrition, Food Science and Gastronomy, Faculty of Pharmacy and Food Sciences, and Institute of Theoretical and Computational Chemistry, University of Barcelona, Spain
  6. Chemistry Department, University College London, United Kingdom
  7. Institute of Structural and Molecular Biology, University College London, United Kingdom
10 figures, 3 tables and 1 additional file

Figures

Structure of the full-length SARS-CoV-2 Nsp1.

Cartoon representation of the full-length non-structural protein 1 (Nsp1) structure from the AlphaFold (Jumper et al., 2021; Varadi et al., 2022) model, showing the N-terminus (in yellow, aa Met1-Asn9), the Nsp1N core (in gray, aa Glu10-Asn126), and the C-terminus (blue, aa Gly127-Gly180).

Figure 2 with 1 supplement
Cavities identified on the Nsp1N crystal structure (tPDB entry 7K7P) by the ProteinPlus server for the concave pocket-like structure between the β-barrel and the α-helix and the groove-like topology.
Figure 2—figure supplement 1
Cavities identified by the Fpocket software on the Nsp1N crystal structure (PDB entry 7K7P) (Le Guilloux et al., 2009).
Figure 3 with 1 supplement
Pockets revealed from unbiased simulations.

(A) Cavities identified on Nsp1N along the 1 μs unbiased molecular dynamics (MD) simulation, namely pocket 1 (purple) and pocket 2 (green), with the main residues displayed in sticks. (B) The volume distribution of each pocket along the unbiased MD simulations.

Figure 3—figure supplement 1
Time average of the root-mean-square deviation (RMSD) for Nsp1N.

The black shaded area delimits the RMSD values within two standard deviations from the time average.

Figure 4 with 2 supplements
Pockets revealed from SWISH simulations.

(A) Pockets sampled during the 500 ns of replica-exchange SWISH (sampling water interfaces through scaled Hamiltonians) simulations. Volume distributions of the cryptic binding sites pocket 3 (B) and pocket 4 (C) along the six replicas of the SWISH simulations.

Figure 4—figure supplement 1
Time averages of the root-mean-square deviation (RMSD) for Nsp1N across the six SWISH (sampling water interfaces through scaled Hamiltonians) replicas.

The blue shaded area delimits the RMSD values within two standard deviations from the time average.

Figure 4—figure supplement 2
Violin plots of the pocket volumes.

Volume distributions of the cryptic binding sites pocket 1 (A) and pocket 2 (B) along the six replicas of the SWISH (sampling water interfaces through scaled Hamiltonians) simulation of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) Nsp1N.

Distribution of the binding hotspots on the Nsp1N surface around (A) pocket 1 and (B) pockets 2, 3, and 4.

Multiple consensus clusters are shown in sticks. Each consensus cluster is represented in different colors.

Figure 5—source data 1

Consensus clusters information as obtained from the FTMap program for the 11 selected Nsp1N structures.

https://cdn.elifesciences.org/articles/81167/elife-81167-fig5-data1-v1.docx
Figure 6 with 2 supplements
Characterization of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) non-structural protein 1 (Nsp1)-2E10 complex.

(A) Binding pose of the fragment hit obtained by crystal soaking and structure determination methods. The fragment is located in pocket 1. (B) Chemical structure and name of fragment hit. (C) Magnification of the Nsp1N –2E10 binding pocket showing the interactions the fragment establishes with residues of Nsp1. Hydrophobic interactions are shown by red half-moons and the hydrogen bond interaction is displayed with a dotted green line.

Figure 6—source data 1

Crystallographic data and model refinement statistics for the SARS CoV-2 Nsp1N-2E10 complex.

https://cdn.elifesciences.org/articles/81167/elife-81167-fig6-data1-v1.docx
Figure 6—figure supplement 1
Pan-Dataset Density Analysis (PanDDA) event map and standard single dataset map focused on the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) Nsp1N binding site of fragment hit 2E10.

PanDDA event map (A) without ligand and (B) in the presence of 2E10. Standard single dataset (D) map with and (C) without ligand 2E10. (E) Refined standard single dataset map with ligand 2E10. The sigma level for both the standard single dataset map and PanDDA maps is 1.

Figure 6—figure supplement 2
Microscale thermophoresis (MST) measurements of the binding affinity between fragment hit 2E10 and severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) non-structural protein 1 (Nsp1).

(A) Concentration-response plot of fragment hit 2E10 and SARS-CoV-2 Nsp1N domain. (B) Concentrations-response curve of 2E10 and SARS-CoV-2 Nsp1 FL. All measurements were conducted at least in triplicate.

Crystal packing of the Nsp1N-fragment complex (PDB entry 8A4Y) was obtained from X-ray soaking experiments.

The direct crystal contacts around (A) pocket 1 and (B) pockets 2, 3, and 4 are highlighted with squares.

Figure 8 with 3 supplements
Nsp1FL-RNA complex obtained from the HADDOCK program.

(A) Residues in the proximity of pocket 1 involved in crucial Nsp1FL-SL1 contacts are displayed in sticks. (B) Crystal structure from soaking experiments (orange transparent cartoon) is superimposed over the Nsp1FL of the Nsp1FL-RNA model. The most important residues for the Nsp1FL-RNA interaction are highlighted in white (Nsp1FL model) and orange (PDB entry 8A4Y from our soaking experiments) sticks. Figures obtained from model A.

Figure 8—figure supplement 1
Different Nsp1FL-RNA models proposed in this work.

The Nsp1FL from model A (blue) and B (red) are represented.

Figure 8—figure supplement 2
RMSD for model A and B of the Nsp1-RNA complex.

Root-mean-square deviation (RMSD, nm) along the 500 ns of unbiased molecular dynamics (MD) simulations considering (A) the backbone atoms of the Nsp1FL protein from the Nsp1FL-RNA complex, (B) only the RNA atoms of the stem loop 1 (SL1) from the complex and (C) the backbone atoms of Nsp1FL and all the RNA atoms of the full Nsp1FL-RNA complex.

Figure 8—figure supplement 3
Ligand accesibility to pocket 1 in Nsp1FL.

(A) Volume distribution of pocket 1 along the three unbiased molecular dynamics (MD) simulations of the Nsp1FL. (B) Docked poses of the fragment to the representative structure from the most populated cluster in replica 3 (left) and to the second most populated cluster in replica 2 (right). The original crystallographic pose is depicted in orange, whereas the docked poses in pink and yellow. The flexible loop comprising aa Arg124-Gly137 is displayed is cyan. N- and the C-terminal are depicted in purple and blue, respectively.

Figure 9 with 2 supplements
The conservation of Nsp1 sequence.

(A) Phylogenetic tree based on 283 sequences from distinct α- and β-coronaviruses (CoVs) of different subgenera. The sequences were obtained from the Conserved Domain Database (CDD) with accession number cl41742. The scale bar indicates the number of substitutions per site in the amino acid sequence. Six different Nsp1 domain models can be identified, two for the α-CoVs (transmissible gastroenteritis virus [TGEV]-like and PDEV-like), and four for the β-CoVs genus (MERS-like, HKU9-like, SARS-like, and murine hepatitis virus [MHV]-like). (B) Multi-sequence alignment of the four homologues selected. The alignment was performed with the MUSCLE algorithm (Edgar, 2004). The residues of the different pockets are highlighted in the corresponding color, namely pocket 1 in purple, pocket 2 in green, pocket 3 in orange, and pocket 4 in cyan. (C) Representation of the pockets found in the severe acute respiratory syndrome coronavirus 1 (SARS-CoV-1) Nsp1N variant.

Figure 9—figure supplement 1
Time averages of the backbone root-mean-square deviation (RMSD) along the 1 μs unbiased molecular dynamics (MD) simulation for Nsp1N of (A) severe acute respiratory syndrome coronavirus 1 (SARS-CoV-1), (B) SARS-HKU3, and (C) Bat-CoV-RatG13.

The shaded area delimits the RMSD values within two standard deviations from the time average.

Figure 9—figure supplement 2
Representation of the pockets found in the Bat-CoV-HKU3 (A) and Bat-Cov-RaTG13 (B) Nsp1N variants.
Author response image 1
Model of the full-length Nsp1 based on the cryo-EM structure of the C-terminal bound to the ribosome (PDB entry 7jqb).

.

Tables

Table 1
SMILES and identification number of the 59 tested fragments and corresponding predicted binding.

The predicted pose for each fragment can be found at https://github.com/Gervasiolab/Gervasio-Protein-Dynamics/tree/master/nsp1/virtual_screening (Gervasiolab, 2022;copy archived at swh:1:rev:936e929db11aff39faed53e5fbe6f902f1456a6d). The above reported crystal hit (lig_1427) is highlighted in bold.

Identification NumberSMILESPredicted pocket(s)
lig_30C9H7NO23, 4
lig_83C12H13NO21, 3, 4
lig_113C12H13ClF3NO1, 3, 4
lig_168C13H10O31, 3, 4
lig_171C11H10O31, 4
lig_194C8H6F2O31, 3, 4
lig_212C11H12O21, 3, 4
lig_243C14H17NO2S1
lig_262C15H16N2O1, 3, 4
lig_286C13H13NO2S1, 3, 4
lig_329C9H8F3NOS3, 4
lig_335C13H11NO2S1, 3
lig_349C11H9NO21, 4
lig_355C13H13NO2S4
lig_369C11H12O21, 3, 4
lig_377C12H10O2S1, 4
lig_394C16H13NO21, 3, 4
lig_400C14H12O21, 3, 4
lig_422C14H9NO21, 3, 4
lig_490C8H5NO2S21, 4
lig_502C12H10O31, 3, 4
lig_507C12H12N2O1, 4
lig_552C12H15NO21, 3, 4
lig_570C12H9NO21, 4
lig_575C10H8O31, 4
lig_579C12H9NO21, 3, 4
lig_685C14H12O31, 3, 4
lig_706C12H14N2O1, 4
lig_752C9H7ClFN3S21, 3, 4
lig_783C12H11FO21, 3, 4
lig_806C14H13NO21, 3
lig_812C13H11ClN2O1
lig_864C15H12O3S1, 3, 4
lig_892C7H6N2OS1, 3
lig_897C8H6FNO2S1, 3, 4
lig_907C10H11NOS21, 4
lig_910C10H6F3NOS1, 3, 4
lig_924C14H12O3S21
lig_1009C11H10ClNO21, 4
lig_1037C10H7ClO2S1, 4
lig_1054C12H13NO21, 3, 4
lig_1057C12H16N2O1, 3, 4
lig_1064C13H12O21, 3, 4
lig_1149C11H10F3NO1, 3, 4
lig_1157C13H13NO21, 3, 4
lig_1195C11H10N2O1, 3, 4
lig_1209C9H6F3N3S21, 3, 4
lig_1216C12H11F2NO21, 3, 4
lig_1220C11H12O2S3, 4
lig_1223C8H9ClN2OS1, 3
lig_1228C15H13FN2OS1, 3, 4
lig_1281C9H7F3O31, 3, 4
lig_1310C12H13NO21, 4
lig_1315C13H11F3O21, 3, 4
lig_1381C9H6F3NO3, 4
lig_1382C12H8F3NO2S1, 3, 4
lig_1410C9H11NO3, 4
lig_1427C11H13NO1, 4
lig_1428C11H11NO21, 4
Table 2
Thermal shift assay (TSA) results for 59 potential fragment hits from computational screening using severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) Nsp1N.

Fragments showing atypical curves are labelled as atypical curve in the table.

Identification numberFragmentsTi ± SD [°C]ΔTi/ °C
lig_5751B952.5±0.6–2.7
lig_4901C651.5±0.4–3.4
lig_5071H650.5±0.1–4.1
lig_14272E1053.8±0.3–1.3
lig_14282F1053.5±0.4–1.7
lig_10372F553.7±0.6–1.6
lig_6853E854.8±0.2–0.3
lig_9244B554.7±0.5–0.5
lig_5024C6Atypical curve
lig_3694E453.6±0.4–1.4
lig_4005B753.9±0.1–0.9
lig_1945D3Atypical curve
lig_3355E854.8±0.2–0.4
lig_306E3Atypical curve
lig_1136G854.3±0.0–0.6
lig_1687A251.2±0.4–3.8
lig_2127A4Atypical curve
lig_5527B655.0±0.20.1
lig_12207D1054.7±0.3–0.2
lig_1717D254.6±0.2–0.3
lig_5707E651.5±0.4–3.1
lig_5797F1151.8±0.2–3
lig_10547F654.3±0.3–0.6
lig_13108A4Atypical curve
lig_12818A5Atypical curve
lig_4228B451.2±0.1–3.7
lig_3498C4Atypical curve
lig_14108D353.2±0.3–2.4
lig_10098G751.5±0.4–3.3
lig_3559A1154.9±0.10
lig_2439C1054.3±0.1–0.9
lig_12169D854.5±0.4–0.7
lig_8129E1153.1±0.1–1.9
lig_3299E254.4±0.3–0.5
lig_2629G953.2±0.3–1.7
lig_78310A654.4±0.2–1.2
lig_91010A8Atypical curve
lig_80610A954.6±0.1-1
lig_138110B750.7±0.3–4.9
lig_28610G854.1±0.2–0.7
lig_120911A555.8±0.10.2
lig_105711B2Atypical curve
lig_90711E455.6±0.00.7
lig_106411E854.6±0.1–0.3
lig_37711F1054.8±0.1–0.8
lig_119511F254.8±0.2–0.1
lig_114911G355.4±0.50.5
lig_8312A11Atypical curve
lig_131512A3Atypical curve
lig_86412B1055.1±0.2–0.1
lig_39412B555.2±0.20.1
lig_138212C455.5±0.40.6
lig_75212D554.9±0.1–0.3
lig_122312F752.4±0.3–2.7
lig_115712F955.0±0.2–0.2
lig_122812G752.6±0.5–2.6
lig_89212H254.8±0.1–0.4
lig_70613D651.8±0.4–3.3
lig_89713G554.3±0.1–0.9
Table 3
Pairwise identity percentages of selected non-structural protein 1 (Nsp1) sequences for representative α- and β-coronaviruses (CoVs) subfamilies.

National Center for Biotechnology Information (NCBI) accession numbers the sequences used for the analysis are as follows: transmissible gastroenteritis virus (TGEV) 6IVC_A, PDEV 5XBC_A, severe acute respiratory syndrome coronavirus 1 (SARS CoV-1) NP_828860.2, severe acute respiratory syndrome coronavirus 2 (SARS CoV-2) YP_009725297.1, HKU9 P0C6T6, MERS YP_009047229, and murine hepatitis virus (MHV) YP_209244.

CoV-likeSARS-1SARS-2HKU9MERSMHVTGEVPDEV
SARS-1100
SARS-286.1100
HKU920.919.5100
MERS20.917.124.5100
MHV16.916.44.118.0100
TGEV17.416.913.37.99.0100
PDEV17.916.115.28.711.518.2100

Additional files

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Alberto Borsatto
  2. Obaeda Akkad
  3. Ioannis Galdadas
  4. Shumeng Ma
  5. Shymaa Damfo
  6. Shozeb Haider
  7. Frank Kozielski
  8. Carolina Estarellas
  9. Francesco Luigi Gervasio
(2022)
Revealing druggable cryptic pockets in the Nsp1 of SARS-CoV-2 and other β-coronaviruses by simulations and crystallography
eLife 11:e81167.
https://doi.org/10.7554/eLife.81167