A pH-dependent cluster of charges in a conserved cryptic pocket on flaviviral envelopes
Abstract
Flaviviruses are enveloped viruses which include human pathogens that are predominantly transmitted by mosquitoes and ticks. Some, such as dengue virus, exhibit the phenomenon of antibody-dependent enhancement (ADE) of disease, making vaccine-based routes of fighting infections problematic. The pH-dependent conformational change of the envelope (E) protein required for fusion between the viral and endosomal membranes is an attractive point of inhibition by antivirals as it has the potential to diminish the effects of ADE. We examined six flaviviruses by employing large-scale molecular dynamics (MD) simulations of raft systems that represent a substantial portion of the flaviviral envelope. We utilised a benzene-mapping approach that led to a discovery of shared hotspots and conserved cryptic sites. A cryptic pocket previously shown to bind a detergent molecule exhibited strain-specific characteristics. An alternative conserved cryptic site at the E protein domain interfaces showed a consistent dynamic behaviour across flaviviruses and contained a conserved cluster of ionisable residues. Constant-pH simulations revealed cluster and domain-interface disruption under low pH conditions. Based on this, we propose a cluster-dependent mechanism that addresses inconsistencies in the histidine-switch hypothesis and highlights the role of cluster protonation in orchestrating the domain dissociation pivotal for the formation of the fusogenic trimer.
Editor's evaluation
Using state-of-the-art molecular dynamics simulations, the authors discuss the potential binding sites of drug molecules to the flaviviral envelope. Moreover, using constant pH simulations, they discuss the functional relevance of a cluster of ionizable residues in a cryptic site at the domain interface. These results have provided novel mechanistic insights into the pH-dependent conformational changes of the envelope protein and cryptic binding sites in the envelope protein that can be targeted for inhibiting viral infection.
https://doi.org/10.7554/eLife.82447.sa0Introduction
Flaviviruses belong to a family of enveloped positive-sense, single-stranded RNA viruses which are transmitted by arthropod vectors, predominantly mosquitoes and ticks. The family includes human pathogens such as dengue (DENV), yellow fever (YFV), Zika, tick-borne encephalitis (TBEV), Japanese encephalitis (JEV) and West Nile virus (WNV), which can be subdivided into a neurotropic and non-neurotropic group of infectious agents (Gaunt et al., 2001). The non-neurotropic group of flaviviruses, associated with haemorrhagic disease, is primarily transmitted via Aedes mosquitoes and is mainly present in the tropical and subtropical regions of the world (Bhatt et al., 2013; Kraemer et al., 2015). In recent years, partly as a result of global warming, the mosquito distribution has been rapidly expanding into temperate regions and is now co-distributed with the majority of the human population (Brady et al., 2012; Kamal et al., 2018; Kraemer et al., 2019).
Flaviviral particles consist of an internal nucleocapsid and a surface-facing envelope, which is composed of a membrane and two types of embedded viral proteins. An envelope (E) protein is exposed to the surface of the virus and appears as a first point of contact between the virus and host immune system, whereas a smaller membrane (M) protein, which plays a role in viral maturation, localizes underneath the E protein and is concealed from the exterior. Both proteins are embedded in the viral membrane via transmembrane (TM) helices, and together they form stable heterotetrameric (E2M2) complexes (Zhang et al., 2013a; Kostyuchenko et al., 2013; Sirohi et al., 2016; Sevvana et al., 2018). In the mature virion, 90 E2M2 units are organised in a tight-knit herringbone-like pattern with three heterotetramers arranged in parallel to form a raft. In total, 30 rafts form a complete viral envelope and are likely a functional unit for immune recognition (Zhang et al., 2013a; Sevvana et al., 2018; Figure 1).

Flaviviral raft structure and domain organisation.
(A) A top-down view on the flaviviral envelope protein raft consisting of three E2M2 heterotetramers organised in a parallel fashion. Chains differ in the context of their immediate environment and are described in terms of the symmetry fold, which is repeated across the entire viral particle. In this model, the majority of interchain contacts is retained for the central dimer (consisting of two 2-fold symmetry chains), while the edge 3-fold symmetry chains are not bordering neighbouring rafts as in the full virion. (B) The curvature of the envelope membrane is imposed by that of the protein rafts. See also Figure 1—figure supplement 1. (C) Top and side views of the E2M2 heterotetramer shown in ribbon representation and coloured by functional domains. Protein surface is shown as a transparent outline. The six flaviviral strains differ in length, and residue numbers on the key correspond to the alignment of all strains and to DENV1, 2, and 4. By convention, individual residues mentioned in the text are always matched with DENV2 numbering.
During the viral entry process, the virion is internalised via clathrin-mediated endocytosis (Mosso et al., 2008; Acosta et al., 2008) and is contained within endosomes. A gradually acidifying environment inside the compartment (~pH 5–6.5; Randolph and Stollar, 1990) triggers a large-scale conformational change within the envelope whereupon E proteins dissociate from their E2 dimeric state, which lies flat on the surface of the virus, and instead form trimeric spikes with conserved fusion loops positioned at the tips (Modis et al., 2004). The fusion loops anchor the virus to the host membrane, effectively creating a bridge that spans the gap between the viral and endosomal membranes. A series of putative conformational changes in the spike trimer catalyse membrane fusion, penetrating the endosome and allowing for the viral nucleocapsid with its RNA genome to be released into the host cytoplasm (Harrison, 2015).
The identity of the exact residues responsible for triggering this dramatic sequence of conformational changes remains unclear, but the majority of research has been focussed on several conserved histidine residues as the most likely pH-sensing candidates, guided by the principles of the ‘histidine switch hypothesis’. This hypothesis presupposes histidines to be pH-sensing residues responsible for triggering conformational changes in viral fusion proteins that are dependent on pH. The hypothesis hinges on the property of the histidine side chain having its pKa close to physiological pH (pKa = 6.0), and therefore presumes that only histidines are able to change protonation state under the relatively mild pH changes occurring inside the living cell. Site-directed mutagenesis experiments on recombinant subviral particles of TBEV demonstrated that a single His317 mutation (residue numbers corresponding to DENV2) was sufficient for inhibiting a membrane fusion step, likely because of impaired dissociation of E protein dimers. Double His244/His282 mutants had a reduced stability of fusogenic trimers, which also impaired fusion (Fritz et al., 2008; Stiasny et al., 2011). A mutation in His144 could not yield stable subviral particles and its effect on the fusion step could therefore not be determined. However, subsequent research on WNV reporter virus particles has brought into question the histidine-switch hypothesis of the pH-dependent conformational change (Nelson et al., 2009). Even though mutations of His144 and His244 significantly affected the fusion process, substitutions to non-titratable Met/Asn and aromatic residues, respectively, resulted in partial rescue of infectivity.
One of the best researched flaviviruses is DENV (serotypes 1–4) which poses a serious health risk because it can lead to dengue haemorrhagic fever or dengue shock syndrome (Gould, 1998). More serious forms of the disease have been linked to antibody-dependent enhancement (ADE), where a first infection produces anti-DENV antibodies specific against one strain, which then imperfectly cross-react with another strain during a subsequent infection (Halstead, 1979; Balsitis et al., 2010; Guzman et al., 2013). This unexpected detrimental effect of the host’s immune system has been linked to subtle differences in E protein sequences among DENV serotypes, inefficient viral maturation resulting in a high number of immature particles presenting in the extracellular space, as well as to a striking heterogeneity of DENV particle morphology observed under various environmental conditions associated with temperature and pH (Modis et al., 2004; Junjhon et al., 2010; Fibriansah et al., 2013; Zhang et al., 2013b; Lim et al., 2019; Morrone et al., 2020; Fibriansah et al., 2021). The mechanism of ADE is most likely multifaceted (Morrone and Lok, 2019; Wirawan et al., 2019), but predominantly includes anti-DENV antibodies interacting with the Fcγ receptor (FcγR) leading to enhanced viral internalisation into the endosome, from which the virus can escape by utilising its fusogenic spikes (Balsitis et al., 2010; Rey et al., 2018; Williams et al., 2013). The aggravation of disease caused by antibodies puts into question conventional routes of fighting viral infection, which usually involve vaccines designed to promote humoral immunity and a strong antibody response to the presence of a virus. Sanofi’s Dengvaxia, the only dengue vaccine currently in use, has been shown to be serotype-dependent and can lead to ADE in those who have not contracted the virus prior to vaccination (Guy et al., 2017).
An alternative to antibody-reliant approaches for fighting viral infections are antiviral drugs which may act upon various stages of the viral life cycle. There have been numerous attempts to apply drug design to target flaviviruses, and they have usually focussed on the non-structural viral proteins involved in viral replication or polyprotein processing, as well as some host targets involved in viral assembly (reviewed in Boldescu et al., 2017). Recent studies have demonstrated that cyanohydrazone molecules bind to a pocket on the E protein to inhibit viral entry at micromolar concentrations for DENV2, Zika, and JEV (de Wispelaere et al., 2018; Li et al., 2019). The pocket in question is located at the domain I-domain II interface of the E protein, is cryptic in nature, and was fortuitously observed in crystal structures of the DENV2 soluble portion of the E protein (sE) bound to a single n-Octyl-β-D-Glucoside (β-OG) detergent molecule (Modis et al., 2003). The effectiveness of these antivirals in blocking the early stages of the viral life cycle without causing ADE encourages exploration of other possible druggable sites on flaviviral envelopes. It also highlights the potential of utilizing drug design approaches targeting cryptic sites that may be of functional importance to the virus.
One of the methods focussed on detection of potential drug-binding sites is a co-solvent simulation approach that exploits the property of preferential binding of small organic molecules onto protein surfaces with which they can establish favourable interactions. Such surfaces are known as hotspots and their detection is a critical step for rational drug design (DeLano, 2002). To that aim, the probes resembling common functional drug moieties (such as benzene or hydrocarbons) are preferably used, as their binding location and mode of interaction may be instructive for a subsequent drug design step (Guvench and MacKerell, 2009; Tan et al., 2016). On top of being a frequent component of drug-like molecules (Kolb and Caflisch, 2006), benzene is also apolar in nature. This property is useful for uncovering cryptic sites which are at least partially hydrophobic in character and therefore only transiently appear in a detectable ‘open’ conformation (Cheng et al., 2007; Schmidtke and Barril, 2010; Oleinikovas et al., 2016; Kuzmanic et al., 2020).
Here, we use large-scale co-solute molecular dynamics (MD) simulations to explore the behaviour of six flaviviral rafts (DENV1-4, YFV, and Zika) and reveal conserved cryptic pockets in the presence of benzene probes. To that aim, we establish a raft model as a realistic representation of the viral envelope that successfully reproduces the features of curvature and interchain contacts, while being attainable in an all-atom level description and thus providing a precise insight into physicochemical properties applicable to the envelope as a whole. The dynamics of the ~400,000 atom raft systems are explored using a total simulation sampling time of over 14 µs. We investigate a landscape of potentially druggable sites on the envelope surface, compare pocket conservation across viral strains in relation to their sequence similarity, and propose conserved and functionally important sites with potential for drug targeting and inhibition of the early stages of the viral life cycle. Finally, we elaborate on the functional importance of a proposed cryptic site by highlighting a conserved charged cluster of residues that includes the key histidine residue, His144, explore its behaviour in different pH-environments using a constant pH-simulation approach, and propose a cluster-dependent mechanism of DI-DIII dissociation essential for formation of the fusogenic trimer.
Results and discussion
Benzene binding increases solvent protein accessibility without affecting the secondary structure
High concentrations of apolar probes can potentially lead to undesirable unfolding events by invading and disrupting the hydrophobic core of the protein, resulting in a shift towards biologically irrelevant conformations (Schmidt et al., 2019). Thus, in the current work, we first confirmed that the addition of benzene to flaviviral raft systems increases the total SASA without affecting protein secondary structures (Figure 2A–B). The unperturbed secondary structure suggests that the increased SASA of the sampled conformations of the rafts is not due to protein unfolding, but instead is a result of subtle side chain rearrangements, benzene incorporation into protein-protein interfaces, and cryptic pocket opening events. Additionally, very similar RMSF profiles for water-only and benzene simulations demonstrated that the overall dynamic fluctuations were comparable in behaviour despite the differences in solvent composition (Figure 2C). Highly dynamic loop regions (such as the N153 glycan loop) had fractionally lower RMSFs compared to water-only simulations. This effect might be due to stabilisation imposed by benzene interacting with the disordered loop regions and the ordered surface beneath, thus providing a stable interaction surface for the flexible portions of the protein.

Global raft properties for all six viral serotypes showing the effect of benzene probes added to the solvent.
(A) The total SASA across all viral systems increases with the addition of hydrophobic benzene probes to the solvent. The first 50 ns of each simulation was considered as equilibration and therefore omitted from the analysis. (B) The number of residues participating in secondary structure (defined as constituent elements of ⍺-helices, β-sheets, β-bridges, or turns) remains largely unaffected in benzene systems and closely corresponds to the values describing the experimentally derived cryo-EM structures. (C) The RMSF values describing whole-residue fluctuations are similar across solvent types and viral serotypes. The gaps in lines are due to the sequence alignment which accounts for differences in length or deletions/insertions between different flavivirus envelope proteins, ensuring that the corresponding residues are vertically aligned. All values displayed in panels are averaged across repeats and, in the case of RMSF, across all chains of the system. Standard error is shown as a transparent ribbon around the mean value.
Benzene binding patterns are sequence-dependent
Flaviviral serotypes, although similar in structure, exhibit subtle serotype-specific differences in envelope dynamics. Even though the polyprotein sequence similarity between the presently explored flaviviruses is relatively high (42% for all six serotypes and 71% for DENV serotypes only), a large number of mutations localise on the surface-exposed portions of the envelope protein (Figure 3—figure supplement 1A–B). This is unsurprising in the context of viral evolution, as mutations arise in the surface-exposed areas in order for the virus to evade the host immune response. However, certain exposed regions of the E protein are highly conserved because they play specific functional roles in the viral life cycle and are therefore less likely to accumulate mutations.
Shared sites of benzene binding reflect residue conservation patterns on the E protein and mostly localise around the conserved fusion loop and at the pr-binding interface (Figure 3 and Figure 3—figure supplement 1C). The fusion loop (residues 98–111) is conserved among flaviviruses as it inserts into the endosomal membrane, which is essential for catalysing the membrane fusion event leading to genome release into the host cell cytoplasm (Allison et al., 2001). Similarly, the pr portion of the prM protein protects the fusion loop from premature insertion during the maturation process, and its interaction surface on the E protein is therefore also conserved to a high degree (Heinz et al., 1994; Figure 3—figure supplement 1A and C). It is interesting to note that protein-protein interfaces have similar characteristics to drug-binding hotspots (DeLano, 2002; Mattos and Ringe, 1996; Zerbe et al., 2012) and the methods used for detecting one are also likely to detect the other type of protein surface. This was well demonstrated by benzene mapping, as we found that benzene molecules preferentially interacted with the pr-binding interface, highlighting it as a potential hotspot.

Benzene contacts and changes in SASA shown for all six viral serotypes.
(Left) The average number of benzene contacts per residue is shown on viral raft structures. The values for individual chains are averaged across repeats. For patterns of residue conservation and phylogenetic relationships between strains, see Figure 3—figure supplement 1. (Right) In each upper graph, the mean value of benzene contacts per residue averaged across all chains and repeats is shown in black line, with standard error displayed as a grey ribbon. In each lower graph, the average change in SASA is shown in nm (Bhatt et al., 2013), calculated as (SASA+bnz - SASA-bnz), with an increase shown in green and decrease in red. As with Figure 2, the gaps in lines are due to flavivirus-specific sequence alignment.
Overall, though, the surface patterns of benzene binding and, consequently, increased SASA, were variable and serotype-dependent (Figure 3). The external E protein surface is mostly unconserved and correspondingly showed a significant degree of variability in benzene binding patterns across serotypes. Notably, a central portion of DII was benzene-depleted (conservatively, residues 58–65, 120–125, 215–235, 252–267), suggesting that it is unlikely to be an attractive drug-binding hotspot. The width of this depleted band was sequence-dependent, with it being the narrowest in DENV4 and the widest in YFV. There are currently no known flaviviral antibodies that specifically target the central band of the raft, corroborating the observation that poor benzene binding sites are also unlikely to be involved in the formation of protein-protein interfaces. The benzene interaction sites slightly differed depending upon the position of the chain within the raft (Figure 1A). Protein-protein interfaces that were positioned on the raft edges were more easily accessible to benzene molecules than the interfaces engaged in interchain interactions. Despite this, benzene mapping was highly similar across chains and displayed clear serotype-specific patterns of binding.
The areas of frequent benzene binding also matched increases in SASA, showcasing the role of benzene in exposing protein sites that predominantly remain hidden in water-only simulations. The TM regions of the E proteins and the M proteins showed little increase in SASA, which is explained by the fact that those regions are sequestered within the membrane and are, for the most part, out of reach of benzene molecules.
Correlation analyses of serotype properties, especially SASA, recapitulated phylogenetic relationships between the viruses (Figure 3—figure supplement 1D). Serotypes that are more closely related generally exhibited more similar dynamic behaviour, highlighting the importance of sequence conservation in yielding shared characteristics such as SASA, benzene binding, or pocket formation. Interestingly, not all residue-based properties showed an equally high degree of correlation with sequence – notably, the RMSF correlations were poorly reflective of flaviviral evolutionary relationships, suggesting a more complex interplay of properties causing residue fluctuations than sequence alone (Figure 3—figure supplement 1D). Instead, the RMSF values showed remarkable similarity across all six viral serotypes (Figure 2C).
The open/closed state ratio of β-OG pocket is specific to the viral serotype
The β-OG cryptic pocket was first detected in DENV2 sE protein crystal structures (Modis et al., 2003) where a single detergent molecule intercalated within the hydrophobic pocket located at the DI-DII interface of each chain (Figure 4A). The binding pocket was absent in apo-structures as the position of the kl loop obscured the entry to the site (Figure 1C). Functionally, the DI-DII interface is a hinge region and its movement is essential for the E protein to undergo pH-dependent conformational changes leading to the formation of fusogenic trimers (Zhang et al., 2004). However, the co-crystallised β-OG molecule was absent from sE dimer structures of other flaviviruses (Kanai et al., 2006; Nybakken et al., 2006), and even from the E protein structures of the same viral serotype (Zhang et al., 2004) despite the addition of the detergent in the crystallisation buffer. Nonetheless, other studies have shown that the β-OG pocket is likely present in at least some flaviviruses. The JEV apo-sE crystal structure showed that the kl loop is positioned in a way to reveal a pocket large enough to accommodate β-OG, even in the absence of the ligand (Luca et al., 2012). Indeed, pyrimidine compounds and cyanohydrazones (both groups exhibiting inhibitory effects on the fusion events of DENV2, WNV, JEV, and Zika viruses) appear to bind inside the β-OG pocket, as demonstrated by loss-of-binding site-directed mutagenesis and photocrosslinking experiments (de Wispelaere et al., 2018; Li et al., 2019).

β-OG pocket in the presence of the β-OG ligand or benzene, and its SASA across viral serotypes.
(A) The β-OG detergent molecule co-crystallised inside the cryptic pocket of the DENV2 sE protein (PDB code: 1OKE) (Modis et al., 2003) . The protein is shown as a combination of white ribbon and transparent surface representations. The β-OG molecule is shown in yellow sticks, with oxygen atoms highlighted in red. The residues surrounding the ligand are accentuated in stick representation and coloured based on their conservation scores calculated in Consurf (Ashkenazy et al., 2016). Residue labels correspond to DENV2. (B) A representative simulation snapshot of benzene molecules occupying the β-OG pocket in the DENV2 system. Benzenes are shown in yellow stick representation. Other elements of the system are visualised in the same way as in panel A. (C) Cumulative SASA for all residues lining the pocket (labelled in panel A) across six flaviviral serotypes. The data was visualised as a swarm plot, where a SASA measurement for each sampled frame was represented as a horizontally stacked non-overlapping point. The spread of points therefore describes the distribution of SASA measurements for systems in absence (-bnz) or presence of benzene (+bnz). Pocket SASA values are shown separately depending on the solvent composition. Open and closed conformations of the pockets were differentiated by applying a k-means clustering algorithm (details in the Methods section). The percentage of the β-OG pocket observed in an open conformation is specified for each simulation system. The indicated experimental SASA value of the pocket is that measured for the crystal structure (panel A) averaged across the two chains. See also Figure 4—figure supplement 1.
Using the benzene mapping simulation approach on raft systems, we explored the characteristics of the β-OG pocket in different flaviviral serotypes. We observed that the ratios of open/closed states of the pocket differ greatly depending upon the particular virus (Figure 4C), with the biggest pocket exposure attributed to DENV4 followed by DENV2, with YFV exhibiting comparatively the lowest pocket SASA even in the presence of benzene. Pocket properties were also affected by solvent composition, with simulations containing benzene manifesting, on average, higher SASA values. Benzene molecules can enter the hydrophobic pocket and transiently interact with residues lining the site (Figure 4B). The number of benzenes that occupied the pocket during the course of the trajectory was in direct correlation with the pocket SASA (Figure 4—figure supplement 1B), with DENV2 and DENV4 showing the highest propensity for accommodating benzene molecules within the pocket, both in terms of number and frequency. In contrast, YFV β-OG pockets were the least likely to contain any benzene. Therefore, the presence of benzene probes in the system modified pocket exposure and, consequently, its SASA, either by locking the pocket in an open conformation or by inducing the opening of the pocket in the first place.
The observed open/closed ratios of the β-OG pocket were not equivalent for all chains in the system; instead, they were dependent on the position of the chain within the raft. The edge 3-fold chains appeared to have the highest median SASA value, an effect that was in particular exaggerated for DENV2 and DENV4 systems (Figure 4—figure supplement 1A). This might be due to the fact that the edge chains were unconstrained by the neighbouring elements on one side, allowing for greater flexibility of the protein and easier opening of the pocket.
We clustered pockets based on SASA and contact properties to separate the sampled conformations into two clusters corresponding to open and closed states (Figure 4C). The DENV4 pocket appeared to be predominantly sampled in an open conformation; the DENV2 pocket was in an open state only half as often as compared to DENV4. Contrastingly, DENV1 and DENV3 exhibited very few open conformations. This variability is reflective of a relatively low conservation of residues (53%) among the DENV group of viruses – an attribute that also highlights the potential difficulty in developing pan-flaviviral therapeutics targeting the β-OG site.
The evident differences in pocket exposure can only be partially explained by the differences in the residues lining the pocket, and it is likely that the pocket opening is influenced in the wider context of protein residues outside of the binding site. de Wispelaere et al., 2018 identified a mutation in DENV2 that is detrimental to antiviral compound binding, even though the point-mutation M196V is outside of the β-OG pocket. These findings suggest that the pocket opening might be allosterically affected by residues outside the binding cavity, either by modifying its shape, frequency of opening, or by affecting the stability of the interactions with its binding ligands.
Remarkably, the SASA value measured for the open β-OG pocket in the crystal DENV2 sE structure (Modis et al., 2003) closely corresponded to the mean pocket value in simulated DENV2, suggesting that the benzene simulations successfully sampled biologically relevant open conformations of the pocket that match experiment. Despite some of the serotypes exhibiting predominantly closed states of the pocket even with the addition of benzene, those β-OG pockets are not necessarily absent. Instead, the kinetics of site opening might be serotype-dependent, resulting in pockets of some of the viruses predominantly residing in closed states. Thus, successful inhibition of the fusion events in Zika in the presence of β-OG pocket-binding inhibitors suggests that the pocket is sufficiently druggable, despite its presently observed low fraction (13.1%) of open states.
The ⍺ pocket located on the domain interfaces is conserved and contains a buried cluster of charges
Our previous studies of a single DENV2 E2M2 heterotetramer identified a novel cryptic pocket (referred to as ⍺) located on domain interfaces (Zuzic et al., 2020; Figure 5). Here, we found that this pocket was consistently present in all explored flaviviral serotypes and for all raft chains (Figure 5—figure supplement 1). The increase in SASA in the presence of benzene was evident for all mapped ⍺ pockets, confirming its cryptic character. Interestingly, properties of the ⍺ pocket across serotypes appeared to be more consistent than those of the β-OG pocket, and this uniformity could be linked to a higher degree of residue conservation (47% for all explored flaviviruses; 70% for the DENV group). SASA was also less affected by the location of the chain within the raft for the ⍺ pocket, likely due to the fact that it is positioned peripherally and is therefore less dependent on intra-raft contacts. We cannot exclude that the dynamics of the ⍺ pocket opening might be affected by inter-raft contacts not present in our models.

The location of the ⍺ pocket and the selection of all conserved, buried and titratable residues on the flaviviral envelope, showcasing a charge cluster associated with the pocket.
(A) Left side: The ⍺ pocket density displayed as a transparent grey surface located on the intrachain DI-DIII and interchain DII-DIII interface. The E protein is shown in ribbon representation and coloured according to domain assignments (as seen in Figure 1). The plane represents a directional slice of the pocket shown on the right side of the panel. Right side: The protein is shown in white surface representation and the residues lining the pocket are displayed as sticks and coloured according to their Consurf conservation score. The pocket is highlighted in yellow. (B) The selection of E and M protein residues according to the properties of conservation, burial, and ionizability. Residues characterised by a high degree of conservation (Consurf score ≤–1.0) and solvent inaccessibility (SASA ≤0.2 nm2) are shown in the inset. Additionally, titratable residues are labelled and displayed on the E2M2 structure, where it becomes apparent that almost all selected residues occupy the DI-DIII interface region associated with the ⍺ pocket. For ⍺ pocket SASA measurements across flaviviral strains, see Figure 5—figure supplement 1.
The pocket is elongated and functionally divided into two parts - the fusion loop interface, representing the interchain interaction surface; and the DI-DIII cryptic pocket segment that is located underneath the N153-glycosylation loop (Figure 5A). A closer inspection of the cryptic segment reveals that it is well conserved, with surface residues showing a greater degree of variability, and buried residues appearing overall more conserved. The N153-glycan loop obstructing the entrance to the pocket was highly flexible in our simulations for all DENV serotypes (Figure 1C). Although we omitted the glycan component from our models, the simulations were in agreement with the hydrogen-deuterium exchange experiments performed on the whole DENV2 particle that demonstrate a high degree of flexibility in the glycosylation loops (Lim et al., 2017). Furthermore, cryo-EM structures of the whole viral particles (Zhang et al., 2013a; Sevvana et al., 2018; Renner et al., 2021), as well as crystal structures of E proteins expressed in insect cells (Modis et al., 2003; Zhang et al., 2004; Modis et al., 2005; Barba-Spaeth et al., 2016) indicate that the N153-loop is unstructured. Interestingly, E protein dimers expressed in bacterial systems (Luca et al., 2012; Lu et al., 2019) appear to have the N153-loop structured as an ⍺-helix, suggesting a role of the glycan moiety in loop secondary structure organisation. The flexibility of the N153-loop indicates that it might aid cryptic pocket opening and transient solvent accessibility of residues lining the ⍺ pocket.
The DI-DIII cryptic segment of the pocket contains a conserved His144 residue (Figure 5A) that has been implicated in fusogenic pH-dependence (Nelson et al., 2009) and viral particle stability (Fritz et al., 2008). In a static structure (Zhang et al., 2013a), the residue appears to be solvent-inaccessible and with a predicted pKa outside the biological range (pKa <0), which would make it an unlikely candidate for a pH-sensing residue (Figure 6—figure supplement 2A). However, site-directed mutagenesis experiments indicate its importance in the pH-dependent conformational change (Nelson et al., 2009), although its exact role is as of yet unknown. The dynamic exploration of the flaviviral envelopes in our simulations shows that the residue could be transiently exposed to the solvent, concurrent with the cryptic pocket opening events, which in turn might affect its pKa.
Generally, the only residues that have tended to be considered in the context of pH-sensing roles in the flaviviral E proteins have been histidines as their intrinsic sidechain pKa values (~6.0, as determined from titration experiments of compounds outside the protein environment) can rapidly protonate under relatively mild acidic conditions. However, this approach disregards the effect of the surroundings on residue pKa (García-Moreno et al., 1997; Isom et al., 2011) and the fact that the residues, especially buried ones, are located in a low dielectric environment. Under such conditions, pKa values can be dramatically altered – specifically, Asp/Glu residues have the potential to be upshifted (Srivastava et al., 2007) and act as pH-sensing or pH-coupled residues (Sazanavets and Warwicker, 2015). When E and M protein residues were explored in the context of their burial, conservation, and potential for being titrated, only six of them fulfilled the criteria and, remarkably, all but one were located on the DI-DIII interface associated with the ⍺ pocket region (Figure 5B). Four conserved residues – Arg9, Asp42, His144, and Glu368 – were found to closely interact and formed a distinct cluster of charges bridging the DI-DIII interface. In addition, the N-terminus (N-ter) – characterised by its burial and ionisability, and unaffected in terms of conservation – was in close proximity to the residues in question and was also a part of the cluster. The presence of closely coupled residues within the cluster suggests that any pH-dependent protonation event on His144 would have to be considered in the context of other charges within the cluster. The effects of pH on the cluster protonation and dynamic behaviour were next explored in constant-pH simulations.
Disruption of the ⍺ pocket charge cluster at low pH
We performed 0.88 µs of constant-pH simulations on the sE dimer of DENV2 across a broad pH range of 0–9. The convergence of the examined pKa values is shown in Figure 6—figure supplement 1. The focus of our analysis was the ⍺ pocket charged cluster (N-ter, Arg9, Asp42, His144, Glu368) and its behaviour at different pH conditions (Figure 6). Interestingly, even though preliminary pKa calculations based on the static structure obtained from cryo-EM predicted His144 to be inaccessible for protonation (Figure 6—figure supplement 2A), conformational sampling during constant-pH simulations revealed transient solvent accessibility of the charge cluster that likely allows for histidine protonation at endosomal pH (pKa(His144)=4.84; Figure 6—figure supplement 2B).

The response of the charge cluster to varying pH conditions.
For convergence of pKa values in constant-pH simulations, see Figure 6—figure supplement 1. (A) Disruption of the charge cluster (assessed via its radius of gyration) in relation to the charge pattern (violin plots in the foreground) and overall cluster charge (boxplots in the background). The measurements were obtained over a range of pH values (pH 0–9). The violin plots represent the distribution of y-values (radius of gyration measurements) for each protonation state, where the total area of violin(s) for each protonation state is constant. The codes in the legend describe residue charges for the residues that were titrated in the simulation system: Asp42, His144, and Glu368. N-ter and Arg9 had a fixed positive charge and were therefore omitted. For predicted pKa values of selected residues, see Figure 6—figure supplement 2. (B) Occurrence of distinct protonation states at different pH conditions. Point sizes correspond to the population sizes at each value. D-H0E- is a dominant pattern of charge in high and neutral pH conditions, whereas the remaining seven patterns (representing +1,+2, and+3 total cluster charge) appear at low pH conditions. See also Figure 6—figure supplement 3. (C) PCA describing cluster conformations based on the intra-cluster distances. The axes corresponding to the original variables are shown on the right side of the plot (labels for the smallest axes His144-Glu368 and Asp42-His144 omitted for clarity). Clustering analysis of the five residues (details in the Methods section) generated nine distinct conformations which are projected onto the PCA and shown in insets around the plot. Conformations 8 and 9 had a negligible contribution to the total number of structures and were therefore excluded from the visualisation. The snapshots corresponding to structures undergoing a greater conformational change at low pH are visualised in ribbon representation (trajectories shown in Figure 6—videos 1; 2). Transparent surfaces represent the starting sE dimer conformation, demonstrating the extent of the conformational shift. Ribbon colours correspond to the specific charge patterns under which the conformational change had occurred (green and yellow for D-H+E0 and D0H+E0, respectively). For SASA of the ⍺ pocket across different charge states of the cluster, see Figure 6—figure supplement 4. For the effects of benzene on the conserved cluster behaviour, see Figure 6—figure supplement 5.
The disruption of cluster contacts (assessed by measuring the radius of gyration of the cluster residues, with higher values corresponding to overall greater disorder) was assessed in relation to the patterns of charges. Considering that the charge cluster is broken apart in the postfusion trimer (Modis et al., 2004), it is expected that a decrease in pH would be accompanied by disruption of the cluster. The cluster underwent a charge shift as consequence of lowering pH (Figure 6B), which indeed led to a gradual increase in disorder (Figure 6A), most noticeable following a total cluster charge shift from 0 to +3.
In high and neutral pH environments, the cluster was mainly in the D-H0E- state, corresponding to protonated N-ter and Arg9 and deprotonated Asp42, His144 and Glu368, resulting in a total cluster charge of 0. The residues within the cluster were predominantly in close contact and formed a tightly interacting unit on the DI-DIII interface. When assessed across all pH values, this close-contact state dominated the conformational landscape as it occurred in 95% of all sampled structures (Figure 6C). Notably, the remaining conformations, characterised by looser residue association, were enriched at positive cluster charges occurring at low pH (Figure 6—figure supplement 3). This uneven distribution of conformations across charge states suggests a dissociation effect linked with acidic pH conditions, despite its relatively low occurrence as compared to the compact conformation. The constant-pH simulations did not include titrations of the N-ter or Arg9 and instead contained a fixed +1 charge at all pH values. However, given that the intrinsic pKa values for both residues are in the basic range (pKa = 9.6 and pKa = 12.0) and their predicted pKa values are 7.4±0.2 and 9.45±1.9, respectively (Figure 6—figure supplement 2A), it is likely that both residues remain largely protonated in the acidic environment of the endosome.
Specific patterns of charge distribution across residues had different effects on cluster stability, even when the total charge remained unchanged: D-H+E- (protonated His144) had little effect on cluster disruption, while D-H0E0 (protonated Glu368) introduced a charge imbalance effect leading to the disturbance of the overall cluster structure. The cluster at +1 charge was a significant presence in the pH 5–6 range, relevant in the context of the endosomal pH-dependent conformational change (Figure 6B).
Transition of the total cluster charge to the +2 value resulted in substantial destabilisation of the DI-DIII interface, specifically in a D-H+E0 state (protonated His144 and Glu368). The switch to this charge state was accompanied by a protein-wide conformational change in the sE dimer with DIII shifting upwards from its initial position (Figure 6C). The disruption was linked to Arg9 decoupling from the rest of the residues to point away from the cluster. At very low pH (0–2), the cluster exhibited a total charge of +3 (D0H+E0, corresponding to all-residue protonation), and this was associated with even greater disruption of cluster contacts and conformational changes that manifested as DIII shifting away from DI (Figure 6C). Notably, this change appears to be different from the conformational switch observed in the D-H+E0 state as it was predominantly linked to Glu368 distancing from the remainder of the cluster residues.
Even though we observed protein-wide conformational changes only in more extreme pH environments than are observed within the endosome, it is evident that at least one associated pattern of charge (D-H+E0) can occur at pH 5, suggesting that this route of DI-DIII disruption at endosomal pH is possible. Alternatively, the observed increase in intra-cluster distances at total charge zero suggests that it might also lead to DI-DIII dissociation in the context of better sampling or a more expansive representation of the viral particle surface.
Based on our findings, we propose that the drivers of conformational change at low pH are more complex than the simplistic histidine switch hypothesis, which limits possibilities for pH-sensing residues exclusively to histidines. As shown in our simulations, the solitary His144 protonation event has little effect upon overall sE dimer stability; instead, the conformational change appears to be inextricably linked to an interconnected network of conserved titratable residues that jointly contribute to the pH-dependent formation of the fusogenic spike. The cluster effect might also explain incongruous results in loss-of-function experiments which have shown that E proteins with an H144N mutation can still undergo the fusion event (Nelson et al., 2009), suggesting that His144 plays a part, but is not indispensable for triggering trimer formation.
Limitations
In this work, we modelled the portion of the flaviviral envelope with the aim of detecting cryptic pockets on its protein components, but we omitted 1–2 glycan molecules (at position N67 for DENV1-4 only, and at N153 for all addressed flaviviral species). At the time of conducting the research, the benzene mapping method had not been verified against systems containing sugar moieties, and we hypothesised that the shielding effect of glycans might hinder effective benzene exploration of the protein surface. Furthermore, the experimental data available regarding glycan types on each glycosylation site is scarce and is dependent on the viral species and the examined host (Hacker et al., 2009). However, the ⍺ pocket that contains the cluster of charges implicated in pH-dependent conformational change is located underneath the N153-glycan loop and as such, it is expected to at least partially affect the dynamics of pocket opening. This aspect of flaviviral envelope dynamics remains unaddressed at this time, and a topic for the future.
The membrane was modelled under geometric and compositional constraints. The curvature of the membrane was induced by introducing position restraints on C⍺ atoms of the TM helices. The membrane consisted of three phospholipid species only, and it did not include sphingolipids or cholesterol, both of which are suggested to be functionally important for flaviviral infection (Martín-Acebes et al., 2014; Carro and Damonte, 2013). Taking into consideration the simplicity of the membrane composition model, as well as the application of position restraints in the lipid-proximal areas of proteins, detailed analysis of protein-lipid interactions was not pursued. Instead, the membrane served as a low-dielectric and solvent-excluding ‘shield’ that mimicked the approximate environment of the viral envelope and disallowed benzene to interact with the underside of the modelled raft.
The constant-pH simulations were performed on a truncated model of the E protein dimer, and the titrations were performed only on a small number of ionisable residues (specifically, the identified cluster charge residues and conserved histidines). The constant-pH method utilised the Generalised Born solvation model to attempt protonation, while the conformational sampling was performed in an explicit water environment. These aspects of the constant-pH simulation setup may be limiting in terms of accuracy of the used model and the comprehensiveness of the protonation sampling. Therefore, the future continuation of this research may involve constant-pH simulation approaches that allow for simulations of bigger systems and with titration coordinates applied to all acid-ionisable residues, such as a recently developed method reported by Aho et al., 2022.
Perspectives on cryptic hotspot discovery in flaviviruses
Prophylactic or therapeutic approaches for fighting flaviviral disease that utilise vaccines or antibodies might be suboptimal because of the danger of ADE. Instead, antiviral drugs that could target the flaviviral life cycle could be an effective alternative. Specifically, a potential drug targeting the early stages of the life cycle would not have to be membrane-soluble as it would target the viral particle in the extracellular environment. Furthermore, inhibiting the fusion step would contribute to prevention of ADE, because it would pre-empt viral escape from the endosome prior to degradation. Hence, antiviral drugs would not only offer protection on their own, but could also act in concert with vaccines. The inhibitory action of antiviral drugs by binding to the β-OG pocket has already been demonstrated for a range of flaviviruses (de Wispelaere et al., 2018; Li et al., 2019), likely by preventing a hinge-like movement of the DI-DII interface necessary for formation of the fusogenic trimer. Similarly, targeting drug-like molecules to the ⍺ pocket could modulate charge cluster behaviour, effectively disrupting the pH-dependent dissociation of DI-DIII that is an essential step in the endosomal conformational change. Consistent with the cluster disruption and domain dissociation observed under low pH conditions, the surface area of the ⍺ pocket also increases with the net charge of the cluster (Figure 6—figure supplement 4). A drug molecule binding inside the pocket and stabilising it – both under neutral or low pH conditions – could conceivably prevent domain dissociation and pH-induced conformational changes of the viral envelope.
His144 is the most accessible to benzene probes in all examined flaviviral serotypes and thus likely to be a primary interaction moiety for potential antiviral drugs (Figure 6—figure supplement 5A). Our simulations show that benzene interactions with His144 also lead to an increased disorder within the cluster (Figure 6—figure supplement 5), which is an effect that would need to be considered in the context of drug design. Nonetheless, the proximity of other cluster residues suggests that all of them have the potential to play a role in drug-binding and in turn be modulated in their response to a low pH environment.
Conclusions
Our simulation studies of DENV1-4, YFV and Zika rafts provide a detailed perspective of the dynamic behaviour of flaviviral envelopes in the presence of benzene. We elaborated on a wider context of shared hotspot characteristics and potentially druggable cryptic pockets that could be utilised in the inhibition of the early stages of the viral life cycle. We have demonstrated that conserved surfaces share not only residue identity, but also similar patterns of benzene binding and increased SASA, highlighting the importance of residue conservation in the context of detection of shared binding hotspots. We also provided a detailed view of the β-OG pocket behaviour in different flaviviral species that exhibited a high degree of serotype specificity when examined for pocket opening events.
Furthermore, the ⍺ pocket was mapped across flaviviral serotypes and revealed a relatively consistent profile in terms of conservation, size and opening kinetics. It contained a cluster of conserved ionisable residues, and its position at the domain interfaces prompted further exploration of its role in pH-dependent conformational changes using constant-pH simulations. The conserved cluster displayed increased disorder at lower pH values and interdomain dissociation (a prerequisite for the formation of the fusogenic trimer). Protonation-driven disruption of the DI-DIII interface suggests its critical role in the endosomal pH-dependent conformational change. We also hypothesise that the cluster as a whole – as opposed to the usually considered histidine residues only – plays a role in the conformational shift, corroborating previous experimental observations that histidines alone might not be the only pH-sensing residues responsible for the orchestration of the pH-dependent conformational switch. Collectively, our findings provide a link between the sequence-dependent dynamic behaviour of hotspots and the mechanistic overview of their functional roles, and offer a comprehensive foundation for rational drug design targeting the envelope of flaviviruses.
Methods
Benzene mapping of flaviviral rafts
Raft modelling
A raft encompasses a large surface area of the viral envelope and therefore exhibits a distinct curvature in its quaternary structure, which manifests itself in angled contacts between neighbouring E proteins (Figure 1B). In comparison, crystal structures of ectodomain dimers (Modis et al., 2003; Lu et al., 2019; Dai et al., 2016) are invariably assembled outside the confines of a viral envelope, resulting in a ‘flat’ conformation of the dimeric structure. Such crystal structures are thus unable to fully replicate native interchain contacts present in the spherical particle, highlighting the necessity of using cryo-electron microscopy (cryo-EM) structures to produce a reliable raft assembly. Currently, there are only two published cryo-EM structures of the mature DENV particle at atomic resolution, both of which are of the DENV2 serotype (Zhang et al., 2013a; Renner et al., 2021). We selected a raft segment from a viral envelope cryo-EM structure solved at 3.6 Å resolution (PDB code: 3J27) and used it as a model for the DENV2 raft. We then created the remaining DENV serotypes (DENV1, 3, and 4) by mutating individual residues in Modeller 9.19 (Sali and Blundell, 1993), based on the DENV2 raft template. Sequence differences between YFV and DENV were deemed too great to create a reliable model of YFV based on the DENV2 template alone (Figure 3—figure supplement 1B). Instead, we used an existing crystal structure of the YFV ectodomain (PDB code: 6IW4; Lu et al., 2019) and modelled the missing loops (residues 269–272) with Modeller 9.19. Next, we used homology modelling based on the DENV2 template to create the missing TM regions and M proteins. Finally, the chains were aligned with the cryo-EM-derived DENV raft to reproduce the quaternary arrangement and curvature of a spherical viral particle. The Zika raft structure was taken from the existing cryo-EM Zika virus structure (PDB code: 6CO8) solved at 3.1 Å resolution10.
Protein rafts were embedded within a membrane (see below) and equilibrated during a series of simulations with position restraints of gradually decreasing force constants applied to the stem and TM regions of the proteins. Position restraints on the TM regions were necessary to retain the curvature of the raft. All structures were initially equilibrated for 200 ns with position restraints on the backbone atoms of stem and TM regions (E protein residues: 396–495; M protein residues: 21–72; numbering scheme corresponds to DENV2) using a force constant of 1000 kJ mol–1 nm–2, followed by another 200 ns equilibration with weaker position restraints (Fc = 50 kJ mol–1 nm–2) on the TM region Cα atoms only (E protein residues: 450–495; M protein residues: 39–72; numbering scheme corresponds to DENV2).
Membrane modelling
We used CHARMM-GUI (Sunhwan et al., 2008; Wu et al., 2015) to embed the DENV2 raft in a membrane containing 1-palmitoyl-2-oleoyl-phosphatidylcholine (POPC), 1-palmitoyl-2-oleoyl-phosphatidylethanolamine (POPE), and 1-palmitoyl-2-oleoyl-phosphatidylserine (POPS) lipids in 6:3:1 ratio, which reflected a simplified membrane composition of flaviviral envelopes expressed in a C6/36 mosquito cell line (Perera et al., 2012; Zhang et al., 2012). The produced membrane was initially modelled as a flat sheet, which resulted in lateral raft dimers being incorrectly submerged in the bilayer. In order to induce membrane curvature, we simulated the raft-membrane system with position restraints applied on all backbone atoms of the raft. Within ~400 ns of the simulation, the membrane gradually curved and accommodated the shape of the raft (Figure 1—figure supplement 1), with the TM regions transversing the bilayer and the amphipathic stem helices localising between hydrophilic headgroup and hydrophobic tail regions of the membrane (Figure 1B). In total, the membrane was equilibrated for 785 ns of unrestrained simulation time.
Benzene mapping simulation setup
The strategy using benzene probes for mapping pockets in membrane-bound proteins was adapted from our previous work (Zuzic et al., 2020). A massless virtual site was placed at the centre of mass of a benzene molecule, which was used as a point of repulsion between hydrophobic benzene molecules so as to prevent aggregation. Additionally, the same site was used as a point of repulsion between benzene probes and membrane lipids in order to hinder benzene sequestration into the bilayer. Benzene probes were added to a simulation box at 0.6 M concentration (1148 benzene molecules per system). We tested this concentration for signs of protein unfolding or probe aggregation and found no evidence of either process occurring in our simulation systems (Figure 2B). The raft-membrane system containing benzene was solvated in TIP3P water (Jorgensen et al., 1983) and neutralised with NaCl, to a final salt concentration of 0.15 M.
Benzene mapping simulation protocol
All simulations were performed with the CHARMM36m force field (Huang and MacKerell, 2013) and in Gromacs 2018.2 simulation software (Abraham et al., 2015). Initially, systems were minimized using the steepest descent algorithm until forces converged or reached a maximum value of 100 kJ mol–1 nm–1. Temperature was set to 310 K and regulated using a velocity-rescaling thermostat (Bussi et al., 2007) with a time constant of 0.1 ps and applied on two coupled groups, one containing proteins and lipids, and the other, solute components. The Parrinello-Rahman barostat (Parrinello and Rahman, 1981) with semi-isotropic coupling and with a time constant of 2 ps was used to maintain a pressure of 1 atm. The compressibility was set to 4.5×10–5 bar–1 for both x/y and z directions. The leapfrog algorithm integration step was set to 2 fs, while the LINCS constraint algorithm (Hess et al., 1997) was applied to all bonds associated with hydrogen atoms in the system. Computation of long-range electrostatic interactions was performed using the particle mesh Ewald (PME) method (Darden et al., 1993; Essmann et al., 1995), with a 1.2 nm cutoff for real-space interactions and with the integration of a Coulomb potential-shift modifier. The cutoff value for Van der Waals interactions was set to 1.2 nm, with a force-switch modifier applied at 1.0 nm.
The system was run in the canonical (NVT) ensemble for 1 ns and with strong position restraints (Fc = 1000 kJ mol–1 nm–2) applied in all three dimensions to heavy atoms of both the protein and membrane components of the system. A subsequent 5-ns NPT equilibration involved position restraints on protein heavy atoms only, while the membrane was allowed to relax unrestrained. Finally, a production run was performed for 300 ns, with weak position restraints (Fc = 50 kJ mol–1 nm–2) applied to the TM region Cα atoms of the proteins (E protein residues: 450–495; M protein residues: 39–72; in DENV2 numbering system) in order to retain the curvature in the membrane model. A summary of all simulations and the corresponding number of repeats is listed in Table 1. In total, 14.4 µs of production runs were generated.
Flaviviral envelope raft simulations in presence or absence of benzene in the simulation system.
Strain | Solvent* | Time / ns | Repeats |
---|---|---|---|
DENV1-PVP159 | water | 300 | 3 |
DENV1-PVP159 | 0.6 M benzene | 300 | 5 |
DENV2-NGC | water | 300 | 3 |
DENV2-NGC | 0.6 M benzene | 300 | 5 |
DENV3-H87 | water | 300 | 3 |
DENV3-H87 | 0.6 M benzene | 300 | 5 |
DENV4-H241 | water | 300 | 3 |
DENV4-H241 | 0.6 M benzene | 300 | 5 |
YFV-17D | water | 300 | 3 |
YFV-17D | 0.6 M benzene | 300 | 5 |
ZIKA-H/PF/2013 | water | 300 | 3 |
ZIKA-H/PF/2013 | 0.6 M benzene | 300 | 5 |
-
*
Both types of solvent contained charge-neutralising 0.15 M NaCl.
Constant-pH MD simulations on DENV2 soluble E protein dimer
For constant-pH MD simulations we used a soluble portion of the E protein (sE) in a dimeric form (residues 1–395). The structure of the sE was taken from the DENV2 cryo-EM structure (PDB: 3J27) (Zhang et al., 2013a). As the E protein was truncated for the purposes of simulation, we capped the C-terminus with an N-methyl amide (NME) group. Residues His27, Asp42, His144, His244, His317, and Glu368 in both chains were selected to titrate. His282, although present in the structure and implicated in pH-dependent conformational change, was excluded from the selection as it is natively interacting with the stem region (not present in this setup) and was therefore in a biologically irrelevant environment. Native disulfide bonds were included in the structure. The sE dimer was placed in an octahedron box and solvated with ~84,000 TIP3P water molecules (Jorgensen et al., 1983) and neutralising 0.15 M NaCl.
The simulations were performed with the Amber ff99SB force field (Hornak et al., 2006) modified to include additional parameters relevant for constant-pH. Bond radii of Asp and Glu residues were reduced as carboxylate oxygens were defined with two hydrogens on each protonation site. All constant-pH simulations were performed using the Amber20 simulation software (Case et al., 2005). The system was initially minimised using a combination of steepest descent and conjugate gradient algorithms performed in 5,000 cycles and with protein backbone position restraints defined with a force constant of 4184 kJ mol–1nm–2 (10 kcal mol–1Å–2). The system was then gradually heated to 310 K over 400 ps by applying Langevin thermostat with a collision frequency of 5.0 ps–1. The NPT equilibration step was performed for 4 ns with Langevin dynamics maintaining a constant pressure of 1 atm defined by a collision frequency of 1.0 ps–1. The production steps of constant-pH simulations were run under 10 pH conditions across the scale of pH = 0, 1, 2, … 9. Production runs were simulated for 22 ns each in the NPT ensemble, with attempted protonation state changes every 100 simulation steps (0.2 ps). Protonation attempts were performed by Monte Carlo sampling (Mongan et al., 2004) based on transition free energies derived from generalised Born electrostatics (Bashford and Case, 2000) and with the Metropolis criterion used to determine if the protonation change should be accepted or not. All simulations were run with a 2 fs integration step, applying a SHAKE constraint algorithm (Ryckaert et al., 1977) to all bonds involving hydrogen atoms. Short-range electrostatic and Van der Waals interaction cutoffs were both set to 0.8 nm. Long-range electrostatics interactions were calculated using the PME method (Darden et al., 1993; Essmann et al., 1995).
Constant-pH simulations were run in quadruplicates, resulting in a total production time of 880 ns.
Analysis procedures
Sequence analysis
Flaviviral polyprotein sequences were accessed from GenBank using following accession codes: DENV1 strain PVP159: AEM92304.1; DENV2 strain NGS-C: P14340.2; DENV3 strain H87: P27915.1; DENV4 strain H241: Q58HT7.1; YFV strain 17D: P03314.1; Zika strain H/PF/2013: A0A024B7W1.1. Sequence similarity was determined using the Sequence Identity and Similarity (SIAS) software. Multiple sequence alignment of the E and M proteins was performed using MUSCLE (Edgar, 2004). Alignment visualisation and generation of the neighbour-joining (NJ) tree was performed using Jalview 2.11.1.4 (Waterhouse et al., 2009). Conservation assessment and scoring was carried out using the Consurf Server (Ashkenazy et al., 2016), where a multiple sequence alignment was generated using the HMMER homolog search algorithm in the UNIREF-90 database with an E-value cutoff of 0.0001.
MD simulations analysis
After removing the position restraints from benzene probes, the system required additional equilibration time during which benzene established interactions with protein surfaces. We therefore excluded the first 50 ns of all production runs from the analysis. Simulation properties, including solvent-accessible surface area (SASA) (Eisenhaber et al., 1995), secondary structure content (Kabsch and Sander, 1983), radius of gyration, select residue distances, root-mean-square deviations (RMSDs) and root-mean-square fluctuations (RMSFs), were analysed using Gromacs 2018.2 analysis tools (Abraham et al., 2015). Contact distances between benzene and residues of interest were calculated using VMD 1.9.3 (Humphrey et al., 1996). In order to avoid counting contacts with the β-OG pocket residues that occur when the probes are outside the binding site, we counted only benzene proximal to at least two oppositely positioned pocket residues. The initial detection of protein pockets was performed with the MDpocket software (Schmidtke et al., 2011). Correlation of raft pocket densities across serotypes was calculated using the Fit in Map tool in ChimeraX 1.0 (Pettersen et al., 2021).
Protonation states of all titrated residues were extracted using the chpstats tool implemented in AmberTools21 (Case et al., 2005). pKa predictions of titratable residues based on a static structure were calculated on the DENV2 E proteins (PDB: 3J27) (Zhang et al., 2013a) in an asymmetric unit using the Finite difference Poisson-Boltzmann/Debye-Hückel (FD/DH) method (Warwicker, 2004).
Statistical analysis
All residue-based analyses across serotypes were performed after alignment, ensuring that the residues corresponded to their shared evolutionary origin. All correlation analyses were calculated with a non-parametric Spearman’s rank correlation method. β-OG pocket clustering was performed with a k-means clustering algorithm (k=2) based on the measurements of pocket SASA and the number of contacts established between all the residues lining the pocket. All pairwise significance levels were determined with a Wilcoxon signed-rank test used in comparison of two non-parametric samples. Cluster analysis of the charge cluster residues, applied to all atoms within the group, was based on the gromos algorithm (Daura et al., 1999) implemented in Gromacs with a specified cutoff value of 0.26 nm. Principal component analysis (PCA) of the charge cluster descriptors – all combinations of intra-cluster distances – was performed on a scaled and centered sample containing all structures across pH values, chains, and constant-pH simulation repeats. Visualisation of the original variable axes used the ggbiplot package in R 3.6.3. Additional data processing, analysis and visualisation were carried out in R 3.6.3.
Data availability
The full simulation trajectories have not been uploaded to a sharing server due to the size of the datasets (several terabytes of data in total). However, we have uploaded all simulation setup data such that a researcher could re-run our calculations (i.e. coordinates including initial and final frames, parameters, checkpoints, and portable binary run input files, for a total of 88 simulation systems) at Zenodo: https://doi.org/10.5281/zenodo.7037906. More extensive simulation trajectory files will be made available upon request to the corresponding author, without the need to e.g. submit a project proposal. There will be no restrictions on who can access this data. Code/software used to analyse the data are freely available and in the public domain.
References
-
Functional entry of dengue virus into Aedes albopictus mosquito cells is dependent on clathrin-mediated endocytosisThe Journal of General Virology 89:474–484.https://doi.org/10.1099/vir.0.83357-0
-
Scalable constant pH molecular dynamics in GROMACSJournal of Chemical Theory and Computation 18:6148–6160.https://doi.org/10.1021/acs.jctc.2c00516
-
Generalized born models of macromolecular solvation effectsAnnual Review of Physical Chemistry 51:129–152.https://doi.org/10.1146/annurev.physchem.51.1.129
-
Broad-Spectrum agents for flaviviral infections: dengue, Zika and beyondNature Reviews. Drug Discovery 16:565–586.https://doi.org/10.1038/nrd.2017.33
-
Refining the global spatial limits of dengue virus transmission by evidence-based consensusPLOS Neglected Tropical Diseases 6:e1760.https://doi.org/10.1371/journal.pntd.0001760
-
Canonical sampling through velocity rescalingThe Journal of Chemical Physics 126:014101.https://doi.org/10.1063/1.2408420
-
The amber biomolecular simulation programsJournal of Computational Chemistry 26:1668–1688.https://doi.org/10.1002/jcc.20290
-
Structure-Based maximal affinity model predicts small-molecule druggabilityNature Biotechnology 25:71–75.https://doi.org/10.1038/nbt1273
-
Particle mesh ewald: an N·log (N) method for ewald sums in large systemsThe Journal of Chemical Physics 98:12.https://doi.org/10.1063/1.464397
-
Peptide folding: when simulation meets experimentAngewandte Chemie International Edition 38:236–240.https://doi.org/10.1002/(SICI)1521-3773(19990115)38:1/2<236::AID-ANIE236>3.0.CO;2-M
-
Unraveling hot spots in binding interfaces: progress and challengesCurrent Opinion in Structural Biology 12:14–20.https://doi.org/10.1016/s0959-440x(02)00283-x
-
Inhibition of flaviviruses by targeting a conserved pocket on the viral envelope proteinCell Chemical Biology 25:1006–1016.https://doi.org/10.1016/j.chembiol.2018.05.011
-
A smooth particle mesh ewald methodThe Journal of Chemical Physics 103:8577–8593.https://doi.org/10.1063/1.470117
-
Structural changes in Dengue virus when exposed to a temperature of 37 CJournal of Virology 87:7585–7592.https://doi.org/10.1128/JVI.00757-13
-
Identification of specific histidines as pH sensors in flavivirus membrane fusionThe Journal of Cell Biology 183:353–361.https://doi.org/10.1083/jcb.200806081
-
Phylogenetic relationships of flaviviruses correlate with their epidemiologyThe Journal of General Virology 82:88.https://doi.org/10.1099/0022-1317-82-8-1867
-
Dengue haemorrhagic fever: diagnosis, treatment, prevention and controlTransactions of the Royal Society of Tropical Medicine and Hygiene 92:470.https://doi.org/10.1016/S0035-9203(98)91100-2
-
Computational fragment-based binding site identification by ligand competitive saturationPLOS Computational Biology 5:e1000435.https://doi.org/10.1371/journal.pcbi.1000435
-
A recombinant live attenuated tetravalent vaccine for the prevention of dengueExpert Review of Vaccines 16:1–13.https://doi.org/10.1080/14760584.2017.1335201
-
N-Linked glycans on dengue viruses grown in mammalian and insect cellsJournal of General Virology 90:2097–2106.https://doi.org/10.1099/vir.0.012120-0
-
In vivo enhancement of dengue virus infection in rhesus monkeys by passively transferred antibodyJournal of Infectious Diseases 140:527–533.https://doi.org/10.1093/infdis/140.4.527
-
LINCS: a linear constraint solver for molecular simulationsJournal of Computational Chemistry 18:1463–1472.https://doi.org/10.1002/(SICI)1096-987X(199709)18:12<1463::AID-JCC4>3.0.CO;2-H
-
Charmm36 all-atom additive protein force field: Validation based on comparison to NMR dataJournal of Computational Chemistry 34:2135–2145.https://doi.org/10.1002/jcc.23354
-
VMD: visual molecular dynamicsJournal of Molecular Graphics 14:33–38.https://doi.org/10.1016/0263-7855(96)00018-5
-
Comparison of simple potential functions for simulating liquid waterThe Journal of Chemical Physics 79:926–935.https://doi.org/10.1063/1.445869
-
Influence of PR-M cleavage on the heterogeneity of extracellular Dengue virus particlesJournal of Virology 84:8353–8358.https://doi.org/10.1128/JVI.00696-10
-
Crystal structure of West Nile virus envelope glycoprotein reveals viral surface epitopesJournal of Virology 80:11000–11008.https://doi.org/10.1128/JVI.01735-06
-
Automatic and efficient decomposition of two-dimensional structures of small molecules for fragment-based high-throughput dockingJournal of Medicinal Chemistry 49:7384–7392.https://doi.org/10.1021/jm060838i
-
Investigating cryptic binding sites by molecular dynamics simulationsAccounts of Chemical Research 53:654–661.https://doi.org/10.1021/acs.accounts.9b00613
-
Conformational changes in intact Dengue virus reveal Serotype-specific expansionNature Communications 8:14339.https://doi.org/10.1038/ncomms14339
-
Crystal structure of the Japanese encephalitis virus envelope proteinJournal of Virology 86:2337–2346.https://doi.org/10.1128/JVI.06072-11
-
Locating and characterizing binding sites on proteinsNature Biotechnology 14:595–599.https://doi.org/10.1038/nbt0596-595
-
Constant pH molecular dynamics in generalized born implicit solventJournal of Computational Chemistry 25:2038–2048.https://doi.org/10.1002/jcc.20139
-
Structural perspectives of antibody-dependent enhancement of infection of dengue virusCurrent Opinion in Virology 36:1–8.https://doi.org/10.1016/j.coviro.2019.02.002
-
Crystal structure of the West Nile virus envelope glycoproteinJournal of Virology 80:11467–11474.https://doi.org/10.1128/JVI.01125-06
-
Understanding cryptic pocket formation in protein targets by enhanced sampling simulationsJournal of the American Chemical Society 138:14257–14263.https://doi.org/10.1021/jacs.6b05425
-
Polymorphic transitions in single crystals: a new molecular dynamics methodJournal of Applied Physics 52:7182–7190.https://doi.org/10.1063/1.328693
-
Low PH-induced cell fusion in Flavivirus-infected Aedes Albopictus cell culturesThe Journal of General Virology 71 ( Pt 8):1845–1850.https://doi.org/10.1099/0022-1317-71-8-1845
-
Numerical integration of the Cartesian equations of motion of a system with constraints: molecular dynamics of n-alkanesJournal of Computational Physics 23:327–341.https://doi.org/10.1016/0021-9991(77)90098-5
-
Comparative protein modelling by satisfaction of spatial restraintsJournal of Molecular Biology 234:779–815.https://doi.org/10.1006/jmbi.1993.1626
-
Cosolvent-enhanced sampling and Unbiased identification of cryptic pockets suitable for structure-based drug designJournal of Chemical Theory and Computation 15:3331–3343.https://doi.org/10.1021/acs.jctc.8b01295
-
A high-throughput method for detection of drug binding sitesJournal of Medicinal Chemistry 53:574.https://doi.org/10.1021/jm100574m
-
Molecular mechanisms of flavivirus membrane fusionAmino Acids 41:1159–1163.https://doi.org/10.1007/s00726-009-0370-4
-
CHARMM-GUI: A web-based graphical user interface for CHARMMJournal of Computational Chemistry 29:1859–1865.https://doi.org/10.1002/jcc
-
Benzene probes in molecular Dynamics Simulations reveal novel binding sites for ligand designThe Journal of Physical Chemistry Letters 7:3452–3457.https://doi.org/10.1021/acs.jpclett.6b01525
-
CHARMM-GUI membrane builder toward realistic biological membrane SimulationsJournal of Computational Chemistry 35:1997–2004.https://doi.org/10.1002/jcc.23702.CHARMM-GUI
-
Relationship between hot spot residues and ligand binding hot spots in protein–protein interfacesJournal of Chemical Information and Modeling 52:2236–2244.https://doi.org/10.1021/ci300175u
-
The stem region of premembrane protein plays an important role in the virus surface protein rearrangement during dengue maturationJournal of Biological Chemistry 287:40525–40534.https://doi.org/10.1074/jbc.M112.384446
-
Cryo-Em structure of the mature dengue virus at 3.5-å resolutionNature Structural & Molecular Biology 20:105–110.https://doi.org/10.1038/nsmb.2463
-
A benzene-mapping approach for Uncovering cryptic pockets in membrane-bound proteinsJournal of Chemical Theory and Computation 16:5948–5959.https://doi.org/10.1021/acs.jctc.0c00370
Decision letter
-
Qiang CuiReviewing Editor; Boston University, United States
-
Detlef WeigelSenior Editor; Max Planck Institute for Biology Tübingen, Germany
-
Jana ShenReviewer; University of Maryland, Baltimore, United States
-
Mikael LundReviewer; Lund University, Sweden
Our editorial process produces two outputs: (i) public reviews designed to be posted alongside the preprint for the benefit of readers; (ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.
Decision letter after peer review:
Thank you for submitting your article "A pH-dependent cluster of charges in a conserved cryptic pocket on flaviviral envelopes" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, one of whom is a member of our Board of Reviewing Editors, and the evaluation has been overseen José Faraldo-Gómez as the Senior Editor. The following individuals involved in review of your submission have agreed to reveal their identity: Jana Shen (Reviewer #2); Mikael Lund (Reviewer #3).
The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.
Essential revisions:
1) A more thorough analysis of the constant pH simulations, especially whether the cryptic binding site(s) identified with fixed-protonation-state simulations is (are) likely to change when pH dependence is included. Analysis of charge capacitance is also recommended.
2) Further clarification of some methodological details, such as the treatment of membrane.
Reviewer #1 (Recommendations for the authors):
Overall, the study has been conducted and analyzed rather carefully. I have only a few questions.
1. Glycan has been shown to be important in several recent SARs-Cov-2 spike protein simulations. The authors didn't include the glycan component explicitly. It will be useful to further comment on potential technical challenges that motivated this approximation and its limitations.
2. Many viral envelop proteins (e.g. the HIV) contain a rather high concentration of cholesterol, which appears to play a major role in modulating not only the membrane property but also the structure and assembly of the envelope proteins. The authors didn't include cholesterol in their model. It will be useful to comment on the potential implications.
3. The cryptic binding sites were characterized mainly by the SASA values. Are there other characterizations that might be more relevant for future drug design, such as the volume of the cavity?
Reviewer #2 (Recommendations for the authors):
– A major weakness of the paper appears to be the flawed design of the simulation. The benzene mapping simulations were conducted using fixed protonations which are not specified in the paper, and thus they do not address any pH response. Why wouldn't the authors conduct the constant pH simulations first to determine protonation states and then use these protonation states to conduct conventional MD with benzenes?
– Another major weakness is that the conclusions are not very clear at least in my assessment. The authors claimed the α pocket to the cryptic pocket, but this conclusion can not be discerned from the presented figures. In fact, in Figure 2, 3 and others, the y values of the region around residue 144 is discontinuous. Not sure why. The conclusion of the constant pH part is very vague, and I can't understand what it is exactly.
– Overall, there are too many plots that are tangentially relevant. I suggest they can be moved to SI and instead focus the plots on the data that support the conclusions.
– It is unclear what exactly the histidine switch hypothesis is. The discussion needs to be more specific. Related to this, it is unclear how the determined protonation states address the hypothesis.
– The convergence of protonation state sampling should be included in the SI.
Reviewer #3 (Recommendations for the authors):– p5, line 32. One or two commas would elevate readability– p6, line 4. I wonder if "mixed-solvent" is the right term here. The benzene concentration is 0.6 M and is, compared to >50 M water, rather a co-solute.
– Membrane curvature: Curvature is imposed by MD constraints as detailed in the Method section. I understand that the membrane is not the focus of the analysis, but how does this curvature co-exist with the PBC of the simulation box? The scheme implies that the membrane + raft is surrounded by replicas.
– Figure 2A: Why do the +bnz plots initially drift? If part of the equilibration, I would have expected them to start at the -bnz levels.
– Figure 4C: It's not obvious how to interpret the "violin" plots: the distributions have no scale; and showing two mirrored halves seems redundant. The same comment applies to Figure 6A.
p13-14. Here I suggest revising the use of "rate(s)" as it took me a while to realise that the discussion is not about kinetics, but statics. I suspect that to many readers, "rate" would imply a dynamic property. Perhaps "ratio", "quotient", or "fraction" could be alternatives.
– Figure 6: This plot is packed with information and is used to support a detailed discussion about the role of the found charge cluster. I think that it works well. At pH is data in Fig6a obtained? I understand that the PCA is based on the indicated residue-residue distances. Another way to describe charge-charge interactions is via an electric multipole expansion of the cluster charges. Perhaps an analysis involving the dipole and quadrupole moment could be revealing. Merely a thought.
– Cluster response to pH changes. Figure 6a/b analyses the net-charge of the cluster which probes the overall protonation state. To judge how a pH change would affect the cluster, the charge capacitance, C=-^2 should be straightforward to extract. Measuring the charge fluctuations, it can be directly linked to a charge response due to a pH change. See e.g. doi:10.1017/S003358351300005X. Along the same lines, I wonder of fluctuations in SASA or Rg would reveal information of how easily the cluster is perturbed. Finally, I would have liked to see how the PCA would change in the presence of benzene.
p22, line 19: I think it would be useful to know exactly how many benzene molecule. Is it sufficient to saturate the protein surfaces? Or is there a deficit which could affect the number of observed contacts.
p22, Section 4.1.1. I would prefer to have slightly more information about the setup: "semi-isotropic" could be more specific as well as details about update intervals of the barostat and thermostat.
p24, line 17: Does this mean that solvent and other solutes are not part of the (de)protonation acceptance criterion? That is, is all explicit solvent replaced by a continuum? If so, could one not use a much cheaper constant pH scheme for conformational sampling?
p26, line 13: I much appreciate that the authors have made the effort to deposit the electronic material on Zenodo. This is a nice and very helpful gesture to the community!
https://doi.org/10.7554/eLife.82447.sa1Author response
Essential revisions:
1) A more thorough analysis of the constant pH simulations, especially whether the cryptic binding site(s) identified with fixed-protonation-state simulations is (are) likely to change when pH dependence is included. Analysis of charge capacitance is also recommended.
We conducted a more detailed examination of cryptic pocket behaviour under titration regime of constant-pH simulations and included it in section 2.7 (Perspectives on cryptic hotspot discovery in flaviviruses), supported by Figure 6 —figure supplement 4. We also assessed the convergence of protonation state sampling in Figure 6 —figure supplement 1.
Although we examined using charge capacitance to further elaborate on the effect of the charge cluster protonation on protein conformation, we believe that this analysis would not be appropriate for our specific case. Proton-exchange in the used constant-pH simulation method was allowed only for small number of residues, so the changes in protonation states across the pH values do not accurately describe the changes in the overall charge. We provide more details on this topic in response to the reviewer below.
2) Further clarification of some methodological details, such as the treatment of membrane.
We added clarifications to the Methods section: namely, we specified position-restrained residues in the TM domain (4.1.1 Raft modelling); we included the number of benzene molecules used in our simulation systems (4.1.3 Benzene mapping simulation setup); we also added more details about the used barostat (4.1.4 Benzene mapping simulation protocol). Finally, we reasoned about our chosen system setup (regarding glycosylation and membrane composition) in the 2.6 Limitations section. There we also reflected on the possible limitations of the chosen constant-pH simulation method. Further details are provided below in response to the reviewers.
Reviewer #1 (Recommendations for the authors):
Overall, the study has been conducted and analyzed rather carefully. I have only a few questions.
We thank the reviewer for their positive comments, and appreciate the opportunity to reply to the questions/suggestions below.
1. Glycan has been shown to be important in several recent SARs-Cov-2 spike protein simulations. The authors didn't include the glycan component explicitly. It will be useful to further comment on potential technical challenges that motivated this approximation and its limitations.
We thank the reviewer for highlighting the role of glycans in the viral envelopes. It is indeed one of the key components of the envelope, and its omission from the model was due to challenges in using the benzene mapping in the presence of glycans. At that point, the applicability of the benzene flooding approach had been verified only against systems containing protein and membrane components, but not with glycans (since then, we have successfully applied this method to the glycosylated SARS-CoV-2 spike protein (Zuzic et al., 2022: https://doi.org/10.1016/j.str.2022.05.006)). We were also concerned that the addition of glycans would hinder benzene access to the tightly-packed protein surface, necessitate an even bigger simulation box and, consequently, require more benzene probes, which would result in an unmanageable amount of exclusions between all probe molecules. Motivated by this important comment, we have now included a “Limitations” section in our paper and elaborated on the reasoning behind using the unglycosylated form of the E protein.
2. Many viral envelop proteins (e.g. the HIV) contain a rather high concentration of cholesterol, which appears to play a major role in modulating not only the membrane property but also the structure and assembly of the envelope proteins. The authors didn't include cholesterol in their model. It will be useful to comment on the potential implications.
The reviewer makes an excellent point on the weaknesses of the modelled membrane, and we acknowledge the potential drawbacks of our chosen membrane composition. The used lipid composition data was based on the lipidomics analysis of DENV2 expressed in C6/36 mosquito cells (Zhang et al., 2012: https://doi.org/10.1074/jbc.M112.384446), but this research failed to address lipid types other than phospholipids. Other studies have highlighted the abundance of sphingolipids in West Nile virus (Martin-Acebes et al., 2014: https://doi.org/10.1128/JVI.02061-14) and the critical role of cholesterol for DENV infection in mammalian cells (Carro and Damonte, 2013: https://doi.org/10.1016/j.virusres.2013.03.005). Additionally, ordered lipid densities (possibly PE) have been observed in the cryo-EM structure of Spondweni, Zika, and DENV2 envelope (DiNunno et al., 2020: https://doi.org/10.1038/s41467-020-18747-4), and the conservation of the pocket across flaviviral species suggests functional relevance of the bound lipid, possibly linked to envelope stabilisation.
In the benzene mapping segment of this research, the membrane predominantly served to provide a hydrophobic medium for the TM helices, shielding the TMs and the underside of the proteins from interacting with the solvent molecules. Considering that the main aim of the benzene mapping was to examine transient cryptic pockets accessible from the viral surface, we put forward that the membrane appropriately served its purpose in this case.
Nevertheless, in light of the key point highlighted by the reviewer, we have also included a paragraph addressing the membrane composition under the 2.6 Limitations section.
3. The cryptic binding sites were characterized mainly by the SASA values. Are there other characterizations that might be more relevant for future drug design, such as the volume of the cavity?
The volume measurements of the cryptic sites can be problematic due to the fact that: (i) they are highly dynamic and transient in nature, (ii) surface edges of both observed pockets are defined by flexible loops, the positions of which can excessively affect the value of the reported volume; (iii) the pocket-mapping algorithms that are applicable to simulation trajectories are for the most part imprecise, as the pocket coordinates and boundaries are insufficiently well defined. Specifically, pocket-mapping algorithms rely on extrinsic properties — namely, 3D coordinates within a simulation box — to define a pocket of interest and to calculate its volume. This makes the alignment of the pocket (essentially, an “empty space”) within and across trajectories a non-trivial task, particularly taking into consideration the highly dynamic properties of cryptic pockets. Additionally, defining the pocket edges towards the surface is arbitrary and for the most part requires human input to filter excessively mapped or “spilled” pocket edges. In practice, this is a noteworthy issue when attempting to make a meaningful comparison between different simulation systems.
Considering the factors above, we refrained from reporting on volume measurements in this paper, as we deemed them to be too imprecise. Instead, we opted for a more comparable measure of the pocket expanse that is SASA, which circumvents some of these issues by using the measurement relating to the “filled” space (surface areas of the residues lining the pocket). Previously, we reported the volume of the ⍺ pocket (2.20 ± 0.51 nm3) based on the smaller DENV2 E2M2 model (Zuzic et al., 2020: https://doi.org/10.1021/acs.jctc.0c00370).
Reviewer #2 (Recommendations for the authors):
– A major weakness of the paper appears to be the flawed design of the simulation. The benzene mapping simulations were conducted using fixed protonations which are not specified in the paper, and thus they do not address any pH response. Why wouldn't the authors conduct the constant pH simulations first to determine protonation states and then use these protonation states to conduct conventional MD with benzenes?
In fact, the initial aim of the study was entirely unmotivated from elucidating the pHdependent mechanism of conformational changes in flaviviral envelopes. Instead, our primary goal was to detect cryptic pockets on the flaviviral rafts, which is why we set out to utilise the benzene mapping method on the raft with fixed protonation states from the start. Since the aim was to discover envelope pockets that could potentially be a target prior to cellular entry, the ionization state of amino acids were set according to the neutral pH of the extracellular environment.
Only after the benzene mapping study was conducted, we observed the unusual cluster of conserved residues in the α pocket that merited further investigation. Therefore, the pH-dependence became relevant post factum, which then motivated our constant-pH simulations to further probe the role of the discovered cluster.
– Another major weakness is that the conclusions are not very clear at least in my assessment. The authors claimed the α pocket to the cryptic pocket, but this conclusion can not be discerned from the presented figures. In fact, in Figure 2, 3 and others, the y values of the region around residue 144 is discontinuous. Not sure why. The conclusion of the constant pH part is very vague, and I can't understand what it is exactly.
The cryptic nature of the α pocket is presented in Figure 5 —figure supplement 1, where we compared the pocket SASA across simulations in the absence (-bnz) and presence of benzene (+bnz). This is also indicated in the main text (under section 2.4 The ⍺ pocket located on the domain interfaces is conserved and contains a buried cluster of charges).
The apparent discontinuities in the figures showing properties measured on a sequence residue-basis are due to the sequence alignment – since different flavivirus envelope proteins vary in length and/or contain deletions/insertions – thereby ensuring that the corresponding key residues are vertically aligned. We attempted to make that point clearer in the figure legends.
We thank the reviewer for pointing out the weaknesses in our explanations of the conclusions regarding constant pH simulations. We have now reformulated our conclusions (section 3) in order to make it clearer for the reader.
– Overall, there are too many plots that are tangentially relevant. I suggest they can be moved to SI and instead focus the plots on the data that support the conclusions.
Although we understand that the title mainly highlights exploration of the pH-dependence of the cluster of charges, the conclusions drawn from this research are multi-faceted, and require understanding of both the benzene-based cryptic pocket discovery, as well as the pH-dependent conformational effect of the charged cluster. The figures follow the “trajectory” of the conducted research and assume that the reader is not entirely familiar with the benzene mapping approach or its implications, which is why three out of six figures are dedicated to addressing this research segment.
We are suggesting that the figures are relevant for understanding the overall results of the research, including model features (Figure 1), benzene mapping method verification (Figure 2), effects of benzene on the raft surface (Figure 3), analysis of the positive control β-OG pocket (Figure 4), main features of the ⍺ pocket (Figure 5), and finally, the effect of pH on the charged cluster behaviour (Figure 6).
– It is unclear what exactly the histidine switch hypothesis is. The discussion needs to be more specific. Related to this, it is unclear how the determined protonation states address the hypothesis.
We thank the reviewer for highlighting this important omission of the histidine-switch hypothesis explanation. We have now included more details of the hypothesis in the introduction, and also reinstated it in the conclusion of section 2.5 (Disruption of the ⍺ pocket charge cluster at low pH) to hopefully make things clearer.
– The convergence of protonation state sampling should be included in the SI.
We are grateful to the reviewer for this important suggestion and we have now included the convergence of pKa values for all examined residues in the Supplementary Information (Figure 6 —figure supplement 1).
Reviewer #3 (Recommendations for the authors):
– p5, line 32. One or two commas would elevate readability
Indeed, the sentence was clumsily formulated. We have now split the statement into two parts.
– p6, line 4. I wonder if "mixed-solvent" is the right term here. The benzene concentration is 0.6 M and is, compared to >50 M water, rather a co-solute.
We thank the reviewer for the insightful comment as we had not considered the accuracy of the used term. Indeed, the hydrophobicity of benzene, as well as the ratio of the two components, would most accurately classify benzene as a co-solute. We have introduced this correction in p6, line 12. When addressing a wider group of pocket detection methods that are based on different solvent components (including, but not limited to benzene), we used the term “co-solvent” as a more frequently used and broader description of the approach.
– Membrane curvature: Curvature is imposed by MD constraints as detailed in the Method section. I understand that the membrane is not the focus of the analysis, but how does this curvature co-exist with the PBC of the simulation box? The scheme implies that the membrane + raft is surrounded by replicas.
This is the correct understanding of the simulation setup. The membrane exists in the PBC context, which means that it forms an undulating landscape. This is not ideal or physiologically relevant for the context of viral envelopes (although under certain dengue viral morphologies something like it has been observed), but the membrane analysis was not focused on any great detail, as it was not deemed sufficiently accurate. Instead, the curvature was induced by retaining position restraints on C⍺ atoms of the TM helices in order to preserve accurate inter-chain contacts and angles between envelope monomers, which are the key factors in sampling the external cryptic surface.
– Figure 2A: Why do the +bnz plots initially drift? If part of the equilibration, I would have expected them to start at the -bnz levels.
The first 50 ns that were excluded from the analysis were a part of the production run, during which the benzene probes were free to move and bind onto the protein surface. Therefore, SASA of the +bnz systems drifts towards higher values as compared to the -bnz systems. We opted to exclude this from our plot visualisation in Figure 2A and B to remain consistent with all the other conducted analyses (all of them excluded the first 50 ns of the production run).
– Figure 4C: It's not obvious how to interpret the "violin" plots: the distributions have no scale; and showing two mirrored halves seems redundant. The same comment applies to Figure 6A.
Perhaps the most convenient way to interpret violin plots is to imagine them as vertical density plots in which the area represents the abundance of data points at any given value of y. Unlike “standard” density plots, where the density is represented by an area that occupies only the positive values of the y-axis, violin plots utilise both positive and negative values of the central axis, giving them a symmetrical shape.
Figure 4C is a swarm plot, which means that individual measurements across frames are represented as non-overlapping points which, as a result, approximate a violin plot shape (but does not fully correspond to it). The area covered by data points is therefore affected by the number of sampled points. In 4C, +bnz systems are sampled more as compared to -bnz (they consist of 5 and 3 repeats, respectively), which also means that the total area covered by points in +bnz is bigger as compared to -bnz systems.
This is unlike “true” violin plots (such as Figure 6A), in which the number of samples is not directly correlated with the total area of the violin(s) at a given discrete value of x. In other words, even though the majority of the samples in Figure 6A belongs to a cluster of charge 0, the area of this violin plot is the same as the much less sampled state with cluster charge of +3.
In conclusion, we thank the reviewer for pointing out the issues with interpretability of these plots. We have therefore added clarifications in figure texts (both for Figure 4C and 6A).
p13-14. Here I suggest revising the use of "rate(s)" as it took me a while to realise that the discussion is not about kinetics, but statics. I suspect that to many readers, "rate" would imply a dynamic property. Perhaps "ratio", "quotient", or "fraction" could be alternatives.
We are grateful to the reviewer for this comment, as we believe that the suggested term modification improved the accuracy of our claims. Instead of using the term “rate”, we instead now refer to the “ratio of open/closed states” of the pocket.
– Figure 6: This plot is packed with information and is used to support a detailed discussion about the role of the found charge cluster. I think that it works well. At pH is data in Fig6a obtained? I understand that the PCA is based on the indicated residue-residue distances. Another way to describe charge-charge interactions is via an electric multipole expansion of the cluster charges. Perhaps an analysis involving the dipole and quadrupole moment could be revealing. Merely a thought.
The data in Figure 6A contains cumulative radius of gyration measurements across all simulated pH values (pH 0-9). We have now added this clarification in the Figure text.
We considered expanding our analyses by determining the dipole moment for the cluster, but we were hindered by some analysis implementation issues. Namely, it is unclear to us how to extract information about partial charge changes of our titratable residues from the constant-pH simulations (implemented in Amber). Calculating the dipole moments from simulations with fixed charges would be relatively easy, as the dipole moment would be (only) a function of atom positions. However, the constant-pH simulations allow for the changes of partial charges of atoms included in the titratable residue. As far as we understand, to calculate this we would require information about the fluctuations in partial charge for all relevant atoms and across all trajectory frames. Obtaining this information from the run simulations was, sadly, not obvious to us.
– Cluster response to pH changes. Figure 6a/b analyses the net-charge of the cluster which probes the overall protonation state. To judge how a pH change would affect the cluster, the charge capacitance, C=-^2 should be straightforward to extract. Measuring the charge fluctuations, it can be directly linked to a charge response due to a pH change. See e.g. doi:10.1017/S003358351300005X. Along the same lines, I wonder of fluctuations in SASA or Rg would reveal information of how easily the cluster is perturbed. Finally, I would have liked to see how the PCA would change in the presence of benzene.
We thank the reviewer for pointing out an interesting approach to quantifying molecular charge fluctuations expressed as a protein property of capacitance, which we have not considered before. We used the Eq. 12 from the linked article to calculate protein charge capacitance (Author response image 1), as our constant-pH simulations were run in a titration regime (pH 0-9).

sE protein dimer capacitance as a function of pH.
The charge fluctuation was allowed on twelve residues of the simulation system (six on each monomer): His27, Asp42, His144, His244, His317, Glu368.
However, we are reluctant to include this analysis in the publication, as the main limitation of the constant-pH method is the fact that the proton-exchange was allowed only for a limited number of titratable residues (as detailed in the Methods section 4.2. Constant-pH MD simulations on DENV2 soluble E protein dimer). This results in an incorrect picture of the net charge on the simulated protein, as only a selection of residues underwent titration, whereas the other residues were assigned a fixed protonation state. Furthermore, the envelope model used for constant-pH simulations was truncated (it contained only the soluble E protein dimer), which would also be a limitation in assessing the capacitance of a much larger multimer forming the viral envelope.
The constant-pH simulations are therefore useful for elaborating on the shorterrange effects of cluster residue (de)protonations and their effect on inter-domain stability. However, they cannot give us an accurate picture of the overall variance in mean charge, as it is a property that hinges upon the “titration regime” of the whole protein – or in this case, the viral envelope.
The mapping of cluster changes in the presence of benzene cannot be provided within the scope of this study, as benzene-flooding simulations were performed under the fixed protonation state regime only. We instead included an additional analysis of the ⍺ pocket, namely the mapping of the pocket surface area across different protonation states of the cluster (Figure 6 —figure supplement 4). As the constant-pH simulations were run with a reduced-size envelope model and only on DENV2, the absolute numbers of pocket SASA are not as informative as with the raft simulations (Figure 5 —figure supplement 1). Nonetheless, the general trend of pocket expansion is consistent with our observations of cluster disruption and domain dissociation under low-pH conditions. Using this information, we have further elaborated on the implications of the ⍺ pocket targeting by drug molecules and their potential mechanism of inhibition in Section 2.7 Perspectives on cryptic hotspot discovery in flaviviruses.
p22, line 19: I think it would be useful to know exactly how many benzene molecule. Is it sufficient to saturate the protein surfaces? Or is there a deficit which could affect the number of observed contacts.
We added the information about the number of added benzene molecules (1148 per system) in the manuscript under the section 4.1.3 Benzene mapping simulation setup.
Our best guess is that the saturation level was not reached with this number of benzenes (in preliminary studies of DENV2 rafts with 0.8 M benzene, we did not observe unfolding effects), but there was a concern that the addition of too many benzene molecules would have undesirable effects in terms of membrane behaviour, subsequent unfolding, or the abundance of nonspecific “noise” contacts between proteins and probes. Therefore, we did not pursue the upper bounds of the saturation value (which is unknown for our specific system).
p22, Section 4.1.1. I would prefer to have slightly more information about the setup: "semi-isotropic" could be more specific as well as details about update intervals of the barostat and thermostat.
We have added more detail about compressibility and time constants of both thermostat and barostat to the manuscript (under the section 4.1.4 Benzene mapping simulation protocol).
p24, line 17: Does this mean that solvent and other solutes are not part of the (de)protonation acceptance criterion? That is, is all explicit solvent replaced by a continuum? If so, could one not use a much cheaper constant pH scheme for conformational sampling?
This understanding of the (de)protonation criterion being assessed in the generalised Born implicit solvent model is correct, and we acknowledge it as a downside of the applied method. We did not consider use of the cheaper implicit solvent for conformational sampling (as implemented by Baptista et al., https://doi.org/10.1063/1.1497164), as the main bottleneck of the constant-pH simulation approach was not the computational expense, but method availability. The computational cost of the simulation was not considered as a major factor in our method selection, and we decided to use the explicit water model where possible. This was presumed to provide enhanced credibility in conformational sampling. Therefore, the method which combines an explicit water model during fixed protonation states and an implicit model for (de)protonation updates was deemed appropriate at the time.
If we were to conduct similar research in future, we may opt for an all-explicit water model in constant pH simulations, as was recently developed by Groenhoff and Hess groups and implemented in Gromacs (Aho et al., 2022: https://doi.org/10.1021/acs.jctc.2c00516). Motivated by this comment, we included a remark about the constant-pH simulation approach and the future direction in the 2.6 Limitations section of the paper.
p26, line 13: I much appreciate that the authors have made the effort to deposit the electronic material on Zenodo. This is a nice and very helpful gesture to the community!
We thank the reviewer for the kind acknowledgement!
https://doi.org/10.7554/eLife.82447.sa2Article and author information
Author details
Funding
National Research Foundation Singapore (NRF2017NRF-CRP001-027)
- Jan K Marzinek
- Ganesh S Anand
- Peter J Bond
Agency for Science, Technology and Research
- Lorena Zuzic
- Jan K Marzinek
- Peter J Bond
University of Manchester
- Lorena Zuzic
- Jim Warwicker
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
The research was supported by the A*STAR Bioinformatics Institute (BII) core funds, the A*STAR Graduate Academy (A*GA), and the University of Manchester. PJB, JKM, and GSA acknowledge NRF (NRF2017NRF-CRP001-027) for funding. The computational work for this article was performed on resources of the National Supercomputing Centre, Singapore (https://www.nscc.sg), A*STAR Computational Resource Centre, A*STAR Bioinformatics Institute, and the N8 Centre of Excellence in Computationally Intensive Research (N8 CIR) provided and funded by the N8 research partnership and EPSRC (Grant No. EP/T022167/1). We are grateful to Ana Damjanovic for helpful advice regarding constant-pH simulations.
Senior Editor
- Detlef Weigel, Max Planck Institute for Biology Tübingen, Germany
Reviewing Editor
- Qiang Cui, Boston University, United States
Reviewers
- Jana Shen, University of Maryland, Baltimore, United States
- Mikael Lund, Lund University, Sweden
Version history
- Preprint posted: July 13, 2022 (view preprint)
- Received: August 4, 2022
- Accepted: April 18, 2023
- Version of Record published: May 5, 2023 (version 1)
Copyright
© 2023, Zuzic et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 358
- Page views
-
- 48
- Downloads
-
- 0
- Citations
Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Computational and Systems Biology
- Microbiology and Infectious Disease
Candida albicans, an opportunistic human pathogen, poses a significant threat to human health and is associated with significant socio-economic burden. Current antifungal treatments fail, at least in part, because C. albicans can initiate a strong drug tolerance response that allows some cells to grow at drug concentrations above their minimal inhibitory concentration. To better characterize this cytoprotective tolerance program at the molecular single-cell level, we used a nanoliter droplet-based transcriptomics platform to profile thousands of individual fungal cells and establish their subpopulation characteristics in the absence and presence of antifungal drugs. Profiles of untreated cells exhibit heterogeneous expression that correlates with cell cycle stage with distinct metabolic and stress responses. At 2 days post-fluconazole exposure (a time when tolerance is measurable), surviving cells bifurcate into two major subpopulations: one characterized by the upregulation of genes encoding ribosomal proteins, rRNA processing machinery, and mitochondrial cellular respiration capacity, termed the Ribo-dominant (Rd) state; and the other enriched for genes encoding stress responses and related processes, termed the Stress-dominant (Sd) state. This bifurcation persists at 3 and 6 days post-treatment. We provide evidence that the ribosome assembly stress response (RASTR) is activated in these subpopulations and may facilitate cell survival.
-
- Immunology and Inflammation
- Microbiology and Infectious Disease
Pyroptosis and apoptosis are two forms of regulated cell death that can defend against intracellular infection. When a cell fails to complete pyroptosis, backup pathways will initiate apoptosis. Here, we investigated the utility of apoptosis compared to pyroptosis in defense against an intracellular bacterial infection. We previously engineered Salmonella enterica serovar Typhimurium to persistently express flagellin, and thereby activate NLRC4 during systemic infection in mice. The resulting pyroptosis clears this flagellin-engineered strain. We now show that infection of caspase-1 or gasdermin D deficient macrophages by this flagellin-engineered S. Typhimurium induces apoptosis in vitro. Additionally, we engineered S. Typhimurium to translocate the pro-apoptotic BH3 domain of BID, which also triggers apoptosis in macrophages in vitro. During mouse infection, the apoptotic pathway successfully cleared these engineered S. Typhimurium from the intestinal niche but failed to clear the bacteria from the myeloid niche in the spleen or lymph nodes. In contrast, the pyroptotic pathway was beneficial in defense of both niches. To clear an infection, cells may have specific tasks that they must complete before they die; different modes of cell death could initiate these ‘bucket lists’ in either convergent or divergent ways.