Valid molecular dynamics simulations of human hemoglobin require a surprisingly large box size

Abstract
Introduction
Results and discussion
Materials and methods
Appendix
Data availability
References
Article and author information
Metrics

Abstract

Recent molecular dynamics (MD) simulations of human hemoglobin (Hb) give results in disagreement with experiment. Although it is known that the unliganded (T $_{0}$ ) and liganded (R $_{4}$ ) tetramers are stable in solution, the published MD simulations of T $_{0}$ undergo a rapid quaternary transition to an R-like structure. We show that T $_{0}$ is stable only when the periodic solvent box contains ten times more water molecules than the standard size for such simulations. The results suggest that such a large box is required for the hydrophobic effect, which stabilizes the T $_{0}$ tetramer, to be manifested. Even in the largest box, T $_{0}$ is not stable unless His146 is protonated, providing an atomistic validation of the Perutz model. The possibility that extra large boxes are required to obtain meaningful results will have to be considered in evaluating existing and future simulations of a wide range of systems.

https://doi.org/10.7554/eLife.35560.001

Introduction

Human hemoglobin is the paradigmatic model system for cooperativity in proteins. It transports oxygen from the lungs to the tissues and it is composed of two identical $α$ -chains ( $α 1$ and $α 2$ ) and two identical $β -$ chains ( $β 1$ and $β 2$ ). They form two identical dimers ( $α 1 β 1$ and $α 2 β 2$ ), whose relative orientation differs significantly in the unliganded (T $_{0}$ ) and liganded (R $_{4}$ ) tetramer. (Baldwin and Chothia, 1979) Although there is a vast literature on Hb and its cooperative mechanism (Schechter, 2008), how it functions at the atomistic level is still not fully understood (Cui and Karplus, 2008). The first insight into the mechanism was obtained from the low-resolution structures (5.5 Å) of the hemoglobin tetramer (Muirhead and Perutz, 1963), which showed that the heme groups were too distant to be able to interact directly. Monod, Wyman and Changeux formulated the allosteric (MWC) model (Monod et al., 1965; Cui and Karplus, 2008) based on the structural transition between two quaternary structures (T and R) to explain the indirect interaction between the heme groups required for cooperative oxygen binding.

Higher resolution (2.8 Å) X-ray structures of unliganded and liganded hemoglobin (Perutz, 1970) confirmed that there exist two quaternary structures (deoxy (T $_{0}$ ) and oxy (R $_{4}$ )) for the tetramer and two tertiary structures for each individual chain (liganded and unliganded). Based on the structural results, as well as mutant data, Perutz, 1970) proposed a stereochemical mechanism for cooperativity, in which salt bridges (some with ionizable protons in the neutral pH range) provide the link between ligand-induced tertiary changes and the relative stability of the two quaternary structures. Shortly afterwards, the elements of the Perutz mechanism were incorporated into a structure-based statistical mechanical model (Szabo and Karplus, 1972). The model provides a quantitative framework for the effects of specific tertiary structural changes induced by ligand binding on the relative stability of the T and R structures (Szabo and Karplus, 1972; Gelin and Karplus, 1977). The shift of the equilibrium from T to R as a function of ligand concentration results in the sigmoidal (cooperative) ligand binding curve.

Several papers using different force fields and simulation conditions have been published recently (Hub et al., 2010; Yusuff et al., 2012) describing molecular dynamics (MD) simulations of the T and R states, including T $_{0}$ and R $_{4}$ for which 1.25 Å resolution X-ray structures are available (T $_{0}$ , 2DN2; R $_{4}$ , 2DN3) (Park et al., 2006). Although R $_{4}$ was found to be stable for several hundred nanoseconds, the T $_{0}$ state was not. It was found to make a transition to an R-like state in the same time period (see also Figure 1—figure supplement 2). This occurs in spite of the fact that, experimentally, the T $_{0}$ state is about seven kcal/mol more stable than the R $_{0}$ state (Edelstein, 1971), which is derived from the ratio of the dissociation constants of liganded and unliganded Hb of $6.7 \times 10^{5}$ (Thomas and Edelstein, 1972).

Although the rate of the T $_{0}$ to R $_{0}$ transition has not been measured directly, it can be estimated from the experimentally determined R $_{0}$ to T $_{0}$ transition rates and the R $_{0}$ /T $_{0}$ equilibrium constant. For the unliganded hemoglobin tetramer, the R $_{0}$ to T $_{0}$ rate is about 20 $μ$ s (Sawicki and Gibson, 1976). With the equilibrium constant of [T $_{0}$ /R $_{0}$ ] equal to $6.7 \times 10^{5}$ (Edelstein, 1971; Thomas and Edelstein, 1972), the T $_{0}$ to R $_{0}$ rate is estimated to be on the order of seconds with the major contribution to the activation barrier arising from the equilibrium free energy difference (7 kcal/mol) between T $_{0}$ and R $_{0}$ . There is, thus, a striking disagreement between the transition time observed in the simulations and the estimate from experiment.

Results and discussion

The instability of T $_{0}$ in the published simulations raises a fundamental question: What is wrong with them? In search for an answer, we focused on the hydrophobic effect (Chothia et al., 1976; Lesk et al., 1985), which arises from the disruption of the bulk water hydrogen bond network around nonpolar groups (Rossky et al., 1979; Cheng and Rossky, 1998). The theoretical analysis of Chandler and coworkers (Chandler, 2005; Chandler and Varilly, 2012) indicated that for large molecules, there was a ‘dewetting’ phenomenon that stabilizes a more compact structure. Chothia et al. (1976) noted that "A larger surface area is buried in deoxy- than in methemoglobin as a result of tertiary and quaternary structure changes. [..] This implies that hydrophobicity stabilizes the deoxy structure, the free energy spent in keeping the subunits in a low-affinity conformation being compensated by hydrophobic free energy due to the smaller surface area accessible to solvent.’ Such a stabilizing effect of T $_{0}$ should appear naturally in a MD simulation, but evidently did not do so in the published simulations (Hub et al., 2010; Yusuff et al., 2012).

Figure 1 with 6 supplements see all

Download asset Open asset

Global structural changes depending on box size.

Panel A: Overall structure of the $α 1 β 1 α 2 β 2$ hemoglobin tetramer. The His146 side chains (green spheres) are specifically indicated. Panel B: Temporal change of the C $α$ –C $α$ distance between His146 $_{β 1}$ and His146 $_{β 2}$ . The dashed lines (black, orange, red) indicate transition points for the 75, 90, and 120 box, respectively. Cyan and blue arrows indicate the values of the corresponding observable found for the deoxy T $_{0}$ (2DN2) and oxy R $_{4}$ (2DN3) states, respectively.

https://doi.org/10.7554/eLife.35560.002

Having exhausted other possibilities (see Appendix for more details), we wondered whether the box size used for the MD simulations might be too small for the hydrophobic stabilization to be manifested. Because most of the simulation time is consumed by the waters, rather than the protein itself, which is of primary interest, a ‘lore’ has grown up about the minimal box size that can be used with periodic boundary conditions to carry out meaningful simulations. The standard requirement is that there must be at least five water molecules between any protein atom and the box boundary. (Brooks et al., 1983, 2009) The box we first used was 75 Å; the T $_{0}$ tetramer dimensions are approximately $54 \times 49 \times 50$ Å, and there were 10,763 water molecules, as well as enough Na $^{+}$ and Cl $^{-}$ ions to yield a 0.15 m/L molar concentration. All MD simulations were done in the $N P T$ ensemble (see SI for details).

To investigate the possibility that larger boxes were required for stabilizing T $_{0}$ , we carried out 1 $μ$ s simulations with four cubic boxes, 75 Å (10,543 water molecules) 90 Å (20,756 water molecules), 120 Å (53,287) and 150 Å (105,073), see Figure 1—figure supplement 1. In all these simulations, His146 $_{β 1}$ and His146 $_{β 2}$ , which play an essential role in the Perutz model (Perutz, 1970), were protonated. A comprehensive overview of the structural changes observed in the simulations of the four box sizes is provided in Figure 1. The essential result is that T $_{0}$ is stable for the entire simulation in the 150 Å box, while it is not in the smaller boxes (for details, see SI). Figure 2 shows the structures obtained at the end of the 1 $μ$ s simulations, superposed on the X-ray structure that is more similar; that is, the oxy R $_{4}$ structure (2DN3) for the 90 and 120 Å simulations and the deoxy T $_{0}$ structure (2DN2) for the 150 Å box simulation (see also Table 1).

Figure 2

Download asset Open asset

Global conformational rearrangement of tetrameric hemoglobin as a function of the box size.

(A) Superposition of the 2DN3 structure and the Hb structure in the 90 Å box at $t = 1000$ ns. (B) Superposition of the 2DN3 structure and the Hb structure in the 120 Å box at $t = 1000$ ns. (C) Superposition of the 2DN2 structure and the structure in the 150 Å box at $t = 1000$ ns. In each case the RMSDs (in Å) to the 2DN2 and 2DN3 are also given.

https://doi.org/10.7554/eLife.35560.009

Table 1

C $α$ RMSD (in Å) relative to the 2DN2 (T $_{0}$ ) and 2DN3 (R $_{4}$ ) X-ray structure (Park et al., 2006) of the end point structures at 1 $μ$ s.

In bold structure to which a computed Hb structure is closest.

https://doi.org/10.7554/eLife.35560.010

Structure	C $_{α}$ -C $α$ RMSD to 2DN2	C $_{α}$ -C $α$ RMSD to 2DN3
2DN2	$-$	2.43
2DN3	2.43	$-$
75 Å box	2.37	2.59
90 Å box	3.35	0.64
120 Å box	2.44	0.30
150 Å box	0.39	3.09

Figure 3 with 1 supplement see all

Download asset Open asset

The average number of H-bonds per water molecule $< N_{H - bonds} >$ /molecule, from analyzing water-water hydrogen bonds, during 1 $μ$ s MD simulations for all four box sizes and for three different donor-acceptor distance cutoffs: 2.8 (strong, top panels), 3.0 (medium, middle panels) and 3.3 Å (weak, bottom panels).

The solid black lines are averages for time intervals corresponding to the lifetime of each state in each of the simulation boxes (see Figure 1): For the 75 Å box: 0 to 140 ns (first transition), 140 to 530 ns (second transition) and 530 to 1000 ns. For the 90 Å box: 0 to 470 ns, 470 to 780 ns and 780 to 1000 ns. For the 120 Å box: 0 to 620 ns, 620 to 920 ns, and 920 to 1000 ns. For the 150 Å box, the average is over the entire simulation since no transition occurs. The average reduction in $< N_{H - bonds} >$ /molecule is maximum (0.1) when weak H-bonds are included (cutoff 3.3 Å, bottom row) and smallest (0.015) if only strong H-bonds (cutoff 2.8 Å, top row) are analyzed. In the 120 Å box (third column) and for the strong H-bonds the loss in $< N_{H - bonds} >$ /molecule is insignificant but clearly increases when weak H-bonds are included. We also observe a significant decrease in the fluctuation of $< N_{H - bonds} >$ /molecule between simulations in the smallest and the largest box sizes and a pronounced drop in $< N_{H - bonds} >$ /molecule for the 75 and 90 Å boxes prior to the transitions, see insets.

https://doi.org/10.7554/eLife.35560.011

If the free energy difference between the T $_{0}$ and R $_{0}$ states (7 kcal/mol, see above) were to arise entirely from the relative stability of the water-water network, this value corresponds to an energy difference of $10^{- 4}$ kcal/mol per water molecule when comparing the 90 and 150 Å boxes, which differ by $\approx 80, 000$ water molecules. Such small energy differences are very difficult to capture in MD simulations and is not attempted here. However, it is interesting to note that the average number of hydrogen bonds per water molecule, $< N_{H - bonds} >$ /molecule (see Figure 3) shows such an effect: for the three smaller boxes the $< N_{H - bonds} >$ /molecule decreases by $10^{- 3}$ to $10^{- 4}$ with every transition. This is consistent with the estimated energy change per water molecule. Furthermore, Figure 3 demonstrates that the fluctuation of $< N_{H - bonds} >$ /molecule decreases with increasing box size which is the behaviour expected from statistical mechanics. It should be pointed out that the running averages were evaluated over time intervals between which transitions were observed, see Figure 1.

Based on hydrodynamic arguments, Yeh and Hummer (2004) showed that the water self-diffusion coefficient, $D$ , calculated from an MD simulation of pure water (without ions) in a periodic box, scales as $N^{- 1 / 3}$ , where $N$ is the number of particles. For the largest box they studied (40 Å) the size correction was negligible. Interestingly, our simulation of the 75 Å box containing Hb yielded a value of $D$ that is much too small ( $D = 4.25 \times 10^{- 5}$ cm $^{2}$ /s vs $D = 5.95 \times 10^{- 5}$ cm $^{2}$ /s); the latter is the correct value for the TIP3P water model. In Figure 4A, we show the results for the value of $D$ as a function of box size, plotted versus $1 / L$ (nm $^{- 1}$ ). As expected, all the pure water boxes are large enough so the calculated value agrees with the extrapolated value of Yeh and Hummer ( $D = 5.9 \times 10^{- 5}$ cm $^{2}$ /s) within statistical error. However, the calculated self-diffusion coefficient from MD simulations with Hb present, is identical to that of pure water only for the largest 150 Å box.

Figure 4

Download asset Open asset

System-size dependence of the water diffusion coefficient.

(A) Water diffusion coefficients $D$ as a function of system size for systems with and without protein. Also, results from Hummer et al. are shown for comparison and validation. (B) The water $D$ as a function of time for the first instability to occur. For the 75, 90, and 120 Å boxes instabilities were observed (see Figure 1) and scale linearly with the water $D$ . The yellow symbol for the 150 Å box is the extrapolated value (700 ns).

https://doi.org/10.7554/eLife.35560.013

The results described here suggest that the correct free energy of hemoglobin, at least to the extent that T $_{0}$ is stable relative to R $_{0}$ , requires that the simulation be done in a box large enough so that the water environment behaves like bulk water. In Figure 4B, we plot the calculated $D -$ value for boxes containing Hb versus the time in ns when the first transition from the T $_{0}$ structure takes place. As can be seen, there is a linear relationship between the two. Of most interest is the fact that an extrapolation of the line indicates that in the 150 Å box, the first transition away from T $_{0}$ should take place at 700 ns. However, we have shown that in the 150 Å box, T $_{0}$ is still present at 1.4 $μ$ s. This provides strong evidence that in a 150 Å box, T $_{0}$ is in fact stable. This linear dependence was not expected. It is an interesting result whose origin, though, still requires explanations at a molecular level. The result that the lifetime of the T $_{0}$ state increases systematically with the increase in box size effectively corresponds to multiple simulations. The T-state was finally found to be stable in the 150 Å box for 1.4 $μ$ s, significantly longer than the extrapolated lifetime value (700 ns). The idea that $μ$ s-plus simulations are needed has become a ‘lore’ (similar to the box size-related ‘rule’ investigated here) with the availability of bigger computers, even when they are not required for a particular problem, as is the case here.

Simulations for all box sizes have also been done with His146 deprotonated. The results for that system showed that T $_{0}$ is stabilized in larger boxes, but after less than 100 ns a transition to an R-like state occurs. This provides direct evidence that His146 protonation is essential for stabilizing T $_{0}$ , in accord with the Perutz model (Perutz, 1970).

To relate the above results to the hydrophobic effect, we use the construct of Chandler (Chandler, 2005), who showed (Figure 5A) that significant water depletion around a spherical hydrophobic solute is expected when its radius is larger than 1 nm (10 Å). Since Hb is nearly spherical with a 2.8 nm radius, we calculated the $g (r)$ for the 90 and 150 Å box (see Figure 5B). The behavior in the 150 Å box is consistent with the expected water structuring, while for the 90 Å box it is not.

Figure 5 with 4 supplements see all

Download asset Open asset

System-size dependence of the average equilibrium water density.

(A) From Ref. (Chandler, 2005) (Figure 3); the average equilibrium water density a distance $r + R$ from spherical cavities. $R$ is the (spherical) size of the solute. Solid lines for the ideal hydrophobe, dashed line when van der Waals interactions are present. (B) Results for Hb in water for the 90 Å (red) and 150 Å (green) box. The radius of Hb (2.8 nm) is intermediate between the two cases in panel A.

https://doi.org/10.7554/eLife.35560.014

The full radial distribution function is reported in Figure 5—figure supplement 1 and Figure 5—figure supplement 2B shows the number of water molecules derived from it. It is found that up to a distance of 8.5 Å from the central cylinder $\sim 150$ water molecules are present which is consistent with explicit counting, see Figure 5—figure supplement 2A, and Figure 5—figure supplement 3 for the corresponding probability distribution function. Furthermore, the structural transitions are accompanied by pronounced dewetting and water penetration as shown in Figure 5—figure supplement 4 for the 90 Å box.

To summarize, the T $_{0}$ state is only found to be thermodynamically stable if (i) the hydration water behaves like bulk water as judged from the self-diffusion coefficient, (ii) the number of hydrogen bonds per water molecule is large enough and its fluctuation around the average sufficiently small, see Figure 3—figure supplement 1. Therefore, if water is not engaged in an adequate number of water-water H-bonds, solvent water is prone to attack the protein salt bridges, destabilizing them and eventually to break them.

As a more local measure of the effect of undersolvation the radial distribution function $g (r)$ around the (His146 $_{β 1}$ )CG–O(water) was determined for all box sizes, see Figure 6. For the 150 Å box the $g (r)$ remains almost invariant whereas for the 120 Å box an appreciable change occurs during the time when the major structural transition at 920 ns takes place. For the smallest boxes, the local $g (r)$ is very variable, which supports the importance of locally structured water molecules for stabilizing the T $_{0}$ state.

Figure 6

Download asset Open asset

Averaged radial distribution functions $g (r)$ between His146 and water for the different box sizes and for simulation windows before, during and after the structural transitions (see Figure 1).

Analysis for (A) the C-terminal (CT) of His146 and Water H (Hw) and (B) for C of His146 and water O (Ow). For the largest (150 Å) box no appreciable change in $g (r)$ is found. The two peaks at 2.7 and 4.1 Å in (A) and the two peaks at 5 and 7.5 Å in (B) are due to a water network as indicated in the sketches. The sketches in A and B represent different views of the same snapshot taken from the MD simulation of the 150 Å box at $t = 960$ ns and is used here as reference to describe the water network because this box represented a stable $g (r)$ . Sketches in (A) emphasise the water network at the C-terminal of His146, and sketches in (B) emphasise the water network at the C $_{γ}$ of His146. Waters W0, W1 and W2 are key structural waters (used as anchor) in the stability of the T state (they are framed with a black line). This network is stable for simulations in the largest box but unstable in the other boxes. His143 is singly protonated ( $N ϵ$ ) and His146 is doubly protonated (N $δ$ and N $ϵ$ ).

https://doi.org/10.7554/eLife.35560.019

In addition to the global behavior, local structural changes involving the two His146 residues were examined. The results (Figure 1—figure supplement 3–5) show that for the largest box, the parameters examined (e.g. the salt bridge to Lys40, discussed by Perutz) fluctuate around the equilibrium values near those of the T $_{0}$ structure while for the smaller boxes there are abrupt changes, correlated with global structural transitions. The transition away from the T $_{0}$ structure extends over approximately 10 ns during which several important contacts are broken or formed, see Figure 1—figure supplement 5A. This value suggests that the apparent activation energy from the simulations in the smaller solvent boxes is near zero, in contrast to an estimate on the order of seconds for the transition if the barrier arose from the 7 kcal/mol difference in stability.

Importantly, the ability of the 150 Å box to stabilize the T-state starting from the R-like structure in the 120 Å box after 1 $μ$ s was also explored. Solvating this structure in the 150 Å box, minimizing, heating and equilibrating it (see Materials and methods) and following the equilibrium dynamics for 1 $μ$ s, the final RMSD compared to the 2DN2 (T-state) structure is 1.97 Å (2.55 compared to 2DN3 (R-state)) starting from an RMSD of 2.73 Å (with respect to 2DN2) and 1.42 Å (2DN3) at the beginning of the simulation (after heating and equilibration), respectively. Concomitantly, the His146 $_{β 1}$ –His146 $_{β 2}$ and His143 $_{β 1}$ –His143 $_{β 2}$ distances change from values typical for an R-state structure (both $\sim 12.6$ Å) to 14.0 Å (for His143) and 18.0 (for His146) and approach separations indicative of a T-state structure (18.6 Å and 30.9 Å) without, however, fully completing the transition to a T-state structure. As mentioned above, the R $_{0}$ to T $_{0}$ transition time is about 20 $μ$ s, much longer than the simulation time.

Given the increasing use of molecular dynamics simulations to study conformational transitions in large proteins and in an explicit solvent environment, the present result that much larger boxes than those used standardly are required for what appears to be the hydrophobic effect to be manifested is of general interest. It has wide ranging implications for the interpretation and validity of previous simulations, as well as those to be undertaken in the future. Given that the magnitude of the effect is expected to depend on the size (and shape) of the molecule and its hydrophobicity, as well as possibly other properties (Chandler, 2005), the requirement for the use of larger boxes in simulations will have to be investigated in each case. One particularly relevant situation where capturing the correct diffusional dynamics of the environment will play a crucial role is that in atomistic simulations for the crowded conditions (Feig et al., 2017) that exist in entire cells or parts thereof (Zhou et al., 2008). Possible errors in molecular dynamics simulation results for other phenomena, such as protein folding, for example, as well as for polymeric materials more generally have to be considered as well.

Materials and methods

The influence of solvent layers on the structural stability of hemoglobin tetramer is investigated using Molecular Dynamics (MD) simulations. Extended large-scale simulation were performed with the CHARMM36 all atom force-field (Best et al., 2012) and the TIP3P water model (Jorgensen et al., 1983) using version 5.1.4 of the GROMACS package (Abraham et al., 2015) on GPUs.

The coordinates of the starting structure are taken from the X-ray structure of tetrameric human hemoglobin in the deoxy form, PDB code 2DN2 (1.25 Å resolution) (Park et al., 2006). The protonation states of the histidines were based on the 2013 study of Zheng et al. and the terminal $β$ histidines (His146) were both doubly protonated (Zheng et al., 2013). The protein was solvated in four different cubic boxes of increasing size: 75, 90, 120 and 150 Å. The system was neutralized by adding counter ions and the salt concentration of 0.15 m/L was achieved using Na $^{+}$ and Cl $^{-}$ . The total number of atoms is: 39,432 for the 75 Å box, having 10,543 water molecules and 42 Na $^{+}$ and 38 Cl $^{-}$ ions; 72,142 for the 90 Å box, having 20,756 water molecules and 70 Na $^{+}$ and 66 Cl $^{-}$ ions; 163,480 for the 120 Å box, having 53,287 water molecules and 160 Na $^{+}$ and 156 Cl $^{-}$ ions; 318,911 for the 150 Å box, having 105,073 water molecules and 309 Na $^{+}$ and 305 Cl $^{-}$ ions.

For the electrostatic interactions, particle-mesh Ewald (PME) was used with a grid spacing of 1 Å, a relative tolerance of $10^{- 6}$ and a cutoff of 10 Å, together with a 10 Å switching for the Lennard-Jones (LJ) interactions. The LINCS algorithm (Hess et al., 1997) was used for constraining bonds involving H-atoms. Each system was first energy minimized for 50,000 steps using steepest descent, heated from 0 to 300 K in increments of 10 K in $N V T$ for 300 ps, followed by 500 ps ( $N V T$ ) and 500 ps of $N p T$ equilibration at $p = 1$ atm with a time step of 2 fs. The center of mass of the protein was restrained to the center of mass of the simulation box. The Velocity Rescaling (Bussi et al., 2007) (with $τ = 0.1$ ps) and Parrinello-Rahman (Parrinello and Rahman, 1981) methods were used for temperature and pressure control, respectively. The velocity rescaling method is an extension of the Berendsen thermostat to which a stochastic force chosen such as to generate a correct canonical distribution is added (Bussi et al., 2007). The MD simulations for all systems were carried out for at least 1 $μ$ s at constant temperature and pressure $(N p T)$ at 300 K and one atm with a time step of 2 fs and the $1 σ$ temperature fluctuations over the 1 $μ$ s trajectories were 0.1 K for the 150 Å box and 0.5 K for the 90 Å box.

Water self-diffusion coefficients $D$ were calculated for box sizes 75, 90, 120 and 150 Å, in the presence and in the absence of Hb, over the entire 1 $μ$ s trajectory. In the absence of the protein, the simulation boxes contained pure water systems (no ions were included). Including ions at physiological concentrations will typically change the water self-diffusion coefficient by 1% to 2% (Kim et al., 2012). First, the mean square displacement (MSD) of all oxygen atoms from a set of initial positions was calculated using mass-weighted averages. Then, the diffusion constant was calculated from the slope of the mean-squared displacement $D_{PBC} = {Lim}_{t \to \infty} \frac{\partial}{\partial t} \frac{⟨ | r (t) - r (0) |^{2} ⟩}{6}$ averaged over all water molecules of a particular trajectory. Errors are estimated from the difference of the diffusion coefficients obtained from separate fits over the two halves of the fit interval.

The TIP3P self diffusion coefficient calculated in the simulation by Yeh et al. (using periodic boundary conditions) is $D_{PBC} = 5.8 \times 10^{- 5}$ cm $^{2}$ /s (Yeh and Hummer, 2004). However, simulations of water at ambient conditions and a Lennard-Jones (LJ) fluid show that the diffusion coefficients depend on system size (Allen and Tildesley, 1987). Thus, the diffusion coefficient corrected for system-size effects is $D_{0} = 6.1 \times 10^{- 5}$ cm $^{2}$ /s for an infinite system of TIP3P water at 298 K and ambient temperature (Yeh and Hummer, 2004). For direct comparison of our values the error $κ = L \times (D_{0} - D_{PBC})$ was subtracted from $D_{0}$ . The resulting TIP3P value is $D_{0, corr} = 5.95 \times 10^{- 5}$ cm $^{2}$ /s.

In order to directly compare with the literature (Yeh and Hummer, 2004) the 40 Å box was also considered here. The literature value of $D_{PBC} = 5.65 \pm 0.16$ ( $10^{- 5}$ cm $^{2}$ /s) compares with $D_{PBC} = 5.59 \pm 0.013$ ( $10^{- 5}$ cm $^{2}$ /s) computed here which validates the present simulations.

Appendix

Structural analyses

The distance between the two C $_{α}$ atoms of His146 of the two $β$ chains is one meaningful coordinate to distinguish the T and the R configuration. The distance between the two C $_{α}$ atoms of His143 present at the opening of the central cavity can be used to monitor the width of the opening. The RMSD between the C $_{α}$ s relative to the 2DN2-Xray structure was also evaluated.

For the 75 Å box the collapse after 130 ns leads to a concomitant increase in the C $α$ -C $α$ RMSD from 2DN2 by more than 1 Å, a repositioning of the $β 2$ chain relative to the other three chains and a decrease in the degree of hydration of the central channel by up to 20%. Similar observations apply to the 90 (120) Å boxes for which three structural transitions are observed: one after 480 (630) ns (Step 1), the next one after 780 (820) ns (Step 2), and Step 3 after 880 (920) ns. The transitions in the smaller boxes involve different changes (see Figure 1—figure supplement 2). Also, the end point structures after 1 $μ$ s are not the same (see Table 1).

In order to evaluate and monitor the opening of the central cavity the angle between $α$ 1 $β$ 1 and $α$ 2 $β$ 2 sub-units groups is calculated as the angle formed between their planes (Figure 1—figure supplement 2). The plane is defined by three atoms: C $α$ of His146 $_{β}$ , Fe $_{β}$ and Fe $_{α}$ . The smaller the angle the closer are the subunits and the tighter is the central cavity. This affects the number of water molecules that can access the channel. The water count was thus evaluated and confirmed this observation.

The number of water molecules inside the central cylinder of the protein was also analyzed see Figure 5—figure supplements 3 and 4. The water count includes only waters in the central cylinder, which was defined as a cylinder of 8.5 Å base area, 22 Å of height and having the four heme moieties as the center of mass, and not waters at the tetramerization interface.

The ion distribution was considered, as well. The concentration of ions was 0.15 mol/L for all box sizes. It is known that the ionic strength of the solution plays a role in the oxygenation behaviour. (Szabo and Karplus, 1972) Also, the local ion concentration has an indirect influence by modulating the water activity, which in turn affects allosteric regulation of Hb. (Salvay et al., 2003) The ion diffusivity changes from 2.1 to $2.8 \times 10^{- 5}$ cm $^{2}$ /s for simulations in the 75 to 150 Å boxes which also reflects the increased diffusivity of the bulk water molecules with increasing box size, see Figure 4A. The average number of ions within a 20 Å sphere around the center of mass of Hb is 9.6 (75 Å), 8.5 (90 Å), 8.7 (120 Å), and 9.8 (150 Å). Hence, the ion distribution does not drive the transition directly, which is consistent with additional test simulations carried out, as described below.

Additional modifications considered

Further modifications for which the stability of the T-state was investigated included (i) strategic placements (by using constraints) of 4 Cl $^{-}$ ions at the $β_{1} β_{2}$ interface where the ligand 2,3-diphosphoglycerate (DPG) binds, (ii) reduced force constants for the Fe-N $_{His}$ stretching to render the heme group more adaptable to global structural changes of the protein, (iii) placement of the DPG ligand which is known to bind to deoxy-Hb, (iv) modified charges of the hydrogen bonds to stabilize the hinge contacts (Jones et al., 2014), (v) different protonation state of His146 $_{β 1}$ and His146 $_{β 2}$ ( $δ -$ , $ϵ -$ , and doubly protonated) and (vi) scaling the water-protein nonbonded interactions, specifically the van der Waals well depths. (Best et al., 2014) All these simulations were run starting from the T-state in the 75 Å box and were found to become unstable within 70 to 200 ns. The outcome of the last two modifications considered (v and vi) is reported in Figure 1—figure supplement 6. The figure shows the C $α$ –C $α$ distance between residues His146 $_{β 1}$ and His146 $_{β 2}$ for the T-state Hb in the 75 Å box during 400 ns of 4 MD simulations for the variants (1) protonated His146 (protonated N $_{δ}$ and N $_{ϵ}$ ) with (black) and (2) without (red) scaled protein-water interactions; (3) unprotonated His146 (proton on N $_{ϵ}$ green) and (4) unprotonated His146 (proton on N $_{δ}$ blue) with scaled protein-water interactions. In all cases the T-state structure becomes unstable to various degrees and decays more or less completely towards the R-state. The red trace in Figure 1—figure supplement 6 is identical to the black trace in Figure 1 of the main manuscript.

Particular observations made for these additional simulations are as follows: In the presence 4 Cl $^{-}$ ions in the cavity where DPG binds the T-state is stable for $\sim 45$ ns. Constraining the chloride ions does not significantly extend the time during which the protein is in its T-state. The rationale behind reducing the force constants for the Fe-N bonds is that by softening the internal degrees of freedom the protein as a whole can adapt more easily to perturbations induced by strain. However, even reducing the stretching force constant by a factor of 10 only increases the lifetime of the T-state out to 60 ns. The DPG ligand was placed at the $β_{1} β_{2}$ interface and harmonic constraints were applied to keep it close to this initial position. Despite the present of the ligand, the C $α$ –C $α$ distance between His146 $_{β 1}$ and His146 $_{β 2}$ decreases to 20 Å within 20 ns and hence the T-state decays. Recent experimental work (Jones et al., 2014) suggested that special hydrogen bond contacts (so called ‘hinge’ contacts) exist at the $α_{1} β_{2}$ interface and their presence and absence have been related to the T $\to$ R transition. In order to stabilize these contacts (Tyr42–Asp99 and Trp37–Asp94), the partial charges of the hydrogen and acceptor atoms were increased by 10 and 20%, respectively. With this, the T-state lifetime was 50 ns or less, compared to 40 ns if the charges were unmodified. Such tests had also been done for the T-state by constraining the Perutz H-bonds for 100 ns in equilibrium simulations. (Hub et al., 2010) However, after removing the constraints, the T-structure decayed again during the following 100 ns simulation. Finally, it had been found that a strengthening of the protein-water van der Waals interaction by 10% is sufficient to recover the correct dimensions of intrinsically disordered and unfolded proteins. (Best et al., 2014) Applying this to the apparent instability of the T-state of Hb does, however, not extend its lifetime. After 75 to 120 ns the T-state decays towards the R-state structure (see Figure 1—figure supplement 6). We also note that, although in the initial simulation of the 75 Å box with His 146 doubly protonated, the C $_{α}$ -C $_{α}$ distance between the His 146 residues decreases by only a small amount (from 31 Å to 25 Å in step 1), by other criteria the decaying T-state approaches the R-state (see Figure 1—figure supplement 2).

Data availability

All data generated or analysed during this study are included in the manuscript and supporting files.

References

1. Abraham MJ
2. Murtola T
3. Schulz R
4. Páll S
5. Smith JC
6. Hess B
7. Lindahl E
(2015) GROMACS: high performance molecular simulations through multi-level parallelism from laptops to supercomputers
SoftwareX 1-2:19–25.

https://doi.org/10.1016/j.softx.2015.06.001
- Google Scholar
Book
1. Allen MP
2. Tildesley DJ
(1987)
Computer Simulation of Liquids

Oxford, U.K: Clarendon Press.
- Google Scholar
1. Baldwin J
2. Chothia C
(1979) Haemoglobin: the structural changes related to ligand binding and its allosteric mechanism
Journal of Molecular Biology 129:175–220.

https://doi.org/10.1016/0022-2836(79)90277-8
- PubMed
- Google Scholar
1. Best RB
2. Zheng W
3. Mittal J
(2014) Balanced Protein-Water interactions improve properties of disordered proteins and Non-Specific protein association
Journal of Chemical Theory and Computation 10:5113–5124.

https://doi.org/10.1021/ct500569b
- PubMed
- Google Scholar
1. Best RB
2. Zhu X
3. Shim J
4. Lopes PE
5. Mittal J
6. Feig M
7. Mackerell AD
(2012) Optimization of the additive CHARMM all-atom protein force field targeting improved sampling of the backbone φ, ψ and side-chain χ(1) and χ(2) dihedral angles
Journal of Chemical Theory and Computation 8:3257–3273.

https://doi.org/10.1021/ct300400x
- PubMed
- Google Scholar
1. Brooks BR
2. Brooks CL
3. Mackerell AD
4. Nilsson L
5. Petrella RJ
6. Roux B
7. Won Y
8. Archontis G
9. Bartels C
10. Boresch S
11. Caflisch A
12. Caves L
13. Cui Q
14. Dinner AR
15. Feig M
16. Fischer S
17. Gao J
18. Hodoscek M
19. Im W
20. Kuczera K
21. Lazaridis T
22. Ma J
23. Ovchinnikov V
24. Paci E
25. Pastor RW
26. Post CB
27. Pu JZ
28. Schaefer M
29. Tidor B
30. Venable RM
31. Woodcock HL
32. Wu X
33. Yang W
34. York DM
35. Karplus M
(2009) CHARMM: the biomolecular simulation program
Journal of Computational Chemistry 30:1545–1614.

https://doi.org/10.1002/jcc.21287
- PubMed
- Google Scholar
(1983) CHARMM: A program for macromolecular energy, minimization, and dynamics calculations
Journal of Computational Chemistry 4:187–217.

https://doi.org/10.1002/jcc.540040211
- Google Scholar
(2007) Canonical sampling through velocity rescaling
The Journal of Chemical Physics 126:014101.

https://doi.org/10.1063/1.2408420
- PubMed
- Google Scholar
1. Chandler D
2. Varilly P
(2012)
Proceedings of the International School of Physics Enrico Fermi

75–111, Lectures on molecular- and nano-scale fluctuations in water, Proceedings of the International School of Physics Enrico Fermi, 176.
- Google Scholar
1. Chandler D
(2005) Interfaces and the driving force of hydrophobic assembly
Nature 437:640–647.

https://doi.org/10.1038/nature04162
- PubMed
- Google Scholar
1. Cheng YK
2. Rossky PJ
(1998) Surface topography dependence of biomolecular hydrophobic hydration
Nature 392:696–699.

https://doi.org/10.1038/33653
- PubMed
- Google Scholar
(1976) Role of subunit interfaces in the allosteric mechanism of hemoglobin
PNAS 73:3793–3797.

https://doi.org/10.1073/pnas.73.11.3793
- PubMed
- Google Scholar
1. Cui Q
2. Karplus M
(2008) Allostery and cooperativity revisited
Protein Science 17:1295–1307.

https://doi.org/10.1110/ps.03259908
- PubMed
- Google Scholar
1. Edelstein SJ
(1971) Extensions of the allosteric model for haemoglobin
Nature 230:224–227.

https://doi.org/10.1038/230224a0
- PubMed
- Google Scholar
1. Feig M
2. Yu I
3. Wang PH
4. Nawrocki G
5. Sugita Y
(2017) Crowding in cellular environments at an atomistic level from computer simulations
The Journal of Physical Chemistry B 121:8009–8025.

https://doi.org/10.1021/acs.jpcb.7b03570
- PubMed
- Google Scholar
1. Gelin BR
2. Karplus M
(1977) Mechanism of tertiary structural change in hemoglobin
PNAS 74:801–805.

https://doi.org/10.1073/pnas.74.3.801
- PubMed
- Google Scholar
(1997) LINCS: A linear constraint solver for molecular simulations
Journal of 18:1463–1472.

https://doi.org/10.1002/(SICI)1096-987X(199709)18:12<1463::AID-JCC4>3.0.CO;2-H
- Google Scholar
(2010) Spontaneous quaternary and tertiary T-R transitions of human hemoglobin in molecular dynamics simulation
PLoS Computational Biology 6:e1000774.

https://doi.org/10.1371/journal.pcbi.1000774
- PubMed
- Google Scholar
1. Jones EM
2. Monza E
3. Balakrishnan G
4. Blouin GC
5. Mak PJ
6. Zhu Q
7. Kincaid JR
8. Guallar V
9. Spiro TG
(2014) Differential control of heme reactivity in alpha and beta subunits of hemoglobin: a combined Raman spectroscopic and computational study
Journal of the American Chemical Society 136:10325–10339.

https://doi.org/10.1021/ja503328a
- PubMed
- Google Scholar
(1983) Comparison of simple potential functions for simulating liquid water
The Journal of Chemical Physics 79:926–935.

https://doi.org/10.1063/1.445869
- Google Scholar
1. Kim JS
2. Wu Z
3. Morrow AR
4. Yethiraj A
5. Yethiraj A
(2012) Self-diffusion and viscosity in electrolyte solutions
The Journal of Physical Chemistry B 116:12007–12013.

https://doi.org/10.1021/jp306847t
- PubMed
- Google Scholar
1. Lesk AM
2. Janin J
3. Wodak S
4. Chothia C
(1985) Haemoglobin: the surface buried between the alpha 1 beta 1 and alpha 2 beta 2 dimers in the deoxy and oxy structures
Journal of Molecular Biology 183:267–270.

https://doi.org/10.1016/0022-2836(85)90219-0
- PubMed
- Google Scholar
(1965) On the nature of allosteric transitions: a plausible model
Journal of Molecular Biology 12:88–118.

https://doi.org/10.1016/S0022-2836(65)80285-6
- PubMed
- Google Scholar
1. Muirhead H
2. PERUTZ MF
(1963) Structure of haemoglobin. a three-dimensional fourier synthesis of reduced human haemoglobin at 5-5 a resolution
Nature 199:633–638.

https://doi.org/10.1038/199633a0
- PubMed
- Google Scholar
1. Park SY
2. Yokoyama T
3. Shibayama N
4. Shiro Y
5. Tame JR
(2006) 1.25 A resolution crystal structures of human haemoglobin in the oxy, deoxy and carbonmonoxy forms
Journal of Molecular Biology 360:690–701.

https://doi.org/10.1016/j.jmb.2006.05.036
- PubMed
- Google Scholar
1. Parrinello M
2. Rahman A
(1981) Polymorphic transitions in single crystals: a new molecular dynamics method
Journal of Applied Physics 52:7182–7190.

https://doi.org/10.1063/1.328693
- Google Scholar
1. Perutz MF
(1970) Stereochemistry of cooperative effects in haemoglobin: Haem–Haem Interaction and the Problem of Allostery
Nature 228:726–734.

https://doi.org/10.1038/228726a0
- Google Scholar
(1979) A model for the simulation of an aqueous dipeptide solution
Biopolymers 18:825–854.

https://doi.org/10.1002/bip.1979.360180407
- Google Scholar
(2003) Hydration effects on the structural properties and haem–haem interaction in haemoglobin
Phys. Chem. Chem. Phys. 5:192–197.

https://doi.org/10.1039/B209560B
- Google Scholar
1. Sawicki CA
2. Gibson QH
(1976)
Quaternary conformational changes in human hemoglobin studied by laser photolysis of carboxyhemoglobin

The Journal of Biological Chemistry 251:1533–1542.
- PubMed
- Google Scholar
1. Schechter AN
(2008) Hemoglobin research and the origins of molecular medicine
Blood 112:3927–3938.

https://doi.org/10.1182/blood-2008-04-078188
- PubMed
- Google Scholar
1. Szabo A
2. Karplus M
(1972) A mathematical model for structure-function relations in hemoglobin
Journal of Molecular Biology 72:163–197.

https://doi.org/10.1016/0022-2836(72)90077-0
- PubMed
- Google Scholar
1. Thomas JO
2. Edelstein SJ
(1972)
Observation of the dissociation of unliganded hemoglobin

The Journal of Biological Chemistry 247:7870–7874.
- PubMed
- Google Scholar
1. Yeh I-C
2. Hummer G
(2004) System-Size dependence of diffusion coefficients and viscosities from molecular dynamics simulations with periodic boundary conditions
The Journal of Physical Chemistry B 108:15873–15879.

https://doi.org/10.1021/jp0477147
- Google Scholar
(2012) Role of the subunit interactions in the conformational transitions in adult human hemoglobin: an explicit solvent molecular dynamics study
The Journal of Physical Chemistry B 116:11004–11009.

https://doi.org/10.1021/jp3022908
- PubMed
- Google Scholar
(2013) Hemoglobin Bohr effects: atomic origin of the histidine residue contributions
Biochemistry 52:8539–8555.

https://doi.org/10.1021/bi401126z
- PubMed
- Google Scholar
(2008) Macromolecular crowding and confinement: biochemical, biophysical, and potential physiological consequences
Annual Review of Biophysics 37:375–397.

https://doi.org/10.1146/annurev.biophys.37.032807.125817
- PubMed
- Google Scholar

Article and author information

Author details

Krystel El Hage

Department of Chemistry, University of Basel, Basel, Switzerland

Contribution
Conceptualization, Data curation, Formal analysis, Validation, Investigation, Visualization, Methodology, Writing—original draft, Writing—review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0003-4837-3888
Florent Hédin

Department of Chemistry, University of Basel, Basel, Switzerland

Contribution
Data curation, Formal analysis, Investigation, Writing—original draft

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0001-6341-7557
Prashant K Gupta

Department of Chemistry, University of Basel, Basel, Switzerland

Contribution
Data curation, Formal analysis, Investigation, Writing—original draft

Competing interests
No competing interests declared
Markus Meuwly

Department of Chemistry, University of Basel, Basel, Switzerland

Contribution
Conceptualization, Resources, Software, Formal analysis, Supervision, Funding acquisition, Validation, Investigation, Visualization, Methodology, Writing—original draft, Project administration, Writing—review and editing

For correspondence
m.meuwly@unibas.ch

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0001-7930-8806
Martin Karplus
1. Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts, United States
2. Laboratoire de Chimie Biophysique, ISIS, Université de Strasbourg, Strasbourg, France
Contribution
Conceptualization, Investigation, Methodology, Writing—original draft, Project administration, Writing—review and editing

For correspondence
marci@tammy.harvard.edu

Competing interests
No competing interests declared

Funding

Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung (NCCR MUST)

Markus Meuwly

Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung (200020-169079)

Markus Meuwly

Charmm Development Project

Martin Karplus

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank David Chandler $†$ , to whom we dedicate this paper, for many fruitful discussions. The work in Switzerland was supported by the Swiss National Science Foundation through grants 200021–117810, and the NCCR MUST. The work at Harvard was supported in part by the CHARMM Development Project.

Copyright

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.