Introduction

For decades, cellular organisation was primarily understood through the lens of membrane-bound compartmentalisation. However, it has become increasingly clear that membrane-less organelles —such as stress granules, nuclear speckles, and Cajal bodies— are also widespread within cells (Banani et al., 2017). At least in some cases, these organelles form by phase separation of their components (Shin and Brangwynne, 2017), primarily disordered regions of proteins and nucleic acids, which partition into dense and dilute phases. The condensates thus formed retain properties of liquids, like fusion, dripping and wetting (Brangwynne et al., 2009). This phenomenon has generated significant interest in the physical mechanisms governing the phase behaviour of biomolecular mixtures (Choi et al., 2020).

Many proteins undergoing phase separation share common characteristics, including devia-tions from typical compositions of folding proteins and simple sequences with multiple amino acid repeats. Specifically, in the sequences of proteins that experience upper critical solution temperature (UCST) transitions, like the low-complexity regions of FUS, hnRNPA1, Ddx4, LAF-1 or atGRP7, we observe long stretches rich in polar residues interspersed with aromatics or positively charged residues (Martin and Mittag, 2018). Borrowing language from the theory of associative polymers (Semenov and Rubinstein, 1998), these units of sequence have been termed “stickers”, in the case of aromatics and positively charged residues, and “spacers”, for the polar residue repeats (Martin et al., 2020; Bremer et al., 2022; Mittag and Pappu, 2022). Spacers act as linkers that lend flexibility to the polypeptide mesh in the protein-dense phase. In contrast, stickers play a key role in determining both the single-chain properties of the polymer and the phase behaviour of the condensate through interactions involving their aromatic or charged groups (Wang et al., 2018). Recent experiments have quantified the influence of different types of stickers in the polymer properties and phase behaviour (Bremer et al., 2022). This raises a crucial question: what is the fundamental origin of the relative strengths of different stickers?

This matter has recently been investigated in the context of the cationic amino acids lysine (Lys) and arginine (Arg) (Wang et al., 2018; Das et al., 2020; Schuster et al., 2020; Fisher and Elbaum-Garfinkle, 2020; Greig et al., 2020; Paloni et al., 2021; Hong et al., 2022). Despite having the same net charge, these residues turn out not to be interchangeable. Mutagenesis experiments on LAF-1 have shown that substituting Arg with Lys completely suppresses phase separation (Schuster et al., 2020). This distinction is functionally relevant, as Arg to Lys substitutions affect speckle formation (Greig et al., 2020). Additionally, experiments on the intrinsically disordered region (IDR) of Ddx4 indicate that phase separation is favoured by Arg relative to Lys (Brady et al., 2017; Das et al., 2020; Schuster et al., 2020). Das and co-workers attempted to explain arginine’s greater propensity to phase separate in Ddx4 variants using coarse-grained simulations with two different energy functions (Das et al., 2020). The model was first parametrized using a hydrophobicity scale, aimed to capture the “stickiness” of different amino acids (Dignon et al., 2018), but this did not recapitulate the correct rank order in the stability of the simulated condensates (Das et al., 2020). By replacing the hydrophobicity scale with interaction energies from amino acid contact matrices—derived from a statistical analysis of the PDB (Dignon et al., 2018; Miyazawa and Jernigan, 1996; Kim and Hummer, 2008)— they recovered the correct trends (Das et al., 2020). A key to the greater propensity to phase separate in the case of Arg may derive from the pseudo-aromaticity of this residue, which results in a greater stabilization relative to the more purely cationic character of Lys (Gobbi and Frenking, 1993; Wang et al., 2018; Hong et al., 2022).

Here we focus on the distinct roles of the main aromatic residues acting as stickers —tyrosine (Tyr) and phenylalanine (Phe)— excluding tryptophan due to its much lower abundance (Maraldo et al., 2024). Given that they only differ in a hydroxyl group, one could expect Tyr and Phe to be equally relevant to phase separation, especially considering the finding from a double-mutant cycle that Tyr-Tyr and Phe-Phe pairs make nearly identical contributions to protein stability (Serrano et al., 1991). However, experimental evidence on various proteins suggests that Tyr is a stronger driver of condensation (Lin et al., 2017; Wang et al., 2018; Schuster et al., 2020; Bremer et al., 2022). This is demonstrated by the reduced propensity to phase separate of Tyr-to-Phe mutants of FUS and LAF-1 (Lin et al., 2017; Wang et al., 2018; Schuster et al., 2020) and the opposite effect in Phe-to-Tyr mutants of hnRNPA1 (Wang et al., 2018; Bremer et al., 2022). Understanding the origin of the different sticker strengths of Phe and Tyr is important due to its functional relevance. In a large sample of hnRNPA1 variants, the fractions of Phe and Tyr have been found to co-vary, suggesting that evolutionary control of composition fine-tunes the properties of condensates (Bremer et al., 2022).

The molecular properties of Phe and Tyr may give important insights about their distinct be-haviour as stickers in condensates. Hydrophobicity scales typically rank Phe as the most hydrophobic residue (Kyte and Doolittle, 1982; Tesei et al., 2021), consistent with the greater hydration free energy of Tyr relative to Phe (Wolfenden et al., 1981; Chang et al., 2007) (see Figure 1A-B). As in the case of Arg and Lys, using hydrophobicity as a proxy for interaction energy in bead simulation models proved insufficient to explain the relative strengths of Phe and Tyr as stickers (Das et al., 2020). However, in this case, statistical contact matrices also could not capture the correct order of stickiness. In the Miyazawa and Jernigan statistical potential, Tyr-Tyr contacts are weaker than Phe-Phe (with energies of −4.17 and −7.26 in RT units, respectively; see Figure 1C) (Miyazawa and Jernigan, 1996). We note that a different hydrophobicity scale based on peptides undergoing inverse temperature transitions —i.e. the Urry scale (Urry et al., 1992), where Tyr is more hydrophobic— can account for the correct rank order in saturation concentration (Regy et al., 2021). On the other hand, the potentials of mean force between amino acids calculated with atomistic force fields have a deeper free energy well for the Tyr-Tyr pair than for Phe-Phe (Chelli et al., 2002; Joseph et al., 2021). A final data point of interest is the extremely low solubility of Tyr, over an order of magnitude smaller than that of Phe (0.045 and 2.79 g/100 g of water, respectively; see Figure 1D) (Nozaki and Tanford, 1971). This low solubility has led to the suggestion that tyrosine-tyrosine hydrogen bonds are stronger than those formed by tyrosine and water (Pace et al., 2001).

Bibliographic values for different properties of phenylalanine and tyrosine.

(A) Solvation free energy (ΔGsolv) (Chang et al., 2007). (B) Probability distributions of min-maxed normalized hydropathy values λ from bibliographic hydrophobicity scales (Tesei et al., 2021). (C) Self-interaction energy (εii) from the Miyazawa-Jernigan contact matrix (Miyazawa and Jernigan, 1996). (D) Solubility in water at 25°C (Nozaki and Tanford, 1971).

In summary, experimental results on condensates containing Phe/Tyr variants do not seem to align with either solvation free energies, most hydrophobicity scales or statistical contact potentials, but are consistent with calculations with atomistic force fields in solution and solubilities in water. One possible explanation for these conflicting findings is that, due to their level of hydration, molecular condensates may differ significantly from the tightly-packed cores of folded protein structures (Lin et al., 2017; Das et al., 2020). Here we address this paradox using a combination of classical molecular dynamics (MD) simulations and quantum chemical calculations. First, we estimate trans-fer free energies of peptides including these aromatic residues into model peptide condensates and find that they are more favourable for Tyr than for Phe, an effect that is reversed when we perform the same transformation in apolar media. DFT calculations confirm that the interaction energies in Tyr-Tyr pairs are stronger than those between Phe residues. However, the transfer free energy contribution dominates at sufficiently low dielectric constants, making Phe-Phe pairs more favourable. These findings recapitulate the right rank order of interaction strengths of aromatic stickers in biomolecular condensates but also their crossover in low dielectric media like the hydrophobic cores of folded proteins.

Materials and Methods

Classical molecular dynamics simulations

Molecular models

In this work, we report simulations of terminally-capped GGXGG peptides with X=F/Y in different media, including water, organic solvents and peptide condensates. This family of peptides has been characterised extensively in the past, both from experiment (Plaxco et al., 1997) and simulation (Workman and Pettitt, 2021). We have built structures for the peptides using Ambertools (Salomon-Ferrer et al., 2013). Our condensates are formed by an equimolar mixture of Gly, Ser and either Tyr or Phe “dipeptides” (in fact, terminally capped amino acids), as before (De Sancho, 2022). For most of our simulations we have used the Amber ff99SB⋆-ILDN force field, including corrections for backbone and side chain torsions (Best and Hummer, 2009; Lindorff-Larsen et al., 2010), and TIP3P water (Jorgensen et al., 1983). To evaluate the impact of the water model in our results, we have also prepared simulations with the TIP4P-Ew water model (Horn et al., 2004). For the simulations with different organic solvents, we used the GAFF parameters derived by Mobley and co-workers (Bannan et al., 2016). Tables with a comprehensive description of the simulation systems considered in this work can be found in Appendix 1.

Thermodynamic cycle

To estimate differences in the transfer free energy (ΔΔGTransfer) of our short peptides into a different medium —either a condensate or a different solvent— we use an alchemical (i.e. non-physical) transformation (Mey et al., 2020). Specifically, the thermodynamic cycle we construct involves four different states for the GGXGG peptide, with X=F (λ = 0) or X=Y (λ = 1), which is inserted either in water or in the alternative medium. We illustrate this thermodynamic cycle in the case of transfer into a condensate in Figure 2A. The vertical branches of the thermodynamic cycle are associated to the free energy of transfer of the model peptides from water into the condensate and ,while the horizontal branches involve the alchemical transformation (ΔGF→Y(H2O) and ΔGF→Y(Cond)). Using simulations, we can easily calculate the free energies corresponding to the horizontal branches. This enables us to estimate the desired quantity as

(a) Thermodynamic cycle used in this study for the estimation of free energy differences upon mutation for the insertion of a peptide in a molecular condensate. (B) Schematic of the process of contact formation between two molecules i and j used in the quantum chemical calculations. We consider that contact formation involves the transfer from water (blue) to a different medium (orange) and the interaction in this medium between the entities involved.

Negative ΔΔGTransfer values indicate that transferring the Tyr peptide into the condensate is more favourable than transferring the Phe peptide, whereas positive values indicate the opposite.

Simulation setup

For the simulation runs in pure solvents, we simply inserted the GGXGG peptide in an octahedral box, leaving at least 1 nm of distance to each wall. Then the boxes were filled with the corresponding solvent molecules. For the simulations in the ternary model condensates, we first inserted the peptide at the central position of a rectangular box with dimensions 8.5 nm × 4.5 nm × 4.5 nm, containing an equimolar mixture of Gly, Ser and either Tyr or Phe dipeptides, which we term GSY and GSF, respectively. Then, the box was expanded up to 20 nm and filled with water molecules. This results in a simulation box where the peptide of interest is immersed in a protein-dense slab in contact with a dilute phase.

The equilibration protocol is identical in all cases. First, we ran an energy minimization using a steepest descent algorithm, followed by short simulations in the NVT and NPT ensembles, using the Berendsen (Berendsen et al., 1984) and velocity rescaling (Bussi et al., 2007) thermostats, respectively, to set the temperature at 298 K, and the Berendsen barostat (Berendsen et al., 1984) to set the pressure at 1 bar on the latter. Finally, we performed equilibrium simulations in the NPT ensemble using a leap-frog stochastic dynamics integrator with a time-step of 2 fs at 298 K and setting the pressure to 1 bar with the Parrinello-Rahman barostat (Parrinello and Rahman, 1980). The duration of the equilibrium runs was 1 μs for the runs with a condensate slab configuration and 500 ns for all other solvents (see Appendix 1). We run triplicates of the GSY and GSF simulations. Note that for each system-solvent combination, two different sets of simulations had to be performed, one for each of the alchemical (λ) states corresponding to the central peptide residue being Phe (λ = 0) or Tyr (λ = 1).

For the alchemical transitions, we use a non-equilibrium switching method (Seeliger and de Groot, 2010; Gapsys et al., 2015; Aldeghi et al., 2019) often used for the calculation of differences in binding affinities (Aldeghi et al., 2018; Gapsys et al., 2020) or free energy changes upon mutation (Gapsys et al., 2016; Aldeghi et al., 2019; Martinez-Martin et al., 2023). After discarding the first 100-200 ns from the long equilibrium runs, we selected 100 evenly spaced snapshots as initial states. Then, we run short (50 ps) simulation trajectories where the λ-state was switched forward (λ = 0 → 1) or backward (λ = 1 → 0).

We have run additional simulations of the GSY and GSF condensates in the absence of the longer peptide using the temperature replica exchange method (REMD) (Sugita and Okamoto, 1999). For these simulations, we first set up systems using the protocol described above for the slab simulations. Then, we ran a 50 ns, high-temperature simulation at 500 K in the NVT ensemble. From this trajectory, we dumped 48 randomly selected snapshots that were used as initial conformations for the different replicas. We ran REMD for 500 ns at temperatures ranging between

293.5 and 411.6 K. Replica swaps were attempted every 2 ps. The first 200 ns for each replica were discarded from the analysis. To study interactions in the biomolecular condensates, we isolated the dense phase in the slab from the REMD replica at room temperature. This was then briefly re-equilibrated as before, and production simulations of the condensates were run for 1 μs.

For all simulations, we have used the Gromacs software package (Abraham et al., 2015) (v.2024). To generate hybrid topologies for the different λ-states, we have used the PMX software (Seeliger and de Groot, 2010; Gapsys et al., 2015; Aldeghi et al., 2019) (v. 3.0).

Analysis

The simulations have been analysed using a combination of analysis programs in the PMX and Gromacs packages (Abraham et al., 2015) and in-house Python scripts that make extensive use of the MDTraj library (McGibbon et al., 2015). For our free energy estimates, we have analysed the non-equilibrium switching simulations using the PMX tools (Seeliger and de Groot, 2010; Gapsys et al., 2015; Aldeghi et al., 2019). Specifically, we calculate the work values from the forward and backward transitions and then estimate free energy differences between the initial and final states using Bennet’s Acceptance Ratio (Bennett, 1976). For more details, we refer to the PMX papers (Seeliger and de Groot, 2010; Gapsys et al., 2015; Aldeghi et al., 2019). Errors were estimated from bootstrapping, except in the case of the GSY and GSF condensates, where we estimate the standard error of the mean from the three replicates.

To obtain the phase diagrams, we calculate the peptide densities of the dilute and dense phases (ρL and ρH, respectively) from the REMD trajectories. These were obtained after fitting the density profile to the expression

where xDS and t are, respectively, the position and width of the dividing surface between dense and dilute phases (Tesei et al., 2021). Then, using data only for temperatures where we could clearly distinguish between dense and dilute phases, we obtained the critical temperature (Tc) from a fit to the equation

where β = 0.325 is the critical exponent (Dignon et al., 2018). The critical density (ρc) was derived from the law of rectilinear diameters

We used gmx dipoles to calculate the dielectric constant (ε) for different condensates and solvents from the fluctuations in the dipole moment of the simulation box, M, which is defined as (Neumann, 1983)

In this expression, ε0 is the vacuum permittivity, kB is Boltzmann’s constant, and V is the volume of the simulation box. For the calculation of the dielectric constant of condensates, we used the simulations of isolated dense phases mentioned above. Error analysis has been performed by blocking or determined as the standard error of the mean when derived from multiple trajectories.

Quantum chemical calculations

We performed quantum mechanical calculations using the Gaussian 16 program to study the interaction energy of the different complexes. All calculations were made at the Density Functional Theory (DFT) level, using the ωB97XD functional (Chai and Head-Gordon, 2008) and Pople’s 6-311++G(d,p) basis set (Hehre et al., 1972) for geometry optimization. Energetics were refined with the 6-311++G(3df,2p) basis set. Geometry optimizations were done in different solvents, using the Polarizable Continuum Model approach (Mennucci, 2012). The interaction energy in a given solvent was calculated according to

where i and j specify the interacting monomers. We consider p-cresol to represent the side chain of tyrosine (Y, hereafter), toluene to represent the side chain of phenylalanine (F hereafter) and N-methyl-acetamide to represent an amide bond group (A, hereafter). Dimers ij include different combinations of Y, F, and A in various orientations.

To calculate the free energy transfer of a monomer from water to a given solvent, we used the SMD solvation model (Marenich et al., 2009) to calculate accurate solvation-free energies of the monomers in different solvents, using B3LYP/6-31+G(d,p) level of theory (Becke, 1993; Chengteh et al., 1988). Thus, the transfer free energy from water to a given solvent s is calculated according to

As solvents, we have considered three different groups:

  1. Alcohols: methanol, ethanol, 1-propanol, 1-butanol, 1-poentanol, 1-hexanol, 1-nonanol and 1-decanol.

  2. Aliphatic solvents: 2-nitropropane, 2-bromopropane, 1-bromopropane, 1-bromopentane, 1-fluorooctane, n-pentane, n-decane nitrobenzene, o-dichlorobenzene, nromobenzene, iodobenzene, ethylbenzene, benzene, toluene and cyclohexane.

  3. Water with varying dielectric values.

These solvents span a broad range of values of the dielectric constant and can be viewed to represent different chemical environments where contact formation can take place.

Results and Discussion

Transfer free energies into a condensate slab

Our first goal is to find whether differences in transfer free energy for Phe and Tyr variants capture the hierarchy in sticker strength derived from experiments. Ideally one would calculate the transfer free energies for the protein of interest inside a realistic protein condensate. However, this would involve simulations of multiple copies of an IDR with tens or hundreds of amino acids, resulting in a system with very slow conformational dynamics (Paloni et al., 2020; Zheng et al., 2020; Galvanetto et al., 2023). As a computationally efficient alternative, we use a simple model peptide with sequence GGXGG, with X being either F or Y, which is transferred into the dense mixtures of short peptides we have recently characterized (De Sancho, 2022) (see Figure 3A). Peptide condensates are useful model systems as they retain many properties of protein condensates but allow for much greater computational efficiency (Paloni et al., 2020; Tang et al., 2021; De Sancho, 2022; Brown and Potoyan, 2024).

(A) Representative simulation box with a fully solvated GSY condensate in slab geometry including a GGXGG peptide (spheres) and the capped amino acid mixture (G: white, S: yellow and Y: green). (B) Time series for the solvent accessible surface area (SASA) in a representative trajectory of the GGXGG peptide within the GSY condensate for different values of λ. (C) Time evolution of the density profiles calculated across the longest dimension of the simulation box (L) in the coexistence simulations. In blue we show the density in mg/mL of all the peptides, and in dark red that of the F/Y residue in the GGXGG peptide.

We define the thermodynamic cycle in Figure 2 and estimate free energies associated to its horizontal branches using alchemical transformations, which we performed using a state-of-theart non-equilibrium switching method (Seeliger and de Groot, 2010; Gapsys et al., 2015; Aldeghi et al., 2019) (see Methods). The first step in this procedure involves running long equilibrium simulations of the peptide of interest at the initial λ-states (i.e. where X=F for λ = 0 and X=Y for λ = 1). While these calculations are straightforward for a peptide in solution, within the condensate, we must first ensure that the condensate slab has converged to a stable density. Also, the peptide that experiences the transformation, which is not restrained, must remain buried within the condensate for all the snapshots that we use as initial frames, to avoid averaging the work in the dilute and dense phases. In Figure 3B, we show the solvent accessible surface area (SASA) of all protein residues in a representative trajectory of the GSY condensate with the model peptide submerged. After an initial increase in SASA, the values stabilise and remain flat for the remainder of the simulation, which is indicative of stable dense and dilute phases. We also show the density profiles for all the amino acid residues in the condensate and for the mutated residue (Figure 3B). For both λ values, we find that during the whole duration of the simulation, the GGXGG peptide remains buried within the condensate. Very similar results were obtained in three independent replicates started from different initial configurations of the peptide slab (see Figure S1 in Appendix 2).

We discarded the initial 100 ns of these long trajectories and selected initial configurations for non-equilibrium switching from evenly spaced snapshots of the λ = 0 and λ = 1 datasets. For each of the replicates, we ran 100 independent simulations forward and backwards, collecting the values of (∂H/∂λ). Alchemical transformations were also performed in the same way for the peptide in TIP3P water (see Methods). Then we estimate transfer free energy differences by combining data from the transformations within the condensate and in water (see Appendix 2, Figure S2A and B). We find that ΔΔGTransfer is small and negative (−2.9±0.5 kJ/mol), indicating that transfer into the peptide condensate consisting of Gly, Ser and Tyr is more favourable for the Tyr-containing peptide than for the Phe variant. This result is consistent with mounting experimental evidence that suggests a greater propensity for Tyr to form condensates compared to Phe in variants of proteins such as FUS, LAF1, and hnRNPA1.

A limitation of our model peptide condensates is their high protein densities, which are larger than those of full-length IDRs (Zheng et al., 2020). This may be influenced by the propensity of the TIP3P water model to induce compact states (Piana et al., 2015). We have repeated the same procedure using the TIP4P-Ew water model (Horn et al., 2004), to estimate ΔΔGTransfer for the GGXGG peptide into a GSY condensate (see Figure S3 in Appendix 2). In combination with force fields from the Amber ff99SB family, this water model resulted in improved properties for peptides in dilute (Nerenberg and Head-Gordon, 2011) and dense peptide solutions (Miller et al., 2016). With this force field-water model combination, we have obtained a value of −2.6±0.6 kJ/mol (see Figure S4), which is in good agreement with the result in TIP3P water.

One possible explanation for these results is the formation of preferential sticker-sticker interactions between the GGYGG peptide and the Tyr residues in the GSY condensate. While the hydroxyl group of Tyr can form hydrogen bonds, such interactions are expected to be depleted for the GGFGG variant. Conversely, if the condensate were composed of Phe instead of Tyr, there would be no reason to favour GGYGG. Moreover, both hydrophobicity considerations and interaction energies from contact matrices suggest that transfer into a GSF condensate should be more favourable for the GGFGG peptide than for GGYGG. To test this hypothesis, we have run the same procedure in a Gly/Ser/Phe (GSF) peptide condensate. We show the time series data for SASA and protein density for the GSF simulations in Figure S5 of Appendix 2. Combining these results with those from the simulations in water, we obtain a value for the transfer free energy difference of ΔΔGTransfer =-2.5±0.5 kJ/mol (see Appendix 2, Figure S2C). Hence, irrespective of the aromatic within the condensate, transfer is more favourable for the Tyr-containing peptide.

Similar protein-protein interaction patterns in Tyr and Phe condensates

To gain further insight into the dominant interactions, we have run additional simulations of isolated dense phases for GSY and GSF condensates, without the GGXGG peptide (see Figure 4A). First, we look at the statistics for the different interaction pairs inside the condensates. We find that the interaction patterns are very similar in both cases, with a greater number of contacts formed by the aromatic stickers and smaller contributions from Gly and Ser (see Figure 4B). This result recapitulates our previous observations (De Sancho, 2022) and also those from atomistic simulations of low complexity regions of FUS and Ddx4 (Zheng et al., 2020). The greater contact probability of Tyr/Phe relative to Gly/Ser is consistent with the scaffold-client relationship between aromatic and polar residues (Dignon et al., 2020).

(A) Representative snapshots from the simulations of dense phases. G/S/F/Y dipeptides are shown as transparent surfaces (G: white, S: yellow, Y: green, F: purple). (B) Interaction matrix for the normalized number of contacts between different pairs of amino acid residues in the condensates (top) and for each type of amino acid (bottom). (C) Density plots for side chain interactions between aromatic side chains, as characterised by the mean inter-residue distance and the angle θ between the vectors normal to the rings (Calinsky and Levy, 2024). (D) Density plots for sp2-π interactions between amide bonds and aromatic side chains, as characterised by the mean inter-group distance and the angle θ between the vectors normal to the peptide bond and ring planes. In all panels, results for GSY and GSF condensates are shown on the top and bottom, respectively. Representative snapshots of relevant interactions for each type of pair are shown.

To focus on ππ interactions between pairs of aromatic side chains, we have also calculated the mean inter-side chain distances and the angles between vectors normal to the aromatic rings (θ). In Figure 4C, we show their distribution, which is very similar in the GSY and GSF condensates. Even though planar ππ stacking is feasible, it does not play a dominant role (see the regions with low inter-residue distance and θ angles close to 0 or π). We also examine sp2π contacts formed between aromatic and peptide bond groups, which are common in our peptide dense phases. This is consistent with previous simulation work (Zheng et al., 2020; Murthy et al., 2021; Rekhi et al., 2024) and also with a statistical analysis of experimental structures from the PDB, which showed that these interactions are particularly prevalent in disordered regions of folded proteins (Vernon et al., 2018). Interestingly, also for these sp2π interactions, we find no significant differences in interaction patterns between GSY and GSF condensates.

Same interaction patterns, different stabilities

Our results indicate that Tyr’s greater ability to promote condensate formation, relative to Phe, is not primarily due to differences in protein–protein interaction patterns between GSY and GSF condensates. Importantly, this similarity does not imply that the two systems have an equal propensity for phase separation, as the energetics of the interactions may differ. To interrogate this point, we have calculated the phase diagrams for the GSY and GSF condensates. We have used REMD simulations to estimate the peptide density of the dilute and dense phases at the temperatures where these can be resolved (see snapshots at high and low temperatures in Figure 5A-B and density profiles in Figure S6 of Appendix 2). In Figure 5C, we show the calculated phase diagrams for both systems, which have remarkably similar shapes. However, the saturation density (ρsat), i.e. the amount of peptide required for the system to phase separate, is slightly lower in the GSY condensate than in GSF. Additionally, GSY has a slightly greater critical temperature (Tc).

(A) Representative snapshots of REMD simulations of the GSY condensate at high (top) and low (bottom) temperatures. Colour code as in Figure 3. (B) Same for the GSF condensate. (C) Phase diagrams for GSY (green) and GSF (purple). Empty circles correspond to simulations and filled circles correspond to fitted critical temperatures (Tc) and densities (ρc).

For simplicity, we have calculated the densities of all the peptides in the mixture as if they were partitioning uniformly. However, a more rigorous analysis would involve treating stickers and spacers separately, as we did in our more systematic study for mixtures with different concentrations of stickers and spacers (De Sancho, 2022). Given the role as a “scaffold” of the aromatics in the dense phases, we expect them to be overrepresented in the condensate relative to the spacers (“clients”). In Figure S7 we show the density profiles of aromatics at room temperature, where we find that the saturation density of Tyr is lower than that of Phe. This overemphasises, that, even if Phe and Tyr play similar roles as scaffolds in peptide condensates, Tyr has a stronger propensity to phase separate, in agreement with the hierarchy of sticker strenghts observed experimentally.

Condensates as hydrated peptide solvents

We have thus far found that condensate formation by Tyr and Phe is mediated by similar contact patterns and interaction modes, dominated by contacts between aromatic residue pairs, which are more favourable for Tyr. An important driver of the observed behaviour may involve differences in protein-solvent interactions by the Phe/Tyr peptides. To investigate this, we begin by analysing the distribution of water molecules around the aromatic side chains in the dense phase simulations of GSF and GSY (see Fig. 4A), revealing notable differences. In Figure 6B, we show the radial distribution function (g(r)) for water oxygen around the Cζ in the aromatic side chains. In the case of Tyr, there is a prominent peak in g(r) at short distances due to the presence of water molecules involved in hydrogen bonds. These interactions are enthalpically favourable and entropically unfavourable due to the water structuring around the Tyr hydroxyl group (Lin et al., 2017). The different thermo-dynamic contributions are intertwined in the calculated ΔΔGTransfer. The net balance will depend on whether the peptide is transferred from water into a condensate, into the core of a folded protein or an altogether different solvent, capable or not of hydrogen bonding.

(A) Radial distribution function for water oxygen around the Cζ in Phe/Tyr for GSF and GSY condensates. We show a representative overlay of simulation snapshots where water molecules are hydrogen-bonded to the Tyr side chain. (B) Transfer free energy differences from water to a different medium between Tyr and Phe. We consider condensates (green), polar solvents (orange) and apolar solvents (blue). (C) Same as a function of calculated dielectric constant, ε.

The picture that emerges from our transfer free energy calculations is one where condensates act as media capable of interacting with the peptide via different types of interactions, including ππ and sp2π contacts. To investigate the effect of different environments, we have performed the same type of calculations using a battery of solvents with diverse properties. These include cyclohexane or octanol —used in the past as models for the interior of folded proteins (Kauzmann, 1959; Nozaki and Tanford, 1971; Radzicka and Wolfenden, 1988; Wimley et al., 1996; Pace et al., 2011)— and also shorter alcohols or acetone. The results of these calculations are shown, together with those of the different peptide condensates and solutions, in Figure 6B. In the case of a purely non-polar solvent like cyclohexane, the difference in free energy from Phe to Tyr is large and positive (ΔΔGTransfer = 13.2±0.3 kJ/mol), indicating that Phe is favoured. This is also the case for aromatic solvents like benzene and toluene (with ΔΔGTransfer of 6.8±0.3 and 7.5±0.3 kJ/mol, respectively). Lacking partners for the interactions that favour Tyr in the target solvent, the greater hydrophobicity of Phe dominates in the net ΔΔGTransfer. This is consistent with the stronger interaction strength of Phe-Phe interactions in the Miyazawa and Jernigan potential (Miyazawa and Jernigan, 1996) (see Figure 1C), derived from contacts in the apolar cores of proteins. The difference in free energy is smaller for alcohols and decreases gradually as the length of the aliphatic chain decreases. For acetone, transfer of Tyr is slightly more favourable (ΔΔGTransfer =-1.2±0.2 kJ/mol).

A parameter that we can determine from our MD trajectories characterising the properties of the different solvents, and also the condensates, is the dielectric constant (ε, see Methods). Clearly, the ΔΔGTransfer values have a dependence on the value of ε (see Figure 6C). For the GSY and GSF condensates, ε values range between 30 and 50, surprisingly close to that derived from experiments on Ddx condensates using Flory-Huggins theory (45±13) (Nott et al., 2015) and from atomistic simulations of Ddx4 (~35−50 at a volume fraction of ϕ =0.4) (Das et al., 2020). In summary, for non-polar solvents with low ε, Phe is preferred to Tyr, while for polar solvents, including the condensates, the trend is reversed. At intermediate values of ε, there is a crossover between a regime dominated by hydrophobicity and one favoured by the stronger interactions established by Tyr.

Quantum calculations confirm a crossover in interaction strength

To characterise these subtle effects in greater detail, we resort to quantum chemical calculations, which can more accurately determine interaction strengths for specific configurations involving aromatic residues (Carter-Fenk et al., 2023; Calinsky and Levy, 2024). We consider different modes of interaction between the side chains of Phe and Tyr, represented by toluene (F) and p-cresol (Y), respectively. We considered various orientations in the optimization process and selected the two most stable structures for each case. For F-F, the most stable structures correspond to textitcross and antiparallel configurations, whereas in the case of Y-Y, H-bonded and antiparallel are most favoured (see Figure 7A and C). Our results are consistent with previous work on the toluene dimer, where the energy minima also corresponded to the cross and antiparallel structures, with only minor differences in their energies (Tsuzuki et al., 2005; Rogers et al., 2006). Other structures, like parallel or T-shaped, are less stable for toluene, although in the benzene dimer, T-shaped conformers can be favoured (Chipot et al., 1996). In our optimisations, T-shaped F-F structures tended to collapse to less stable parallel structures, and hence we do not include them here. The interaction energy values we obtained are qualitatively similar to those in the literature (Calinsky and Levy, 2024), though slightly more negative, and they follow the same rank order (i.e. Y-Y H-bonded < antiparallel Y-Y < cross F-F). We also calculate the energies for Phe and Tyr side chains interacting with an amide group (A) representing a peptide bond in the prevalent sp2-π interactions (Vernon et al., 2018) (see Figure 7C). In the case of Tyr, we have also analyzed the configuration where it forms a hydrogen bond with the amide through the alcohol group of p-cresol.

(A) and (B) for different interaction pairs ij and different solvents as a function of the dielectric constant, ε. (C) Optimized geometries from quantum chemical calculations for Phe and Tyr interaction pairs. (D) with respect to the cross-F-F interaction for different solvents as a function of the dielectric. Different symbols correspond to the different solvent types. Filled: alcohols; empty: alkanes/benzenes; lines: water with different dielectric.

To explore the environmental dependence in contact formation for amino acid pairs, we have calculated values of the interaction energies considering a set of protic and non-protic solvents (see Figure 7A). Additionally, we have run the same calculations using a modified water with the dielectric constant as a free parameter. The overall trends in interaction energies are very similar across interaction types. In all cases, the interaction energy tends to become stronger as the value of the dielectric decreases. Also, the relative strength of interactions is consistent across solvents. The strongest interaction corresponds to the Y-A hydrogen-bonded interaction. Next comes a series of partial π-stacking configurations with contributions of hydrogen bonds and sp2-π interactions, namely, Y-Y H-bond, Y-A sp2-π and F-A sp2-π structures, which we have found to be particularly abundant in our MD runs. Finally, the π-stacked F-F cross is the weakest interaction among those included in this analysis.

Therefore, if we consider these interaction energies as the primary driving force for condensate formation, Tyr should be more adhesive than Phe independently of the environment. However, as realized by Miyazawa and Jernigan (Miyazawa and Jernigan, 1996; Thomas and Dill, 1996), contact formation, either within the core of folded proteins or in the case of molecular condensates, involves two processes: first, the transfer of these groups from water to an environment with a lower dielectric; and second, the interaction between side chain groups (or with the backbone), which is captured by (see Figure 2B). Hence, for a given solvent, the total energy can be calculated from the sum of these two contributions as

The last two terms on the right-hand side of this expression capture the propensity of a monomer to be transferred from water to the solvent, which can be modelled using polarizable continuum models such as SMD (Marenich et al., 2009). Notice that when the solvent is a low dielectric non-protic solvent such as cyclohexane, recapitulates the results from a standard hydropho-bicity scale (Kyte and Doolittle, 1982).

When both contributions are considered, the picture changes dramatically (see Figure 7B). In non-protic aliphatic solvents, such as alkanes and benzene derivatives, interaction for F-F pairs becomes more favourable than in Y-Y pairs, irrespective of the dielectric constant. On the other hand, in alcohols, tends to favour Y-A interactions at high ε and F-F pairs at low ε. In other words, at intermediate dielectrics, we find a crossover between F and Y in the propensity to form pairwise contacts in solvents. In our calculations, we have also considered a water-like solvent where the dielectric constant can be tuned as a free parameter (see Figure 7C). Also, we find a similar tendency in this case, but now the crossover is obtained at slightly lower values of ε. In Figure 7D, we show values, where we are using as reference the energy of the F-F cross configuration, for three selected interactions: Y-Y H-bond, Y-A H-bond, and F-A, for the three types of environments. Negative values of indicate a preference of the corresponding interaction relative to F-F cross, while positive values indicate that cross-F-F is stronger. Clearly, there is a crossover between a regime where toluene forms stronger interactions and one where p-cresol is preferred, which is highly influenced by the type of environment. In a non-protic solvent, where hydrogen bonding is not possible, or in a protic solvent with low dielectric constant, we find that F-F interactions are favoured. In contrast, in high dielectric environments, Y-Y and Y-A interactions dominate.

Conclusions

In this work, we have aimed to reconcile the apparent contradictions between the lower solvation-free energy (Wolfenden et al., 1981; Chang et al., 2007) and stronger hydrophobicity of Phe (Kyte and Doolittle, 1982; Tesei et al., 2021) and the greater contact energies assigned to Phe-Phe interactions in proteins (Miyazawa and Jernigan, 1996) with the higher sticker strength of Tyr in biomolecular condensates (Lin et al., 2017; Wang et al., 2018; Schuster et al., 2020; Bremer et al., 2022). In the past, the different propensities of positively charged stickers to form condensates had been explained from the fundamental properties of Lys and Arg (Wang et al., 2018; Das et al., 2020; Gobbi and Frenking, 1993; Hong et al., 2022). However, an explanation for the different strengths of Phe and Tyr remained elusive (Das et al., 2020).

Through a combination of classical MD simulations and quantum mechanical calculations, we have found that, in addition to the distinct types of interactions formed by Phe and Tyr, a crucial factor in their overall propensity for contact formation across different environments is the transfer free energy. At one extreme, we find environments such as the well-packed core of a protein, where the dielectric constant can be as low as 2-4 (Pitera et al., 2001). This type of environment can be modelled using aliphatic solvents, an approach followed extensively in the past (Kauzmann, 1959; Nozaki and Tanford, 1971; Radzicka and Wolfenden, 1988; Wimley et al., 1996; Pace et al., 2011). In this low-dielectric regime, Phe-Phe interactions are the most favourable, as evidenced by both our classical transfer free energies (Figure 6C) and the estimated quantum values (Figure 7).

This explains why Phe forms the strongest interactions in contact matrices like that by Miyazawa and Jernigan, which were derived from statistics from 3D structures of proteins (Miyazawa and Jernigan, 1996) and are known to be dominated by hydrophobicity effects (Pokarowski et al., 2005). At the other extreme, in environments with high dielectric constants such as aqueous or polar solvent conditions, Tyr-Tyr interactions become more favourable, as inferred from potentials of mean force derived from atomistic molecular simulations (Chelli et al., 2002; Joseph et al., 2021). Biomolecular condensates, due to their high water content (~200-300 mg/mL in our simulations, but up to ~600 mg/mL in experiments, (Zheng et al., 2020)), are best modelled as a solvent with an intermediate dielectric and favourable interactions between stickers. Hence, they operate in the second regime, although the influence of the dielectric is subtle, due to the possibility of crossover. We note that, although we have not included in our analysis positively charged residues that form cation-π interactions with aromatics, the observed crossover will also be relevant to Arg/Lys contacts with Phe and Tyr. Following the rationale of our findings, within condensates, cation-Tyr interactions are expected to promote phase separation more strongly than cation-Phe pairs. Our findings align with a recent report from Rekhi et al. using the equilibrium alchemical techniques that emphasizes the role of transfer free energy in biomolecular condensates (Rekhi and Mittal, 2025).

The conclusions from our work are necessarily limited due to our reductionist approach. First, in our calculations, we are considering very short peptides including a single sticker, and hence we cannot account for the constraints imposed by chain connectivity (Shimizu and Chan, 2001) or the influence of multivalency in protein IDRs (Martin et al., 2020). Also, we have used a minimal alphabet of possible interaction pairs limited to two types of amino acid residues. We do not consider effects relative to cation-π interactions that have been characterised by others (Brady et al., 2017; Das et al., 2020; Aldeghi et al., 2018; Hazra and Levy, 2023; Calinsky and Levy, 2024). In our calculations, the minimal peptides result in high protein densities in the dense phases, which are nonetheless consistent with those found by other authors in peptides with greater variation in length and sequence space (Workman et al., 2024; Brown and Potoyan, 2024; Emelianova et al., 2025). Additionally, results may vary somewhat when considering different classical force fields. Recent work points to a substantial variation in hydrophobicity across force fields (Lobo et al., 2025). Another potential limitation may arise from the choice of water model, as the dielectric constant —which plays an important role in our observations— is known to vary significantly between water models. In the case of the TIP3P model used in most of our calculations, it exceeds the experimental value (ε =100±5, (Braun et al., 2014)). However, using the TIP4P-Ew water model (with ε =64±1, (Horn et al., 2004)), we have found consistent results. Careful benchmarking of force fields against properties of peptide condensates offers a promising avenue for future efforts in force-field calibration.

The methodology proposed in the present work can be easily extended to other types of interactions. The overall qualitative conclusions on the crossover in interaction strengths as a function of the environment are likely to be maintained if different quantum approaches are used for contact calculations or different force fields are used for the classical simulations. The context dependence of interaction strengths that we observe is related to a growing body of work in the study of peptide self-assembly (Kaygisiz et al., 2025). The effect has deep roots in the concepts of “pair” and “bulk” hydrophobicity (Wood and Thompson, 1990; Shimizu and Chan, 2002). While pair hydrophobicity is connected to dimerisation equilibria (i.e. the second step in Figure 2B), bulk hydrophobicity is related to transfer processes (the first step). Our work stresses the importance of considering both the pair contribution that dominates at high solvation, and the transfer free energy contribution, which overwhelms the interaction strength at low dielectrics.

Acknowledgements

The authors gratefully acknowledge conversations with Priya R. Banerjee, Rohit Pappu and Tanja Mittag, which inspired this research. The work has been financed by the Spanish Ministry of Science and Innovation through project PID2021-127907NB-I00, and the Basque Government (Project IT1584-22). The authors also thank the IZO-SGI SGIker (UPV/EHU/ERDF,EU) and DIPC for technical and human support and for the allocation of computational resources. The authors thankfully acknowledge RES resources provided by the Barcelona Supercomputing Center in MareNostrum5 (BCV-2025-2-0002).

Appendix 1

Supporting Tables

Copies of each of the terminally capped amino acid residues in the condensate simulations.

For GSY and GSF condensates, we use an equimolar mixture of all three components.

Number of molecules used in pure solvent simulations for the determination of dielectric constants.

Simulation times for each of the families of systems considered in this study.

For GSY and GSF condensates, we run triplicates. For the simulations in pure solvents (including water), a single simulation run was performed for each system.

Appendix 2

Supporting Figures

Replicates of the GGXGG peptide simulations in the GSY condensate.

(A) Time series for the accessible surface area for different values of λ. (B) Density profiles of peptide (solid lines) and F/Y residue (right Y-axis) averaged over the simulation runs. (C) Time series for the density profiles of all peptides (blue) and the F2Y residue (red).

Work values for the forward and reverse alchemical transitions for different initial snapshots (bottom) and their distributions (top).

The BAR estimator is shown in the legend. Each panel corresponds to a different solvent. (A) TIP3P water. (B) Replicates of GSY condensates. (C) Replicates of GSF condensates.

Results for the GSY condensate in the TIP4P-Ew water model.

(A) Representative simulation box with a fully solvated GSY condensate in slab geometry including a GGXGG peptide (spheres) and the capped amino acid mixture (G: white, S: yellow and Y: green). (B) Time series for the solvent accessible surface area (SASA) for different values of λ. (C) Time evolution of the density profiles calculated across the longest dimension of the simulation box (L) in the coexistence simulations.

Results for the GSY condensate in the TIP4P-Ew water model.

Work values and their distribution for forward and backward transitions within water (A) and the GSY condensate (B).

Replicates of the GGXGG peptide simulations in the GSF condensate.

(A) Time series for the accessible surface area for different values of λ. (B) Density profiles of peptide (solid lines) and F/Y residue (right Y-axis) averaged over the simulation runs. (C) Time series for the density profiles of all peptides (blue) and the F2Y residue (red).

Density profiles at different temperatures for the GSY (A) and GSF (B) condensates.

Density profiles for sticker residues across the simulation box from the REMD simulations at room temperature.

Work values for the forward and reverse alchemical transitions for different initial snapshots (left) and their distributions (right).

The BAR estimator is shown in the legend. Each panel corresponds to a different solvent: (A) Acetone. (B) Benzene. (C) Cyclohexane. (D) Ethanol. (E) Hexanol. (F) Methanol. (G) Octanol. (H) Toluene.