Evolution towards simplicity in bacterial small heat shock protein system

Piotr Karaś; Klaudia Kochanowicz; Marcin Pitek; Przemyslaw Domanski; Igor Obuchowski; Bartlomiej Tomiczek; Krzysztof Liberek

doi:10.7554/eLife.89813.2

eLife assessment

This valuable study advances our understanding of the evolution of protein complexes and their functions. Through convincing experimental and computational methodologies, the authors show that the specialization of protein function following gene duplication can be reversible. The work will be of interest to investigators working in biochemical evolution and those working on heat shock proteins.

https://doi.org/10.7554/eLife.89813.2.sa2

Significance of findings

valuable: Findings that have theoretical or practical implications for a subfield

landmark
fundamental
important
valuable
useful

Strength of evidence

convincing: Appropriate and validated methodology in line with current state-of-the-art

exceptional
compelling
convincing
solid
incomplete
inadequate

During the peer-review process the editor and reviewers write an eLife assessment that summarises the significance of the findings reported in the article (on a scale ranging from landmark to useful) and the strength of the evidence (on a scale ranging from exceptional to inadequate). Learn more about eLife assessments

Abstract

Evolution can tinker with multi-protein machines and replace them with simpler single-protein systems performing equivalent functions in equally efficient manner. It is unclear how, on a molecular level, such simplification can arise. With ancestral reconstruction and biochemical analysis we have traced the evolution of bacterial small heat shock proteins (sHsp), which help to refold proteins from aggregates using either two proteins with different functions (IbpA and IbpB) or a secondarily single sHsp that performs both functions in an equally efficient way. Secondarily single sHsp evolved from IbpA, an ancestor specialized in strong substrate binding. Evolution of an intermolecular binding site drove the alteration of substrate binding properties, as well as formation of higher-order oligomers. Upon two mutations in the α-crystallin domain, secondarily single sHsp interacts with aggregated substrates less tightly. Paradoxically, less efficient binding positively influences the ability of sHsp to stimulate substrate refolding, since the dissociation of sHps from aggregates is required to initiate Hsp70-Hsp100-dependent substrate refolding. After the loss of a partner, IbpA took over its role in facilitating the sHsp dissociation from an aggregate by weakening the interaction with the substrate, which became beneficial for the refolding process. We show that the same two amino acids introduced in modern-day system define whether the IbpA acts as a single sHsp or obligatorily cooperates with an IbpB partner. Our discoveries illuminate how one sequence has evolved to encode functions previously performed by two distinct proteins.

Introduction

Gene birth and loss is a hallmark of protein family evolution, however molecular determinants and genetic mechanisms enabling that process are not well understood (Fernandez & Gabaldon, 2020; Worth et al, 2009). Gene loss and differential retention of paralogues reshapes the divergence of organisms both in Animalia, Archaea and Bacteria (Fernandez & Gabaldon, 2020; Iranzo et al, 2019; Puigbo et al, 2014). Examples of gene loss often involve adaptive changes in response to changing environmental niches, like loss of genes encoding olfactory receptors in primates (Demuth & Hahn, 2009) or differential retention of paralogous genes encoding venom toxins in different rattlesnake lineages (Dowell et al, 2016). In prokaryote genomes, gene loss is one of the main evolutionary processes accelerating sequence divergence leading to functional innovations (Puigbo et al., 2014). In complex protein systems execution of a cellular function can be shared between several proteins. In bacteria the cost of maintaining additional gene copy is very high and maintaining low gene count is important for keeping the replication energy costs low (Kempes et al, 2017; Lever et al, 2015; Lynch & Marinov, 2015). Still, it remains unclear how a multi-protein system can undergo simplification. Here we asked what are the molecular events that enabled the gene loss, and how one of the biochemical functions has been taken over by the other protein. We investigated these questions using small heat shock protein (sHsp) system, which underwent simplification within Enterobacterales (which include common bacteria species like Escherichia coli, Salmonella enterica and Erwinia amylovora) as a model (Fig. 1).

sHsp systems in *Enterobacteriacrae* and *Erwiniaceae*
Left - schematic phylogeny of sHsps in *Enterobacterales*. Gene duplication resulting in IbpA + IbpB two-protein system is marked with a star, while the loss of *ibpB* gene in *Erwiniaceae* clade is marked with a cross. AncA₀ – reconstructed last common ancestor of IbpA from *Erwiniaceae* and *Enterobacteriaceae*, expressed as a part of two-protein system. AncA₁ – reconstructed last common ancestor of secondarily single IbpA from *Erwiniaceae*. Right - representative extant sHsps’ ability to stimulate luciferase refolding. sHsps were present during the luciferase thermal denaturation step. Refolding of denatured luciferase was performed by the Hsp70-Hsp100 chaperone system. Activity of luciferase was measured after 1h refolding at 25 °C and shown as an average of at least three repeats ± standard deviation.

sHsps are a family of ATP – independent molecular chaperones present in all living organisms with various copy numbers (ten representatives in human) (Haslbeck & Vierling, 2015). They bind misfolded proteins and sequester them into refolding – prone assemblies, preventing uncontrolled aggregation and helping to maintain proteostasis at stress conditions. sHsp are composed of a highly conserved α–crystallin domain (ACD), in a form of so-called β–sandwich, flanked by less conserved, unstructured N – and C – terminal regions (Haslbeck & Vierling, 2015; Haslbeck et al, 2019; Reinle et al, 2022). Their smallest functional unit is usually a dimer, formed by the interaction between ACDs of two neighboring sHsps. Stable sHsp dimers in turn tend to form variable and dynamic higher – order oligomers, stabilized by N – and C – terminal region interactions. Particularly the interaction between IXI motif, highly conserved in sHsps C-termini, and the cleft formed by β4 and β8 strands of ACD is critical for oligomer formation (Haslbeck & Vierling, 2015; Kennaway et al, 2005; Mani et al, 2016; Strozecka et al, 2012). Oligomers of bacterial sHsps reversibly dissociate into smaller forms when the temperature increases. It is considered their activation mechanism, probably uncovering substrate interaction sites. The mechanism of sHsps’ interaction with misfolded substrates is not yet fully understood, but both N - termini and β4 – β8 cleft region have been found to play a role in this process (Basha et al, 2006; Fuchs et al, 2009; Jaya et al, 2009; Lee et al, 1997; Reinle et al., 2022).

In most Enterobacterales a two-protein sHsps system exists, consisting of IbpA and IbpB proteins (Mogk et al, 2003; Obuchowski et al, 2019). IbpA and IbpB have originated via duplication, form a heterodimer partnership and are functionally divergent from one another (Obuchowski et al., 2019; Pirog et al, 2021) (Fig. 1A,B). IbpA is specialized in tight substrate binding (sequestrase activity), while IbpB is required for dissociation of both sHps from the aggregates, a step necessary to initiate Hsp70-Hsp100 dependent substrate disaggregation and refolding (Obuchowski et al, 2021; Ratajczak et al, 2009). In a subset of Enterobacterales (Erwiniaceae), as a result of ibpB gene loss, the secondarily single-protein sHsp (IbpA) system has emerged (Fig. 1). The term “secondarily single” is used in order to distinguish it from single-protein IbpA from clades in which the duplication did not occur (for example Vibrionaceae) (Obuchowski et al., 2019). How did IbpA evolve to become independent of its partner? In this study, using ancestral reconstruction, we identify mutations, which allowed the secondarily single IbpA to be fully functional without its partner in substrate sequestration and handover to Hsp70-Hsp100-mediated disaggregation and refolding.

Results

New activity of Erwiniaceae IbpA has evolved in parallel to ibpB gene loss

To better understand the evolution of sHsps after gene loss, we reconstructed the IbpA ancestors from before and after the loss of its IbpB partner. This technique uses multiple sequence alignments of modern-day proteins from different species to infer amino acid sequences of its common ancestors (Ashkenazy et al, 2012; Pupko et al, 2002) and is widely used to investigate various evolutionary questions (Gaucher et al, 2008; Longo et al, 2020; Thomson et al, 2005; Thornton et al, 2003). We created a multiple sequence alignment of 77 IbpA sequences from Enterobacterales (supplementary file 1), from which we inferred the phylogeny of IbpA using maximum likelihood method (Fig. 2, supplementary file 2 – phylogenetic tree in newick format). From that we inferred ancestral sequences, which have the highest probability of producing the modern-day sequences using the empirical Bayes method (Ashkenazy et al., 2012; Cohen & Pupko, 2011; Cohen et al, 2008; Pupko et al., 2002; Simmons & Ochoterena, 2000). Next, we resurrected (i.e., expressed and purified) the last ancestor of IbpA present before (AncA₀) and after (AncA₁) the differential gene loss (Figs. 2, supplementary file 3).

IbpA phylogeny in *Enterobacterales*:
Phylogeny was reconstructed from 77 IbpA orthologs from *Enterobacterales* using Maximum Likelihood algorithm with JTT + R3 substitution model. AncA₀ – node representing the last common ancestor of IbpA from *Erwiniaceae* and *Enterobacteriaceae*. AncA₁ – node representing the last common ancestor of IbpA from *Erwiniaceae*. Bootstrap support is noted for the major nodes. Extant IbpAs from *E. coli* and *E. amylovora* are marked with a red frame. Scale bar – substitutions per position.

Similarly, to modern-day IbpA proteins both AncA₀ and AncA₁ were fully folded, and reversibly deoligomerized into smaller species under elevated temperature (Fig. 3 – figure supplement 1). Moreover, both ancestral proteins were able to sequester aggregating firefly luciferase in sHsp-substrate assemblies. AncA₀ exhibited sequestrase activity on the level comparable to IbpA from Escherichia coli (IbpA_E.coli). AncA₁ was moderately efficient in this process and IbpA from Erwinia amylovora (IbpA_E.amyl) was the least efficient sequestrase (Fig. 3A). The differences in sequestrase activity were especially pronounced at lower sHsp concentrations. Next, we tested their ability to bind protein aggregates in real time (Fig. 3B). Ancestral proteins’ interaction with the aggregated substrates was stronger than in the case of extant E. amylovora IbpA, but weaker than in the case of extant E. coli IbpA (Fig. 3B).

Functional changes during the evolution of secondarily single sHsp in *Erwiniaceae*.
(A) Sequestrase activity of extant and ancestral sHsps. Luciferase was heat denatured in the presence of different concentrations of sHsps and size of formed sHsps – substrate assemblies was measured by DLS. Results are shown as average hydrodynamic diameter ± standard deviation. (B) Binding of extant and ancestral sHsps to heat-aggregated *E. coli* proteins. *E. coli* proteins were heat aggregated and immobilized on a BLI sensor. sHsps were heat activated before the binding step. (C) Extant and ancestral sHsps’ ability to stimulate luciferase refolding. Experiment was performed at 25 °C. Luciferase activity at each timepoint was shown as an average of at least three repeats ± standard deviation.

Finally, we asked how the modification of the substrate aggregation process by reconstructed proteins influences subsequent substrate refolding by the Hsp100 and Hsp70 chaperones. AncA₁ stimulated luciferase refolding, however its effectiveness was around half of both analyzed extant sHsp systems (single IbpA from E. amylovora or IbpA + IbpB system from E. coli), similar to extant IbpA form E. coli without it’s IbpB partner. AncA₀, in contrast, inhibited luciferase refolding in comparison to control (no sHsps at substrate aggregation step) (Fig. 3C).

To test the robustness of our observations, we repeated the analysis for the alternative ancestors, which have the second highest probability to produce the modern-day sequences (AltAll) (Fig. 3 – figure supplement 2, supplementary file 3 – posterior probabilities) (Eick et al, 2017). Both AltAll variants behaved similarly to most likely (ML) variants in reversible deoligomerization, sequestrase activity and stimulation of substrate refolding assays (Fig. 3 – figure supplement 3 A-D). However, the last property (the influence on refolding) required higher Hsp70 system concentration to observe AltAll variants activity (Fig. 3 – figure supplement 3 C,D). Together, these data show that reconstructed ML and AltAll ancestors are functional. What is of particular interest, these data clearly point out that the ability of reconstructed sHsps to stimulate Hsp70-Hsp100-dependent substrate refolding arose between A₀ and A₁ nodes.

We performed a molecular evolution analysis to test for positive selection across IbpA phylogeny using both branch models and branch-site models in codeml (Jeffares et al, 2015; Yang, 1998, 2007; Yang & Nielsen, 2002). The analysis shows a significantly increased ratio of nonsynonymous to synonymous substitutions, after the gene loss, at the branch leading to A₁ with both tests. This result indicates that the new IbpA functionality likely arose due to an episode of positive selection rather than genetic drift (Fig. 4A, Supplementary file 4 A,B). The result of the branch-site test indicates possible positive selection acting on all sites substituted at the branch leading to A₁ with pp>0.5, therefore we aimed to identify minimum number of mutations that are responsible for a change in functional properties of IbpA.

Substitutions at positions 66 and 109 that occurred between nodes A₀ and A₁ are crucial for ancestral sHsps to work as a single protein.
Luciferase refolding assay was performed as in fig. 1. Activity of luciferase was measured after 1h refolding at 25 °C and shown as an average of at least three repeats ± standard deviation. (A) Schematic phylogeny of *Enterobacterales* IbpA showing increased ratio of nonsynonymous to synonymous substitutions (ω) on the branch between nodes AncA₀ and AncA₁ . Loss of cooperating IbpB is marked on a tree. Value of Likelihood Ratio Test (LRT) is given for the selection model. (B) Identification of substitutions necessary for AncA₀ to obtain AncA₁ – like activity in luciferase disaggregation; seven candidate mutations were introduced into AncA₀ (AncA₀ +7); subsequently, in series of six mutants, each of the candidate positions was reversed to a more ancestral state (AncA₀ + 6* variants) (C) Effect of substitutions at positions 66 and 109 on the ability of AncA₀ and AncA₁ to stimulate luciferase refolding.

Identification of residues defining ancestral sHsps activities

In order to identify amino acids responsible for the observed new functionality of AncA₁, we compared the sequences of two ancestral proteins, selecting seven out of ten substitutions as probable candidates. Three substitutions were removed from analysis based on the low conservation of these positions in extant proteins (Fig. 4 – figure supplement 1). The remaining seven were introduced into AncA₀ . Resulting AncA₀ +7 protein stimulated Hsp70-Hsp100-dependent luciferase refolding at the level comparable to AncA₁ (Fig. 4B). To further specify key mutations, we prepared seven additional variants. In each variant a different position in AncA₀ +7 was reversed to a more ancestral state. The substantial decrease in luciferase refolding stimulation was observed for positions 66 and 109 (Fig. 4B). Next, each of these substitutions on its own (Q66H or G109D) was separately introduced into AncA₀ . This was not sufficient to increase AncA₀ ability to stimulate luciferase refolding. However, when both substitutions were introduced simultaneously, the resulting sHsp exhibited activity similar to AncA₁ (Fig. 4C). What is more, when in AncA₁ these two positions were reversed to AncA₀ -like state, the resulting sHsp lost the ability to stimulate luciferase refolding (Fig. 4C). All analyzed proteins, namely AncA₀ +7, AncA₀ Q66H G109D and AncA₁ H66Q D109G, possess biochemical properties characteristic for sHsps, exhibiting reversible thermal deoligomerization and sequestrase activity (Fig. 4 – figure supplement 2 A,B). All these results show that substitutions Q66H and G109D are both sufficient and necessary for the increase in activity observed for ancestral sHsps between A₀ and A₁ nodes.

Identified substitutions influence α-crystallin domain properties

Substitutions Q66H and G109D, responsible for gaining single sHsp activity, are located in the α-crystallin domain (ACD) within β4 and β8 strands, which form a cleft responsible for the interaction with unstructured C-terminal peptide of the neighboring sHsp dimer (Fig. 5A). In order to identify possible structural underpinnings of the single sHsp activity we have predicted structures of AncA₀ and AncA₀ Q66H G109D α-crystallin domain (ACD) dimers in complex with C-terminal peptide using AlphaFold2 and in silico mutagenesis and subjected them to 0.5 μs equilibrium molecular dynamics (MD) simulations. Analysis of the C-terminal peptide interface contact probabilities in MD trajectories showed that both substituted residues contact the C-terminal peptide, although overall contact pattern remain similar upon their introduction (Fig. 5 – figure supplement 1) and no major differences in the overall ACD domains structure were observed (Fig. 5 – figure supplement 2) . To explore the possibility that identified substitutions affect strength of this interaction, we analyzed binding of purified ACDs of AncA₀ and AncA₀ Q66H G109D to C– terminal peptide using biolayer interferometry. Titrations of immobilized C-terminal peptide by different ACDs (Figs. 5B, 5 – figure supplement 3) allowed us to determine the dissociation constants. These two substitutions increased the K_0.5 of ACD binding to the C -terminal peptide from 4.3 μM to 7.1 μM at the same time increasing the Hill coefficient of the interaction from 2.3 to 3.7, indicating a modest decrease in affinity, accompanied by an increase in binding cooperativity.

Substitutions at positions 66 and 109 decreased the affinity of AncA₀ ACD to C-terminal peptide and aggregated substrate.
(A) Structural model of complex formed by AncA₀ Q66H G109D α-crystallin domain dimer (purple and lilac) and AncA₀ C-terminal peptide (orange). (B) Effect of Q66H G109D substitutions (green) on AncA₀ (purple) ACD’s affinity to the C - terminal peptide assayed by BLI. Biolayer thickness at the end of the association step was used to calculate the fraction of bound peptide. Filled circles represent means of triplicate measurements, individual data points are shown as hollow circles and were fitted to cooperative binding model (Hill equation). Values of fitted binding affinities [K_0.5] (AncA₀ 4.3 ±0.2 μM, AncA₀ Q66H G109D 7.1 ±0.2 μM) and Hill coefficients [n] (AncA₀ 2.3 ±0.17, AncA₀ Q66H G109D 3.7 ±0.34) are indicated on the plot. (C) Effect of Q66H G109D substitutions on AncA₀ ACD’s affinity to aggregated *E. coli* proteins bound to BLI sensor. Analysis was performed as in Fig. 3A.

As this interaction is known to play a crucial role in formation of sHsp oligomers (Fu et al, 2005; Mani et al., 2016; Strozecka et al., 2012), we used dynamic light scattering to investigate how Q66H and G109D substitutions influence the size of oligomers formed by AncA₀ at different temperatures. In agreement with decreased affinity between ACD and the C–terminal peptide, we have shown that these substitutions slightly decrease the oligomer size and facilitate AncA₀ deoligomerization (Figure 5 – figure supplement 4 A,B).

β4-β8 cleft in certain sHsps, in addition to its interaction with C – terminal peptide, was also shown to participate in interactions with substrates and partner proteins (Fuchs et al., 2009; Jaya et al., 2009; Lee et al., 1997; Reinle et al., 2022). Therefore, we tested whether ACD of AncA₀ binds protein aggregates and whether this interaction is influenced by Q66H G109D substitutions. We observed that AncA₀ ACD efficiently binds to either aggregated E. coli lysate or aggregated luciferase, and this binding was weakened by analyzed substitutions (Figs. 5C, 5 – figure supplement 5). This suggests that ACD of bacterial sHsps interacts with the substrate, most likely through the β4-β8 cleft.

These results allow us to conclude that substitutions Q66H and G109D in AncA₀ substantially increased the sHsp ability to stimulate Hsp70-Hsp100-dependent substrate refolding by weakening the interaction of β4-β8 cleft with both the C-terminal peptide and the aggregated substrates. Despite its ability to bind aggregated substrates in biolayer interferometry assay, analyzed ACDs do not exhibit sequestrase activity and were unable to positively influence substrate refolding by the Hsp70-Hsp100 system (Fig. 5 – figure supplement 5 B,C).

Identified substitutions define the mode of action of extant sHsps

As more ancestral, AncA₀ -like state in positions 66 and 109 is conserved in IbpA of E. coli while more modern, AncA₁ -like state is conserved in IbpA of E. amylovora, we decided to ask whether this difference is sufficient to explain functional differences between the two extant proteins. Therefore, we introduced AncA₁ -like substitutions into IbpA_E.coli and AncA₀ -like substitutions into IbpA_E.amyl . Resulting IbpA_E.coli Q66H G109D, in comparison to wild type IbpA_E.coli, exhibited increased ability to stimulate Hsp70-Hsp100-dependent luciferase refolding, as well as a decreased ability to bind aggregated substrates – becoming more similar to modern E. amylovora IbpA. At the same time, IbpA_E.amyl H67Q D110G significantly less efficiently stimulated luciferase refolding, while exhibiting increased ability to bind aggregated substrates in comparison to wild type IbpA_E.amyl (Fig. 6A-C). Still, both new IbpA variants exhibited properties characteristic for sHsps, namely reversible thermal deoligomerization and sequestrase activity (Fig. 6 - figure supplement 1 A,B).

Differences at positions 66 and 109 determine functional differences between extant IbpA proteins from *E. coli* and *E. amylovora*.
(A) Effect of substitutions at position 66 and 109 (and homologous) on the ability of IbpA from E. amylovora and E. coli to stimulate luciferase refolding. Assay was performed as in Fig. 1B. Activity of luciferase was measured after 1h refolding at 25 °C and shown as an average of at least three repeats ± standard deviation. (B, C) Effect of substitutions at analyzed positions on binding of IbpA from E. coli (B) and E. amylovora (C) to heat-aggregated E.coli proteins. Assay was performed as in 3A. (D-H) Effect of substitutions at analyzed positions on inhibition of Hsp70 system binding to aggregates by extant sHsps (D) Experimental scheme. (E-H) Aggregate-bound sHsps differently inhibit Hsp70 binding. BLI sensor with immobilized aggregated luciferase and aggregate bound sHsps was incubated with Hsp70 or buffer (spontaneous dissociation curve). Grey traces present Hsp70 binding to immobilized aggregates in the absence of sHsps. Results are presented as an average of at least three repeats ± standard deviation.

Above results suggest that tight sHsp binding to aggregates negatively affects subsequent Hsp70-Hsp100-dependent substrate refolding process. It is initiated by binding of the Hsp70 system (DnaK and cochaperones DnaJ and GrpE) to aggregates that requires sHsps to be outcompeted from aggregates (Zwirowski et al, 2017). To gain insight into the competition between sHsps and Hsp70 we modified the biolayer interferometry experiments and introduced the sensor with sHsps bound to luciferase aggregates into a buffer containing Hsp70 system (Fig. 6D). Although biolayer interferometry cannot distinguish between proteins bound to the sensor, we took advantage of the differences in the thickness of the protein layers specific for sHsp or Hsp70 binding and also in the binding kinetics. The analysis of the Hsp70 binding to the aggregates covered with sHsps clearly shows that the presence of IbpA_E.amyl or IbpA_E.coli Q66H G109D on aggregates only weakly inhibits Hsp70 binding (Fig. 6E,H). In contrast, the inhibition is much more pronounced when IbpA_E.amyl H67Q D110G or IbpA_E.coli are present on aggregates (Fig. 6F,G).

All above experiments indicate that two specific amino acids in positions 66 and 109 in ACD of IbpA proteins define the mode of IbpA activity. Glutamine 66 and glycine 109 are characteristic for IbpA proteins which bind tightly to substrates and thus are not easily outcompeted from the aggregates by Hsp70s. Such IbpAs require IbpB partner cooperation to function properly. Substitutions at these positions to histidine (position 66) and aspartic acid (position 109) allowed for the emergence of a single sHsp which binds to aggregating substrate less tightly and can be outcompeted from the aggregates by Hsp70s in the absence of IbpB.

Discussion

In this study we traced, at the molecular level, how the two-protein sHsp system, a part of the cellular protein refolding machinery, underwent simplification in a way that its biochemical functions are performed by a single protein. In most Enterobacterales two sHsps (IbpA and IbpB) drive sequestration of misfolded proteins into the reactivation-prone assemblies (Obuchowski et al., 2019). Together, IbpA and IbpB form a functional heterodimer, in which IbpA specializes in substrate binding, preventing the substrates from creating large aggregates (sequestrase activity), while IbpB promotes sHsps dissociation from the aggregates required for subsequent Hsp70-Hsp100-dependent substrate refolding (Obuchowski et al., 2019; Pirog et al., 2021). We showed that, in parallel to the ibpB gene loss in Erwiniaceae, new functions of IbpA have emerged, i.e. a lower substrate sequestrase activity, which correlates with the efficient substrate refolding. We have identified two amino acid substitutions (Q66H and G109D) responsible for this new IbpA functionality. Selection analysis shows that these two substitutions were likely driven by positive selective pressure. This indicates that this change possibly had an adaptive character in the ancestral background. It is important to note, however, that models used for the analysis do not account for variation of synonymous substitution rate and multinucleotide substitution events, which in some cases might lead to false positive results (Lucaci et al, 2023). Because of that, the alternative hypothesis that substitutions Q66H and G109D occurred in the common ancestor of Erwiniaceae due to genetic drift, enabling the subsequent loss of ibpB gene, cannot be fully discounted. Functional differences observed between modern-day sHsps from E. coli and E. amylovora are at least partially defined by presence of specific amino acids in these two positions and can be diminished by their swapping between extant proteins. Their occurrence in the last common ancestor of Erwiniaceae IbpA resulted in a decreased affinity of ACD’s β4-β8 cleft to aggregated substrates as well as to the C–termini of the other sHsps. These interactions might be of particular importance for stabilization of sHsps on a surface of sequestered aggregated substrates leading to the formation of so-called protective shell preventing further uncontrolled aggregation. Apparent role of C-termini and ACD interaction is in agreement with earlier studies, showing that addition of free C-terminal peptide causes E. coli IbpA and IbpB dissociation from the outer shell of sHsp - substrate complex (Zwirowski et al., 2017). It was also shown that, in case of Hsp 16.6 from cyanobacterium Synechocystis, substitutions that slightly weakened interaction between ACD and C-termini lead to increased stimulation of luciferase refolding. However, abolishing this interaction resulted in a non-functional protein, most likely due to the loss of the sequestrase activity (Giese & Vierling, 2002). Destabilization of the protective shell of secondarily single IbpA by weakening these interactions had an effect functionally analogous to a role of IbpB in the two-protein system, facilitating IbpA dissociation from the substrate (Obuchowski et al., 2019). Two identified substitutions also weaken the ACD interaction with aggregated substrates which is an additional factor shifting the sHsps balance towards dissociation, a step necessary to initiate Hsp70-Hsp100 dependent disaggregation and refolding.

Our results show how ACD substitutions can fine-tune sHsp system by exerting pleiotropic effects on ACD - C-terminal peptide and ACD - substrate interactions. These relatively small changes strongly influence the effectiveness of sHsp functioning in complex process of aggregated protein rescue by molecular chaperones. It might be particularly important in the case of a conserved interaction, like the one between C - terminal peptide and ACD, when excessive changes of affinity may be detrimental to the overall protein function (Giese & Vierling, 2002). Our approach enabled us to find functional residues in sHsp system, which would not have been possible by using conventional mutagenesis and highlights the importance of using vertical approach in biochemical studies. This study closely follows the evolutionary process, in which mutations in one of the two cooperating proteins tinker it to a point it becomes independent of its partner, enabling the simplification of a more complex system by partner loss while maintaining its overall function.

Following, with molecular precision, the genetic events associated with the gene loss allowed us to answer several questions about protein family evolution. The first question concerns how the lost function is incorporated into a remaining partner protein. The results of our experiments indicate that even though the primary function of a partner protein is maintained, it is altered so that it does not interfere with the newly incorporated one. In our case substrate binding properties, as well as formation of higher-order oligomers were altered in a way to keep the sequestrase function maintained but allow for more efficient stimulation of Hsp70-Hsp100-dependent substrate refolding. The second question concerns the context-dependence of mutations in protein evolution. We successfully transplanted the mutations that appeared in E. amylovora into E. coli IbpA ortholog, artificially creating an efficient single protein system from a protein that normally needs a partner for efficient substrate refolding. In contrast to other studies (Natarajan et al, 2023) the context of E.coli IbpA protein did not influence the ability of IbpA_E.coli Q66H G109D to work without IbpB partner. This brings the question about the accessibility of adaptive solutions. In case of E.coli IbpA there was no need for adaptive changes because the gene loss did not occur in this clade, suggesting that the loss of a protein can push another one towards an adaptation, which leads to finding efficient molecular innovations.

Materials and Methods

Reconstruction of IbpA phylogeny

Amino acid sequences of 77 IbpA orthologs from Enterobacterales were obtained from NCBI and UniProt databases and aligned using Clustal Omega (Sievers et al, 2011). Alignment was trimmed manually. JTT+R3 was identified as the best fit model by iq-tree, using Bayesian Information Criterion and was used in the analysis (Kalyaanamoorthy et al, 2017; Nguyen et al, 2015). The phylogenetic tree was inferred using iq-tree on the basis of 328 iterations of ML search with 100 rapid bootstraps replicates (Nguyen et al., 2015).

Reconstruction of ancestral IbpA amino acid sequences

Ancestral sequence reconstruction was performed on the basis of multiple sequence alignment of 77 amino acid sequences of IbpA orthologs from Erwiniaceae and Enterobacreriaceae as well as a phylogenetic tree of those orthologs (see above). Marginal reconstruction of ancestral sequences was performed with FastML program based on ML algorithm and Bayesian approach using JTT substitution matrix with gamma parameter (Ashkenazy et al., 2012; Cohen & Pupko, 2011; Cohen et al., 2008; Jones et al, 1992; Pupko et al., 2002; Simmons & Ochoterena, 2000).

Alternative ancestral sequences for AncA₀ and AncA₁ proteins were obtained by substituting most likely amino acid on every uncertain position (defined as a position with more than one amino acid with posterior probability ≥ 0.2) with the amino acid with the second highest posterior probability (Eick et al., 2017).

Analysis of natural selection

Analysis of natural selection was performed using codeml. First, Pal2Nal was used to obtain codon alignment based on the multiple sequence alignment of amino acid sequences of IbpA orthologs from Enterobacterales (see above) as well as corresponding nucleotide sequences obtained from NCBI database. Resulting codon alignment was then trimmed manually and used together with the phylogenetic tree obtained earlier (see above) for the selection analysis.

For branch model analysis, models M0 (null hypothesis) and Two – ratio (with either AncA₀ -AncA₁ branch or Erwiniaceae clade as foreground) were used. For branch – site model analysis, models A null (null hypothesis) and A were used, with foreground branches selected as above. Statistical significance of different models was estimated with Likelihood Ratio Test (LRT) (Jeffares et al., 2015; Yang, 1998, 2007; Yang & Nielsen, 2002).

Protein Purification

Purification of IbpA proteins

pET3a plasmids containing ancA₀, ancA₁, ancA₀ +7, ancA_{0 alt_all}, ancA_{1alt_all} and ibpA_Ea genes were ordered from GeneScript. Point mutations were introduced using site - directed mutagenesis and confirmed by sequencing. Proteins were overproduced in E. coli BL21(DE3). Cells were then lysed by sonication in Qsonica sonicator (13% amplitude, 2 min 30 s process time, 15 s pulse-ON time, 45 s pulse-OFF time) in lysis buffer L1 (50 mM Tris pH 7.5, 50 mM NaCl, 5 mM EDTA, 10 % glycerol, 5mM β-mercaptoethanol). Insoluble fraction containing proteins of interest was separated by centrifugation (75 000 x g, 30 min, 4°C) and resolubilized in buffer A (40 mM Tris pH 7.5, 50 mM NaCl, 10% glycerol, 5 mM β-mercaptoethanol, 6M urea) and then centrifuged (75 000 x g, 30 min, 4°C). Supernatant was loaded on Q – Sepharose chromatography column equilibrated with buffer A and eluted in 50 mM - 500 mM NaCl gradient. Fractions containing proteins of interest were then dialyzed to buffer B (40 mM Tris pH 8.5, 50mM NaCl, 10% glycerol, 5mM β-mercaptoethanol) and loaded on Q – Sepharose chromatography column equilibrated with buffer B. Flow-through fraction was collected and dialyzed to buffer C (50 mM Tris pH 7.5, 150 mM KCl, 5% (v/v) glycerol, 5 mM β-mercaptoethanol).

Purification of ACD domains

ACDs of IbpA_Ec,, AncA₀ and AncA₀ Q66H G109D were purified as described previously (Pirog et al., 2021) and as a final step dialyzed to buffer G (50 mM Tris pH 7.5, 150mM KCl, 5mM β-mercaptoethanol).

Purification of His₆ – SUMO and His₆ –SUMO-C-terminal peptide of AncA₀ construct

His₆ - SUMO was purified using the Champion™ pET SUMO Expression System. pET28a plasmid containing gene encoding His₆ –SUMO fused with C-terminal peptide of AncA₀ (PEAMKPPRIEIN) was ordered from GeneScript. Proteins were overproduced in E. coli BL21(DE3). Cells were then lysed by sonication in Qsonica sonicator (20% amplitude, 2 min process time, 5s pulse-ON time, 10s pulse-OFF time) in lysis buffer L2 (40 mM Tris pH 7.5, 100 mM NaCl, 10 % glycerol, 10 mM imidazole, 2 mM β-mercaptoethanol). Insoluble fractions were separated by centrifugation for 30 min at 70 000 x g and supernatants, containing proteins of interest, were incubated for 1h with Ni-NTA resin equilibrated with buffer L2. Resins were then washed with the buffer D (40 mM Tris pH 7.5, 100 mM NaCl, 10 % glycerol, 40 mM imidazole, 2mM β-mercaptoethanol) Proteins of interest were eluted from the columns with the buffer E (40 mM Tris pH 7.5, 100 mM NaCl, 10 % glycerol, 400 mM imidazole, 2mM β-mercaptoethanol) and then dialyzed to buffer C (as above).

DnaK, DnaJ, GrpE, ClpB and IbpA_Ec proteins were purified as described previously (Ratajczak et al., 2009). IbpB_Ec protein was purified as described previously (Pirog et al., 2021). His-tagged luciferase used for BLI measurements was purified as described previously (Obuchowski et al., 2019).

Purity of purified proteins was assessed with SDS-PAGE electrophoresis with Coomassie Blue staining. Protein concentrations were measured using Bradford reaction, with Bovine Serum Albumin as a standard. In the case of His₆ – SUMO fused with C-terminal peptide of AncA₀, concentration was measured with SDS-PAGE electrophoresis with Coomassie Blue staining coupled with densitometric analysis with Bovine Serum Albumin used as a standard.

OuantiLum® Recombinant Luciferase was purchased from Promega. Creatin Kinase from rabbit muscle was purchased from Sigma Aldrich.

Luciferase refolding assay

1,5 μM recombinant firefly luciferase in buffer F (50 mM Tris pH 7.5, 150 mM KCl, 20 mM MgCl₂, 2.5 mM DTT) was denatured by incubation for 10 min at 44°C alone or in the presence of 10 μM sHsps (3uM IbpA_Ec + 7μM IbpB_Ec in the case of two-protein system from E. coli). Denatured luciferase was then incubated at 25 °C with Hsp70 system (1 μM DnaK, 0.4 μM DnaJ and 0.3 μM GrpE), 2 μM ClpB, and ATP regeneration system (5 mM ATP, 0.1 mg/ml creatine kinase and 18 mM creatine phosphate). For experiment presented in Fig.3 – figure supplement 3D, higher concentration of the Hsp70 system was used (2 μM DnaK, 0.8 μM DnaJ and 0.6 μM GrpE). At different timepoints luciferase activity was measured with GLOMAX™ 20/20 luminometer, using the Luciferase Assay System from Promega. Results are presented as averages of at least three independent repeats ± standard deviation.

DLS measurements

Dynamic Light Scattering measurements were performed using Malvern Instruments ZetaSizer Nano S instrument, at 40 μl sample volume, scattering angle 173° and wavelength of 633 nm. For every measurement, minimum ten subsequent series of ten 10-s runs were averaged and particle size distribution was calculated by fitting to 70 size bins between 0.4 and 10,000 nm, as previously described (Zwirowski et al., 2017).

For reversible deoligomerization assay, size of oligomers formed by 10 μM sHsps in buffer F (50 mM Tris pH7.5, 150 mM KCl, 20 mM MgCl₂, 2.5 mM DTT) were measured by DLS. First measurement was performed at 25°C and then the sample was heated to 44°C, cooled to 25°C, heated to 44°C and cooled to 25°C, with measurements performed after each change in temperature. Results are presented as size distribution by volume.

For measuring influence of substitutions on oligomer formation, size of oligomers formed by either 10 μM AncA₀ or 10 μM AncA₀ Q66H G109D in buffer F (as above) was measured by DLS at 25, 27, 29, 31, 33, 35, 37, 39, 41, 43 and 45°C. Results were presented as size distribution by intensity (for temperatures 25°C, 35°C and 45°C) or as an average hydrodynamic diameter corresponding to maximum of a dominant peak of size distribution by volume plotted against temperature ± standard deviation.

For assembly formation (sequestrase) assay, 1.5 μM firefly luciferase in buffer F was denatured alone or in the presence of different sHsp concentrations by incubation at 44°C for 10 min. Size of obtained luciferase aggregates was then measured by DLS at 25°C. Results are presented as an average hydrodynamic diameter of measured particles weighted by intensity (Z-average) ± standard deviation.

Biolayer interferometry (BLI) measurements

sHsps interactions with aggregated luciferase or aggregated E. coli lysate were measured using Octet ® K2 system. Anchoring layer of his-tagged luciferase was attached to Octet® NTA Biosensors by 5 min incubation in 0.6 mg/ml his-tagged luciferase in denaturing conditions in buffer UF (50 mM Tris pH 7.5, 4.5 M urea, 150 mM KCl, 20 mM MgCl₂, 2.5 mM DTT) at 25 °C with 350 rpm shaking. Sensors were then incubated for 5 min as above in buffer H (50 mM Tris pH 7.5, 150 mM KCl, 20 mM MgCl₂, 5 mM β-mercaptoethanol) to remove urea and unbound luciferase. The protein aggregate was then formed on the sensor by incubation for 10 min in 0,5 mg/ml his-tagged luciferase or 0.2 mg/ml E. coli lysate in buffer H at 44°C (in case of the luciferase) or 55 °C (in case of the lysate). Sensors were then again incubated in buffer H for 5 min at 25 °C with 350 rpm shaking to remove excess protein. Sensors with attached aggregate were placed in the Octet® system in H buffer for 60 s baseline measurement and then placed for 1h in 5 μM sHsp solution in H buffer to measure sHsps association. Sensors were then moved for 1h to buffer H to measure protein dissociation. Measurements were performed with 1000 rpm shaking at 44°C (in case of full - length sHsps) or at 25°C (in case of ACDs). Full-length proteins were preincubated at 44°C for 10 min before measurement.

sHsps displacement by Hsp70 system was measured using ForteBio® BLItz. Sensors with attached aggregates were prepared as described above, with buffer F (50 mM Tris pH7.5, 150 mM KCl, 20 mM MgCl₂, 2.5 mM DTT) instead of buffer H. Baseline biolayer was measured for 60 s. Sensors were then placed in 5 μM sHsp solution in buffer F, previously preincubated for 10 min at 44°C. sHsps association was measured for 10 min. Sensors were then moved to either buffer F or Hsp70 system in buffer F (0.7 μM DnaK, 0.28 μM DnaJ, 0.21 μM GrpE, 5 mM ATP, 0.1 mg/ml creatine kinase, 18 mM creatine phosphate). Hsp70 system binding and sHsps dissociation were measured for 1h. Measurements were performed at room temperature with 2000 rpm shaking.

ACD interactions with C –terminal peptide were measured using Octet® K2 system. Octet® NTA Biosensors were placed in buffer C (50mM Tris pH 7.5, 150 mM KCl, 5% glycerol, 5 mM β-mercaptoethanol) and baseline signal was measured for 60 s. Sensors were then placed in 2.5 μM His₆ -SUMO-C-peptide solution in buffer C and incubated for 15 min. Surplus His₆ -SUMO-C-peptide was then removed by incubation in G buffer for 15 min. Sensors were then moved to ACD solution and association was measured for 48 min. After that, ACD dissociation was measured in buffer G for 10 min. Measurements were performed with 1000 rpm shaking at 25°C. Measurements were performed for different ACD concentrations in triplicates. Biolayer thickness at the end of association stage was corrected for the nonspecific binding using control substituting His₆ -SUMO for His₆ -SUMO-C-peptide and converted to fraction bound by min-max scaling between 0 and 1 using minimal and maximal triplicate averages of biolayer thickness. To determine dissociation constant fraction bound as a function of ACD concentration was fitted to the Hill equation using SciPy implementation (Virtanen et al, 2020) of dogbox algorithm:

Where [X] is the total ACD concentration, K_0.5 – ACD concentration required to reach half-maximum binding at equilibrium, n – Hill coefficient, f_b - fraction bound. Standard deviations of fitted K_0.5 and n values were derived from the diagonal of optimized parameters covariance matrix.

Molecular Dynamics (MD)

All MD simulations were performed using Gromacs 2019.2 (Van Der Spoel et al, 2005) and CHARMM36-jul2021 as a force field (Huang et al, 2017). Simulations were performed in the isothermal-isobaric (NPT) ensemble, where temperature was kept at 310 K using v-rescale thermostat (Bussi et al, 2007) with a time constant of 0.1 ps and the pressure was held at 1 bar using Parinello-Rahman barostat (Parrinello & Rahman, 1981). Lennard-Jones potential with a cut-off of 1.0 nm was used to describe Van der Waals interactions. Computation of long-range electrostatic interactions was performed using the particle mesh Ewald (PME) method (Essmann et al, 1995) with a Fourier grid spacing of 0.12 nm and a real space cutoff of 1.0 nm. Bonds between hydrogen and protein heavy atoms were constrained by P-LINCS (Hess et al, 1997) and water molecules geometry was constrained by SETTLE (Miyamoto & Kollman, 1992). Integration of equations of motion was performed by leap-frog algorithm (Van Gunsteren & Berendsen, 1988) with a time step of 2 fs. Periodic boundary conditions were applied in all dimensions.

The initial conformation of AncA₀ ACD - C-terminal peptide complex was predicted by ColabFold implementation (Mirdita et al, 2022) of AlphaFold-Multimer (Richard et al, 2022) Substitutions Q66H and G109D were introduced to the complex using PyMol Mutagenesis Wizard (http://www.pymol.org). Complexes were then placed in rhombic dodecahedral boxes measuring 10.4 nm in all dimensions and solvated by CHARMM-modified TIP3P water model (Jorgensen et al, 1983). Concentrations of sodium and chloride ions were adjusted to 0.15 M and net zero charge of the system. Each system was subjected to 3-step energy minimization protocol, where during the first step protein conformation was constrained, during the second step constraint was reduced to protein backbone only and during the third step positions of all protein heavy atoms were restrained with a force constant of 1000 kJ*mol^-1*nm^-1. Minimized systems were equilibrated for 10 ns while positions of protein backbone atoms were kept constrained, equilibration was then continued without constraints for further 500 ns. The first 100 ns of equilibration was discarded, and the rest was used for ACD – C-terminal peptide contact determination using GetContacts tool (https://getcontacts.github.io/). Contact between interfacial residues was defined as any of the following types of interaction: hydrogen bond, ionic, π-stacking, π-cation or van der Waals between purely hydrophobic residues. The default GetContacts interaction criteria were used for all interaction types except for hydrogen bond detection, where a more stringent 30° cutoff for hydrogen-donor-acceptor angle was used. Contact heatmaps were prepared using seaborn (Waskom, 2021).

The representative conformations of ACD – C-terminal peptide complexes were chosen by clustering the last 400 ns of equilibrium MD trajectories (frames spaced every 0.5 ns) using “gmx cluster” tool and Jarvis-Patrick clustering method (Jarvis & Patrick, 1973). RMSD cutoff of 0.2 nm, was used for Jarvis-Patrick algorithm and conformations possessing at least 3 neighbors in common out of 15 closest conformations were assigned to the same cluster. RMSD calculation was based on coordinates of heavy backbone atoms of the C-terminal peptide interacting ACD monomer without dimerization loop (residues 40-74 and 95 to 126) and stably interacting region of the C-terminal peptide (residues 132-137). In the case of both simulated complexes the biggest cluster contained more than 90% of simulation frames (92.3% for AncA0 ACD complex and 93.3% for Anca0 Q66H G109D complex) and its middle frame was chosen as representative conformation. Visual Molecular Dynamics (VMD) (William et al, 1996) and Blender (https://www.blender.org/) were used for structure visualization.

Supporting information

Figure_supplements

supplementary file 1 - Multiple Sequence Alignment of Enterobacterales IbpA orthologs

supplementary file 2 - phylogenetic tree of Enterobacterales IbpA orthologs in Newick format

supplementary file 3 - posterior probability statistics

supplementary file 4 - branch and branch - site model statistics

Acknowledgements

This work was supported by a grant of the Polish National Science Centre (OPUS 17 2019/33/B/NZ1/00352). We gratefully acknowledge Poland’s high-performance Infrastructure PLGrid (HPC Centers: ACK Cyfronet AGH, PCSS, CI TASK, WCSS) for providing computer facilities and support. We thank Prof. Max Telford, Prof. Jaroslaw Marszalek and Dr. Agnieszka Kłosowska for helpful discussions.

Additional Files

Figure supplements

Supplementary file 1 Multiple sequence alignment of Enterobacterales IbpA orthologs used for phylogenetic analysis and ancestral reconstruction in fasta format

Supplementary file 2 Phylogenetic tree of Enterobacterales IbpA protein family in newick format (see figure 2)

Supplementary file 3 Posterior probability statistics for ancestral sequence reconstruction of AncA₀ and AncA₁ nodes: For each position amino acids reconstructed with posterior probability higher than 0.2 are shown. Single – letter symbols of reconstructed amino acids are followed by posterior probability of reconstruction (in brackets). Positions at which most likely amino acid differ between AncA₀ and AncA₁ are marked in bold and italics. Posterior probabilities estimated using FastML program based on Maximum Likelihood and Empirical Bayes method.

Supplementary file 4 Statistics for selection analysis: A) Branch model statistics for IbpA orthologs from Enterobacterales. Models assumed either branch between nodes AncA₀ and AncA₁ or entire Erwiniaceae clade as foreground; NS – Not significant. B) Branch - site model statistics for IbpA orthologs from Enterobacterales. Models assumed either branch between nodes AncA₀ and AncA₁ or entire Erwiniaceae clade as foreground; NS – Not significant

References

1. Ashkenazy H
2. Penn O
3. Doron-Faigenboim A
4. Cohen O
5. Cannarozzi G
6. Zomer O
7. Pupko T
2012FastML: a web server for probabilistic reconstruction of ancestral sequencesNucleic Acids Res 40:W580–584https://doi.org/10.1093/nar/gks498 Google Scholar
1. Basha E
2. Friedrich KL
3. Vierling E
2006The N-terminal arm of small heat shock proteins is important for both chaperone activity and substrate specificityJ Biol Chem 281:39943–39952https://doi.org/10.1074/jbc.M607677200 Google Scholar
1. Bussi G
2. Donadio D
3. Parrinello M
2007Canonical sampling through velocity rescalingJournal of Chemical Physics 126https://doi.org/10.1063/1.2408420 Google Scholar
1. Cohen O
2. Pupko T
2011Inference of gain and loss events from phyletic patterns using stochastic mapping and maximum parsimony--a simulation studyGenome Biol Evol 3:1265–1275https://doi.org/10.1093/gbe/evr101 Google Scholar
1. Cohen O
2. Rubinstein ND
3. Stern A
4. Gophna U
5. Pupko T
2008A likelihood framework to analyse phyletic patternsPhilos Trans R Soc Lond B Biol Sci 363:3903–3911https://doi.org/10.1098/rstb.2008.0177 Google Scholar
1. Demuth JP
2. Hahn MW
2009The life and death of gene familiesBioessays 31:29–39https://doi.org/10.1002/bies.080085 Google Scholar
1. Dowell NL
2. Giorgianni MW
3. Kassner VA
4. Selegue JE
5. Sanchez EE
6. Carroll SB
2016The Deep Origin and Recent Loss of Venom Toxin Genes in RattlesnakesCurr Biol 26:2434–2445https://doi.org/10.1016/j.cub.2016.07.038 Google Scholar
1. Eick GN
2. Bridgham JT
3. Anderson DP
4. Harms MJ
5. Thornton JW
2017Robustness of Reconstructed Ancestral Protein Functions to Statistical UncertaintyMol Biol Evol 34:247–261https://doi.org/10.1093/molbev/msw223 Google Scholar
1. Essmann U
2. Perera L
3. Berkowitz ML
4. Darden T
5. Lee H
6. Pedersen LG
7. Doi 10.1063/1.470117
1995A Smooth Particle Mesh Ewald MethodJournal of Chemical Physics 103:8577–8593Google Scholar
1. Fernandez R
2. Gabaldon T
2020Gene gain and loss across the metazoan tree of lifeNat Ecol Evol 4:524–533https://doi.org/10.1038/s41559-019-1069-x Google Scholar
1. Fu X
2. Zhang H
3. Zhang X
4. Cao Y
5. Jiao W
6. Liu C
7. Song Y
8. Abulimiti A
9. Chang Z
2005A dual role for the Nterminal region of Mycobacterium tuberculosis Hsp16.3 in self-oligomerization and binding denaturing substrate proteinsJ Biol Chem 280:6337–6348https://doi.org/10.1074/jbc.M406319200 Google Scholar
1. Fuchs M
2. Poirier DJ
3. Seguin SJ
4. Lambert H
5. Carra S
6. Charette SJ
7. Landry J
2009Identification of the key structural motifs involved in HspB8/HspB6-Bag3 interactionBiochem J 425:245–255https://doi.org/10.1042/BJ20090907 Google Scholar
1. Gaucher EA
2. Govindarajan S
3. Ganesh OK
2008Palaeotemperature trend for Precambrian life inferred from resurrected proteinsNature 451:704–707https://doi.org/10.1038/nature06510 Google Scholar
1. Giese KC
2. Vierling E
2002Changes in oligomerization are essential for the chaperone activity of a small heat shock protein in vivo and in vitroJ Biol Chem 277:46310–46318https://doi.org/10.1074/jbc.M208926200 Google Scholar
1. Haslbeck M
2. Vierling E
2015A first line of stress defense: small heat shock proteins and their function in protein homeostasisJ Mol Biol 427:1537–1548https://doi.org/10.1016/j.jmb.2015.02.002 Google Scholar
1. Haslbeck M
2. Weinkauf S
3. Buchner J
2019Small heat shock proteins: Simplicity meets complexityJournal of Biological Chemistry 294:2121–2132https://doi.org/10.1074/jbc.REV118.002809 Google Scholar
1. Hess B
2. Bekker H
3. Berendsen HJC
4. Fraaije JGEM
5. Doi 10.1002/(Sici)1096-987x(199709)18:12<1463::Aid-Jcc4>3.3.Co;2-L
1997LINCS: A linear constraint solver for molecular simulationsJournal of Computational Chemistry 18:1463–1472Google Scholar
1. Huang J
2. Rauscher S
3. Nawrocki G
4. Ran T
5. Feig M
6. de Groot BL
7. Grubmüller H
8. MacKerell AD
2017CHARMM36m: an improved force field for folded and intrinsically disordered proteinsNature Methods 14:71–73https://doi.org/10.1038/nmeth.4067 Google Scholar
1. Iranzo J
2. Wolf YI
3. Koonin EV
4. Sela I
2019Gene gain and loss push prokaryotes beyond the homologous recombination barrier and accelerate genome sequence divergenceNat Commun 10:5376https://doi.org/10.1038/s41467-019-13429-2 Google Scholar
1. Jarvis RA
2. Patrick EA
1973Clustering Using a Similarity Measure Based on Shared near NeighborsIeee T Comput C-22:1025–1034https://doi.org/10.1109/T-C.1973.223640 Google Scholar
1. Jaya N
2. Garcia V
3. Vierling E
2009Substrate binding site flexibility of the small heat shock protein molecular chaperonesProc Natl Acad Sci U S A 106:15604–15609https://doi.org/10.1073/pnas.0902177106 Google Scholar
1. Jeffares DC
2. Tomiczek B
3. Sojo V
4. dos Reis M
2015A beginners guide to estimating the non-synonymous to synonymous rate ratio of all protein-coding genes in a genomeMethods Mol Biol 1201:65–90https://doi.org/10.1007/978-1-4939-1438-8_4 Google Scholar
1. Jones DT
2. Taylor WR
3. Thornton JM
1992The rapid generation of mutation data matrices from protein sequencesComput Appl Biosci 8:275–282https://doi.org/10.1093/bioinformatics/8.3.275 Google Scholar
1. Jorgensen WL
2. Chandrasekhar J
3. Madura JD
4. Impey RW
5. Klein ML
1983Comparison of Simple Potential Functions for Simulating Liquid WaterJournal of Chemical Physics 79:926–935https://doi.org/10.1063/1.445869 Google Scholar
1. Kalyaanamoorthy S
2. Minh BQ
3. Wong TKF
4. von Haeseler A
5. Jermiin LS
2017ModelFinder: fast model selection for accurate phylogenetic estimatesNat Methods 14:587–589https://doi.org/10.1038/nmeth.4285 Google Scholar
1. Kempes CP
2. van Bodegom PM
3. Wolpert D
4. Libby E
5. Amend J
6. Hoehler T
2017Drivers of Bacterial Maintenance and Minimal Energy RequirementsFrontiers in Microbiology 810Google Scholar
1. Kennaway CK
2. Benesch JL
3. Gohlke U
4. Wang L
5. Robinson CV
6. Orlova EV
7. Saibil HR
8. Keep NH
2005Dodecameric structure of the small heat shock protein Acr1 from Mycobacterium tuberculosisJ Biol Chem 280:33419–33425https://doi.org/10.1074/jbc.M504263200 Google Scholar
1. Lee GJ
2. Roseman AM
3. Saibil HR
4. Vierling E
1997A small heat shock protein stably binds heat-denatured model substrates and can maintain a substrate in a folding-competent stateEMBO J 16:659–671https://doi.org/10.1093/emboj/16.3.659 Google Scholar
1. Lever MA
2. Rogers KL
3. Lloyd KG
4. Overmann J
5. Schink B
6. Thauer RK
7. Hoehler TM
8. Jørgensen BB
2015Life under extreme energy limitation: a synthesis of laboratory- and field-based investigationsFEMS Microbiology Reviews 39:688–728https://doi.org/10.1093/femsre/fuv020 Google Scholar
1. Longo LM
2. Despotovic D
3. Weil-Ktorza O
4. Walker MJ
5. Jablonska J
6. Fridmann-Sirkis Y
7. Varani G
8. Metanis N
9. Tawfik DS
2020Primordial emergence of a nucleic acid-binding protein via phase separation and statistical ornithine-to-arginine conversionProc Natl Acad Sci U S A 117:15731–15739https://doi.org/10.1073/pnas.2001989117 Google Scholar
1. Lucaci AG
2. Zehr JD
3. Enard D
4. Thornton JW
5. Kosakovsky Pond SL
2023Evolutionary Shortcuts via Multinucleotide Substitutions and Their Impact on Natural Selection AnalysesMol Biol Evol 4010Google Scholar
1. Lynch M
2. Marinov GK
2015The bioenergetic costs of a geneProceedings of the National Academy of Sciences 112:15690–15695https://doi.org/10.1073/pnas.1514974112 Google Scholar
1. Mani N
2. Bhandari S
3. Moreno R
4. Hu L
5. Prasad BVV
6. Suguna K
2016Multiple oligomeric structures of a bacterial small heat shock proteinSci Rep 6:24019https://doi.org/10.1038/srep24019 Google Scholar
1. Mirdita M
2. Schütze K
3. Moriwaki Y
4. Heo L
5. Ovchinnikov S
6. Steinegger M
2022ColabFold: making protein folding accessible to allNature Methods 19:679–682https://doi.org/10.1038/s41592-022-01488-1 Google Scholar
1. Miyamoto S
2. Kollman PA
3. DOI 10.1002/jcc.540130805
1992Settle - an Analytical Version of the Shake and Rattle Algorithm for Rigid Water ModelsJournal of Computational Chemistry 13:952–962Google Scholar
1. Mogk A
2. Deuerling E
3. Vorderwulbecke S
4. Vierling E
5. Bukau B
2003Small heat shock proteins, ClpB and the DnaK system form a functional triade in reversing protein aggregationMol Microbiol 50:585–595Google Scholar
1. Natarajan C
2. Signore AV
3. Bautista NM
4. Hoffmann FG
5. Tame JRH
6. Fago A
7. Storz JF
2023Evolution and molecular basis of a novel allosteric property of crocodilian hemoglobinCurr Biol 33:98–108https://doi.org/10.1016/j.cub.2022.11.049 Google Scholar
1. Nguyen LT
2. Schmidt HA
3. von Haeseler A
4. Minh BQ
2015IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogeniesMol Biol Evol 32:268–274https://doi.org/10.1093/molbev/msu300 Google Scholar
1. Obuchowski I
2. Karas P
3. Liberek K
2021The Small Ones Matter-sHsps in the Bacterial Chaperone NetworkFront Mol Biosci 8:666893https://doi.org/10.3389/fmolb.2021.666893 Google Scholar
1. Obuchowski I
2. Pirog A
3. Stolarska M
4. Tomiczek B
5. Liberek K
2019Duplicate divergence of two bacterial small heat shock proteins reduces the demand for Hsp70 in refolding of substratesPLoS Genet 15:e1008479https://doi.org/10.1371/journal.pgen.1008479 Google Scholar
1. Parrinello M
2. Rahman A
3. Doi 10.1063/1.328693
1981Polymorphic Transitions in Single-Crystals - a New Molecular-Dynamics MethodJournal of Applied Physics 52:7182–7190Google Scholar
1. Pirog A
2. Cantini F
3. Nierzwicki L
4. Obuchowski I
5. Tomiczek B
6. Czub J
7. Liberek K
2021Two Bacterial Small Heat Shock Proteins, IbpA and IbpB, Form a Functional HeterodimerJ Mol Biol 433:167054https://doi.org/10.1016/j.jmb.2021.167054 Google Scholar
1. Puigbo P
2. Lobkovsky AE
3. Kristensen DM
4. Wolf YI
5. Koonin EV
2014Genomes in turmoil: quantification of genome dynamics in prokaryote supergenomesBMC Biol 12:66https://doi.org/10.1186/s12915-014-0066-4 Google Scholar
1. Pupko T
2. Pe’er I
3. Hasegawa M
4. Graur D
5. Friedman N
2002A branch-and-bound algorithm for the inference of ancestral amino-acid sequences when the replacement rate varies among sites: Application to the evolution of five gene familiesBioinformatics 18:1116–1123https://doi.org/10.1093/bioinformatics/18.8.1116 Google Scholar
1. Ratajczak E
2. Zietkiewicz S
3. Liberek K
2009Distinct activities of Escherichia coli small heat shock proteins IbpA and IbpB promote efficient protein disaggregationJ Mol Biol 386:178–189https://doi.org/10.1016/j.jmb.2008.12.009 Google Scholar
1. Reinle K
2. Mogk A
3. Bukau B
2022The Diverse Functions of Small Heat Shock Proteins in the Proteostasis NetworkJ Mol Biol 434:167157https://doi.org/10.1016/j.jmb.2021.167157 Google Scholar
1. Richard E
2. Michael ON
3. Alexander P
4. Natasha A
5. Andrew S
6. Tim G
7. Augustin Ž Russ B
8. Sam B
9. Jason Y
10. et al
2022Protein complex prediction with AlphaFold-MultimerbioRxiv https://doi.org/10.1101/2021.10.04.463034 Google Scholar
1. Sievers F
2. Wilm A
3. Dineen D
4. Gibson TJ
5. Karplus K
6. Li W
7. Lopez R
8. McWilliam H
9. Remmert M
10. Soding J
11. et al
2011Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal OmegaMol Syst Biol 7:539https://doi.org/10.1038/msb.2011.75 Google Scholar
1. Simmons MP
2. Ochoterena H
2000Gaps as Characters in Sequence-Based Phylogenetic AnalysesSystematic Biology 49:369–381Google Scholar
1. Strozecka J
2. Chrusciel E
3. Gorna E
4. Szymanska A
5. Zietkiewicz S
6. Liberek K
2012Importance of N- and Cterminal regions of IbpA, Escherichia coli small heat shock protein, for chaperone function and oligomerizationJ Biol Chem 287:2843–2853https://doi.org/10.1074/jbc.M111.273847 Google Scholar
1. Thomson JM
2. Gaucher EA
3. Burgan MF
4. De Kee DW
5. Li T
6. Aris JP
7. Benner SA
2005Resurrecting ancestral alcohol dehydrogenases from yeastNat Genet 37:630–635https://doi.org/10.1038/ng1553 Google Scholar
1. Thornton JW
2. Need E
3. Crews D
2003Resurrecting the ancestral steroid receptor: ancient origin of estrogen signalingScience 301:1714–1717https://doi.org/10.1126/science.1086185 Google Scholar
1. Van Der Spoel D
2. Lindahl E
3. Hess B
4. Groenhof G
5. Mark AE
6. Berendsen HJ
2005GROMACS: fast, flexible, and freeJ Comput Chem 26:1701–1718https://doi.org/10.1002/jcc.20291 Google Scholar
1. Van Gunsteren WF
2. Berendsen HJC
1988A Leap-Frog Algorithm for Stochastic DynamicsMol Simulat 1:173–185https://doi.org/10.1080/08927028808080941 Google Scholar
1. Virtanen P
2. Gommers R
3. Oliphant TE
4. Haberland M
5. Reddy T
6. Cournapeau D
7. Burovski E
8. Peterson P
9. Weckesser W
10. Bright J
11. et al
2020SciPy 1.0: fundamental algorithms for scientific computing in PythonNat Methods 17:261–272https://doi.org/10.1038/s41592-019-0686-2 Google Scholar
1. Waskom ML
2021seaborn: statistical data visualizationJournal of Open Source Software 6:3021https://doi.org/10.21105/joss.03021 Google Scholar
1. William H
2. Andrew D
3. Klaus S
1996VMD: Visual molecular dynamicsJournal of Molecular Graphics 14:33–38https://doi.org/10.1016/0263-7855(96)00018-5 Google Scholar
1. Worth CL
2. Gong S
3. Blundell TL
2009Structural and functional constraints in the evolution of protein familiesNat Rev Mol Cell Biol 10:709–720https://doi.org/10.1038/nrm2762 Google Scholar
1. Yang Z
1998Likelihood ratio tests for detecting positive selection and application to primate lysozyme evolutionMol Biol Evol 15:568–573https://doi.org/10.1093/oxfordjournals.molbev.a025957 Google Scholar
1. Yang Z
2007PAML 4: phylogenetic analysis by maximum likelihoodMol Biol Evol 24:1586–1591https://doi.org/10.1093/molbev/msm088 Google Scholar
1. Yang Z
2. Nielsen R
2002Codon-substitution models for detecting molecular adaptation at individual sites along specific lineagesMol Biol Evol 19:908–917https://doi.org/10.1093/oxfordjournals.molbev.a004148 Google Scholar
1. Zwirowski S
2. Klosowska A
3. Obuchowski I
4. Nillegoda NB
5. Pirog A
6. Zieztkiewicz S
7. Bukau B
8. Mogk A
9. Liberek K
2017Hsp70 displaces small heat shock proteins from aggregates to initiate protein refoldingEmbo Journal 36:783–796https://doi.org/10.15252/embj.201593378 Google Scholar

Article and author information

Author information

Piotr Karaś
Intercollegiate Faculty of Biotechnology UG-MUG, University of Gdansk, Abrahama 58, 80-307 Gdansk, Poland
ORCID iD: 0000-0001-5270-7938
Klaudia Kochanowicz
Intercollegiate Faculty of Biotechnology UG-MUG, University of Gdansk, Abrahama 58, 80-307 Gdansk, Poland
ORCID iD: 0000-0002-2041-0301
Marcin Pitek
Intercollegiate Faculty of Biotechnology UG-MUG, University of Gdansk, Abrahama 58, 80-307 Gdansk, Poland
ORCID iD: 0000-0002-1300-4364
Przemyslaw Domanski
Intercollegiate Faculty of Biotechnology UG-MUG, University of Gdansk, Abrahama 58, 80-307 Gdansk, Poland
Igor Obuchowski
Intercollegiate Faculty of Biotechnology UG-MUG, University of Gdansk, Abrahama 58, 80-307 Gdansk, Poland
ORCID iD: 0000-0001-6350-1273
Bartlomiej Tomiczek
Intercollegiate Faculty of Biotechnology UG-MUG, University of Gdansk, Abrahama 58, 80-307 Gdansk, Poland
ORCID iD: 0000-0001-9295-663X
- Correspondence to B.T. (bartlomiej.tomiczek@biotech.ug.edu.pl) or K.L. (krzysztof.liberek@ug.edu.pl)
Krzysztof Liberek
Intercollegiate Faculty of Biotechnology UG-MUG, University of Gdansk, Abrahama 58, 80-307 Gdansk, Poland
ORCID iD: 0000-0002-7532-9279
- Correspondence to B.T. (bartlomiej.tomiczek@biotech.ug.edu.pl) or K.L. (krzysztof.liberek@ug.edu.pl)

Version history

Preprint posted: May 30, 2023
Sent for peer review: June 22, 2023
Reviewed Preprint version 1: August 21, 2023
Reviewed Preprint version 2: November 23, 2023
Version of Record published: December 8, 2023

Cite all versions

You can cite all versions using the DOI https://doi.org/10.7554/eLife.89813. This DOI represents all versions, and will always resolve to the latest one.

Copyright

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

views: 1,606
downloads: 146
citations: 3

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Significance of findings

Strength of evidence

Abstract

Introduction

sHsp systems in Enterobacteriacrae and Erwiniaceae

Results

New activity of Erwiniaceae IbpA has evolved in parallel to ibpB gene loss

IbpA phylogeny in Enterobacterales:

Functional changes during the evolution of secondarily single sHsp in Erwiniaceae.

Substitutions at positions 66 and 109 that occurred between nodes A0 and A1 are crucial for ancestral sHsps to work as a single protein.

Identification of residues defining ancestral sHsps activities

Identified substitutions influence α-crystallin domain properties

Substitutions at positions 66 and 109 decreased the affinity of AncA0 ACD to C-terminal peptide and aggregated substrate.

Identified substitutions define the mode of action of extant sHsps

Differences at positions 66 and 109 determine functional differences between extant IbpA proteins from E. coli and E. amylovora.