Introduction

Ice crystals grow from ice embryos, which are crystalline aggregates of water molecules that spontaneously form (homogeneous nucleation) in pure H2O at approximately –38 °C (1). Ice can arise in nature at much warmer temperatures because various surfaces act as stabilizers of ice embryos (heterogeneous nucleation). Only once an ice embryo reaches a critical number of organized water molecules will it become stable enough to spontaneously grow at elevated temperatures, a process called ice nucleation (2). The most active heterogeneous ice nucleators are bacterial ice nucleation proteins (INPs), which can stabilize an ice embryo at temperatures as warm as −2 °C (3). INP-producing bacteria are widespread in the environment where they are responsible for initiating frost (4) and atmospheric precipitation (5). As such, these bacteria play a significant role in the Earth’s hydrological cycle and in agricultural productivity.

As described in the literature, INPs are large proteins (up to ∼150 kDa) that are thought to form multimers on the surface of the bacteria that express them (6, 7). AlphaFold predictions have provided some insight into the INP monomer structure (Fig. 1A) (8). For the INP from Pseudomonas borealis (PbINP) AlphaFold predicted a folded domain of ∼100 residues at the N terminus followed by a flexible linker of ∼50 residues, a repetitive domain composed of 65 16-residue tandem repeats, and a small 41-residue C-terminal capping structure (supported by model confidence metrics, Fig. S1). The predicted fold of the repetitive domain agrees with some previous homology-based models in which each 16-residue repeat forms a single coil of a β-solenoid structure (9, 10).

Classification of unique INP tandem arrays into either WO-coil or R-coil subdomains.

A) AlphaFold 2 model of PbINP coloured by domain. Purple: N-terminal domain, pink: flexible linker, green: water-organizing (WO) coils, blue: arginine-rich (R) coils, yellow: C-terminal cap. The inset shows a cross section through the solenoid coil. B) 16-residue tandem repeat forming one coil from the β-solenoid with positions numbered from N-to C-terminus. C) The number of repeats in the WO-coils and R-coils for each unique sequence. PbINP is included and indicated in orange. n = 121. D) Sequence logos constructed from each 16-residue repeat present in the dataset.

In INP sequences, most coils of the β-solenoid contain putative water-organizing motifs like Thr-Xaa-Thr (TxT) that occupy the same position in each coil to form long parallel arrays, and where Xaa is an inward pointing amino acid residue (Fig. 1B). Shorter versions of similar arrays have convergently evolved in insects to form the ice-binding sites of several hyperactive antifreeze proteins (AFPs) (1113). These arrays are thought to organize sufficient ice-like water molecules on their surface to facilitate AFP adsorption to the ice crystal surface (14). In the much longer INP arrays, which further form multimers, the organizing effect on nearby water molecules is thought to increase to the point where ice embryos can be sufficiently stabilized to cause spontaneous growth of ice at high sub-zero temperatures. Consistent with this idea, we recently demonstrated that interrupting these water-organizing motifs decreased the ice nucleation temperature by the same amount as extensive deletions of the water-organizing coils (8).

Previously we showed that the 12 C-terminal coils lack the water organizing motifs and that deleting these coils resulted in a near total loss of ice nucleation activity (8). Interestingly, the necessity for these C-terminal coils was demonstrated by Green and Warren in 1986 in the first publication of an INP sequence (15) but was not further investigated. While the water-organizing coils (WO-coils) are characterized by their conserved TxT, SxT, and Y motifs, the defining feature of the C-terminal-most coils, other than the lack of these motifs, is that position 12 of the 16-residue coil is typically occupied by arginine. Thus, we refer to these non-WO-coils as R-coils. Since both coil types maintain the same predicted fold but serve different functions, we consider them subdomains of the same β-solenoid (16). In the WO-coils, position 12 is usually occupied by residues of the opposite charge, Asp and Glu. This charge inversion is noteworthy as it has been shown that electrostatic interactions contribute to the formation of INP multimers (17, 18). It has also been shown that INP activity is affected by pH, which is consistent with a role for electrostatic interactions (19). We and others have suggested that INPs may multimerize through salt-bridging of the sidechains in these positions of the coil (8, 20).

Radiation inactivation analysis suggests a multimer size of 19 MDa (>100 monomers) (6). Computational estimates predict increased activity upon the assembly of up to a 5-MDa (34 INPs) multimer, which is on the same order of magnitude as that determined experimentally (21). The tendency to form such large structures is one of many factors that makes these proteins difficult to work with (3) and may be part of why, despite many attempts, very little is known about them at a molecular level (22). The size of these structures does, however, make them amenable to size-based separation from other proteins (23, 24). INP multimers are also large enough to be visible using negative stain transmission electron microscopy (TEM) on enriched samples, revealing a fibril-like morphology (24, 25).

In nature, these multimers form on the surface of bacteria, anchored to the outer membrane by the N-terminal domain (26). When expressed recombinantly in E. coli, INPs have full activity suggesting that multimers are still able to form and that they are the product of self-assembly. Remarkably, INPs with N-terminal truncations are only slightly less active, suggesting that assembly on the cell surface is not mandatory, and it can occur in the cytoplasm whether anchored to the inner surface of the plasma membrane or free in solution (27, 28).

Here, we have studied the role of the R-coils in INPs through a series of mutations and rearrangements. Additionally, using cryo-focused-ion-beam (cryo-FIB) milling and cryo-electron tomography (cryo-ET), we have observed the fibrillar morphology of INP multimers in situ within cells recombinantly expressing INPs. The R-coils’ length, location, and sequence are critical for INP multimerization and hence INP activity. Although we report results using PbINP, the bioinformatic analysis presented here indicates that these findings are universally applicable to the INP family, including the more commonly studied InaZ from Pseudomonas syringae.

Results

Bioinformatic analysis reveals conservation in the number of R-coils across all INPs

A bioinformatic analysis of bacterial INPs was undertaken to identify their variations in size and sequence to understand what is common to all that could guide experiments to probe higher order structure and help develop a collective model of the INP multimer. In PbINP, there are 53 WO-coils and 12 R-coils (Fig. 1A), each composed of 16 residues (Fig. 1B). To determine whether this ratio of coil types is consistent across known INPs, we analyzed INP and INP-like solenoid sequences in the NCBI’s non-redundant protein (nr) database. The long tandem arrays of coils in INPs make them prone to mis-assembly when using short-read DNA sequencing (29) so we opted to limit our dataset to sequences obtained by long-read technologies (Oxford Nanopore and Pacific Biosciences SMRT sequencing). From this bioinformatics study, it is apparent that the number of WO-coils varies considerably from over 70 coils to around 30, with a median length of 58 coils (Fig. 1C). In contrast, the length of the R-coil region is much less variable across sequences, with 107 of 120 sequences containing either 10 or 12 R-coils (Fig. 1C). The stark difference in length variation between the numbers of WO-coils and the R-coils supports the hypothesis that these two regions have different functions.

The differences observed between the PbINP WO-coil consensus sequence and the R-coil consensus sequence, which include the loss of putative WO motifs in the former and the appearance of basic residues at position 12 in the latter, are consistent across the entire dataset (Fig. 1D). This is apparent from the sequence logos comparing them. Also worth noting is the similarity of the sequence logos for WO-coils and R-coils in PbINP (8) with those based on 120 sequences from the database.

Incremental replacement of R-coils with WO-coils severely diminishes ice nucleating activity

Given the remarkable conservation of the R-coil count compared to the variability of the WO-coil numbers, we measured the functional impact of shortening the R-coil region. We designed mutants in which the R-coils were incrementally replaced with WO-coils, shortening the R-coil subdomain from 12 to 10, 8, 6, 4, or 1 coil(s), while retaining the same overall length as wild-type PbINP (Fig. 2A). To avoid disrupting any potential interaction between the C-terminal cap structure and the R-coils, one R-coil was left in place to produce the 1 R-coil mutant. As previously described these constructs were tagged with GFP as an internal control for INP production, and its addition had no measured effect on ice nucleation activity (8).

Ice nucleation activity of mutant INPs in which R-coils are incrementally replaced with WO-coils.

A) Diagram indicating the domain map of PbINP and from it, the design of the INP mutants. Sections of the WO-coils used in the replacement of the R-coils are indicated. wt: wild type. B) Ice nucleation temperatures measured by WISDOM for E. coli cells expressing PbINP and mutants with different numbers of R-coils. The temperature at which fifty percent of the droplets froze (T50) is indicated with a hollow circle, and its corresponding value written nearby. The shaded region indicates standard deviation. C) The same constructs assayed for ice nucleation using a nanoliter osmometer. The grey box indicates temperatures beyond the lower limit of the NLO apparatus for detecting ice nucleation in this experiment.

Ice nucleation assays were performed on intact E. coli expressing PbINP to assess the activity of the incremental replacement mutants. In theory, replacement of R-coils by WO-coils could result in a gain of function as the water-organizing surface increased in area. However, as the number of R-coils was reduced, the nucleation temperature decreased (Figs. 2B, 2C). Replacing two or four R-coils to leave ten or eight in place resulted in a slight loss of activity compared to the wild-type protein (T50 = –9.1 °C and –10.2 °C, respectively, where T50 is the temperature at which 50% of droplets have frozen, p < 0.001). Reducing the R-coil count to six dramatically decreased the activity (T50 = –35.7 °C). The construct with only four R-coils in place showed only the slightest amount of activity, and activity was entirely lost in the construct containing only one R-coil. Evidently, small decreases in the R-coil region length produce disproportionately large decreases in activity. Halving the length of the R-coils by replacing just six coils reduced ice nucleation activity by 26.9 °C, whereas reducing the WO-coil length in half decreased the T50 by less than 2 °C (8). Since the R-coils mostly lack the motifs required for water-organizing, we attribute the observed changes in nucleation temperature to changes in INP multimer formation.

The location of the R-coil subdomain is crucial

In addition to its length, we investigated whether the location of the R-coil subdomain is important for ice nucleation activity. We produced constructs where 11 of the 12 R-coils were relocated to either the N-terminal end of the solenoid or the approximate midpoint of the solenoid (Fig. 3A). As before, the C-terminal R-coil was left in place adjacent to the cap structure. We also produced an R-coil deletion construct, where the same 11 R-coils were deleted entirely from the protein.

Ice nucleation activity of mutant INPs with entirely relocated or deleted R-coil subdomain.

A) Diagram indicating the design of the constructs. 11 of the 12 R-coil repeats were either moved within the construct or deleted. b) Freezing curves with T50 and number of unfrozen droplets indicated where applicable.

The N-terminal relocation construct displayed markedly lower activity with a T50 = –22.4 °C compared to wild-type PbINP, and the midpoint relocation construct displayed almost no activity (T50 = –36.1 °C), and was indistinguishable in activity from the construct where the R-coils were deleted (Fig. 3B).

Targeted mutations reveal that positively charged residues are important for R-coil function

Having established the importance of R-coil position and length for high activity, we next investigated the features of this subdomain that are required for its activity. Looking at the charge distribution along the solenoid from N terminus to C terminus (Fig. 4A), we noted a switch at the start of the R-coils from an abundance of acidic residues to their replacement by basic residues. To probe the significance of this observation, we mutated all basic residues (R/K/H) in the R-coils to match those found in the same repeat positions of the WO-coils (D/G/E for positions 11 and 12, and S for position 14) (Fig. 1D). In total, 17 basic residues – 10 Arginine (R), 4 Lysine (K), 3 Histidine (H) – were replaced in the R-coils to generate the RKH replacement mutant. The side chains at these positions are predicted by the AlphaFold model to point outward from the solenoid, so these mutations are unlikely to compromise the stability of the solenoid core.

Site-specific mutagenesis of noteworthy motifs in the R-coil subdomain.

A) Diagram indicating the design of the constructs. Translucent bars indicate continuity of three conserved motifs along the length of the central repetitive domain. Neg. charge: Negative residues present in positions 11, 12, and 14 of the repeat. TxT: Thr-Xaa-Thr motif in positions 6-8 of the repeat. Y-ladder: An entirely conserved Tyr is position 3 of the motif. Three mutants were created in which these motifs were extended into the R-coils. K-coils: All arginines in the R-coils were replaced with lysines. B) Freezing curves with T50 and number of unfrozen droplets indicated where applicable.

There was a 10.3 °C drop in T50 from wild-type activity after RKH replacement (Fig. 4B) (T50 = –19.1 °C). Although this is a large decrease in activity, it was not as deleterious as the relocation or deletion mutations (Fig. 3). The prominent, entirely conserved tyrosine in position 3 of the WO-coils is only present in the first three R-coils and is missing from the following nine coils, making it another candidate for mutation. Upon extending this “tyrosine ladder” through the R-coils (Fig. 4A), there was a 1.3-°C loss in activity. However, when combining the RKH replacement with the tyrosine ladder extension, an almost total loss of activity was observed (T50 = –36.1 °C on WISDOM) (Fig. 4B).

In the final mutated PbINP construct in this series, all arginines in the R-coils were replaced by lysines (K-coils). This mutant nucleated ice formation at essentially the same temperature as the wild type (Fig. 4b) (p = 0.89), suggesting that positive charges in these locations are more important than side chain geometry.

Droplet freezing assays show recombinant cell lysate supernatant has ice nucleation activity that is affected by pH

The experiments described above were performed using whole recombinant bacteria rather than extracted INPs. In E. coli, the vast majority of the expressed INP is intracellular (27). Indeed, with our GFP-tagged constructs, we observe intense green fluorescence in the cytoplasm. To see how important electrostatic interactions were in the multimerization of PbINP as reflected by its ice nucleation activity, it was necessary to lyse the E. coli to change the pH surrounding the INP multimers. After centrifuging the sonicate to remove cell debris and passing the supernatant through a 0.2-μm filter to remove any unbroken cells, the extracts were tested to see how ice nucleation activity is affected by pH between 2.0 to 11.0. The activity of the filtered supernatant was only a few degrees lower than that of whole bacteria (T50 = –9.6 °C) (Fig. 5A), which agrees with the results of Kassmannhuber et al. (28). This indicates that large INP structures are present within the bacterial cytoplasm.

A) A comparison of the nucleation temperatures of PbINP when assayed using intact E. coli cells and when assayed with filtered supernatant. B) A box and whisker plot showing the nucleation temperatures of filtered supernatant containing PbINP under different pH conditions. Boxes and bars indicate quartiles, with medians indicated by a centre line. Outliers are indicated by diamonds.

The effect of pH on Snomax activity has been previously reported (18). However, Snomax is comprised of freeze-dried P. syringae cells in which the INPs are thought to be membrane bound. Our assays on bacterial lysate tested free, cytoplasmic PbINP complexes, producing a similar trend regarding the effect of pH but with somewhat greater loss of activity on the lower end of the optimal range (Fig. 5B). Ice nucleation activity decreased by a few degrees below pH 5.0, and by ∼8 °C at pH 2.0. The loss of activity in the alkaline buffers up to pH 11.0 was minimal. Similar to the findings of Chao et al. (30), we did not observe a major change in activity (i.e. ΔT50 > 10 °C) even at the extremes of pH 2.0 and 11.0, suggesting that the mechanism of ice nucleation is not pH-dependent.

INP activity is remarkably heat resistant

Having access to lysate also provided an opportunity to examine the heat stability of the INP complexes. The filtered lysate was heat-treated to 60, 70, 80, 90, or 99 °C for 10 min in sealed tubes before being chilled and assayed for ice nucleation activity (Fig. 6A). The activity of the 60 °C sample (T50 = –9.9 °C) was nearly identical to the non-treated wild-type control (T50 = –9.6°C), and the 70 °C sample only displayed a minor loss of activity (T50 = –10.2 °C). From 80 °C to 99 °C the activity incrementally decreased (T50 = –11.3 °C, –12.7 °C, –14.6 °C, respectively), but the activity loss never exceeded 6 °C. Indeed, the heat resistance of the INP complex is remarkable. The C-terminal GFP tag provided an internal control for the effectiveness of heat treatment, as GFP denatures at around 73 °C (31). The green colour of the bacteria was robust at 65 °C and with very few exceptions, gone at 75 °C (Fig. 6B). There was no fluorescence at 90°C.

A) Measured freezing temperatures of heat-treated droplets containing either PbINP or PbINP with the first 32 repeats (counting from the N-terminal end) deleted. Wild-type PbINP freezing without heat treatment (kept at roughly 10 °C) is indicated-on the left. B) Fluorescent microscopy images of recombinant E. coli cells expressing PbINP tagged with GFP viewed under bright-field (BF) or fluorescent excitatory (GFP) light. Representative images are shown (n = 3). Note: cells that retain their fluorescence after 75 °C treatment are rarely observed.

To assess what role the WO-coils play in multimer stability, we also assayed the lysate of a construct from our previous study in which repeats 16-47 (residues 411-923) of the solenoid had been deleted, leaving 32 coils (8). The overall freezing profile remained the same, indicating that this construct is also extremely resistant to heat denaturation, with each temperature sample freezing at slightly lower temperatures than their full-length counterparts (Fig. 6A). While the Δ411-923 construct had slightly lower overall activity, its heat resistance was not affected by the truncation, suggesting that the R-coil and C-terminal cap subdomains are mainly responsible for multimer stability.

The β-solenoid of INPs is stabilized by a capping structure at the C terminus, but not at the N terminus

There is a clear C-terminal capping structure in the AlphaFold model (Fig. 1A), but a possible N-terminal cap was more nebulous. Most protein solenoids are N-and/or C-terminally capped to help maintain the fold and/or prevent end-to-end associations (32). Looking at the N-terminal sequence, we tested if any part of the extended linker region serves as an N-terminal capping motif. To investigate this, we made a series of incremental N-terminal deletions starting at residues Asp150 (Truncation 1), Gln159 (Truncation 2), and Gln175 (Truncation 3) (Fig. 7A).

A) Sites of N-terminal truncations to PbINP, indicating the location of the starting residue in the shortened construct. B) Alignment of representative INP C-terminal domains from the genus Pseudomonas. Mutated residues and their one-letter codes are indicated above. Symbols at the bottom indicate consensus (* for fully conserved,: for conservation of strongly similar chemical properties, for conservation of weakly similar chemical properties). C) Predicted location of mutated residues in the PbINP C-terminal cap with sidechains shown and predicted H-bonds for D1208 shown as dashed lines. D) The ice nucleation curves for the N-and C-terminal cap mutants.

Truncation 1 lacked most of the N-terminal domain, leaving the last few residues of the unstructured linker. Truncation 2 removed those linker residues so that the putative cap (a single β-strand) was located at the very N-terminal end of the protein. Truncation 3 removed the β-strand along with the rest of the first coil of the solenoid. When tested, there was no difference between the activities of the three truncations and the wild type (p = 0.82) (Fig. 7D). This result is in line with those from Kassmannhuber et al. (28), which showed that deletion of the N-terminal domain does not significantly affect ice nucleation activity.

Previously, we demonstrated that the C-terminal cap is essential for ice nucleation activity (8). Bioinformatic analysis showed a high degree of conservation in the C-terminal cap residues (Fig. 7B). Rather than deleting the cap, we made targeted mutations: F1204D, D1208L, and Y1230D, to disrupt the structure predicted by AlphaFold. Residues for mutations were chosen based on the putative key roles of those residues in the AlphaFold model. For an enhanced effect of the mutations hydrophobic residues were replaced with charged ones and vice versa. F1204 sits atop the final R-coil to cover its hydrophobic core. D1208 helps to maintain a tight loop through strategic hydrogen bonds, and Y1230 fills a gap in the surface of the cap (Fig. 7C). When comparing these selections to the aligned C-terminal cap sequences, we see that all three residues are highly conserved. The resulting triple mutant displayed greatly reduced activity (T50 = –27.8°C), which helps validate the AlphaFold-predicted structure of the cap and its importance to the stability of the solenoid it covers.

Cryo-electron tomography reveals INPs multimers form bundled fibres in recombinant cells

The idea that INPs must assemble into larger structures to be effective at ice nucleation has persisted since their discovery (6). In the interim the resolving power of cryo-EM has immensely improved. Here we elected to use cryo-electron tomography to view the INP multimers in situ and avoid any perturbation of their superstructure during isolation. E. coli cells recombinantly overexpressing INPs were plunge-frozen and milled into ∼150-nm thick lamella using cryo-FIB (Fig. 8A). Grids containing lamellae were transferred into either a 200-or a 300-kV transmission electron microscope for imaging under cryogenic conditions. Many E. coli cells were observed within the low-magnification cryo-TEM overview image of the lamella (Fig. 8B). Tilt series were collected near individual E. coli cells, and 3-D tomograms were reconstructed to reveal cellular and extracellular features. Strikingly, E. coli cells overproducing wild-type INPs appear to be lysed after three days of cold acclimation at 4 °C and contain clusters of fibres in the cytoplasm (Fig. 8 C, D, E, tomograms in Movies S2 and S3). Individual fibres are up to a few hundred nanometers in length but only a few nanometers in width. Intriguingly, these fibre clusters were not observed in E. coli that overexpress INP mutants lacking R-coils and the cell envelopes stay integral after being cold acclimated over the same period as those of wild-type INP-producing E. coli. (Fig. S4 A, B, C, D).

Fibrous bundles observed by cryo-FIB and cryo-ET in in E. coli cells expressing INP.

A) Ion-beam image of a thin lamella containing E. coli cells expressing INP obtained from cryo-FIB milling. B) Zoomed-in view of a cryo-TEM image of the lamella in A). Boxes with dashed-lines indicate areas where tilt series were collected. C) and D) Snapshots from 3-D cryo-tomograms reconstructed from tilt series collected in the boxed regions in B) showing striking fibrous bundles (yellow arrowheads). The E. coli cell envelopes are indicated with thick dash-lines. E) Further examples of the fibrous bundles produced by INP-expressing E. coli. Size markers in A) is 10 μm, in B) is 2 μm and in C), D) and E) are 100 nm, respectively.

Discussion

Previously, we showed that the PbINP solenoid domain is made up of two subdomains: the larger N-terminal region of WO-coils accounting for 80-90% of the total length; and the smaller C-terminal R-coil region accounting for the remaining 10-20% (8). The length of the WO-coil region and the continuity of the water-organizing motifs were shown to directly affect ice nucleation temperature. Although the R-coil region lacks water-organizing motifs, its presence was critical for ice nucleation activity, which led us to propose a key role for this region in INP multimer formation. Here we have characterized the R-coil subdomain in terms of the attributes it needs to support INP multimerization and have shown by cryo-ET the first in situ view of what these multimers look like. In addition, we have advanced a working model for the INP multimer structure that is compatible with all of the known INP properties.

In the aforementioned work, we showed that removal of up to half of the PbINP solenoid (reducing the number of WO-coils from 53 to 21) only dropped the ice nucleation activity by ∼2 °C. Here we have confirmed this tolerance of WO-coil count variation through bioinformatic analysis of natural INPs. The majority of bacterial INPs have WO-coil counts between 30 and 70. PbINP is average in this respect with 53 WO-coils. It seems counterintuitive that these bacteria have not been uniformly selected for the highest WO-coil count, which might give them an advantage in causing frost damage to plants at the highest possible temperature (33). However, it is clear that INPs are not functioning as monomers but rather as large multimers so any loss of water-organizing surface can potentially be compensated for by simply adding more monomers to the multimer.

The ability to form superstructures is a key property of INPs and centers on the R-coil subdomain. This was shown here in the same bioinformatic analysis where there is remarkably little variation to the R-coil length of 10-12 coils. The importance of a minimal R-coil region length is supported by experiments. Whereas over 30 of the WO-coils can be removed with slight loss of activity, when six of the 12 PbINP R-coils were replaced by WO-coils there was a catastrophic loss of ice nucleation activity, and no activity at all with further shortening of the R-coils. We postulate that at least eight R-coils are required for efficient multimer formation and that the ice nucleation activity of a monomer is inconsequential in the natural environment.

In the absence of detailed structural information, we have probed the properties of the multimers to help develop feasible models for their structure and assembly. The location of the R-coils at the C-terminal end of the solenoid next to the highly conserved cap structure is critical, as they do not function in the middle of the WO-coil region, and only poorly at the N-terminal end. These R-coils have a strong positive charge from the Arg and Lys residues, whereas the WO-coils are negatively charged, and their interaction potentially provides an electrostatic component to the fibre assembly. As expected, changing the charge on the R-coils from positive to negative caused some loss of ice nucleation activity (∼9 °C), consistent with charge repulsion between these two solenoid regions weakening the multimer structure. In wild-type INPs, the negative charges of the WO-coils are consistent throughout their length, which offers no clue as to where on the WO-coils the R-coils might interact. One possible advantage of this uniformity is that multimer assembly could still happen if the WO-coil length is appreciably shortened as it can be in nature and by experimentation (8)

The minimal effects of pH change on native INP activity are reminiscent of the insensitivity of antifreeze activity to pH (30, 34). The ice-binding sites of AFPs are typically devoid of charged residues and there should be no effect of pH on the ability of these sites to organize ice-like waters. The same can be said for the water-organizing motifs in INPs. We noted the extraordinary heat stability of INP multimers. Even after heating to 99 °C for 10 min the bacterial extracts only lost 5 °C of ice nucleation activity, whereas the heat-stable internal GFP control was denatured at 75 °C. We cannot rule out the possibility that the INP multimers were also denatured by heat treatment but could reassemble on cooling.

Working model of the INP multimer

The fundamental unit of the INP multimer in this hypothetical model is a dimer (Fig. 9). The dimerization interface involves an interaction of the stacked tyrosine ladders from the two INP monomers as previously suggested (10, 24). However, in this model the INPs are aligned antiparallel to each other (Fig. 9B). This orientation is more likely than a parallel alignment since the R-coils and C-terminal cap structure appear to clash when modelled parallel to each other (Movie S5). The antiparallel dimer would not be a rigid, flat sheet but could hinge at the tyrosine ladder. Another advantage of the antiparallel arrangement is that the two dimer termini are identical allowing end-to-end linking to form a long fibre.

Filamentous multimer model for bacterial INPs.

A) A possible assembly of INP solenoids to form long fibres composed of antiparallel INP dimers (indicated by orange and yellow pairs). B) Dimers are formed along the tyrosine ladder, a previoiusly proposed dimerization interface. They are joined end to end by forming electrostatic interactions between negatively (red) and positively (blue) charged surfaces. All threonines are coloured light green, displaying the arrays of TxT WO-motifs. The termini of the INP solenoids are labeled N and C and coloured to match panel A. This illustration uses a manually flattened AlphaFold model of PbINP. C) Cross sections of the model at positions indicated in A. Monomers are rotated approximately 90° to each other and dimerized along their tyrosine ladders (purple). Toward their termini, a pair of dimers can be matched by oppositely charged electrostatic surfaces (teal).

The end-to-end dimer associations involve electrostatic interactions between the basic side of the R-coils and the acidic side of the WO-coils. If these interactions can also form with the proteins at an approximate right-angle, it should be possible for end-linked dimers to form a compact fibre (Fig. 9B) with a diameter close to that seen by cryo-ET (Fig. 8). The antiparallel arrangement of the dimers gives a sidedness to the multimer where TxT motifs (light green) face outwards and inwards in an alternating pattern with SxT motifs (on the underside) in the opposite phase (Fig. 9B). Cross-sectional views of the INP fibre (Fig. 9C) show the interactions between the negatively (red) and positively (blue) charged regions where the dimers overlap to form a ring of four solenoids, while maintaining the interaction of the two monomers through the tyrosine ladder pairing.

Working model of the INP multimer is consistent with the properties of INPs and their multimers

We previously showed that the length of the WO-coil region can be shortened by ∼60% with only a few °C decrease in ice nucleation temperature (8). The working model can accommodate these huge deletions simply by closing the gap between the dimers. For example, the deletion of 32 WO-coils leaving just 21 along with the 12 R-coils retains all the molecular interactions seen in the longer fibre but with fewer stacked tyrosine interactions. This can help explain the heat stability of the INP multimers and the minimal difference (2-3 °C) in activity loss between full-length PbINP with 65 coils and the truncated version with 33 coils (Fig. 6). Similarly, longer WO-coil regions can be accommodated by lengthening the gap. This can explain the wide range of WO-coil lengths seen in nature (Fig. 2). They all fit in the same model.

Our model also shows how the interaction between the R-coils and the WO-coils of the adjacent dimers supports fibre formation. Any shortening of the R-coil subdomain jeopardizes the ability to link up the dimers. The catastrophic loss of ice nucleation activity seen below 8 R-coils is because the interacting length of R-coils and WO-coils has too few electrostatic and other interactions to bridge the dimers together. The importance of electrostatic interaction has been illustrated in this study in two ways. First, when the R-coil basic residues were replaced by acidic residues, the ice nucleation activity was severely compromised but was fully restored when the mutated residues were all converted to lysines. Second, in cell-free extracts of lysed INP-producing E. coli ice nucleation activity decreased by a few degrees Celsius at low pH values where the charge on acidic residue side chains was reduced or eliminated. When the carboxyl groups of aspartate and glutamate involved in electrostatic pairing lose their negative charges at low pH, they can still form hydrogen bonds with basic amino acid partners, which can explain why the lowering of pH was not as disruptive as reversing the charge on these residues. Another useful test of the electrostatic component to the multimer model would be to study the effects of increasing salt concentration on ice nucleation activity of the E. coli extracts.

The observation that low, variable levels of ice nucleation activity remained in the construct where the R-coil basic residues were replaced by acidic residues, suggests that there are additional binding interactions between the dimers other than electrostatic ones. We suggest the involvement of the highly conserved C-terminal capping structure. When three mutations designed to disrupt the cap fold were introduced, all ice nucleation activity was lost. Also of note is the disruptive effect of extending the tyrosine ladder further into the R-coil sub-domain in the mutant where the acidic residues replaced the basic ones. The subtle details of the R-coil region will require detailed structural analysis for their elucidation.

The relocation of the R-coils to the N-terminal end of the solenoid caused a loss of just over 50% activity and it is possible to accommodate such a change in the model while retaining a charge interaction between the R-coils and WO-coils. However, the separation from the cap structure might account for some of the activity loss. Movement of the R-coils to the centre of the WO-coil region is not compatible with the model and sure enough, this construct was devoid of ice nucleation activity (Fig. 3B).

Other features supporting the model are that the dimer’s C-and N-terminal ends are exposed and can accommodate tags and extensions without disrupting the fibre. Thus, the addition of a C-terminal GFP tag has no detrimental effect in ice nucleation activity. Nor is there any difference in activity if the N-terminal INP domain and linker region are present or not (Fig. 7) (8, 28). Even the incorporation of a bulky protein like mRuby into the WO-coils (8) can be accommodated because the fibre is just a dimer rather than a bundle of solenoids.

Electron microscopy of newly synthesized INPs in a cell-free system shows them as thin molecules of dimensions 4-6 nm in diameter by a few hundred nm in length (25). Negatively stained images of recombinantly produced INP multimers isolated by centrifugation and chromatography show an elongated structure ∼5-7 times longer than a monomer but not much wider (24). The fibres seen in situ in INP-expressing E. coli (Fig. 8) are similarly long but slightly thinner, consistent with the absence of negative staining. The model in Fig. 9 is the thinnest structure we can project for a fibrillar multimer.

Solving the structure of the INP fibres at atomic detail will be the key to understanding the remarkable ability of biological ice nucleators to start the freezing process at high sub-zero temperatures. Structures of this type offer the promise of cell-free ice nucleation for use in biotechnological and food applications where there is a need to avoid the use of bacteria.

Methods

AlphaFold prediction

The AlphaFold model for PbINP was generated by Forbes et al. as described (6).

Bioinformatic analysis of INPs

NCBI’s BLAST was accessed using the BioPython library v1.81 (35). The consensus sequence for the 16-residue coil ‘AGYGSTQTAGEDSSLT’ was used as the query against the non-redundant protein database. The PAM30 scoring matrix was used due to the short query. Quality control (QC) was performed using custom Python scripts, making use of BioPython’s Entrez module to fetch information on the protein, BioProject, and assembly method for each BLAST result (Fig. S6). Custom Python scripts were used to automatically identify the tandem repeats and classify them as WO-coils or R-coils.

Sequence logos were made using the Logomaker package v0.8 (36). Alignment of C-terminal cap sequences was performed using JalView software v2.11.2.6 (37).

Synthesis of PbINP genes

Experiments for this project used a synthetic PbINP gene previously developed by our group. This codon-optimized gene encodes the P. borealis INP gene (GenBank accession: EU573998). Additionally, the DNA sequence for enhanced green fluorescent protein (eGFP) (GenBank accession: AAB02572) was fused to the 3ʹ-end of the PbINP gene using a hexanucleotide encoding two linker residues (Asn-Ser). More details about the PbINP-eGFP sequence are provided in Forbes et al. (8).

All mutants for this study were designed by modifying the aforementioned synthetic gene. GenScript performed all (Piscataway, NJ, USA) gene syntheses, which we subsequently cloned into the pET-24a expression vector.

Five PbINP mutants were designed to test the effect of replacing the R-coils. The R-coils were incrementally replaced with the sequences of WO-coils adjacent to the R-coil region as indicated, resulting in constructs containing 10, 8, 6, 4, and 1 R-coil (Fig. 2A). Replacements were designed such that they maintained the periodicity of the tandem repeats. The C-terminal R-coil was left untouched to avoid disturbing possible interactions with the putative C-terminal cap structure.

The R-coils were either relocated within the protein or deleted (Fig. 3A), while again leaving the N-and C-terminal coils untouched to avoid interactions with adjacent domains.

Targeted mutations were introduced to the R-coil region gene to produce four additional constructs (Fig. 4A). For the first construct (RKH replacement), any positively charged residues (Arg, Lys, His) in positions 11, 12, or 14 in the R-coils were replaced with residues commonly found in those locations in the WO-coils (Asp, Glu, and Gly for positions 11 and 12, Ser for position 14). The second construct Y extend in Fig. 4A extends the stacked tyrosine ladder present at position 3 of the coils through 7 additional coils toward the C terminus of the solenoid. The third construct (RKH replacement + Y extend in Fig. 4A is a combination of both mutants. The fourth construct (K-coils in Fig. 4A) converted every Arg residue in the R-coil section to a Lys residue.

Protein expression in E. coli

Each PbINP construct was transformed into the ArcticExpress strain of E. coli, since its expression of two cold-adapted chaperones, Cpn10 and Cpn60, promotes the correct folding of proteins at low temperatures (38). Transformation and induction with IPTG were performed according to the supplier’s instructions (Agilent Technologies, Catalog #230192). Cells expressed at 10 °C for 24 h post-induction. The eGFP tag allowed expression to be confirmed using fluorescence microscopy (8).

Ice nucleation assays by WISDOM

Constructs were assayed on WISDOM (WeIzmann Supercooled Droplets Observation on a Microarray) (39) in a similar way as described in Forbes et al. (6).

Ice nucleation assays by nanoliter osmometer

Ice nucleation activity was quantified using a droplet freezing assay protocol (40) that makes use of a LabVIEW-operated nanoliter osmometer (Micro-Ice, Israel) (41). Briefly: Following induction and cold incubation, nanoliter-sized droplets of liquid cultures were pipetted into oil-filled wells resting on a cold stage. The temperature of the cold stage was lowered at a rate of 1 °C/min while a video recording was taken of the sample grid. Freezing was characterized by a distinct change in droplet appearance. After assay completion, the videos were analyzed to record the temperatures of all freezing events. The fraction of frozen droplets (fice) as a function of temperature was plotted, generating ice nucleation curves for each sample. This apparatus could not reach temperatures as low as those achieved on WISDOM, but results are in agreement between the two approaches (Fig. 2B).

Heat treatment and pH

To obtain cell lysates, E. coli cultures were centrifuged at 3,200 × g for 30 min post-induction. Cell pellets were then resuspended in a lysis buffer of 50 mM Tris-HCl, 150 mM NaCl, containing Pierce Protease Inhibitor (Thermo Scientific, Canada) before sonication at 70% amplitude for 30-s rounds. Lysate was centrifuged at 31,000 × g and the resulting supernatant was passed through a 0.2 μm filter.

For heat treatment, filtered lysate in sealed Eppendorf tubes was heated at 60 °C, 70 °C, 80 °C, 90 °C, or 99 °C for 10 min in a thermocycler and then quenched on ice prior to being assayed for activity.

For the pH experiments, aliquots of filtered lysate were diluted 50-fold in pH-adjusted buffer of 100 mM sodium citrate, 100 mM sodium phosphate, and 100 mM sodium borate following the protocol by Chao et al. (30). Before assaying, we verified using universal indicator strips that addition of lysate to the buffer mixtures did not meaningfully affect the pH of the final mixtures.

Preparation of the cryo-EM grids

After confirming eGFP-INP expression, the E. coli cultures were incubated at 4 °C for an additional 3 days. The E. coli cells were spun down and resuspended in PBS to an OD600 nm of ∼ 3. These concentrated E. coli samples were deposited onto freshly glow-discharged QUANTIFOIL holey carbon grids (Electron Microscopy Sciences). The grids were then blotted from the back side with the filter paper for ∼5 s before plunge-frozen in liquid ethane, using a manual plunger-freezing apparatus as described previously (42, 43).

Cryo-FIB milling

The plunge-frozen grids with E. coli cells were clipped into cryo-FIB AutoGrids and mounted into the specimen shuttle under liquid nitrogen. An Aquilos2 cryo-FIB system (Thermo Fisher Scientific) was used to mill the thick bacterial samples into lamellae of < 200 nm in thickness. The milling process was completed using a protocol as previously described (44).

Cryo-ET data acquisition and tomogram reconstruction

Grids containing the lamellae obtained from cryo-FIB milling were loaded into either a 300-kV Titan Krios electron microscope (Thermo Fisher Scientific) equipped with a Direct Electron Detector and energy filter (Gatan) or a 200-kV Glacios Electron Microscope at Yale University. The FastTOMO script was used with the SerialEM software to collect tilt series with defocus values of approximately −6 μm (45), and a cumulative dose of ∼70 e-/Å covering angles from −48° to 48° (3° tilt step). Images were acquired at 42,000 × magnification with an effective pixel size of 2.148 Å. All recorded images were first drift corrected by MotionCor2 (46), stacked by the software package IMOD (47), and then aligned by IMOD using Pt particles as fiducial markers. TOMO3D was used to generate tomograms by simultaneous iterative reconstruction technique (SIRT) (48). In total, 10 tomograms were reconstructed with TOMO3D for the WT INP while 5 tomograms were produced for the R-coil mutant.

Data Availability

The full dataset of long-read sequences used is available in Supplementary Data 1.

Contributions

T.H., J.C.L., and P.L.D. planned these experiments based on the identification of the R-coil subdomain by T.H. and P.L.D. in a prior work. T.H., J.C.L., and P.L.D. wrote the original manuscript for peer review. T.H. and P.L.D. designed the model for multimerization. T.H. coded the pipeline and performed the bioinformatic analysis of INPs.

T.H. and J.C.L. designed all mutants and prepared all samples except where otherwise noted. J.C.L. prepared the buffers and samples for the pH experiments, and performed all measurements of ice nucleation activity on the nanoliter osmometer. T.H. and J.C.L. performed the heat treatment experiments.

G.O. and I.B. prepared samples whose activity was assayed by N.R. and Y.R. on WISDOM. S.G., W.G. and J.L. performed cryo-FIB and cryo-ET experiments. S.G. prepared Fig. 8 and Movies S2 and S3. T.H. prepared all other figures.

Acknowledgements

This work was supported by CIHR Foundation Grant FRN 148422 to P.L.D., who holds the Canada Research Chair in Protein Engineering, and by an Israel Science Foundation grant to I.B. YR acknowledges support by a research grant from the Yotam project and the Weizmann Institute sustainability and energy research initiative. S.G. was supported by a CIHR Post-Doctoral Fellowship and an NIH RO1 grant (R01AI087946) of J.L. W.G. was also supported by the NIH RO1 grant (R01AI087946) of J.L. We thank Virginia K. Walker for the gift of the Pseudomonas borealis strain.

Supplemental information

AlphaFold shows high confidence in overall fold of the model.

A) Predicted location dependent difference test (pLDDT) shows the residue-by-residue confidence of the model generated by AlphaFold. Low values may indicate low-confidence or intrinsic disorder. B) Predicted aligned error (pAE) plots indicate the confidence in the relative orientation of the models. The x-and y-axes indicate residue position of the model, with low (blue) values indicating high confidence and high (red) indicating low. Rigid domains often appear as squares along the diagonal axis.

E. coli expressing INP mutant lacking R-coils show no fibre clusters as observed in those cells overexpressing wild-type INP.

A-D) Representative snapshots from 3-D cryo-tomograms showing cytoplasmic and extracellular features of various E. coli cells overexpressing an INP mutant in which all but the C-terminal R-coil have been replaced by WO-coils. All four images are in the same scale and the scale bar represents 100 nm.

Flowchart and quality control steps in sequence selection for bioinformatic analysis.

Ten known INPs from literature were used to generate a consensus sequence for WO-coils which was then used as a query in a BLAST against NCBI’s non-redundant protein database to identify INPs. NCBI E-Utils were used to generate a data set using only genes from long-read DNA sequencing data.