A select subset of DFDs have intrinsic nucleation barriers enabling persistent supersaturation

A. Schematic diagram illustrating two models for signal amplification through protein self-assembly. Top left: Extrinsic model, where PAMP-binding coupled with nucleotide hydrolysis stabilizes active assemblies (red glow) relative to solute precursors (blue glow). This model is exemplified by localized actin polymerization downstream of many cell surface receptors,76 but could also occur indirectly by, for example, phosphorylation-mediated release of solubilizing factors. Bottom left: DFDs that function in this way will assemble promptly and monotonously above their saturation concentration (Csat). Top right: Intrinsic model, where the protein is supersaturated at rest but prevented from assembling by a sequence-encoded nucleation barrier. PAMP-binding eliminates the barrier, releasing the energy of supersaturation to drive assembly. The models are not mutually exclusive. Bottom right: DFDs that function in this way will remain soluble above Csat until stochastic nucleation, creating a discontinuous relationship of assembly to concentration across a population of cells.

B. Illustration showing how the concentration-dependence of self-assembly as classified by DAmFRET relates to the subcellular morphology of self-assemblies classified by high-throughput confocal microscopy. “Continuous” and “discontinuous” classifications describe the relationship of self-assembly (AmFRET, y-axis) to expression level (x-axis) for each DFD. Discontinuous DFDs exhibit a range of concentrations where self-assembly occurs stochastically, indicating an intrinsic nucleation barrier. The four instances of visible assemblies despite no AmFRET-positive cells are presumed to result from those DFDs partitioning with other cellular components or endogenous condensates wherein they remain too dilute to FRET.

C. Distribution of DAmFRET classifications across the four subfamilies of DFDs.

D. Schematic diagram of our experimental design to assess the ability of each DFD to seed itself. Top: Biological activation of an exemplary signalosome –– the AIM2 inflammasome –– occurs when the receptor AIM2 oligomerizes on the multivalent PAMP, dsDNA, and then templates the assembly of the adaptor protein, ASC. Bottom: Experimental paradigm to test for supersaturation mimics biological activation, by expressing each DFD in trans with the same DFD expressed as a fusion to μNS, a modular self-condensing protein. AmFRET-positivity will only occur if the μNS fusion templates subsequent self-assembly by the non μNS-fused DFD.

E. Representative DAmFRET data contrasting two self-assembling DFDs –– one that is supersaturable (left) and the other that is not (right). The plot for the supersaturated protein exhibits a discontinuous distribution of AmFRET across the expression range (top and bottom). The discontinuity is eliminated, with all cells moving to the AmFRET-positive population, by expressing the protein in the presence of genetically encoded seeds (middle). The dashed horizontal lines approximate the mean AmFRET value for monomeric mEos3. Procedure defined units (p.d.u.).

F. Contingency table showing that discontinuous DFDs tend to be self-seedable. Chi-square test revealed a strong association between continuity and self-seedability. X2 (1, n = 83) = 40.71; p < 0.001. Cramer’s V = 0.700.

G. Boxplot comparing the Csat values (as approximated by C50seeded) of continuous and discontinuous DFDs. Discontinuous DFDs have significantly lower Csat, indicating greater stability of the assemblies. Mann-Whitney U = 457, ncontinuous = 26, ndiscontinuous = 20 (p < 0.001).

H. Boxplot comparing supersaturability, represented as the fold change reduction in C50 by seeding (C50stochastic – C50seeded), of continuous and discontinuous DFDs. The C50 values were more strongly reduced by seeding for discontinuous DFDs than for continuous DFDs. Mann-Whitney U = 164, ncontinuous = 58, ndiscontinuous = 21 (p < 0.001).

See also Figures S1, S2, and Table S1.

Nucleation barriers are a characteristic feature of inflammatory signalosome adaptors

A. Boxplot of DFD-containing protein abundances in monocytes, showing that discontinuous DFDs have higher endogenous expression levels. Mann-Whitney U = 53, ncontinuous = 26, ndiscontinuous = 8 (p = 0.039). Protein abundance values are from PAXdb.18

B. Scatter plot of DFD gene expression in monocytes (normalized transcripts per million) and Csat values. Spearman R = –0.285 (p = 0.03). Adaptor DFDs are labeled. Dataset obtained from the Human Protein Atlas.

C. Top: box plots of degree centrality (left) and betweenness centrality (right) of continuous and discontinuous DFDs in the endogenous network of physically interacting DFD proteins, showing that the latter are more centrally positioned. Degree centrality Mann-Whitney U = 242.0 (p = 0.010); betweenness centrality Mann-Whitney U = 274.0 (p = 0.030); ncontinuous = 46, ndiscontinuous = 18. Bottom: box plots of centrality measures of non-seedable and seedable DFDs, showing that the latter are more centrally positioned. Degree centrality Mann-Whitney U = 167.5 (p = 0.022); betweenness centrality Mann-Whitney U = 172.5 (p = 0.023); nnon-seedable = 35, nseedable = 16.

D. Visualization of how the DAmFRET profiles of isolated DFD domains (left) change in their full-length contexts (right), showing that only adaptor proteins (green connections) tend to retain discontinuous transitions in their full-length context.

E. Subnetworks of prominent signalosome adaptor proteins that were found to be supersaturable. Edges connect nodes with experimentally determined physical interactions with confidence > 0.9 in STRING. All proteins shown have DFDs except TRAFs. Each adaptor’s node size is proportional to its supersaturability score.

F. Comparison of protein abundances at the whole body level for the signalosome components in Figure 2E (left) and Figure 2G (right), showing that adaptors are more highly expressed for the former. Protein abundance values are from PAXdb.18 P-values are from Mann-Whitney test. For supersaturable signalosomes: nsensor = 13, nadaptor = 6, neffector = 4; sensors and adaptors, U = 4.0 (p < 0.001); sensors and effectors, U = 6.0 (p = 0.023); adaptors and effectors, U = 18.0 (p = 0.257). For non-supersaturable signalosomes: nsensor = 3, nadaptor = 2, neffector = 2; sensors and adaptors, U = 1.0 (p = 0.400); sensors and effectors, U = 0.0 (p = 0.200); adaptors and effectors, U = 0.0 (p = 0.333).

G. Subnetworks of signalosomes lacking supersaturable DFDs. Edges connect nodes with experimentally determined physical interactions with confidence > 0.9 in STRING. All proteins shown have DFDs except TRAF6.

See also Figures S3 and S4, and Tables S1 and S2.

Nucleation barriers facilitate signal amplification in human cells

A. Schematic diagram of experiment in HEK293T cells to reconstitute the apoptosome with optogenetic control, in either a non-supersaturable or supersaturable format. The non-supersaturable format comprises the typical APAF1CARD and CASP9 pair; the supersaturable format comprises the chimeric APAF1 with NLRC4CARD in place of APAF1CARD and chimeric CASP9 with CASP1CARD replacing CASP9CARD (CASP9CASP1CARD). Blue light will trigger assembly in both cases, but subsequent disassembly in the dark will only occur for the non-supersaturated apoptosome.

B. Caspase 3/7 activity reporter fluorescence intensities in the absence of stimulation or after one minute of 488 nm stimulation for cell lines expressing the non-supersaturable or supersaturable pairs, showing that both pairs comparably activate caspase 3/7 while oligomerized. APAF1CARD-Cry2 + CASP9-mScarlet-I, dark n = 163, pulse n = 375, Mann-Whitney U = 11362 (p < 0.0001). NLRC4CARD-Cry2 + CASP9CASP1CARD, dark n = 46, pulse n = 305, Mann-Whitney U = 4253 (p < 0.0001).

C. Coefficient of variation (CV) of fluorescence distribution in HEK293T cells expressing the indicated protein pairs after a single one minute 488 nm laser activation. Top, APAF1CARD-Cry2 and CASP9-mScarlet-I display rapid cluster formation that dissociates by 20 min. Bottom, NLRC4CARD-Cry2 and chimeric CASP9CASP1CARD cluster less rapidly but the clusters continue to grow indefinitely.

D. Representative images from experiment in C. Clusters of APAF1CARD-Cry2 and CASP9-mScarlet-I form then dissociate while NLRC4CARD-Cry2 and CASP9CASP1CARD crusters only get larger.

E. Quantification of cell death of the HEK293T chimeric cells (as in A) using Annexin V-Alexa 488 staining, either two hours after a single one minute pulse of 488 nm laser, or after two hours of “constant” stimulation whereby cells were subjected to a one second pulse every one minute. P-values derived from t-test.

See also Figure S5, Table S3 and Movie S1, S2.

Innate immune adaptors are endogenously supersaturated

A. Time course of apoptotic cell death of THP-1 cells following exposure to AIM2 ligand, poly(dA:dT). P-value obtained from ANOVA followed by pair comparison.

B. Schematic diagram of the experiment to transiently optogenetically stimulate AIM2PYD to monitor ASCPYD assembly. This experiment was conducted in HEK293T cells because they do not undergo pyroptosis.

C. Top, Time course of fluorescence intensity distribution in THP-1 cells following 10 seconds of optogenetic activation, showing that WT AIM2PYD forms clusters (high CV) that persist and induces cell death, while the F27G solubilizing mutant16 forms clusters that subsequently disperse. Bottom, normalized Sytox Orange fluorescence intensity for the experiment in the top panel.

D. Representative confocal microscopy images from a timelapse of THP-1 monocytes showing that transient optogenetic stimulation of WT but not F27G mutant of AIM2PYD causes it to form puncta that coincide with cell death. Sytox Orange was used for this experiment because it can be excited without activating Cry2.

E. Time course of cell death of THP-1 cells when subjected to a blue light pulse every 5 minutes (“repeated”), showing rapid cell death (violet trace) only when AIM2PYD is WT and when ASC is present. The absence of ASC results in slower death (green trace), consistent with apoptosis. The F27G mutation of AIM2PYD blocks cell death irrespective of ASC (black and golden traces).

F. Coefficient of variation (CV) of fluorescence distribution of AIM2PYD-Cry2 and ASC-mScarlet-I in THP-1 PYCARD-KO cells following a 10 s blue light pulse. This shows that AIM2PYD and ASC-mScarlet-I (with slightly delayed kinetics) rapidly form clusters that persist well after stimulus removal. ASC-mScarlet-I was induced to only ∼20% of the ASC expression in WT cells using 1.0 µg/mL doxycycline (dox).

G. Quantification of CellTox staining in individual ASC-mScarlet-I THP-1 PYCARD-KO cells 30 minutes after a 10 second blue laser pulse, at different levels of dox-induced ASC-mScarlet-I expression. Green dotted line indicates 95% confidence interval (CI) for background fluorescence intensity, above which cells were considered CellTox-positive. Error bars denote standard deviation. Control, n = 37. 0.25 µg/mL dox, n = 36. 0.5 µg/mL dox, n = 47. 0.75 µg/mL dox, n = 113. 1 µg/mL dox, n = 180.

H. Top: The metastability of supersaturation implies that cells will occasionally inflame and/or die from stochastic (without PAMPs) DFD nucleation, which creates a tradeoff between innate immunity and lifespan. Bottom: Scatter plot showing the relationship between geometric mean of adaptor supersaturation including ASC, FADD, BCL10, TRADD, MAVS (as approximated by the ratio of transcription levels and Csat values) and mean lifespan for each cell type in the human body for which data is available.46 Cell types with greater DFD supersaturation have shorter lifespans. The red line represents the best-fit power-law regression, obtained by performing linear regression in log-log space. The shaded region represents the 95% confidence interval for the trend line. Spearman R = –0.8375 (two-tailed p = 0.000027).

See also Figure S5 and Table S3.

The nucleating interactome is highly specific

A. Matrix of all nucleating interactions (gray-shaded circles) detected in a comprehensive DAmFRET screen of > 10,000 DFD pairs. Each DFD-mEos3 (columns) was separately expressed with each DFD-μNS seed (rows). Darker shading of the circle denotes increased seedability. Interactions among members of the same signaling pathway (in legend) appear in color shaded squares. Asterisk denotes seeds that were screened in a separate experiment from the rest. The matrix was clustered on seedability values, on a log scale, using the SciPy.cluster.hierarchy v1.11.1 linkage and dendrogram Python packages, using the Ward variance minimization algorithm to calculate distances. Procedure defined units (p.d.u.).

B. Circos plot of the nucleating interactions summarized by DFD subfamily. Each subfamily is represented with a segment proportional to the number of DFDs with a nucleating interaction, as indicated by ribbons within and between segments. Inner stacked bars around the perimeter show the numbers of DFDs in each subfamily seeded by the subfamily in that segment. Middle stacked bars around the perimeter show the numbers of DFDs in each subfamily that seed the subfamily in that segment. Outer stacked bars around the perimeter show total nucleating interactions involving the subfamily in that segment.

C. Nucleating interactions involving DFDs in extrinsic apoptosis and pyroptosis, with blue edges highlighting the direct nucleating effect of AIM2 on FADD and ASC that is explored in Figure 4. The network was created in Cytoscape with node size corresponding to betweenness centrality and grouped by reported function. Interactions between FL proteins (Table S2) were included. Edge darkness indicates the seedability score of the corresponding interaction.

See also Figure S6 and Table S4.

DFD nucleation barriers are deeply conserved

A. DAmFRET classifications for DFD-only and FL components of the DISC from the model sponge, Amphimedon queenslandica, and of the inflammasome from the model fish Danio rerio, showing that adaptors are specifically supersaturable. *D. rerio CASP1FL exhibits a high Csat in the mid-micromolar range, *D. rerio CASP1FL exhibits a high Csat in the mid-micromolar range based on prior calibrations of DAmFRET plots,12 which greatly exceeds the nanomolar concentration expected for endogenous procaspase-1,77 making it unlikely to supersaturate at endogenous concentrations.

B. Phylogenetic tree illustrating evolutionary relationships between DFD signaling pathways from bacteria to humans.

C. DAmFRET of DFDs extracted from pairs of proteins in innate immunity operons from the indicated multicellular bacterial species. Operon schema show domain architectures in the corresponding genes, as adapted from Kaur et al. 55. One bacterial DFD (bDLD3) from each of the putative signaling pathways is seen to be supersaturable.

D. Physical logic of DFD function. Left: Cells experience thermodynamic perturbations either from stochastic fluctuations (noise) or PAMP binding to innate immune receptors. These perturbations can nucleate supersaturated signaling proteins (dashed horizontal lines) with a probability that depends on the type of phase transition and specifically, whether it is accompanied by structural ordering. Middle: For phase separation in the absence of structural ordering (LLPS), the nucleation barrier (ΔΔG(nucleus - solute)) declines sharply with concentration beyond Csat,66,78 which increases its susceptibility to noise. This limits the level of supersaturation that can be maintained by a cell (vertical dashed line), and therefore, the extent to which assembly (ΔΔG(solute –– assembly)) can power signal amplification (tiny battery schematic). Right: For phase separation with structural ordering (paracrystallization as in adaptor DFD assemblies), the dependence of nucleation on concomitant intramolecular fluctuations buffers the barrier against concentration (as indicated by a shallower curve relative to LLPS), which allows cells to maintain much higher levels of supersaturation.12,14 Following nucleation, the assemblies grow and deplete soluble protein until it is no longer supersaturated, driving amplification (diagonal orange arrow) through proximity-dependent effector activation. The intrinsic nucleation barriers encoded by solution phase DFD ensembles therefore allow them to function as phase change batteries (giant battery schematic) to power innate immune signal amplification.

See also Figure S7 and Table S2.

Sequence, imaging and DAmFRET analysis reveal diverse sequence-encoded phase behaviours of DFDs, related to Figure 1

A. Schematic diagram of all DFDs characterized in this paper, and their classification into structural subfamilies. Tandem DFDs are highlighted in red, but were analysed with their corresponding single DFD subfamilies.

B. Matrices of predicted alignment error (PAE) for the indicated regions of proteins containing two DFDs, as reported in the AlphaFold Protein Structure Database,79 grouped into two categories according to interdomain PAE values consistent with either independent (left) or dependent (right) relative geometries of the DFDs.

C. DAmFRET profiles of representative DFDs classified as continuous or discontinuous.

D. Classification of the proteins as entirely diffuse, fibrillar or punctate based on boundaries on the scatter plot of coefficient of variation vs aspect ratio. The colored circles represent the mean and covariance of the values for each category.

E. Classification of continuous DFD as “low”, “low to high” or “high” by thresholding on the minimum and ending AmFRET values of a fitted spline, normalized to that of a control DFD.

F. Images of yeast expressing representative DFDs classified as fibrillar that produced continuous (low to high) DAmFRET profiles.

G. Images of yeast expressing representative DFDs classified as fibrillar that produced discontinuous DAmFRET profiles.

H. Representative confocal microscopy images of yeast expressing the indicated DFD constructs in the presence of the ASC or CARD14CARD seeds. The images show the emergence of filaments only from matching μNS-DFD seeds.

Self-assembly involves subunit interfaces shared with solved DFD polymer structures, related to Figure 1

A. Representative DAmFRET plots for the indicated DFDs with the indicated point mutations. The horizontal line approximates the mean AmFRET value for monomeric mEos3. Procedure defined units (p.d.u.).

B. Image of SDD-AGE showing the size distribution of detergent-resistant multimers (where present) of mEos3-fused proteins expressed in yeast. The amyloid-forming protein, RIPK1RHIM, formed detergent-resistant multimers whereas all DFD multimers were detergent-labile.

Proteins with DFDs that have seedable and/or discontinuous DAmFRET are central to their physical interaction networks and are more likely to be supersaturated in vivo, related to Figure 2

A. Transcripts encoding proteins with discontinuous DFDs have higher expression in immune cells. P values are from Mann-Whitney test (see also Table S5). Transcripts per million (TPM) values are from the immune cell data of the Human Protein Atlas, comprising 18 cell types and total Peripheral Blood Mononuclear Cells (PBMC).

B. Heatmap of protein abundance relative to reference, of discontinuous and continuous DFD containing proteins for the indicated tissues. Tissues are ordered by significance. P values are from Mann-Whitney test (see also Table S5). Protein abundance values are from the Proteome Map of the Human Body.20

C. Bar plot of the Spearman R correlation between immune cell type transcript abundance and Csat values of DFDs shows consistent negative and significant anticorrelation among immune cell types. Data were obtained from the immune cell section of the Human Protein Atlas.

D. Boxplot comparing the betweenness (left) and degree centrality (right) of DFD-containing proteins that are either non-seedable or continuous (n = 37) to those that are both seedable and discontinuous (n = 14). Seedable, discontinuous proteins were found to have a significantly higher betweenness and degree centrality than non-seedable or continuous proteins. Mann-Whitney U = 145.5 (p = 0.012) and U = 146.5 (p = 0.017), respectively.

Proteins characterized as signaling adaptors display discontinuity in its DFD and FL context, related to Figure 2

A. Pairs of DAmFRET plots comparing the behaviours of representative DFDs and their corresponding FL proteins. Dashed horizontal lines approximate the mean AmFRET value for monomeric mEos3. FL MAVS has appreciable AmFRET in the supersaturated state that we attribute to its mitochondrial localization signal.

B. Left, DAmFRET plot of ASCPYD expressed alone. Right, DAmFRET plot of ASCPYD co-expressed with FL NLRP3 showing persistence of the supersaturated bottom population indicating that FL NLRP3 oligomers are not active (in the absence of stimulation).

Characterization of engineered THP-1 cell lines and apoptosome assembly, and correlation of DFD supersaturation with cell mortality in the human body, related to Figures 3 and 4

A. DAmFRET plots of APAF1CARD and CASP9CARD measured in the presence of the indicated “seeds” expressed in trans. Both proteins fail to populate a high-AmFRET state.

B. Western blot verifying the knock-out status of ASC and/or FADD in the respective engineered stable THP-1 cell lines. Actin is the loading control.

C. Cartoon depicting the doxycycline-inducible ASC-mScarlet-I that replaced endogenous ASC in THP-1 PYCARD-KO cells.

D. Representative capillary western blots comparing expression levels of the dox-inducible ASC-mScarlet-I construct alongside endogenous ASC. Actin is the loading control.

E. Quantification of the data showing significantly lower-than endogenous levels of ASC in the engineered construct at even the highest level of induction by doxycycline. For each condition, one million cells were sorted and lysed.

F. Scatter plot showing the relationship between ASC supersaturation (as approximated by the ratio of transcription levels and Csat values) and mean lifespan for each cell type as indicated in Figure 4H. The red line represents the best-fit power-law regression, obtained by performing linear regression in log-log space. The shaded region represents the 95% confidence interval for the trend line. Spearman R = –0.87 (two-tailed p = 0.0000057).

G. Bar plot of the Spearman R correlation between supersaturation and cell mean lifespan of cell types as shown in Figure 4H for each DFD. Transcript levels for each cell type were obtained from the single cell RNA dataset of the Human Protein Atlas.

Generation and validation of the DFD nucleating interactome, related to Figure 5

A. Illustration of how the library of all pairs of DFDs was created. An arrayed sublibrary of yeast transformed with 105 DFD-mEos3 fusions was mated to a separate arrayed sublibrary of yeast strains expressing 107 chromosomally integrated DFD-μNS-mCardinal fusions, to create a library of 12,660 diploid strains representing all pairwise combinations.

B. Left, DAmFRET was run on all pairwise combinations. Only high quality datasets –– having a total cell count greater or equal to 2500 and a mean acceptor intensity greater or equal to 3.5 p.d.u. –– were used in the analysis. Right, DAmFRET plots of FADDFL either null-seeded (lacking a DFD) or self-seeded. Nucleating interactions (as shown by the self-seeded example) are indicated by a reduced C50 and increased percentage of cells with self-assemblies (those above the gate delimiting low FRET, shown in orange).

C. Hits are determined by a multiparameter combination of the degree of C50 outlier and the degree of fraction assembled outlier as defined by how many interquartile ranges (IQR) a plot is below or above the median, respectively. Points on the graph are shaded by this parameter. The leftmost set of boxplots show the distribution of standardized log10 C50 and fraction assembled for all seeds for a representative protein, FADDFL. The middle boxplot shows the average outlier degree value of these two parameters. This value is referred to as “seedability” throughout the text and is used in determining hits. The cutoff value for hit determination was set to be 3 standard deviations above the mean of all seedability values across the screen. The scatter plot on the right shows the standardized log10 C50 and fraction assembled values, depicting the contribution of both to the scoring value and positive nucleating interactions within the green box.

D. Top, scatter plot of seedability values from the two replicate experiments containing 3478 DFD + seed combinations. Points are colored according to agreement between the two experiments. Gray rectangles indicate seedability values of negative interactions, with the dark gray square containing the interaction found to be negative for both instances. Bottom, the replicates after the removal of instances found to be negative in both experiments to reduce random and outlier effects. The line of best fit is shown in red. Pearson correlation R between the two experiments = 0.91 (p < 0.0001). In order to reduce the effects of random variations of the negatives as well as outlier effects, we omitted double negative instances. In this case, the Pearson correlation R = 0.90 (p < 0.0001).

E. Seventeen DFD + seed combinations that had inconsistent hit-calling out of the 3478 combinations reassessed. From this we determine that our assay had a consistency of 99.51% with a 95% confidence interval of 99.28 – 99.74%.

F. Bar plots showing the number of hits per experiment, as well as total number of hits, determined for each replicated DFD. Blue and yellow indicate separate hit counts for each experiment. Gray are the number of unique hits found across both experiments. What percentage of seeds are consistently called hits are shown for each DFD. The left bar plot shows the overall summary of all DFDs included in both sets.

Demonstration of conserved energy storage capacity of DFDs, related to Figure 6

A. DAmFRET data of DFD-only and full length inferred DISC components from the model sponge, Amphimedon queenslandica. The distant homolog to human FADD exhibits supersaturability in its FL and isolated DFDs.

B. DAmFRET data of DFD-only and full length inflammasome components from the model fish Danio rerio.