Figures and data

The harmful effects of each Poly10X are generally conserved across species.
A) Experimental setup. Poly10X sequences were fused to the C-terminus of EGFP and expressed in S. cerevisiae under the TDH3 or WTC846 promoter, or in E. coli under the lac promoter. Cellular fitness (growth rate) and expression level (fluorescence intensity) of cells overexpressing each construct were measured. Relative neutrality was calculated by comparison with the corresponding control EGFP construct lacking Poly10X (Δ). B, C) Maximum growth rate and maximum fluorescence intensity of S. cerevisiae cells overexpressing EGFP–Poly10X in SC–LU medium at 30 °C, and the calculated relative neutrality (C). Growth and fluorescence curves are shown in Figure S2B. D, E) Maximum growth rate and maximum fluorescence intensity of E. coli expressing EGFP–Poly10X in LB + ampicillin medium supplemented with 1 mM IPTG at 37 °C, and the calculated relative neutrality (E). Growth and fluorescence curves are shown in Figure S3B. F–H) Cross-species comparison of Poly10X harmful effects. Average relative neutrality values obtained in S. cerevisiae (C) and E. coli (E) were compared with each other and with previously reported Poly10X cytotoxicity ranks in COS-7 cells 26. Cytotoxicity ranks were calculated as described in Materials and Methods. I) Precise determination of relative neutrality in S. cerevisiae. EGFP–Poly10X expression was titrated across seven concentrations of aTc, and the maximum neutrality point was used for final normalization to the EGFP control (Δ). Growth and fluorescence curves are shown in Figure S4B and C. J) Correlation between the maximum relative neutrality of Poly10X in S. cerevisiae (I) and the Poly10X cytotoxicity ranking in COS-7 cells 26. Single-letter codes indicate the amino acid repeated in PolyX. K) Hierarchical clustering analysis of relative neutrality values across varying expression levels. On the y-axis, TDH3_U and TDH3_LU indicate experiments using TDH3pro under SC-U (Figure S1) and SC-LU (C and Figure S2) conditions, respectively. The numbers indicate the concentration of aTc in experiments using WTC846 (I and Figure S4). Each dataset was normalized to the neutrality of the EGFP control. Euclidean distance and average linkage were used. The heatmap represents relative neutrality levels. In B–E and I, bars, dots, and error bars represent the mean, individual values, and standard deviation from at least three biological replicates. In B, C, and D, asterisks indicate significant differences in maximum growth rate, maximum fluorescence intensity, and relative neutrality, respectively, compared with the control (Δ) (p < 0.05, Student’s t-test with Bonferroni correction). In F–H and J, Spearman’s rank correlation coefficient (ρ) and its p-value are shown.

Poly10X harmful and beneficial effects are associated with amino acid polarity and hydrophobicity.
A) Schematic of the constructs used in this study. Each Poly10X was attached to the C-terminal of each fluorescent protein (FP), and expressed in S. cerevisiae under the control of the WTC846 promoter. EGFP constructs were expressed under the TDH3 promoter (see Figure 1B, C). B) Relative neutrality of Poly10X measured using various fluorescent proteins in SC–LU medium at 30 °C. Bars represent the mean relative neutrality for each amino acid, and dots correspond to measurements across different fluorescent proteins. Each dot represents the mean of at least three biological replicates, and the bars indicate the mean relative neutrality obtained across six different fluorescent proteins. Growth and fluorescence curves are shown in Figure S2B, S5B–S9B. C) Correlation matrix comparing the trends of Poly10X neutrality across different fluorescent proteins. Each value represents the Spearman’s rank correlation coefficient based on relative neutrality values measured for each FPs–Poly10X. D) Relationships between Poly10X neutrality and amino acid physicochemical properties (energy cost, polar requirement and hydrophobicity). Spearman’s rank correlation coefficient (ρ) and its p-values are shown. E) Hierarchical clustering of relative neutrality values obtained from different fluorescent proteins. Clustering was performed using Euclidean distance and average linkage. Single-letter codes indicate the amino acid repeated in Poly10X. F) Comparison of the three Poly10X clusters identified in E with amino acid physicochemical properties (energy cost, polar requirement and hydrophobicity). Statistical analyses were performed using the Mann–Whitney U test between each cluster and the remaining total samples, followed by Bonferroni correction for multiple comparisons. G) Maximum growth rate and maximum fluorescence intensity of S. cerevisiae cells overexpressing EGFP without Poly10X (Δ) or with C-terminal Poly10F, Poly10I, Poly10W, or Poly10Y fusions in SC–LU medium at 30 °C. Indicated amino acid was supplemented to the medium at standard (×1), ×2, or ×4 concentrations. Bars, dots, and error bars represent the mean, individual values, and standard deviation from four biological replicates. Growth and fluorescence curves are shown in Figure S11B.

Structural context modulates the effect of Poly10X, while its overall neutrality trend is conserved.
A) Schematic representation of the constructs used to examine the effect of Poly10X in different structural contexts. In the “Poly10X between two FPs” construct, a Poly10X–GS linker sequence was inserted between EGFP and mCherry. In the “internal Poly10X” construct, Poly10X was inserted between residues 173 and 174 of EGFP. In the “detached Poly10X” construct, a P2A self-cleaving sequence was placed between EGFP and Poly10X. B, C) Maximum growth rate and maximum fluorescence intensity (B), and relative neutrality (C) of S. cerevisiae cells overexpressing the Poly10X between two FPs construct in SC–LU medium at 30 °C. Growth and fluorescence curves are shown in Figure S12B. D, E) Maximum growth rate and maximum fluorescence intensity (D), and relative neutrality (E) of cells expressing the internal Poly10X construct in SC–LU medium at 30 °C. Growth and fluorescence curves are shown in Figure S13B. F, G) Maximum growth rate and maximum fluorescence intensity (F), and relative neutrality (G) of cells overexpressing the detached Poly10X construct in SC–LU medium at 30 °C. Growth and fluorescence curves are shown in Figure S14B. H) Cross-comparison of Poly10X neutrality profiles among different structural contexts. Each value represents the Spearman’s rank correlation coefficient calculated from the relative neutrality indices obtained in each construct. I) Pairwise comparison of Poly10X neutrality trends between different structural contexts. Scatter plots show correlations of average relative neutrality values across constructs. Spearman’s rank correlation coefficient (ρ) and its p-values are shown. Single-letter codes indicate the amino acid repeated in Poly10X. In B–G bars, dots, and error bars represent the mean, individual values, and standard deviation from at least three biological replicates.

Poly10X induces protein relocalization and aggregate formation A) Fluorescence microscopy images of S. cerevisiae cells expressing EGFP–Poly10X under the control of TDH3pro.
Cells were pre-cultured in SC–U medium and then cultured overnight in SC–U or SC–LU medium before imaging shown as –U and –LU. Brightness and contrast were adjusted to allow clear visualization of cell morphology. Indicated subcellular localization and aggregation were categorized by visual inspection. B) Fluorescence microscopy images of S. cerevisiae cells expressing EGFP–mCherry, EGFP–Poly10K–mCherry, and EGFP–Poly10P–mCherry under the control of the WTC846 promoter, observed 6 hours after aTc induction. No mCherry fluorescence was detected in cells expressing EGFP–Poly10K–mCherry, and EGFP–Poly10P–mCherry. Single-letter codes indicate the amino acid repeated in Poly10X. Δ represents the protein without a Poly10X fusion.

Poly10E reduces the harmful effects of protein overexpression through aggregation suppression.
A) Relative neutrality of EGFP-Poly10X under heat stress. Cells were cultured in SC–LU medium at 38 °C. Growth and fluorescence data are shown in Figure S20. B) Comparison of the relative neutrality of EGFP–Poly10X at 30 °C (from Figure 1C) and 38 °C (from A). Average values are used, and the plot of EGFP-Poly10E is shown as Poly10E. C) Relative neutrality of moxGFP-Poly10X under heat stress. Cells were cultured in SC–LU medium at 38 °C. Growth and fluorescence data are shown in Figure S21. D) Comparison of the relative neutrality of moxGFP–Poly10X at 30 °C (from Figure S5D) and 38 °C (from C). Average values are used, and the plot of moxGFP-Poly10E is shown as Poly10E. E) Representative image of SDS–PAGE of total, soluble, and insoluble protein fractions from cells overexpressing vector, EGFP, and EGFP–Poly10E cultured in SC–LU medium. Proteins were visualized by fluorescent dye staining. The arrowhead indicates the band corresponding to EGFP or EGFP–Poly10E. F) Quantification of target protein levels (% of total protein) based on the band intensity in E and Figure S22. Only monomeric bands were quantified; therefore, polymerized EGFP or EGFP–Poly10E may not be accurately represented. Bars, dots, and error bars represent the mean, individual values, and standard deviation from five biological replicates. Statistical comparisons were performed by Welch’s t-test. The raw gel images used for the quantification are shown in Figure S22. G) Percentage of cells forming Hsp70 aggregates among those overexpressing vector, EGFP, or EGFP–Poly10E. The p-values were calculated by Welch’s t-test with Bonferroni correction. For each of three biological replicates, 200 cells per sample were randomly selected, and aggregate formation was visually assessed under blinded conditions. H) Fluorescent microscopic images of Hsp70 foci in cells overexpressing EGFP or EGFP–Poly10E. Hsp70/Ssa1–mScarlet-I was genomically integrated to monitor Hsp70 aggregate formation. Image brightness and contrast were adjusted to enhance visibility of aggregates. Enlarged images are shown in Figure S23. I) Microscopic observation of morphological phenotypes in various yeast strains overexpressing vector, EGFP, EGFP–Poly10E, or moxGFP. Wild-type strain BY4741 and mutant strains (cdc24-5, rpl19aΔ) were cultured in SC–LU medium for 18 h before imaging. Representative images, the morphological quantification method, and the quantitative results for all seven mutants are shown in Figures S24 and S25. J–L) Transcriptomic analysis of S. cerevisiae cells overexpressing EGFP or EGFP–Poly10X constructs by RNA-seq. Volcano plots show differential gene expression between cells expressing EGFP and EGFP–Poly10E (J), EGFP–Poly10D (K), and EGFP–Poly10I (L). Genes regulated by the heat-shock transcription factor Hsf1 and the proteasome stress-response gene RPN4 are highlighted. Cells expressing EGFP–Poly10E and EGFP–Poly10D were cultured in SC–LU medium, while those expressing EGFP–Poly10I were cultured in SC–U together with the EGFP control.

The neutrality of Poly10X mirrors its evolutionary usage in proteomes.
A Analysis of PolyX occurrence patterns (PolyXmax and Num–Poly10X) in S. cerevisiae. The bar graph shows the maximum number of consecutive identical residues (PolyXmax) for each amino acid in S288C ORFs. Shuffled S288C ORFs represent the minimum, mode, and maximum values obtained from 10,000 random simulations. Asterisks indicate amino acids that exhibit significantly longer homorepeat lengths than expected from simulation (q < 0.001, Monte Carlo test with FDR correction). Numbers below the graph indicate the counts of proteins containing ≥10 consecutive identical residues (Num–Poly10X) for S288C ORFs, shuffled S288C ORFs, and pan Sc ORFs/isolates (from top to bottom). Values for pan-Sc ORFs/isolates are rounded to one decimal place. B Spearman’s rank correlation coefficient (ρ) of PolyXmax values among different species. The underlying numerical data used for the calculation are provided in Figure S27. C Spearman’s rank correlation coefficients (ρ) of Num–Poly10X/Gene_number across species. The underlying numerical data used for the calculation are provided in Figure S27. D Relationship between PolyXmax in S288C ORFs and experimentally measured relative neutrality. Colors indicate the chemical properties of each amino acid. E Relationship between Num–Poly10X in Pan Sc ORFs/isolates and experimentally measured relative neutrality. For cysteine (C) and tryptophan (W), values of 0 were plotted as 0.001 for visualization purposes. In D and E, Spearman’s rank correlation coefficient (ρ) and its p-value are shown. Colors indicate the physicochemical properties of each amino acid.

Neutrality of C-terminal Poly10X fusions to EGFP in yeast (low-copy conditions).
A) Schematic representation of the expression constructs. Poly10X was fused to the C-terminus of EGFP and expressed under the TDH3 promoter in S. cerevisiae. B) Growth and fluorescence curves of S. cerevisiae cells expressing EGFP–Poly10X, measured using the gTOW method in SC–U medium at 30 °C. Curves represent the mean values from at least three biological replicates, and shaded regions indicate the standard deviation (SD). C, D) Maximum growth rate and maximum fluorescence intensity of S. cerevisiae cells low-level overexpressing EGFP–Poly10X in SC–U medium at 30 °C, and the calculated relative neutrality (D). Bars, dots, and error bars represent the mean, individual data points, and standard deviation from at least three biological replicates. Asterisks indicate significant differences in maximum growth rate, maximum fluorescence intensity, and relative neutrality, respectively, compared with the control (Δ) (p < 0.05, Student’s t-test with Bonferroni correction).

Neutrality of C-terminal Poly10X fusions to EGFP in yeast (High-copy conditions).
A) Schematic representation of the expression constructs. Poly10X was fused to the C-terminus of EGFP and expressed under the TDH3 promoter in S. cerevisiae. B) Growth and fluorescence curves of S. cerevisiae cells expressing EGFP–Poly10X, measured using the gTOW method in SC–LU medium at 30 °C. Curves represent the mean values from at least three biological replicates, and shaded regions indicate the standard deviation (SD). C, D) Maximum growth rate and maximum fluorescence intensity of S. cerevisiae cells overexpressing EGFP–Poly10X in SC–LU medium at 30 °C, and the calculated relative neutrality (D). Bars, dots, and error bars represent the mean, individual data points, and standard deviation from at least three biological replicates. Asterisks indicate significant differences in maximum growth rate, maximum fluorescence intensity, and relative neutrality, respectively, compared with the control (Δ) (p < 0.05, Student’s t-test with Bonferroni correction). E Comparison of Poly10X harmfulness trends under low-level (SC–U) and high-level (SC–LU) overexpression conditions. Spearman’s rank correlation coefficient (ρ) is shown.

Neutrality of C-terminal Poly10X fusions to EGFP in E. coli.
A) Schematic representation of the expression constructs. Poly10X was fused to the C-terminus of EGFP and expressed under the lac promoter in E. coli. B) Growth and fluorescence curves of E. coli cells expressing EGFP–Poly10X, measured in LB + ampicillin medium at 37 °C. Curves represent the mean values from four biological replicates, and shaded regions indicate the standard deviation (SD). C, D) Maximum growth rate and maximum fluorescence intensity of E. coli cells expressing EGFP–Poly10X in LB + ampicillin medium at 37 °C, and the calculated relative neutrality (D). Bars, dots, and error bars represent the mean, individual data points, and standard deviation from four biological replicates. Asterisks indicate significant differences in maximum growth rate, maximum fluorescence intensity, and relative neutrality, respectively, compared with the control (Δ) (p < 0.05, Student’s t-test with Bonferroni correction). E Several E. coli strains expressing EGFP–Poly10X were collected after cultivation in B, serially diluted 10-fold, and spotted (5 µl each) onto LB or LB + ampicillin agar plates. Plates were incubated overnight at 37 °C and photographed the following day. These data suggest plasmid loss after cultivation of cells expressing harmful Poly10X variants. Single-letter codes indicate the amino acid repeated in Poly10X. Δ represents the protein without a Poly10X fusion.

Neutrality of C-terminal Poly10X fusions to EGFP under different induction levels in yeast.
A) Schematic representation of the expression constructs. Poly10X was fused to the C-terminus of EGFP and expressed under the control of the WTC846 promoter in S. cerevisiae. B, C) Stepwise growth (B) and fluorescence (C) curves of S. cerevisiae cells expressing EGFP–Poly10X measured using the gTOW method in SC–LU medium at 30 °C. Gradual induction of expression was achieved under the control of the WTC846 promoter by stepwise adjustment of aTc concentration. Curves represent the mean values from four biological replicates.

Neutrality of C-terminal Poly10X fusions to moxGFP in yeast.
A) Schematic representation of the expression constructs. Poly10X was fused to the C-terminus of moxGFP and expressed under the control of the WTC846 promoter in S. cerevisiae. B) Growth and fluorescence curves of S. cerevisiae cells expressing moxGFP–Poly10X, measured using the gTOW method in SC–LU medium at 30 °C. Curves represent the mean values from at least three biological replicates, and shaded regions indicate the standard deviation (SD). C, D) Maximum growth rate and maximum fluorescence intensity of S. cerevisiae cells overexpressing moxGFP–Poly10X in SC–LU medium at 30 °C, and the calculated relative neutrality (D). Bars, dots, and error bars represent the mean, individual data points, and standard deviation from at least three biological replicates. Asterisks indicate significant differences in maximum growth rate, maximum fluorescence intensity, and relative neutrality, respectively, compared with the control (Δ) (p < 0.05, Student’s t-test with Bonferroni correction).

Neutrality of C-terminal Poly10X fusions to mNeonGreen in yeast.
A) Schematic representation of the expression constructs. Poly10X was fused to the C-terminus of mNeonGreen and expressed under the control of the WTC846 promoter in S. cerevisiae. B) Growth and fluorescence curves of S. cerevisiae cells expressing mNeonGreen–Poly10X, measured using the gTOW method in SC–LU medium at 30 °C. Curves represent the mean values from at least three biological replicates, and shaded regions indicate the standard deviation (SD). C, D) Maximum growth rate and maximum fluorescence intensity of S. cerevisiae cells overexpressing mNeonGreen–Poly10X in SC–LU medium at 30 °C, and the calculated relative neutrality (D). Bars, dots, and error bars represent the mean, individual data points, and standard deviation from at least three biological replicates. Asterisks indicate significant differences in maximum growth rate, maximum fluorescence intensity, and relative neutrality, respectively, compared with the control (Δ) (p < 0.05, Student’s t-test with Bonferroni correction).

Neutrality of C-terminal Poly10X fusions to Gamillus in yeast.
A) Schematic representation of the expression constructs. Poly10X were fused to the C-terminus of Gamillus and expressed under the control of the WTC846 promoter in S. cerevisiae. B) Growth and fluorescence curves of S. cerevisiae cells expressing Gamillus–Poly10X, measured using the gTOW method in SC–LU medium at 30 °C. Curves represent the mean values from at least three biological replicates, and shaded regions indicate the standard deviation (SD). C, D) Maximum growth rate and maximum fluorescence intensity of S. cerevisiae cells overexpressing Gamillus–Poly10X in SC–LU medium at 30 °C, and the calculated relative neutrality (D). Bars, dots, and error bars represent the mean, individual data points, and standard deviation from at least three biological replicates. Asterisks indicate significant differences in maximum growth rate, maximum fluorescence intensity, and relative neutrality, respectively, compared with the control (Δ) (p < 0.05, Student’s t-test with Bonferroni correction).

Neutrality of C-terminal Poly10X fusions to mScarlet-I in yeast.
A) Schematic representation of the expression constructs. Poly10X was fused to the C-terminus of mScarlet-I and expressed under the control of the WTC846 promoter in S. cerevisiae. B) Growth and fluorescence curves of S. cerevisiae cells expressing mScarlet-I–Poly10X, measured using the gTOW method in SC–LU medium at 30 °C. Curves represent the mean values from at least three biological replicates, and shaded regions indicate the standard deviation (SD). C, D) Maximum growth rate and maximum fluorescence intensity of S. cerevisiae cells overexpressing mScarlet-I–Poly10X in SC–LU medium at 30 °C, and the calculated relative neutrality (D). Bars, dots, and error bars represent the mean, individual data points, and standard deviation from at least three biological replicates. Asterisks indicate significant differences in maximum growth rate, maximum fluorescence intensity, and relative neutrality, respectively, compared with the control (Δ) (p < 0.05, Student’s t-test with Bonferroni correction).

Neutrality of C-terminal Poly10X fusions to mCherry in yeast.
A) Schematic representation of the expression constructs. Poly10X was fused to the C-terminus of mCherry and expressed under the control of the WTC846 promoter in S. cerevisiae. B) Growth and fluorescence curves of S. cerevisiae cells expressing mCherry–Poly10X, measured using the gTOW method in SC–LU medium at 30 °C. Curves represent the mean values from at least three biological replicates, and shaded regions indicate the standard deviation (SD). C, D) Maximum growth rate and maximum fluorescence intensity of S. cerevisiae cells overexpressing mCherry–Poly10X in SC–LU medium at 30 °C, and the calculated relative neutrality (D). Bars, dots, and error bars represent the mean, individual data points, and standard deviation from at least three biological replicates. Asterisks indicate significant differences in maximum growth rate, maximum fluorescence intensity, and relative neutrality, respectively, compared with the control (Δ) (p < 0.05, Student’s t-test with Bonferroni correction).

Correlation analysis between Poly10X neutrality and amino acid properties.
A) Correlation between Poly10X neutrality and various amino acid indices (physicochemical and usage-related properties). Each dot and bar represents individual Spearman’s rank correlation coefficients and their mean values calculated across six fluorescent proteins. B) Relationships among amino acid indices that showed strong correlations with Poly10X relative neutrality. Spearman’s rank correlation coefficient (ρ) and its p-value are shown.

Effects of supplemented amino acids on the harmfulness of high-biosynthetic-cost Poly10X repeats.
A) Schematic representation of the expression constructs. Poly10F, Poly10I, Poly10W, and Poly10Y were fused to the C-terminus of EGFP and expressed under the control of the WTC846 promoter in S. cerevisiae. B) Growth and fluorescence curves of S. cerevisiae cells expressing EGFP–Poly10X, measured using the gTOW method in SC–LU medium supplemented with additional amino acids (×1, ×2, ×4) at 30 °C. Curves represent the mean values from four biological replicates, and shaded regions indicate the standard deviation (SD). C) Maximum growth rate and maximum fluorescence intensity of S. cerevisiae cells overexpressing EGFP–Poly10X in SC–LU medium supplemented with additional amino acids (×1, ×2, ×4) at 30 °C. Bars, dots, and error bars represent the mean, individual data points, and standard deviation from four biological replicates. Maximum growth rate and maximum fluorescence intensity of S. cerevisiae cells overexpressing EGFP without Poly10X (Δ) or with C-terminal Poly10F, Poly10I, Poly10W, or Poly10Y fusions in SC–LU medium at 30 °C. Indicated amino acid was supplemented to the medium at standard (×1), ×2, or ×4 concentrations. Bars, dots, and error bars represent the mean, individual values, and standard deviation from four biological replicates.

Neutrality of Poly10X Insertions between two FPs in yeast.
A) Schematic representation of the expression constructs. Poly10X were inserted between EGFP and mCherry and expressed under the control of the WTC846 promoter in S. cerevisiae. B) Growth and fluorescence curves of S. cerevisiae cells expressing EGFP–Poly10X–GSlinker–mCherry, measured using the gTOW method in SC–LU medium at 30 °C. Curves represent the mean values from at least three biological replicates, and shaded regions indicate the standard deviation (SD). C, D) Maximum growth rate and maximum fluorescence intensity of S. cerevisiae cells overexpressing EGFP–Poly10X–GSlinker–mCherry in SC–LU medium at 30 °C, and the calculated relative neutrality (D). Bars, dots, and error bars represent the mean, individual data points, and standard deviation from at least three biological replicates. Asterisks indicate significant differences in maximum growth rate, maximum fluorescence intensity, and relative neutrality, respectively, compared with the control (Δ) (p < 0.05, Student’s t-test with Bonferroni correction).

Neutrality of Poly10X Insertions within EGFP in yeast.
A) Schematic representation of the expression constructs. Poly10X was inserted into an internal loop of EGFP between residues 173 and 174, and expressed under the control of the WTC846 promoter in S. cerevisiae. B) Growth and fluorescence curves of S. cerevisiae cells expressing EGFP173–Poly10X–174EGFP, measured using the gTOW method in SC–LU medium at 30 °C. Curves represent the mean values from at least three biological replicates, and shaded regions indicate the standard deviation (SD). C, D) Maximum growth rate and maximum fluorescence intensity of S. cerevisiae cells overexpressing EGFP173–Poly10X–174EGFP in SC–LU medium at 30 °C, and the calculated relative neutrality (D). Bars, dots, and error bars represent the mean, individual data points, and standard deviation from at least three biological replicates. Asterisks indicate significant differences in maximum growth rate, maximum fluorescence intensity, and relative neutrality, respectively, compared with the control (Δ) (p < 0.05, Student’s t-test with Bonferroni correction).

Neutrality of Poly10X detached from EGFP via P2A in Yeast.
A) Schematic representation of the expression constructs. A self-cleaving P2A sequence was inserted between EGFP and Poly10X, allowing Poly10X to be detached from EGFP during translation. The construct was expressed under the control of the TDH3 promoter in S. cerevisiae. B) Growth and fluorescence curves of S. cerevisiae cells expressing EGFP–P2A–Poly10X, measured using the gTOW method in SC–LU medium at 30 °C. Curves represent the mean values from four biological replicates, and shaded regions indicate the standard deviation (SD). C, D) Maximum growth rate and maximum fluorescence intensity of S. cerevisiae cells overexpressing EGFP–P2A–Poly10X in SC–LU medium at 30 °C, and the calculated relative neutrality (D). Bars, dots, and error bars represent the mean, individual data points, and standard deviation from four biological replicates. Asterisks indicate significant differences in maximum growth rate, maximum fluorescence intensity, and relative neutrality, respectively, compared with the control (Δ) (p < 0.05, Student’s t-test with Bonferroni correction).

Fluorescence microscopy of yeast cells expressing EGFP–Poly10X (low-copy conditions).
Fluorescence microscopy images of S. cerevisiae cells expressing EGFP–Poly10X under the control of TDH3pro. Cells were pre-cultured in SC–U medium and subsequently cultured overnight in the same medium before imaging. Brightness and contrast were adjusted to clearly visualize cell morphology. For each sample, the bright-field image (left) and the corresponding GFP fluorescence image (right) are shown. Single-letter codes indicate the amino acid repeated in Poly10X. Δ represents the protein without a Poly10X fusion.

Fluorescence microscopy of yeast cells expressing EGFP–Poly10X (High-copy expression).
Fluorescence microscopy images of S. cerevisiae cells expressing EGFP–Poly10X under the control of TDH3pro. Cells were pre-cultured in SC–U medium and subsequently cultured overnight in SC–LU medium before imaging. Brightness and contrast were adjusted to clearly visualize cell morphology. For each sample, the bright-field image (left) and the corresponding GFP fluorescence image (right) are shown. Single-letter codes indicate the amino acid repeated in Poly10X. Δ represents the protein without a Poly10X fusion.

Fluorescence microscopy of EGFP–Poly10X–mCherry expression immediately after induction in yeast.
Fluorescence microscopy images of S. cerevisiae cells expressing EGFP–Poly10X–GSlinker–mCherry under the control of the WTC846 promoter immediately after aTc induction. Cells were pre-cultured in SC–U medium and subsequently cultured overnight in SC–LU medium before imaging. Brightness and contrast were adjusted to clearly visualize cell morphology. For each sample, the bright-field image (left), GFP fluorescence image (center left), RFP fluorescence image (center right), and the merged GFP/RFP image (right) are shown. Single-letter codes indicate the amino acid repeated in Poly10X. Δ represents the protein without a Poly10X fusion.

Fluorescence microscopy of EGFP–Poly10X–mCherry expression six hours after aTc induction in yeast.
Fluorescence microscopy images of S. cerevisiae cells expressing EGFP–Poly10X–GSlinker–mCherry under the control of the WTC846 promoter six hours after aTc induction. Cells were pre-cultured in SC–U medium and subsequently cultured overnight in SC–LU medium before imaging. Brightness and contrast were adjusted to clearly visualize cell morphology. For each sample, the bright-field image (left), GFP fluorescence image (center left), RFP fluorescence image (center right), and the merged GFP/RFP image (right) are shown. Single-letter codes indicate the amino acid repeated in Poly10X. Δ represents the protein without a Poly10X fusion.

Fluorescence microscopy of EGFP–Poly10X–mCherry expression overnight after aTc induction in yeast.
Fluorescence microscopy images of S. cerevisiae cells expressing EGFP–Poly10X–GSlinker–mCherry under the control of the WTC846 promoter overnight after aTc induction. Cells were pre-cultured in SC–U medium and subsequently cultured overnight in SC–LU medium before imaging. Brightness and contrast were adjusted to clearly visualize cell morphology. For each sample, the bright-field image (left), GFP fluorescence image (center left), RFP fluorescence image (center right), and the merged GFP/RFP image (right) are shown. Single-letter codes indicate the amino acid repeated in Poly10X. Δ represents the protein without a Poly10X fusion.

Neutrality of C-terminal Poly10X fusions to EGFP in yeast under heat-stress conditions.
A) Schematic representation of the expression constructs. Poly10X was fused to the C-terminus of EGFP and expressed under the TDH3 promoter in S. cerevisiae. B) Growth and fluorescence curves of S. cerevisiae cells expressing EGFP–Poly10X, measured using the gTOW method in SC–LU medium at 38 °C. Curves represent the mean values from at least three biological replicates, and shaded regions indicate the standard deviation (SD). C, D) Maximum growth rate and maximum fluorescence intensity of S. cerevisiae cells overexpressing EGFP–Poly10X in SC–LU medium at 38 °C, and the calculated relative neutrality (D). Bars, dots, and error bars represent the mean, individual data points, and standard deviation from at least three biological replicates. Asterisks indicate significant differences in maximum growth rate, maximum fluorescence intensity, and relative neutrality, respectively, compared with the control (Δ) (p < 0.05, Student’s t-test with Bonferroni correction).

Neutrality of C-terminal Poly10X fusions to moxGFP in yeast under heat-stress conditions.
A) Schematic representation of the expression constructs. Poly10X were fused to the C-terminus of moxGFP and expressed under the WTC846 promoter in S. cerevisiae. B) Growth and fluorescence curves of S. cerevisiae cells expressing moxGFP–Poly10X, measured using the gTOW method in SC–LU medium at 38 °C. Curves represent the mean values from at least three biological replicates, and shaded regions indicate the standard deviation (SD). C, D) Maximum growth rate and maximum fluorescence intensity of S. cerevisiae cells overexpressing moxGFP–Poly10X in SC–LU medium at 38 °C, and the calculated relative neutrality (D). Bars, dots, and error bars represent the mean, individual data points, and standard deviation from at least three biological replicates. Asterisks indicate significant differences in maximum growth rate, maximum fluorescence intensity, and relative neutrality, respectively, compared with the control (Δ) (p < 0.05, Student’s t-test with Bonferroni correction).

SDS–PAGE and Western blot analysis of EGFP and EGFP–Poly10E proteins in yeast.
A) Schematic overview of the protein extraction procedure. A 25 mL culture was collected at approximately OD660 = 1. One milliliter was used for total protein extraction, while the remaining 5 or 24 mL was fractionated into soluble and insoluble fractions. Each fraction was analyzed by SDS–PAGE, followed by western blotting. Protein concentrations were normalized based on the total protein amount extracted from cells in 1 mL of culture at OD660 = 1, which was defined as 1 unit (1 U). B) SDS–PAGE images of proteins extracted from cells with the control vector (Vector) or overexpressing EGFP or EGFP–Poly10E (Poly10E). Cells were cultured in SC–LU medium and collected at approximately OD660 = 1. Lanes correspond to total, soluble (Sol), and insoluble (Insol) fractions (from left to right). Protein bands were visualized using a chemiluminescent detection reagent. Five biological replicates were analyzed. Protein loading concentrations are indicated in the figure. C) Quantification method for %protein level from SDS–PAGE images shown in B. Signal intensity corresponding to the EGFP band (Red) was corrected by subtracting the background intensity from the same region (Blue), and the resulting value was normalized to the total protein amount in the sample (green) to calculate the percentage. D) Western blot images of proteins extracted from cells with the control vector (Vector) or overexpressing EGFP or EGFP–Poly10E (Poly10E). The same gels used in B were transferred to membranes and probed with anti-GFP antibodies. Lanes correspond to total, soluble, and insoluble fractions (from left to right).

Fluorescence imaging of Hsp70 foci in yeast cells overexpressing EGFP or EGFP–Poly10E.
Fluorescence microscopy images of Hsp70 foci in cells overexpressing EGFP or EGFP–Poly10E under the control of TDH3pro in SC–LU conditions. Hsp70–mScarlet-I was genomically integrated to visualize Hsp70 aggregate formation. Representative images of nine individual cells are shown, along with the corresponding GFP and RFP fluorescence images. Green indicates EGFP or EGFP–Poly10E, and magenta represents the distribution of Hsp70. Image brightness and contrast were adjusted to enhance the visibility of aggregates.

Morphological analysis of yeast strains overexpressing EGFP or EGFP–Poly10E.
A) Microscopy images of S. cerevisiae strain BY4741 and seven mutant strains (pre7-ph, rpl18bΔ, rpl19aΔ, cdc24-5, mac1Δ, mmr1Δ, and psk1Δ) with the control vector (Vector) or overexpressing EGFP, EGFP–Poly10E, or moxGFP under the control of TDH3pro. Cells were pre-cultured in SC–U medium and then cultured for 18 h in SC–LU medium before imaging. Brightness and contrast were adjusted to clearly visualize cell morphology. B) Schematic workflow of image analysis. After culturing the cells for 18 h in SC–LU medium, microscopic images were acquired and processed using Cellpose 2.2.3 64 for cell segmentation. The segmented images were then analyzed with the MeasureObjectSizeShape module in CellProfiler 4.2.6 65 to measure the major and minor axes of each cell. Elongation ratio (major axis/minor axis) values were calculated using a Python script or Excel and visualized. C) Schematic representation of cell elongation analysis. In this study, cells with an elongation ratio ≥ 1.5 were defined as morphologically abnormal.

Quantification of cell elongation defects in yeast strains overexpressing EGFP or EGFP–Poly10E.
A) Distribution of elongation ratios in S. cerevisiae strain BY4741 and seven mutant strains (pre7-ph, rpl18bΔ, rpl19aΔ, cdc24-5, mac1Δ, mmr1Δ, and psk1Δ) with the control vector (Vector) or overexpressing EGFP, EGFP–Poly10E, or moxGFP under the control of TDH3pro. Each violin plot represents the elongation ratio (major axis/minor axis) of individual cells. Wider regions indicate a higher frequency of cells with the corresponding elongation ratio. The dashed horizontal line represents the threshold for morphological abnormality (elongation ratio ≥ 1.5). Morphological parameters were calculated from microscopic images analyzed using Cellpose and CellProfiler as described in Figure S24B and C. B) Proportion of morphologically abnormal cells (elongation ratio ≥ 1.5) in each strain. The analysis was performed on a single biological replicate, with at least 197 cells analyzed per condition. Bars indicate the percentage of abnormal cells relative to the total number of segmented cells.

Expression changes of Hsf1-regulated genes and RPN4 in yeast expressing EGFP-Poly10D, EGFP-Poly10E, or EGFP-Poly10I.
A, B) Comparison of expression ratios (vs. Vector) for genes regulated by Hsf1 and for RPN4, quantified by RNA-seq. In A, bars indicate the expression ratios of cells overexpressing EGFP, EGFP–Poly10D, and EGFP–Poly10E, respectively. In B, bars indicate the expression ratios of cells with low-level overexpressing EGFP and EGFP–Poly10I, respectively. Outlines denote genes showing significant differential expression compared with the vector control (FDR < 0.05).

PolyX occurrence across diverse species.
A) Heatmap of PolyXmax values (maximum homorepeat length for each amino acid) across multiple species. B) Heatmap of Num-Poly10X values across the same species. Each value represents the number of proteins containing ≥10-residue homorepeats per amino acid, normalized by the total number of genes in the species and multiplied by 1000. Gray boxes indicate amino acids for which no homorepeats of length ≥10 were detected.

Correlation analysis between experimental Poly10X harmfulness patterns and proteome-wide PolyX occurrence trends.
Correlation matrix comparing Poly10X harmfulness patterns observed across different fluorescent proteins with amino acid–level PolyXmax and Num-Poly10X values obtained from proteome-wide computational analyses. Each value represents the Spearman’s rank correlation coefficient for the corresponding pairwise comparison.

Structural context of PolyX homorepeats in the S. cerevisiae proteome based on AlphaFold pLDDT scores.
A) Distribution plots showing the relationship between amino acid homorepeat length in the S. cerevisiae proteome and the corresponding pLDDT scores obtained from the AlphaFold structural prediction dataset (https://alphafold.ebi.ac.uk/). Each dot represents a sequence region within a protein. The pLDDT score indicates prediction confidence (100–90: very high; 90–70: confident; 70–50: low; 50–0: very low). Regions with low pLDDT scores likely correspond not only to low model confidence but also to intrinsically disordered regions (IDRs), which do not adopt a fixed 3D structure. This analysis was performed to examine whether naturally occurring PolyX regions tend to reside within well-structured protein cores or within flexible, disordered regions. For most amino acids, longer homorepeats were associated with lower pLDDT scores, indicating enrichment in unstructured regions. In contrast, PolyQ (glutamine) repeats maintained relatively high pLDDT scores even at longer lengths. B) Representative structures of PolyQ-containing proteins, with PolyQ segments that exhibit pLDDT ≥ 70 highlighted in red. Although these PolyQ regions show high prediction confidence, they appear to be located away from the structured protein core. This suggests that PolyX tracts, including PolyQ, are generally enriched in flexible or intrinsically disordered regions rather than in tightly structured core domains.