Mass-spectrometry (MS)-based screening revealed several proteins bound to FMR1 RNA with expanded CGG repeats; some affect the yield of polyglycine-containing proteins, alleviating their toxicity.

(A) Scheme of two RNA molecules used to perform MS-based screening. The FMR1 RNA contains the entire length of the 5’ untranslated region (5’UTR) of FMR1 with expanded CGG (x99) repeats forming a hairpin structure (red). The open reading frame for the polyglycine-containing protein starts at a repeats-associated non-AUG initiated (RAN) translation-specific ACG codon. RAN translation can also be initiated from a GUG near-cognate start codon, which is not indicated in the scheme. The GC-rich RNA contains TMEM107 mRNA enriched with G and C nucleotide residues (GC content > 70%; similar to FMR1 RNA) with the open reading frame starting at the canonical AUG codon (blue). Both RNAs are tagged with three MS2 stem-loop aptamers (grey) interacting with an MS2 protein tagged with an in vivo biotinylating peptide used to pull down proteins interacting with the RNAs.

(B) The volcano plot representing proteins captured during MS-based screening showing the magnitude of enrichment (log2 fold change) and the statistical significance (−log P-value); red dots indicate proteins significantly enriched (P < 0.05) on FMR1 RNA compared to GC-rich RNA. Three proteins, RPS26, LUC7L3, and DHX15, tested in a subsequent validation experiment are marked. DDX3X is also indicated as this protein has been previously described in the context of interaction with FMR1 mRNA.

(C) The scheme of RNA used for the transient overexpression of FMR99xG (mutant, long, 99 polyglycine tract-containing protein). The construct contains the entire length of the 5’UTR of FMR1 with expanded CGG repeats forming a hairpin structure (red) tagged with enhanced green fluorescent protein (eGFP) (green). Western blot analysis of FMR99xG and Vinculin for HEK293 cells with insufficient DHX15, RPS26, and LUC7L3 induced by specific short interfering RNA (siRNA) treatment. To detect FMR99xG, the 9FM antibody was used. The upper bands were used for quantification. The graph presents the mean signal for FMR99xG normalized to Vinculin from N = 3 biologically independent samples with the standard deviation (SD). An unpaired Student’s t-test was used to calculate statistical significance: *, P < 0.05; ***, P < 0.001; ns, non-significant.

(D) Results of microscopic quantification of FMR99xG-positive aggregates in HeLa cells upon RPS26 silencing. The graph presents a normalized number of GFP-positive aggregates per nucleus in N = 10 biologically independent samples with the SD. An unpaired Student’s t-test was used to calculate statistical significance: *, P < 0.05.

(E) The influence of RPS26 silencing on apoptosis evoked in HeLa cells after 28h or 43h of FMR99xG overexpression. Apoptosis was measured as luminescence signals (relative luminescence units; RLU). The graph presents relative mean values from N = 6 biologically independent samples treated with either siCtrl or siRPS26 with the SD normalized to mock control (cells transfected only with the delivering reagent). An unpaired Student’s t-test was used to calculate statistical significance: *, P < 0.05; **, P < 0.01.

RPS26 insufficiency induces a lower production of polyglycine, but not FMRP, in multiple FXTAS cellular models.

(A) Representative microscopic images showing inducible expression of FMR95xG fused with eGFP in stable transgenic cell line containing a single copy of 5’UTR FMR1 99xCGG-eGFP transgene under the control of a doxycycline-inducible promoter: S-95xCGG model. Blue are nuclei stained with Hoechst; the green signal is derived from FMR95xG tagged with eGFP; scale bar 50 µm; +DOX and −DOX indicate cells treated (or not) with doxycycline to induce transcription of the transgene from the doxycycline-dependent promoter.

(B & C) Results of western blot analyses of FMR95xG (with long, mutant polyglycine stretches) or FMR16xG (with short, normal polyglycine stretches) normalized to Vinculin and an RT-qPCR analysis of FMR1-GFP transgene expression normalized to GAPDH upon RPS26 silencing in S-95xCGG and S-16xCGG models, respectively. siRPS26_I and siRPS26_II indicate two different siRNAs used for RPS26 silencing. The graphs present means from N = 3 (B) or N = 5 biologically independent samples (C) with SDs. An unpaired Student’s t-test was used to calculate statistical significance: *, P < 0.05; **, P < 0.01; ns, non-significant.

(D) Representative microscopic images of cells stably expressing FMR99xCGG obtained after transduction with lentivirus containing the 5’UTR FMR1 99xCGG-eGFP transgene: L-99xCGG model. The green field image showing the signal from FMR99xG fused with eGFP was merged with the bright field image; scale bar 100 µm.

(E) Results of western blot analyses of FMR99xG and FMRP normalized to Vinculin upon RPS26 silencing in the L-99xCGG model. The graphs present means from N = 3 biologically independent samples with SDs. An unpaired Student’s t-test was used to calculate statistical significance: *, P < 0.05; **, P < 0.01; ns, non-significant.

(F) Results of western blot analyses of FMRP levels normalized to Vinculin upon RPS26 silencing in fibroblasts derived either from healthy individual (CGGnorm) or FXTAS patient (CGGexp). The graphs present means from N = 3 biologically independent samples with SDs. An unpaired Student’s t-test was used to calculate statistical significance: ns, non-significant.

Changes in the proteome of RPS26-deficient cells are not robust.

(A) The volcano plot represents a stable isotope labeling using amino acids in cell culture (SILAC)-based quantitative proteomic analysis identifying proteins sensitive to RPS26 insufficiency. It shows the magnitude of protein-level changes (log2 fold change) vs. the statistical significance (−log P-value) 48h post siRPS26 treatment. Data were collected from three independent biological replicates for each group. Grey dots indicate proteins non-responding to RPS26 depletion (N = 1506). Red dots indicate proteins responding to RPS26 insufficiency (P < 0.05); EIF5 and PDCD4 are examples of Negative (N = 223; green) and Positive (N = 158; orange) responders, respectively. Protein groups were further analyzed in B, C, and D.

(B) Gene ontology (GO) analysis performed for positive responders of RPS26 insufficiency from the proteomic experiment shown in A. The graph presents significantly enriched GO terms (P < 0.05); statistical significance was calculated using Fisher’s Exact test with the Bonferroni correction. Note that for negative responders, no GO terms were significantly enriched.

(C) Data validation from the proteomic experiment described in A. Western blot analyses of PDCD4, ILF3, FMRP, and EIF5 proteins normalized to Vinculin for HEK293 cells treated with siRPS26. The graphs present means from N = 4 biologically independent samples with SDs. An unpaired Student’s t-test was used to calculate statistical significance: *, P < 0.05; **, P < 0.01; ***, P < 0.001; ns, non-significant.

(D) The percentage of GC content in 5’UTRs and coding sequences across extending sequence windows initiated from the start codon within three groups of transcripts: Negative responders (green), Positive responders (orange), and Background (BG) understood as the total transcriptome (red). To avoid biases in the GC-content mean, transcripts with 5’UTR sequences shorter than 20 nucleotides were excluded from the analysis, yielding the following sample sizes: Negative responders (N = 213), Positive responders (N = 147), and BG (N = 20,862). For example, position −6 in 5’ UTR sequences corresponds to a 6-nucleotide fragment (window from −6 to −1 positions upstream of ATG), while position −7 corresponds to a 7-nucleotide fragment (window from −7 to −1 positions). The solid line shows the mean GC content at a given position (i.e., within the window), and the shade indicates the standard error of the mean. P-values < 0.01 are denoted by green or yellow dots. These reflect pairwise comparisons of GC content between transcript groups (compared to BG) and were determined using a two-tailed paired t-test with Bonferroni correction.

Insufficiency of the TSR2 chaperone protein lowers the level of RPS26 and FMRpolyG but not FMRP and selected 40S components.

(A) Western blot analyses of RPS26, FMR95xG, FMRP but also Histone H3.3, a sensor of ribosomes depleted with RPS26, normalized to Vinculin in stable S-95xCGG cells treated with siTSR2. Graphs represent means from N = 5 biologically independent samples with SDs. An unpaired Student’s t-test was used to calculate statistical significance: **, P < 0.01; ***, P < 0.001; ns, non-significant.

(B) Western blot analyses of selected 40S ribosomal proteins RACK1, RPS6, and RPS15, normalized to Vinculin upon TSR2 silencing in the S-95xCGG model. The graphs present means from N = 5 biologically independent samples with SDs. An unpaired Student’s t-test was used to calculate statistical significance: *, P < 0.05; ns, non-significant.

Silencing RPS25, the other component of 43S PIC, reduces the level of polyglycine produced from mutant FMR1 mRNA.

(A) A western blot analysis of FMR95xG normalized to Vinculin and an RT-qPCR analysis of the FMR1-GFP transgene expression normalized to GAPDH upon RPS25 silencing in the S-95xCGG model. The graphs present means from N = 3 biologically independent samples with SDs. An unpaired Student’s t-test was used to calculate statistical significance: *, P < 0.05; ns, non-significant.

(B) A western blot analysis of FMR99xG normalized to Vinculin from the human neuroblastoma cell line (SH-SY5Y) upon silencing of either RPS26 or RPS25. siRPS26_I and siRPS26_II indicate two different siRNAs used for RPS26 silencing. The upper bands were used for quantification. The graph presents means from N = 3 biologically independent samples with SDs. An unpaired Student’s t-test was used to calculate statistical significance: *, P < 0.05; **, P < 0.01; ns, non-significant.

MS-based screening revealed proteins interacting with FMR1 RNA.

(A) Sequences of FMR1 RNA and GC-rich RNA constructs used in MS2-based screening. Note that the sequences presented are a fragment of the plasmids used in the experiment.

(B) Results of western blot representing FMRpolyG derived from FRM1 RNA construct (lane 1&2) in comparison to FMRpolyG tagged with GFP (lane 3&4) derived from template construct (Addgene #63089). Tubulin was used as a loading control.

(C) Gene Ontology (GO) analysis performed on proteins detected in FMR1 RNA sample group (common among three technical replicates). Graph presents significantly enriched, selected GO terms.

(D) Results of western blot analysis of FMR99xG normalized to Vinculin upon NOP58, HSHP1, NAF1, MYBBP1A and DHX9 siRNA-based silencing in HEK293 cells. Graphs present means from N = 3 biologically independent samples with SDs. Unpaired Student’s t test was used to calculate statistical significance: ns, non-significant. Note that presented blots are cropped.

(E) RT-qPCR analysis of FMR1-GFP transgene expression derived from transient transfection system normalized to GAPDH upon RPS26 and DHX15 depletion in HEK293 cells. siRPS26_I and siRPS26_II indicate two independent siRNAs. Graphs present means from N = 3 biologically independent samples with SDs. Unpaired Student’s t test was used to calculate statistical significance: ns, non-significant.

(F) Results of western blot demonstrating enrichment of RPS26 detected in 5’UTR 99xCGG and 23xCGG eluates compared to GC-rich RNA in biochemical assay. 50% of eluate fraction was loaded on the gel. All RNA molecules in the experiment were biotinylated (b) in order to perform bRNA-protein pull down procedure from HEK293 cell extract. No RNA sample served as a negative control.

(G) Results of western blot demonstrating enrichment of SAM68 protein detected in FMR1 RNA sample eluate after MS2-based, in cellulo RNA-protein pull down procedure; the percentage of following fractions were loaded on the gel: Input, 5% of total lysate; eluate, 20% of immunoprecipitated fraction. Note that presented blot is cropped.

Characterization of S-95xCGG and S-16xCGG models and RPS26-responders.

(A) Results of western blot of FMR95xG and FMR16xCGG expression in S-95xCGG and S-16xCGG models, respectively, after induction of doxycycline-inducible promoter. Upper image represents the titration of doxycycline (in ng/ml). Following images demonstrate time-dependent course of FMR95xG/FMR16xCGG expression post promoter induction. The same amount of protein sample was loaded on gels. Vinculin serves as a loading control.

(B) The representative confocal microscopic images showing homogenous expression of FMR95xG fused with eGFP in S-95xCGG model after 35 days of doxycycline treatment; scale bar 10 µm. Note that no GFP-positive aggregates of polyglycine were detected.

(C) RT-qPCR analysis of endogenous FMR1 and FMR1-GFP transgene expression in S-95xCGG model with (+DOX) or without (-DOX) doxycycline induction. Bars represent mean cycle (Ct).

(D) Results of western blot analysis of SUZ12 and Histone H3.3 proteins normalized to Vinculin upon RPS26 silencing in COS7 cells. Graphs present means from N = 3 biologically independent samples with SDs. Unpaired Student’s t test was used to calculate statistical significance: *, P < 0.05; **, P < 0.01.

Validation of SILAC-MS results and bioinformatic analysis of mRNAs encoding RPS26-sensitive proteins.

(A) Validation of SILAC-MS data. Results of western blot analysis of RPS6 & FUS (results are in line with proteomic analysis) and PCBP2 & EIF3J (results are not in line with proteomic analysis) normalized to Vinculin upon RPS26 silencing in HEK293 cells. Graphs present means from N = 4 biologically independent samples with SDs. Unpaired Student’s t test was used to calculate statistical significance: *, P < 0.05; ns, non-significant.

(B) Frequency of nucleotides across transcript sequence positions in the close vicinity of start codon within three groups of transcripts: negative responders (N = 218), positive responders (N = 158), and background transcripts (N = 22,160). The analyzed sequence positions span the last 20 nucleotides of the 5’ UTR sequence (from −20 to −1 downstream of the start codon) and the first 20 nucleotides of the coding sequence (from +1 to +20 upstream of the start codon). Each nucleotide symbol’s height on the graph is proportionate to its relative frequency at the corresponding sequence position. The cumulative height of the stack at each position sums to one, encompassing the frequencies of all four nucleotides.

(C) The percentage of GC content of 5’UTR across extending sequence windows initiated upstream from the start codon within groups of transcripts. The graph gathers analysis of transcripts determined with two P-value cut offs (P < 0.05 and P < 0.01) yielding different samples sizes for each group. Colors corresponding to given group as well as its size are indicated in the legend. The solid line shows the mean GC content at a given position (i.e., within the window), and the shade indicates the standard error of the mean. adj. P values (right panel) reflect pairwise comparisons of GC content between transcript groups and were determined using a two-tailed paired t-test with Bonferroni correction.

(D) Results of western blot analysis of FMRpolyG protein derived from wild type (WT) or mutated (−4G<A) construct normalized to Vinculin upon RPS26 silencing in HEK293 cells. Graphs present means from N = 3 biologically independent samples with SDs. Unpaired Student’s t test was used to calculate statistical significance: ns, non-significant.

(E) Comparison of overrepresented hexamers in 5’UTR full-length sequences within two transcript groups: negative (N = 218) and positive responders (N = 158). The frequency of each hexamer is compared to the corresponding hexamer frequency in 5’UTR sequences of background transcripts (BG, N = 22,160) (indicated as log2 fold change). The P-value associated with each hexamer signifies the degree of overrepresentation (enrichment) and is calculated from the binomial distribution. Only hexamers with P-values < 10-5 are displayed.

Silencing of TSR2 reduces FMR16xG level in S-16xCGG model.

Results of western blot analysis of FMR16xG normalized to Vinculin upon TSR2 silencing in S-16xCGG model. Graphs present means from N = 5 biologically independent samples with SDs. Unpaired Student’s t test was used to calculate statistical significance: ***, P < 0.001.

Biotinylated RNA-protein pull down of RPS25.

Results of western blot demonstrating enrichment of RPS25 detected in samples obtained with biotinylated 5’UTR 99xCGG RNA and to a lesser extend in 5’UTR (no repeats) & GC-rich RNA eluates (ca. 20% of eluate fraction was loaded on the gel). Input, 5% of total protein lysate was loaded. All RNA molecules in the experiment were biotinylated (b) in order to perform bRNA-protein pull down procedure. Beads only was a sample (with no bRNA) which served as a negative control.