Introduction

As the main constituent of the terrestrial plant biomass, lignocellulose represents a huge reservoir of fixed carbon and a renewable resource for bioproducts and energy that is essential for human usages1. Lignocellulose biomass enclosed mainly within secondary cell wall (SCW) reinforces vessels for long-distance transport and provides physical properties to fibers allowing upright growth of the plants1, 2. SCWs are composite material made of high molecular weight biopolymers, including cellulose, hemicelluloses, and lignin, as well as cell wall proteins whose composition and structure can differ markedly between dicots and monocots2, 3. Cellulose is synthesized at the plasma membrane by the cellulose synthase complex and is composed of linear chains of β-(1–4) glucans that aggregate to form highly crystalline cellulose microfibrils2, 3. Xylans, the main hemicellulosic polysaccharides in SCW, are synthesized in the Golgi apparatus by a complex biosynthetic machinery and comprise a group of polysaccharides that share a common backbone of β-(1,4)-linked xylose (Xyl) units, but differs by the presence of side chains, whose nature is dependent on tissues and species2. In particular, glucuronoxylan, the dominant xylan in the SCW of dicots, is notably substituted at O-2 position with α-D-glucuronic acid (GlcA) or (4-O-methyl)-α-D-glucuronic acid (MeGlcA) groups, and with O-acetyl moieties at C-2 or C-3 positions2. By interacting with the cellulose and lignin, xylans contribute to the strengthening of the SCW, and are considered to be one of the main factors contributing to the biomass recalcitrance to enzymatic hydrolysis3, 4. Finally, lignin polymers are made up of H, G, and S units which are the results of the oxidation and the polymerisation of the p-coumaryl, coniferyl and synapyl alcohols, respectively. These monomers, also called monolignols are synthesized on the ER surface/cytoplasm prior transportation to the cell wall. Lignins confer stiffness, strength and hydrophobicity to the SCW24.

The metabolic load imposed by the coordinated production of SCW biopolymers emphasizes the importance for plants of having a precise control over temporal and spatial expression of the corresponding SCW biosynthetic genes. Based on founding studies performed in Arabidopsis model, it is generally recognized that the SCW synthesis is primarily controlled at a transcriptional level by a multilayer and interconnected network of transcription factors/TFs5, 6. Key players of this regulatory network are related NAC (NAM-ATAF1/2-CUC) and MYB-type TFs that induce, in a redundant and combinatorial manner, the expression of SCW biosynthetic genes by binding their promoter region5. In addition, a family of Class III homeodomain leucine zipper (HD-ZIPIII) TFs, whose members control several aspects of Arabidopsis development, was also shown to interact with SCW regulatory network, contributing to xylem cell specification and SCW synthesis5, 6. Although this transcriptional level of control can gain complexity in perennial plants, functional orthologs of the main regulators were identified in many different species, suggesting that SCW deposition proceeds via a conserved mechanism in all vascular plants7, 8. During the last decade, engineering the SCW regulatory network to deposit modified lignin and/or improve polysaccharide composition became a promising strategy to optimize biomass processability9, 10. However, despite huge progresses in understanding the regulation of SCW synthesis, the engineering of SCW regulatory network to improve biomass yield and digestibility has proven difficult and often impairs plant growth because of altered vascular tissue development11, 12.

Downstream of transcriptional regulations by TFs, RNA-binding proteins (RBPs) are essential post-transcriptional modulators of gene expression across all kingdoms of life13. However, although a large cohort of RBPs exists in plants14, there is little information about RBPs-mediated posttranscriptional regulation of SCW biosynthetic genes. So far, only few microRNAs (miRNAs) families have been implicated in the post-transcriptional regulation of genes involved in both regulatory and enzymatic aspects of SCW biosynthesis6. In particular, miR165/166 were shown to target HD-ZIPIII transcriptional regulators, having a direct impact on the transcriptional network associated with SCW biosynthesis6. In the other hand, miRNA397/857/408 were shown to affect lignin content via the targeting of laccase/LAC genes, which control lignin polymerization from monolignols precursors15, 16. In addition, two related tandem CCCH zinc-finger proteins, which exhibit both DNA and RNA binding abilities, have been proposed to modulate SCW formation by regulating the expression of genes associated with cell wall metabolism17. Despite these mounting evidences, our current knowledge of the role of post-transcriptional regulators in SCW synthesis is still very limited.

In this work, we report that two RRM-domain containing RNA-binding proteins homologous to the animal translational regulator Musashi, Musashi-like2/MSIL2 and Musashi-like4/MSIL4, function redundantly to control various aspects development, including the stiffness of the inflorescence stem in Arabidopsis. We show that RRM-dependent RNA-binding activity is essential for MSIL2/4 functions in vivo, and that MSIL2/4 interactomes are similar, being enriched in proteins involved in 3’-UTR binding and translational regulation. MSIL2/4 mutations alter the formation of SCW in the interfascicular fiber cells, leading to a reduction in lignin deposition, and a change in the decoration pattern of glucuronoxylan that is associated with an increase of 4-O-methylation of GlcA substituent. In accordance, quantitative mass-spectrometry-based protein analysis reveal an overaccumulation of glucuronoxylan biosynthetic machinery, including the GlucuronoXylan Methyltransferase3/GXM3, in the msil2/4 mutant stem. We show that MSIL4 immunoprecipitates GXM3 mRNA in vivo, likely regulating its expression at a translational level. Our results demonstrate that MSILs regulate SCW synthesis in interfascicular fiber cells and point to a novel aspect of SCW regulation linking translational repression to regulation of SCW biosynthesis genes.

Results

Musashi-like MSIL2/4 proteins redundantly control specific Arabidopsis development processes

While many studies have implicated the MSIs in the control of gene expression during cellular proliferation, cell fate determination and cancer in animals18, 19, yet little is known about the presence and activity of MSI-type proteins in plants. In this work, we describe an RNA-binding protein family in Arabidopsis, whose members hereafter named as MUSASHI- Like1 to 4/MSIL1-4, share both sequence, domain organization, and model-based structural similarities with animal MSIs (Fig. 1a, b). Phylogenetic analysis revealed that Arabidopsis indeed harbors a clade of seven MSIL-type genes with a prominent sub-clade harboring the MSIL1-4 genes (Fig. 1b). Notably, a neighboring sub-clade contains two genes, RBGD2 and RGBD4 (Fig. 1b), that encode for heat-inducible RBPs implicated in the response to heat stress in Arabidopsis20. MSIL1-4 proteins share a common domain architecture consisting of two N-terminal RNA recognition motifs (RRM1 and RRM2) that are followed by a poorly conserved, intrinsically unstructured, carboxy-terminal region (Fig. 1a; and Supplementary Fig. 1a). The survey of the gene expression ARAPORT11 databases (https://araport.org/; http://fgcz-pep2pro.uzh.ch) revealed that the MSIL1-4 genes are widely expressed in Arabidopsis, with MSIL1 showing the lowest levels of expression (Fig. 1c). Database searches further indicated that MSIL orthologs are widely distributed in land plants, from bryophytes to angiosperms (Supplementary Fig. 1b). To gain a mechanistic understanding of MSIL function in Arabidopsis, we determined the subcellular localization of these proteins by performing fluorescence microscopy on GFP-tagged versions of MSIL2 and MSIL4 (MSIL2G and MSIL4G). Both proteins showed a diffuse cytoplasmic distribution in root cells of stable Arabidopsis transgenic plants that contrasts with the nucleus and cytoplasmic distributions of the free GFP protein control (Fig. 1d).

Musashi-like MSIL2/4 proteins redundantly control development in Arabidopsis.

a) Schematic representation of the Musashi-like (MSIL) protein family in Arabidopsis thaliana. The invariant RNP1 and RNP2 motifs within the conserved RRM domains are indicated. Numbers refer to amino acid identities between the Mouse Musashi/MSI RRM1/2 domains and the corresponding domains in Arabidopsis MSIL homologs. b) Evolutionary relationships between MSILs and related RNA-binding proteins. The scale bar indicates the rate of evolutionary change expressed as number of amino acid substitutions per site. c) RNA- seq expression map of MSIL genes extracted from ARAPORT11. Shading is a log2 scale of transcripts per million (TPM). d) General overview of the roots of 6-days-old wild type Arabidopsis seedlings (Col-0) or transgenic seedlings expressing either a free GFP protein or GFP-tagged versions of MSIL2/MSIL2G and MSIL4/MSIL4G. Scale bar, 10 μm. e) Photographs of representative rosettes of Col-0, msil2/4, msil2/4-MSIL2F1 and msil2/4- MSIL4F1 plants. f) Photographs of representative inflorescence stems of Col-0, msil2/4, msil2/4-MSIL2F1 and msil2/4-MSIL4F1 plants. Abbreviations: Mus musculus Musashi2 (MmMsi2); human heterogeneous nuclear ribonucleoprotein A1 (hnRNPA1).

To further investigate the function of the MSILs in Arabidopsis, we raised specific antibodies against non-conserved regions of MSIL1-4 and characterized T-DNA insertion msil mutant lines that lack full-length MSIL mRNA and MSIL proteins as judged by RT-PCR and western blot experiments, respectively (Supplementary Fig. 2a-c). However, none of the four characterized single msil mutants showed any obvious phenotype (Supplementary Fig. 3a and data not shown). To assess the potential functional redundancy between MSIL genes, we crossed the single msil mutants together and characterized the six possible double mutant combinations (Supplementary Fig. 3a). Only the msil2-1/msil4-1 (msil2/4) double mutant showed discernable developmental abnormal phenotypes, including enlarged and curled leaves, early leaf senescence and a pendant inflorescence stem (Fig. 1e, f; and Supplementary Fig. 3a, b). To assess whether MSIL1 and MSIL3 also contribute to Arabidopsis development in the msil2/4 double mutant background, we generated a msil1-1/msil2-1/msil3-1/msil4-1 (msil1/2/3/4) quadruple mutant that shows no particular or aggravated developmental phenotypes with respect to the msil2/4 double mutant (Supplementary Fig. 3a, c, d). To test the functional redundancy of MSIL2 and MSIL4 proteins in vivo, we transformed the msil2/4 double mutant with constructs coding either for a Flag/HA-tagged version of MSIL2 (MSIL2F) or MSIL4 (MSIL4F), and selected two independent transgenic plants expressing similar and near physiological levels of the tagged protein, as evaluated by western blot using our home-made antibodies (Supplementary Fig. 3e). The ectopic expression of either MSIL2F or MSIL4F proteins was able to rescue all the phenotypes associated with the MSIL2/4 defect (Fig. 1e, f; and Supplementary Fig. 3b). This confirms that MSIL2F and MSIL4F are functional proteins and more importantly that MSIL2 and MSIL4 redundantly control various aspects of development in Arabidopsis.

RRM-dependent RNA-binding activity is essential for MSIL2/4 functions in vivo

Animal MSIs interact with target transcripts through their RRM domains and mutations in these domains have been shown to impair MSI activity in vivo21, 22. MSILs are very similar to metazoan MSIs over their RRM sequences (Fig. 1a; and Supplementary Fig. 4a), a closeness of sequences also observed when homology models of the MSIL4 RRM motifs were generated using SWISS-MODEL server23 (Fig. 2a). Homology-based structure prediction proposed that these domains adopt the characteristic RRM fold with a four-stranded β-sheet structure bearing the conserved phenylalanine residues involved in the specific recognition of RNA bases24 (Fig. 2a; and Supplementary Fig. 4a). This supports the idea that MSILs have retained the ability to interact with RNA, a notion further supported by the identification of MSIL2/4 proteins in the experimentally determined Arabidopsis mRNA-binding proteome14 To examine the requirement for the interaction of MSIL with RNA in developmental control, we first generated an RNA-binding mutant form of MSIL4, MSIL4RRM, by introducing phenylalanine to aspartate (F→D) mutations in both RRM domains (Fig. 2b). Protein-RNA interaction assays using RNA homopolymers immobilized on agarose beads confirmed that the MSIL4, but not MSIL4RRM, exhibited intrinsic RNA binding activity in vitro (Fig. 2c). To assess the impact of these mutations in vivo, the MSIL4G and MSIL4GRRM constructs were expressed under the control of the MSIL4 endogenous promoter into the msil2/4 mutant, and two independent transformants expressing near physiological levels of the WT (MSIL4G-3 and MSIL4G-8), and mutant (MSIL4GRRM-3 and MSIL4GRRM-10) proteins were selected for further complementation analysis (Supplementary Fig. 4b). MSIL4G, but not MSIL4GRRM proteins were able to rescue the developmental defects incurred by MSIL2/4 mutations (Fig. 2d; and Supplementary Fig. 5a, b), indicating that these proteins are likely to exert their functions by interacting with RNA targets in vivo.

RRM-dependent RNA-binding activity is essential for MSIL function in planta.

Experimentally determined structures of the RRM1 domain of MmMSI2 (PDB ID: 6C8U) and the RRM2 domain of MmMSI1 (PDB ID: 5x3y), and models of the RRM domains of MSIL4 generated using the homology-modeling server SWISS-MODEL. The α-helices and β-sheets of RRM domain as well as the phenylalanine residues that contact RNA bases are highlighted. b) Schematic representation of the mutations introduced in the RRM domains of MSIL4. c) Binding assays of MSIL4 RRM domains on ssRNA homopolymers in vitro. Single-stranded polyA (pA), polyG (pG), polyC (pC) or polyU (pU) RNAs were subjected to binding with the His-tagged recombinant RRM domains from MSIL4 (upper part) or its mutated version (MSIL4RRM, lower part). An anti-His antibody was used for detection. In. represents the input fraction. d) RRM-dependent RNA-binding activity is essential for MSIL2/4 function in stem. Photographs of representative inflorescence stems of Col-0 and msil2/4 mutants complemented or not with a WT version of MSIL4 (MSIL4G-8 and 3) or an rrm mutant (MSIL4GRRM-3 and 10).

MSIL2/4 protein interactomes are enriched in proteins involved in 3’-UTR binding and translational regulation

Animal MSIs modulate mRNA expression mostly at a translational level through binding to conserved motifs in the 3’-UTR of the target mRNA and further interactions with various mRNA binding partners18, 25, 26. To further characterize the components of the MSIL2/4 network in Arabidopsis, we performed affinity purification coupled to LC-MS/MS, as described in Scheer et al.27, using the complemented MSIL2F1 and MSIL4F1 lines. Consistent with their functional redundancy, MSIL2F1 and MSIL4F1 exhibit a similar protein interaction network, with 27 proteins significantly enriched in both MS analyses (Fig. 3a, b; and Supplementary Table 1). Examination of the PANTHER database (http://www.pantherdb.org) identified that MSIL2F1/4F1 are part of a protein-protein interaction network that was principally enriched in 4 categories of GO molecular functions, namely poly(A) binding protein (GO:0008143), mRNA 3-UTR binding (GO:0003730), single-stranded RNA binding (GO:0003727), and translation initiation factor activity (GO:0003743), highlighting the relationship between MSIL2/4 and the mRNA binding and regulatory proteome (Fig. 3c).

MSIL2/4 protein interactomes are enriched in proteins involved in 3’-UTR binding and translation regulation.

a) The semi-volcano plots show the enrichment of proteins co-purified with MSIL2F1 and MSIL4F1 as compared with control IPs. Y- and X-axis display adjusted p-values and fold changes, respectively. The dashed lines indicate the threshold above which proteins are significantly enriched/depleted (fold change > 2; adjP < 0.05). b) Venn diagram showing the overlap between the MSIL2 and the MSIL4 interactomes. c) PANTHER-based classification of the GO molecular function that are overrepresented among the MSIL-interacting proteins. d) Confocal monitoring of the colocalization of MSIL2G, MSIL4G and PAB2R in root tips of 8-d-old seedlings were monitored after 30 min of exposure to 38°C (heat stress) or 20°C for control treatment. e) Cycloheximide treatment inhibits the formation of MSILG and PAB2R cytosolic foci. The transgenic plants expressing the stress granule marker PAB2R was used as a control. Root of 7-day-old seedlings expressing either MSIL4G, MSIL2G, or PAB2R were monitored after 1 hour of exposure to 38°C in the presence of DMSO control treatment (DMSO) or in the presence of cycloheximide inhibitor (+CHX). Scale bar, 10 μm.

Many of the proteins present in the MSIL2/4 interactome, such as PABs, PUMs, UBP1 are involved in translation regulation and are targeted to stress granule (SGs) components, a membraneless cytoplasmic organelle that formed consecutive to the global inhibition of translation caused by heat-stress28, 29. To assess the potential functional link between MSIL and those proteins, we crossed the MSIL2G/4G plants with a functional PAB2-RFP/PAB2R line29, and assessed the localization of the fusion proteins before and after heat-stress in both parent and crossed lines. Fluorescence analysis revealed that MSIL2G/4G, like PAB2R, show a dispersed cytoplasmic signal under normal condition, and are recruited to cytoplasmic granules that overlap with PAB2R-containing SG upon heat stress (Fig. 3d; and Supplementary Fig. 6). Treatment with cycloheximide (CHX), a translation inhibitor that prevent SG assembly by trapping mRNAs at polysomes30, confirmed that those foci correspond to bonafide SGs (Fig. 3e). These results were consistent with the RNA-binding protein-based signature and the nature of MSIL interactomes, further supporting a functional relationship between MSILs and the translation machinery in Arabidopsis.

MSIL2/4 proteins regulate the molecular architecture of SCW in Arabidopsis fibers

Pendant stem inflorescence is a common phenotypic trait of Arabidopsis mutants defective in SCW biosynthesis, that results from a lack of mechanical support and rigidity6, 31. Given the current lack of knowledge of the role of post-transcriptional mechanisms in the control of SCW biosynthesis in plants, we henceforth focused our analysis on the pendant stem phenotype of the msil2/4 double mutant. Survey of the Arabidopsis inflorescence stem tissue-specific transcriptome database32 indicated that the MSIL2/4 genes are highly expressed in the SCW-forming xylary fiber (F) and xylem vessel (X) cells (Fig. 4a). Toluidine blue O staining and microscopic analysis of Col-0 and msil2/4 mutant stem cross-sections revealed no major changes in the architecture of both the interfascicular and xylary fibers in the mutant stem. Notably, xylem cells deformation or collapse, which are a characteristic of mutants deficient in SCW synthesis, were not observed in the msil2/4 mutant (Fig. 4b). However, the examination of the cell wall thickness in cross sections of stems revealed that the interfascicular fiber cells display a significantly thinner SCW in msil2/4 mutant (Fig. 4c). UV autofluorescence and phloroglucinol staining of the Col-0, msil2/4, msil2/4+MSIL2F1 and msil2/4+MSIL4F1 stem sections revealed a reduced lignin signal in the interfascicular fibers of the msil2/4 mutant compared to Col-0 or complemented plants (Fig. 4d; and supplementary Fig. 7a). In agreement with these observations, a significant decrease of 30% in AcBr lignin content was observed in the msil2/4 mutant compared to the Col-0 control plants (Fig. 4e). Despite a tendency to decrease, the difference of lignin content was not significant in the complemented lines msil2/4+MSIL2F1 and msil2/4+MSIL4F1 compared to Col-0 (Fig. 4e). To further investigate potential changes in the lignin structure and composition in the Col-0 and msil2/4 mutant, we evaluated the impact of MSIL2/4 deficiency on lignin composition by thioacidolysis, a degradation method allowing the determination of the relative amounts of H, G and S lignin units linked by beta-O-4 linkages. Total monomer yield did not significantly differ between the col-0 and the msil2/4 mutant (Supplementary Fig. 7b), suggesting that the structure of the lignin polymer is not merely affected in the mutant. Further examination of β-glucan composition using the fluorescent dye calcofluor white33 revealed an increased staining in the interfascicular fiber cells of msil2/4 mutant, supporting changes in the composition/accessibility of cellulose (Fig. 4f; and Supplementary Fig. 7c).

MSIL proteins regulate the molecular architecture of SCW in Arabidopsis fibers.

a) Expression profiles of the MSIL2 and MSIL4 genes in the inflorescence stem tissues as retrieved from the Arabidopsis inflorescence stem tissue-specific transcriptome database40 (https://arabidopsis-stem.cos.uni-heidelberg.de/). The results of two replicates are shown. F, fibers; X, xylem vessels; Cx Cambium (xylem side); Cp, Cambium (Phloem side); P, Phloem; S, Starch sheath; E, Epidermis. b) Cross-sections of Col-0 and msil2/4 stems showing vascular bundle and interfascicular fiber cells stained with Toluidine blue. c) Measurements of the SCW thickness of xylary fiber and interfascicular fiber cells. Values are means (n>400)± SEM. Data were analyzed by unpaired Student’s test. Asterisks indicate significant differences relative to Col-0; ****p<0.0001. d) Top: lignin autofluorescence under ultraviolet (UV) using confocal microscopy in Col-0, msil2/4 mutant, and complemented msil2/4- MSIL2F1 and msil2/4-MSIL4F1 plant stem sections. xy: xylem fibers; if: interfascicular fibers. Bottom: Phloroglucinol-HCl staining of Col-0, msil2/4, and complemented msil2/4-MSIL2F1 and msil2/4-MSIL4F1 plant stem sections. Scale bar, 100 μm. e) Histograms represent acetyl bromide lignin content, expressed as % of dry weight. Lignin content was measured in the bottom section of mature stems (3 biological replicates per line). Significant differences between Col-0, msil2/4+MSIL2F1, msil2/4+MSIL4F1, and msil2/4 plant lines are indicated by asterisks*, P-value< 0.01 according to Student’s t test (n = 6 to 10). f) Confocal microscopy of Calcofluor-white staining of cross sections of Col-0 and msil2/4 mutant inflorescence stems. Scale bar, 100 μm. g) Glucose yield after incubation of Col-0 and msil2/4 lignocellulosic material with cellulases either without pre-treatment (-Tr) or after sodium hydroxide pre-treatment (+Tr).

To extend this analysis, we then subjected purified Col-0 and msil2/4 SCW materials to Fourier-Transform Infrared (FTIR) spectroscopy34. PLS-DA analyses performed on FT-IR absorbance data clearly separated msil2/4 mutant lines from Col-0 plants, using the first two components (explaining 74% and 10% of total variability, respectively) (Supplementary Fig. 7d, left panel). The large-scale chemotyping of CW composition in stem by FTIR spectroscopy suggested that not only lignin deposition was impaired in msil2/4 mutants but also polysaccharides composition (Supplementary Fig. 7d, right panel). The lowest absorbance values in msil2/4 lines were observed in the region 1800-1500cm-1, mostly associated to the absorbance of lignin related compounds. On the contrary, a higher absorbance was detected in msil2/4 samples within 1050-900 cm-1 region, mostly associated to polysaccharides compounds34. Because the optimal interaction between the SCW lignin and polysaccharide polymers is believed to contribute to SCW recalcitrance, we used saccharification as a proxy to compare cell wall sugars accessibility between Col-0, msil2/4 and complemented mutant lines (Fig. 4g). A higher amount of reducing sugars from native cell wall (without pre-treatment/-Tr) was released using msil2/4 samples compared to Col-0 samples. After cell wall loosening using alkali pre-treatment/+Tr, an even more important amounts of sugars (+5%) was released from double mutants CW enzymatic digestion (Fig. 4g). Together, our data indicate that MSIL2/4 regulate the molecular architecture of SCW in the interfascicular fibers, contributing significantly to the setting of biomass recalcitrance in Arabidopsis.

The accumulation and activity of the glucuronoxylan decoration machinery are mis-regulated in msil2/4 mutant

To elucidate the impact of the MSIL2/4 on the expression of SCW-related biosynthesis genes in the Arabidopsis inflorescence stem, we performed mRNA sequencing (mRNA-seq) on both wild-type and msil2/4 mutant backgrounds. In total, we found 234 genes to be significantly differentially regulated (DEG) in the msil2/4 mutant background (with stringent thresholds having a logarithm of fold change >1.5 or <-1.5 and false discovery rate (FDR ≤ 0.05), including 156 up-regulated and 78 down-regulated mRNAs (Supplementary Fig. 8a; and Supplementary Table 2). Interestingly, genes encoding regulatory or enzymatic components of the SCW biosynthesis pathway were absent from the DEG list, and their expression was not significant changed in msil2/4 mutant (Supplementary Fig. 8a). Gene Ontology analysis of DEGs revealed significantly enriched signaling pathways (FDR ≤0.05) involved in plant defense or responses to biotic and abiotic stresses (Supplementary Fig. 8b), encoding products that are preferentially targeted to the extracellular region and the cell wall (Supplementary Fig. 8c). In this regard, the constitutive activation of defense pathways has been well documented in mutant plants defective in cell wall formation, including the SCW35.

The absence of SCW biosynthesis genes in the DEG list suggests that MSIL2/4 could regulate the SCW formation at a post-transcriptional/translational level, a hallmark of MSI function in animal models18, 19. To investigate proteome-level changes that could potentially occur upon MSIL2/4 mutations, we performed a comparative proteomic analysis of Col-0 and msil2/4 inflorescence stems using MS-based quantitative proteomics, and represented ≈4300 reliably identified and quantified proteins in a volcano plot according to their statistical p-value and their relative difference of abundance. This analysis revealed 267 proteins with significant abundances in the msil2/4 mutant versus Col-0 inflorescence stems (fold change ≥2, Benjamini-Hochberg FDR < 1%), including 156 up-regulated, and 111 down-regulated proteins (Fig. 5a; and Supplementary Table 3). While no specific GO terms were enriched in the set of down-regulated proteins, the inspection of the up-regulated proteins revealed a strong enrichment in enzymes involved in the glucuronoxylan biosynthetic pathway (Fig. 5a, b; and supplementary Table 3). Notably, the upregulated glucuronoxylan-related proteins included glycosyltransferases required for the xylan backbone synthesis (IRX9/10/14), as well as enzymes involved in its substitution (ESK1, GUX1, and GXM3)2 (Fig. 5a, c). Visual inspection of the RNA-seq aligned reads of the up-regulated glucuronoxylan-related genes using the Integrative Genomics Viewer (IGV) and quantitative RT-PCR analysis of the glucuronoxylan-related decoration genes confirmed that the observed proteomic changes in msil2/4 were not associated with significant variations in mRNA levels (Supplementary Fig. 9a, b).

The accumulation and activity of the glucuronoxylan decoration machinery are altered in msil2/4 mutant.

a) MS-based quantitative comparison of Col-0 and msil2/4 inflorescence stem proteomes. Volcano plot displaying the differential abundance of proteins in Col-0 and msil2/4 cells analyzed by MS-based label-free quantitative proteomics. The volcano plot represents the - log10 (limma p-value) on y axis plotted against the log2 (Fold Change msil2/4 vs Col-0) on x axis for each quantified protein. Green and red dots represent proteins found significantly enriched respectively in msil2/4 and Col-0 Arabidopsis stems (log2(Fold Change) ≥ 1 and - log10(p-value) ≥ 2.29, leading to a Benjamini-Hochberg FDR = 1.01 %). The up-regulated proteins involved the glucuronoxylan biosynthetic pathways are indicated. b) Gene Ontology (GO) analysis of the proteins that are significantly up-regulated in the msil2/4 mutant. The gene ontology analysis of DEGs was performed using ShinyGO v0.76 software. Lollipop diagrams provide information about GO fold enrichment, significance (FDR in log10), and number of genes in each pathway. c) Schematic model of glucuronoxylan substitution patterns in Arabidopsis and the enzymes involved. IRX9/10/14, glycosyltransferases involved in the synthesis of xylan (Xy) backbone. ESK1, eskimo1. GUX1, glucuronic acid substitution of xylan1; GXM3, glucuronoxylan methyltransferase3 are involved in the glucuronoxylan decoration. d) Neutral monosaccharide composition of alcohol insoluble residue (AIR) extracted from inflorescence stems of wild-type and msil2/4 mutant plants that have been pre- hydrolyzed or not with acid. Rha, rhamnose; Fuc, fucose; Ara, arabinose; Xyl, xylose; Man, mannose; Gal, galactose; Glc, glucose. e) Left: Immunofluorescence labeling of transverse sections of Col-0 and msil2/4 stems with the LM10 and LM11 antixylan antibodies. Xy, xylem fibers; if, interfascicular fibers. Right: Quantification of the fluorescence was done using ImageJ software and processed according to Supplementary table 4.

The outcomes of our histological and proteomic analyses suggest potential variations in polysaccharide content and/or composition in the SCW of msil2/4 mutant. To address this point, we first performed glycome profiling on either untreated (-Tr) or H2SO4-treated (+Tr) stem samples. The levels of glucose (Glc) and xylose (Xyl), which are the building blocks of cellulose and xylan polysaccharides in the SCW were not substantially affected in the msil2/4 mutant (Fig. 5d). Similarly, no significant changes were observed in the levels of the primary cell wall-related monosaccharides, galactose, rhamnose, arabinose, mannose, fucose and galacturonic acid (Fig. 5d), indicating that the MSIL2/4 mutations did not significantly altered cellulose or glucuronoxylan levels in the SCW.

To assess possible changes in glucuronoxylan decoration in msil2/4 mutant, we performed cytochemical analysis on Col-0 and msil2/4 stem sections using two antixylan antibodies (LM10 and LM11), whose binding specificities depend on the degree of glucuronoxylan substitution36. LM11, which binds with similar affinity both unsubstituted and substituted glucuronoxylan, shows a similar signal in the interfascicular and xylary fibers of Col-0 and msil2/4 mutant stems, confirming that the glucuronoxylan content was not significantly modified in msil2/4 compared to Col-0 (Fig. 5e). In contrast, LM10, which binds less efficiently the substituted glucuronoxylan36, shows a significant signal decrease in the interfascicular fibers of msil2/4 mutant (Fig. 5e; and Supplementary Table 4). Together, our data indicate that the MSIL2/4 mutations impact the accumulation and activity of the glucuronoxylan decoration machinery in the interfascicular fiber cells of Arabidopsis stem.

MSILs restrain the degree of 4-O-methylation of glucuronoxylan in Arabidopsis

In Arabidopsis, the glucuronoxylan backbone is decorated predominantly by acetyl, GlcA, and MeGlcA groups, that are deposited in a regular and controlled manner by a specific enzymatic machinery2, 37, 38. Interestingly, the ESK1, GUX2, and GXM3 enzymes, which are involved in the deposition of the acetyl and GlcA groups, as well as the methylation of GlcA group, respectively, are over-accumulated in the msil2/4 stem proteomics, providing a possible explanation for the modification in glucuronoxylan decoration observed in the interfascicular fiber cells of the msil2/4 mutant. To investigate further the variations in glucuronoxylan decoration incurred by the msil2/4 mutation, we analyzed the patterns of xylanase-released SCW oligosaccharide using matrix-assisted laser desorption ionization-time-of-flight mass spectrometry (MALDI-TOF)39. Consistent with previously observations40, 41, MALDI-TOF analysis of Col-0 glucurononoxylan oligosaccharides showed the release of xylo-oligosaccharide pairs, evenly spaced by their degrees of polymerization and acetylation, bearing a GlcA (m/z 1705.5/1747.5) or methylated GlcA (m/z 1719.5/1761.5) substitution group (Fig. 6a). Interestingly, the level of GlcA branched xylo-oligomers was strongly decreased in the msil2/4 double mutant, and this whatever the complexity of the oligomers considered (Fig. 6a; and Supplementary Fig. 10a). Importantly, the molecular phenotypes observed in the msil2/4 mutant were rescued in both msil2/4+MSIL2F1 and msil2/4+MSIL4F1 complemented lines, confirming the specificity of the mutation (Fig. 6a). In agreement with the MALDI–TOF outcome, quantification of the acidic sugar levels in untreated (-Tr), or H2SO4-treated (+Tr) WT and msil2/4 stem samples confirmed that the release of GlcA moiety was strongly reduced in msil2/4 double mutant, as opposed to the galacturonic acid/GalA group located on pectins (Fig. 6b).

MSILs restrain the degree of 4-O-methylation of glucuroxylan in Arabidopsis inflorescence stem.

a) MALDI-TOF mass spectra of xylooligosaccharides generated by xylanase digestion of xylan from Col-0, msil2/4, and complemented msil2/4-MSIL2F1 and msil2/4-MSIL4F1 inflorescence stem materials. The ions at m/z 1705/1747 and 1719/1761 correspond to acetylated xylo-decapolysaccharides bearing a GlcA residu (GlcA-Xyl10- Ac4//GlcA-Xyl10-Ac5) or a methylated GlcA residue (MeGlcA-Xyl10-Ac4//MeGlca-Xyl10- Ac5). b) Acidic monosaccharide composition of alcohol insoluble residue (AIR) extracted from inflorescence stems of wild-type and msil2/4 mutant plants that have been pre- hydrolyzed or not with acid. GalA, galacturonic acid; GlcA, glucuronic acid. c) Relative changes in unmethylated/methylated GlcA decapolysaccharide ratio in Col-0 (black) and msil2/4 mutant (red) as controlled by the addition of external spike-in control corresponding to a pentaacetyl-chitopentaose. d) GFP-based RNA-IP assays. Top panel (WB): western blots performed using antiGFP or antiUGPase antibodies on protein fractions from inputs or antiGFP immunoprecipitates from WT and the msil2/4 mutant plants expressing the WT (MSIL4G-3) or RRM mutant (MSIL4GRRM-3) MSIL4-GFP fusions. Bottom panel (RT) : Corresponding RT-PCR using GXM1 or GXM3 specific primers on RNA fractions.

It has been previously proposed that the intermediate level of glucuronoxylan methylation observed in Arabidopsis is due to an intrinsic rate-limiting activity of GXM enzymes in vivo42. We noticed that in addition to GXM3, a second member of the GXM family, GXM142, tends also to be over-expressed in msil2/4 stem although at a fold change of 1.77 slightly below the fixed significance threshold in the proteomic analysis (Fig. 5a; and Supplementary Table 3). This observations make us wonder whether the apparent decrease of GlcA branched xylooligosaccharides in msil2/4 could be due to an overmethylation of the GlcA substituent. To address this question, we performed the xylanase digestion and MALDI-TOF analysis in presence of an added spike-in oligosaccharide control (Pentaacetyl- chitopentaose) for normalization of the MALDI_TOF data (Supplementary Fig. 10a). Upon normalization, the MALDI-TOF data indicated that the decrease of the peaks corresponding to the GlcA-branched xylooligosaccharides was correlated with a increase in corresponding meGlcA branched xylooligosaccharides, (Fig. 6c; and Supplementary Fig. 10b; and Supplementary Table 5), suggesting that the MSIL2/4 mutations unleash the activity of the GXM1/3 in stem. RNA immunoprecipitation assays further confirmed that the GXM3 and GXM1 mRNAs are efficiently pulled-down from MSIL4G-3, but not MSIL4GRRM-3 plant extracts (Fig. 6d), suggesting that the MSIL2/4 interaction network controls the 4-O-methylation of glucuronoxylan by repressing the activity of the GXM1/3 mRNAs in stem.

Discussion

RNA-binding proteins (RBPs) are essential components of the gene regulation machinery that govern the fate and expression of cellular RNA at post-transcriptional levels, including processing, splicing, base modification, translation and degradation13. Sequence-based bioinformatic, reverse and forward genetic, and affinity-based proteomic analyses have converged to show that plants, like animals and fungi, harbor a large diversity of RBPs whose functions in gene regulation and plant development are ill-defined14. In this study, we have identified an RRM motif-containing RBP family, hereafter named as MSIL, whose members share sequence, structural and functional similarities with the animal translational regulator Musashi/MSI. We found that two of the four MSIL members, MSIL2 and MSIL4, function redundantly to control various aspects of Arabidopsis development, including leaf senescence, morphology of rosette leaves, and the rigidity of the inflorescence stem. We provide evidence that the RRM domains of the MSIL2/4 proteins exhibit RNA-binding ability and are essential for the activity of these proteins in vivo, an observation that is corroborated by the presence of the MSIL2/4 proteins in an experimentally determined Arabidopsis mRNA-binding proteome14. Further supporting a role for MSIL2/4 in mRNA-related transactions, affinity purification-mass spectrometry analysis revealed that these proteins share a common network of protein-protein interactions with well-known mRNA-binding proteins, including the PAB2/4/8, RBP45/47, and the translation factor EIF4G. Notably, the animal polyA-binding protein/PABP (homolog of the plant PABs) was previously shown to be a functional partner of MSI that is essential for it activity in translation repression18, 26. Interestingly, the MSIL2/4-associated proteins also include the phylogenetically related RBGD2/4 proteins, two constitutively expressed RBPs that are involved in heat resistance in Arabidopsis20. The comparison of the MSIL2/4 and RBGD2/4 interactomes indicates only a partial overlap, suggesting a certain level of subfunctionalization. In accordance with this observation, the MSILs and RBGDs clearly diversify in function, since the rbgd2/4 double mutant exhibits no developmental phenotypes under normal growth conditions.

A penetrant developmental phenotype associated with the MSIL2/4 mutations is a loss in stem rigidity that we could trace back to a defect in SCW formation, an important reservoir of fixed carbon and a renewable resource of major environmental and economic importance1. Yet, with the exception of few microRNAs involved in the control of lignin polymer synthesis6, 15, 16, no information about RBPs-mediated posttranscriptional regulation of SCW biosynthetic genes was previously reported. In this study, we show that MSIL2 and MSIL4, function redundantly to promote SCW formation in the interfascicular fibers and that their ability to bind RNA is essential to fulfill this function. Our conclusions are supported by independent approaches which highlight changes of SCW thickness, lignin content, and xylan decoration profile in the interfascicular fiber cells of the msil2/4 mutant. The fiber-specific effect of the MSIL2/4 mutations on SCW formation, is further supported by our evidence showing that the msil2/4 mutant grows normally and lacks the xylem collapses usually observed in more pleiotropic SCW deficient mutants. Previous studies have shown that specific master transcriptional regulators, such as NST1 and NST3, control cell-lineage specific formation of SCW formation in Arabidopsis5, 31. Our study extend this observation by showing that RBP-mediated post-transcriptional mechanisms are also involved in cell-type specific control of SCW formation in Arabidopsis. Notably, the presumed role of MSIL2/4 in interfascicular fiber cells contrasts with the rather constitutive expression of the MSIL2/4 genes, suggesting that their activity depends on cell-context. This is in accordance with the reported cell-type specific activities of ubiquitous animal MSI2 in cell renewal, cell differentiation and cancer control18, 19. Understanding how the activity of the MSIL2/4 genes is controlled by the cellular context will be essential from a basic point of view, providing a unique model for studying RBP regulation in specific cell type in plants.

Our data indicate that the accumulation of gene transcripts of SCW biosynthesis pathway enzymes and regulators are unaffected in msil2/4, suggesting that MSIL2/4 proteins are likely to act at a translational level to regulate the SCW formation in interfascicular fiber cells, an observation that is supported by the nature of the MSIL2/4 interactome. In this regard, the stem proteomics gives us a glimpse into the changes in protein abundance that could account for the observed msil2/4 SCW phenotype. Indeed, our MS data reveal that the glucuronoxylan biosynthesis machinery, including enzymes involved both in the synthesis and substitution of the xylan backbone, is specifically over-accumulated in the msil2/4 double mutant. The molecular specificity of the observed glucuronoxylan phenotype with respect to the changes in glucuronoxylan enzymatic machinery content raises several interesting questions about the impact of the MSIL2/4 mutations on the SCW formation. First, we noticed that the outcome of our proteomic data contrasts with the observed normal level of glucuronoxylan deposition in the interfascicular fibers SCW, suggesting that independent regulatory mechanisms act downstream of MSIL2/4 to control final glucuronoxylan content in the SCW. One possible explanation for this observation could be that the activities of the GUX, ESK, IRXs glucuronoxylan biosynthesis enzymes, in contrast to that of GXM3, are not rate-limiting for glucuronoxylan synthesis in vivo. Alternatively, and in agreement with studies that have reported that glucuronoxylan and cellulose contents are correlated in vivo in xylan-deficient mutants43, one could imagine that cellulose is the primary determinant of the content in glucuronoxylan in the SCW, and that the glucuronoxylan produced in excess in the golgi compartment of the msil2/4 mutant is targeted for degradation upon secretion if unbound to cellulose. In contrast, the concomitance between the glucuronoxylan hypermethylation and the GXM3 overaccumulation phenotypes observed in msil2/4 clearly support the previous assumption that the GXM-mediated methylation is a rate-limiting step in glucuronoxylan that can be overcome by GXM protein overexpression42. Our data support the idea that MSIL2/4 play a specific role in the developmental control of the GXM protein accumulation/activity, actively restraining the level of glucuronoxylan methylation in the Arabidopsis interfascicular fiber cells. We currently posit that MSIL2/4 could control the level of the GXM enzymes by directly regulating the translation of the GXM mRNAs or by indirectly regulating the translation of an yet unknown translational regulator of the glucuronoxylan biosynthesis pathway (XTRe) (Fig. 7). Future biochemical and functional studies will be necessary to precisely understand the mechanism by which MSIL2/4 proteins impact gene expression during SCW synthesis in Arabidopsis.

Model of MSIL-dependent control of glucuronoxylan methylation in Arabidopsis and its consequence for SCW architecture.

In the interfascicular fiber cells of Col-0, MSIL2/4 restrain the translation of the glucuronoxylan biosynthesis machinery, including the rate-limiting GXM3 enzyme. This activity would keep the level of glucuronoxylan methylation at an intermediate level, therefore providing a biochemical environment that favors the interactions between the glucuronoxylan and lignin polymers. In the interfascicular fiber cells of msil2/4 mutant, the translation of the glucuronoxylan biosynthesis machinery, including GXM3, is increased, leading to the deposition of an over- methylated form of glucuronoxylan that has detrimental effects on glucuronoxylan-lignin interaction and the SCW formation. In a non exclusive manner, MSIL2/4 could also have a positive role in lignin synthesis, whose defect in msil2/4 would lead to a decrease in lignin content. The model was inspired from Grantham et al37.

Our data indicate that, in addition to the reported change in glucuronoxylan methylation, the MSIL2/4 mutations is also associated with a specific decrease of lignin content in the SCW of the interfascicular fiber cells, whose origin remains unclear. On the basis of the previous observations, we envision two possible mechanisms by which the MSIL2/4 mutations could impact the lignin deposition in the fiber cells. In the first model, we propose that MSIL2/4 positively control the lignin synthesis pathway in the interfascicular cells by promoting the translation of lignin biosynthesis genes or that of a putative lignin translational activator (LTAc) (Fig. 7). In this regard, although no GO enrichment could be observed for lignin biosynthesis related genes in the proteomic analysis, two lignin biosynthetic enzymes, CAD4 and PAL2, are significantly down-accumulated in msil2/4 mutant (Supplementary Table 3), an observation that could account for the observed decrease in lignin content. However, previous published biochemical and genetic analyses do not support this conclusion, as the single cad4 and pal2 knock-out mutants display no significant alteration of lignin content due to genetic redundancy9, 44. Moreover, thio-acidolysis analyses showed that the ratio or degree of polymerization of the different lignin monomers remain unchanged in the msil2/4 mutant, reinforcing the idea that the lignin biosynthesis pathway is functional in this mutant, leaving open the question of the specificity of MSIL2/4 action on the lignin pathway (Fig.7). An alternative explanation for the decrease in lignin phenotype observed in the msil2/4 mutant would be that it results from an indirect cascade effect primarily triggered by the msil2/4-dependent changes in glucuronoxylan methylation. In this regard, it has been proposed that the 4-O-methylation of GlcA substituent results in the xylan being more hydrophobic, possibly impacting its non-covalent interaction with lignin and the capacity of lignin to assemble in the SCW41, 45, 46. Meanwhile, the hypermethylation of the carbon-4 hydroxyl of the GlcA side chains could affect the capacity of glucuronoxylan to form ether linkages to lignin (Fig. 7). Indeed, a study recently reported the existence of an ether linkage between a carbon-6 hydroxyl of a mannosyl residue of a galactoglucomannan hemicellulose and lignin in pine wood47. Although xylan has been proposed to bind covalently to lignin via ether bonds involving xylose or Arabinose residues, it remains unclear to date whether glucuronoxylan can form GlcA-dependent ether linkages with lignin, and the functional relevance of such linkages remains to be demonstrated48.

Methods

Plant material

Arabidopsis thaliana Col-0 and mutants msil1-1 (GABI_462A04), msil2-1 (Salk_066670), msil3-1 (Salk_002477) and msil4-1 (Salk_094167) obtained from the SALK Institute genomic analysis laboratory were used in this study. PAB2-RFP/PAB2R line has been provided by Cécile Bousquet-Antonelli (LGDP, Université Perpignan). The primers used to genotype lines are listed in Supplementary Table 4. Double mutants and quadruple mutants were generated by crossing plants. Plants were either grown in soil or cultivated in vitro on plates containing 2.20 g/l synthetic Murashige and Skoog (MS) (Duchefa) medium and 0.5g/l MES, pH 5.7, and 7 g/l agar. To breakdown dormancy, seeds were incubated for 48 h at 6°C in the dark. Germination and development were performed in growth chambers, at 20°C, 60–75% hygrometry with a 16-h light/8-h dark photoperiod (100 μE m −2 s−1 light [fluorescent bulbs with white 6500K spectrum, purchased from Sylvania]). For in vitro culture, the seeds were surface-sterilized before been sown on plates, incubated for 48 h at 6°C in the dark, and placed in a growth cabinet at 20°C with a 16-h-d/8-h-dark cycle and 120 μE m −2 s−1 light (LEDs with white 4500K spectrum, purchased from Vegeled).

Transgenic lines construction

MSIL2F-and MSIL4F-tagged versions were produced from Phusion-generated genomic DNA PCR products (promoter and coding region) using primers TL976(SalI)-TL924(PstI) for MSIL2F and TL974(SalI)-TL975(BamHI) for MSIL4F. PCR products were cloned in the CTL235 binary vector containing a Flag-HA tag in front of a NosT terminator and the hygromycin resistance gene driven by a 35S promoter. To obtain the MSIL2G and MSIL4G constructs used for fluorescence microscopy, the same PCR products were cloned in the CTL579 binary vector in front of a EGFP tag. The obtained plasmids were then used to transform mgl2/4 plants via Agrobacterium transformation. MSIL4- GFP/MSIL4G-tagged version used for complementation and RIP corresponds to the fusion of a genomic PCR product containing MSIL4 promoter (primers TL3527(HindIII)-TL3528(SalI)) fused with a second PCR cDNA fragment (primers TL3529(SalI)-TL3530(BamHI)) cloned initially in pGEM T Easy (Promega). After sequencing the fusion DNA has been cloned in the binary vector CTL579 containing GFP cDNA sequence. The MSIL4G-tagged construct was finally introduced in msil2/4 plants via Agrobacterium transformation. The rrm mutated MSIL4GΔr version was produced as MSIL4G except that the cDNA fragment has been gene-synthetized (Genecust) to modify RNA binding ability by having aspartic instead of phenylalanine at position 9, 49 and 51 in RRM1 domain and in position 111, 151 and 153 in RRM2 domain. Transgenic seeds selection was performed in the presence of 30 μg/l hygromycin and the progenies were screened by PCR.

Protein extraction and immunodetection

Total plant extracts (up to 100 mg) were ground in liquid nitrogen and proteins were extracted according to Hurkman and Tanaka49 Before migration, SDS–PAGE loading buffer was added and Coomassie staining was used to calibrate loadings. Proteins were separated on SDS/ PAGE gels and blotted onto Immobilon-P PVDF membrane (Cat. No. IPVH00010; MerckMillipore). Protein blot analysis was performed using the Immobilon Western Chemiluminescent HRP Substrate (Cat. No. WBKLS0500; MerckMillipore). Specific MSIL antibodies were raised in rabbits against defined epitopes (Epitope MSIL1 (SCDGTSSTFGYNRIPS), MSIL2 (RLQEYFGKYGDLVE), MSIL3 (GYGVKPEVRYSPAVGN) and MSIL4 (TWRSPTPETEGPAPFS) (Eurogentec).

Protein production and purification

The MSIL4 RRM binding domains were amplified using primers TL4128 and TL4129 either from MSIL4G construct for the wild type version or from MSIL4GΔr for the mutated version. PCR products were then cloned in pET41 to obtain His-tagged RRM domains. Tagged proteins were produced by induction in Escherichia coli BL21 cells. Cultures of 200 mL of an ampicillin-resistant (100 mg mL–1) colony were grown at 37°C and induced by 100mM isopropyl-b-D-galactopyranoside in the exponential phase (optical density 0.5 at 500 nm). After induction, bacteria were harvested by centrifugation and pellets were resuspended in 5 ml of binding buffer (His-bind buffer kit (Millipore), proteins were extracted using a Constant cell disrupter system (Constant Systems) with a disruption pressure of 2.35kbar and then purified using His-bind buffer kit (Millipore) following manufacturer’s recommendations.

RNA binding assays

Binding assays were performed using homopolymer RNA conjugated to CNBr-activated Sepharose beads (Sigma C9142). Either poly(A), poly(G), poly(C) or poly (U) from Mercks (respectively P9403, P4903, P4404 and P9528) were used following Sigma’s recommendations. 0.7 to 1 mg of RNA were used for 200 mg sepharose beads. 0.45 mg of purified protein (MSIL4 RRM binding domain or mutated version) in 0.5 ml binding buffer (100 mM NaCl, 10 mM Tris-HCl, pH 7.5, 2.5 mM MgCl2, 2.5 % Triton X-100) were incubated with 5 mg RNA conjugated Sepharose beads for 1 hour. After incubation, beads were washed 5 times for 10 min each in 0.5 ml of binding buffer. Beads were then pelleted at 12,000 rpm, resuspended in SDS sample buffer, boiled for 20 min, and pelleted again. Supernatants were transferred to fresh tubes and loaded on a 12% SDS-PAGE minigel for electrophoresis (Bio-Rad) and protein blot analysis.

RNA immunoprecipitation

RIP was performed as described previously in Merret et al.50 using 400 mg of stem powder as starting material.

Expression analysis

Total RNA was isolated using the TRI re-agent (Cat. No. TR-118; Euromedex) according to the manufacturer’s instructions. Genomic DNA was then digested out using the RQ1 RNase-Free DNase (Cat. No. M6101; Promega). cDNAs were obtained from 400 ng of RQ1-treated RNA using 1 μl of GoScript Reverse Transcriptase (Cat. No. A5003; Promega) in a 20-μl final volume reaction using random primers (Cat. No. C1181; Promega) in the presence of 20 units of RNasin (Recombinant Ribonuclease Inhibitor, Cat. No. C2511; Promega) and dNTP mix at a final concentration of 0.5 mM of each dNTP (Cat. No. U1511; Promega). Semi-quantitative RT–PCR amplifications were performed on 1 μl of cDNA in a 12.5-μl reaction volume to start with. The amplification of EF1αtranscripts was used to equilibrate. qRT–PCR was performed on a Light Cycler 480 II machine (Roche Diagnostics) by using the Takyon No ROX SYBR MasterMix blue dTTP kit (Cat. No. UF-NSMT-B0701; Eurogentec). Each amplification reaction was set up in a 10-μl reaction containing each primer at 300 nM and 1 μl of RT template in the case of total RNA with a thermal profile of 95°C for 10 min and 40 amplification cycles of 95°C, 15 s; 60°C, 60 s. Relative transcript accumulation was calculated using the ΔΔCt methodology51, using ACTIN2 as internal control. Average ΔΔCt represents three experimental replicates with standard errors. PolyA+ RNAs were purified from RNA obtained by the TRI Reagent extraction, using the PolyATtract mRNA Isolation System III (Cat. No. Z5310; Promega) according to the manufacturer’s instructions. cDNAs were then synthetized according to the above protocol using 50 ng of polyA+ RNA as starting material. Semiquantitative and quantitative RT–PCRs were performed as described above with a 1/25 dilution of cDNAs from polyA+ RNA as template for qRT–PCR. The primers used for RT and qRT-PCR are listed in Supplementary Table 6.

RNA Sequencing

Total RNA were extracted from the basal stem section of mature Col-0 and msil2/4 mutant plants using the RNeasy Plant Mini Kit (Qiagen, Cat.#74904) and treated onto the column with the RNase-Free DNase Set (Qiagen, Cat.#79254). Two replicates were performed per plant lines. PolyA purification, library preparation using the Illumina TruSeq stranded mRNA kit and library quality controls were performed by the BioEnvironment Illumina Platform. The mRNA sequencing of Col-0 and msil2/4 mRNAs was done with single-reads, 1x 75pb and 30–35×106 reads per library.

RNASeq analysis

For each library, 30–35×106 reads were obtained with 85–90% of the bases displaying a Q-score ≥ 30, with a mean Q-score of 38 as assessed with Fastqc (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/). Reads were mapped against the TAIR 10 genome using gtf annotation file and default parameters of TopHat2 v2.0.752. Assembly and transcript quantification were performed with Cufflinks v2.2.153. Finally, the low expressed transcripts -less than 1 RPKM (reads per kilobase per million mapped reads) in one of the libraries were filtered out. The differential analysis was conducted by Cuffdiff software, belonging to the Trapnell suite. Foldchange (FC) was determined as the ratio between the normalized read counts between mutant and wild-type.

Fluorescence detection and tissue colorations

GFP detections were performed on in vitro grown one week seedling roots fixed in fixation buffer (50mM Pipes, 5mM EGTA, 5mM MgSO4, pH7 added with 2% para-formaldehyde and 0.2% Triton). After 5 min vacuum infiltration and 10 min agitation at room temperature, roots were washed twice with fixation buffer and stored at 4°C until confocal observation. Confocal imaging was done using an Zeiss Axio Observer Z1 LSM 700. Roots were observed in water and for GFP detection samples were excited at 488 nm and the signal captured at 495–555 nm. For RFP detection, excitation was performed at 555 nm and the signal captures at 600-700 nm. Images were analysed with ImageJ. At least 15 samples per line were analyzed.

Stem coloration were performed from 5 cm length basal stem pieces conserved in ethanol 80%. Stems were rehydrated first in ethanol 50% and later in water and then embedded in agarose 5% before being transversally cut using a vibratome (Leica vt 1000s) to obtain 120 micron thickness sections stored in water. The presence of lignin was determined by staining with phloroglucinol for 30 seconds, giving a red coloration in the presence of lignin cinnamaldehyde groups. Sections were observed using an inverted microscope (Leitz DMRIBE, Leica Micro-systems, Wetzlar, Germany) and images were registered using a CCD camera (Color Collview, Photonic Science, Milham, UK).

For cellulose detection, the brightener Calcofluor White ST [4,4’-bis(anilino-6-bis(2-hydroxyethyl)-amino-s-triazine-ylamino)-2,2’-stilbene disulfonic acid; Cyanamide Co., Bound Brook, N.J] was used at a final concentration of 0.1%. The samples were incubated with calcofluor for 3 min, washed three times with water before being observed. For detection, excitation was performed at 405 nm and the signal captures at 434-496 nm. For auto-fluorescence analysis stem sections were excited at 405 nm and the signal captures at 410-470 nm for blue. Pictures have been analyzed with LAS X (Leica Application Suite X).

Biochemical analyses of SCWs

Arabidopsis stems (5 cm sections at the bottom part of the stems) were harvested from Col0 and msil2/4 double mutant as four independent biological replicates. Soluble extractives were eliminated as described by Ployet et al.54. Briefly, stems were freeze dried for 48h, ground using a Mixer Mill MM 400 (Retsch), and extracted by hot solvents (successively water, ethanol, ethanol/toluene (1:1 v/v) and acetone) to obtain extractive-free stem residues (ESR). The determination of acetyl bromide (AcBr) lignin was performed on 10mg of ESR as described by Ployet et al.55. All analyses were done in triplicate. Fourier transform infrared spectroscopy (FT-IR) analysis was performed on 100-200 mg of ESR. Spectra were recorded from ten technical replicates, in the range 400–4000 cm−1 with a 4 cm-1 resolution and 32 scans per spectrum using an attenuated total reflection (ATR) Nicolet 6700 FT-IR spectrometer (Thermo Fisher, Illkirch-Graffenstaden, France) equipped with a deuterated-triglycine sulfate (DTGS) detector. Spectra analyses were performed as described in Dai et al.56. Saccharification yield was estimated with or without alkali pretreatment, as described by Van Acker et al.57. Reducing sugar concentration was assessed with dinitrosalicylic acid (DNS) reagent using 10µl of the supernatant after 9 h of reaction. Enzyme activity was assessed at 0.25 filter paper units (FPU) ml-1 in our conditions.

Mass spectrometry (MS)-based proteomic analyses

Proteins from total inflorescence stem extracts of three biological replicates of Col-0 and msil2/4 mutant plants were solubilized in Laemmli buffer and heated for 10 min at 95°C. They were then stacked in the top of a 4-12% NuPAGE gel (Invitrogen), stained with Coomassie blue R-250 (Bio-Rad) before in-gel digestion using modified trypsin (Promega, sequencing grade) as previously described58. The resulting peptides were analyzed by online nanoliquid chromatography coupled to MS/MS (Ultimate 3000 RSLCnano and Orbitrap Exploris 480, Thermo Fisher Scientific) using a 180- min gradient. For this purpose, the peptides were sampled on a precolumn (300 μm x 5 mm PepMap C18, Thermo Scientific) and separated in a 75 μm x 250 mm C18 column (Aurora Generation 2, 1.6µm, IonOpticks). The MS and MS/MS data were acquired using Xcalibur 4.4 (Thermo Fisher Scientific).

Peptides and proteins were identified by Mascot (version 2.8.0, Matrix Science) through concomitant searches against the Viridiplantae database (from Uniprot, Arabidopsis thaliana (thale-cress) taxonomy, 136447 sequences) and a homemade database containing the sequences of classical contaminant proteins found in proteomic analyses (human keratins, trypsin…, 126 sequences). Trypsin/P was chosen as the enzyme and two missed cleavages were allowed. Precursor and fragment mass error tolerances were set at respectively at 10 and 20 ppm. Peptide modifications allowed during the search were: Carbamidomethyl (C, fixed), Acetyl (Protein N-term, variable) and Oxidation (M, variable). The Proline software59 (version 2.2.0) was used for the compilation, grouping, and filtering of the results (conservation of rank 1 peptides, peptide length ≥ 6 amino acids, false discovery rate of peptide-spectrum-match identifications < 1%60, and minimum of one specific peptide per identified protein group). MS data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository61. Proline was then used to perform a MS1 label-free quantification of the identified protein groups based on razor and specific peptides.

Statistical analysis was performed using the ProStaR software62 based on the quantitative data obtained with the four biological replicates analyzed per condition. Proteins identified in the contaminant database, proteins identified by MS/MS in less than two replicates of one condition, and proteins quantified in less than three replicates of one condition were discarded. After log2 transformation, abundance values were normalized using the variance stabilizing normalization (vsn) method, before missing value imputation (SLSA algorithm for partially observed values in the condition and DetQuantile algorithm for totally absent values in the condition). Statistical testing was conducted with limma, whereby differentially expressed proteins were selected using a log2(Fold Change) cut-off of 1 and a p- value cut-off of 0.00513, allowing to reach a false discovery rate close to 1% according to the Benjamini-Hochberg estimator. Proteins found differentially abundant but identified by MS/MS in less than two replicates, and detected in less than three replicates, in the condition in which they were found to be more abundant were invalidated (p-value = 1).

MALDI-TOF MS analyses of the patterns of xylanase-released SCW oligosaccharide

10 mg of dried alcohol insoluble cell wall residues were incubated with 10U.ml-1 of endo-1,4-D- xylanase (E-XYLNP) from MEGAZYME previously desalted with centrifulgal filters (Amicon Ultra-0.5ml) in 1 ml during 12hrs at 40°C. After boiling during 5 min to inactivate enzyme activity, samples where filtred with 50KD centrifulgal filters (Amicon Ultra-0.5ml). As positive control, 2 mg of wheat AX (MEGAZYMZE) were treated in a similar manner. The samples were then analyzed by matrix-assisted laser desorption ionization-time of flight (MALDI-TOF) MS using DMA/DHB (N,N-dimethylaniline/2,5-dihydroxybenzoic acid) matrix39. The samples (1 μL) were deposited and then covered by the matrix (1 μL) on a polished steel MALDI target plate. MALDI measurements were then performed on a rapifleX MALDI-TOF spectrometer (Bruker Daltonics, Bremen, Germany) equipped with a Smartbeam 3D laser (355 nm, 10000 Hz) and controlled using the Flex Control 4.0 software package. The mass spectrometer was operated with positive polarity in reflectron mode. Spectra were acquired in the range of 560–3200 m/z. The detected species correspond to ions in the form of sodium adducts.

Thioacidolysis

This method is adapted from Lapierre et al.63. Briefly, 10 mg of alcohol insoluble cell wall residues were incubated in 3ml of dioxane with ethanethiol (10%), BF3 etherate (2.5%) containing 0.1% of heinecosane C21 diluted in CH2Cl2 at 100°C during 4 hours. 3ml of NaHCO3 (0.2 M) were added after cooling and mixed prior the addition of 0.1 mL of HCl (6 M). The tubes were wortexed after addition of 3 mL of dichloromethane and the whole lower organic phase was collected in a new tube before concentration under nitrogen atmosphere to approximately 0.5 ml. Then, 10 μL were trimethylsilylated (TMS) with 100 μL of N,O-bis(trimethylsilyl) trifluoroacetamide and 10 μL of ACS-grade pyridine. The trimethylsilylated samples were injected (1 μL) onto an Agilent 5973 Gas Chromatography–Mass Spectrometry system. Specific ion chromatograms reconstructed at m/z 239, 269 and 299 were used to quantify H, G and S lignin monomers respectively and compared to the internal standard at m/z 57, 71, 85).

Cell wall sugar analysis

Neutral monosaccharide sugar content was determined by gas chromatography after acid hydrolysis and conversion of monomers into alditol acetates as described in Hoebler et al.64 and Blakeney et al.65 Gas Chromatography was performed on a DB 225 capillary column (J&W Scientific, Folsorn, CA, USA; temperature 205 °C, carrier gas H2). Calibration was made with standard sugar solution and inositol as internal standard.

For acid sugar analysis, the uronic acid content was automatically quantified after hydrolysis of polysaccharides in concentrated sulfuric acid (18 M) containing sodium tetraborate (12.5 mM) or not, followed by m-hydroxydiphenyl colorimetric determination66, 67. Galacturonic and Glucuronic acid were used as standard for quantification.

Data availability

The high-throughput sequencing data generated in this study are accessible at NCBI’s Gene Expression Omnibus (GEO) via GEO Series accession number GSE223732. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identifier PXD040207 (proteomic identification of MSIL-interacting partners), and PXD040020 (proteomic analysis of total inflorescence stem extracts).

Acknowledgements

We thank Aurélie Le Ru for the histological staining and cell wall thickness measurements, Hua Cassan-Wang for helping to the preparation of stem sections for staining, and Michele Laudié for preparation of the mRNA libraries. This work was supported by the Centre National de la Recherche Scientific (CNRS), and grants ANR-08-BLAN-0206 and 12-BSV6- 0010 from Agence National de la Recherche (ANR) to TL. This study was supported by the “Ecole Universitaire de Recherche (EUR) TULIP-GS (ANR-18-EURE-0019), and the EPIPLANT Groupement de Recherche (CNRS, France). The proteomic experiments performed by LB and YC were partially supported by Agence Nationale de la Recherche under projects ProFI (Proteomics French Infrastructure, ANR-10-INBS-08) and GRAL, a program from the Chemistry Biology Health (CBH) Graduate School of University Grenoble Alpes (ANR-17-EURE-0003). The funders had no role in study design, data collection and analysis, decision to publish or preparation of the paper.

Author contributions

Study design: T.L., F.M., R.S., M.F., DP., and N.B.E.; mutant characterization and molecular biology: A.K., D.P., C.P., R.M., and J.A.; RNA binding analysis: N.B.E.; MSIL co-purification experiments, mass spectrometry and statistical analysis of co-IP data: D.G. and P.H.; Total inflorescence stem proteomic and statistical analysis: L.B. and Y.C.; Polysaccharide biochemical analysis: L.L-B. and R.S.; Lignin biochemical and FTIR analysis: F.M.; immunohistochemical staining: Y.M.; MALDI-TOF analysis and statistical analysis: R.S. and M.F.; writing: T.L., F.M., R.S., M.F., D.P., and N.B.E.; visualisation: T.L.; supervision: T.L. and N.B.E.

Competing interests

The authors declare no competing interests.