1. Biochemistry and Chemical Biology
  2. Chromosomes and Gene Expression
Download icon

Hfq CLASH uncovers sRNA-target interaction networks linked to nutrient availability adaptation

  1. Ira Alexandra Iosub
  2. Robert Willem van Nues
  3. Stuart William McKellar
  4. Karen Jule Nieken
  5. Marta Marchioretto
  6. Brandon Sy
  7. Jai Justin Tree
  8. Gabriella Viero
  9. Sander Granneman  Is a corresponding author
  1. Centre for Synthetic and Systems Biology, University of Edinburgh, United Kingdom
  2. Institute of Cell Biology, University of Edinburgh, United Kingdom
  3. Institute of Biophysics, CNR Unit, Italy
  4. School of Biotechnology and Biomolecular Sciences, University of New South Wales, Australia
Research Article
  • Cited 1
  • Views 754
  • Annotations
Cite this article as: eLife 2020;9:e54655 doi: 10.7554/eLife.54655

Abstract

By shaping gene expression profiles, small RNAs (sRNAs) enable bacteria to efficiently adapt to changes in their environment. To better understand how Escherichia coli acclimatizes to nutrient availability, we performed UV cross-linking, ligation and sequencing of hybrids (CLASH) to uncover Hfq-associated RNA-RNA interactions at specific growth stages. We demonstrate that Hfq CLASH robustly captures bona fide RNA-RNA interactions. We identified hundreds of novel sRNA base-pairing interactions, including many sRNA-sRNA interactions and involving 3’UTR-derived sRNAs. We rediscovered known and identified novel sRNA seed sequences. The sRNA-mRNA interactions identified by CLASH have strong base-pairing potential and are highly enriched for complementary sequence motifs, even those supported by only a few reads. Yet, steady state levels of most mRNA targets were not significantly affected upon over-expression of the sRNA regulator. Our results reinforce the idea that the reproducibility of the interaction, not base-pairing potential, is a stronger predictor for a regulatory outcome.

Introduction

Microorganisms are renowned for their ability to adapt to environmental changes by rapidly rewiring their gene expression program. These responses are mediated through integrated transcriptional and post-transcriptional networks. Transcriptional control dictates which genes are expressed (Balleza et al., 2009; Martínez-Antonio et al., 2008) and is well-characterised in Escherichia coli. Post-transcriptional regulation is key for controlling adaptive responses. By using riboregulators and RNA-binding proteins (RBPs), cells can efficiently integrate multiple pathways and incorporate additional signals into regulatory circuits. E. coli employs many post-transcriptional regulators, including small regulatory RNAs (sRNAs (Waters and Storz, 2009)), cis-acting RNAs (Kortmann and Narberhaus, 2012), and RNA binding proteins (RBPs) (Holmqvist and Vogel, 2018). The sRNAs are the largest class of bacterial regulators, working in tandem with RBPs to regulate their RNA targets (Storz et al., 2011; Waters and Storz, 2009). By base-pairing with their targets, small RNAs can repress or stimulate translation and transcription elongation and control the stability of transcripts (Sedlyarova et al., 2016; Updegrove et al., 2016; Vogel and Luisi, 2011; Waters and Storz, 2009).

Base-pairing interactions are often mediated by RNA chaperones such as Hfq and ProQ, which help to anneal or stabilize the sRNA and sRNA-target duplex (Melamed et al., 2020; Melamed et al., 2016; Smirnov et al., 2017; Smirnov et al., 2016; Updegrove et al., 2016). Although Hfq is most frequently mentioned in association with sRNA-mediated regulation, it can also control gene expression independently of sRNAs in response to environmental changes (Salvail et al., 2013; Sonnleitner and Bläsi, 2014). In Pseudomonas aeruginosa, Hfq directly binds to mRNAs to repress translation in response to changes in nutrient availability, which relies on a protein co-factor Crc that acts cooperatively with Hfq to inhibit translation (Pei et al., 2019; Sonnleitner and Bläsi, 2014).

During growth in rich media, E. coli are exposed to continuously changing conditions, such as fluctuations in nutrient availability, pH and osmolarity. Consequently, E. coli elicit complex responses that result in physiological and behavioural changes such as envelope composition remodelling, quorum sensing, nutrient scavenging, swarming and biofilm formation. Even subtle changes in the growth conditions can trigger rapid adaptive responses.

Accordingly, each stage of the growth curve is characterised by different physiological states driven by the activation of different transcriptional and post-transcriptional networks. Moreover, growth phase dependency of virulence and pathogenic behaviour has been demonstrated in both Gram-positive and Gram-negative bacteria. In some cases, a particular growth stage is non-permissive for the induction of virulence (Mäder et al., 2016; El et al., 2018). Although the exponential and stationary phases have been characterised in detail (Navarro Llorens et al., 2010; Pletnev et al., 2015), little is known about the transition between these two phases. During this transition, the cell population starts to scavenge alternative carbon sources, which requires rapid remodelling of their transcriptome (Baev et al., 2006a; Baev et al., 2006b; Sezonov et al., 2007).

To understand sRNA-mediated adaptive responses, detailed knowledge of the underlying post-transcriptional circuits is required. In E. coli, hundreds of sRNAs have been discovered, and only a small fraction of these have been characterised. A key step to unravel the roles of sRNAs in regulating adaptive responses is to identify their target mRNAs. To tackle this at genome-wide level, high-throughput methods have been developed to uncover sRNA base-pairing interactions (Han et al., 2017; Hör et al., 2018; Hör and Vogel, 2017; Lalaouna et al., 2015a; Melamed et al., 2016; Waters et al., 2017).

To unravel sRNA base-pairing interactions taking place during the entry into stationary phase, we applied UV cross-linking, ligation and sequencing of hybrids (CLASH) (Helwak et al., 2013; Kudla et al., 2011) to E. coli. Firstly, we demonstrate that the highly stringent purification steps make CLASH a robust method for direct mapping of Hfq-mediated sRNA base-pairing interactions in E. coli. This enabled us to significantly expand on the sRNA base-pairing interactions found by RNase E CLASH (Waters et al., 2017) and RIL-seq (Melamed et al., 2016). Additionally, we identified a plethora of sRNA-sRNA interactions and potentially novel 3’UTR-derived sRNAs, confirming that this class of sRNAs is highly prevalent (Chao et al., 2012; Chao et al., 2017; Chao and Vogel, 2016; Miyakoshi et al., 2015a). The sRNA-mRNA interactions identified by CLASH have a strong base-pairing potential and are highly enriched for complementary sequence motifs, even those supported by only a few chimeric reads. We rediscovered known and identified novel sRNA seed sequences in the CLASH data, implying they represent genuine in vivo interactions. However, in many cases, over-expression of the sRNA did not significantly impact the steady state levels of putative mRNA targets. Although base-pairing potential is important, our results reinforce the notion that reproducibly detected interactions, are more likely to impact target steady-state levels (Faigenbaum-Romm et al., 2020).

Results

Hfq CLASH in E. coli

To unravel the post-transcriptional networks that underlie the transition between exponential and stationary growth phases in E. coli, we performed CLASH (Helwak et al., 2013; Kudla et al., 2011) using Hfq as bait (Figure 1A). To generate high-quality Hfq CLASH data, we made a number of improvements to the original protocol used for RNase E CLASH (Waters et al., 2017). Our Hfq CLASH protocol has several advantages over the related RIL-seq method (see Materials and methods and Discussion). As negative controls, replicate CLASH experiments were performed on the untagged parental strain. When combined, the control samples had ~10 times less single-mapping reads and contained only 297 unique chimeric reads, compared to the over 50,000 chimeras identified in the tagged Hfq data. This result demonstrates that the CLASH purification method produced very low background levels.

Figure 1 with 3 supplements see all
Hfq CLASH experiments at different growth phases in E. coli.

(A) Overview of the critical experimental steps for obtaining the Hfq CLASH data. E. coli cells expressing an HTF (His6-TEV-3xFLAG)-tagged Hfq (Tree et al., 2014) were grown in LB and an equal number of cells were harvested at different optical densities (OD600) and UV cross-linked. Hfq, cross-linked to sRNA-RNA duplexes is purified under stringent and denaturing conditions and RNA ends that are in close proximity are ligated together. After removal of the protein with Proteinase K, cDNA libraries are prepared and sequenced. The single reads can be used to map Hfq-RNA interactions, whereas the chimeric reads can be traced to sRNA-target interactions. (B) A growth curve of the cultures used for the Hfq CLASH experiments, with OD600 at which cells were cross-linked. Each growth stage is indicated above the plot. The results show the mean and standard deviations of two biological replicates. Source data are provided as a Source Data file. (C) Cultures at the indicated OD600 were cross-linked,harvested by filtration and analysed by Hfq CLASH, RNA-seq and western blotting to detect Hfq. All the experiments were done in duplicate.

Cell samples from seven different optical densities were subjected to Hfq CLASH. Based on the growth curve analysis shown in Figure 1B, we categorized OD600 densities 0.4 and 0.8 as exponential growth phase, 1.2, 1.8, 2.4 as the transition phase from exponential to stationary, and 3.0 and 4.0 as early stationary phase. To complement the CLASH data, RNA-seq and western blot analyses were performed on UV-irradiated cells to quantify steady state RNA and Hfq protein levels, respectively (Figure 1C, Figure 1—figure supplement 1). Western blot analyses revealed that Hfq levels were very modestly increased during growth (Figure 1—figure supplement 1A–B). To determine the cross-linking efficiency, Hfq-RNA complexes immobilised on nickel beads were radiolabelled, resolved on NuPAGE gels and analysed by autoradiography. The data show that the recovery of Hfq and radioactive signal was comparable at each optical density studied (Figure 1—figure supplement 1C). Comparison of normalised read counts of replicate CLASH and RNA-seq experiments showed that the results were highly reproducible (Figure 1—figure supplement 2). Meta-analyses of the Hfq CLASH sequencing data revealed that the distribution of Hfq binding across mRNAs was very similar at each growth stage. We observed the expected Hfq enrichment at the 5’UTRs and at the 3’UTRs at each growth stage (Figure 1—figure supplement 3A and B for examples). After identifying significantly enriched Hfq-binding peaks (FDR <= 0.05; see Materials and methods for details), we used the genomic coordinates of these peaks to search for Hfq binding motifs in mRNAs. The most enriched k-mer included poly-U stretches (Figure 1—figure supplement 3C) that resemble the poly-U tracts characteristic to Rho-independent terminators found at the end of many bacterial transcripts (Wilson and Von, 1995), and confirms the motif uncovered in CLIP-seq studies in Salmonella (Holmqvist et al., 2016).

Hfq CLASH robustly detects RNA-RNA interactions

To get the complete catalogue of the RNA-RNA interactions captured by Hfq CLASH, we merged the data from the two biological replicates of CLASH growth phase experiments (Supplementary file 1). Overlapping paired-end reads were merged and unique chimeric reads were identified using the hyb pipeline (Travis et al., 2014). To select RNA-RNA interactions for further studies, we applied a probabilistic analysis pipeline previously used for detecting RNA-RNA interactions in human cells (Sharma et al., 2016) and adapted for the analyses of RNase E CLASH data (Waters et al., 2017). This pipeline tests the likelihood that observed chimeras could have formed spuriously. Strikingly, 87% of the chimeric reads had a Benjamini-Hochberg adjusted p-value of 0.05 or less, indicating that it is highly unlikely that these chimeras were generated by random ligation of RNA molecules. A complete overview of statistically significantly enriched chimeras is provided in Supplementary file 2.

We next analysed the distribution of combinations of transcript classes found in the statistically filtered chimeric reads. Hfq CLASH identified over unique 2000 sRNA-mRNA target interactions represented by 18783 chimeras (Figure 2A; Supplementary file 3). These chimeras included sRNAs derived from 3’UTRs and were the most frequently recovered Hfq-mediated interaction type (65.7%; Figure 2A). We suspect that this number might be higher, as 1.7% of the chimeras contained fragments of sRNAs fused to short sequences from intergenic regions (Figure 2A). Manual inspection of several of these indicated that some of the intergenic sequences were located near genes for which the UTRs were either unannotated or too short. Interestingly, 10.5% of the intermolecular chimeras contained fragments from two different mRNAs (Figure 2A). Based on analyses presented below, we speculate that many of these could be interactions between novel 3’UTR-derived sRNAs and mRNA substrates. Around 1% of the chimeras represented sRNA-tRNA interactions. In E. coli, external transcribed spacers of tRNAs can base-pair with sRNAs to absorb transcriptional noise (Lalaouna et al., 2015a). In many cases, the predicted base-pairing interactions between the tRNA and sRNA halves in chimeras are quite extensive (Supplementary file 2). Hence, it is possible that this group contains biologically relevant interactions.

Figure 2 with 5 supplements see all
Hfq CLASH detects RNA-RNA interactions in E. coli.

(A) Intermolecular RNA interactions found in chimeras captured by Hfq CLASH. Chimera counts for all the uniquely annotated hybrids that mapped to genomic features. *tRNA-tRNA and rRNA-rRNA chimeras originating from different genomic regions were removed because tRNA and rRNA gene copies are very similar and therefore we could not unambiguously determine if these represented intermolecular or intramolecular interactions. (B) Venn diagram comparing the sRNA-mRNA interactions found in RIL-seq S-chimera data (log and stationary) and Hfq CLASH data. (C) Venn diagram showing the intersection between interactions from statistically filtered CLASH data from two biological replicates, recovered at three main growth stages: exponential (OD600 0.4 and 0.8), transition (OD600 1.2, 1.8, 2.4) and early stationary (OD600 3.0 and 4.0). (D) Same as in (C) but for sRNA-mRNA interactions. (E) Distribution of mRNA fragments in sRNA-mRNA chimeras over all E. coli protein-coding genes. Each gene was divided in 100 bins and the number of mRNA fragments that mapped to each bin (hit density; y-axis) was calculated. (F) Distribution of the mRNA fragments of sRNA-mRNA chimeras around the translational start codon (AUG). The pink line indicates the position of the start codon (G–H) Enriched motifs in mRNA fragments of chimeras that uniquely overlap 5’UTRs and 3’UTRs; the logos were drawn using the top 20 K-mers.

Most of the interactions, including sRNA-mRNA interactions, were identified in the transition phase (Figure 2C–D). The mRNA fragments found in chimeric reads were strongly enriched in 5’UTRs peaking near the translational start codon (Figure 2E–F), consistent with the canonical mode of translational inhibition by sRNAs (Bouvier et al., 2008). Enrichment was also found in 3’UTRs of mRNAs, although to a lesser extent compared to 5’UTRs (Figure 2E). Motif analyses revealed a distinct sequence preference in 5’UTR and 3’UTR binding sites (Figure 2G–H, Supplementary files 89). The motifs enriched in the 5’UTR chimeric fragments are more consistent with Hfq binding to Shine Dalgarno-like (ARN)n sequences (Tree et al., 2014; Supplementary file 8) and U-tracts, whereas the 3’UTR-containing chimera consensus motif corresponds to poly-U transcription termination sites (Figure 2G–H and Supplementary file 9).

To further test the quality of our CLASH data, we focussed on the 24 experimentally verified sRNA-mRNA interactions recovered in our data, which we used as a ‘ground truth’ for known interactions. Strikingly, 92% of the sRNAs in our chimeras with experimentally verified interactions were fused to the cognate mRNA fragments (Figure 2—figure supplement 1A). Vice versa, ~87% of the mRNAs in our chimeras known to be regulated by sRNAs, were fused to cognate sRNA fragments (Figure 2—figure supplement 1B). Except for the GcvB-sstT chimeras, all the experimentally verified interactions in our data had the known mRNA and sRNA seeds (Figure 2—figure supplement 1C–D). This implies that the false negative rate in our data is very low. When we extended these analyses to all sRNAs and mRNAs identified in our data, we obtained very similar results (Figure 2—figure supplement 2A–B). Only the known MicC seed sequence was absent in MicC chimeras (Figure 2—figure supplement 2C).

As a proxy for noise we quantified intermolecular chimeras containing rRNA sequences. Ribosomal RNA represents up to 80% of total cellular RNA and therefore often contributes significantly to noise in sequencing data. Although Hfq is known to interact with rRNA, this interaction appears to be sRNA independent (Andrade et al., 2018). Therefore, chimeras containing rRNA fragments likely represent background. In less than 4% of the chimeras were sRNAs or mRNAs fused to rRNA sequences, suggesting that the CLASH data has low background (Figure 2—figure supplements 12).

We recovered around 20% of the sRNA-mRNA networks found with RIL-seq (Figure 2B) and 37 experimentally verified interactions (Supplementary file 7). These results suggest that while the CLASH data contained many known interactions, the analyses were clearly not exhaustive (also see Discussion). A large number of sRNA-mRNA interactions (~1700) were uniquely found in the CLASH data (Figure 2B) and many were supported by a relatively low number of reads compared to those found both in RIL-seq and CLASH (Supplementary file 2; Figure 2—figure supplement 3). This raises the question whether these chimeras represent bona fide interactions or were merely generated through random/stochastic ligation events. To address this, we repeated the previous bioinformatics analyses on the chimeras unique to the CLASH data. This gave almost identical results. The vast majority of the chimeras were fusions between sRNA and mRNA fragments (Figure 2—figure supplement 4A–B) and again in almost all cases the experimentally verified sRNA seeds were recovered (Figure 2—figure supplement 4B). Next, we analysed the chimeras unique to the CLASH data that were supported by less than four reads (Figure 2—figure supplement 5). The majority of these chimeras in this group represented sRNA-mRNA and mRNA-mRNA interactions (Figure 2—figure supplement 5A–B) and again in almost all cases the known sRNA seed sequences were recovered (Figure 2—figure supplement 5C). We do note the slightly higher percentage of sRNA-rRNA and mRNA-rRNA chimeras (12–13%) in this group, suggesting higher background levels (Figure 2—figure supplement 5A–B). However, considering again the sheer abundance of rRNA in bacterial cells, we argue that also the background in this group of low abundance chimeras is remarkably low.

To provide additional evidence that the low abundant interactions identified with CLASH represent genuine interactions and not weak or stochastic interactions, we calculated the base-pairing potential between the two halves of the chimeras. For this purpose, we used RNAduplex (Lorenz et al., 2011) to compute the hybridization potential (in kcal/mol) of the two halves in each chimera. We focussed on sRNA-mRNA chimeras as this group represented the largest number of interactions (Figure 3). These analyses revealed that all sRNA-mRNA chimeras in the CLASH data, even those supported by only a few reads (Figure 3D), had a significantly higher propensity to form stable duplexes when compared to in silico shuffled chimeric reads (p-value<6*10−16). These data imply that a large fraction of the chimeras represent genuine base-pairing interactions and not random ligation events.

In silico folding of sRNA-mRNA chimeras shows Hfq CLASH sRNA-mRNA interactions are significantly more structured than randomly matched pairs.

(A) Cumulative distribution of the predicted folding energy (ΔG) values between sRNA and matching mRNA found in all statistically filtered sRNA-mRNA interactions. Chimera folding energies were calculated using RNADuplex (Lorenz et al., 2011), and their distribution was compared to the control distributions of chimeric reads in which the fragments were randomly shuffled over the same gene, or over genes belonging to the same class of genes (e.g sRNAs or mRNAs), respectively. Significance was tested with Kolmgorov-Smirnov test. (B) As in (A) but now for the chimeras unique to the CLASH data. (C) As in (A) but now for chimeras that are supported by less than four reads. (D) As in (A) but now for chimeras unique to the CLASH data and supported by less than four reads.

If the recovered interactions indeed represent bona fide interactions, then it may be expected that the putative mRNA targets found in CLASH chimeras are enriched for sequence motifs complementary to the sRNA seed sequences. To test this, we performed motif analyses on targets of 38 sRNAs that showed at least five unique interactions with different mRNAs (Figure 4A). Some sRNAs appeared to utilize multiple and independent seed sequences to base-pair with mRNAs. In these cases, we first performed a K-means clustering analysis to group those chimeras that contained similar sRNA sequences. For each of the resulting clusters (usually 4–5), we subsequently extracted the corresponding mRNA fragments from the chimeras and performed motif analyses using the MEME tool suite (Bailey et al., 2009). This enabled us to detect mRNA sequence motifs that are associated with specific sRNA seed sequences. The results are shown in Figure 4—figure supplements 112. The motif analyses were performed for all the mRNA fragments found in sRNA-mRNA chimeras, mRNA fragments from sRNA-mRNA interactions uniquely identified by CLASH, and mRNA fragments found in sRNA-mRNA interactions supported by less than four reads. In the majority of cases we recovered previously identified mRNA sequence motifs (Faigenbaum-Romm et al., 2020; Melamed et al., 2016; Waters et al., 2017). The majority of the sRNA-mRNA interactions involving RyjB, ChiX, SdsR and GadY were supported by less than four reads and only found in our CLASH data. Regardless, the mRNA fragments in these chimeras were significantly enriched for sequence motifs complementary to the sRNA including known seed sequences (Figure 4B, Figure 4—figure supplements 13). We also identified novel mRNA sequence motifs for RyjB, GadY, ArcZ, CyaR and GcvB (Figure 4B, Figure 4—figure supplements 36). GcvB was previously reported to recognize the consensus motif CACAaCAY in mRNAs through interactions with the GU-rich R1 seed region located at bases 66–89 (Gulliver et al., 2018; Sharma et al., 2011). Consistent with this, we found a similar motif in chimeras from cluster 2, although these were less frequently recovered in the interactions only identified by CLASH and chimeras supported by less than four reads. Our analyses also identified a well-defined sequence motif in putative mRNA targets that is highly complementary to the R3 seed, consistent with the idea that this seed is also very frequently used to regulate mRNAs (Lalaouna et al., 2019). The R3 complementary sequence motif was most highly enriched in the interactions uniquely identified in CLASH (Figure 4—figure supplement 6B). In all but one case (CyaR motif in cluster 3; Figure 4—figure supplement 5B) did the mRNA sequence motifs show significant complementarity to known seed sequences (Figure 4—figure supplements 112). In addition, these analyses indicated that sequences in the 3’ ends of ArcZ and CyaR can also function as seeds (Figure 4—figure supplements 45). Certain motifs were more frequently found in sRNA-mRNA interactions uniquely identified by CLASH: The MgrR mRNA motif found in the RIL-seq data was not frequently detected in our data, but the novel MgrR interactions recovered by CLASH showed a significant enrichment of G-rich motifs in mRNA fragments (Figure 4—figure supplement 7).

Figure 4 with 13 supplements see all
Total number of interactions for sRNAs and in how many cases enriched sequence motifs were found.

(A and C) The heatmaps show the number of different mRNA interactions identified with independently transcribed sRNAs (A) or (putative) 3’UTR-derived sRNAs (C). Only the sRNA for which we recovered at least five different interactions with mRNAs (highlighted in black) were further analysed for enriched motifs in the putative mRNA targets. The black-and-white heatmaps indicate if enriched motifs were identified in predicted mRNA targets (black means Yes and white means No). Motif analysis was performed using the MEME suite (Bailey et al., 2009). The number of target sequences that contained the common motif and the E-value of MEME are shown. The identified motifs in the mRNA targets also show sequence complementarity to the sRNA sequence. The Motif Alignment Search Tool (MAST) was used to determine the degree of complementarity between the identified motifs in putative mRNA targets and the putative sRNA. An sRNA was considered to have an enriched motif if a motif identified by MEME had an E-value <= 0.1 and/or the MAST p-value of the motif, which indicates the overall match between the identified motifs and the sRNA sequence (Bailey et al., 2009), was <= 0.001. (B–D) Motif analyses of mRNA sequences found in RyjB sRNA-mRNA and ahpF-3’UTR-mRNA interactions. All of the RyjB and ahpF-3’UTR interactions with mRNAs we found were uniquely detected in our CLASH data.

We also reasoned that genuine interactions should be enriched in RNA-RNA interaction data generated by alternative experimental approaches. To test this, we compared our data to recent GcvB and CyaR MS2 Affinity Purification coupled with RNA Sequencing (MAPS) datasets (Lalaouna et al., 2019; Lalaouna et al., 2018; Figure 4—figure supplement 13A and B). The CyaR and GcvB datasets were chosen as we had a large number of different mRNA interactions with these sRNAs (>200), which enabled us to do a statistically meaningful comparison of the datasets. Indeed, the results show that CLASH mRNA targets were significantly more highly enriched compared to the other genes in the MAPS datasets. This was even the case for those interactions supported by a relatively low number of chimeric reads, including many interactions uniquely found in our CLASH data.

Collectively, these analyses strongly suggest that the predicted interactions found in our CLASH data, even those supported by a relatively low number of chimeras, are highly enriched for bona fide sRNA-mRNA interactions and less likely to be formed by random/stochastic events.

What is the biological significance of these interactions? Because sRNAs can influence the stability of their mRNA targets, we asked how many of the putative mRNA targets showed changes in gene expression in existing sRNA over-expression datasets (Figure 5, Figure 5—figure supplements 14). We initially analysed previously published E. coli microarray datasets (Beisel and Storz, 2011; De Lay and Gottesman, 2009; Sharma et al., 2011) similar to what was performed to validate RIL-seq interactions (Melamed et al., 2016). For these analyses, we also focussed our analyses on sRNAs that had a very high number of different mRNA interactions (>200) in our CLASH data (ArcZ, GcvB, CyaR and Spot42; Figure 5—figure supplements 14). While this work was under revision, RNA-seq data from several sRNA over-expression analyses in E. coli became available (Faigenbaum-Romm et al., 2020), which we subsequently included in our analyses (Figure 5A). Only a subset of the predicted sRNA targets showed significant changes in gene expression. GcvB CLASH mRNA targets were most highly enriched for differentially expressed genes, although this was lower for the less abundant interactions uniquely found in the CLASH data (Figure 5A, Figure 5—figure supplement 1). Surprisingly, although the CyaR targets were highly enriched in the MAPS data, only a few of the mRNAs were significantly differentially expressed in the CyaR over-expression data (Figure 5A, Figure 5—figure supplement 2). The Spot42 mRNA targets predicted by CLASH showed larger (albeit modest) changes in gene expression compared to the other genes in the dataset (Figure 5—figure supplement 3).

Figure 5 with 4 supplements see all
A subset of putative mRNA targets identified by CLASH show gene expression changes upon over-expression of the sRNA.

The Venn diagrams show how many of the predicted mRNA targets were also found to be differentially expressed in sRNA over-expression RNA-seq data (Faigenbaum-Romm et al., 2020). The GcvB and MicA CLASH mRNA targets are highly enriched for genes that are differentially expressed in the over-expression RNA-seq data (p-value<0.001). The statistical significance was calculated using a hypergeometric test. Interactions that are generally presented by a relatively low number of reads (‘CLASH unique’ and ‘less four reads’ categories) are not significantly enriched for differentially expressed genes. (B) The mRNA targets found in GcvB and MicA interactions found in both RIL-seq and CLASH show significantly higher fold-changes in the over-expression data compared to the interactions uniquely found in the CLASH data. The violin plots show the distribution of fold-changes in mRNA target expression (y-axis) in the over-expression RNA-seq data for chimeras supported by CLASH and RIL-seq and those found in CLASH only (x-axis). Statistical significance between the two groups was calculated using a Mann-Whitney U test.

Previous work implied that those interactions that impact mRNA steady-state levels are mostly found in multiple replicate RIL-seq experiments and are generally more abundant (Faigenbaum-Romm et al., 2020). The interactions recovered by both RIL-seq and CLASH were supported by a significantly higher number of chimeras compared to those uniquely identified in the CLASH data (Figure 2—figure supplement 3). Therefore, we asked if this group of interactions was more likely to alter mRNA levels. This was the case for the GcvB and MicA mRNA interactions, but not ArcZ and CyaR interactions (Figure 5B).

In conclusion, similar to what was observed for RIL-seq mRNA targets (Faigenbaum-Romm et al., 2020), many of the sRNA-mRNA interactions do not appear to significantly affect mRNA steady-state levels and for some sRNAs reproducible interactions have a higher likelihood impacting mRNA target levels (also see Discussion).

Hfq CLASH predicts sRNA-sRNA interactions as a widespread layer of post-transcriptional regulation

Surprisingly, we uncovered a large number of sRNA-sRNA chimeras, representing 200 unique interactions (Figure 2A; 2.1%; Supplementary file 4). Many of the sRNA-sRNA interactions were uniquely found in our Hfq CLASH data (Figure 6A), were growth-stage specific and the sRNA-sRNA networks show extensive rewiring across the exponential, transition and stationary phases (Figure 6—figure supplement 1). The sRNA-sRNA network is dominated by several abundant sRNAs that appear to act as hubs with many interacting partners: ChiX, Spot42 (spf), ArcZ and GcvB. Again, in many cases, the experimentally validated sRNA seed sequences were found in the chimeric reads, for both established and novel interactions. For example, the majority of ArcZ sRNA-sRNA chimeras contained the known and well conserved seed sequence (Figure 6B, Figure 6—figure supplement 2).

Figure 6 with 2 supplements see all
sRNA-RNA interactions identified by CLASH.

(A) Hfq CLASH uncovers sRNA-sRNA interaction networks: comparison between statistically filtered sRNA-sRNA interactions in the Hfq CLASH data, RIL-seq S-chimeras (Melamed et al., 2016) (log and stationary) and RNase E CLASH (Waters et al., 2017). Only independently transcribed sRNAs were considered. (B–C) Heatmaps showing the read density (log2(chimera count+1)) of chimeric fragments mapping to ArcZ (B) and CyaR (C). The location of the known sRNA seed sequences as well as the predicted new CyaR seed is indicated above the heatmap. Note that the ArcZ processing site is located just upstream of the seed sequence.

The sRNA-sRNA chimeras containing CyaR fragments were of particular interest, as the sRNA is primarily expressed during the transition from late exponential to stationary phase (De Lay and Gottesman, 2009). While 30% of the CyaR chimeras contained the known seed sequence (De Lay and Gottesman, 2009; Papenfort et al., 2008), the majority of the chimeras contained a ~25 nt fragment in the 5’ region of CyaR, which was also frequently recovered in RNase E CLASH data (Waters et al., 2017; Figure 6B; Figure 6—figure supplement 2), suggesting that this region represents a bona fide interaction site. Notably, the ArcZ-CyaR chimeras contained the seed sequences from both sRNAs (Figure 6—figure supplement 2) and these were detected specifically in the transition phase (Figure 6A; Figure 6—figure supplement 1).

To validate the predicted in vivo interaction between ArcZ and CyaR (Figure 7A), we used an E. coli plasmid-based assay that is routinely used to monitor sRNA-sRNA interactions and expression of their target mRNAs (Melamed et al., 2016; Miyakoshi et al., 2015b; Tree et al., 2014). An advantage of this system is that each sRNA would be uncoupled from the chromosomally encoded regulatory networks (that were thought to act largely in a 1:1 stoichiometry) and to allow the specific effects of the sRNA-target RNA to be assessed (Miyakoshi et al., 2015b). Importantly, these sRNAs were induced during early exponential growth phase when the endogenous (processed) ArcZ and CyaR sRNAs are only detectable at very low levels (Figure 7B, lanes 1, 2, 5, 7). The RT-qPCR data for each pZE-expressed sRNA were subsequently normalized to the results obtained with the pJV300 control to calculate fold changes in expression levels. It has recently been shown that sRNAs can also function as ‘decoys’ or ‘sponges’ that can divert other sRNA away from its mRNA targets (Azam and Vanderpool, 2015; Figueroa-Bossi and Bossi, 2018; Kavita et al., 2018). This mode of ‘regulating the regulator’ often results in cross-talk between pathways (reviewed in Figueroa-Bossi and Bossi, 2018). We hypothesized that the ArcZ-CyaR interaction may represent such a sponging activity. However, since it is difficult to predict directly from the CLASH data which sRNA in each pair acts as the decoy/sponge, we tested both directions. ArcZ over-expression not only decreased the expression of its mRNA targets (tpx, sdaC) by more than 50%, but also that of CyaR (Figure 7C, panel I; Figure 7D, panel I). Concomitantly, we observed a substantial increase in CyaR targets nadE and yqaE (Figure 7C, panel I). CyaR over-expression reduced the level of a direct mRNA target (nadE) by ~40% but it did not significantly alter the level of ArcZ or ArcZ mRNA targets (tpx and sdaC; Figure 7C, panel II). Notably, in this two-plasmid assay CyaR was not expressed at levels higher than ArcZ (Figure 7D, panel II). Therefore, it is plausible that under the tested conditions the CyaR over-expression was not sufficient to see an effect on ArcZ. We find this unlikely as over-expression of CyaR also did not significantly affect endogenous ArcZ levels, which was ~80 fold less abundant than CyaR in this experiment (Figure 7D, panel III). The qPCR results were also confirmed by Northern blot analyses (Figure 7B, lanes 1–8), which confirmed the reduction in CyaR levels upon ArcZ over expression and demonstrated that ArcZ processing was not affected upon CyaR over-expression. These results suggest that the regulation is unidirectional, reminiscent of what has been described for Qrr3 in Vibrio harveyi (Feng et al., 2015).

ArcZ can influence CyaR levels.

(A) Base-pairing interactions predicted from the ArcZ-CyaR chimeras using RNACofold. The nucleotide substitutions for experimental validation of direct base-pairing are shown as red or green residues. (B) Northern blot analysis of ArcZ and CyaR. The cells containing both the empty pZA and pJV300 plasmids (lanes 1, 5, 7) do not express ArcZ and CyaR at detectable levels. (C) Validation of ArcZ-CyaR interaction by over-expression analyses. ArcZ (panel I) orCyaR (panel II) was over-expressed and the levels of their targets were monitored by RT-qPCR. The tpx and sdaC mRNAs are ArcZ targets (panel I). The nadE and yqaE mRNAs are CyaR targets (panel II). The dashed horizontal line indicates the level in the control plasmid (pJV300) that expresses a ~50 nt randomly generated RNA sequence. Panel III: The sRNAs and mutants (as in (A)) were ectopically co-expressed in E. coli and CyaR and CyaR 38–39 levels were quantified by RT-qPCR. Experiments were performed in biological and technical triplicates; Error bars indicate the standard error of the mean (SEM) of the three biological replicates. (D) ArcZ and CyaR were overexpressed from a plasmid-borne IPTG inducible promoter (pZE-ArcZ and pZE-CyaR) and the data were compared to data from cells carrying plasmid pJV300. The co-expressed candidate target sRNAs (expressed from pZA-derived backbone) were induced with anhydrotetracycline hydrochloride (panels I and II). The bars indicate the mean fold-change in expression relative to the level of 5S rRNA (rrfD) in cells with the indicated vector. In panel III endogenous ArcZ levels were measured upon over-expression of CyaR. Error bars indicate the standard error of the mean from three biological replicates and three technical replicates per experiment. Source data are provided as a Source Data file.

To provide additional support for direct interactions between these sRNAs, we generated mutations in the seed sequences of the sRNAs analysed here (Figure 7A). We found that two G to C nucleotide substitutions in ArcZ was sufficient to disrupt ArcZ regulation of CyaR (Figure 7C panel III; ArcZ 70–71 + CyaR). Unexpectedly, the wild-type ArcZ was also able to effectively suppress the CyaR seed mutant (Figure 7C panel III; ArcZ + CyaR 38–39). We predict that the wild-type ArcZ can still form stable base-pairing interactions with the CyaR mutant. Nevertheless, regulation by the ArcZ 70–71 mutant was almost fully restored when complementary mutations were introduced in the CyaR region (Figure 7C panel III; ArcZ 70–71 + CyaR 38–39), providing additional evidence that these sRNAs base-pair in vivo. Furthermore, the data also demonstrate that it is very unlikely that the observed changes in CyaR levels were the result of Hfq redistribution due to over-expression of ArcZ (Moon and Gottesman, 2011; Papenfort et al., 2009), as the ArcZ seed mutant stably accumulated (and therefore effectively binds Hfq), but did not affect CyaR levels (Figure 7C panel III).

These results, together with the CLASH data, imply that ArcZ and CyaR base-pair in vivo, and that this interaction could lead to a reduction in CyaR levels but not vice versa.

Hfq CLASH identifies novel sRNAs in untranslated regions

Two lines of evidence from our data indicate that many other mRNAs may be harbouring sRNAs in their UTRs or be involved in base-pairing among themselves. First, around 10% of the unique intermolecular chimeras mapped to mRNA-mRNA interactions (Figure 2A). Secondly, we observed extensive binding of Hfq in 3’UTRs near transcriptional terminators (Figure 1—figure supplement 3A–B), indicating that like in Salmonella, the E. coli 3’UTRs may harbour many functional sRNAs (Chao et al., 2017). We identified 116 3’UTR-containing mRNA fragments that were involved in 507 interactions (represented by a total of 3149 unique chimeras). Eighteen of these 3’UTR fragments were also identified in 3’UTR-mRNA chimeric reads in the RIL-seq S-chimeras data (Melamed et al., 2016) and 10 appeared stabilised upon transient inactivation of RNase E performed in Salmonella (TIER-seq data Chao et al., 2017; Figure 8A, Supplementary files 5 and 6). For several of the putative 3’-UTR derived sRNAs, complementary sequence motifs in the mRNA fragments were identified, including motifs for the putative sRNA derived from the 3’UTR of ahpF (Figure 4C–D; Figure 8—figure supplements 13). Out of the 507 3’UTR-mRNA interactions, 75 were 3’UTRs fused to 5’UTRs of mRNAs, suggesting that these may represent 3’UTR-derived sRNAs that base-pair with 5’UTRs of mRNAs, a region frequently targeted by sRNAs (Supplementary files 5 and 6). Strikingly, 233 interactions (2094 unique chimeras) contained the 3’UTR fragment of cpxP, 51 (812 chimeras) of which were also found in the RIL-Seq data (Supplementary file 6). In Salmonella cpxP harbours the CpxQ sRNA (Chao and Vogel, 2016). Our analyses greatly increased the number of potential CpxQ mRNA targets and show that the vast majority of CpxQ interactions take place during the transition and stationary phases (Supplementary file 6). Motif analyses of the putative CpxQ mRNA targets, including those identified in the interactions unique to CLASH, revealed two highly enriched G-rich sequence motifs that showed strong sequence complementarity to the known seed sequences (Figure 8—figure supplement 2).

Figure 8 with 6 supplements see all
Hfq CLASH uncovers novel 3’UTR-derived sRNAs.

(A) Genes with their 3’UTRs found fused to mRNAs were selected from the statistically filtered CLASH data and RIL-seq S-chimera data. The RIL-seq S-chimeras (Melamed et al., 2016) (log and stationary phases)were filtered for3’UTR/EST3UTR annotations on either orientation of the mRNA-mRNA pairs. Both were intersected with the set of mRNAs that were predicted by TIER-seq studies (Chao et al., 2017) to harbour sRNAs that get released from 3’UTRs by RNase E processing. Known (CpxQ, SdhX, MicL, GadF, glnA-3’UTR and SroC) and novel 3’UTR derived sRNAs (MalH, flgL 3’UTR, ahpF-3’UTR and YgaN) are indicated. See Supplementary file 5 for the detailed comparison. (B) MalH is transiently expressed during the transition from exponential to stationary phase. RybB was probed as a sRNA positive control and 5S rRNA as the loading control. See Figure 8—figure supplement 4 for full-size blots. (C) Genome-browser snapshots of several regions containing candidate sRNAs for optical densities at which the RNA steady-state was maximal; the mRNA names and OD600 are indicated at the left side of the y-axes; the y-axis shows the normalized reads (RPM: reads per million); red: RPM of RNA steady-states from an RNA-seq experiment, blue: Hfq cross-linking from a CLASH experiment; black: unique chimeric reads found in this region.

We identified six mRNA 3’UTRs that were uncovered in all three (Hfq CLASH, RIL-seq and TIER-seq) datasets (Figure 8A), suggesting they likely contain sRNAs released from 3’UTRs by RNase E processing. Northern blot analyses confirmed the presence of sRNAs in malG, ygaM and gadE 3’UTRs (Figure 8B, Figure 8—figure supplement 4). We predict that the 3’UTR of ygaM harbours a ~100 nt sRNA (hereafter referred to as YgaN; Figure 8—figure supplement 4) and robust Hfq cross-linking could be detected in this region (Figure 8C).

The gadE 3’UTR was also detected in the RIL-seq data and experimentally confirmed and annotated as GadF (Melamed et al., 2016). Remarkably, even though we only recovered 23 unique GadF-mRNA interactions, two distinct complementary sequence motifs (CCAGGGG and CUGGUG) were identified in mRNA fragments of these chimeras, the former of which was not previously detected (Figure 8—figure supplement 3). Again, these complementary mRNA motifs were also enriched in interactions uniquely identified by CLASH (Figure 8—figure supplement 3). For two other 3’UTR-derived sRNAs (MicL and SdhX), we recovered 13 and 9 interactions with mRNAs, respectively (Figure 8—figure supplement 5). MicL was previously shown to repress the synthesis of the Lpp outer membrane protein (Guo et al., 2014). Lpp mRNA fragments were most frequently found in MicL chimeras (15; Figure 8—figure supplement 5A). The in silico folded structure of the MicL-lpp chimeras is in excellent agreement with the previously proposed interaction between MicL and lpp (Figure 8—figure supplement 5B; Guo et al., 2014). SdhX is involved in linking acetate metabolism with the TCA cycle (De Mets et al., 2019; Miyakoshi et al., 2018). Our data predict over a dozen SdhX interactions, several of which had not been previously described (Figure 8—figure supplement 5C). We recovered two SdhX interactions with known mRNAs targets (ackA and katG; Figure 8—figure supplement 5D; De Mets et al., 2019; Miyakoshi et al., 2018). Interestingly, the SdhX-ackA interaction was detected in the exponential phase, whereas the SdhX-katG interaction appeared specifically during stationary phase. Although the number of chimeras supporting these interactions were relatively low (katG; two chimeras; ackA; three chimeras), the in silico predicted interactions between the two halves of these chimeras are fully consistent with previously published work (De Mets et al., 2019; Miyakoshi et al., 2018). These results reinforce the idea that Hfq CLASH recovers genuine interactions.

To substantiate our 3’UTR-derived sRNA candidate prediction, we analysed RNA-seq data from a study that used Terminator 5’-Phosphate Dependent Exonuclease (TEX) to map transcription start sites (TSS) of coding and non-coding RNAs in E. coli (Thomason et al., 2015). TEX degrades processed transcripts that have 5’ monophosphates, but not primary transcripts with 5’ triphosphates. Therefore, these data enabled us to determine whether (a) a short RNA was detected in the 3’UTR and (b) whether these were generated by RNase-dependent processing (TEX sensitive) or originated from an independent promoter (TEX insensitive). For 47 of the 126 predicted 3’UTR-derived sRNAswe found strong evidence for the presence of sRNAs in the TEX data (Figure 8—figure supplement 6, Supplementary file 5 and see Data and Code availability). The TEX data indicate that ygaM has (at least) two promoters, one of which is located near the 3’ end of the gene that we predict is the TSS for YgaN (Figure 8—figure supplement 6A). Furthermore, we speculate that YgaN is processed by RNases. This is based on the observation that multiple YgaN species were detected in the Northern blot analyses (Figure 8—figure supplement 4) and the TEX data indicate that shorter YgaN RNAs are sensitive to TEX treatment (Figure 8—figure supplement 6A).

The majority of the sRNAs we analysed are more abundant at higher cell densities (including GadF, YgaN and RybB; see Figure 8B). In sharp contrast, the sRNA derived from the 3’UTR of the malG mRNA (MalH) was expressed very transiently and peaked at an OD600 of 1.8 (Figure 8B). We envisage that the particularly transient expression of this sRNA may be associated with a role in the adaptive responses triggered during transition from exponential to stationary phases of growth.

Discussion

Microorganisms need to constantly adapt their transcriptional program to meet changes in their environment, such as changes in temperature, cell density and nutrient availability. In bacteria, small RNAs (sRNAs) and their associated RNA-binding proteins play a key role in this process. By controlling translation and degradation rates of mRNAs in response to stress (Holmqvist and Wagner, 2017; Nitzan et al., 2017; Shimoni et al., 2007), they can regulate the kinetics of gene expression as well as suppress noisy signals (Beisel and Storz, 2011), enabling organisms to more efficiently adapt to environmental changes. A major challenge for bacteria is the transition from exponential growth to stationary phase, when the most favourable nutrients become limiting. To counteract this challenge, cells need to rapidly remodel their transcriptome to efficiently metabolize alternative carbon sources. This transition is highly dynamic and involves both activation and repression of diverse metabolic pathways. However, it is unclear to what degree sRNAs contribute to this transition. The most useful piece of information would be to know what sRNAs are upregulated during this transition phase and to identify their RNA targets. This would help to uncover the regulatory networks that govern this adaptation, as well as provide a starting point for more detailed functional analyses on sRNAs predicted to play a key role in this process. For this purpose, we performed UV cross-linking, ligation and sequencing of hybrids (CLASH; Kudla et al., 2011) to unravel the sRNA base-pairing interactions during this transition. Using Hfq as a bait we uncovered thousands of unique sRNA base-pairing interactions. We identified almost 1700 novel sRNA-mRNA interactions represented by over 18000 unique chimeras, and 200 novel sRNA-sRNA interactions, compared to previously published work (Melamed et al., 2016; Waters et al., 2017).

Hfq CLASH

Our earlier S. cerevisiae Cross-linking and cDNA analysis data (CRAC; Granneman et al., 2009) showed that a percentage of the cDNAs were formed by intermolecular ligations of two RNA fragments (chimeras) known to base pair in vivo (Kudla et al., 2011). These findings prompted us to develop a refined protocol to enrich for sRNA-target chimeric reads using Hfq as an obvious bait. The initial Hfq UV cross-linking data (CRAC; Tree et al., 2014) did not yield sufficiently high numbers of chimeric reads to extract new biological insights. In line with observations from other groups (Bandyra et al., 2012; Bruce et al., 2018; Morita et al., 2005), it was proposed that duplexes formed by Hfq are rapidly transferred to the RNA degradosome. This can cause an extensive reduction in the likelihood of capturing sRNA-target interactions with Hfq using CLASH (Waters et al., 2017). However, a recent study demonstrated that Hfq can be used effectively as a bait to enrich for sRNA-target duplexes under lower stringency purification conditions suggesting that sRNA-mRNA duplexes are sufficiently stable on Hfq during purification (Melamed et al., 2016). This encouraged us to further optimize the CLASH method. We made a number of changes to the protocol that enabled us to recover a large number of chimeric reads, many of which represent sRNAs base-paired to potential targets (detailed in Materials and methods). We shortened various incubation steps to minimize RNA degradation and performed very long and stringent washes after bead incubation steps to remove any background binding of non-specific proteins and RNAs. Crucially, we very carefully controlled the RNase digestion step that is used to trim the cross-linked RNAs prior to making cDNA libraries, ensuring the recovery of longer chimeric RNA fragments. The resulting cDNA libraries were paired-end sequenced to increase the recovery of chimeric reads with high mapping scores from the raw sequencing data. These modifications led to a substantial improvement in the recovery of chimeric reads (8.6% compared to 0.001%; 0.47% were intermolecular chimeras).

Both RIL-seq and Hfq CLASH have advantages and disadvantages and are highly complementary. A major strength of CLASH, however, is that the purification steps are performed under highly stringent and denaturing conditions. During the first FLAG affinity purification steps, the beads are extensively washed with high-salt buffers and the second Nickel-affinity purification step is done under denaturing conditions (6M guanidium hydrochloride). These stringent purification steps can significantly reduce noise by strongly enriching for RNAs covalently cross-linked to the bait protein (Granneman et al., 2009). Indeed, we show that Hfq CLASH can generate high-quality RNA-RNA interaction data with low background: only a few hundred chimeric reads were found in the control datasets, compared to the over 50,000 chimeras that co-purified with Hfq. The RIL-seq library preparation protocol uses an rRNA depletion step to remove contaminating ribosomal RNA. For Hfq CLASH this is not necessary, and we show that chimeras containing rRNA fragments, which presumably represent noise, are not very abundant in our data (Figure 2—figure supplements 1, 2, 4 and 5). Our library preparation protocol also includes the use of random nucleotides in adapter sequences to remove potential PCR duplicates (‘collapsing’) from the data.

The very stringent purification conditions used in CLASH could, in some cases, also be a disadvantage as it completely relies on UV cross-linking to isolate directly bound RNAs. In cases where the efficiencies of protein-RNA cross-linking are low (for example, in the case of proteins that only recognize double-stranded RNA), RIL-seq may be a better approach as it does not completely rely on UV cross-linking (Melamed et al., 2016).

A large number of interactions were unique to both RIL-seq and Hfq CLASH datasets, which we believe can be explained by a number of technical and experimental factors. The denaturing purification conditions used with CLASH completely disrupt the Hfq hexamer (Tree et al., 2014 and this work). Therefore, during the adapter ligation reactions the RNA ends are likely more accessible for ligation. In support of this, in the RIL-Seq data, the sRNAs are mostly found in the second half of the chimeras (Melamed et al., 2016), whilst in the Hfq CLASH data we observe sRNAs fragments with almost equal distributed in both sides (45% in left fragment and 55% in right fragment). Indeed, it was proposed that in RIL-seq the 3’ end of the sRNA is buried in the hexamer and therefore not always accessible for ligation (Melamed et al., 2016).

For the RIL-seq experiments, the authors harvested the cells at 4°C and resuspended them in ice-cold PBS prior to UV irradiation (Melamed et al., 2018; Melamed et al., 2016). This procedure results in a cold-shock that can affect the sRNA-interactome as well as sRNA stability. We cross-link actively growing cells in their growth medium and we UV irradiate our cells within seconds using the Vari-X-linker we recently developed (van Nues et al., 2017). We use filtration devices to rapidly harvest our cells (less than 30 seconds) and the cells are subsequently stored on the filters at −80°C. We previously showed that filtration, combined with short UV cross-linking times dramatically reduces noise introduced by the activation of the DNA damage response and significantly increased the recovery of short-lived RNA species (van Nues et al., 2017). We speculate that many of the interactions that are unique to our Hfq CLASH data represent short-lived RNA duplexes that are preferentially captured with our UV cross-linking and rapid cell filtration setup.

Biological significance of the interactions

One important question that needs to be addressed in the field is how many of the interactions that are recovered by high-throughput RNA-RNA interactome methodologies represent physiologically or biologically relevant base-pairing interactions. The analysis of the RIL-seq (Melamed et al., 2016) and our CLASH data showed that the predicted mRNA targets did not frequently show significant changes in gene expression upon over-expression of the sRNA. It is, of course, possible that sRNA base-pairing mostly affects mRNA translation and mRNA stability to a lesser extent. Hence, approaches other than over-expression analyses may need to be included to verify the interaction networks. Ribosome profiling analyses on mutant strains should be helpful in determining whether the absence of the sRNA alters the association of mRNA targets with ribosomes (Guo et al., 2014; Wang et al., 2015), however, this is also a method not without challenges (Mohammad et al., 2019). Whilst this work was in progress, the Margalit group presented compelling evidence suggesting that many mRNA targets compete for Hfq and that the binding efficiency of Hfq to the targets primarily determines the regulatory outcome (Faigenbaum-Romm et al., 2020). Those mRNAs that were significantly affected by sRNA over-expression were also more frequently and reproducibly found in chimeras with the sRNA. This offers a plausible explanation for why we did not always observe enrichment of differentially expressed genes in putative mRNA targets recovered in a relatively low number of chimeras. Another aspect to consider is that over-expression of sRNAs will not only impact the direct targets. For example, over-expression of ArcZ in Salmonella revealed widespread changes in gene expression, presumably as a result of redistribution of Hfq over the transcriptome (Papenfort et al., 2009). As a result, a relatively small fraction of the differentially expressed genes will be represented in the CLASH/RIL-seq data, resulting in poor p-values.

One could argue that some of the interactions we present here may represent weak or stochastic interactions that do not have a biological function. For example, sRNAs can cycle on Hfq (reviewed in Santiago-Frangos and Woodson, 2018) and it is therefore conceivable that some of the sRNA-sRNA chimeras detected in our CLASH data happen to be two sRNAs that were in close proximity during their exchange on Hfq. Although it is not possible to quantify the number of such interactions, we would argue that they are not very abundant in our data. We purified Hfq and cross-linked RNAs under very stringent and completely denaturing conditions before we do the intermolecular ligation reactions. Because our purification conditions completely disrupt the Hfq hexamer (this work and Tree et al., 2014), transient interactions that do not involve (significant) base-pairing would only be detected if an Hfq monomer was UV cross-linked to both sRNAs simultaneously and if the available 5’ and 3’ ends are in close proximity. Considering the poor efficiency of UV cross-linking, the likelihood of this happening is very low. Secondly, we show that our chimeras, including those that are supported by only a few reads, have a high propensity to form stable duplexes in silico (Figure 3). Finally, for many sRNAs, we identified enriched sequence motifs in predicted mRNA targets that have significant sequence complementarity to sRNA seeds (Figure 4, Figure 4—figure supplements 112, Figure 8—figure supplements 13). Thus, we conclude that with the CLASH protocol weaker or stochastic interactions are not easily recovered. While the CLASH and the RIL-seq analyses agree that for some sRNAs the more frequent interactions are more likely to affect target mRNA stability, they also highlight that low-abundance interactions have strong complementarity and base-pairing potential, thus are genuine. The biological significance of these is yet to be determined, but one possibility is that many low-frequency interactions occur to confer robustness to the regulation of a few principal targets (Jost et al., 2013) and we speculate that these principal targets are condition-specific.

Surprisingly, for ArcZ and CyaR, even some of the mRNA targets found in a larger number of chimeric fragments were not differentially expressed. Possible explanations include their regulation at the protein synthesis level, but not at the RNA level, or control by fine-tuning, which would result in modest or undetectable changes in transcript levels.

sRNA-sRNA interactions; ArcZ regulation of CyaR

One of the most striking observation of our global study was the abundance of sRNA-sRNA interactions in E. coli, many of which were growth-stage dependent. We experimentally validated the ArcZ-CyaR interaction, which involves the known seed sequence of ArcZ and the 5’end of CyaR. We demonstrate that ArcZ over-expression can reduce CyaR steady state levels but not vice versa, implying the regulation is unidirectional. Consistent with our findings, in Salmonella, over-expression of ArcZ showed a dramatic reduction in CyaR bound to Hfq and upregulation of CyaR targets, such as nadE (Papenfort et al., 2009). This suggests that this activity is conserved between these two Gram-negative bacteria. A similar type of unidirectional regulation has also been elegantly demonstrated for the Qrr3 sRNA of Vibrio cholerae (Feng et al., 2015). The fate of these sRNA-sRNA duplexes may depend on the position of the interaction; It was shown that if the interaction with Qrr3 involves its stabilizing 5’ stem-loop structure, the sRNA will be preferentially degraded (Feng et al., 2015). Consistent with this, folding of the chimeric reads suggests that ArcZ preferentially base-pairs with the 5’ end of CyaR (Figure 6C and Figure 7A). This may destabilize secondary structures that normally help to stabilize the sRNA.

The biological significance of ArcZ regulating CyaR remains unclear, however, a possible function could be to reduce noise in CyaR expression by preventing CyaR levels from overshooting during the transition phase. ArcZ and CyaR target mRNAs are associated with many different processes. Thus, these interactions are expected to connect multiple pathways. For example, ArcZ regulation of CyaR may connect adaptation to stationary phase/biofilm development (De Lay and Gottesman, 2009; Monteiro et al., 2012) to quorum sensing and cellular adherence (De Lay and Gottesman, 2009). CyaR expression is controlled by the global regulator Crp. Most of the genes controlled by Crp are involved in transport and/or catabolism of amino acids or sugar. Interestingly, ArcZ downregulates the sdaCB dicistron which encodes for proteins involved in serine uptake and metabolism (Papenfort et al., 2009). This operon has been shown to be regulated by Crp as well, suggesting that ArcZ can counteract the activity of Crp.

Materials and methods

Bacterial strains and culture conditions

Request a detailed protocol

An overview of the bacterial strains used in this study is provided in the Key Resources Table. The E. coli MG1655 (Blattner et al., 1997) and TOP10F’ strains served as parental strains. The E. coli K12 strain used for CLASH experiments, MG1655 hfq::HTF was previously reported (Tree et al., 2014). Cells were grown in Lysogeny Broth (LB) at 37°C under aerobic conditions with shaking at 200 rpm. The media were supplemented with antibiotics where required at the following concentrations: chloramphenicol (Corning, –S, C239RI) - 25 µg/ml and kanamycin (Gibco, US,–11815–024) - 50 µg/ml. For induction of sRNA expression from plasmids, 1 mM IPTG, or 200 nM anhydrotetracycline hydrochloride (Sigma, 1035708–25 MG) were used.

Construction of sRNA expression plasmids

Request a detailed protocol

The plasmids used in this study are listed in the Key Resources Table. The gene fragments and primers used for cloning procedures in this work are provided in Supplementary file 10. For the sRNA over-expression constructs, the sRNA gene of interest was cloned at the transcriptional +one site under PlacO control by amplifying the pZE12luc plasmid (Expressys) by inverse PCR using Q5 DNA Polymerase (NEB). The sRNA genes and seed mutants were synthesized as ultramers (IDT; Supplementary file 10) which served as the forward primers. The reverse primer (oligo pZE12_5P_rev, Supplementary file 10) bears a monophosphorylated 5’ end to allow blunt-end self-ligation. The PCR reaction was digested with 10U DpnI (NEB) for 1 hr at 37°C and purified by ethanol precipitation. The linear PCR product was circularized by self-ligation, and transformed in E. coli TOP10F’ competent cells. Positive transformants were screened by Sanger sequencing (Edinburgh Genomics, Edinburgh, UK). Small RNA over-expression constructs derived from the pZA21MCS (Expressys) were generated identically, using the indicated ultramers in Supplementary file 10 as forward primers and oligo pZA21MCS_5P_rev as the reverse primer.

Hfq UV cross-linking, ligation and analysis of hybrids (Hfq-CLASH)

Request a detailed protocol

CLASH was performed essentially as described (Waters et al., 2017), with a number of modifications including changes in incubation steps, cDNA library preparation, reaction volumes and UV cross-linking. E. coli expressing the chromosomal Hfq-HTF were grown overnight in LB at 37°C with shaking (200 rpm), diluted to starter OD600 0.05 in fresh LB, and re-grown with shaking at 37°C in 750 ml LB. A volume of culture equivalent to 80 OD600 per ml was removed at the following cell-densities (OD600): 0.4, 0.8, 1.2, 1.8, 2.4, 3.0 and 4.0, and immediately subjected to UV (254 nm) irradiation for 22 s (~500 mJ/cm2) in the Vari-X-linker (van Nues et al., 2017) (https://www.vari-x-link.com). Cells were harvested using a rapid filtration device (van Nues et al., 2017) (https://www.vari-x-link.com) onto 0.45 μM nitrocellulose filters (Sigma, UK, HAWP14250) and flash-frozen on the membrane in liquid nitrogen. Membranes were washed with ~15 ml ice-cold phosphate-buffered saline (PBS), and cells were harvested by centrifugation. Cell pellets were lysed by bead-beating in 1 vol per weight TN150 buffer (50 mM Tris pH 8.0, 150 mM NaCl, 0.1% NP-40, 5 mM β-mercaptoethanol) in the presence of protease inhibitors (Roche, A32965), and 3 volumes 0.1 mm Zirconia beads (Thistle Scientific, 11079101z), by performing five cycles of 1 min vortexing followed by 1 min incubation on ice. One additional volume of TN150 buffer was added. To reduce the viscosity of the lysate and remove contaminating DNA, the lysate was incubated with RQ1 DNase I (10 U/ml Promega, M6101) for 30 min on ice. Two-additional volumes of TN150 were added and mixed with the lysates by vortexing. The lysates were centrifuged for 20 min at 4000 rpm at 4°C and subsequently clarified by a second centrifugation step at 13.4 krpm, for 20 min at 4°C. Purification of the UV cross-linked Hfq-HTF-RNA complexes and cDNA library preparation was performed as described (Granneman et al., 2009). Cell lysates were incubated with 50 μl of pre-equilibrated M2 anti-FLAG beads (Sigma, M8823-5ML) for 1–2 hr at 4°C. The anti-FLAG beads were washed three times 10 min with 2 ml TN1000 (50 mM Tris pH 7.5, 0.1% NP-40, 1M NaCl) and three times 10 min with TN150 without protease inhibitors (50 mM Tris pH 7.5, 0.1% NP-40, 150 mM NaCl). For TEV cleavage, the beads were resuspended in 250 μl of TN150 buffer (without protease inhibitors) and incubated with home-made GST-TEV protease at room temperature for 1.5 hr. The TEV eluates were then incubated with a fresh 1:100 dilution preparation of RNaceIt (RNase A and T1 mixture; Agilent, 400720) for exactly 5 min at 37°C, after which they were mixed with 0.4 g GuHCl (6M, Sigma, G3272-100G), NaCl (300 mM), and Imidazole (10 mM, I202-25G). Note this needs to be carefully optimized to obtain high-quality cDNA libraries. The samples were then transferred to 50 μl Nickel-NTA agarose beads (Qiagen, 30210), equilibrated with wash buffer 1 (6 M GuHCl, 0.1% NP-40, 300 mM NaCl, 50 mM Tris pH 7.8, 10 mM Imidazole, 5 mM beta-mercaptoethanol). Binding was performed at 4°C overnight with rotation. The following day, the beads were transferred to Pierce SnapCap spin columns (Thermo Fisher, 69725), washed three times with wash buffer 1 and three times with 1xPNK buffer (10 mM MgCl2, 50 mM Tris pH 7.8, 0.1% NP-40, 5 mM beta-mercaptoethanol). The washes were followed by on-column TSAP incubation (Thermosensitive alkaline phosphatase, Promega, M9910) treatment for 1 hr at 37°C with 8 U of phosphatase in 60 μl of 1xPNK, in the presence of 80U RNasin (Promega, N2115). The beads were washed once with 500 μl wash buffer 1 and three times with 500 μl 1xPNK buffer. To add 3’-linkers (App-PE – Key Resources Table), the Nickel-NTA beads were incubated in 80 μl 3’-linker ligation mix with (1 X PNK buffer, 1 µM 3’-adapter, 10% PEG8000, 30U Truncated T4 RNA ligase 2 K227Q (NEB, M0351L), 60U RNasin). The samples were incubated for 4 hr at 25°C. The 5’ ends of bound RNAs were radiolabelled with 30U T4 PNK (NEB, M0201L) and 3 μl 32P-γATP (1.1 µCi; Perkin Elmer, NEG502Z-500) in 1xPNK buffer for 40 min at 37°C, after which ATP (Roche, 11140965001) was added to a final concentration of 1 mM, and the incubation prolonged for another 20 min to complete 5’ end phosphorylation. The resin was washed three times with 500 μl wash buffer one and three times with equal volume of 1xPNK buffer. For on-bead 5’-linker ligation, the beads were incubated 16 hr at 16°C in 1xPNK buffer with 40U T4 RNA ligase I (NEB, M0204L), and 1 μl 100 μM L5 adapter (Key Resources Table), in the presence of 1 mM AtP and 60U RNasin. The Nickel-NTA beads were washed three times with wash buffer one and three times with buffer 2 (50 mM Tris–HCl pH 7.8, 50 mM NaCl, 10 mM imidazole, 0.1% NP-40, 5 mM β-mercaptoethanol). The protein-RNA complexes were eluted in two steps in new tubes with 200 μl of elution buffer (wash buffer 2 with 250 mM imidazole). The protein-RNA complexes were precipitated on ice by adding TCA (T0699-100ML) to a final concentration of 20%, followed by a 20 min centrifugation at 4°C at 13.4 krpm. Pellets were washed with 800 μl acetone, and air dried for a few minutes in the hood. The protein pellet was resuspended and incubated at 65°C in 20 μl 1x NuPage loading buffer (Thermo Scientific, NP0007), resolved on 4–12% NuPAGE gels (Thermo Scientific, NP0323PK2) and visualised by autoradiography. The cross-linked proteins-RNA were cut directly from the gel and incubated with 160 μg of Proteinase K (Roche, 3115801001) in 600 μl wash buffer 2 supplemented with 1% SDS and 5 mM EDTA at 55°C for 2–3 hr with mixing. The RNA was subsequently extracted by phenol-chloroform extraction and ethanol precipitated. The RNA pellet was directly resuspended in RT buffer and was transcribed in a single reaction with the SuperScript IV system (Invitrogen, 18090010) according to manufacturer’s instructions using the PE_reverse oligo as primer. The cDNA was purified with the DNA Clean and Concentrator 5 kit (Zymo Research) and eluted in 11 μl DEPC water. Half of the cDNA (5 μl) was amplified by PCR using Pfu Polymerase (Promega, M7745) with the cycling conditions (95°C for 2 min; 20–24 cycles: 95°C for 20 s, 52°C for 30 s and 72°C for 1 min; final extension of 72°C for 5 min). The PCR primers are listed in the Key Resources Table. PCR products were treated with 40U Exonuclease 1 (NEB, M0293L) for 1 hr at 37°C to remove free oligonucleotide and purified by ethanol precipitation/or the DNA Clean and Concentrator 5 kit (Zymo Research, D4003T). Libraries were resolved on a 2% MetaPhor agarose (Lonza, LZ50181) gel and 175–300 bp fragments were gel-extracted with the MinElute kit (Qiagen, 28004) according to manufacturer’s instructions. All libraries were quantified on a 2100 Bionalyzer using the High-Sensitivity DNA assay and a Qubit 4 (Thermo Scientific, Q33226). Individual libraries were pooled based on concentration and barcode sequence identity. Paired-end sequencing (75 bp) was performed by Edinburgh Genomics on an Illumina HiSeq 4000 platform.

RNA-seq

Request a detailed protocol

E. coli MG1655 was cultured, UV-irradiated and harvested as described for the CLASH procedure. Total RNA was extracted using the Guanidium thiocyanate phenol method. RNA integrity was assessed with the Prokaryote Total RNA Nano assay on a 2100 Bioanalyzer (Agilent, G2939BA). Sequencing libraries from two biological replicates were prepared by NovoGene using the TruSeq library preparation protocol and 150 bp paired-end sequencing was performed on an Illumina NovaSeq 6000 system. This yielded ~7–8 million paired-end reads per sample.

Small RNA over-expression studies

Request a detailed protocol

Individual TOP10F’ clones carrying pZA21 and pZE12-derived sRNA constructs and control plasmids combinations (Key Resources Table) were cultured to OD600 0.1 and expression of sRNAs was induced with IPTG and anhydrotetracycline hydrochloride (Sigma, I6758-1G and 1035708–25 MG) for 1 hr. Cells were collected by centrifugation for 30 s at 14,000 rpm, flash-frozen in liquid nitrogen and total RNA was isolated as above. Gene expression was quantified by RT-qPCR (see below) using 10 ng total RNA as template, and expressed as fold change relative to the reference sample containing pJV300 (Sittka et al., 2007) or empty pZA21.

RT-qPCR

Request a detailed protocol

Total RNA (10 µg) was treated with 2 U of Turbo DNase (Thermo Scientific, AM2238) for 1 hr at 37°C in a 10 μl reaction in the presence of 2 U superaseIn RNase inhibitor (Thermo Scientific, AM2694). The RNA was purified with RNAClean XP beads (Beckman Coulter, A63987). Quantitative PCR was performed on 10 ng of DNAse I-treated total RNA using the Luna Universal One-Step RT-qPCR Kit (NEB, E3005E) according to manufacturer’s instructions. The qPCRs were run on a LightCycler 480 (Roche), and the specificity of the product was assessed by generating melt curves, as follows: 65 °C-60s, 95°C (0.11 ramp rate with five acquisitions per °C, continuous). The data analyses were performed with the IDEAS2.0 software, at default settings: Absolute Quantification/Fit Points for Cp determination and Melt Curve Genotyping. The RT-qPCR for all samples was performed in technical triplicate. Outliers from the samples with technical triplicate standard deviations of Cp >0.3 were discarded from the analyses. To calculate the fold-change relative to the control, the 2-ddCp method was employed, using 5S rRNA (rrfD) as the reference gene. Experiments were performed for three biological replicates, and the mean fold-change and standard error of the mean were computed. Unless otherwise stated, significance of the fold-change difference compared to the reference sample control (for which fold-change = 1 by definition) was tested with a one-sample t-test.

Northern blot analysis

Request a detailed protocol

Total RNA was extracted from cell lysates by GTC-Phenol extraction. 10 μg total RNA was separated on an 8% polyacrylamide TBE-Urea gel and transferred to a nylon membrane (HyBond N+, GEHealthcare, RPN1210B) by electroblotting for 4 hr at 50 V. Membranes were pre-hybridised in 10 ml of UltraHyb Oligo Hyb (Thermo Scientific, AM8663) for 1 hr and probed with 32P-labeled DNA oligo at 42°C for 12–18 hr in a hybridization oven. The sequences of the probes used for Northern blot detection are detailed in Supplementary file 10. Membranes were washed twice with 2xSSC + 0.5% SDS solution for 10 min and visualized using a Phosphor imaging screen and FujiFilm FLA-5100 Scanner (IP-S mode). For detection of highly abundant species (5S rRNA) autoradiography was used for exposure.

Western blot analyses

Request a detailed protocol

E. coli MG1655 Hfq::htf lysates using strains cultured, cross-linked, harvested and lysed in identical conditions as the CLASH experiments containing 40 µg protein were resolved on PAGE gels and transferred to a nitrocellulose membrane. The membranes were blocked for 1 hr in blocking solution (5% non-fat milk in PBST (1X phosphate saline buffer, 0.1% Tween-20). To detect Hfq-HTF protein, the membrane was probed overnight at 4°C with the Rabbit anti-TAP polyclonal primary antibody (Thermo Fisher, 1:5000 dilution in blocking solution), which recognizes an epitope at the region between the TEV-cleavage site and His6. For the loading control we used a rabbit polyclonal to GroEL primary antibody (Abcam, 1:150000 dilution, ab82592), for 2 hr at room temperature. After 3 × 10 min PBST washes, the membranes were blotted for one hour with a Goat anti-rabbit IgG H and L (IRDye 800) secondary antibody (Abcam, ab216773, 1:10,000 in blocking solution) at room temperature. Finally, after three 10 min PBST washes, the blot was rinsed in PBS, and the proteins were visualised with a LI-COR (Odyssey CLx) using the 800 nm channel and scan intensity 4. Image acquisition and quantifications were performed with the Image Studio Software.

Computational analysis

Pre-processing of the raw sequencing data

Request a detailed protocol

Raw sequencing reads in fastq files were processed using a pipeline developed by Sander Granneman, which uses tools from the pyCRAC package (Webb et al., 2014). The entire pipeline is available at https://bitbucket.org/sgrann/). The CRAC_pipeline_PE.py pipeline first demultiplexes the data using pyBarcodeFilter.py and the in-read barcode sequences found in the L5 5’ adapters. Flexbar then trims the reads to remove 3’-adapter sequences and poor-quality nucleotides (Phred score <23). Using the random nucleotide information present in the L5 5’ adaptor sequences, the reads are then collapsed to remove potential PCR duplicates. The reads were then mapped to the E. coli MG1655 genome using Novoalign (www.novocraft.com). To determine to which genes the reads mapped to, we generated an annotation file in the Gene Transfer Format (GTF). This file contains the start and end positions of each gene on the chromosome as well as what genomic features (i.e. sRNA, protein- coding, tRNA) it belongs to. To generate this file, we used the Rockhopper software (Tjaden, 2015) on E. coli rRNA-depleted total RNA-seq data (generated by Christel Sirocchi), a minimal GTF file obtained from ENSEMBL (without UTR information). The resulting GTF file contained information not only on the coding sequences, but also complete 5’ and 3’ UTR coordinates. We then used pyReadCounters.py with Novoalign output files as input and the GTF annotation file to count the total number of unique cDNAs that mapped to each gene.

Normalization steps

Request a detailed protocol

To normalize the read count data generated with pyReadCounters.py and to correct for differences in library depth between time-points, we calculated Transcripts Per Million reads (TPM) for each gene. Briefly, for each time-point the raw counts for each gene was first divided by the gene length and then divided by the sum of all the values for the genes in that time-point to normalize for differences in library depth. The TPM values for each OD600 studied were then log2-normalized.

Hfq-binding coverage plots

Request a detailed protocol

For the analysis of the Hfq binding sites the pyCRAC package (Webb et al., 2014) was used (versions. 1.3.2–1.4.3). The pyBinCollector tool was used to generate Hfq cross-linking distribution plots over genomic features. First, PyCalculateFDRs.py was used to identify the significantly enriched Hfq-binding peaks (minimum 10 reads, minimum 20 nucleotide intervals). Next, pyBinCollector was used to normalize gene lengths by dividing their sequences into 100 bins and calculate nucleotide densities for each bin. To generate the distribution profile for all genes individually, we normalized the total number of read clusters (assemblies of overlapping cDNA sequences) covering each nucleotide position by the total number of clusters that cover the gene. Motif searches were performed with pyMotif.py using the significantly enriched Hfq-binding peaks (FDR intervals). The 4–8 nucleotide k-mers with Z-scores above the indicated threshold were used for making the motif logo with the k-mer probability logo tool (Wu and Bartel, 2017) with the -ranked option (http://kplogo.wi.mit.edu/).

Analysis of chimeric reads

Request a detailed protocol

Chimeric reads were identified using the hyb package using default settings (Travis et al., 2014) and further analysed using the pyCRAC package (Webb et al., 2014). To apply this single-end specific pipeline to our paired-end sequencing data, we joined forward and reverse reads using FLASH (https://github.com/dstreett/FLASH2) (Magoč and Salzberg, 2011), which merges overlapping paired reads into a single read. Paired reads that were not considered overlapping were subsequently concatenated into a single sequence and again filtered for overlapping reads that were missed by FLASH. These were then analysed using hyb. The -anti option for the hyb pipeline was used to be able to use a genomic E. coli hyb database, rather than a transcript database. Uniquely annotated hybrids (.ua.hyb) were used in subsequent analyses. To visualise the hybrids in the genome browser, the. ua.hyb output files were converted to the GTF format. To generate distribution plots for the genes to which the chimeric reads mapped, the parts of the chimeras were clustered with pyClusterReads.py and BEDtools (Quinlan and Hall, 2010) (intersectBed) was used to remove clusters that map to multiple regions. To produce the coverage plots with pyBinCollector, each cluster was counted only once, and the number of reads belonging to each cluster was ignored.

Statistical filtering of the data

Request a detailed protocol

The uniquely annotated chimeras from the merged CLASH experiments were statistically scored using available pipelines (Waters et al., 2017). Only chimeras with an Benjamini-Hochberg adjusted p-value lower than 0.05 were considered and referred to as statistically filtered chimeras.

Predicted folding energy analyses

Request a detailed protocol

Cumulative distributions of minimum folding energy were generated using the minimum folding energies predicted with RNADuplex (Lorenz et al., 2011) for all statistically filtered sRNA-mRNA chimeras. To generate the data for the shuffled chimeras, the fragments were randomly shuffled over the same gene, or over genes belonging to the same class of genes (e.g sRNAs or mRNAs), respectively. Significance was tested with the Kolmgorov-Smirnov test.

Motif analyses for sRNA targets

Request a detailed protocol

For each sRNA with at least five different putative targets, we clustered those chimeras based on the similarity of sRNA sequences using K-means clustering. The clustering step was skipped for those sRNAs for which almost all chimeric reads overlapped the same region. The sequences of the fused mRNA fragments in each cluster were extracted and motif searches using MEME (Bailey et al., 2009). To calculate complementarity between the identified motifs in putative mRNA targets and the sRNA we used MAST (Bailey et al., 2009). Only motifs that had a MAST p-value<=0.001 were considered.

Microarray analyses

Request a detailed protocol

ArcZ, Spot42 and GcvB microarray data were processed by GEO2R using the limma package (Ritchie et al., 2015). The accession numbers for these datasets are GSE17771, GSE24875 and GSE26573. The processed CyaR data were obtained from the Supplementary data provided in the paper describing the CyaR over-expression in E. coli (De Lay and Gottesman, 2009). Cumulative distribution plots were generated using the T-statistics calculated by the limma package. Average expression levels were calculated by averaging the expression of genes in the parental and over-expression strain.

sRNA density plots

Request a detailed protocol

To visualize the nucleotide read density of sRNA-target pairs for a given sRNA, the hit counts at each nucleotide position for all statistically filtered chimeras were summed. The count data was log2-normalized (actually log2(Chimera count +1) to avoid NaN for nucleotide positions with 0 hits when log-transforming the data).

To make distributions of the chimeric reads around known sRNA and mRNA seeds, we manually retrieved the experimentally validated sRNA and mRNA seed sequences from sRNATarbase 3.0 (Wang et al., 2015) and literature. We converted the FASTA sequences to the genomic coordinates of our reference genome. Next, we normalized the length of all sequences to eight nucleotides with pyNormalizeIntervalLengths.py, then used the pyBinCollector tool to calculate the overlap of the intervals corresponding to statistically filtered chimeric reads with the seed sequence interval of each sRNA and sRNA-mRNA interaction. sRNA-sRNA network visualization.

Only the sRNA-sRNA chimeric reads representing statistically filtered chimeras in the merged CLASH dataset were considered. For each such interaction, chimera counts corresponding in either orientation were summed, log2-transformed and visualized with the igraph Python package.

Data and code availability

Request a detailed protocol

The next generation sequencing data have been deposited on the NCBI Gene Expression Omnibus (GEO) with accession number GSE123050. The python pyCRAC (Webb et al., 2014), kinetic-CRAC and GenomeBrowser software packages used for analysing the data are available from https://bitbucket.org/sgrann (pyCRAC up to version 1.4.3), https://git.ecdf.ed.ac.uk/sgrannem/ and pypi (https://pypi.org/user/g_ronimo/). The hyb pipeline for identifying chimeric reads is available from https://github.com/gkudla/hyb. The scripts for statistical analysis of hyb data is available from https://bitbucket.org/jaitree/hyb_stats/. The FLASH algorithm for merging paired reads is available from https://github.com/dstreett/FLASH2. Bedgraph and Gene Transfer Format (GTF) generated from the analysis of the Hfq CLASH, RNA-seq and TEX RNA-seq data (Thomason et al., 2015) are available from the Granneman lab DataShare repository (https://datashare.is.ed.ac.uk/handle/10283/2915).

References

  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
  6. 6
  7. 7
  8. 8
  9. 9
  10. 10
  11. 11
  12. 12
  13. 13
  14. 14
  15. 15
  16. 16
  17. 17
  18. 18
  19. 19
  20. 20
  21. 21
  22. 22
  23. 23
  24. 24
  25. 25
  26. 26
  27. 27
  28. 28
  29. 29
  30. 30
  31. 31
  32. 32
  33. 33
  34. 34
  35. 35
  36. 36
  37. 37
  38. 38
  39. 39
  40. 40
  41. 41
  42. 42
  43. 43
  44. 44
  45. 45
  46. 46
  47. 47
  48. 48
  49. 49
  50. 50
  51. 51
  52. 52
  53. 53
  54. 54
  55. 55
  56. 56
  57. 57
  58. 58
  59. 59
  60. 60
  61. 61
  62. 62
  63. 63
  64. 64
  65. 65
  66. 66
  67. 67
  68. 68
  69. 69
  70. 70
  71. 71
  72. 72
  73. 73
  74. 74
  75. 75
  76. 76
  77. 77
  78. 78
  79. 79
  80. 80
  81. 81
  82. 82
  83. 83
  84. 84
  85. 85
  86. 86
  87. 87
  88. 88

Decision letter

  1. Joseph T Wade
    Reviewing Editor; Wadsworth Center, New York State Department of Health, United States
  2. James L Manley
    Senior Editor; Columbia University, United States
  3. Ben F Luisi
    Reviewer; University of Cambridge, United Kingdom

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Acceptance summary:

This study uses a genome-scale approach, CLASH, to identify many RNA-RNA interactions in Escherichia coli. The interacting RNA pairs identified in this work represent a valuable resource for groups studying RNA-based regulation in bacteria. Moreover, the data reveal many interacting pairs of small, regulatory RNAs (sRNAs), suggesting complex regulatory cross-talk among sRNAs.

Decision letter after peer review:

[Editors’ note: the authors submitted for reconsideration following the decision after peer review. What follows is the decision letter after the first round of review.]

Thank you for submitting your work entitled "Hfq CLASH uncovers sRNA-target interaction networks involved in adaptation to nutrient availability" for consideration by eLife. Your article has been reviewed by two peer reviewers, and the evaluation has been overseen by a Reviewing Editor and a Senior Editor. The following individual involved in review of your submission has agreed to reveal their identity: Ben F Luisi (Reviewer #2).

Our decision has been reached after consultation between the reviewers and the Reviewing Editor. Based on these discussions and the individual reviews below, we regret to inform you that your work cannot be considered further for publication in eLife.

While the reviewers are enthusiastic about the potential of the resource that the CLASH data represent, concerns were raised about the validation of these data. Additionally, the reviewers felt that the follow-up studies are interesting, but that some of the conclusions need to be softened. With additional validation of the CLASH data, the manuscript would likely be suitable for publication in eLife, without the need for much in the way of new experimental data. Nonetheless, the required analyses will likely take some time. We encourage you to resubmit if you can make a more compelling case that the CLASH data represent physiological RNA-RNA interactions.

The major concern is that there is currently insufficient evidence to conclude that the RNA-RNA pairs identified by CLASH represent bona fide RNA-RNA pairs inside cells. Many of the reported RNA-RNA pairs appear to have been identified only once, and many include highly abundant RNAs (i.e. tRNA, rRNA). Moreover, overlap with RIL-seq data is fairly limited. While the discussion clearly lays out why overlap with RIL-seq data might be low, this also raises the bar for validating the RNA-RNA pairs not found by RIL-seq. It should be possible to use bioinformatic analyses to further test whether the novel RNA-RNA pairs are genuine. For example, are novel sRNA targets enriched for sequences complementary to the sRNA seeds? Are known sRNA-regulated genes enriched for sRNA-mRNA pairs, and vice versa? These analyses are most important for the novel RNA-RNA pairs identified by CLASH (i.e. not found by RIL-seq).

Reviewer #1:

Overall, this is very interesting work, and the manuscript obviously represents a great deal of effort. My major criticisms of the work are two-fold. First, the manuscript seems very diffuse – touching on too many topics at a rather surface level. Second, the biological implications of the experimental results are overstated. I hope my comments will be useful to the authors as they consider how to revise their manuscript.

1) The title of the manuscript is misleading. The main functional characterization of MdoR and its targets is intriguing and hints at a physiological function related to carbon source adaptation, but there is a long way to go to say that this is truly the function of this sRNA.

2) ArcZ-CyaR experiment in the middle is not well connected to the rest of the manuscript. The inclusion in the model figure doesn't really help shed light on the biological role for this interaction.

3) Subsection “Hfq CLASH predicts sRNA-sRNA interactions as a widespread layer of post transcriptional regulation”, third paragraph. Figure 5D, the wild-type ArcZ still affects mutant CyaR levels. The authors provide a hand-waving explanation that could be tested. Moreover, the authors state that ArcZ promotes CyaR degradation, but there is no direct evidence for this. It could be tested. Not sure it's the highest priority for this manuscript, given that this experiment in general is not well integrated. But at least the authors should modulate their statement to reflect the actual data.

4) Abstract – there is no direct evidence that MdoR enhances maltose uptake.

5) I did not understand the logic behind the analyses in Figure 2. The authors state that it was "logical to assume that changes in Hfq binding would also be reflected in changes in sRNA steady state levels." However, there are numerous studies showing that different sRNAs bind Hfq via different modes, and that there is a great deal of variability regarding the role of Hfq in stabilizing sRNAs. Moreover, the competition among RNAs for binding to a limiting pool of Hfq will certainly change over time, and be influenced by the total sRNA abundance and any given sRNA's proportion of the total RNA pool. There seems to be no overall conclusion from the figure, and no follow up, so I would recommend deleting it.

6) Subsection “MdoR directly regulates the expression of major outer membrane porins and represses the envelope stress response pathway”, fourth paragraph: The authors state hypotheses in this section that are not further tested, and are not supported by data shown. These are more appropriate for modest speculation in the Discussion.

7) Figure 7F: Is the effect of MdoR SM on MicA significant?

8) The only MdoR-target interaction that was definitively demonstrated was MdoR-ompC, and indeed, the authors went above and beyond with evidence here. It is interesting that ompC levels are reduced in maltose (Figure 8B), but this is clearly NOT MdoR-dependent (Figure 8D). The differences in MicA and lamB RNA levels in the mdoR mutant grown in maltose are intriguing, but these effects can't be linked to a specific MdoR-target regulation. Minimally, the authors should try to make the link between molecular interaction of MdoR and a target (rpoE?) and the differences in MicA/lamB more clear.

9) Subsection “MdoR enhances maltoporin expression during maltose fermentation”, last paragraph: It would be very exciting if the data directly supported this statement. However, the experiments presented fall short. More physiological evidence is needed – growth phenotypes, maltose uptake assays, etc. In the absence of these, the authors must tone down their claims.

10) Because there is so little investigation of the physiology, the discussion of the physiological relevance of these findings is very superficial. The transition from exponential to stationary phase growth has been studied in E. coli growing in LB. What becomes limiting? The authors say very generically "the most favorable nutrients" become limiting. The finding that malEFG and MdoR are specifically expressed during a very narrow window of time in LB grown cells is very interesting. There must be more to the story of their regulation than malT-dependent maltose-inducible expression given this expression pattern in LB given that the main carbon source in LB is peptides/amino acids. The authors should work on improving the quality of the discussion of these issues, and be up front about the limitations of their study in this regard.

11) One key issue that should be addressed in the Discussion is the fact that these global approaches have so little overlap. I did appreciate the thorough description of the relative advantages provided by the Hfq-CLASH method as compared to RIL-seq. However, I think the field as a whole needs to find a way to discern direct, physiologically-relevant interactions from those that may be transient, weaker, and stochastic. I don't expect the authors to solve this issue, but it should be acknowledged. The sensitivity and accuracy of various methods needs a thorough investigation. At least, the authors could consider their Hfq-CLASH results in light of their total expression profiles (RNA-seq) of well characterized sRNAs and their regulons. What's the false negative rate for known interactions?

Reviewer #2:

This manuscript analyses the RNAs associated with the RNA chaperone Hfq in Escherchia coli at different growth stages, and in particular during the transition between stages. There has been other work published in this topic, but the new aspect of the work presented here is the depth of analysis of the transitions and the in depth characterisation of the associated RNAs. One important finding from this study is that sRNA expression does not correlate strongly with Hfq binding profile – suggesting that there must be context dependent binding of the RNA to Hfq. Another is the model for the regulatory network involving the processed transcript from the mal operon. The experimental work is extensive and there are many interesting new findings reported. There are several comments listed below that will hopefully be useful for the authors to consider:

1) Hfq for CLASH has two large tags on C-terminus. As the C-terminus has been proposed to participate in RNA/protein partners binding and Hfq autoinhibition (work from the Woodson group), have the authors done any controls to make sure this does not interfere with RNA banding/introduce false results?

2) "Hfq binds to sRNA-target RNA duplexes" – are RNA duplexes the only Hfq targets? For example, sRNAs were shown to cycle on Hfq, therefore one can imagine a situation in which one sRNA is not fully displaced and the second one already bound. Could some of the sRNA hybrids represent such state?

3) 'tRNA-tRNA and rRNA-rRNA chimeras originating from different coding regions were removed' why?

4) Can the authors please comment on other chimeras isolated, others than sRNA-mRNA and mRNA-mRNA? Would these represent Hfq targets in the cell?

5) Figure 3 – it is not clear what the enriched motifs are showing, 5' end of the chimera? 5' end of both RNAs in the chimera? Only mRNAs?

6) Explain in more detail what is meant by scrambled RNA.

7) Figure 4 – as only mutations in ArcZ cause disruption of the regulation, can it be an indirect effect, not the result of direct sRNA-sRNA regulation?

8) Figure 5B – It is difficult to see expression of ygaM (or YgaN, which the blot may be showing). Where is MdoR on the blot? If it is labelled malG it is somewhat confusing, as the blot presumably shows the sRNA fragments, not the whole mRNAs?

9) The signal for RyhB on Figure 5B is quite strong for OD 1.2 and 1.8, however on Figure 6C it is very weak. Can the authors explain? MdoR intensities seem to match, so presumably the RNAs quantities used are similar?

10) Is there an evidence that RyhB is in the cell as 5'PPP RNA? Perhaps it is not processed, but has the possibility of it harbouring a different 5' end has been excluded?

11) Figure 7C – It is confusing that the authors label 5' and 3' ends which are not real ends, and are different for each panel for MdoR. Could they mark, e.g. with dots, that these are not real ends of RNAs? Or indicate positions of the nucleotides shown? It would make analysing the results much easier.

12) Figure 7D- If an empty plasmid is used as a control and the blot probed for MdoR, can the authors explain what is being expressed in their control after 20 minutes? There may be a typo in the legend as it states that the samples were harvested 15 minutes after induction, but the blot shows the results for 20. What is the meaning of the red rectangle over the 15 minutes into MdoR expression?

13) Figure 7F – explain MdoR SM, it also isn't introduced in the text. Why does the seed mutation cause higher target levels? For RyeA it doesn't seem like the seed mutation has abolished regulation. Is RybB regulating MicA as well?

14) Have the authors tested how their substantial MdoR seed mutation influences RNA structure? Is it possible that, as the seed seems internal, the overall structure of the sRNA is disrupted and therefore the regulation lost? Can the mutant still bind to Hfq? The structure change would also explain problems with RNase E processing.

15) 'Notably, the fully-processed mutant MdoR sRNA is less abundant than the wild-type (Figure 9C) and longer (unprocessed) fragments that contain upstream malG regions could be readily detected (Figure 9E) '- should be 8C and 8E

16) 'We conclude that the dynamics of sRNA expression and binding to Hfq are not always highly correlated.' Any thoughts why?

17) Polysome preps used cyclohexamide, but this acts but blocking the peptide exit channel in the ribosome and may not trap polysomes except by blocking the last ribosome on the assembly. Another antibiotic or non-hydrolysable GTP might be better.

18) These references have related information that may be useful to comment on in the manuscript: de Mets, van Melderen and Gottesman, 2018; Miyakoshi et al., 2018.

Also, Hfq has been known to be involved in nutrient uptake regulation in Pseudomonas aeruginosa, where it inhibits translation of certain mRNAs depending on which nutrients are available. Pei et al., 2019, have solved high resolution structures of Hfq in complex with a target mRNA and other effector molecules to show how this Hfq based regulatory complex works. This research may be related to the theme of the report here and it might be helpful to comment on these findings.

[Editors’ note: further revisions were suggested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "Hfq CLASH uncovers sRNA-target interaction networks linked to nutrient availability adaptation" for further consideration by eLife. Your revised article has been reviewed by three peer reviewers, one of whom is a member of our Board of Reviewing Editors, and the evaluation has been overseen by James Manley as the Senior Editor.

All the reviewers were enthusiastic about the manuscript and recommend acceptance pending some edits to the text. In particular, the reviewers felt that the new analyses of the CLASH data make a strong case that the identified RNA-RNA interactions are real, and thus greatly expand the known set of interactions for E. coli, and reveal important insights such as the abundance of sRNA-sRNA interactions. To better focus the manuscript, we recommend removing the section on MdoR. While the reviewers found this work to be of interest, they also felt that it was peripheral to the main theme of the study, and would be better suited to an independent publication in a more specialized journal. This would free up some space in the paper to move some of the supplementary figure panels into the main figures, improving readability. Reviewer 3 has some specific suggestions for supplementary figure panels that could be moved into the main set of figures. The detailed reviews are listed below:

Reviewer #1:

The authors have provided further experimental data and analysis and made compelling response to most of the points raised in the review. The manuscript has been improved and the support for the conclusions strengthened considerably.

One minor issue is the Figure 2E legend does not explain the figure very clearly.

Reviewer #2:

This is a much improved revised version of a manuscript describing a global method for characterization of RNA-RNA interactions. The authors have nicely addressed my previous concerns and I have no additional major issues.

Reviewer #3 – :

The new analyses of the CLASH data make a very convincing case that the novel RNA-RNA pairs reflect real in vivo interactions. My preference would be to remove the MdoR story, which is interesting but peripheral to the main theme of the paper, and does not look at a novel sRNA (MdoR was identified previously by RIL-seq). Moreover, I suggest moving some of the more important supplementary figure panels into the main part of the paper.

Figure 2—figure supplement 1C. The "distance from sRNA seed" numbers appear to be similar to the length of the sRNAs. The authors should indicate the sRNA lengths.

Figure 2—figure supplement 6 (predicted base-pairing strength for identified interactions). This is an important analysis and should be moved to the main figures.

Figure 2—figure supplement 7 (number of enriched sequence motifs from mRNA targets that match the paired sRNA) also belongs in the main figures. I suggest combining this with a couple of the most interesting examples of newly found motifs (i.e. unique to this study).

Figure 2—figure supplement 7. The criteria used to make the yes/no calls should be described in the legend.

Figure 3—figure supplement 3. This could be moved to the main figures. The legend needs to be expanded for panel C.

Figure 4—figure supplement 4. I would not expect to see sufficient overlap in regulation between E. coli and Salmonella for this analysis to be informative. I suggest removing this figure.

Figure 4—figure supplement 8. Panel labels are wrong in the legend.

https://doi.org/10.7554/eLife.54655.sa1

Author response

While the reviewers are enthusiastic about the potential of the resource that the CLASH data represent, concerns were raised about the validation of these data. Additionally, the reviewers felt that the follow-up studies are interesting, but that some of the conclusions need to be softened. With additional validation of the CLASH data, the manuscript would likely be suitable for publication in eLife, without the need for much in the way of new experimental data. Nonetheless, the required analyses will likely take some time. We encourage you to resubmit if you can make a more compelling case that the CLASH data represent physiological RNA-RNA interactions.

We were extremely pleased with the opportunity to submit a revised version of the manuscript. We have performed a large number of additional bioinformatics analyses to demonstrate the robustness of the CLASH data, focussing also on chimeras that uniquely identified in our CLASH data and those that are supported by a relatively low number of reads.

The requested analyses took longer than expected as we discovered an annoying bug in our bioinformatics pipeline that resulted in some of the reads being incorrectly assigned as chimeras. In brief, we used a software package called FLASH (https://github.com/dstreett/FLASH2) that merges overlapping paired-reads together into one single contig, which are then sent to the hyb pipeline to detect chimeric reads. Those paired reads that FLASH did not consider to be overlapping were subsequently concatenated by our pipeline to make sure that we would also be able to recover chimeras from non-overlapping reads. However, we recently discovered that FLASH is less sensitive in detecting overlapping reads than we expected. As a result, the pipeline concatenated many overlapping paired reads, which hyb then subsequently identified as chimeras. As a result, 10% of the chimeras called by hyb were false positives as they contained almost identical sequences that were frequently annotated as intramolecular interactions. Therefore, we removed these false-positive chimeras and reanalysed the data. This did not change the interpretation of the data. On the contrary, it improved the results substantially. We have included these additional filtering steps in the Materials and methods section of the revised manuscript.

The major concern is that there is currently insufficient evidence to conclude that the RNA-RNA pairs identified by CLASH represent bona fide RNA-RNA pairs inside cells. Many of the reported RNA-RNA pairs appear to have been identified only once, and many include highly abundant RNAs (i.e. tRNA, rRNA). Moreover, overlap with RIL-seq data is fairly limited. While the discussion clearly lays out why overlap with RIL-seq data might be low, this also raises the bar for validating the RNA-RNA pairs not found by RIL-seq. It should be possible to use bioinformatic analyses to further test whether the novel RNA-RNA pairs are genuine. For example, are novel sRNA targets enriched for sequences complementary to the sRNA seeds? Are known sRNA-regulated genes enriched for sRNA-mRNA pairs, and vice versa? These analyses are most important for the novel RNA-RNA pairs identified by CLASH (i.e. not found by RIL-seq).

To address these points, we did a number of additional bioinformatics analyses. These analyses as well as the new results are described in the Results section and presented in Figure 2—figure supplements 1 to 5. In the first supplementary Figure we show that for almost all the sRNA and mRNAs identified in our data that have experimentally verified seed regions we indeed recovered the known seed sequences in the chimera fragments. Moreover, we show that the data has low background levels, as judged by the percentage of rRNA chimeras in the dataset. We obtained very similar results when we repeated these analyses for all the chimeras (Figure 2—figure supplement 2) and the chimeras uniquely identified in the CLASH data (Figure 2—figure supplement 3). The group of chimeras supported by a low number of reads (<4) also largely consisted of sRNA-mRNA fragments as well as sRNA-sRNA and mRNA-mRNA fragments. Moreover, for the vast majority of sRNA chimeras with low read counts we again recovered the known sRNA seed sequences. The number of chimeras containing rRNA fragments is slightly higher in this group (12-13%), suggesting higher background. However, considering the sheer abundance of rRNAs in cells (up to 80%) and the fact that we do not do any rRNA depletion step before library preparation, we would argue that the background is remarkably low. Finally, to test whether the novel sRNA targets are enriched for complementary sequences, we folded the sRNA-mRNA chimeras in silico using RNADuplex from the Vienna package and compared it to chimeric reads in which the fragments were randomly shuffled over the same gene or the same class of genes (i.e. all sRNAs or mRNAs). These analyses revealed that the vast majority of chimeras, even those supported by only a few reads (Figure 2—figure supplement 5B-D), had a significantly higher propensity to form stable duplexes compared to randomly generated chimeric reads (p-value < 6*10-16).

Collectively, these data strongly suggest that the vast majority of chimeras that we recovered, including the new interactions and the less abundant interactions, represent genuine base-pairing interactions rather than random ligations.

Reviewer #1:

Overall, this is very interesting work, and the manuscript obviously represents a great deal of effort. My major criticisms of the work are two-fold. First, the manuscript seems very diffuse – touching on too many topics at a rather surface level. Second, the biological implications of the experimental results are overstated. I hope my comments will be useful to the authors as they consider how to revise their manuscript.

1) The title of the manuscript is misleading. The main functional characterization of MdoR and its targets is intriguing and hints at a physiological function related to carbon source adaptation, but there is a long way to go to say that this is truly the function of this sRNA.

We have now changed the title to “Hfq CLASH uncovers sRNA-target interaction networks linked to nutrient availability adaptation. We hope that this title now better reflects the experimental data.

2) ArcZ-CyaR experiment in the middle is not well connected to the rest of the manuscript. The inclusion in the model figure doesn't really help shed light on the biological role for this interaction.

We agree that this experiment does not blend in well with the rest of the manuscript, but we felt we had to do some validation of the identified sRNA-sRNA interactions to demonstrate that the sRNA-sRNA interactions that we have identified are biologically relevant. We have moved the results describing the validation of the ArcZ-CyaR interaction to the Supplementary data.

3) Subsection “Hfq CLASH predicts sRNA-sRNA interactions as a widespread layer of post transcriptional regulation”, third paragraph. Figure 5D, the wild-type ArcZ still affects mutant CyaR levels. The authors provide a hand-waving explanation that could be tested.

We agree that it is strange that the wild-type ArcZ can still affect mutant CyaR levels, but we believe this is because we did not sufficiently disrupt the base-pairing interaction potential. However, the observation that the compensatory mutations in CyaR restore the regulatory activity of the ArcZ seed mutant does support the idea that the two sRNA physically interact in vivo. As stated above, we have now moved these data to the Supplementary data.

Moreover, the authors state that ArcZ promotes CyaR degradation, but there is no direct evidence for this. It could be tested. Not sure it's the highest priority for this manuscript, given that this experiment in general is not well integrated. But at least the authors should modulate their statement to reflect the actual data.

We have removed the text where we state that ArcZ promotes CyaR degradation. We now statein the Results section: “These results, together with the CLASH data, imply that ArcZ and CyaR base-pair in vivo, and that this interaction could lead to a reduction in CyaR levels but not vice versa.” .

4) Abstract – there is no direct evidence that MdoR enhances maltose uptake.

We now state in the Abstract that: “We hypothesize that MdoR contributes to the rearrangements in the outer membrane necessary for efficient uptake of maltose/maltodextrins”.

5) I did not understand the logic behind the analyses in Figure 2. The authors state that it was "logical to assume that changes in Hfq binding would also be reflected in changes in sRNA steady state levels." However, there are numerous studies showing that different sRNAs bind Hfq via different modes, and that there is a great deal of variability regarding the role of Hfq in stabilizing sRNAs. Moreover, the competition among RNAs for binding to a limiting pool of Hfq will certainly change over time, and be influenced by the total sRNA abundance and any given sRNA's proportion of the total RNA pool. There seems to be no overall conclusion from the figure, and no follow up, so I would recommend deleting it.

As requested by the reviewer, we have removed these analyses from the manuscript.

6) Subsection “MdoR directly regulates the expression of major outer membrane porins and represses the envelope stress response pathway”, fourth paragraph: The authors state hypotheses in this section that are not further tested, and are not supported by data shown. These are more appropriate for modest speculation in the Discussion.

As requested by the reviewer, we have moved this to the Discussion section.

7) Figure 7F: Is the effect of MdoR SM on MicA significant?

In this dataset the effect on MdoR SM on MicA was not statistically significant (p-value = 0.06). However, to generate more convincing results, we repeated the qPCRs and included an additional biological replicate experiment. We also repeated the other qPCRs as the rpoE and RyeA data were noisier than the other samples. We have now also added p-values to all the bar plots. We feel that the new results more convincingly show that MdoR suppression of MicA relies on the MdoR seed sequence. We do note that the MdoR seed mutation still results in down-regulation of RyeA, suggesting that the regulation is indirect or relies on other sequences within MdoR. We now discuss this in the subsection MdoR directly regulates the expression of major outer membrane porins and represses the envelope stress response pathway”.

8) The only MdoR-target interaction that was definitively demonstrated was MdoR-ompC, and indeed, the authors went above and beyond with evidence here. It is interesting that ompC levels are reduced in maltose (Figure 8B), but this is clearly NOT MdoR-dependent (Figure 8D). The differences in MicA and lamB RNA levels in the mdoR mutant grown in maltose are intriguing, but these effects can't be linked to a specific MdoR-target regulation. Minimally, the authors should try to make the link between molecular interaction of MdoR and a target (rpoE?) and the differences in MicA/lamB more clear.

Our initial hypothesis was that MdoR would help to suppress MicA levels by directly targeting rpoE, but we did not find any evidence for direct interactions in our data or the RIL-seq data. However, to make the link between MdoR, rpoE and MicA clearer, we repeated the qPCRs of the data shown in Figure 6F and included a third biological replicate so that we could get more reliable statistics. These data show that, consistent with the DESeq analyses, rpoE levels do go down upon over-expression of MdoR, however, the changes in rpoE levels are not statistically significant. Secondly, we also performed qPCR on rpoE levels in RNA samples extracted from the strain that has seed mutations in chromosomal copy of MdoR. We assumed that if the increase of MicA in the MdoR seed mutant strain was directly linked to rpoE, we would also see also higher rpoE mRNA levels in this strain. This was not the case (see revised Figure 7E). Therefore, our current model is that MdoR enhances LamB expression by suppressing MicA independently of rpoE. This then begs the question whether MdoR directly targets MicA. Unfortunately, MdoR-MicA chimeras were not found in our Hfq CLASH data and the base-pairing interactions predicted by RNA co-fold and RNADuplex from the Vienna package were not at all convincing, so we have not yet been able to address this question. We now discuss these new findings in the Discussion section.

9) Subsection “MdoR enhances maltoporin expression during maltose fermentation”, last paragraph: It would be very exciting if the data directly supported this statement. However, the experiments presented fall short. More physiological evidence is needed – growth phenotypes, maltose uptake assays, etc. In the absence of these, the authors must tone down their claims.

We have removed this sentence from the manuscript and now present a hypothesis about the role of MdoR in nutrient adaptation: “Based on these results, we hypothesise that when cells decide to use maltose as a main carbon source, subsequent MdoR expression enhances the uptake of maltose by suppressing MicA expression, independently of rpoE. This in turn enhances the production of the LamB maltoporin (Figure 8A)”. We believe that, based on the data, this is a reasonable hypothesis. With respect to physiological evidence, we now acknowledge in the Discussion section that this is indeed lacking and present an idea on how to pursue this.

10) Because there is so little investigation of the physiology, the discussion of the physiological relevance of these findings is very superficial. The transition from exponential to stationary phase growth has been studied in E. coli growing in LB. What becomes limiting? The authors say very generically "the most favorable nutrients" become limiting. The finding that malEFG and MdoR are specifically expressed during a very narrow window of time in LB grown cells is very interesting. There must be more to the story of their regulation than malT-dependent maltose-inducible expression given this expression pattern in LB given that the main carbon source in LB is peptides/amino acids. The authors should work on improving the quality of the discussion of these issues, and be up front about the limitations of their study in this regard.

We acknowledge that we have not presented evidence demonstrating the physiological relevance of MdoRWe agree that the transient MdoR expression in LB may not be exclusively dependent on the availability of maltodextrins – given the complexity of E. coli physiology in LB (e.g. pH, cell density, concurrent metabilization of other substrates etc) and that this is worthwhile investigating. We initially prioritised the study of nutrient-dependent aspect of MdoR function because MdoR accumulates after glucose depletion and the few remaining carbohydrates in LB are utilised sequentially – and during this process MalT is induced. We now discuss in detail which nutrients become limiting in the Discussion section. Moreover, the MdoR expression pattern is similar to that of the other MalT-regulated genes – thus MalT control is key for understanding MdoR physiology. Thus, we considered it important to dissect separately the role of MdoR when maltose is the sole carbon source. It is also known that induction of the mal operon is not as effective in a medium that contains other carbon sources (Zhou et al. BMC Systems Biology 2013).

11) One key issue that should be addressed in the Discussion is the fact that these global approaches have so little overlap. I did appreciate the thorough description of the relative advantages provided by the Hfq-CLASH method as compared to RIL-seq. However, I think the field as a whole needs to find a way to discern direct, physiologically-relevant interactions from those that may be transient, weaker, and stochastic. I don't expect the authors to solve this issue, but it should be acknowledged. The sensitivity and accuracy of various methods needs a thorough investigation.

We completely agree that a thorough comparison between the various methods should be done to test the sensitivity and accuracy of each method. In our opinion, this should ideally be done with multiple identical samples that are then analysed in several labs that are using these methods. We would be very keen to contribute to such a study so that the field as a whole can come to a consensus on best practices for performing these type of studies.

It is not possible to quantify the number of interactions that are the result of transient/stochastic/weak interactions, however, we acknowledge that they will certainly be present in our data. We do predict that the frequency of such interactions in our data will be low. This is now discussed in detail in the Discussion section, but we would like to point out here that the CLASH data is highly enriched for chimeras that form stable duplexes, even those chimeras supported by only a few reads (Figure 2—figure supplement 5). Therefore, we would predict that the majority of interactions that we recover represent genuine base-pairing interactions but to what extent these are functional is of course not possible to deduce from the data. We now also discuss ways that would enable us to systematically test which sRNA-mRNA interactions could be biologically relevant.

At least, the authors could consider their Hfq-CLASH results in light of their total expression profiles (RNA-seq) of well characterized sRNAs and their regulons.

This is an excellent point and we had looked at the correlation of CLASH hits and target expression in detail (see Author response image 1). For these analyses we specifically focussed on known interactions with GcvB and CyaR (Author response image 1A) as well as novel interactions with these sRNAs (Author response image 1B). Although in some individual cases we could see a positive or negative correlation between steady state levels (RNA-seq; TPM values) and number of chimeras, the overall results were unsatisfying as we could not detect a clear pattern. This may have to do with the fact that for a number of mRNA interactions we did not identify many chimeras, which would make the comparison with RNA-seq steady state levels problematic. We decided therefore not to include these analyses in the manuscript.

Author response image 1

What's the false negative rate for known interactions?In response to the comments, we have done a significant amount of additional bioinformatics analyses to evaluate in more detail the quality of our data (see Figure 2—figure supplements 1-5). Here, we also focussed on experimentally verified sRNA-mRNA interactions identified in our data (Figure 2—figure supplement 1). These results show that in all but one case (Gcvb-sstT interaction), we recover the known sRNA and mRNA seeds (as well as some potentially new seed sequences). Therefore, the false-negative rate, which we define as the number of incorrect wrong seed sequences identified for known interactions in our data, seems to be low. We now discuss these data in the Results section.

Reviewer #2:

This manuscript analyses the RNAs associated with the RNA chaperone Hfq in Escherchia coli at different growth stages, and in particular during the transition between stages. There has been other work published in this topic, but the new aspect of the work presented here is the depth of analysis of the transitions and the in depth characterisation of the associated RNAs. One important finding from this study is that sRNA expression does not correlate strongly with Hfq binding profile – suggesting that there must be context dependent binding of the RNA to Hfq. Another is the model for the regulatory network involving the processed transcript from the mal operon. The experimental work is extensive and there are many interesting new findings reported. There are several comments listed below that will hopefully be useful for the authors to consider:

1) Hfq for CLASH has two large tags on C-terminus. As the C-terminus has been proposed to participate in RNA/protein partners binding and Hfq autoinhibition (work from the Woodson group), have the authors done any controls to make sure this does not interfere with RNA banding/introduce false results?

This is an important point and we addressed this in a previous paper (Tree et al. Molecular Cell; see Figure 1—figure supplement 1 and first paragraph of the Results section). The data indicated that HTF tagged Hfq is functional and facilitates MicF repression of OmpF.

2) "Hfq binds to sRNA-target RNA duplexes" – are RNA duplexes the only Hfq targets? For example, sRNAs were shown to cycle on Hfq, therefore one can imagine a situation in which one sRNA is not fully displaced and the second one already bound. Could some of the sRNA hybrids represent such state?

It is certainly possible that such chimeras can be formed during the ligation step. Although it is not possible to quantify the number of such interactions, based on the following we would argue that these interactions are probably not very abundant in our data. Firstly, we purify Hfq and cross-linked RNAs under very stringent and completely denaturing conditions before we do the intermolecular ligation reactions. We would predict that most of such interactions, including weak and stochastic interactions, would dissociate under these conditions. Furthermore, because our purification conditions completely disrupt the Hfq hexamer (this work and (Tree et al., 2014)), such transient interactions would only be detected if an Hfq monomer was UV cross-linked to both sRNAs simultaneously and if the available 5’ end 3’ ends are in close proximity. Considering the low efficiency of UV cross-linking, the likelihood of this happening is very low. Secondly, we show that our chimeras, including those chimeras that are supported by only a few reads, are highly enriched for stable duplexes as well as sRNA seed sequences. It seems therefore unlikely that the CLASH protocol would efficiently recover weak and or transient interactions. We now discuss this in the Discussion section.

3) 'tRNA-tRNA and rRNA-rRNA chimeras originating from different coding regions were removed' why?

There are many copies of tRNA and rRNA genes in the genome that are very similar, and the sequence aligner cannot always determine from which copy the two halves of a chimera came from. So, in many cases our software will label these rRNA-rRNA and tRNA-tRNA chimeras as “intermolecular” because each half was mapped to a different copy. But we cannot rule out the possibility that these fragments originated from the same gene. Therefore, we consider these to be false positives as they are very likely intramolecular interactions. We now mention this in the legend of Figure 2A.

4) Can the authors please comment on other chimeras isolated, others than sRNA-mRNA and mRNA-mRNA? Would these represent Hfq targets in the cell?

A few % of the chimeras appear to represent sRNA-tRNA interactions. Although it is unclear whether these are biologically relevant, it is worth noting here that in E. coli external transcribed spacers of tRNAs can base-pair with sRNAs to absorb transcriptional noise (Lalaouna et al., 2015). Moreover, the predicted base-pairing interactions between the tRNA and sRNA halves in chimeras are in many cases quite extensive (Supplementary file 2). Between 4-5% of the chimeras represented sRNA-rRNA interactions. Binding of Hfq to rRNA has also been demonstrated (Andrade et al., 2018), however, this interaction appears ot be independent of sRNAs. Therefore we predict that sRNA-rRNA interactions likely represent noise. However, considering the sheer abundance of ribosomal RNA in a cell (80%) we would argue the noise is quite low.

We now discuss this in more detail in the Results section and we have included more extensive analyses of all the types of interactions in Figure 2A and Figure 2—figures supplements 1-5.

5) Figure 3 – it is not clear what the enriched motifs are showing, 5' end of the chimera? 5' end of both RNAs in the chimera? Only mRNAs?

We apologise for the confusion. Only the mRNA fragments in chimeras were considered for the motif analyses. The left panel shows motifs of the 5’UTRs of mRNAs found in chimeras, the right panel, the motifs of the 3’UTRs of mRNAs found in chimeras. We have now added a sentence to the figure as well as the figure legend to make this clearer.

6) Explain in more detail what is meant by scrambled RNA.

This an sRNA that has a random nucleotide sequence. We have now made this clearer in the figure legend. The control plasmid is pJV300, the standard control plasmid for pL-driven sRNA expression. It expresses a ~50 nucleotides long nonsense RNA derived from rrnB terminator region of the backbone plasmid.

7) Figure 4 – as only mutations in ArcZ cause disruption of the regulation, can it be an indirect effect, not the result of direct sRNA-sRNA regulation?

We cannot rule out that the regulation is indirect, however, the fact that the compensatory mutations in CyaR restore the regulation of the ArcZ mutant is strong evidence that the regulation is direct. Note that we have now moved these analyses to the supplementary information, as requested by reviewer #1

8) Figure 5B – It is difficult to see expression of ygaM (or YgaN, which the blot may be showing).

(Note Figure 5 is now Figure 4) We apologize for the fact that YgaN is poorly detectable. It is clearly a very low abundant fragment and even after exposing the blot for two weeks with several probes, we were not able to get a stronger signal. This is why we added a larger scan of the same blot in Figure 4—figure-supplement 1A, where ygaN is more visible.

Where is MdoR on the blot? If it is labelled malG it is somewhat confusing, as the blot presumably shows the sRNA fragments, not the whole mRNAs?

We apologise for the confusion. What is shown in Figure 5B (now Figure 4B) are indeed the sRNAs, but we had labelled those sRNAs derived from 3’UTRs with the names of the host mRNA. We now included the sRNA names.

9) The signal for RyhB on Figure 5B is quite strong for OD 1.2 and 1.8, however on Figure 6C it is very weak. Can the authors explain? MdoR intensities seem to match, so presumably the RNAs quantities used are similar?

We had to expose the blot for about 4-5 days to get a good MdoR signal and by the time we hybridized the blot with the RybB probe the 32P label was already about two weeks old. This explains why the signal is a bit weaker. CpxQ and 5S rRNA probing was performed with fresh label once the signals on the blot had sufficiently decayed.

10) Is there an evidence that RyhB is in the cell as 5'PPP RNA? Perhaps it is not processed, but has the possibility of it harbouring a different 5' end has been excluded?

RybB was not reported to be processed by RNase E for maturation (Chao et al., 2017) and it does not have a 5’NAD modification (Cahová et al., 2015). RybB is also TEX insensitive (Thomason et al., 2015). In this study, the 5’PPP ends were converted using TAP to 5’P after the TEX step. RybB was also detected in the TEX untreated sample, so we can infer it originally harboured a 5’PPP.

11) Figure 7C – It is confusing that the authors label 5' and 3' ends which are not real ends, and are different for each panel for MdoR. Could they mark, e.g. with dots, that these are not real ends of RNAs? Or indicate positions of the nucleotides shown? It would make analysing the results much easier.

As suggested by the reviewer we have now marked the ends with dashed lines to indicate that these are not the beginning and ends of the RNAs. Note that the original Figure 7 is now Figure 6.

12) Figure 7D- If an empty plasmid is used as a control and the blot probed for MdoR, can the authors explain what is being expressed in their control after 20 minutes? There may be a typo in the legend as it states that the samples were harvested 15 minutes after induction, but the blot shows the results for 20. What is the meaning of the red rectangle over the 15 minutes into MdoR expression?

The RNA expressed in the control after induction is the endogenous MdoR (15 min after induction, cells are near OD600 0.8, the OD600 at which MdoR starts being expressed). The plasmid that we used to over-express MdoR has two transcriptional terminators, one from the operon itself and one already present in the plasmid. We believe that the longer band represents MdoR terminated at the second terminator. The red rectangle indicates the time-point for which RNAseq and differential expression analyses were performed. Made this clearer in the main text and the figure legend.

13) Figure 7F – explain MdoR SM, it also isn't introduced in the text. Why does the seed mutation cause higher target levels? For RyeA it doesn't seem like the seed mutation has abolished regulation. Is RybB regulating MicA as well?

MdoR SM indicates the MdoR seed mutant. We apologise for not including this in the text. This has now been fixed. As stated above (# reviewer 1 point 7), we have repeated the qPCRs presented in this figure and added an additional biological replicates to make the data more robust. The new data confirm that the MdoR seed mutant does not abolish RyeA regulation. We now discuss this in the subsection “MdoR directly regulates the expression of major outer membrane porins and represses the envelope stress response pathway”. We do not believe RybB directly regulates MicA (or rseA; Figure 6F) but that the decrease in their RNA is the result of a homeostatic negative feedback loop, as proposed by the Gottesman lab (Thompson et al., Journal of Bacteriology, 2007; https://jb.asm.org/content/189/11/4243.long). A recent paper from Sarah Ades lab (Nicoloff et al. Journal of Bacteriology 2017) suggests that the RybB levels need to be carefully controlled as too high expression can be toxic under some circumstances. One way to respond to RybB over-expression would be to reduce transcription of the sigmaE operon, which includes rpoE and RseA.

14) Have the authors tested how their substantial MdoR seed mutation influences RNA structure? Is it possible that, as the seed seems internal, the overall structure of the sRNA is disrupted and therefore the regulation lost? Can the mutant still bind to Hfq? The structure change would also explain problems with RNase E processing.

This is an interesting point and we have not tested it. Our initial idea was to make a knock-out of MdoR, but we were worried that this would also impair MalG expression. Therefore, we decided to make a mutant version that we predicted would completely disrupt the base-pairing interaction of MdoR with its targets but would not affect accumulation of MalG. The fact that it is no longer cleaved by RNase E was actually a bonus is because we wanted to remove the sRNA entirely. We therefore did not test whether the transcript was still bound by Hfq or its secondary structure.

15) 'Notably, the fully-processed mutant MdoR sRNA is less abundant than the wild-type (Figure 9C) and longer (unprocessed) fragments that contain upstream malG regions could be readily detected (Figure 9E) '- should be 8C and 8E

We thank the reviewer for pointing this out. This has been corrected. Note that the data is now presented in Figure 7.

16) 'We conclude that the dynamics of sRNA expression and binding to Hfq are not always highly correlated.' Any thoughts why?

We hypothesized that this may be linked to the availability of Hfq, as the protein is ~15 times less abundant at exponential phase as compared to stationary. Similar changes in Hfq expression at different growth phases have also been observed in pathogenic bacteria. It is conceivable that at different growth stages sRNAs are packaged in different RNPs and that the composition of these complexes is dynamic. The E. coli sRNA IsrA/McaS has been shown to associate with a large number of different proteins, including Hfq, ProQ and CsrA; RNA chaperones that are known to bind and stabilize sRNAs and regulate sRNA-target interactions. It is tempting to speculate that the composition of this RNP may be growth phase dependent. Therefore, Hfq may not be essential to stabilize all sRNAs at low cell densities and their stability may vary at different growth stages. A plausible model is that some sRNAs are sequestered and sufficiently stabilized by other RBPs (such as ProQ and CsrA) during early growth stages and that Hfq can only stably associate with these RNPs once expression levels are sufficiently high. However, reviewer #1 recommended that we remove these results from the manuscript and therefore these data are no longer included.

17) Polysome preps used cyclohexamide, but this acts but blocking the peptide exit channel in the ribosome and may not trap polysomes except by blocking the last ribosome on the assembly. Another antibiotic or non-hydrolysable GTP might be better.

We trapped ribosomes on mRNAs using two combined approaches. One is to add cycloheximide, the other was to flash freeze E. coli samples and pulverize them under liquid nitrogen. The latter procedure is known to be most conservative of polysomes without introducing biases or artefacts. Indeed, the use of the eukaryotic elongation inhibitor cycloheximide (CHX) is under debate for ribosome profiling (not for polysomal profiling), as it may introduce artefacts in ribosome positioning along the transcripts (Gerashchenko and Gladyshev, 2014). In our manuscript we performed polysomal profiling and not ribosome profiling. Moreover, we analysed the data as differential uploading of transcripts between two conditions, making it is reasonable to assume that any possible bias induced by the drug are negligible.

18) These references have related information that may be useful to comment on in the manuscript: de Mets, van Melderen and Gottesman, 2018; Miyakoshi et al.,.

We now discuss the 3’UTR-derived SdhX sRNA in the subsection “

Hfq CLASH identifies novel sRNAs in untranslated regions” where we describe chimeras supported by a low number of reads. We found 2-3 chimeras with SdhX and known interactions (katG and ackA) and we show that the predicted secondary structure of the chimeric reads matches the known interaction reported in these papers.

Also, Hfq has been known to be involved in nutrient uptake regulation in Pseudomonas aeruginosa, where it inhibits translation of certain mRNAs depending on which nutrients are available. Pei et al., 2019, have solved high resolution structures of Hfq in complex with a target mRNA and other effector molecules to show how this Hfq based regulatory complex works. This research may be related to the theme of the report here and it might be helpful to comment on these findings.

We agree that this is a very interesting of Hfq-dependent regulation of gene expression and relevant to our work. We now mention in the Introduction that Hfq can also regulate gene expression independently of sRNAs and we mention the Hfq-Crc example in Pseudomonas.

[Editors’ note: further revisions were suggested prior to acceptance, as described below.]

All the reviewers were enthusiastic about the manuscript and recommend acceptance pending some edits to the text. In particular, the reviewers felt that the new analyses of the CLASH data make a strong case that the identified RNA-RNA interactions are real, and thus greatly expand the known set of interactions for E. coli, and reveal important insights such as the abundance of sRNA-sRNA interactions. To better focus the manuscript, we recommend removing the section on MdoR. While the reviewers found this work to be of interest, they also felt that it was peripheral to the main theme of the study, and would be better suited to an independent publication in a more specialized journal. This would free up some space in the paper to move some of the supplementary figure panels into the main figures, improving readability. Reviewer 3 has some specific suggestions for supplementary figure panels that could be moved into the main set of figures. The detailed reviews are listed below:

Reviewer #1:

The authors have provided further experimental data and analysis and made compelling response to most of the points raised in the review. The manuscript has been improved and the support for the conclusions strengthened considerably.

One minor issue is the Figure 2E legend does not explain the figure very clearly.

We apologise for not explaining this properly. We have improved the explanation in the figure legend.

Reviewer #3:

The new analyses of the CLASH data make a very convincing case that the novel RNA-RNA pairs reflect real in vivo interactions. My preference would be to remove the MdoR story, which is interesting but peripheral to the main theme of the paper, and does not look at a novel sRNA (MdoR was identified previously by RIL-seq).

We have now removed all the data referring to MdoR and will include this in a different manuscript. Because of this, we had to make a number of changes to the text, including rewriting the Abstract and the last paragraphs of the Introduction. In these sections we now focus more on our findings that interactions that are more reproducibly recovered are more likely to have a regulatory outcome and that base-pairing potential is important but it does not have the strongest predictive power. Note that we have now renamed MdoR MalH in the main text and all of the figures as this is more in line with the nomenclature in the field.

Moreover, I suggest moving some of the more important supplementary figure panels into the main part of the paper.

Figure 2—figure supplement 1C. The "distance from sRNA seed" numbers appear to be similar to the length of the sRNAs. The authors should indicate the sRNA lengths.

We have added the lengths of the sRNAs to the heat maps shown in the supplementary figures associated with Figure 2.

Figure 2—figure supplement 6 (predicted base-pairing strength for identified interactions). This is an important analysis and should be moved to the main figures.

We have now included this in the main figures as Figure 3.

Figure 2—figure supplement 7 (number of enriched sequence motifs from mRNA targets that match the paired sRNA) also belongs in the main figures. I suggest combining this with a couple of the most interesting examples of newly found motifs (i.e. unique to this study).

This figure, as well as two examples of complementary sequence motifs identified specifically in our study is now included in the main figures as Figure 4.

Figure 2—figure supplement 7. The criteria used to make the yes/no calls should be described in the legend.

We have now added this to the legend of Figure 4 where these data are described. An sRNA was considered to have an enriched motif if a motif identified by MEME had an E-value <= 0.1 and/or the MAST p-value of the motif, which indicates the overall match between the identified motifs and the sRNA sequence, was <= 0.001.

Figure 3—figure supplement 3. This could be moved to the main figures. The legend needs to be expanded for panel C.

This is now Figure 7 in the main figures.

Figure 4—figure supplement 4. I would not expect to see sufficient overlap in regulation between E. coli and Salmonella for this analysis to be informative. I suggest removing this figure.

We have removed the figure from the manuscript.

Figure 4—figure supplement 8. Panel labels are wrong in the legend.

We have corrected the legend.

https://doi.org/10.7554/eLife.54655.sa2

Article and author information

Author details

  1. Ira Alexandra Iosub

    Centre for Synthetic and Systems Biology, University of Edinburgh, Edinburgh, United Kingdom
    Contribution
    Conceptualization, Resources, Data curation, Software, Formal analysis, Supervision, Validation, Investigation, Visualization, Methodology, Writing - original draft, Project administration, Writing - review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-2924-2471
  2. Robert Willem van Nues

    Institute of Cell Biology, University of Edinburgh, Edinburgh, United Kingdom
    Contribution
    Conceptualization, Resources, Supervision, Writing - review and editing
    Competing interests
    No competing interests declared
  3. Stuart William McKellar

    Centre for Synthetic and Systems Biology, University of Edinburgh, Edinburgh, United Kingdom
    Contribution
    Resources, Methodology
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-0792-9878
  4. Karen Jule Nieken

    Institute of Cell Biology, University of Edinburgh, Edinburgh, United Kingdom
    Contribution
    Formal analysis, Validation
    Competing interests
    No competing interests declared
  5. Marta Marchioretto

    Institute of Biophysics, CNR Unit, Trento, Italy
    Contribution
    Investigation, Methodology
    Competing interests
    No competing interests declared
  6. Brandon Sy

    School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, Australia
    Contribution
    Investigation, Methodology
    Competing interests
    No competing interests declared
  7. Jai Justin Tree

    School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, Australia
    Contribution
    Resources, Software, Methodology, Writing - review and editing
    Competing interests
    No competing interests declared
  8. Gabriella Viero

    Institute of Biophysics, CNR Unit, Trento, Italy
    Contribution
    Resources, Formal analysis, Investigation, Methodology, Writing - review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-6755-285X
  9. Sander Granneman

    Centre for Synthetic and Systems Biology, University of Edinburgh, Edinburgh, United Kingdom
    Contribution
    Conceptualization, Resources, Data curation, Software, Formal analysis, Supervision, Funding acquisition, Validation, Investigation, Visualization, Methodology, Writing - original draft, Project administration, Writing - review and editing
    For correspondence
    sgrannem@ed.ac.uk
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-4387-1271

Funding

Wellcome (102334)

  • Ira Alexandra Iosub

Wellcome (091549)

  • Sander Granneman

Medical Research Council (MR/R008205/1)

  • Sander Granneman

National Health and Medical Research Council (GNT1067241)

  • Jai J Tree

National Health and Medical Research Council (GNT1139313)

  • Jai J Tree

Axonomix

  • Gabriella Viero
  • Marta Marchioretto

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We are grateful to Lionello Bossi and Meriem El Karoui for their valuable feedback on the project and fruitful discussions. We thank Jörg Vogel and Yanjie Chao for providing the Salmonella CpxQ microarray data, Alasdair Ivens for help with the microarray data analysis, Christel Sirocchi for help with preparing E. coli RNA-seq libraries, Erica de Leau for expert technical assistance and the members of the Granneman lab for critically reading the manuscript. This work was supported by grants from the Wellcome Trust (091549 to SG and 102334 to IAI), the Wellcome Trust Centre for Cell Biology core grant (092076), a Medical Research Council non Clinical Senior Research Fellowship (MR/R008205/1 to SG), the Australian National Health and Medical Research Council Project grants (GNT1067241 and GNT1139313 to JJT) and the Autonomous Province of Trento (Axonomix to GV and MM). Next Generation Sequencing was in part carried out by Edinburgh Genomics that is supported through core grants from NERC (R8/H10/56), MRC (MR/K001744/1) and BBSRC (BB/J004243/1).

Senior Editor

  1. James L Manley, Columbia University, United States

Reviewing Editor

  1. Joseph T Wade, Wadsworth Center, New York State Department of Health, United States

Reviewer

  1. Ben F Luisi, University of Cambridge, United Kingdom

Publication history

  1. Received: December 21, 2019
  2. Accepted: April 30, 2020
  3. Accepted Manuscript published: May 1, 2020 (version 1)
  4. Version of Record published: May 11, 2020 (version 2)

Copyright

© 2020, Iosub et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 754
    Page views
  • 160
    Downloads
  • 1
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Biochemistry and Chemical Biology
    2. Cell Biology
    Slavica Pavlovic Djuranovic et al.
    Research Article
    1. Biochemistry and Chemical Biology
    Santosh Kumar Kuncha et al.
    Research Article