Development of RiboD-PETRI and validation of its technical performance in studying population heterogeneity

(A) Graphic summary of the RiboD-PETRI method illustrating the incorporation of RiboD after cell pooling and lysis in PETRI-seq. The RiboD protocol is represented by the dashed-line box. In this box, firstly, we perform template-switching oligonucleotides (TSOs) in the mixture of heterozygous chain, then we remove the RNA strand using RNaseH, at this point the system contains r-cDNA and m-cDNA single-stranded mixture. Then we add the r-cDNA probe, which specifically binds to the r-cDNA. The probes are then bound to magnetic beads, allowing the r-cDNA-probe-bead complexes to be separated from the rest of the library. And then we remove the r-cDNA that is attached to the probe by Streptavidin magnetic beads. We then performed amplification of the libraries and sent them for sequencing. We designed separate probe sets for E. coli, C. crescentus, and S. aureus. Each set was specifically constructed to be reverse complementary to the r-cDNA sequences of its respective bacterial species. This species-specific approach ensures high efficiency and specificity in rRNA depletion for each organism. (B) Comparison of non-rRNA (tRNA, mRNA and other non-rRNA) and rRNA UMI counts ratio among different bacterial scRNA-seq methods. Data from PETRI-seq (E. coli), MicroSPLIT-seq (E. coli), M3-seq (E. coli) cited from previous studies. Error bars represent standard deviations of biological replicates. The “ΔΔ” label represents the RiboD-PETRI protocol; The “Ctrl” label represents the classic PETRI-seq protocol we performed. (C) Comparison of UMI counts per cell between RiboD-PETRI (Table S7) and PETRI (Table S8) at the same unsaturated sequencing depth. (D) Assessment of the effect of rRNA depletion on transcriptional profiles. The Pearson correlation coefficient (r) of UMI counts per gene (log2 UMIs) between RiboD-PETRI (Table S7) and PETRI (Table S9) was calculated for 3790 out of 4141 total genes, excluding those with zero counts in either library. Each point represents a gene. (E) Evaluation of the correlation between RiboD-PETRI (Table S7) data and bulk RNA-seq (Table S10) results. The Pearson correlation coefficient (r) of UMI counts per gene (log2 UMIs) among RiboD-PETRI data and the reads per gene (log2 reads) of bulk RNA-seq data was calculated for 3814 out of 4141 total genes, excluding those with zero counts in either library. Each point represents a gene. All data presented in Fig.1C, D, E were from our own sequencing experiments.

Comprehensive Analysis of single-cell mRNA Transcriptomic Profiles in Exponential Phase E. coli using RiboD-PETRI

(A) The number of UMIs detected per cell in recovered cells in exponential period E. coli(≥15 UMIs/cell). The cells are ranked from highest to lowest based on the number of detected UMIs, and cells with ≥15 UMIs are selected for plotting. The median number of UMIs is calculated for these selected cells. (B) Distribution of mRNA UMIs captured per cell in RiboD-PETRI data of exponential period E. coli, presented as violin plots showing the upper quartile, median, and lower quartile lines. The cells are ranked from highest to lowest based on the number of UMIs detected. Then, specific numbers of cells (indicated above the panel) are selected for plotting. The median number of UMIs is calculated for these selected cells. (C) The number of genes detected per cell in exponential period E. coli. The cells are ranked from highest to lowest based on the number of genes detected. Then, specific numbers of cells (indicated above the panel) are selected for plotting. The median number of genes is calculated for these selected cells. (D) Uniform Manifold Approximation and Projection (UMAP) visualization of E. coli bacteria during the exponential phase. Data were filtered for cells with UMIs between 200 and 5,000, resulting in 1,464 cells. Each dot represents a cell. (E) Heatmap illustrating the normalized gene expression levels of marker genes in different clusters of exponential period E. coli. Marker genes with relatively high expression levels are depicted in yellow, while lower expression levels are shown in purple. Each row represents a gene, and each column represents a cell. (F) Functional enrichment analysis of marker genes of exponential period E. coli in cluster 2. Marker genes were selected based on screening criteria of p-value < 0.001 and log2 fold change (FC) > 0.2. The color blocks in these figures represent the p-values of the data points. The color scale ranges from red to blue. Red colors indicate smaller p-values, suggesting higher statistical significance and more reliable results. Blue colors indicate larger p-values, suggesting lower statistical significance and less reliable results. Count is the number of genes enriched into this pathway. (G) Expression levels of marker genes in cluster 2 during the 3-hour exponential period of E. coli overlaid on the UMAP plot. Cells with high expression levels are depicted in blue. Marker genes were selected based on a p-value greater than 0.001 and a log2 FC greater than 3. (H) Principal Component Analysis (PCA) performed on screened data of exponential phase E. coli. The resulting scatterplots show heterogeneity among the populations, with each point representing a cell. (I) Distribution of UMIs on the UMAP results for exponential phase E. coli. UMAP results reveal heterogeneity among populations, with each point representing a cell and color shading indicating UMI counts (Table S11).

Single-cell Transcriptomic Analysis and Characterization of Static E. coli Biofilm using RiboD-PETRI

(A-F, H) RiboD-PETRI data from static E. coli biofilm (E. coli 24h static culture) (Table S12, 13). RiboD-PETRI data of static E. coli biofilm were screened for cells with UMIs between 100 and 2000, resulting in 1621 and 3999 cells. (A) The number of UMIs detected per cell in recovered cells in Static E. coli biofilms (≥15 UMIs/cell). The cells are ranked from highest to lowest based on the number of detected UMIs, and cells with ≥15 UMIs are selected for plotting. (B) Distribution of mRNA UMIs captured per cell in RiboD-PETRI data of static E. coli biofilm. (C) The number of genes detected per cell in static E. coli biofilm. (D) UMAP visualization of static E. coli biofilm, revealing two small populations of heterogeneous cells in clusters 2 and 3. (E) Inferred expression levels of marker genes from static E. coli biofilm of E. coli across different clusters. (F) Enrichment pathways for marker genes of static E. coli biofilm data in cluster 2, selected based on screening criteria of p-value < 0.001 and log2 fold change (FC) > 0.2. The color blocks in these figures represent the p-values of the data points. (G & H) Dot plot displaying scaled expression levels of marker genes in different clusters of E. coli in exponential phase (G) and E. coli in static E. coli biofilm (H). These genes were markers of static E. coli biofilms in cluster 2, identified with screening criteria of p-value < 0.001 and log2 FC > 3. Dot size represents the percentage expression of the gene in the cluster, while color indicates the average expression level normalized from 0 to 1 across all clusters for each gene.

Functional Investigation of Marker Gene PdeI in Static E. coli Biofilm

(A & B) UMAP plots showing the distribution of pdeI in single-cell data of exponential period E. coli (A) and static E. coli biofilm (B). Each dot represents a cell colored by normalized expression levels of genes. (C) Subcellular localization of PdeI-GFP and GFP. Scale bar, 1 μm. (D) c-di-GMP levels (R-1 score) in E. coli cells with different BFP, PdeI-BFP, PdeI(G412S)-BFP expression levels (low or high), under the control of the pdeI native promoter, in static E. coli biofilm. c-di-GMP levels are measured using the c-di-GMP sensor system integrated into E. coli cells. R-1 score was determined using the fluorescent intensity of mVenusNB and mScarlet-I in the system. The fluorescent intensity is measured by flow cytometry. (E) Determination of cellular concentrations of c-di-GMP by HPLC-MS/MS in cells overexpressing PdeI under the control of arabinose promoter, with 0.002% arabinose induction for 2 h (n=3). (F & G) Localization of PdeI-high cells in the biofilm matrix. Cells expressing PdeI-BFP under the control of the pdeI native promoter were grown in a glass-bottom cell culture dish and stained with SYTO™ 24 for bacterial DNA. Cells expressing BFP under the control of arabinose promoter, with 0.00001% arabinose induction for 24h in a glass-bottom cell culture dish and stained with SYTOTM 24 for bacterial DNA. (H & I) Heterogeneous expression of PdeI in single-cell data of exponential period E. coli (H) and E. coli in static E. coli biofilm (E. coli 24h static culture) (I). Biofilm cells with high or low expression levels of PdeI-BFP were sorted by flow cytometry. (J) Persister counting assay using 150 μg/ml ampicillin on cells with high or low expression levels of BFP, PdeI-BFP and PdeI(G412S)-BFP from static E. coli biofilm, sorted by flow cytometry.These strains were under the control of the pdeI native promoter. (K) Time-lapse images of the persister assay observed under a microscope. Static biofilm cells of the PdeI-GFP strain were spotted on a gel pad and treated with 150 μg/ml ampicillin in LB broth. Images were captured over 6 hours at 37 °C, followed by the replacement of fresh LB broth to allow persister cell resuscitation. Scale bar, 2 μm. Error bars represent standard deviations of biological replicates. Significance was ascertained by unpaired Student’s t-test; Statistical significance is denoted as *P <0.05, **P < 0.01, ***P < 0.001, and ****P < 0.0001.

(A, B) The number of UMIs detected per cell in recovered cells in different samples (≥15 UMIs/cell): (A) PETRI, (B) RiboD-PETRI at the same unsaturated sequencing depth. The cells are ranked from highest to lowest based on the number of detected UMIs, and cells with ≥15 UMIs are selected for plotting. The median number of UMIs is calculated for these selected cells. (C) Scatterplot illustrating the relationship between reads per cell and counts of UMIs per cell detected from exponential phase E. coli data. Each dot represents a cell. (D) Sequencing saturation of data of exponential period E. coli (3h). We extracted 20%, 40%, 60%, 80% and 100% of the data and further tested their saturation using the saturation calculation method of 10× Genomics. (E & F) Sequencing saturation analysis. We took 20%, 40%, 60%, 80% and 100% of the sequencing data for single-cell analysis and counted the number of genes and UMIs for each cell in these data. The cells were then sorted from largest to smallest values, and cells were taken to count the median number of genes (E) and UMIs (F).

Comprehensive Single-Cell Transcriptomic Analysis of S. aureus and C. crescentus using RiboD-PETRI

Technical Application of RiboD-PETRI in S. aureus (SA) (A-F), cultured for 9 hours in MHB medium at 37 °C (Table S14) and C. crescentus (CC) (G-L), incubated at 37 °C for 3 hours (Table S15). (A, G) The number of UMIs detected per cell in different samples (≥15 UMIs/cell): (A) S. aureus (SA) and (G) C. crescentus (CC). (B, H) Distribution of mRNA UMIs captured per cell in RiboD-PETRI data of (B) S. aureus (SA) and (H) C. crescentus (CC), presented as violin plots showing the upper quartile, median, and lower quartile lines. The cells are ranked from highest to lowest based on the number of UMIs detected. Then, specific numbers of cells (indicated above the panel) are selected for plotting. The median number of UMIs is calculated for these selected cells. (C, I) The number of genes detected per cell in different samples (C) S. aureus and (I) C. crescentus. The cells are ranked from highest to lowest based on the number of genes detected. Then, specific numbers of cells (indicated above the panel) are selected for plotting. The median number of genes is calculated for these selected cells. “SA” denotes S. aureus, and “CC” denotes C. crescentus. (D, J) UMAP visualization of (D) S. aureus and (J) C. crescentus, demonstrating the ability of RiboD-PETRI to distinguish population heterogeneity. (E, K) Normalized and Principal Component Analysis (PCA) performed on screened data of (E) S. aureus and (K) C. crescentus. The resulting scatterplots show heterogeneity among the populations, with each point representing a cell. (F, L) Distribution of UMIs on the UMAP results for (F) S. aureus and (L) C. crescentus. UMAP results reveal heterogeneity among populations, with each point representing a cell and color shading indicating UMI counts.

Evaluation of Transcriptomic Consistency and Batch Effect Analysis in Static Biofilm E. coli Samples

(A) Scatterplot demonstrating the relationship between reads per cell and counts of UMIs per cell detected from static biofilm E. coli data. Two replicates of the sample are included. (B) Calculation of the Pearson correlation coefficient (r) of UMI counts per gene between replicate 1 and replicate 2 of static biofilm E. coli. The analysis involved 4,062 out of 4,141 total genes, with a significant correlation (p-value < 0.0001, r = 0.96), indicating good replication between samples. Each dot represents a gene. (C) Before batch effects were removed, UMAP plot based on the original identity of static biofilm E. coli samples (replicate 1 and replicate 2). Each dot represents a cell, with red indicating replicate 1 and green indicating replicate 2. (D) After batch effects were removed using Harmony, UMAP plot based on the original identity of static biofilm E. coli samples (replicate 1 and replicate 2). (E) Principal Component Analysis (PCA) performed on screened data of two replicates of static biofilm E. coli. The resulting scatterplots show heterogeneity among the populations, with each point representing a cell. (F) Distribution of UMIs on the UMAP results for two replicates of static biofilm E. coli. UMAP results reveal heterogeneity among populations, with each point representing a cell and color shading indicating UMI counts.

Profiling of Marker Genes in exponential phase E. coli culture by RiboD-PETRI

Expression levels of diverse marker genes across distinct clusters in exponential phase E. coli culture, visualized through violin plots. Each individual dot represents a single cell, demonstrating the high-resolution, single-cell nature of the RiboD-PETRI analysis.

Marker Genes Identified in stationary phase S. aureus culture by RiboD-PETRI

Expression levels of different marker genes across different clusters in stationary phase S. aureus culture overlaid on the UMAP plot. Marker genes were selected based on a p-value greater than 0.001 and a log2 FC greater than 0.2. Each dot represents a cell and color shading indicating UMI counts.

Marker Genes Identified in exponential phase C. crescentus culture by RiboD-PETRI

Expression levels of different marker genes across different clusters in exponential phase C. crescentus culture overlaid on the UMAP plot. Marker genes were selected based on a p-value greater than 0.001 and a log2 FC greater than 0.2. Each dot represents a cell and color shading indicating UMI counts.

Marker Genes Identified in static E. coli biofilms by RiboD-PETRI

Expression levels of different marker genes across different clusters in static E. coli biofilms overlaid on the UMAP plot. Marker genes were selected based on a p-value greater than 0.001 and a log2 FC greater than 3. Each dot represents a cell and color shading indicating UMI counts.

Schematic chart for the structure of E. coli PdeI