A major challenge in biology is identifying distinct cell classes and mapping their interactions in vivo. Tissue-dissociative technologies enable deep single cell molecular profiling but do not provide spatial information. We developed a proximity ligation in situ hybridization technology (PLISH) with exceptional signal strength, specificity, and sensitivity in tissue. Multiplexed data sets can be acquired using barcoded probes and rapid label-image-erase cycles, with automated calculation of single cell profiles, enabling clustering and anatomical re-mapping of cells. We apply PLISH to expression profile ~2900 cells in intact mouse lung, which identifies and localizes known cell types, including rare ones. Unsupervised classification of the cells indicates differential expression of ‘housekeeping’ genes between cell types, and re-mapping of two sub-classes of Club cells highlights their segregated spatial domains in terminal airways. By enabling single cell profiling of various RNA species in situ, PLISH can impact many areas of basic and medical research.https://doi.org/10.7554/eLife.30510.001
The human body contains several hundred types of specialized cells that have different roles. The cells form tissues, and each tissue can only work if it has the right cells and if they are correctly organized and distributed to build working structures. This is how the same body parts, e.g. lungs, brains, hearts, end up looking and working the same way in almost everyone.
Cells organize themselves into tissues by exchanging short-range messages between nearby cells. Understanding how cells communicate to form, maintain and repair tissues is a challenge for biologists. Finding ways to examine different signals at the same time would improve our understanding of these important processes.
Now, Nagendran, Riordan et al. developed a microscopy technique that can tackle this issue. The cells can be stained and tagged with already established dyes and markers to measure the location and signals of all cell groups at the same time. By calculating how these cells are distributed in space it is then possible to estimate how they interact with each other. Based on this, Nagendran, Riordan et al. then successfully tested their tool in lung tissues from mice and humans.
This cheap, high-speed technology works on tissue samples from any animal, including humans, and can be easily combined with existing technologies and so be adapted for a wide range of uses. A deeper knowledge of how combinations of signals guide tissue formation and maintenance could help us to better understand what causes developmental diseases, organ failures and cancer. Tools like this could even help to identify key targets for new treatments.https://doi.org/10.7554/eLife.30510.002
In parallel with the development of single-cell RNA sequencing (scRNA-seq), there have been rapid advances in single-molecule in situ hybridization (smISH) techniques that localize RNAs of interest directly in fixed cells (Shah et al., 2016; Huss et al., 2015; Chen et al., 2015; Wang et al., 2012; Larsson et al., 2010; Raj et al., 2008; Femino et al., 1998). These smISH techniques involve hybridization of fluorescently-labeled oligonucleotide probes, typically 24–96 per gene, to mark individual RNA molecules with a discrete, diffraction-limited punctum that can be quantitatively analyzed by fluorescence microscopy. smISH has been used in cultured cells to study the subcellular distribution of RNAs (reviewed in [Buxbaum et al., 2015]), the consequences of stochastic noise on gene expression (Raj et al., 2010; Raj et al., 2006), and the impact of cell shape and environment on expression programs (Moffitt et al., 2016a; Battich et al., 2015).
An increasingly important application for smISH is the simultaneous localization of customized panels of transcripts in tissue, which is used to validate putative cell subtypes identified by scRNA-seq studies (Grün and van Oudenaarden, 2015). Performing smISH in intact tissue can also reveal the spatial relationship between the cells expressing secreted signaling factors and the cells expressing the corresponding receptors, information that current scRNA-seq approaches cannot resolve because they require tissue dissociation with irretrievable loss of spatial context. Finally, when applied on a genome-wide scale in tissues, smISH has the potential to entirely bypass scRNA-seq as an upfront discovery tool.
The development of multiplexed smISH for use in tissue has been challenging due to autofluorescent background and light scattering (Shah et al., 2016; Sylwestrak et al., 2016; Moffitt et al., 2016b; Chen et al., 2016; Choi et al., 2014; Lyubimova et al., 2013). One strategy for addressing this problem is to amplify probe signals by the hybridization chain reaction (HCR, reviewed in (Choi et al., 2016); see also (Wang et al., 2012) for branched-DNA amplification), which provides up to five orthogonal detection channels. Higher levels of multiplexing can be achieved by repeated cycles of RNA in situ hybridization followed by a re-amplification step (Shah et al., 2016), but because a single round of probe hybridization in tissue sections takes hours, multiplexing with HCR is laborious. Unamplified smISH techniques have the practical advantage that hundreds of endogenous RNA species can be barcoded in a single reaction, and then read out with rapid label-image-erase cycles (Moffitt et al., 2016b; Moffitt et al., 2016c), but these do not provide adequate signal in tissues.
Ideally, a technique for high-throughput profiling would combine all of the RNA probe hybridization and signal amplification steps into a single reaction. Previously, Nilsson and colleagues presented an elegant enzymatic solution to this problem (Larsson et al., 2010; Ke et al., 2013). They used barcoded padlock probes to label cDNA molecules in cells and tissues, and rolling-circle amplification (RCA) to transform the circularized probes into long tandem repeats. The approach worked in tissues and handled an unbounded number of orthogonal amplification channels. The only limitations were that the RNA-detection efficiency was capped at about 15% (each transcript could only be probed at a single site because the 3' end of the cDNA served as the replication primer), and that the approach required an in situ reverse transcription step with specialized and costly locked nucleic-acid primers.
Here, we report an in situ hybridization technique with performance characteristics that enable rapid and scalable single-cell expression profiling in tissue. Our approach is a simplified variant of the padlock/RCA technique which replaces padlock probes with RNA-templated proximity ligation; hence, we term it proximity ligation in situ hybridization (PLISH). As demonstrated below, PLISH generates data of exceptionally high signal-to-noise. Multiplexed hybridization and signal amplification of all target RNA species is carried out in a single parallel reaction, and the RNAs are then localized with rapid label-image-erase cycles. PLISH exhibits high detection efficiency because it probes multiple sites in each target RNA, and high specificity because of the proximity ligation mechanism. PLISH utilizes only commodity reagents, so it can be scaled up inexpensively to cover many genes. It works well on conventional formalin-fixed tissues that have been cryo- or paraffin-embedded, and can be performed concurrently with immunostaining, making it extremely versatile. Using the murine lung as a characterized model tissue, we show that multiplexed PLISH can rediscover and spatially map the distinct cell types of a tissue in an automated and unsupervised fashion. An unexpected discovery from this experiment is that murine Club cells separate into two populations that differ molecularly and segregate anatomically. PLISH constitutes a novel, single cell spatial-profiling technology that combines high performance, versatility and low cost. Because of its technical simplicity, it will be accessible to a broad scientific community.
Proximity ligation at Holliday junctions offers a simple mechanism for the amplified detection of RNA. First, a transcript is targeted with a pair of oligonucleotide 'H' probes designed to hybridize at adjacent positions along its sequence (Figure 1A). The left H probe includes a single-stranded 5' overhang while the right probe includes a 3' overhang. Importantly, target RNAs can be tiled with H probe pairs at multiple sites, which is critical for efficient detection of low abundance transcripts (Figure 1B). The overhangs are then hybridized to 'bridge' and linear 'circle' oligonucleotides with embedded barcode sequences to form a Holliday junction structure, after which ligation at the nick sites creates a closed circle. Finally, the 3' end of the right H probe primes rolling-circle replication, which generates a long single-stranded amplicon of tandem repeats. Addition of fluorescently-labeled 'imager' oligonucleotides complementary to the barcodes generates an extremely bright punctum at the site of each labeled transcript. Because each barcode sequence is unique, the puncta derived from different target RNAs can be labeled with different colors (Figure 1C).
To implement PLISH, we adapted protocols for antibody-based proximity ligation (Söderberg et al., 2006). The technique utilizes conventional oligonucleotides, two commercially available enzymes, and procedures familiar to molecular biologists. The ligase and polymerase enzymes are less than half the size of an immunoglobulin G, and they diffuse at least as rapidly as the 60mer DNA hairpins used for HCR amplification (Choi et al., 2014; Joubert et al., 2003; Lapham et al., 1997; Modrich et al., 1973). Our initial studies produced bright puncta that were absent if any of the oligonucleotide or enzyme reagents was withheld. The signal from the individual RCA amplicons exceeded cellular and tissue fluorescence background by more than 30-fold, rendering autofluorescence inconsequential (Figure 1—figure supplements 1A–B and [Jarvius et al., 2006; Blab et al., 2004]). Histograms of puncta intensities fit to a negative binomial distribution, as expected for a DNA replication process that terminates stochastically and irreversibly (Figure 1—figure supplement 1C–F). The coefficients of variation for the puncta intensity distributions were typically between one and two.
The requirement for coincident hybridization of two probes at adjacent sites in an RNA transcript should make PLISH highly specific. To evaluate this, we performed several experiments. First, we used PLISH to detect the transcription factor SRY-box 4 (SOX4) in cultured HCT116 cells. A pool of ten H probesets exhibited much higher RNA detection efficiency than a single H probeset, as expected (Figure 1D). However, when the RNA-recognition sequence of either the left or right H probe in each set was scrambled, there were no detectable puncta. Thus, both H probes had to be correctly targeted to generate a signal. Second, we tested the sequence-specificity of the PLISH signal in tissue by pre-incubating samples with antisense 'blocking' oligonucleotides complementary to the target RNA at the H probe hybridization sites. For these experiments, we stained mouse lung sections for secretoglobin 1a1 (Scgb1a1), a marker of airway Club cells. Antisense oligonucleotides drastically attenuated the number of PLISH puncta, whereas scrambled blocking oligonucleotides of the same length had no apparent effect (Figure 1E). Third, we analyzed murine lung sections for the co-localization of the mRNA transcript and protein product of surfactant protein C (Sftpc), which is expressed in alveolar epithelial type II (AT2) cells. Of the cells that were positive for PLISH signal, 98.5% were also positive for antibody staining (n = 184, Figure 1—figure supplement 2). This level of specificity is excellent relative to HCR-amplified smISH, where off-target binding of hybridization probes can account for a quarter of the observed puncta (Shah et al., 2016).
To quantify the sensitivity and accuracy of RNA detection, we benchmarked PLISH measurements against a reference-standard dataset of single-cell, quantitative reverse transcription polymerase chain reaction (qPCR) and RNA-seq measurements on HCT116 cells (Wu et al., 2014). For genes with fragment-per-kilobase-per-million-read (FPKM) values greater than one, the single-cell qPCR technique detected mRNA in >90% of the cells (Figure 1F). However, the fraction of transcript-positive cells dropped quickly between FPKM values of 1 and 0.1. A fit of the qPCR data to a Poisson sampling model suggested that an FPKM value of one corresponded to 2.5 copies per cell (see also [Marinov et al., 2014; Battich et al., 2013]). The PLISH technique detected RNA transcripts with a sensitivity comparable to single-cell qPCR. For example, Caspase-9 (CASP9) has an FPKM value of 2, and it was observed in 100% of the cells by PLISH. We detected an average of 8 puncta per cell, which is consistent with the prediction of 5 copies per cell from the fit to the qPCR data (Figure 1F, inset). For a set of ten genes covering the full spectrum of expression levels in HCT116 cells, the number of PLISH puncta per cell correlated with bulk FPKM values (Figure 1G).
To quantify RNA-detection efficiency in tissue, we marked a set of axin 2 (Axin2) transcripts in mouse lung sections using an HCR-amplified smISH procedure (Choi et al., 2014) and then determined the fraction of the marked transcripts that could be identified by PLISH. We chose the Axin2 gene because of its low expression level in the lung. HCR detected a sparse population of cells with one to two puncta each (the HCR detection efficiency was low because we used a single HCR probe rather than 24). PLISH puncta generated with a pool of four H probe pairs co-localized with 32% of the HCR puncta (Figure 1—figure supplement 3). Thus, the four PLISH probesets detected Axin2 transcripts with a composite efficiency of 32% and an average per-site efficiency of 9%. This probe efficiency matches or exceeds that of other smISH techniques. The PLISH detection efficiency can be tuned on a per gene basis by altering the number of H probe pairs. Decreasing the number of probesets pro-rates the number of puncta from highly-expressed genes, while increasing the number of probesets can facilitate sensitive detection of very low-abundance transcripts.
We next characterized the performance of PLISH for low-plex RNA localization in tissues. This experimental format uses a disposable hybridization chamber that is sealed to a coverslip or slide surrounding a tissue section (Figure 2A). PLISH detection of up to 5 RNA species is accomplished by stepwise application of reagents through the inlet and outlet ports of the chamber. The puncta from each RNA species are then labeled in a unique color by hybridization to 'imager' oligonucleotides with spectrally-distinct fluorophores. After imaging, the fluorescence micrographs are interpreted by direct visual inspection. We aimed to test whether PLISH provides single-molecule and single-cell resolution in tissues, whether it robustly detects low-abundance RNA species, whether the spatial distribution of RNA is consistent with prior knowledge, whether PLISH is compatible with simultaneous immunostaining, and whether it is compatible with formalin-fixed, paraffin-embedded (FFPE) samples.
First, we analyzed murine lung sections for RNA expression of the ciliated-cell marker Forkhead box J1 (Foxj1), and the Club-cell marker Scgb1a1. Foxj1 is a low-abundance transcript with an FPKM value of 10 in ciliated cells, as measured by scRNA-seq (Treutlein et al., 2014). We observed single cells with multiple discrete Foxj1 puncta in the terminal bronchiolar epithelium, surrounded by numerous strongly Scgb1a1 positive cells (Figure 2B). These data establish PLISH's single-molecule and single-cell resolution in tissues, and its ability to detect low-abundance transcripts.
Second, we analyzed human lung FFPE sections for RNA expression of SCGB1A1, and for protein expression of the basal cell marker, Keratin 5 (KRT5). To do this, we appended two antibody incubation steps to the standard PLISH protocol. Strongly SCGB1A1 positive cells were localized to the lumen of the airways, overlying KRT5 positive cells (Figure 2C), matching the known anatomical distribution of Club and basal cells, respectively. These data establish PLISH's compatibility with simultaneous immunostaining, and with FFPE samples.
Third, we analyzed murine lung sections for RNA expression of three genes: the AT2 cell marker Sftpc, the macrophage-enriched marker Lysozyme 2 (Lyz2), and Scgb1a1. Overlays of the three channels provided a striking visual depiction of the different cell types. Macrophages were bright in the Lyz2 channel, but absent in the other channels (Figure 2D). AT2 cells were bright in the Sftpc channel, moderately bright in the Lyz2 channel and absent in the Scgb1a1 channel (Figure 2D, white cells in the overlay). Club cells were very bright in the Scgb1a1 channel, but otherwise absent (Figure 2E). Finally, putative bronchioalveolar stem cells (Kim et al., 2005) (BASCs) were bright in the Sftpc channel with a weak punctate signal in the Scgb1a1 channel (Figure 2E). Thus, raw PLISH data can be interpreted without any computational processing, made possible by PLISH's exceptional signal-to-noise in tissues.
We also evaluated how PLISH performs in primary samples of diseased human tissue, to assess whether it will be useful for molecular analysis of the many human diseases that cannot be accurately modeled in animals. One example is idiopathic pulmonary fibrosis (IPF), a fatal lung disease of unknown pathogenesis (Travis et al., 2013). The diagnosis of IPF is based on the presence of specific histological features, including clusters of spindle-shaped fibroblasts, stereotyped 'honeycomb' cysts, and epithelial cell hyperplasia. In this regard, single-cell profiling approaches that operate on dissociated tissue (Xu et al., 2016) are intrinsically limited because they cannot correlate molecular data with cytologic and spatial features. As a preliminary test, we used PLISH to analyze RNA expression of the AT2 cell marker SFTPC in resected lung tissue from control and IPF patients. In contrast to the uniformly cuboidal SFTPC-expressing AT2 cells distributed throughout alveoli of non-IPF lungs (Figure 2F), we observed clusters of SFTPCHi cells of heterogeneous size and varying degrees of flattening lining the airspace lumen of IPF lungs. Surprisingly, many cells that did not appear to be epithelial (i.e., they were not lining an airway lumen) expressed SFTPC at low levels. Based on this pilot experiment, PLISH should be a suitable tool for building atlases of RNA expression in human disease. The PLISH data can be overlaid with monoclonal antibody staining patterns that are the mainstay of pathologic diagnosis and classification.
Highly multiplexed measurement of different RNA species requires iterated data collection cycles, since conventional fluorescence microscopy only provides up to five channels (Figure 3A). The data collection cycles include fluorescent labeling of a subset of the 'barcodes' (i.e., unique nucleotide sequences complementary to fluorescently labeled 'imager' oligonucleotides) in a sample, imaging of the labeled transcripts, and erasure of the fluorescent signal. Ideally, the cycles should be fast, and the erasure should not cause any mechanical or chemical damage to the sample. Consistent with prior work, we found that PLISH puncta could be imaged in the presence of a 3 nM background of freely-diffusing imager oligonucleotides (Figure 3—figure supplements 1A and [Blab et al., 2004]). This allowed us to streamline data collection by eliminating a wash step, and also presented a simple erasure strategy. By using short imager oligonucleotides that equilibrate rapidly on and off of RCA amplicons (Jungmann et al., 2014), we could erase fluorescence from a previous cycle by a simple buffer exchange (Figure 3—figure supplement 1B). We also established an erasure method based on uracil-containing imager oligonucleotides, which were removed with a 15 min enzymatic digestion (Figure 3—figure supplement 1C). Thus, we could image PLISH puncta in five different color channels with spectrally-distinct fluorophores, and we were able to complete cycles in as little as 20 min, which approaches the cycle time of an Illumina MiSeq instrument (https://support.illumina.com).
To demonstrate and validate the multiplexing capacity of PLISH, we co-localized the mRNA of eight selected genes in ~2900 single cells from an adult mouse lung (Figure 3B). Our panel included four commonly used lung cell type markers that have been previously characterized, and four ubiquitously expressed genes. The targeted transcripts were Sftpc (AT2 cells), advanced glycosylation end product-specific receptor (Ager, AT1 cells), Scgb1a1 (Club cells), Lyz2 (macrophage and AT2 cell subset), ferritin light polypeptide 1 (Ftl1), beta actin (Actb), inactive X specific transcripts (Xist), and glyceraldehyde-3-phosphate dehydrogenase (Gapdh). The eight RNA species were barcoded in a single PLISH reaction, and the data were collected with a pair of label-image-erase cycles using the enzymatic erasure approach described above (Figure 3A). A nuclear counterstain (DAPI) and transmitted light micrograph were also obtained.
To quantify the expression of all eight genes on a per-cell basis, we created a PLISH-specific pipeline in CellProfiler, an open-source software package (Kamentsky et al., 2011). The pipeline first identified nuclei in the DAPI channel, which were used as anchor points for expansion to full-cell assignments. Fortuitously, the bulk of the detected mRNAs in AT1 cells, which have an extremely flat and broad morphology, were clustered around the nuclei. We summed the PLISH signal for each gene in the nuclear and peri-nuclear regions of each cell, and saved the results as single-cell expression profiles indexed on anatomical location. We also created a utility to pseudocolor cells in a transmitted light micrograph according to their inferred cell type (see below), so that we could visualize the relationship between cellular gene expression and anatomical localization.
An important scientific challenge is to identify and map all of the molecularly distinct cell types that make up complex tissues, and in situ single-cell profiling should be a powerful tool for working towards this goal. As a proof-of-concept for this, we asked whether known lung cell types could be rediscovered by an automated and unsupervised analysis of our multiplexed PLISH data set. We used two standard data analysis tools, K-means clustering (Figure 3C) and t-distributed stochastic neighbor embedding (van der Maaten and Hinton, 2008) (t-SNE, Figure 3D), to classify and visualize the entire population of cells. The automated analysis identified ten cell classes, four of which were labeled 'other' because they were defined primarily by 'signature' profiles of ubiquitously-expressed genes. The remaining six classes were associated with a known lung cell type based on marker-gene expression. The Sftpc and Scgb1a1 positive cell classes were labeled as AT2 and Club, respectively, while the Lyz2 positive class was labeled as macrophage (one of the two AT2 cell classes was also Lyz2 positive as previously reported in (Desai et al., 2014), Figure 4—figure supplement 1A). The cell class with the highest Ager expression was labeled as AT1, but Ager mRNA was also detected in a subset of AT2 and Club cells, and in one of the four 'other' cell classes, indicating it is not particularly specific for AT1 cells. We validated the PLISH results by indirect immunohistochemistry (Figure 4—figure supplement 1B) and by comparison with previously published scRNA-seq data (Figure 4—figure supplement 1C), which confirmed the low specificity of Ager for AT1 cells. We also analyzed the RNA expression of Akap5, another transcript that is highly-enriched in AT1 cells (Treutlein et al., 2014), and found that its localization correlated closely with Ager's (Figure 4—figure supplement 1D).
For a higher-resolution analysis of cellular gene expression, we examined the expression pattern of individual genes in re-colored t-SNE plots (Figure 4A). We found a small cloud of cells between the Club and AT2 clusters that expressed both Sftpc and Scgb1a1. On the basis of this dual expression, we assigned them as the BASC type (Kim et al., 2005). We also noted that Lyz2 expression partitioned the AT2 cells into two classes designated Lyz2+ and Lyz2-, while Actb segregated Club cells into two classes designated ActbHi and ActbLo. Gapdh was the most uniformly expressed transcript, consistent with its role as a 'housekeeping' gene (Figure 4B). Ftl1 expression was highest in alveolar macrophages, as expected, where it is believed to play a role in processing iron from ingested red blood cells (McGowan and Henley, 1988). Unexpectedly, Ftl1 was also highly expressed in Club cells. Actb expression was highest in macrophages, presumably because of its functional role in motility, and in AT1 cells, which must maintain a flat morphology and expansive cytoskeleton (Foster et al., 2010).
To validate the PLISH results, we pseudocolored the cells in transmitted-light images according to their class (Figure 4C–D). Importantly, no spatial information was included in the k-means clustering. Several observations confirmed the accuracy of the automated classification. First, the Club cell class mapped perfectly onto the bronchial epithelium, while cells from the AT1 and AT2 classes were distributed throughout the alveolar compartment. The rare BASCs also localized precisely to the bronchioalveolar junctions, where they have been shown to reside by immunostaining (Kim et al., 2005) (Figure 4C and Figure 4—figure supplement 1E). The macrophage class was primarily found inside the alveolar lumen, and many exhibited a characteristic rounded cell shape. The Otherd class of cells was enriched in pulmonary arteries, and therefore might represent endothelial or perivascular cells. We further observed a striking spatial segregation of the two Club cell classes. ActbHi Club cells clustered together at the bronchial terminus, while ActbLo Club cells populated more proximal domains (Figure 4D–E). While the significance of this pattern is not immediately obvious, it emphasizes how PLISH can readily integrate molecular and spatial features of single cells to generate insights that would be missed with either piece of information alone.
PLISH represents a practical technology for multiplexed expression profiling in tissues. It combines high performance in four key areas: specificity, detection efficiency, signal-to-noise and speed. The specificity derives from coincidence detection, which requires two probes to hybridize next to one another for signal generation. Efficient detection of low-abundance transcripts is accomplished by targeting multiple sites along the RNA sequence. Enzymatic amplification produces extremely bright puncta, and allows many different RNA transcripts to be marked with unique barcodes in one step. The different RNA transcripts can then be iteratively detected to rapidly generate high dimensional data.
While low-plex PLISH on a handful of different genes can be valuable, the PLISH technology is also scalable, without requiring specialized microscopes (or other equipment), software, or computational expertise. The oligonucleotides and enzymes are inexpensive and commercially available from multiple vendors. The H probes are the cost-limiting reagent, but can be synthesized in pools (Murgha et al., 2014; Beliveau et al., 2012). Assuming five pairs of H probes for each target RNA species, and 20 cents for a 40mer oligonucleotide, the cost of PLISH reagents amounts to $3 per gene. It should therefore be practical to simultaneously interrogate entire molecular systems, such as signaling pathways or super-families of adhesion receptors. The high specificity and signal-to-noise of PLISH will be advantageous for deep profiling, where non-specific background increases with increasingly complex mixtures of hybridization probes (Moffitt et al., 2016c).
Our initial studies demonstrate PLISH's capacity for rapid, automated and unbiased cell-type classification, and illustrate how it can complement single-cell RNA sequencing (sc-RNAseq). Sc-RNAseq offers greater gene depth than in situ hybridization approaches, but it is less sensitive, fails to capture spatial information, and induces artefactual changes in gene expression during tissue dissociation (van den Brink et al., 2017; Lee et al., 2015). PLISH provides the missing cytological and spatial information, and it is applied to intact tissues. Going forward, sequencing can be used to nominate putative cell types and molecular states based on the coordinate expression of 'signature genes', and multiplexed PLISH can be used to distinguish true biological variation from technical noise and experimentally-induced perturbations. Importantly, multiplexed PLISH provides the tissue context of distinct cell populations, which is essential for understanding the higher-order organization of intact systems like solid tumors and developing organs. In diseases like IPF where morphology and gene expression are severely deranged (Xu et al., 2016), histological, cytological and spatial features may even be essential for making biological sense of sequencing data.
Currently, efforts are underway to more deeply characterize cellular states by integrating diverse types of molecular information. We have already demonstrated the combined application of PLISH with conventional immunostaining. Going one step further, oligonucleotide-antibody conjugates make it possible to mix and match protein and RNA targets in a multiplexed format (Weibrecht et al., 2013). The generation of comprehensive, multidimensional molecular maps of intact tissues, in both healthy and diseased states, will have a fundamental impact on basic science and medicine.
During the course of this work, Frei et al. published a related RNA detection approach (Frei et al., 2016) that utilizes the classical proximity ligation architecture (Gullberg et al., 2004; Söderberg et al., 2006) rather than a Holliday Junction architecture (Labib et al., 2013), and applied it to multiplexed RNA detection in fixed cells.
Unless otherwise specified, all reagents were from Thermo-Fisher and Sigma-Aldrich. Oligonucleotides were purchased from Integrated DNA Technologies. T4 polynucleotide kinase, T4 ligase, USER enzyme and their respective buffers were purchased from New England Biolabs. Nxgen phi29 polymerase and its buffer were purchased from Lucigen.
Abbreviations: BSA, bovine serum albumin; DAPI, 4,6-diamidino-2-phenylindole; DEPC, diethylpyrocarbonate; EDTA, ethylenediaminetetraacetic acid; min, minutes; PBS, phosphate buffered saline; PFA, paraformaldehyde; RCA, rolling circle amplification; RT, room temperature. All oligonucleotide sequences are listed in Supplementary file 1.
HCT116 cells (ATCC; CCL-247) were authenticated by HLA typing and confirmed negative for Mycoplasma contamination using PCR. Cells were grown on poly-lysine coated #1.5 coverslips (Fisherbrand 12–544 G) using standard cell culture protocols until they reached the desired confluency. The cells were rinsed in 1X PBS and fixed in 3.7% formaldehyde with 0.1% DEPC at RT for 20 min. The fixed cells were treated with 10 mM citrate buffer (pH 6.0) at 70°C for 30 min, dehydrated in an ethanol series, then enclosed by application of a seal chamber (Grace Biolabs 621505) to the coverslip.
Lungs were collected from adult B6 mice (Jackson Labs) and fixed by immersion in 4% PFA as previously described (Desai et al., 2014). Non-IPF human lung tissue was obtained from a surgical resection, and IPF tissue from an explant. All mouse and human research were approved by the Institutional Animal Care and Use Committee and Internal Review Board, respectively, at Stanford University. The tissues were fixed by immersion in 10% neutral buffered formalin in PBS at 4°C overnight under gentle rocking, cryoprotected in 30% sucrose at 4°C overnight, submerged in OCT (Tissue Tek) in an embedding mold, frozen on dry ice, and stored at −80°C. 20 μm sections were cut on a cryostat (LeicaCM 3050S) and collected on either poly-lysine coated #1.5 coverslips or glass slides (Fisherbrand Superfrost), air dried for 10 min, and post-fixed with 4% PFA at RT for 20 min. The human lung tissue in Figure 2A was formalin-fixed and paraffin-embedded (FFPE) according to standard protocols, and 20 μm sections were cut on a microtome and collected on glass slides. The FFPE sections were deparaffinized by immersion in Histoclear (National Diagnostics, HS-200) for 3 × 5 min, then dehydrated in an ethanol series and post-fixed with 4% PFA at RT for 20 min. Tissue sections were treated with 10 mM citrate buffer (pH 6.0) containing 0.05% lithium dodecyl sulfate at 70°C for 30 min, or in some experiments, with 0.1 mg/ml Pepsin in 0.1M HCl for 8 min at 37°C and dehydrated in an ethanol series. Following treatment, sections were air dried for 10 min and enclosed by application of a seal chamber.
Target RNAs were probed at ~40 nucleotide detection sites, with 1 to 10 sites per RNA species depending on expression level. NCBI BLAST searches were used to eliminate detection sites that shared 10 or more contiguous nucleotides with a non-target RNA. The detection sites were also selected to minimize self-complementarity as indicated by the IDT oligo analyzer. Each detection site was targeted with a pair of H probes designated HL (left H probe) and HR (right H probe). The HL and HR probes included ~20 nucleotide binding sequences that were complementary respectively to the 5' and 3' halves of the detection site. The binding sequences were chosen so that the 5' end of the HL binding sequence and the 3' end of the HR binding sequence would abut at a 5’-AG-3’ or a 5’-TA-3’ dinucleotide in the target RNA. The lengths of the binding sequences were adjusted so that the melting temperature of the corresponding DNA duplex would fall between 45–65°C as computed by IDT Oligo analyzer using default settings of 0.25 μM oligo concentration and 50 mM salt concentration. To generate H probes, suitable HL and HR binding sequences were catenated at their respective 5' and 3' ends with overhang sequences taken from one of eight modular design templates (Supplementary file 1). The left and right overhang sequences in each design template were complementary to a specific bridge (B) and circle (C) oligonucleotide, which directed a desired fluorescent readout. The design templates reported here utilized a common 31 base oligonucleotide for the bridge. Following previous work (Söderberg et al., 2006), the circle oligonucleotides were ~60 bases long with 11 base regions of complementarity to cognate H probes on either end. The circle sequences were chosen to minimize self-complementarity. Each imager oligonucleotide was complementary to a barcode embedded in one of the C oligonucleotides, allowing unique detection of the corresponding RCA amplicon.
The H-probe oligonucleotides were ordered on a 25 nanomole scale with standard desalting. The B and C oligonucleotides were ordered on a 100 nanomole scale with HPLC purification, and phosphorylated with T4 polynucleotide kinase according to the manufacturer recommendations. Imager oligonucleotides were purchased either as HPLC-purified fluorophore conjugates (A488, Texas Red, Cy3, Cy5), or as amine-modified oligonucleotides that were subsequently coupled to Pacific Blue-NHS ester according to the manufacturer recommendations.
Six buffers were used for PLISH barcoding: H-probe buffer (1M sodium trichloroacetate, 50 mM Tris pH 7.4, 5 mM EDTA, 0.2 mg/mL Heparin), bridge-circle buffer (2% BSA, 0.2 mg/mL heparin, 0.05% Tween-20, 1X T4 ligase buffer in RNAse-free water), PBST (PBS + 0.1% Tween-20), ligation buffer (10 CEU/μl T4 DNA ligase, 2% BSA, 1X T4 ligase buffer, 1% RNaseOUT and 0.05% Tween-20 in RNAse-free water), labeling buffer (2x SSC/20% formamide in RNAse-free water), and RCA buffer (1 U/μl Nxgen phi29 polymerase, 1X Nxgen phi29 polymerase buffer, 2%BSA, 5% glycerol, 10 mM dNTPs, 1% RNaseOUT in RNAse-free water).
An H cocktail was prepared by mixing H probes in H-probe buffer at a final concentration of 100 nM each. If an RNA was targeted with more than five probe sets, the concentrations of the H probes for that RNA were pro-rated so that their sum did not exceed 1000 nM. A BC cocktail was also prepared by mixing B and C oligonucleotides in bridge-circle buffer at a final concentration of 6 μM each.
Single-step barcoding was performed in sealed chambers. The workflow consisted of three steps: (i) The sample was incubated in the H cocktail at 37°C for 2 hr. The sample was then washed 4 × 5 min with H-probe buffer at RT, and incubated in the BC cocktail at 37°C for 1 hr. (ii) Following a 5 min wash with PBST at RT, the sample was incubated in ligation buffer at 37°C for 1 hr. (iii) The sample was washed 2 × 5 min with labeling buffer at RT, and washed with 1X Nxgen phi29 polymerase buffer at RT for 5 min. The sample was then incubated in RCA buffer at 37°C for 2 hr (typical for cultured cells) to overnight (typical for tissue). Finally, the sample was washed 2 × 5 min with labeling buffer.
Barcoded PLISH samples were fluorescently labeled by two different procedures, designated 'washout' and 'fast'. In the washout procedure, the sample was incubated with imager oligonucleotides in imager buffer (labeling buffer with 0.2 mg/mL heparin) at a final concentration of 100 nM each for 30 min, and then washed 2 × 5 min with PBST at RT. In the fast procedure, the sample was incubated for 5 min with imager oligonucleotides in imager buffer at a final concentration of 3 nM each, and then imaged immediately. Samples that did not require label-image-erase cycles were stained with DAPI (stock 1 mg/ml; final concentration - 1:1000 in PBS) for 5 min and mounted in H-1000 Vectashield mounting medium (Vector).
Data were collected by confocal microscopy (Leica Sp8 and Zeiss LSM 800) using a 40X oil immersion or a 25X water immersion objective lens. 20 μm z-stacks were scanned, and maximum projection images were saved for analysis. For 5-color experiments, DAPI was added after the Pacific Blue channel had been imaged, and the Texas Red and Cy3 channels were linearly unmixed using Zeiss software. Transmitted light images were acquired on a Leica Sp8 confocal microscope using the 488 nm Argon laser and the appropriate PMT-TL detector. Images from serial rounds of data collection were aligned using the nuclear stain from each round as a fiducial marker. Unless otherwise stated, imaging data of cells and mouse lung tissue are representative of three independent experiments with ≥4 fields of view each. Imaging data of human lung tissue are representative of two independent experiments with ≥4 fields of view each.
HCR was performed following a published protocol (Choi et al., 2014) with probes that targeted two sites covering nucleotides 621–670 and 1159–1208 in the mouse Axin2 transcript, and AlexaFluor 488-/AlexaFluor 647-labeled amplifier oligonucleotides. The samples were then processed for PLISH with H probes targeting four sites covering nucleotides 347–386, 1878–1917, 2412–2451 and 2956–2995 in the Axin2 transcript, and imaged using a Cy3-labeled imager oligonucleotide.
PLISH barcoding was performed as described above. Subsequently, the sample was washed 3 × 5 min with PBST at RT, and incubated in blocking solution (50 μl/ml [5%] normal goat serum, 1 μl/ml [0.1%] Triton X-100, 5 mM EDTA and 0.03 g/ml [3%] BSA in PBS) at RT for 1 hr. The sample was then incubated with primary antibody (Rabbit anti-pro-Sftpc, Millipore, 1:500 or Rabbit anti-Cytokeratin 5, Abcam Ab193895, 1:400) in blocking solution at 37°C for 2 hr under gentle rocking, washed 4 × 5 min with PBST at RT, and incubated with secondary antibody (Goat anti-Rabbit-Cy5, Jackson Lab, 1:250) and DAPI (1:1000) in blocking solution at RT for 1 hr. The sample was washed 3 × 5 min in PBST at RT and mounted in H-1000 Vectashield.
Mouse lung tissue cryosections were collected on slides, post-fixed and processed as described above. The samples were incubated with a 60-base oligonucleotide complementary to nucleotides 219–278 in the Scgb1a1 mRNA, or with a scrambled 60-base oligonucleotide, at 100 nM final concentration in H-probe buffer at 37°C for 2 hr. The samples were then washed 2 × 5 min with H-probe buffer at RT, and processed for PLISH using H probes that targeted nucleotides 229–268 in the Scgb1a1 transcript.
To perform enzymatic erasure, 15–20 base imager oligonucleotides were ordered with the dT nucleotides replaced by dU nucleotides. Following imaging, the signal was erased by incubating the sample with 0.1 U/μL USER enzyme in 1X USER enzyme buffer at 37°C for 20 min, followed by washing 2 × 3 min with PBST at RT. To perform rapid erasure, short 10–11 base oligonucleotides were ordered. Following imaging, the signal was erased by incubating the sample with PBST at 37°C for 15 min.
Lungs collected from B6 and the Lyz2+/EGFP mouse strains (Faust et al., 2000) were fixed and immunostained as whole mounts as previously described (Desai et al., 2014). Primary antibodies were chicken anti-GFP (Abcam ab13970), rat anti-Ecad/Cdh1 (Invitrogen ECCD-2), goat anti-Scgb1a1 (gift from Barry Stripp), rabbit anti-pro-Sftpc (Chemicon AB3786), and rat anti-Ager (R and D MAB1179). Fluorophore-conjugated secondary antibodies raised in Goat (Invitrogen) or Donkey (Jackson Labs) were used at 1:250 and DAPI at 1:1000.
FIJI was used to pseudocolor unprocessed micrographs for display as three-color overlays. A custom CellProfiler (Kamentsky et al., 2011) pipeline (Source code 1) was created to measure RNA signal intensities at the single-cell level. Briefly, the centers of cell nuclei were first identified as maxima in a filtered DAPI image, and associated with a numerical index. Nuclear boundaries were assigned by a propagation algorithm, and then expanded by ~1 micron to define sampling areas. The following data were then recorded: (i) average pixel intensities for each data channel over each sampling area; (ii) the coordinates of the sampling areas; (iii) shape metrics for the corresponding nuclei; and (iv) an image with the boundary pixels of each nucleus set equal to the associated index value. For each RNA species, the PLISH data were first normalized onto a 0:10 scale by dividing through by the largest value observed in any cell over all of the fields of view, and then multiplying by ten. The data were then log-transformed onto a −1:1 scale by the operation: transformed_data = log(0.1 + normalized_data). Custom Matlab scripts were used to perform hierarchical clustering of the log-transformed single-cell expression profiles, to generate heatmaps, and to create images with the boundary pixels of each nucleus colored according to a cluster assignment (Source code 1). Custom R scripts were used for k-means clustering and to make t-SNE projection plots (Source code 1).
Homogeneous detection of single rolling circle replication productsAnalytical Chemistry 76:495–498.https://doi.org/10.1021/ac034987+
In the right place at the right time: visualizing and understanding mRNA localizationNature Reviews Molecular Cell Biology 16:95–109.https://doi.org/10.1038/nrm3918
Nanoscale imaging of RNA with expansion microscopyNature Methods 13:679–684.https://doi.org/10.1038/nmeth.3899
Combinatorial analysis of mRNA expression patterns in mouse embryos using hybridization chain reactionCold Spring Harbor Protocols 2015:259–268.https://doi.org/10.1101/pdb.prot083832
Digital quantification using amplified single-molecule detectionNature Methods 3:725–727.https://doi.org/10.1038/nmeth916
Global conformations, hydrodynamics, and X-ray scattering properties of Taq and Escherichia coli DNA polymerases in solutionJournal of Biological Chemistry 278:25341–25347.https://doi.org/10.1074/jbc.M302118200
In situ sequencing for RNA analysis in preserved tissue and cellsNature Methods 10:857–860.https://doi.org/10.1038/nmeth.2563
Four-way junction formation promoting ultrasensitive electrochemical detection of microRNAAnalytical Chemistry 85:9422–9427.https://doi.org/10.1021/ac402416z
Measurement of diffusion constants for nucleic acids by NMRJournal of Biomolecular NMR 10:255–262.https://doi.org/10.1023/A:1018310702909
In situ detection and genotyping of individual mRNA moleculesNature Methods 7:395–397.https://doi.org/10.1038/nmeth.1448
Single-molecule mRNA detection and counting in mammalian tissueNature Protocols 8:1743–1758.https://doi.org/10.1038/nprot.2013.109
An official American Thoracic Society/European Respiratory Society statement: Update of the international multidisciplinary classification of the idiopathic interstitial pneumoniasAmerican Journal of Respiratory and Critical Care Medicine 188:733–748.https://doi.org/10.1164/rccm.201308-1483ST
Visualizing data using t-SNEJournal of Machine Learning Research: JMLR 9:2579–2605.
RNAscope: a novel in situ RNA analysis platform for formalin-fixed, paraffin-embedded tissuesThe Journal of Molecular Diagnostics: JMD 14:22–29.https://doi.org/10.1016/j.jmoldx.2011.08.002
Jay RajagopalReviewing Editor; Harvard University, United States
In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.
[Editors’ note: this article was originally rejected after discussions between the reviewers, but the authors were invited to resubmit after an appeal against the decision.]
Thank you for submitting your work entitled "Automated cell type classification in intact tissues by single-cell molecular profiling" for consideration by eLife. Your article has been evaluated by a Senior Editor and three reviewers, one of whom is a member of our Board of Reviewing Editors. The reviewers have opted to remain anonymous.
Our decision has been reached after consultation between the reviewers. Based on these discussions and the individual reviews below, we regret to inform you that your work will not be considered further for publication in eLife.
While all three reviewers agree that the techniques presented in this manuscript are valuable to the community, all three note there is no substantive biological advance in terms of new knowledge. Furthermore, the imaging data requires improvement as below and low-level transcript detection might add to this paper that should be published in a methods journal. Additionally, novel markers and their expression pattern should be described even in a methods paper.
In this study, Nagendran et al. validate the use of an improved version of RNA in situ hybridization that they name PLISH, which utilizes rolling circle amplification rather than the proprietary amplification procedures commercially available (i.e. RNAScope, ViewRNA, etc.), thus providing an accessible tool for the broad scientific community. They demonstrate sensitivity and specificity of PLISH, as well as an application in tissue as opposed to only cultured cells.
However, they make many claims that are not supported by the data and are not particular improvements on previous technology. For example, PLISH puncta in tissues are not on par with IF, and do not offer any subcellular resolution. They go on to use a clever approach to multiplexing and image analysis that overly complicates the presentation of a few data points attempting to mimic the computational approaches necessary for scRNASeq data. The novelty suggested in the cover letter with regard to new cell types is not reflected in the manuscript or the data.
This work is much better suited to a methods journal, as there is no exploratory work, and only validation and confirmation of what was already known. Overall, PLISH seems to be a poor-man's RNAScope/ViewRNA/etc… without the commercialization and therefore cost. While this is a great thing, some of the technical terms thrown around seem to be an attempt to differentiate PLISH from technologies upon which it is only building, not pioneering.
1) Pixels in tissue images look highly saturated/blurry, even DAPI – why?
-Does not help support claims of superior resolution/is not subcellular.
2) PLISH gives puncta, as described, in cells, but looks like weak IF in tissues with no clear subcellular resolution (which is a major claim in the Introduction).
3) Scale bars need to be in every separate image, especially in diseased vs. normal e.g. Figure 2.
4) Dotted lines outlining relative structures, such as basement membrane/lumen, would make it easier to follow.
5) Define "technical terms" if this is to be "biologist friendly" – "imager oligonucleotide" is not clear without a search in the Materials and methods. Workflow in Figure 4 is not readily apparent or obvious. As a major aspect of PLISH, this would be easier appreciated if described more thoroughly.
6) Usage of terms like "barcode" and "imager nt" make something that is not complex much more complicated and "jargon-y" than necessary. "RCA followed by hybridization of a fluorescently labeled nt probe" would be much simpler.
7) The inclusion of number of cells (2900) and visualization of the data in tSNE plots deceivingly equates PLISH to a sequencing approach, rather than a standard snapshot visualization tool.
8) Higher mag images are needed for Figure 3A-C to support the claim that this is "on par with IF" – the current images are not on par with IF and hardly support the presence of a putative BASC, while the IF in Figure 4—figure supplement 2E is much clearer.
9) Figure 3DE – what is the "other"? How does is fit in with what we already know, since these are a set of well-characterized genes? Should rework the text/figure in 3D to better lead into the application of this aspect of PLISH in Figure 4.
Overall, this is a highly technical paper that defines a tool. The authors clearly realize this, as they have filed a patent on PLISH. The concept is interesting and its application has great potential, yet it is not as excitingly novel as the text implies. While indeed an improvement on current technologies, this is fit for a methods report rather than a scientific article. The data are confirmatory rather than exploratory, and serve mainly as tool validation.
The focus on BASCs does not help the authors' case, as these are a controversial cell type not found in humans, and the IF images were more convincing than the PLISH data.
If the authors tackled more profound and subtle questions with PLISH, and in human samples (such as identifying the human BASC), then this would be less a tools paper and more a scientific article.
Finally, the overall tone describing this technology is overly ambitious and exceedingly strong. While this is fantastic work at method development and optimization (as well as creating accessibility), it falls short of being an original research article. The cover letter overstates the novelty.
This manuscript tackles an important question: how to obtain spatial information about RNA expression in a high-throughput fashion. It presents a technique the authors call PLISH which is an improved variation of the "in situ sequencing" technique published in 2013. The manuscript convincingly demonstrates that in cultured cells the PLISH technique can detect a variety of transcripts which are expressed at different levels and that PLISH detection levels correlate extremely well with FPKM values obtained by RNA-seq. In addition, they provide proof-of-principle that in tissue sections their technique can be used in a multiplex fashion to detect highly-expressed transcripts with very good specificity e.g. the blocking of the Scgb1a1 signal by antisense oligos, but not scrambled oligos, in Figure 1F is very impressive given how highly Scgb1a1 is transcribed.
Beyond Figure 1, the additional experiments presented all show that the PLISH technique works well in mouse and human lung sections (which are complex tissues) when highly-expressed mRNAs are detected.
However, no biological insights are presented. For example, Figure 2 shows that the technique basically works in formalin-fized paraffin embedded human samples using extremely highly expressed transcripts, but these results had previously been obtained by immunostaining so no biological insights are presented. Note: the cell shape changes that are described in the text for SpC+ cells are not visible in the image on the merged pdf.
The analysis of adult mouse lungs – Figures 3 and 4 is a bit disappointing. This is solid proof-of-principle that the PLISH technique works. However, these transcripts are all highly expressed – no real biological insights are gained at all here. To publish in eLife I would at least expect to see low level transcripts detected, or some new biology, in addition to the proof of principal work shown here. No prior evidence springs to mind as to what the 2 types of club cells could be – prox and dist, but this change in B-actin could be purely structural to do with cell size/shape in different parts of the airways.
"Proper" analysis of a tissue section using probes to transcripts that are thought to be expressed at low levels, in addition to highly expressed genes such as Sftpc, is required. Plus some biological insight.
The present study by the Desai lab shows the development of a proximity ligation in situ hybridization technology and its use to examine various cell type markers in the mouse and human lung. The technique appears to work well and if employed correctly could enhance studies on human tissue samples where histology is possible but isolation of single cells is limited. One limitation of the current study is that it does not extend our understanding of lung cell heterogeneity as most if not all of the markers examined have already been extensively characterized in the lung.
1) Much of the histology shown is at far too low of a magnification to assess the patterns of expression. The authors should provide high mag pictures in all situations, especially for the alveolar region as it is difficult to assess where the positive cells are located. Use of 3D reconvolution (i.e. IMARIS) would be helpful.
2) It would be nice for the authors to show data from some new novel markers that were isolated from the current study. This would provide novel insight into how this technique has advantages over other techniques.
3) A lot of the co-staining is difficult to assess. These data may be helped by higher mag imaging.
4) In the scRNA-seq experiment, was this done on whole lung cell suspension or Epcam+? It is not mentioned in the paper that I can see.
5) Many of the figures need better annotation. It’s very hard to interpret what some of the panels are trying to convey.https://doi.org/10.7554/eLife.30510.018
- Pehr B Harbury
- Tushar J Desai
- Tushar J Desai
- Pehr B Harbury
- Tushar J Desai
- Monica Nagendran
- Pehr B Harbury
- Tushar J Desai
- Pehr B Harbury
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
We thank Andres Andalon and Yana Kazadaeva for technical assistance; Barry Stripp for the goat anti-Scgb1a1 antibody. This work was supported by the NHLBI Progenitor Cell Biology Consortium 5U01HL09999507 (PH, TD), NHLBI 1R56HL1274701 (TD), Stanford BIO-X IIP-130 (PH, TD), Stanford ChEM-H (MN, PH, TD), and by a Stanford Discovery Innovation Fund award (PH).
Human subjects: Adult human lung was obtained from Stanford Healthcare with patient informed consent and consent to publish in strict accordance with protocol 18891, approved by the Institutional Review Board Administrative Panel on Human Subjects in Medical Research of Stanford University, in compliance with requirements for protection of human subjects.
Animal experimentation: This study was performed in strict accordance with the recommendations in the Guide for the Care and Use of Laboratory Animals of the National Institutes of Health. All of the animals were handled according to approved institutional animal care and use committee (IACUC) protocols (#22988) of Stanford University. The protocol was approved by the Administrative Panel on Laboratory Animal Care (APLAC) of Stanford University. Every effort was made to minimize suffering.
- Jay Rajagopal, Reviewing Editor, Harvard University, United States
© 2018, Nagendran et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.