Hearing and balance rely on small sensory hair cells that reside in the inner ear. To explore dynamic changes in the abundant proteins present in differentiating hair cells, we used nanoliter-scale shotgun mass spectrometry of single cells, each ~1 picoliter, from utricles of embryonic day 15 chickens. We identified unique constellations of proteins or protein groups from presumptive hair cells and from progenitor cells. The single-cell proteomes enabled the de novo reconstruction of a developmental trajectory using protein expression levels, revealing proteins that greatly increased in expression during differentiation of hair cells (e.g., OCM, CRABP1, GPX2, AK1, GSTO1) and those that decreased during differentiation (e.g., TMSB4X, AGR3). Complementary single-cell transcriptome profiling showed corresponding changes in mRNA during maturation of hair cells. Single-cell proteomics data thus can be mined to reveal features of cellular development that may be missed with transcriptomics.https://doi.org/10.7554/eLife.50777.001
Hair cells, the sensory cells of the inner ear, carry out a finely orchestrated construction of an elaborate actin cytoskeleton during differentiation. Progenitors of vestibular hair cells, the supporting cells (Roberson et al., 1992), have an unremarkable actin cytoskeleton. By contrast, differentiating hair cells express a wide array of actin-associated proteins, including crosslinkers, membrane-to-actin linkers, and capping molecules, and use them to rapidly assemble mechanically sensitive hair bundles on their apical surfaces (Shin et al., 2013; Ellwanger et al., 2018). Hair bundles consist of ~100 stereocilia each, filled with filamentous actin (F-actin) and arranged in multiple rows of increasing length; by maturity, stereocilia contain >90% of the F-actin in a hair cell (Tilney and Tilney, 1988).
Along the axis of the chicken cochlea, stereocilia systematically decrease in maximum height (from >5 to 1.5 µm) and increase in number per cell (from 30 to 300) as the frequency encoded increases (Tilney et al., 1992). Despite these changes, a quantitative analysis suggested that within experimental error, hair cells use the same amount of actin to build these disparate hair bundles (Tilney and Tilney, 1988). There is little evidence of substantially increased expression of actin genes during hair cell differentiation (Ellwanger et al., 2018), suggesting that hair cells use the existing monomeric actin that is available during differentiation to build a bundle (Tilney and Tilney, 1988).
The number of hair cells in the chick utricle increases 15-fold from embryonic day 7 (E7) to posthatch day 2, with hair-cell production peaking at E12 (Goodyear et al., 1999). Because of this asynchronous development, at any given age, cells are in distinct states along the pathway from progenitors to hair cells. While bulk sampling of cells averages over these developmental distinctions, sampling and analyzing individual cells of the E15 chick utricle allowed for transcriptomics examination of a large portion of the developmental trajectory for forming hair cells from supporting cells (Ellwanger et al., 2018), and should allow for a corresponding trajectory analysis using proteomics.
We sought to understand how supporting cells, with their modest F-actin cytoskeletons, could transform rapidly into stereocilia-endowed hair cells without significant upregulation of actin gene transcription. We used a highly sensitive single-cell proteomics approach to assess the concentrations of the abundant proteins in cells isolated from the E15 chick utricle. We found that hair cells were readily distinguished from supporting cells based on only 50–75 proteins identified in each. Notably, the actin monomer binding protein thymosin β4 (TMSB4X) was abundant in supporting cells but not in hair cells, and its expression decreased as progenitors developed into hair cells, which we revealed using trajectory analysis based on the proteomics data. Single-cell RNA-seq (scRNA-seq) analysis showed that TMSB4X transcripts are downregulated when transcription of ATOH1, a key regulator of hair cell differentiation, is activated. These data are consistent with a model that suggests that existing monomeric actin is made available for hair-bundle assembly by the degradation of TMSB4X.
Recent reports demonstrated that extensive shotgun mass spectrometry characterization of the proteins of single cells is possible when samples are processed in nanoliter-scale volumes (Zhu et al., 2018b; Zhu et al., 2018c; Zhu et al., 2018a; Zhu et al., 2018d). About 700 proteins or protein groups could be detected from a single HeLa cell (Zhu et al., 2018a), which is estimated to have a volume of a few picoliters (Zhao et al., 2008; Park et al., 2008). We analyzed single cells from E15 chicken utricle; by using peeled epithelia for cell dissociation, we limited the cell types analyzed to hair cells and supporting cells (Herget et al., 2013), each of which is smaller than a HeLa cell. To distinguish the two cell types, we labeled utricle cells with FM1-43, which labels hair cells more strongly than supporting cells (Herget et al., 2013; Ellwanger et al., 2018). After cell dissociation, we collected single cells and pools of 3, 5, and 20 cells in nanowells using fluorescence-activated cell sorting (FACS; Figure 1A and Figure 1—figure supplement 1). As expected, collected cells with high levels of FM1-43 (FM1-43high) had hair bundles and elongated cell bodies (Figure 1B), both of which are characteristic of hair cells; FM1-43low cells—mostly supporting cells—were round after FACS (Figure 1C). We measured cell volumes after cell sorting. FM1-43high cells averaged 1.01 ± 0.20 picoliters (mean ± SD; N = 9), while FM1-43low cells averaged 0.67 ± 0.15 picoliters (N = 8). For comparison, using the same method, we measured the volume of HeLa cells to be 4.9 ± 1.1 picoliters (N = 5).
We used the nanoPOTS (nanodroplet processing in one-pot for trace samples) approach to carry out all sample processing steps in single nanowells (Zhu et al., 2018b). After protein extraction, reduction, alkylation, and proteolysis, digested peptides were collected and separated using nano-liquid chromatography on a 30-µm-i.d. column (Zhu et al., 2018b) (Figure 1A). The separated peptides were delivered to an Orbitrap Fusion Lumos Tribrid mass spectrometer and were analyzed using data-dependent acquisition. Peptides were identified, quantified, and assembled into proteins with Andromeda and MaxQuant (Cox and Mann, 2008; Cox et al., 2011), using Match Between Runs (Tyanova et al., 2016; Zhu et al., 2018b) to identify unmatched peptides based on their accurate masses and liquid chromatography retention times. To quantify proteins based on molar abundance, we used intensity-based absolute quantification (iBAQ), calculated from the sum of peak intensities of all peptides matching to a specific protein divided by the number of theoretically observable peptides (Schwanhäusser et al., 2011). All mass spectrometry data are deposited at ProteomeXchange, and analyzed data are reported in Figure 1—source data 1.
About 200 unique peptides were identified in each single FM1-43high cell; about half were identified by MS/MS scans and half by matching (Figure 1D). The total number of unique peptides increased to ~2500 in pools of 20 cells, with about 70% identified by MS/MS scans (Figure 1D). Only ~1200 peptides were identified in pools of 20 FM1-43low cells, with about 60% identified by MS/MS scans (Figure 1E). Total iBAQ rose nonlinearly with the number of cells (Figure 1F), suggesting that some protein was lost to surface adsorption; while small relative to typical sample wells, the volume of the nanowell is still 50,000-fold larger than the volume of a utricle cell. Because sample processing occurred in a protected nanowell environment using robotic liquid handling, the total iBAQ attributed to keratins (e.g., human skin contamination) was only ~0.1% of the total, far less than >50% occurring in some mass-spectrometry experiments with small amounts of protein. The number of proteins or protein groups identified increased from ~60 for FM1-43high single cells to nearly 600 for pools of 20 cells (Figure 1G); fewer proteins were identified in supporting cells, likely because of their smaller volume.
Comparison of single FM1-43high and FM1-43low cells to wells with collection triggered to noise allowed us to confirm the presence of single cells, even without visual inspection of the wells. Total iBAQ did not accurately indicate which wells contained single cells (Figure 1H), presumably because the total signal can be dominated by incorrect assignment of contaminant signals to proteins. By contrast, the number of proteins or protein groups identified distinguished most FM1-43high or FM1-43low samples from noise (Figure 1I); the samples with low numbers of identifications could represent sorting events where cells missed their target nanowell.
To examine the composition of FM1-43high and FM1-43low samples, we used relative iBAQ (riBAQ) for quantitation (Shin et al., 2013; Krey et al., 2014) and displayed the 60 most abundant proteins of the 20 cell samples (Figure 2A). All identified proteins from the 20 cell samples are displayed in Figure 2—figure supplement 1, and all identified proteins from the single-cell samples are displayed in Figure 2—figure supplements 2 and 3. We used a volcano plot analysis to show those proteins that had statistically significant enrichment in the 20 cell FM1-43high and FM1-43low samples (Figure 2B). Well-known hair-cell proteins were enriched significantly in FM1-43high cells, including the mobile Ca2+ buffers OCM and CALB2, as well as the molecular motor MYO6. Several proteins were highly enriched in the FM1-43low samples, including TMSB4X, STMN2, SH3BGRL, and MARCKS.
The proteomics experiments also revealed several abundant proteins that had not been previously found to be hair-cell specific, including GSTO1, GPX2, CRABP1, and AK1; TMSB4X and AGR3 were examples of proteins that were much more abundant in supporting cells (Figure 1—source data 1; Figure 2A–B). We examined several of these proteins in E15 chick utricles using immunocytochemistry. Antibody labeling for AGR3 and the hair-cell marker OTOF labeling did not overlap, and the elongated cell bodies labeled for AGR3 indicated that it marked supporting cells (Figure 3A–C and Figure 3—figure supplement 1). By contrast, CRABP1 was specific for hair cells, seen by the overlap with OTOF (Figure 3D–F and Figure 3—figure supplement 2).
The thymosin-beta family of proteins, which includes TMSB4X, are actin monomer binding proteins that sequester substantial fractions of actin in many cell types (Nachmias, 1993; Sun et al., 1995). Five TMSB4X peptides were identified by mass spectrometry, which covered 75% of the ~5 kD protein; one of the peptides was shared by TMSB15B, another member of the family (Figure 1—figure supplement 2). Analysis of transcript expression in mouse inner ear using gEAR (https://gear.igs.umaryland.edu) indicated that Tmsb4x expression was considerably higher than that of another paralog, Tmsb10, and much higher than the two Tmsb15 isoforms, justifying our focus on TMSB4X.
To localize TMSB4X in the E15 chick utricle, we used an antibody that has been validated previously with knock-down experiments against mouse TMSB4X (Zhou et al., 2013; Li et al., 2018); chicken TMSB4X differs from mouse and human TMSB4X by only two serine-to-threonine substitutions out of 44 total amino acids. TMSB4X immunoreactivity was cytoplasmic and strong in supporting cells and substantially reduced in hair cells (Figure 3G–I and Figure 3—figure supplement 3), which was consistent with the mass-spectrometry results. Because TMSB4X maintains actin in a monomeric form (G-actin), probes for G-actin like the JLA20 antibody (Lin, 1981) provide another way of localizing the pool of unpolymerized actin. JLA20 immunoreactivity was comparable in most cells, although there was an increased level of signal at the base of the hair cells (Figure 3J–L and Figure 3—figure supplement 4).
The concentration of TMSB4X relative to total actin should indicate how much free actin is available for assembling filamentous structures like stereocilia (Weber et al., 1992). Analyzing the 20 cell samples, we found that the ACTG1 protein group—total actin—accounted for a relative molar fraction (riBAQ) of 0.043 ± 0.001 (mean ± SEM) in FM1-43high cells and 0.060 ± 0.005 in FM1-43low cells (Figure 2C). A mixed-effects model accounting for intra-sample correlations indicated that these concentrations differed significantly, albeit only at an alpha level of 0.05 (summary statistics with confidence intervals are reported in Table 1). While TMSB4X accounted for a relative molar fraction of only 0.006 ± 0.002 in FM1-43high cells, it was 0.056 ± 0.012 in FM1-43low cells, ten-fold higher (Figure 2C) and significantly different (p<0.001). Critically, the concentration of hair-cell TMSB4X differed significantly from that of hair-cell actin (p=0.001), while the concentration of supporting cell TMSB4X did not differ from that of supporting cell actin (p=0.660). Because TMSBX and actin interact with a 1:1 stoichiometry (Goldschmidt-Clermont et al., 1992), and no other actin-binding proteins are detected at similar high levels, our quantitation suggests that there is enough TMSB4X to bind most actin monomers in supporting cells.
In wholemount preparations, we counted 72 ± 8 stereocilia per utricle hair cell (mean ± SD; N = 26 from striolar and extrastriolar regions). The actin quantitation suggested that each E15 hair cell contains ~15,000,000 actin molecules (G- and F-actin combined). If nearly all actin is in stereocilia (Tilney and Tilney, 1988), then each stereocilium would contain ~200,000 actin molecules. While fewer than the 400,000 molecules estimated per E20 chick stereocilium (Shin et al., 2013), the value is consistent with the relative immaturity of E15 cells.
While our results showed that the expression profile of pooled-cell samples distinguished between supporting cells and hair cells, a single contaminating cell in a pool could distort the pool’s expression pattern. We therefore examined whether we could achieve similar discrimination based on the 30 single-cell profiles, despite the low numbers of identifications in each cell. We used 75 proteins or protein groups that were detected in at least five cells, and normalized and batch-corrected (Johnson et al., 2007; Büttner et al., 2019; Luecken and Theis, 2019) data from individual cells. The resulting expression values, referred to as log2 normalized iBAQ (niBAQ) units, comprised an expression matrix having similar characteristics to single-cell transcriptomics data. Non-detects accounted for 62% of the values, while for 83% of identifications the detected values followed a log-normal distribution (Shapiro-Wilk normality test p>0.05).
Because methods developed for single-cell transcript analysis (Luecken and Theis, 2019) should be suitable for dissection of the single-cell proteomics results, we applied CellTrails, which we previously used to uncover the branching trajectory from progenitors to hair cells in the chicken utricle using transcript data (Ellwanger et al., 2018). To interpret the latent structure in the single-cell mass spectrometry data, its lower-dimensional manifold was investigated using CellTrails’ robust nonlinear spectral embedding on the submatrix of the 37 highest variable identifications (Figure 4A). Appropriately, most cells segregated according to their FM1-43 uptake (Figure 4B). We noted that the protein pattern of three cells classified as FM1-43high appeared to match better to the FM1-43low (supporting cell) pattern. Similarly, two FM1-43low cells were embedded in the neighborhood of cells with a high FM1-43 uptake (hair cells). While FM1-43 is useful for labeling hair cells, transcript analysis showed that FM1-43 levels are not a perfect proxy for hair-cell maturity (Ellwanger et al., 2018); for example, hair cells with damaged mechanotransduction will not load with the dye and such cells would be classified as FM1-43low. Alternatively, cells with relatively low FM1-43 could be transitional cells between progenitors and mature hair cells (Ellwanger et al., 2018). We therefore surmised that we could elicit a developmental trajectory from the single-cell protein expression patterns. The chronological ordering of the cells was learned in the lower-dimensional manifold and a pseudotime value was assigned to each cell (Figure 4C).
The 75 proteins sufficiently detected on single-cell level are all relatively highly expressed and largely do not include those expected to distinguish different classes of hair cells (Ellwanger et al., 2018). Moreover, we expect that our sample is dominated by type II hair cells, especially those from extrastriola regions, as they are much more numerous than type I hair cells (Ellwanger et al., 2018). Because of the gating strategy used (Figure 1—figure supplement 1), we primarily sampled cells from either end of the developmental trajectory, which was apparent from the gap approximately at the midpoint of the pseudotime axis (Figure 4D). Nevertheless, we sampled sufficient numbers of differentiating cells to establish a developmental trajectory.
We examined protein expression dynamics as a function of developmental pseudotime (Figure 4E). Protein expression changed systematically along the developmental pseudotime axis with the expected trends: proteins enriched in supporting cells, including AGR3 and TMSB4X, decreased in expression along the pseudotime axis (Figure 4E–F). By contrast, proteins known to be enriched in hair cells, including OCM, CALB2, MYO6, CKB, and GAPDH, all increased as pseudotime progressed (Figure 4E–F).
We predicted that the decrease in TMSB4X as hair cells mature arose from downregulation of TMSB4X transcript expression during differentiation of hair cells. We therefore used transcriptomic profiling of single cells isolated from E15 chick utricle to examine gene expression during the bifurcating trajectory that describes the development of progenitor cells to mature striolar and extrastriolar hair cells (Ellwanger et al., 2018). We carried out scRNA-seq transcriptomic profiling using the Smart-seq protocol (Picelli et al., 2014) on 384 FACS-sorted E15 chick utricle epithelial cells. To provide maximum correlation of TMSB4X expression changes with chicken utricle hair cell maturation, we reconstructed the trajectory in similar fashion as previously described (Ellwanger et al., 2018), carrying out the analysis with 182 assay genes already including GSTO1 and CRABP1 from that previous study, supplemented with TMSB4X, AGR3, GPX2 and AK1. Nine cellular subgroups emerged, each of which was distinguished by distinct marker gene sets (Figure 5A). Based on their expression profiles, for example the lack of TECTA and especially high levels of TMSB4X (Figure 5A; see also Figure 3), two subgroups (S8 and S9) appeared to be stromal cells; to focus on the developmental progression of progenitor (supporting) cells to hair cells, we removed S8 and S9 for subsequent analysis.
We mapped the remaining 254 individual cells of subgroups S1-S7 along developmental trajectories, plotting CellTrails maps (Ellwanger et al., 2018) to demonstrate the branching nature of the trajectory (Figure 5B). Our assay was biased for hair-bundle genes, and at least half of the cells isolated were hair cells, so it is unsurprising that the final trajectory revealed not only the transition from progenitor (supporting) cells to hair cells, but also further developmental branching. One major branch was supporting cells, as these cells expressed markers like TECTA and OTOA (Figure 5H; Figure 5—source data 1). The other two major branches were hair cells, as they occurred after the peak of ATOH1 expression and also showed high levels of hair-cell markers like MYO7A (Figure 5C; Figure 5—source data 1).
Enrichment of LOXHD1, ATP2B2, TMC2, TNNC2, MYO3A and OCM indicated that the branch projecting down and to the right was from striolar cells (Figure 5F; Figure 5—source data 1), which we named TrS (‘trail striola’) as previously defined (Ellwanger et al., 2018). Similarly, enrichment of SYN3, SKOR2, CALB2, and TMC1 indicated the branch projecting up and to the right is from extrastriolar cells, which we named TrES. TrES branched again, and relative enrichment of ATP2B2 (Figure 5F; Figure 5—source data 1) and reduced expression of SKOR2 (Figure 5G; Figure 5—source data 1) indicated that the left-hand branch was equivalent to the novel hair cell type TrES* found in our previous study (Ellwanger et al., 2018).
To confirm the spatial identity of the two major hair cell branches, all experiments were carried out with E15 chicken utricles split apart into lateral halves, which contain striolar and extrastriolar cells, and medial halves, which contain only extrastriolar cells. The lower-right branch was populated nearly entirely by lateral cells, which confirms that it represents striolar hair cells (Figure 5D–E). We conclude that the scRNA-seq experiment accurately replicated our previous experiment using a multiplex RT-qPCR approach (Ellwanger et al., 2018).
We next examined the genes highlighted in the proteomics experiments, including TMSB4X, AGR3, GSTO1, GPX2, CRABP1, and AK1. As predicted from the proteomics and localization experiments, the CellTrails analysis showed that TMSB4X and AGR3 were specific to progenitor (supporting) cells (Figure 5H), while GSTO1, GPX2, CRABP1, and AK1 were specific to hair cells (Figure 5I). The CellTrails maps indicated that GPX2 and AK1 were concentrated in striolar hair cells, while CRABP1 was enriched in extrastriolar cells, particularly TrES* (Figure 5I). Cells observed with high levels of CRABP1 in immunocytochemistry experiments could be the TrES* cells (Figure 3G,I). GSTO1 was expressed at similar levels in both hair cell types. Available antibodies against GPX2, AK1, and GSTO1 were insufficiently reliable to check their hair-cell specificity. Examining databases in gEAR, however, we noted that Gpx2 and Ak1 are predicted to be substantially enriched in hair cells as compared to non-hair cells in mouse utricle; by contrast, Gsto1 is expressed at higher levels in mouse utricle non-hair cells than in hair cells.
The scRNA-seq results corroborated the expression dynamics of TMSB4X on transcriptional level. High in progenitor cells, its transcriptional activity decreased substantially during hair cell differentiation (Figure 5H), and was nearly undetectable in striolar hair cells. Interestingly, TMSB4X was expressed at detectable levels, albeit relatively low, in cells along TrES as compared to TrES* (Figure 5H).
We also noted a striking decrease in ACTB expression as hair cells differentiated; the CellTrails maps suggested that ACTB was >10 fold higher in progenitor cells than in hair cells (Figure 5K; Figure 5—source data 1). ACTG1 increased modestly in expression, especially in TrES cells, but overall was present at lower levels than ACTB (Figure 5K; Figure 5—source data 1). Similar trends for these actin isoforms were also seen in our previous data (Ellwanger et al., 2018). ACTB and ACTG1 differ by only four amino acids, however, and we only detected one of the peptides that distinguish the isoforms (Ac-EEEIAALVIDNGSGMCK from ACTG1) in single mass spectrometry run. We were therefore unable to accurately measure the relative abundance of the two actin isoforms in our protein mass spectrometry experiments.
Global analysis of proteins from single cells has previously been thwarted by nonspecific adsorption of proteins to surfaces and the lack of a scheme to amplify proteins (Couvillion et al., 2019). Recent development of nanowell sample processing with nanoPOTS, coupled with extremely sensitive mass spectrometers, permits detection of abundant proteins of even very small cells (<1 picoliter). We exploited this method to characterize the abundant proteins of FACS-sorted supporting cells, the hair cell progenitors, and hair cells from embryonic chick utricles. Remarkably, we were able to use the mass spectrometry data from single utricle cells to reconstruct a developmental trajectory from protein-expression values alone.
In addition, we identified several proteins not previously highlighted as specific for hair cells (CRABP1, GSTO1, GPX2, AK1) or for supporting cells (AGR3, TMSB4X). TMSB4X was present at nearly equimolar levels with respect to actin in supporting cells and thus may sequester actin monomers there. By contrast, in hair cells, TMSB4X was only one-tenth as abundant as actin. This developmental change was characterized in more depth using single-cell RNA sequencing, which showed that the drop in TMSB4X was greater in extrastriolar hair cells than in striolar hair cells. Together, these data are consistent with the hypothesis that downregulation of TMSB4X allows differentiating hair cells to construct their hair bundles with newly available actin monomers.
While fewer proteins were identified from single hair cells than were identified in a recent report using the same technique with single HeLa cells and primary lung cells (Zhu et al., 2018a), hair cells have a volume of ~1 picoliter, while HeLa cells are five times larger in volume. Given this difference, detection of fewer proteins in hair cells than in HeLa cells was not surprising. The 700 proteins detected in HeLa cells are also a substantial underestimate of the total that are present there; using two-dimensional separation prior to mass spectrometry, over 11,000 proteins or protein groups can be detected from these cells (Geiger et al., 2012; Kulak et al., 2017), and analyses with a one-dimensional separation, like that used here, typically yield ~3000 proteins. The total number of proteins in hair cells and supporting cells is similarly large; using separation in only one chromatographic dimension, we identified ~3000 proteins from E15 chick utricle sensory epithelium (Shin et al., 2013). Note that even these numbers are underestimates of the total number of genes expressed; for example, we detected expression from 13,643 genes in our scRNA-seq experiments. The range of protein expression is enormous and the dynamic range of mass spectrometers is limited, which prevents proteomic detection of all expressed genes.
While the nanoPOTS approach is useful for characterizing abundant proteins in small cells or more proteins in larger cells, without further increases in sensitivity, the small number of proteins detected in single hair cells will prevent characterization of low-abundance proteins or deeply categorizing developmental pathways using protein expression. Because the relationship between cell number and total protein signal intensity was nonlinear, especially for 1–3 cell samples (Figure 1F), we concluded that we lost significant amounts of protein to adsorption to the nanowells. These results indicate that further improvement of protein recovery is critical to increase proteome coverage and quantification performance of single-cell proteomics technology employing nanoPOTS. Fabricating nanowells with smaller dimensions would be straightforward; however, dispensing single cells by FACS to yet-smaller wells would be difficult to carry out reproducibly. In addition, the nanowell surfaces could be coated chemically with antifouling materials such as polyethylene glycol or poly(2-methyl-2-oxazoline) polymers (Weydert et al., 2017). An alternative strategy for single-cell protein analysis uses TMT multiplex labeling with one channel utilized by a sample of several hundred carrier cells, which will reduce the relative error due to protein loss (Budnik et al., 2018). Indeed, coupling of TMT multiplex labeling approach with nanoPOTS significantly increases proteome coverage and analysis throughput of single cell proteomics (Dou et al., 2019).
We detected much higher levels of small proteins (<20 kD) here than in previous experiments using the same tissue (Shin et al., 2007; Shin et al., 2010; Shin et al., 2013; Herget et al., 2013; Wilmarth et al., 2015). We attribute this improved detection to the processing of samples in nanoPOTS nanowell without transfer steps; other sample preparation methods included steps (e.g., SDS-PAGE gel separation or filter washes) that facilitated loss of small proteins (Krey et al., 2016). Our new data indicate that on a molar basis, OCM (Figure 2A, Figure 3M–O), which was formerly known as parvalbumin 3 in the chick (Heller et al., 2002), is the most abundant protein in E15 chick hair cells, accounting for >10% of the total number of protein molecules.
Although we see high levels of small proteins, we do not think that they are over-represented in our data. Moreover, levels of cytoskeletal proteins like actin and tubulin are similar to those seen in bulk proteomics experiments (Shin et al., 2013). By contrast, as in most shotgun mass spectrometry experiments, membrane proteins are underrepresented relative to their presence in the genome (Kar et al., 2017). Membrane proteins are generally expressed at low levels, however, which also contributes to our inability to see more than a few such proteins (e.g., ATP1A1, SLC17A8, and ATP2B2). Challenges in detecting membrane proteins are not restricted to single-cell proteomics, however.
We report here the first example of the use of single-cell protein mass spectrometry data to construct a developmental trajectory, in this case from progenitors to hair cells. While limited by having only 30 cells for this analysis, CellTrails nevertheless was able to generate a developmental trajectory that discriminated distinct expression patterns of transitioning cells. The proposed trajectory for the single cells studied here accounted for protein expression levels at the endpoints, which represented supporting cells and hair cells; moreover, expression levels of many of the endpoint-selective proteins systematically increased or decreased across the trajectory, as expected. Although we did not detect the branching trajectories of hair cells along the TrS, TrES, and TrES* trails that were seen using multiplex RT-qPCR approach (Ellwanger et al., 2018) or scRNA-seq (Figure 5), the small number of cells analyzed for protein-based trajectories likely precluded their detection.
Although application of trajectory-analysis methods to single-cell proteomics is very much in its infancy, we show here that CellTrails is suitable for this purpose. The analysis was limited by sensitivity, as proteins detected by mass spectrometry were limited to those expressed at relatively high levels. Improvements of the nanoPOTS method that lead to increased sensitivity and reproducibility will enhance future protein-based trajectory analyses. Nevertheless, while robotic manipulation allows for increased sample-preparation output, the number of cells analyzed presently must remain low because of slow throughput of the mass spectrometry steps. Single-cell RNA-seq approaches are likely to continue to offer much higher throughput and depth for the foreseeable future. That said, analysis of developmental pathways using single-cell proteomics allows the identification of key proteins that change in protein expression level without alternations in transcript levels. Moreover, future single-cell proteomics approaches will allow analysis of posttranslational modifications like phosphorylation, which will expand our ability to probe developmental cascades.
Interestingly, GAPDH was found to increase during hair cell maturation (Figure 4E–F), while its mRNA was reported to remain at a constant level (Avenarius et al., 2014; Ellwanger et al., 2018); this discrepancy suggests that GAPDH undergoes post-transcriptional regulation, either from increased translation efficacy or by protein stabilization following translation. Because GAPDH concentrates in stereocilia (Shin et al., 2007; Shin et al., 2013), protein stabilization there is the most likely explanation for the increasing GAPDH levels during hair-cell differentiation. This observation highlights the power of the protein-based trajectory analysis; because of the poor correlation between transcript levels and protein levels (Liu et al., 2016), understanding how proteins change during a developmental process will require their direct measurement, not inferring their levels as is done with scRNA-seq experiments.
During formation of hair cells, we observed a substantial downregulation of ACTB during differentiation to hair cells, with the largest decrease in extrastriolar cells; this decrease was partially compensated by an increase in ACTG1, especially in extrastriolar cells. If scRNA-seq provides accurate relative measurements of transcript levels, the sum of ACTB and ACTG1 transcripts in E15 hair cells was one-quarter that in supporting cells. Total actin protein levels were similar in hair cells and supporting cells, however, suggesting that actin synthesis may be reduced during differentiation of hair cells.
In chicken auditory hair cells, ACTB is found nearly exclusively in stereocilia F-actin cores, while the more abundant ACTG1 is distributed to all F-actin assemblies (Höfer et al., 1997). Nevertheless, Actb or Actg1 knockout mice have normal stereocilia development, suggesting that the two isoforms are interchangeable in mice, at least initially (Belyantseva et al., 2009; Perrin et al., 2010). Phenotypes of aging hair cells from the two mutants differ, however, and these differences arise from the ACTB and ACTG1 proteins themselves, not expression dynamics (Perrin et al., 2010; Patrinostro et al., 2018).
The reduced expression of actin genes is broadly consistent with Tilney’s suggestion that each hair cell uses an equivalent-sized bolus of actin to build their hair bundles (Tilney and Tilney, 1988). An alternative model dictates that the final amount of actin used in a hair bundle depends on the expression level of crosslinkers like PLS1 and ESPN (Höfer et al., 1997; Sekerková et al., 2011; Krey et al., 2016). A plausible hypothesis incorporating observations reported here is that ACTB is sequestered with TMSB4X in supporting cells; upon differentiation to hair cells, ACTB is made immediately available for stereocilia elongation by degradation of the actin buffer TMSB4X, while ACTG1 expression is increased to provide actin for other assemblies, including the cuticular plate and circumferential actin belt (Höfer et al., 1997).
Single cells were collected from utricles of E15 chick embryos using methods previously described (Ellwanger et al., 2018). Utricles were incubated with 10 µM FM1-43FX (Thermo Fisher, Waltham, MA) for 20 s, then treated with thermolysin (Sigma-Aldrich, St. Louis, MO) to remove the sensory epithelium; cells in the epithelium were dissociated by treating with Accutase (Innovative Cell Technologies, San Diego, CA) with mechanical trituration (Ellwanger et al., 2018). Cells were sorted with a BD Influx Instrument (BD Biosciences) set to ‘‘single cell’’ mode and equipped with a 85 µm nozzle. To enable the direct cell sorting into nanowells, a customized template was built with the BD FACS software environment (Sortware) to match the format of nanowell array. SPHERO Drop Delay Calibration Particles (Spherotech, Lake Forest, IL) were used to confirm the drop targeting, as well as the parameter optimization. Prior to sorting onto the final collection chip, a separate preparation of E15 chick cells was used to optimize drop delay settings in order to verify alignment of droplets within the well and cell numbers within each well by visual inspection under an epifluorescence microscope. Debris was removed based on forward scatter (FSC) versus side scatter (SSC) and doublets were excluded based on forward scatter (FSC) versus trigger pulse width. SYTOX Red Dead Cell Stain (Thermo Fisher) was added to cells prior to sorting and was used to identify live and dead cells. Based on our final gating approach, which compared 638–1 (SYTOX Red) versus 488–1 (FM1-43), single SYTOX Red-negative FM1-43high or FM1-43low cells were deposited into individual nanowells.
NanoPOTS chips were fabricated on standard microscopy slides as described (Zhu et al., 2018b; Zhu et al., 2018a). An arrangement of 5 × 13 hydrophilic nanowells was created, each 1 mm diameter with 2.25 mm on-center spacing, and the surrounding chip surface was treated with 2% (v/v) heptadecafluoro-1,1,2,2-tetrahydrodecyl)-dimethylchlorosilane (PFDS) in 2,2,4-trimethylpentane to render it hydrophobic. A glass spacer and cover plate were fabricated for each nanowell chip, allowing the chip to be sealed so that evaporation was minimized during sample incubation. A home-built robotic liquid handling system, capable of subnanoliter dispensing, was used to dispense sample preparation reagents into nanowells (Zhu et al., 2018b; Zhu et al., 2018a). To lyse cells, and to extract and reduce proteins, 100 nl of 0.2% dodecyl β-D-maltoside containing 5 mM DTT in 0.5x PBS and 25 mM ammonium bicarbonate were added to a nanowell containing FACS-sorted single cells or pools of cells; the chip was then incubated at 70°C for 1 hr. Next, proteins were alkylated using 50 nl of 30 mM iodoacetamide in 50 mM ammonium bicarbonate in the dark at 37°C for 30 min. Lys-C and trypsin (each from Promega, Madison, WI, USA) were added sequentially using 50 nl of 5 ng/μl enzyme solutions in 50 mM ammonium bicarbonate for 4 hr and 6 hr, respectively. Finally, the peptide sample was acidified with 50 nl of 5% formic acid and then collected into a fused-silica capillary (200 μm i.d., 5 cm long). To maximize sample recovery, each nanowell was re-extracted twice, each with 200 nl of 0.1% formic acid in water. The sample collection capillaries were sealed on both ends with Parafilm and stored at −70°C until use.
Two single-cell proteomics experiments were carried out. For Experiment 1, the FM1-43high samples included five samples with 1 cell, four samples with 3 cells, four samples with 5 cells, and four samples with 20 cells; the FM1-43low samples included five samples with 1 cell, three samples with 3 cells, four samples with 5 cells, and four samples with 20 cells. For Experiment 2, the FM1-43high samples included ten samples with 1 cell and four samples with 20 cells; the FM1-43low samples also included ten samples with 1 cell and four samples with 20 cells.
Images of individual FM1-43high and FM1-43low cells (Figure 1) were acquired by sorting 1000 cells of either cell type into a drop of 4% paraformaldehyde, which had been placed on a Superfrost Plus glass slide (Fisher Scientific). We used FM1-43 to label the isolated cells after fixation; here, the dye’s propensity to insert into the extracellular leaflet of the membrane was used rather than its ability to enter transduction channels. Cells were fixed for 10 min, then a PBS solution containing DAPI (4,6-diamidino-2-phenylindole, 1 µg/ml, Thermo Fisher, D1306) and 10 µM FM1-43FX was added to the slide for 10 min. Fixed and labeled cells were washed, then covered with Vectashield mounting medium (Vector Labs) and imaged using a 63x lens on a Zeiss Elyra PS.1 microscope.
To quantify cell volume, cells were FACS-sorted (for hair cells and supporting cells), fixed, stained with DAPI and phalloidin, and imaged as described above. For each slice of the z-stack, the Threshold and Make Binary tools of Fiji/ImageJ were used to generate a binary stack, which defined the cell perimeter. The Analyze Particles tool was then used to determine the cell area in each slice. The volume for a slice was calculated as the product of the single-slice area multiplied by the z-stack interval; all slice volumes were added together to estimate total cell volume.
To count stereocilia per hair bundle, Airyscan z-stack images of E15 chicken utricles stained with Alexa Fluor 488-conjugated phalloidin (1:1000, Invitrogen, Carlsbad, CA, A12379) were obtained using a Zeiss LSM 880 microscope; images were acquired near the base of bundles to ensure that all stereocilia were in each image. Stereocilia were manually counted from single x-y images.
A capillary solid phase extraction (SPE) column (75 μm i.d., with 3 μm C18 particles of 300 Å pore size; Phenomenex, Torrance, USA) was used for initial sample loading and desalting, and was then connected to a 50 cm, 30 μm i.d. column packed with the same material. Mobile phase was delivered at 60 nl/min with a Dionex UltiMate NCP-3200RS pump system (Thermo Fisher). Peptides were separated with a linear 8–22% Buffer B (0.1% formic acid in acetonitrile) gradient over 60 min, followed by a 10 min increase to 45%. The column was washed with 80% Buffer B for 10 min and then equilibrated with 2% Buffer B for 15 min. An Orbitrap Fusion Lumos Tribrid mass spectrometer (Thermo Fisher) was used for data collection. Peptides were ionized at a spray voltage of 2 kV and ions were collected into an ion transfer capillary set at 150°C. The RF lens was set at 30%. MS1 scans used a 375–1575 mass range, a scan resolution of 120,000, an AGC target of 3 × 106, and a maximum injection time of 246 ms. Precursor ions were selected for MS/MS sequencing if they had charges of +2 to +7 and intensities > 8,000; precursors were isolated with an m/z window of 2 and fragmented by high energy dissociation (HCD) set at 30%. Repeat sampling was reduced by using an exclusion duration of 40 s and m/z tolerance of ± 10 ppm. MS2 scans were carried out in the Orbitrap with an AGC target of 2 × 105. The maximum injection time and MS2 scan resolution were set as 502 ms and 120,000, respectively.
Andromeda and MaxQuant (version 188.8.131.52) were employed for database searching and label-free protein quantification (Cox and Mann, 2008; Cox et al., 2011). All MS2 spectra were searched against the NCBI Genome Reference Consortium Chicken Build 6a (GRCg6a) database (49,673 protein sequences; released 2018-03-27). The default MaxQuant contaminants file was edited as described (Wilmarth et al., 2015). Carbamidomethylation was selected as fixed modification, and N-terminal protein acetylation and methionine oxidation were set as variable modifications. Peptides were required to contain >5 amino acids and peptide masses must be <4600 Da; two missed cleavages were allowed for each peptide. Peptides and proteins were each filtered with a false discovery rate (FDR) of 0.01. The Match Between Runs algorithm was used to improve proteome coverage; we used an alignment window of 15 min and a match time window of 0.5 min. iBAQ protein intensities were used for quantification.
For enrichment analysis of FM1-43high (N = 4 of 20 cell samples) and FM1-43low (N = 4 of 20 cell samples) cells, the riBAQ data were transformed into log2 scale; linear models with empirical Bayes statistics were fitted to the transformed data using the limma R package (Ritchie et al., 2015) (10.18129/B9.bioc.limma). For the enrichment analysis, we only used the 345 proteins that were measured in at least two replicates in each group. To correct for multiple tests (Benjamini and Hochberg, 1995), the FDR was used to correct two-sided p-values from a moderated t-test (Ritchie et al., 2015); an enrichment of >1.5 fold and a FDR-adjusted p-value less than 0.05 was considered significant.
For statistical comparisons of ACTG1 and TMSB4X mass spectrometry results, to account for potential intra-sample correlations, a mixed-effects model with a random intercept for samples was fitted to the data and used t-tests of contrasts to assess differences between groups (Pinheiro and Bates, 2000). The lmerTest R package (version 3.1–0) was used for the computation (Kuznetsova et al., 2017). A p-value less than 0.05 was considered significant.
For single-cell analysis, we filtered all identifications that were robustly detected in at least five cells, leaving a total of 75 proteins (or protein groups). The iBAQ values for proteins in the 30 single cells were normalized to the median of the cells’ mean expression and log2-transformed; nondetected values were kept as zeros to avoid imputation artifacts. The empirical Bayes framework available in the sva R package (10.18129/B9.bioc.sva) was used to remove the batch covariate, while accounting for the FM1-43 levels as covariate of interest (Johnson et al., 2007; Büttner et al., 2019; Luecken and Theis, 2019). The same framework has been used previously to correct for batch effects in protein mass spectrometry data (Carlyle et al., 2017). The resulting values are referred to as log2-normalized iBAQ (niBAQ) units. The relationship between protein expression variance and its average expression was fitted using a log-log cubic smoothing spline with four degrees of freedom; 37 proteins with a higher average expression than a log2 niBAQ value of 1.0 and a higher variance than the fit (Figure 4A) were kept for dimensionality reduction. The lower-dimensional manifold (four dimensions) was computed and a trajectory was inferred using the CellTrails R package (Ellwanger et al., 2018) (10.18129/B9.bioc.CellTrails). Protein expression dynamics were calculated by fitting the rolling mean (Haghverdi et al., 2016; Keren-Shaul et al., 2017) with a window size of five of the log2 niBAQ values and the pseudotime axis using a cubic smoothing spline function with 4-degrees of freedom.
E15 chicken utricles were dissected in ice-cold Medium 199 containing Hanks’ salts (Thermo Fisher) and were fixed with 4% formaldehyde in PBS at 4°C overnight. Utricles were treated with 0.2 M EDTA in PBS until the otoconia became invisible. For embedding, utricles were equilibrated at 50°C in PBS, followed by incubations in 2.5% and 5% low-melt agarose in PBS for 15 min. Utricles were appropriately positioned while the 5% agarose solution was still hot. Cross-sections from E15 utricles were cut to 80 µm thickness with a vibratome (Leica VT1200). Immunocytochemistry was performed as previously described (Scheibinger et al., 2018) using the following primary antibodies: rabbit anti-TMSB4X (1:250, Proteintech, Rosemont, IL, 19850–1-AP); rabbit anti-AGR3 (1:250, Proteintech, 11967–1-AP); rabbit anti-CRABP1 (1:250, Proteintech, 12588–1-AP); mouse anti-G-actin (1:250, Developmental Studies Hybridoma Bank, JLA20); mouse anti-otoferlin (1:250, Developmental Studies Hybridoma Bank, HCS-1); goat anti-SOX2 (1:100, Santa Cruz Biotechnology, Dallas, TX, sc-17320); mouse anti-tubulin beta-3 (TUJ1. 1:250, BioLegend, San Diego, CA, 801202); rabbit anti-parvalbumin3/oncomodulin (Heller et al., 2002) (1:1000, Heller laboratory); and rabbit anti-MYO7A (1:1000, Proteus Biosciences, Ramona, CA, 25–6790). Secondary antibodies: Alexa Fluor 546 donkey anti-rabbit (1:250, Thermo Fisher, A10040); Alexa Fluor 647 donkey anti-mouse (1:100, Thermo Fisher, A31571); Alexa Fluor 488 donkey anti-goat (1:250, Thermo Fisher, A11055). Validation of antibodies is reported in Table 2.
DAPI was used to visualize nuclei and Alexa Fluor 488-phalloidin to visualize F-actin filaments. Sections were imaged with a Plan-Apochromat 40x/1.3 NA oil DIC UV-IRM27 objective on a Zeiss LSM 880 Airyscan laser scanning confocal microscope and Zen Black software. For whole utricle vibratome cross-sections, confocal z-stacks were imaged with the tiling (6% overlap; mode: bounding grid) and stitching function. The 40x Plan-Apochromat objective was used with 0.9x zoom setting. For the extrastriolar and striolar regions, confocal z-stacks were collected separately with the 40x Plan-Apochromat objective used with 1.7 x zoom setting. Extrastriolar and striolar regions in whole utricle vibratome cross-sections were identified using SOX2, TUJ1 and MYO7A antibody labeling as previously described (Ellwanger et al., 2018) (Figure 3—figure supplement 6). Maturing or mature striolar type I hair cells express MYO7A but lack SOX2 expression and harbor calyx type terminals. Bouton-innervated extrastriolar Type II hair cells express both, MYO7A and SOX2. All supporting cells express SOX2 and lack MYO7A as well as TUJ1 expression (Figure 3—figure supplement 6). Maximum intensity projections were generated by a subset of the z-stacks to preserve single cell resolution.
Single cells from E15 chicken utricles were collected and sorted as previously described (Ellwanger et al., 2018). In this study, FM1-43 labeling was not performed and single cells were sorted with a BD FACSAria Fusion instrument (BD Biosciences). Two independent batches of 270 cells were deposited into individual wells of 96-well plates, with prefilled wells of 4 µl lysis solution with 1 U/µl of recombinant RNase inhibitor (Clontech #2313B), 0.1% Triton X-100 (Thermo #85111), 2.5 mM dNTP (Thermo Fisher #10297018), 2.5 µM oligo d(T)30 VN (5’-AAGCAGTGGTATCAACGCAGAGTACT30VN-3’, IDT). Plates containing sorted cells were immediately sealed, frozen on dry ice and stored at −80°C.
Single-cell RNA-seq was performed via the method of Picelli and colleagues (Picelli et al., 2014) using SMARTscribe (Clontech #639538) for reverse transcription. Kapa Biosystems Hifi HotStart ReadyMix (2X) (#KK2602) was used for 22 cycles of amplification. Amplified cDNAs were purified by bead cleanup using a Biomek FX automated platform and assessed with a fragment analyzer for quantitation and quality assurance. Barcoded libraries were synthesized using a scaled-down Nextera XT protocol (Mora-Castilla et al., 2016) in a total volume of 4 µl. A total of 384 libraries were pooled and paired-end sequenced (2 × 150 bp) on a NextSeq 500/550 High Output flow cell. Raw reads in FASTQ format were aligned to the NCBI Gallus gallus v5.0 (GCA_000002315.3) reference genome using custom scripts on the Sherlock Supercomputer Cluster (Stanford). The FastQC tool (version 0.11.6) was used to run an initial quality control check on the raw sequence data. Sequencing reads were mapped by STAR aligner and the transcriptome BAM files were quantified by RSEM. The results were summarized into counts, fragments per kilobase of transcript per million (FPKM), and transcript per million (TPM) expression matrices.
Scater (10.18129/B9.bioc.scater) was used to perform the quality control of the count expression matrix of 384 cells and 19,153 genes. ERCC spike-in transcripts, 56 information-poor cells and 3422 low level expressed genes were removed from the count matrix before read count normalization using SCnorm (Bacher et al., 2017). The CellTrails R package (10.18129/B9.bioc.CellTrails) was then utilized following the strategy described in our previous study (Ellwanger et al., 2018). The variable trajFeatureNames was set to the 182 previously used assay genes with the addition of AGR3, AK1, GPX2, and TMSB4X, which restricted to the analysis to 186 genes (Table 3).
In all cases, samples were biological replicates—none of the biological samples were split to be run separately as multiple technical replicates. Figure 1. B-C, Confocal imaging of FACS-sorted cells. Experiment was carried out 3 times. D-G, Characterization of mass spectrometry results. Number of samples for FM1-43high: 1 cell, N = 5; 3 cells, N = 4; 5 cells, N = 4; 20 cells, N = 4. Number of samples for FM1-43low: 1 cell, N = 5; 3 cells, N = 3; 5 cells, N = 4; 20 cells, N = 4. H-I, Number of samples for FM1-43high: 1 cell, N = 10; 20 cells, N = 3. Number of samples for FM1-43low: 1 cell, N = 10; 20 cells, N = 3. Figure 2. Characterization of FM1-43high and FM1-43low samples. A-C, Experiment was carried out two times. Panels were based on Experiments 1 and 2. Figure 3. A-C, Immunofluorescence detection of AGR3. Experiment was carried out 3 times. D-F, CRABP1. Experiment was carried out 3 times. G-I, TMSB4X. Experiment was carried out 5 times. J-L, G-actin. Experiment was carried out 3 times. M-O, OCM. Experiment was carried out 3 times. Figure 4. Developmental trajectory based on proteomics of single cells. Analysis used 30 single cells from two experiments. Figure 5. Developmental trajectory based on scRNA-seq of single cells. Analysis used two independent batches of 270 cells.
The mass spectrometry proteomics data, including raw data from the mass spectrometry runs, have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository (Perez-Riverol et al., 2019) with the dataset identifier PXD014256. The analyzed data are reported in Figure 1—source data 1. The analyzed single-cell RNA-seq data are reported in Figure 5—source data 1.
Correlation of actin crosslinker and capper expression levels with stereocilia growth phasesMolecular & Cellular Proteomics 13:606–620.https://doi.org/10.1074/mcp.M113.033704
SCnorm: robust normalization of single-cell RNA-seq dataNature Methods 14:584–586.https://doi.org/10.1038/nmeth.4263
Controlling the false discovery rate: a practical and powerful approach to multiple testingJournal of the Royal Statistical Society: Series B 57:289–300.https://doi.org/10.1111/j.2517-6161.1995.tb02031.x
A multiregional proteomic survey of the postnatal human brainNature Neuroscience 20:1787–1795.https://doi.org/10.1038/s41593-017-0011-2
Andromeda: a peptide search engine integrated into the MaxQuant environmentJournal of Proteome Research 10:1794–1805.https://doi.org/10.1021/pr101065j
Comparative proteomic analysis of eleven common cell lines reveals ubiquitous but varying expression of most proteinsMolecular & Cellular Proteomics 11:M111.014050.https://doi.org/10.1074/mcp.M111.014050
Hair-cell numbers continue to increase in the utricular macula of the early posthatch chickJournal of Neurocytology 28:851–861.https://doi.org/10.1023/a:1007070121751
Identification of the hair cell soma-1 antigen, HCS-1, as otoferlinJournal of the Association for Research in Otolaryngology 11:573–586.https://doi.org/10.1007/s10162-010-0231-6
Diffusion pseudotime robustly reconstructs lineage branchingNature Methods 13:845–848.https://doi.org/10.1038/nmeth.3971
Parvalbumin 3 is an abundant Ca2+ buffer in hair cellsJARO - Journal of the Association for Research in Otolaryngology 3:488–498.https://doi.org/10.1007/s10162-002-2050-x
Integral membrane proteins: bottom-up, top-down and structural proteomicsExpert Review of Proteomics 14:715–723.https://doi.org/10.1080/14789450.2017.1359545
Accurate label-free protein quantitation with high- and low-resolution mass spectrometersJournal of Proteome Research 13:1034–1044.https://doi.org/10.1021/pr401017h
Plastin 1 widens stereocilia by transforming actin filament packing from hexagonal to liquidThe Journal of Cell Biology 215:467–482.https://doi.org/10.1083/jcb.201606036
Loss-less Nano-fractionator for high sensitivity, high coverage proteomicsMolecular & Cellular Proteomics 16:694–705.https://doi.org/10.1074/mcp.O116.065136
lmerTest Package: Tests in Linear Mixed Effects ModelsJournal of Statistical Software 82:1–26.https://doi.org/10.18637/jss.v082.i13
Current best practices in single-cell RNA-seq analysis: a tutorialMolecular Systems Biology 15:e8746.https://doi.org/10.15252/msb.20188746
Miniaturization technologies for efficient Single-Cell library preparation for Next-Generation sequencingJournal of Laboratory Automation 21:557–567.https://doi.org/10.1177/2211068216630741
Small actin-binding proteins: the beta-thymosin familyCurrent Opinion in Cell Biology 5:56–62.https://doi.org/10.1016/S0955-0674(05)80008-0
The PRIDE database and related tools and resources in 2019: improving support for quantification dataNucleic Acids Research 47:D442–D450.https://doi.org/10.1093/nar/gky1106
Full-length RNA-seq from single cells using Smart-seq2Nature Protocols 9:171–181.https://doi.org/10.1038/nprot.2014.006
Aminoglycoside damage and hair cell regeneration in the chicken utricleJournal of the Association for Research in Otolaryngology 19:17–29.https://doi.org/10.1007/s10162-017-0646-4
Molecular architecture of the chick vestibular hair bundleNature Neuroscience 16:365–374.https://doi.org/10.1038/nn.3312
Actin filaments, Stereocilia, and hair cells: how cells count and measureAnnual Review of Cell Biology 8:257–274.https://doi.org/10.1146/annurev.cb.08.110192.001353
Nanog suppresses cell migration by downregulating thymosin β4 and Rnd3Journal of Molecular Cell Biology 5:239–249.https://doi.org/10.1093/jmcb/mjt002
Proteomic analysis of single mammalian cells enabled by microfluidic nanodroplet sample preparation and ultrasensitive NanoLC-MSAngewandte Chemie International Edition 57:12370–12374.https://doi.org/10.1002/anie.201802843
Proteome profiling of 1 to 5 spiked circulating tumor cells isolated from whole blood using immunodensity enrichment, laser capture microdissection, nanodroplet sample processing, and ultrasensitive nanoLC-MSAnalytical Chemistry 90:11756–11759.https://doi.org/10.1021/acs.analchem.8b03268
Subnanogram proteomics: impact of LC column selection, MS instrumentation and data analysis strategy on proteome coverage for trace samplesInternational Journal of Mass Spectrometry 427:4–10.https://doi.org/10.1016/j.ijms.2017.08.016
Matthew W KelleyReviewing Editor; National Institute on Deafness and Other Communication Disorders, United States
Anna AkhmanovaSenior Editor; Utrecht University, Netherlands
In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.
Thank you for submitting your article "Single-cell proteomics reveals downregulation of TMSB4X to drive actin release for stereocilia assembly" for consideration by eLife. Your article has been reviewed by two peer reviewers and the evaluation has been overseen by Anna Akhmanova as the Senior Editor. The reviewers have opted to remain anonymous.
The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.
The manuscript "Single-cell proteomics reveals downregulation of TMSB4X to drive actin release for stereocilia assembly" presents a series of state-of-the-art protein spectrometry experiments on inner-ear hair cells that provide single-cell proteomes enabling the identification of important steps in hair-cell differentiation. The authors focus on E15 chicken utricle tissue, that includes both supporting and hair cells. Various sample preparation strategies and analysis protocols were used to identify about 600 proteins or protein groups for FM1-43high samples, and about 300 or so for FM1-43low. Several proteins specific to either presumptive hair cells or supporting cells were identified, and further experiments were done for some of these proteins to validate their localization. In addition, TMSB4X, found predominantly in supporting cells but not hair-cells, is suggested to play a key role in capturing actin in supporting cells and then releasing it during differentiation to facilitate the assembly of the characteristic actin-based sterocilia bundles of hair cells. Proteomics data also permitted the classification of cells according to a pseudotemporal ordering, and single cell RNA-seq measurements identified developmental trajectories for extrastriolar and striolar hair cells.
Overall, the reviewers felt that this study had merit and was worthy of consideration for publication in eLife. In general, the most significant concern of both reviewers was the conclusion that TMSB4X actually regulates actin release related to stereocilia assembly. While this is a very reasonable hypothesis, in the absence of experimental data, it seems premature to include this idea in the title of the study.
In addition, one of the reviewers requested some additional clarification regarding the types of proteins that might not be detected in hair cells and supporting cells.
1) While the authors suggest that their results show that TMSB4X plays a role in releasing actin that can then be used to build stereocilia bundles, there is no evidence for this beyond changes in TMSB4X expression data. In light of this, the study then fundamentally represents the demonstration of a technique, single cell proteomic analysis. The title is a clear overstep based on the data provided. There is no direct evidence that TMSB4X actually drives actin release related to stereocilia assembly. This may in fact be what occurs, but without data, it seems inappropriate to include this in the title.
The enrichment of TMSB4X in supporting cells is intriguing and well documented, yet lack of TMSB4X in hair cells is only correlated with the appearance of a hair bundle, causality is not really demonstrated although it is implied, which might be problematic. Whether TMSB4X really functions in capturing actin in supporting cells is an attractive hypothesis, and it should be referred as such (where is the immunostaining of G-actin and TMSB4X showing overlap? Is the antibody for TMSB4X really reliable? It seems to stain a lot of other stuff. Could it be that the hair bundle forms even in the presence of TMSB4X and that other factors are really governing this transition?). Mechanistic details about how the TMSB4X proteins function and get degraded during development are lacking. One can see how obtaining these mechanistic insights is likely beyond the scope of the paper, but perhaps clarifying that this is only a correlation, and that the TMSB4X function in supporting cells is hypothesized but that it has not really been established, would be helpful.
2) There are about 600 proteins identified for hair-cells, and significantly less for supporting cells. An estimate of how much is missing can be obtained by comparing to results from HeLa cells (700 proteins), and the authors do acknowledge that some proteins are missing. This needs to be discussed in more depth, both in terms of what types of proteins might be missing, and how many. For instance, are membrane/soluble proteins well represented in the list? Is the method biasing the detections of some, but not others? Is the mass spec analysis biasing the detection towards small proteins? (I assume orbitrap limits on mass are bypassed by proteolysis treatments, but this could be discussed a bit more). In addition, an upper boundary on the number of different proteins expressed in a cell could also be obtained from mRNA transcripts identified (with obvious caveats, but a good upper limit nonetheless), which could help provide another estimate of how many proteins are missing in the analyses.https://doi.org/10.7554/eLife.50777.028
1) While the authors suggest that their results show that TMSB4X plays a role in releasing actin that can then be used to build stereocilia bundles, there is no evidence for this beyond changes in TMSB4X expression data. In light of this, the study then fundamentally represents the demonstration of a technique, single cell proteomic analysis. The title is a clear overstep based on the data provided. There is no direct evidence that TMSB4X actually drives actin release related to stereocilia assembly. This may in fact be what occurs, but without data, it seems inappropriate to include this in the title.
We agree with the reviewers and have changed the title.
The enrichment of TMSB4X in supporting cells is intriguing and well documented, yet lack of TMSB4X in hair cells is only correlated with the appearance of a hair bundle, causality is not really demonstrated although it is implied, which might be problematic.
Agreed. We have also downplayed the TMSB4X model throughout (see below).
Whether TMSB4X really functions in capturing actin in supporting cells is an attractive hypothesis, and it should be referred as such (where is the immunostaining of G-actin and TMSB4X showing overlap?
We have referred to this idea as a hypothesis throughout the manuscript.
Is the antibody for TMSB4X really reliable? It seems to stain a lot of other stuff.
TMSB4X is expressed widely. The reviewers may be noting the strong staining in the stroma cells in Figure 3H-I and Figure 3—figure supplement 2. However, we also note that there are two groups of cells in the scRNA-seq experiment (clusters S8 and S9 in Figure 5A) that have high expression of TMSB4X (Figure 5A). These cells were removed for the subsequent analysis of scRNA-seq data, but they at least partially corroborate our suggestion that the antibody is specific. The antibody has been validated in knock-down experiments, as indicated in the text.
Could it be that the hair bundle forms even in the presence of TMSB4X and that other factors are really governing this transition?). Mechanistic details about how the TMSB4X proteins function and get degraded during development are lacking. One can see how obtaining these mechanistic insights is likely beyond the scope of the paper, but perhaps clarifying that this is only a correlation, and that the TMSB4X function in supporting cells is hypothesized but that it has not really been established, would be helpful.
We agree with the reviewers that there are several very interesting questions to address beyond the scope of this paper. We also are convinced by the reviewers to indicate clearly that the role we propose for TMSB4X is hypothetical as well.
Last half of Abstract changed to: "The single-cell proteomes enabled the de novo reconstruction of a developmental trajectory using protein expression levels, revealing proteins that greatly increased in expression during differentiation of hair cells (OCM, CRABP1, GPX2, AK1, GSTO1) and those that decreased during differentiation (TMSB4X, AGR3). […] Single-cell proteomics data thus can be mined to reveal features of cellular development that may be missed with transcriptomics."
Last sentence of Introduction changed to: "These data are consistent with a model that suggests that existing monomeric actin is made available for hair-bundle assembly by the degradation of TMSB4X."
Discussion:"TMSB4X was present at nearly equimolar levels with respect to actin in supporting cells and thus may sequester actin monomers there. […] Together, these data are consistent with the hypothesis that downregulation of TMSB4X allows differentiating hair cells to construct their hair bundles with newly available actin monomers."
Discussion: The section titled "TMSB4X dynamics and assembly of stereocilia" was nearly completely removed.
2) There are about 600 proteins identified for hair-cells, and significantly less for supporting cells. An estimate of how much is missing can be obtained by comparing to results from HeLa cells (700 proteins), and the authors do acknowledge that some proteins are missing. This needs to be discussed in more depth, both in terms of what types of proteins might be missing, and how many.
Most cells are thought to express ~10,000 different proteins, not accounting for splice variants; mass spectrometry only detects this many proteins with substantial protein loads, however. Using 20 µg of starting material and two-dimensional separation, the Mann group identified 11,148 proteins or protein groups from HeLa cells (Geiger et al., 2012; Kulak et al., 2017). With a single separation dimension, ~3000 proteins can usually be detected.
For example, in our experiments carrying out proteomics measurements on whole chick utricle sensory epithelia at E20 (Shin et al., 2013), which used far more material, we detected 2753 proteins in at least two of four replicates. We used ~10 utricles per sample, which represents ~400,000 cells. These preparations will contain both hair cells and supporting cells, but very few other cell types.
Of course, the dynamic range of protein expression is enormous (107- to 1010-fold) and so most techniques only a capture a slice of the range. The single cell experiments, because they use so little input protein, are necessarily biased against detecting rare proteins. Nevertheless, we can gain significant insight into the function of a cell just by looking at its abundant proteins, as we show here.
For instance, are membrane/soluble proteins well represented in the list? Is the method biasing the detections of some, but not others?
Despite accounting for ~1/3 of all proteins, as with most mass spectrometry approaches, membrane proteins are not represented well on our list of detected proteins. Their hydrophobic peptides do not perform well on reverse-phase liquid chromatography, are not efficiently ionized before introduction into the mass spectrometer, or both. Soluble and cytoskeletal proteins are well represented, but they are also typically of much higher concentration than are membrane proteins.
Is the mass spec analysis biasing the detection towards small proteins? (I assume orbitrap limits on mass are bypassed by proteolysis treatments, but this could be discussed a bit more).
Most mass spectrometry sample preparation methods are biased against small proteins, as those methods typically include sieving steps like gel or filter-based purification. Thus we suggest that our sample prep method provides a more accurate estimate of the levels of these small proteins.
Because of high intrinsic variability for detection efficiency from peptide to peptide, where only a fraction of peptides (5-10%) perform exceptionally well in the mass spectrometer, small proteins can be missed if none of their tryptic peptides are well behaving. Again, normally mass spectrometry analysis is biased against small proteins, so anything that levels the field – like single-pot sample prep – gives a better representation of small proteins.
In addition, an upper boundary on the number of different proteins expressed in a cell could also be obtained from mRNA transcripts identified (with obvious caveats, but a good upper limit nonetheless), which could help provide another estimate of how many proteins are missing in the analyses.
In the scRNA-seq experiments, we detected expression of 13,643 genes, not accounting for splice variants, again in the combination of hair cells and supporting cells. Note that all of the concerns raised in point #2 are good ones, and so we have added an additional text to the Discussion that discusses the depth of protein sequencing and biases for and against certain classes of proteins.https://doi.org/10.7554/eLife.50777.029
- Peter G Barr-Gillespie
- Stefan Heller
- Ryan T Kelly
- Ying Zhu
- Stefan Heller
- Stefan Heller
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
We thank Pamela Canaday and Matthew Lewis for their assistance with flow cytometry. We received support from the following OHSU core facilities: FACS from the OHSU Flow Cytometry Shared Resource (supported in part by the OHSU Knight Cancer Institute and the NCI Cancer Center Support Grant P30 CA069533) and confocal microscopy from the OHSU Advanced Light Microscopy Core (P30 NS061800 provided support for imaging). We also received support from the following Stanford core facilities: the Stanford Shared FACS Facility, the Stanford Functional Genomics Facility, and the Otolaryngology Imaging Core. Some of the computing for this project was performed on the Sherlock cluster; we thank the Stanford Research Computing Center for providing computational resources and support. YZ was supported by Earth and Biological Sciences Directorate Mission Seed under the Laboratory Directed Research and Development Program at PNNL, and the Precision Medicine Innovation Co-Laboratory (PMedIC), a joint research collaboration of OHSU and PNNL. A portion of the research at PNNL was performed using EMSL (grid.436923.9), a DOE Office of Science User Facility sponsored by the Office of Biological and Environmental Research. SH was supported by R01 DC015201, by the Hearing Health Foundation’s Hearing Restoration Project, and through the Stanford Initiative to Cure Hearing Loss. RTK was supported by R33 CA225248. PGBG was supported by NIH grant R01 DC011034.
- Anna Akhmanova, Utrecht University, Netherlands
- Matthew W Kelley, National Institute on Deafness and Other Communication Disorders, United States
This is an open-access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.