Global, quantitative and dynamic mapping of protein subcellular localization

Abstract
eLife digest
Introduction
Results
Discussion
Materials and methods
Acknowledgments
References
Article and author information
Metrics

Abstract

Subcellular localization critically influences protein function, and cells control protein localization to regulate biological processes. We have developed and applied Dynamic Organellar Maps, a proteomic method that allows global mapping of protein translocation events. We initially used maps statically to generate a database with localization and absolute copy number information for over 8700 proteins from HeLa cells, approaching comprehensive coverage. All major organelles were resolved, with exceptional prediction accuracy (estimated at >92%). Combining spatial and abundance information yielded an unprecedented quantitative view of HeLa cell anatomy and organellar composition, at the protein level. We subsequently demonstrated the dynamic capabilities of the approach by capturing translocation events following EGF stimulation, which we integrated into a quantitative model. Dynamic Organellar Maps enable the proteome-wide analysis of physiological protein movements, without requiring any reagents specific to the investigated process, and will thus be widely applicable in cell biology.

https://doi.org/10.7554/eLife.16950.001

eLife digest

The interior of every cell is highly organised, and contains many compartments, called organelles, that are dedicated to specific roles. Proteins are the tools and machines of the cell, and each organelle has its own set of proteins that it requires to work correctly. Each cell contains ten or more organelles, and several thousand different types of proteins. The exact location of proteins in the cell is important; once we know what compartment a protein is in, it is easier to narrow down what it might be doing.

The location of many proteins in a cell is unclear or simply not known. Moreover, since changing the location of a protein can change its activity, it is also important to be able to detect changes in the location of proteins under different circumstances, such as before and after drug treatment.

Itzhak et al. set out to develop a method that reveals the locations of all the proteins in a cell at any given time. The resulting technique maps the location of most of the proteins in a human cancer cell line and, in addition, determines how many copies of each protein there are. Combining these two types of information produces a model of the cell’s architecture. Importantly, Itzhak et al. were able to compare such a model of the cell under normal circumstances to a model made after the cell had been stimulated with a growth factor. This revealed which proteins had changed location, identifying these proteins as important for the cell’s response to the growth factor.

The new mapping method could be used in the future to analyse the anatomy of different cell types, such as nerve cells and cells of the immune system. Itzhak et al. also want to investigate the differences between healthy cells and cells from people with neurological disorders to understand how such diseases arise.

https://doi.org/10.7554/eLife.16950.002

Introduction

The hallmark of eukaryotic cells is their compartmentalization into distinct membrane-bound organelles. Protein function is critically determined by subcellular localization, as organelles offer different chemical environments and interaction partners. In order to regulate protein activity, many biological processes involve changes in protein subcellular localization. Prominent examples include the endocytic uptake of activated plasma membrane signalling receptors, to terminate the signalling process (Jones and Rappoport, 2014), and the nucleo-cytoplasmic shuttling of many transcription factors, to regulate their access to DNA (Plotnikov et al., 2011).

The ability to monitor changes in organellar composition would provide a powerful tool to investigate cell biological processes at the systems level. While transcriptomic (Curtis et al., 2012) and proteomic abundance profiling approaches (Deeb et al., 2015) have yielded valuable insights into changes in gene or protein expression, they lack the important spatial dimension. Microscopy-based approaches can provide spatial information on individual proteins (Uhlen et al., 2015), but are limited by the availability of specific antibodies, and are very labour-intensive for analysing complete proteomes (Marx, 2015). Genome-wide GFP-tagging in yeast circumvents the need for antibodies (Huh et al., 2003), but tags may inadvertently alter protein subcellular localisation, which is difficult to control for; in addition, serial imaging of cells for comparative purposes remains experimentally challenging (Breker et al., 2013).

Mass spectrometry-based proteomics has much enhanced our understanding of cellular composition (Larance and Lamond, 2015). Although sophisticated approaches for organellar proteomics have been available for over a decade (Andersen et al., 2003; Christoforou et al., 2016; Dunkley et al., 2004; Foster et al., 2006; Gilchrist et al., 2006; Smirle et al., 2013), there is currently no proteomic method that allows global dynamic mapping of protein subcellular localization. The main reason for this deficiency is the high variability between spatial proteomics experiments, which renders the identification of genuine organellar transitions very difficult (Gatto et al., 2014).

Here, we have developed a rapid proteomic profiling workflow for the generation of highly reproducible organellar maps. We use the method to assemble a comprehensive database of protein subcellular localization and abundance information from HeLa cells, allowing us to build a quantitative model of cellular anatomy. We then apply organellar maps to capture the protein translocation events triggered by EGF stimulation, demonstrating the dynamic capabilities of our approach.

Results

Organellar maps through fractionation profiling

The principle of our approach is to separate organelles partially with a minimum number of fractionation steps, and to generate organellar profiles by high-accuracy quantification of each fraction against an invariant reference. Metabolically labelled HeLa cells, SILAC light or heavy (Ong et al., 2002), were mechanically lysed following gentle hypo-osmotic swelling (Figure 1A). Damage to organelles was minimal, as assessed by leakage of lumenal contents (Figure 1—figure supplement 1A). Post-nuclear supernatant from light cells was fractionated by a series of five differential centrifugation steps, whereas a total organellar ‘reference’ fraction was obtained in a single centrifugation step from heavy post-nuclear supernatant. This procedure is highly reproducible, as assessed by protein recovery (Figure 1—figure supplement 1B,C). Each light sub-fraction was then combined with an equal amount of the heavy reference, subjected to tryptic digest and analysed by LC-MS/MS. For each protein, we obtained an abundance distribution profile across the sub-fractions. In a typical experiment, approximately 4500 proteins were profiled. Proteins associated with the same organelle have similar profiles, and organelles can be distinguished from one another (Figure 1B). In parallel, the global distributions of proteins across the nuclear, organellar, and cytosolic fractions were obtained by label-free quantification mass-spectrometry, typically covering 8000 proteins (Figure 1C).

Figure 1 with 1 supplement see all

Download asset Open asset

Generation of organellar maps through fractionation profiling.

(A) Metabolically labelled HeLa cells were mechanically lysed to release organelles. Light labelled lysate was then subjected to differential centrifugation at the indicated speeds (RCF_MAX) and times (in minutes). Heavy-labelled lysate was centrifuged twice, once at low speed to generate a nuclear-enriched pellet, and again at high speed to generate the organellar pellet; the supernatant was the cytosolic fraction. The heavy organellar ‘reference’ fraction was combined with equal protein amounts of each of the five light membrane sub-fractions and analysed by mass spectrometry, generating SILAC ratios for each protein in all fractions. (B) The SILAC ratios were converted to enrichment over reference. Median values of organellar marker proteins were plotted, showing clearly distinct profiles. (C) In a parallel analysis, the heavy-labelled nuclear, organellar and cytosolic fractions were subjected to label-free mass spectrometric analysis, revealing the global distribution of proteins across these three fractions. Examples of normalized profiles of marker proteins for the nucleus (Histone H3), lysosome (Cathepsin D) and the cytosol (Pyruvate Kinase) are shown. Bars show mean ± SD (n = 6). Please refer to Figure 1—figure supplement 1 for organellar leakage analysis and evaluation of fractionation yield reproducibility.

https://doi.org/10.7554/eLife.16950.003

Following this scheme, we prepared six replicate fractionations, in batches of two, on three different days. We first considered the set of 3766 proteins common to all replicates, and applied principal component analysis (PCA) to their abundance profiles. The PCA scores plot was then overlaid with established organellar markers (Supplementary file 1), which clustered into distinct regions of the plot (Figure 2). Resolved compartments included plasma membrane, endoplasmic reticulum, ERGIC, Golgi apparatus, endosome, lysosome, peroxisomes and mitochondria, as well as a cluster of diverse large protein complexes, such as ribosomes and proteasomes. Closer inspection of the marker proteins suggested further sub-organellar resolution, revealing a partial divide between ER membrane and lumen, as well as division of mitochondria into matrix, inner membrane, and outer membrane (Figure 2—figure supplement 1A). For independent validation, we overlaid the scores plot with UniProt annotation for subcellular targeting features, including signal peptides, mitochondrial transit peptides and transmembrane domains, and observed near-complete agreement with our maps (Figure 2—figure supplement 1B). To assess the reproducibility of the method, we next analysed the six individual maps by PCA; all had very similar patterns (Figure 2—figure supplement 2A), with organellar clusters occupying stable positions between maps. The SILAC ratios of replicate fractions were also highly reproducible (average correlation > 0.95; Figure 2—figure supplement 2B).

Figure 2 with 2 supplements see all

Download asset Open asset

Visualization of an organellar map.

Thirty SILAC ratios from six replicate fractionation experiments were combined and subjected to principal component analysis to achieve dimensionality reduction. Projections along the first (x-axis) and third (y-axis) principal components (PCs) provided the optimal separation of clusters. Each scatter point represents a protein. Proximity of proteins indicates similar fractionation behaviour. Marker proteins for organelles are coloured as indicated in the legend, and reveal clustering of proteins belonging to the same organelle. Non-marker proteins are shown as small grey dots. PCs 1–3 account for 64%, 21%, and 12% of the variability in the data, respectively. Please note that the actual resolution of the map is much higher than is apparent in this 2D representation of the full 30-dimensional data set, and most of the seemingly overlapping clusters are in fact separated. Please refer to Figure 2—figure supplement 1 for more detailed cluster annotation, and overlays with external protein sequence feature predictions. Figure 2—figure supplement 2 shows the reproducibility analysis of six replicate organellar maps. The complete organellar assignments, spatial and abundance information can be found in Supplementary file 1 (compact format) and 4 (interactive database).

https://doi.org/10.7554/eLife.16950.005

For the rigorous assignment of proteins to organellar clusters, we used a support vector machine (SVM)-based supervised learning approach. Briefly, SVMs allow non-linear separation of clusters (Varmuza and Filzmoser, 2009). Optimal boundaries between organellar clusters are determined using marker proteins, with cross-validation to prevent over-fitting. Non-marker proteins falling within the boundaries of a particular cluster are then assigned to that organelle. Since a suitable canonical set of organellar markers was not available, we manually curated a set of over 1000 proteins (Supplementary file 1). We chose markers based on their expression in HeLa cells and their well-documented (and ideally unimodal) localization to a particular organelle. Clustering of these markers was visually confirmed with several PCA-based pilot maps (not included in this study). Where necessary, we specifically chose further established markers near the edges of organellar clusters, as these are particularly important for defining boundaries. We applied SVM classification to all six maps individually. The mean prediction accuracy for marker proteins was 94.7% (with full cross validation), demonstrating the exceptionally high level of organellar resolution achieved (Supplementary file 2). While marker prediction accuracy does not provide a direct measure of overall prediction accuracy, it nevertheless serves as a useful estimate (see Methods for further details). The average proportion of identical organellar assignments between maps, referred to as concordance, was 93.7% for all proteins, and >98% for three-quarters of the proteins (Figure 2—figure supplement 2C).

Collectively, these data show that fractionation profiling is effective for generating high resolution organellar maps. The remarkable level of reproducibility enables comparative applications (see below).

A database of protein subcellular localization

We combined the predictions from the six replicate maps into a single output (see Methods for details). In total, we derived organellar profiles for 5265 proteins, of which 2423 were assigned to 9 membranous organelles, with 96.5% of marker proteins predicted correctly (92.7% average per membrane-bound organelle; Table 1). To validate the novel predictions, we removed the organellar markers from the set, and annotated the remaining proteins with UniProt subcellular localization information. A Fisher’s exact test showed that for eight of the nine compartments, the most significantly enriched localization term corresponded to our own organellar classification (Supplementary file 3). Furthermore, we compared our mitochondrial predictions with the MitoCarta database of experimentally validated mitochondrial proteins (Calvo et al., 2016); the overall concordance was 97% (92.3% for non-marker proteins). These data provide strong independent support for the high quality of our organellar assignments.

Organellar maps deliberately exclude the cytosolic fraction, since numerous peripheral membrane proteins have a soluble as well as a membrane-bound pool. Inclusion of the cytosol in the maps would reveal which proteins are predominantly cytosolic but sacrifice information on the precise localization of the membrane-associated fraction. Therefore, the maps were augmented by an auxiliary workflow, which reveals the nuclear-organellar-cytosolic distribution (Figure 1C). In total, this global profile analysis extends to 8710 proteins, including 1999 cytosolic, 1133 nuclear, and 672 nucleo-cytosolic proteins (Supplementary file 1).

Table 1

Prediction output and performance of HeLa organellar maps. The table shows the combined organellar prediction output from six replicate maps from HeLa cells. Prediction performance is judged by the proportion of correctly assigned organellar marker proteins. Please also refer to Supplementary file 1 (compact format) and 4 (complete database), which contain detailed information for all 8710 proteins covered in this study, including nuclear and cytosolic predictions. Supplementary file 2 shows the performance of each individual map.

https://doi.org/10.7554/eLife.16950.008

Compartment	Number of marker proteins	Correctly predicted markers		All proteins predicted in this compartment
Compartment	Number of marker proteins	Number	%	All proteins predicted in this compartment
Endosome	85	75	88.2%	304
ER	127	127	100.0%	530
ER, high curvature	11	11	100.0%	45
ERGIC/cisGolgi	26	25	96.2%	73
Golgi	33	29	87.9%	190
Lysosome	43	41	95.3%	88
Mitochondrion	242	239	98.8%	658
Peroxisome	21	15	71.4%	25
Plasma membrane	127	123	96.9%	510
All organellar proteins	715	685	95.8%	2423
Average per organelle			92.7%
Large Protein Complexes	361	353	97.8%	2739
Total	1076	1038	96.5%	5162

We combined all data into a database, which contains three layers of information. At the global level, it includes the distribution of each protein between nuclear, organellar, and cytosolic pools, as well as copy numbers per cell and cellular concentrations (calculated with the ‘proteomic ruler’ approach [Wisniewski et al., 2014]). At the organellar level, predictions of compartment associations are provided, with confidence scores. Furthermore, maps have high local resolution; this third level of information provides the ‘neighbourhood’ of a protein, revealing which other proteins have similar fractionation profiles. In many cases, this allows identification of stable protein complexes. The database is accessible via a web interface (www.MapOfTheCell.org), and as an interactive Excel file (Supplementary file 4); Supplementary file 1 contains a compact summary of the organellar predictions and copy numbers. The website allows visual exploration of the individual organellar maps.

Quantitative anatomy of a HeLa cell

Combined knowledge of protein subcellular localization and abundance enables construction of a model of HeLa cell composition. We calculated the protein mass of each organelle by multiplying the molecular weights of constituent proteins by their estimated copy numbers (Figure 3). This revealed that the endomembrane system contributes approximately 16% to total cellular protein mass, dominated by mitochondria (6.6%), ER (4.4%), and plasma membrane (3.1%), with relatively minor contributions from endosomes, lysosomes, peroxisomes and Golgi (Figure 3A). The mitochondria, ER and plasma membrane are themselves dominated by a few highly abundant proteins (Figure 3B). In each case, the 20 most abundant proteins constitute at least 40% of organellar protein mass (Figure 3C–E, and Supplementary file 5). For example, the most abundant plasma membrane protein is the 4F2 cell-surface antigen heavy chain (SLC3A2), with 15 million copies/cell. This versatile protein can heterodimerize with several other proteins (eg SLC7A5, another very abundant protein, three million copies) to form amino acid transporters. This predominance probably reflects the adaptation of HeLa cells for fast nutrient uptake to support rapid growth. Supporting this view, all plasma membrane transporters combined (40 million copies) contribute approximately 25% of the total compartment protein mass. Other integral membrane proteins (such as adhesion and signalling receptors) contribute ~30 million copies. Assuming a cell surface area of ~1600 μm² typical of adherent HeLa cells (Fisher and Cooper, 1967) yields an estimated density of 4–5 integral membrane proteins per 100 nm², in excellent agreement with a previous biochemically derived estimate of 3 for baby hamster kidney (BHK) fibroblasts (Quinn et al., 1984). Within the endoplasmic reticulum, proteins involved in protein folding and quality control predominate (20% chaperones, 10% protein disulfide isomerases). A similar abundance of chaperones was observed in the mitochondria (14%), which exceeds the collective contribution of citric acid cycle components (9%). We detected the five members of the mitochondrial ATP synthase F₀ catalytic complex with the expected stoichiometry of ~3:1 for subunits A/B to C/D/E, and estimate the number of complexes at ~3 million per cell (5% of mitochondrial protein mass). Thus, a picture of HeLa cell anatomy emerges from the quantitative subcellular localization information.

Figure 3

Download asset Open asset

Quantitative anatomy of a HeLa cell.

(A) Schematic diagram of a cell where compartments are approximately scaled by their relative contributions to total cell protein mass (*not by their volumes*). All membranous organelles combined (excluding the nucleus) contribute ca. 16%. For comparison, ribosomes and proteasomes contribute 6% and 1.3%, respectively. The proportion of the ER fraction would increase from 4.4% to ca. 5.4% if attached ribosomes were included. (B) Proteins of major organelles were ranked in order of decreasing abundance, and plotted against their cumulative mass. Very few proteins contribute the majority of organellar protein mass in all three cases. (**C–E**) Top 20 most abundant proteins in each of the three major organelles, plotted against their contribution to protein organelle mass. The complete quantitative composition of ER, mitochondria, and plasma membrane are shown in Supplementary file 5.

https://doi.org/10.7554/eLife.16950.009

Systems-wide detection of protein translocation events – dynamic organellar maps

The very high reproducibility of our approach opens the possibility to compare maps under different physiological conditions, to identify protein translocation events. To test this, we investigated the well-characterized process of epidermal growth factor receptor (EGFR) uptake. Following stimulation with EGF, EGFR autophosphorylates, binds downstream factors, and is rapidly endocytosed from the plasma membrane to an endosomal compartment (Jones and Rappoport, 2014). The translocation process is readily imaged using fluorescently-labelled EGF (Figure 4A,B). We prepared organellar maps from untreated (control) HeLa cells and from HeLa cells continuously stimulated with EGF for 20 min, in biological triplicate (Figure 4C,D; Figure 4—figure supplement 1 provides a schematic of the experimental design; Figure 4—figure supplement 2A shows all six maps). Overall map morphology from treated and control cells was unchanged, however EGFR, which localized to the plasma membrane cluster in control cells, was localized to the endosomal cluster upon EGF treatment, as expected. To identify subcellular translocations in an unbiased manner, we developed a two-stage statistical analysis. For each protein the magnitude of translocation (Movement score) as well as the consistency of the direction of the translocation across biological repeats (Reproducibility score) were assessed. The two metrics were then combined in a ‘MR’ plot (Figure 4E,F) to identify proteins undergoing consistent translocations. To derive stringent score cut-offs, we took advantage of the maps used to generate our subcellular localization database (Figure 2—figure supplement 2A). We treated these six maps as a mock experiment in which we would not expect to detect any specific changes, by assigning three maps as 'controls' and three as 'mock-treated'. We determined the most stringent score cut-offs from the MR plot of the mock experiment by defining a region where no false positives were obtained. Applying these cut-offs to the EGF treatment experiment identified four proteins as significantly translocating; EGFR, GRB2, SHC1 and PKN2. Both GRB2 and SHC1 are recruited to EGFR upon EGF stimulation and constitute the first step in EGFR signaling (Oda et al., 2005). Inspection of the maps (Figure 4C,D, Figure 4—figure supplement 2A), and classification with support vector machines (Supplementary file 6) showed that all proteins had moved to the endosome/lysosomal compartment, as expected. Therefore, our approach correctly and specifically identified the major translocation events following EGF stimulation. For a deeper exploratory analysis, we then relaxed score cut-offs to allow an FDR of 10%, and identified 14 further significantly translocating proteins (Supplementary file 7). These included numerous known downstream targets of EGF, such as RPS6KA3, PIK3C2B and ROCK2, as well as several new candidates (see Discussion). These results validate the use of dynamic organellar maps for the systematic detection of subcellular translocation events.

Figure 4 with 3 supplements see all

Download asset Open asset

Dynamic organellar maps reveal protein localization changes following EGF stimulation.

(**A, B**) Fluorescently tagged EGF (green) was pre-bound to HeLa cells on ice, and imaged by confocal microscopy. Lysosomal compartments were visualized with Lysotracker (red). Most of the EGF was at the cell surface (A). Cells were then shifted to 37°C, and incubated for 30 min. EGF had been cleared off the cell surface, and localized predominantly to an endosomal compartment, with little lysosomal co-localization (B). Scale bars = 10 μm. (C) Organellar maps were prepared from untreated HeLa cells, and (D) from cells following 20 min of continuous stimulation with 20 ng/ml EGF. The translocation of the EGFR receptor (star symbol) from plasma membrane to endosomes was captured. Colours indicate organelles as in Figure 2. Maps show the combined data from three replicates each. (**E, F**) Unbiased identification of significant translocation events triggered by EGF stimulation. Each protein is scored for magnitude of translocation (M score, x-axis) and reproducibility of translocation direction (R score, y-axis) across the three replicates. A MR plot reveals significant translocations in the top right quadrant. Score cut-offs for FDR-control were determined by analysis of a triplicate mock experiment where no genuine translocations are expected (E). Ultra-stringent cut-offs (corresponding to an FDR of 0) were then applied to the EGF treatment experiment (F). Four significant translocations were detected, including EGFR and two known binding partners, GRB2 and SHC1. As the maps in C, D reveal, all move to the endolysosomal cluster. Figure 4—figure supplement 1 provides a schematic of the experimental design. Please refer to Figure 4—figure supplements 2 and 3 for further in-depth analysis of protein localization changes following EGF stimulation.

https://doi.org/10.7554/eLife.16950.010

In addition to intra-organellar translocation events, EGF signalling also involves cytosol/membrane as well as cytosol/nuclear transitions. To capture these events, we compared the abundance of proteins in membrane, nuclear and cytosolic fractions (as prepared in Figure 1C) from control and EGF-stimulated cells, based on label-free quantification (Cox et al., 2014) (Figure 4—figure supplement 2B). Stringent FDR controls were derived using a mock experiment of our six database maps, as above, identifying 26 significant changes in the EGF experiment (Supplementary file 7, and Figure 4—figure supplement 3). In agreement with the organellar maps, we detected substantial recruitment of GRB2 and SHC1 to membranes. In addition, we observed recruitment of CBL and UBASH3B, which are also known to bind to activated EGFR (Grovdal et al., 2004; Raguz et al., 2007). Consistent with that, CBL and UBASH3B were not detected in control maps but were found in individual EGF-treated maps, in the endosome/lysosome (Figure 4—figure supplement 2A). Among others, we identified the known translocation of RPS6KA3 into the nucleus, as well as a surprising number of transcriptional regulators leaving the nucleus (Supplementary file 7). These included ZCCHC8 and RBM7, the unique components of the nuclear exosome targeting (NEXT) complex (Lubas et al., 2011), which targets the exosome to promoter upstream transcripts (PROMPTS) for their degradation. Therefore our data suggest that EGF may induce modulation of the non-coding transcriptome.

Quantitative modelling of EGF-triggered subcellular translocations

Finally, we combined the identified translocation events with our estimates of absolute protein abundances. For each translocating protein, we calculated the number of molecules in cytosol, nuclear, and organellar fractions, before and after EGF treatment. Differences were then interpreted as the number of proteins moving between compartments (summarized in Supplementary file 7). For example, our data show a significant overall loss of EGFR upon EGF treatment (from 700,000 to 620,000 copies per cell, p=0.0022), suggesting that a proportion of endocytosed EGFR has already been degraded in lysosomes. Approximately 500,000 copies of GRB2 are recruited onto endosomes/EGFR, suggesting a stoichiometry of ~1:1 with EGFR. In contrast, CBL (10,000 copies) and UBASH3B (30,000 copies) are recruited sub-stoichiometrically, as would be expected of enzymatically acting proteins. SHC1 (100,000 copies) is also recruited sub-stoichiometrically. The cell loses three quarters of its Vasorin (a negative regulator of TGFB signalling; 30,000 copies), most likely through plasma membrane shedding (Malapeira et al., 2011). Over 300,000 copies of the actin regulator Palladin are released into the cytosol, and the Rho-effector PKN2 is shifted to endosomes, indicating major cytoskeletal rearrangements. Our data thus begin to provide a quantitative, integrated view of EGF-triggered subcellular translocations at the protein level (Figure 5).

Figure 5

Download asset Open asset

Quantitative mapping of EGF-triggered subcellular translocation events.

Summary of key protein translocations in HeLa cells following 20 min of continuous stimulation with EGF. All depicted changes were detected by organellar maps in this study; they include numerous previously known as well as novel translocation events. Numbers on arrows indicate how many copies of a protein undergo the indicated movement (per cell). These estimates were also calculated from the mass spectrometry data, using the proteomic ruler approach (Wisniewski et al., 2014). Figure 4—figure supplement 3 and Supplementary file 6 (interactive database) and 7 (compact summary) show additional translocations not included here.

https://doi.org/10.7554/eLife.16950.014

Discussion

We have developed and applied a powerful new method for making quantitative organellar maps, to generate an extensive database of human protein subcellular localizations and organellar composition. Furthermore, the ease and reproducibility of our approach permits comparative applications and process modelling, as we demonstrate using EGF signalling.

The HeLa spatial proteome

Here, we provide localization information for 8710 proteins in HeLa cells. The database is accessible via an Excel file (Supplementary file 4) and a website (www.MapOfTheCell.org), which provide complementary features for analysing the data. Both contain information on protein abundance (copy numbers per cell), global cellular distributions (eg cytosolic vs membrane pools), and predicted organellar associations. In addition, the website offers visualization and interactive exploration of the maps. Supplementary file 4 provides an extra local ‘neighbourhood analysis’ identifying proteins with highly similar fractionation profiles (useful for identifying potential protein complexes), and also allows easy annotation of whole protein families via its batch submission option.

The complexity of the HeLa proteome has been estimated at around 10,000 proteins (Beck et al., 2011; Nagaraj et al., 2011); a substantial proportion of this is covered by our database. Importantly, it accounts for the vast majority of protein cell mass (as can for example be seen from the cumulative mass plots in Figure 3B, which all reach a stable plateau); further identifications would mostly correspond to low abundance proteins, with minimal contributions to organellar composition. In this respect, our database approaches comprehensive coverage, and offers a quantitative view of cellular architecture (Figure 3). The relative sizes of organelles differ significantly between cell types; the approach presented here allows a comparatively rapid characterization at a level previously only achievable through extensive morphological studies. A future comparison of different cell types will substantially enhance our understanding of cellular identity, by uncovering universal features and specific adaptations. In addition to the organellar level, it will also give new insights for individual proteins, by revealing cell- or species-specific localization differences, and thus potentially new regulatory or functional aspects.

Accurate, quantitative and reproducible organellar maps

The profiling approach presented here maximizes speed and simplicity of the subcellular fractionation procedure. This ensures reproducibility, and at the same time keeps organelles as intact as possible. Since the preparative aspects are straightforward, several fractionations can be carried out in parallel on the same day, allowing multiplexing and complex experimental designs. Relative to the previous LOPIT approach (localization of organelle proteins by isotope tagging; Christoforou et al., 2016), our fractionation protocol is five times faster (4 hr vs ~20 hr), and requires an order of magnitude less starting material (10⁷ cells vs 10⁸ cells). Most importantly, our method can be used comparatively, and also offers quantitative data on protein abundance; a comparative application of LOPIT has yet to be demonstrated. The peptide labelling strategy allows very flexible use of LOPIT; our method requires metabolic labelling (SILAC), currently rendering it most suitable for dividing cells in culture. However, an application of fractionation profiling to mammalian tissues is possible, since mice can be kept on a SILAC diet (Zanivan et al., 2012); alternatively, a representative mix of SILAC-labelled cell lines may be used to generate the reference fraction (SuperSILAC approach [Geiger et al., 2010]). In addition, mass tagging is in principle compatible with our approach, too, and may thus extend its range of applications in future. A detailed comparison of the methods’ relative advantages and requirements is presented in Supplementary file 8.

Our organellar assignments are in excellent agreement with independent external data (Figure 2—figure supplement 1, Supplementary file 3). Furthermore, we made a direct comparison with a recent analysis of the mouse stem cell spatial proteome using LOPIT (Christoforou et al., 2016). 2397 homologous proteins were classified in both studies, of which 2196 had identical compartment predictions (91.6%; Supplementary file 3). This exceptionally high level of agreement, across species and cell types, reciprocally supports the very high accuracy of predictions in both datasets.

Organellar maps based on subcellular fractionation profiles reflect protein steady state localizations. Proteins predominantly associated with a single organelle have closely matching profiles, and can be assigned unambiguously. In contrast, proteins equally split over two (or more) compartments have mixed profiles, which may be difficult to interpret (Gatto et al., 2014). Here, we assign each protein to the most likely compartment, but potential secondary assignments are also indicated (Supplementary file 4). Furthermore, our two-tiered profiling approach considerably alleviates the dual-localization problem, by separating organellar predictions from quantifying a protein’s nuclear, cytosolic and membrane pools. This allows, for example, the accurate characterization of nuclear-cytosolic shuttling proteins: instead of showing an ambiguous ‘in between’ state, our approach precisely determines how these proteins are distributed over the two compartments. Similarly, for proteins with a cytosolic and an organellar pool, it allows quantification of the distribution, in addition to identification of the membrane compartment. Of note, our dynamic implementation of maps is generally unaffected by multiple localization difficulties, since we uncouple the detection of protein translocations from organellar assignments (Figure 4). Thus, our approach allows the identification of translocation events, even if they only involve partial organellar transitions.

Dynamic organellar maps applied to EGF signalling

Here, we have used organellar maps to analyse cellular events following EGF stimulation. We correctly captured the endosomal transition of EGF receptor, and recruitment of signalling adaptors. Remarkably, the translocations were detected with extremely stringent FDR control, using cut-offs where we expect no false positives. This supports that our approach is capable of identifying translocation events de novo, without having to filter results based on prior knowledge. Furthermore, in combining the translocation data with protein copy number estimations, we provide a genuine systems-biology approach to EGF signalling at the protein level (Figure 5). Unlike transcriptomic or proteomic profiling, our approach allows detection of cellular rearrangements at very early time points after stimulation, long before changes in protein abundance occur. The entire experiment (triplicate comparisons, six maps) required only five days of mass spectrometry measuring time.

In total, our analysis identified 40 translocation events, including numerous previously unreported movements. Among them are ten major regulators of actin dynamics, such as the kinases ROCK2, PKN2, PIK3C2B, and their downstream targets ADD1 and CTNN1, as well as PALLD, LASP1, and UTRN, suggesting that re-arrangement of the cytoskeleton is one of the major immediate effects of EGF signalling in HeLa cells. For several of these proteins, this study provides the first experimental evidence that they are targets of the EGF pathway (Supplementary file 7). Our data also reveal an unexpected cross-talk with other signalling pathways; Vasorin-shedding, AHNK and PDCD4 rerouting are all likely to counteract anti-proliferative TGFB signalling, and may serve to enhance EGF activity. Strikingly, we observed several transcriptional regulators leaving the nucleus. While nuclear import of proteins, such as ERK2/MAPK1, is a common downstream effect of signalling, nuclear protein export has been reported comparatively rarely. A possible explanation is that this type of movement is more difficult to detect with conventional approaches, such as microscopy: protein import concentrates the signal in the nucleus, whereas export diffuses it. Taken together, these observations highlight the power of the holistic proteomic approach, which identifies the co-ordinated behaviour of functionally linked groups of proteins, and thus uncovers cellular response modules.

Outlook

This study demonstrates that dynamic organellar maps can shed new light even on relatively well-studied processes, such as EGF uptake. We propose that they will be similarly suitable in the fields of autophagy, membrane trafficking and cellular differentiation, providing a powerful complement to imaging-based techniques. Since they offer an unbiased approach to studying cellular dynamics that does not require prior knowledge, they will also be an effective tool for exploratory investigations of poorly characterized processes. The possibility to combine maps with high-throughput phosphoproteomics data (Humphrey et al., 2015) promises to provide unprecedented views of signalling, by linking the movement of substrates to their phosphorylation status. Moreover, as we have shown here, organellar maps will pave the way for quantitative process modelling in cell biology.

Materials and methods

Quick start guide - from cells to organellar maps in 6 easy steps
Experimental protocols
Bioinformatic analysis
Detection and interpretation of translocation events – dynamic organellar maps
A guide to interpreting organellar maps
- Understanding the prediction output – reading organellar maps
- The Large Protein Complex (LPC) cluster
How to use the website www.mapofthecell.org

Step	Description	Time requirements
	Starting material: SILAC heavy and light labelled HeLa cells (1 x15 cm dish each, 50% confluent, ie 2 x 10 million cells).
1	Mechanical cell lysis, and differential centrifugation subcellular fractionation → the actual ‘Fractionation Profiling’	4 hr
2	Protein assay of fractions, overnight tryptic digest, peptide clean-up	4 hr hands-on, + overnight digestion
3	Mass spectrometry analysis (Thermo Q-Exactive HF)	20 hr (fast protocol)
4	MaxQuant data processing (free software), data filtering	< 24 hr (processor with 8 cores e.g. intel i7)
5	Visualization of maps by PCA, check clustering (using eg SIMCA software (free demo), or Perseus software (free)	1 hr
6	Prediction of protein subcellular localization by SVM classification (Perseus, free software)	1 hr
→From cells to map in 3 days

	Fast Map	Deep Map
Number of subcellular fractions	8 (5xSILAC, 3xLFQ)	8 (5xSILAC, 3xLFQ)
Peptide fractionations	1	3
Measuring time/sample	2 hr 30 min	2 hr 30 min
Total measuring time per map	20 hr	60 hr
Proteins mapped (average)	2800	4500
Global protein profiles (average)	6000	8000
Prediction performance (on markers)	93.4%	94.7%

Prediction Confidence	Score	Map Concordance	Maps disagree in:	Predictions in this class
Very low	<1	94%	1:13	657
Low	1-4	96%	1:25	573
Medium	4-16	98%	1:50	999
High	16-52	99%	1:100	1229
Very High	52-100	>99%	<1:100	1807
Total	5265

Sub-Compartment	No of markers	Correctly predicted	%
ER
Membrane	78	72	92.3%
Lumen	49	40	81.6%
Total	127	112	88.1
Mitochondria
Matrix	173	171	98.8%
Outer membrane	20	15	75.0%
Inner membrane	46	37	80.4%
Total	239	223	93.3%

Share this article

Cite this article

Generation of organellar maps through fractionation profiling.

Visualization of an organellar map.

Quantitative anatomy of a HeLa cell.

Dynamic organellar maps reveal protein localization changes following EGF stimulation.

Quantitative mapping of EGF-triggered subcellular translocation events.

Author details

Daniel N Itzhak

Contribution

Competing interests

Stefka Tyanova

Contribution

Competing interests

Jürgen Cox

Contribution

Competing interests

Georg HH Borner

Contribution

For correspondence

Competing interests

Citations by DOI

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

Categories and tags

Research organism