1. Cell Biology
Download icon

Synthetic protein interactions reveal a functional map of the cell

  1. Lisa K Berry
  2. Guðjón Ólafsson
  3. Elena Ledesma-Fernández
  4. Peter H Thorpe  Is a corresponding author
  1. The Francis Crick Institute, Mill Hill Laboratory, United Kingdom
Tools and Resources
  • Cited 6
  • Views 2,189
  • Annotations
Cite this article as: eLife 2016;5:e13053 doi: 10.7554/eLife.13053

Abstract

To understand the function of eukaryotic cells, it is critical to understand the role of protein-protein interactions and protein localization. Currently, we do not know the importance of global protein localization nor do we understand to what extent the cell is permissive for new protein associations – a key requirement for the evolution of new protein functions. To answer this question, we fused every protein in the yeast Saccharomyces cerevisiae with a partner from each of the major cellular compartments and quantitatively assessed the effects upon growth. This analysis reveals that cells have a remarkable and unanticipated tolerance for forced protein associations, even if these associations lead to a proportion of the protein moving compartments within the cell. Furthermore, the interactions that do perturb growth provide a functional map of spatial protein regulation, identifying key regulatory complexes for the normal homeostasis of eukaryotic cells.

https://doi.org/10.7554/eLife.13053.001

eLife digest

Our actions often depend on who we interact with: parents, teachers, friends, colleagues. So it is for proteins in the cell: their function depends on which other proteins they work with. If a protein interacts with new partners or ends up in a new neighborhood of the cell, it can perform an entirely unexpected role, rewiring how that cell works.

There are millions of possible protein-protein interactions, but it is not known how cells behave if their proteins are forced into new associations. For example, how many of these associations affect how well the cell can grow?

Using budding yeast, Berry et al. were able to associate every protein in the cell with proteins from each of the major areas of the cell such as the nucleus, cell membrane or mitochondria. These new associations and relocations were then examined to see how many of them caused problems, slowing the cell’s growth or killing it.

Unexpectedly, most forced associations had no detectable effect, indicating that the cell is remarkably tolerant of new protein-protein interactions. This contradicts a common idea that proteins are very fussy about their partner proteins, and will not work properly if they are forced into new interactions.

The associations that do cause a growth defect are often between proteins that normally work together, indicating that their association is normally carefully controlled during the normal growth of cells. In some cases these forced associations identified previously unknown regulators of cell behavior.

Proteins that interact with the wrong partners or are in the wrong place within cells cause a number of diseases. Future forced association experiments will allow us to examine such interactions and possibly search for drugs that will correct the problem.

https://doi.org/10.7554/eLife.13053.002

Introduction

Post-translational protein modifications such as phosphorylation or ubiquitylation often alter the affinity of one protein for other proteins or cellular components, which drive their movement within the cell (Scott and Pawson, 2009). Protein relocalization is critical for many cellular processes, including the asymmetric division of adult stem cells, which underlies metazoan development. The importance of protein localization is also highlighted by diseases ranging from cystic fibrosis to cancer that result, in part, from protein mislocalization (Hung and Link, 2011). The evolution of new modes of protein regulation requires new associations to form, but currently we do not know how tolerant the cell is of novel protein interactions. For example, can a nuclear kinase relocate to the cytoplasm without consequence?

Various methodologies have been developed to allow specific affinity-based relocation of proteins in vivo. For example, some systems are designed to disable a location-specific function by sequestering proteins to a specific compartment (Haruki et al., 2008; Robinson et al., 2010). Alternatively, a leucine zipper-based system was developed to screen for pairwise protein associations, provided that selection for a phenotype is possible (Devit et al., 2005). However, none of these approaches have systematically assessed the effects of creating pairwise protein associations, one at a time, across the entire proteome. To address this, we made use of the Synthetic Physical Interaction (SPI) system (Olafsson and Thorpe, 2015) to create high-affinity interactions between each of the ~six-thousand members of the eukaryotic yeast proteome and target proteins in each of the major cellular compartments. This has allowed us to assay the effect of each of these in vivo binary protein interactions individually upon the normal growth of cells. We find that most protein-protein interactions are benign to the normal growth of cells, but that specific interactions do perturb growth - these interactions are termed Synthetic Physical Interactions or SPIs (Olafsson and Thorpe, 2015). The SPIs are enriched for functional regulators, indicating that constitutive colocalization of a regulator with its target causes a growth defect. We are able to use SPIs to identify novel regulatory proteins; for example, we examine SPIs between the kinetochore protein Nuf2 and both Hmo1 and Sgf29 and find that these two proteins are required to regulate the levels of outer kinetochore proteins. Furthermore, the SPIs correlate with the quaternary structure of large protein complexes such as the kinetochore or nuclear pore. As such, the SPIs provide a powerful tool to complement existing physical and genetic interactions.

Results

The SPI system uses a GFP-binding protein (GBP) derived from an alpaca antibody (Rothbauer et al., 2006), which when fused to a target protein of interest creates binary associations in vivo with GFP-tagged proteins (Rothbauer et al., 2006; Rothbauer et al., 2008; Grallert et al., 2013). We define a target protein as one fused with the GBP and a query protein as one tagged with GFP. By introducing GBP-target proteins into strains encoding GFP-query proteins, we induce an affinity between the target and query proteins via the strong binding of GBP to GFP. We used the Selective Ploidy Ablation technique (Reid et al., 2011) to introduce a plasmid encoding the GBP-target protein into the collection of ~6000 GFP strains, each of which has a chromosomally integrated GFP introduced at the 3’ end of a specific open-reading frame (Huh et al., 2003). In each resulting haploid strain, the GBP-target protein is plasmid-encoded and the GFP-query protein is endogenously-encoded; we are therefore able to create a binary protein-protein interaction and assess the effects of this interaction upon growth. We used two independent controls, which were separately transferred into the GFP collection. The first control encodes the GBP alone, and the second encodes the target protein. These two constructs control both for the effects of binding a protein to the GFP tag and also for the ectopic expression of the target gene in each GFP strain. We chose 23 different target proteins that represent 18 of the major cellular compartments (Figure 1A and Figure 1—source data 1), such as the nucleus (Pus1 and Rad52), the cell membrane (Psr1), and the endoplasmic reticulum (Sec63). The genes encoding these target proteins were fused with GBP and transferred into every strain of the GFP collection (Figure 1—source data 1). Thus, for each target protein, we create ~6000 strains each of which contains the target GBP-tagged protein together with a specific GFP-query protein. The effect on growth was assayed by comparing the colony sizes of strains containing the GBP-GFP interaction with the two controls (Figure 1B,C)(Dittmar et al., 2010). The two controls gave equivalent results (Figure 1—figure supplement 1 and as previously reported Olafsson and Thorpe, 2015) and consequently an average growth score was used.

Figure 1 with 1 supplement see all
Quantitative analysis of the effects of binding proteins throughout the cell.

(A) A schematic diagram of S. cerevisiae indicating the cellular compartments and target proteins within the cell that were associated with each member of the proteome. (B) A 1536 colony plate from the Sec63 screen. The inset below shows the highlighted row from the Sec63-GBP plate, the Sec63 control plate and the GBP-only control plates respectively. Growth defects are indicated with a black line. (C) The z-scores of all 5734 proteins in each of the 23 screens. For each screen, the strains are ranked according to order of z-scores, positive z-scores indicate a growth defect relative to controls. The inset highlights the strains with the largest growth defects in each screen.

https://doi.org/10.7554/eLife.13053.003

We expected that many of the forced associations would disrupt cellular homeostasis, but unexpectedly, we found that 98% of GBP–GFP combinations (129,098 out of 131,882) do not affect the growth of cells (Figure 1C). These data imply that cells are surprisingly permissive for most protein-protein interactions and as a corollary that cells are broadly tolerant of proteins being relocated within the cell.

In cases where fluorescent imaging was able to detect protein relocalization, we confirmed that ~72% of interactions do occur. Since the GBP tag is linked to red fluorescent protein (RFP), we were able to assay colocalization with GFP. We examined 552 GBP-GFP combinations - each of the 23 GBP-tagged target proteins separately combined with a random selection of 24 GFP-tagged query proteins - using live cell imaging (Figure 2—source data 1, for examples see Figure 2 and Figure 2—figure supplement 1). Of the 524 GBP-GFP combinations that we could score, 210 (40%) are already in the same compartment and so we cannot determine whether GFP and GBP associate, of the remaining 314, 225 were detectably colocalized (Figure 2C), indicating that in the majority of cases the protein-protein associations do occur (Figure 2, Figure 2—figure supplement 1 and Figure 2—source data 1). These observations are therefore consistent with the notion that most synthetic protein-protein interactions do not cause a growth defect.

Figure 2 with 2 supplements see all
Colocalization of target GBP protein and query GFP proteins.

(A) Cdc11-GBP relocalizes to the Golgi when bound to Sec26-GFP. (B) Cdc55-GFP relocalizes to the mitochondria when bound to Om45-GBP. (C) Bar chart of the proportion of colocalization (n=552), note that the colocalized category includes 210 combinations where the target and query proteins are within the same compartment and so protein-protein association will not be apparent from this microscopy analysis. (D) Bar chart of the direction of movement of GFP and GBP (n=225). ‘To query protein - GFP’ indicates relocation of the majority of the GBP target protein to GFP (see A); ‘To target protein - RFP’ denotes relocation of the majority of the GFP query proteins to the GBP-RFP target (see B). ‘Both locations’ indicates that GBP and GFP proteins are in both their normal location and those of the other protein (e.g. Figure 2—figure supplement 1B). ‘Neither location’ denotes both GFP and GBP proteins are colocalized, but not to either of their normal locations (e.g. Figure 2—figure supplement 1C), whereas ‘Regionally colocalized’ indicates one protein is in the same region of the cell as the second protein, but not completely colocalized (see E). ‘Foci only’ designates that the proteins relocalized to discrete foci (see Figure 2—figure supplement 1E). Two categories are omitted from this analysis, first those cells which were uncharacterized, typically because the cells were dead. Second, cells in which the target and query protein reside in the same cellular location, such that microscopy is not informative on whether or not they are associated, this latter category make up ~40% of our combinations. (E) Hta2-GBP is displaced from the nucleus when bound to Spt6. The scale bars are 5 µm.

https://doi.org/10.7554/eLife.13053.007

The microscopy analysis also allows us to examine whether the GBP-tagged target protein recruits the GFP protein to its location or vice versa. We anticipated that each binary protein association would create a ‘tug-of-war’ between the target protein and the query protein. The image data support this notion; where it is possible to distinguish the location of two proteins in the cell, we observed that there are roughly equal instances of the GBP protein recruiting the GFP protein as the reverse (Figure 3 and Figure 3—figure supplement 1). However, this generalization is not true for some classes/types of proteins. When we look at individual GBP or GFP proteins, we find that structural components more often recruit proteins to their location than enzymes that are not anchored to a specific location (Figure 3, Figure 3—figure supplement 1 and Figure 2—source data 1). For example, GFP-tagged cytosolic query proteins such as Cdc55 and Snf1 mostly relocalize to their target proteins (Figure 2B and Figure 3—figure supplement 1), whereas the GFP-tagged nucleolar proteins Rpa49 and Pwp2 more often recruit GBP-tagged target proteins to their location (Figure 3—figure supplement 1). There are some rare cases where the two proteins localize to both locations and also where one or both proteins mislocalize to a new location that is foreign to both (Figure 2D). An example of the latter is the recruitment of the nucleosome remodeling protein, Spt6, to the histone subunit Hta2. Constitutive recruitment of a nucleosome remodeler to the chromatin might be expected to give a phenotype and indeed we find that the histone subunit Hta2-GBP is strikingly no longer restricted to the nucleus (Figure 2E) concomitant with a strong growth defect. It is possible that we are overestimating the extent of relocalization caused by the GFP-GBP interaction. First, since the target and query proteins are not stoichiometrically matched, some of the GFP or GBP protein will likely remain at its native location. Second, it is possible that in some cases either the GFP tag or the GBP tag is cleaved from its query or target protein respectively, thus giving a false indication of colocalization. It is also possible that imaging underestimates the proportion of relocalization, since we could not score the 210 combinations where proteins are already in the same compartment, these are perhaps more likely to associate via the GFP-GBP interaction. Furthermore, it should be noted that in some cases where we could not detect that the GFP and GBP proteins were colocalized, there was nevertheless either a growth phenotype or a change in the location of one of the proteins. For example, of the 15 Iqg1 associations that failed to show protein colocalization (Figure 3), 14 show mislocalization of either the Iqg1 target protein or the GFP query protein.

Figure 3 with 1 supplement see all
Direction of colocalization.

(A) The proportion of the 24 query proteins that colocalized in the direction indicated. Categories used to characterize the direction of colocalization are described in Figure 2. The ‘Uncharacterized’ category includes strains where there were no cells to image, which is often the case if the interaction perturbs growth.

https://doi.org/10.7554/eLife.13053.011

Around 2% of the forced interactions restrict growth (Figure 1C, 4A and Figure 1—source data 1). However, we note that of the 6000 GFP-tagged proteins used in this study, only ~4000 have been validated and are clearly observable (Huh et al., 2003). We therefore reanalyzed the proteome-wide data using only 3905 GFP strains with unambiguous fluorescence signal (Tkach et al., 2012) and find that ~3% restrict growth (Figure 4—source data 1), consistent with the notion that most protein-protein associations do not restrict growth. We did not use a specific threshold cutoff to define a SPI, rather we confirmed the SPIs with the greatest impact on cell growth for each GBP by repeating the assay starting with the strongest interaction and proceeding sequentially through the SPIs until the false discovery rate (FDR) reached 40% (Figure 4—figure supplement 1). Associations that produced a growth defect relative to controls with 16 replicates in the confirmation experiments are considered SPIs. Thus, some SPIs result from relatively mild growth defects, as outlined in Figure 4—source data 1. We note that the false negative rate may be significant, since we did not test further than the 40% FDR and due to the limitations of measuring growth by colony size. Using this approach, we confirmed 2784 SPIs in total produced by 727 GFP-tagged query proteins with one or more of the 23 target proteins (Figure 4—figure supplement 2 and Figure 4—source data 1).

Figure 4 with 6 supplements see all
Comparisons of synthetic physical interaction screens.

(A) Cluster analysis of the SPI data. The 23 screens are arranged horizontally and the 727 GFP strains clustered vertically. High z-scores (positive; >2) in yellow and low (negative; < -1) scores in blue. Three distinct clusters are highlighted (a, b, and c) and described in Figure 4—figure supplement 6. (B) Spearman’s Rank Correlation Coefficients for the different SPI screens shows similar compartments give similar SPIs, for example, Sec63 and Loa1 cluster together as do two kinetochore proteins Nuf2 and Dad2. (C) The notched box-and-whisker plot indicates the distributions of the retest log growth ratios and indicates that SPIs produced by a query protein and a target protein from different compartments produce stronger growth defects than those from the same compartments (***indicates a p-value = 1.8x10-5, Wilcoxon's rank-sum). The plot shows the median value (bar) and quartiles (box), the whiskers show the minimum of the range or 1.5 interquartile ranges, outlying data points are indicated as circles and the notches indicate the 95% confidence intervals of the medians. (D) The GFP proteins with SPIs have, on average, more protein-protein interactions than non-SPI query proteins, the notched box-and-whisker plot is in the same format as panel B (***indicates a p-value <.2x10-16, Wilcoxon’s rank-sum). The 727 SPI query proteins (red) are superimposed upon the yeast interactome with proteins with ≥10 interactions shown as larger squares. (E) The CLIK interaction density plot for Sec63 is shown (see Figure 4—figure supplement 5 for the other CLIK plots). The ~500 Sec63 associations that show the strongest growth restriction have a high interaction density (inset).

https://doi.org/10.7554/eLife.13053.013

One possible cause of the SPIs is that the target protein would sequester the GFP-tagged query protein away from its normal location. Should this be the case, we would expect low-abundance proteins to be more susceptible to growth defects. However, this is not generally the case for most SPIs, consistent with our earlier findings (Olafsson and Thorpe, 2015), since we found there was no correlation between protein abundance and the z-scores (a relative measure of growth) from the 23 GBP screens (R2 values ≤0.004). To address the issue further we grouped all GFP strains into eight categories based upon the abundance of their GFP proteins, each group has 421 proteins. We then plotted the proportion of GFP strains within each group that produced SPIs with a given GBP target (Figure 4—figure supplement 3A). Broadly, there are no abundance categories that are consistently enriched for SPIs with all GBP associations. However, we did note that in some cases the group of most abundant proteins had fewer SPIs than the other groups (for example Hta2 and Sec63, see Figure 4—figure supplement 3A). To assess whether the levels of the GBP-tagged protein would influence the SPIs, we altered the GBP-tagged protein levels by virtue of their constitutive copper promoter. The CUP1 promoter functions in the absence of copper and its expression can be gradually increased by adding copper to the growth media. We confirmed that upon addition of increasing amounts of copper, the levels of the GBP target proteins increased, as assayed by quantitative fluorescence imaging of the RFP tag attached to GBP (Figure 4—figure supplement 3B). We then retested 400 GFP strains, representing both high and low abundance proteins, with four different GBP target proteins, two of which had less SPIs with high-abundance proteins than expected (Hta2 and Sec63). The results indicate that increasing the expression of the GBP proteins does not specifically increase the number of SPIs within high abundance categories (Figure 4—figure supplement 3C). Nevertheless, we expected that a subset of proteins would be particularly sensitive to the effects of forced association and relocalization and this proved true. When we examine all the 727 SPI query proteins collectively (Figure 4A), we find that 75 GFP query proteins produce SPIs with at least 10 of our 23 GBP-tagged target proteins (Figure 4—figure supplement 2A). These ‘frequent SPI query proteins’ are on average of lower abundance than less frequent SPI query proteins (Figure 4—figure supplement 2B,C), also they are enriched for essential genes (≈83%) and for proteins whose gene ontology (GO) terms include RNA metabolism (p-value = 9.26x10-5), mRNA polyadenylation (p-value = 1.63x10-9), cytoplasmic and nuclear transport (p-values = 1.14x10-8 and 1.69x10-7, respectively), microtubule nucleation (p-value = 5.09x10-8), and spindle pole body (p-value = 3.22x10-8). We have previously shown that these interactions are mostly suppressed by having an untagged copy of the query protein present in the cell (Olafsson and Thorpe, 2015). In heterozygous diploid strains, the untagged version of the SPI query protein is able to complement for the tagged version of the protein that is mislocalized via its association with the target protein. To confirm that the frequent SPI query proteins fall into this category we retested 41 SPIs from the Nuf2 screen that fall into the frequent SPI query proteins group and 40 from the non-frequent SPI query proteins group. Consistent with our expectation all 41 frequent SPIs are suppressed in heterozygous diploid cells, whereas 15% (6 out of 40) SPIs in the non-frequent group were reproduced in diploid cells (Figure 4—figure supplement 4). Thus, we conclude that these frequent SPI query proteins are predominantly those whose essential function is location-dependent and whose sequestration to another compartment results in a growth defect (as is routinely achieved using other systems Haruki et al., 2008).

To understand whether associations to similar areas of the cell create growth defects from common sets of query proteins, we compared the SPIs generated from each target protein. Spearman’s correlation coefficient analysis (Lubbock et al., 2013) indicates that, in specific cases, SPI screens using target proteins from the same cellular compartment give similar SPIs (Figure 4B). For example, the Pus1 and Rad52 target proteins, which are both in the nucleus, produce SPIs with a similar set of query GFP proteins. However, it is interesting to note that some target proteins from the same cellular compartment give quite distinct sets of SPIs. For example, the SPI data for nuclear proteins Nop10 (nucleolus), Heh2 (nuclear membrane), and Hta2 (histone) cluster together but are distinct from both Pus1 and Rad52 (two nuclear enzymes). We suggest that these SPIs segregate into two different classes because Pus1 and Rad52 are non-essential nuclear enzymes, whereas Nop10, Heh2, and Hta2 are structural components, which may be more sensitive to movement. We next asked whether SPI query proteins would be located in the same cellular compartment as their target protein. SPIs between query and target proteins that normally localize to the same cellular compartment are enriched (10.4% of our confirmed SPIs are with target and query proteins from the same compartment, versus an expected value of 7.1% for the full dataset, p-value = 1.8x10-9, Fisher's Exact test). Also, this notion is true in specific cases, particularly for nuclear proteins. For example, SPIs with a nucleolar protein, Nop10, are enriched for nucleolar components (21 out of 115, p-value = 1.8x10-8, Fisher’s exact test) or SPIs with the microtubule-associated kinetochore component Nuf2, which are enriched for microtubule components (described below). This pattern was typical of nuclear proteins, but not evident for other proteins: for example, the SPIs with the mitochondrial protein Om45 did not include any mitochondrial proteins. However, it should be noted that although there are more SPIs between proteins in the same compartment, SPIs produced by proteins in different compartments tend to give a greater growth defect (Figure 4C).

Unexpectedly, we find that SPI query proteins are enriched for characterized physical interactions, compared with non-SPIs (p<2.2 x 10–16, Wilcoxon’s rank-sum). This is visualized by overlaying all the confirmed SPI query proteins onto a graph of the yeast physical interaction dataset (HINT database (Das and Yu, 2012), Figure 4D). We also asked the same question for each SPI screen using the Cutoff Linked to Interaction Knowledge tool (CLIK), which examines quantitative data for interaction density (Dittmar et al., 2013). The CLIK tool ranks all genes/proteins by their z-score (high scores bottom left, low scores top right) and then plots the interaction density between all proteins (using data from the Biogrid database Stark et al., 2006). If, from a specific target screen, the most growth restricted query proteins are collectively enriched for genetic or physical interactions then a cluster of high density will be visible in the bottom left of the density plot. Most SPI screens have a strong enrichment for genetic and physical interactions indicating that the strongest SPIs share interactions (Figure 4E and Figure 4—figure supplement 5), which is a predictor of common function. The overlap with physical interactions is particularly surprising; indicating that proteins that normally interact together can induce a growth defect when constitutively bound. Collectively, these observations are consistent with the idea that proteins and their regulators are often located within the same compartment, but their temporal or spatial physical association is tightly regulated.

The SPIs for each target protein are also enriched for proteins involved in regulating their function. Gene ontology enrichment analysis for the SPIs demonstrates that specific functional classes of proteins are enriched for each cellular compartment. For example, SPIs for the DNA repair protein Rad52 are enriched for components of the nuclear pore (Ndc1, Nic96, Nup1, Nup85, Nup49, Nup57, Nup84, Nup145, and Nup192; p-value = 7x10-9), specifically the Nup84 complex, which functions in specialized types of DNA repair (Nagai et al., 2008). Another example is the kinetochore protein Nuf2, whose SPIs are enriched for proteins involved in microtubule organization (Ark35, Bir1, Cbf2, Cdc14, Ctf19, Dad2, Dad4, Dsn1, Ipl1, Kip1, Kip3, Okp1, Spc24, Spc29, Spc42, Spc105, Spc110, Stu1, and Tub4; p-value = 8x10-13). Nuf2 is an outer kinetochore protein whose calponin-homology domain directs microtubule binding (Wei et al., 2007; Ciferri et al., 2008). As such, the SPIs may include numerous novel regulators of their target proteins (Olafsson and Thorpe, 2015). To test this, we examined three Nuf2 SPIs in more detail. Hmo1, Sgf29 (both chromatin-associated proteins) and Sst2 (a GTPase activating protein) all gave a strong SPI phenotype with the kinetochore protein Nuf2. Only one of these mutants, hmo1∆, gives a chromosomal instability phenotype (Stirling et al., 2011) and none have a reported role in kinetochore function. The SPI data (Figure 4A) cluster Hmo1 adjacent to Dad4, an outer kinetochore protein and with other kinetochore proteins (Mcm21, Okp1, Nkp2, Ctf19, and Spc24). To test whether the Hmo1-Nuf2 SPI was unique in the kinetochore, we tested various other kinetochore target proteins fused with GBP in an Hmo1-GFP strain. We find that in addition to Nuf2, Hmo1 has SPIs with Mif2 and Ctf19, but not Kre28, Mtw1, Dad2, Ctf3, Chl4, Skp1, Cnn1, or Cbf1 (Figure 5A). These data suggest that the Hmo1 SPI is specific to central/outer kinetochore components. We examined fluorescently tagged kinetochore proteins in hmo1∆, sgf29∆, and sst2∆ cells. We chose two kinetochore proteins, Mtw1 and Dad4, both of which are at the central and outer kinetochore, respectively, and adjacent to Nuf2, Ctf19 and Mif2 and also have been used in quantitative studies (Joglekar et al., 2006; Ledesma-Fernández and Thorpe, 2015). Strikingly, we find that hmo1∆ and, to a lesser extent, sgf29∆ cells both have elevated levels of Dad4 outer kinetochore protein associated with their centromeres, although the levels of Mtw1 were unaffected (Figure 5B,C and 5D). However, Hmo1 stimulates the activity of the SWI/SNF chromatin remodeling complex (Hepp et al., 2014) and therefore may affect expression of the DAD4 gene. To test whether the hmo1∆ mutant was affecting Dad4 protein levels we quantified total cellular Dad4-YFP fluorescence in wild-type and mutant cells and find approximately one third of hmo1∆ cells have higher levels of Dad4 than those found in wild-type cells (Figure 5—figure supplement 1). Nearly half of the hmo1∆ cells have Dad4 levels in the wild-type range (+/- one standard deviation of the wild-type mean); hence cellular Dad4 protein levels are not sufficient to explain the aberrant Dad4 foci seen in most hmo1∆ cells (Figure 5B). Furthermore, it has previously been shown that Hmo1 is associated with purified kinetochores (Akiyoshi et al., 2010), consistent with a direct role at the kinetochore. These data support the notion that in specific cases SPIs define functional regulators.

Figure 5 with 1 supplement see all
Nuf2 SPIs affect kinetochores.

(A) The Hmo1-GFP query protein encoding strain was transformed separately with 13 plasmids encoding different kinetochore proteins target proteins tagged with GBP (4 replicates each). The growth relative to controls (GBP alone and target protein alone) was assessed as in Figure 1. (B) Deletion of HMO1, SGF29, and SST2 were separately introduced into strains encoding Dad4-YFP and Mtw1-YFP at their endogenous loci. Fluorescence imaging of these strains reveals that hmo1∆ mutants have large-bright Dad4-YFP kinetochore foci (red arrows) and some weak foci (green arrows). sgf29∆ mutants contain bright Dad4-YFP foci (red arrows). In all cases, there are no effects upon Mtw1-YFP foci (right panels). Scale bars in all images are 5 µm. (C) Quantitation of the Dad4-YFP kinetochore foci fluorescence levels from these cells indicates that the levels of Dad4-YFP at kinetochores are affected by deletion of either HMO1 or SGF29. The left notched box and whiskers plot indicates the median (background subtracted) fluorescence values of kinetochore foci in relative units. The plot shows the median value (bar) and quartiles (box), the whiskers show the minimum of the range or 1.5 interquartile ranges, outlying data points are indicated as circles (note that several outlying data points are not shown as they are beyond the scale of the plot). The notches indicate the 95% confidence intervals of the medians (***indicates p-values <10–10 from a Wilcoxen’s rank-sum test). It should be noted that the distribution of kinetochore intensities do not conform to a normal distribution, particularly for the hmo1∆ mutant. The right panels show the distribution of fluorescent intensities of kinetochore foci of the same data plotted to the left (note that several outlying data points are beyond the scale of the plot). These data indicate the abundance of both the low and high intensity Dad4 foci of the hmo1∆ mutant (green and red arrowheads in Figure 5A, respectively) (D) Mtw1 kinetochore foci fluorescence levels are plotted as in panel C, we could not detect a difference from wild type cells in all three mutants.

https://doi.org/10.7554/eLife.13053.021

For each cellular compartment there are relatively few GFP proteins that produce SPIs with just one target protein. The query GFP proteins that produce SPIs have on average 3.8 SPIs with the 23 target proteins. However, those GFP proteins with just one SPI may be informative. For example, the histone subunits Hta2, Htb1, Htb2, and Hhf2 as well as the chromosomal proteins Bub1 and Mft1 have unique SPIs with the eisosome component Pil1. These interactions may indicate a nuclear role for Pil1, which relocalizes from the plasma membrane in response to DNA damage (Tkach et al., 2012) and associates with histones and chromosomal proteins (Lambert et al., 2009; Akiyoshi et al., 2010). Indeed, the Pil1-histone SPIs result from Pil1 recruitment into the nucleus (Figure 2—figure supplement 2).

Since selected SPI query proteins are enriched for physical and genetic interactions and contain proteins involved in regulating the biology of their target, we next performed hierarchical clustering analysis in order to test whether SPI data can be used to assess functional associations between proteins (Figure 4A). We find that query proteins from specific large functional complexes cluster together, for example, the mediator complex, which is involved in activating transcription, clusters together as do members of the COP1 coatomer, the outer ring of the nuclear pore, the signal recognition particle and TRAMP complex (Figure 4—figure supplement 6). It is important to note that SPIs are not a substitute for physical interaction data, but rather represent a common phenotype in response to forced association. Collectively, the clustering of protein complexes, gene ontology enrichment and physical and genetic enrichment indicate that specific target proteins show SPIs with sets of query proteins that share a common location, potentially common components of larger protein complexes. Thus, although the proteome-wide SPI data themselves do not directly give structural information, the SPI data groups query proteins within these known protein complexes.

We next asked whether the SPI data would correlate with the quaternary structure of multi-protein complexes, since protein associations with one part of a complex may give a similar growth phenotype that contrasts with a different part of that same complex. We chose the kinetochore as an example, since this is a large array of between 60 and 100 proteins that are arranged into defined sub-complexes (Biggins, 2013). We selected these proteins (and some kinetochore-associated proteins) and clustered them based upon their SPI scores from the 23 screens. We find that key sub-complexes within the kinetochore are clustered together purely based upon their 23 SPI scores (Figure 6). For example, three of the four members of the COMA complex cluster together (Ctf19, Okp1, and Mcm21) with two members of the Ctf3 complex (Mcm22 and Nkp2), and Cse4 and Chl4, which are all part of the constitutive centromere associated network (CCAN) of inner kinetochore proteins that bind to centromeric DNA. Three of the four MIND complex members (Dsn1, Nnf1, and Nsl1) also cluster with Spc24, Kre28 and Nuf2, which are all part of the KMN network of outer kinetochore proteins. In contrast, the DAM/DASH complex, which is composed of 10 different proteins, segregates into distinct clusters (with Dad2, 3, and 4 distinct from Dam1, Ask1, Dad1, Spc34, and Duo1). Dad2, 3 and 4 are small central domain subunits of the DAM/DASH complex that are important for structural integrity of the complex and therefore potentially sensitive to association with other proteins (consequently they have many SPIs). In contrast Dam1, Duo1, and Spc34 are key interaction hubs for the decameric complex (Shang et al., 2003) and Ask1’s C-terminus plays an important role in intercomplex interactions (Ramey et al., 2011). Thus these proteins form external surfaces on the complex, which may be more tolerant of protein association. A similar correlation with the quaternary structure can be made for another large protein assembly, the nuclear pore complex (Figure 6—figure supplement 1). Hence, although SPIs do not substitute for physical interaction data they indicate a common phenotype produced by specific protein-protein associations.

Figure 6 with 1 supplement see all
Cluster analysis of kinetochore and associated proteins using the SPI data are plotted as a heat-map.

High z-scores (positive; >2) are shown in yellow and low (negative; < -1) scores in blue (as in Figure 4A). The different protein complexes within the kinetochore are color-coded as indicated in the legend. Based on the SPI data alone, key complexes within the kinetochore cluster together as indicated by the colored boxed regions of the plot.

https://doi.org/10.7554/eLife.13053.023

Discussion

The SPI technology has allowed us to create binary protein associations throughout the cell and in many cases these interactions result in protein relocalization. However, only a small fraction of these interactions lead to a measurable growth phenotype, suggesting that cells are highly tolerant of both protein mislocalization and protein-protein associations. There are exceptions, proteins that do affect growth in almost any location. For example the ubiquitin hydrolase, Doa4 and numerous proteins involved in transport (Figure 4—source data 1). Furthermore, there are proteins whose association with specific proteins causes a growth defect. We find that these SPIs are enriched for proteins that physically interact (Figure 4). Collectively the SPI data allow us to both identify regulatory proteins (Olafsson and Thorpe, 2015 and Figure 5) and provide information on quaternary structure of specific large complexes within the cell (Figure 6). These data illustrate that SPIs can be used, like physical interactions, to reveal the functional organization of the cell. However, since the readout of SPIs is phenotypic, in this case cell growth, the SPIs indicate functional interactions rather than physical interactions per se. Thus, the SPI methodology provides a powerful in vivo proteomics tool to map the mechanisms underlying spatial regulation within cells. The SPI technology may be particularly informative to define interactions that are detrimental under conditions of stress, drug treatment or other specific cellular perturbations. Many disease pathologies result, at least in part, from the mislocalization of proteins in cells (Hung and Link, 2011). Recent studies are discovering the extent to which specific drugs induce global changes in protein location (Tkach et al., 2012; Breker et al., 2013; Chong et al., 2015). Combining this cellular pharmacodynamics knowledge with SPI data opens the possibility of using drugs to induce therapeutic changes in protein localization; of the 727 SPI query proteins identified here, ~76% (549) have human homologs compared to 56% (3766) of the whole yeast genome (6604 ORFs) (YeastMine, Balakrishnan et al., 2012). This study provides the first comprehensive map of the effects of forced protein associations within cells.

Materials and methods

Yeast strains and methods

All yeast strains used in this study are listed in Table 1. W303 strains are ADE2+RAD5+ derivatives of W303 (can1-100 his3-11,15 leu2-3,112 ura3-1 unless otherwise indicated Thomas and Rothstein, 1989; Zou and Rothstein, 1997). GFP strains are all based upon BY4741 (his3∆1 leu2∆0 met15∆0 ura3∆0 Brachmann et al., 1998; Huh et al., 2003). Yeast were grown in standard growth medium including 2% (weight/volume) of the indicated carbon source (Sherman, 2002). Yeast plasmids were created using the gap-repair cloning technique, which combines a linearized plasmid with PCR products using in vivo recombination. All PCR products were generated using primers from Sigma Life Science and PfuII Ultra proof reading polymerase (Agilent Technologies, UK) or Q5 polymerase (New England Biolabs, USA). All plasmid constructs (listed in Figure 1—source data 2) were validated using Sanger sequencing (Beckman Coulter Genomics, UK).

Table 1

Yeast strains used in this study.

https://doi.org/10.7554/eLife.13053.025
Strain nameGenetic backgroundRelevant genotypeReference
W8164-2BW303MATα CEN1-16::Gal-Kl-URA3(Zou and Rothstein, 1997)
GFP strainsBY4741MATa his3∆1 leu2∆0 met15∆0 ura3∆0 XXX-GFP::HIS3(Huh et al., 2003)
PT147-7CW303MATa TRP1 lys2∆ DAD4-YFP::NAT SPC42-RFP::This study
PT12-13DW303MATa TRP1 MTW1-YFP hmo1∆::KANThis study
T403W303MATa TRP1 lys2∆ DAD4-YFP::NAT SPC42-RFP::HYG hmo1∆::KANThis study
T404W303MATa TRP1 lys2∆ DAD4-YFP::NAT SPC42-RFP::HYG sgf29∆::KANThis study
T402W303MATa TRP1 lys2∆ DAD4-YFP::NAT SPC42-RFP::HYG sst2∆::KANThis study
T406W303MATa TRP1 MTW1-YFP hmo1∆::KANThis study
T407W303MATa TRP1 MTW1-YFP sgf29∆::KANThis study
T405W303MATa TRP1 MTW1-YFP sst2∆::KANThis study

Selective ploidy ablation (SPA) screening

The SPA screening method is a mating-based approach for yeast transformation, and we followed the established protocol (Reid et al., 2011). The SPA method relies upon a universal donor strain (UDS, W8164-2B) that includes conditional centromeres on each and every chromosome. This strain is transformed with a plasmid encoding the GBP-tagged target protein (or controls) and then mated en masse with the collection of GFP strains. The resulting diploids are converted back to haploids by first destabilizing and then counter-selecting against all of the chromosomes from the UDS. The resulting colonies are then assessed for growth by measuring colony size as described below. In the first step, plasmid constructs (encoding GBP alone, target protein alone or target-GBP) were transferred into the UDS by transformation. The three resulting strains were separately mated with arrays of MATa GFP strains (Huh et al., 2003) on YPD agar plates for 24 hr. The resulting colonies were then copied to synthetic galactose medium lacking leucine to destabilize the donor chromosomes for 24 hr. Finally, colonies were copied onto galactose medium lacking leucine, including the drug 5-Fluoroorotic acid (5-FOA) to counter-select against the UDS chromosomes. Plates were then grown at 30˚C for 48–72 hr prior to imaging. All mating and copying of yeast colonies utilized a RoToR pinning robot (Singer Instruments, UK) with a minimum of four replicates per strain.

Quantitative analysis of high-throughput yeast growth

After SPA screening, the resulting agar plates were scanned using a desktop flatbed scanner (Epson V750 Pro, Seiko Epson Corporation, Japan) at 300 dpi resolution in transmission mode. These images were processed and analyzed using the ScreenMill suite of software (Dittmar et al., 2010), which assesses growth based upon the two-dimensional size of the colonies. The software was run in default mode, both for the kinetochore-specific screen and for the proteome-wide screen. For retesting strains for growth defects, plate images were normalized using specific controls on the plate as a reference, rather than the default plate median. This is necessary when the majority of the strains on a plate are affected since this will influence the plate median.

Spatial smoothing algorithm

Colonies arrayed on agar plates often grow faster on one side of the plate than the other. This growth effect can be caused by temperature or humidity gradients within incubators, variable thickness of agar (and hence concentration of nutrients), or uneven pinning pressure during plate copying. These anomalies can result in one side of the plate producing an overall higher z-score than the other. To correct for these type of biases, algorithms adjust colony size data to reflect overall even growth across a plate (Collins et al., 2006; Baryshnikova et al., 2010). The ScreenMill suite of software used for our analysis does not contain such corrections and so we employed a simple algorithm to correct z-scores for spatial anomalies (Olafsson and Thorpe, 2015).

Fluorescence microscopy

To examine the levels and location of tagged proteins within the cells, we used epifluorescence microscopy. Log phase cells were embedded in 0.7% low melting point agarose dissolved in the appropriate growth medium. The depth of agarose between the slide and coverslip is fixed at 6–8µm, slightly larger than the diameter of the average yeast cell, which maintains a consistent distance from the coverslip to the cell nucleus. Cells were imaged with a Zeiss Axioimager Z2 microscope (Carl Zeiss AG, Germany), using a 63x 1.4NA oil immersion lens, illuminated using a Zeiss Colibri LED illumination system (GFP=470 nm, YFP=505 nm, and RFP=590 nm). Bright field contrast was enhanced with differential interference contrast (DIC) prisms. The resulting light was captured using either a Hamamatsu ORCA ERII CCD camera containing an ER-150 interline CCD sensor with 6.45 µm pixels, binned 2x2 (Hamamatsu Photonics, Japan) or a Hamamatsu Flash 4 Lte. CMOS camera containing a FL-400 sensor with 6.5 µm pixels, binned 2x2. The exposure times were set to ensure that pixels were not saturated and were identical between control and experimental images. All images were acquired using either Axiovision or Zen software from Zeiss. Images shown in the figures were prepared using Volocity imaging software (Perkin Elmer Inc., USA) and control and experimental images have identical linear contrast adjustments unless otherwise stated.

Fluorescence image analysis

To quantify the relative amount of RFP in cells containing GBP-RFP tags we used custom scripts for the Volocity image analysis software (Perkin Elmer Inc. USA). Briefly, red fluorescence regions were identified within the three-dimensional images based upon an intensity threshold. These regions were then dilated by a fixed amount (~600 nm) in each direction to ensure that we assay all of the red fluorescence signal. The regions were further dilated (2.4 µm) to create an outer background region, which was subtracted from each fluorescence measurement (the script is available online https://sourceforge.net/projects/berry-et-al/files/RFP_quantitation.assf/download).

To quantify the relative levels of Dad4-YFP and Mtw1-YFP kinetochore proteins within kinetochore foci, we employed a custom ImageJ script (Ledesma-Fernández and Thorpe, 2015). To quantify the total cellular levels of Dad4-YFP we measured the YFP fluorescence signal from maximum projection images (from a stack of vertically separated z planes) for each cell and subtracted a mean background signal specific to each image (this script is available at https://sourceforge.net/projects/berry-et-al/files/general_cell_quan.ijm/download).

Bioinformatics analysis

Michael Eisen’s cluster program (version 3.0) was used to cluster the SPI data (Eisen et al., 1998). We used hierarchical centroid linkage clustering of both the GBP screens and the GFP-tagged genes. For the quaternary structure examples (Figure 6, Figure 4—figure supplement 6 and Figure 6—figure supplement 1) only a selected subset of the GFP strains were used for the cluster analysis. Cluster diagrams were visualized using Java Treeview (Saldanha, 2004). Gene ontology enrichment analysis was performed using the GOrilla algorithm (cbl-gorilla.cs.technicon.ac.il [Eden et al., 2009]).

References

  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
  6. 6
  7. 7
  8. 8
  9. 9
    A strategy for extracting and analyzing large-scale quantitative epistatic interaction data
    1. SR Collins
    2. M Schuldiner
    3. NJ Krogan
    4. JS Weissman
    (2006)
    Genome Biol 7: , R63, 10.1186/gb-2006-7-7-r63.
  10. 10
  11. 11
  12. 12
  13. 13
  14. 14
  15. 15
    Cluster analysis and display of genome-wide expression patterns
    1. MB Eisen
    2. PT Spellman
    3. PO Brown
    4. D Botstein
    (1998)
    Proceedings of the National Academy of Sciences of the United States of America 95:14863–14868.
    https://doi.org/10.1073/pnas.95.25.14863
  16. 16
  17. 17
  18. 18
  19. 19
  20. 20
  21. 21
  22. 22
  23. 23
  24. 24
  25. 25
  26. 26
  27. 27
  28. 28
  29. 29
  30. 30
  31. 31
  32. 32
  33. 33
  34. 34
  35. 35
    Getting started with yeast
    1. F Sherman
    (2002)
    Methods in Enzymology 350:3–41.
  36. 36
  37. 37
  38. 38
    The genetic control of direct-repeat recombination in Saccharomyces: the effect of rad52 and rad1 on mitotic recombination at GAL10, a transcriptionally regulated gene
    1. BJ Thomas
    2. R Rothstein
    (1989)
    Genetics 123:725–738.
  39. 39
  40. 40
  41. 41

Decision letter

  1. Randy Schekman
    Reviewing Editor; Howard Hughes Medical Institute, University of California, Berkeley, United States

In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

Thank you for submitting your work entitled "Synthetic protein interactions reveal a functional map of the cell" for consideration by eLife. Your article has been reviewed by three peer reviewers, and the evaluation has been overseen by Randy Schekman as the Senior and Reviewing Editor.

The following individuals involved in review of your submission have agreed to reveal their identity: Stanley Fields and Nevan Krogan (peer reviewers).

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

Summary:

The authors present a screen using their newly developed SPI system, which allows for the creation of artificial interactions between pairs of proteins. They create binary interactions between each of the ~6,000 yeast proteins and 23 target proteins that represent the major cellular compartments. The effects of these interactions on growth are assayed by measuring colony sizes. In spite of these interactions often leading to protein relocalization, they find only a small fraction lead to a measurable growth phenotype, suggesting that the cells are tolerant of both protein movement and association. The authors highlight how their method can be used to discover new regulatory relationships and to provide structural information on large cellular complexes.

The SPI system provides an exciting complement to PPIs and genetic interactions, and the scale of the collected dataset is impressive. The manuscript is well written and for the most part the methods are clearly described. Together, the importance and quality of the work makes it suited for publication in eLife as a tool/resource. However, there are a number of important points that need to be addressed.

Essential revisions:

1) The analysis on a limited number (24) of GFP-fusion proteins suggests that for only roughly ~20% of time when they co-express a GFP-fusion query protein with a target protein do they see mislocalization of the GFP fusion outside of where it is normally found. But even this may be a large over estimation of the degree to which their system is causing protein mislocalization. First, they do not evaluate what fraction of the GFP-fusion is mislocalized. Second, cleavage of the GFP from the fusion protein will result in mislocalization of the GFP domain but not of the rest of the protein. Finally, they report on ~6000 individual GFP fusion proteins obtained from the library developed in Huh et al. (2003). But in that paper only ~4500 of the fusion proteins were validated and observable. Thus unless the remaining 1500 strains were made and validated separately, they are suspect. In the end because of these and other concerns, I think it is not possible evaluate from these data what fraction of the proteins tolerate being effectively mislocalized to a cellular location that is different from where it is naturally found. This limitation must be discussed and explained.

2) The potentially most interesting result reported here is the detection of Synthetic Physical Interactions (SPIs), where a forced interaction does cause a growth phenotype. But it's not entirely clear that one learns all that much from these results. The one example of an SPI that they do examine in more detail in Figure 5 could easily be an indirect relationship rather than a direct molecular link (yeast null for chromatin associated proteins, hmo1 and Sgf29, show increase levels of a kinetochore protein Dad4). This would be much more compelling if the authors had data indicating some direct physical interaction (IP/mass spec) or biochemical rational for how hmo1 could regulate the kinetochore to support their claim that SPIs can identify functional regulators- this could easily be a case where hmo1 regulates transcription of Dad4 directly or indirectly. If so, please include this in the revised version or at least comment on relevant data in support of a direct and/or functional interaction.

3) The clustering of complexes (Figure 6) is not that convincing. the approach seems like an awkward strategy for obtaining information that is much more effectively obtained by mass-spec. For example, the histones fall into two distinct categories and the CPC, NDC80, Spc105 and MAPs complexes don't seem to cluster at all. This seeming disparity with known interactions requires explanation.

4) Results, second paragraph: “we found that 98% of GBP-GFP combinations […] do not affect the growth of cells.” Do 98% not affect growth at all, or are they using a z-score threshold? If z-score, this should be clarified, as it is not necessarily the same as "no effect on growth".

5) Results, third paragraph and Figure 2: “Fluorescent imaging confirmed that ~83% of interactions do occur and typically result in protein relocation […] Of the 524 GBP-GFP combinations that we could score, 435 were detectably colocalized (Figure 2C), indicating that in most cases the protein-protein fusions do occur.” Figure 2C shows this and 2D breaks it down in more detail. However, ~50% of the colocalized strains belong to the "indistinguishable" category = proteins that normally colocalize (2D). It is misleading to include this category for estimating how often relocation/interaction occurs. Correcting for this will result in a percentage dramatically lower than 83%. Similarly, this should be clarified in the counts in the text ("435 of 524") and the Figure 2C legend.

6) Results, fifth paragraph: “Around 2% of the forced interactions restrict growth […]” Specify z-score cutoff used to qualify as restriction in main text.

7) Results, fifth paragraph: “[…]of the 727 SPI proteins […]” -> of the 727 SPI query proteins.

8) Results, fifth paragraph: “[…] whose sequestration to another compartment is lethal […]” Did they show it's actually lethal or just decreases growth (i.e. sick)?

9) Results, fifth paragraph: Discuss why there is a difference in suppression of interactions between the frequent and non-frequent SPI groups.

10) Results, sixth paragraph and Figure 4B: It's here stated that target proteins from the same cellular compartment give similar SPIs. While the figure and analysis are suggestive of this, it would be better to carry out a statistical test to corroborate the statement. E.g. Box plots of the distributions of SPIs from same vs. different compartments, accompanied by a Wilcoxon rank-sum test to determine significance.

11) Results, seventh paragraph and Figure 4C: The authors are comparing two distributions of PPI counts (SPI vs. non-SPI) and compute a p-value for the difference using Spearman's rank correlation. I doubt Spearman's rank correlation can be used to produce a p-value for the difference between two distributions. Additionally, the stated p-value (2.2E-16) appears extremely optimistic given the large overlap of the box plots.

12) Results, seventh paragraph and Figure 4D: The CLIK analysis should be described better.

13) Results, eighth paragraph: Describe or reference the gene ontology enrichment analysis.

14) Results, eighth paragraph: Was fluorescently tagged Nuf2 examined in these cells as well? How is Nuf2 affected?

15) Figure 5A: Explain why Mtw1 was chosen.

16) Figure 5B and 5C: Describe in much more detail how to interpret these plots. Also, what are the error bars? Finally, is the p-value (E-10) really correct? Depending on what the error bars represent this looks low.

17) Results, last paragraph: “[…] SPI data can be used to predict protein complexes […]” Substantiate this claim. Show if the SPI data actually have predictive power with a ROC or precision recall curve.

18) Results, last paragraph: Try to interpret why the DAM/DASH complex segregates into distinct clusters.

19) Discussion: I suggest explicitly writing "physical interactions" instead of "interactions" to not confuse with SPIs.

20) Discussion: “[…] and derive quaternary structure […]” ‘provide information on’ would be more appropriate than ‘derive’.

21) Discussion: “[…] 54% (394) are conserved in human cells.” May be worth discussing how this compare to the conservation of the complete yeast genome to human?

[Editors' note: further revisions were requested prior to acceptance, as described below.]

Thank you for submitting your article "Synthetic protein interactions reveal a functional map of the cell" for consideration by eLife. Your article has been reviewed by three peer reviewers, and the evaluation has been overseen by Randy Schekman as the Senior and Reviewing Editor. Stanley Fields and Nevan Krogan have agreed to share their identity. There remains one concern with the language in your Abstract. Please adjust this according to the suggestion of reviewer #1.

Reviewer #1:

The authors have largely addressed the concerns I raised and put in appropriate caveats in their revised manuscript. I am still concerned that their Abstract is misleading in claiming to establish that proteins have an "unanticipated tolerance for forced protein associations and consequently their relocation". I think it is likely that in many, perhaps the large majority, of the cases where apparent relocation of a protein does not disrupt function has to do with at least partial retention of the protein in its correct locations. It would indeed be quite surprising if most nuclear proteins could function in the cytosol or most organellar localized proteins could function outside of their native organelle. I do not think this is what the authors intend to say (and certainly is not what they have shown) but I could easily see how a casual reader of the Abstract could be left with this impression.

Reviewer #2:

I find the revised version acceptable.

Reviewer #3:

I am happy with the revisions and support publication.

https://doi.org/10.7554/eLife.13053.030

Author response

Essential revisions: 1) The analysis on a limited number (24) of GFP-fusion proteins suggests that for only roughly ~20% of time when they co-express a GFP-fusion query protein with a target protein do they see mislocalization of the GFP fusion outside of where it is normally found. But even this may be a large over estimation of the degree to which their system is causing protein mislocalization. First, they do not evaluate what fraction of the GFP-fusion is mislocalized. Second, cleavage of the GFP from the fusion protein will result in mislocalization of the GFP domain but not of the rest of the protein. Finally, they report on ~6000 individual GFP fusion proteins obtained from the library developed in Huh et al. (2003). But in that paper only ~4500 of the fusion proteins were validated and observable. Thus unless the remaining 1500 strains were made and validated separately, they are suspect. In the end because of these and other concerns, I think it is not possible evaluate from these data what fraction of the proteins tolerate being effectively mislocalized to a cellular location that is different from where it is naturally found. This limitation must be discussed and explained.

We agree that it is possible that the imaging overestimates the frequency of mislocalization. We have now discussed these limitations (Results, fourth paragraph). The amount of a given protein that is mislocalized will depend in part on stoichiometry of the GBP and GFP tagged proteins, which is why we examined the effects of stoichiometry in Figure 4—figure supplements 2 and 3. We have made clear that not all the protein is likely to be mislocalized. However, where we score the protein to be mislocalized, this is because the majority of the fluorescence signal has changed location, we have made this clear in the legend of Figure 2. The 24 GFP proteins and 23 GBP tagged proteins tested by imaging show clear localization and have not been reported to be cleaved, however, we cannot rule out cleavage, particularly when the GFP- and GBP-tagged proteins are associated. We have now made clear the cleavage of the GFP protein from the C-terminus of a protein (or cleavage of the GBP from the target protein) will affect these data (in the aforementioned paragraph). Additionally, we have reanalyzed the global data, excluding nearly 2000 GFP strains where protein levels are low or the signal has not been validated. We find that the proportion of associations that affect growth are 3%, which is similar to the data for the full set of GFP-tagged protein, this analysis is now discussed (Results, fifth paragraph), top section, with details included in Figure 4—source data 1.

2) The potentially most interesting results reported here is the detection of Synthetic Physical Interactions (SPIs), where a forced interaction does cause a growth phenotype. But it's not entirely clear that one learns all that much from these results. The one example of an SPI that they do examine in more detail in Figure 5 could easily be an indirect relationship rather than a direct molecular link (yeast null for chromatin associated proteins, hmo1 and Sgf29, show increase levels of a kinetochore protein Dad4). This would be much more compelling if the authors had data indicating some direct physical interaction (IP/mass spec) or biochemical rational for how hmo1 could regulate the kinetochore to support their claim that SPIs can identify functional regulators- this could easily be a case where hmo1 regulates transcription of Dad4 directly or indirectly. If so, please include this in the revised version or at least comment on relevant data in support of a direct and/or functional interaction.

To characterize the effect of hmo1∆ mutants in more detail, we examined whether the Hmo1-Nuf2 SPI is specific to Nuf2 or more general to the kinetochore, we have now tested 10 other kinetochore GBP-target proteins and find that in addition to Nuf2, both Mif2 and Ctf19 give a SPI, whereas Kre28, Mtw1, Dad2, Ctf3, Chl4, Skp1, Cnn1 and Cbf1 do not (new Figure 5A) – suggesting that the effect of Hmo1 is restricted to specific regions within the central/outer kinetochore. We have now indicated that Hmo1 was identified by mass spectroscopy from immuno-precipitation of kinetochore complexes (work from Sue Biggins’ lab, Results, eighth paragraph). However, it remains possible that increased DAD4 expression is contributing to the phenotype. In a separate study, we have shown that overexpression of outer kinetochore components does not adversely affect kinetochore function (although not specifically DAD4, Herrero & Thorpe PLoS Genetics2016). However, to examine this in more detail we have quantified the total cellular levels of Dad4 in an hmo1∆ strain and find that they are elevated in approximately 30% of cells. However, nearly half of hmo1∆ cells have normal Dad4-YFP levels (quantitation provided in Figure 5—figure supplement 1), therefore changes in Dad4 levels are not sufficient to explain the aberrant Dad4 foci shown in most hmo1∆ cells in Figure 5B(also in text, in the aforementioned paragraph).

3) The clustering of complexes (Figure 6) is not that convincing. the approach seems like an awkward strategy for obtaining information that is much more effectively obtained by mass-spec. For example, the histones fall into two distinct categories and the CPC, NDC80, Spc105 and MAPs complexes don't seem to cluster at all. This seeming disparity with known interactions requires explanation.

We agree that structural information is best achieved using other methods, such as protein-protein interactions (PPIs) identified by, for example, mass spectroscopy. We realize that the SPIs are distinct from PPIs and have attempted to clarify this, see point 17 below. In our original submission we gave the impression that SPIs predict structure which is misleading, rather SPIs identify proteins which behave similarly when associated with specific other proteins, in some cases reflecting characterized functional complexes. We think that the power of the SPI technology is as stated in the Discussion:

“These data illustrate that SPIs can be used, like physical interactions, to reveal the functional organization of the cell. However, since the readout of SPIs is phenotypic, in this case cell growth, the SPIs indicate functional interactions rather than physical interactions per se.”

Some structural complexes are inferred from the SPI data, however this is likely via a common functional association and the SPI data cannot be used as structural evidence. We have now made clear that the SPIs do not constitute structural data – this is discussed more in point 17 below. Nevertheless, we feel that the cluster visualization allows readers to clearly see that SPI data for certain groups of proteins are more similar than for other groups. We have simplified Figure 6 to only highlight 3 clusters (CCAN, DAM1 and KMN). As discussed in point 17 and 18 below we now speculate on why structural complexes may not cluster together using SPI data and conclude the Results section as follows:

“Hence, although SPIs do not substitute for physical interaction data they indicate a common phenotype produced by specific protein-protein associations.”

4) Results, second paragraph: “we found that 98% of GBP-GFP combinations […] do not affect the growth of cells.” Do 98% not affect growth at all, or are they using a z-score threshold? If z-score, this should be clarified, as it is not necessarily the same as "no effect on growth".

We do not use a defined z-score or log growth ratio (LGR) as a cutoff for interactions. One reason for not using a z-score cutoff is that although within a screen the LGR between experiment and controls shows a linear correlation with z-score, this is not true between screens as z-scores were calculated on the data from each screen individually, not by combining the LGRs from the whole dataset. Thus a z-score in one screen does not equate to a z-score in another screen. We could have used an LGR cutoff, however, this assumes that SPIs with one target protein would produce a comparable growth phenotype with SPIs from a different target protein, which is an assumption we were not comfortable with. We felt that the best way to confirm SPIs was to sequentially retest the strongest interactions (as judged by z-score) from every screen. Thus the strongest SPIs were retested with 16 replicates and any target-query interaction that gave an LGR that was consistently more than control strains (a non-GFP strain) was considered a SPI. We stopped this retest process when the false discovery rate reached 40% (Figure 4—figure supplement 1). It is rare for such a large screen to retest all of the ‘hits’, but we felt that this was the highest quality approach to allow us to produce a list of repeatable interactions that lead to a growth defect. All of the LGRs from the retests and original z-scores are reported in Figure 4—figure supplement 1. This is now discussed in the text (Results, fifth paragraph). We also note that some SPIs are likely missed due to the colony measurement methodology and furthermore we did not retest beyond a 40% false discovery rate.

5) Results, third paragraph and Figure 2: “Fluorescent imaging confirmed that ~83% of interactions do occur and typically result in protein relocation […] Of the 524 GBP-GFP combinations that we could score, 435 were detectably colocalized (Figure 2C), indicating that in most cases the protein-protein fusions do occur.” Figure 2C shows this and 2D breaks it down in more detail. However, ~50% of the colocalized strains belong to the "indistinguishable" category = proteins that normally colocalize (2D). It is misleading to include this category for estimating how often relocation/interaction occurs. Correcting for this will result in a percentage dramatically lower than 83%. Similarly, this should be clarified in the counts in the text ("435 of 524") and the Figure 2C legend.

We have amended the 83% figure to 72% by excluding the 40% of samples that were already colocalized, the sentence now reads:

“In cases where fluorescent imaging was able to detect protein relocalization, we confirmed that ~72% of interactions do occur.”

This is now re-iterated in the legend of Figure 2D (and Results, third paragraph) that 40% of the strains examined have GFP and GBP in the same cellular compartment, such that their colocalization status cannot be assessed with microscopy. We have amended Figure 2D to remove the indistinguishable (and uncharacterized) category and clarified this in the legend. The Figure 2C legend now indicates that 210 of the 552 combinations examined already colocalized;

“[…] note that the colocalized category includes 210 combinations where the target and query proteins are within the same compartment and so protein-protein association will not be apparent from this microscopy analysis.”

6) Results, fifth paragraph: “Around 2% of the forced interactions restrict growth […]” Specify z-score cutoff used to qualify as restriction in main text.

We have now clarified this as described in point 4 above.

7) Results, fifth paragraph: “[…]of the 727 SPI proteins […]” -> of the 727 SPI query proteins.

We have included this specific change and also several other instances of ‘SPI proteins’ throughout the manuscript.

8) Results, fifth paragraph: “[…] whose sequestration to another compartment is lethal […]” Did they show it's actually lethal or just decreases growth (i.e. sick)?

We were wrong to state ‘lethal’ when the data show a growth defect (i.e. sick). We have amended the text (Results, end of sixth paragraph).

9) Results, fifth paragraph: Discuss why there is a difference in suppression of interactions between the frequent and non-frequent SPI groups.

This issue relates to point 8 also. We have moved the section concerning suppression of frequent SPI query proteins up to follow on from the characterization of the SPI query proteins (Results, sixth paragraph). We have expanded the explanation which now concludes as follows:

“Thus we conclude that these frequent SPI query proteins are predominantly those whose essential function is location-dependent and whose sequestration to another compartment results in a growth defect (as is routinely achieved using other systems (Haruki et al., 2008)).”

10) Results, sixth paragraph and Figure 4B: It's here stated that target proteins from the same cellular compartment give similar SPIs. While the figure and analysis are suggestive of this, it would be better to carry out a statistical test to corroborate the statement. E.g. Box plots of the distributions of SPIs from same vs. different compartments, accompanied by a Wilcoxon rank-sum test to determine significance.

The relationship between compartments and z-scores is interesting. We find that in some cases, z-scores correlate between two screens in which the target proteins were from the same compartment, for example Pus1 and Rad52 target proteins, which both localize to the nucleus (Figure 4B). We now include a section to describe this compartmentalization in more detail. 7.1% of the possible associations are between a target and query protein within the same compartment, however 10.4% of the SPIs occur between members of the same compartment. This enrichment is statistically significant using Fishers exact test, but nevertheless quite a small enrichment. By including the figures, readers can get a clear picture of the enrichment (Results, seventh paragraph). We give an example of Nop10 (a nucleolar protein), which has an enrichment for SPIs with query proteins in the nucleolus and use Fisher’s exact test to calculate a p-value, but importantly, we note that this is not true for all compartments.

In contrast to the number of SPIs within a compartment, when we compare the log growth ratios (i.e. the degree of growth restriction) of strains in which the query and target proteins are from the same compartment, the growth restriction is less than that for associations between different compartments. We have now included a description of this in the text, see Results, seventh paragraph and also included the box plot of same vs. different compartment LGRs in Figure 4C, accompanied by a Wilcoxon’s rank-sum test. In summary, there are (slightly) more SPIs between proteins in the same compartment, but the SPIs between proteins in different compartments produce a stronger growth defect.

11) Results, seventh paragraph and Figure 4C: The authors are comparing two distributions of PPI counts (SPI vs. non-SPI) and compute a p-value for the difference using Spearman's rank correlation. I doubt Spearman's rank correlation can be used to produce a p-value for the difference between two distributions. Additionally, the stated p-value (2.2E-16) appears extremely optimistic given the large overlap of the box plots.

We apologize for using the wrong statistical test, furthermore the p value should have been <2.2E-16 (the limit provided by the R function we had used). The p value of the Wilcoxon rank-sum test is similarly small. We failed to specify in the figure legend that the top of the box plot was cut off, hence some outliers were excluded. We have changed the y axis of this plot to a Log scale to allow all the data to be included and changed the box plots to include notches that indicate 95% confidence intervals of the median (Figure 4D), with a full description in the legend. This makes it easier for readers to see and interpret the data.

12) Results, seventh paragraph and Figure 4D: The CLIK analysis should be described better. We have now described the CLIK analysis in more detail (Results, seventh paragraph).

13) Results, eighth paragraph: Describe or reference the gene ontology enrichment analysis.

We have included a reference for the GOrilla algorithm for ontology enrichment.

14) Results, eighth paragraph: Was fluorescently tagged Nuf2 examined in these cells as well? How is Nuf2 affected?

To characterize the effect of Hmo1 on the kinetochore in more detail, we have now created ten additional kinetochore proteins fused with GBP (two of which are tagged at both the N- and C- termini) and tested the effect of introducing these into an Hmo1-GFP strain; we have now discussed this in the text (Results, eighth paragraph) and in a new Figure 5A. This experiment shows that the effect of Hmo1 association is not restricted to Nuf2, but is also found for Mif2 and Ctf19, a central and outer kinetochore component respectively; although a number of other associations do not produce a phenotype. We chose to examine Dad4 and Mtw1 as they are canonical members of the kinetochore complex and both have previously been quantitatively characterized in some detail (published work from both our own lab and Kerry Bloom’s lab). This rationale is now discussed in the text (in the aforementioned paragraph).

15) Figure 5A: Explain why Mtw1 was chosen.

See point 14 above.

16) Figure 5B and 5C: Describe in much more detail how to interpret these plots. Also, what are the error bars? Finally, is the p-value (E-10) really correct? Depending on what the error bars represent this looks low…

We have amended the Figure 5 legends (now C & D) to explain these plots in more detail. The barcharts showed means +/- standard deviations. However, we have changed these plots to notched box and whisker plots and given a full explanation in the legend. We have changed the p-values from two-tailed unpaired t-tests to Wilcoxon’s rank-sum tests – comparing the foci intensities of WT kinetochore foci with those of the mutants (for hmo1∆ vs WT the p-value is 4x10-24 and for sgf29∆ the value is 4x10-37). The distribution of foci fluorescence intensities (right panel), particularly for hmo1∆ cells, is not a normal/Gaussian one, hence we wanted to show the distribution of the fluorescence data so that readers can get a clearer picture of these data (the right panels in Figure 5C,D).

17) Results, last paragraph: “[…] SPI data can be used to predict protein complexes […]” Substantiate this claim. Show if the SPI data actually have predictive power with a ROC or precision recall curve.

This point follows on from point 3, we do not wish to claim the SPIs substitute for PPIs. We realize that this was an over-statement of our data to say that SPI data can predict protein complexes or would substitute for protein-protein interaction data, since SPI data is not structural. Rather, we want to indicate that when specific proteins are associated with the same protein complex they may produce a similar SPI phenotype. We have changed this section to

“It is important to note that SPIs are not a substitute for physical interaction data, but rather represent a common phenotype in response to forced association. […] Thus although the proteome-wide SPI data themselves do not directly give structural information, the SPI data does group query proteins within these known protein complexes.”

We have also amended the quaternary structure section to reflect this;

“We nest asked whether the SPI data would correlate with the quaternary structure of multi-protein complexes, since protein associations with one part of a complex may give a similar growth phenotype that contrasts with a different part of that same complex.”

This section now fits with our speculation of the different phenotypes given for members of the DAM1 complex (Results, last paragraph, as requested in point 18 below). We speculate here why some protein complexes may not cluster together using SPI data.

Furthermore we conclude the Results section with the sentence

“Hence, although SPIs do not substitute for physical interaction data they indicate a common phenotype produced by specific protein-protein associations”.

18) Results, last paragraph: Try to interpret why the DAM/DASH complex segregates into distinct clusters.

We have expanded this section of the text to speculate on why the DAM/DASH complex splits into two clusters based upon high-resolution structural models of this complex, see Results, last paragraph. This addresses issues raised in point 3 and 17.

19) Discussion: I suggest explicitly writing "physical interactions" instead of "interactions" to not confuse with SPIs.

We have changed these in the Discussion.

20) Discussion: “[…] and derive quaternary structure […]” ‘provide information on’ would be more appropriate than ‘derive’.

We have changed this in the Discussion.

21) Discussion: “[…] 54% (394) are conserved in human cells.” May be worth discussing how this compare to the conservation of the complete yeast genome to human?

We have amended these values utilizing data from ‘Yeastmine’ (yeastmine.yeastgenome.org). 3766 out of 6604 yeast ORFs have human homologs (57%), 549 out of the 727 SPIs have human homologs (~76%) (Fisher’s exact test, p-value = 3.32x10-23).

[Editors' note: further revisions were requested prior to acceptance, as described below.]

Reviewer #1: The authors have largely addressed the concerns I raised and put in appropriate caveats in their revised manuscript. I am still concerned that their Abstract is misleading in claiming to establish that proteins have an "unanticipated tolerance for forced protein associations and consequently their relocation". I think it is likely that in many, perhaps the large majority, of the cases where apparent relocation of a protein does not disrupt function has to do with at least partial retention of the protein in its correct locations. It would indeed be quite surprising if most nuclear proteins could function in the cytosol or most organellar localized proteins could function outside of their native organelle. I do not think this is what the authors intend to say (and certainly is not what they have shown) but I could easily see how a casual reader of the Abstract could be left with this impression.

Reviewer #1 had asked for one change to the Abstract, we have changed:

“This analysis reveals that cells have a remarkable and unanticipated tolerance for forced protein associations and consequently their relocation.”

to

“This analysis reveals that cells have a remarkable and unanticipated tolerance for forced protein associations, even if these associations lead to a proportion of the protein moving compartments within the cell.”

https://doi.org/10.7554/eLife.13053.031

Article and author information

Author details

  1. Lisa K Berry

    Mitotic Control Laboratory, The Francis Crick Institute, Mill Hill Laboratory, London, United Kingdom
    Contribution
    LKB, Conception and design, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article
    Competing interests
    The authors declare that no competing interests exist.
  2. Guðjón Ólafsson

    Mitotic Control Laboratory, The Francis Crick Institute, Mill Hill Laboratory, London, United Kingdom
    Contribution
    GÓ, Conception and design, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article
    Competing interests
    The authors declare that no competing interests exist.
  3. Elena Ledesma-Fernández

    Mitotic Control Laboratory, The Francis Crick Institute, Mill Hill Laboratory, London, United Kingdom
    Present address
    MRC Laboratory of Molecular Cell Biology, University College London, London, United Kingdom
    Contribution
    EL-F, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article
    Competing interests
    The authors declare that no competing interests exist.
  4. Peter H Thorpe

    Mitotic Control Laboratory, The Francis Crick Institute, Mill Hill Laboratory, London, United Kingdom
    Contribution
    PHT, Conception and design, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article
    For correspondence
    peter.thorpe@crick.ac.uk
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-1649-6816

Funding

Medical Research Council (MC_UP_A252_1027)

  • Peter H Thorpe

Cancer Research UK (Institute core funding)

  • Peter H Thorpe

Wellcome Trust (Institute core funding)

  • Peter H Thorpe

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

This work was funded by a Medical Research Council U.K. Centenary Award and research grant (MC_UP_A252_1027). The Francis Crick Institute is funded by The Medical Research Council UK, Cancer Research UK, the Wellcome Trust, Imperial College London, University College London and Kings College London. We thank D Peer, H Caulston, G Brown, B Andrews, E Styles, R Rothstein, J Dittmar, R Reid, I Overton, J Bahler, D Gresham and M Gartenburg. The authors declare no competing financial interests.

Reviewing Editor

  1. Randy Schekman, Howard Hughes Medical Institute, University of California, Berkeley, United States

Publication history

  1. Received: November 16, 2015
  2. Accepted: March 17, 2016
  3. Version of Record published: April 21, 2016 (version 1)

Copyright

© 2016, Berry et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 2,189
    Page views
  • 433
    Downloads
  • 6
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, Scopus, PubMed Central.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Cell Biology
    2. Chromosomes and Gene Expression
    Konstadinos Moissoglu et al.
    Research Article Updated
    1. Cell Biology
    2. Structural Biology and Molecular Biophysics
    Florian Ullrich et al.
    Research Article