Insertion of helix-forming segments into the membrane and their association determines the structure, function, and expression levels of all plasma membrane proteins. However, systematic and reliable quantification of membrane-protein energetics has been challenging. We developed a deep mutational scanning method to monitor the effects of hundreds of point mutations on helix insertion and self-association within the bacterial inner membrane. The assay quantifies insertion energetics for all natural amino acids at 27 positions across the membrane, revealing that the hydrophobicity of biological membranes is significantly higher than appreciated. We further quantitate the contributions to membrane-protein insertion from positively charged residues at the cytoplasm-membrane interface and reveal large and unanticipated differences among these residues. Finally, we derive comprehensive mutational landscapes in the membrane domains of Glycophorin A and the ErbB2 oncogene, and find that insertion and self-association are strongly coupled in receptor homodimers.https://doi.org/10.7554/eLife.12125.001
Cells are defined by a thin membrane that separates the inside of the cell from the outside. The core of this membrane is hydrophobic, meaning that it repels water. Many signals and nutrients cannot pass through the membrane itself, but can pass through the proteins that span the membrane. Membrane proteins are therefore essential for living cells; yet even after decades of research, it remains unclear how proteins interact with the membrane and which features determine a protein’s stability in a biological membrane.
Since the early 1980s it was known that the bacterium E. coli could grow on a common antibiotic called ampicillin if it had enough of an antibiotic-degrading enzyme called β-lactamase anchored into its inner membrane. Now, Elazar et al. have used this enzyme to obtain detailed information on the interactions between a biological membrane and a membrane protein. First, hundreds of different mutations were introduced into the gene that encodes the enzyme to generate a population of bacteria that each had a slightly different membrane anchor. The mutant bacteria were then grown in the presence of the antibiotic, meaning that those mutants with a more stable membrane anchor were more likely to survive and grow than those with less stable anchors.
Elazar et al. then collected all the surviving bacteria, sequenced their DNA and measured how common the different mutations were in the final population. This approach was less labor-intensive and more accurate than traditional methods for monitoring membrane-anchored proteins, and the resulting large dataset was used to uncover which features affect a protein’s stability in a membrane. These results also showed that a biological membrane’s core is considerably more hydrophobic than was previously thought.
In addition to being hydrophobic, biological membranes have more negative charge in the side that faces into the cell. This means that membrane proteins with a positive charge in this region will be more stable, and Elazar et al. were able to use their new system to measure this effect for the first time.
Finally, membrane proteins do not only span the membrane; they also bind with other membrane proteins in order to carry out their roles. Elazar et al. used their system to look at the surfaces of human membrane proteins that interact with one another, and build a detailed map of the interaction surfaces, from which they derived accurate models of the membrane proteins.
Overall, these new findings could now be used to model the three-dimensional structures of membrane proteins and improve their stability. This in turn may help efforts to develop these proteins into more robust experimental tools and in the search for drugs that target membrane proteins.https://doi.org/10.7554/eLife.12125.002
The past four decades have seen persistent efforts to decipher the contributions to membrane-protein energetics (Reynolds et al., 1974; Cymer et al., 2015). Membrane-protein folding can be conceptually divided into two thermodynamic stages (Popot and Engelman, 1990; Cymer et al., 2015), each of which affects membrane-protein structure, function, and expression levels: the insertion into the membrane of transmembrane segments as α helices, and their association to form helix bundles (Ben-Tal et al., 1996; Heinrich and Rapoport, 2003; Moll and Thompson, 1994; White and Wimley, 1999; Popot and Engelman, 1990). While significant progress has been made in structure prediction, design, and engineering of soluble proteins (Fleishman and Baker, 2012), important but fewer successes were reported in design of membrane proteins (Joh et al., 2014; Li et al., 2004), largely owing to the complexity of the plasma membrane and the lack of systematic and accurate measurements of membrane-protein energetics (Cymer et al., 2015).
Recently, experimental systems that offer a realistic model for biological membranes have advanced. von Heijne and co-workers quantitated the partitioning of engineered peptides fused to the bacterial transmembrane protein, leader peptidase (Lep), between membrane-inserted and translocated states, and highlighted the importance of interactions between the translocon and the nascent polypeptide chain in determining partitioning (Hessa et al., 2007; Öjemalm et al., 2013). The insertion energetics obtained from this assay, however, were significantly lower than expected from previous theoretical and experimental studies; for instance, the apparent atomic-solvation parameter, which quantifies the free-energy contribution from the partitioning of hydrophobic surfaces to the membrane core, was only 10 cal/mol/Å2 according to the Lep measurements (Ojemalm et al., 2011), compared to ~30 cal/mol/Å2 from previous analyses (Andrew Karplus, 1997; Vajda et al., 1995). Additionally, the magnitude of the insertion free energies for individual amino acids were substantially lower according to the Lep system (Hessa et al., 2007; Ojemalm et al., 2011; Öjemalm et al., 2013) compared to other studies (Moon and Fleming, 2011; Shental-Bechor et al., 2006). These discrepancies led to suggestions that the Lep measurements were 'compressed' relative to others due to interactions between the engineered protein and other membrane constituents (Johansson and Lindahl, 2009; Shental-Bechor et al., 2006).
Membrane-protein energetics are governed not only by the insertion but also by the association of helices into bundles. A significant body of work has shown that association is driven by packing interactions and short sequence motifs comprising small-xxx-small residues, where small is any of the small polar residues (Ser, Gly, or Ala) and x is any residue (Russ and Engelman, 2000; Senes et al., 2004; Melnyk et al., 2004). However, while it is recognized that insertion and association both play roles in protein energetics (Duong et al., 2007; Finger et al., 2006; Moll and Thompson, 1994; Ben-Tal et al., 1996; Heinrich and Rapoport, 2003; Popot and Engelman, 1990), the interplay between these two aspects has not been subjected to systematic experimental analysis. Given the remaining open questions on membrane-protein and protein-protein interactions within the membrane, we established a high-throughput assay to shed light on both factors and their coupling in a systematic and unbiased way within the bacterial plasma membrane.
To overcome gaps in our understanding of membrane-protein energetics, we adapted the TOXCAT-β−lactamase (TβL) screen (Lis and Blumenthal, 2006; Russ and Engelman, 1999; Langosch et al., 1996) for high-throughput analysis by deep mutational scanning (Boucher et al., 2014; Fowler and Fields, 2014); we refer to this new method as deep-sequencing TOXCAT-β−lactamase (dsTβL). TβL is a genetic screen based on a chimera, in which a membrane-spanning segment is flanked on the amino terminus by the ToxR dimerization-dependent transcriptional activator of a chloramphenicol-resistance gene and on the carboxy terminus by β-lactamase (Figure 1a). In this construct, bacterial survival on ampicillin monitors membrane integration (Broome-Smith et al., 1990; Jaurin et al., 1982), and survival on chloramphenicol correlates with self-association of the membrane span (Lis and Blumenthal, 2006; Russ and Engelman, 1999; Langosch et al., 1996). Furthermore, β-lactamase and ToxR function only in the periplasm and cytoplasm, respectively; therefore, unlike previous assays of membrane-protein insertion (Hessa et al., 2007), the orientation of the inserted segment relative to the membrane is known, and only proteins inserted with their carboxy terminus located in the periplasm would be selected. Most studies using the TOXCAT screen fused maltose-binding protein (MBP) in the carboxy-terminal domain (instead of β-lactamase) and used the MBP-null E. coli strain MM39 and maltose as sole carbon source to monitor membrane integration (Russ and Engelman, 1999; Langosch et al., 1996; Melnyk et al., 2004; Li et al., 2004). Unlike this conventional TOXCAT experiment, in TβL survival on ampicillin depends linearly on β−lactamase expression levels (Broome-Smith et al., 1990; Jaurin et al., 1982; Li et al., 2004; Lis and Blumenthal, 2006), thus providing a more sensitive reporter for membrane insertion than MBP. Furthermore, the MM39 strain is not amenable to high-throughput transformation as required by our study; with TβL, we were able to use the high-transformation efficiency E. cloni cells (see Materials and methods).
Previous studies based on ToxR activity measured the effects of mutations using colony growth and enzyme-linked immunosorbant assay (ELISA), which do not allow high-throughput analysis (Mendrola et al., 2002; Langosch et al., 1996; Lis and Blumenthal, 2006; Russ and Engelman, 2000; Melnyk et al., 2004). Here, instead, we subject libraries encoding every amino acid substitution in the membrane domain to selection on agar plates containing either ampicillin alone or ampicillin and chloramphenicol to monitor insertion and self-association, respectively (Figure 1b); the same bacterial population is also plated on non-selective agar and serves as a reference to control for clonal-representation biases. Following overnight growth, the bacteria in each plate are pooled, plasmids encoding the TβL construct are extracted from each pool, and the variable gene segment, which encodes the membrane span, is amplified by PCR. The three resulting DNA samples are subjected to deep sequencing, which reports the relative frequency of each mutant in the selected and reference populations (Boucher et al., 2014) (see equation 1 in Materials and methods). If the cytoplasmic protein fraction were perfectly constant among different mutants, the measured population frequencies could be interpreted as the relative propensities of each mutant to insert into the membrane or to self-associate in the membrane. This condition is unlikely to hold for all mutants; still, the agreement reported below with multiple lines of biophysical evidence on purified proteins suggests that the population frequencies provide a reasonable measure for changes in membrane-insertion and self-association partitioning. Hence, if we treat the population frequencies as if the mutants’ partitioning between cytosolic, membrane-inserted, and self-associated fractions were under thermodynamic control, following the Boltzmann equation we can derive, at each position i in the membrane span, apparent free energy changes for insertion or self-association due to the substitution from wild type to amino acid aa, (see equations 2–3 in Materials and methods). Although confounding factors, such as nonspecific interactions between the inserted segment and other bacterial membrane proteins, may affect the readout from the experiment, insertion and self-association are likely to dominate, since every mutant in this library differs from the wild type by only one amino acid; furthermore, all mutants are subjected to identical selection conditions, including temperature and antibiotic, thereby minimizing experimental noise (Mackenzie and Fleming, 2008).
We used dsTβL to comprehensively map the sequence determinants of membrane insertion in a single-pass membrane segment (Figure 2a). To minimize the effects of self-association on experimental readout, we chose the C-terminal portion of human L-Selectin (CLS), which does not self-associate (Srinivasan et al., 2011) (Figure 2—figure supplement 1). Furthermore, the CLS amino acid sequence includes several polar amino acids (Figure 2a, bottom); we therefore reasoned that its membrane-expression levels would be sensitive even to point mutations. To verify that the deep-sequencing data reflected trends observed in experiments with single clones, we selected 10 single-point substitutions at the membrane-spanning segment’s amino terminus and at its core, and subjected them to experimental analysis on a clone-by-clone basis. Each clone and the wild type were grown overnight in non-selective medium, normalized to the same density, and plated in serial dilutions on ampicillin-containing agar to estimate relative viability (Figure 2—figure supplement 2). Nine of the 10 selected single-point substitutions (all but Val302Lys) showed qualitatively similar trends of viability in deep sequencing and single-clone analysis. The resolution of the deep-sequencing data, however, is much greater than that seen in the single-clone assays; for instance, whereas both charge mutations, Met303Glu and Ala311Arg, are highly deleterious according to deep sequencing and plate viability, the ∆∆ value from deep sequencing for the former is 3.7 kcal/mol compared to 1.3 kcal/mol for the latter, emphasizing the larger dynamic range of deep sequencing compared to traditional viability screens. We next expressed these 10 mutants in non-selective conditions, isolated membrane preparations for each (Molloy, 2008), and measured membrane-localization levels relative to wild type using Western blots with an antibody targeting β-lactamase (Figure 2—figure supplement 3 see Materials and methods). All clones expressed in the membrane fraction and ran at the expected size of ~55 kDa. Of the 10 tested mutations, 6 showed the expected trends, including mutations that increased (Met303Leu, Val304Phe, and Ala311Leu) or decreased expression (Val302Lys, Leu310Ala, and Ala311Arg) in agreement with the deep-sequencing and single-clone viability data (Figure 2—figure supplement 3). For instance, Ala311Arg was much less viable and showed lower membrane localization than wild type, whereas Ala311Leu was more viable and had higher membrane localization. However, three mutants to charges at the amino terminus (Val311His, Met303Glu, and Val304Asp) increased expression levels according to Western blots but were disruptive according to dsTβL. We attribute this difference to the fact that ampicillin viability reports not only on membrane-expression levels but also on the appropriate membrane integration, which is not captured by Western blots.
We next computed the apparent free-energy change of each substitution across the membrane relative to a substitution to Ala, and at each position i computed the running average over five neighboring positions [i-2…i+2] (Figure 2b; Supplementary file 1). The resulting profiles describe the energetics of inserting each of the twenty amino acids relative to Ala at each position across the bacterial plasma membrane (Figure 2b). Although the location of the membrane mid-plane (Z = 0) could not be determined unambiguously in this assay, we estimated it by aligning the hydrophobic residues’ profiles (Leu, Ile, Met, and Phe), thereby locating the profiles' troughs and the presumed membrane mid-plane at CLS position Ala311.
The small and polar amino acids, Ser, Thr, and Cys have shallow, nearly neutral profiles, ranging from −0.1 to +0.8 kcal/mol. By contrast, the helix-distorting amino acids, Gly and Pro, which expose the polar protein backbone to the hydrophobic membrane environment, have a high disruptive profile, which peaks (~2 kcal/mol) at the membrane mid-plane, emphasizing the strong unfavorable impact of exposing the polar protein backbone to the membrane environment. The large polar (Asn, His, and Gln) and charged residues (Asp, Glu, Lys, and Arg) are all highly disruptive in the membrane mid-plane. We note that the energetic penalties for Asp, Asn, His, Gln, Glu, and Lys cannot be determined precisely from the dsTβL assay, since the number of reads for substitutions to these residues at the membrane mid-plane in the selected population is nearly 0, reflecting exceedingly large negative-selection pressures (Figure 1; Supplementary file 2).
At the membrane mid-plane, the hydrophobic residues, Val, Ile, Leu, Met, and Phe, show the expected troughs, which are shallower for the small amino acid Val (approximately −0.5 kcal/mol) than for the large amino acids (<−1.5 kcal/mol). We compared the dsTβL values for hydrophobic residues in the membrane mid-plane to values from five hydrophobicity scales (Figure 2—figure supplement 4). dsTβL fits well to the Moon scale (Figure 2c, r2 = 0.90, with a slope close to 1), which similar to dsTβL measures substitution effects in a bacterial membrane – albeit the outer membrane (Moon and Fleming, 2011). The correspondence between dsTβL, which is based on in vivo measurement of membrane integration in a bacterial population, with biophysical assays on purified proteins, partly confirms the use of dsTβL for studying membrane-protein energetics.
The dsTβL profile for Trp is similar to the profiles of the aliphatic residues, whereas Tyr makes a nearly neutral contribution to insertion in the membrane core. These profiles diverge from statistical inferences from membrane-protein structures and partitioning experiments, which show that Tyr and Trp preferentially line the membrane-water interface (Ulmschneider et al., 2005; Schramm et al., 2012; Senes et al., 2007; Nakashima and Nishikawa, 1992; Yau et al., 1998). Further experimental analysis of the role of aromatic residues in membrane-protein stability is warranted, and one possible explanation for these differences is that in the dsTβL assay aromatic residues on the membrane-spanning segment lack neighboring aromatic residues with which to form stabilizing stacking interactions; indeed, experimental stability measurements have shown that stacking makes a significant contribution to the net stabilization provided by aromatic residues in membrane proteins (Hong et al., 2007, 2013).
Recently, controversy has surrounded the question of how hydrophobic are biological membranes (Johansson and Lindahl, 2009). On the one hand, theoretical considerations and values inferred from hydrocarbons in solution suggested that the free energy contribution due to inserting aliphatic groups into the membrane, or the atomic-solvation parameter, is ~30 ± 5 cal/mol/Å2 of nonpolar surface area (Vajda et al., 1995; Andrew Karplus, 1997); on the other hand, the Lep measurements suggested values of only 10 cal/mol/Å2 (Ojemalm et al., 2011). We analyzed dsTβL data for 39 substitutions from one aliphatic residue (Ala, Val, Ile, Leu, and Met) to another at the core of the membrane (−9 Å<Z<13 Å) and inferred an apparent atomic-solvation parameter of 37 ± 6 cal/mol/Å2 (Figure 2d). We additionally derived an atomic-solvation parameter of 32 ± 4 cal/mol/Å2 by analyzing the apparent insertion free energies at the membrane mid-plane () for each of the aliphatic residues (Figure 2d, inset). The values we infer for the atomic-solvation parameter are therefore in fair agreement with values for protein cores and hydrocarbons in aqueous solution (Vajda et al., 1995), and 3–4 times larger than the value inferred from the Lep system (Ojemalm et al., 2011). We further note that while the ranking of apparent insertion free energies of aliphatic amino acids in dsTβL and Lep (Hessa et al., 2007) is similar (r2 = 0.79, Figure 2—figure supplement 4), the magnitude of the insertion free-energy changes is nearly four times greater according to dsTβL. We conclude that our results support the view that the hydrophobicity of the plasma membrane core is similar to that of hydrocarbons and much higher than measured in the Lep system.
A hallmark of plasma membrane proteins is the charge asymmetry known as the ‘positive-inside’ rule, according to which the cytoplasmic-facing side of the protein is much more positively charged than the periplasmic or extracellular-facing side (von Heijne, 1989). This asymmetry has been used to successfully predict the orientation of membrane proteins (von Heijne, 1989), but experimental quantification of the energetics of this asymmetry met with difficulty; previous studies measured only a small energy difference (-0.5 kcal/mol) between inserting Arg and Lys in the cytoplasmic relative to the extracellular-facing side of the membrane and no asymmetry for His (Lerch-Bader et al., 2008; Öjemalm et al., 2013). A striking feature of the dsTβL profiles, by contrast, is that they show clear and large asymmetries for Arg, Lys, and His, in agreement with the ‘positive-inside’ rule (Figure 2b). The three profiles, however, are not identical: whereas Lys and Arg are favored by 2 kcal/mol near the cytoplasm compared to near the periplasm, the titratable amino acid His shows a more modest asymmetry of 1 kcal/mol; moreover, of these three amino acids, only Arg stabilizes the segment near the cytosol, whereas Lys and His are nearly neutral at the cytosol-membrane interface. This difference between Arg and Lys, which has not been noted until now, may be due to charge delocalization in the Arg sidechain and Arg’s ability to form more hydrogen bonds with lipid phosphate headgroups.
We compared the relative propensity of each of the 20 amino acids at each position across the membrane (Figure 3; equation 4 in Materials and methods). The results clearly reflect the asymmetric distribution of charges across the plasma membrane, with Arg as the most favored amino acid at the cytoplasmic surface, giving way to the large and hydrophobic amino acids, Leu, Ile, and Phe. Since the propensities reflect protein-lipid interactions, they could be used to engineer variants of natural membrane proteins that exhibit higher stability and expression levels by mutating membrane-facing positions to the highest-propensity identity. Furthermore, the results suggest that the insertion profiles could be used for bioinformatics prediction of the locations and orientations of membrane-spanning proteins, [manuscript in preparation (Elazar et al.)].
With the accumulation of membrane-protein molecular structures, it has become possible to derive knowledge-based potentials for the insertion of amino acids across the membrane from distributions of amino acids observed in structures (Ulmschneider et al., 2005). We compared the dsTβL profiles to three knowledge-based profiles published over the past decade (Senes et al., 2007; Ulmschneider et al., 2005; Schramm et al., 2012) (Figure 4). The dsTβL profiles are similar to the knowledge-based ones for the weakly polar and hydrophobic residues (Val, Ser, and Thr), but they diverge with respect to the more hydrophobic and polar residues: for instance, the dsTβL apparent insertion energy for Leu, Ile, and Phe at the membrane mid-plane is −2 kcal/mol and around −0.5kcal/mol for the other scales. Additionally, the only knowledge-based scale that attempted to derive an asymmetric insertion potential (Schramm et al., 2012) reported much smaller asymmetries for Lys and Arg, of around 0.5 kcal/mol, compared to 2kcal/mol in dsTβL. These differences are likely due to the fact that knowledge-based potentials reflect the frequencies of amino acids in membrane proteins and are biased by functional constraints; indeed polar residues are often found in the membrane core, where they have important roles in oligomerization, substrate binding, transport, and conformational change. Furthermore, the dsTβL experiment is based on a single-pass segment, where every position is exposed to the membrane, whereas the knowledge-based profiles are derived from all structures, including multi-pass proteins, where many residues mediate interactions with other protein segments or line water-filled cavities.
Insertion and association of membrane-spanning helices are thermodynamically coupled (Kessel and Ben-Tal, 2002; Moll and Thompson, 1994; Popot and Engelman, 1990), but except in one study (Duong et al., 2007) these two aspects were assayed separately (Fleming et al., 1997; Finger et al., 2009; Hessa et al., 2007; Mendrola et al., 2002). To test the coupling between insertion and association, we applied dsTβL to two model systems for studying membrane protein self-association: the membrane domains of the human erythrocyte sialoglycoprotein Glycophorin A (GpA) and the ErbB2 oncogene, and compared survival on ampicillin and chloramphenicol to survival on ampicillin alone.
Some of the amino acid positions that mediate self-association in GpA and ErbB2 according to their experimentally determined structures (Bocharov et al., 2008; MacKenzie et al., 1997) show large decreases in chloramphenicol viability upon mutation. Unexpectedly, however, many other mutations that have large effects on chloramphenicol viability are not close to the dimerization surfaces (Figure 5—figure supplement 1), suggesting that factors other than self-association dominate the chloramphenicol-viability landscape. To test whether these confounding results are due to variability in expression levels among the mutants, we subtracted from the observed effects of every mutant the expected effects due to expression-level changes according to the dsTβL insertion profiles (see equation 5 in Materials and methods). The corrected self-association mutational landscapes now correctly discriminate positions that mediate self-association from those that do not (Figure 5a), and clearly highlight the interaction motifs in GpA and ErbB2. We compared the results from the systematic self-association landscape of GpA to a previous analysis of 24 GpA mutants that were individually screened for self-association using TOXCAT and corrected for differences in membrane expression using Western blots (Figure 5b) (Duong et al., 2007). The results from dsTβL and the previous study are consistent (r2 = 0.66; Figure 5b), confirming the use of the dsTβL insertion scale to correct self-association mutational landscapes in single-pass homodimers. Furthermore, by systematically probing every position across the membrane, our results highlight additional positions that are sensitive to mutation, such as GpA’s Gly86, which was previously not subjected to mutagenesis (Lemmon et al., 1992; Duong et al., 2007). Moreover, it is notable that although the vast majority of mutations are either neutral or disruptive to self-association, some mutations, for instance to Met in the GpA amino terminus, may promote self-association in the context of the TβL construct (Figure 5a). Our results further confirm the strong interplay between membrane-protein expression levels and association, and the importance of accounting for both in biophysical experiments on membrane proteins.
We also tested whether the systematic mutational landscapes generated by dsTβL could be used to provide constraints for structure modeling of receptor membrane domains (Fleishman et al., 2002; Kim et al., 2003; Polyansky et al., 2014). We used the Rosetta biomolecular-modeling software (Das and Baker, 2008; Yarov-Yarovoy et al., 2006) to generate 100,000 structure models of GpA and ErbB2 directly from their sequences, and selected structures that self-associate through positions that are sensitive to mutation according to dsTβL but not through positions that are insensitive to mutation (Figure 5c, Figure 5—figure supplement 2). In both cases, fewer than five models passed the selection criteria and of those, some models were within 2 Å of experimentally determined structures.
Despite progress in measuring protein energetics within biological membranes, significant open questions remained, among them, what is the hydrophobicity at the core of biological membranes; what is the magnitude of the bias for positively charged residues at the cytoplasm surface; and how strong is the coupling between membrane-protein insertion and association energetics? To shed light on these fundamental questions, we established a high-throughput genetic screen and used it to generate systematic mutation landscapes of insertion and self-association in the plasma membrane of live bacteria.
The apparent insertion energies in dsTβL are in line with biophysical stability measurements on outer-membrane proteins (Moon and Fleming, 2011), and the inferred atomic-solvation parameter is close to measurements in model systems and protein cores (Andrew Karplus, 1997; Vajda et al., 1995). Our measurements, however, are three to four times larger than the corresponding ones using the Lep system (Ojemalm et al., 2011; 2013; Hessa et al., 2007). To be sure, we are not the first to note these large differences (Johansson and Lindahl, 2009; Shental-Bechor et al., 2006); yet, we find it significant that our measurements, similar to those in the Lep system, use biological membranes. The observation that the dsTβL insertion measurements for aliphatic side chains have the same ranking but are fourfold larger in magnitude compared to those from Lep (Figure 2—figure supplement 4) may indicate that the Lep system measures only a part of the energy contribution to insertion. While further investigation is needed, we speculate that the reason for the large differences between dsTβL and Lep is that total membrane-protein expression levels were not quantified in the Lep system (Hessa et al., 2007; Ojemalm et al., 2011; 2013).
We note the following two caveats regarding the dsTβL insertion profiles. First, the penalties for most polar residues at the membrane mid-plane are likely to indicate lower bounds on their insertion energies, since the number of clones counted in the deep-sequencing data for these mutants is close to 0 (supplementary data). Second, statistical analyses (Ulmschneider et al., 2005; Schramm et al., 2012; Senes et al., 2007) and experiments (Hessa et al., 2007) demonstrated that the aromatics Tyr and Trp are preferred in the water-membrane interface rather than in the core, although dsTβL shows the reverse (Figure 2b). We suggest that these results reflect the fact that dsTβL is based on a monomeric construct where the aromatics are fully exposed to the membrane environment; however, these uncertainties require further research.
The TOXCAT genetic screen has made essential contributions to our understanding of self-association in the membrane (Lindner and Langosch, 2006; Lis and Blumenthal, 2006; Russ and Engelman, 1999; Finger et al., 2009; Mendrola et al., 2002; Li et al., 2004; Srinivasan et al., 2011Reuven et al., 2012). Some early reports demonstrated that chloramphenicol survival also depends on membrane-protein expression levels (Russ and Engelman, 1999; Duong et al., 2007). Our results strongly support this view and show that expression levels are a dominant factor in chloramphenicol survival. This dominance is perhaps not surprising in retrospect, given that a mutation’s effects on monomer concentrations are counted twice in computing its effects on homodimer concentrations, and therefore on chloramphenicol viability (see equation (5) in Materials and methods). A key contribution of unbiased and systematic assays, such as dsTβL, is that they clarify such trends unambiguously. Furthermore, the dsTβL insertion profiles derived from the monomeric CLS provide a self-consistent way to factor out the contributions from insertion energetics in future assays on membrane-protein association or function in unrelated membrane proteins, thereby eschewing the need to measure the expression levels of individual mutants.
Deep mutational scanning has made important inroads to analysis and optimization of diverse protein systems (Whitehead et al., 2012; Fowler and Fields, 2014; Boucher et al., 2014). The main strengths of deep mutational scanning are the ability to measure the effects of all point mutations without bias and that all mutants experience strictly equal experimental conditions, thereby limiting experimental noise. The structural simplicity of the model systems tested here, consisting of a single α helix or of helix homodimers, plays a further role in the ability to accurately infer energetics. Combined with structural modeling, the assay can provide essential information both on association energetics and the molecular architecture of membrane receptors. More generally the data on protein-membrane and protein-protein energetics obtained from dsTβL will be used to improve models of membrane-protein energetics and to design, screen, and engineer high-expression mutants of specific membrane proteins (Fleishman and Baker, 2012; Joh et al., 2014).
The p-Mal plasmid was generously provided by the Mark Lemmon laboratory. We replaced the maltose-binding protein domain at the open-reading frame carboxy-terminus with β-lactamase (Lis and Blumenthal, 2006). The restriction sites in multiple-cloning site 1 were changed to XhoI and SpeI. The p-Mal plasmid contains a gene for spectinomycin resistance, which is constitutively expressed, providing selection pressure for transformation. The open-reading frame encompassing the TβL construct is also constitutively expressed and is under the control of the weak ToxR promoter.
The DNA coding sequence for the transmembrane constructs used in the paper:
CCGCTGTTCATCCCGGTTGCAGTTATGGTTACCGCTTTTAGTGGATTGGCGTTTATCATCTGGCTGGCT (amino acid sequence: PLFIPVAVMVTAFSGLAFIIWLA)
CTCATTATTTTTGGGGTGATGGCTGGTGTTATTGGAACGATCCTGATC (amino acid sequence: LIIFGVMAGVIGTILI)
CTGACGTCTATCATCTCTGCGGTGGTTGGCATTCTGCTGGTCGTGGTCTTGGGCGTGGTCTTTGGCATCCTGATC (amino acid sequence: LTSIISAVVGILLVVVLGVVFGILI)
The CLS construct was deposited in the AddGene repository [pMAL_dstβL-(Plasmid #73805)].
All experiments were conducted using the high-transformation efficiency E. cloni cells (Lucigen Corporation, Middleton, WI).
Customized MatLab 8.0 (MathWorks, Nattick, Massachusetts) scripts for generating primers were written (supplementary files) to generate forward and reverse DNA oligos of lengths 40–85 base pairs, where the central codon is replaced by the degenerate codon NNS, where N is any of the four nucleotides (ATGC) and S is G or C, encoding all possible natural amino acids. Resulting primers were ordered from Sigma (Sigma-Aldrich, Rehovot, Israel). For example, to replace the central 302nd codon of human CLS with an NNS codon, the following two primers were ordered:
Each pair of oligos was then cloned into the wild type by restriction-free (RF) cloning (van den Ent and Löwe, 2006).
The resulting plasmids from the library-construction step above were electroporated into E. cloni and plated on agar plates containing 50 μg/ml spectinomycin. Plasmids for each position were transformed and plated separately and positions with fewer than 200 colonies were retransformed. All positions were then pooled and used to inoculate 10 ml of Luria Broth medium (LB) with 50 μg/ml spectinomycin and grown in a shaker at 200 rpm and 37°C over-night, diluted 1:1000 and grown to OD = 0.2–0.4. The libraries were then diluted to OD = 0.1 and 200 μl of the resulting cultures were plated at different dilutions (1:1, 1:10, 1:100, 1:1000) on large 12-cm petri dishes containing spectinomycin, ampicillin alone, or ampicillin and chloramphenicol. After overnight incubation at 37°C, p-Mal plasmids were extracted from the resulting colonies using a miniprep kit (Qiagen, Valencia, California).
Every wild-type membrane-spanning segment exhibits different sensitivity to chloramphenicol and ampicillin. To determine the concentrations that are most likely to provide maximal dynamic range, we started by cloning mutants that are predicted to reduce insertion of the membrane-spanning segment or its self association (Mendrola et al., 2002). Results are represented in Supplementary file 1. We next titrated the wild-type construct as well as the mutant on plates with varying concentrations of antibiotic to find the concentration that shows the largest difference in viability between the wild type and the compromising mutants. Supplementary file 2 provides the ampicillin and chloramphenicol concentrations used in each of the experiments reported in the paper.
In order to connect the adaptors for deep sequencing, the membrane-spanning segments were amplified from the p-Mal plasmids using KAPA Hifi DNA-polymerase (Kapa Biosystems, London, England) using a two-step PCR.
1μl of the PCR product was taken to the next PCR step:
>reverse barcode 1
>reverse barcode 2
The DNA samples from each of the populations (unselected; ampicillin-selected; and chloramphenicol and ampicillin selected) were PCR-amplified using DNA barcodes for deep sequencing. The following barcodes were used:
All the primers were ordered as PAGE-purified oligos. The concentration of the PCR product was verified using Qu-bit assay (Life Technologies, Grand Island, New York).
DNA samples were run on an Illumina MiSeq using 150-bp paired-end kits. The quality control for a typical run showed that the membrane-spanning segment was at high-quality (source data) FASTQ sequence files were obtained for each run and customized MatLab 8.0 scripts were written to generate the selection heat maps from the data (scripts are available in supplementary files). Briefly, the script starts by translating the DNA sequence to amino acid sequence; it then eliminates sequences that harbor more than one amino acid mutation relative to wild type; counts each variant in each population; and eliminates variants with fewer than 100 counts in the reference population (to reduce statistical uncertainty). In a typical experiment, at least 70% of the reads passed these quality-control measures.
The ampicillin selected and the unselected populations of CLS mutants were subjected to deep sequencing analysis yielding more than 4 million reads for each population. Out of 540 possible single-point substitutions, 472 (~87%) mutants were each counted more than 100 times in the reference population; the remaining mutants were eliminated from analysis to reduce uncertainty (gray tiles in Figure 2a). The dsTβL assay has a large dynamic range; for instance, at position 307 in the membrane center, the number of reads in the selected population for Lys, Gln, and Glu is 0, whereas the number of reads for Leu is nearly 110,000, spanning five orders of magnitude.
To derive the mutational landscapes (Figures 2a and 4a) we compute the frequency of each mutant relative to wild-type in the selected and reference pools, where i is the position and j is the substitution, relative to wild-type:
where count is the number of reads for each mutant. The selection coefficients are then computed as the ratio
where selected refers to the selected population (ampicillin in the case of the CLS insertion analysis, and ampicillin plus chloramphenicol in self-association analyses) and reference refers to the reference population (spectinomycin-selection in the case of CLS insertion analysis, and ampicillin in the case of self-association analysis). The resulting values are then transformed to apparent changes in free energy () due to each single-point substitution through the Gibbs free-energy equation:
where R is the gas constant, T is the absolute temperature (310K), and ln is the natural logarithm.
The readout from the insertion selection in dsTβL comprises contributions from the local environment of each position; for example, substitution to a small residue might form a cavity if surrounded by large residues. To reduce such sequence-specific effects, the insertion free-energy values relative to alanine were smoothed using the MatLab smooth function over a window of 5 residues (2 on each side), excluding gray tiles (with insufficient data), and plotted as points in Figure 2b. The points were then fitted using the polynomial fitting function polyfit to yield 4th-order polynomials (Figure 2b lines and Supplementary file 1). Two centrally located polar amino acid positions, CLS positions Ser307 and Gly308, were discarded from the analysis due to their inconsistency with the general trends of the insertion profiles, likely because mutations at these polar positions distort the helix backbone.
To compute the amino acid preference at each position in the membrane (Figure 2c), we calculated the Boltzmann-weighted probability of every amino acid residue at each position in the membrane-spanning domain of human CLS(Srinivasan et al., 2011) using the following formula (MatLab script in supplement files):
where R is the gas constant, T = 310K and are the apparent free energy of transfer of amino acid j at position i relative to alanine (Figure 2b).
We generated a model of the CLS membrane domain by threading its sequence on a canonical α helix, and used Rosetta to singly introduce each substitution from one aliphatic identity (Ala, Val, Ile, Met, Leu, and Phe) to another in the membrane core. Amino acid sidechains were combinatorially repacked and the change in solvent-accessible surface area (ΔSASA) was computed. Four additional data points (marked with asterisks, Figure 2d) were extracted from Glycophorin A’s position Ala82, which is located at the membrane center and away from the dimerization interface. To compute the atomic-solvation parameter from the insertion energies of the aliphatics at the membrane mid plane (Figure 2d, inset), we compared the insertion energy at the membrane mid plane for each aliphatic residue with computed ΔSASA of a change from that residue to Ala on a canonical poly-Ala α helix.
ToxR activity depends on homodimer concentrations (Langosch et al., 1996; Russ and Engelman, 1999), and homodimer concentrations depend on both monomer insertion into the membrane and self-association strength. The measured effects of every point mutation on self-association (Figure 5—figure supplement 1) therefore comprise contributions from insertion (multiplied by two because the homodimer comprises two mutants) and dimerization (Duong et al., 2007). To isolate the mutation’s effects on self-association (Figure 5a), we subtract from every data point twice the contribution to insertion at the relevant position along the membrane normal.
where, i, j, and x are the wild-type identity, mutation, and the position along the membrane normal, respectively, ΔΔGmeasured is the measured change in self-association free energy (see equation (3)), and Ginsertion is the free-energy change expected for a mutation from i to j at position x according to the insertion polynomials of Supplementary file 1.
Cells were grown in 5 ml LB overnight at 37°C. The cells were then diluted at a 1:100 ratio into 50 ml LB and were grown to A600 = 0.6, harvested on ice, washed in TBS buffer, and equal amounts of cells were re-suspended in extraction buffer (50 mM Tris pH 8 [Bio-Lab, Israel], 100 mM NaCl, 5% [w/w] sucrose and 1 mM AEBSF [Sigma-Aldrich]). The cells were then disrupted by three cycles of sonication with Microson XL at 12 watts for 10 s with 60 s intervals. Samples were centrifuged for 15 min at 13,000 rpm in order to discard cell debris, supernatant was ultracentrifuged for 1 hr at 300,000 g using Optima TLX with TLA100.1 rotor in order to sediment membranes to the pellet. The pellet was re-suspended in 100 mM Na(HCO3)2 and incubated for 15 min at 4°C and ultra-centrifuged for 1 hr at 300,000 g. Pellet and extract protein concentration were measured with Lowry protein assay (Peterson, 1977), and equal amounts of protein were loaded on 12.5% Tris-Glycine SDS PAGE gels. Gels were transferred to Protran nitrocellulose membranes (Whatman) and incubated with mouse anti-β-lactamase antibody (Santa Cruz, Dallas) and a horse-radish-peroxidase-fused rabbit anti-mouse secondary antibody, and imaged using SuperSignal West Femto Maximum Sensitivity Substrate (Thermo, Waltham, MA). ECL and the chemiluminescence signal was detected using the ChemiDoc MP System (Bio-Rad). Band densitometry was analyzed with imageJ (Schneider et al., 2012).
For each membrane segment, we start by generating all-helical backbone-conformation fragments using the Rosetta utility fragment picker (Gront et al., 2011) and construct a C2 symmetry definition file using the Rosetta symmetry utility function (https://www.rosettacommons.org). We then use the Rosetta Fold-and-Dock application (Das et al., 2009), which samples symmetric degrees of freedom for both docking and folding of the homodimer using the RosettaMembrane energy function (Yarov-Yarovoy et al., 2006). Example files and command lines for running fragment picker and fold-and-dock are available in supplementary files.
For each position in the membrane-spanning region of the target protein, we assign two labels: likely mediating binding – if at least four substitutions from wild-type disrupted binding by at least 2kcal/mol (* in Figure 5a); and unlikely to mediate binding – if at least four substitutions improved or did not change binding (†). For each of the 20% lowest-energy Rosetta models, we tested whether at least two of the residues that likely mediate binding are within 5 Å of the partner monomer and all positions, which are unlikely to mediate binding, are outside a 4 Å shell. Structures that passed the filter above were clustered using the Rosetta clustering application with default parameters. The clusters were visually inspected and models showing significant kinks were eliminated.
Free-energy determinants of alpha-helix insertion into lipid bilayersBiophysical Journal 70:1803–1812.https://doi.org/10.1016/S0006-3495(96)79744-8
Spatial structure of the dimeric transmembrane domain of the growth factor receptor ErbB2 presumably corresponding to the receptor active stateJournal of Biological Chemistry 283:6950–6956.https://doi.org/10.1074/jbc.M709202200
Beta-lactamase as a probe of membrane protein assembly and protein exportMolecular Microbiology 4:1637–1644.https://doi.org/10.1111/j.1365-2958.1990.tb00540.x
Mechanisms of integral membrane protein insertion and foldingJournal of Molecular Biology 427:999–1022.https://doi.org/10.1016/j.jmb.2014.09.014
Simultaneous prediction of protein folding and docking at high resolutionProceedings of the National Academy of Sciences of the United States of America 106:18978–18983.https://doi.org/10.1073/pnas.0904407106
Macromolecular modeling with rosettaAnnual Review of Biochemistry 77:363–382.https://doi.org/10.1146/annurev.biochem.77.062906.171838
Changes in apparent free energy of helix–helix dimerization in a biological membrane due to point mutationsJournal of Molecular Biology 371:422–434.https://doi.org/10.1016/j.jmb.2007.05.026
The stability of transmembrane helix interactions measured in a biological membraneJournal of Molecular Biology 358:1221–1228.https://doi.org/10.1016/j.jmb.2006.02.065
A putative molecular-activation switch in the transmembrane domain of erbB2Proceedings of the National Academy of Sciences of the United States of America 99:15937–15940.https://doi.org/10.1073/pnas.252640799
The effect of point mutations on the free energy of transmembrane α-helix dimerizationJournal of Molecular Biology 272:266–275.https://doi.org/10.1006/jmbi.1997.1236
Deep mutational scanning: a new style of protein scienceNature Methods 11:801–807.https://doi.org/10.1038/nmeth.3027
Role of aromatic side chains in the folding and thermodynamic stability of integral membrane proteinsJournal of the American Chemical Society 129:8320–8327.https://doi.org/10.1021/ja068849o
Sequence elements determining ampC promoter strength in e. coliThe EMBO Journal 1:875–881.
Protein contents in biological membranes can explain abnormal solvation of charged and polar residuesProceedings of the National Academy of Sciences of the United States of America 106:15684–15689.https://doi.org/10.1073/pnas.0905394106
Free energy determinants of peptide association with lipid bilayersCurrent Topics in Membranes 52:205–253.https://doi.org/10.1016/S1063-5823(02)52010-X
A simple method for modeling transmembrane helix oligomersJournal of Molecular Biology 329:831–840.https://doi.org/10.1016/S0022-2836(03)00521-7
A simple method for displaying the hydropathic character of a proteinJournal of Molecular Biology 157:105–132.https://doi.org/10.1016/0022-2836(82)90515-0
Dimerisation of the glycophorin a transmembrane segment in membranes probed with the ToxR transcription activatorJournal of Molecular Biology 263:525–530.https://doi.org/10.1006/jmbi.1996.0595
Glycophorin a dimerization is driven by specific interactions between transmembrane alpha-helicesThe Journal of Biological Chemistry 267:7683–7689.
Contribution of positively charged flanking residues to the insertion of transmembrane helices into the endoplasmic reticulumProceedings of the National Academy of Sciences of the United States of America 105:4127–4132.https://doi.org/10.1073/pnas.0711580105
Dimerization of the transmembrane domain of integrin IIb subunit in cell membranesJournal of Biological Chemistry 279:26666–26673.https://doi.org/10.1074/jbc.M314168200
A ToxR-based dominant-negative system to investigate heterotypic transmembrane domain interactionsProteins: Structure, Function, and Bioinformatics 65:803–807.https://doi.org/10.1002/prot.21226
A modified, dual reporter TOXCAT system for monitoring homodimerization of transmembrane segments of proteinsBiochemical and Biophysical Research Communications 339:321–324.https://doi.org/10.1016/j.bbrc.2005.11.022
Association energetics of membrane spanning α-helicesCurrent Opinion in Structural Biology 18:412–419.https://doi.org/10.1016/j.sbi.2008.04.007
The affinity of GXXXG motifs in transmembrane helix-helix interactions is modulated by long-range communicationJournal of Biological Chemistry 279:16591–16597.https://doi.org/10.1074/jbc.M313936200
The single transmembrane domains of ErbB receptors self-associate in cell membranesJournal of Biological Chemistry 277:4704–4712.https://doi.org/10.1074/jbc.M108681200
Isolation of bacterial cell membranes proteins using carbonate extractionMethods in Molecular Biology 424:397–401.https://doi.org/10.1007/978-1-60327-064-9_30
Side-chain hydrophobicity scale derived from transmembrane protein folding into lipid bilayersProceedings of the National Academy of Sciences of the United States of America 108:10174–10177.https://doi.org/10.1073/pnas.1103979108
The amino acid composition is different between the cytoplasmic and extracellular sides in membrane proteinsFEBS Letters 303:141–146.
Apolar surface area determines the efficiency of translocon-mediated membrane-protein integration into the endoplasmic reticulumProceedings of the National Academy of Sciences of the United States of America 108:E359.https://doi.org/10.1073/pnas.1100120108
Membrane protein folding and oligomerization: the two-stage modelBiochemistry 29:4031–4037.https://doi.org/10.1021/bi00469a001
Empirical correlation between hydrophobic free energy and aqueous cavity surface areaProceedings of the National Academy of Sciences of the United States of America 71:2925–2927.https://doi.org/10.1073/pnas.71.8.2925
TOXCAT: a measure of transmembrane helix association in a biological membraneProceedings of the National Academy of Sciences of the United States of America 96:863–868.https://doi.org/10.1073/pnas.96.3.863
The GxxxG motif: a framework for transmembrane helix-helix associationJournal of Molecular Biology 296:911–919.https://doi.org/10.1006/jmbi.1999.3489
Ez, a depth-dependent potential for assessing the energies of insertion of amino acid side-chains into membranes: derivation and applications to determining the orientation of transmembrane and interfacial helicesJournal of Molecular Biology 366:436–448.https://doi.org/10.1016/j.jmb.2006.09.020
Folding of helical membrane proteins: the role of polar, GxxxG-like and proline motifsCurrent Opinion in Structural Biology 14:465–479.https://doi.org/10.1016/j.sbi.2004.07.007
Has the code for protein translocation been broken?Trends in Biochemical Sciences 31:192–196.https://doi.org/10.1016/j.tibs.2006.02.002
L-selectin transmembrane and cytoplasmic domains are monomeric in membranesBiochimica Et Biophysica Acta (BBA) - Biomembranes 1808:1709–1715.https://doi.org/10.1016/j.bbamem.2011.02.006
Properties of integral membrane protein structures: derivation of an implicit membrane potentialProteins: Structure, Function, and Bioinformatics 59:252–265.https://doi.org/10.1002/prot.20334
Extracting hydrophobicity parameters from solute partition and protein mutation/unfolding experimentsProtein Engineering, Design and Selection 8:1081–1092.https://doi.org/10.1093/protein/8.11.1081
RF cloning: a restriction-free method for inserting target genes into plasmidsJournal of Biochemical and Biophysical Methods 67:67–74.https://doi.org/10.1016/j.jbbm.2005.12.008
MEMBRANE PROTEIN FOLDING and STABILITY: physical principlesAnnual Review of Biophysics and Biomolecular Structure 28:319–365.https://doi.org/10.1146/annurev.biophys.28.1.319
Multipass membrane protein structure prediction using rosettaProteins: Structure, Function, and Bioinformatics 62:1010–1025.https://doi.org/10.1002/prot.20817
The preference of tryptophan for membrane interfacesBiochemistry 37:14713–14718.https://doi.org/10.1021/bi980809c
Quantitative analysis of SecYEG-mediated insertion of transmembrane α-helices into the bacterial inner membraneJournal of Molecular Biology 425:2813–2822.https://doi.org/10.1016/j.jmb.2013.04.025
Yibing ShanReviewing Editor; DE Shaw Research, United States
In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.
Thank you for submitting your work entitled "Mutational scanning reveals the determinants of protein insertion and association energetics in the plasma membrane" for consideration by eLife. Your article has been reviewed by two peer reviewers, and the evaluation has been overseen by Yibing Shan (Reviewing Editor) and John Kuriyan as the Senior Editor. One of the two reviewers, William DeGrado, has agreed to share his identity.
The reviewers have discussed the reviews with one another and the Reviewing editor has drafted this decision to help you prepare a revised submission.
This paper describes a mutational scanning analysis to determine the energetics of membrane insertion of transmembrane helices and helix-helix association in membrane. Much of the method is new, which uses a modified TOXCAT assay coupled with high throughput screening and deep sequencing. The results suggest a membrane hydrophobicity higher than previously estimated, energetic asymmetry for several amino acids between the periplasmic and the cytoplasmic lipid layers, and a significant difference in the membrane energetics of Arg and Lys. The helix-helix association data confirm that the GxxxG motif represents favored dimerization interfaces. These results may inform analysis and design of membrane proteins and they are of broad interest to the community.
The reviewers suggested a number of important revisions that would help strengthen this manuscript.
1) The reviewers pointed out an important caveat for the present analysis: strictly speaking, the results should be considered propensities rather than free energies of membrane insertion and helix-helix association, without calibrating the expression level, or the relative population of the membrane-inserted protein vs the protein in cytosol. We suggest the authors Western Blot the expression levels of a small set (~10) of mutants and hopefully to show small variances in the expression levels. In any event this caveat should be discussed in the revised manuscript.
2) Both reviewers thought the discussion concerning TM topology is a distraction from the main focus. It would be better to remove this discussion from this manuscript and to publish it separately with further development.
3) The results reported in Figure 2b (Z-dependent energetic profiles) should be analyzed in comparison with bioinformatics results (Senes et al., 2007; Ulmschneider et al. 2005; Ulmschneider et al. 2006; Schramm et al., 2012).
4) The reviewers asked for further description of the present dsTβL system and for a more detailed comparison of the present system with the previously developed Lep system. Why the choice of β-lactamase instead of the maltose binding protein? What is the precise sequence of L-selectin used in the system? Why the high sensitivity of membrane insertion to even conservative TM mutations? The authors attribute the lower membrane hydrophobicity estimated by previous work (Hessa et al., 2007) to translocon interactions of the Lep system. Please elaborate this point, as the present system also passes through translocons to be embedded in the membrane.
Individual reviewer comments:
Sarel Fleishman et al. present a mutational scanning analysis to determine the energetics of TM helix insertion into a membrane. Using a modified TOXCAT assay coupled with high throughput screening and deep sequencing they have derived insertion profiles for each amino acid at all positions within a selected TM helix. Based on these results they conclude that (i.) membranes are more hydrophobic than anticipated based on previous findings, and (ii.) Arg and Lys differ in their membrane insertion asymmetry. Furthermore, the authors have identified an asymmetry for several amino acids and have used their data to predict the membrane topology of a set of E. coli proteins with experimentally determined TM topologies. In addition, the authors have used their insertion energetics together with association energetic to model the structure of two selected TM helix dimers.
I found the first part, i.e. the selection of amino acids and the determination of membrane insertion preferences, rather convincing, albeit some open questions remain, which are outlined in detail below. However, the authors mention throughout the entire manuscript many times that they present "the first systematic mutation analysis of insertion and self-association in the plasma membrane of live bacteria". This is simply wrong, and the entire idea is not that new. ToxR-based assays are in use for quite a while and it is common that authors present some analysis of the proteins expression levels. Thus, the entire statement, that membrane insertion and association energetics have never been studied in parallel simply is wrong. In fact, in MT Doung et al. (JMB (2007) 371, 422) the authors have adjusted their measured CAT activities to the expression levels. Furthermore, in Finger et al. (JMB (2006) 358, 1221) the authors have used a system very similar to TOXCAT and they have systematically followed the changes associated with changing expression levels. Thus, I do not see the new concept here. Similarly, I found the results of the modeling little convincing (subsection “High-precision structure modeling constrained by systematic self-association measurements”, last paragraph). Especially for small TM helix dimers there are many good predictions out, and I do not see that the current modeling results are superior.
I feel that the authors should consider concentrating on the first part of the manuscript and on the prediction of the TM topology. In the subsection “Topology prediction” they mention that another manuscript is in preparation. They might consider combining these two manuscripts, which would certainly improve the current manuscript. The analysis of the TM topology would certainly enhance the visibility of the present work.
1) In the second paragraph of the Introduction and later on: It is mentioned several times that the differences observed between the current results and previous results (Hessa et al. 2007) are due to interactions between the engineered protein and other components of the membrane, such as the translocon. However, in the present study the authors also express a fusion protein and a large part of that has to pass the translocon. While the fusion proteins analyzed in the present study certainly differ from the Lep systems, as we have a soluble domain, one could argue that the passage of the large soluble protein domain into the periplasm somehow influences the results.
Along the same lines: I do not understand the argumentation that the lower membrane hydrophobicity see with the Lep system is due to the translocon (Discussion, second paragraph). Also in E. coli the TM segments are inserted into the membrane via the SecYEG translocon! Please clarify.
2) In the first paragraph of the subsection “dsTβL: a high-throughput assay for measuring membrane-protein energetics”: I agree that the orientation of the segment within the membrane is known due to the system set up. However, the question arises whether the authors have missed all the proteins in their analysis which have a different topology. Thus, does the entire assay really determine insertion energetic or energetics of insertion in only one defined way?
3) System set up: When the equilibrium between soluble and membrane inserted residue is somehow affected by an amino acid, thus when a residue forces (or abolishes) insertion of a TM helix into the membrane, some β-lactamase might still be expressed into the periplasm (just less or more). As far as I understand the assay, one would still select the clones, as they grow on the selective agar, and add their sequence to the analysis. How meaningful is an analysis of residues, which are only partly integrated into the membrane are also expressed as soluble versions? Why should there be a negative selection if the selection pressure allows growth of all constructs? This problem might also influence the later presented interaction analysis. In principle the authors mention and use this argument in a different context in the Discussion (second paragraph).
4) In the subsection “High-precision structure modeling constrained by systematic self-association measurements”, second paragraph: It is clear that some mutations severely affect the expression level of a given protein. While this is here presented in the context of dimerization, a reduced expression level might also affect survival on ampicillin plates. So the question arises, what is actually measured and compared? Membrane insertion propensity of an amino acid or the impact of a substitution on the expression level? Probably both.
5) When "mutations form polar residues distort the helix backbone" and these are therefore discarded, how valuable is then the analysis of any polar residue in the present analysis. Is the argument valid for all positions?
Reviewer #2: This paper describes an elegant approach to screen for helix-helix interactions in the membrane, which relies on deep-sequencing of populations that have been selected for transmembrane peptide insertion and/or dimerization as well as the unselected controls. Much of the method is new. The results are interesting and can help enable the design and analysis membrane proteins; they should be of broad interest to the community. However, I have many suggestions to improve the clarity of the presentation, the interpretation of the results, and some of the computational methods briefly presented in the paper. Although these comments are lengthy, I do feel this is a very interesting and important paper that should be rapidly published.
The authors used an unusual membrane protein in this study to compare and contrast with earlier study based on Lep. The system is based on the single-span membrane protein, ToxR, which lacks a signal sequence. The construct contains a water-soluble N-terminal DNA-binding domain, a transmembrane helix, and a C-terminal β-lactamase (bla) as a periplasmic selectable marker. Thus, the entire system is in reverse order from the more common Type I insertion process, with no efficient signal protein to direct insertion into the membrane. The protein system studied here is also different from other bacterial Type II membrane proteins in that it has a full N-terminal domain. One strong point of this paper is that it allows comparison of this unusual insertion process with previous systems. However, the authors do not discuss these differences, and there is no discussion of whether their system is cotranslationally or post-translationally inserted, whether it requires SecA and SecB, and whether it uses the SecYEG.
The authors use an elegant series of selections to define the background population, the pool of protein sequences that insert properly, and the pool of protein sequences that both insert and dimerize. The authors point out that much of this has been accomplished previously using other methods, but not in a high-throughput manner. (Also the Bouchner manuscript was cited in such a way that it seemed that some of the present work was presented previously, which can easily be fixed.) The authors do not comment on why MBP is removed from the starting construct and β lactamase is used instead. Did they find growth +/- maltose to be an insufficient screen?
The constructed system would appear, perhaps serendipitously, to be very sensitive to small changes. Even an Ala to Leu change in the TM sequence can have a large change selective advantage, which almost certainly relates to the efficiency of insertion. Nevertheless, I am not sure whether they have shown this using a membrane prep. and Western analysis of a dozen variants spanning different insertion efficiencies, which would be standard in most papers on this subject. Also, the authors do not address whether the L-selectin's TM sequence is acting as a cleaved signal peptide or whether it is retained as a TM tether for the bla domain.
The extreme sensitivity to single mutations through the entire TM sequence is unusual, as similar studies generally show only localized regions to be important. Indeed, if any conservative mutation in a TM sequence led to a significant change in fitness it would be difficult to understand how proteins evolve and show such sequence variability. I therefore became curious what the actual sequence for L-selectin's TM was used. Following the gene sequence from the Methods section,
and translating in all three possible reading frames, the most hydrophobic peptide sequence would be:
There are many polar amino acids in this stretch. This clearly is not the sequence that was used by Renhao Li's lab in the reference provided in the manuscript . Either there is a major problem in what was intended versus what was actually cloned, or this is a typo that could have led to confusion in the literature. Either way, the correct amino acid sequence should be shown, along with the adjacent sequence N-terminal and C-terminal to the insert. If the native TM from human L-selectin was indeed cloned properly, the authors might wish to discuss how such small changes make large differences in fitness. Normally, one would expect that the sequence would need to be teetering on the edge of stability or function to be so sensitive throughout its sequence.
The authors would like to use the counts of amino acid types at a given position in the sequence observed in the control population versus that in the bla-selected pool to compute the free energy of insertion of the TM helix. There is good reason to believe that this reflects the insertion efficiency (although even this is difficult to be sure of without Western analysis for confirmation). However, it is a real stretch to equate this with true free energies. For insertion efficiency to be interpreted as free energies then the system would need to be under thermodynamic or pseudo-thermodynamic equilibrium. This was achieved in the Lep system by quantifying the two populations of inserted states, and in Karen Fleming's system by quantifying the concentration of the inserted and un-inserted populations of protein. In this case, however, we have no idea the concentration of the un-inserted protein and whether the rate-determining step in this unusual case represents engagement with accessory proteins, insertion into the translocon, or release from the translocon. Furthermore, the concentration of soluble protein in the cell vs. the protein inserted into the membrane is not measured. I feel, the authors should report their primary data as propensities or frequencies and restrict their conclusions to statements like: "if we treat these insertion frequency signatures as if they were under thermodynamic control following a Boltzmann we find that…" Plots derived based on this assumption are fine, but only after appropriate discussion of the caveats.
The authors do not compare their Z-dependent position profiles for the individual amino acids to the corresponding frequency plots obtained from bioinformatics analysis of membrane protein crystal structures [2-5]. Also, it is not clear how they decide on the Z-position in the membrane for their inserted TM helix. Clearly it depends on the surrounding sequence, which was not provided. Comparing the present scale to bioinformatics scales, the distributions behave in the expected manner for hydrophobic residues, which is reassuring. Also, if the propensities from earlier papers are converted to energies using the somewhat questionable method of reverse Boltzmann statistics employed in the present paper (see supplement of ) they would appear to support the magnitude of the solvation parameter being similar to that found in Karen Fleming's analysis, supporting the analysis presented in the current paper. This is a striking finding, and one of the reasons I feel that the current paper should be accepted with revision. However, beyond this, the differences between the present scale and previous experimental or bioinformatics scales are not subtle. Tyr and Trp do not show more favorable interaction in the headgroup region than the center of the bilayer in the current analysis, even though it is seen in virtually every other experimental and bioinformatics study. Glu, Asp, Gln, Asn are essentially flat, and do not become more favorable as the depth of insertion approaches +/- 20 Å as in previous studies. If this is due to averaging and is not significant, the profiles should not be shown. The profiles for Lys and Arg are also different from expectation. ΔΔG for Arg vs. Ala is around 0 at -5 Å (very close to the center of the bilayer!) and Lys is unfavorable in the cytoplasmic headgroup region. This must be telling us something about the specific system they are studying, which should be expanded on. I very much doubt that these data can be used alone for making general conclusions, although when appropriately analyzed to understand differences with other systems they will be quite informative.
Figure 2c shows a nice correlation with the Moon Fleming values, but it is not clear what is plotted for dsTBL (ΔΔG at Z=0?). Figure 2d would appear to contain a small selection of the data from a very large number of potential sequences that have been obtained from the deep sequencing. What is the slope and correlation if all sequences are used, or if it is computationally difficult how do they change after choosing 25 sequences versus 50 versus 100 sequences? What if the smoothed value of DDG at Z=0 is used to compute the values? What if one plots the X-values of Figure 2c vs. the area of the sidechains in a poly-Ala helical backbone?
Do the data in Figure 4a teach us anything new that would not have been gleaned from Mark Lemmon's, Don Engelman's and Axel Brunger's first papers in the early 1990s or Kevin MacKenzie's and Karen Fleming's subsequent papers [6-11] or does this serve as a calibration exercise? How do the data in Figure 4b compare to the analysis of Mark Lemmon  and subsequent papers?
High-precision structure modeling constrained by systematic self-associationmeasurements. In reading this section one is struck by the fact that the computational power required is many orders of magnitude greater than what was used by Axel Brunger to analyze glycophorin A two decades ago [13, 14], and yet the derived structures are not as close in RMSD to those predicted by his group prior to the structure determination of glycophorin. His method also works on asymmetric sequences. Also, the authors do not discuss methods that use mutagenesis results in a prospective way to predict structures, which, again, appear to perform better than the current method [15-17].
"Since dsTβL is the first experimental scale to report large insertion asymmetries for positively charged residues we examined whether the profiles could be used to predict membrane-protein topology directly from sequence using a benchmark comprising 607 bacterial membrane proteins of experimentally determined topology (Daley et al. 2005). The dsTβL-based predictor (see Methods) correctly assigns topology in ~70% of the cases, within the performance range of statistics-based predictors (70-80%) (Tsirigos et al. 2015). Prediction accuracy increases to 81% where the computed energy gap between inserting the protein in one orientation or the other is larger than 3kcal/mol (Figure 3b). It is encouraging that the dsTβL predictor, which is based on an experimental scale measured on a single-pass membrane protein, performs on par with methods that were fitted using experimental topology data. We are currently testing the ability of dsTβL to predict other structural properties of membrane proteins directly from sequence."
These prediction values are poorer than what is seen using analogous structural bioinformatics scales (e.g., 5), and similar to what is seen when using hydrophobicity alone. If the differences between Lys and Arg are indeed generally applicable one would expect an improvement. In any event, this one subsection appears to be the topic of a future paper (stated by the authors in the Methods section). I am sure there will be sufficient space to properly compare their novel prediction method to existing methods in this future paper. As is, I think this small subsection detracts from what is otherwise an interesting paper.
In conclusion, this is a very interesting paper describing the first (?) application of deep-sequencing methods to the problem of helix-helix interactions in the membrane. Given the novelty and importance of the work, I have endeavored to provide some perspective and suggestions in this review. I would urge the editors to publish the paper. Even if published as is, I don't think it would be a fine addition to the literature, and the issues I address would ultimately be worked out in future papers.
Finally, the authors should comment on the large difference in the length of the inserted sequences used as TM helices for glycophorin A versus ErbB2.
Glycophorin: LIIFGVMAGVIGTILI (16 residues)
ErbB2: LTSIISAVVGILLVVVLGVVFGILI (25-residues)
 Srinivasan, S., Deng, W., and Li, R. (2011) L-selectin transmembrane and cytoplasmic domains are monomeric in membranes, Biochim Biophys Acta 1808, 1709-1715.
 Senes, A., Chadi, D. C., Law, P. B., Walters, R. F., Nanda, V., and Degrado, W. F. (2007) E(z), a depth-dependent potential for assessing the energies of insertion of amino acid side-chains into membranes: derivation and applications to determining the orientation of transmembrane and interfacial helices, J Mol Biol 366, 436-448.
 Ulmschneider, M. B., Sansom, M. S., and Di Nola, A. (2005) Properties of integral membrane protein structures: derivation of an implicit membrane potential, Proteins 59, 252-265.
 Ulmschneider, M. B., Sansom, M. S., and Di Nola, A. (2006) Evaluating tilt angles of membrane-associated helices: comparison of computational and NMR techniques, Biophys J 90, 1650-1660.
 Schramm, C. A., Hannigan, B. T., Donald, J. E., Keasar, C., Saven, J. G., Degrado, W. F., and Samish, I. (2012) Knowledge-based potential for positioning membrane-associated structures and assessing residue-specific energetic contributions, Structure 20, 924-935.
 Lemmon, M. A., Flanagan, J. M., Treutlein, H. R., Zhang, J., and Engelman, D. M. (1992) Sequence Specificity in the Dimerization of Transmembrane α Helices, Biochemistry 31, 12719-12725.
 Treutlein, H. R., Lemmon, M. A., Engelman, D. M., and Brunger, A. t. (1992) The glycophorin A transmembrane domain dimer: sequence-specific propensity for a right-handed supercoil of helices, Biochemistry 31, 12726-12733.
 Lemmon, M. A., and Engelman, D. M. (1994) Specificity and promiscuity in membrane helix interactions, Quarterly reviews of biophysics 27, 157-218.
 Lemmon, M. A., Treutlein, H. R., Adams, P. D., Brunger, A. T., and Engelman, D. M. (1994) A dimerization motif for transmembrane α helices, Nature, Structural biology 1, 157-163.
 Mingarro, I., Whitley, P., Lemmon, M. A., and von Heijne, G. (1996) Ala-insertion scanning mutagenesis of the glycophorin A transmembrane helix: a rapid way to map helix-helix interactions in integral membrane proteins, Protein Sci 5, 1339-1341.
 MacKenzie, K. R., and Fleming, K. G. (2008) Association energetics of membrane spanning α-helices, Curr Opin Struct Biol 18, 412-419.
 Mendrola, J. M., Berger, M. B., King, M. C., and Lemmon, M. A. (2002) The single transmembrane domains of ErbB receptors self-associate in cell membranes, J Biol Chem 277, 4704-4712.
 Adams, P. D., Arkin, I. T., Engelman, D. M., and Brunger, A. T. (1995) Computational searching and mutagenesis suggest a structure for the pentameric transmembrane domain of phospholamban, Nature Structural Biology 2, 154-162.
 Adams, P. D., Engelman, D. M., and Brünger, A. T. (1996) Improved prediction for the structure of the dimeric transmembrane domain of glycophorin obtained through global searching, Proteins 26, 257-261.
 Berger, B. W., Kulp, D. W., Span, L. M., DeGrado, J. L., Billings, P. C., Senes, A., Bennett, J. S., and DeGrado, W. F. (2010) Consensus motif for integrin transmembrane helix association, Proc Natl Acad Sci U S A 107, 703-708.
 Metcalf, D. G., Law, P. B., and DeGrado, W. F. (2007) Mutagenesis data in the automated prediction of transmembrane helix dimers, Proteins 67, 375-384.
 Soto, C. S., Hannigan, B. T., and DeGrado, W. F. (2011) A photon-free approach to transmembrane protein structure determination, J Mol Biol 414, 596-610.https://doi.org/10.7554/eLife.12125.018
The reviewers suggested a number of important revisions that would help strengthen this manuscript. 1) The reviewers pointed out an important caveat for the present analysis: strictly speaking, the results should be considered propensities rather than free energies of membrane insertion and helix-helix association, without calibrating the expression level, or the relative population of the membrane-inserted protein vs the protein in cytosol. We suggest the authors Western Blot the expression levels of a small set (~10) of mutants and hopefully to show small variances in the expression levels. In any event this caveat should be discussed in the revised manuscript.
We agree with this caveat and have addressed it in several ways:
A) We adopted the argument recommended by Reviewer #2 that the free energy derivation depends on the constancy of the cytosolic amounts of protein among the different mutants (subsection “dsTβL: a high-throughput assay for measuring membrane-protein energetics”, last paragraph) and we explicitly mention now the assumption of thermodynamic control of partitioning between the different states. We further state that this assumption is supported by the agreement between the results and biophysical measurements.
B) We verified the behavior of 10 selected mutants by Western blots of membrane preparations (Figure 2—figure supplement 3). The Western blots showed that the proteins express in the membrane, and all run at the expected size, thereby excluding the possibility of cleavage. 6 mutations that target membrane-core positions and one at the amino terminus show the expected trends, including mutations that increase or decrease expression. However, 3 mutants to charges at the amino terminus increased expression levels according to Western blots, but were disruptive according to dsTβL. We propose in the revised manuscript that these results are due to the fact that ampicillin selections probe appropriate integration in the membrane, not just expression levels as in Western blots.
C) The reviewers found it surprising that the screen is so sensitive, even to mild mutations. In the first paragraph of the subsection “Systematic per-position contributions to membrane-protein insertion” we now note that the CLS amino acid sequence is quite polar, suggesting that its membrane-expression levels would be sensitive even to point mutations. To further confirm this sensitivity, for the 10 above-mentioned mutants we carried out plate-viability assays on a clone-by-clone basis, and in 9 saw the same behavior as in the deep sequencing experiments (Figure 2—figure supplement 2), including complete non-viability for some of the mutants.
2) Both reviewers thought the discussion concerning TM topology is a distraction from the main focus. It would be better to remove this discussion from this manuscript and to publish it separately with further development.
Right. We replaced this discussion with the statement: “…the results suggest that the insertion profiles could be used for sequence-based prediction of the locations and orientations of membrane-spanning proteins (A.E., J.W, et al., manuscript in preparation).”
3) The results reported in Figure 2b (Z-dependent energetic profiles) should be analyzed in comparison with bioinformatics results (Senes et al., 2007; Ulmschneider et al. 2005; Ulmschneider et al. 2006; Schramm et al.,
This now appears in Figure 4 and discussed in the last paragraph of the subsection “Large differences and strong asymmetries in insertion of positively charged residues”. Briefly, we note some similarities but also significant differences with the dsTβL profiles. Most importantly the statistics-based profiles are quite flat by comparison and show minor contributions for hydrophobicity or the positive-inside rule. We discuss these differences with respect to the fact that membrane-protein statistics reflect functional constraints rather than pure energetics. Reviewer #2 noted that our profiles for Tyr and Trp did not match the expectation that these residues are favored at the membrane-water interfaces. We agree with this point and discuss it in the last paragraph of the subsection “Systematic per-position contributions to membrane-protein insertion”, suggesting that these observations may reflect differences between single and multi-span membrane proteins. We also summarized caveats regarding the insertion scales in the Discussion, third paragraph, which we hope will clarify these points and spur future research.
4) The reviewers asked for further description of the present dsTβL system and for a more detailed comparison of the present system with the previously developed Lep system. Why the choice of β
-lactamase instead of the maltose binding protein? What is the precise sequence of L-selectin used in the system? Why the high sensitivity of membrane insertion to even conservative TM mutations? The authors attribute the lower membrane hydrophobicity estimated by previous work (Hessa et al., 2007) to translocon interactions of the Lep system. Please elaborate this point, as the present system also passes through translocons to be embedded in the membrane.
A) We have elaborated our description of the TβL construct (subsection “dsTβL: a high-throughput assay for measuring membrane-protein energetics”). Briefly, there were two reasons for using β-lactamase instead of MBP: 1. Viability on ampicillin is linearly related to β-lactamase expression levels whereas MBP on maltose is not (see refs. provided in the paper); and 2. The TOXCAT-MBP construct requires working with an MBP-null E. coli strain (MM39), which is not amenable to high-throughput genetic transformations, whereas with β-lactamase we could work with the E. cloni high-transformation efficiency strain.
B) As mentioned above in our response 1C, we verified the high sensitivity on a clone-by-clone basis. The reason for this high sensitivity is that the CLS amino acid sequence is quite polar.
C) Regarding our comparison to the Lep system: we have revised our argument in the Discussion. Briefly, the Lep measurements quantified proteins that were either singly or doubly glycosylated, but did not quantify total membrane-protein levels. The observations that the Lep measurements are rank-ordered as in our measurements (and also in published biophysical experiments on an outer-membrane protein), but are fourfold smaller in magnitude, as well as the observation that the atomic-solvation parameter inferred from Lep is smaller by roughly fourfold compared to both dsTβL and previous biophysical work are consistent with a view that Lep was measuring only part of the insertion equilibrium. We have removed the argument that the differences are due to interactions with the translocon.
D) Reviewer #2 was correct in identifying a mistake in the DNA sequence we provided for CLS in the original supplement, which was of the CLS reverse-complement. We corrected this mistake and apologize for the error. The amino acid sequence of CLS is noted at the bottom of Figure 2a.
In addition to the changes above, we corrected, clarified, and reanalyzed our data following the individual reviewer comments, including those that were not mentioned in the summary. With respect to the self-association data we now compare to the Doung et al. data on GpA (Figure 5b) finding high correspondence between their data and ours. We also revised this section to highlight the new findings in dsTβL, including positions, which were not tested in the Lemmon et al. and Doung et al. studies, but are sensitive to mutation, or the few mutations that promote self-association; without a comprehensive analysis it is unlikely to identify the very few beneficial mutations. We further shifted the emphasis from structure prediction to the insertion-association coupling by shortening the structure-prediction paragraph and changing the last sentence in the Abstract and the sub-title of the relevant section. We feel that the self-association section is an important and integral part of this paper because it shows that a single assay can easily provide systematic data on the two primary components of membrane-protein energetics: insertion and association. Furthermore, the fact that we used the insertion data derived from CLS to subtract the insertion contribution in two unrelated systems (GpA and ErbB2) confirms the general usefulness of the insertion scales and should simplify future studies that analyze mutational effects on association or function.https://doi.org/10.7554/eLife.12125.019
- Assaf Elazar
- Jonathan Weinstein
- Yearit Fridman
- Sarel Jacob Fleishman
- Assaf Elazar
- Jonathan Weinstein
- Yearit Fridman
- Sarel Jacob Fleishman
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
We thank Olga Khersonsky, Dror Baran, Adina Weinberger, Jens Meiler, and Zohar Mukamel for advice, and Gunnar von Heijne, Nir Ben-Tal, Ingemar Andre, Julia Koehler-Leman, and Dan Tawfik for critical reading. Ilan Samish and William DeGrado provided advice for analyzing knowledge-based insertion energies. The research was supported by the Minerva Foundation with funding from the Federal German Ministry for Education and Research, a European Research Council’s Starter’s Grant, an individual grant from the Israel Science Foundation (ISF), the ISF’s Center for Research Excellence in Structural Cell Biology, career development awards from the Human Frontier Science Program and the Marie Curie Reintegration Grant, an Alon Fellowship, and a charitable donation from Sam Switzer and family.
- Yibing Shan, DE Shaw Research, United States
© 2016, Elazar et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.