Native proline-rich motifs exploit sequence context to target actin-remodeling Ena/VASP protein ENAH

Abstract
Editor's evaluation
Introduction
Results
Discussion
Materials and methods
Data availability
References
Article and author information
Metrics

Abstract

The human proteome is replete with short linear motifs (SLiMs) of four to six residues that are critical for protein-protein interactions, yet the importance of the sequence surrounding such motifs is underexplored. We devised a proteomic screen to examine the influence of SLiM sequence context on protein-protein interactions. Focusing on the EVH1 domain of human ENAH, an actin regulator that is highly expressed in invasive cancers, we screened 36-residue proteome-derived peptides and discovered new interaction partners of ENAH and diverse mechanisms by which context influences binding. A pocket on the ENAH EVH1 domain that has diverged from other Ena/VASP paralogs recognizes extended SLiMs and favors motif-flanking proline residues. Many high-affinity ENAH binders that contain two proline-rich SLiMs use a noncanonical site on the EVH1 domain for binding and display a thermodynamic signature consistent with the two-motif chain engaging a single domain. We also found that photoreceptor cilium actin regulator (PCARE) uses an extended 23-residue region to obtain a higher affinity than any known ENAH EVH1-binding motif. Our screen provides a way to uncover the effects of proteomic context on motif-mediated binding, revealing diverse mechanisms of control over EVH1 interactions and establishing that SLiMs can’t be fully understood outside of their native context.

Editor's evaluation

The manuscript uses a new screen called MassTitr to display long (36-mer) peptides derived from human proteome to screen for peptides that can bind the EVH1 domain of ENAH protein. About 100 peptides were identified and further analysis identified sequence features that contribute to the binding of EVH1 domain, including an additional proline after the FP4 motif and double FP4 motif. This paper will be of broad interest in the field of proteomics and to scientists interested in how biological interactions achieve specificity.

https://doi.org/10.7554/eLife.70680.sa0

Introduction

Interactions between modular interaction domains and short linear motifs (SLiMs) direct a broad range of intracellular functions, from protein trafficking to substrate targeting for post-translational modifications. To faithfully propagate signals, SLiMs must recognize the correct interaction partners within the cellular environment. But how interaction specificity is achieved is enigmatic. SLiMs, which occur as 3–10 consecutive amino acids in intrinsically disordered regions of proteins, are degenerate and have low complexity, meaning they are defined by just a few key residues or motif features. Crystal structures of SH3, WW, and PDZ domains bound to SLiMs typically reveal three to six residues docked into a shallow groove (Lim et al., 1994; Macias et al., 1996; Schultz et al., 1998). The expansion of modular interaction domain families in metazoan proteomes has led to hundreds of domains that share overlapping SLiM-binding specificity profiles yet carry out distinct functions in the cell (Bhattacharyya et al., 2006). How high-fidelity interactions are maintained between low complexity SLiMs and cognate recognition domains remains poorly understood for many pathways.

Most SLiM research has centered around defining the ‘core SLiM’, or the minimal set of amino acids sufficient to bind to a given domain. High-throughput approaches, such as phage display using libraries of 7–16-residue peptides (Ivarsson et al., 2014; Teyra et al., 2017; Tonikian et al., 2008; Davey et al., 2017), have been instrumental for advancing our understanding. But these assays do not probe how the sequences surrounding core SLiMs affect their interactions, and there is increasing evidence that the surrounding sequence critically influences SLiM interaction affinity and specificity (Palopoli et al., 2018; Prestel et al., 2019; Stein and Aloy, 2008). For example, an alpha-helical extension C-terminal to a SLiM in ankyrin-G confers high-affinity and selective interactions with the GABARAP subfamily of Atg8 proteins by making contacts with the GABARAP interface (Li et al., 2018b). The presence of aromatic residues directly flanking a SLiM in Drebrin prevents its interaction with Homer, demonstrating that SLiM sequence context can also disfavor protein-protein interactions (Li et al., 2019).

The actin interactome contains many proline-rich SLiMs and many proline-binding modules such as SH3, WW, and EVH1 domains that participate in regulating actin dynamics (Holt and Koffer, 2001). Although the extent to which these domains cross-react or bind selectively in the cell is unknown, sequence elements surrounding linear, proline-rich motifs could play an essential role in directing specific interactions. Therefore, we sought to uncover the impact of sequence context on SLiM-mediated interactions with the EVH1 domain of the actin-regulating Ena/VASP protein ENAH.

Ena/VASP proteins form a family of cytoskeletal remodeling factors that are recruited to different regions of the cell by binding proline-rich SLiMs via their N-terminal EVH1 domains and promoting actin polymerization via their C-terminal EVH2 domains. The family is implicated in many cellular functions such as axon guidance and cell adhesion (McConnell et al., 2016; Scott et al., 2006). The Ena/VASP EVH1 domain recognizes the SLiM [FWYL]PXΦP, where X is any amino acid and Φ is any hydrophobic residue (Ball et al., 2000). This motif, referred to in this paper as the FP4 motif, because FPPPP is a common example, adopts a polyproline type II (PPII) helix structure and binds weakly to the EVH1 domain (Prehoda et al., 1999). Searching for this core FP4 motif in the human proteome yields 4,994 instances. This number of potential interaction partners is very large, and although spatial, structural, and temporal context impose additional determinants for cellular interaction (Bugge et al., 2020), the abundant motif matches raise the question of whether sequence elements beyond the FP4 SLiM affect molecular recognition.

We used a new screening approach to uncover examples of how the sequence context surrounding the core Ena/VASP FP4 SLiM affects binding specificity in the proteome. Our unbiased screening method, MassTitr, identified 36-residue human proteome-derived peptides that bind to the ENAH EVH1 domain with a range of affinities. To our knowledge, this is the first use of a high-throughput screening method to systematically discover and characterize both local and distal sequence elements that impact SLiMs. By analyzing features of high-affinity binders, we identified distinct ways in which sequence elements surrounding proteomic FP4 SLiMs impact binding affinity and specificity for ENAH. Our work provides insight into how selective interactions are maintained in proline-rich motif-mediated signaling networks and highlights the importance of considering sequence context when investigating SLiM-mediated interactions. Our pipeline serves as a blueprint to map and predict how the sequence context surrounding SLiMs impacts protein-protein interactions on a proteome-wide scale.

Results

MassTitr identifies ENAH EVH1 domain-binding peptides from the human proteome

To identify ENAH EVH1 binders in the human proteome, we applied a screen called MassTitr. MassTitr is a SORT-SEQ method that is based on fluorescence-activated cell sorting (FACS) of a library of peptide-displaying bacteria and subsequent deconvolution of signals by deep sequencing. As shown in Figure 1A and B, peptide-displaying Escherichia coli cells are sorted into bins according to their binding signals across a range of protein concentrations, and the binding signal for each peptide at each protein concentration is extracted by deep sequencing each bin. Two advantages of this method over phage display are that MassTitr supports screening of long peptides and leads to the identification of binders with a broad range of affinities. MassTitr is similar in concept to yeast-surface-display based methods that have been applied to study the interactions of anti-fluorescein scFvs and SARS-CoV-2 receptor binding domain mutants (Adams et al., 2016; Starr et al., 2020).

Figure 1 with 4 supplements see all

Download asset Open asset

MassTitr screening identifies biologically relevant ENAH EVH1 ligands.

(A) At left, bacterial surface display schematic. Library peptides flanked by a FLAG tag and a c-Myc tag were expressed as fusions to the C-terminus of eCPX on the surface of *E. coli*. Cells were labeled with anti-FLAG-APC to quantify expression and then incubated with tetrameric ENAH EVH1 domain, which was detected by streptavidin conjugated to phycoerythrin (SAV-PE). At right, a FACS plot for surface-displayed ActA peptide binding to ENAH EVH1 tetramer (10 µM monomer concentration). (B) MassTitr schematic. The top row represents a library of three clones (blue, purple, and green) sorted into four gates at three concentrations of ENAH. The rows highlighted in blue, purple, and green illustrate reconstructions of the concentration-dependent binding of each clone based on deep sequencing. The experiment in this paper sorted a pre-enriched library of clones into four gates at eight concentrations. (C) Distribution of MassTitr hits after filtering; 68 peptides contained a canonical FP4 motif matching the regular expression [FWYL]PX[FWYLIAVP]P. (D) Frequency plot made from sequences that match the FP4 motif in the human proteome, the input library, and the MassTitr binders using Weblogo (Crooks et al., 2004). (E) Subcellular locations where at least two MassTitr hits that are predicted to be disordered and localized in the cytoplasm are annotated to reside. White text denotes previously reported Ena/VASP interactions. (F) IP and western blot showing interaction of ENAH with MassTitr hits FHOD1 and IFT52 in cells. GFP-tagged ENAH and FLAG-tagged candidate interactors were overexpressed in cells and resulting lysate was precipitated with anti-FLAG antibody and then blotted with anti-FLAG and anti-GFP.

Using MassTitr, we screened a library of 416,611 36-mer peptides with seven-residue overlaps (the T7-pep library) (Larman et al., 2011). This library spans the entire protein-coding space of the human genome, and we hypothesized that the long lengths of the encoded peptides would illuminate the impact of the sequence surrounding the FP4 motif in a biologically relevant sequence space. We first prescreened the library for binding to an ENAH EVH1 domain that was tetramerized by fusion to the endogenous ENAH coiled coil, as shown in Figure 1A, generating an input library enriched in binders. We then ran MassTitr on the prescreened library, using eight concentrations of ENAH EVH1 tetramer (Figure 1B). After sorting, sequencing, and filtering based on read counts, 108 unique high-confidence binders were identified and classified as either high-affinity or low-affinity as described in the methods (Figure 1C, Supplementary file 1). Of the 108 hits, 14 may have bound to ENAH EVH1 because a library synthesis error introduced a motif that is not present in the human protein; we did not analyze these sequences further.

We validated the binding of 16 MassTitr peptide hits to monomeric ENAH EVH1 domain by using biolayer interferometry (BLI) to determine dissociation constants that ranged from 0.18 μM to 63 μM (Supplementary file 2). Except for SHROOM3 and TENM1 peptides, binders classified as high-affinity by MassTitr bound to the ENAH EVH1 domain more tightly than peptides classified as low-affinity. Many newly identified peptides bound with affinities similar to or tighter than a well-studied control peptide from Listeria monocytogenes protein ActA, which bound with K_D = 4.9 μM in our BLI assay (Supplementary file 2). Prior to this work, this single FP4-motif-containing sequence from ActA was the tightest known endogenously derived binder of Ena/VASP EVH1 domains (Ball et al., 2000). The highest affinity peptide that we discovered was from photoreceptor cilium actin regulator (PCARE) (K_D = 0.18 μM for 36-residue peptide PCARE^813-848; Supplementary file 2), which contains the FP4 motif LPPPP. Successive truncations of this peptide identified the 23-residue minimal region for high-affinity binding, which extends 14 residues beyond the FP4 motif (PCARE^826-848 K_D = 0.32 μM, Figure 1—figure supplement 1).

Although the majority of MassTitr hits contained FP4 motifs, 40 out of the 108 high-confidence hits did not (Figure 1C and D, Supplementary file 1). Most of the non-canonical binders contained a CXC motif. Although disulfide bond formation between CXC-containing peptides and ENAH may contribute to signal in the cell-surface display assay, we confirmed that this motif, and not just the presence of one or more cysteine residues, was important for the binding of a peptide from OLIG3 to ENAH in the presence of 2 mM DTT (Figure 1—figure supplement 2). Also, a CXC-containing peptide from TRIM1 bound reversibly to the ENAH EVH1 domain at mid-micromolar concentrations (Figure 1—figure supplement 2). Peptides from KIAA1522 and TJAP1 bound to ENAH and lacked either an FP4 or a CXC motif (Supplementary file 2). Our results, therefore, add to increasing evidence that the ENAH EVH1 domain can bind sequences beyond the FP4 motif (Boëda et al., 2007; Chen et al., 2014; Menon et al., 2015).

MassTitr peptides are associated with and expand the ENAH signaling network

To highlight putative biologically relevant interaction partners of ENAH, we applied a bioinformatic analysis to identify those motifs that are likely to be accessible and co-localized with ENAH. We filtered our high-confidence hits by disorder propensity (IUPred2A > 0.4) (Mészáros et al., 2018) and cytoplasmic subcellular localization (Binns et al., 2009; Thul et al., 2017). This resulted in 34 peptides that mapped to 33 unique proteins, of which 13 are derived from interaction partners previously known to interact or co-localize with an Ena/VASP protein (Supplementary file 1). The Ena/VASP binding sites of 10 of these hits have been previously mapped. Therefore, MassTitr provided new information about the EVH1-binding sites of 23 novel or previously known interaction partners of the Ena/VASP family. Filtered hits were highly enriched in GO biological process terms including actin filament organization (FDR < 10^–6) and positive regulation of cytoskeleton (FDR < 0.05) (Mi et al., 2019), which align with documented cellular functions of ENAH. Notably, we also identified proteins localized to the Golgi body and cilia, where Ena/VASP function is not well characterized (Figure 1E; Kannan et al., 2014; Tang et al., 2016).

We tested whether putative new ENAH interaction partners bound to full-length proteins in mammalian cells. We overexpressed GFP-tagged full-length ENAH with FLAG-tagged FHOD1, IFT52, or TJAP1 and used an anti-FLAG antibody to precipitate complexes from the cell lysate. Probing with anti-FLAG and anti-GFP antibodies showed robust immunoprecipitation (IP) of GFP-ENAH relative to cells expressing GFP-ENAH and a FLAG-tag-only negative control protein (Figure 1F, Figure 1—figure supplement 3).

A proline-rich C-terminal flank binds to a novel site on the EVH1 domain to enhance affinity in ENAH interaction Partners

We used MassTitr data to identify FP4 SLiM-flanking elements that enhance binding to the ENAH EVH1 domain. A sequence logo made of the high-confidence MassTitr hits shows enrichment of prolines C-terminal to the FP4 motif, and a binomial test confirms that peptides containing FP4 motifs followed by three consecutive prolines are enriched our hit list (p < 10^–11; Figure 1D). A peptide from ENAH interactor ABI1 (Chen et al., 2014; Tani et al., 2003) was among the highest affinity ligands that we validated by BLI, with K_D = 2.6 μM (Supplementary file 2). ABI1 contains an FP4 motif followed by four prolines. Mutating FPPPPPPPP (FP₈) to FPPPPSSSS in the context of the ABI1 36-mer reduced affinity by approximately fourfold (p < 0.05; Supplementary file 3). Although this confirms that the C-terminal prolines enhance affinity, peptide FPPPPPPPP alone binds to the ENAH EVH1 domain with K_D = 28 μM, indicating that additional interactions contribute to the high affinity of the ABI1 36-mer. Previous studies have shown that acidic residues N- and C-terminal to the FP4 motif can enhance affinity (Niebuhr et al., 1997) and we hypothesized that positively charged patches on ENAH could bind acidic residues that flank the FP₈ segment in ABI1 (Figure 2—figure supplement 1). Indeed, truncating the N-terminal or C-terminal acidic flanks of the 36-residue ABI1 peptide further decreased affinity (Supplementary file 3).

We solved a crystal structure of the ENAH EVH1-ABI1 peptide complex at 1.88 Å resolution. Only the FP₈ region was fully resolved in the structure under the crystallization conditions, which included high salt and low pH (Figure 2A). The peptide folds into a PPII helix with prolines 1, 4, and 7 (⁰FPPPPPPPP⁸) contacting the EVH1 surface (Figure 2A and B). The FP4 portion of the peptide binds the canonical FP4 groove, as observed in other structures (Prehoda et al., 1999; Fedorov et al., 1999), whereas the 7th proline docks into a previously uncharacterized site on ENAH composed of Ala12, Gly92, Phe32, and the aliphatic part of the side chain of Asn90 (Figure 2C). We note that residue 12 has diverged in EVL and VASP EVH1 domains (Figure 2C), which may contribute the weaker binding of ABI1 to those paralogs (Supplementary file 4), although we have not isolated the affinity difference to this specific change. Notably, a similar binding site at the analogous location is used by the Homer EVH1 domain to bind the phenylalanine of PPXXF motifs (Beneken et al., 2000). However, this site is relatively shallow in ENAH, and modeling large aromatic acids at this position on the ABI1 peptide using Pymol leads to severe steric clashes. Homer contains a smaller Gly89 at the site of Asn90 in ENAH and can accommodate the bulky Phe of the PPXXF motif (Figure 2C).

Figure 2 with 1 supplement see all

Download asset Open asset

Prolines C-terminal to FP4 can engage a novel ENAH binding site.

(A) Surface representation of the ENAH EVH1 domain bound to FP₈. The core FP4 motif is light blue, the P₄ flank is orange; insets show details of the interactions. (B) Axial view of a polyproline type II helix highlighting three-fold symmetry (left); a side view shows P1, P4, and P7 facing the same side (right). (C) At left, surface representation of the HOMER1 EVH1 domain bound to TPPSPF (PDB 1DDV, peptide in red) aligned to the ENAH EVH1 domain bound to peptide FP₈ (peptide in light blue/orange). The region corresponding to the Pro7 binding pocket in HOMER1 is colored in green. Inset: magnified views of the Pro7 binding pocket in ENAH and the analogous pocket in HOMER1. The table compares residues in this pocket for HOMER1, ENAH, VASP, and EVL.

Distal sequence elements enhance ENAH EVH1 binding through bivalent interactions

Another enriched feature of MassTitr-identified binders, relative to the pre-screened input library, is the presence of multiple FP4 motifs (binomial test, p < 10^–22). Multi-motif hits highlighted preferred spacings of approximately five or 15 residues between FP4 motifs (Figure 3A). Multiple motifs were also enriched in MassTitr high-affinity hits relative to all hits (p < 0.02), supporting the hypothesis that multiple FP4 motifs enhance affinity. We confirmed this experimentally by showing that binding was changed significantly (p < 0.05), with a 2.5- to sixfold reduction in affinity, when 36-mer sequences from LPP and NHSL1, which contain two FP4 motifs, were truncated to leave only one motif (Table 1, Figure 3B).

Figure 3 with 1 supplement see all

Download asset Open asset

Multiple FP4 motifs enhance peptide binding affinity.

(A) Spacing of FP4 motifs in the input library and in high-confidence hits. (B) Fold change increase in K_D for truncated single-motif peptide variants relative to higher affinity 36-mer dual-motif library peptides for LPP and NHSL1; see Table 1 for sequences. (C) Fold change increase in K_D for 36-mer peptides binding to ENAH EVH1 R47A relative to tighter binding ENAH EVH1 WT. (D) ITC binding curves for 36-residue peptides from ActA, LPP, and NHSL1. (E) The entropic and enthalpic contributions to binding determined using data in panel D. Fold-change errors in (B) and (C) were calculated by propagating the error from two affinity measurements. Sequences for peptides referenced in this figure are given in Table 1 and Supplementary file 6.

Figure 3—source data 1 Raw data for Figure 3B and C.: https://cdn.elifesciences.org/articles/70680/elife-70680-fig3-data1-v1.xlsx
Download elife-70680-fig3-data1-v1.xlsx

Table 1

Affinities of dual FP4 motif peptides and their variants for ENAH EVH1 WT or ENAH EVH1 R47A obtained using biolayer interferometry.

Name^b‡	Sequence	WT K_D (μM)	R47A K_D (μM)
NHSL1^*	ADRSPFLPPPPPVTDCSQGSPLPHSPVFPPPPPEAL	9.7 ± 2.5	51.5 ± 10.0
NHSL1 FP4 1*	ADRSPFLPPPPPVTDCSQGSPLPHSPV	45.9 ± 5.5	93.0 ± 16.0
NHSL1 FP4 2*	PVTDCSQGSPLPHSPVFPPPPPEAL	24.9 ± 1.2	53.0 ± 4.1
NHSL1 Duplicated*	ADRSPFLPPPPPVTDCSQGSPLPHSPVPVTDCSQGSPLPHSPVFPPPPPEAL	18.6 ± 0.2	65.0 ± 7.0
LPP*	KQPGGEGDFLPPPPPPLDDSSALPSISGNFPPPPPL	4.7 ± 2.4	60.1 ± 6.7
LPP FP4 1*	KQPGGEGDFLPPPPPPLDDSSALPSISGN	13.9 ± 2.5	61.5 ± 0.6
LPP FP4 2†	PPLDDSSALPSISGNFPPPPPL	29.6 ± 2.3	67.2 ± 14.9
LPP Duplicated*	KQPGGEGDFLPPPPPPLDDSSALPSISGNDDSSALPSISGNFPPPPPL	7.9 ± 2.2	53.3 ± 2.5

*

Difference between WT and R47A KD value is significant with p < 0.01.
†

Difference between WT and R47A KD value is significant with p < 0.05.
‡

Errors are standard deviations over three replicates.

Table 1—source data 1 Raw data for Table 1.: https://cdn.elifesciences.org/articles/70680/elife-70680-table1-data1-v1.xlsx
Download elife-70680-table1-data1-v1.xlsx

Zyxin, which contains four clustered FP4 motifs, has been shown to bind to the VASP EVH1 domain by contacting both the canonical FP4 site and a noncanonical site on the opposite side of the EVH1 domain (Acevedo et al., 2017). Interestingly, a crystal structure of the ENAH EVH1 domain bound to a single-FP4-motif peptide at the canonical site also contains a second peptide bound to the region corresponding to the noncanonical binding site in VASP (PDB 5NC7, Barone et al., 2020). To test whether multi-FP4-motif peptides engage this noncanonical site, we designed ENAH EVH1 R47A. In the ActA peptide-bound structure, ENAH Arg47 forms a bidentate hydrogen bond with a carbonyl on the peptide PPII helix backbone in the back-side site. The analogous VASP Arg48 exhibits significant NMR HSQC chemical shifts upon titration with a multi-FP4 motif zyxin peptide (Acevedo et al., 2017). Thus, we predicted that the R47A mutation would disrupt back-site binding. Indeed, while the affinities of single-FP4-motif peptides from ActA and PCARE were minimally affected by this mutation, peptides from zyxin, LPP, and NHSL1 that contain two FP4 motifs showed a 5–15-fold reduction in affinity for ENAH upon mutating Arg47 to Ala (p < 0.01) (Table 1, Figure 3C, Supplementary file 5).

Next, we investigated the stoichiometry and thermodynamics of multi-motif peptide binding to a monomeric ENAH EVH1 domain. Using isothermal titration calorimetry (ITC), we confirmed that dual-FP4 motif peptides from LPP and NHSL1 and the single-FP4 motif peptide from ActA fit well to a 1:1 binding model (Figure 3D). Interestingly, the ITC analysis showed that binding of the ActA-derived single-FP4 motif peptide was driven by favorable entropy, whereas binding of the NHSL1 and LPP dual-motif peptides was enthalpically driven. ActA, LPP, and NHSL1 peptides have similar binding free energies, but the entropic contribution to the dual-motif interactions is ~10 fold less favorable (Figure 3E). These data are consistent with a model in which long, disordered dual-motif peptides pay an entropic penalty to wrap around the EVH1 domain and engage two sites but gain enthalpic binding energy from additional interactions. Duplication of the linkers between the two motifs in LPP and NHSL1 led to a very modest, ~ twofold reduction in affinity for the ENAH EVH1 domain compared to the WT LPP and NHSL1 peptides (p < 0.05 for NHSL1 duplicated, n.s. for LPP duplicated; Table 1). This is consistent with a less favorable conformational entropy of binding when the motifs are separated by a greater linker (Table 1). However, our data do not establish whether or not the two motifs bind the same way when the linker is duplicated. We observed that the interaction of truncated dual-motif peptides, which contain only a single motif plus the surrounding linker sequence, is weakened by mutation R47A (FP41 and FP42 peptides in Table 1). This suggests that the linker residues themselves may be able to make favorable interactions with the back-side site.

Finally, we examined the minimal motif-spacing requirements for bivalent binding. We used Rosetta to build a peptide chain to connect single FP4-motif peptides bound to the canonical and noncanonical sites of the ENAH EVH1 domain. There are two orientations of the chain that preserve the directionalities of the bound FP4 peptides observed in structure 5NC7. Ten residues were required to span the two motifs in orientation 1, whereas nine residues were sufficient in orientation 2 (Figure 3—figure supplement 1). This indicates that the ~15-residue spacing that was enriched in our hits is more than enough to span the two binding sites and implies that the chain is not taught between these two sites.

Discussion

In recent years, phage display screening of peptides derived from the human proteome has been used to define SLiM specificity profiles and predict novel interaction partners (Davey et al., 2017; Ueki et al., 2019; Jespersen et al., 2019; Wigington et al., 2020); these studies have primarily focused on defining the “core SLiM”. In this work, we used MassTitr to screen more than 400,000 36-residue segments of the human proteome against the cytoskeleton regulator ENAH. Analysis of the hits readily identified both local and distal sequence features up to 15 residues away from the core FP4 SLiM that are important for binding. Our study highlights ways in which low-information SLiMs exploit sequence context to selectively recognize modular interaction domains within the proteome, especially in the context of proline-rich signaling networks where over 300 SH3 domains, 80 WW domains, and 20 EVH1 domains coexist to drive signal transduction in humans (Zarrinpar et al., 2003).

We found multiple ways that sequence flanking the FP4 motif can modulate binding to ENAH. We first demonstrated that prolines C-terminal to FP4 motifs can enhance binding by contacting a previously uncharacterized hydrophobic patch on ENAH. Both secondary structure and sequence are key to this binding mode, which positions the 7th proline of a ⁰FPPPPPPPP⁸ peptide to contact ENAH in a shallow groove that we refer to as the Pro7 binding pocket. The relatively flat surface of the EVH1 domain in this region limits the binding energy available from favorable contacts, but PPII helix preorganization presumably minimizes the entropic cost of binding. We anticipate that this binding mode is widely exploited by cellular interaction partners of ENAH. Multiple previously annotated Ena/VASP interactors, including proteins identified in our screen such as FBLIM1, ZYX, and LPP (Zhang et al., 2006; Drees et al., 2000; Petit et al., 2000) contain FP4 motifs followed by trailing prolines, with either a leucine or proline in the 7th position (⁰FPPPPPPPP⁸).

As shown in Figure 2C, the Homer EVH1 domain uses a site structurally analogous to the ENAH Pro7 binding pocket to accommodate Phe in the SLiM PPXXF (Beneken et al., 2000). Thus, part of the core binding site in the Homer EVH1 domain is used by the ENAH EVH1 domain as a secondary affinity-enhancing site. Also, interestingly, the Ena/VASP family members VASP and EVL have polar Thr or Ser residues in place of Ala 12 in this pocket. Both of these proteins bind ~5 fold less tightly than ENAH does to ABI1 (Supplementary file 4). The unique hydrophobic pocket on ENAH EVH1 provides a striking example of how a peripheral site that has diverged among homologous domains can engage motif-flanking sequence, which may provide a mechanism for increasing molecular recognition specificity.

Many known Ena/VASP partners contain multiple FP4 motifs (Hansen and Mullins, 2015) and dual-motif peptides were prevalent among our high-affinity hits, consistent with the multiple motifs enhancing binding. Analysis of our multi-FP4 MassTitr hits showed preferential spacing of ~5 or ~ 15 residues between FP4 motifs in a single chain. For peptides with a motif spacing of ~15 residues, our data and previous work support a model of bivalent binding, where multi-FP4 motif peptides can engage two sites on a single EVH1 domain (Acevedo et al., 2017; Barone et al., 2020). A 1:1 stoichiometry is supported by ITC for dual-motif peptides LPP and NHSL1 binding to a monomeric EVH1 domain, in an interaction that is weakened by disruption of the noncanonical site by mutation R47A. Doubling the linker length between motifs weakened binding, slightly, consistent with alteration of the effective concentration of a bivalent interaction (Table 1).

We speculate that diverse sequences, particularly those with the propensity to adopt a PPII helix conformation, could make favorable contacts with the back-side site when present at high effective concentration due to the binding of a primary FP4 motif at the canonical site. In support of this, we saw that the peptides from LPP and NHSL1 that lacked a second FP4 motif but contained at least one proline residue ~10 residues away from a single-FP4 motif bound at least twofold more tightly to wild-type ENAH EVH1 than to ENAH EVH1 R47A (Table 1). Our data suggest that either a second FP4 motif or linker residues can make favorable interactions with the noncanonical site.

FP4 motifs separated by five residues probably do not bind simultaneously to a single EVH1 domain, as structural modeling suggests that the minimum chain length required to span the two putative bindings sites is nine residues. In such cases, it may be that two EVH1 domains bind to two closely spaced motifs (see one possible model in Figure 2—figure supplement 1B). Another possibility is that clustered FP4 motifs separated by only a few residues bind using mechanisms such as allovalency, where the increased effective concentration of multiple FP4 motifs close together enhances affinity (Levchenko, 2003).

The critical noncanonical site residues for binding FP4 motifs, including ENAH Arg47 and VASP Tyr38 (Acevedo et al., 2017), are conserved across ENAH, VASP, and EVL, suggesting that bivalent binding is a general mechanism to increase molecular recognition specificity for the Ena/VASP family. However, there is also some evidence that this binding mode could provide paralog specificity, as the linker region connecting multiple FP4 motifs could contact regions on the EVH1 domain that differ across the Ena/VASP paralogs. In support of this, we found that a dual-FP4-motif peptide from LPP bound ~7 fold tighter to ENAH over EVL EVH1 domains (Supplementary file 4 and p < 0.01).

Finally, we identified a peptide derived from PCARE that binds to ENAH with the highest known affinity of any SLiM (K_D = 0.18 μM). Truncation experiments indicated that the 14-residues C-terminal of the LPPPP motif in PCARE are critical for its high affinity, hinting that extensive contacts between this region and the ENAH EVH1 domain could be responsible for the enhanced binding. Our subsequent work revealed the surprising structural basis for this affinity (Hwang et al., 2021).

Filtering MassTitr hits for interactions of most probable biological significance, based on localization and disorder, yielded peptides from 33 putative binding partners. 19 proteins from this list have not, to our knowledge, been reported to associate with Ena/VASP proteins and provide avenues for further investigation. Some of the binding partners that we discovered lack a match to the canonical FP4 motif. The segment from TJAP1 that gave a hit in our screen and was confirmed to bind to ENAH in IP experimentsdoes not contain any recognizable FP4 motif yet is proline-rich and binds to the ENAH EVH1 domain with a K_D of 23 μM by BLI (Supplementary file 2). We also confirmed that a region from KIAA1522 that lacks an FP4 motif but includes 10 proline residues in the 36-mer peptide binds to the ENAH EVH1 domain (K_D = 14 μM; Supplementary file 2). KIAA1522 potentiates metastasis in esophageal carcinoma and breast cancer cells (Xie et al., 2017; Li et al., 2018a), potentially linking ENAH and KIAA1522 in tumor progression. Our results imply that while the sequence context surrounding FP4 motifs can significantly impact their affinity and specificity to the Ena/VASP family, noncanonical motifs also contribute to the Ena/VASP interactome.

We were able to verify interactions of ENAH with FHOD1, IFT52, and TJAP1 by co-IP assay in mammalian cells (Figure 1F, Figure 1—figure supplement 3). TJAP1 is primarily localized to the trans-Golgi complex and is thought to help maintain Golgi body structure (Tamaki et al., 2012). ENAH has been shown to regulate Golgi architecture in Drosophila photoreceptors and to play a role in maintaining Golgi structure in human cells via its interaction with GRASP65 (Kannan et al., 2014; Tang et al., 2016). However, the role of Ena/VASP proteins in regulating functions of the Golgi body is largely unexplored, positioning TJAP1 as a promising lead to further explore the role of Ena/VASP proteins in the Golgi body.

FHOD1 is one of several formin proteins (FHOD1, FHDC1, FMN2) that were identified as putative interactors in our screen. Like Ena/VASP proteins, formins also promote unbranched actin polymerization. There is evidence that the two families cooperate in regulating filopodial protrusions (Barzik et al., 2014), although the mechanistic basis behind this interaction is not well understood. Our hits are potential leads to further investigate the intersection between formins and Ena/VASP proteins in fine-tuning filopodial formation and dynamics.

IFT52 is part of the intraflagellar transport B complex (IFT-B) and is critical for the assembly of cilia and flagella. Mutations in the IFT-B complex are associated with several ciliopathies. IFT52 has been linked to short-rib thoracic dysplasia and retinal ciliopathies (Chen et al., 2018). To date, the role of Ena/VASP proteins in cilia has not been well characterized, although PCARE, a cilia-associated protein primarily found in the outer segment of photoreceptor cells, has been reported to associate with ENAH through tandem affinity purification mass spectrometry (Corral-Serrano et al., 2020). PCARE was also a hit from our MassTitr screen, pointing to a significant as-yet unexplored role for Ena/VASP proteins in cilia.

Conclusion

For many protein domains beyond EVH1, degenerate SLiMs have been cataloged in the Eukaryotic Linear Motif (ELM) database to describe their interaction preferences (Kumar et al., 2022). The ELM listing implies that there is a relatively simple recognition code for many key domain interactions. However, the short sequences of most SLiMs are likely insufficient for biological specificity in many or most cases. Here we showed how defining the EVH1 binding motif as [FWYL]PXΦP is an over-simplification and how, by systematically examining the role of flanking sequences for just one EVH1 domain, we readily uncovered numerous examples in which the binding is modulated via additional extra-motif residues. Added to prior reports from investigations of individual interactions (Stein and Aloy, 2008; Li et al., 2018a; Aitio et al., 2010), our work definitively demonstrates the importance of sequence context on SLiM behavior by illustrating specific mechanisms, including an unusual conformational specificity mechanism that is documented in our companion paper (Hwang et al., 2021). MassTitr provides a versatile experimental platform for uncovering context effects on domain-peptide interactions and will surely lead to similar insights into the recognition strategies of other domains.

Materials and methods

Key resources table

Reagent type (species) or resource	Designation	Source or reference	Identifiers	Additional information
Strain, strain background (Escherichia coli)	DH5a	NEB	Cat# 2987 H	Chemically competent cells
strain, strain background (Escherichia coli)	BL21(DE3)	Novagen	Cat# 71,400	Chemically competent cells
Peptide, recombinant protein	Streptavidin, R-Phycoerythrin Conjugate (SAPE)	Thermo Fisher	Cat# S866	(1:100)
Antibody	SureLight Allophycocyanin-anti-FLAG antibody (mouse monoclonal)	Perkin Elmer	Cat# AD0059F	(1:100)
Antibody	Anti-FLAG (mouse monoclonal)	ProteinTech Group	Cat# 66008–3, RRID:AB_2749837	(5 µg per mg of protein)
Antibody	Anti-FLAG (rabbit polyclonal)	ProteinTech Group	Cat# 66002–1, RRID:AB_11232216	(1:1000)
Antibody	Anti-GFP(mouse monoclonal)	ProteinTech Group	Cat# 66002–1, RRID:AB_11182611	(1:1000)
Antibody	Anti-Mouse IgG Alexa Fluor 680 (goat polyclonal)	ThermoFisher Scientific	Cat# A21057	(1:20,000)
Antibody	Anti-Rabbit IgG Alexa Fluor 790 (goat polyclonal)	ThermoFisher Scientific	Cat# A11367	(1:20,000)
Cell line (Homo-sapiens)	HEK293T	ATCC

Share this article

Cite this article

MassTitr screening identifies biologically relevant ENAH EVH1 ligands.

Prolines C-terminal to FP4 can engage a novel ENAH binding site.

Multiple FP4 motifs enhance peptide binding affinity.

Figure 3—source data 1

Affinities of dual FP4 motif peptides and their variants for ENAH EVH1 WT or ENAH EVH1 R47A obtained using biolayer interferometry.

Table 1—source data 1

Author details

Theresa Hwang

Contribution

Competing interests

Sara S Parker

Contribution

Competing interests

Samantha M Hill

Contribution

Competing interests

Robert A Grant

Contribution

Competing interests

Meucci W Ilunga

Contribution

Competing interests

Venkatesh Sivaraman

Contribution

Competing interests

Ghassan Mouneimne

Contribution

Competing interests

Amy E Keating

Contribution

For correspondence

Competing interests

Citations by DOI

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

Categories and tags

Research organism

Further reading