High-throughput profiling of sequence recognition by tyrosine kinases and SH2 domains using bacterial peptide display
Abstract
Tyrosine kinases and SH2 (phosphotyrosine recognition) domains have binding specificities that depend on the amino acid sequence surrounding the target (phospho)tyrosine residue. Although the preferred recognition motifs of many kinases and SH2 domains are known, we lack a quantitative description of sequence specificity that could guide predictions about signaling pathways or be used to design sequences for biomedical applications. Here, we present a platform that combines genetically-encoded peptide libraries and deep sequencing to profile sequence recognition by tyrosine kinases and SH2 domains. We screened several tyrosine kinases against a million-peptide random library and used the resulting profiles to design high-activity sequences. We also screened several kinases against a library containing thousands of human proteome-derived peptides and their naturally-occurring variants. These screens recapitulated independently measured phosphorylation rates and revealed hundreds of phosphosite-proximal mutations that impact phosphosite recognition by tyrosine kinases. We extended this platform to the analysis of SH2 domains and showed that screens could predict relative binding affinities. Finally, we expanded our method to assess the impact of non-canonical and post-translationally modified amino acids on sequence recognition. This specificity profiling platform will shed new light on phosphotyrosine signaling and could readily be adapted to other protein modification/recognition domains.
Data availability
All of the processed data from the high-throughput specificity screens are provided as source data files. The raw fastq and fasta sequencing files are available as a Dryad repository (DOI: 10.5061/dryad.0zpc86727). Custom code used to process/analyze screening data can be found in a GitHub repository, as specified in the manuscript.
-
Data from: High-throughput profiling of sequence recognition by tyrosine kinases and SH2 domains using bacterial peptide displayDryad Digital Repository, doi:10.5061/dryad.0zpc86727.
Article and author information
Author details
Funding
National Institute of General Medical Sciences (R35GM138014)
- Neel H Shah
Damon Runyon Cancer Research Foundation (DFS 31-18)
- Neel H Shah
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Copyright
© 2023, Li et al.
This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 2,613
- views
-
- 289
- downloads
-
- 13
- citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Biochemistry and Chemical Biology
- Microbiology and Infectious Disease
Mycofactocin is a redox cofactor essential for the alcohol metabolism of mycobacteria. While the biosynthesis of mycofactocin is well established, the gene mftG, which encodes an oxidoreductase of the glucose-methanol-choline superfamily, remained functionally uncharacterized. Here, we show that MftG enzymes are almost exclusively found in genomes containing mycofactocin biosynthetic genes and are present in 75% of organisms harboring these genes. Gene deletion experiments in Mycolicibacterium smegmatis demonstrated a growth defect of the ∆mftG mutant on ethanol as a carbon source, accompanied by an arrest of cell division reminiscent of mild starvation. Investigation of carbon and cofactor metabolism implied a defect in mycofactocin reoxidation. Cell-free enzyme assays and respirometry using isolated cell membranes indicated that MftG acts as a mycofactocin dehydrogenase shuttling electrons toward the respiratory chain. Transcriptomics studies also indicated remodeling of redox metabolism to compensate for a shortage of redox equivalents. In conclusion, this work closes an important knowledge gap concerning the mycofactocin system and adds a new pathway to the intricate web of redox reactions governing the metabolism of mycobacteria.
-
- Biochemistry and Chemical Biology
- Genetics and Genomics
Yerba mate (YM, Ilex paraguariensis) is an economically important crop marketed for the elaboration of mate, the third-most widely consumed caffeine-containing infusion worldwide. Here, we report the first genome assembly of this species, which has a total length of 1.06 Gb and contains 53,390 protein-coding genes. Comparative analyses revealed that the large YM genome size is partly due to a whole-genome duplication (Ip-α) during the early evolutionary history of Ilex, in addition to the hexaploidization event (γ) shared by core eudicots. Characterization of the genome allowed us to clone the genes encoding methyltransferase enzymes that catalyse multiple reactions required for caffeine production. To our surprise, this species has converged upon a different biochemical pathway compared to that of coffee and tea. In order to gain insight into the structural basis for the convergent enzyme activities, we obtained a crystal structure for the terminal enzyme in the pathway that forms caffeine. The structure reveals that convergent solutions have evolved for substrate positioning because different amino acid residues facilitate a different substrate orientation such that efficient methylation occurs in the independently evolved enzymes in YM and coffee. While our results show phylogenomic constraint limits the genes coopted for convergence of caffeine biosynthesis, the X-ray diffraction data suggest structural constraints are minimal for the convergent evolution of individual reactions.