Abstract
Deep Mutational Scanning (DMS) is an emerging method to systematically test the functional consequences of thousands of sequence changes to a protein target in a single experiment. Because of its utility in interpreting both human variant effects and protein structure-function relationships, it holds substantial promise to improve drug discovery and clinical development. However, applications in this domain require improved experimental and analytical methods. To address this need, we report novel DMS methods to precisely and quantitatively interrogate disease-relevant mechanisms, protein-ligand interactions, and assess predicted response to drug treatment. Using these methods, we performed a DMS of the melanocortin-4 receptor (MC4R), a G protein-coupled receptor (GPCR) implicated in obesity and an active target of drug development efforts. We assessed the effects of >6,600 single amino acid substitutions on MC4R’s function across 18 distinct experimental conditions, resulting in >20 million unique measurements. From this, we identified variants that have unique effects on MC4R-mediated Gαs- and Gαq-signaling pathways, which could be used to design drugs that selectively bias MC4R’s activity. We also identified pathogenic variants that are likely amenable to a corrector therapy. Finally, we functionally characterized structural relationships that distinguish the binding of peptide versus small molecule ligands, which could guide compound optimization. Collectively, these results demonstrate that DMS is a powerful method to empower drug discovery and development.
Introduction
Deep Mutational Scanning (DMS) employs cutting-edge synthetic biology or genome editing methods, DNA synthesis, and sequencing to systematically assess the effect of every possible single amino acid substitution on the function of a protein target (Araya and Fowler, 2011; Fowler and Fields, 2014; Starita et al., 2017). Researchers have leveraged DMS to gauge a variety of protein functions or their cellular consequences, including viability (Findlay et al., 2014), protein abundance (Faure et al., 2022; Matreyek et al., 2018), transcriptional signaling (Jones et al., 2020), and inter- and intramolecular interactions (Braberg et al., 2022; Faure et al., 2022). DMS assays are increasingly used for human variant interpretation (Weile and Roth, 2018) and to elucidate the relationship between protein structure and function, including in the evaluation (Brandes et al., 2023) and finetuning of protein language models (Lafita et al., 2024).
While DMS has significantly advanced our understanding of protein function, its potential in drug discovery and development has yet to be fully realized. Applications in this realm require methods that are more disease-relevant, sensitive, and quantitative to measure the subtle effects of both sequence variants and experimental conditions (e.g., drug treatments). For example, DMS assays often measure effects like viability, which are several biological layers removed from the specific molecular mechanisms modulated by drugs. Additionally, the current signal-to-noise ratios of many assays and challenges around uncertainty quantification make drawing quantitative conclusions from DMS data difficult. Moving from categorical (e.g., benign vs. pathogenic) classifications toward precise quantitative measurements of human variants could improve predictions of safety and efficacy at early stages of drug development programs (Plenge et al., 2013). It would also expand the use of matching patients’ medications to their specific genetic variants (i.e., theratyping), which could lead to better patient outcomes (McDonald et al., 2024). Finally, DMS data contain the functional consequences of thousands of different biochemical perturbations. Sufficiently sensitive assays against functional readouts and improved analysis methods would readily complement structure-based approaches by elucidating the functional consequences of ligand binding. Building on this, it should be possible to identify novel protein-ligand interactions that could be reverse-engineered to increase compound potency, further expanding the potential impact of DMS on drug discovery.
In a previous work, we described DMS methods to measure the effects of thousands of single amino acid variants on the function of the beta-2 adrenergic receptor (β2AR) (Jones et al., 2020), a member of the G-protein-coupled receptor (GPCR) class that is the most commonly targeted protein family in drug development (Sriram and Insel, 2018). Here, we build upon this previous work to demonstrate more sensitive and robust DMS methods for drug discovery and development, focusing on the melanocortin-4 receptor (MC4R). MC4R is a GPCR, and human variants that result in partial or complete loss of its function cause the most common form of inherited obesity [OMIM #618406] (Farooqi et al., 2003; Hinney et al., 2003; Vaisse et al., 2000). Variants that increase MC4R activity are protective against obesity (Lotta et al., 2019; Paisdzior et al., 2020), and numerous small molecule and peptide agonists of MC4R have been tested as potential therapeutics (Chen et al., 2015; Clément et al., 2020; Collet et al., 2017; Greenfield et al., 2009; Hinney et al., 2022; Huang and Tao, 2014; Kievit et al., 2013; Sweeney et al., 2023).
In this study, we developed substantially improved experimental and analytical methods for DMS that are capable of detecting subtle quantitative effects of variants and differences between experimental conditions with a high degree of statistical rigor. We then tested the effects of nearly all possible single amino acid variants of MC4R (6,633 of 6,640) on two distinct GPCR signaling functions under a variety of treatment conditions. From this, we generated a high-resolution map of how MC4R’s structure relates to function, and we accurately classified the quantitative effects of human variants. Additionally, we identified amino acid changes that differentially impact (i.e., bias) MC4R’s different GPCR signaling functions, pinpointed human variants that are amenable to a specific class of therapy, and elucidated the functional impact of protein-ligand interactions between MC4R and both peptide and small molecule ligands. This demonstrates the utility of DMS for various drug discovery and development applications, and the methods described herein should be broadly applicable to GPCRs and other drug target classes that function in transcriptional signaling pathways.
Results
Development of highly quantitative deep mutational scanning methods
Assays for disease-relevant mechanisms
A critical tool in drug discovery programs is a very sensitive set of assays that directly interrogate the function(s) of a protein target against which to test potential therapies (Hughes et al., 2011). This is essential for ensuring that compounds are modulating a specific mechanism underlying disease and not having undesired “off-target” effects. Stimulation of MC4R with its agonist, alpha melanocyte-stimulating hormone (α-MSH), results in signaling through multiple canonical GPCR pathways, including Gαs-coupled cyclic adenosine monophosphate signaling (hereafter referred to as Gs) and Gαq-coupled calcium signaling (hereafter referred to as Gq) (Tao, 2010). Therefore, we first developed multiplexed reporter assays for these two critical MC4R G-protein signaling functions (Fig. 1, Supplementary Fig. 1), building off of our earlier work performing a deep mutational scan of β2AR (Jones et al., 2020). Both reporters were designed for use in human HEK293T cells and to be compatible with our previously described DMS library construction methods (Jones et al., 2020). Briefly, these methods harness high-throughput DNA synthesis to construct every possible single amino acid variant, and each variant is then linked to a transcriptional reporter containing an oligonucleotide sequence barcode unique to that variant. Reporter constructs are then integrated into cells using a site-specific recombination-based landing pad system and drug selection to ensure that each cell contains a single variant-barcode combination. Activation of the receptor turns on a response element for the signaling pathway, leading to the expression of the barcoded reporter, which is then quantified using RNA sequencing.
The MC4R Gs assay was adapted and further optimized from the cAMP response element-based reporter we previously used for β2AR (Supplementary Fig. 1A,C). For Gq signaling, an analogous reporter using an NFAT response element alone to activate reporter gene expression was not suitable due to weak signal-to-noise. To solve this problem, we incorporated a “relay” system to amplify the reporter signal using a synthetic transcription factor composed of Gal4 fused to the VP64-p65-Rta (VPR) transcriptional activator (Chavez et al., 2015) (Fig. 1, Supplementary Fig. 1B, see Methods for more details). The resulting Gal4-VPR transcription factor was placed under the control of the NFAT response element, and the reporter gene was placed under the control of a UAS element that responds to Gal4 binding. This assay design resulted in robust reporter expression upon stimulation of MC4R (Supplementary Fig. 1C). Together, these assays provide sensitive functional readouts for two important MC4R signaling activities implicated in obesity and other phenotypes (Fatima et al., 2022; Sweeney et al., 2023), and we used them to build DMS libraries to assess the effects of all possible single amino acid substitutions in MC4R.
Analysis model for statistically robust comparisons
To explore many hypotheses that arise in drug discovery applications, it is valuable to assay a DMS library using experimental replication and under a variety of conditions, such as different drugs and/or pathways of interest. However, most popular methods for DMS analysis do not leverage DNA barcodes or other experimental replicate information, nor do they support hypothesis testing between conditions (Faure et al., 2020; Rubin et al., 2017). Consequently, we developed an alternative modeling framework to enable this (Fig. 1, Supplementary Fig. 2A, see Methods: Negative Binomial Regression Analysis Pipeline for additional details). Briefly, borrowing from approaches for inferring differential expression from RNA-seq data (Ahlmann-Eltze and Huber, 2021; Love et al., 2014; McCarthy et al., 2012), we applied a mixed effect negative binomial generalized linear model (GLM) to raw barcode counts directly. The model contains a random effect across barcodes to share barcode information between replicates and conditions, and incorporates sample-specific offsets to account for technical covariates like sequencing depth, as is common for RNA-seq (Robinson and Oshlack, 2010). For each variant, we estimate the mean shift in barcode count and associated standard error for each treatment condition, relative to wild-type. Using per-condition summary statistics, we either directly test whether each variant barcode mean is significantly different from wild-type (zero) in each treatment, or we can define more complex linear contrasts on variant effects across multiple treatments.
Increasing power through barcoding
As described above, our assay design and analysis framework harness DNA barcodes that are uniquely associated with a particular variant and provide multiple independent measures of a variant’s effect. In our previous DMS of β2AR, the median number of barcodes independently linked to each variant was ∼10 (Jones et al., 2020), and we reasoned that increasing this number would increase the power to detect functional effects. To this end, we optimized and scaled our library cloning and cellular integration protocols to target ∼30 barcodes per variant in building DMS libraries for MC4R. As expected, this increased the power to detect variant effects. For example, the separation between the activity of alleles that are clearly deleterious (i.e., a stop codon at any position in the protein) and all other alleles (i.e., wild-type or missense variants) was drastically increased in the MC4R Gs assay relative to the same assay for β2AR (Supplementary Fig. 2B). To further test the effect of the number of barcodes for a given variant, we computationally down-sampled the barcodes for representative positions of MC4R and ran the resulting data through our analysis pipeline. As expected, by increasing the number of barcodes per variant, the magnitude of the standard error of the estimated variant effect decreases in a manner consistent with increased sample size (Supplementary Fig. 2C). This confirms that increasing the number of barcodes per variant enables the quantification of subtle differences within a standard hypothesis testing framework, and provides an experimental parameter that one can vary to improve the power to detect the effects of sequence variants of particular interest.
Comprehensive deep mutational scanning of MC4R
With these methods in hand, we carried out a comprehensive assessment of the effects of all single amino acid substitutions (including nonsense variants) on MC4R’s Gs and Gq signaling activities under a variety of experimental conditions (Fig. 2a, Supplementary Table 1, Supplementary Figs. 3-4). We selected experimental conditions that would inform aspects of drug discovery and development programs, such as elucidating protein structure-function relationships, identifying regions of the protein that bias activity towards or away from a specific function, classifying the effects of human variants in the presence and absence of potential therapies, and uncovering functional differences in protein-ligand interactions. In total, we tested 18 unique conditions, each performed in quadruplicate, including: basal activity (i.e., no stimulation) of MC4R, stimulation of MC4R with a range of doses of the native peptide agonist alpha-melanocyte-stimulating hormone (α-MSH), stimulation with a range of doses of a small molecule agonist (THIQ) (Sebhat et al., 2002), treatment with a small molecule corrector (Ipsen 17) (Poitout et al., 2007; Wang et al., 2014), and library composition normalization controls (forskolin, see Methods). Our resulting DMS assays had extraordinary variant coverage, with 99.9% (6,633/6,640) of all possible single amino acid substitutions present in all experimental conditions. Each variant was represented by an average of 56 and 28 barcodes for the Gs and Gq signaling pathways, respectively. Between both assays, this translates to more than 557,000 uniquely engineered human cells, each containing a distinctive variant-reporter-barcode combination. When factoring in the number of experimental conditions (18 unique), replicates (four per condition), amino acid variants tested (99.9% of 6,640 possible), and the mean barcodes per variant (56 and 28 for Gs and Gq, respectively), this equates to >21,500,000 measurements across all datasets.
Multiple lines of evidence support the high quality and utility of these data for classification of variant effects (Fig. 2A-F, Supplementary Figs. 3-5). Focusing on one dataset as a representative example (Gs signaling using a low dose of α-MSH stimulation), variants that introduce stop codons or fall within transmembrane domains and buried surfaces disproportionately lead to significant loss of MC4R function (Fig. 2A-C). The results also correlate well with expectations from human genetics data and variant effect prediction algorithms (Fig. 2C-D, Supplementary Fig. 5). For example, the majority of human MC4R variants classified as pathogenic or likely pathogenic in ClinVar (Landrum et al., 2014) lead to a significant reduction of Gs signaling under low α-MSH stimulation conditions (significance threshold: false discovery rate (FDR) < 1%; Fig. 2C). Variants that are significantly loss-of-function in this condition are rarer in the human population, and more common human variants have no significant effect on MC4R function (significance threshold: FDR < 1%; Fig. 2D). Loss-of-function variants by our DMS assay are also typically predicted to be deleterious by commonly used variant effect predictors like AlphaMissense (Cheng et al., 2023) and popEVE (Orenbuch et al., 2023) (Supplementary Fig. 5).
Because of the sensitivity of our reporter system and the statistical power gained by testing dozens of unique barcodes per variant, we anticipated that these assays would capture subtle quantitative, rather than just qualitative, effects on MC4R function. To assess this, we benchmarked our results against previous quantitative characterizations of MC4R variants from the literature (Fig. 2E-F). For example, many MC4R variants that have been observed in the human population have been previously tested for their effects on MC4R function, and a review (Huang et al., 2017) summarizing this work systematically classified >70 variants according to whether they result in “full”, “partial”, or “no/mild” loss of MC4R activity. Our results are consistent with these classifications: variants classified previously as “full” loss-of-function typically have very low MC4R activity in our assay, “partial” variants have intermediate effects, and “no/mild” effect variants have near-normal activity (Fig. 2E, median log2[fold change of variant activity over wild-type activity] of −1.4, −0.8, and −0.3, respectively, for each group). Finally, our results show a high degree of correlation (Pearson correlation = 0.84, R2 = 0.71) with quantitative effect measurements reported for 25 variants individually introduced into the orthosteric site of MC4R (Zhang et al., 2021). Collectively, this demonstrates that our high quality MC4R DMS data accurately and quantitatively assess the effects of variants on MC4R’s function.
Systematic human variant interpretation of MC4R
Looking across multiple data sets gives a comprehensive picture of variant effects, as the various experimental conditions tested have disparate strengths and weaknesses at detecting loss-of-function (LoF) versus gain-of-function (GoF) activities (Fig. 2G). For example, unstimulated conditions (e.g., zero α-MSH) uncover variants that lead to constitutive activation of MC4R, but they have less ability to detect loss-of-function variants. In contrast, conditions with agonist stimulation are much better enabled to identify loss of MC4R function. Considering individual α-MSH stimulation conditions (zero, low, medium, and high) for both the Gs and Gq assays, each condition identifies 6.6 - 39.3% of variants as loss-of-function and 0.02 - 1.1% as gain-of-function (Fig. 2G). Collectively across all α-MSH stimulation conditions, 3,370 variants (50.8%) are loss-of-function in at least one condition, 347 variants (5.2%) are gain-of-function in at least one condition, and 2,996 (45.2%) always show wild-type activity. Interestingly, 80 (1.2%) variants are classified as both loss- and gain-of-function, depending upon the condition.
To aid in clinical variant interpretation, we provide detailed functional effect classifications for 220 human variants reported in ClinVar (Landrum et al., 2014) or from published patient sequencing studies (Brouwers et al., 2021; Farooqi et al., 2003; Hinney et al., 2013, 2006; Huang et al., 2017; Rodríguez Rondón et al., 2024; Stutzmann et al., 2008; Wade et al., 2021; Yeo et al., 2003) in Supplementary Table 2. In total, 130 of these reported human variants (59.1%) are LoF in at least one condition, consistent with being pathogenic for obesity-related phenotypes. This includes 83.9% (26/31) of the variants that are reported in ClinVar as pathogenic or likely pathogenic, 53.3% (32/60) of those that are unclassified or have conflicting interpretation, and 0% (0/3) of those classified as benign or likely benign. A small number of reported human variants (V103I, H158R, I251L) result in significant increases in Gs and/or Gq signaling, consistent with having a protective effect for obesity, and are generally classified as benign in ClinVar or as having wild-type activity in the literature (Hinney et al., 2006). These results highlight the utility of systematic deep scans across multiple experimental conditions for facilitating human variant interpretation.
Variants that bias MC4R function
MC4R signals through multiple G protein pathways (Breit et al., 2011; Ju et al., 2018; Paisdzior et al., 2020; Sharma et al., 2019), and evidence from human variant interpretation studies (Lotta et al., 2019; Metzger et al., 2024; Rodríguez Rondón et al., 2024), mouse modeling (Li et al., 2016), and previous drug programs (Clément et al., 2018; Sharma et al., 2019) suggest that biasing MC4R’s activity toward or away from specific pathways could be therapeutically valuable for the treatment of obesity. To gain a better understanding of how MC4R structure relates to its various functions, we systematically searched for variants that differentially impact Gs versus Gq signaling. We applied Principal Component Analysis (PCA) across eight total DMS datasets (four each for Gs and Gq: with zero, low, medium, and high α-MSH stimulation; see Supplementary Table 1 for details). The first two principal components explained 66% and 12% of the variance, respectively (Fig. 3A, Supplementary Fig. 6A,B). Through inspection, we found that Principal Component 1 (PC1) separates variants that impact both signaling functions, with variants that are loss-of-function for both Gs and Gq having higher PC1 values (Supplementary Fig. 6A,B). PC2 separates variants that affect Gs and Gq signaling differently (Fig. 3A-D, Supplementary Fig. 6A,B). Variants with higher PC2 values typically exhibit Gq bias by having greater than wild-type levels of Gq signaling activity, while retaining wild-type levels of Gs activity (Fig. 3C,D; Supplementary Fig. 6C,D). In contrast, variants with more negative PC2 values are Gs-biased variants that typically have wild-type levels of Gs signaling and reduced Gq signaling upon agonist stimulation (Fig. 3C).
Overall, more variants substantially increase Gq signaling than Gs signaling (Fig. 3A). Gs is the primary G protein coupling for MC4R (Podyma et al., 2018; Tao, 2010), and our data suggest that there is little room for further improving MC4R’s already robust Gs signaling activity. Interestingly, biased variants are positionally diverse. For example, the 14 variants that display the most extreme Gq bias (Fig. 3A) are found at 12 different residue positions, with position 79 unique in having several variants that result in Gq bias (Supplementary Fig. 6D). Many of the variants with extreme Gq or Gs bias are located within the regions of MC4R that interact with G proteins, with some scattered throughout the transmembrane domains and far fewer in the vicinity of the protein’s orthosteric site (Fig. 3B-D).
Structural insights into biased signaling
Comparing these results with existing structural information could provide additional detailed insights into MC4R’s signaling functions. It is thought that ligand binding in GPCR orthosteric sites is communicated to the intra-cellular G-protein binding domain through a series of conserved residues or “microswitches” (Hauser et al., 2021; Zhou et al., 2019). Structural studies comparing the inactive and active state structures have confirmed that MC4R shares a similar signaling cascade (Yu et al., 2020; Zhang et al., 2021). Upon ligand binding, W258 (W2586×48 in https://gpcrdb.org/ nomenclature; (Isberg et al., 2015)) of the conserved CWxP motif undergoes a conformational rearrangement that is translated to L1333×36 and I1373×40, of the conserved PIF motif (MIF in melanocortin receptors). This causes F2546×44 in the PIF motif to rearrange, which in turn disrupts the packing of three different interactions: 1) L1403×43 and I1433×46, 2) I2516×41 and L2476×37, and 3) R1473×50 and N2406×30. These, amongst other rearrangements, culminate in the receptor being able to bind a G-protein. This interaction with the G-protein is primarily mediated through R1473×50 in the conserved DRY motif, T1503×53, Y15734×53, H15834×54 and R3057×56 (Zhang et al., 2021).
Strikingly, a number of mutations at residues throughout this signaling cascade had extremely positive PC2 values, implicating them as Gq-biasing mutations. Within the core of the cascade, we identified I1373×40L, F2546×44P, and L1403×43I as Gq-biased (Fig. 3A; Supplemental Fig. 6C). We also identified a number of Gq-biasing mutations within the G-protein binding pocket, specifically T1503×53G and H15834×54R (Fig. 3A,C). Interestingly, the H15834×54R variant is found in the human population (Hinney et al., 2006; Wade et al., 2021) and has previously been shown to preferentially signal through the Gq pathway (Paisdzior et al., 2020). H15834×54 is also co-located near two other mutations (K1643×55L, F1523×55R), in intracellular loop 2 (ICL2) and near the ends of the third and fourth transmembrane domains (TM3 and TM4), that display bias (Fig. 3C). Interestingly, K1643×55L exhibits a Gs bias in that it drives loss of function through Gq.
Our data also point to a number of potentially novel interactions. For example, M792×39 packs against residues H387G.H5.19 and Q390G.H5.22 of the Gs alpha subunit (Gsα) (Flock et al., 2015). This position has multiple different variants that result in Gq bias, including M792×39R, M792×39S, and M792×39G (Fig. 3A, Supplemental Fig. 6D). The most extreme bias signal in our PCA analysis came from I2235×69L (Fig. 3A,D), which interfaces with the C-terminus of Gsα (Flock et al., 2015), near position L394. Further down towards the intracellular side of TM5, we also identified V228R (Fig. 3A,D), which interfaces proximal to E323G.hgh4.13 in Gsα (Flock et al., 2015). Collectively, the combination of DMS data and structural information is a fruitful avenue for generating protein structure-function hypotheses. These results highlight the power of DMS to identify regions of MC4R that could be harnessed for designing drugs that precisely modulate specific cellular signaling functions.
Systematic prediction of treatment response
Many variants of MC4R disrupt signaling by causing protein misfolding, which ultimately inhibits proper localization of MC4R to the cell membrane (Huang et al., 2017). Correctors are a class of small molecule drugs that facilitate protein folding. Corrector therapies have been developed for phenotypes such as cystic fibrosis (Boyle and De Boeck, 2013) and Fabry disease (Germain et al., 2016), and they have been proposed as a strategy for treating MC4R-associated obesity (Huang et al., 2017; Huang and Tao, 2014; Wang et al., 2014). One feature of corrector therapies is that they are typically only effective, and therefore FDA-approved, for a subset of patients harboring specific sequence variants (Boyle and De Boeck, 2013; Weaver et al., 2022). Identifying variants that respond to corrector therapy is typically done by rigorously testing the effect of a compound on a single variant at a time (Weaver et al., 2022), and DMS offers an attractive avenue to systematically test the treatment response of thousands of patient variants in a single assay.
To this end, we tested whether Ipsen 17, a small molecule tool compound that has been shown to correct MC4R misfolding (Poitout et al., 2007; Wang et al., 2014), is able to restore the Gs signaling function of the MC4R variants in our DMS library. Out of all 6,633 tested variants, 290 (4.4%) showed disrupted Gs signaling in the absence of treatment that was partially or fully rescued by the addition of Ipsen 17 (Supplementary Fig. 7, see Methods for details of statistical analysis). This includes a number of variants that have been classified as pathogenic in ClinVar (Landrum et al., 2014) or otherwise found in patient sequencing studies (Brouwers et al., 2021; Farooqi et al., 2003; Hinney et al., 2013, 2006; Huang et al., 2017; Stutzmann et al., 2008; Wade et al., 2021; Yeo et al., 2003) (Fig. 4 shows results for selected variants reported in the human population). Other reported patient variants showed no functional improvement in response to corrector therapy (Fig. 4). Collectively, these data support that performing DMS in the presence of a small molecule corrector can be used to systematically predict which patients are likely to benefit from such treatment modalities.
Mapping protein-ligand interactions
DMS experiments can be used to define “drug-resistant” variants within MC4R that disrupt the activity of different types of ligands, providing functional insight into protein-ligand interactions that are key for understanding the mechanisms underlying agonism. Such functional information would be a valuable addition to structural methods and has the potential to streamline the lengthy and iterative cycle of compound optimization in drug discovery. Substantial work has been done to characterize how peptide agonists interact structurally with MC4R, but similar work on small-molecule agonists with comparable activity and selectivity remains relatively limited (Gonçalves et al., 2018; Heyder et al., 2021; Sharma et al., 2019; Yu et al., 2020; Zhang et al., 2021). To characterize the functional interaction landscapes of different ligand types and to better understand what distinguishes peptide and small-molecule MC4R pharmacophores, we performed DMS assays of MC4R using both native peptide agonist stimulation (α-MSH) and small molecule agonist stimulation (THIQ). Bayesian meta-regression analysis of the lowest dose concentrations of α-MSH and THIQ for the Gs signaling assay revealed a set of variants that uniquely disrupt activation by one ligand but not the other (at FDR < 5%, Fig. 5A). These variants cluster exclusively within the orthosteric binding pocket (Fig. 5B-E) and at positions of known binding interactions of each molecule (Zhang et al., 2021). Notably, there are many more variants that uniquely disrupt activation by α-MSH (Fig. 5A-D). For example, a majority of amino acid substitutions at position I1042×6 disrupt activation of MC4R by α-MSH, but none lead to significant reduction in MC4R activity upon stimulation by THIQ (Fig. 5C). Multiple positions within the orthosteric binding pocket displayed this pattern (Fig. 5B-E), which is consistent with how the peptide agonist utilizes a larger network of interactions to increase binding affinity.
Interestingly, this comparison also highlights how the same residue of MC4R can be critical for interfacing with multiple ligands but points to substantive differences in the precise physical interaction between each ligand and that position of the target. For example, positions P481×36 and I1293×32 harbor variants that can have differentially deleterious effects under the two ligand conditions (Fig. 5C-E). Many amino acid substitutions at P481×36 (including to V, I, F, Y, Q, and R) disrupt stimulation of MC4R by α-MSH but not THIQ. However, changing this position to the negatively charged P481×36D variant has the opposite effect, uniquely ablating activation by THIQ. The HFRW motif (His6-Phe7-Arg8-Trp9) of α-MSH represents a conserved pharmacophore critical for activation of MC4R by peptides, and the tri-branched THIQ molecule (R1-R2-R3) mimics the HFRW conformational architecture (Fig. 5E) (Gonçalves et al., 2018; Hruby et al., 1987; Zhang et al., 2021). Our observation that α-MSH stimulation is more sensitive to variants at position P481×36 is consistent with how the His6 of α-MSH forms more interactions with this hydrophobic pocket of MC4R, while the analogous R3 group of THIQ is more flexible and forms non-specific interactions in this region (Fig. 5C–E) (Gonçalves et al., 2018; Hruby et al., 1987; Zhang et al., 2021). At position I1293×32, mutation to any polar uncharged variant (S, T, N, or Q) alters THIQ activation, while I1293×23V uniquely inhibits α-MSH activation (Fig. 5C–E). The Phe7 of α-MSH and the analogous R2 group of THIQ form key interactions with Ca2+ and the core hydrophobic pocket formed by I1293×32 of MC4R (Fig. 5E) (Zhang et al., 2021). The enrichment of variants at I1293×32 that uniquely disrupt MC4R activation by THIQ points to a stronger dependency of the small-molecule on interactions with this residue. In summary, by assessing the functional consequences of all mutations within the MC4R orthosteric site, we not only confirm known binding interactions but also reveal interactions that distinguish peptide and small-molecule activation. These relationships provide additional functional insight into the structural mechanism of MC4R ligand binding that could be harnessed for drug design.
Discussion
Deep Mutational Scanning holds substantial potential for improving many phases of drug discovery and development, but realizing this potential requires the ability to draw highly quantitative and disease-relevant conclusions from DMS data, an outstanding challenge for the field. Here, we demonstrated the value of improved DMS assays and analysis methods in addressing these challenges by applying our approaches to a medically relevant GPCR, MC4R. First, we were able to quantify differences in Gs-mediated cAMP and Gq-mediated calcium signaling for MC4R in response to its native ligand, α-MSH. Understanding these subtle differences in molecular signaling phenotypes is crucial for driving precision treatments. For example, Gq signaling bias is one of the purported mechanisms for the better side-effect profile of the FDA approved MC4R agonist, Setmelanotide (Sharma et al., 2019). Additionally, we systematically assessed the responses of MC4R variants implicated in human obesity to a small-molecule corrector, Ipsen 17. Whereas traditional approaches would have required the separate experimental assessment of every variant, DMS tests every conceivable single amino acid variant in one experiment. This enables researchers to understand how a potential treatment will work for broader swaths of the population before drugs reach the clinic, which could ultimately lead to more effective clinical trial designs.
Our data also contribute to the growing understanding of the complicated mechanism by which MC4R (and other GPCRs) translate ligand binding on the extracellular side of the membrane, through an allosteric network of residues, to G-protein activation on the intracellular side (Howard et al., 2024). We demonstrate the utility of DMS for understanding how proteins bind other molecules. In particular, we identified mutations that uniquely disrupt the binding of two agonists, one a peptide (α-MSH) and one a small molecule (THIQ). This is crucial information for understanding the mechanism by which these molecules bind MC4R, which could lead to hypotheses for further optimization of more potent compounds. DMS data contain a substantial amount of latent information about protein structure, and can help identify novel pockets to target with new compounds (Weng et al., 2024). As suggested above, we also envision using DMS data to help optimize molecules by identifying mutations that uniquely disrupt or potentiate sets of compounds in a chemical series.
Importantly, the experimental and analysis frameworks we describe are widely applicable to GPCRs and more broadly to other target classes. GPCRs are known to signal through a variety of pathways, and there are existing transcriptional reporters for most G-protein mediated signaling pathways that could be harnessed to gain a holistic understanding of the downstream consequences of GPCR activity (Hauser et al., 2022). Furthermore, there are a substantial number of existing transcriptional reporters for a wide variety of cellular signaling processes mediated by transcription factors, kinases, and other major targets of drug discovery efforts, which could extend these methods to other target classes.
Finally, there has been incredible progress over the last few years in the development of machine learning-based models to predict the structural and functional consequences of mutations on proteins. That said, these models have struggled to address many real-world applications in the treatment of human disease (McDonald et al., 2024). One potential explanation to this gap in performance could be due to the fact that many DMS datasets (our previous work included) used to train these models typically assay a limited set of experimental conditions, often assay cellular effects that are only indirectly related to disease etiology, suffer from low signal to noise ratios, have no quantification of uncertainty in their measurements, and are subject to a heterogeneous set of subtle experimental caveats. Detailed experimental characterization of protein function (such as the >21,500,000 different measurements in this work alone) and efforts such as ProteinGym (Notin et al., 2023) to benchmark large-scale functional data will continue to be critical for developing and evaluating large scale models that can be applied to a wider variety of drug discovery and development applications. As both the experimental and computational methods continue to improve, we foresee DMS having a profound impact on the drug discovery process.
Materials and methods
Reporter assays and cell line development
The Gs (cAMP-luciferase) reporter assay, diagramed in Supplementary Fig. 1A, was adapted from an assay previously used to assess the function of other GPCRs (Jones et al., 2020, 2019).
The Gq assay relay reporter system (diagrammed in Supplementary Fig. 1B) is described in detail in a patent stemming from this work (Chan et al., 2024). Briefly, it was constructed as follows: A piggybac transposon plasmid was constructed using Gibson assembly harboring a genetic cassette that expresses a synthetic transcription factor, Gal4_DBD-VPR under the control of an NFAT response element. Human embryonic kidney cells (HEK293T) were then cotransfected with this plasmid along with a piggybac transposase expression vector, and cells were selected for puromycin resistance. These cells were then isocloned by dilution plating, and colonies were selected based on single cell clones. Five clones were then carried forward based on cell morphology and growth rates. A second series of Bxb1-based landing pad vectors were constructed containing a library of 20 Gq-coupled GPCRs under the control of the dox-inducible promoter, along with a second genetic cassette containing a DNA-barcoded luciferase gene under the control of the Gal4_UAS enhancer. This plasmid library was integrated into each of the five isoclonal Gq-relay cell lines. The best performing Gq-relay cell line was selected on signal-to-noise criteria across a panel of agonists for the 20 Gq-coupled GPCRs. Use of this relay system in combination specifically with an MC4R transgene resulted in α-MSH-dose dependent expression of the reporter gene (Supplementary Fig. 1C).
Building the DMS libraries
Generation and genomic integration of the plasmid libraries containing all possible single amino acid substitutions of MC4R used a further optimized version of an earlier method (Jones et al., 2020). In brief, variant segments of MC4R cDNA were amplified from DNA microarrays and cloned into base vectors through a multi-step process to yield pooled libraries of MC4R variants with fully intact and barcoded reporter gene cassettes. Fully assembled plasmid libraries were then co-transfected with a plasmid encoding Bxb1 recombinase into HEK293T cells containing a landing pad at the H11 safe harbor locus to achieve single copy integration per cell. In a deviation from the previously published method, two independent replicates of each sub-library were cloned and pooled together post-cellular integration in order to maximize library coverage and the number of barcodes per variant.
Variant-barcode mapping
As described above, barcodes were randomly appended to variants during amplification of MC4R segments from variant oligo pools and then ligated into library base vectors. After this first step of plasmid library cloning, variant segments were amplified along with the neighboring barcode and sequenced with 2×150 paired-end reads (see Supplementary Table 3 for amplification and sequencing primers) on an Illumina NextSeq 550 instrument using a 300-cycle Mid Output kit. Illumina 2×150 BCL files were demultiplexed with bcl2fastq2 into R1 and R2 FASTQ files, which were merged into single fragments using Flash2 requiring a minimum 5 bp of overlap (Quan et al., 2019). The first 21 bp of each fragment corresponding to the barcode sequence were extracted into the read name, and the remaining fragment was adapter-trimmed using umi_tools and cutadapt, respectively (Martin, 2011; Smith et al., 2017). The remaining fragments were mapped against a custom reference composed of the designed oligonucleotide library using STAR with default parameters except requiring that alignments be strictly unique to be reported (Dobin et al., 2013). Taking each alignment as an oligo-barcode pair, the read counts per unique oligo-barcode pair were computed for each replicate and joined by barcode. Finally, the resulting maps were filtered to require each oligo-barcode pair to pass three requirements in both replicates: correct barcode length, total read depth > 10, and purity > 0.75. The purity of an oligo-barcode pair was defined as the read count of that pair divided by the total number of reads containing that barcode. Post-processing after STAR was performed using samtools for BAM manipulation and custom R code (Li et al., 2009).
Running DMS assays
The DMS assay protocol was optimized from a previous method (Jones et al., 2020). HEK293T single-copy variant cell libraries were seeded at a density of ∼17×106 cells per 150 mm tissue-culture treated dish in DMEM + 10% fetal bovine serum (FBS). Four dishes were seeded for each experimental condition, with each dish being treated as an independent biological replicate (4 replicates per condition). Twenty-four hours after seeding, media was exchanged with DMEM + 0.5% FBS +/- 10 ng/mL Doxycycline. For chaperone experiments, all conditions were additionally replicated +/- 1 μM Ipsen 17. 24 hrs after Doxycycline induction, media was exchanged with Opti-MEM + DMSO, Forskolin, or MC4R agonist (α-MSH or THIQ). Forskolin bypasses MC4R to constitutively activate cAMP signaling, so this condition was used as a variant-independent measurement of library composition. For chaperone experiments, cells were washed 3x with 10 mL DMEM to remove Ipsen 17 prior to agonist stimulation. Six hours after agonist stimulation, cells were harvested by scraping in 4 mL lysis buffer (RLT buffer (Qiagen) + 143 mM β-ME). Lysis was performed by passing the cell slurry 6x through a sterile 18G needle and then spinning through QIAshredder (Qiagen) columns. RNA was extracted from 1 ml of the homogenized lysate with the RNeasy Plus Mini kit (Qiagen), including optional on-column DNAse digestion, and eluted into 100 µL water. Eight reverse transcriptase reactions per sample were performed with the SuperScript IV kit (Thermo Fisher), as described previously (Jones et al., 2020) (primers listed in Supplementary Table 3). cDNA from each sample was treated with 1 µL RNase A (100 µg/ml, Thermo Fisher) and 3.2 µL RNase H (5,000 U/mL, NEB) at 37°C for 30 min. RNase-treated samples were concentrated to ∼55 µL by spinning through Amicon Ultra 10 kDa concentrators (EMD Millipore) for ∼8 min. To determine the necessary cycle numbers for equivalent amplification of each sample library, qPCR reactions were performed on 1 µL cDNA (diluted 1:8 in water) with Q5 polymerase (NEB), SYBR Green (Thermo Fisher), and library amplification primers (Supplementary Table 3). Final amplification cycles for each sample were chosen by adding three cycles to the Cq values generated from each respective qPCR reaction. Illumina sequencing libraries were prepared by amplifying 50 µL of each cDNA sample with sequencing adapters (500 nM each library amplification primer, Supplementary Table 3) using the NEBNext Q5 High Fidelity 2x PCR Kit (NEB) under the following cycling conditions: 98°C for 30 s, X cycles of 98°C for 8 s, 65°C for 20 s, and 72°C for 10 s, followed by an extension of 72°C for 2 min. 3 µL of each DNA library sample was run on a 4% E-Gel (Thermo Fisher) and densitometry was performed with Fiji to account for differences in library yields. Samples were mixed at equal amounts into a single pool and then purified into 200 µL IDTE (Qiagen) with AxyPrep Magnetic beads (Fisher Scientific). The purified library was quantified with the DeNovix High-Sensitivity Fluorescence kit and prepared for sequencing with a 10% PhiX spike-in. Final library mixture was sequenced using custom read and index primers (Supplementary Table 3) on an Illumina NextSeq 550 with the High Output 75 cycle kit.
Sequence processing for barcode expression
Illumina 1×26 BCL files were demultiplexed with bcl2fastq2 and processed to remove the last 5 bp using basic bash commands. The resulting sequences were counted for each sample, and the resulting barcodes were joined with the appropriate oligo-barcode map. The resulting barcodes were joined with sample and MC4R variant metadata and returned for regression analysis. All processing after demultiplexing was performed with custom R code.
Negative binomial regression analysis pipeline
In many DMS methods, barcodes are summed within protein-coding variants to generate a single variant-level count per sample. This unnecessarily sacrifices available power obtained by repeatedly measuring the same molecular process in distinct cells. An alternative is to model barcode counts directly. However, to apply standard models, missing data must either be removed, which can remove a majority of detected barcodes, or somehow imputed. Additionally, many existing methods either use a log transformation of read counts to obtain approximate normality, or apply Poisson regression and related assumptions for inference (Faure et al., 2020; Rubin et al., 2017). However, there is strong prior reason to believe that DMS counts are overdispersed in our data, since our synthetic reporter system is read out via RNA-seq (Robinson and Smyth, 2007).
Instead, we developed and applied a mixed effects negative binomial general linear model (GLM) to resolve these challenges. These models have been widely deployed to model count data, and in particular identify differential expression, in bulk and single-cell RNA-seq analysis. We implement maximum likelihood estimation for this model using glmmTMB, which can accommodate the potentially large scale of multiplex count data (Brooks et al., 2017). For each position, we consider all variants located at that position along with all wild-type variants in the same protein subregion and apply the following model:
For the ith condition, the jth variant, the kth barcode, and the mth sample. Consequently, the first two terms in the last equation above correspond to a global mean term for each condition and a term for the variant-specific deviation from wild-type in each condition. The last two terms are the random effect for barcode k, and the sample-specific technical offset for sample m. The definition of the offset is often context specific, and here we use the log of the sum of barcode counts derived from stops, reasoning they should be constant across conditions and replicates. We fit this model for each MC4R position independently and extract coefficients for the additive shift in the mean of each variant relative to wild-type. Using the per-condition summary statistics, we obtained Wald test statistics by dividing the effect size by the standard error and computed p-values against the normal distribution. P-values were adjusted for multiple testing using the Benjamini-Hochberg method and thresholded to 1% or 5% FDR where indicated.
To define more complex null hypotheses like chaperone rescue, we extracted marginal means for each variant under each treatment using the emmeans package (Lenth, 2024). We define the chaperone rescue contrast as the additive shift of each variant in each treatment condition to the wild-type mean specifically in the untreated condition. Since this quantity is a linear contrast across marginal means, we computed the associated contrast estimates and standard errors, and tested them for significance using the same approach as the per-condition summary statistics.
Comparison to human genetics data and variant effect predictors
Pathogenicity classifications of MC4R missense and nonsense variants were obtained from ClinVar (Landrum et al., 2014) on January 5, 2024. Human population frequency data for MC4R missense variants were obtained from gnomAD (Chen et al., 2024) on January 8, 2024. A comprehensive list of 220 MC4R variants that are of potential clinical relevance (Supplementary Table 2) was mined from ClinVar (Landrum et al., 2014), along with papers or review articles describing variants identified in human sequencing studies (Brouwers et al., 2021; Farooqi et al., 2003; Hinney et al., 2013, 2006; Huang et al., 2017; Rodríguez Rondón et al., 2024; Stutzmann et al., 2008; Wade et al., 2021; Yeo et al., 2003). Effect predictions for MC4R missense variants were obtained from the public releases of AlphaMissense (Cheng et al., 2023) and popEVE (Orenbuch et al., 2023).
Identification of functionally biased variants
We used Principal Component Analysis to identify biased variants. Specifically, we used the test statistic (log2 fold change divided by the standard error) for the DMSO and all α-MSH conditions in both the Gq and Gs pathways to create a matrix with conditions (defined as Drug_Concentration) as columns, and each individual mutant (defined as Position_Substitution) as rows. We then passed this matrix to R’s prcomp function with default parameters, and used the first two principal components to visualize the results. Visual inspection of the loadings via a biplot, along with plotting of various variant types (Supplementary Fig. 6A,B) revealed that PC1 separates variants based on their overall effect on MC4R function, and PC2 separates variants based on differential response through the Gq and Gs pathways. We set a simple cutoff of +- 7.5 on PC2 to highlight particularly interesting variants.
Identifying variants that respond to corrector treatment
We identified variants whose activity increased upon treatment with Ipsen 17 using a slight modification of the summary statistics from the general-purpose model described above. For each variant and condition, we compute the log2-scaled marginal mean (averaging across barcodes and replicates) and its associated standard error. Then, for each variant we compute the following two summary statistics, and test whether their estimates are significantly different from zero. First, we define a variants’ defect as the marginal mean of that variant under DMSO treatment minus the marginal mean of WT under DMSO treatment. This quantifies the existence and severity of the variant’s intrinsic effect on MC4R activity. Second, we define that variant’s rescue as the marginal mean of that variant under Ipsen 17 treatment minus the marginal mean of that variant under DMSO activity. This quantifies the magnitude of the (typically) increase in MC4R activity upon Ipsen 17 treatment for each variant, relative to its DMSO-only baseline. After computing the indicated estimate and errors, we perform significance testing as in the general model, where we define Wald statistics as the estimate divided by the propagated error, compute p-values from the normal distribution, and adjust for multiple testing using the Benjamini-Hochberg procedure.
Identifying critical variants for protein-ligand interactions
We identified sets of mutations that specifically inhibited or potentiated MC4R activation in the presence of ligands. We applied the general DMS model (see Negative binomial regression analysis pipeline) and extracted log2-scaled fold changes and standard errors for each variant relative to WT, within each ligand-treated condition. To account for systematic differences between ligands across all variants, we applied Bayesian meta-regression via the brms R package (Bürkner, 2017) and regressed the α-MSH summary statistics against THIQ, while including the errors in both quantities via the se() and me() brms functions. Finally, to infer significant α-MSH- or THIQ-specific effects, we extracted the residual of each variant relative to the meta-regression best-fit line and tested whether this residual was significantly non-zero based on the posterior sampling performed with brms.
Structural modeling
Molecular visualization of variant effects on MC4R was performed with UCSF ChimeraX (Pettersen et al., 2021). For visualization of functionally biased extremes in MC4R (Fig. 3B), the maximum absolute PC2 value for each position was calculated and the define attribute function of ChimeraX was used to color the structure of α-MSH-bound MC4R (PDB: 7F53) by these relative values, ranging from white (no bias) to pink (extreme bias). For all other structural panels related to variant bias, positions of interest are colored binarily by green or purple to indicate Gs-bias or Gq-bias, respectively. For visualizing variant effects on protein-ligand interactions (Fig. 5), positions with significant variants identified by meta-regression were colored by whether variants at a given position perturb activation uniquely by α-MSH (blue) or THIQ (orange). Where α-MSH and and THIQ structures are shown, the respective α-MSH-bound (PDB: 7F53) and THIQ-bound (PDB: 7F58) cryo-EM models were used for visualization. For the depictions in Fig. 5B, α-MSH-bound MC4R was used.
Figure generation
Cartoon diagrams of reporter assay designs in Fig. 1 and Supplementary Fig. 1A,B were created with BioRender.com.
Sequence data and software availability
Raw sequencing data are available from SRA under project accession number PRJNA1161152. All code used for analysis and figure generation are available at https://github.com/octantbio/mc4r-dms.
Competing Interests
CJH, NSA, BAO, EMJ, LYC, HC, JBA, JSB, ARC, SK, DED, and NBL are current employees of Octant, Inc. and/or hold shares or options in the company. EMJ, HC, AC, and SK are listed inventors on patents related to this work. All other authors declare no competing financial interests.
Acknowledgements
We would like to thank Jakob Sture Madsen for his intellectual contributions to the project and his helpful comments on the paper.
Supplemental Information
Supplemental Figures
References
- glmGamPoi: fitting Gamma-Poisson generalized linear models on single cell count dataBioinformatics 36:5701–5702https://doi.org/10.1093/bioinformatics/btaa1009
- Deep mutational scanning: assessing protein function on a massive scaleTrends Biotechnol 29:435–442https://doi.org/10.1016/j.tibtech.2011.04.003
- A new era in the treatment of cystic fibrosis: correction of the underlying CFTR defectLancet Respir Med 1:158–163https://doi.org/10.1016/S2213-2600(12)70057-7
- From systems to structure - using genetic data to model protein structuresNat Rev Genet 23:342–354https://doi.org/10.1038/s41576-021-00441-w
- Genome-wide prediction of disease variant effects with a deep protein language modelNat Genet 55:1512–1522https://doi.org/10.1038/s41588-023-01465-0
- Alternative G protein coupling and biased agonism: new insights into melanocortin-4 receptor signallingMol Cell Endocrinol 331:232–240https://doi.org/10.1016/j.mce.2010.07.007
- glmmTMB Balances Speed and Flexibility Among Packages for Zero-inflated Generalized Linear Mixed ModelingR J 9:378–400https://doi.org/10.32614/RJ-2017-066
- Human MC4R variants affect endocytosis, trafficking and dimerization revealing multiple cellular mechanisms involved in weight regulationCell Rep 34https://doi.org/10.1016/j.celrep.2021.108862
- . brms: An R Package for Bayesian Multilevel Models Using StanJ Stat Softw 80:1–28https://doi.org/10.18637/jss.v080.i01
- Systems and Methods for Measuring Cell Signaling Protein ActivityUS Patent
- Highly efficient Cas9-mediated transcriptional programmingNat Methods 12:326–328https://doi.org/10.1038/nmeth.3312
- Accurate proteome-wide missense variant effect prediction with AlphaMissenseScience 381https://doi.org/10.1126/science.adg7492
- RM-493, a melanocortin-4 receptor (MC4R) agonist, increases resting energy expenditure in obese individualsJ Clin Endocrinol Metab 100:1639–1645https://doi.org/10.1210/jc.2014-4024
- A genomic mutational constraint map using variation in 76,156 human genomesNature 625:92–100https://doi.org/10.1038/s41586-023-06045-0
- MC4R agonism promotes durable weight loss in patients with leptin receptor deficiencyNat Med 24:551–555https://doi.org/10.1038/s41591-018-0015-9
- Setmelanotide POMC and LEPR Phase 3 Trial Investigators. 2020. Efficacy and safety of setmelanotide, an MC4R agonist, in individuals with severe obesity due to LEPR or POMC deficiency: single-arm, open-label, multicentre, phase 3 trialsLancet Diabetes Endocrinol 8:960–970https://doi.org/10.1016/S2213-8587(20)30364-8
- Evaluation of a melanocortin-4 receptor (MC4R) agonist (Setmelanotide) in MC4R deficiencyMol Metab 6:1321–1329https://doi.org/10.1016/j.molmet.2017.06.015
- STAR: ultrafast universal RNA-seq alignerBioinformatics 29:15–21https://doi.org/10.1093/bioinformatics/bts635
- Clinical spectrum of obesity and mutations in the melanocortin 4 receptor geneN Engl J Med 348:1085–1095https://doi.org/10.1056/NEJMoa022050
- Melanocortin-4 receptor complexity in energy homeostasis,obesity and drug development strategiesDiabetes Obes Metab 24:583–598https://doi.org/10.1111/dom.14618
- Mapping the energetic and allosteric landscapes of protein binding domainsNature 604:175–183https://doi.org/10.1038/s41586-022-04586-4
- DiMSum: an error model and pipeline for analyzing deep mutational scanning data and diagnosing common experimental pathologiesGenome Biol 21https://doi.org/10.1186/s13059-020-02091-3
- Saturation editing of genomic regions by multiplex homology-directed repairNature 513:120–123https://doi.org/10.1038/nature13695
- Universal allosteric mechanism for Gα activation by GPCRsNature 524:173–179https://doi.org/10.1038/nature14663
- Deep mutational scanning: a new style of protein scienceNat Methods 11:801–807https://doi.org/10.1038/nmeth.3027
- Treatment of Fabry’s Disease with the Pharmacologic Chaperone MigalastatN Engl J Med 375:545–555https://doi.org/10.1056/NEJMoa1510198
- MC4R Agonists: Structural Overview on Antiobesity TherapeuticsTrends Pharmacol Sci 39:402–423https://doi.org/10.1016/j.tips.2018.01.004
- Modulation of blood pressure by central melanocortinergic pathwaysN Engl J Med 360:44–52https://doi.org/10.1056/NEJMoa0803085
- Common coupling map advances GPCR-G protein selectivityElife 11https://doi.org/10.7554/eLife.74107
- GPCR activation mechanisms across classes and macro/microscalesNat Struct Mol Biol 28:879–888https://doi.org/10.1038/s41594-021-00674-7
- Structures of active melanocortin-4 receptor-Gs-protein complexes with NDP-α-MSH and setmelanotideCell Res 31:1176–1189https://doi.org/10.1038/s41422-021-00569-8
- Prevalence, spectrum, and functional characterization of melanocortin-4 receptor gene mutations in a representative population-based sample and obese adults from GermanyJ Clin Endocrinol Metab 91:1761–1769https://doi.org/10.1210/jc.2005-2056
- Melanocortin-4 receptor gene: case-control study and transmission disequilibrium test confirm that functionally relevant mutations are compatible with a major gene effect for extreme obesityJ Clin Endocrinol Metab 88:4258–4267https://doi.org/10.1210/jc.2003-030233
- The promise of new anti-obesity therapies arising from knowledge of genetic obesity traitsNat Rev Endocrinol :1–15https://doi.org/10.1038/s41574-022-00716-0
- Melanocortin-4 receptor in energy homeostasis and obesity pathogenesisProg Mol Biol Transl Sci 114:147–191https://doi.org/10.1016/B978-0-12-386933-3.00005-4
- Molecular basis of proton-sensing by G protein-coupled receptorsbioRxiv https://doi.org/10.1101/2024.04.17.590000
- . alpha-Melanotropin: the minimal active sequence in the frog skin bioassayJ Med Chem 30:2126–2130https://doi.org/10.1021/jm00394a033
- A small molecule agonist THIQ as a novel pharmacoperone for intracellularly retained melanocortin-4 receptor mutantsInt J Biol Sci 10:817–824https://doi.org/10.7150/ijbs.9625
- Pharmacological chaperones for the misfolded melanocortin-4 receptor associated with human obesityBiochim Biophys Acta Mol Basis Dis 1863:2496–2507https://doi.org/10.1016/j.bbadis.2017.03.001
- Principles of early drug discoveryBr J Pharmacol 162:1239–1249https://doi.org/10.1111/j.1476-5381.2010.01127.x
- Generic GPCR residue numbers - aligning topology maps while minding the gapsTrends Pharmacol Sci 36:22–31https://doi.org/10.1016/j.tips.2014.11.001
- A Scalable, Multiplexed Assay for Decoding GPCR-Ligand Interactions with RNA SequencingCell Syst 8:254–260https://doi.org/10.1016/j.cels.2019.02.009
- Structural and functional characterization of G protein-coupled receptors with deep mutational scanningElife 9https://doi.org/10.7554/eLife.54895
- Understanding melanocortin-4 receptor control of neuronal circuits: Toward novel therapeutics for obesity syndromePharmacol Res 129:10–19https://doi.org/10.1016/j.phrs.2018.01.004
- Chronic treatment with a melanocortin-4 receptor agonist causes weight loss, reduces insulin resistance, and improves cardiovascular function in diet-induced obese rhesus macaquesDiabetes 62:490–497https://doi.org/10.2337/db12-0598
- Fine-tuning protein Language Models with Deep Mutational Scanning improves variant effect predictionarXiv
- ClinVar: public archive of relationships among sequence variation and human phenotypeNucleic Acids Res 42:D980–5https://doi.org/10.1093/nar/gkt1113
- emmeans: Estimated Marginal Means, aka Least-Squares MeansCRAN
- The Sequence Alignment/Map format and SAMtoolsBioinformatics 25:2078–2079https://doi.org/10.1093/bioinformatics/btp352
- G(q/11)α and G(s)α mediate distinct physiological responses to central melanocortinsJ Clin Invest 126:40–49https://doi.org/10.1172/JCI76348
- Human Gain-of-Function MC4R Variants Show Signaling Bias and Protect against ObesityCell 177:597–607https://doi.org/10.1016/j.cell.2019.03.044
- Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2Genome Biol 15https://doi.org/10.1186/s13059-014-0550-8
- Cutadapt removes adapter sequences from high-throughput sequencing readsEMBnet.journal 17:10–12https://doi.org/10.14806/ej.17.1.200
- Multiplex assessment of protein variant abundance by massively parallel sequencingNat Genet 50:874–882https://doi.org/10.1038/s41588-018-0122-z
- Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variationNucleic Acids Res 40:4288–4297https://doi.org/10.1093/nar/gks042
- Benchmarking AlphaMissense pathogenicity predictions against cystic fibrosis variantsPLoS One 19https://doi.org/10.1371/journal.pone.0297560
- A human obesity-associated MC4R mutation with defective Gq/11α signaling leads to hyperphagia in miceJ Clin Invest 134https://doi.org/10.1172/JCI165418
- ProteinGym: Large-Scale Benchmarks for Protein Design and Fitness PredictionbioRxiv https://doi.org/10.1101/2023.12.07.570727
- Deep generative modeling of the human proteome reveals over a hundred novel genes involved in rare genetic disordersmedRxiv https://doi.org/10.1101/2023.11.27.23299062
- Differential Signaling Profiles of MC4R Mutations with Three Different LigandsInt J Mol Sci 21https://doi.org/10.3390/ijms21041224
- UCSF ChimeraX: Structure visualization for researchers, educators, and developersProtein Sci 30:70–82https://doi.org/10.1002/pro.3943
- Validating therapeutic targets through human geneticsNat Rev Drug Discov 12:581–594https://doi.org/10.1038/nrd4051
- The stimulatory G protein Gsα is required in melanocortin 4 receptor-expressing cells for normal energy balance, thermogenesis, and glucose metabolismJ Biol Chem 293:10993–11005https://doi.org/10.1074/jbc.RA118.003450
- Identification of a novel series of benzimidazoles as potent and selective antagonists of the human melanocortin-4 receptorBioorg Med Chem Lett 17:4464–4470https://doi.org/10.1016/j.bmcl.2007.06.010
- FLASH: a next-generation CRISPR diagnostic for multiplexed detection of antimicrobial resistance sequencesNucleic Acids Res 47https://doi.org/10.1093/nar/gkz418
- A scaling normalization method for differential expression analysis of RNA-seq dataGenome Biol 11https://doi.org/10.1186/gb-2010-11-3-r25
- Moderated statistical tests for assessing differences in tag abundanceBioinformatics 23:2881–2887https://doi.org/10.1093/bioinformatics/btm453
- MC4R variants modulate α-MSH and setmelanotide induced cellular signaling at multiple levelsJ Clin Endocrinol Metab https://doi.org/10.1210/clinem/dgae210
- A statistical framework for analyzing deep mutational scanning dataGenome Biol 18https://doi.org/10.1186/s13059-017-1272-5
- Design and pharmacology of N-[(3R)-1,2,3,4-tetrahydroisoquinolinium-3-ylcarbonyl]-(1R)-1-(4-chlorobenzyl)-2-[4-cyclohexyl-4-(1H-1,2,4-triazol-1-ylmethyl)piperidin-1-yl]-2-oxoethylamine (1), a potent, selective, melanocortin subtype-4 receptor agonistJ Med Chem 45:4589–4593https://doi.org/10.1021/jm025539h
- Current Mechanistic and Pharmacodynamic Understanding of Melanocortin-4 Receptor ActivationMolecules 24https://doi.org/10.3390/molecules24101892
- UMI-tools: modeling sequencing errors in Unique Molecular Identifiers to improve quantification accuracyGenome Res 27:491–499https://doi.org/10.1101/gr.209601.116
- G Protein-Coupled Receptors as Targets for Approved Drugs: How Many Targets and How Many Drugs?Mol Pharmacol 93:251–258https://doi.org/10.1124/mol.117.111062
- Variant Interpretation: Functional Assays to the RescueAm J Hum Genet 101:315–325https://doi.org/10.1016/j.ajhg.2017.07.014
- Prevalence of melanocortin-4 receptor deficiency in Europeans and their age-dependent penetrance in multigenerational pedigreesDiabetes 57:2511–2518https://doi.org/10.2337/db08-0153
- Targeting the central melanocortin system for the treatment of metabolic disordersNat Rev Endocrinol 19:507–519https://doi.org/10.1038/s41574-023-00855-y
- The melanocortin-4 receptor: physiology, pharmacology, and pathophysiologyEndocr Rev 31:506–543https://doi.org/10.1210/er.2009-0037
- Melanocortin-4 receptor mutations are a frequent and heterogeneous cause of morbid obesityJ Clin Invest 106:253–262https://doi.org/10.1172/JCI9238
- Loss-of-function mutations in the melanocortin 4 receptor in a UK birth cohortNat Med 27:1088–1096https://doi.org/10.1038/s41591-021-01349-y
- Rescue of defective MC4R cell-surface expression and signaling by a novel pharmacoperone Ipsen 17J Mol Endocrinol 53:17–29https://doi.org/10.1530/JME-14-0005
- Expanding Approved Patient Populations for Rare Disease Treatment Using In Vitro DataClin Pharmacol Ther 112:58–61https://doi.org/10.1002/cpt.2414
- Multiplexed assays of variant effects contribute to a growing genotype-phenotype atlasHum Genet 137:665–678https://doi.org/10.1007/s00439-018-1916-x
- The energetic and allosteric landscape for KRAS inhibitionNature 626:643–652https://doi.org/10.1038/s41586-023-06954-0
- Mutations in the human melanocortin-4 receptor gene associated with severe familial obesity disrupts receptor function through multiple molecular mechanismsHum Mol Genet 12:561–574https://doi.org/10.1093/hmg/ddg057
- Determination of the melanocortin-4 receptor structure identifies Ca2+ as a cofactor for ligand bindingScience 368:428–433https://doi.org/10.1126/science.aaz8995
- Structural insights into ligand recognition and activation of the melanocortin-4 receptorCell Res 31:1163–1175https://doi.org/10.1038/s41422-021-00552-3
- Common activation mechanism of class A GPCRsElife 8https://doi.org/10.7554/eLife.50279
Article and author information
Author information
Version history
- Preprint posted:
- Sent for peer review:
- Reviewed Preprint version 1:
Copyright
© 2024, Howard et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.