High resolution deep mutational scanning of the melanocortin-4 receptor enables target characterization for drug discovery

Conor J Howard; Nathan S Abell; Beatriz A Osuna; Eric M Jones; Leon Y Chan; Henry Chan; Dean R Artis; Jonathan B Asfaha; Joshua S Bloom; Aaron R Cooper; Andrew Liao; Eden Mahdavi; Nabil Mohammed; Alan L Su; Giselle A Uribe; Sriram Kosuri; Diane E Dickel; Nathan B Lubock

doi:10.7554/eLife.104725.2

eLife Assessment

The authors use deep mutational scanning to assess the effect of ~6,600 protein-coding variants in MC4R, a G protein coupled receptor associated with obesity. They develop new, more precise approaches to deep mutational scanning, enabling them to probe molecular phenotypes directly relevant to the development of drugs that target this receptor. In this important work, the authors provide compelling evidence that variants impact signaling through MC4R in different ways, that some defective variants are amenable to a corrector drug and that deep mutational scanning data could guide compound optimization.

https://doi.org/10.7554/eLife.104725.2.sa3

Significance of findings

important: Findings that have theoretical or practical implications beyond a single subfield

landmark
fundamental
important
valuable
useful

Strength of evidence

compelling: Evidence that features methods, data and analyses more rigorous than the current state-of-the-art

exceptional
compelling
convincing
solid
incomplete
inadequate

During the peer-review process the editor and reviewers write an eLife assessment that summarises the significance of the findings reported in the article (on a scale ranging from landmark to useful) and the strength of the evidence (on a scale ranging from exceptional to inadequate). Learn more about eLife assessments

Abstract

Deep Mutational Scanning (DMS) is an emerging method to systematically test the functional consequences of thousands of sequence changes to a protein target in a single experiment. Because of its utility in interpreting both human variant effects and protein structure-function relationships, it holds substantial promise to improve drug discovery and clinical development. However, applications in this domain require improved experimental and analytical methods. To address this need, we report novel DMS methods to precisely and quantitatively interrogate disease-relevant mechanisms, protein-ligand interactions, and assess predicted response to drug treatment. Using these methods, we performed a DMS of the melanocortin-4 receptor (MC4R), a G protein-coupled receptor (GPCR) implicated in obesity and an active target of drug development efforts. We assessed the effects of >6,600 single amino acid substitutions on MC4R’s function across 18 distinct experimental conditions, resulting in >20 million unique measurements. From this, we identified variants that have unique effects on MC4R-mediated Gα_s- and Gα_q-signaling pathways, which could be used to design drugs that selectively bias MC4R’s activity. We also identified pathogenic variants that are likely amenable to a corrector therapy. Finally, we functionally characterized structural relationships that distinguish the binding of peptide versus small molecule ligands, which could guide compound optimization. Collectively, these results demonstrate that DMS is a powerful method to empower drug discovery and development.

Introduction

Deep Mutational Scanning (DMS) employs cutting-edge synthetic biology or genome editing methods, DNA synthesis, and sequencing to systematically assess the effect of every possible single amino acid substitution on the function of a protein target (Starita et al. 2017; Araya and Fowler 2011; Fowler and Fields 2014). Researchers have leveraged DMS to gauge a variety of protein functions or their cellular consequences, including viability (Findlay et al. 2014), protein abundance (Matreyek et al. 2018; Faure et al. 2022), transcriptional signaling (Jones et al. 2020), and inter- and intramolecular interactions (Braberg et al. 2022; Faure et al. 2022). DMS assays are increasingly used for human variant interpretation (Weile and Roth 2018) and to elucidate the relationship between protein structure and function, including in the evaluation (Brandes et al. 2023) and finetuning of protein language models (Lafita et al. 2024).

While DMS has significantly advanced our understanding of protein function, its potential in drug discovery and development has yet to be fully realized. Applications in this realm require methods that are more disease-relevant, sensitive, and quantitative to measure the subtle effects of both sequence variants and experimental conditions (e.g., drug treatments). For example, DMS assays often measure effects like viability, which are several biological layers removed from the specific molecular mechanisms modulated by drugs. Additionally, the current signal-to-noise ratios of many assays and challenges around uncertainty quantification make drawing quantitative conclusions from DMS data difficult. Moving from categorical (e.g., benign vs. pathogenic) classifications toward precise quantitative measurements of human variants could improve predictions of safety and efficacy at early stages of drug development programs (Plenge, Scolnick, and Altshuler 2013). It would also expand the use of matching patients’ medications to their specific genetic variants (i.e., theratyping), which could lead to better patient outcomes (McDonald et al., 2024). Finally, DMS data contain the functional consequences of thousands of different biochemical perturbations. Sufficiently sensitive assays against functional readouts and improved analysis methods would readily complement structure-based approaches by elucidating the functional consequences of ligand binding. Building on this, it should be possible to identify novel protein-ligand interactions that could be reverse-engineered to increase compound potency, further expanding the potential impact of DMS on drug discovery.

In a previous work, we described DMS methods to measure the effects of thousands of single amino acid variants on the function of the beta-2 adrenergic receptor (β2AR) (Jones et al. 2020), a member of the G-protein-coupled receptor (GPCR) class that is the most commonly targeted protein family in drug development (Sriram and Insel 2018). Here, we build upon this previous work to demonstrate more sensitive and robust DMS methods for drug discovery and development, focusing on the melanocortin-4 receptor (MC4R). MC4R is a GPCR, and human variants that result in partial or complete loss of its function cause the most common form of inherited obesity [OMIM #618406] (Farooqi et al. 2003; Vaisse et al. 2000; Hinney et al. 2003). Variants that increase MC4R activity are protective against obesity (Lotta et al. 2019; Paisdzior et al. 2020), and numerous small molecule and peptide agonists of MC4R have been tested as potential therapeutics (Sweeney et al. 2023; Greenfield et al. 2009; K. Y. Chen et al. 2015; Kievit et al. 2013; Collet et al. 2017; Clément et al. 2020; Huang and Tao 2014; Hinney, Körner, and Fischer-Posovszky 2022).

In this study, we developed substantially improved experimental and analytical methods for DMS that are capable of detecting subtle quantitative effects of variants and differences between experimental conditions with a high degree of statistical rigor. We then tested the effects of nearly all possible single amino acid variants of MC4R (6,633 of 6,640) on two distinct GPCR signaling functions under a variety of treatment conditions. From this, we generated a high-resolution map of how MC4R’s structure relates to function, and we accurately classified the quantitative effects of human variants. Additionally, we identified amino acid changes that differentially impact (i.e., bias) MC4R’s different GPCR signaling functions, pinpointed human variants that are amenable to a specific class of therapy, and elucidated the functional impact of protein-ligand interactions between MC4R and both peptide and small molecule ligands. This demonstrates the utility of DMS for various drug discovery and development applications, and the methods described herein should be broadly applicable to GPCRs and other drug target classes that function in transcriptional signaling pathways.

Results

Development of highly quantitative deep mutational scanning methods

Assays for disease-relevant mechanisms

A critical tool in drug discovery programs is a very sensitive set of assays that directly interrogate the function(s) of a protein target against which to test potential therapies (Hughes et al. 2011). This is essential for ensuring that compounds are modulating a specific mechanism underlying disease and not having undesired “off-target” effects. Stimulation of MC4R with its agonist, alpha melanocyte-stimulating hormone (α-MSH), results in signaling through multiple canonical GPCR pathways, including Gα_s-coupled cyclic adenosine monophosphate signaling (hereafter referred to as Gs) and Gα_q-coupled calcium signaling (hereafter referred to as Gq) (Tao 2010). Therefore, we first developed multiplexed reporter assays for these two critical MC4R G-protein signaling functions (Fig. 1, Supplementary Fig. 1), building off of our earlier work performing a deep mutational scan of β2AR (Jones et al. 2020). Both reporters were designed for use in human HEK293T cells and to be compatible with our previously described DMS library construction methods (Jones et al. 2020). Briefly, these methods harness high-throughput DNA synthesis to construct every possible single amino acid variant, and each variant is then linked to a transcriptional reporter containing an oligonucleotide sequence barcode unique to that variant. Reporter constructs are then integrated into cells using a site-specific recombination-based landing pad system and drug selection to ensure that each cell contains a single variant-barcode combination. Activation of the receptor turns on a response element for the signaling pathway, leading to the expression of the barcoded reporter, which is then quantified using RNA sequencing.

Highly quantitative Deep Mutational Scanning (DMS) methods for drug discovery applications.
**Top:** DMS experimental and analysis methods improvements reported in this paper. Schematic made with BioRender.com/g94q960. **Bottom:** Questions commonly encountered in drug discovery and development that can be addressed by improved DMS methodology.

The MC4R Gs assay was adapted and further optimized from the cAMP response element-based reporter we previously used for β2AR (Supplementary Fig. 1A,C). For Gq signaling, an analogous reporter using an NFAT response element (Boss, Talpade, and Murphy 1996) alone to activate reporter gene expression was not suitable due to weak signal-to-noise (Supplementary Fig. 1D). To solve this problem, we incorporated a “relay” system to amplify the reporter signal using a synthetic transcription factor composed of Gal4 fused to the VP64-p65-Rta (VPR) transcriptional activator (Chavez et al. 2015) (Fig. 1, Supplementary Fig. 1B, see Methods for more details). The resulting Gal4-VPR transcription factor was placed under the control of the NFAT response element, and the reporter gene was placed under the control of a UAS element that responds to Gal4 binding. This assay design resulted in robust reporter expression upon stimulation of MC4R (Supplementary Fig. 1C). Together, these assays provide sensitive functional readouts for two important MC4R signaling activities implicated in obesity and other phenotypes (Fatima et al. 2022; Sweeney et al. 2023), and we used them to build DMS libraries to assess the effects of all possible single amino acid substitutions in MC4R.

Analysis model for statistically robust comparisons

To explore many hypotheses that arise in drug discovery applications, it is valuable to assay a DMS library using experimental replication and under a variety of conditions, such as different drugs and/or pathways of interest. However, most popular methods for DMS analysis do not leverage DNA barcodes or other experimental replicate information, nor do they support hypothesis testing between conditions (Rubin et al. 2017; Faure et al. 2020). Consequently, we developed an alternative modeling framework to enable this (Fig. 1, Supplementary Fig. 2A, see Methods: Negative Binomial Regression Analysis Pipeline for additional details). Briefly, borrowing from approaches for inferring differential expression from RNA-seq data (McCarthy, Chen, and Smyth 2012; Love, Huber, and Anders 2014; Ahlmann-Eltze and Huber 2021), we applied a mixed effect negative binomial generalized linear model (GLM) to raw barcode counts directly. The model contains a random effect across barcodes to share barcode information between replicates and conditions, and incorporates sample-specific offsets to account for technical covariates like sequencing depth, as is common for RNA-seq (Robinson and Oshlack 2010). For each variant, we estimate the mean shift in barcode count and associated standard error for each treatment condition, relative to wild-type. Using per-condition summary statistics, we either directly test whether each variant barcode mean is significantly different from wild-type (zero) in each treatment, or we can define more complex linear contrasts on variant effects across multiple treatments.

Increasing power through barcoding

As described above, our assay design and analysis framework harness DNA barcodes that are uniquely associated with a particular variant and provide multiple independent measures of a variant’s effect. In our previous DMS of β₂AR, the median number of barcodes independently linked to each variant was ∼10 (Jones et al. 2020), and we reasoned that increasing this number would increase the power to detect functional effects. To this end, we optimized and scaled our library cloning and cellular integration protocols to target ∼30 barcodes per variant in building DMS libraries for MC4R. As expected, this increased the power to detect variant effects. For example, the separation between the activity of alleles that are clearly deleterious (i.e., a stop codon at any position in the protein) and all other alleles (i.e., wild-type or missense variants) was drastically increased in the MC4R Gs assay relative to the same assay for β₂AR (Supplementary Fig. 2B). To further test the effect of the number of barcodes for a given variant, we computationally down-sampled the barcodes for representative positions of MC4R and ran the resulting data through our analysis pipeline. As expected, by increasing the number of barcodes per variant, the magnitude of the standard error of the estimated variant effect decreases in a manner consistent with increased sample size (Supplementary Fig. 2C). This confirms that increasing the number of barcodes per variant enables the quantification of subtle differences within a standard hypothesis testing framework, and provides an experimental parameter that one can vary to improve the power to detect the effects of sequence variants of particular interest.

Comprehensive deep mutational scanning of MC4R

With these methods in hand, we carried out a comprehensive assessment of the effects of all single amino acid substitutions (including nonsense variants) on MC4R’s Gs and Gq signaling activities under a variety of experimental conditions (Fig. 2a, Supplementary Table 1, Supplementary Figs. 3-4 ). We selected experimental conditions that would inform aspects of drug discovery and development programs, such as elucidating protein structure-function relationships, identifying regions of the protein that bias activity towards or away from a specific function, classifying the effects of human variants in the presence and absence of potential therapies, and uncovering functional differences in protein-ligand interactions. In total, we tested 18 unique conditions, each performed in quadruplicate, including: basal activity (i.e., no stimulation) of MC4R, stimulation of MC4R with a range of doses of the native peptide agonist alpha-melanocyte-stimulating hormone (α-MSH), stimulation with a range of doses of a small molecule agonist (THIQ) (Sebhat et al. 2002), treatment with a small molecule corrector (Ipsen 17) (Wang et al. 2014; Poitout et al. 2007), and library composition normalization controls (forskolin, see Methods). Our resulting DMS assays had extraordinary variant coverage, with 99.9% (6,633/6,640) of all possible single amino acid substitutions present in all experimental conditions. Each variant was represented by an average of 56 and 28 barcodes for the Gs and Gq signaling pathways, respectively. Between both assays, this translates to more than 557,000 uniquely engineered human cells, each containing a distinctive variant-reporter-barcode combination. When factoring in the number of experimental conditions (18 unique), replicates (four per condition), amino acid variants tested (99.9% of 6,640 possible), and the mean barcodes per variant (56 and 28 for Gs and Gq, respectively), this equates to >21,500,000 measurements across all datasets.

Effect of 6,633 missense and nonsense variants on MC4R signaling functions.
A. Heatmaps showing the functional effects (z-scores) for nearly all possible amino acid substitutions (6,633 of 6,640) on MC4R activity for two GPCR signaling functions (Gs and Gq) under a variety of conditions. Heatmaps showing results (both z-score and log2[fold change of variant activity over wild-type]) for all experimental conditions are shown in Supplementary Figs. 3-4. The results of the Gs assay with low α-MSH stimulation are highlighted on the left (and in Panels **B-F**). TM: transmembrane domain; GoF: gain-of-function; LoF: loss-of-function; WT: wild-type activity B. A modified snake plot (adapted from https://gpcrdb.org/) showing the sensitivity of each MC4R residue to mutation (defined as the mean log2[fold change variant activity over wild-type] divided by sqrt(sum(standard error^2)) after excluding nonsense variants). C. Z-scores for each variant (point), broken out by variant type and clinical (ClinVar) classification. Blue indicates statistically significant LoF, brown is significant GoF (significance threshold: FDR < 1%). VUS: variant of uncertain significance. D. Functional effect (Log2[fold change of variant activity over wild-type], x-axis) for all human variants relative to the frequency of the allele in the in the human population (y-axis, gnomAD global population (S. Chen et al. 2024)). E. DMS results for human MC4R variants (y-axis) relative to previous functional classifications in the literature (x-axis, from (Huang, Wang, and Tao 2017)) F. DMS results (z-scores, x-axis) compared to change in α-MSH potency (relative to WT) of 25 MC4R variants made to the orthosteric binding site for α-MSH (y-axis, cAMP accumulation assay results from (Zhang et al. 2021)). G. Fractions of MC4R variants that result in LoF, GoF, or WT activity for eight unique experimental conditions.

Multiple lines of evidence support the high quality and utility of these data for classification of variant effects (Fig. 2A-F, Supplementary Figs. 3-5). Focusing on one dataset as a representative example (Gs signaling using a low dose of α-MSH stimulation), variants that introduce stop codons or fall within transmembrane domains and buried surfaces disproportionately lead to significant loss of MC4R function (Fig. 2A-C). The results also correlate well with expectations from human genetics data and variant effect prediction algorithms (Fig. 2C-D, Supplementary Fig. 5). For example, the majority (63.3%, 31/49) of human MC4R variants classified as pathogenic or likely pathogenic in ClinVar (Landrum et al. 2014) lead to a significant reduction of Gs signaling under low α-MSH stimulation conditions (significance threshold: false discovery rate (FDR) < 1%; Fig. 2C). Variants that are significantly loss-of-function in this condition are rarer in the human population, and more common human variants have no significant effect on MC4R function (significance threshold: FDR < 1%; Fig. 2D). Loss-of-function variants by our DMS assay are also typically (e.g., AlphaMissense: 93.4%, 1894/2028) predicted to be deleterious by commonly used variant effect predictors like AlphaMissense (Cheng et al. 2023) and popEVE (Orenbuch et al. 2023) (Supplementary Fig. 5).

Because of the sensitivity of our reporter system and the statistical power gained by testing dozens of unique barcodes per variant, we anticipated that these assays would capture subtle quantitative, rather than just qualitative, effects on MC4R function. To assess this, we benchmarked our results against previous quantitative characterizations of MC4R variants from the literature (Fig. 2E-F). For example, many MC4R variants that have been observed in the human population have been previously tested for their effects on MC4R function, and a review (Huang, Wang, and Tao 2017) summarizing this work systematically classified >70 variants according to whether they result in “full”, “partial”, or “no/mild” loss of MC4R activity. Our results are consistent with these classifications: variants classified previously as “full” loss-of-function typically have very low MC4R activity in our assay, “partial” variants have intermediate effects, and “no/mild” effect variants have near-normal activity (Fig. 2E, median log2[fold change of variant activity over wild-type activity] of -1.4, -0.8, and -0.3, respectively, for each group). Finally, our results show a high degree of correlation (Pearson correlation = 0.84, R² = 0.71) with quantitative effect measurements reported for 25 variants individually introduced into the orthosteric site of MC4R (Zhang et al. 2021). Collectively, this demonstrates that our high quality MC4R DMS data accurately and quantitatively assess the effects of variants on MC4R’s function.

Systematic human variant interpretation of MC4R

Looking across multiple data sets gives a comprehensive picture of variant effects, as the various experimental conditions tested have disparate strengths and weaknesses at detecting loss-of-function (LoF) versus gain-of-function (GoF) activities (Fig. 2G). For example, unstimulated conditions (e.g., zero α-MSH) uncover variants that lead to constitutive activation of MC4R, but they have less ability to detect loss-of-function variants. In contrast, conditions with agonist stimulation are much better enabled to identify loss of MC4R function. Considering individual α-MSH stimulation conditions (zero, low, medium, and high) for both the Gs and Gq assays, each condition identifies 6.6 - 39.3% of variants as loss-of-function and 0.02 - 1.1% as gain-of-function (Fig. 2G). Collectively across all α-MSH stimulation conditions, 3,370 variants (50.8%) are loss-of-function in at least one condition, 347 variants (5.2%) are gain-of-function in at least one condition, and 2,996 (45.2%) always show wild-type activity. Interestingly, 80 (1.2%) variants are classified as both loss- and gain-of-function, depending upon the condition.

To aid in clinical variant interpretation, we provide detailed functional effect classifications for 220 human variants reported in ClinVar (Landrum et al. 2014) or from published patient sequencing studies (Farooqi et al. 2003; Yeo et al. 2003; Hinney et al. 2006; Stutzmann et al. 2008; Wade et al. 2021; Brouwers et al. 2021; Huang, Wang, and Tao 2017; Hinney, Volckmar, and Knoll 2013; Rodríguez Rondón et al. 2024) in Supplementary Table 2. In total, 130 of these reported human variants (59.1%) are LoF in at least one condition, consistent with being pathogenic for obesity-related phenotypes. This includes 83.9% (26/31) of the variants that are reported in ClinVar as pathogenic or likely pathogenic, 53.3% (32/60) of those that are unclassified or have conflicting interpretation, and 0% (0/3) of those classified as benign or likely benign. A small number of reported human variants (V103I, H158R, I251L) result in significant increases in Gs and/or Gq signaling, consistent with having a protective effect for obesity, and are generally classified as benign in ClinVar or as having wild-type activity in the literature (Hinney et al. 2006). These results highlight the utility of systematic deep scans across multiple experimental conditions for facilitating human variant interpretation.

Variants that bias MC4R function

MC4R signals through multiple G protein pathways (Sharma et al. 2019; Paisdzior et al. 2020; Ju, Cho, and Sohn 2018; Breit et al. 2011), and evidence from human variant interpretation studies (Lotta et al. 2019; Metzger et al. 2024; Rodríguez Rondón et al. 2024), mouse modeling (Y.-Q. Li et al. 2016), and previous drug programs (Sharma et al. 2019; Clément et al. 2018) suggest that biasing MC4R’s activity toward or away from specific pathways could be therapeutically valuable for the treatment of obesity. To gain a better understanding of how MC4R structure relates to its various functions, we systematically searched for variants that differentially impact Gs versus Gq signaling. We applied Principal Component Analysis (PCA) across eight total DMS datasets (four each for Gs and Gq: with zero, low, medium, and high α-MSH stimulation; see Supplementary Table 1 for details). The first two principal components explained 66% and 12% of the variance, respectively (Fig. 3A, Supplementary Fig. 6A,B). Through inspection, we found that Principal Component 1 (PC1) separates variants that impact both signaling functions, with variants that are loss-of-function for both Gs and Gq having higher PC1 values (Supplementary Fig. 6A,B). PC2 separates variants that affect Gs and Gq signaling differently (Fig. 3A-D, Supplementary Fig. 6A,B). Variants with higher PC2 values typically exhibit Gq bias by having greater than wild-type levels of Gq signaling activity, while retaining wild-type levels of Gs activity (Fig. 3C,D; Supplementary Fig. 6C,D). In contrast, variants with more negative PC2 values are Gs-biased variants that typically have wild-type levels of Gs signaling and reduced Gq signaling upon agonist stimulation (Fig. 3C).

Systematic identification of variants that have biased effects on MC4R signaling.
A. Principal component analysis of eight MC4R DMS conditions (Gs and Gq signaling each at four α-MSH stimulation doses: zero, low, medium, high, see Supplementary Table 1). Each point indicates a unique variant, with those exhibiting extreme Gq or Gs bias colored in purple and green, respectively. PC, principal component. Schematics made with BioRender.com/d70i288. B. Ribbon structure of MC4R showing the maximum absolute PC2 value for each residue position. **C. Top:** Closeup of MC4R structure shown in bottom left box in B, highlighting positions where variants result in Gs (green) or Gq (purple) bias. **Bottom:** MC4R signaling activity (log2[fold change of variant activity relative to wild-type]) for three selected variants across α-MSH doses. Error bars are +/- 2 standard errors. Med: medium; WT: wild-type **D. Left:** Closeup of MC4R structure shown in bottom right box in B, highlighting positions where variants result in Gs or Gq bias as in C. **Right:** MC4R signaling activity for select variants as in C.

Overall, more variants substantially increase Gq signaling than Gs signaling (Fig. 3A). Gs is the primary G protein coupling for MC4R (Tao 2010; Podyma et al. 2018), and our data suggest that there is little room for further improving MC4R’s already robust Gs signaling activity. Interestingly, biased variants are positionally diverse. For example, the 14 variants that display the most extreme Gq bias (Fig. 3A) are found at 12 different residue positions, with position 79 unique in having several variants that result in Gq bias (Supplementary Fig. 6D). Many of the variants with extreme Gq or Gs bias are located within the regions of MC4R that interact with G proteins, with some scattered throughout the transmembrane domains and far fewer in the vicinity of the protein’s orthosteric site (Fig. 3B-D).

Structural insights into biased signaling

Comparing these results with existing structural information could provide additional detailed insights into MC4R’s signaling functions. It is thought that ligand binding in GPCR orthosteric sites is communicated to the intra-cellular G-protein binding domain through a series of conserved residues or “microswitches” (Hauser et al. 2021; Zhou et al. 2019). Structural studies comparing the inactive and active state structures have confirmed that MC4R shares a similar signaling cascade (Yu et al. 2020; Zhang et al. 2021). Upon ligand binding, W258 (W258^6×48 in https://gpcrdb.org/ nomenclature, where 6 corresponds to the 6th transmembrane helix and 48 denotes 258 is 2 residues before the most conserved residue in that helix (Isberg et al. 2015)) of the conserved CWxP motif undergoes a conformational rearrangement that is translated to L133^3×36 and I137^3×40, of the conserved PIF motif (MIF in melanocortin receptors). This causes F254^6×44 in the PIF motif to rearrange, which in turn disrupts the packing of three different interactions: 1) L140^3×43 and I143^3×46, 2) I251^6×41 and L247^6×37, and 3) R147^3×50 and N240^6×30. These, amongst other rearrangements, culminate in the receptor being able to bind a G-protein. This interaction with the G-protein is primarily mediated through R147^3×50 in the conserved DRY motif, T150^3×53, Y157^34×53, H158^34×54 and R305^7×56 (Zhang et al. 2021).

Strikingly, a number of mutations at residues throughout this signaling cascade had extremely positive PC2 values, implicating them as Gq-biasing mutations. Within the core of the cascade, we identified I137^3×40L, F254^6×44P, and L140^3×43I as Gq-biased (Fig. 3A; Supplementary Fig. 6C). We also identified a number of Gq-biasing mutations within the G-protein binding pocket, specifically T150^3×53G and H158^34×54R (Fig. 3A,C; Supplementary Fig. 6D). Interestingly, the H158^34×54R variant is found in the human population (Hinney et al. 2006; Wade et al. 2021) and has previously been shown to preferentially signal through the Gq pathway (Paisdzior et al. 2020). H158^34×54 is also co-located near two other mutations (K164^3×55L, F152^3×55R), in intracellular loop 2 (ICL2) and near the ends of the third and fourth transmembrane domains (TM3 and TM4), that display bias (Fig. 3C). Interestingly, K164^3×55L exhibits a Gs bias in that it drives loss of function through Gq.

Our data also point to a number of potentially novel interactions. For example, M79^2×39 packs against residues H387^G.H5.19 and Q390^G.H5.22 of the Gs alpha subunit (G_sα) (Flock et al. 2015). This position has multiple different variants that result in Gq bias, including M79^2×39R, M79^2×39S, and M79^2×39G (Fig. 3A, Supplementary Fig. 6D). The most extreme bias signal in our PCA analysis came from I223^5×69L (Fig. 3A,D), which interfaces with the C-terminus of G_sα (Flock et al. 2015), near position L394. Further down towards the intracellular side of TM5, we also identified V228R (Fig. 3A,D), which interfaces proximal to E323^G.hgh4.13 in G_sα (Flock et al. 2015). Collectively, the combination of DMS data and structural information is a fruitful avenue for generating protein structure-function hypotheses. These results highlight the power of DMS to identify regions of MC4R that could be harnessed for designing drugs that precisely modulate specific cellular signaling functions.

Systematic prediction of treatment response

Many variants of MC4R disrupt signaling by causing protein misfolding, which ultimately inhibits proper localization of MC4R to the cell membrane (Huang, Wang, and Tao 2017). Correctors are a class of small molecule drugs that facilitate protein folding. Corrector therapies have been developed for phenotypes such as cystic fibrosis (Boyle and De Boeck 2013) and Fabry disease (Germain et al. 2016), and they have been proposed as a strategy for treating MC4R-associated obesity (Huang, Wang, and Tao 2017; Wang et al. 2014; Huang and Tao 2014). One feature of corrector therapies is that they are typically only effective, and therefore FDA-approved, for a subset of patients harboring specific sequence variants (Boyle and De Boeck 2013; Weaver et al. 2022). Identifying variants that respond to corrector therapy is typically done by rigorously testing the effect of a compound on a single variant at a time (Weaver et al. 2022), and DMS offers an attractive avenue to systematically test the treatment response of thousands of patient variants in a single assay.

To this end, we tested whether Ipsen 17, a small molecule tool compound that has been shown to correct MC4R misfolding (Wang et al. 2014; Poitout et al. 2007), is able to restore the Gs signaling function of the MC4R variants in our DMS library. Out of all 6,633 tested variants, 290 (4.4%) showed disrupted Gs signaling in the absence of treatment that was partially or fully rescued by the addition of Ipsen 17 (Supplementary Fig. 7, see Methods for details of statistical analysis). This includes a number of variants that have been classified as pathogenic in ClinVar (Landrum et al. 2014) or otherwise found in patient sequencing studies (Farooqi et al. 2003; Yeo et al. 2003; Hinney et al. 2006; Stutzmann et al. 2008; Wade et al. 2021; Brouwers et al. 2021; Huang, Wang, and Tao 2017; Hinney, Volckmar, and Knoll 2013) (Fig. 4 shows results for selected variants reported in the human population). Other reported patient variants showed no functional improvement in response to corrector therapy (Fig. 4). Collectively, these data support that performing DMS in the presence of a small molecule corrector can be used to systematically predict which patients are likely to benefit from such treatment modalities.

Corrector therapy rescues the activity of a subset of human MC4R variants.
Gs signaling activity of 21 selected MC4R variant alleles (of 6,633 tested) with (red) and without (gray) Ipsen 17, a small molecule corrector that has been shown to restore the activity of misfolded MC4R. Bars represent the activity of the variant allele normalized to that of WT MC4R in the no corrector condition, and error bars are +/- two standard errors.

Mapping protein-ligand interactions

DMS experiments can be used to define “drug-resistant” variants within MC4R that disrupt the activity of different types of ligands, providing functional insight into protein-ligand interactions that are key for understanding the mechanisms underlying agonism. Such functional information would be a valuable addition to structural methods and has the potential to streamline the lengthy and iterative cycle of compound optimization in drug discovery. Substantial work has been done to characterize how peptide agonists interact structurally with MC4R, but similar work on small-molecule agonists with comparable activity and selectivity remains relatively limited (Gonçalves, Palmer, and Meldal 2018; Sharma et al. 2019; Yu et al. 2020; Heyder et al. 2021; Zhang et al. 2021). To characterize the functional interaction landscapes of different ligand types and to better understand what distinguishes peptide and small-molecule MC4R pharmacophores, we performed DMS assays of MC4R using both native peptide agonist stimulation (α-MSH) and small molecule agonist stimulation (THIQ). Bayesian meta-regression analysis of the lowest dose concentrations of α-MSH and THIQ for the Gs signaling assay revealed a set of variants that uniquely disrupt activation by one ligand but not the other (at FDR < 5%, Fig. 5A). These variants cluster exclusively within the orthosteric binding pocket (Fig. 5B-E) and at positions of known binding interactions of each molecule (Zhang et al. 2021). Notably, there are many more variants that uniquely disrupt activation by α-MSH (Fig. 5A-D). For example, a majority of amino acid substitutions at position I104^2×6 disrupt activation of MC4R by α-MSH, but none lead to significant reduction in MC4R activity upon stimulation by THIQ (Fig. 5C). Multiple positions within the orthosteric binding pocket displayed this pattern (Fig. 5B-E), which is consistent with how the peptide agonist utilizes a larger network of interactions to increase binding affinity.

Systematic identification of functional protein-ligand interactions.
A. Bayesian meta-regression of the α-MSH and THIQ datasets at low agonist concentration for Gs reveals variants that differentially affect MC4R activation by each agonist. Statistically significant effects are colored by which agonist condition they impaired (α-MSH, blue; THIQ, orange, FDR < 0.05). Variants at positions 48, 104, and 129 are labeled. B. Side-views of MC4R structure in surface (left) and ribbon (right) view show clustering of variants from A that specifically abrogate α-MSH but not THIQ activation (blue) in the extracellular orthosteric binding site. C. Effect of all possible variants at residues 48, 104, and 129 at low concentration of α-MSH (blue) and THIQ (orange). Variants that disproportionately affect activation by only one agonist are boxed by respective color. D. Top-down surface views of MC4R (PDB: 7f58) with bound α-MSH (left, blue; PDB: 7f53) or THIQ (right, orange; PDB: 7f58). Positions are colored by whether variants at that position uniquely perturb activation by α-MSH (blue) or THIQ (orange). E. Zoomed view of the MC4R binding pocket with α-MSH bound (left) or THIQ bound (right). Structures are colored as in (D). Residues that form the HFRW motif of α-MSH and functional groups R1, R2, and R3 of THIQ are labeled in bold. Residues 48, 104, and 129 are shown in stick form.

Interestingly, this comparison also highlights how the same residue of MC4R can be critical for interfacing with multiple ligands but points to substantive differences in the precise physical interaction between each ligand and that position of the target. For example, positions P48^1×36 and I129^3×32 harbor variants that can have differentially deleterious effects under the two ligand conditions (Fig. 5C-E). Many amino acid substitutions at P48^1×36 (including to V, I, F, Y, Q, and R) disrupt stimulation of MC4R by α-MSH but not THIQ. However, changing this position to the negatively charged P48^1×36D variant has the opposite effect, uniquely ablating activation by THIQ. The HFRW motif (His6-Phe7-Arg8-Trp9) of α-MSH represents a conserved pharmacophore critical for activation of MC4R by peptides, and the tri-branched THIQ molecule (R1-R2-R3) mimics the HFRW conformational architecture (Fig. 5E) (Hruby et al. 1987; Gonçalves, Palmer, and Meldal 2018; Zhang et al. 2021). Our observation that α-MSH stimulation is more sensitive to variants at position P48^1×36 is consistent with how the His6 of α-MSH forms more interactions with this hydrophobic pocket of MC4R, while the analogous R3 group of THIQ is more flexible and forms non-specific interactions in this region (Fig. 5C–E) (Hruby et al. 1987; Gonçalves, Palmer, and Meldal 2018; Zhang et al. 2021). At position I129^3×32, mutation to any polar uncharged variant (S, T, N, or Q) alters THIQ activation, while I129^3×23V uniquely inhibits α-MSH activation (Fig. 5C–E). The Phe7 of α-MSH and the analogous R2 group of THIQ form key interactions with Ca²⁺ and the core hydrophobic pocket formed by I129^3×32 of MC4R (Fig. 5E) (Zhang et al. 2021). The enrichment of variants at I129^3×32 that uniquely disrupt MC4R activation by THIQ points to a stronger dependency of the small-molecule on interactions with this residue. In summary, by assessing the functional consequences of all mutations within the MC4R orthosteric site, we not only confirm known binding interactions but also reveal interactions that distinguish peptide and small-molecule activation. These relationships provide additional functional insight into the structural mechanism of MC4R ligand binding that could be harnessed for drug design.

Discussion

Deep Mutational Scanning holds substantial potential for improving many phases of drug discovery and development, but realizing this potential requires the ability to draw highly quantitative and disease-relevant conclusions from DMS data, an outstanding challenge for the field. Here, we demonstrated the value of improved DMS assays and analysis methods in addressing these challenges by applying our approaches to a medically relevant GPCR, MC4R. First, we were able to quantify differences in Gs-mediated cAMP and Gq-mediated calcium signaling for MC4R in response to its native ligand, α-MSH. Understanding these subtle differences in molecular signaling phenotypes is crucial for driving precision treatments. For example, Gq signaling bias is one of the purported mechanisms for the better side-effect profile of the FDA approved MC4R agonist, Setmelanotide (Sharma et al. 2019). Additionally, we systematically assessed the responses of MC4R variants implicated in human obesity to a small-molecule corrector, Ipsen 17. Whereas traditional approaches would have required the separate experimental assessment of every variant, DMS tests every conceivable single amino acid variant in one experiment. This enables researchers to understand how a potential treatment will work for broader swaths of the population before drugs reach the clinic, which could ultimately lead to more effective clinical trial designs.

Our data also contribute to the growing understanding of the complicated mechanism by which MC4R (and other GPCRs) translate ligand binding on the extracellular side of the membrane, through an allosteric network of residues, to G-protein activation on the intracellular side (Howard et al. 2024). We demonstrate the utility of DMS for understanding how proteins bind other molecules. In particular, we identified mutations that uniquely disrupt the binding of two agonists, one a peptide (α-MSH) and one a small molecule (THIQ). This is crucial information for understanding the mechanism by which these molecules bind MC4R, which could lead to hypotheses for further optimization of more potent compounds. DMS data contain a substantial amount of latent information about protein structure, and can help identify novel pockets to target with new compounds (Weng et al. 2024). As suggested above, we also envision using DMS data to help optimize molecules by identifying mutations that uniquely disrupt or potentiate sets of compounds in a chemical series.

Importantly, the experimental and analysis frameworks we describe are widely applicable to GPCRs and more broadly to other target classes. GPCRs are known to signal through a variety of pathways, and there are existing transcriptional reporters for most G-protein mediated signaling pathways that could be harnessed to gain a holistic understanding of the downstream consequences of GPCR activity (Hauser et al. 2022). Furthermore, there are a substantial number of existing transcriptional reporters for a wide variety of cellular signaling processes mediated by transcription factors, kinases, and other major targets of drug discovery efforts, which could extend these methods to other target classes.

Finally, there has been incredible progress over the last few years in the development of machine learning-based models to predict the structural and functional consequences of mutations on proteins. That said, these models have struggled to address many real-world applications in the treatment of human disease (McDonald et al. 2024). One potential explanation to this gap in performance could be due to the fact that many DMS datasets (our previous work included) used to train these models typically assay a limited set of experimental conditions, often assay cellular effects that are only indirectly related to disease etiology, suffer from low signal to noise ratios, have no quantification of uncertainty in their measurements, and are subject to a heterogeneous set of subtle experimental caveats. Detailed experimental characterization of protein function (such as the >21,500,000 different measurements in this work alone) and efforts such as ProteinGym (Notin et al. 2023) to benchmark large-scale functional data will continue to be critical for developing and evaluating large scale models that can be applied to a wider variety of drug discovery and development applications. As both the experimental and computational methods continue to improve, we foresee DMS having a profound impact on the drug discovery process.

Materials and methods

Reporter assays and cell line development

The Gs (cAMP-luciferase) reporter assay, diagramed in Supplementary Fig. 1A, was adapted from an assay previously used to assess the function of other GPCRs (Jones et al. 2020, 2019).

The Gq assay relay reporter system (diagrammed in Supplementary Fig. 1B) is described in detail in a patent stemming from this work (Chan, Cooper, and Gasperini 2024). Briefly, it was constructed as follows: A piggybac transposon plasmid was constructed using Gibson assembly harboring a genetic cassette that expresses a synthetic transcription factor, Gal4_DBD-VPR under the control of an NFAT response element. Human embryonic kidney cells (HEK293T) were then cotransfected with this plasmid along with a piggybac transposase expression vector, and cells were selected for puromycin resistance. These cells were then isocloned by dilution plating, and colonies were selected based on single cell clones. Five clones were then carried forward based on cell morphology and growth rates. A second series of Bxb1-based landing pad vectors were constructed containing a library of 20 Gq-coupled GPCRs under the control of the dox-inducible promoter, along with a second genetic cassette containing a DNA-barcoded luciferase gene under the control of the Gal4_UAS enhancer. This plasmid library was integrated into each of the five isoclonal Gq-relay cell lines. The best performing Gq-relay cell line was selected on signal-to-noise criteria across a panel of agonists for the 20 Gq-coupled GPCRs. Use of this relay system in combination specifically with an MC4R transgene resulted in α-MSH-dose dependent expression of the reporter gene (Supplementary Fig. 1C).

Building the DMS libraries

Generation and genomic integration of the plasmid libraries containing all possible single amino acid substitutions of MC4R used a further optimized version of an earlier method (Jones et al. 2020). In brief, variant segments of MC4R cDNA were amplified from DNA microarrays and cloned into base vectors through a multi-step process to yield pooled libraries of MC4R variants with fully intact and barcoded reporter gene cassettes. Fully assembled plasmid libraries were then co-transfected with a plasmid encoding Bxb1 recombinase into HEK293T cells containing a landing pad at the H11 safe harbor locus to achieve single copy integration per cell. In a deviation from the previously published method, two independent replicates of each sub-library were cloned and pooled together post-cellular integration in order to maximize library coverage and the number of barcodes per variant.

Variant-barcode mapping

As described above, barcodes were randomly appended to variants during amplification of MC4R segments from variant oligo pools and then ligated into library base vectors. After this first step of plasmid library cloning, variant segments were amplified along with the neighboring barcode and sequenced with 2×150 paired-end reads (see Supplementary Table 3 for amplification and sequencing primers) on an Illumina NextSeq 550 instrument using a 300-cycle Mid Output kit. Illumina 2×150 BCL files were demultiplexed with bcl2fastq2 into R1 and R2 FASTQ files, which were merged into single fragments using Flash2 requiring a minimum 5 bp of overlap (Quan et al. 2019). The first 21 bp of each fragment corresponding to the barcode sequence were extracted into the read name, and the remaining fragment was adapter-trimmed using umi_tools and cutadapt, respectively (Smith, Heger, and Sudbery 2017; Martin 2011). The remaining fragments were mapped against a custom reference composed of the designed oligonucleotide library using STAR with default parameters except requiring that alignments be strictly unique to be reported (Dobin et al. 2013). Taking each alignment as an oligo-barcode pair, the read counts per unique oligo-barcode pair were computed for each replicate and joined by barcode. Finally, the resulting maps were filtered to require each oligo-barcode pair to pass three requirements in both replicates: correct barcode length, total read depth > 10, and purity > 0.75. The purity of an oligo-barcode pair was defined as the read count of that pair divided by the total number of reads containing that barcode. Post-processing after STAR was performed using samtools for BAM manipulation and custom R code (H. Li et al. 2009).

Running DMS assays

The DMS assay protocol was optimized from a previous method (Jones et al. 2020). HEK293T single-copy variant cell libraries were seeded at a density of ∼17×10⁶ cells per 150 mm tissue-culture treated dish in DMEM + 10% fetal bovine serum (FBS). Four dishes were seeded for each experimental condition, with each dish being treated as an independent biological replicate (4 replicates per condition). Twenty-four hours after seeding, media was exchanged with DMEM + 0.5% FBS +/- 10 ng/mL Doxycycline. For chaperone experiments, all conditions were additionally replicated +/- 1 μM Ipsen 17. 24 hrs after Doxycycline induction, media was exchanged with Opti-MEM + DMSO, Forskolin, or MC4R agonist (α-MSH or THIQ). Forskolin bypasses MC4R to constitutively activate cAMP signaling, so this condition was used as a variant-independent measurement of library composition. For chaperone experiments, cells were washed 3x with 10 mL DMEM to remove Ipsen 17 prior to agonist stimulation as it has been shown to be an antagonist of α-MSH activity and is thought to bind directly to the same site on MC4R (Wang et al. 2014). Six hours after agonist stimulation, cells were harvested by scraping in 4 mL lysis buffer (RLT buffer (Qiagen) + 143 mM β-ME). Given the seeding density (∼17×10⁶ cells per 150 mm replicate dish), time from seeding to collection, and doubling time of HEK293T cells, approximately 25.5×10⁶ cells were collected per replicate. This translates to approximately 30-60x cellular coverage per amino acid variant in each replicate. Lysis was performed by passing the cell slurry 6x through a sterile 18G needle and then spinning through QIAshredder (Qiagen) columns. RNA was extracted from 1 ml of the homogenized lysate with the RNeasy Plus Mini kit (Qiagen), including optional on-column DNAse digestion, and eluted into 100 µL water. Eight reverse transcriptase reactions per sample were performed with the SuperScript IV kit (Thermo Fisher), as described previously (Jones et al. 2020) (primers listed in Supplementary Table 3). cDNA from each sample was treated with 1 µL RNase A (100 µg/ml, Thermo Fisher) and 3.2 µL RNase H (5,000 U/mL, NEB) at 37°C for 30 min. RNase-treated samples were concentrated to ∼55 µL by spinning through Amicon Ultra 10 kDa concentrators (EMD Millipore) for ∼8 min. To determine the necessary cycle numbers for equivalent amplification of each sample library, qPCR reactions were performed on 1 µL cDNA (diluted 1:8 in water) with Q5 polymerase (NEB), SYBR Green (Thermo Fisher), and library amplification primers (Supplementary Table 3). Final amplification cycles for each sample were chosen by adding three cycles to the Cq values generated from each respective qPCR reaction. Illumina sequencing libraries were prepared by amplifying 50 µL of each cDNA sample with sequencing adapters (500 nM each library amplification primer, Supplementary Table 3) using the NEBNext Q5 High Fidelity 2x PCR Kit (NEB) under the following cycling conditions: 98°C for 30 s, X cycles of 98°C for 8 s, 65°C for 20 s, and 72°C for 10 s, followed by an extension of 72°C for 2 min. 3 µL of each DNA library sample was run on a 4% E-Gel (Thermo Fisher) and densitometry was performed with Fiji to account for differences in library yields. Samples were mixed at equal amounts into a single pool and then purified into 200 µL IDTE (Qiagen) with AxyPrep Magnetic beads (Fisher Scientific). The purified library was quantified with the DeNovix High-Sensitivity Fluorescence kit and prepared for sequencing with a 10% PhiX spike-in. Final library mixture was sequenced using custom read and index primers (Supplementary Table 3) on an Illumina NextSeq 550 with the High Output 75 cycle kit.

Sequence processing for barcode expression

Illumina 1×26 BCL files were demultiplexed with bcl2fastq2 and processed to remove the last 5 bp using basic bash commands. The resulting sequences were counted for each sample, and the resulting barcodes were joined with the appropriate oligo-barcode map. The resulting barcodes were joined with sample and MC4R variant metadata and returned for regression analysis. All processing after demultiplexing was performed with custom R code. Total mapped reads per replicate at the RNA-seq stage were as follows:

Gs/CRE: 9.1-18.2 million mapped reads, median=12.3
Gq/UAS: 8.6-24.1 million mapped reads, median=14.5
Gs/CRE+Chaperone: 6.4-9.5 million mapped reads, median=7.5

The median read counts per sample per barcode were 8, 10, and 6 reads for Gs/CRE, Gq/UAS, and Gs/CRE+Chaperone assays, respectively. The median number of barcodes per variant across all samples (the “median of medians”) were 56 for Gs/CRE, 28 for Gq/UAS, and 44 for Gs/CRE+Chaperone. The correlation (r) of barcode readcounts between replicates was ∼0.5 and ∼0.4 for the Gs and Gq assays, respectively (Supplementary Fig. 1E).

Negative binomial regression analysis pipeline

In many DMS methods, barcodes are summed within protein-coding variants to generate a single variant-level count per sample. This unnecessarily sacrifices available power obtained by repeatedly measuring the same molecular process in distinct cells. An alternative is to model barcode counts directly. However, to apply standard models, missing data must either be removed, which can remove a majority of detected barcodes, or somehow imputed. Additionally, many existing methods either use a log transformation of read counts to obtain approximate normality, or apply Poisson regression and related assumptions for inference (Rubin et al. 2017; Faure et al. 2020). However, there is strong prior reason to believe that DMS counts are overdispersed in our data, since our synthetic reporter system is read out via RNA-seq (Robinson and Smyth 2007).

Instead, we developed and applied a mixed effects negative binomial general linear model (GLM) to resolve these challenges. These models have been widely deployed to model count data, and in particular identify differential expression, in bulk and single-cell RNA-seq analysis.

We implement maximum likelihood estimation for this model using glmmTMB, which can accommodate the potentially large scale of multiplex count data (Brooks et al. 2017). For each position, we consider all variants located at that position along with all wild-type variants in the same protein subregion and apply the following model:

For the ith condition, the jth variant, the kth barcode, and the mth sample. Consequently, the first two terms in the last equation above correspond to a global mean term for each condition and a term for the variant-specific deviation from wild-type in each condition. The last two terms are the random effect for barcode k, and the sample-specific technical offset for sample m. The definition of the offset is often context specific, and here we use the log of the sum of barcode counts derived from stops, reasoning they should be constant across conditions and replicates.

We fit this model for each MC4R position independently and extract coefficients for the additive shift in the mean of each variant relative to wild-type. Using the per-condition summary statistics, we obtained Wald test statistics by dividing the effect size by the standard error and computed p-values against the normal distribution. P-values were adjusted for multiple testing using the Benjamini-Hochberg method and thresholded to 1% or 5% FDR where indicated.

To define more complex null hypotheses like chaperone rescue, we extracted marginal means for each variant under each treatment using the emmeans package (Lenth 2024). We define the chaperone rescue contrast as the additive shift of each variant in each treatment condition to the wild-type mean specifically in the untreated condition. Since this quantity is a linear contrast across marginal means, we computed the associated contrast estimates and standard errors, and tested them for significance using the same approach as the per-condition summary statistics.

Comparison to human genetics data and variant effect predictors

Pathogenicity classifications of MC4R missense and nonsense variants were obtained from ClinVar (Landrum et al. 2014) on January 5, 2024, and all available annotations were included in the analysis regardless of ClinVar review status metric. Human population frequency data for MC4R missense variants were obtained from gnomAD (S. Chen et al. 2024) on January 8, 2024. A comprehensive list of 220 MC4R variants that are of potential clinical relevance (Supplementary Table 2) was mined from ClinVar (Landrum et al. 2014), along with papers or review articles describing variants identified in human sequencing studies (Farooqi et al. 2003; Yeo et al. 2003; Hinney et al. 2006; Stutzmann et al. 2008; Wade et al. 2021; Brouwers et al. 2021; Huang, Wang, and Tao 2017; Hinney, Volckmar, and Knoll 2013; Rodríguez Rondón et al. 2024). Effect predictions for MC4R missense variants were obtained from the public releases of AlphaMissense (Cheng et al. 2023) and popEVE (Orenbuch et al. 2023).

Identification of functionally biased variants

We used Principal Component Analysis to identify biased variants. Specifically, we used the test statistic (log2 fold change divided by the standard error) for the DMSO and all α-MSH conditions in both the Gq and Gs pathways to create a matrix with conditions (defined as Drug_Concentration) as columns, and each individual mutant (defined as Position_Substitution) as rows. We then passed this matrix to R’s prcomp function with default parameters, and used the first two principal components to visualize the results. Visual inspection of the loadings via a biplot, along with plotting of various variant types (Supplementary Fig. 6A,B) revealed that PC1 separates variants based on their overall effect on MC4R function, and PC2 separates variants based on differential response through the Gq and Gs pathways. We set a simple cutoff of +- 7.5 on PC2 to highlight particularly interesting variants.

Identifying variants that respond to corrector treatment

We identified variants whose activity increased upon treatment with Ipsen 17 using a slight modification of the summary statistics from the general-purpose model described above. For each variant and condition, we compute the log2-scaled marginal mean (averaging across barcodes and replicates) and its associated standard error. Then, for each variant we compute the following two summary statistics, and test whether their estimates are significantly different from zero. First, we define a variants’ defect as the marginal mean of that variant under DMSO treatment minus the marginal mean of WT under DMSO treatment. This quantifies the existence and severity of the variant’s intrinsic effect on MC4R activity. Second, we define that variant’s rescue as the marginal mean of that variant under Ipsen 17 treatment minus the marginal mean of that variant under DMSO activity. This quantifies the magnitude of the (typically) increase in MC4R activity upon Ipsen 17 treatment for each variant, relative to its DMSO-only baseline. After computing the indicated estimate and errors, we perform significance testing as in the general model, where we define Wald statistics as the estimate divided by the propagated error, compute p-values from the normal distribution, and adjust for multiple testing using the Benjamini-Hochberg procedure.

Identifying critical variants for protein-ligand interactions

We identified sets of mutations that specifically inhibited or potentiated MC4R activation in the presence of ligands. We applied the general DMS model (see Negative binomial regression analysis pipeline) and extracted log2-scaled fold changes and standard errors for each variant relative to WT, within each ligand-treated condition. To account for systematic differences between ligands across all variants, we applied Bayesian meta-regression via the brms R package (Bürkner 2017) and regressed the α-MSH summary statistics against THIQ, while including the errors in both quantities via the se() and me() brms functions. Finally, to infer significant α-MSH- or THIQ-specific effects, we extracted the residual of each variant relative to the meta-regression best-fit line and tested whether this residual was significantly non-zero based on the posterior sampling performed with brms.

Structural modeling

Molecular visualization of variant effects on MC4R was performed with UCSF ChimeraX (Pettersen et al. 2021). For visualization of functionally biased extremes in MC4R (Fig. 3B), the maximum absolute PC2 value for each position was calculated and the define attribute function of ChimeraX was used to color the structure of α-MSH-bound MC4R (PDB: 7F53) by these relative values, ranging from white (no bias) to pink (extreme bias). For all other structural panels related to variant bias, positions of interest are colored binarily by green or purple to indicate Gs-bias or Gq-bias, respectively. For visualizing variant effects on protein-ligand interactions (Fig. 5), positions with significant variants identified by meta-regression were colored by whether variants at a given position perturb activation uniquely by α-MSH (blue) or THIQ (orange). Where α-MSH and THIQ structures are shown, the respective α-MSH-bound (PDB: 7F53) and THIQ-bound (PDB: 7F58) cryo-EM models were used for visualization. For the depictions in Fig. 5B, α-MSH-bound MC4R was used.

Sequence data and software availability

Raw sequencing data are available from SRA under project accession number PRJNA1161152. All code used for analysis and figure generation are available at https://github.com/octantbio/mc4r-dms.

Supplemental information

Supplemental Figures

Methods development detail.
**A,B.** Detailed cartoons of reporters for cAMP (Gs) signaling (A) and calcium (Gq) signaling (B). C. Luciferase assays used to determine agonist (α-MSH and THIQ) dose ranges for the DMS experiments. For each reporter (Gs, Gq), we fit a 3-parameter log-logistic with the bottom of the curve fixed to the DMSO mean. We shared the slope and Emax parameter between α-MSH and THIQ, and let the EC50 parameter vary between the drugs. D. Basal reporter gene expression for the Gs (CRE) reporter, the Gq (NFAT) reporter, and the Gq relay (NFAT) reporter. E. Correlation of sequence read counts of each barcode for two experimental replicates each of the Gs (**left**) and Gq (**right**) DMS assays. Panels A and B made with BioRender.com/y06g086.

Analysis method and barcoding detail.
A. Schematic for analysis method. DMS data can be represented as a tensor, with dimensions for Position in the protein, Amino Acid substitution, and Condition + Replicate. Within the Condition + Replicate dimension, there are multiple drugs (i.e., α-MSH and THIQ) and multiple signaling pathways (i.e., Gs and Gq). Furthermore, each amino acid substitution has multiple barcodes associated with it. Our negative binomial mixed-effect model takes all of this structure into account to produce summary statistics that are the basis of our comparisons. B. Distribution of Z-statistics for stop codons versus all other variant effects from published β2AR (Jones et al. 2020) and MC4R (this study) DMS assays assessing Gs or Gq signaling activity. C. Downsampling simulation from the MC4R data showing the effect of number of barcodes per variant on the estimation of log2 fold change relative to wild type. Here we show two representative positions in the α-MSH low condition, randomly downsampled a fixed number of barcodes associated with that variant. We repeated that process for 5 independent samplings, and ran the resulting data through our model resulting in parameter estimates. Increasing the number of barcodes per variant decreases the magnitude of the error bars (+/- 2 standard errors).

Results (z-scores) of MC4R Deep Mutational Scans across all agonist stimulation conditions.
Shown are z-scores indicating the effect of each variant (y-axis) at each position (x-axis) relative to wild-type. GoF: gain-of-function; LoF: loss-of-function, WT: wild-type.

Results (log₂[fold change]) of MC4R Deep Mutational Scans across all agonist stimulation conditions.
Shown are the log₂(fold change) values of each variant (y-axis) at each position (x-axis) relative to wild-type. GoF: gain-of-function; LoF: loss-of-function, WT: wild-type.

MC4R Deep Mutational Scans are consistent with variant effect predictions. A.,
B. Shown are z-scores for each MC4R variant (x-axis) from the Gs-α-MSH-low (forskolin normalized) DMS condition relative to the AlphaMissense (A) or popEve (B) score of the variant. GoF: gain-of-function; LoF: loss-of-function, WT: wild-type.

Variants with differential effects on Gs and Gq signaling.
A. PCA of z-scores from all Gs and Gq assays using no, low, medium, and high α-MSH. Each point represents a variant, with stop codon variants colored in blue and supporting that PC1 is driven by variants that are LoF in both Gs and Gq assays versus those that retain activity in one or both assays. B. Same PCA as A but with arrows showing how each of the incorporated datasets is contributing to the components. This supports that PC2 is driven by differences in Gs and Gq signaling. **C,D.** Additional examples of variants that have differential effects on Gs and Gq signal, as in Fig. 3C,D. **Left:** Closeup of MC4R structure showing positions of interest in purple. **Right/Bottom:** MC4R signaling activity (log2[fold change of variant activity relative to wild-type]) for selected variants across α-MSH doses. Error bars are +/- 2 standard errors. Med: medium; WT: wild-type

Identification of variants that respond to corrector treatment.
Scatter plot showing activity of each MC4R variant in Gs signaling assay upon stimulation with 1 μM α-MSH (>EC99) in the absence of corrector (x-axis) versus upon treatment with 1 μM Ipsen 17 followed by stimulation with 1 μM α-MSH (>EC99) (y-axis). Each point represents a variant, with those in red classified as both 1) loss-of-function in the absence of corrector; and 2) functionally rescued in the Ipsen 17 treatment condition (see Methods for details).

Variants that differentially impact activation by peptide versus small molecule agonists.
A. PCA of z-scores from all Gs signaling assays with no, low, medium, and high concentrations of α-MSH or THIQ. Each point represents a variant colored by whether it uniquely abrogates signaling in the α-MSH (blue) or THIQ (orange) condition(s). B. Top-down ribbon view of MC4R with bound α-MSH (left, blue; PDB: 7f53) or THIQ (right, orange; PDB: 7f58). MC4R residues are colored by whether variants at a given position disrupt activation uniquely by α-MSH (blue), THIQ (orange), both (green), or neither (gray). Residues 48, 104, and 129 are shown in stick form.

Acknowledgements

We would like to thank Jakob Sture Madsen for his intellectual contributions to the project and his helpful comments on the paper.

Additional files

Supplementary tables

References

1. Ahlmann-Eltze Constantin
2. Huber Wolfgang
2021glmGamPoi: Fitting Gamma-Poisson Generalized Linear Models on Single Cell Count DataBioinformatics 36:5701–2Google Scholar
1. Araya Carlos L.
2. Fowler Douglas M.
2011Deep Mutational Scanning: Assessing Protein Function on a Massive ScaleTrends in Biotechnology 29:435–42Google Scholar
1. Boss V.
2. Talpade D. J.
3. Murphy T. J.
1996Induction of NFAT-Mediated Transcription by Gq-Coupled Receptors in Lymphoid and Non-Lymphoid CellsThe Journal of Biological Chemistry 271:10429–32Google Scholar
1. Boyle Michael P.
2. De Boeck Kris
2013A New Era in the Treatment of Cystic Fibrosis: Correction of the Underlying CFTR DefectThe Lancet. Respiratory Medicine 1:158–63Google Scholar
1. Braberg Hannes
2. Echeverria Ignacia
3. Kaake Robyn M.
4. Sali Andrej
5. Krogan Nevan J.
2022From Systems to Structure - Using Genetic Data to Model Protein StructuresNature Reviews. Genetics 23:342–54Google Scholar
1. Brandes Nadav
2. Goldman Grant
3. Wang Charlotte H.
4. Ye Chun Jimmie
5. Ntranos Vasilis
2023Genome-Wide Prediction of Disease Variant Effects with a Deep Protein Language ModelNature Genetics 55:1512–22Google Scholar
1. Breit Andreas
2. Büch Thomas R. H.
3. Boekhoff Ingrid
4. Solinski Hans Jürgen
5. Damm Ellen
6. Gudermann Thomas
2011Alternative G Protein Coupling and Biased Agonism: New Insights into Melanocortin-4 Receptor SignallingMolecular and Cellular Endocrinology 331:232–40Google Scholar
1. Brooks Mollie E.
2. Kristensen Kasper
3. van Benthem Koen J.
4. Magnusson Arni
5. Berg Casper W.
6. Nielsen Anders
7. Skaug Hans J.
8. Mächler Martin
9. Bolker Benjamin M.
2017glmmTMB Balances Speed and Flexibility Among Packages for Zero-Inflated Generalized Linear Mixed ModelingThe R Journal 9:378–400Google Scholar
1. Brouwers Bas
2. de Oliveira Edson Mendes
3. Marti-Solano Maria
4. Monteiro Fabiola B. F.
5. Laurin Suli-Anne
6. Keogh Julia M.
7. Henning Elana
8. et al.
2021Human MC4R Variants Affect Endocytosis, Trafficking and Dimerization Revealing Multiple Cellular Mechanisms Involved in Weight RegulationCell Reports 34Google Scholar
1. Bürkner Paul-Christian
2017Brms: An R Package for Bayesian Multilevel Models Using StanJournal of Statistical Software 80:1–28Google Scholar
1. Chan Henry
2. Cooper Aaron
3. Gasperini Molly Jeanette
2024Systems and Methods for Measuring Cell Signaling Protein ActivityUS Patent https://patents.google.com/patent/US20240230643A9/en
1. Chavez Alejandro
2. Scheiman Jonathan
3. Vora Suhani
4. Pruitt Benjamin W.
5. Tuttle Marcelle
6. Iyer Eswar P R
7. Lin Shuailiang
8. et al.
2015Highly Efficient Cas9-Mediated Transcriptional ProgrammingNature Methods 12:326–28Google Scholar
1. Cheng Jun
2. Novati Guido
3. Pan Joshua
4. Bycroft Clare
5. Žemgulytė Akvilė
6. Applebaum Taylor
7. Pritzel Alexander
8. et al.
2023Accurate Proteome-Wide Missense Variant Effect Prediction with AlphaMissenseScience 381:eadg7492Google Scholar
1. Chen Kong Y.
2. Muniyappa Ranganath
3. Abel Brent S.
4. Mullins Katherine P.
5. Staker Pamela
6. Brychta Robert J.
7. Zhao Xiongce
8. et al.
2015RM-493, a Melanocortin-4 Receptor (MC4R) Agonist, Increases Resting Energy Expenditure in Obese IndividualsThe Journal of Clinical Endocrinology and Metabolism 100:1639–45Google Scholar
1. Chen Siwei
2. Francioli Laurent C.
3. Goodrich Julia K.
4. Collins Ryan L.
5. Kanai Masahiro
6. Wang Qingbo
7. Alföldi Jessica
8. et al.
2024A Genomic Mutational Constraint Map Using Variation in 76,156 Human GenomesNature 625:92–100Google Scholar
1. Clément Karine
2. van den Akker Erica
3. Argente Jesús
4. Bahm Allison
5. Chung Wendy K.
6. Connors Hillori
7. De Waele Kathleen
8. et al.
2020Efficacy and Safety of Setmelanotide, an MC4R Agonist, in Individuals with Severe Obesity due to LEPR or POMC Deficiency: Single-Arm, Open-Label, Multicentre, Phase 3 TrialsThe Lancet. Diabetes & Endocrinology 8:960–70Google Scholar
1. Clément Karine
2. Biebermann Heike
3. Sadaf Farooqi I.
4. Van der Ploeg Lex
5. Wolters Barbara
6. Poitou Christine
7. Puder Lia
8. et al.
2018MC4R Agonism Promotes Durable Weight Loss in Patients with Leptin Receptor DeficiencyNature Medicine 24:551–55Google Scholar
1. Collet Tinh-Hai
2. Dubern Béatrice
3. Mokrosinski Jacek
4. Connors Hillori
5. Keogh Julia M.
6. de Oliveira Edson Mendes
7. Henning Elana
8. et al.
2017Evaluation of a Melanocortin-4 Receptor (MC4R) Agonist (Setmelanotide) in MC4R DeficiencyMolecular Metabolism 6:1321–29Google Scholar
1. Dobin Alexander
2. Davis Carrie A.
3. Schlesinger Felix
4. Drenkow Jorg
5. Zaleski Chris
6. Jha Sonali
7. Batut Philippe
8. Chaisson Mark
9. Gingeras Thomas R.
2013STAR: Ultrafast Universal RNA-Seq AlignerBioinformatics 29:15–21Google Scholar
1. Farooqi, I. Sadaf, Julia M. Keogh, Giles S. H. Yeo, Emma J. Lank, Tim Cheetham, and Stephen O’Rahilly
2003Clinical Spectrum of Obesity and Mutations in the Melanocortin 4 Receptor GeneThe New England Journal of Medicine 348:1085–95Google Scholar
1. Fatima Munazza Tamkeen
2. Ahmed Ikhlak
3. Fakhro Khalid Adnan
4. Al-Shabeeb Akil Ammira Sarah
2022Melanocortin-4 Receptor Complexity in Energy Homeostasis,obesity and Drug Development StrategiesDiabetes, Obesity & Metabolism 24:583–98Google Scholar
1. Faure Andre J.
2. Domingo Júlia
3. Schmiedel Jörn M.
4. Hidalgo-Carcedo Cristina
5. Diss Guillaume
6. Lehner Ben
2022Mapping the Energetic and Allosteric Landscapes of Protein Binding DomainsNature 604:175–83Google Scholar
1. Faure Andre J.
2. Schmiedel Jörn M.
3. Baeza-Centurion Pablo
4. Lehner Ben
2020DiMSum: An Error Model and Pipeline for Analyzing Deep Mutational Scanning Data and Diagnosing Common Experimental PathologiesGenome Biology 21:207Google Scholar
1. Findlay Gregory M.
2. Boyle Evan A.
3. Hause Ronald J.
4. Klein Jason C.
5. Shendure Jay
2014Saturation Editing of Genomic Regions by Multiplex Homology-Directed RepairNature 513:120–23Google Scholar
1. Flock Tilman
2. Ravarani Charles N. J.
3. Sun Dawei
4. Venkatakrishnan A. J.
5. Kayikci Melis
6. Tate Christopher G.
7. Veprintsev Dmitry B.
8. Madan Babu M.
2015Universal Allosteric Mechanism for Gα Activation by GPCRsNature 524:173–79Google Scholar
1. Fowler Douglas M.
2. Fields Stanley
2014Deep Mutational Scanning: A New Style of Protein ScienceNature Methods 11:801–7Google Scholar
1. Germain Dominique P.
2. Hughes Derralynn A.
3. Nicholls Kathleen
4. Bichet Daniel G.
5. Giugliani Roberto
6. Wilcox William R.
7. Feliciani Claudio
8. et al.
2016Treatment of Fabry’s Disease with the Pharmacologic Chaperone MigalastatThe New England Journal of Medicine 375:545–55Google Scholar
1. Gonçalves, Juliana Pereira Lopes, Daniel Palmer, and Morten Meldal
2018MC4R Agonists: Structural Overview on Antiobesity TherapeuticsTrends in Pharmacological Sciences 39:402–23Google Scholar
1. Greenfield Jerry R.
2. Miller Jeffrey W.
3. Keogh Julia M.
4. Henning Elana
5. Satterwhite Julie H.
6. Cameron Gregory S.
7. Astruc Beatrice
8. et al.
2009Modulation of Blood Pressure by Central Melanocortinergic PathwaysThe New England Journal of Medicine 360:44–52Google Scholar
1. Hauser Alexander S.
2. Avet Charlotte
3. Normand Claire
4. Mancini Arturo
5. Inoue Asuka
6. Bouvier Michel
7. Gloriam David E.
2022Common Coupling Map Advances GPCR-G Protein SelectivityeLife 11:Marchhttps://doi.org/10.7554/eLife.74107 Google Scholar
1. Hauser Alexander S.
2. Kooistra Albert J.
3. Munk Christian
4. Heydenreich Franziska M.
5. Veprintsev Dmitry B.
6. Bouvier Michel
7. Madan Babu M.
8. Gloriam David E.
2021GPCR Activation Mechanisms across Classes and Macro/microscalesNature Structural & Molecular Biology 28:879–88Google Scholar
1. Heyder Nicolas A.
2. Kleinau Gunnar
3. Speck David
4. Schmidt Andrea
5. Paisdzior Sarah
6. Szczepek Michal
7. Bauer Brian
8. et al.
2021Structures of Active Melanocortin-4 Receptor-Gs-Protein Complexes with NDP-α-MSH and SetmelanotideCell Research 31:1176–89Google Scholar
1. Hinney Anke
2. Bettecken Thomas
3. Tarnow Patrick
4. Brumm Harald
5. Reichwald Kathrin
6. Lichtner Peter
7. Scherag André
8. et al.
2006Prevalence, Spectrum, and Functional Characterization of Melanocortin-4 Receptor Gene Mutations in a Representative Population-Based Sample and Obese Adults from GermanyThe Journal of Clinical Endocrinology and Metabolism 91:1761–69Google Scholar
1. Hinney Anke
2. Hohmann Sarah
3. Geller Frank
4. Vogel Constanze
5. Hess Claudia
6. Wermter Anne-Kathrin
7. Brokamp Britta
8. et al.
2003Melanocortin-4 Receptor Gene: Case-Control Study and Transmission Disequilibrium Test Confirm That Functionally Relevant Mutations Are Compatible with a Major Gene Effect for Extreme ObesityThe Journal of Clinical Endocrinology and Metabolism 88:4258–67Google Scholar
1. Hinney Anke
2. Körner Antje
3. Fischer-Posovszky Pamela
2022The Promise of New Anti-Obesity Therapies Arising from Knowledge of Genetic Obesity TraitsNature Reviews. Endocrinology :1–15Google Scholar
1. Hinney Anke
2. Volckmar Anna-Lena
3. Knoll Nadja
2013Melanocortin-4 Receptor in Energy Homeostasis and Obesity PathogenesisProgress in Molecular Biology and Translational Science 114:147–91Google Scholar
1. Howard Matthew K.
2. Hoppe Nicholas
3. Huang Xi-Ping
4. Macdonald Christian B.
5. Mehrota Eshan
6. Grimes Patrick Rockefeller
7. Zahm Adam
8. et al.
2024Molecular Basis of Proton-Sensing by G Protein-Coupled ReceptorsbioRxiv https://doi.org/10.1101/2024.04.17.590000 Google Scholar
1. Hruby V. J.
2. Wilkes B. C.
3. Hadley M. E.
4. Al-Obeidi F.
5. Sawyer T. K.
6. Staples D. J.
7. de Vaux A. E.
8. Dym O.
9. Castrucci A. M.
10. Hintz M. F.
1987Alpha-Melanotropin: The Minimal Active Sequence in the Frog Skin BioassayJournal of Medicinal Chemistry 30:2126–30Google Scholar
1. Huang Hui
2. Tao Ya-Xiong
2014A Small Molecule Agonist THIQ as a Novel Pharmacoperone for Intracellularly Retained Melanocortin-4 Receptor MutantsInternational Journal of Biological Sciences 10:817–24Google Scholar
1. Huang Hui
2. Wang Wei
3. Tao Ya-Xiong
2017Pharmacological Chaperones for the Misfolded Melanocortin-4 Receptor Associated with Human ObesityBiochimica et Biophysica Acta, Molecular Basis of Disease 1863:2496–2507Google Scholar
1. Hughes J. P.
2. Rees S.
3. Kalindjian S. B.
4. Philpott K. L.
2011Principles of Early Drug DiscoveryBritish Journal of Pharmacology 162:1239–49Google Scholar
1. Isberg Vignir
2. de Graaf Chris
3. Bortolato Andrea
4. Cherezov Vadim
5. Katritch Vsevolod
6. Marshall Fiona H.
7. Mordalski Stefan
8. et al.
2015Generic GPCR Residue Numbers - Aligning Topology Maps While Minding the GapsTrends in Pharmacological Sciences 36:22–31Google Scholar
1. Jones Eric M.
2. Jajoo Rishi
3. Cancilla Daniel
4. Lubock Nathan B.
5. Wang Jeffrey
6. Satyadi Megan
7. Chong Rockie
8. et al.
2019A Scalable, Multiplexed Assay for Decoding GPCR-Ligand Interactions with RNA SequencingCell Systems 8:254–60Google Scholar
1. Jones Eric M.
2. Lubock Nathan B.
3. Venkatakrishnan A. J.
4. Wang Jeffrey
5. Tseng Alex M.
6. Paggi Joseph M.
7. Latorraca Naomi R.
8. et al.
2020Structural and Functional Characterization of G Protein-Coupled Receptors with Deep Mutational ScanningeLife 9:Octoberhttps://doi.org/10.7554/eLife.54895 Google Scholar
1. Ju Sang Hyeon
2. Cho Gyu-Bon
3. Sohn Jong-Woo
2018Understanding Melanocortin-4 Receptor Control of Neuronal Circuits: Toward Novel Therapeutics for Obesity SyndromePharmacological Research: The Official Journal of the Italian Pharmacological Society 129:10–19Google Scholar
1. Kievit Paul
2. Halem Heather
3. Marks Daniel L.
4. Dong Jesse Z.
5. Glavas Maria M.
6. Sinnayah Puspha
7. Pranger Lindsay
8. Cowley Michael A.
9. Grove Kevin L.
10. Culler Michael D.
2013Chronic Treatment with a Melanocortin-4 Receptor Agonist Causes Weight Loss, Reduces Insulin Resistance, and Improves Cardiovascular Function in Diet-Induced Obese Rhesus MacaquesDiabetes 62:490–97Google Scholar
1. Lafita Aleix
2. Gonzalez Ferran
3. Hossam Mahmoud
4. Smyth Paul
5. Deasy Jacob
6. Allyn-Feuer Ari
7. Seaton Daniel
8. Young Stephen
2024Fine-Tuning Protein Language Models with Deep Mutational Scanning Improves Variant Effect PredictionarXiv [q-bio.GN] http://arxiv.org/abs/2405.06729
1. Landrum Melissa J.
2. Lee Jennifer M.
3. Riley George R.
4. Jang Wonhee
5. Rubinstein Wendy S.
6. Church Deanna M.
7. Maglott Donna R.
2014ClinVar: Public Archive of Relationships among Sequence Variation and Human PhenotypeNucleic Acids Research 42:D980–85Google Scholar
1. Lenth Russell V
2024Estimated Marginal Means, Aka Least-Squares Means [R Package Emmeans Version 1.10.1]https://CRAN.R-project.org/package=emmeans
1. Li Heng
2. Handsaker Bob
3. Wysoker Alec
4. Fennell Tim
5. Ruan Jue
6. Homer Nils
7. Marth Gabor
8. Abecasis Goncalo
9. Durbin Richard
10. 1000 Genome Project Data Processing Subgroup
2009The Sequence Alignment/Map Format and SAMtoolsBioinformatics 25:2078–79Google Scholar
1. Li Yong-Qi
2. Shrestha Yogendra
3. Pandey Mritunjay
4. Chen Min
5. Kablan Ahmed
6. Gavrilova Oksana
7. Offermanns Stefan
8. Weinstein Lee S.
2016G(q/11)α and G(s)α Mediate Distinct Physiological Responses to Central MelanocortinsThe Journal of Clinical Investigation 126:40–49Google Scholar
1. Lotta Luca A.
2. Mokrosiński Jacek
3. de Oliveira Edson Mendes
4. Li Chen
5. Sharp Stephen J.
6. Luan Jian ’an
7. Brouwers Bas
8. et al.
2019Human Gain-of-Function MC4R Variants Show Signaling Bias and Protect against ObesityCell 177:597–607Google Scholar
1. Love Michael I.
2. Huber Wolfgang
3. Anders Simon
2014Moderated Estimation of Fold Change and Dispersion for RNA-Seq Data with DESeq2Genome Biology 15:550Google Scholar
1. Martin Marcel
2011Cutadapt Removes Adapter Sequences from High-Throughput Sequencing ReadsEMBnet.journal 17:10–12Google Scholar
1. Matreyek Kenneth A.
2. Starita Lea M.
3. Stephany Jason J.
4. Martin Beth
5. Chiasson Melissa A.
6. Gray Vanessa E.
7. Kircher Martin
8. et al.
2018Multiplex Assessment of Protein Variant Abundance by Massively Parallel SequencingNature Genetics 50:874–82Google Scholar
1. McCarthy Davis J.
2. Chen Yunshun
3. Smyth Gordon K.
2012Differential Expression Analysis of Multifactor RNA-Seq Experiments with Respect to Biological VariationNucleic Acids Research 40:4288–97Google Scholar
1. McDonald Eli Fritz
2. Oliver Kathryn E.
3. Schlebach Jonathan P.
4. Meiler Jens
5. Plate Lars
2024Benchmarking AlphaMissense Pathogenicity Predictions against Cystic Fibrosis VariantsPloS One 19:e0297560Google Scholar
1. Metzger Peter J.
2. Zhang Aileen
3. Carlson Bradley A.
4. Sun Hui
5. Cui Zhenzhong
6. Li Yongqi
7. Jahnke Marshal T.
8. et al.
2024A Human Obesity-Associated MC4R Mutation with Defective Gq/11α Signaling Leads to Hyperphagia in MiceThe Journal of Clinical Investigation 134:4https://doi.org/10.1172/JCI165418 Google Scholar
1. Notin Pascal
2. Kollasch Aaron W.
3. Ritter Daniel
4. van Niekerk Lood
5. Paul Steffanie
6. Spinner Hansen
7. Rollins Nathan
8. et al.
2023ProteinGym: Large-Scale Benchmarks for Protein Design and Fitness PredictionbioRxiv https://doi.org/10.1101/2023.12.07.570727 Google Scholar
1. Orenbuch Rose
2. Kollasch Aaron W.
3. Spinner Hansen D.
4. Shearer Courtney A.
5. Hopf Thomas A.
6. Franceschi Dinko
7. Dias Mafalda
8. Frazer Jonathan
9. Marks Debora S.
2023Deep Generative Modeling of the Human Proteome Reveals over a Hundred Novel Genes Involved in Rare Genetic DisordersmedRxiv https://doi.org/10.1101/2023.11.27.23299062 Google Scholar
1. Paisdzior Sarah
2. Dimitriou Ioanna Maria
3. Schöpe Paul Curtis
4. Annibale Paolo
5. Scheerer Patrick
6. Krude Heiko
7. Lohse Martin J.
8. Biebermann Heike
9. Kühnen Peter
2020Differential Signaling Profiles of MC4R Mutations with Three Different LigandsInternational Journal of Molecular Sciences 21:4https://doi.org/10.3390/ijms21041224 Google Scholar
1. Pettersen Eric F.
2. Goddard Thomas D.
3. Huang Conrad C.
4. Meng Elaine C.
5. Couch Gregory S.
6. Croll Tristan I.
7. Morris John H.
8. Ferrin Thomas E.
2021UCSF ChimeraX: Structure Visualization for Researchers, Educators, and DevelopersProtein Science: A Publication of the Protein Society 30:70–82Google Scholar
1. Plenge Robert M.
2. Scolnick Edward M.
3. Altshuler David
2013Validating Therapeutic Targets through Human GeneticsNature Reviews. Drug Discovery 12:581–94Google Scholar
1. Podyma Brandon
2. Sun Hui
3. Wilson Eric A.
4. Carlson Bradley
5. Pritikin Ethan
6. Gavrilova Oksana
7. Weinstein Lee S.
8. Chen Min
2018The Stimulatory G Protein Gsα Is Required in Melanocortin 4 Receptor-Expressing Cells for Normal Energy Balance, Thermogenesis, and Glucose MetabolismThe Journal of Biological Chemistry 293:10993–5Google Scholar
1. Poitout Lydie
2. Brault Valérie
3. Sackur Carole
4. Bernetière Sonia
5. Camara José
6. Plas Pascale
7. Roubert Pierre
2007Identification of a Novel Series of Benzimidazoles as Potent and Selective Antagonists of the Human Melanocortin-4 ReceptorBioorganic & Medicinal Chemistry Letters 17:4464–70Google Scholar
1. Quan Jenai
2. Langelier Charles
3. Kuchta Alison
4. Batson Joshua
5. Teyssier Noam
6. Lyden Amy
7. Caldera Saharai
8. et al.
2019FLASH: A next-Generation CRISPR Diagnostic for Multiplexed Detection of Antimicrobial Resistance SequencesNucleic Acids Research 47:e83Google Scholar
1. Robinson Mark D.
2. Oshlack Alicia
2010A Scaling Normalization Method for Differential Expression Analysis of RNA-Seq DataGenome Biology 11:R25Google Scholar
1. Robinson Mark D.
2. Smyth Gordon K.
2007Moderated Statistical Tests for Assessing Differences in Tag AbundanceBioinformatics 23:2881–87Google Scholar
1. Rondón Rodríguez
2. Alejandra V.
3. Welling Mila S.
4. van den Akker Erica L. T.
5. van Rossum Elisabeth F. C.
6. Boon Elles M. J.
7. van Haelst Mieke M.
8. Delhanty Patric J. D.
9. Visser Jenny A.
2024MC4R Variants Modulate α-MSH and Setmelanotide Induced Cellular Signaling at Multiple LevelsThe Journal of Clinical Endocrinology and Metabolism https://doi.org/10.1210/clinem/dgae210 Google Scholar
1. Rubin Alan F.
2. Gelman Hannah
3. Lucas Nathan
4. Bajjalieh Sandra M.
5. Papenfuss Anthony T.
6. Speed Terence P.
7. Fowler Douglas M.
2017A Statistical Framework for Analyzing Deep Mutational Scanning DataGenome Biology 18:150Google Scholar
1. Sebhat Iyassu K.
2. Martin William J.
3. Ye Zhixiong
4. Barakat Khaled
5. Mosley Ralph T.
6. Johnston David B. R.
7. Bakshi Raman
8. et al.
2002Design and Pharmacology of N-[(3R)-1,2,3,4-Tetrahydroisoquinolinium- 3-Ylcarbonyl]-(1R)-1-(4-Chlorobenzyl)-2-[4-Cyclohexyl-4-(1H-1,2,4-Triazol- 1-Ylmethyl)piperidin-1-Yl]-2-Oxoethylamine (1), a Potent, Selective, Melanocortin Subtype-4 Receptor AgonistJournal of Medicinal Chemistry 45:4589–93Google Scholar
1. Sharma Shubh
2. Garfield Alastair S.
3. Shah Bhavik
4. Kleyn Patrick
5. Ichetovkin Ilia
6. Moeller Ida Hatoum
7. Mowrey William R.
8. Van der Ploeg Lex H. T.
2019Current Mechanistic and Pharmacodynamic Understanding of Melanocortin-4 Receptor ActivationMolecules 24:10https://doi.org/10.3390/molecules24101892 Google Scholar
1. Smith Tom
2. Heger Andreas
3. Sudbery Ian
2017UMI-Tools: Modeling Sequencing Errors in Unique Molecular Identifiers to Improve Quantification AccuracyGenome Research 27:491–99Google Scholar
1. Sriram Krishna
2. Insel Paul A.
2018G Protein-Coupled Receptors as Targets for Approved Drugs: How Many Targets and How Many Drugs?Molecular Pharmacology 93:251–58Google Scholar
1. Starita Lea M.
2. Ahituv Nadav
3. Dunham Maitreya J.
4. Kitzman Jacob O.
5. Roth Frederick P.
6. Seelig Georg
7. Shendure Jay
8. Fowler Douglas M.
2017Variant Interpretation: Functional Assays to the RescueAmerican Journal of Human Genetics 101:315–25Google Scholar
1. Stutzmann Fanny
2. Tan Karen
3. Vatin Vincent
4. Dina Christian
5. Jouret Béatrice
6. Tichet Jean
7. Balkau Beverley
8. et al.
2008Prevalence of Melanocortin-4 Receptor Deficiency in Europeans and Their Age-Dependent Penetrance in Multigenerational PedigreesDiabetes 57:2511–18Google Scholar
1. Sweeney Patrick
2. Gimenez Luis E.
3. Hernandez Ciria C.
4. Cone Roger D.
2023Targeting the Central Melanocortin System for the Treatment of Metabolic DisordersNature Reviews. Endocrinology 19:507–19Google Scholar
1. Tao Ya-Xiong
2010The Melanocortin-4 Receptor: Physiology, Pharmacology, and PathophysiologyEndocrine Reviews 31:506–43Google Scholar
1. Vaisse C.
2. Clement K.
3. Durand E.
4. Hercberg S.
5. Guy-Grand B.
6. Froguel P.
2000Melanocortin-4 Receptor Mutations Are a Frequent and Heterogeneous Cause of Morbid ObesityThe Journal of Clinical Investigation 106:253–62Google Scholar
1. Wade Kaitlin H.
2. Lam Brian Y. H.
3. Melvin Audrey
4. Pan Warren
5. Corbin Laura J.
6. Hughes David A.
7. Rainbow Kara
8. et al.
2021Loss-of-Function Mutations in the Melanocortin 4 Receptor in a UK Birth CohortNature Medicine 27:1088–96Google Scholar
1. Wang Xiao-Hua
2. Wang Hao-Meng
3. Zhao Bao-Lei
4. Yu Peng
5. Fan Zhen-Chuan
2014Rescue of Defective MC4R Cell-Surface Expression and Signaling by a Novel Pharmacoperone Ipsen 17Journal of Molecular Endocrinology 53:17–29Google Scholar
1. Weaver James L.
2. Wu Wendy
3. Hyland Paula L.
4. Lim Robert
5. Smpokou Patroula
6. Pacanowski Michael
2022Expanding Approved Patient Populations for Rare Disease Treatment Using In Vitro DataClinical Pharmacology and Therapeutics 112:58–61Google Scholar
1. Weile Jochen
2. Roth Frederick P.
2018Multiplexed Assays of Variant Effects Contribute to a Growing Genotype-Phenotype AtlasHuman Genetics 137:665–78Google Scholar
1. Weng Chenchun
2. Faure Andre J.
3. Escobedo Albert
4. Lehner Ben
2024The Energetic and Allosteric Landscape for KRAS InhibitionNature 626:643–52Google Scholar
1. Yeo Giles S. H.
2. Lank Emma J.
3. Sadaf Farooqi I.
4. Keogh Julia
5. Challis Benjamin G.
6. O’Rahilly Stephen
2003Mutations in the Human Melanocortin-4 Receptor Gene Associated with Severe Familial Obesity Disrupts Receptor Function through Multiple Molecular MechanismsHuman Molecular Genetics 12:561–74Google Scholar
1. Yu Jing
2. Gimenez Luis E.
3. Hernandez Ciria C.
4. Wu Yiran
5. Wein Ariel H.
6. Han Gye Won
7. McClary Kyle
8. et al.
2020Determination of the Melanocortin-4 Receptor Structure Identifies Ca2+ as a Cofactor for Ligand BindingScience 368:428–33Google Scholar
1. Zhang Huibing
2. Chen Li-Nan
3. Yang Dehua
4. Mao Chunyou
5. Shen Qingya
6. Feng Wenbo
7. Shen Dan-Dan
8. et al.
2021Structural Insights into Ligand Recognition and Activation of the Melanocortin-4 ReceptorCell Research 31:1163–75Google Scholar
1. Zhou Qingtong
2. Yang Dehua
3. Wu Meng
4. Guo Yu
5. Guo Wanjing
6. Zhong Li
7. Cai Xiaoqing
8. et al.
2019Common Activation Mechanism of Class A GPCRseLife 8:Decemberhttps://doi.org/10.7554/eLife.50279 Google Scholar

Article and author information

Author information

Conor J Howard
Octant, Inc, Emeryville, United States
ORCID iD: 0000-0001-5375-6248
- These authors contributed equally to this work.
Nathan S Abell
Octant, Inc, Emeryville, United States
ORCID iD: 0000-0003-1098-5395
- These authors contributed equally to this work.
Beatriz A Osuna
Octant, Inc, Emeryville, United States, Tacit Therapeutics, Inc, South San Francisco, United States
ORCID iD: 0000-0003-2604-6173
Eric M Jones
Octant, Inc, Emeryville, United States
ORCID iD: 0000-0002-6648-1965
Leon Y Chan
Octant, Inc, Emeryville, United States
ORCID iD: 0000-0002-0189-4689
Henry Chan
Octant, Inc, Emeryville, United States
ORCID iD: 0009-0008-5927-3587
Dean R Artis
Octant, Inc, Emeryville, United States, Annexon Biosciences, Brisbane, United States
Jonathan B Asfaha
Octant, Inc, Emeryville, United States
ORCID iD: 0000-0003-1160-5104
Joshua S Bloom
Department of Human Genetics and Department of Computational Medicine, University of California, Los Angeles, Los Angeles, United States, Howard Hughes Medical Institute, Chevy Chase, United States
ORCID iD: 0000-0002-7241-1648
Aaron R Cooper
Octant, Inc, Emeryville, United States, Arsenal Biosciences, South San Francisco, United States
ORCID iD: 0000-0003-4588-2513
Andrew Liao
Octant, Inc, Emeryville, United States
ORCID iD: 0000-0002-4548-9175
Eden Mahdavi
Octant, Inc, Emeryville, United States
Nabil Mohammed
Octant, Inc, Emeryville, United States
ORCID iD: 0009-0003-6446-6379
Alan L Su
Octant, Inc, Emeryville, United States, Department of Chemical and Systems Biology, Stanford University, Stanford, United States
ORCID iD: 0009-0005-3406-9270
Giselle A Uribe
Octant, Inc, Emeryville, United States
ORCID iD: 0000-0003-1523-6916
Sriram Kosuri
Octant, Inc, Emeryville, United States
ORCID iD: 0000-0002-4661-0600
Diane E Dickel
Octant, Inc, Emeryville, United States
ORCID iD: 0000-0001-5497-6824
- For correspondence: diane@octant.bio
Nathan B Lubock
Octant, Inc, Emeryville, United States
ORCID iD: 0000-0001-8064-2465
- For correspondence: nate@octant.bio

Author Notes

Competing interests: CJH, NSA, BAO, EMJ, LYC, HC, JBA, JSB, ARC, SK, DED, and NBL are current employees of Octant, Inc. and/or hold shares or options in the company. EMJ, HC, ARC, and SK are listed inventors on patents related to this work. All other authors declare no competing financial interests.

Version history

Preprint posted: October 12, 2024
Sent for peer review: November 4, 2024
Reviewed Preprint version 1: December 19, 2024
Reviewed Preprint version 2: February 18, 2025
Version of Record published: April 9, 2025

Cite all versions

You can cite all versions using the DOI https://doi.org/10.7554/eLife.104725. This DOI represents all versions, and will always resolve to the latest one.

Copyright

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

views: 3,543
downloads: 264
citations: 14

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Significance of findings

Strength of evidence

Abstract

Introduction

Results

Development of highly quantitative deep mutational scanning methods

Assays for disease-relevant mechanisms

Highly quantitative Deep Mutational Scanning (DMS) methods for drug discovery applications.

Analysis model for statistically robust comparisons

Increasing power through barcoding

Comprehensive deep mutational scanning of MC4R

Effect of 6,633 missense and nonsense variants on MC4R signaling functions.

Systematic human variant interpretation of MC4R

Variants that bias MC4R function

Systematic identification of variants that have biased effects on MC4R signaling.

Structural insights into biased signaling

Systematic prediction of treatment response

Corrector therapy rescues the activity of a subset of human MC4R variants.

Mapping protein-ligand interactions

Systematic identification of functional protein-ligand interactions.

Discussion

Materials and methods

Reporter assays and cell line development

Building the DMS libraries

Variant-barcode mapping

Running DMS assays

Sequence processing for barcode expression

Negative binomial regression analysis pipeline

Comparison to human genetics data and variant effect predictors

Identification of functionally biased variants

Identifying variants that respond to corrector treatment

Identifying critical variants for protein-ligand interactions

Structural modeling

Sequence data and software availability

Supplemental information

Supplemental Figures

Methods development detail.

Analysis method and barcoding detail.

Results (z-scores) of MC4R Deep Mutational Scans across all agonist stimulation conditions.

Results (log2[fold change]) of MC4R Deep Mutational Scans across all agonist stimulation conditions.

MC4R Deep Mutational Scans are consistent with variant effect predictions. A.,

Variants with differential effects on Gs and Gq signaling.

Identification of variants that respond to corrector treatment.

Variants that differentially impact activation by peptide versus small molecule agonists.

Acknowledgements

Additional files

References

Article and author information

Author information

Conor J Howard#

Nathan S Abell#

Beatriz A Osuna

Eric M Jones

Leon Y Chan

Henry Chan

Dean R Artis

Jonathan B Asfaha

Joshua S Bloom

Aaron R Cooper

Andrew Liao

Eden Mahdavi

Nabil Mohammed

Alan L Su

Giselle A Uribe

Sriram Kosuri

Diane E Dickel

Nathan B Lubock

Author Notes

Version history

Cite all versions

Copyright

Metrics

Results (log₂[fold change]) of MC4R Deep Mutational Scans across all agonist stimulation conditions.

Conor J Howard

Nathan S Abell