Determinants of trafficking, conduction, and disease within a K+ channel revealed through multiparametric deep mutational scanning

  1. Willow Coyote-Maestas
  2. David Nedrud
  3. Yungui He
  4. Daniel Schmidt  Is a corresponding author
  1. Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, United States
  2. Department of Genetics, Cell Biology and Development, University of Minnesota, United States

Abstract

A long-standing goal in protein science and clinical genetics is to develop quantitative models of sequence, structure, and function relationships to understand how mutations cause disease. Deep mutational scanning (DMS) is a promising strategy to map how amino acids contribute to protein structure and function and to advance clinical variant interpretation. Here, we introduce 7429 single-residue missense mutations into the inward rectifier K+ channel Kir2.1 and determine how this affects folding, assembly, and trafficking, as well as regulation by allosteric ligands and ion conduction. Our data provide high-resolution information on a cotranslationally folded biogenic unit, trafficking and quality control signals, and segregated roles of different structural elements in fold stability and function. We show that Kir2.1 surface trafficking mutants are underrepresented in variant effect databases, which has implications for clinical practice. By comparing fitness scores with expert-reviewed variant effects, we can predict the pathogenicity of ‘variants of unknown significance’ and disease mechanisms of known pathogenic mutations. Our study in Kir2.1 provides a blueprint for how multiparametric DMS can help us understand the mechanistic basis of genetic disorders and the structure–function relationships of proteins.

Editor's evaluation

This manuscript details a new approach to systematically interrogating the relations between the amino acid sequence of ion channels, and their stability, subcellular localization, and function. By applying this approach to an inward-rectifier K+-channel, the authors uncover intriguing structural relations between channel regions likely involved in protein stability and those responsible for functional dynamics. Their analysis also offers predictions for the potential pathogenicity of channel variants of unknown significance found in the human population.

https://doi.org/10.7554/eLife.76903.sa0

Introduction

Inward rectifier K+ (Kir) channels play a central role in physiology by setting, maintaining, and regulating a cell’s resting membrane potential (RMP) (Hibino et al., 2010). Misregulated Kir’s cause neurological disorders (alcohol and opiate addiction; Bodhinathan and Slesinger, 2013; Marron Fernandez de Velasco et al., 2015; Torrecilla et al., 2002), Down’s syndrome (Rachidi and Lopes, 2007), epilepsy (Köhling and Wolfart, 2016), Parkinson’s (Zhang et al., 2020), cardiac disorders (long QT syndrome; Moss and Kass, 2005), hereditary renal diseases (Flagg et al., 1999), and diabetes (Flanagan et al., 2009; Hattersley and Ashcroft, 2005; Marthinet et al., 2005).

While some mechanisms that underlie disease-causing Kir variants involve aberrant function (gating, ion permeation, ligand binding; Abraham et al., 1999) or transcript processing (Vuong et al., 2016), others are linked to trafficking defects. For example, over half the missense mutations investigated in Kir1.1 affected surface expression, often caused by proteosomal degradation of folding-deficient or mistrafficked channels (O’Donnell et al., 2017; Peters et al., 2003). A comparison of mutation hotspots in the Kir C-terminal domain (CTD) revealed that ~60% of variants could be linked to impaired surface expression (Fallen et al., 2009; Zangerl-Plessl et al., 2019). Mutations linked to neonatal diabetes (which is caused by gain-of-function mutations) and congenital hyperinsulinism affect Kir6.2 surface expression to varying degrees (Flanagan et al., 2009; Hattersley and Ashcroft, 2005; Marthinet et al., 2005; Lin et al., 2003). Beyond known trafficking signals (e.g., ER and Golgi-export signal sequences; Li et al., 2016; Ma et al., 2001; Ma et al., 2002; Ma et al., 2011; Stockklausner et al., 2001; Zerangue et al., 1999), mutations along the entire Kir primary sequence can disrupt surface expression (Zangerl-Plessl et al., 2019). Several additional factors control surface expression, such as protein stability or interactions with trafficking partners and complexes that stabilize channels in the membrane (Ma et al., 2001; Ma et al., 2002; Leonoudakis et al., 2004).

The apparent prevalence of Kir trafficking phenotypes has a significant caveat: phenotypes are experimentally determined for only a small fraction of possible or clinically observed variants. Of the 8113 possible missense mutations for KCNJ2/Kir2.1, only 252 are reported in ClinVar – a database of clinically observed variation and phenotypes (Landrum et al., 2018) – and gnomAD – a database of human variation from exome and whole-genome sequences (Karczewski et al., 2020; Figure 1A, Figure 1—figure supplement 1, Supplementary file 1). Most studies focus on common natural variants observed in human genomes or exomes. Rare genetic variants vastly outnumber common variants (Tennessen et al., 2012) and are implicated in a substantial portion of complex human disease (McClellan and King, 2010). Rarity means that clinical significance (benign or pathogenic) of most variants cannot be established using genome-wide association studies due to low statistical power for calling pathogenic variants. For KCNJ2/Kir2.1, ClinVar reports 162 missense mutations, but most (74%) have ‘uncertain significance’ or conflicting or no interpretation (Figure 1A). Computational algorithms (PolyPhen-2 [Adzhubei et al., 2010], SIFT [Ng and Henikoff, 2003], EVE [Frazer et al., 2021], etc.) are currently filling this gap by predicting the consequence of mutations. However, computational approaches work best in conserved regions of genes and cannot predict mechanism of pathogenicity.

Figure 1 with 4 supplements see all
Deep mutational scanning (DMS) to improve clinical variant prediction.

(A) Clinically observed variation in Kir2.1 and other Kir and their interpretation of pathogenicity. (B) Schematic summarizing the various processes that must work correctly for Kir2.1 to fulfill its role in cellular physiology. (C) Schematic summarizing DMS and multiparametric assessment of phenotype. Mutations are introduced into Kir2.1 using SPINE. A stable single-copy insertion library is generated by BxBI-mediated recombination in HEK293T. Cells are sorted based on channel surface expression or function (K+ conductance) as determined by antibody labeling of an extracellular FLAG-tag or voltage-sensitive dye, respectively. Genotypes of each sorted cell population are recovered by NextGen Sequencing (NGS).

Despite a central role for Kir trafficking and function in disease etiology, there are no large-scale studies that comprehensively study sequence determinants of Kir trafficking and function. Apart from an early study that identified functional Kir2.1 substitutions (Minor et al., 1999), previous studies focused on a small subset of common natural variants. Systematic studies are needed for a global picture of Kir trafficking and function. Beyond explaining disease, large-scale mutational studies can reveal intrinsic properties of proteins (Atkinson et al., 2018; Coyote-Maestas et al., 2021; McLaughlin et al., 2012), determine protein structures (Schmiedel and Lehner, 2019), and inform mechanistic models of protein structure–function relationships (Matreyek et al., 2018).

Here, we combine programmable mutagenesis library generation with multiparametric sequencing-based assays for surface expression and function of Kir2.1. Our data reveal distinct determinants for Kir trafficking, folding, and function. The phenotype of most clinically observed variants can be explained by impacting function, not surface expression. We also find further support for the hypothesis that a hierarchical organization of ion channels structure balances stability and flexibility for folding and function (Coyote-Maestas et al., 2021).

Results

A multiparametric deep mutational scan of Kir2.1

For Kir2.1 to fulfill its roles in cellular physiology, it must be ER targeted, folded, tetramerized, and surface trafficked (Figure 1B). Once trafficked to the cell surface, Kir2.1 must be sensitive to phosphatidylinositol-4,5-bisphosphate (PIP2) (Hibino et al., 2010; Hansen et al., 2011), undergo gating-associated conformational transitions (pore opening and closing), and selectively conduct K+.

To probe how Kir2.1 mutations affect these processes, we use a programmable deep mutagenesis approach (SPINE; Coyote-Maestas et al., 2020; Nedrud et al., 2021) and mutated residues 2–392 of mouse Kir2.1 (UniProt P35561) to every other amino acid (Figure 1C). We included synonymous mutations in 20 Kir2.1 positions as an internal standard to determine wildtype fitness. From this deep mutational scanning (DMS) library, we generated a stable single-copy library HEK293 cell line using BxBI-mediated recombination (Matreyek et al., 2020) for stringent genotype–phenotype linkage. Endogenous channel currents in HEK293 are negligible when compared to those of exogenous channel driven from a strong constitutive promoter (CAG) (Varghese et al., 2006). Sequencing of the stable library cell line indicates near-complete coverage and high redundancy: 90.2% of variants were detected; >89% of variants had a read count of greater than 20 (Figure 1—figure supplement 2).

Since mutations can affect any of the abovementioned processes, multiparametric assessment of phenotypes is required to develop a comprehensive mechanistic understanding of Kir2.1 variation. We focused on separating a mutation's effect on secretion, folding, and trafficking (‘surface expression’) and allosteric regulation and ion conductance (‘function’). Considering the variant library size (391 × 19 = 7429 variants), assays that assess surface expression and function must be high-throughput. Fluorescently activated cell sorting (FACS) allows sorting of variant libraries based on protein phenotypes; sequencing can be used to assign a genotype to each sorted phenotype.

Mutations that interfere with secretion, folding, or trafficking will cause reduced Kir2.1 surface expression (Bendahhou et al., 2003; Pegan et al., 2006; Tinker et al., 1996). To measure Kir2.1 surface expression in cells, we inserted an epitope Flag-tag into the extracellular loop (T115; Stockklausner et al., 2001). We can determine surface expression scores for all mutants in the library by using FACS and NextGen Sequencing (NGS) on fluorescently antibody-labeled cells expressing the Kir2.1 DMS library (Figure 1—figure supplement 3). Phenotypes can be linked to genotype by NGS. We have extensively used this method in FACS-based genotype–phenotype assays (Coyote-Maestas et al., 2021; Coyote-Maestas et al., 2020; Coyote-Maestas et al., 2019).

Functional Kir2.1 hyperpolarizes HEK293 cells by driving the RMP toward the reversal potential of K+ (–80 mV for our conditions) while nonfunctional Kir2.1 variants do not affect RMP (–35 mV in HEK293). We can measure RMP with voltage-sensitive dyes, such as the FLIPR Blue, a concept that we and others demonstrated in optical Kir function assays (Coyote-Maestas et al., 2019; Adams and Levin, 2012). Changes in membrane voltage alter dye membrane partitioning, and therefore extinction coefficient and fluorescence. Hyperpolarized cells expressing functional Kir2.1 can be separated from more depolarized cells expressing nonfunctional Kir2.1 (e.g., the nonconducting V302M mutant; Ma et al., 2007) by FACS using FLIPR dye fluorescence (Figure 1—figure supplement 4). As with surface expression, phenotype can be linked to genotype by NGS.

A global view of trafficking determinants

To learn how amino acids contribute to and mutations alter Kir2.1 trafficking and folding, we sorted and sequenced the Kir2.1 DMS library based on surface expression. Surface trafficking fitness was determined for 6898 Kir2.1 variants (93%). Of 521 missing variants (no data in either replicates), only 154 were also missing in the stable library cell line, suggesting that dropout is stochastic and not related to library construction or poor coverage in the original cell pool (Figure 2—figure supplement 1). Biological replicates for subpools were highly correlated (Pearson correlations 0.85–0.94, Figure 2—figure supplement 2) and read depth is excellent (greater than 30-fold at most positions, Figure 2—figure supplements 34). Surface expression fitness scores and standard errors were calculated using Enrich2 (Rubin et al., 2017). Enrich2 calculates log enrichment by fitting a weighted regression based on changes in mutation frequency across samples within an experiment. Positive scores represent enrichment (increased surface expression) greater than wildtype, while negative scores represent depletion relative to wildtype. In our assay, surface fitness follows a bimodal distribution with one population of mutations centered at wildtype fitness and the other population strongly decreasing surface expression (Figure 2A). Median and standard deviation for synonymous and missense mutations were –0.1 ± 0.4 and –0.17 ± 2.18, respectively (Figure 2A, Figure 2—figure supplement 5). 10% percentile tail ends differed by 4.1 and 0.94 log units at the low and high ends, respectively, which suggest that differences in surface trafficking are well-resolved.

Figure 2 with 5 supplements see all
Surface expression fitness.

(A) Distribution of surface expression fitness scores for synonymous mutations (red line) and missense mutations (black line). (B) Heatmap of Enrich2-calculated surface expression fitness scores expressed as a gradient from blue (below wildtype [WT] fitness) over white (same as WT fitness) to red (above WT fitness). WT residues are indicated by filled green boxes. Missing data is indicated by filled dark gray boxes. Synonymous mutations are indicated by red box outlines. (C) Cartoon of Kir2.2 (PDB 3SPI, 70% identity with Kir2.1). Monomer chains shown in different colors. Lipid bilayer boundaries are indicated by dashed lines. (D) Close-up of core β-sheet of the C-terminal domain (CTD) with mean surface expression fitness (averaged across all measured mutations for that position) mapped onto each residue as a gradient from blue over white to red, corresponding to below WT, same as WT, and above WT fitness, respectively. Residues that comprise the Golgi export signal are outlined green. Numbering corresponds to Kir2.1-Flag. (E) Interface close-up of neighboring subunits (green and magenta cartoons). Sidechains are colored by surface fitness score; same scale as in (D). (F) Close-up of βDE and βLM loop mean surface expression fitness. (G) Boxplots showing surface fitness in the βDE and βLM loop if the WT residues of a specific type (e.g., glutamate, charged) is mutated to another residue of the same type (e.g., aspartate) or a different type (e.g., tryptophan, aromatic). Median is marked with a thick line, the vertical length of the box represents the interquartile range (IQR), upper fence: 75th percentile +1.5× IQR, lower fence: 25th percentile −1.5× IQR, outlier points are shown as solid black circles. All data points are indicated by transparent dots. Only replacing aromatic residues significantly increases surface expression. Significance is tested using one-sided t-tests; number of observations (n) and p-values are indicated for each comparison.

As expected, mutations are highly deleterious within the Flag-tag used for surface labeling and known trafficking signals (Figure 2B). Substituting aromatic residues with charged residues strongly decreases surface expression within the 81WRWMLLLLF88 motif that mediates interaction with Caveolin-3 (Vaidyanathan et al., 2018). Glutamates within the di-acidic ER export signal (Ma et al., 2001), 382FCYENE387, are quite sensitive to mutations with the second glutamate being far more sensitive. Interestingly, negatively charged mutations, N386E or N386D, between the di-acidic motif increase surface expression (3.3 ± 0.76 and 3.9 ±1.2, respectively).

A Golgi export signal comprised of two residues patches at Kir2.1’s N- and C-termini interacts with the AP1 adaptin complex to target Kir2.1 to clathrin-coated vesicles for Golgi to plasma membrane trafficking (Ma et al., 2011). Others have used alanine scans to identify sequence determinants for Golgi export. First, Ma et al., 2011 defined the minimal signal motif comprised of hydrophobic residues (I328, W330, Y323) and a juxtaposed electrostatic interaction between R46 and E335. Later, Li et al., 2016 included salt bridges between N-terminus (R44, R46, K50) and C-terminus (E301, E327), and hydrophobic residues (F203, L239, L240, F300). This structured trafficking signal is formed when the CTD has folded correctly to include a hydrophobic cleft, an adjacent cluster of basic residues, and a network of interactions between termini and adjacent subunits. Ma et al. suggested this predisposes Golgi exit on correct protein conformation; our comprehensive mutagenesis suggests this quality control mechanism may extend beyond previously identified sequences. It appears to involve all core β-sheets in the CTD (βD1, βH, βI, βJ, βK, βB2, βC, βG) and helices αG and αF (Figure 2B and C). Across CTD secondary structure elements, we find a distributed network of residues that, when mutated, result in reduced surface expression (Figure 2D). Changes to the packing of the hydrophobic core of the CTD will likely impact the Golgi export motif orientation. We hypothesize that the Golgi export signal is a reporter for folding of the entire CTD. Several lines of evidence support this hypothesis: IgG-like domains are a β-sheet sandwich with conserved hydrophobic residues at sheet interfaces that are crucial for fold stability (Cota et al., 2001; Hamill et al., 2000), computational mutagenesis in Kir1.1 showed these residues are crucial for fold stability, and experimental mutagenesis found that mutations strongly reduce inward rectifier surface expression (Fallen et al., 2009; Koster et al., 1998). Residues in this network belong to the same subunit; disrupting interactions between residues belonging to neighboring subunits (e.g., carbon–π and π–π interactions between F47, L253, Y345; Figure 2B and E) do not uniformly reduce surface expression. Although Kir is presumably already assembled into tetramers (Deutsch, 2002), our data suggests that the conformational checkpoint for Golgi export requires properly folded monomers not correctly assembled tetramers.

Sites enriched for beneficial or deleterious substitutions could point to additional mechanisms for Kir2.1 trafficking control. Along those lines, we observe that substitution of N- and C terminal residues (G2–G31, S377–S381) with polar mutations have neutral trafficking phenotypes, while aromatic or hydrophobic mutations tend to decrease surface expression. While we do not know the underlying mechanism, our data suggests that disordered N- and C-termini are important in targeting Kir2.1 to the cell surface.

Mutations along βB2/C(M211–K215), βDE(S242–P252), βK(R333, P336), βL(F339), βM(Y345), βN(Y356), and αF helix(K372–L376) tend to improve surface trafficking above WT especially when aromatic residues in the βDE and βLM loops are replaced with any non-aromatic amino acids (Figure 2G, t-test p-value 0.017). These sites form a patch on the outer face of the CTD domain where two subunits interdigitate (βDE from one subunit, the remaining elements from another subunit; Figure 2F). In this region, beneficial mutations have no clear preferences beyond non-aromatics. While no known trafficking signal matches, aromatic residues are often found in binding interfaces (Vaidyanathan et al., 2018; Williams and Fukuda, 1990). Perhaps mutations alter trafficking patterns by disrupting protein interactions (e.g., reduce ER retention, reduce forwarding to lysosomal recycling).

Overall, the agreement with earlier trafficking studies in Kir2.1 shows that our high-throughput surface expression assay works. This allows us to establish a global view of sequence/trafficking relationships in Kir2.1. Comprehensive mutagenesis allows us to validate and fill in the gaps on existing models of channel trafficking (Li et al., 2016; Ma et al., 2011) while discovering new potential trafficking determinants.

A cotranslationally folded biogenic unit in Kir2.1

By examining structured regions sensitive to mutations, we can identify determinants of Kir2.1 folding and trafficking. In the pore domain, we find the lower halves of M1 and M2 helices are more tolerant to mutation compared to the upper halves. Residues whose sidechains project toward the pore cavity are particularly deleterious. At the base of the pore domain, mutations are highly deleterious within a cluster of hydrophobic residues (W81, L85, I178, F182, I187, M191) that form inter-subunit interactions between M1 and M2 helices. The top of the pore domain – a region comprised of the upper halves of M1, M2, and pore helix – is extremely intolerant to substitutions. This includes the M1-pore helix loop (e.g., H110–E133), whose amino acid sidechains contribute to tight packing centered around a hydrophobic pocket formed by F103 (M1), V131 (M1-pore helix loop), F143 (pore helix), V158 (pore helix-M2 loop), and V169 (M2) (Figure 3A). The same region contains two highly conserved cysteines (C130, C162) that form a disulfide bond that is critical for folding but not function (K+ conductance) (Cho et al., 2000; Leyland et al., 1999; Tao et al., 2009). Others found that P163, at M2’s apex, is essential for efficient surface trafficking in Kir2.2 (Dassau et al., 2011). In agreement with these prior studies, all substitutions in these positions are deleterious to surface expression scores.

Determinants of improved surface expression and Kir2.1’s biogenic folding unit.

(A) Close-up of mean surface expression fitness scores in the pore domain. Residues that are part of the putative biogenic unit of Kir2.1 are shown as sticks. (B) Mean surface expression fitness of selectivity filter residues. K+ ions are shown as purple spheres. Regions with no data are shown in magenta. (C) Surface expression fitness of specific residues when replaced with charged (left panel) or polar residues (right panel). Selectivity filter residues are colored turquoise; neighboring residues are colored salmon. The solid horizontal line represents the median surface trafficking fitness of Kir2.1 synonymous mutations with dashed lines being ±1 SD. The gray-shaded ribbon represents a 90% confidence interval of mean surface fitness (n = 2–7) for each position (solid blue lines). (D) Cumulative density function of surface fitness score within the selectivity filter (turquoise line) and neighboring residues (salmon line). (E) Mean surface expression fitness scores mapped onto a single Kir2.1 monomer highlighting the importance of the biogenic folding unit in the pore domain and the core β-sheet of the C-terminal domain (CTD). All mapped mean surface fitness scores (averaged across all measured mutations for that position in A, B, and E) are represented as a gradient from blue over white to red, corresponding to below wildtype (WT), WT, and above WT fitness, respectively.

Glycines in the selectivity filter (147TQTTIGYG154 in Kir2.1; conserved residues are underlined) are absolutely conserved in all K+ channels. Because glycine is achiral, it is the only canonical amino acid that can satisfy the unusual conformation of the selectivity filter main chain, which alternates between left-handed and right-handed α-helical dihedral angles (Valiyaveetil et al., 2004). Despite this strong conservation, the selectivity filter is significantly more tolerant to mutations compared to the preceding M1 and the following pore helix (Figure 3B–D, two-sample Kolmogorov–Smirnov test p-value 1.918e-13). Substitutions of either T150 (which occupies unfavorable backbone torsion angle conformation in the Kir2.2. crystal structure; Hansen et al., 2011; Tao et al., 2009) or Y153 (which forms π–π interaction to a neighboring subunit) in the selectivity filter with glycine do not decrease surface expression. This suggests that the peculiar conformation of the selectivity filter is not required for channel biogenesis.

Based on the extreme mutational sensitivity, high contact density of sidechain packing in the Kir2.2 crystal structure, and known disulfide-mediated fold stabilization, we propose that the top halves of M1 and M2, pore helix, and the M2-pore helix loop have critical roles in ion channel biogenesis and trafficking. This may be a general feature of K+ channel folding as a similar ‘cotranslationally folded biogenic unit’ was proposed for Kv1.3, in which S5/pore loop/S6 interact to establish a ‘reentrant pore architecture’ and correct topology early in channel biogenesis without requiring tetramerization (Delaney et al., 2014; Gajewski et al., 2011). In KcsA, pore helix folding and tetramerization are uncoupled, with the selectivity filter being able to access nonnative, partially unstructured states during folding (Song et al., 2021). Consistent with these previous studies, residues from neighboring subunits that interact with this putative biogenic unit in the fully assembled channel are relatively tolerant to mutations (e.g., selectivity filter residues Y153, which forms a π–π interaction with F143 and F103). This supports the idea that early trafficking-critical folding events involve only intra-subunit interactions that promote tertiary structure formation within nascent Kir2.1 monomers (perhaps in the ER translocon) to establish correct topology and promote membrane insertion.

Regions involved in gating transitions do not contribute to fold stability

From the perspective of surface trafficking as the measured phenotype, several regions of Kir2.1 are tolerant to mutations. This includes the interface between pore and CTD (comprised of slide helix αA/αB, inner helix gate [I183–M188], and tether helix αF), the G-loop (βH-βI loop), and the residues lining the CTD cavity (loops between βC-βD and βE-βG) at the interface between different subunits (Figure 3E). Interestingly, these are involved in gating-associated conformational changes (Hansen et al., 2011; Li et al., 2015) induced by the binding of PIP2, Kir2.1's allosteric activator (two-sided Fisher’s exact test for interrelation of mutational tolerance and gating region membership, p-value 8.091e-05, odds ratio 2.47). An explanation for high mutational tolerance in gating structural elements is that they have higher conformational flexibility required for interconversion between different gating states. They are therefore unlikely to contain motifs recognized by folding-based quality control mechanisms.

Taking the biogenic folding unit and the location of mutation tolerant regions together, Kir2.1 appears to be organized internally into contiguous regions involved in fold-stability (biogenic folding unit, IgG-like CTD) and gating transitions (TM/CTD interface, CTD subunit interfaces), each with distinct mutational tolerance. To understand the relationship between folding and function, we assayed K+ conductance as a second phenotype for the DMS library.

Functional phenotype assays map molecular determinants of conduction and PIP2 sensitization

To assay variant function, we sorted the Kir2.1 DMS library (in duplicate, on separate days) into populations based on voltage-sensitive dye fluorescence, sequenced, and calculated fitness using the same Enrich2 package. This assay is based on the premise that Kir2.1 decreases RMP, resulting in decreased fluorescence of the FLIPR dye. Functional fitness was determined for 6944 Kir2.1 variants (93.4%), with similar replicates (Pearson correlations 0.86–0.93, Figure 4—figure supplement 1) and excellent read depth (greater than 30-fold at most positions, Figure 4—figure supplements 23). Of the 475 missing variants (no data in either one or both replicates), only 158 were also missing in the stable library cell line, again suggesting stochastic dropout (Figure 2—figure supplement 1).

Function fitness median and standard deviation for synonymous and missense mutations were 0.07 ± 0.24 and 0.04 ± 0.6, respectively (Figure 4A, Figure 4—figure supplement 4). 10% percentile tail ends differed by 0.53 and 0.35 log units at the low and high ends, respectively, which suggest much less resolution compared to the surface trafficking assay. Unlike the bimodal distribution for surface expression fitness, functional fitness is unimodal, with most mutations being neutral (Figure 4A); there is no distinct low function population likely due to lower dynamic range of the voltage-sensitive dye compared to fluorescent antibody labeling (Figure 1—figure supplements 34). Function fitness is the combination of surface expression (number of channels on surface) and functional properties (open probability, single-channel conductance, K+ selectivity, etc.). Furthermore, our assay is measuring steady-state RMP and even partially functional Kir2.1in the balance with onductances from other channels constiutively expressed within HEK293T cells could hyperpolarize cells. In aggregate, this means we must interpret functional fitness in the context of surface expression fitness.

Figure 4 with 4 supplements see all
Function fitness.

(A) Distribution of function expression fitness scores for synonymous mutations (red line) and missense mutations (black line). (B) Mean function fitness scores averaged across all measured mutations for that position mapped onto a single Kir2.1 monomer highlighting the importance of the pore helix, selectivity filter, and structural elements at the TM/C-terminal domain (CTD) interface (e.g., interfacial helix, tether helix, G-loop gate) for function. (C) Heatmap of function expression fitness scores. Wildtype (WT) residues are indicated by blue filled boxes. Missing data is indicated by filled dark gray boxes. Synonymous mutations are indicated by red box outlines. (D–G) Heatmap of fitness score in regions that contain bona fide trafficking signal (D, E; for surface and function scores, respectively) or the PIP2 binding site (F, G; for surface and function scores, respectively). (H) Close-up of function fitness at the TM/CTD interface. Allosteric ligand PIP2 is shown as sticks. Surface fitness in (D, F) is represented as a gradient from dark blue over white to dark red, while function fitness in B, C, E–H is represented as a gradient from green over white to magenta. Both gradient colors order corresponds to below WT, same as WT, and above WT fitness, respectively.

Many variants in the pore domain have low functional fitness scores (e.g., pore helix, selectivity filter, Figure 4B). For example, Y153 mutations, which have relatively tolerant surface fitness, strongly reduce function (Figure 4C). Y153 interacts through π–π interactions with F143 and F103 on a neighboring subunit and mutations may preclude correct selectivity filter geometry to conduct K+ ions. The conservative substitution Y153F has a neutral phenotype, in agreement with earlier electrophysiology studies (Heginbotham et al., 1994; Splitt et al., 2000). Many positions along the pore helix that form a hydrogen bond network to support a conductive and K+ selective state (Cheng et al., 2011) are very sensitive to mutation. Mutations of conserved hydrophobic residues in the CTD core β-sheet (e.g., W220) and the Golgi export signal (e.g., hydrophobic cleft residues Y323, W330, and salt bridge R44, E327) are deleterious as predicted by their effect on surface expression.

Disordered N and C termini including the ER export signal FCYENE had moderately low surface expression fitness, but their functional fitness is mostly near-neutral (Figure 4D and E). This suggests that mutations within the disordered regions only impact trafficking of otherwise properly folded and functional channels. Mutations within the unstructured termini are deleterious to surface expression but neutral to channel function.

A specific structural mechanism for PIP2 regulation was previously proposed (Hansen et al., 2011; Tao et al., 2009). PIP2 binding at the TM/CTD interface induces a disorder-to-order transition in the tether helix, contracting it and thereby translating the CTD toward the pore domain. The engagement between CTD and pore domain allows the G-loop to wedge into the pore domain and the inner helix activation gate to open. As described by Hansen et al., 2011, PIP2’s binding site includes pore domain residues (R80, W81, R82, K190) and CTD residues (K193, K195, K196, R67, R226, R197). We found that in these residues surface expression fitness was neutral and functional fitness was negative (Figure 4F and G). This gave us the idea that by focusing on mutation-sensitive positions (for function), we may be able to trace secondary structural elements that couple the PIP2 binding site to the Kir2.1’s gate, the G-loop (Pegan et al., 2005). Beginning with function-sensitive residues that directly interact with PIP2 in the pore domain (R80, W81, R82) and tether helix (K190, K193, K195, K196), we find additional residues in the tether helix (R197) that are in close vicinity to function-sensitive residues in neighboring structural elements (L201 in βB1, R67 in the interfacial helix). Both βB1 and interfacial helix form structural contacts with the βCD loop (Hansen et al., 2011; Whorton and MacKinnon, 2011), and prior functional studies have established the role of βCD in mediating PIP2 regulation (D’Avanzo et al., 2013; Lopes et al., 2002). In agreement with these studies, residues are highly function-sensitive in the entire βCD loop (Figure 4H). Function sensitivity extends to residues on the βCD loop (e.g., H229) and βEG loop (e.g., F270) of one subunit that potentially couple to the βH (I305), βI (R320), and the G-loop gate of another subunit, thus completing the trace to Kir2.1’s gate.

Taken together, DMS aligns with established mechanistic knowledge and structural connectivity, supporting the central role of the βCD loop propagating PIP2 binding at the TM/CTD interface to the G-loop gate. Molecular dynamics simulations uncovered an extended network of interactions between the N-terminus, tether helix, and CD loop to keep the G-loop gate in a closed state (Li et al., 2015). Electrophysiology demonstrated that mutations in R218 (R226 in Kir2.1-Flag numbering), which is conserved in the KIR family, attenuate PIP2 affinity (Lopes et al., 2002) to cause Anderson–Tawil syndrome (Plaster et al., 2001). Mutating the corresponding residues in Kir3.2 – R201A – constitutively activates channels, potentially by forcing the βCD loop into a conformation that mimics the G-protein-activated state (Whorton and MacKinnon, 2011).

Another striking feature revealed by functional assays are function-increasing mutations near the helix bundle crossing gate toward the lower half of TM2 (Figure 5A and B). Previous studies suggested that this ‘lower glycine hinge’ is required for tight packing of M2 helices (Shang and Tucker, 2008). Mutation to aspartate (G178D in Kir2.2) resulted in a ‘forced-open’ channel (Zangerl-Plessl et al., 2020). In agreement, G185D (the corresponding residue in Kir2.1-Flag, and – more strongly – G185T) increased function fitness scores (Figure 4C). Several other, but not all, mutations above and below G185 also increase functional fitness (e.g., I184H, M188R, M191Y). Together, these residues form a cuff above the HBC gate coupling TM2 from different subunits (Delaney et al., 2014; Doyle et al., 1998). The cuff may stabilize specific conformations of M2 that decrease open probability by keeping the HBC gate closed; mutations to large residues may disrupt this hydrophobic interaction network with a function-increasing effect. Additional electrophysiological characterization is needed to independently test this idea. The HBC gate is also stabilized by hydrophobic contacts to M1 of an adjacent subunit (Zangerl-Plessl et al., 2020; Meng et al., 2016). Consistent with M1 interactions forming a gating energy barrier, mutating interacting TM1 residues (F88) also increases function fitness scores.

Figure 5 with 1 supplement see all
HBC gate mutations and regions with distinct roles in stability and function.

(A) Function fitness (averaged across all measured mutations for that position) in the pore domain represented as a gradient from green over white to magenta, corresponding to below wildtype (WT), same as WT, and above WT fitness, respectively. One subunit is removed for clarity. K+ ions in the selectivity filter are shown as purple spheres. Sidechains of residues that are enriched for activating mutations are shown as sticks. (B) Detailed view of the HBC gate. Numbering corresponds to Kir2.1-Flag. (C) Surface fitness (orange circles) and function fitness (teal circles) for each position. The line length between each score represents the score difference and line color represents sign combination of both scores (both negative; one negative, one positive; etc.). Vertical solid and dashed lines represent median scores ±1 SD of synonymous Kir2.1 mutations for surface fitness (orange) and function fitness (teal). Kir2.1 secondary structural element are shown as gray boxes; trafficking signals are colored green. (D–F) Residues with significant function/surface fitness score differences are mapped onto the Kir2.2 structure (PBD 3SPI). Residues with mutations that predominantly reduced function are shown in magenta. Residues with mutations that predominantly reduced surface expression are shown in cyan. PIP2 is shown in blue. Overview (D), close-ups of selectivity filter (E), and TM/C-terminal domain (CTD) (F).

To identify regions that are important for function and not surface expression, we compared surface fitness and function fitness for each position (Figure 5C). For many positions, including the FLAG-tag, N-termini, and Golgi export motifs, surface scores were strongly negative (deleterious) while function scores were relatively neutral (Figure 5C, positions with orange lines). This apparent discordance is likely due to the difference in measurement dynamic range between surface expression and function assay. The function assay that is measuring steady-state RMP and fully functional Kir2.1, even at lower surface expression level, may hyperpolarize the cell, giving the appearance of a neutral phenotype. Transmembrane helices and CTD core β-sheets had low surface score and low function scores (Figure 5C, positions with blue lines). For another subset of positions, function scores were significantly lower than surface expression scores (Figure 5C, positions with red lines). This includes sites involved in gating (PIP2 binding, G-loop gate). By making a binary assignment whether a position belongs is ‘surface trafficking and function-sensitive’ or ‘function-sensitive’ and mapping this assignment onto the Kir2.2 structure, we again find structural organization around elements that are important for fold stability (TM, CTD β-sheets) and function (PIP2 sensitization, gating, K+ conduction) (Figure 5D).

This distinct organization is clearest in the pore domain, where all interactions within a subunit are driving surface expression scores (Figure 5E, cyan sticks), while residues mediating subunit interactions required for the selectivity filter stability are driving function scores (Figure 5E, magenta sticks). Other residues that drive function scores are distributed along an interaction network that connects the PIP2 binding site (M2, αF) with the slide helix (αA/B), βCD loop, βEG loop, and the G-loop gate (βHI loop) (Figure 5F). Intriguingly, this network involves interfaces between adjacent subunits (N-terminus<>βLM loop interface, βI<>βCD loop, βD<>βEG loop). Subunit interactions determining gating is consistent with recent crystal structures of a forced-opened Kir2.2 in which inner helix gate opening requires CTD subunits to move relative to each other (Zangerl-Plessl et al., 2020).

For a more quantitative comparison, we used the Protein Contact Atlas (Kayikci et al., 2018) to systematically identify noncovalent intra- and intersubunit contacts in Kir2.2 (using both closed-state PDB 3SPI and forced-open state PDB 6M84; Figure 5—figure supplement 1, Supplementary file 2). Taking only residues in the CTD into account, we find that residues that participate exclusively in intra-subunit contacts are drivers of surface expression fitness. With surface fitness scores as the metric, residues that make inter-subunit contacts are much more tolerant to mutations (Figure 5—figure supplement 1D; two-sample Kolmogorov–Smirnov test p-value<2.2e-16). Conversely, function scores of mutations in residues that do make inter-subunit contacts are significantly lower than those that make only intra-subunit contact (Figure 5—figure supplement 1E, two-sample Kolmogorov–Smirnov test p-value 3.713e-10). However, this effect is less severe than for surface fitness, probably because a properly folded state is a prerequisite for proper function. Residues that exclusively make inter-subunit contacts or make no contacts at all are enriched for neutral/moderately function improving mutations.

Overall, the function assay allows us to validate and test existing models of Kir2.1 gating while also providing further evidence that inter-subunit interactions bias Kir2.1’s conformational ensemble toward closed states.

Surface and Function DMS results agree with prior studies

To provide a more critical assessment of our surface and function fitness dataset quality, we compared our per-variant scores with equivalent data from prior studies. Ma et al. previously had performed an alanine scan in Golgi export motifs and measured cell surface expression (Ma et al., 2011). Their fraction of surface expression compared to WT and our surface fitness score are strongly positively correlated (Pearson correlation coefficient 0.72, Figure 6A). Dart et al. had previously used scanning cysteine mutagenesis of the selectivity filter and Ag+-blockage to map which residue lines the ion conduction pathway (Dart et al., 1998). We find that Ag+-mediated inhibition is strongly positively correlated with surface expression fitness (Pearson correlation coefficient 0.75, Figure 6B.) This is consistent with the idea that buried residues – those involved in structural stability of the pore domain – are drivers of surface fitness score. Of course, we expect a negative correlation if we compare Ag+ block to function scores. However, we need to account for proper folding being a prerequisite for function. We therefore compared the ratio of function/surface fitness score with Ag+ block and found strong negative correlation. Corrected for the effect on surface fitness, residues important for function (higher Ag+ block) have lower function scores in our assay (Figure 6C). In a third comparison, we turned to Minor et al., 1999, who previously had used random mutagenesis of Kir2.1’s transmembrane domains. By using Sanger sequencing on colonies grown expressing Kir2.1 variants in a K+ transport-deficient yeast strain under selective conditions, they recovered frequency counts for substitutions compatible with forming functional channels. By comparing surface and function fitness to Minor et al.’s observed vs. nonobserved substitutions (presumably meaning unallowed mutations), we find that observed substitutions are enriched for more neutral fitness scores (two-sample Kolmogorov–Smirnov test, p-value 7.65e-11 and 3.584e-07, respectively) (Figure 6D and E).

Comparison to prior studies.

(A) Scatterplot of surface-expressed fraction (compared to Kir2.1) for alanine mutants tested in Ma et al., 2011 vs. their corresponding surface fitness score measured in this study. (B, C) Scatterplot of Ag+-mediated Kir2.1 inhibition report in Dart et al., 1998 vs. their corresponding surface fitness score (B) or surface fitness-normalized function score (C) measured in this study. For (A–C), the solid black line represents a linear regression with the gray-shaded ribbon being the 95% confidence interval. Pearson correlation coefficient and p-value are shown in inset. Numbering corresponds to Kir2.1-Flag. (D, E) Cumulative density function of surface scores (D) or function scores (E) for observed substitution (teal line) or not observed substitutions (red line) in Minor et al., 1999. All error bars represent Enrich2-calculated weighted-least-squares regression standard error.

Taken together, we find good agreement between prior studies and our surface fitness and function fitness scores. This suggests our datasets provide sufficient sensitivity, dynamic range, and fidelity for meaningful interpretation and hypothesis formation.

Most pathogenic inward rectifier mutations are clustered within functionally important residues

A central goal in molecular genetics is identifying the mechanistic basis by which mutations cause disease. Multiparametric DMS studies that interrogate multiple phenotypes associated with variants are a promising strategy to answer this question. Based on the premise that most pathogenic mutations affect protein stability, approaches such as Vamp-Seq (Matreyek et al., 2018) were developed as generalizable measures of protein abundance. In cases such as Kir2.1, where proper trafficking, localization, and gating are crucial for producing functioning proteins, measuring abundance likely will not work. It is necessary to differentiate between mutations’ effects on folding, trafficking, and gating to learn the mechanism for how genotype affects Kir2.1 phenotype. To learn potential pathogenicity and mechanism of action for reported disease-associated mutations, we compared the surface expression and function fitness data to clinically observed mutations.

Since the overall domain architecture and structure/function relationships are conserved within the inward rectifier family, we reasoned that the mechanisms underlying variant effects are conserved, as well. We therefore gathered missense variant effects reported in ClinVar (Landrum et al., 2018) and gnomAD (Karczewski et al., 2020) for inward rectifiers (as of March 25, 2022). By aligning all human Kir, we assigned the corresponding Kir2.1 position to each variant and noted whether the wildtype amino acid matches between Kir2.1 and the aligned Kir (total variant count: 2613; Supplementary file 1). To test if variants in ClinVar or gnomAD are related to surface trafficking or function, we compared the trafficking and function scores variants to those in databases vs. those that are not. They differed for trafficking scores but not for function scores (two-sided Kolmogorov–Smirnov test p-values 0.005563 and 0.7298, respectively; Figure 7A and B). This means being listed in genetic variation databases is related to Kir2.1 function and less related to trafficking. Variants with low surface scores in our DMS are underrepresented in databases. A likely explanation is that Kir2.1 is essential for normal physiology and therefore under strong selection. Indeed, in support of Kir2.1’s essential role, homozygous knockout mice die 8–12 hr after birth (Zaritsky et al., 2001) and Kir2.1 has a low missense constraint score of 0.5 within gnomAD (Karczewski et al., 2020). Variants that misregulate surface trafficking are more likely more deleterious (e.g., abolishing all K+ conductance) than variants affecting function (e.g., gating kinetics), which is why the former are extremely rare in the living human population. Consistent with this idea, exome and targeted sequencing in products of conception showed a significant enrichment of pathogenic variants associated with cardiac channelopathies in stillbirths (Sahlin et al., 2019). Furthermore, only two variants of the five most surface-trafficking impairing positions on Kir2.1’s Golgi export motif (R46, S322, E327, I328, W330, G331) are reported in ClinVar (S322F uncertain significance, W330C pathogenic). Another variant (R46H) is only observed as heterozygous in gnomAD.

Figure 7 with 1 supplement see all
Folding and trafficking variants in clinical databases.

Cumulative distributions of surface (A) or function (B) fitness scores for variants that are represented in ClinVar and gnomAD (red line) or that are not (black line). (C) Line plot by position of mean function or surface fitness scores. Fitness scores below wildtype (WT), same as WT, and above WT fitness are represented as a gradient from dark blue over white to dark red for surface fitness and green over white to magenta for function fitness, respectively. Kir2.1 secondary structural elements are shown as gray boxes. In the top row, known pathogenic variants in Kir2.1 (KCNJ2) are mapped as circles filled with color indicating predicted pathogenic mechanism determined in this study. In the row below, known pathogenic mutations in other Kir (KCNJx) are mapped as gray circles. In the third row, reported variant of unknown significance (VUS) are mapped as circles filled with color indicating predicted pathogenic mechanism determined in this study. Mutation hotspots are indicated by gray outlines. (D, E) Distribution of surface fitness (D) and function fitness (E) by ClinVar pathogenicity assignment (represented as line color; ‘Conflicting’ means some studies interpreted a variant as benign, others as pathogenic). Vertical dashed lines represent 10th and 90th percentile boundaries of expert reviewed benign variants (blue) and pathogenic variants (blue), respectively. Variants with score below the 10th percentiles of known benign variant bound are predicted to be pathogenic (shaded red zone), while variants above the 90th percentile of known pathogenic variants are predicted to be benign (shaded blue zone).

The correspondence between variant listing and inward rectifier function is also apparent when we annotate mean surface trafficking and function scores with pathogenic variants (Figure 7C). Across the board, hotspots enriched for pathogenic disease-associated mutation have low function scores, whereas their surface expression scores are more varied. Variants of uncertain significance followed this trend. In addition to regions involved in PIP2 binding (αA/B slide helix, lower part of M1, αF), and propagating conformational changes to the G-loop gate (βB1, βCD, βHI loop), pore helix and selectivity filters are also hotspots for pathogenic variants.

We further divided fitness scores by assigned clinical significance to estimate the fitness score bounds of expert-reviewed benign and pathogenic variants (Figure 7D and E). For surface fitness, we find that >90% of known benign variants have a fitness score >–0.3. With this lower bound assumption for benign variants, we predict that any ‘variant of unknown significance’ (VUS) scoring lower will be pathogenic with a loss of surface expression phenotype. Conversely, the upper surface fitness score bound (10th percentile) for expert-reviewed benign variants was 0.92, and VUS with greater surface scores are predicted to have a gain of surface expression pathogenic phenotype. We applied the same reasoning to function fitness scores. Using this estimate, we can assign a predicted pathogenic phenotype (loss/gain of surface expression/function) to 75 of 89 (76%) Kir2.1 VUS and variants with conflicting interpretation; several have multiple predicted phenotypes (Supplementary file 3). For example, the M1 mutation R80C or the G-loop mutation M307I, which are associated with short QT syndrome 3 and long QT syndrome, respectively, have both predicted loss of function and loss of surface expression phenotypes. In another example, C76C, located in the slide helix, has a gain of surface expression but loss of function predicted phenotype. The primary sequence location of these predicted pathogenic variants is shown in Figure 7C. Many predicted loss of function variants localize to the slide and tether helices, G-loop gate, and selectivity filter/pore helix. Gain of surface expression variants map to the βBC and βK loop near the interface to βDE of a neighboring subunit. The putative biogenic folding unit (top of M1, M2, and pore helix) contains several predicted loss of surface expression variants. Interestingly, no expert-reviewed pathogenic variants map to this region (apart from selectivity filter G152 and Y153), which suggests that DMS-based prediction can identify new mutation hotspots. We can also use estimated bounds of benign variants to predict the mechanism of 26 of the 53 expert-reviewed pathogenic or conflicting variants that cause Kir2.1 disorders (Supplementary file 4). For six of these mutations, a combination of both loss-of-function plus loss of surface expression is predicted.

When we compare the number of predicted phenotypes, we find that the odds are much higher to predict a loss of surface expression vs. loss of function in experimentally measured variants compared to those reported in ClinVar (one-sided Fisher’s exact test, p-value 0.0001145, odds ratio 4.6, Figure 7—figure supplement 1). This again suggests that loss of surface expression variants are underrepresented in ClinVar.

Taken together, we find that variants with deleterious surface score are underreported in available databases, which we propose is due to the essential nature of Kir2.1 in human physiology. Using bounds of surface and function scores for expert-assigned variants, we can predict VUS effects. We can also predict the mechanism of action through which known pathogenic mutations cause Kir2.1 disorders. This suggests that DMS-derived surface trafficking and function scores can be useful to predict pathogenicity and underlying mechanism of inherited and de novo mutations. Because of the conserved architecture of the inward rectifier family, assignment of mutation hotspots may extend to the entire inward rectifier family. Further corresponding DMS studies in other inward rectifiers are required to separate generalizable themes from idiosyncrasies. For example, Kir6.2 requires binding to the ABC transporter SUR1 to be expressed to the cell surface and function. This means that assay-reported phenotypes for a subset of variation at the Kir6.2/SUR1 interface (involving αA/B slide helix, M1, βLM sheets) have greater bearing in Kir6.2-linked diseases (e.g., diabetes mellitus). The methods described here are a blueprint for these studies.

Discussion

DMS quantitatively links protein phenotypes to the genetic variation within the protein’s coding sequence. This style of systematic amino acid substitution can reveal protein function (Araya et al., 2012; Fowler et al., 2010; Starr et al., 2020), determine protein structure (Schmiedel and Lehner, 2019), and explain protein behavior in healthy and diseased cellular contexts (Matreyek et al., 2021). The phenotypic outcome of genetic variation can be multifaceted. Folding, membrane trafficking, and functional modification of a protein can occur through multiple mechanisms such as protein–protein interactions (e.g., chaperones) or ligand binding (e.g., allosteric modulators). To explore the numerous impacts of mutations, we measure two phenotypes of Kir2.1 variation: surface trafficking and K+ conduction. Both assays are based on cell sorting mediated by fluorescent signals (antibody binding to an extracellular loop and a voltage sensor dye). By measuring how variants are enriched or depleted, we assign quantitative fitness scores.

Many other assay types are compatible with this approach, providing opportunities for even richer phenotypic description of ion channel variation. For example, a recently described approach, HiLITR (Coukos et al., 2021), could provide more granular resolution about trafficking motifs and Kir2.1 localization in living cells. Spontaneously-spiking HEK cells (Hochbaum et al., 2014; Park et al., 2013), which co-express Kir2.1, a voltage-dependent Na+ channel (Nav1.7), the channelrhodopsin CheRiff, and genetically encoded voltage indicator QuasAr2, could be adapted to evaluate how ion channel expression levels and gating properties impact excitability (O’Leary et al., 2014). When assays are conducted in combination with small molecule screening, they may aid in the discovery of novel allosteric modulators, state-dependent blockers, and molecular chaperones to precisely treat channelopathies based on genotype. Integration of disparate assays into a common framework of protein variation and its role in structure and function will be challenging, but efforts to standardize reporting and unified statistical frameworks for interpretation (e.g., Enrich2 [Rubin et al., 2017], the Atlas of Variant Effects [Fowler et al., 2021]) are on the horizon.

While our assays were done in an immortalized nonpolarized HEK293 cell line, which is a caveat in trafficking and function fitness, many of our findings with respect to mutational sensitivity of trafficking motifs, core structural elements, and regions of Kir2.1 involved in gating align with existing knowledge and biophysical intuition. Unlike previous more limited screens and intuition, our data is quantitative and enables data-driven approaches to construct global biophysical models of Kir2.1 structure and function in this specific cellular context. For example, the Golgi export signal comprised of two patches at the N-terminus and CTD was first described as a minimal set of hydrophobic residues along a CTD cleft and juxtaposed basic residues (Ma et al., 2011) and later expanded to adjacent sites (Li et al., 2016). Our comprehensive screen shows that these trafficking motifs may indirectly probe the correct folding of the entire β-sheet core of the CTD. This motif appears integral to allowing Kir2.1 to pass intracellular ‘quality control’ after having achieved proper conformation of the entire CTD and assembly into tetramers.

We observed contiguous regions with neutral surface expression fitness involved in gating transitions while regions involved in fold stability were highly sensitive to mutations. Looking at the same regions in functional data, we find the inverse. This suggests two things. First, comprehensive assessment of Kir2.1’s phenotypes after perturbation (i.e., mutation) is a high-throughput method to annotate protein sequences into classes linked to specific functions (e.g., putative trafficking signals, folding units, etc.). Conceptually this is similar to other high-throughput biochemical approaches that probe sequence–function relationships, such as circular permutation profiling (Atkinson et al., 2018) or high-throughput enzyme variant kinetics measurements (Markin et al., 2021). Second, combining multiple assayed parameters is key to discover general organizational principles in Kir2.1, specifically two distinct structural regions with distinct roles in providing fold stability or dynamics required for gating transitions. Expanding to more measured phenotypes, in different cellular contexts, and integrating datasets may be the blueprint for ‘sequencing-based’ biophysics that probes protein function, folding, and dynamics through steady-state biochemical experiments.

Our DMS study provides additional context for the mechanistic basis of structure/function relation (e.g., reaffirming the βCD loop’s role in propagating PIP2 binding at the TM/CTD interface to the G-loop gate). Our data is consistent with a biogenic folding unit that represents an early quality control step of reentrant pore architecture and correct topology, likely while the monomer is in the translocon. A similar biogenic unit was described in voltage-dependent K+ channels, where it is stabilized by an extensive network of interactions (Delaney et al., 2014). Furthermore, many ion channels have structurally homologous reentrant pore loop architectures and earlier studies suggested that the presence of an ‘aromatic cuff’ is a general feature of their biogenesis (Delaney et al., 2014; Doyle et al., 1998). With further assay development to separate variant effect on folding vs. export and trafficking between different cell organelles, DMS studies may provide a path to probe the energetics and biophysics of stabilizing interactions and to test the hypothesis that hydrophobicity is a general stabilizing factor across reentrant pore loop architectures.

DMS may also shed light on how subunit interactions determine inward rectifier gating properties. We find that mutations near the interface of subunits strongly increase Kir2.1 function. This is consistent with subunits’ roles in setting channel gating properties. Single-channel patch electrophysiology of Kir2.1/2.2 heterotetramers showed that addition of a Kir2.2 monomer increases single-channel conductance and decreases the open dwell time (τopen) (Panama et al., 2010). Perhaps this feature contributes to the observed differences in gating between inward rectifier homo- and heterotetramer, which differ in number and nature of subunit interactions.

Our finding that variants with deleterious surface score are underreported in available database supports the emerging theme that many disease-causing variants are linked to trafficking defects (O’Donnell et al., 2017; Peters et al., 2003; Fallen et al., 2009; Zangerl-Plessl et al., 2019; Li et al., 2016; Ma et al., 2001; Ma et al., 2002; Ma et al., 2011; Stockklausner et al., 2001; Zerangue et al., 1999; Lin et al., 2006). Other large-scale mutational analysis, as undertaken for the voltage-dependent K+ channel Kv11.1 (Anderson et al., 2014; Kozek et al., 2020), has similarly shown that 88% of long QT-linked variants have trafficking-deficient mechanisms. They also demonstrate that data-driven approaches outperform smaller, more limited studies that predicted normal trafficking for most mutants (Harley et al., 2012).

Underrepresentation in variant databases has implications for clinical practice since standards for the interpretation of sequence variants are heavily focused on null variants (nonsense, frameshift), population frequency (frequent variants are likely benign), predicted functional effect (is the variant in an important domain), and case evidence (Richards et al., 2015). In genes under strong purifying selection, such as KCNJ2/Kir2.1, missense mutations cause severe developmental defects (craniofacial structures, limb development; Zaritsky et al., 2001; Belus et al., 2018) that likely contribute to or cause spontaneous abortions and pregnancy loss. These variants therefore never enter population or clinic-associated variant databases. DMS studies are not limited by organismal viability to probe the structural and functional consequences of genetic variation. DMS could be useful to identify drivers of spontaneous abortions and the underlying mechanisms of recurrent pregnancy loss.

By comparing our experimentally determined fitness scores to expert-reviewed assignment of variants effects, which are often based on smaller-scale direct biochemical studies, we can estimate the bounds of fitness scores for benign and pathogenic variants. This allows us to make predictions about pathogenicity and mechanisms of action for VUS and known pathogenic variants, closing the loop between high-throughput assays, biophysical mechanisms underlying fitness scores, and clinical interpretation of human variation in ion channel genes.

Materials and methods

Kir2.1 DMS library generation

Request a detailed protocol

Into mouse Kir2.1 (UniProt P35561), we introduced a FLAG-tag into an extracellular loop (at position T115) and added a downstream expression marker miRFP670 co-expressed via a P2A sequence. For Kir2.1 residues 2–391, each wildtype amino acid was mutated to all other 19 amino acids weighted by their codon usage frequency in humans. In addition to these missense mutations, mutations synonymous to wildtype were included for 20 positions as a benchmark. We included missense mutation into the FLAG as a negative control. We generated this mouse Kir2.1 DMS library using the SPINE (Coyote-Maestas et al., 2020; Nedrud et al., 2021), which we briefly summarize as follows: SPINE-based libraries employ synthesized pools of DNA oligos with mutations at each position of a gene. Due to high error rates in oligo synthesis, the current maximum length for oligos from Agilent is 230 base pairs. As most genes are longer than 230 base pairs, we break up our gene (eight sections in the case of Kir2.1) and replace a subsection of the gene with a pool of mutated oligos. The oligos are designed with unique barcodes for amplifying out a specific subpools library, Golden Gate-compatible BsmBI cut sites, and the mutation within a subsection of Kir2.1. OLS oligos were designed using the SPINE scripts on GitHub (https://github.com/schmidt-lab/spine, Coyote-Maestas and Nedrud, 2022). These scripts design oligo libraries, primers for amplifying oligo-sublibraries, and inverse PCR primers for adding compatible cut sites to the Kir2.1 plasmid.

All backbones were amplified using a 25-cycle PCR with GXL polymerase and 1 ng of backbone DNA as template. The PCR product was then gel-purified. All oligo libraries were amplified using 25-cycle PCR with GXL polymerase and with 1 μl of the OLS library (resuspended in 1 ml TE) as template. To assemble the backbone and library DNA, Golden Gate reactions were set up in 20 μl containing 100 ng of amplified backbone DNA, 20 ng of amplified oligo DNA, 0.2 μl BsmBI -HFv2 (New England Biolabs), 0.4 μl T4 DNA ligase (New England Biolabs), 2 μl T4 DNA ligase buffer, and 2 μl 10 mg/ml BSA. These reactions were put in a thermocycler overnight using the following program: (1) 5 min at 42°C, (2) 10 min at 16°C, (3) repeat 40 times, (3) 42°C for 20 min, and (4) 80°C for 10 min. This reaction was cleaned using a Zymo Research Clean and Concentrate 5 kit and eluted in 6 μl of elution buffer. The entirety of this reaction was transformed in E. cloni 10G electrocompetent cells (Lucigen) according to the manufacturer’s instructions. Cells were grown overnight with shaking at 30°C to avoid overgrowth in 30 ml of LB with 40 μg/ml kanamycin and library DNA was isolated using Zymo Zyppy miniprep kits. A small subset of transformed cells was plated with varying dilutions to assess transformation efficiency and validate successful mutations. All libraries at this step yielded >300,000 colonies implying a 100× coverage (assuming 1/3 of the variants were perfect based on our previous analysis of Agilent OLS-based libraries). Each sublibrary was combined at an equimolar ratio to make a complete library with all intended mutations included.

Stable cell line generation

Request a detailed protocol

To generate cell lines, we used a rapid single-copy mammalian cell line generation pipeline (Matreyek et al., 2020). Briefly, mutational libraries are cloned into a staging plasmid with BxBI-compatible attB recombination sites using BsmBI Golden Gate cloning. We amplify the staging plasmid backbone using inverse PCR and the library of interest with primers that add complementary BsmBI cut sites. Golden Gate cloning and subsequent transformation was conducted with BsmbI (NEB), T4 Ligase (NEB) following the manufacturer’s instructions using the same protocol as previously described for library generation. Completed library landing pad constructs are co-transfected (1:1) with a BxBI expression construct (pCAG-NLS-Bxb1) into (TetBxB1BFP-iCasp-Blast Clone 12 HEK293T cells) using Turbofect according to the manufacturer’s instructions in six wells of a six-well dish. This cell line has a genetically integrated tetracycline induction cassette, followed by a BxBI recombination site, and split rapalog-inducible dimerizable Casp-9. Cells were maintained in D10 (DMEM, 10% fetal bovine serum [FBS], 1% sodium pyruvate, and 1% penicillin/streptomycin). 2 days after transfection, doxycycline (2 μg/ml, Sigma-Aldrich) was added to induce expression of our genes of interest (successful recombination) or the iCasp-9 selection system (no recombination). Successful recombination shifts the iCasp-9 out of frame, thus only cells that have undergone recombination survive, while those that have not will die from iCasp-9-induced apoptosis. 1 day after doxycycline induction, AP1903 (10 nM, MedChemExpress) was added to cause dimerization of Casp9 and selectively kill cells without successful recombination. 1 day after AP1903-Casp9 selection, media was changed back to D10 + doxycycline (2 μg/ml, Sigma-Aldrich) for recovery. 2 days after cells have recovered from all wells, they were mixed and are reseeded to enable normal cell growth. The library cell line therefore represents independent recombination reactions across six wells. Once cells reach confluency, library cells are frozen in 50% FBS and 10% DMSO stocks in aliquots for assays. It was also sequenced to establish a baseline for represented Kir2.1 variants (Figure 1—figure supplement 2). Replicate for each assay (surface trafficking and function, see below) represents different aliquots of the same library stock.

Surface expression cell sorting

Request a detailed protocol

Thawed stocks of library cell lines were seeded into a 10 cm dish and media were swapped the following day to D10. Cells were grown and split before confluency to maintain cell health. Media were swapped to D10 + doxycycline (2 μg/ml, Sigma-Aldrich) 2 days prior to the experiment. Cells were detached with 1 ml Accutase (Sigma-Aldrich), spun down and washed three times with FACS buffer (2% FBS, 0.1% NaN3, 1X PBS), incubated for 1 hr rocking at 4°C with a BV421 anti-flag antibody (BD Biosciences), washed twice with FACS buffers, filtered with cell strainer 5 ml tubes (Falcon), covered with aluminum foil, and kept on ice for transfer to the flow cytometry core. Before sorting, 5% of cells were withdrawn for processing and sequencing as a baseline control.

Cells were sorted on a BD FACSAria II P69500132 cell sorter. miRFP670 fluorescence was excited with a 640 nm laser and recorded with a 670/30 nm bandpass filter and 505 nm long-pass filter. BV421 fluorescence was excited using a 405 nm laser. Cells were gated on forward scattering area and side scattering area to find whole cells, forward scattering width, and height to separate single cells, miRFP670 for cells that expressed variants without errors (our library generation results in single base pair deletions that will not have miRFP670 expression because deletions will shift the fluorescent protein out of frame Coyote-Maestas et al., 2020), and label for surface-expressed cells. The surface expression label gate boundaries were determined based on unlabeled cells from the same population because controls tend to have nonrepresentative distributions. An example of our gating strategy is depicted in Figure 1—figure supplement 3.

Cells were sorted based on surface expression into four populations (miRFPhigh/BV421none, miRFPhigh/BV421low, miRFPhigh/BV421medium, miRFPhigh/BV421high). The surface expression experiment was done in duplicate (starting from different aliquots of the library cell line) on separate days for two entirely independent replicates. We collected ~5.1 million cells across both replicates and populations to ensure a greater than 100× coverage for each Kir2.1 variant (Figure 1—figure supplement 3H).

Resting membrane potential cell sorting

Request a detailed protocol

Concurrently with preparing samples for surface labeling, another sample of the same cells was prepared for sorting based on RMP. These cells were initially washed with FACS buffer and concentrated in this buffer for cell health prior to sorting. 30 min prior to sorting, cells were resuspended in Tyrode (125 mM NaCl, 2 mM KCl, 3 mM CaCl2, 1 mM MgCl2, 10 mM HEPES, 30 mM glucose, pH 7.3) that contained FLIPR membrane potential dye with a Blue quencher.

Cells were also sorted based on RMP on a BD FACSAria II P69500132 cell sorter. miRFP670 fluorescence was excited with a 640 nm laser and recorded with a 670/30 nm bandpass filter and 505 nm long-pass filter. FLIPR fluorescence was excited using a 488 nm laser and recorded on a 525/50 nm bandpass filter. As before, the same general sorting scheme was used to identify whole single cells based on forward and side scatter and enriched for good quality library members based on miRFP670 fluorescence. An example of our gating strategy is depicted in Figure 1—figure supplement 4. Cells were sorted based on RMP into three populations (miRFPhigh/FLIPRLow, miRFPhigh/FLIPRmedium, miRFPhigh/FLIPRhigh). The RMP experiment was done in duplicate (starting from different aliquots of the library cell line) on separate days for two independent replicates. We collected ~3.2 million cells across both replicates and sort populations to ensure a greater than 100× coverage for each Kir2.1 variant (Figure 1—figure supplement 4H).

Sequencing

Request a detailed protocol

For both biological replicates, DNA from pre-sort control and sorted cells was extracted with Microprep DNA kits (Zymo Research) and triple-eluted with water. The elute was diluted such that no more than 1.5 µg of DNA was used per PCR reaction and amplified for 20 cycles of PCR using Primestar GXL (Takara Bio), run on a 1% agarose gel, and gel-purified. Primers that bind outside the recombination site ensure leftover plasmid DNA from the original cell line construction step is not amplified. Purified DNA was quantified using Picogreen DNA quantification. Equal amounts (by mass) of each sample were pooled by cell sorting category . Pooled amplicons were prepared for sequencing using the Nextera XT sample preparation workflow and sequenced using Illumina NovaSeq in 2 × 150 bp mode. Source sequencing data is available in the NCBI Sequence Raw Archive (https://www.ncbi.nlm.nih.gov/sra) under accession code PRJNA791691. Read count statistics are listed in Supplementary file 5.

Alignment and calculating variant frequency

Request a detailed protocol

Variant frequencies for every single mutation were determined as follows: 150 bp paired-end reads were trimmed to remove standard Illumina adapters (bbmap/resources) using BBDuk with ‘mink’ at eight bases. Overlapping reads were corrected without merging using BBMerge with the ‘ecco’ and ‘mix’ setting. Reads were aligned to the Kir2.1 sequence using BBMap using ‘maxindel’ at 500 and ‘local’ alignment. The custom Python script is available at https://github.com/schmidt-lab/KirDMS(schmidt-lab, 2022). The resulting SAM file was analyzed for mutation count at each residue position. Since we programmed specific mutations for each codon into the OLS oligo pool that we used for SPINE-mediated variant library synthesis, we could distinguish between expected codons and unexpected codons (predominantly wildtype codons but also sequencing errors). Mutations outside of spine-generated fragments are rare, with 1 bp deletions being the most common (Coyote-Maestas et al., 2020). Only programmed (i.e., expected mutations) were counted; the ratio of unexpected/expected mutations for each sample is listed in Supplementary file 5. This process of generating a variant list and their counts was repeated for every sample sequenced. For surface fitness assay, this resulted in variant frequencies for 6975–7305 mutations across two replicates of four samples, named from lowest to highest surface expression: ‘negative,’ ‘low,’ ‘up,’ and ‘high.’ For the function fitness assay, this resulted in variant frequencies for between 7168 and 7320 mutations across two replicates of three samples, named from lowest to highest conductance: ‘negative,’ ‘low,’ and ‘up.’ Coverage of mapped reads is listed in Figure 2—figure supplement 4 and Figure 4—figure supplement 3.

To calculate enrichment of mutants within the surface expression and function assays, we used Enrich2 (Rubin et al., 2017) with count data for each mutation as input. Enrich2 calculates fitness scores and their errors based on a fitted weighted-least-squares regression across a series of experiments (e.g., different gating conditions that represent different selection pressures). Each variant’s score is defined as the log10-fold transform of slope of the regression line weighted to wildtype frequencies, and the standard deviations within the model account for measurement errors within the sample. This means that positive scores represent enrichment greater than wildtype, while negative scores represent depletion relative to wildtype. Specifically, positive surface expression scores mean higher than wildtype expression, while negative surface scores mean lower than wildtype expression. In function assays, positive scores mean more hyperpolarization relative to wildtype, while negative scores mean less hyperpolarization. Scores were output with a standard error in a .csv file. Positional surface expression and function fitness scores were calculated by taking the mean fitness scores for that experiment across all measured mutations at that position. We were able to collect near-complete datasets for both measured phenotypes. Variant dropout appears to be stochastic; only 20% of variants missing in both surface and function assay were also missing in the stable library cell line baseline (Figure 1—figure supplement 2). Some variants were only missing in the baseline library but were detected in phenotyping assay. Taken together, this suggests that dropout occurs at the cell sorting/NGS stages of our workflow. Additional replication, extending sort, and sequencing depth all could help with removing the remaining analysis gaps.

Inward rectifier phylogenetic alignment and ClinVar mutation assignment

Request a detailed protocol

We downloaded all human inward rectifiers (KCNJ1 [Kir1.1/ROMK1; UniProt P48048], KCNJ10 [Kir1.2/Kir4.1; UniProt P78508], KCNJ15 [Kir1.3/Kir4.1; UniProt Q99712], KCNJ2 [Kir2.1; UniProt P63252], KCNJ12 [Kir2.2; UniProt Q14500], KCNJ4 [Kir2.3; UniProt P48050], KCNJ14 [Kir2.4; UniProt Q9UNX9], KCNJ18 [Kir2.6; UniProt B7U540], KCNJ3 [Kir3.1/GIRK1; UniProt P48549], KCNJ6 [Kir3.2/GIRK2; UniProt P48051], KCNJ9 [Kir3.3/GIRK3; UniProt Q92806], KCNJ5 [Kir3.4/GIRK4; UniProt P48544], KCNJ16 [Kir5.1; UniProt Q9NPI9], KCNJ8 [Kir6.1; UniProt Q15842], KCNJ11 [Kir6.2; UniProt Q14654], KCNJ13 [Kir7.1; UniProt O60928]) and aligned these together using MegaX (Kumar et al., 2018). Based on this alignment, we generated a master list of Inward Rectifier numberings to translate residue numbering between Kir homologues. We assigned mutations observed in ClinVar and gnomAD and their pathogenic classification (ClinVar only) to this master alignment (Supplementary file 1).

Data availability

Sequencing data generated in this study have been deposited in the Sequence Raw Archive (https://www.ncbi.nlm.nih.gov/sra) under accession code PRJNA791691. All remaining source data (including processed data and R scripts to reproduce manuscript figures) are included as supplementary information (Source data 1) and are also available at github.com/schmidt-lab/KirDMS (copy archived at swh:1:rev:76cab17b04e2041b3818bcadfc225cd58c9f231f).

The following data sets were generated
    1. Coyote-Maestas W
    2. Nedrud D
    3. He Y
    4. Schmidt D
    (2022) NCBI BioProject
    ID PRJNA791691. Multiparametric Deep Mutational Scanning in the Inward Rectifier Kir2.1.

References

    1. Torrecilla M
    2. Marker CL
    3. Cintora SC
    4. Stoffel M
    5. Williams JT
    6. Wickman K
    (2002)
    G-protein-gated potassium channels containing Kir3.2 and Kir3.3 subunits mediate the acute inhibitory effects of opioids on locus ceruleus neurons
    The Journal of Neuroscience 22:4328–4334.

Decision letter

  1. Andrés Jara-Oseguera
    Reviewing Editor; The University of Texas at Austin, United States
  2. Kenton J Swartz
    Senior Editor; National Institute of Neurological Disorders and Stroke, National Institutes of Health, United States
  3. Colin G Nichols
    Reviewer; Washington University School of Medicine, United States
  4. Justin Gregory English
    Reviewer; The University of Utah, United States

Our editorial process produces two outputs: i) public reviews designed to be posted alongside the preprint for the benefit of readers; ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.

Decision letter after peer review:

Thank you for submitting your article "Determinants of trafficking, conduction, and disease within a K+ channel revealed through multiparametric deep mutational scanning" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, one of whom is a member of our Board of Reviewing Editors, and the evaluation has been overseen by Kenton Swartz as the Senior Editor. The following individuals involved in review of your submission have agreed to reveal their identity: Colin G. Nichols (Reviewer #2); Justin Gregory English (Reviewer #3).

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission. All three reviewers agree that the manuscript will be of great value to the field, and that the data and analysis presented are of high quality. However, the reviewers identified several points that the authors need to address before publication, which are listed below.

Essential revisions:

1) For Figure S1, it would be informative to also map onto this figure all the currently observed missense mutations in the GnomAD database. Are mutations currently circulating in the population in permissive regions observed later in the study? Are they clustered to particular areas? Looks like majority are in N-term and C-term.

2) For Figure 1C, there does not appear to be a data set sequencing the HEK293T landing pad cell line to demonstrate the extent of library coverage. This is generally important, both for the reader to assess the quality of the initial library as well as to understand the limitations of this method in providing complete coverage maps. I am curious if the cell line coverage landscape would mirror the deficits observed in Supplementary Figure 6 (magneta boxes). While only 7% of the potential data is missing, these gaps appear to be focused in very specific amino acid and receptor domain locations. This data should be made into a heatmap figure (a la figure 2) to demonstrate if the library was uniform or biased toward specific mutational areas. It would be informative to identify how many mutations are under-represented in the library and if this pairs with specific functional deficits. i.e. – are we seeing magenta in Figure 6 because there wasn't sufficient coverage in the initial library to detect them, or are they present, but not measurable?

3) The authors should provide FACS histograms as in Suppl. Figures 2 and 3 for cells expressing only WT channels and for cells without channels – these could be included as an overlay so we can gauge the fluorescence distribution of the negative control cells wherein no channel activity or surface expression is present. It would strengthen the manuscript if other non-functional or surface expression-deficient variants were also included as benchmark, but we leave this up to the authors to decide.

4) For Figure S2 E and F the X-axis is the excitation laser rather than the emission spectra measured. As this signal is referred to as the emission miRPF 67o throughout the paper it may be helpful to change this label to the actual measured unit for the axis, as was done for the Y-axis (BV421). For Figure S3E-F, note in the figure the axis is labeled 640, which is the excitation laser. For consistency it may be best to label this axis as miRFP 670 emission rather than excitation.

5) Figure 2D, Figure 3B, and Figure 5D are too crowded and difficult to decipher. Transparency in some sections or making subunit cartoon representations all the same color could also help. Consider truncating or hiding some elements in the figure to show only regions of interest.

6) For Figure S2 G, these appear to be a sub-sampled statistic of 1% of the total collected population. In the paper (line 634) it was noted that 2.1 million cells were collected, but only 1,000 of each population appear in this statistic table. Can you please also include the final statistics table to evaluate your final sort values. Can you also please note in the legend whether this data is a sub-sample of the published data, a benchmark run, or otherwise. If this gating format is not directly subsampled from the true data collection runs, please re-render these figures using the collected data.

7) Median fitness error to synonymous mutations is reported as "low" for the trafficking defect assay at 7.5%. It is again reported as low for the functional assays at 15.3%. It may be best instead to report the median and deviation of the synonymous and missense datasets and their relative differences at each tail end. Low is a subjective statement here and isn't supported by a positive control measure such as a large N of unmutated channels, run in the same sample, showing similar deviations in these assays as the synonymous mutants. If this data was run, it would be best to include it in the graph for comparison. While not essential, in the future I would encourage the authors to include unmutated controls into their library to assess the impact of synonymous substitutions, and to potentially make single synonymous substitutions at every amino acid position as a reference comparison to the mutants.

8) The authors should include in their discussion some consideration of the "no data" category which generated a consistent number of analysis gaps within certain domains of the channel. How to avoid these in the future? Were these cloning issues? Library construction difficulties with SPINE? Poor coverage of the original cell pool? Could this have been avoided by making multiple cell libraries, sequencing them, and pooling those with compensating coverage?

9) The authors should describe their NGS strategy with much more detail, including more information about the amplicons that they generated for sequencing and how they discriminated between substitutions inside or outside the SPINE-generated variant fragments.

10) The authors should indicate whether the replicates reflect different libraries, different recombined cell libraries, or different stocks of the same library.

11) The "no data" color scheme from Figure 6S should be adopted for the main paper figures. The use of light yellow for no data and white for "0 fold change" makes differentiating these two dataset types difficult visually in these figures. It is important for the reader to be able to easily observe where mutational coverage was lacking in this study.

12) The authors should tone down several observations that are based solely on the DMS screen results. Some of these examples include the increased surface expression of N386E, N386D, T150, Y153, the positive effect in function of N224K and of the residues surrounding the hinge at G176, and importantly, and the predicted phenotypes for the variants of unknown significance. It would further strengthen the manuscript if the authors provided additional experimental validation for a few key examples, either using electrophysiology or the same fluorescence-based assays performed on cell populations expressing single variants, but this is not required.

13) In line 215, the sentence reads: "selectivity filter with glycine significantly increase surface expression." This is difficult to see in Figure 2B, since most mutations in that section of the sequence decrease surface expression. Further, since it is stated to be significant, we believe the authors intended to imply statistical significance? The reader would benefit from a different panel in the figure to highlight the residues within the section of the protein being referenced, as well as information on the statistical test that was run and the p-value.

14) The authors should tone down remarks that involve a distinction between processes that can't be distinguished by their assays, such as co-translational folding and export between different sub-cellular organelles. It needs to be pointed out that the assay is done in an immortalized non-polarized human embryonic kidney cell line. A major caveat to interpretation of trafficking susceptibility is that trafficking constraints may be very different in native cells, and perhaps especially in polarized cells.

15) It would strengthen the manuscript if the authors systematically identified residues engaged in inter vs intra-subunit interactions, and correlated this with the fitness scores for surface expression and function. Otherwise, the observations regarding the role of residues engaging in intra- vs inter-subunit interactions for determining stability and folding or function should be toned down.

16) In line 238, there is a reference to a Fisher's Exact Test, but it's not entirely clear what this test is comparing. We would like to request some clarification of this section, since the current wording suggests that residues involved in gating were analyzed somehow but it's not clear what parameters were used.

17) The authors should systematically compare their per-amino acid variant results with known mutations from the literature, to provide a more critical assessment of their per-variant results. Discrepancies should also be discussed. In Figures 2B and 4C it would be beneficial to place the citation number of the paper corresponding to a previously published defect that matched your observations and border the matching mutation in the heatmap with a unique color identifier so that readers can assess the relative benchmarks and past known mutations and their effects in your assay format more easily. Alternatively, authors could include additional heatmap panels with reduced and simplified content that facilitate comparison between the data in the manuscript and the literature.

18) L189: It is stated that there is agreement between earlier trafficking studies but this statement needs to be validated by a (at least semi-) quantitative comparison, showing a score from the present assay and a score from previous studies, or a Table listing present score and previous results with references, for such residues. This is important given that the authors then talk about establishing a 'global view' of sequence/trafficking relationships. Are there any previous experimental results that do not match? i.e. previous studies showing opposite effects of a given mutation on trafficking? Or on function? While Table S2 compares the present results to clinvar databases, it is unclear whether there are any experimental comparisons.

19) L196: Why is it surprising that pore-lining mutations are tolerant to mutation?

20) L263: What does this sentence mean? The idea that partially functional Kir2.1 will 'eventually' hyperpolarize the cell seems logically wrong. The degree of polarization will depend on the balance of K conductance and other conductances, time dependence is a non sequitur.

21) Paragraph starting l284: This misses any references for the stated mechanistic knowledge. These should be given. The idea that secondary structural elements that couple PIP2 binding to gating can be 'traced' from mutation-sensitive positions is interesting, but far from necessarily correct. This attempt to infer structural connectivity should be made cautiously.

22) Refs 73-75 are 20 years old, and the idea that Kir channels gate by kinking at G176 has been significantly passed over. Gly can be substituted by other residues (DOI: 10.1007/s00249-007-0206-7), and many MD simulations (e.g. DOI: 10.1085/jgp.201912422) show subtle movements along M2 rather than a specific kinking, during gating, as well as highlighting the effects of charge mutations in inducing channel opening. This more nuanced view seems much more consistent with the present data – it is not correct that all mutations reduce function in the present assay. It seems there is improved function for A,T,M,E ,K substitutions.

23) The analysis in Figure 5C conveys no information about the sign and magnitude of the two fitness scores that are being compared; there are multiple possible combinations of surface expression and fitness scores, each implying distinct perturbation mechanisms, that would yield the same calculated difference score. Instead of showing the differences between scores, the authors could show both surface expression and function scores for each position, with a line between the two scores. Different colors could be used for the lines separating the scores, depending on the magnitude of the difference and the sign combination of the two scores. The authors should also discuss if there are positions with a solidly negative surface expression score and a neutral or positive function score – to what extent do these cases raise concern about the accuracy of the function scores? How more reliable are the per-position function scores relative to the per-variant scores, both in regards to what is described above and in general? Granted it is possible that mutations can reduce surface expression without appreciably compromising function, but this should be discussed.

24) Ref 10 is not appropriate for monogenic disease-relevance of Kirs.

25) L47: Neonatal diabetes results from Kir6.2 gain of function. It is confusing to discuss reduced surface expression without noting that this will be a counteracting LOF.

26) On l54 it states that ClinVar lists 163 missense mutations in Kir2.a, but then on l60 it says it reports 144? Something seems wrong.

27) On L66 it is said that there are no large-scale studies of sequence determinants of Kir trafficking and function. The authors should mention Minor et al., (Cell 96:879-91), an early attempt to systematically assess amino acid relevance to structure.

28) L309: What is the important physiological role of GIRK2 here? Presumably you mean Na is an important regulator of GIRK2?

29) In general, it would be best to stick to one nomenclature system, and the system, which names this channel as Kir2.1 (rather than IRK1 or other name) seems best. In this case, GIRK2 should be Kir3.2, ROMK1 should be Kir1.1, etc.

30) Please check the figure numbers and letters that are quoted in the text. We note that lines 279 and 289 reference Figure 4, but it doesn't appear to be referencing the correct panel. For example, line 279, surface and functional data are compared and should reference panels D and E? (Currently references D and F. Similar question on line 289.) Line 631 references the wrong Supplementary Figure. Line 652 also references the wrong Supplementary Figure.

Reviewer #1 (Recommendations for the authors):

We would like to recommend specific content changes in order to address the comments raised above. The below suggestions are in the same order to correspond to each point in the public review.

1) The authors should provide FACS histograms as in Suppl. Figures 2 and 3 for cells expressing only WT channels and for cells without channels. Ideally, other non-functional or surface expression-deficient variants should be included as benchmark.

2) The authors should tone down several observations that are based solely on the DMS screen results, and provide additional experimental validation for a few key examples, either using electrophysiology or the same fluorescence-based assays performed on cell populations expressing single variants. Some of these examples include the increased surface expression of N386E, N386D, T150, Y153, the positive effect in function of N224K and of the residues surrounding the hinge at G176, and importantly, at least some of the predicted phenotypes for the variants of unknown significance.

3) The authors should systematically compare their per-amino acid variant results with known mutations from the literature, to provide a more critical assessment of their per-variant results. Discrepancies should also be discussed.

4) It would strengthen the manuscript if the authors systematically identified residues engaged in inter vs intra-subunit interactions, and correlated this with the fitness scores for surface expression and function. Otherwise, the observations regarding the role of residues engaging in intra- vs inter-subunit interactions for determining stability and folding or function should be toned down.

5) The authors should tone down remarks that involve a distinction between processes that can't be distinguished by their assays, such as co-translational folding and export between different sub-cellular organelles.

6) The authors should describe their NGS strategy with detail, including more information about the amplicons that they generated for sequencing and how they discriminated between substitutions inside or outside the SPINE-generated variant fragments. Also, more information about the library should be provided, such as the error rates and error types found after sequencing.

7) The authors should indicate whether the replicates reflect different libraries, different recombined cell libraries, or different stocks of the same library.

8) The analysis in Figure 5C conveys no information about the sign and magnitude of the two fitness scores that are being compared; there are multiple possible combinations of surface expression and fitness scores, each implying distinct perturbation mechanisms, that would yield the same calculated difference score. Instead of showing the differences between scores, the authors could show both surface expression and function scores for each position, with a line between the two scores. Different colors could be used for the lines separating the scores, depending on the magnitude of the difference and the sign combination of the two scores.

9) Related to point (7), the authors should discuss if there are positions with a solidly negative surface expression score and a neutral or positive function score – to what extent do these cases raise concern about the accuracy of the function scores? How more reliable are the per-position function scores relative to the per-variant scores, both in regards to what is described above and in general? Granted it is possible that mutations can reduce surface expression without appreciably compromising function, but this should be discussed.

In paper organization, we would like to note some 'housekeeping' suggestions for the manuscript and figures to increase clarity.

1) In line 215, the sentence reads: "selectivity filter with glycine significantly increase surface expression." This is difficult to see in Figure 2B, since most mutations in that section of the sequence decrease surface expression. Further, since it is stated to be significant, we believe the authors intended to imply statistical significance? The reader would benefit from a different panel in the figure to highlight the residues within the section of the protein being referenced, as well as information on the statistical test that was run and the p-value.

2) In line 238, there is a reference to a Fisher's Exact Test, but it's not entirely clear what this test is comparing. We would like to request some clarification of this section, since the current wording suggests that residues involved in gating were analyzed somehow but it's not clear what parameters were used.

3) Please check the figure numbers and letters that are quoted in the text. We note that lines 279 and 289 reference Figure 4, but it doesn't appear to be referencing the correct panel. For example, line 279, surface and functional data are compared and should reference panels D and E? (Currently references D and F. Similar question on line 289.) Line 631 references the wrong Supplementary Figure. Line 652 also references the wrong Supplementary Figure.

4) Further on figures, could the authors please consider an alternative presentation or more extensive explanation of Figures 6D and E. We respectfully suggest edits are needed to present that point much more forcefully and help the reader understand the significance of the authors' findings. Panels 6 D and E might benefit from more labels on the graph indicating what the dashed lines are. The figure legend and caption don't fully explain the figure, which makes the interesting conclusions very difficult to interpret. Further, it would be helpful if the caption explained briefly what the orange conflicting and green VUS categories are. Are the colors consistent across panels, from A-E? These data are incredibly relevant to the authors' point and different presentation or perhaps a different figure entirely would assist in making this high-impact finding more accessible.

5) Figure 2D, Figure 3B, and Figure 5D are too crowded and difficult to decipher. Transparency in some sections or making subunit cartoon representations all the same color could also help. Consider truncating or hiding some elements in the figure to show only regions of interest.

6) The authors should provide markings (asterisks, arrows) in the heat maps in Figures 2B and 4C, or additional supplementary figures with reduced and simplified content that facilitate discussing observations for specific positions without requiring readers to refer to the heat maps containing data for all variants.

7) In Figure 6C, we suggest a different color for either the variants without predicted mechanism for Kir2.1 (red) and the gain of function (orange), because it is difficult to visually distinguish between them.

Reviewer #2 (Recommendations for the authors):

This manuscript reports use of a clever approach to generation of a library of single copy mutants in Kir2.1 and then use of FACS sorting of trafficked and functional mutations to analyze the consequences. The authors are to be congratulated on a data set that is very comprehensive and extremely informative, with important controls in place. The authors have made a strong effort to link their work to previous findings, but the text is written in a discursive style, and the grammatical and typographical errors throughout the manuscript make it difficult to read at times. There are a large number of comments below that should be addressed.

1. L189 It is stated that there is agreement between earlier trafficking studies but this statement needs to be validated by a (at least semi-) quantitative comparison, showing a score from the present assay and a score from previous studies, or a Table listing present score and previous results with references, for such residues. This is important given that the authors then talk about establishing a 'global view' of sequence/trafficking relationships. Are there any previous experimental results that do not match? i.e. previous studies showing opposite effects of a given mutation on trafficking? Or on function? While Table S2 compares the present results to clinvar databases, it is unclear whether there are any experimental comparisons.

2. L196 Why is it surprising that pore-lining mutations are tolerant to mutation?

3. L243-246. This is mere conjectural statement seems unnecessary. Suggest deleting.

4. L263 What does this sentence mean? The idea that partially functional Kir2.1 will 'eventually' hyperpolarize the cell seems logically wrong. The degree of polarization will depend on the balance of K conductance and other conductances, time dependence is a non sequitur.

5. In Figure 5 surface expression was subtracted from function fitness, and while the analysis produces interesting clusters, the real significance is unclear. Rather it seems that the 'functional assessment' is a product of expression and activity of individual channels, and so it seems the analysis should somehow seek to 'divide' the raw functional score by the expression level?

6. Paragraph starting l284. This misses any references for the stated mechanistic knowledge. These should be given. The idea that secondary structural elements that couple PIP2 binding to gating can be 'traced' from mutation-sensitive positions is interesting, but far from necessarily correct. This attempt to infer structural connectivity should be made cautiously.

7. l. 30 paragraph. Refs 73-75 are 20 years old, and the idea that Kir channels gate by kinking at G176 has been significantly passed over. Gly can be substituted by other residues (DOI: 10.1007/s00249-007-0206-7), and many MD simulations (e.g. DOI: 10.1085/jgp.201912422) show subtle movements along M2 rather than a specific kinking, during gating, as well as highlighting the effects of charge mutations in inducing channel opening. This more nuanced view seems much more consistent with the present data – it is not correct that all mutations reduce function in the present assay. It seems there is improved function for A,T,M,E ,K substitutions.

8. It needs to be pointed out that the assay is done in an immortalized non-polarized human embryonic kidney cell line. A major caveat to interpretation of trafficking susceptibility is that trafficking constraints may be very different in native cells, and perhaps especially in polarized cells.

9. There are a many odd typos or grammatical idiosyncracies that need cleaning up (missing articles, oddly inserted 'into', 'to', 'they are', 'and but', 'and' instead of 'the', etc , etc, words throughout the text). The use of speech contractions (e.g. 'We've' for 'We have', l108; 'protein don't' for proteins do not', l458) should all be corrected.

10. l38 Ref 10 is not appropriate for monogenic disease-relevance of Kirs.

11. l47. Neonatal diabetes results from Kir6.2 gain of function. It is confusing to discuss reduced surface expression without noting that this will be a counteracting LOF.

12. On l54 it states that ClinVar lists 163 missense mutations in Kir2.a, but then on l60 it says it reports 144? Something seems wrong.

13. On l66 it is said that there are no large-scale studies of sequence determinants of Kir trafficking and function. The authors should mention Minor et al., (Cell 96:879-91), an early attempt to systematically assess amino acid relevance to structure.

14. l309 What is the important physiological role of GIRK2 here? Presumably you mean Na is an important regulator of GIRK2?

15. In general, it would be best to stick to one nomenclature system, and the system, which names this channel as Kir2.1 (rather than IRK1 or other name) seems best. In this case, GIRK2 should be Kir3.2, ROMK1 should be Kir1.1, etc.

16. Paragraph beginning l2438 belongs in discussion.

17. There are weirdly incomplete references e.g, refs 97-99.

Reviewer #3 (Recommendations for the authors):

1. For Figure S1, it would be informative to also map onto this figure all the currently observed missense mutations in the GnomAD database. Are mutations currently circulating in the population in permissive regions observed later in the study? Are they clustered to particular areas? Looks like majority are in N-term and C-term.

2. For Figure 1C, there does not appear to be a data set sequencing the HEK293T landing pad cell line to demonstrate the extent of library coverage. This is generally important, both for the reader to assess the quality of the initial library as well as to understand the limitations of this method in providing complete coverage maps. I am curious if the cell line coverage landscape would mirror the deficits observed in Supplementary Figure 6 (magneta boxes). While only 7% of the potential data is missing, these gaps appear to be focused in very specific amino acid and receptor domain locations. This data should be made into a heatmap figure (a la figure 2) to demonstrate if the library was uniform or biased toward specific mutational areas. It would be informative to identify how many mutations are under-represented in the library and if this pairs with specific functional deficits. i.e. – are we seeing magenta in Figure 6 because there wasn't sufficient coverage in the initial library to detect them, or are they present, but not measurable?

3. In Figure 2b it would be beneficial to place the citation number of the paper corresponding to a previously published defect that matched your observations and border the matching mutation in the heatmap with a unique color identifier so that readers can assess the relative benchmarks and past known mutations and their effects in your assay format more easily.

4. For Figure S2 E and F the X-axis is the excitation laser rather than the emission spectra measured. As this signal is referred to as the emission miRPF 67o throughout the paper it may be helpful to change this label to the actual measured unit for the axis, as was done for the Y-axis (BV421).

5. For Figure S2 G, these appear to be a sub-sampled statistic of 1% of the total collected population. In the paper (line 634) it was noted that 2.1 million cells were collected, but only 1,000 of each population appear in this statistic table. Can you please also include the final statistics table to evaluate your final sort values. Can you also please note in the legend whether this data is a sub-sample of the published data, a benchmark run, or otherwise. If this gating format is not directly subsampled from the true data collection runs, please re-render these figures using the collected data.

6. Median fitness error to synonymous mutations is reported as "low" for the trafficking defect assay at 7.5%. It is again reported as low for the functional assays at 15.3%. It may be best instead to report the median and deviation of the synonymous and missense datasets and their relative differences at each tail end. Low is a subjective statement here and isn't supported by a positive control measure such as a large N of unmutated channels, run in the same sample, showing similar deviations in these assays as the synonymous mutants. If this data was run, it would be best to include it in the graph for comparison. While not essential, in the future I would encourage the authors to include unmutated controls into their library to assess the impact of synonymous substitutions, and to potentially make single synonymous substitutions at every amino acid position as a reference comparison to the mutants.

7. For Figure S3E-F, please show the FLIPR fluorescence of the miRFP 670 negative population as an overlay in F (dotted line of peaks over the red of P4) so we can gauge the fluorescence distribution of the negative control cells wherein no channel activity is present. Is this how "FLIPR-" was determined? Also, note in the figure the axis is labeled 640, which is the excitation laser. For consistency it may be best to label this axis as miRFP 670 emission rather than excitation.

8. The authors should include in their discussion some consideration of the "no data" category which generated a consistent number of analysis gaps within certain domains of the channel. How to avoid these in the future? Were these cloning issues? Library construction difficulties with SPINE? Poor coverage of the original cell pool? Could this have been avoided by making multiple cell libraries, sequencing them, and pooling those with compensating coverage?

9. The "no data" color scheme from Figure 6S should be adopted for the main paper figures. The use of light yellow for no data and white for "0 fold change" makes differentiating these two dataset types difficult visually in these figures. It is important for the reader to be able to easily observe where mutational coverage was lacking in this study.

https://doi.org/10.7554/eLife.76903.sa1

Author response

Essential revisions:

1) For Figure S1, it would be informative to also map onto this figure all the currently observed missense mutations in the GnomAD database. Are mutations currently circulating in the population in permissive regions observed later in the study? Are they clustered to particular areas? Looks like majority are in N-term and C-term.

We have updated reported missense mutations (now Figure 1—figure supplement 1) to include a more recent release of ClinVar (as of 3/25/2022) and added the gnomAD data (v.2.1.1). gnomADonly variants do not appear to cluster in a specific area of Kir2.1.

2) For Figure 1C, there does not appear to be a data set sequencing the HEK293T landing pad cell line to demonstrate the extent of library coverage. This is generally important, both for the reader to assess the quality of the initial library as well as to understand the limitations of this method in providing complete coverage maps. I am curious if the cell line coverage landscape would mirror the deficits observed in Supplementary Figure 6 (magneta boxes). While only 7% of the potential data is missing, these gaps appear to be focused in very specific amino acid and receptor domain locations. This data should be made into a heatmap figure (a la figure 2) to demonstrate if the library was uniform or biased toward specific mutational areas. It would be informative to identify how many mutations are under-represented in the library and if this pairs with specific functional deficits. i.e. – are we seeing magenta in Figure 6 because there wasn't sufficient coverage in the initial library to detect them, or are they present, but not measurable?

We have added these data as new supplementary figures (Figure 1—figure supplement 2 and Figure 2-figure supplement 1) and analysis discussed in the main results section. Some missing data localizes to the “seams” of SPINE-generated fragments from which the Kir2.1 ORF is assembled. This is a known issue with SPINE and may be addressed by using overlapping fragment assembly. Nevertheless, our data comparing missing data in the stable library cell line, surface expression assay, and function assay suggests that variant dropout is stochastic occurring at the cell sorting / NGS stages of our workflow. Additional replication, extending sort and sequencing depth all could help with removing the remaining analysis gaps. We have added this data and discussion.

3) The authors should provide FACS histograms as in Suppl. Figures 2 and 3 for cells expressing only WT channels and for cells without channels – these could be included as an overlay so we can gauge the fluorescence distribution of the negative control cells wherein no channel activity or surface expression is present. It would strengthen the manuscript if other non-functional or surface expression-deficient variants were also included as benchmark, but we leave this up to the authors to decide.

We have added these histograms; both include WT Kir2.1 channels (Figure 1—figure supplement 3 and Figure 1—figure supplement 4). For surface expression assays, we have included a negative control HEK cell line expressing Kv1.3, which lacks the extracellular epitope and therefore will not be labelled by the FLAG antibody. For the function assays, we have included V302M as a negative control, which abolishes gating but does not affect surface expression (Ma […] Welling 2007).

4) For Figure S2 E and F the X-axis is the excitation laser rather than the emission spectra measured. As this signal is referred to as the emission miRPF 67o throughout the paper it may be helpful to change this label to the actual measured unit for the axis, as was done for the Y-axis (BV421). For Figure S3E-F, note in the figure the axis is labeled 640, which is the excitation laser. For consistency it may be best to label this axis as miRFP 670 emission rather than excitation.

We have updated all figures to say that fluorescence intensity at the respective emission wavelength of the fluorophore is shown.

5) Figure 2D, Figure 3B, and Figure 5D are too crowded and difficult to decipher. Transparency in some sections or making subunit cartoon representations all the same color could also help. Consider truncating or hiding some elements in the figure to show only regions of interest.

We have simplified these figures, and several others, to make them easier to understand.

6) For Figure S2 G, these appear to be a sub-sampled statistic of 1% of the total collected population. In the paper (line 634) it was noted that 2.1 million cells were collected, but only 1,000 of each population appear in this statistic table. Can you please also include the final statistics table to evaluate your final sort values. Can you also please note in the legend whether this data is a sub-sample of the published data, a benchmark run, or otherwise. If this gating format is not directly subsampled from the true data collection runs, please re-render these figures using the collected data.

We have included count statistics for all collected cell population in these Supplementary Figures (Figure 1—figure supplement 3 and Figure 1—figure supplement 4). We have also updated the figure to reflect that gating examples are based on a sub-sample of the collected cell population on which this study is based.

7) Median fitness error to synonymous mutations is reported as "low" for the trafficking defect assay at 7.5%. It is again reported as low for the functional assays at 15.3%. It may be best instead to report the median and deviation of the synonymous and missense datasets and their relative differences at each tail end. Low is a subjective statement here and isn't supported by a positive control measure such as a large N of unmutated channels, run in the same sample, showing similar deviations in these assays as the synonymous mutants. If this data was run, it would be best to include it in the graph for comparison. While not essential, in the future I would encourage the authors to include unmutated controls into their library to assess the impact of synonymous substitutions, and to potentially make single synonymous substitutions at every amino acid position as a reference comparison to the mutants.

We have added these metrics and removed the less quantitative statements. Due the specific implementation of our genotype/phenotype linkage workflow (fragmentation of amplicons and short-read NextGen Sequencing), it is not feasible to include unmutated controls in our libraries. For this reason, we used synonymous mutations to assess wildtype fitness (a common approach in the field). We will consider switching to a combination of long-read sequencing (Sequel) and unique molecular identifiers for future studies.

8) The authors should include in their discussion some consideration of the "no data" category which generated a consistent number of analysis gaps within certain domains of the channel. How to avoid these in the future? Were these cloning issues? Library construction difficulties with SPINE? Poor coverage of the original cell pool? Could this have been avoided by making multiple cell libraries, sequencing them, and pooling those with compensating coverage?

Our data comparing missing data in the stable library cell line, surface expression assay, and function assay suggests that variant dropout is stochastic occurring at the cell sorting / NGS stages of our workflow (new Figure 2—figure supplement 1). Additional replication, extending sort and sequencing depth all could help with removing the remaining analysis gaps. We have added this data and discussion.

9) The authors should describe their NGS strategy with much more detail, including more information about the amplicons that they generated for sequencing and how they discriminated between substitutions inside or outside the SPINE-generated variant fragments.

We have edited this section in the Methods description:

– We describe how amplicon libraries are prepared (limited cycle PCR of purified genomic DNA) followed by NexteraXT tagmentation for Illumina Sequencing.

– We describe how reads are analyzed to generate a list of variants and their counts. We are making the analysis script (written in Python) available at https://github.com/schmidtlab/KirDMS. Because we use OLS oligo pools with specific programmed mutations, we can distinguish between expected mutations at each codon and unexpected mutations (wildtype codon, sequencing errors etc.). Our analysis is based on expected mutation counts only.

10) The authors should indicate whether the replicates reflect different libraries, different recombined cell libraries, or different stocks of the same library.

We have added this information. Replicates reflect different aliquots of the same recombined libraries that were frozen down, then thawed, and expanded less than one week prior to experiments. Each biology replicate was done on a different day.

11) The "no data" color scheme from Figure 6S should be adopted for the main paper figures. The use of light yellow for no data and white for "0 fold change" makes differentiating these two dataset types difficult visually in these figures. It is important for the reader to be able to easily observe where mutational coverage was lacking in this study.

We have adapted the magenta = “no data” scheme for figures that illustrate fitness errors and dark grey (grey60) in Figures 2 and 4. Magenta would not work for the latter two figures; other primary colors are too distracting. In our view, a dark gray tone makes it easy to see where data is missing, and this is a standard in the field (e.g. https://elifesciences.org/articles/56707 and https://elifesciences.org/articles/15802).

12) The authors should tone down several observations that are based solely on the DMS screen results. Some of these examples include the increased surface expression of N386E, N386D, T150, Y153, the positive effect in function of N224K and of the residues surrounding the hinge at G176, and importantly, and the predicted phenotypes for the variants of unknown significance. It would further strengthen the manuscript if the authors provided additional experimental validation for a few key examples, either using electrophysiology or the same fluorescence-based assays performed on cell populations expressing single variants, but this is not required.

We are confident in the premise that our assays for surface expression and function provides sufficient sensitivity, dynamic range, and fidelity. This is borne out of prior work (e.g., CoyoteMaestas Nat Comm 2019) and controls included in this study (e.g., Kv1.3 control for surface expression; V302M as a negative control for function).

That said, we agree that additional experimental for single variants could be useful to further bolster this premise. However, finding the right balance between how many variants to test and which metrics to use for comparison (for function – open probability, selectivity, etc.) is not immediately obvious.

Furthermore, since we are using single copy stable cell lines (each cell expresses only one variant) there is not conceptual difference between performing the same fluorescence assay on cell populations expressing a single variant vs. multiple variants. The former is simply a subsample of the latter; we would not expect the distribution of fitness scores to change, only the measurement error would improve.

We have therefore addressed this request in the following ways:

– We have edited the text to make it clear that we are only reporting fitness scores measured by our assays.

– Considering noisy data (in particular for function), we refrain drawing conclusion from single variant but instead focus on themes emerging when taking several variant data points in aggregate.

– We have included additional comparison to prior studies that reported variant effect on surface expression and/or function. We find that there is very strong correlation between our results and prior data, which adds confidence to our result and interpretations.

– In several instance (e.g., surface expression of T150 and Y153 variants, we have added additional analysis and illustration).

– We have removed our speculative statement about N224K.

– We edited the text to emphasize that our interpretation of lower glycine hinge mutations is speculative at this point and requires more validation.

– With respect to predicted phenotypes for variants of unknown significance, we respectfully point out that “predicted” implies uncertainty in the phenotype we are assigning. Even if we were to make individual variants, test them by electrophysiology, and find the same phenotype that our high-throughput assay reported, we would not necessarily have more confidence in assigning clinical significance. That is because of the heavy focus medical genetics and genomics places on case evidence (i.e., finding these variants in patient with disease).

13) In line 215, the sentence reads: "selectivity filter with glycine significantly increase surface expression." This is difficult to see in Figure 2B, since most mutations in that section of the sequence decrease surface expression. Further, since it is stated to be significant, we believe the authors intended to imply statistical significance? The reader would benefit from a different panel in the figure to highlight the residues within the section of the protein being referenced, as well as information on the statistical test that was run and the p-value.

We have added additional analysis and illustration (Figure 3B-D). We find that selectivity filter residues are more tolerant to mutations (when measuring surface expression) than the surrounding structural elements (TM1 and Pore Helix).

14) The authors should tone down remarks that involve a distinction between processes that can't be distinguished by their assays, such as co-translational folding and export between different sub-cellular organelles. It needs to be pointed out that the assay is done in an immortalized non-polarized human embryonic kidney cell line. A major caveat to interpretation of trafficking susceptibility is that trafficking constraints may be very different in native cells, and perhaps especially in polarized cells.

We have rephrased these sections and softened these remarks. It is correct that we currently cannot distinguish between effects on early folding events and later trafficking. However, by using our experimentally measured surface fitness score in a region that is equivalent to a “cotranslationally folded biogenic unit” in other K+ channels, we use inductive reasoning to propose that a similar cotranslationally-folded biogenic unit exists in Kir2.1. We now go on to say that further assay development to separate variant effect on folding vs. export and trafficking between different cell organelles would be very useful.

We thank the reviewer reminding us of context dependence when study ion channel trafficking and biophysics (endogenous vs. overexpression, primary culture, cell lines, oocytes, etc.). We have edited the discussion to point out the caveats that come with performing these assays in HEK293.

15) It would strengthen the manuscript if the authors systematically identified residues engaged in inter vs intra-subunit interactions, and correlated this with the fitness scores for surface expression and function. Otherwise, the observations regarding the role of residues engaging in intra- vs inter-subunit interactions for determining stability and folding or function should be toned down.

We have added this analysis and new figure panels (Figure 5—figure supplement 1). For CTD residues, we can show that intra-subunit contacts are the driver of surface-expression, while intersubunit contact drive function fitness.

16) In line 238, there is a reference to a Fisher's Exact Test, but it's not entirely clear what this test is comparing. We would like to request some clarification of this section, since the current wording suggests that residues involved in gating were analyzed somehow but it's not clear what parameters were used.

We have clarified what this test is comparing: the odds that a loss of surface expression (LoS) vs. a loss of function (LoF) is predicted for experimentally measured variants vs. variants in ClinVar (see Figure 7—figure supplement 1). The odds are significantly higher (odds ratio LoS/LoF 4.6, p-value 0.0001145), meaning that loss of surface expression variants are underrepresented in ClinVar.

17) The authors should systematically compare their per-amino acid variant results with known mutations from the literature, to provide a more critical assessment of their per-variant results. Discrepancies should also be discussed. In Figures 2B and 4C it would be beneficial to place the citation number of the paper corresponding to a previously published defect that matched your observations and border the matching mutation in the heatmap with a unique color identifier so that readers can assess the relative benchmarks and past known mutations and their effects in your assay format more easily. Alternatively, authors could include additional heatmap panels with reduced and simplified content that facilitate comparison between the data in the manuscript and the literature.

We have chosen three comparison studies that allow us to make quantitative comparisons:

Ma et al., 2011, Cell Comparing variant effect on Golgi export and surface expression

Dart et al., 1998, JGP Accessibility of selectivity filter-lining residues to Ag+

Minor et al., 1999, Cell Allowed substitutions in functional Kir2.1 channels

These comparisons are conveyed in a new Figure 6 and under the result subheading “Surface and Function DMS results agree with prior studies”. Our data are in strong agreement with prior these prior studies.

18) L189: It is stated that there is agreement between earlier trafficking studies but this statement needs to be validated by a (at least semi-) quantitative comparison, showing a score from the present assay and a score from previous studies, or a Table listing present score and previous results with references, for such residues. This is important given that the authors then talk about establishing a 'global view' of sequence/trafficking relationships. Are there any previous experimental results that do not match? i.e. previous studies showing opposite effects of a given mutation on trafficking? Or on function? While Table S2 compares the present results to clinvar databases, it is unclear whether there are any experimental comparisons.

See our response to the comment above.

19) L196: Why is it surprising that pore-lining mutations are tolerant to mutation?

We have removed the word “surprisingly”.

20) L263: What does this sentence mean? The idea that partially functional Kir2.1 will 'eventually' hyperpolarize the cell seems logically wrong. The degree of polarization will depend on the balance of K conductance and other conductances, time dependence is a non sequitur.

We have corrected this sentence removing the impression that we are talking about a nonequilibrium process.

21) Paragraph starting l284: This misses any references for the stated mechanistic knowledge. These should be given. The idea that secondary structural elements that couple PIP2 binding to gating can be 'traced' from mutation-sensitive positions is interesting, but far from necessarily correct. This attempt to infer structural connectivity should be made cautiously.

We have edited this section to highlight from which prior studies our statements of mechanistic knowledge are drawn. We use more cautious phrasing about potential coupling between subunit interfaces.

22) Refs 73-75 are 20 years old, and the idea that Kir channels gate by kinking at G176 has been significantly passed over. Gly can be substituted by other residues (DOI: 10.1007/s00249-007-0206-7), and many MD simulations (e.g. DOI: 10.1085/jgp.201912422) show subtle movements along M2 rather than a specific kinking, during gating, as well as highlighting the effects of charge mutations in inducing channel opening. This more nuanced view seems much more consistent with the present data – it is not correct that all mutations reduce function in the present assay. It seems there is improved function for A,T,M,E ,K substitutions.

The thank the reviewer for bringing this to our attention; we have edited this section accordingly.

23) The analysis in Figure 5C conveys no information about the sign and magnitude of the two fitness scores that are being compared; there are multiple possible combinations of surface expression and fitness scores, each implying distinct perturbation mechanisms, that would yield the same calculated difference score. Instead of showing the differences between scores, the authors could show both surface expression and function scores for each position, with a line between the two scores. Different colors could be used for the lines separating the scores, depending on the magnitude of the difference and the sign combination of the two scores. The authors should also discuss if there are positions with a solidly negative surface expression score and a neutral or positive function score – to what extent do these cases raise concern about the accuracy of the function scores? How more reliable are the per-position function scores relative to the per-variant scores, both in regards to what is described above and in general? Granted it is possible that mutations can reduce surface expression without appreciably compromising function, but this should be discussed.

We have created a new version of this figure along the lines of what the reviewer is suggesting. We have added a clarification that explain the apparent discordance of low surface fitness and neutral function fitness: it is due to the difference in measurement dynamic range between surface expression and function assays. The function assay is measuring steady-state RMP and fully functional Kir2.1, even at lower surface expression level may hyperpolarize the cell, giving the appearance of a neutral phenotype.

24) Ref 10 is not appropriate for monogenic disease-relevance of Kirs.

We have updated these references.

25) L47: Neonatal diabetes results from Kir6.2 gain of function. It is confusing to discuss reduced surface expression without noting that this will be a counteracting LOF.

We have edited this sentence to clarify that neonatal diabetes is caused by GOF mutations in Kir6.2.

26) On l54 it states that ClinVar lists 163 missense mutations in Kir2.a, but then on l60 it says it reports 144? Something seems wrong.

ClinVar continuously is updated with new information; we had overlooked one instance referring to an older version. In this revised version, we are using ClinVar data pull on March 25, 2022 and also gnomAD data pulled on the same day.

27) On L66 it is said that there are no large-scale studies of sequence determinants of Kir trafficking and function. The authors should mention Minor et al., (Cell 96:879-91), an early attempt to systematically assess amino acid relevance to structure.

We are now mentioning this early study. We are also using it as one of the validation datasets for a quantitative comparison to earlier literature datasets.

28) L309: What is the important physiological role of GIRK2 here? Presumably you mean Na is an important regulator of GIRK2?

We have removed this paragraph. It seemed superfluous to the point we are trying to make.

29) In general, it would be best to stick to one nomenclature system, and the system, which names this channel as Kir2.1 (rather than IRK1 or other name) seems best. In this case, GIRK2 should be Kir3.2, ROMK1 should be Kir1.1, etc.

We have changed everything to the KirX.Y nomenclature.

30) Please check the figure numbers and letters that are quoted in the text. We note that lines 279 and 289 reference Figure 4, but it doesn't appear to be referencing the correct panel. For example, line 279, surface and functional data are compared and should reference panels D and E? (Currently references D and F. Similar question on line 289.) Line 631 references the wrong Supplementary Figure. Line 652 also references the wrong Supplementary Figure.

We have fixed these errors.

Reviewer #1 (Recommendations for the authors):

We would like to recommend specific content changes in order to address the comments raised above. The below suggestions are in the same order to correspond to each point in the public review.

1) The authors should provide FACS histograms as in Suppl. Figures 2 and 3 for cells expressing only WT channels and for cells without channels. Ideally, other non-functional or surface expression-deficient variants should be included as benchmark.

This is addressed by Essential Revision #3.

2) The authors should tone down several observations that are based solely on the DMS screen results, and provide additional experimental validation for a few key examples, either using electrophysiology or the same fluorescence-based assays performed on cell populations expressing single variants. Some of these examples include the increased surface expression of N386E, N386D, T150, Y153, the positive effect in function of N224K and of the residues surrounding the hinge at G176, and importantly, at least some of the predicted phenotypes for the variants of unknown significance.

This is addressed by Essential Revision #12.

3) The authors should systematically compare their per-amino acid variant results with known mutations from the literature, to provide a more critical assessment of their per-variant results. Discrepancies should also be discussed.

This is addressed by Essential Revision #17.

4) It would strengthen the manuscript if the authors systematically identified residues engaged in inter vs intra-subunit interactions, and correlated this with the fitness scores for surface expression and function. Otherwise, the observations regarding the role of residues engaging in intra- vs inter-subunit interactions for determining stability and folding or function should be toned down.

This is addressed by Essential Revision #15.

5) The authors should tone down remarks that involve a distinction between processes that can't be distinguished by their assays, such as co-translational folding and export between different sub-cellular organelles.

This is addressed by Essential Revision #14.

6) The authors should describe their NGS strategy with detail, including more information about the amplicons that they generated for sequencing and how they discriminated between substitutions inside or outside the SPINE-generated variant fragments. Also, more information about the library should be provided, such as the error rates and error types found after sequencing.

This is addressed by Essential Revision #9.

7) The authors should indicate whether the replicates reflect different libraries, different recombined cell libraries, or different stocks of the same library.

This is addressed by Essential Revision #10.

8) The analysis in Figure 5C conveys no information about the sign and magnitude of the two fitness scores that are being compared; there are multiple possible combinations of surface expression and fitness scores, each implying distinct perturbation mechanisms, that would yield the same calculated difference score. Instead of showing the differences between scores, the authors could show both surface expression and function scores for each position, with a line between the two scores. Different colors could be used for the lines separating the scores, depending on the magnitude of the difference and the sign combination of the two scores.

This is addressed by Essential Revision #23.

9) Related to point (7), the authors should discuss if there are positions with a solidly negative surface expression score and a neutral or positive function score – to what extent do these cases raise concern about the accuracy of the function scores? How more reliable are the per-position function scores relative to the per-variant scores, both in regards to what is described above and in general? Granted it is possible that mutations can reduce surface expression without appreciably compromising function, but this should be discussed.

This is addressed by Essential Revision #23.

In paper organization, we would like to note some 'housekeeping' suggestions for the manuscript and figures to increase clarity.

1) In line 215, the sentence reads: "selectivity filter with glycine significantly increase surface expression." This is difficult to see in Figure 2B, since most mutations in that section of the sequence decrease surface expression. Further, since it is stated to be significant, we believe the authors intended to imply statistical significance? The reader would benefit from a different panel in the figure to highlight the residues within the section of the protein being referenced, as well as information on the statistical test that was run and the p-value.

This is addressed by Essential Revision #13.

2) In line 238, there is a reference to a Fisher's Exact Test, but it's not entirely clear what this test is comparing. We would like to request some clarification of this section, since the current wording suggests that residues involved in gating were analyzed somehow but it's not clear what parameters were used.

This is addressed by Essential Revision #16.

3) Please check the figure numbers and letters that are quoted in the text. We note that lines 279 and 289 reference Figure 4, but it doesn't appear to be referencing the correct panel. For example, line 279, surface and functional data are compared and should reference panels D and E? (Currently references D and F. Similar question on line 289.) Line 631 references the wrong Supplementary Figure. Line 652 also references the wrong Supplementary Figure.

This is addressed by Essential Revision #30.

4) Further on figures, could the authors please consider an alternative presentation or more extensive explanation of Figures 6D and E. We respectfully suggest edits are needed to present that point much more forcefully and help the reader understand the significance of the authors' findings. Panels 6 D and E might benefit from more labels on the graph indicating what the dashed lines are. The figure legend and caption don't fully explain the figure, which makes the interesting conclusions very difficult to interpret. Further, it would be helpful if the caption explained briefly what the orange conflicting and green VUS categories are. Are the colors consistent across panels, from A-E? These data are incredibly relevant to the authors' point and different presentation or perhaps a different figure entirely would assist in making this high-impact finding more accessible.

We have updated this Figure (now: Figure 7). We simplified panels D and E. We have added an explanation for 'conflicting' vs. 'VUS' – essentially, conflicting interpretation vs. unknown significance. Colors are consistent within panel A, within panel C, and within panels D & E, but not across panels. We have chose visually distinct color schemes to avoid confusion.

5) Figure 2D, Figure 3B, and Figure 5D are too crowded and difficult to decipher. Transparency in some sections or making subunit cartoon representations all the same color could also help. Consider truncating or hiding some elements in the figure to show only regions of interest.

This is addressed by Essential Revision #5.

6) The authors should provide markings (asterisks, arrows) in the heat maps in Figures 2B and 4C, or additional supplementary figures with reduced and simplified content that facilitate discussing observations for specific positions without requiring readers to refer to the heat maps containing data for all variants.

We have attempted adding asterisk and other visual aids to highlight specific position mentioned in the text, but this added clutter to an already complicated heatmap. Considering noisy data (in particular for function), we found that drawing from conclusion from several data points in aggregate is better than focusing on single variants. Where possible, we have added dedicated panels with fitness score mapped onto the Kir2.2 structure to provide visual context.

7) In Figure 6C, we suggest a different color for either the variants without predicted mechanism for Kir2.1 (red) and the gain of function (orange), because it is difficult to visually distinguish between them.

We updated this figure (now: Figure 7C) and chose visually distinct color schemes to avoid confusion.

Reviewer #2 (Recommendations for the authors):

This manuscript reports use of a clever approach to generation of a library of single copy mutants in Kir2.1 and then use of FACS sorting of trafficked and functional mutations to analyze the consequences. The authors are to be congratulated on a data set that is very comprehensive and extremely informative, with important controls in place. The authors have made a strong effort to link their work to previous findings, but the text is written in a discursive style, and the grammatical and typographical errors throughout the manuscript make it difficult to read at times. There are a large number of comments below that should be addressed.

1. L189 It is stated that there is agreement between earlier trafficking studies but this statement needs to be validated by a (at least semi-) quantitative comparison, showing a score from the present assay and a score from previous studies, or a Table listing present score and previous results with references, for such residues. This is important given that the authors then talk about establishing a 'global view' of sequence/trafficking relationships. Are there any previous experimental results that do not match? i.e. previous studies showing opposite effects of a given mutation on trafficking? Or on function? While Table S2 compares the present results to clinvar databases, it is unclear whether there are any experimental comparisons.

This is addressed by Essential Revision #18.

2. L196 Why is it surprising that pore-lining mutations are tolerant to mutation?

This is addressed by Essential Revision #19.

3. L243-246. This is mere conjectural statement seems unnecessary. Suggest deleting.

This statement relates to the major motivation for using perturbation as a coarse grain high-throughput approach to annotate protein structure – a major focus of our ongoing research. We respectfully request to leave this statement as is, even if conjecture at this point.

4. L263 What does this sentence mean? The idea that partially functional Kir2.1 will 'eventually' hyperpolarize the cell seems logically wrong. The degree of polarization will depend on the balance of K conductance and other conductances, time dependence is a non sequitur.

This is addressed by Essential Revision #20.

5. In Figure 5 surface expression was subtracted from function fitness, and while the analysis produces interesting clusters, the real significance is unclear. Rather it seems that the 'functional assessment' is a product of expression and activity of individual channels, and so it seems the analysis should somehow seek to 'divide' the raw functional score by the expression level?

This is addressed by Essential Revision #23.

6. Paragraph starting l284. This misses any references for the stated mechanistic knowledge. These should be given. The idea that secondary structural elements that couple PIP2 binding to gating can be 'traced' from mutation-sensitive positions is interesting, but far from necessarily correct. This attempt to infer structural connectivity should be made cautiously.

This is addressed by Essential Revision #21.

7. l. 30 paragraph. Refs 73-75 are 20 years old, and the idea that Kir channels gate by kinking at G176 has been significantly passed over. Gly can be substituted by other residues (DOI: 10.1007/s00249-007-0206-7), and many MD simulations (e.g. DOI: 10.1085/jgp.201912422) show subtle movements along M2 rather than a specific kinking, during gating, as well as highlighting the effects of charge mutations in inducing channel opening. This more nuanced view seems much more consistent with the present data – it is not correct that all mutations reduce function in the present assay. It seems there is improved function for A,T,M,E ,K substitutions.

This is addressed by Essential Revision #22.

8. It needs to be pointed out that the assay is done in an immortalized non-polarized human embryonic kidney cell line. A major caveat to interpretation of trafficking susceptibility is that trafficking constraints may be very different in native cells, and perhaps especially in polarized cells.

This is addressed by Essential Revision #14.

9. There are a many odd typos or grammatical idiosyncracies that need cleaning up (missing articles, oddly inserted 'into', 'to', 'they are', 'and but', 'and' instead of 'the', etc , etc, words throughout the text). The use of speech contractions (e.g. 'We've' for 'We have', l108; 'protein don't' for proteins do not', l458) should all be corrected.

We have fixed these.

10. l38 Ref 10 is not appropriate for monogenic disease-relevance of Kirs.

This is addressed by Essential Revision #24.

11. l47. Neonatal diabetes results from Kir6.2 gain of function. It is confusing to discuss reduced surface expression without noting that this will be a counteracting LOF.

This is addressed by Essential Revision #25.

12. On l54 it states that ClinVar lists 163 missense mutations in Kir2.a, but then on l60 it says it reports 144? Something seems wrong.

This is addressed by Essential Revision #26.

13. On l66 it is said that there are no large-scale studies of sequence determinants of Kir trafficking and function. The authors should mention Minor et al., (Cell 96:879-91), an early attempt to systematically assess amino acid relevance to structure.

This is addressed by Essential Revision #27.

14. l309 What is the important physiological role of GIRK2 here? Presumably you mean Na is an important regulator of GIRK2?

This is addressed by Essential Revision #28.

15. In general, it would be best to stick to one nomenclature system, and the system, which names this channel as Kir2.1 (rather than IRK1 or other name) seems best. In this case, GIRK2 should be Kir3.2, ROMK1 should be Kir1.1, etc.

This is addressed by Essential Revision #29.

16. Paragraph beginning l2438 belongs in discussion.

This is addressed in Essential Revision. This is addressed by our response to reviewer comment 3.

17. There are weirdly incomplete references e.g, refs 97-99.

Fixed.

Reviewer #3 (Recommendations for the authors):

1. For Figure S1, it would be informative to also map onto this figure all the currently observed missense mutations in the GnomAD database. Are mutations currently circulating in the population in permissive regions observed later in the study? Are they clustered to particular areas? Looks like majority are in N-term and C-term.

This is addressed by Essential Revision #1.

2. For Figure 1C, there does not appear to be a data set sequencing the HEK293T landing pad cell line to demonstrate the extent of library coverage. This is generally important, both for the reader to assess the quality of the initial library as well as to understand the limitations of this method in providing complete coverage maps. I am curious if the cell line coverage landscape would mirror the deficits observed in Supplementary Figure 6 (magneta boxes). While only 7% of the potential data is missing, these gaps appear to be focused in very specific amino acid and receptor domain locations. This data should be made into a heatmap figure (a la figure 2) to demonstrate if the library was uniform or biased toward specific mutational areas. It would be informative to identify how many mutations are under-represented in the library and if this pairs with specific functional deficits. i.e. – are we seeing magenta in Figure 6 because there wasn't sufficient coverage in the initial library to detect them, or are they present, but not measurable?

This is addressed by Essential Revision #2.

3. In Figure 2b it would be beneficial to place the citation number of the paper corresponding to a previously published defect that matched your observations and border the matching mutation in the heatmap with a unique color identifier so that readers can assess the relative benchmarks and past known mutations and their effects in your assay format more easily.

This is addressed by Essential Revision #17.

4. For Figure S2 E and F the X-axis is the excitation laser rather than the emission spectra measured. As this signal is referred to as the emission miRPF 67o throughout the paper it may be helpful to change this label to the actual measured unit for the axis, as was done for the Y-axis (BV421).

This is addressed by Essential Revision #4.

5. For Figure S2 G, these appear to be a sub-sampled statistic of 1% of the total collected population. In the paper (line 634) it was noted that 2.1 million cells were collected, but only 1,000 of each population appear in this statistic table. Can you please also include the final statistics table to evaluate your final sort values. Can you also please note in the legend whether this data is a sub-sample of the published data, a benchmark run, or otherwise. If this gating format is not directly subsampled from the true data collection runs, please re-render these figures using the collected data.

This is addressed by Essential Revision #6.

6. Median fitness error to synonymous mutations is reported as "low" for the trafficking defect assay at 7.5%. It is again reported as low for the functional assays at 15.3%. It may be best instead to report the median and deviation of the synonymous and missense datasets and their relative differences at each tail end. Low is a subjective statement here and isn't supported by a positive control measure such as a large N of unmutated channels, run in the same sample, showing similar deviations in these assays as the synonymous mutants. If this data was run, it would be best to include it in the graph for comparison. While not essential, in the future I would encourage the authors to include unmutated controls into their library to assess the impact of synonymous substitutions, and to potentially make single synonymous substitutions at every amino acid position as a reference comparison to the mutants.

This is addressed by Essential Revision #7.

7. For Figure S3E-F, please show the FLIPR fluorescence of the miRFP 670 negative population as an overlay in F (dotted line of peaks over the red of P4) so we can gauge the fluorescence distribution of the negative control cells wherein no channel activity is present. Is this how "FLIPR-" was determined? Also, note in the figure the axis is labeled 640, which is the excitation laser. For consistency it may be best to label this axis as miRFP 670 emission rather than excitation.

This is addressed by Essential Revision #3.

8. The authors should include in their discussion some consideration of the "no data" category which generated a consistent number of analysis gaps within certain domains of the channel. How to avoid these in the future? Were these cloning issues? Library construction difficulties with SPINE? Poor coverage of the original cell pool? Could this have been avoided by making multiple cell libraries, sequencing them, and pooling those with compensating coverage?

This is addressed by Essential Revision #8.

9. The "no data" color scheme from Figure 6S should be adopted for the main paper figures. The use of light yellow for no data and white for "0 fold change" makes differentiating these two dataset types difficult visually in these figures. It is important for the reader to be able to easily observe where mutational coverage was lacking in this study.

This is addressed by Essential Revision #11.

https://doi.org/10.7554/eLife.76903.sa2

Article and author information

Author details

  1. Willow Coyote-Maestas

    Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, United States
    Present address
    Department of Bioengineering and Therapeutic Science and Quantitative Biosciences Institute, University of California, San Francisco, United States
    Contribution
    Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Validation, Writing – original draft, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-9614-5340
  2. David Nedrud

    Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, United States
    Contribution
    Conceptualization, Data curation, Investigation, Methodology, Software
    Competing interests
    No competing interests declared
  3. Yungui He

    Department of Genetics, Cell Biology and Development, University of Minnesota, Minneapolis, United States
    Contribution
    Investigation
    Competing interests
    No competing interests declared
  4. Daniel Schmidt

    Department of Genetics, Cell Biology and Development, University of Minnesota, Minneapolis, United States
    Contribution
    Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review and editing
    For correspondence
    schmida@umn.edu
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-7609-4873

Funding

National Institute of General Medical Sciences (R01GM136851)

  • Daniel Schmidt

Howard Hughes Medical Institute

  • Willow Coyote-Maestas

Illumina

  • Daniel Schmidt

National Science Foundation

  • Willow Coyote-Maestas

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We are grateful for helpful discussions with Anna Gloyn, James Fraser, Gabbriella Estevam, Eric Greene, the DMS crew, and the rest of the Fraser lab. We also want to acknowledge the hard work of the members of the UMN Flow Cytometry Core that enabled us to do these experiments during the COVID-19 pandemic. Rashi Arora especially assisted us in sorting our cells by FACS. We also thank you for taking the time to read our article. This work was supported by the National Institutes of Health (1R01GM136851 to DS) and a University of Minnesota Genome Center Illumina S2 grant. WC-M was supported by a National Science Foundation Graduate Research Fellowship and a Howard Hughes Medical Institute Gilliam Fellowship for Advanced Study.

Senior Editor

  1. Kenton J Swartz, National Institute of Neurological Disorders and Stroke, National Institutes of Health, United States

Reviewing Editor

  1. Andrés Jara-Oseguera, The University of Texas at Austin, United States

Reviewers

  1. Colin G Nichols, Washington University School of Medicine, United States
  2. Justin Gregory English, The University of Utah, United States

Publication history

  1. Preprint posted: January 6, 2022 (view preprint)
  2. Received: January 8, 2022
  3. Accepted: May 27, 2022
  4. Accepted Manuscript published: May 31, 2022 (version 1)
  5. Version of Record published: July 11, 2022 (version 2)

Copyright

© 2022, Coyote-Maestas et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 1,332
    Page views
  • 490
    Downloads
  • 3
    Citations

Article citation count generated by polling the highest count across the following sources: PubMed Central, Crossref, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Willow Coyote-Maestas
  2. David Nedrud
  3. Yungui He
  4. Daniel Schmidt
(2022)
Determinants of trafficking, conduction, and disease within a K+ channel revealed through multiparametric deep mutational scanning
eLife 11:e76903.
https://doi.org/10.7554/eLife.76903
  1. Further reading

Further reading

    1. Evolutionary Biology
    2. Genetics and Genomics
    Yi Feng, Rafik Neme ... Laura F Landweber
    Research Article

    Ciliates are microbial eukaryotes that undergo extensive programmed genome rearrangement, a natural genome editing process that converts long germline chromosomes into smaller gene-rich somatic chromosomes. Three well-studied ciliates include Oxytricha trifallax, Tetrahymena thermophila and Paramecium tetraurelia, but only the Oxytricha lineage has a massively scrambled genome, whose assembly during development requires hundreds of thousands of precise programmed DNA joining events, representing the most complex genome dynamics of any known organism. Here we study the emergence of such complex genomes by examining the origin and evolution of discontinuous and scrambled genes in the Oxytricha lineage. This study compares six genomes from three species, the germline and somatic genomes for Euplotes woodruffi, Tetmemena sp., and the model ciliate Oxytricha trifallax. To complement existing data, we sequenced, assembled and annotated the germline and somatic genomes of Euplotes woodruffi, which provides an outgroup, and the germline genome of Tetmemena sp.. We find that the germline genome of Tetmemena is as massively scrambled and interrupted as Oxytricha's : 13.6% of its gene loci require programmed translocations and/or inversions, with some genes requiring hundreds of precise gene editing events during development. This study revealed that the earlier-diverged spirotrich, E. woodruffi, also has a scrambled genome, but only roughly half as many loci (7.3%) are scrambled. Furthermore, its scrambled genes are less complex, together supporting the position of Euplotes as a possible evolutionary intermediate in this lineage, in the process of accumulating complex evolutionary genome rearrangements, all of which require extensive repair to assemble functional coding regions. Comparative analysis also reveals that scrambled loci are often associated with local duplications, supporting a gradual model for the origin of complex, scrambled genomes via many small events of DNA duplication and decay.

    1. Chromosomes and Gene Expression
    2. Genetics and Genomics
    Joseph V Geisberg, Zarmik Moqtaderi ... Kevin Struhl
    Research Advance

    Alternative polyadenylation yields many mRNA isoforms whose 3' termini occur disproportionately in clusters within 3' UTRs. Previously, we showed that profiles of poly(A) site usage are regulated by the rate of transcriptional elongation by RNA polymerase (Pol) II (Geisberg et., 2020). Pol II derivatives with slow elongation rates confer an upstream-shifted poly(A) profile, whereas fast Pol II strains confer a downstream-shifted poly(A) profile. Within yeast isoform clusters, these shifts occur steadily from one isoform to the next across nucleotide distances. In contrast, the shift between clusters from the last isoform of one cluster to the first isoform of the next - is much less pronounced, even over large distances. GC content in a region 13-30 nt downstream from isoform clusters correlates with their sensitivity to Pol II elongation rate. In human cells, the upstream shift caused by a slow Pol II mutant also occurs continuously at the nucleotide level within clusters, but not between them. Pol II occupancy increases just downstream of the most speed-sensitive poly(A) sites, suggesting a linkage between reduced elongation rate and cluster formation. These observations suggest that 1) Pol II elongation speed affects the nucleotide-level dwell time allowing polyadenylation to occur, 2) poly(A) site clusters are linked to the local elongation rate and hence do not arise simply by intrinsically imprecise cleavage and polyadenylation of the RNA substrate, 3) DNA sequence elements can affect Pol II elongation and poly(A) profiles, and 4) the cleavage/polyadenylation and Pol II elongation complexes are spatially, and perhaps physically, coupled so that polyadenylation occurs rapidly upon emergence of the nascent RNA from the Pol II elongation complex.