Functional characterization of all CDKN2A missense variants and comparison to in silico models of pathogenicity

Hirokazu Kimura; Kamel Lahouel; Cristian Tomasetti; Nicholas J. Roberts

doi:10.7554/eLife.95347.1

eLife assessment

This is a saturation mutagenesis screening of the CDKN2A gene, successfully assessing the functionality of the missense variants. The results seem robust, but currently, the manuscript is incomplete with a number of weaknesses. The work has the potential to serve as a valuable resource for diagnostic labs as well as cancer geneticists.

https://doi.org/10.7554/eLife.95347.1.sa2

Significance of findings

valuable: Findings that have theoretical or practical implications for a subfield

landmark
fundamental
important
valuable
useful

Strength of evidence

incomplete: Main claims are only partially supported

exceptional
compelling
convincing
solid
incomplete
inadequate

During the peer-review process the editor and reviewers write an eLife assessment that summarises the significance of the findings reported in the article (on a scale ranging from landmark to useful) and the strength of the evidence (on a scale ranging from exceptional to inadequate). Learn more about eLife assessments

Abstract

Interpretation of variants identified during genetic testing is a significant clinical challenge. In this study, we developed a high-throughput CDKN2A functional assay and characterized all possible CDKN2A missense variants. We found that 40% of all missense variants were functionally deleterious. We also used our functional classification to assess the performance of in silico models that predict the effect of variants, including recently reported models based on machine learning. Notably, we found that all in silico models similarly when compared to our functional classifications with accuracies of 54.6 – 70.9%. Furthermore, while we found that functionally deleterious variants were enriched within ankyrin repeats, rarely were all missense variants at a single residue functionally deleterious. Our functional classifications are a resource to aid the interpretation of CDKN2A variants and have important implications for the application of variant interpretation guidelines, particularly the use of in silico models for clinical variant interpretation.

Introduction

Genetic testing of patients with cancer to identify variants associated with an increased cancer risk and sensitivity to targeted therapies is becoming more common as broad testing criteria are integrated into clinical care guidelines (Goggins et al., 2020; Stoffel et al., 2019). American College of Medical Genetics (ACMG) provides a framework to integrate multiple types of evidence, including variant characteristics, disease epidemiology, clinical information, and functional classifications, to interpret variants in any gene (Richards et al., 2015). In silico variant effect predictors are also integrated into AMCG variant interpretation guidelines as supporting evidence to aid classification of variants. While numerous models have been developed, varied accuracy, poor agreement between models, and inflated performance on publicly available data have been reported (Cubuk et al., 2021; Jaffe et al., 2011; Wilcox et al., 2022). Recently developed variant effect predictors aim to overcome these limitations by incorporating deep-learning based protein structure predictions and by not training on human annotated datasets (Brandes et al., 2023; Cheng et al., 2023; Gao et al., 2023). However, post-development assessment of machine learning based variant effect predictors to determine accuracy on novel experimental datasets and suitability for clinical use are limited.

Variants that cannot be classified as either pathogenic or benign are categorized as variants of uncertain significance (VUSs). However, while pathogenic and benign variants identified during genetic testing are clinically actionable, VUSs are the cause of deep uncertainty for patients and their health care providers as an unknown fraction are functionally deleterious and therefore, likely pathogenic. For example, individuals with germline VUSs in a pancreatic cancer susceptibility gene would not be eligible for clinical surveillance programs that are associated with improved patient outcomes, unless they otherwise meet family history criteria (Goggins et al., 2020; Stoffel et al., 2019). Similarly, patients with breast or pancreatic cancer and a germline BRCA2 VUS would not be eligible for treatment with olaparib, a poly (ADP-ribose) polymerase inhibitor (Golan et al., 2019; Tutt et al., 2021). Reclassification of VUSs into pathogenic or benign strata has real-world, life-or-death consequences that necessitate a high degree of accuracy.

Germline VUSs in hereditary cancer genes are a frequent finding in patients with cancer and frequently can be reclassified as pathogenic on the basis of in vitro functional characterization (Kimura et al., 2022). In patients with pancreatic ductal adenocarcinoma (PDAC), germline CDKN2A VUSs affecting p16^INK4a, most often rare missense variants, are found in up to 4.3% of patients (Chaffee et al., 2018; Kimura et al., 2021; McWilliams et al., 2018; Roberts et al., 2016; Shindo et al., 2017; Zhen et al., 2015). As functional data from well-validated in-vitro assays are incorporated into ACMG variant interpretation guidelines, we recently determined the functional consequence of 29 CDKN2A VUSs identified in patients with PDAC using an in vitro cell proliferation assay (Kimura et al., 2022; Richards et al., 2015). We found that over 40% of VUSs assayed were functionally deleterious and could reclassified as likely pathogenic.

Functional characterization is time-consuming, expensive, and requires technical and scientific expertise. These limitations hinder assessment of in silico variant effect predictors and patient access to functional data that may allow reclassification of VUSs into clinically actionable strata. As CDKN2A VUSs will continue to be identified in patients with cancer undergoing genetic testing, we developed a multiplexed functional assay to provide a broad interpretation framework for CDKN2A variants. We characterized all possible CDKN2A missense variants and compared our functional classifications to including recently developed in silico models based on machine learning to determine the accuracy of variant effect predictions.

Results

Functional characterization of CDKN2A missense variants

We utilized a codon optimized CDKN2A sequence for our multiplexed functional assay (Appendix 1-table 1). Expression of codon optimized CDKN2A or three synonymous CDKN2A variants, p.L32L, p.G101G, and p.V126V, in PANC-1, a PDAC cell line with a homozygous deletion of CDKN2A, resulted in significant reduction is cell proliferation (P value < 0.0001; Figure 1-figure supplement 1A). Conversely, expression of three pathogenic variants, p.L32P, p.G101W, and p.V126D, in PANC-1 cells did not result in any significant changes in cell proliferation. To determine if there were unappreciated selective effects during in vitro culture, we generated a CellTag library based on the pLJM1 plasmid that contained twenty non-functional 9 base pair barcodes of equal representation. We then transduced PANC-1 cells stably expressing codon optimized CDKN2A with the CellTag library and determined representation of each barcode in the cell pool, before and after in vitro culture. We found no statistically significant changes in barcode representation, indicating that representation of a pool of functionally neutral variants is stable over a period of in vitro culture representing our assay time course (Figure 1-figure supplement 1B).

We next determined whether we could identify functionally deleterious CDKN2A variants at a single residue when all amino acid variants were assayed simultaneously. We generated lentiviral expression libraries for two CDKN2A amino acid residues, p.V126 and p.R144, that include pathogenic and benign variants, respectively. Each lentiviral expression library contained all amino acid variants (19 missense and 1 synonymous variant) at a single residue. We then transduced PANC-1 cells with each of the lentiviral expression libraries and determined the representation of each variant in the resulting cell pool before and after a period of in vitro culture (Figure 1A, B). Synonymous variants, p.V126V and p.R1441R, as well as a previously reported benign variant, p.R144C, either decreased or maintained their representation in the cell pool during in vitro culture. Representation of a previously reported pathogenic variant, p.V126D, increased in the cell pool. Notably, several other variants including p.V126R, p.V126W, p.V126K, and p.V126Y, increased in representation, suggesting that additional amino acid changes at this residue are functionally deleterious (Figure 1A).

Pooled analysis of *CDKN2A* variants at two residues with previously reported pathogenic and benign variants.
PANC-1 cell stably expressed with a total of 20 *CDKN2A* variants, 19 missense variants and 1 synonymous variant, at resides were cultured. Variant representation, as percent of reads supporting the variant sequence, before and after a period in vitro cell growth determined by next generation sequencing for two residues, p.V126 (A) or p.R144 (B). *CDKN2A* variant p.V126D (*) was previously reported as pathogenic and increased representation during in vitro growth. CDKN2A variant p.R144C (**) was previously reported as benign variant and maintained representation during in vitro growth.

Next, to functionally characterize 2,964 CDKN2A missense variants, we generated 156 CDKN2A lentiviral expression libraries where each library contained all possible amino acid substitutions at a single residue. PANC-1 cells were then transduced with each of the lentiviral expression libraries and representation of each CDKN2A variant in the resulting cell pool determined before and after a period of in vitro culture. Variant read counts were then analyzed using a gamma generalized linear model and variants with statistically significant P values were classified as functionally deleterious. Variants with P values that did not reach statistical significance were classified as functionally deleterious.

We found that 1,182 of 2,964 missense variants (39.9%) characterized in our assay were functionally deleterious and that 1,782 variants (60.1%) were functionally neutral (Figure 2A, Appendix 1-table 2). In general, our results were consistent with previously reported classifications. Thirty-two variants identified in patients with cancer and previously reported to be functionally deleterious in published literature and/or reported in ClinVar as pathogenic or likely pathogenic were characterized as functionally deleterious in our assay (Figure 2B, Appendix 1-table 3) (Chaffee et al., 2018; Chang et al., 2016; Horn et al., 2021; Hu et al., 2018; Kimura et al., 2022; McWilliams et al., 2018; Roberts et al., 2016; Zhen et al., 2015). Of 162 benign variants, including 156 synonymous variants and six missense variants previously reported to be functionally neutral in published literature and/or reported in ClinVar as benign or likely benign, all were characterized as functionally neutral in our assay (Figure 2B, Appendix 1-table 3) (Kimura et al., 2022; McWilliams et al., 2018; Roberts et al., 2016). Similarly, of 50 missense variants classified as VUSs in ClinVar that also have previously reported functional data, only two variants, p.G35Q and p.G101R, had divergent functional classifications. Both variants were previously reported to have functionally neutral effects but were characterized as functionally deleterious in our assay (Figure 2C, Appendix 1-table 2, 3).

Functional characterization of all possible *CDKN2A* missense variants.
(A) Functional classifications of 3,120 CDKN2A variants, including all possible 2,964 missense variants and 156 synonymous variants. Variants were classified as functionally deleterious or neutral based on P value. 1,182 (39.9%) of variants were classified as functionally deleterious. (B) P values for 32 benchmark pathogenic variants and 162 benign variants. All pathogenic variants were classified as functionally deleterious, and all benign variants were classified as functionally neutral, based on P values. (C) High-throughput functional assay P values for 32 *CDKN2A* VUSs previously reported to have functionally deleterious effects and 18 VUSs previously reported to have functionally neutral effects. (D) Heat map with P values for all 3,120 CDKN2A variants assayed.

Comparison to in silico prediction algorithms

As in silico predictions of variant effect are integrated into ACMG variant interpretation guidelines as supporting evidence, we compared the ability of different algorithms, including recently described algorithms that incorporate deep-learning models of protein structure to predict the functional consequence of CDKN2A missense variants. Using our functional classifications for all CDKN2A missense variants as truth, we compared our functional classifications to predictions from CADD, PolyPhen-2, SIFT, VEST, AlphaMissense, ESM1b, and PrimateAI-3D. Predictions for all missense variants (1,182 functionally deleterious and 1,782 functionally neutral) were available for comparison for all algorithms, except CADD and PrimateAI-3D, where 910 (349 functionally deleterious and 561 functionally neutral) and 904 (349 functionally deleterious and 555 functionally neutral) missense variants had predictions available respectively (Appendix 1-table 4). In silico variant effect predictors performed similarly across a broad range of performance characteristics (Appendix 1-table 5). Accuracy of in silico model predictions were 54.6 – 70.9% (CADD – 56.8%; PolyPhen-2 – 54.6%; SIFT – 61.4%; VEST – 70.8%; AlphaMissense – 70.9%; ESM1b – 64.9%; and PrimateAI-3D; 66.7%) (Figure 3). We also assessed sensitivity, specificity, positive predictive value, and negative predictive value for each model. We found that sensitivity was 0.15 – 0.90 (CADD – 0.86; PolyPhen-2 – 0.9; SIFT – 0.64; VEST – 0.67; AlphaMissense – 0.69; ESM1b – 0.77; and PrimateAI-3D – 0.15), specificity was 0.31 – 0.99 (CADD – 0.39; PolyPhen-2 – 0.32; SIFT – 0.60; VEST – 0.74; AlphaMissense – 0.72; ESM1b – 0.57; and PrimateAI-3D – 0.99), positive predictive value was 0.46 – 0.93 (CADD – 0.47; PolyPhen-2 – 0.46; SIFT – 0.51; VEST – 0.63; AlphaMissense – 0.62; ESM1b – 0.54; and PrimateAI-3D – 0.93), and negative predictive value was 0.65 – 0.82 (CADD – 0.81; PolyPhen-2 – 0.82; SIFT – 0.71; VEST – 0.77; AlphaMissense – 0.78; ESM1b – 0.79; and PrimateAI-3D – 0.65).

Comparison of functional classifications and in silico variant effect predictions for all possible CDKN2A missense variants.
Variant effect predictions for CDKN2A missense variants using CADD, PolyPhen-2, SIFT, VEST, AlphaMissense, ESM1b, and PrimateAI-3D. Predicted deleterious, damaging, or pathogenic effects (black box) and predicted neutral, tolerated, benign, or ambiguous effects (white box) presented as percent of missense variants with available prediction. Number of missense variants with available prediction for each in silico model given in parentheses. CADD; Combined Annotation Dependent Depletion, PolyPhen-2; Polymorphism Phenotyping v2, SIFT; Sorting Intolerant From Tolerant, VEST; Variant Effect Scoring Tool score.

Distribution of functionally deleterious variants

Functionally deleterious missense variants were not distributed evenly across CDKN2A. CDKN2A contains four ankyrin repeats that mediate protein-protein interactions, ankyrin repeat 1 at codon 11-40, ankyrin repeat 2 at codon 44-72, ankyrin repeat 3 at codon 77-106, and ankyrin repeat 4 at codon 110-139 (Goldstein, 2004; Ruas and Peters, 1998; Sun et al., 2010) (Figure 2-figure supplement 1A). Functionally deleterious variants were enriched in ankyrin repeat 1 (44.8%, adjusted P value = 8.5 x 10^-4), ankyrin repeat 2 (47.4%, adjusted P value = 1.4 x 10^-6), and ankyrin repeat 3 (55.0%, adjusted P value = 6.1 x 10^-22), and depleted in ankyrin repeat 4 (27.2%, adjusted P value = 1.6 x 10^-8), non-ankyrin repeat residue 1-10 (22.0%, adjusted P value = 1.5 x 10^-5), and residues 140-156 (9.1%, adjusted P value = 4.5 x 10^-30) (Figure 2-figure supplement 1B). Moreover, functionally deleterious variants were further enriched within 10 residue subregions of ankyrin repeats 1-3, with 59.5% of variants in residues 14 – 23 of ankyrin repeat 1, 63.5% of variants in residues 46-55 of ankyrin repeat 2, and 81.5% of variants in residues 80-89 of ankyrin repeat 3 being classified as functionally deleterious (Figure 2D, Appendix 1-table 2). Furthermore, analysis of functionally deleterious variants may highlight critical and non-critical resides for CDKN2A function. Across all single residues, the mean percent of functionally deleterious missense variants was 39.9% (95% confidence interval: 34.7% – 45.0%) (Figure 2-figure supplement 1C). At six amino acid residues, p.G55, p.P81, p.H83, p.D84, p.G89, and p.P114, all missense variants were functionally deleterious. These residues are conserved between human and murine p16 (Byeon et al., 1998). Furthermore, p.H83 has been reported to stabilized peptide loops connecting the helix-turn-helix structure exhibited by four ankyrin repeats (Byeon et al., 1998), whereas p.D84 and p.G89 are located in a 20- residue region reported to interact with CDK4 and CDK6 (Fåhraeus et al., 1996). Conversely, at 18 residues, no missense variant was characterized as functionally deleterious (Appendix 1-table 2).

Functional effect of CDKN2A somatic mutations

Somatic alterations in CDKN2A are a frequent finding in many types of cancer. However, not all somatic alterations are unequivocally deleterious to protein function. Missense somatic mutations, for example, are particularly challenging to functionally interpret and the presence of a functionally neutral somatic mutation may impact patient care (Tung et al., 2020). To understand the functional effect of missense somatic mutations in CDKN2A, we functionally classified mutations reported in the Catalogue Of Somatic Mutations In Cancer (COSMIC) (Forbes et al., 2009), The Cancer Genome Atlas (TCGA) (Muddabhaktuni and Koyyala, 2021), patients with cancer undergoing sequencing at The Johns Hopkins University School of Medicine (JHU), Memorial Sloan Kettering-Integrated Mutation Profiling of Actionable Cancer Targets Clinical Sequencing Cohort (MSK-IMPACT) (Cheng et al., 2015). Overall, 355 unique missense somatic mutations were reported, of which 193 (54.4%) were functionally deleterious (Appendix 1-table 6). The percent of missense somatic mutations that were classified as functionally deleterious was greater than the percent of all possible CDKN2A missense variants that were classified as functionally deleterious, suggesting enrichment of functionally deleterious missense changes among somatic mutations (Figure 2A, Appendix 1-table 6). The prevalence of functionally deleterious missense somatic mutations was similar in COSMIC, TCGA, JHU, and MSK-IMPACT with 55.9%, 68.0%, 65.7%, and 66.7% of mutations being classified as functionally deleterious, respectively (Figure 4A, Appendix 1-table 6). Similar to all functionally deleterious variants, functionally deleterious missense somatic mutations were not distributed evenly across CDKN2A. Functionally deleterious somatic mutations were enriched within the ankyrin repeat 3 (Figure 4B, Appendix 1-table 6), We found that 29.0%, 37.1%, 50.8%, and 42.9% of all functionally deleterious missense somatic mutations occurred within ankyrin repeat 3 in COSMIC, TCGA, JHU, and MSK-IMPACT, respectively, and within this domain, 62.7%, 73.1%, 72.7%, and 70.8% of functionally deleterious mutations were in residues 80-89 in COSMIC, TCGA, JHU, and MSK-IMPACT, respectively (Figure 4B).

Functional classification of missense somatic mutations in *CDKN2A*.
(A) Missense somatic missense variants in *CDKN2A* reported in COSMIC, TCGA, JHU, or MSK-IMPACT, by functional classification (deleterious – black box; neutral – white box). (B) Distribution of functionally deleterious missense somatic mutations *CDKN2A* reported in COSMIC, TCGA, JHU, or MSK-IMPACT by ankyrin (ANK) repeat.

We were also able to assess the contribution of functionally deleterious CDKN2A missense somatic mutations in COSMIC, TCGA, JHU, and MSK-IMAPCT by cancer type, and found that 44.4 – 95.7% of reported CDKN2A missense somatic mutations were functionally deleterious when stratified by cancer type (Figure 4-figure supplement 1A-D). When considering missense somatic mutation reported in all databases, there was enrichment of functionally deleterious mutations in melanoma (84.8%; adjusted P value – 0.019) and depletion of functionally deleterious mutations in colorectal adenocarcinoma (49.0%; adjusted P value = 2.6 x 10^-4) (Figure 4-figure supplement 2). As the proportion of missense somatic mutations that were functionally deleterious was less in colorectal carcinoma compared to other types of cancer, we assessed whether somatic mutations in mismatch repair genes (MLH1, MSH2, MSH6, and PMS2) were associated with the functional status of CDKN2A missense somatic mutations. Thirty-six samples in COSMIC had a CDKN2A missense somatic mutation, of which 12 samples (33.3%) had a somatic mutation in a mismatch repair gene. We found that 3 of 12 samples (25%) with a somatic mutation in a mismatch repair gene had a functionally deleterious CDKN2A missense somatic mutation compared to 12 of 24 samples (50%) without a somatic mutation in a mismatch repair gene (Fisher’s exact test; P = 0.2821).

Discussion

VUSs in hereditary cancer susceptibility genes, predominantly rare missense variants, are a frequent finding in patients undergoing genetic testing and the cause of significant uncertainty. ACMG variant interpretation guidelines incorporate functional data, as well as other evidence such as in silico predictions of variant effect, to aid classification of variants as either pathogenic or benign. CDKN2A VUSs are a frequent finding in patients with PDAC. We previously found that over 40% of CDKN2A VUSs were functionally deleterious and therefore could be reclassified as likely pathogenic using ACMG variant interpretation guidelines. In this study, we developed a validated high-throughput in vitro assay and functionally characterized 2,964 CDKN2A missense variants, representing all possible single amino acid variants. We found that 1,182 missense variants (39.9%) were functionally deleterious. These pre-defined functional characterizations are resource for the scientific community and can be integrated into variant interpretation schema, such as those from the ACMG, to aid classification of CDKN2A germline variants and somatic mutations.

Our comprehensive characterization of all possible CDKN2A missense variants allowed us to assess the ability of in silico algorithms – including recently published predictors based on machine learning AlphaMissense, ESM1b, and PrimateAI-3D – to predict the pathogenicity or functional effect of CDKN2A missense variants. We found that all in silico variant effect predictors assessed performed similarly. Overall, PolyPhen-2 had highest sensitivity and negative predictive value at 0.90 and 0.82 respectively. PrimateAI-3D had the highest specificity and positive predictive values at 0.99 and 0.93, respectively. However, while specificity and positive predictive values were high, Primate AI-3D lowest sensitivity and negative predictive values at 0.15 and 0.65 respectively. Highest accuracy was observed with AlphaMissense at 70.9%, closely followed by VEST at 70.8%. Given that reclassification of VUSs in hereditary cancer genes into inappropriate strata has significant implications for patients, integration of in silico models, including those utilizing machine learning, may be premature. Ultimately, our data support current ACMG guidelines that include in silico predictions of variant effect as supporting evidence of pathogenicity or benign impact.

Our study also provides other insight for the implementation of variant interpretation guidelines. ACMG guidelines include presence of a missense variants at a residue with a previously reported pathogenic variant as moderate evidence of pathogenicity. We found that the mean percent of functionally deleterious missense variants per residue was 39.9% and that only six amino acid residues were all missense variants functionally deleterious. These data suggest, at least for CDKN2A, that the presence of a pathogenic missense variant at a residue should be used with caution when classifying other missense variants at the same residue.

Our high-throughput functional assay characterized variants based upon a broad cellular phenotype, cell proliferation, in a single PDAC cell line. However, there appear to be limited cell-specific and assay-specific differences in functional classifications of CDKN2A variants. In our previous study, we characterized 29 CDKN2A VUSs in three PDAC cell lines, using cell proliferation and cell cycle assays, and found agreement between all functional classifications (Kimura et al., 2022). Nevertheless, our assay may not encompass all cellular functions of CDKN2A and underestimate the number of functionally deleterious missense variants. Moreover, in this study, we found that functionally deleterious effects were enriched among somatic missense mutations compared to all CDKN2A missense variants, further supporting the veracity of our functional classifications. Furthermore, in silico models to predict variant effect may characterize function (CADD, PolyPhen-2, and SIFT) or pathogenicity (VEST, AlphaMissense, ESM1b, and PrimateAI-3D). As our assay does not directly characterize pathogenicity, to compare in silico assays, we assumed that predicted pathogenic variants were functionally deleterious.

In this study, we determined functional classifications for all possible CDKN2A missense variants to aid variant classifications. Furthermore, comparison of our functional classifications to in silico variant effect predictors, including recently described algorithms based on machine learning, provides performance benchmarks and supports current recommendations integrating data computational data into variant interpretation guidelines.

Methods

Cell lines

PANC-1 (American Type Culture Collection, Manassas, VA; catalog no. CRL-1469), a human PDAC cell line with a homozygous deletion of CDKN2A (Caldasl et al., 1994) and 293T (American Type Culture Collection; catalog no. CRL-3216), a human embryonic kidney cell line, were maintained in Dulbecco’s modified Eagle’s medium (Thermo Fisher Scientific Inc., Waltham, MA; catalog no.11995-065) supplemented with 10% fetal bovine serum (Thermo Fisher Scientific Inc.; catalog no. 26140-079). Cell line authentication and mycoplasma testing were performed using the GenePrint 10 System (Promega Corporation, Madison, WI; catalog no. B9510) and the PCR-based MycoDtect kit (Greiner Bio-One, Monroe, NC; catalog no. 463 060) (Genetics Resource Core Facility, The Johns Hopkins University, Baltimore, MD).

CDKN2A somatic mutation data

CDKN2A (p16^INK4; NP_000068.1) missense somatic mutation data was obtained from the Catalogue Of Somatic Mutations In Cancer (Forbes et al., 2009), The Cancer Genome Atlas (Muddabhaktuni and Koyyala, 2021), patients with cancer undergoing sequencing at The Johns Hopkins University School of Medicine (Baltimore, MD), Memorial Sloan Kettering-Integrated Mutation Profiling of Actionable Cancer Targets Clinical Sequencing Cohort (Cheng et al., 2015).

Plasmids

pHAGE-CDKN2A (Addgene, Watertown, MA; plasmid no. 116726) was created by Gordon Mills & Kenneth Scott (Ng et al., 2018). pLJM1 (Addgene; plasmid no. 91980) was created by Joshua Mendell (Golden et al., 2017). pLentiV_Blast (Addgene, plasmid no. 111887) was created by Christopher Vakoc (Tarumoto et al., 2020). psPAX2 (Addgene, plasmid no. 12260) was created by Didier Trono), and pCMV-VSV-G (Addgene, plasmid no. 8454) was created by Bob Weinberg (Stewart et al., 2003).

CDKN2A expression plasmid libraries

CDKN2A cDNA from pHAGE-CDKN2A was subcloned into the pLJM1 plasmid as previously described (Kimura et al., 2022). Codon-optimized CDKN2A cDNA using p16^INK4A amino acid sequence (NP_000068.1), was designed (Appendix 1-table 1) and pLJM1 containing codon optimized CDKN2A (pLJM1-CDKN2A) generated (Twist Bioscience, South San Francisco, CA). Then, 156 plasmid libraries were synthesized using pLJM1-CDKN2A such that each library contained all possible 20 amino acids variants (19 missense and 1 synonymous) at a given position.

Single variant CDKN2A expression plasmids

Individual pLJM1-CDKN2A expression constructs for CDKN2A missense variants, p.L32L, p.L32P, p.G101G, p.G101W, p.V126D, and p.V126V were generated using the Q5 Site-Directed Mutagenesis kit (New England Biolabs, Ipswich, MA; catalog no. E0552). Primers used for site-directed mutagenesis are given in Appendix 1-table 7. Integration of each CDKN2A variant was confirmed using Sanger sequencing (Genewiz, Plainsfield, NJ) using the CMV Forward sequencing primer (CGCAAATGGGCGGTAGGCGTG). The manufacturer’s protocol was followed unless otherwise specified.

CellTag plasmid library

Twenty nonfunctional 9 base pair barcodes “CellTags” were subcloned into pLentiV_Blast using Q5® Site-Directed Mutagenesis kit (New England Biolabs; catalog no. E0552) (Biddy et al., 2018). Primers used to generate each CellTag plasmid are given in Appendix 1-table 7. Integration of each CellTag was confirmed using Sanger sequencing (Genewiz) (sequencing primer: AACTGGGAAAGTGATGTCGTG). The manufacturer’s protocol was followed unless otherwise specified. CellTag plasmids were then pooled to form a CellTag plasmid library with equal representation of each CellTag plasmid.

Lentivirus production

Lentivirus production was performed as previously described with the following modifications (Kimura et al., 2022). pLJM1 lentiviral expression vectors (plasmid libraries and single variant expression plasmids) and lentiviral packaging vectors (psPAX2 and pCMV-VSV-G) were transfected into 293T cells using Lipofectamine 3000 Transfection Reagent (Thermo Fisher Scientific, Waltham, MA; catalog no. L3000008). Media was collected at 24 hours and 48 hours, pooled, and lentiviral particles concentrated using Lenti-X Concentrator (Clontech, Mountain View, CA; catalog no. 631231).

Lentiviral transduction

PANC-1 cells were used for CDKN2A plasmid library and single variant CDKN2A expression plasmid transductions. PANC-1 cells previously transduced with pLJM1-CDKN2A (PANC- 1^CDKN2A) were used for CellTag library transductions. Briefly, 1 x 10⁵ cells were cultured in media supplemented with 10 ug/ml polybrene and transduced with x 10⁷ transducing units per mL of lentivirus particles. Cells were then centrifuged at 1,200 x g for 1 hour. After 48 hours of culture at 37°C and 5% CO₂, transduced cells were selected using 3 µg/ml puromycin (CDKN2A plasmid libraries and single variant CDKN2A expression plasmids) or 5 µg/ml blasticidin (CellTag plasmid library) for 7 days. After selection, cells were trypsinized for passage into T150 flasks and DNA collection. T150 flasks were cultured for 2-5 weeks until confluent and then DNA collected. DNA was extracted from PANC-1 cells using the PureLink^TM Genomic DNA Mini Kit (Invitrogen, Carlsbad, CA; catalog no. K1820-01).

Generation of sequence libraries

Library preparation and sequencing was performed as previously described with the following modifications (Kinde et al., 2011). For the 1^st stage PCR, 3 target specific primers were designed to amplify CDKN2A amino acid positions 1 to 53, 54 to 110, and 111 to 156. Forward and reverse 1^st stage primers contained 5’ sequence, M13F (GTAAAACGACGGCCAGC) and M13R (CAGGAAACAGCTATGAC) respectively, to enable second stage of amplification and ligation of Illumina adapter sequences (Appendix 1-table 7). DNA was amplified with Q5 Hot Start High-Fidelity 2X Master Mix (New England Biolabs; catalog no. M0494S). For the 1^st stage PCR, each DNA sample was amplified in three reactions each containing 66 ng of DNA.1^st stage PCR products for each sample were then pooled and purified using the Agencourt AMPure XP system (Beckman Coulter, Inc, Brea, CA; catalog no. A63881) into 50 µL of elution buffer. Purified PCR product was amplified in a 2^nd stage PCR to add Illumina adaptor sequences and indexes (Appendix 1-table 7). PCR Amplification was performed with KAPA HiFi HotStart PCR Kit (Kapa Biosystems, Wilmington, MA; catalog no. KK2501) in 25 µL reactions containing 5X KAPA HiFi Buffer - 5 µL, 10 mM KAPA dNTP Mix - 0.75 µL, 10 μM forward primer - 0.75 µL, 10 μM reverse primer - 0.75 µL. For the 1^st stage PCR, 66 ng of template DNA and 12.5 µL, Q5 Hot Start High-Fidelity 2X Master Mix was used with the following cycling conditions: 98 °C for 30 seconds; 25 cycles of 98 °C for 10 seconds, 72 °C for 30 seconds, 72 °C for 25 seconds; 72 °C for 2 minutes. For the 2^nd stage PCR, 0.25 µL of of 1^st stage PCR product and 0.5 µL of 1 U/μL KAPA HiFi HotStart DNA Polymerase was used with the following cycling conditions: 95 °C for 3 minutes; 25 cycles of 98 °C for 20 seconds, 62 °C for 15 seconds, 72 °C for 1 minute. 2^nd stage PCR products were purified with the Agencourt AMPure XP system (Beckman Coulter, Inc.; catalog no. A63881) into 30 µL of elution buffer. Samples were quantified by Qubit using dsDNA HS assay kit (Invitrogen; catalog no. Q33230).

Sequencing and analysis

Sequence libraries were pooled into groups of 16 samples and sequenced on the Illumina MiSeq System (Illumina, San Diego, CA) with the MiSeq Reagent Kit v2 (300 cycles) (Illumina catalog no. MS-102-2002) to generate 150 base pair paired-end reads. Samples were demultiplexed and FASTQ sequenced read files were generated with MiSeq control software 2.5.0.5 (Illumina). Paired sequence reads were then combined into a single contiguous sequence using Paired-End Read Merger (Zhang et al., 2014). Reads supporting each variant at a given amino acid position were counted using perl.

Functional characterization of CDKN2A variants using gamma generalized linear model

We determined if a variant has a fitness advantage by assessing the significance of the observed ratio r_v,cf at confluence between the number of cells with a missense variant 𝑣 and the number of cells with a synonymous variant at a given amino acid position. Under the assumption that the missense variant is neutral (null model), we assumed that the distribution of r_v,cf can be explained by two key covariates: 𝑟_{𝑣,𝑖𝑛𝑖𝑡}, which represent the missense variant-to-synonymous variant ratio at day 9, and 𝑝_{𝑣,𝑖𝑛𝑖𝑡}, the proportion of the missense variant cells among other variants, including the synonymous variant, at the studied position. More specifically, given the variables 𝑟_{𝑣,𝑖𝑛𝑖𝑡} 𝑎𝑛𝑑 𝑝_{𝑣,𝑖𝑛𝑖𝑡}, the ratio at confluence follows a distribution:

where the mean 𝑢_𝑣 of the Gamma distribution is such that:

Here, the parameters of the null model to estimate are α, 𝑎, 𝑎𝑛𝑑 𝑏, where α, is the shape parameter of the Gamma distribution and is assumed to be the same for all variants. This model is a Gamma Generalized Linear Model (GLM) over the response variable 𝑟_{𝑣,𝑐𝑓} with a log-link function and covariates 𝑙𝑜𝑔(𝑟_{𝑣,𝑖𝑛𝑖𝑡}) and 𝑙𝑜𝑔(𝑝_{𝑣,𝑖𝑛𝑖𝑡}). Estimating the parameters will provide a null distribution of 𝑟_{𝑣,𝑐𝑓}, generating a p-value for every observed 𝑟_{𝑣,𝑐𝑓} for any variant at a given position.

To estimate the parameters α, a, and 𝑏, we utilized three control experiments where the CellTag plasmid library was transduced into PANC-1^CDKN2Aco cells so that each CellTag represented a neutral variant. For a single experiment, every variant can be considered as wild-type, and we test the other 19 variants against it, knowing that they are neutral and therefore follow the null distribution. This provides us with 19 x 20 triplets , for every experiment, yielding 1140 datapoints when considering all three experiments together. To estimate the parameters using these 1140 data points, we fit the GLM corresponding GLM model using the sklearn.linear_model module.

After the estimation of parameters α, a, and 𝑏, every observation for a tested variant 𝑣 at a given position of the triplet yields a p-value, defined as the probability of observing a ratio at confluence that is at least 𝑟_{𝑣,𝑐𝑓} given 𝑝_{𝑣,𝑖𝑛𝑖𝑡}, 𝑟_{𝑣,𝑖𝑛𝑖𝑡} under the null Gamma model. As some variants were tested in repeated experiments, we combined their associated p-values into a single p-value using Fisher’s method. Finally, to determine if a variant presents a fitness advantage, we apply a Benjamini-Hochberg estimator on all the tested variants p-values, fixing the False Discovery Rate at a level of 0.05.

Data visualization

Heat map of individual variant p-values by amino position was generated using R with the heatmaply package (Galili et al., 2018).

Cell proliferation assay

Cell proliferation assay were performed as previously described with the following modifications (Kimura et al., 2022). 1 × 10⁵ cells were seeded into in vitro culture on day 0. Cell were counted on day 14 using a TC20 Automated Cell Counter (Bio-Rad Laboratories, Herclues, CA; catalog no. 1450102). Relative cell proliferation value was calculated as cell number normalized to empty vector control. Assays were repeated in triplicate. Mean cell proliferation value and standard deviation (s.d.) were calculated.

Variant effect predictions

Publicly available algorithms were used to predict the consequence of CDKN2A missense variants. Prediction algorithms used included: Combined Annotation Dependent Depletion (CADD) (Kircher et al., 2014), Polymorphism Phenotyping v2 (PolyPhen-2) (Adzhubei et al., 2010), Sorting Intolerant From Tolerant (SIFT) (Kumar et al., 2009), Variant Effect Scoring Tool score (VEST) (Carter et al., 2013), AlphaMissense (18), ESM1b (Brandes et al., 2023), and PrimateAI-3D (Gao et al., 2023) (Appendix 1-table 4). PolyPhen-2, SIFT, VEST, AlphaMissense, and ESM1b prediction were available for all missense variants. CADD scores were available for 910 missense variants and where multiple CADD scores were possible, mean values were used. PrimateAI-3D prediction scores were available for 904 assayed missense variants.

Statistical analyses

Statistical analyses were performed using JMP v.11 (SAS, Cary, NC) and Python statsmodel package (version 0.14.0). Cell proliferation value means were compared with the Student’s t-test. Proportions of functionally deleterious missense variants and somatic mutations were compared with the Z-test and multiple test correction performed with the Bonferroni method. P values < 0.05 were considered statistically significant.

Acknowledgements

Funding

National Institutes of Health grant P50CA62924 (NJR) Susan Wojcicki and Dennis Troper (NR)

The Sol Goldman Pancreatic Cancer Research Center (NJR) The Rolfe Pancreatic Cancer Foundation (NJR)

The Japanese Society of Gastroenterology Support for Young Gastroenterologists Studying in the United States (HK)

The Japan Society for the Promotion of Science Overseas Research Fellowships (HK)

Author contributions

Conceptualization: HK, KL, CT, NJR Resources: CT, NJR

Data curation: HK, KL, CT, NJR Formal analysis: HK, KL, CT, NJR Investigation: HK, KL, CT, NJR Visualization: HK

Methodology: HK, KL, CT, NJR Writing-original draft: HK, NJR

Project administration: NJR

Writing-review and editing: HK, KL, CT, NJR

Competing interests

Authors declare that they have no competing interests.

Data and materials availability

All data are available in the main text or the supplementary materials.

Supplementary Information

Figures

Figure 1-figure supplement 1. Development and validation of high-throughput CDKN2A functional assay.

Figure 2-figure supplement 1. Functional characterization of all possible CDKN2A missense variants.

Figure 4-figure supplement 1. Functional classification of missense somatic mutations in CDKN2A.

Figure 4-figure supplement 2. Aggregate functional classifications for missense somatic mutations in CDKN2A

Tables

Appendix 1-table 1. Codon optimized CDKN2A sequence.

Appendix 1-table 2. Assay outputs and functional classifications for all possible CDKN2A missense and synonymous variants.

Appendix 1-table 3. Benchmark pathogenic variants, benchmark benign variants, and variants of uncertain significance with previously reported function data.

Appendix 1-table 4. In silico variant effect predictions for CDKN2A missense variants. Appendix 1-table 5. Assessment of in silico variant effect prediction models.

Appendix 1-table 6. Missense somatic mutations in CDKN2A reported in COSMIC, TCGA, JHU, MSK-IMPACT.

Appendix 1-table 7. Sequences of primers used in study.

Source data

Figure 1-source data Raw data in Figure 1

Figure 1-figure supplement 1-source data 1 Raw data in Figure 1-figure supplement 1A Figure 1-figure supplement 1-source data 2 Raw data in Figure 1-figure supplement 1B Figure 2-source data 1

Raw data in Figure 2A Figure 2-source data 2

Raw data in Figure 2B and C Figure 2-source data 3

Raw data in Figure 2D

Figure 2-figure supplement 1-source data 1

Raw data in Figure 2-figure supplement 1B

Figure 2- figure supplement 1-source data 2

Raw data in Figure 2- figure supplement 1C

Figure 3-source data 1

Raw data in Figure 3

Figure 4-source data 1

Raw data in Figure 4

Figure 4-figure supplement 1-source data 1

Raw data in Figure 4-figure supplement 1

Figure 4-figure supplement 2-source data 1

Raw data in Figure 4-figure supplement 2

Development and validation of high-throughput CDKN2A functional assay.
(A) Cell proliferation of PANC-1 cells stably expressing empty expression vector, one of three synonymous variants (p.L32L, p.G101G, p.V126V), or one of three pathogenic variants (p.L32P, p.G101W, p.V126D) over 14 days in culture. Cell proliferation values are given as mean of three repeats ± standard deviation normalized to PANC-1 cells that stably express empty vector. Statistically significant inhibition of cell proliferation inhibition in PANC-1 cells that stably express synonymous variants *; Student’s t test, P value < 0.001). (B) PANC-1 cells stably express codon optimized CDKN2A transduced with a CellTag lentiviral library of 20 nonfunctional barcodes were cultured and representation (percent of reads supporting each barcode) before and after a period of in vitro cell proliferation was measured determined by next generations sequencing. Percent values are given as the mean of three repeats ± standard deviation.

Functional characterization of all possible *CDKN2A* missense variants.
(A) Schematic representation of CDKN2A ankyrin repeats. (B) Percent of functionally deleterious (black box) and functionally neutral variants (white box) within ankyrin repeats and non-ankyrin repeat regions of CDKN2A. Ank; Ankyrin repeat. (C) Box plot showing distribution of percent functionally deleterious missense variants per residue.

Functional classification of missense somatic mutations in *CDKN2A*.
(**A - D**) Percent of missense somatic mutations in CDKN2A reported in COSMIC (A), TCGA (B), JHU (C), or MSK-IMPACT (D) that were classified as functionally deleterious (black box) or functionally neutral (white box) group by tumor type. Cancer types with 10 or more missense somatic mutations in COSMIC are presented. The number of missense somatic mutations for each tumor type given in parentheses. COSMIC; the Catalogue Of Somatic Mutations In Cancer, TCGA; The Cancer Genome Atlas, JHU; The Johns Hopkins University School of Medicine, MSK-IMPACT; Memorial Sloan Kettering-Integrated Mutation Profiling of Actionable Cancer Targets.

Aggregate functional classifications for missense somatic mutations in *CDKN2A*.
Percent of missense somatic mutations in *CDKN2A* that were classified as functionally deleterious (black box) or functionally neutral (white box) group by tumor type. Missense somatic mutations reported in COSMIC, TCGA, JHU, and MSK-IMPACT were combined. Cancer types with 10 or more missense somatic mutations in COSMIC are presented. The number of missense somatic mutations for each tumor type given in parentheses. COSMIC; the Catalogue Of Somatic Mutations In Cancer, TCGA; The Cancer Genome Atlas, JHU; The Johns Hopkins University School of Medicine, MSK-IMPACT; Memorial Sloan Kettering-Integrated Mutation Profiling of Actionable Cancer Targets.

Assessment of in silico variant effect prediction models.

References

1. Adzhubei IA
2. Schmidt S
3. Peshkin L
4. Ramensky VE
5. Gerasimova A
6. Bork P
7. Kondrashov AS
8. Sunyaev SR
2010A method and server for predicting damaging missense mutationsNat Methods 7:248–249https://doi.org/10.1038/nmeth0410-248
1. Biddy BA
2. Kong W
3. Kamimoto K
4. Guo C
5. Waye SE
6. Sun T
7. Morris SA
2018Single-cell mapping of lineage and identity in direct reprogrammingNature 564:219–224https://doi.org/10.1038/s41586-018-0744-4
1. Brandes N
2. Goldman G
3. Wang CH
4. Ye CJ
5. Ntranos V
2023Genome-wide prediction of disease variant effects with a deep protein language modelNat Genet 55:1512–1522https://doi.org/10.1038/s41588-023-01465-0
1. Byeon IJL
2. Li J
3. Ericson K
4. Selby TL
5. Tevelev A
6. Kim HJ
7. O’Maille P
8. Tsai MD
1998Tumor suppressor p16INK4A: Determination of solution structure and analyses of its interaction with cyclin-dependent kinase 4Mol Cell 1:421–431https://doi.org/10.1016/S1097-2765(00)80042-8
1. Caldasl C
2. Hahn SA
3. Luis T
4. Marks C
5. Schutte M
6. Seymour AB
7. Weinstein CL
8. Hruban RH
9. Yeo CJ
10. Kern SE
1994Frequent somatic mutations and homozygous deletions of the p16 (MTS1) gene in pancreatic adenocarcinomaNat Genet 8:27–32
1. Carter H
2. Douville C
3. Stenson PD
4. Cooper DN
5. Karchin R
2013Identifying Mendelian disease genes with the variant effect scoring toolBMC Genomics 14:S3https://doi.org/10.1186/1471-2164-14-s3-s3
1. Chaffee KG
2. Oberg AL
3. McWilliams RR
4. Majithia N
5. Allen BA
6. Kidd J
7. Singh N
8. Hartman A-R
9. Wenstrup RJ
10. Petersen GM
2018Prevalence of Germline Mutations in Cancer GenesGenet Med 20:119–127https://doi.org/10.1038/gim.2017.85.PREVALENCE
1. Chang MT
2. Asthana S
3. Gao SP
4. Lee BH
5. Chapman JS
6. Kandoth C
7. Gao JJ
8. Socci ND
9. Solit DB
10. Olshen AB
11. Schultz N
12. Taylor BS
2016Identifying recurrent mutations in cancer reveals widespread lineage diversity and mutational specificityNat Biotechnol 34:155–163https://doi.org/10.1038/nbt.3391
1. Cheng DT
2. Mitchell TN
3. Zehir A
4. Shah RH
5. Benayed R
6. Syed A
7. Chandramohan R
8. Liu ZY
9. Won HH
10. Scott SN
11. Rose Brannon A
12. O’Reilly C
13. Sadowska J
14. Casanova J
15. Yannes A
16. Hechtman JF
17. Yao J
18. Song W
19. Ross DS
20. Oultache A
21. Dogan S
22. Borsu L
23. Hameed M
24. Nafa K
25. Arcila ME
26. Ladanyi M
27. Berger MF
2015Memorial sloan kettering-integrated mutation profiling of actionable cancer targets (MSK-IMPACT): A hybridization capture-based next-generation sequencing clinical assay for solid tumor molecular oncologyJ Mol Diagnostics 17:251–264https://doi.org/10.1016/j.jmoldx.2014.12.006
1. Cheng J
2. Novati G
3. Pan J
4. Bycroft C
5. Žemgulytė A
6. Applebaum T
7. Pritzel A
8. Wong LH
9. Zielinski M
10. Sargeant T
11. Schneider RG
12. Senior AW
13. Jumper J
14. Hassabis D
15. Kohli P
16. Avsec Ž
2023Accurate proteome-wide missense variant effect prediction with AlphaMissenseScience (80-) 7492https://doi.org/10.1126/science.adg7492
1. Cubuk C
2. Garrett A
3. Choi S
4. King L
5. Loveday C
6. Torr B
7. Burghel GJ
8. Durkie M
9. Callaway A
10. Robinson R
11. Drummond J
12. Berry I
13. Wallace A
14. Eccles D
15. Tischkowitz M
16. Whiffin N
17. Ware JS
18. Hanson H
19. Turnbull C
20. CanVIG-UK
2021Clinical likelihood ratios and balanced accuracy for 44 in silico tools against multiple large-scale functional assays of cancer susceptibility genesGenet Med 23:2096–2104https://doi.org/10.1038/s41436-021-01265-z
1. Fåhraeus R
2. Paramio JM
3. Ball KL
4. Lain S
5. Lane DP
1996Inhibition of pRb phosphorylation and cell-cycle progression by a 20-residue peptide derived from P16CDKN2/INK4ACurr Biol https://doi.org/10.1016/s0960-9822(02)00425-6
1. Forbes SA
2. Tang G
3. Bindal N
4. Bamford S
5. Dawson E
6. Cole C
7. Kok CY
8. Jia M
9. Ewing R
10. Menzies A
11. Teague JW
12. Stratton MR
13. Futreal PA
2009COSMIC (the Catalogue of Somatic Mutations In Cancer): A resource to investigate acquired mutations in human cancerNucleic Acids Res 38:652–657https://doi.org/10.1093/nar/gkp995
1. Galili T
2. O’Callaghan A
3. Sidi J
4. Sievert C
2018Heatmaply: An R package for creating interactive cluster heatmaps for online publishingBioinformatics 34:1600–1602https://doi.org/10.1093/bioinformatics/btx657
1. Gao H
2. Hamp T
3. Ede J
4. Schraiber JG
5. McRae J
6. Singer-Berk M
7. Yang Y
8. Dietrich ASD
9. Fiziev PP
10. Kuderna LFK
11. Sundaram L
12. Wu Y
13. Adhikari A
14. Field Y
15. Chen C
16. Batzoglou S
17. Aguet F
18. Lemire G
19. Reimers R
20. Balick D
21. Janiak MC
22. Kuhlwilm M
23. Orkin JD
24. Manu S
25. Valenzuela A
26. Bergman J
27. Rousselle M
28. Silva FE
29. Agueda L
30. Blanc J
31. Gut M
32. de Vries D
33. Goodhead I
34. Harris RA
35. Raveendran M
36. Jensen A
37. Chuma IS
38. Horvath JE
39. Hvilsom C
40. Juan D
41. Frandsen P
42. de Melo FR
43. Bertuol F
44. Byrne H
45. Sampaio I
46. Farias I
47. do Amaral JV
48. Messias M
49. da Silva MNF
50. Trivedi M
51. Rossi R
52. Hrbek T
53. Andriaholinirina N
54. Rabarivola CJ
55. Zaramody A
56. Jolly CJ
57. Phillips-Conroy J
58. Wilkerson G
59. Abee C
60. Simmons JH
61. Fernandez-Duque E
62. Kanthaswamy S
63. Shiferaw F
64. Wu D
65. Zhou L
66. Shao Y
67. Zhang G
68. Keyyu JD
69. Knauf S
70. Le MD
71. Lizano E
72. Merker S
73. Navarro A
74. Bataillon T
75. Nadler T
76. Khor CC
77. Lee J
78. Tan P
79. Lim WK
80. Kitchener AC
81. Zinner D
82. Gut I
83. Melin A
84. Guschanski K
85. Schierup MH
86. Beck RMD
87. Umapathy G
88. Roos C
89. Boubli JP
90. Lek M
91. Sunyaev S
92. O’Donnell-Luria A
93. Rehm HL
94. Xu J
95. Rogers J
96. Marques-Bonet T
97. Farh KKH
2023The landscape of tolerated genetic variation in humans and primatesScience (80-) 380https://doi.org/10.1126/science.abn8197
1. Goggins M
2. Overbeek KA
3. Brand R
4. Syngal S
5. Del Chiaro M
6. Bartsch DK
7. Bassi C
8. Carrato A
9. Farrell J
10. Fishman EK
11. Fockens P
12. Gress TM
13. Van Hooft JE
14. Hruban RH
15. Kastrinos F
16. Klein A
17. Lennon AM
18. Lucas A
19. Park W
20. Rustgi A
21. Simeone D
22. Stoffel E
23. Vasen HFA
24. Cahen DL
25. Canto MI
26. Bruno M.
2020Management of patients with increased risk for familial pancreatic cancer: updated recommendations from the International Cancer of the Pancreas Screening (CAPS) ConsortiumGut 69:7–17https://doi.org/10.1136/gutjnl-2019-319352
1. Golan T
2. Hammel P
3. Reni M
4. Van Cutsem E
5. Macarulla T
6. Hall MJ
7. Park J-O
8. Hochhauser D
9. Arnold D
10. Oh D-Y
11. Reinacher-Schick A
12. Tortora G
13. Algül H
14. O’Reilly EM
15. McGuinness D
16. Cui KY
17. Schlienger K
18. Locker GY
19. Kindler HL.
2019Maintenance Olaparib for Germline BRCA-Mutated Metastatic Pancreatic CancerN Engl J Med 381:317–327https://doi.org/10.1056/nejmoa1903387
1. Golden RJ
2. Chen B
3. Li T
4. Braun J
5. Manjunath H
6. Chen X
7. Wu J
8. Schmid V
9. Chang TC
10. Kopp F
11. Ramirez-Martinez A
12. Tagliabracci VS
13. Chen ZJ
14. Xie Y
15. Mendell JT
2017An Argonaute phosphorylation cycle promotes microRNA-mediated silencingNature 542:197–202https://doi.org/10.1038/nature21025
1. Goldstein AM
2004Familial melanoma, pancreatic cancer and germline CDKN2A mutationsHum Mutat 23:630https://doi.org/10.1002/humu.9247
1. Horn IP
2. Marks DL
3. Koenig AN
4. Hogenson TL
5. Almada LL
6. Goldstein LE
7. Romecin Duran PA
8. Vera R
9. Vrabel AM
10. Cui G
11. Rabe KG
12. Bamlet WR
13. Mer G
14. Sicotte H
15. Zhang C
16. Li H
17. Petersen GM
18. Fernandez-Zapico ME
2021A rare germline CDKN2A variant (47T>G; p16-L16R) predisposes carriers to pancreatic cancer by reducing cell cycle inhibitionJ Biol Chem 296:1–11https://doi.org/10.1016/J.JBC.2021.100634
1. Hu C
2. Hart SN
3. Polley EC
4. Gnanaolivu R
5. Shimelis H
6. Lee KY
7. Lilyquist J
8. Na J
9. Moore R
10. Antwi SO
11. Bamlet WR
12. Chaffee KG
13. DiCarlo J
14. Wu Z
15. Samara R
16. Kasi PM
17. McWilliams RR
18. Petersen GM
19. Couch FJ
2018Association between inherited germline mutations in cancer predisposition genes and risk of pancreatic cancerJAMA 319:2401–2409https://doi.org/10.1001/jama.2018.6228
1. Jaffe A
2. Wojcik G
3. Chu A
4. Golozar A
5. Maroo A
6. Duggal P
7. Klein AP
2011Identification of functional genetic variation in exome sequence analysisBMC Proc 5:9–13https://doi.org/10.1186/1753-6561-5-S9-S13
1. Kimura H
2. Klein AP
3. Hruban RH
4. Roberts NJ
2021The Role of Inherited Pathogenic CDKN2A Variants in Susceptibility to Pancreatic CancerPancreas 50:1123–1130https://doi.org/10.1097/MPA.0000000000001888
1. Kimura H
2. Paranal RM
3. Nanda N
4. Wood LD
5. Eshleman JR
6. Hruban RH
7. Goggins MG
8. Klein AP
9. Brand R
10. Cote ML
11. Du M
12. Gallinger S
13. Goggins M
14. Kurtz RC
15. Petersen GM
16. Rustgi AK
17. Schwartz AG
18. Stoffel EM
19. Syngal S
20. Zogopoulos G
21. Roberts NJ.
2022Functional CDKN2A assay identifies frequent deleterious alleles misclassified as variants of uncertain significanceElife 11:1–16https://doi.org/10.7554/eLife.71137
1. Kinde I
2. Wu J
3. Papadopoulos N
4. Kinzler KW
5. Vogelstein B
2011Detection and quantification of rare mutations with massively parallel sequencingProc Natl Acad Sci U S A 108:9530–9535https://doi.org/10.1073/pnas.1105422108
1. Kircher M
2. Witten DM
3. Jain P
4. O’roak BJ
5. Cooper GM
6. Shendure J
2014A general framework for estimating the relative pathogenicity of human genetic variantsNat Genet 46:310–315https://doi.org/10.1038/ng.2892
1. Kumar P
2. Henikoff S
3. Ng PC
2009Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithmNat Protoc 4:1073–1082https://doi.org/10.1038/nprot.2009.86
1. McWilliams RR
2. Wieben ED
3. Chaffee KG
4. Antwi SO
5. Raskin L
6. Olopade OI
7. Li D
8. Highsmith WE
9. Colon-Otero G
10. Khanna LG
11. Permuth JB
12. Olson JE
13. Frucht H
14. Genkinger J
15. Zheng W
16. Blot WJ
17. Wu L
18. Almada LL
19. Fernandez-Zapico ME
20. Sicotte H
21. Pedersen KS
22. Petersen GM
2018CDKN2A germline rare coding variants and risk of pancreatic cancer in minority populationsCancer Epidemiol Biomarkers Prev 27:1364–1370https://doi.org/10.1158/1055-9965.EPI-17-1065
1. Muddabhaktuni BMC
2. Koyyala VPB
2021The Cancer Genome AtlasIndian J Med Paediatr Oncol 42:353–355https://doi.org/10.1055/s-0041-1735440
1. Ng PKS
2. Li J
3. Jeong KJ
4. Shao S
5. Chen H
6. Tsang YH
7. Sengupta S
8. Wang Z
9. Bhavana VH
10. Tran R
11. Soewito S
12. Minussi DC
13. Moreno D
14. Kong K
15. Dogruluk T
16. Lu H
17. Gao J
18. Tokheim C
19. Zhou DC
20. Johnson AM
21. Zeng J
22. Ip CKM
23. Ju Z
24. Wester M
25. Yu S
26. Li Y
27. Vellano CP
28. Schultz N
29. Karchin R
30. Ding L
31. Lu Y
32. Cheung LWT
33. Chen K
34. Shaw KR
35. Meric-Bernstam F
36. Scott KL
37. Yi S
38. Sahni N
39. Liang H
40. Mills GB
2018Systematic Functional Annotation of Somatic Mutations in CancerCancer Cell 33:450–462https://doi.org/10.1016/j.ccell.2018.01.021
1. Richards S
2. Aziz N
3. Bale S
4. Bick D
5. Das S
6. Gastier-Foster J
7. Grody WW
8. Hegde M
9. Lyon E
10. Spector E
11. Voelkerding K
12. Rehm HL
2015Standards and guidelines for the interpretation of sequence variants: A joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular PathologyGenet Med 17:405–424https://doi.org/10.1038/gim.2015.30
1. Roberts NJ
2. Norris AL
3. Petersen GM
4. Bondy ML
5. Brand R
6. Gallinger S
7. Kurtz RC
8. Olson SH
9. Rustgi AK
10. Schwartz AG
11. Stoffel E
12. Syngal S
13. Zogopoulos G
14. Ali SZ
15. Axilbund J
16. Chaffee KG
17. Chen YC
18. Cote ML
19. Childs EJ
20. Douville C
21. Goes FS
22. Herman JM
23. Iacobuzio-Donahue C
24. Kramer M
25. Makohon-Moore A
26. McCombie RW
27. Wyatt Mcmahon K
28. Niknafs N
29. Parla J
30. Pirooznia M
31. Potash JB
32. Rhim AD
33. Smith AL
34. Wang Y
35. Wolfgang CL
36. Wood LD
37. Zandi PP
38. Goggins M
39. Karchin R
40. Eshleman JR
41. Papadopoulos N
42. Kinzler KW
43. Vogelstein B
44. Hruban RH
45. Klein AP
2016Whole genome sequencing defines the genetic heterogeneity of familial pancreatic cancerCancer Discov 6:166–175https://doi.org/10.1158/2159-8290.CD-15-0402
1. Ruas M
2. Peters G
1998The p16(INK4a)/CDKN2A tumor suppressor and its relativesBiochim Biophys Acta - Rev Cancer 1378https://doi.org/10.1016/S0304-419X(98)00017-1
1. Shindo K
2. Yu J
3. Suenaga M
4. Fesharakizadeh S
5. Cho C
6. Macgregor-Das A
7. Siddiqui A
8. Witmer PD
9. Tamura K
10. Song TJ
11. Almario JAN
12. Brant A
13. Borges M
14. Ford M
15. Barkley T
16. He J
17. Weiss MJ
18. Wolfgang CL
19. Roberts NJ
20. Hruban RH
21. Klein AP
22. Goggins M
2017Deleterious germline mutations in patients with apparently sporadic pancreatic adenocarcinomaJ Clin Oncol 35:3382–3390https://doi.org/10.1200/JCO.2017.72.3502
1. Stewart SA
2. Dykxhoorn DM
3. Palliser D
4. Mizuno H
5. Yu EY
6. An DS
7. Sabatini DM
8. Chen ISY
9. Hahn WC
10. Sharp PA
11. Weinberg RA
12. Novina CD
2003Lentivirus-delivered stable gene silencing by RNAi in primary cellsRna 9:493–501https://doi.org/10.1261/rna.2192803
1. Stoffel EM
2. Mckernin SE
3. Brand R
4. Canto M
5. Goggins M
6. Moravek C
2019Evaluating Susceptibility to Pancreatic Cancer: ASCO Provisional Clinical OpinionJ Clin Oncol 37:153–164https://doi.org/10.1200/JCO.18.01489
1. Sun P
2. Nallar SC
3. Raha A
4. Kalakonda S
5. Velalar CN
6. Reddy SP
7. Kalvakolanu D V
2010GRIM-19 and p16INK4a synergistically regulate cell cycle progression and E2F1- responsive gene expressionJ Biol Chem 285:27545–27552https://doi.org/10.1074/jbc.M110.105767
1. Tarumoto Y
2. Lin S
3. Wang J
4. Milazzo JP
5. Xu Y
6. Lu B
7. Yang Z
8. Wei Y
9. Polyanskaya S
10. Wunderlich M
11. Gray NS
12. Stegmaier K
13. Vakoc CR
2020Salt-inducible kinase inhibition suppresses acute myeloid leukemia progression in vivoBlood 135:56–70https://doi.org/10.1182/blood.2019001576
1. Tung NM
2. Robson ME
3. Ventz S
4. Santa-Maria CA
5. Nanda R
6. Marcom PK
7. Shah PD
8. Ballinger TJ
9. Yang ES
10. Vinayak S
11. Melisko M
12. Brufsky A
13. DeMeo M
14. Jenkins C
15. Domchek S
16. D’Andrea A
17. Lin NU
18. Hughes ME
19. Carey LA
20. Wagle N
21. Wulf GM
22. Krop IE
23. Wolff AC
24. Winer EP
25. Garber JE
2020TBCRC 048: Phase II Study of Olaparib for Metastatic Breast Cancer and Mutations in Homologous Recombination-Related GenesJ Clin Oncol 38:4274–4282https://doi.org/10.1200/JCO.20.02151
1. Tutt ANJ
2. Garber JE
3. Kaufman B
4. Viale G
5. Fumagalli D
6. Rastogi P
7. Gelber RD
8. de Azambuja E
9. Fielding A
10. Balmaña J
11. Domchek SM
12. Gelmon KA
13. Hollingsworth SJ
14. Korde LA
15. Linderholm B
16. Bandos H
17. Senkus E
18. Suga JM
19. Shao Z
20. Pippas AW
21. Nowecki Z
22. Huzarski T
23. Ganz PA
24. Lucas PC
25. Baker N
26. Loibl S
27. McConnell R
28. Piccart M
29. Schmutzler R
30. Steger GG
31. Costantino JP
32. Arahmani A
33. Wolmark N
34. McFadden E
35. Karantza V
36. Lakhani SR
37. Yothers G
38. Campbell C
39. Geyer CE.
2021Adjuvant Olaparib for Patients with BRCA1 - or BRCA2-Mutated Breast CancerN Engl J Med 384:2394–2405https://doi.org/10.1056/nejmoa2105215
1. Wilcox EH
2. Sarmady M
3. Wulf B
4. Wright MW
5. Rehm HL
6. Biesecker LG
7. Abou Tayoun AN
2022Evaluating the impact of in silico predictors on clinical variant classificationGenet Med 24:924–930https://doi.org/10.1016/j.gim.2021.11.018
1. Zhang J
2. Kobert K
3. Flouri T
4. Stamatakis A
2014PEAR: A fast and accurate Illumina Paired-End reAd mergeRBioinformatics 30:614–620https://doi.org/10.1093/bioinformatics/btt593
1. Zhen DB
2. Rabe KG
3. Gallinger S
4. Syngal S
5. Schwartz AG
6. Goggins MG
7. Hruban RH
8. Cote ML
9. Mcwilliams RR
10. Roberts NJ
11. Cannon-Albright LA
12. Li D
13. Moyes K
14. Wenstrup RJ
15. Hartman A-R
16. Seminara D
17. Klein AP
18. Petersen GM
19. Author GM
2015BRCA1, BRCA2, PALB2, and CDKN2A Mutations in Familial Pancreatic Cancer (FPC): A PACGENE Study HHS PublicGenet Med 17:569–577https://doi.org/10.1038/gim.2014.153

Article and author information

Author information

Hirokazu Kimura
Department of Pathology, the Johns Hopkins University School of Medicine; Baltimore, 21287, USA
Kamel Lahouel
Division of Integrated Genomics, Translational Genomics Research Institute; Phoenix, 85004, USA
Cristian Tomasetti
Department of Computational and Quantitative Medicine, Beckman Research Institute, City of Hope; Duarte, 91010, USA
Nicholas J. Roberts
Department of Pathology, the Johns Hopkins University School of Medicine; Baltimore, 21287, USA, Department of Oncology, the Johns Hopkins University School of Medicine; Baltimore, 21287, USA
ORCID iD: 0000-0002-8709-0664
- Corresponding author. Email: nrobert8@jhmi.edu

Version history

Preprint posted: December 28, 2023
Sent for peer review: December 28, 2023
Reviewed Preprint version 1: February 19, 2024
Reviewed Preprint version 2: November 27, 2024
Reviewed Preprint version 3: April 3, 2025
Version of Record published: April 16, 2025

Cite all versions

You can cite all versions using the DOI https://doi.org/10.7554/eLife.95347. This DOI represents all versions, and will always resolve to the latest one.

Copyright

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Not revised: This Reviewed Preprint includes the authors’ original preprint (without revision), an eLife assessment, and public reviews.

Reviewing Editor
Murim Choi
Seoul National University, Seoul, Republic of Korea
Senior Editor
Murim Choi
Seoul National University, Seoul, Republic of Korea

Reviewer #1 (Public Review):

Summary:
Kimura et al performed a saturation mutagenesis study of CDKN2A to assess the functionality of all possible missense variants and compare them to previously identified pathogenic variants. They also compared their assay result with those from in silico predictors.

Strengths:
CDKN2A is an important gene that modulates cell cycle and apoptosis, therefore it is critical to accurately assess the functionality of missense variants. Overall, the paper reads well and touches upon major discoveries in a logical manner.

Weaknesses:
The paper lacks proper details for experiments and basic data, leaving the results less convincing. Analyses are superficial and do not provide variant-level resolution.

https://doi.org/10.7554/eLife.95347.1.sa1

Reviewer #2 (Public Review):

This study describes a deep mutational scan across CDKN2A using suppression of cell proliferation in pancreatic adenocarcinoma cells as a readout for CDKN2A function. The results are also compared to in silico variant predictors currently utilized by the current diagnostic frameworks to gauge these predictors' performance. The authors also functionally classify CDKN2A somatic mutations in cancers across different tissues.

This study is a potentially important contribution to the field of cancer variant interpretation for CDKN2A, but is almost impossible to review because of the severe lack of details regarding the methods and incompleteness of the data provided with the paper. We do believe that the cell proliferation suppression assay is robust and works, but when it comes to the screening of the library of CDKN2A variants the lack of primary data and experimental detail prevents assessment of the scientific merit and experimental rigor.

https://doi.org/10.7554/eLife.95347.1.sa0