Functional characterization of all CDKN2A missense variants and comparison to in silico models of pathogenicity

  1. Hirokazu Kimura
  2. Kamel Lahouel
  3. Cristian Tomasetti
  4. Nicholas Jason Roberts  Is a corresponding author
  1. Department of Pathology, the Johns Hopkins University School of Medicine, United States
  2. Division of Integrated Genomics, Translational Genomics Research Institute, United States
  3. Department of Computational and Quantitative Medicine, Beckman Research Institute, City of Hope, United States
  4. Department of Oncology, the Johns Hopkins University School of Medicine, United States
5 figures, 1 table and 14 additional files

Figures

Figure 1 with 2 supplements
Pooled analysis of CDKN2A variants at two residues with previously reported pathogenic and benign variants.

PANC-1 cell stably expressing 1 of 20 CDKN2A variants, 19 missense variants, and 1 synonymous variant, at residue p.V126 or p.R144 were cultured. Variant representation, as the percent of reads supporting the variant sequence, before and after a period in vitro cell proliferation determined by next-generation sequencing for the two residues, p.V126 (A) or p.R144 (B). CDKN2A variant p.V126D (*) was previously reported as pathogenic and increased representation during in vitro proliferation. CDKN2A variant p.R144C (**) was previously reported as benign variant and maintained representation during in vitro proliferation.

Figure 1—figure supplement 1
Development and validation of high-throughput CDKN2A functional assay.

(A) Cell proliferation of PANC-1 cells stably expressing empty expression vector, codon-optimized CDKN2A, one of three synonymous variants (p.L32L, p.G101G, p.V126V), or one of three pathogenic variants (p.L32P, p.G101W, p.V126D) over 14 days in culture. Cell proliferation values are given as mean of three repeats ± standard deviation normalized to PANC-1 cells that stably express empty vector. Statistically significant inhibition of cell proliferation inhibition in PANC-1 cells that stably express synonymous variants (*; p-value<0.001; Students t-test). (B) PANC-1 cells stably expressing codon-optimized CDKN2A transduced with a CellTag lentiviral library of 20 nonfunctional barcodes were cultured and representation (percent of reads supporting each barcode) before (day 9) and after a period of in vitro cell proliferation (day 45) was determined using next-generation sequencing. Percent values are given as the mean of three repeats ± standard deviation.

Figure 1—figure supplement 2
Data for CDKN2A plasmid library.

(A) Dot plot showing proportion of each variant per residue in the plasmid libraries. (B) Variant proportion in plasmid libraries grouped in 0.5% increments. (C) Dot plot showing variant proportion in the amplified plasmid library compared to the day 9 cell pool. (D) Normalized fold change of variant proportion between day 9 cell pool and the amplified plasmid library based on American College of Medical Genetics (ACMG) classification.

Figure 2 with 5 supplements
Functional characterization of all possible CDKN2A missense variants.

(A) Functional classifications for 3120 CDKN2A variants, including 2964 missense variants and 156 synonymous variants. Variants were classified as functionally deleterious, indeterminate function, or neutral based on p-value using gamma generalized linear model (GLM). 525 (17.7%) variants were classified as functionally deleterious. (B) Log2 p-value (gamma GLM) for 32 benchmark pathogenic variants, 6 benign variants, 31 variants of uncertain significance (VUSs) previously reported to have functionally deleterious effects, and 18 VUSs previously reported to have functionally neutral effects. (C) Heatmap with p-values (gamma GLM) for all 3120 CDKN2A variants assayed.

Figure 2—figure supplement 1
p-Values for all possible CDKN2A missense variants.

(A) Distribution of log2 p-value (gamma GLM) for all possible CDKN2A missense variants. (B) Distribution of log2 p-value (gamma GLM) for benchmark pathogenic variants (red box), benchmark benign variants (blue box), variants of uncertain significance (VUSs) previously reported to have functionally deleterious effects (orange box), and VUSs previously reported to have functionally neutral effects (green box). (C) Dot plot showing log2 p-value (gamma GLM) of all possible CDKN2A missense variants per residue.

Figure 2—figure supplement 2
Normalized fold change for all possible CDKN2A missense variants.

(A) Dot plot showing log2 normalized fold change of all possible CDKN2A missense variants by residue. (B) Log2 normalized fold change for 32 benchmark pathogenic variants, 6 benign variants, 31 variants of uncertain significance (VUSs) previously reported to have functionally deleterious effects, and 18 VUSs previously reported to have functionally neutral effects. (C) Functional classifications for 3120 CDKN2A variants, including 2964 missense variants and 156 synonymous variants. Variants were classified as functionally deleterious, indeterminate function, or neutral based on log2 normalized fold change. (D) Comparison of functional classification of all possible CDKN2A missense variants by log2 p-value (gamma GLM) and log normalized fold change.

Figure 2—figure supplement 3
Reproducibility of CDKN2A assay.

(A) Dot plot showing log2 p-value (gamma GLM) for 560 CDKN2A missense variants assayed in duplicate. (B) Comparison of functional classifications for 560 CDKN2A missense variants assayed in duplicate. (C) Dot plot showing log2 normalized fold change for 560 CDKN2A missense variants assayed in duplicate.

Figure 2—figure supplement 4
Proportion of variants in day 9.

(A) Proportion of all possible 2964 CDKN2A missense variants in the day 9 cell pool (replicate 1 if duplicated). (B) Percent of functionally deleterious variants (black box), variants of indeterminate function, and functionally neutral variants (white box) by variant proportion in the day 9 cell pool (replicate 1 if duplicated). Left graph variants grouped as <2% and ≥2% in day 9 cell pool. Right graph, variants grouped as <2%, 1% intervals from 2% to 8%, ≥8% in the day 9 cell pool.

Figure 2—figure supplement 5
Functional characterization of all possible CDKN2A missense variants by ankyrin domain and residue.

(A) Schematic representation of CDKN2A with ankyrin repeats 1–4 represented. (B) Percent of functionally deleterious (black box), indeterminate function (gray box), and functionally neutral variants (white box) within ankyrin repeats and non-ankyrin repeat regions of CDKN2A. Ank; ankyrin repeat. (C) Dot plot showing distribution of percent functionally deleterious missense variants per residue.

Figure 3 with 2 supplements
Comparison of functional classifications and in silico variant effect predictions for all possible CDKN2A missense variants.

Variant effect predictions for CDKN2A missense variants using CADD, PolyPhen-2, SIFT, VEST, AlphaMissense, ESM1b, and PrimateAI-3D. Predicted deleterious, damaging, or pathogenic effects (black box) and predicted neutral, tolerated, benign, or ambiguous effects (white box) presented as percent of missense variants with an available prediction. Number of missense variants with an available prediction for each in silico model given in parentheses. Accuracy shown as a red line. CADD: Combined Annotation Dependent Depletion; PolyPhen-2: Polymorphism Phenotyping v2; SIFT: Sorting Intolerant From Tolerant; VEST: Variant Effect Scoring Tool score.

Figure 3—figure supplement 1
Variant in silico predictions for seven algorithms.

(A) Number of algorithms predicting deleterious effect for 904 CDKN2A missense variants with predictions from seven algorithms. (B) Percent of functionally deleterious (black box) and indeterminate function or functionally neutral (white box) variants grouped by the number of algorithms predicting deleterious effect. (C) Number of algorithms predicting deleterious effect for 904 CDKN2A missense variants grouped by ankyrin repeats and non-ankyrin repeat regions. (D–H) Percent of functionally deleterious (black box) and indeterminate function or functionally neutral (white box) variants grouped by the number of algorithms predicting deleterious effect in Ank1 (D), Ank2 (E), Ank3 (F), Ank4 (G), and non-ankyrins repeat regions (H) of CDKN2A.

Figure 3—figure supplement 2
Variant in silico predictions for five algorithms.

(A) Number of algorithms predicting deleterious effect for 2060 CDKN2A missense variants with predictions from five algorithms. (B) Percent of functionally deleterious (black box) and indeterminate function or functionally neutral (white box) variants grouped by the number of algorithms predicting deleterious effect. (C) Number of algorithms predicting deleterious effect for 2060 CDKN2A missense variants grouped by ankyrin repeats and non-ankyrin repeat regions. (D–H) Percent of functionally deleterious (black box) and indeterminate function or functionally neutral (white box) variants grouped by the number of algorithms predicting deleterious effect in Ank1 (D), Ank2 (E), Ank3 (F), Ank4 (G), and non-ankyrins repeat regions (H) of CDKN2A.

Figure 4 with 2 supplements
Functional classification of missense somatic mutations in CDKN2A.

(A) Somatic missense variants in CDKN2A reported in COSMIC, TCGA, JHU, or MSK-IMPACT, by functional classification (deleterious – black box; indeterminate – gray box; neutral – white box). (B) Distribution of functionally deleterious missense somatic mutations CDKN2A reported in COSMIC, TCGA, JHU, or MSK-IMPACT by ankyrin (ANK) repeat. (C) Percent of missense somatic mutations in CDKN2A that were classified as functionally deleterious (black box), indeterminate function (gray box), or functionally neutral (white box) group by tumor type. Missense somatic mutations reported in COSMIC, TCGA, JHU, and MSK-IMPACT were combined. The number of missense somatic mutations for each tumor type given in parentheses. COSMIC; the Catalogue Of Somatic Mutations In Cancer, TCGA; The Cancer Genome Atlas, JHU; The Johns Hopkins University School of Medicine, MSK-IMPACT; Memorial Sloan Kettering-Integrated Mutation Profiling of Actionable Cancer Targets.

Figure 4—figure supplement 1
Missense somatic mutations in CDKN2A.

(A) Percent of missense somatic mutations in CDKN2A reported in either COSMIC, TCGA, JHU, or MSK-IMPACT that were classified as pathogenic or likely pathogenic (black box), variant of uncertain significance (VUS) (gray box), or benign or likely benign (white box) using American College of Medical Genetics (ACMG) interpretation guidelines. (B) Percent of missense somatic mutations in CDKN2A that were classified as pathogenic or likely pathogenic (black box), VUS (gray box), or benign or likely benign (white box) using ACMG interpretation guidelines grouped by mutation database. (C) Number of patients with a pathogenic or likely pathogenic missense somatic mutation grouped by mutation database. Patients with p.His83Tyr mutation (black box), patients with p.Asp84Asn mutations (gray box), and patients with other mutations highlighted. COSMIC: the Catalogue Of Somatic Mutations In Cancer; TCGA: The Cancer Genome Atlas; JHU: The Johns Hopkins University School of Medicine; MSK-IMPACT: Memorial Sloan Kettering-Integrated Mutation Profiling of Actionable Cancer Targets.

Figure 4—figure supplement 2
Functional classification of missense somatic mutations in CDKN2A.

Percent of missense somatic mutations in CDKN2A reported in either COSMIC (A), TCGA (B), JHU (C), or MSK-IMPACT (D) that were classified as functionally deleterious (black box), indeterminate (gray box), or functionally neutral (white box) in our CDKN2A functional assay grouped by tumor type. The number of missense somatic mutations for each tumor type given in parentheses. COSMIC: the Catalogue Of Somatic Mutations In Cancer; TCGA: The Cancer Genome Atlas; JHU: The Johns Hopkins University School of Medicine; MSK-IMPACT: Memorial Sloan Kettering-Integrated Mutation Profiling of Actionable Cancer Targets.

CDKN2A synonymous and missense variants reported in gnomAD and ClinVar.

(A) Synonymous and missense variants in CDKN2A reported in gnomAD. (B) 287 CDKN2A missense variants reported in gnomAD, by American College of Medical Genetics (ACMG) guideline classification. (C) 264 missense variants in CDKN2A reported in gnomAD, by functional classification (deleterious – black box; indeterminate – gray box; neutral – white box). (D) 395 missense variants in CDKN2A reported in ClinVar, by functional classification (deleterious – black box; indeterminate – gray box; neutral – white box).

Tables

Key resources table
Reagent type (species) or resourceDesignationSource or
reference
IdentifiersAdditional
information
Gene (Homo sapiens)CDKN2AGenBankGene ID: 1029 NM_000077.5 NP_000068.1
Cell line (H. sapiens)PANC-1American Type
Culture Collection
Cat. #: CRL-1469
RRID:CVCL_0480
Cell line (H. sapiens)293TAmerican Type
Culture Collection
Cat. #: CRL-3216
RRID:CVCL_0063
Recombinant
DNA reagent
pHAGE-CDKN2A (plasmid)AddgeneRRID:Addgene_116726Lentiviral vector expressing
CDKN2A
Recombinant
DNA reagent
pLentiV_
Blast (plasmid)
AddgeneRRID:Addgene_111887
Recombinant
DNA reagent
pLJM1-Empty
(plasmid)
AddgeneRRID:Addgene_91980
Recombinant
DNA reagent
psPAX2
(plasmid)
AddgeneRRID:Addgene_12260
Recombinant
DNA reagent
pCMV-VSV-G
(plasmid)
AddgeneRRID:Addgene_8454
Recombinant
DNA reagent
pLJM1-CDKN2A
(plasmid)
Twist
Bioscience
Lentiviral vector expressing
codon-optimized CDKN2A
Recombinant
DNA reagent
pLJM1-CDKN2A-Leu32Leu (plasmid)This paperLentiviral vector expressing
CDKN2A-Leu32Leu
Recombinant
DNA reagent
pLJM1-CDKN2A-Leu32Pro (plasmid)This paperLentiviral vector expressing
CDKN2A-
Leu32Pro
Recombinant
DNA reagent
pLJM1-CDKN2A- Gly101Gly (plasmid)This paperLentiviral vector expressing
CDKN2A-
Gly101Gly
Recombinant
DNA reagent
pLJM1-CDKN2A- Gly101Trp (plasmid)This paperLentiviral vector expressing
CDKN2A-
Gly101Trp
Recombinant
DNA reagent
pLJM1-CDKN2A- Val126Asp (plasmid)This paperLentiviral vector expressing
CDKN2A-
Val126Asp
Recombinant
DNA reagent
pLJM1-CDKN2A- Val126Val (plasmid)This paperLentiviral vector expressing
CDKN2A-
Val126Val
Recombinant
DNA reagent
pLentiV-Blast-CellTag (plasmid)This paperLentiviral vector expressing
CellTAg
Sequence-based reagentPCR primersThis paperSee Supplementary file 7
Commercial
assay or kit
GenePrint 10 SystemPromega
Corporation
Cat. #: B9510
Commercial
assay or kit
PCR-based MycoDtect kitGreiner
Bio-One
Cat. #: 463 060
Commercial
assay or kit
Q5 Site-Directed Mutagenesis kitNew England
Biolabs
Cat. #: E0552
Commercial
assay or kit
PureLink Genomic DNA Mini KitInvitrogenCat. #: K1820-01
Commercial
assay or kit
KAPA HiFi HotStart PCR KitKapa
Biosystems
Cat. #: KK2501
Commercial
assay or kit
Qubit dsDNA HS assay kitInvitrogenCat. #: Q33230
Commercial
assay or kit
MiSeq Reagent Kit v2 (300 cycles)IlluminaCat. #: MS-102-2002
Chemical
compound, drug
Dulbecco’s modified Eagle’s mediumGibco/Thermo
Fisher
Cat. #: 11995-065
Chemical
compound, drug
Lipofectamine 3000 Transfection ReagentThermo Fisher ScientificCat. #: L3000008
Chemical
compound, drug
Q5 Hot Start High-Fidelity 2X Master MixNew England
Biolabs
Cat. #: M0494S
Chemical
compound, drug
Agencourt AMPure XP systemBeckman
Coulter, Inc
Cat. #: A63881
Chemical
compound, drug
Lenti-X ConcentratorClontechCat. #: 631231
Chemical
compound, drug
FxCycle Violet Ready Flow ReagentInvitrogenCat. #: R37166
Software,
algorithm
TC20 Automated Cell CounterBio-Rad
Laboratories
Cat. #: 1450102
Software, algorithmMiSeq SystemIllumina
Software, algorithmMiSeq control softwareIllumina Version 2.5.0.5
Software, algorithmRThe R FoundationRRID:SCR_001905 Version 4.2.0
Software, algorithmJMPSASRRID:SCR_014242 Version 11
Software, algorithmPython statsmodel packageThe Python Software FoundationRRID:SCR_016074 Version 0.14.0

Additional files

Supplementary file 1

Assay outputs for CellTag experiments.

https://cdn.elifesciences.org/articles/95347/elife-95347-supp1-v1.xlsx
Supplementary file 2

Proportion of each variant in the initial plasmid library.

https://cdn.elifesciences.org/articles/95347/elife-95347-supp2-v1.xlsx
Supplementary file 3

Proportion of each variant in residues R24, H66, and A127.

https://cdn.elifesciences.org/articles/95347/elife-95347-supp3-v1.xlsx
Supplementary file 4

Assay outputs and functional classifications for all possible CDKN2A missense and synonymous variants.

https://cdn.elifesciences.org/articles/95347/elife-95347-supp4-v1.xlsx
Supplementary file 5

Day of confluency by experiment and residue.

https://cdn.elifesciences.org/articles/95347/elife-95347-supp5-v1.xlsx
Supplementary file 6

Normalized fold change for all possible CDKN2A missense and synonymous variants.

https://cdn.elifesciences.org/articles/95347/elife-95347-supp6-v1.xlsx
Supplementary file 7

In silico variant effect predictions for CDKN2A missense variants.

https://cdn.elifesciences.org/articles/95347/elife-95347-supp7-v1.xlsx
Supplementary file 8

Assessment of in silico variant effect prediction models.

https://cdn.elifesciences.org/articles/95347/elife-95347-supp8-v1.xlsx
Supplementary file 9

Missense somatic mutations in CDKN2A reported in COSMIC, TCGA, JHU, MSK-IMPACT.

https://cdn.elifesciences.org/articles/95347/elife-95347-supp9-v1.xlsx
Supplementary file 10

CDKN2A missense and synonymous variants reported in gnomAD.

https://cdn.elifesciences.org/articles/95347/elife-95347-supp10-v1.xlsx
Supplementary file 11

CDKN2A missense variants of uncertain significance (VUSs) reported in ClinVar.

https://cdn.elifesciences.org/articles/95347/elife-95347-supp11-v1.xlsx
Supplementary file 12

Codon-optimized CDKN2A sequence.

https://cdn.elifesciences.org/articles/95347/elife-95347-supp12-v1.xlsx
Supplementary file 13

Sequences of primers used in study.

https://cdn.elifesciences.org/articles/95347/elife-95347-supp13-v1.xlsx
MDAR checklist
https://cdn.elifesciences.org/articles/95347/elife-95347-mdarchecklist1-v1.docx

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Hirokazu Kimura
  2. Kamel Lahouel
  3. Cristian Tomasetti
  4. Nicholas Jason Roberts
(2025)
Functional characterization of all CDKN2A missense variants and comparison to in silico models of pathogenicity
eLife 13:RP95347.
https://doi.org/10.7554/eLife.95347.4