Top Row. CADD scores. A. Empirical cumulative distribution functions of raw CADD scores of matching and non-matching variants across all genes, for Potential Set 1. Non-matching variants are further divided into stable and top variants, with a score lower threshold of 1.0 and upper threshold of 5.0 used to improve visualization. B. For a deleteriousness cutoff, the percent of (i) all matching variants, (ii) all nonmatching top variants, and (iii) all non-matching stable variants, which are classified as deleterious. We use a sliding cutoff threshold ranging from 10 to 20 as recommended by CADD authors. Bottom Row. Empirical cumulative distribution functions of perturbation scores of Enformer-predicted H3K27me3 ChIP-seq track. Score upper threshold of 0.015 and empirical CDF lower threshold of 0.5 used to improve visualization. C. Perturbation scores computed from predictions based on centering input sequences on the gene TSS as well as its two flanking positions. D. Perturbation scores computed from predictions based on centering input sequences on the gene TSS only.