Sequence features of HOT loci. A) Distribution of conservation score in loci bound by DAPs in HepG2 and K562. The logarithmic part of the bins is expressed in terms of the percentages of loci that each bin covers, averaged over two cell lines. The correlation value is Pearson. The shaded region represents HOT loci. B) phastCons conservation scores of regular enhancer, HOT loci, and exon regions. The values are normalized by the average scores of regular enhancers. C) Classification performances (auROC and auPRC values) of HOT loci against the backgrounds of DHS, promoter, and regular enhancer regions. The X-axis values are the methods used for classifications. Methods starting with “seq -” are based on sequences (CNNs and gkmSVM, refer to Methods and main text). Starting with “feat -” are methods where all sequence features are used (GC, CpG, GpC, CpG island). Depicted values for feature-based SVMs are run using linear kernels.