Figures and data in The impact of stability considerations on genetic fine-mapping

Figures
Tables
Additional files

12 figures, 14 tables and 1 additional file

Figures

Figure 1

Download asset Open asset

An overview of our study of the impact of stability considerations on genetic fine-mapping.

(A) The two ways in which we perform fine-mapping, the first of which (colored in green) prioritizes the stability of variant discoveries to subpopulation perturbations. The data illustrates the case where there are two distinct environments, or subpopulations (denoted $E_{1}$ and $E_{2}$ ), that split the observations. (B) Key steps in our comparison of the stability-guided approach with the popular residualization approach.

Figure 2 with 32 supplements

Download asset Open asset

Simulation study results.

(A) The frequency with which at least one causal variant is recovered in Potential Set 1 by Plain PICS and Stable PICS, across 1440 simulated gene expression data that incorporate ancestry-mediated environmental heterogeneity. Recovery frequencies are stratified by simulations differing in the number of causal variants, and the Venn diagram reports the number of matching and non-matching variants in Potential Set 1 across all simulations. (B) The frequency with which at least one causal variant is recovered in Potential Set 1 by Combined PICS, Stable PICS, and Top PICS, across 2400 simulated gene expression data. Recovery frequencies are stratified by the SNR parameter $ϕ$ used in simulations, and the Venn diagram reports the number of matching and non-matching variants in Potential Set 1 across all simulations. (C) The frequency with which at least one causal variant is recovered in Credible Set 1 by Stable SuSiE and Top SuSiE. Venn diagram reports the number of matching and non-matching variants in Potential Set 1 across all simulations. (D) The frequency with which matching and non-matching variants in the first credible or potential set recover a causal variant, obtained from comparing top and stable approaches to an algorithm. We report approximate 95% confidence intervals for each point estimate, by multiplying the associated standard error of the estimate by 1.96.

Figure 2—figure supplement 1

Download asset Open asset

Plain PICS vs Stable PICS (Potential Sets 2 and 3).

Frequency with which at least one causal variant is recovered in Potential Sets 2 and 3 by Plain PICS and Stable PICS, across 1440 simulated gene expression phenotypes. Recovery frequencies are stratified by simulations differing in the number of causal variants, but Venn diagrams report the number of matching and non-matching variants across all simulations.

Figure 2—figure supplement 2

Download asset Open asset

Performance of PICS algorithms (Potential Sets 2 and 3).

Frequency with which at least one causal variant is recovered in Potential Sets 2 and 3 by Combined PICS, Stable PICS, and Top PICS, across 2400 simulated gene expression phenotypes. Recovery frequencies are stratified by simulations differing in the signal-to-noise ratio (SNR) parameter $ϕ$ , but Venn diagrams report the number of matching and non-matching variants across all simulations.

Figure 2—figure supplement 3

Download asset Open asset

Performance of SuSiE algorithms (Credible Sets 2 and 3).

Frequency with which at least one causal variant is recovered in Potential Sets 2 and 3 by Stable SuSiE and Top SuSiE, across 2400 simulated gene expression phenotypes. Recovery frequencies are stratified by simulations differing in the signal-to-noise ratio (SNR) parameter $ϕ$ , but Venn diagrams report the number of matching and non-matching variants across all simulations.

Figure 2—figure supplement 4

Download asset Open asset

Matching vs non-matching variants (Potential and Credible Sets 1 and 2).

Frequencies with which matching and non-matching variants in the credible or potential set recover a causal variant, obtained from comparing top and stable approaches to an algorithm. Analysis is performed over 2400 simulated gene expression phenotypes, and recovery frequencies are stratified by simulations differing in the signal-to-noise ratio (SNR) parameter $ϕ$ . (A) Credible or Potential Set 2. (B) Credible or Potential Set 3.

Figure 2—figure supplement 5

Download asset Open asset

Stable PICS vs Stable SuSiE (one causal variant).

Empirical discrete probability distributions over the number of causal variants (0 or 1) recovered by stability-guided algorithms in simulations with one causal variant, stratified by the SNR parameter used in simulations (increasing SNR from left to right). The impact of including a greater number of credible or potential sets on the distribution is shown (increasing number of included sets from top to bottom).

Figure 2—figure supplement 6

Download asset Open asset

Stable PICS vs Stable SuSiE (two causal variants).

Empirical discrete probability distributions over the number of causal variants (0, 1, or 2) recovered by stability-guided algorithms in simulations with two causal variants, stratified by the SNR parameter used in simulations (increasing SNR from left to right). The impact of including a greater number of credible or potential sets on the distribution is shown (increasing number of included sets from top to bottom).

Figure 2—figure supplement 7

Download asset Open asset

Stable PICS vs Stable SuSiE (three causal variants).

Empirical discrete probability distributions over the number of causal variants (0, 1, 2, or 3) recovered by stability-guided algorithms in simulations with three causal variants, stratified by the SNR parameter used in simulations (increasing SNR from left to right). The impact of including a greater number of credible or potential sets on the distribution is shown (increasing number of included sets from top to bottom).

Figure 2—figure supplement 8

Download asset Open asset

Matching Top vs Stable SNP posterior probabilities.

Posterior probabilities of matching top and stable variants across 2400 simulated gene expression phenotypes. Points are colored by the number of causal variants,, $S \in {1, 2, 3}$ set in simulations.

Figure 2—figure supplement 9

Download asset Open asset

Non-matching Top vs Stable SNP posterior probabilities.

Posterior probabilities of non-matching top and stable variants across 2400 simulated gene expression phenotypes. Points are colored by the number of causal variants,, $S \in {1, 2, 3}$ set in simulations.

Figure 2—figure supplement 10

Download asset Open asset

Stable PICS vs SuSiE (one causal variant).

Empirical discrete probability distributions over the number of causal variants (0 or 1) recovered by Stable PICS or Top SuSiE in simulations with one causal variant, stratified by the SNR parameter used in simulations (increasing SNR from left to right). The impact of including a greater number of credible or potential sets on the distribution is shown (increasing number of included sets from top to bottom).

Figure 2—figure supplement 11

Download asset Open asset

Stable PICS vs SuSiE (two causal variants).

Empirical discrete probability distributions over the number of causal variants (0, 1, or 2) recovered by Stable PICS or Top SuSiE in simulations with two causal variants, stratified by the SNR parameter used in simulations (increasing SNR from left to right). The impact of including a greater number of credible or potential sets on the distribution is shown (increasing number of included sets from top to bottom).

Figure 2—figure supplement 12

Download asset Open asset

Stable PICS vs SuSiE (three causal variants).

Empirical discrete probability distributions over the number of causal variants (0, 1, 2, or 3) recovered by Stable PICS or Top SuSiE in simulations with three causal variants, stratified by the SNR parameter used in simulations (increasing SNR from left to right). The impact of including a greater number of credible or potential sets on the distribution is shown (increasing number of included sets from top to bottom).

Figure 2—figure supplement 13

Download asset Open asset

Distribution of the number of variants recovered by PICS (one causal variant).

Empirical discrete probability distributions over the number of causal variants (0 or 1) recovered by PICS algorithms in simulations with one causal variant, stratified by the SNR parameter used in simulations (increasing SNR from left to right). The impact of including a larger number of potential sets on the distribution is shown (increasing number of included sets from top to bottom).

Figure 2—figure supplement 14

Download asset Open asset

Distribution of the number of variants recovered by PICS (two causal variants).

Empirical discrete probability distributions over the number of causal variants (0, 1, or 2) recovered by PICS algorithms in simulations with two causal variants, stratified by the SNR parameter used in simulations (increasing SNR from left to right). The impact of including a larger number of potential sets on the distribution is shown (increasing number of included sets from top to bottom).

Figure 2—figure supplement 15

Download asset Open asset

Distribution of the number of variants recovered by PICS (three causal variants).

Empirical discrete probability distributions over the number of causal variants (0, 1, 2, or 3) recovered by PICS algorithms in simulations with three causal variants, stratified by the SNR parameter used in simulations (increasing SNR from left to right). The impact of including a greater number of potential sets on the distribution is shown (increasing number of included sets from top to bottom).

Figure 2—figure supplement 16

Download asset Open asset

Distribution of the number of variants recovered by SuSiE (one causal variant).

Empirical discrete probability distributions over the number of causal variants (0 or 1) recovered by SuSiE algorithms in simulations with one causal variant, stratified by the SNR parameter used in simulations (increasing SNR from left to right). The impact of including a greater number of credible sets on the distribution is shown (increasing number of included sets from top to bottom).

Figure 2—figure supplement 17

Download asset Open asset

Distribution of number of variants recovered by SuSiE (two causal variants).

Empirical discrete probability distributions over the number of causal variants (0, 1, or 2) recovered by SuSiE algorithms in simulations with two causal variants, stratified by the SNR parameter used in simulations (increasing SNR from left to right). The impact of including a greater number of credible sets on the distribution is shown (increasing number of included sets from top to bottom).

Figure 2—figure supplement 18

Download asset Open asset

Distribution of number of variants recovered by SuSiE (three causal variants).

Empirical discrete probability distributions over the number of causal variants (0, 1, 2, or 3) recovered by SuSiE algorithms in simulations with three causal variants, stratified by the SNR parameter used in simulations (increasing SNR from left to right). The impact of including a greater number of credible sets on the distribution is shown (increasing number of included sets from top to bottom).

Figure 2—figure supplement 19

Download asset Open asset

Variant recovery frequency of PICS and SuSiE matching and non-matching variants (one causal variant).

Frequency with which the causal variant is recovered by a matching variant, non-matching top variant, or non-matching stable variant in simulations involving one causal variant. Recovery frequencies are stratified by simulations differing in the signal-to-noise ratio (SNR) parameter $ϕ$ .

Figure 2—figure supplement 20

Download asset Open asset

Distribution of number of variants recovered by PICS matching and non-matching variants (two causal variants).

Empirical discrete probability distributions over the number of causal variants (0, 1, or 2) recovered by matching Top and Stable PICS variants, non-matching Top PICS variants, and non-matching Stable PICS variants, in simulations involving two causal variants. Empirical distributions are stratified by simulations differing in the signal-to-noise ratio (SNR) parameter $ϕ$ (increasing SNR from left to right).

Figure 2—figure supplement 21

Download asset Open asset

Distribution of the number of variants recovered by SuSiE matching and non-matching variants (two causal variants).

Empirical discrete probability distributions over the number of causal variants (0, 1, or 2) recovered by matching Top and Stable SuSiE variants, non-matching Top SuSiE variants, and non-matching Stable SuSiE variants, in simulations involving two causal variants. Empirical distributions are stratified by simulations differing in the signal-to-noise ratio (SNR) parameter $ϕ$ (increasing SNR from left to right).

Figure 2—figure supplement 22

Download asset Open asset

Distribution of number of variants recovered by PICS matching and non-matching variants (three causal variants).

Empirical discrete probability distributions over the number of causal variants (0, 1, 2, or 3) recovered by matching Top and Stable PICS variants, non-matching Top PICS variants, and non-matching Stable PICS variants, in simulations involving three causal variants. Empirical distributions are stratified by simulations differing in the signal-to-noise ratio (SNR) parameter $ϕ$ (increasing SNR from left to right).

Figure 2—figure supplement 23

Download asset Open asset

Distribution of number of variants recovered by SuSiE matching and non-matching variants (three causal variants).

Empirical discrete probability distributions over the number of causal variants (0, 1, 2, or 3) recovered by matching Top and Stable SuSiE variants, non-matching Top SuSiE variants, and non-matching Stable SuSiE variants, in simulations involving three causal variants. Empirical distributions are stratified by simulations differing in the signal-to-noise ratio (SNR) parameter $ϕ$ (increasing SNR from left to right).

Figure 2—figure supplement 24

Download asset Open asset

Plain PICS vs Stable PICS in environmental heterogeneity simulations (one causal variant).

Empirical discrete probability distributions over the number of causal variants (0 or 1) recovered by Plain or Stable PICS in simulations with one causal variant and environmental heterogeneity, stratified by the SNR parameter used in simulations (increasing SNR from left to right). The impact of including a greater number of potential sets on the distribution is shown (increasing number of included sets from top to bottom).

Figure 2—figure supplement 25

Download asset Open asset

Plain PICS vs Stable PICS in environmental heterogeneity simulations (two causal variants).

Empirical discrete probability distributions over the number of causal variants (0, 1, or 2) recovered by Plain or Stable PICS in simulations with two causal variants and environmental heterogeneity, stratified by the SNR parameter used in simulations (increasing SNR from left to right). The impact of including a greater number of potential sets on the distribution is shown (increasing number of included sets from top to bottom).

Figure 2—figure supplement 26

Download asset Open asset

Plain PICS vs Stable PICS in environmental heterogeneity simulations (three causal variants).

Empirical discrete probability distributions over the number of causal variants (0, 1, 2, or 3) recovered by Plain or Stable PICS in simulations with three causal variants and environmental heterogeneity, stratified by the SNR parameter used in simulations (increasing SNR from left to right). The impact of including a larger number of potential sets on the distribution is shown (increasing number of included sets from top to bottom).

Figure 2—figure supplement 27

Download asset Open asset

Plain PICS vs Stable PICS in ‘variance shift (t = 8)’ environmental heterogeneity simulations.

Empirical discrete probability distributions over the number of causal variants (0, 1, 2, or 3) recovered by Plain or Stable PICS in ‘ $t = 8$ ’ simulations involving environmental heterogeneity (variance shift scenario), stratified by the SNR parameter used in simulations (increasing SNR from left to right). Each row reports the distribution for simulations with a specific number of causal variants (1, 2, or 3), and we use all three potential sets to compute the number of causal variants recovered in each case.

Figure 2—figure supplement 28

Download asset Open asset

Plain PICS vs Stable PICS in ‘variance shift (t = 16)’ environmental heterogeneity simulations.

Empirical discrete probability distributions over the number of causal variants (0, 1, 2, or 3) recovered by Plain or Stable PICS in ‘ $t = 16$ ’ simulations involving environmental heterogeneity (variance shift scenario), stratified by the SNR parameter used in simulations (increasing SNR from left to right). Each row reports the distribution for simulations with a specific number of causal variants (1, 2, or 3), and we use all three potential sets to compute the number of causal variants recovered in each case.

Figure 2—figure supplement 29

Download asset Open asset

Plain PICS vs Stable PICS in ‘variance shift (t = 128)’ environmental heterogeneity simulations.

Empirical discrete probability distributions over the number of causal variants (0, 1, 2, or 3) recovered by Plain or Stable PICS in ‘ $t = 128$ ’ simulations involving environmental heterogeneity (variance shift scenario), stratified by the SNR parameter used in simulations (increasing SNR from left to right). Each row reports the distribution for simulations with a specific number of causal variants (1, 2, or 3), and we use all three potential sets to compute the number of causal variants recovered in each case.

Figure 2—figure supplement 30

Download asset Open asset

Plain PICS vs Stable PICS in ‘variance shift (t = 256)’ environmental heterogeneity simulations.

Empirical discrete probability distributions over the number of causal variants (0, 1, 2, or 3) recovered by Plain or Stable PICS in ‘ $t = 256$ ’ simulations involving environmental heterogeneity (variance shift scenario), stratified by the SNR parameter used in simulations (increasing SNR from left to right). Each row reports the distribution for simulations with a specific number of causal variants (1, 2, or 3), and we use all three potential sets to compute the number of causal variants recovered in each case.

Figure 2—figure supplement 31

Download asset Open asset

Plain PICS vs Stable PICS in ‘mean shift (|i − 3|)’ environmental heterogeneity simulations.

Empirical discrete probability distributions over the number of causal variants (0, 1, 2, or 3) recovered by Plain or Stable PICS in ‘ $| i - 3 |$ ’ simulations involving environmental heterogeneity (mean shift scenario), stratified by the SNR parameter used in simulations (increasing SNR from left to right). Each row reports the distribution for simulations with a specific number of causal variants (1, 2, or 3), and we use all three potential sets to compute the number of causal variants recovered in each case.

Figure 2—figure supplement 32

Download asset Open asset

Figure 3 with 2 supplements

Download asset Open asset

Venn diagram showing the number of matching and non-matching variants for Potential Set 1 in GEUVADIS fine-mapped variants.

Figure 3—figure supplement 1

Download asset Open asset

Matching GEUVADIS Top vs Stable SNP posterior probabilities.

Pair density plot of posterior probabilities of the top variant and the stable variant, in case they match.

Figure 3—figure supplement 2

Download asset Open asset

Non-matching GEUVADIS Top vs Stable SNP posterior probabilities.

Pair density plot of posterior probabilities of the top variant and the stable variant, in case they do not match.

Figure 4

Download asset Open asset

Distribution of computational VEP scores across matching and non-matching variants.

*Top row*. CADD scores. (A) Empirical cumulative distribution functions of raw CADD scores of matching and non-matching variants across all genes, for Potential Set 1. Non-matching variants are further divided into stable and top variants, with a score lower threshold of 1.0 and upper threshold of 5.0 used to improve visualization. (B) For a deleteriousness cutoff, the percent of (1) all matching variants, (2) all non-matching top variants, and (3) all non-matching stable variants, which are classified as deleterious. We use a sliding cutoff threshold ranging from 10 to 20 as recommended by CADD authors. For each value along the x-axis, 95% confidence intervals for point estimates on the y-axis were obtained using the Sison-Glaz method for constructing multinomial distribution standard errors (R command DescTools::MultinomCI(...)). *Bottom row*. Empirical cumulative distribution functions of perturbation scores of Enformer-predicted H3K27me3 ChIP-seq track. Score upper threshold of 0.015 and empirical CDF lower threshold of 0.5 used to improve visualization. (C) Perturbation scores computed from predictions based on centering input sequences on the gene TSS as well as its two flanking positions. (D) Perturbation scores computed from predictions based on centering input sequences on the gene TSS only.

Figure 5

Download asset Open asset

Comparison of CADD scores across non-matching top and stable variants.

(A) Paired scatterplot of raw CADD scores of both top and stable variant for each gene, for Potential Set 1. (B) Percent of genes that are classified as (1) having deleterious top variant only, (2) having deleterious stable variant only, and (3) having both top and stable variant deleterious, using a sliding cutoff threshold ranging from 10 to 20 as recommended by CADD authors.

Figure 6

Download asset Open asset

Visual summary of the PICS algorithm described in Probabilistic Identification of Causal SNPs.

(A) Breakdown of the calculation of the probability of a focal SNP $A_{i}$ being causal. (B) Illustration of the permutation procedure used to generate the null distribution. An example $N \times P$ genotype array with $N = P = 6$ is used, with two valid row shuffles, or permutations, of the original array shown. Entries affected by the shuffle are highlighted, as is the focal SNP ( $A_{3}$ ).

Author response image 1

Download asset Open asset

Author response image 2

Download asset Open asset

Author response image 3

Download asset Open asset

Author response image 4

Download asset Open asset

Author response image 5

Download asset Open asset

Author response image 6

Download asset Open asset

Tables

Table 1

A list of 378 functional annotations across which the biological significances of stable and top fine-mapped single nucleotide polymorphisms are compared.

Annotations that report multiple scores have the total number of scores reported shown in parentheses. Scores mined from the FAVOR database (Zhou et al., 2023) are indicated by an asterisk (TSS = transcription start site, bp = base pair).

Functional annotation type	Functional annotation
Ensembl	Distance to Canonical TSS (Cunningham et al., 2022)
Ensembl	Regulatory Features (6; Cunningham et al., 2022)
Computational predictions	CADD∗ (2; Rentzsch et al., 2019)
	SIFTVal∗ (Ng and Henikoff, 2003)
	FATHMM-XF∗ (Rogers et al., 2018)
	LINSIGHT∗ (Huang et al., 2017)
	Polyphen∗ (Adzhubei et al., 2010)
	PhyloP∗ (3; Pollard et al., 2010)
	Gerp∗ (2; Davydov et al., 2010)
	B Statistic∗ (McVicker et al., 2009)
	FunSeq2∗ (Fu et al., 2014)
	ALoFT∗ (Balasubramanian et al., 2017)
	Percent CpG in 75 bp window∗ (Rentzsch et al., 2019)
	Percent GC in 75 bp window∗ (Rentzsch et al., 2019)
	FIRE (Ioannidis et al., 2017)
	Enformer (177 tracks × 2 scores per track; Avsec et al., 2021)

Table 2

List of six moderating factors considered.

Moderator	Quantity/statistic computed
(1) Degree of Stability	No. subpopulations for which stable variant has positive probability
(2) Population Diversity	Maximum of pairwise allele frequency difference between subpopulations for which stable variant has positive posterior probability
(3) Population Differentiation	Maximum $F_{S T}$ between subpopulations for which stable variant has positive posterior probability
(4) Inclusion of Distal Subpopulations (Top)	Whether or not the top variant also had positive probability in Yoruban subpopulation when the stability-guided approach was used
(5) Inclusion of Distal Subpopulations (Stable)	Whether or not the stable variant had positive probability in Yoruban subpopulation when the stability-guided approach was used
(6) Degree of Certainty of Causality Using Residualization Approach	Posterior probability of top variant

Appendix 12—table 1

Plain and Stable PICS matching frequencies.

Below reports the frequencies with which Plain and Stable PICS have matching variants for the same potential set. The numbers of matching variants for each SNR scenario are reported in the parentheses. The bottom two rows show matching frequencies when results are stratified by posterior probability (PP) of the Plain PICS variant. The numbers of matching variants for each PP stratum are reported in the parentheses.

Stratified by signal-to-noise ratio (SNR) of simulations
	Potential Set 1	Potential Set 2	Potential Set 3
SNR = 0.053	0.736 (265)	0.803 (289)	0.797 (287)
SNR = 0.111	0.775 (279)	0.753 (271)	0.758 (273)
SNR = 0.25	0.903 (325)	0.714 (257)	0.728 (262)
SNR = 0.667	0.906 (326)	0.753 (271)	0.744 (268)
Stratified by posterior probability (PP) of plain PICS variant
p > 0.9	0.978 (441)	0.899 (286)	0.927 (307)
p ≤ 0.9	0.762 (754)	0.715 (802)	0.706 (783)

Appendix 12—table 2

Stable and Top PICS matching frequencies.

Below reports the frequencies with which Stable and Top PICS have matching variants for the same potential set. The numbers of matching variants for each SNR/‘No. Causal Variants’ scenario are reported in the parentheses.

Stratified by signal-to-noise ratio (SNR) and No. Causal Variants (S) in simulations
		Potential Set 1	Potential Set 2	Potential Set 3
One causal variant (S = 1)	SNR = 0.053	0.695 (139)	0.41 (82)	0.22 (44)
	SNR = 0.111	0.75 (150)	0.405 (81)	0.24 (48)
	SNR = 0.25	0.825 (165)	0.45 (90)	0.26 (52)
	SNR = 0.667	0.895 (179)	0.405 (81)	0.275 (55)
Two causal variants (S = 2)	SNR = 0.053	0.545 (109)	0.36 (72)	0.225 (45)
	SNR = 0.111	0.68 (136)	0.38 (76)	0.215 (43)
	SNR = 0.25	0.79 (158)	0.435 (87)	0.27 (54)
	SNR = 0.667	0.78 (156)	0.41 (82)	0.26 (52)
Three causal variants (S = 3)	SNR = 0.053	0.565 (113)	0.37 (74)	0.245 (49)
	SNR = 0.111	0.655 (131)	0.36 (72)	0.265 (53)
	SNR = 0.25	0.72 (144)	0.39 (78)	0.22 (44)
	SNR = 0.667	0.785 (157)	0.48 (96)	0.255 (51)

Appendix 12—table 3

Stable and Top SuSiE matching frequencies.

Below reports the frequencies with which Stable and Top SuSiE have matching variants for the same potential set. The numbers of matching variants for each SNR/‘No. Causal Variants’ scenario are reported in the parentheses.

Stratified by signal-to-noise ratio (SNR) and No. Causal Variants (S) in simulations
		Potential Set 1	Potential Set 2	Potential Set 3
One causal variant (S = 1)	SNR = 0.053	0.55 (110)	0.115 (23)	0.095 (19)
	SNR = 0.111	0.725 (145)	0.14 (28)	0.095 (19)
	SNR = 0.25	0.84 (168)	0.145 (29)	0.17 (34)
	SNR = 0.667	0.875 (175)	0.14 (28)	0.19 (38)
Two causal variants (S = 2)	SNR = 0.053	0.505 (101)	0.095 (19)	0.065 (13)
	SNR = 0.111	0.73 (146)	0.14 (28)	0.095 (19)
	SNR = 0.25	0.875 (175)	0.33 (66)	0.105 (21)
	SNR = 0.667	0.875 (175)	0.425 (85)	0.115 (23)
Three causal variants (S = 3)	SNR = 0.053	0.345 (69)	0.075 (15)	0.085 (17)
	SNR = 0.111	0.68 (136)	0.23 (46)	0.145 (29)
	SNR = 0.25	0.795 (159)	0.37 (74)	0.185 (37)
	SNR = 0.667	0.85 (170)	0.585 (117)	0.25 (50)

Appendix 12—table 4

Off-diagonal matching frequencies and causal variant recovery.

Below reports the number of Stable and Top PICS non-matching variants that match across different, or ‘off-diagonal’, potential sets. Frequencies are computed across simulations with the same number of causal variants (S = 1, 2, or 3), with numbers along the yellow-shaded diagonal reporting the number of non-matching variants between the same potential sets. Each off-diagonal element reports both the number of matching variants for the pair of potential sets listed as well as the percentage of these matches that also correspond to the causal variant.

Simulations with one causal variant
		Top PICS potential set compared against
		Potential Set 1	Potential Set 2	Potential Set 3
Stable PICS potential set	Potential Set 1	167	5 (60%)	3 (67%)
	Potential Set 2	4 (25%)	466	104 (0.96%)
	Potential Set 3	0	94 (0%)	601
Simulations with two causal variants
		Top PICS potential set compared against
		Potential Set 1	Potential Set 2	Potential Set 3
Stable PICS potential set	Potential Set 1	241	24 (46%)	5 (60%)
	Potential Set 2	29 (52%)	483	88 (10%)
	Potential Set 3	9 (11%)	84 (13%)	606
Simulations with three causal variants
		Top PICS potential set compared against
		Potential Set 1	Potential Set 2	Potential Set 3
Stable PICS potential set	Potential Set 1	255	29 (66%)	7 (14%)
	Potential Set 2	30 (40%)	480	79 (18%)
	Potential Set 3	4 (0%)	65 (12%)	603

Appendix 12—table 5

List of matching variants with low stable posterior probability.

Below summarizes the genes and potential sets for which Stable and Top PICS returned matching variants, along with SNP-level and fine-mapping features for interpretation. Five statistics are reported: posterior probability of the stable variant (Stable PP); posterior probability of the top variant (Top PP); posterior probability support size, defined as the number variants with positive probability (Support Size); the number of ancestry slices, including the ALL slice, for which the stable variant had positive posterior probability from running Stable PICS (Number of Slices); the maximum difference in allele frequency between any pair of subpopulations among YRI, TSI, GBR, FIN, and CEU (Max AF Difference).

Potential Set	Gene	Matching variant	Stable PP	Top PP	Support size	Number of slices	Max AF Difference
1	ENSG00000134762.11	rs61731921	0.0028	0.76	23	4	0.22
1	ENSG00000197847.8	rs7130955	0.0075	0.23	45	3	0.18
1	ENSG00000255284.1	rs12224894	0.0067	0.65	23	6	0.14
1	ENSG00000104442.5	rs6995242	0.0092	0.31	42	4	0.34
1	ENSG00000146733.9	rs10239528	0.0031	0.53	27	5	0.24
1	ENSG00000248468.1	rs9853505	0.0099	0.29	39	3	0.43
1	ENSG00000122224.10	rs57449	0.0089	0.50	25	4	0.31
1	ENSG00000134262.8	rs17464525	0.0030	0.45	23	4	0.15
2	ENSG00000216522.3	rs5751902	0.0052	1	7	3	0.16
2	ENSG00000108592.9	rs9912201	0.0022	0.32	32	6	0.27
2	ENSG00000134551.7	rs7315843	0.0019	0.58	10	5	0.22
2	ENSG00000221947.3	rs3103860	0.0018	0.99	4	4	0.084
2	ENSG00000081791.4	rs2270113	0.0059	0.77	11	4	0.33
3	ENSG00000140368.8	rs62027296	0.0069	0.29	21	4	0.15
3	ENSG00000254614.1	rs625750	0.0017	0.64	4	3	0.22
3	ENSG00000133835.9	rs2451818	0.0036	0.40	32	3	0.41
3	ENSG00000158234.8	rs693293	0.0069	0.55	24	4	0.13

Appendix 12—table 6

List of variant annotations with interpretations.

Functional annotation	Interpretation
Distance to Canonical Transcription Start Site (TSS)	-
Percent CpG in 75 bp window centered on variant position	-
Percent GC in 75 bp window centered on variant position	-
CTCF Binding Enrichment	Whether the variant lies within a CTCF binding site region as predicted by Ensembl
Enhancer Enrichment	Whether the variant lies within an enhancer region as predicted by Ensembl
Open Chromatin Enrichment	Whether the variant lies within an open chromatin region as predicted by Ensembl
Promoter Enrichment	Whether the variant lies within a promoter region as predicted by Ensembl
TF Binding Enrichment	Whether the variant lies within a TF-binding site region as predicted by Ensembl
Promoter Flanking Enrichment	Whether the variant lies within a promoter flanking region as predicted by Ensembl
CADD (2 scores)	Whether the variant is likely to be simulated or not, and hence likely deleterious or not. One score is raw while the other is rank-normalized
SIFTVal	Whether the variant affects protein function, and hence deleterious
Polyphen2	Posterior probability that the variant is damaging
LINSIGHT	Probability that the variant site is under selection, thus having functional consequence
PhyloP (3 scores)	Substitution rates measuring cross-species evolutionary conservation at the site of the variant. Each score is computed with respect to a clade (vertebrate, mammal, primate)
GerpN	Estimated neutral substitution rate at variant position, with higher value implying greater conservation
GerpS	Estimated rejected substitution rate at variant position, with positive value implying a deficit in substitutions
B Statistic	Background selection at variant position, with smaller value indicating larger impact of selection
FATHMM-XF	Integrative score measuring deleteriousness of the variant
Funseq2	Integrative score measuring deleteriousness of the variant
ALoft	Integrative score measuring loss of function associated with the variant
FIRE	Integrative score measuring deleteriousness of the variant
Magnitude of Effect on Enformer Track Prediction (177 tracks)	Change in prediction of a gene regulatory track when performing in silico mutagenesis on the variant in a 196,608 bp sequence

Author response table 1

Min	1st Qu.	Median	Mean	3rd Qu.	Max
-0.999	-0.342	-0.107	-0.117	0.067	0.949

Author response table 2

Min	1st Qu.	Median	Mean	3rd Qu.	Max
-1.000	-0.334	-0.0741	-0.0778	0.162	0.976

Author response table 3

Min	1st Qu.	Median	Mean	3rd Qu.	Max
-0.998	-0.327	-0.0564	-0.0629	0.191	0.968

Author response table 4

Min	1st Qu.	Median	Mean	3rd Qu.	Max
-0.994	-0.266	-0.00645	-0.109	0.049	0.924

Author response table 5

Min	1st Qu.	Median	Mean	3rd Qu.	Max
-0.998	-0.376	-0.0944	-0.121	0.115	0.903

Author response table 6

Min	1st Qu.	Median	Mean	3rd Qu.	Max
-0.998	-0.389	-0.100	-0.114	0.146	0.915

Additional files

MDAR checklist: https://cdn.elifesciences.org/articles/88039/elife-88039-mdarchecklist1-v1.pdf
Download elife-88039-mdarchecklist1-v1.pdf

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Mendeley

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

Alan J Aw
Lionel Chentian Jin
Nilah Ioannidis
Yun S Song

(2026)

The impact of stability considerations on genetic fine-mapping

eLife 12:RP88039.

https://doi.org/10.7554/eLife.88039.3

Share this article

Cite this article

An overview of our study of the impact of stability considerations on genetic fine-mapping.

Simulation study results.

Plain PICS vs Stable PICS (Potential Sets 2 and 3).

Performance of PICS algorithms (Potential Sets 2 and 3).

Performance of SuSiE algorithms (Credible Sets 2 and 3).

Matching vs non-matching variants (Potential and Credible Sets 1 and 2).

Stable PICS vs Stable SuSiE (one causal variant).

Stable PICS vs Stable SuSiE (two causal variants).

Stable PICS vs Stable SuSiE (three causal variants).

Matching Top vs Stable SNP posterior probabilities.

Non-matching Top vs Stable SNP posterior probabilities.

Stable PICS vs SuSiE (one causal variant).

Stable PICS vs SuSiE (two causal variants).

Stable PICS vs SuSiE (three causal variants).

Distribution of the number of variants recovered by PICS (one causal variant).

Distribution of the number of variants recovered by PICS (two causal variants).

Distribution of the number of variants recovered by PICS (three causal variants).

Distribution of the number of variants recovered by SuSiE (one causal variant).

Distribution of number of variants recovered by SuSiE (two causal variants).

Distribution of number of variants recovered by SuSiE (three causal variants).

Variant recovery frequency of PICS and SuSiE matching and non-matching variants (one causal variant).

Distribution of number of variants recovered by PICS matching and non-matching variants (two causal variants).

Distribution of the number of variants recovered by SuSiE matching and non-matching variants (two causal variants).

Distribution of number of variants recovered by PICS matching and non-matching variants (three causal variants).

Distribution of number of variants recovered by SuSiE matching and non-matching variants (three causal variants).

Plain PICS vs Stable PICS in environmental heterogeneity simulations (one causal variant).

Plain PICS vs Stable PICS in environmental heterogeneity simulations (two causal variants).

Plain PICS vs Stable PICS in environmental heterogeneity simulations (three causal variants).

Plain PICS vs Stable PICS in ‘variance shift (t = 8)’ environmental heterogeneity simulations.

Plain PICS vs Stable PICS in ‘variance shift (t = 16)’ environmental heterogeneity simulations.

Plain PICS vs Stable PICS in ‘variance shift (t = 128)’ environmental heterogeneity simulations.

Plain PICS vs Stable PICS in ‘variance shift (t = 256)’ environmental heterogeneity simulations.

Plain PICS vs Stable PICS in ‘mean shift (|i − 3|)’ environmental heterogeneity simulations.

Plain PICS vs Stable PICS in ‘mean shift (i = 3)’ environmental heterogeneity simulations.

Venn diagram showing the number of matching and non-matching variants for Potential Set 1 in GEUVADIS fine-mapped variants.

Matching GEUVADIS Top vs Stable SNP posterior probabilities.

Non-matching GEUVADIS Top vs Stable SNP posterior probabilities.

Distribution of computational VEP scores across matching and non-matching variants.

Comparison of CADD scores across non-matching top and stable variants.

Visual summary of the PICS algorithm described in Probabilistic Identification of Causal SNPs.

A list of 378 functional annotations across which the biological significances of stable and top fine-mapped single nucleotide polymorphisms are compared.

List of six moderating factors considered.

Plain and Stable PICS matching frequencies.

Stable and Top PICS matching frequencies.

Stable and Top SuSiE matching frequencies.

Off-diagonal matching frequencies and causal variant recovery.

List of matching variants with low stable posterior probability.

List of variant annotations with interpretations.

MDAR checklist

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)