Sibling Similarity Under Different Tail Architectures.

From left to right: When individuals have extreme (top 1%) tail values as a result of many common alleles of small effect (polygenicity), then their siblings’ trait values are expected to show regression-to-the-mean (grey). When individuals have extreme trait values as a result of de novo mutations of large effect, then their siblings are expected to have trait values that correspond to the background distribution. When individuals have extreme trail values as a result of a rare variant of large effect inherited from one of their parents (Mendelian), then their siblings are expected to have either similarly extreme trait values or trait values that are drawn from the background distribution, depending on whether or not they inherited the same large effect allele.

Identifying Complex Tail Genetic Architecture.

Conditional-sibling z-values plotted against index sibling quantiles. Grey depicts purely polygenic architecture across all index sibling values. In the Lower Tail, an extreme scenario of de novo architecture is shown (in green), resulting in sibling discordance. In the Upper Tail, an extreme scenario of Mendelian architecture is shown (in red), whereby half the siblings are concordant and half discordant, resulting in a bimodal conditional sibling distribution. Statistical tests to identify the presence of each type of complex tail architecture are designed to exploit these characteristics.

Simulation Schematic.

Publicly available GWAS allele frequency and effect size data is used to simulate parent liability (A). Midparent liability (B) is simulated assuming random mating. Offspring genotype and liability (C) is simulated assuming complete recombination. Environmental variation (D) is added to compare with theoretical polygenic conditional sibling distribution. De novo and Mendelian rare-variant effects are simulated (E) to benchmark tests for complex architecture (F).

Conditional-Sibling Trait Distribution under Polygenic Architecture.

A: The conditional-sibling trait distribution according to Equation 4 for index siblings at the 1st, 25th, 50th, 75th, and 99th percentile of the standardised trait distribution, when heritability is high (h2 = 0.95, in orange) and moderate (h2 = 0.5, in blue). When heritability is 0.95 conditional-sibling expectation is almost half of the index sibling z-score, when heritability is 0.5 the conditional-sibling expectation is equal to 1/4 of the index sibling z-score. B: The conditional distribution transformed into rank space. An individual whose sibling is at the 99% percentile is expected to have a trait value in the 80% percentile when heritability is high and in the 67% percentile when heritability is moderate.

Power to to detect complex tail architecture for different heritability levels, de novo and Mendelian frequencies and sample sizes.

Simulation assumes highly penetrant de novo and Mendelian frequencies of 0.05%,0.1%,0.2%,0.3%, and 0.5%. The false-positive rate was set at 0.05. Null simulations (red dashed line) demonstrate tests are well calibrated.

Analysis of Six UK Biobank Traits.

Application of statistical tests for Mendelian and de novo tail architecture to sibling trait data of six UK Biobank traits. For each trait the conditional sibling mean is plotted under polygenicity in which heritability is estimated by index siblings lying between the 5th and 95th percentiles. Statistical tests were applied to conditional siblings with index sibling in the upper and lower 1% of the distribution. Significant associations for the Mendelian and de novo tests are shown in red and green respectively. Tail architecture that is not distinct from polygenic expectation is denoted in grey.

Theoretical and simulated conditional expectation and variance in liability (z-score) and rank across index sibling percentiles for conditional sibling, midparents and index siblings. Simulation drew one-thousand parent liability values from 𝒩(0, 1), these were randomly paired to produce to midparents with liability mi, two offspring were subsequently drawn from and randomly assigned as index and conditional siblings.

For both Fluid Intelligence and Standing Height GWAS variants (on chromosome one) were used to simulate parent and offspring genotypes and liability values. Plots show that for both traits the offspring distribution is normal and that the sibling distribution is multivariate normal, in lin.e with our theoretical prediction