Sibling Similarity Under Different Tail Architectures.

Left to right: When an individual’s extreme trait value (top 1%) is due to many alleles of small effect (“Polygenic”), then their siblings’ trait values are expected to show regression-to-the-mean (grey). When an individual’s extreme trait value is due to a de novo mutation of large effect, then their siblings are expected to have trait values that correspond to the background distribution (green). When an individual’s extreme trait value is the result of an inherited rare allele of large effect (“Mendelian”), then their siblings are expected to have either similarly extreme trait values or trait values that are drawn from the background distribution (red), depending on whether or not they inherited the same large effect allele.

Identifying Complex Tail Genetic Architecture.

Conditional-sibling z-values plotted against index-sibling quantiles. Grey depicts complete polygenic architecture across index-sibling values. In the lower tail, an extreme scenario of de novo architecture is shown in green, resulting in sibling discordance. In the Upper Tail, extreme Mendelian architecture is shown in red, whereby siblings are half concordant and half discordant, resulting in a bimodal conditional sibling trait distribution. Statistical tests to infer each type of complex tail architecture are designed to exploit these expected trait distributions.

Simulation Schematic.

Publicly available GWAS allele frequency and effect size data is used to simulate parent genetic trait value (A). Midparent genetic trait value (B) is simulated assuming random mating. Offspring genotype and genetic trait value (C) is simulated assuming complete recombination. Environmental variation (D) is added to compare with theoretical polygenic conditional sibling distribution. De novo and Mendelian rare-variant effects are simulated (E) to benchmark tests for complex architecture (F).

Conditional-Sibling Trait Distribution under Polygenic Architecture.

A: The conditional-sibling trait distribution according to Equation 4 for index siblings at the 1st, 25th, 50th, 75th, and 99th percentile of the standardised trait distribution, when heritability is high (h2 = 0.95, in orange) and moderate (h2 = 0.5, in blue). When heritability is 0.95 conditional-sibling expectation is almost half of the index sibling z-score, when heritability is 0.5 the conditionalsibling expectation is equal to 1/4 of the index sibling z-score. B: The conditional distribution transformed into rank space. An individual whose sibling is at the 99% percentile is expected to have a trait value in the 80% percentile when heritability is high and in the 67% percentile when heritability is moderate.

Power to to detect complex tail architecture for different heritability levels, de novo and Mendelian frequencies and sample sizes.

Simulation assumes highly penetrant de novo and Mendelian frequencies of 0.05%,0.1%,0.2%,0.3%, and 0.5%. The false-positive rate was set at 0.05. Null simulations (red dashed line) demonstrate tests are well calibrated.

Analysis of Six UK Biobank Traits.

Application of statistical tests for Mendelian and de novo tail architecture to sibling trait data of six UK Biobank traits. For each trait the conditional sibling mean is plotted under polygenicity (black line) for the heritability estimated estimated from the data. The red (high) and blue (low) bands represent the expected conditional sibling mean under polygenicity at different heritability values. Statistical tests for de novo architecture, Mendelian architecture, and general departure from polygenicity (Kolmogorov-Smirnov Test, Dist P-val), were applied to conditional siblings with index siblings in the upper and lower 1% of the distribution. Significant associations for the Mendelian and de novo tests are shown in red and green respectively. Tail architecture that is not distinct from polygenic expectation is denoted in grey.

Theoretical and simulated conditional expectation and variance in liability (z-score) and rank across index sibling percentiles for conditional sibling, midparents and index siblings. Simulation drew one-thousand parent liability values from 𝒩 (0, 1), these were randomly paired to produce to midparents with liability mi, two offspring were subsequently drawn from and randomly assigned as index and conditional siblings.

For both Fluid Intelligence and Standing Height GWAS variants (on chromosome one) were used to simulate parent and offspring genotypes and liability values. Plots show that for both traits the offspring distribution is normal and that the sibling distribution is multivariate normal, in line with our theoretical prediction

Application To UKB (Extended).