1. Genetics and Genomics
Download icon

Variable prediction accuracy of polygenic scores within an ancestry group

  1. Hakhamanesh Mostafavi  Is a corresponding author
  2. Arbel Harpak  Is a corresponding author
  3. Ipsita Agarwal
  4. Dalton Conley
  5. Jonathan K Pritchard
  6. Molly Przeworski  Is a corresponding author
  1. Department of Biological Sciences, Columbia University, United States
  2. Department of Sociology, Princeton University, United States
  3. Office of Population Research, Princeton University, United States
  4. Department of Genetics, Stanford University, United States
  5. Department of Biology, Stanford University, United States
  6. Howard Hughes Medical Institute, Stanford University, United States
  7. Department of Systems Biology, Columbia University, United States
Research Article
  • Cited 5
  • Views 3,000
  • Annotations
Cite this article as: eLife 2020;9:e48376 doi: 10.7554/eLife.48376

Abstract

Fields as diverse as human genetics and sociology are increasingly using polygenic scores based on genome-wide association studies (GWAS) for phenotypic prediction. However, recent work has shown that polygenic scores have limited portability across groups of different genetic ancestries, restricting the contexts in which they can be used reliably and potentially creating serious inequities in future clinical applications. Using the UK Biobank data, we demonstrate that even within a single ancestry group (i.e., when there are negligible differences in linkage disequilibrium or in causal alleles frequencies), the prediction accuracy of polygenic scores can depend on characteristics such as the socio-economic status, age or sex of the individuals in which the GWAS and the prediction were conducted, as well as on the GWAS design. Our findings highlight both the complexities of interpreting polygenic scores and underappreciated obstacles to their broad use.

eLife digest

Complex diseases like cancer and heart disease are caused by the interplay of many factors: the variants of genes we inherit, the lifestyles we lead and the environments we inhabit, plus the interaction of all these factors. In fact, almost every trait, even how many years we will spend studying, is influenced both by our environment and our genes.

To identify some of the genetic factors at play, scientists perform analyses known as genome-wide association studies, or GWAS for short. In these studies, the genomes from many different people are scanned to look for genetic differences associated with differences in traits. By summing up all the small genetic differences, so-called “polygenic scores” can be calculated. When there is a large genetic component to a trait, polygenic scores can be useful predictive tools.

But there is a catch: polygenic scores make less accurate predictions for individuals of a different ancestry than those involved in the GWAS, which limits the use of these tools around the world. Mostafavi, Harpak et al. set out to understand if there are other factors in addition to ancestry that could influence the performance of polygenic scores.

Using data from the UK Biobank, an international health resource that pairs genomic data and clinical information, Mostafavi, Harpak et al. examined polygenic scores among individuals that share a single, common ancestry. These polygenic scores were used to predict three traits (blood pressure, body mass index and educational attainment) in individuals and the predictions were then compared to the actual trait values to see how accurate they were. The analysis revealed that even within a group of people with similar ancestry, the accuracy of polygenic scores can vary, depending on characteristics such as the sex, age or socioeconomic status of the individuals.

This analysis emphasises how variable GWAS and their predictive value can be even within seemingly similar population groups. It further highlights both the complexities of interpreting polygenic scores and underappreciated obstacles to their broad use in medical and social sciences.

Introduction

Genome-wide association studies (GWAS) have now been conducted for thousands of human complex traits, revealing that the genetic architecture is almost always highly polygenic, that is that the bulk of the heritable variation is due to thousands of genetic variants, each with tiny marginal effects (Boyle et al., 2017; Bulik-Sullivan et al., 2015). These findings make it difficult to interpret the molecular basis for variation in a trait, but they lend themselves more immediately to another use: phenotypic prediction. Under the assumption that alleles act additively, a 'polygenic score' (PGS) can be created by summing the effects of the alleles carried by an individual; this score can then be used to predict that individual’s phenotype (Henderson, 1984; Meuwissen et al., 2001; Kathiresan et al., 2008; Lynch and Walsh, 1998). For highly heritable traits, such scores already provide informative predictions in some contexts: for example, prediction accuracies are 24.4% for height (using R2 as a measure) (Yengo et al., 2018) and up to 13% for educational attainment (using incremental R2) (Lee et al., 2018).

This genomic approach to phenotypic prediction has been rapidly adopted in three distinct fields. In human genetics, PGS have been shown to help identify individuals that are more likely to be at risk of diseases such as breast cancer and cardiovascular disease (Khera et al., 2018; Inouye et al., 2018; Mavaddat et al., 2019; Khera et al., 2019). Based on these findings, a number of papers have advocated that PGS be adopted in designing clinical studies, and by clinicians as additional risk factors to consider in treating patients (Torkamani et al., 2018; Khera et al., 2018). In human evolutionary genetics, several lines of evidence suggest that adaptation may often take the form of shifts in the optimum of a polygenic phenotype and hence act jointly on the many variants that influence the phenotype (Pritchard and Di Rienzo, 2010; Berg and Coop, 2014; Höllinger et al., 2019; Sella and Barton, 2019). In this context, the goal is to test whether the set of variants that influence a trait are rapidly evolving across populations or over time (Field et al., 2016; Berg et al., 2019; Uricchio et al., 2019; Edge and Coop, 2019; Racimo et al., 2018; Berg and Coop, 2014). Finally, in various disciplines of the social sciences, PGS are increasingly used to distinguish environmental from genetic sources of variability (Conley, 2016), as well as to understand how genetic variation among individuals may cause heterogeneous treatment effects when studying how an environmental influence (e.g., a schooling reform) affects an outcome (such as BMI) (Barcellos et al., 2018; Davies et al., 2018). In all these applications, the premise is that PGS will ‘port’ well across groups—that is that they remain predictive not only in samples very similar to the ones in which the GWAS was conducted, but also in other sets of individuals (henceforth ‘prediction sets’).

As recent papers have highlighted, however, PGS are not as predictive in individuals whose genetic ancestry differs substantially from the ancestry of individuals in the original GWAS (reviewed in Martin et al., 2019). As one illustration, PGS calculated in the UK Biobank predict phenotypes of individuals sampled in the UK Biobank better than those of individuals sampled in the BioBank Japan Project: for instance, the incremental R2 for height is approximately 11% in the UK versus 3% in Japan (Martin et al., 2019). Similarly, using PGS based on Europeans and European-Americans, the largest educational attainment GWAS to date ('EA3') reported an incremental R2 of 10.6% for European-Americans but only 1.6% for African-Americans (Lee et al., 2018).

To date, such observations have been discussed mainly in terms of population genetic factors that reduce portability (Martin et al., 2017; Kim et al., 2018; Duncan et al., 2018; De La Vega and Bustamante, 2018; Sirugo et al., 2019; Martin et al., 2019). Notably, GWAS does not pinpoint causal variants, but instead implicates a set of possible causal variants that lie in close physical proximity in the genome. The estimated effect of a given SNP depends on the extent of linkage disequilibrium (LD) with the causal sites (Pritchard and Przeworski, 2001; Bulik-Sullivan et al., 2015). LD differences between populations that arose from their distinct demographic and recombination histories will lead to variation in the estimated effect sizes and hence to variable phenotypic prediction accuracies (Rosenberg et al., 2019). Populations will also differ in the allele frequencies of causal variants. This problem is particularly acute for alleles that are rare in the population in which the GWAS was conducted but common in the population in which the trait is being predicted. Such variants are likely to have noisy effect size estimates in the estimation sample or may not be included in the PGS at all, and yet they contribute substantially to heritability in the target population. Furthermore, causal loci or effect sizes may differ among populations, for instance if the effect of an allele depends on the genetic background on which it arises (e.g., Adhikari et al., 2019). For all these reasons, we should expect PGS to be less predictive across ancestries.

In practice, given that most individuals (about 80%) included in current GWAS are of European ancestry (Popejoy and Fullerton, 2016; Martin et al., 2019), PGS are systematically more predictive in European-ancestry individuals than among other people. As a consequence, the clinical applications and scientific understanding to be gained from PGS will predominantly and unfairly benefit a small subset of humanity. A number of papers have therefore highlighted the importance of expanding GWAS efforts to include more diverse ancestries (Martin et al., 2018; Bien et al., 2019; Wojcik et al., 2019; Martin et al., 2019; Sirugo et al., 2019).

Importantly, factors other than ancestry could also impact the accuracy and portability of PGS. For example, the educational attainment of an individual depends not only on their own genotype, but on the genotypes of their parents, due to nurturing effects (Kong et al., 2018), and of their peers, due to social genetic effects (Domingue et al., 2018), and of course on non-genetic factors. Also, traits such as height and educational attainment show strong patterns of assortative mating, which can distort effect size estimates in GWAS (Domingue et al., 2014; Robinson et al., 2017; Ruby et al., 2018). To what extent these effects remain the same across cultures and environments is unknown, but if they differ, so will the prediction accuracy. More generally, while we still know little about genotype-environment interactions (GxE) in humans, they are well-documented in other species—notably in experimental settings—and would further reduce the portability of PGS across environments (Gibson, 2008; Tropf et al., 2017; Mills and Rahal, 2019; Lynch and Walsh, 1998). In addition, the extent of environmental variability could differ between GWAS and prediction groups, which would change the proportion of the variance in the trait explained by a PGS (i.e., the prediction accuracy). PGS for some traits may also include a component of environmental or cultural confounding with population structure (Sohail et al., 2019; Haworth et al., 2019; Lawson et al., 2020; Kerminen et al., 2018; Berg et al., 2019); this source of confounding can increase or decrease prediction accuracy, depending on the structure in the prediction samples.

Given these considerations, it is important to ask to what extent PGS are portable among groups within the same ancestry. To explore this question, we stratified the subset of UK Biobank samples designated as ‘White British’ (WB) according to some of the standard sample characteristics of GWAS studies: the ages of the individuals, their sex, and socio-economic status. We chose to focus on these particular characteristics because they vary among GWAS samples depending on sample ascertainment procedures. Furthermore, these characteristics have been shown to influence heritability for some traits in a study of a subset of the UK Biobank (Ge et al., 2017), raising the possibility that these choices also influence prediction accuracy. Indeed, for three example traits, we show that there exist major differences in the prediction accuracy of the PGS among these groups, even though they share highly similar genetic ancestries. We further demonstrate for a variety of traits that prediction accuracy differs markedly depending on whether the GWAS is conducted in unrelated individuals or in pairs of siblings, even when controlling for the precision of the estimates. This finding is again unexpected under standard GWAS assumptions; it underscores the importance of genetic effects that are included in estimates from some study designs and not others and highlights underappreciated challenges with GWAS-based phenotypic prediction.

At present, it is difficult to determine the reasons why we see such variable prediction accuracy across these strata and study designs. Contributing factors probably include indirect genetic effects from relatives, assortative mating, varying levels of genetic and environmental variance, GxE interaction effects and perhaps undetected confounding. Nonetheless, our results make clear that the prediction accuracy of PGS can be affected in unpredictable ways by known—and presumably unknown—factors in addition to genetic ancestry.

Results

Sample characteristics of the GWAS and prediction set can influence prediction accuracy even within a single ancestry

We examined how PGS for a few example traits port across samples that are of similar genetic ancestry but differ in terms of some common study characteristics, such as the male:female ratio (henceforth ‘sex ratio’), age distribution, or socio-economic status (SES). To this end, we limited our analysis to the largest subset of individuals in the UKB with a relatively homogeneous ancestry: 337,536 unrelated individuals that were characterized by the UKB, based on self-reported ethnicities as well as genetic analysis, as ‘White British’ (WB) (Bycroft et al., 2018). In all analyses, we further adjusted for the first 20 principal components of the genotype data, to account for population structure within this set of individuals (Materials and methods).

In all analyses, we randomly selected a subset of individuals to be the prediction set; we then conducted GWAS using the remaining individuals and built a PGS model by LD-based clumping of the associations (Materials and methods). To examine the reliability of the prediction, we considered the incremental R2, that is the R2 increment obtained when adding the PGS to a model with other covariates (referred to as 'prediction accuracy' henceforth). Whether this measure is appropriate depends on how PGS are to be used; it is not always the most obvious choice in human genetics, where the goal is often to identify individuals at high risk of developing a particular disease (i.e., in the tail of the polygenic score distribution). Nonetheless, because it has been widely reported in discussions of portability across genetic ancestries (e.g., Lee et al., 2018; Martin et al., 2019), we also used it here; later, we also present some results on binary traits using incremental area under the receiver operator curve (AUC).

As a first case, we considered the prediction accuracy of a PGS for diastolic blood pressure in prediction sets stratified by sex, motivated by reports that variation in this trait may arise for somewhat distinct reasons in the two sexes (Reckelhoff, 2001; Zhou et al., 2017). We randomly selected males and females as prediction sets (20K individuals each), and used a subset of the rest of the individuals for GWAS, matching the numbers of females and males in the GWAS set (total sample size 122,774); we refer to this mixed set, somewhat loosely, as the 'diverse GWAS.' Adjusting for mean sex effects and medication use (see Materials and methods), the prediction accuracy is about 1.15-fold higher for females than for males (Mann-Whitney p=1.110-5; Figure 1A). Thus, despite equal representation of males and females in the GWAS set, the prediction accuracy varies depending on the sex ratio of prediction samples. To examine this further, we repeated the same analysis but performed the GWAS in only one sex (which we refer to as 'stratified GWAS' using the same sample size as in the diverse GWAS). [Note that the diverse GWAS sample is not a merge of the stratified GWAS samples but a mixed-sex sample of equal sample size to that used in the women-only and the men-only GWAS, to allow for direct comparison between GWASs. Results for the merged GWAS (with a much larger sample size) are presented in Appendix 1—figure 1A.] When the GWAS is conducted only in females, the prediction accuracy is about 1.35-fold higher for females than for males; in turn, when GWAS was done in only males, the prediction accuracy in both sexes is similar, as well as somewhat decreased (Figure 1A).

Variable prediction accuracy of polygenic scores within an ancestry group.

Shown are incremental R2 values (i.e., the increment in R2 obtained by adding a polygenic score predictor to a model with covariates alone) in different prediction sets. Each box and whiskers plot is computed based on 20 iterations of resampling GWAS and prediction sets. Thick horizontal lines denote the medians. The polygenic scores were estimated in samples of unrelated WB individuals. Phenotypes were then predicted in distinct samples of unrelated WB individuals, stratified by sex (A), age (B) or Townsend deprivation index, a measure of SES (C). In red and green cases, polygenic scores are based on a GWAS in a sample limited to one sex, age or SES group (a 'stratum'). In blue, polygenic scores are based on a GWAS in a diverse sample matching the number of individuals in each stratum. GWAS samples sizes are: 122,774 for all three diastolic blood pressure GWAS samples, 72,328 for all three BMI GWAS samples, 73,280 for years of schooling GWAS in the diverse sample and 73,283 for GWAS in the low SES and high SES samples.

We then considered two other cases, evaluating prediction accuracy in groups stratified by age for BMI—since the UK Biobank participants were enrolled within about a five-year span, differences in age could in principle also be reflective of cohort effects—and by adult SES for years of schooling, using the Townsend deprivation index as a measure; our choices were motivated by prior evidence suggesting that these characteristics of the GWAS influence estimates of SNP-heritability (Branigan et al., 2013; Conley et al., 2015; Belsky et al., 2018; Elks et al., 2012; Ge et al., 2017). We withheld a random set of 10K individuals in each quartile of age and SES for prediction and performed GWAS using a subset of the remaining individuals, matching the sample sizes across quartiles in the GWAS set (total sample sizes of 72,328 and 73,280 for BMI and years of schooling GWAS, respectively). Similar to our observation for diastolic blood pressure, the prediction accuracy varies across prediction sets: it is 1.4-fold higher for BMI in the youngest quartile compared to the oldest (Mann-Whitney p=1.110-5Figure 1B), and 2-fold higher for years of schooling in the lowest SES quartile compared to the highest (Mann-Whitney p=2.910-6; Figure 1C). Furthermore, the differences across groups are again sensitive to the choice of the GWAS set: the differences are marked when GWAS is restricted to the youngest quartile for BMI and the lowest SES quartile for years of schooling, but diminished when the GWAS is performed in the oldest and the highest SES quartiles for BMI and years of schooling, respectively (Figure 1B, C). These results remained qualitatively unchanged when we used R2 instead of incremental R2 to measure prediction accuracy (Appendix 1—figure 2).

In these analyses, we used a p-value threshold of 10-4 for inclusion of a SNP in the PGS. The choice of how stringent to make the GWAS p-value threshold is important but somewhat arbitrary, with approaches ranging from requiring genome-wide significance to including all SNPs (Weedon et al., 2008; Pharoah et al., 2008; Euesden et al., 2015; Vilhjálmsson et al., 2015; Ware et al., 2017; Mostafavi et al., 2017; Speidel et al., 2019). Often, this threshold is chosen to maximize prediction accuracy in an independent validation set. When the goal is to compare prediction performance across different groups, there is no obvious optimal choice of the p-value threshold. [The optimal p-value in this context will differ across studies, as it depends not only on the genetic architecture and heritability of the trait, but also on the GWAS sample size, that is power (Dudbridge, 2013).] As we show, however, the qualitative trends reported in Figure 1 do not depend on the p-value threshold choice (Appendix 1—figure 3); moreover, the qualitative trends remain when LDpred is used (with a prior probability of 1 on loci being causal; Vilhjálmsson et al., 2015) instead of pruning approaches (Appendix 1—figure 3).

These results pertain to three exemplar traits and do not speak to the prevalence of this phenomenon. Nonetheless, they demonstrate that the prediction accuracy of a polygenic score can vary markedly depending on sample characteristics of both the original GWAS and the prediction set, even within a single ancestry, and that this variation in prediction accuracy can be substantial—on the same order as reported for different continental ancestries within the UK Biobank (Martin et al., 2019). As one example, the prediction accuracy in East Asian samples, averaged across a number of traits, is about half of that in European samples when GWAS was European-based; when the GWAS is done in the lowest SES group for years of schooling, prediction accuracy in the highest SES group is less than half of that in the lowest SES (Figure 1C). Moreover, whereas for these traits, we had prior information about which characteristics may be relevant, other aspects that vary across sets of individuals are undoubtedly important as well (e.g., smoking behavior and diet may modify genetic effects on lipid traits; Bentley et al., 2019; Telkar et al., 2019), and for other traits of interest, much less may be known a priori.

Possible explanations for the variable prediction accuracy

Our goal in this paper is to highlight that prediction accuracies can vary across groups of highly similar ancestry, rather than to investigate the likely causes for any particular phenotype. Nonetheless, we provide some observations that may cast light on these results. We first note that in these three examples, the prediction accuracies track SNP heritability differences across strata (Figure 2A,B,C). This relationship should be expected, given that the estimation noise decreases with heritability (Appendix 1), and potentially underlies the observation that prediction accuracies using the diverse GWAS sample are often intermediate between those obtained from stratified GWAS samples of equal sample size (Figure 1).

Differences in environmental variance alone do not explain the variable prediction accuracy.

(A,B,C) The x-axes show heritability estimates (± SE) based on LD score regression in each set. The y-axes show incremental R2 values obtained using the procedure described in Figure 1, with GWAS performed in a pooled sample of all strata and testing in stratified prediction sets (see Materials and methods); points and bars show mean and central 80% range computed based on 20 iterations of resampling GWAS and prediction sets. ‘Q’ denotes quartile of age and SES in (B,E) and (C,F), respectively. (D,E,F) The x-axes show phenotypic variance estimates (± SE) across strata after adjusting for covariates (sex, age and 20 PCs). If the heritability differences across strata are due to differences in environmental variance alone, with genetic variance constant, then heritability should be inversely proportional to phenotypic variance. The best-fitting model for this inverse proportionality (dashed line, simple linear regression) provides a poor fit to the observations.

Perhaps the simplest explanation for these findings would be that heritabilities, and hence prediction accuracies, vary only because of differences in the extent of environmental variance across strata, while the genetic variance is the same. We can test this hypothesis by examining whether the heritability decreases with increasing phenotypic variance (more precisely whether it is inversely proportional to it), as expected if the genetic variance is fixed across strata. What we find instead is that the estimated SNP heritabilities for all three traits increase or remain the same with increasing phenotypic variance (Figure 2D,E,F). Thus, for these traits at least, the variable prediction accuracy is not simply the result of differences in the extent of environmental heterogeneity across strata.

Another possibility is that there is an interaction between genetic effects and sample characteristics, for instance that different sets of genetic variants contribute to blood pressure levels in males and females or to BMI across different stages of life. [Although such interactions could in some contexts be thought of as reflecting GxE, we use the term ‘sample characteristic’ rather than ‘environment’, as environment has different meaning across disciplines, referring in some contexts only to factors that are exogenous to genetics. Viewed in this lens, SES in adulthood cannot be interpreted as exogenous, because it is in part determined by educational achievement, which is itself influenced by genetic factors, and similarly it is questionable whether age or sex are environments.] This explanation is not supported by bivariate LD score regression, which indicates that the genetic correlations across strata are close to 1 (Appendix 1—table 2; Materials and methods). Yet when we re-estimate individual SNP effects in the prediction sets for SNPs ascertained in the original GWAS, the estimated effects of trait-increasing alleles are larger in the groups with higher prediction accuracy (Appendix 1—figure 4; Materials and methods).

One simple model that could reconcile these findings is if effect sizes are highly correlated across the groups, but systematically larger in those groups with higher prediction accuracy. This explanation is reminiscent of the ‘amplification’ model of genetic influences on cognition during development (Briley and Tucker-Drob, 2013).

Other factors complicate interpretation, however, and may also contribute to our observations. In particular, for the case of years of schooling, conditioning on adult SES induces a form of range restriction, which could contribute to variable prediction accuracy across strata. We note, however, that we see highly variable prediction accuracies across SES strata even when the GWAS is conducted in a diverse sample (i.e., including individuals from all strata) (Figure 1C); in that regard, our approach mimics what happens in practice when polygenic scores are used to predict phenotypes in a sample with a smaller range of SES (e.g., Rimfeld et al., 2018). More generally, although this type of range restriction is artificially amplified in our example, SES differences may often be a problem for GWAS in which the sample is not representative of the population; for instance, the most recent major GWAS of educational attainment (Lee et al., 2018) included numerous medical data sets and the 23andMe data set, which are not representative of the national population.

Another potentially important factor is that the adjustment for PCs may not be a sufficient control for the different ways in which population structure can confound GWAS results (Vilhjálmsson and Nordborg, 2013), leading to variable prediction accuracy across strata if they differ in their population structure. To examine this possibility, we repeated the analysis in Figure 1 but using a linear mixed model (LMM) approach (including PCs among other covariates; see Materials and methods), and obtained qualitatively similar results (Appendix 1—figure 5). Although not a perfect fix (Listgarten et al., 2013; Mathieson and McVean, 2013), the fact that we obtain similar results using PCs and LMM suggests that confounding due to population stratification in the UK Biobank alone does not explain the variable prediction accuracies across strata.

Obstacles to portability explored through a comparison of standard and family-based GWAS

Beyond sample characteristics such as age or sex, a number of other factors may shape the portability of scores across groups of similar ancestry. Standard GWAS is done in samples of individuals that deliberately exclude close relatives; as implemented, it detects direct effects of the genetic variants, but also any indirect genetic effects of parents, siblings, or peers, effects of assortative mating among parents, and potentially environmental differences associated with fine-scale population structure (Young et al., 2018; Trejo and Benjamin, 2019; Kong et al., 2018; Lee et al., 2018; Berg et al., 2019). Given that many of these effects are likely to be culturally mediated (Stulp et al., 2017; Selzam et al., 2019), it seems plausible that they may vary within as well as across groups of individuals with different ancestries. If culturally-contingent effects contribute to GWAS estimates (and hence to PGS), they may lead to differences in the prediction accuracy in samples unlike the original GWAS.

To demonstrate that these considerations are not just hypothetical, we compared the prediction accuracy when the PGS is trained on ‘unrelated’ individuals such as those used in a standard GWAS to one obtained from a sibling-based (or ‘sib-based’) GWAS (Materials and methods). In the latter, genotype differences between sibs, a result of random Mendelian segregation in the parents, are tested for association with the phenotypic differences between them. Because the tests depend on phenotypic differences between siblings who, of course, have the same parents, these tests are conditioned on the parental genotypes and hence exclude many of the indirect effects signals that may be picked up in standard GWAS (Appendix 1). Differences between standard and sib-based GWAS are thus informative about the presence of factors other than direct genetic effects (Wood et al., 2014; Trejo and Benjamin, 2019; Lee et al., 2018; Berg et al., 2019; Selzam et al., 2019).

A challenge in this comparison is that the UKB contains only ~22K sibling pairs, ~19K of whom are labeled as ‘White British’ (WB). The siblings are similar to the unrelated individuals in terms of ages, SES distributions and genetic ancestries (Appendix 1—figures 6 and 7) but include a higher proportion of females; this difference is unlikely to influence our analyses (see below). While a large number, 19K pairs is still too few to have adequate power to discover trait-associated SNPs, when compared to a standard GWAS using the much larger sample of unrelated WB individuals (~340K).

To increase power and enable a direct comparison between the two designs, we split the SNP ascertainment and effect estimation steps as follows (Figure 3A): we identified SNPs using a standard GWAS with a large sample size (median ~270K across the traits considered) (see Materials and methods). We then estimated the effect of each significant SNP using (i) a sib-based association test and (ii) a standard association test. We chose the size of the estimation set in (ii) such that the median standard error of effect estimates in (i) and (ii) is approximately equal. We then compared the prediction accuracy of the two PGS obtained in this way (‘standard PGS’ and ‘sib-based PGS’) in an independent prediction set of unrelated individuals; as we show in Appendix 1, our approach leads to highly similar prediction accuracies of the two approaches under a model with direct effects only (see Materials and methods for details). A further advantage is that the two scores are compared for the same set of SNPs, such that LD patterns and allele frequency differences do not come into play.

Comparison of prediction accuracy of standard and sib-based polygenic scores.

(A) After ascertaining SNPs in a large sample of unrelated individuals, we estimated the effects of these SNPs with a standard regression using unrelated individuals and, independently, using sib-regression. We then used the polygenic scores for prediction in a third sample of unrelated individuals. We chose the sample size of the standard PGS estimation set such that median effect estimate SEs are equal in the two designs, thereby ensuring equal prediction accuracy under a vanilla model with no indirect effects or assortative mating. Numbers in parentheses are median sample size in each set across 20 traits (see Materials and methods and Appendix 1—table 1 for the definition of each trait, and Appendix 1—table 3 for sample sizes for each trait). (B) Ratio of prediction accuracy in the two designs across 20 traits. For each trait, we performed 10 resampling iterations of unrelated individuals into three sets for discovery, estimation and prediction (small points). Large points show median values. (C-F) We repeated this procedure with different discovery-set p-value thresholds for including a SNP in the polygenic score. The higher the p-value threshold is, the more SNPs are included. For each p-value threshold, points show 10 iterations as described and large points show median values. Shown are a subset of traits, with traits appearing in (B) but not shown here presented in Appendix 1—figure 12.

We applied the approach to 20 traits, focusing on traits with relatively high heritability estimates as well as social and behavioral traits that have been the focus of recent attention in social sciences. For the majority of the traits, such as diastolic blood pressure, BMI, and hair color, the prediction accuracies of standard and sib-based PGS were similar (Figure 3B), as expected under standard GWAS assumptions and as observed for traits simulated under these assumptions (Appendix 1—figure 8). However, for height and for a range of social and behavioral traits, such as years of schooling, pack years of smoking and household income, the prediction accuracy of the sib-based PGS was substantially lower than that of the standard PGS (Figure 3B). [We caution that, because the first step of our study design is to identify SNPs that are associated with the trait in a large set of unrelated individuals and we subsequently match the sampling variances of sib- and standard GWAS, rather than identify distinct sets of SNPs separately in the two designs, the ratio of prediction accuracies that we obtain cannot be directly compared to those reported in other studies.]

A number of factors could contribute to the differences between prediction accuracies for PGS based on sibs versus unrelated individuals, including confounding effects of population stratification, indirect genetic effects from parents and assortative mating. The relative importance of each factor will vary across traits (Rosenberg et al., 2019; Kong et al., 2018; Haworth et al., 2019; Ruby et al., 2018; Selzam et al., 2019). For educational attainment, this gap is likely to reflect at least in part the documented contribution of indirect genetic effects to the standard PGS (Lee et al., 2018; Kong et al., 2018; Young et al., 2018). We show in Appendix 1 that in the presence of indirect genetic effects mediated through parents, standard PGS outperforms sib-based PGS unless direct and indirect effects are strongly anticorrelated (Appendix 1—figure 9), which seems unlikely to be the case for years of schooling. The difference in the performance of sib-based and standard PGS observed for other social and behavioral outcomes, such as household income and age at first sexual intercourse (Figure 3B), may reflect a similar phenomenon. An additional contribution to divergent prediction accuracies could come from indirect effects among siblings, which would also contribute differentially to standard and sibling-based PGS. For height, there may be an important contribution of assortative mating to the difference in prediction accuracies (Wood et al., 2014; Robinson et al., 2017; Lee et al., 2018). In Appendix 1, we show that under a simple model of positive assortative mating, the prediction accuracy based on a standard PGS is higher than that of a sib-based PGS (Appendix 1—figure 10). We further confirmed that the difference in the sex ratio of the siblings and unrelated individuals, mentioned earlier, has a negligible effect on these differences, though it may underlie the slightly lower prediction accuracy of the standard PGS for pulse rate (Appendix 1—figure 11).

The lower prediction accuracies for PGS based on sib-based GWAS indicate that complications such as assortative mating or indirect effects contribute to the standard GWAS estimates. In the absence of these complications, we ensure that prediction accuracies are comparable by matching the sampling errors of the two approaches (Figure 3A). In the presence of these complications, the magnitude of the ratio of prediction accuracies should reflect the strength of assortative mating, the relative contribution of indirect genetic effects compared to direct effects, and so forth. However, interpreting the magnitude of the deviation from 1 is far from straightforward: as we show in Appendix 1, the relative difference in prediction accuracies between the two approaches stems in part from the noise-to-signal ratio for the effect estimates in sib-based versus standard GWAS (Appendix 1, Appendix 1—figures 9 and 10), and as a result also depends on features of the comparison like the sample sizes used and the PGS model.

Motivated by these considerations, we examined how the prediction accuracy varies when progressively relaxing the GWAS p-value threshold for inclusion of SNPs, that is when including more weakly associated SNPs in the PGS. [In Figure 3B, results are shown for the p-value threshold that maximizes the prediction accuracy of the standard PGS, replicating the practice when comparing populations of different ancestry; Martin et al., 2019.] For hair color and diastolic blood pressure, there is little to no difference in prediction accuracy between the two estimation methods, regardless of the number of SNPs included in the score (Figure 3C,D). In contrast, for height, standard and sib-based PGS perform similarly when based on the most significantly associated SNPs, but standard PGS progressively outperforms sib-based PGS when more SNPs are included (Figure 3E). Similarly, the difference in prediction accuracy between sib-based and standard PGS changes markedly for years of schooling, household income and other social and behavioral traits (Figure 3F and Appendix 1—figure 12). The growing gap in performance with increasing p-value threshold likely reflects a combination of an increasing noise-to-signal ratio for the effect estimates in sib-based versus standard GWAS (see Appendix 1) and changes in the relative importance of direct effects versus other factors such as indirect parental effects and assortative mating.

In summary, the differences between the prediction accuracies of standard and sib-based PGS seen for a number of traits (Figure 3B), notably social and behavioral ones, demonstrate that standard GWAS estimates often include a substantial contribution of factors other than direct effects. In these cases, even if the power to detect direct effects were comparable, standard GWAS would lead to higher prediction accuracy than sib-GWAS. In some contexts that may be a sufficient reason to rely on PGS derived from standard GWAS. However, that gain stems from the inclusion of factors such as indirect effects and assortative mating that are likely to be modulated by SES, environment and culture (e.g., Selzam et al., 2019; Stulp et al., 2017). Thus, the increased prediction accuracy likely comes at a cost of not always porting well across groups, even of the same ancestry, in ways that may be difficult to anticipate.

Discussion

Although the conversation around the portability of PGS has largely focused on genetic ancestries, our results show that prediction accuracy can also differ, in some cases substantially, across groups of similar ancestry—even due to basic study design differences such as age, sex or SES composition. When due only to increased environmental variance, such decreased accuracy may not pose a problem, at least for certain applications. But as we have shown, differences in the degree of environmental variance are not the primary explanation for the patterns we report (Figure 2), and other factors, including differences in the magnitude of genetic effects among groups, indirect effects and assortative mating, also lead to differences in the prediction accuracy of PGS, in ways that may make applications of phenotypic prediction less reliable, even within a single ancestry group. For some traits, there is prior information about which factors are likely to be important, but not always, and even for well-studied traits, it may be difficult to enumerate all the influential factors. As an example, we considered the accuracy of the polygenic score for years of schooling and found that it also varies somewhat depending on whether individuals have no sibling or one sibling in the prediction sets (Materials and methods; Appendix 1—figure 13).

Following the discussion of portability across ancestries, we have focused on incremental R2 as a measure of portability. This measure is less directly informative when the goal is to use PGS to reliably identify individuals in the tails of the distribution, that is those at elevated risk of developing a disease—the main application of PGS in human genetics, as distinct from social science or evolutionary biology. Nonetheless, the same concerns raised here are likely to apply. To illustrate that point, we considered binary outcomes of the traits considered in Figure 1, 'hypertension' (defined as diastolic blood pressure > 110 mmHG), 'obesity' (defined as BMI > 35 kg/m2), and 'college completion', and evaluated the prediction accuracy as measured by incremental AUC (Appendix 1—figure 14).The qualitative results are the same as in Figure 1. We also examined how incremental AUC varies by sex for five binary disease traits that we chose because they have relatively high heritability. For three of them, hypothyroidism and two cardiovascular outcomes, prediction accuracy varies depending on both the GWAS and prediction sets (Appendix 1—figure 15).

Thus, for both quantitative and binary traits, the question of the domain over which a PGS applies is not just about LD patterns, allele frequencies or GxG effects but also about the extent of environmental and genetic variance, GxE, as well as the contribution of direct effects versus indirect effects, assortative mating and environmental confounding. An important implication is that differences in prediction accuracies among groups with distinct ancestries cannot be interpreted exclusively or even primarily in terms of population genetic parameters when these groups differ dramatically in their SES (Chetty and Hendren, 2018; Conley, 2010; Nuru-Jeter et al., 2018; Reich, 2017) and other factors that may affect portability—especially when the relative contribution of these factors to GWAS signals remains unknown (Young et al., 2019; Mills and Rahal, 2019). Thus, efforts to conduct GWAS in groups that vary in ancestry and geographic locations will need to be accompanied by a careful examination of variation in portability along other dimensions.

While these results raise the question of how to best construct a PGS, the answer is not obvious, and likely depends on the specific trait and samples. For example, for the three cases shown in Figure 1, considering a fixed GWAS sample size, the highest prediction accuracy is attained with a GWAS sample limited to some stratum (e.g., women for diastolic blood pressure). Yet a much larger merged data set containing the union of strata generates the most predictive PGS (Appendix 1—figure 1). Together, these observations suggest a trade-off between the factors that are shared among strata and lead to increased power with sample size and those that differ across strata and underlie the variable prediction accuracy. In principle then, if influential factors were known, the composition of the GWAS sample could be optimized to yield the highest accuracy in a given prediction set, but how much each stratum should be weighted will depend on a number of factors such as the genetic and environmental variance in each stratum, genetic correlation across strata, and sample sizes. Moreover, factors such as assortative mating and indirect effects are soaked up into the GWAS estimates—and critically also into the SNP heritability estimates. Thus, the choice of a GWAS sample is about more than power; it is implicitly making a choice about all sorts of sample characteristics that may or may not hold true of the prediction set.

In that regard, it is worth noting that while classical twin studies were often constituted to be representative of a reference population (often national in nature) (Polderman et al., 2015; Branigan et al., 2013), the same is not true of most contemporary human genetic datasets, which are skewed towards medical case-control studies, biobanks that are opt-in (and thus tend to include individuals who are wealthier and better educated than the population average) or direct-to-consumer proprietary genetic databases (which are even more skewed along these dimensions) (Lee et al., 2018). For instance, individuals in UK Biobank have higher SES than the rest of the British population (Fry et al., 2017) and are presumably self-selected for a certain level of interest in biomedical research. These factors alone raise challenges as to the broad portability of PGS derived from them. More generally, it seems plausible that individuals included in a GWAS differ from those that, for myriad reasons, do not end up participating (Taylor et al., 2018), in ways that make it difficult to predict the domain over which GWAS-based estimates can be reliably generalized.

One fruitful way forward may be to study data from related individuals, in which it should be possible to decompose the components of the signals identified in GWAS into direct and indirect effects, the degree of assortative mating and the contribution of residual stratification (Zhang et al., 2015; Young et al., 2018; Kong et al., 2018). Not only will this decomposition help us to better interpret the results of GWAS and the resulting PGS, it will make it possible to examine under which circumstances, and for which phenotypes, components port more reliably to other sets of individuals, both unrelated and related. Ultimately, we envisage that in order to be broadly applicable, GWAS-based phenotypic prediction models will need to include not only a PGS but some study characteristics, other social and environmental measures and, perhaps crucially, their interactions.

Materials and methods

UK biobank

The UK Biobank (UKB) is a large study of about half a million United Kingdom residents, recruited between years 2006 to 2010 (Bycroft et al., 2018). In addition to genetic data, hundreds of phenotypes were collected through measurements and questionnaires at assessment centers, and by accessing medical records of the participants.

Inclusion criteria

Request a detailed protocol

In this study, we focused on 408,434 participants who passed quality control (QC) measures provided by UKB; specifically, for whom the reported sex (QC parameter ‘Submitted.Gender’) matched their inferred sex from genotype data (QC parameter ‘Inferred.Gender’); who were not identified as outliers based on heterozygosity and missing rate (QC parameter ‘het.missing.outliers’==0); and did not have an excessive number of relatives in the database (QC parameter ‘excess.relatives’==0). We further selected individuals identified by UKB to be of ‘White British’ (WB) ancestry (QC parameter ‘in.white.British.ancestry.subset’==1), which is a label that refers to those who, when given a set of choices, self-reported to be of ‘White’ and ‘British’ ethnic backgrounds and, in addition, were tightly clustered in a principal component analysis of the genotype data, as detailed in Bycroft et al. (2018). We excluded individuals that had withdrawn from the UK Biobank by the time of the analyses here. For a given trait, we further conditioned on individuals for whom the trait value was reported.

Phenotype data

Request a detailed protocol

We focused on 25 traits, including traits with relatively high heritability estimates as well as social and behavioral traits that have been the focus of recent attention in social sciences (see Appendix 1—table 1 for a complete list of phenotype data used in this work, and their corresponding numeric field codes in the UKB data showcase). We calculated the phenotype ‘years of schooling’ by converting the maximal educational qualification of the participants to years following Okbay et al. (2016) (Appendix 1—table 4). For diastolic blood pressure, pulse rate, and forced vital capacity, we took the average of the first two rounds of measurement taken during the same examination at UKB assessment centers. We adjusted the diastolic blood pressure levels for blood pressure lowering medication following Evangelou et al. (2018) by shifting the values upward by 10 mmHg for individuals taking medication. For hand grip strength, we took the average of the measurements for the two hands. For categorical phenotypes, we assigned integer values to each category (Appendix 1—table 1). For hair color, individuals who reported hair color variable ‘Other’ were excluded from the analyses. We considered binary traits, ‘hypertension’ defined as diastolic blood pressure >110 mmHG, ‘obesity’ defined as BMI >35 kg/m2, and ‘college completion’ defined based on attainment of a college or a university degree. Disease outcomes were ascertained using self-reported information and/or using the hospital inpatient main and secondary diagnoses coded according to the International Classification of Diseases (ICD-9 and ICD-10). Hypothyroidism, type 2 diabetes, and rheumatoid arthritis were ascertained based on ICD-10 codes of E03.X, E11.X and M06.X, respectively. Myocardial infarction was ascertained based on ICD-9 codes of 410.9, 411.9, 412.9, or ICD-10 codes of I21.X, I22.X, I23.X, I24.1, I25.2 following Khera et al. (2018), or participants with myocardial infarction outcome data among the UK Biobank’s algorithmically-defined outcomes. We also considered the binary outcome of ever being diagnosed to have had a heart attack, angina or stroke. For a subset of individuals, multiple measurements of a phenotype were provided, corresponding to multiple visits to UKB assessment centers; in those cases, we used the measurements during the first visit.

Genotype data

Request a detailed protocol

UKB participants were genotyped on either of two similar genotyping arrays, UK Biobank Axiom and UK BiLEVE arrays, at a total of ~850K markers. We focused on autosomal bi-allelic SNPs shared between both arrays, and used plink v. 1.90b5 (Chang et al., 2015) to filter SNPs with calling rate >0.95, minor allele frequency >10−3, and Hardy-Weinberg equilibrium test p-val >10−10 among the WB samples, resulting in 616,323 SNPs.

GWAS and trait prediction methods

GWAS by sample characteristics

Request a detailed protocol

We focused on a set of 337,488 WB samples that were identified by the UKB to be ‘unrelated’ (sample QC parameter ‘used.in.pca.calculation’==1 as provided by UKB), defined such that no pairs of individuals are inferred to be 3rd degree relatives or closer. We split the sample into non-overlapping sets of individuals by one of the following factors: age at recruitment (in years), sex, and Townsend deprivation index at recruitment (used as a proxy for socio-economic status or SES, specifically we take the negative of the Townsend deprivation index as a measure of SES). For SES and age, we divided the sample into four sets: Q1 [minimum value, first quartile], Q2 (first quartile, second quartile], Q3 (second quartile, third quartile], and Q4 (third quartile, maximum value]. We randomly selected 10K samples in each SES and age group, and 20K of males and 20K of females as held-out prediction sets, and performed GWAS using the remaining samples, matching sample sizes across groups in the GWAS set. We performed nine GWASs: for years of schooling in SES Q1 and SES Q4 (sample size 73,283 for each), and in a diverse sample with equal number of individuals from all four groups (sample size 73,280); for body mass index (BMI) in Q1, Q4, and in a diverse sample with equal number of individuals from all four groups (sample size 72,328 for each); and for diastolic blood pressure in males, females, and in a diverse sample with equal number of males and females (sample size 122,774 for each). We performed all GWASs using plink v. 2.0 (with the flag --linear), adjusting for sex, age (at recruitment) and first 20 PCs as covariates. PCs are principal components of the genotype data, as provided by UKB, calculated using the entire cohort (not just WB individuals). For a subset of cases (where GWAS was performed in samples restricted by characteristics described above), we additionally performed association tests using a linear mixed model (LMM) as implemented in BOLT-LMM v. 2.3.2 (Loh et al., 2015), using LD scores computed from 1000 Genomes European-ancestry samples, with sex, age and first 20 PCs as covariates. The GWAS summary statistics were used to construct PGS for the samples in the prediction sets.

To better understand the performance of PGS across the strata (see ‘Possible explanations for the variable prediction accuracy’), we estimated the mean effect sizes of significant SNPs in each of the strata. To avoid overfitting, we first performed an association test in the pooled sample of all strata excluding individuals in the prediction sets and matching the number of individuals per stratum; sample size 293,132 for years of schooling, 272,456 for BMI, and 245,548 for diastolic blood pressure. Then for significantly associated SNPs (LD pruned as described in ‘Polygenic score construction and trait prediction’), we re-estimated the effect sizes in each of the strata in the prediction sets (see Appendix 1—figure 4). We also used these pooled GWASs to explore the relationship between prediction accuracy and SNP heritability (as shown in Figure 2) and with GWAS sample size (Appendix 1—figure 1). We performed 20 iterations of all above steps.

In addition to above examples, we explored the prediction accuracy for years of schooling when GWAS and prediction sets are stratified based the participants’ number of full siblings. Specifically, we performed GWAS using individuals who had exactly one sibling (sample size 90,417), and evaluated prediction in two independent samples of individuals who reported having no siblings or having one sibling (sample size 20K for each) (see Appendix 1—figure 13).

We also considered five binary disease outcomes stratified by sex. Specifically, we performed GWAS in equally sized samples of males and females for hypothyroidism (sample size 135,526), type 2 diabetes (sample size 136,061), rheumatoid arthritis (sample size 136,039), myocardial infarction (sample size 136,061) and having been diagnosed with a heart attack or angina or stroke (sample size 135,833), leaving out 20K samples of males and females for prediction (see Appendix 1—figures 14 and 15). For these traits we used a logistic regression model for GWAS (using plink v. 2.0 with the flag --logistic). An important caveat to analyses of disease outcomes recorded during multiple follow-ups is that for ‘age’, we could only consider the age at recruitment in the GWAS; that approach is not ideal, considering that a fraction of individuals died during the course of the study (about 20K individuals in the full cohort).

Standard versus sibling-based polygenic score

Request a detailed protocol

We used the genetic relatedness information provided by UKB to infer sibling pairs among the WB samples. Following Bycroft et al. (2018), we marked pairs with 125/2<ϕ<123/2 and IBS0 > 0.0012 as siblings, where ϕ is the estimated kinship coefficient and IBS0 is the fraction of loci at which individuals share no alleles. By this approach, we identified 19,329 sibling pairs including 35,634 individuals across 17,328 families. For a given trait, we included pairs with the property that trait values for both individuals were reported. We then formed two sets of individuals: 'Siblings' set, including the sibling pairs randomly sampled to include only one pair per family, and an 'Unrelateds' set, including the unrelated individuals identified by the UKB (see section 'GWAS by sample characteristics' above), but excluding the Siblings and 6,911 individuals that were related to the Siblings (3rd degree or closer).

We focused on 20 quantitative traits (see Figure 3B for the list of traits considered in this analysis) and a number of simulated traits (see below). For each trait, we first downsampled the Unrelateds set to a sample size n*  such that the median standard error of effect estimates roughly matched the median standard error in the sibling-based regression (see 'Estimating  n*' below). We then divided the Unrelateds set into three non-overlapping sets: after sampling n* individuals (Unrelateds-n* set), we randomly split the rest of the Unrelateds set into an Unrelateds-prediction set (10% of the samples) to be used as a sample for trait prediction ('prediction set'), and an Unrelateds-discovery set (90% of the samples) to be used for the discovery of trait associated variants (see Appendix 1—figure 3 for sample sizes in each set). For each trait, we performed standard GWAS in the Unrelateds-discovery set, and ascertained SNPs by thresholding on association p-values. We then estimated the effect sizes for these ascertained SNPs in two ways: by a sibling-based association test in the Siblings set (using plink v. 1.90b5’s QFAM procedure with the flag --qfam), and by a standard association test in the Unrelateds-n* set (using plink v. 2.0). Subsequently, for each set of ascertained SNPs in the Unrelateds-discovery set, two PGS were constructed for the samples in the Unrelateds-prediction set (see Figure 3A for overview of the pipeline). We performed 10 iterations of the above sampling, ascertainment and estimation steps, except for simulated traits where we performed 30 iterations.

Estimating n*
Request a detailed protocol

In order to compare the performance of sibling-based and standard GWAS designs, we wanted to match both analyses to have similar prediction accuracy under a vanilla model of no assortative mating, population structure stratification or indirect effects. In Appendix 1, we show that this could be achieved by matching median effect estimate standard errors. For each trait, we therefore calculated n*, the sample size of a standard GWAS that yields roughly equal standard errors in the standard and sibling-based regressions. Specifically, for each trait, we first performed sibling-based GWAS in the Siblings using plink’s QFAM procedure (with the flag --qfam mperm=100000 emp-se). We then randomly sampled a range of sample sizes from the set of Unrelateds, from 5K to 20K in 1K increments. Following Wood et al. (2014), for each sample size, we performed a standard GWAS, and investigated the linear relationship between the square root of the sample size and the inverse of the median standard error of the effect size estimates. We then used this linear relationship to estimate the sample size of a standard GWAS that corresponds to the inverse of the median standard error of the effect sizes estimate in the sibling-based GWAS.

All standard association tests were performed using plink v. 2.0 (with the flag --linear), adjusting for sex, age and first 20 PCs as covariates. For sibling-based association tests we first residualized the phenotypic values on age and sex, and then regressed the sibling differences in residuals on sibling genotypic differences using plink’s QFAM procedure as described above.

We also considered a version of the analysis described above, in which we first residualized the phenotypes on covariates in the pooled sample of all WB individuals, and then ran the pipeline on the residuals without further adjustment for covariates in the GWAS or prediction evaluation. As shown in Appendix 1—figure 16, this approach produced results that are qualitatively the same to what we present in Figure 3.

Simulated traits
Request a detailed protocol

We wanted to check that given the study design described above, sibling-based and standard PGS perform similarly with respect to trait prediction, under the vanilla model of no population stratification, assortative mating or indirect genetic effects (Figure 3). To this end, we simulated traits with heritability h2= 0.1 or 0.5 and either 10K or 100K causal SNPs. For each set of parameters, we simulated three replicates giving a total of 12 simulated traits.

We randomly selected the causal SNPs from a set of 10,879,183 imputed SNPs, considering that most causal variants are plausibly not directly genotyped on SNP arrays. We used a set of SNPs that passed quality control procedures by the Neale lab (http://www.nealelab.is/uk-biobank), namely autosomal SNPs, imputed using the haplotype reference consortium (HRC) panel, which have INFO score > 0.8 and have minor allele frequency > 10−4; we further limited the SNP set to ones that were bi-allelic in the WB sample. As in Martin et al. (2017), we randomly assigned effect sizes to these causal SNPs as β~N0,h2m, and zero for non-causal SNPs. We then calculated genetic component of the trait, g, for all WB samples under an additive model by summing the allelic counts weighted by their effect sizes using plink (with the flag --score). Allelic counts were determined by converting imputation dosages to genotype calls with no hard calling threshold. We also assigned environmental contributions as ε~N0,1-h2, and then constructed the PGS for each individual,

g=i=1mβiXi,

where Xi is the number of minor alleles at SNP i carried by the individual, and the trait value for the individual is calculated as the sum of genetic and environmental contributions:

y=h2g-g-σg+1-h2ε-ε-σε

where bars represent averages, σg is the standard deviation of PGS across individuals and σε is the standard deviation of environmental contributions across individuals. These simulated traits were then analyzed using the same pipelines as the other traits (e.g., adjusting for covariates etc.). Importantly, SNP discovery and effect size estimations in GWAS were performed without knowledge of the causal SNPs.

Polygenic score construction and trait prediction

Request a detailed protocol

For all GWAS designs described above, we used p-value thresholding followed by clumping to choose sets of roughly independent SNPs to build PGS. We considered a logarithmically-spaced range of p-values: 10−8, 10−7, 10−6, 10−5, 10−4, 10−3, and 10−2 (or a subset if no SNP reached that significance level). We then used plink’s clumping procedure (with the flag --clump) with LD threshold r2< 0.1 (using 10,000 randomly selected unrelated WB samples as a reference for LD structure) and physical distance threshold of >1MB. The selected SNPs were then used to calculate PGS for individuals in the prediction sets, by summing the allelic counts weighted by their estimated effect sizes (log of the odds ratios in the case of binary traits) using plink (with the flag --score). In a subset of cases, we also calculated polygenic scores using LDpred assuming all loci are causal (Vilhjálmsson et al., 2015). To evaluate prediction accuracy, we calculated the incremental R2: we first determined R2 in a regression of the phenotype to the covariates, and then calculated the change in R2 when including the PGS as a predictor. For binary traits, we calculated the incremental area under the receiver operator curve (AUC).

Estimating heritability and genetic correlation

Request a detailed protocol

We calculated SNP heritability across sex, age and SES groups for diastolic blood pressure, BMI and years of schooling, respectively (as described in the section ‘GWAS by sample characteristics’) as well as genetic correlations across pairs of groups: we first performed GWAS using all unrelated WB individuals in each group. We then used the GWAS summary statistics to perform LD score regression with LD scores computed from the 1000 Genomes European-ancestry samples (Bulik-Sullivan et al., 2015).

Appendix 1

1 Prediction accuracies of polygenic scores based on standard and sib-GWAS

1.1 Overview of derived results

In the main text, we compare the prediction accuracies of polygenic scores (PGS) based on a standard GWAS of unrelated individuals and a GWAS based on sibling differences, for a number of traits. Here, we describe how this comparison is implemented, and how indirect effects and assortative mating manifest in this comparison.

Matching standard and sib-based prediction accuracies

Current standard GWAS are based on huge sample sizes, leading to less noisy estimates than are afforded by family association studies such as those based on sib-differences, which are typically much smaller. This difference in precision needs to be taken into account in making comparisons between the prediction accuracy of scores derived from the two approaches. We show that under a vanilla additive model with no assortative mating, indirect effects, population structure (or other complications), and if the standard GWAS is subsampled to a sample size

n*11+(1-h2)(1-2ρsib)npairs,

where npairs is the number of sib pairs, h2 is the heritability and ρsib is the correlation in environmental effects experienced by siblings, the two study designs are expected to have the same (out-of-sample) prediction accuracy (see Section 1.2). This analytic result is not that useful in practice, however; in particular, it requires prior knowledge about the extent to which environmental effects correlate among siblings. Instead, we took an empirical approach to match the prediction accuracies in the two approaches: following Wood et al. (2014), we subsampled the regular GWAS to match the median standard errors of the sib-GWAS. As we show in Section 1.2.3, under our vanilla model, we then expect equal out-of-sample prediction accuracies for polygenic scores derived from the two study designs.

Indirect parental effects

In the presence of indirect parental effects, the out-of-sample prediction accuracy takes a simple form. For a polygenic score based on a standard GWAS, we obtain

E[Rur2]=τ211+c,

where τ2 is the ratio of the variance in the trait due to both direct effects and indirect effects of transmitted parental alleles over the total phenotypic variance; and c is a term representing the noise-to-signal ratio in a standard GWAS. For the polygenic score based on sib-GWAS, we obtain

E[Rsib2]=(1+ρσησβ)2hβ211+cτ2/hβ2.

where σβ2 and ση2 are the variances of random direct and indirect effects, respectively, ρ is the correlation between direct and indirect effects, and hβ2 is the proportion of the phenotypic variance explained by direct effects. Our results suggest that under plausible conditions, the presence of indirect effects would lead to higher prediction accuracy in a standard GWAS. This result holds whether direct and indirect effects are positively correlated, uncorrelated or even somewhat negatively correlated (Appendix 1—figure 9).

Assortative mating

We investigated several models of assortative mating by simulation. Standard GWAS-based polygenic scores have greater prediction accuracies than those based on sib-GWAS when the parental phenotypes are positively correlated, and the reverse is true if they are negatively correlated (Appendix 1—figure 10 A,B). The relative difference in prediction accuracies of the two study designs grows with the inclusion of more SNPs in the polygenic score model (Appendix 1—figure 10 D,F).

In our analytic model, we ignored the ascertainment step of our study design, in which it is decided which SNPs to include in the polygenic score. We assumed that SNPs are pre-ascertained and that the set of ascertained SNPs includes all causal ones. In a subset of simulations, we implemented the ascertainment step based on an independent simulated GWAS (see below). In both settings, we refer (somewhat loosely therefore) to the regression on ascertained SNPs in a sample of unrelated individuals as ‘standard GWAS’ and the regression of the difference in phenotypes on the difference in sib genotypes as ‘sib-GWAS.’

1.2 Picking the sample size of the standard GWAS to match the prediction accuracy of the score based on the sib-GWAS

We look for the sample size n* of a standard GWAS performed on sample of unrelated individuals such that, under our vanilla model, the resulting polygenic score has the same (out-of-sample) prediction accuracy as the polygenic score obtained from a sib-GWAS with sample size npairs. We begin by assuming that all causal sites i are known; that they are unlinked; that they have only additive, direct effects on the phenotype; and that there is no population stratification or assortative mating. We first find the sampling variance of the effect size estimate for a single site obtained from each of the two study designs. We then examine (and ultimately match) the prediction accuracy of the polygenic scores obtained from effect sizes estimated in the estimation sets, β^ur,β^sib, on a new, independent prediction sample of unrelated individuals {(x,y)}.

1.2.1 Sampling error of the estimated effect size at a single site

Our model for the phenotypic value y is

y=g+e

where e is a Normally distributed environmental effect (which includes all sources of random noise) and

g=β0ur+iβixi

where xi{0,1,2} are random genotypes. The genotype is coded as the the number of alleles with effect βi carried by the individual at site i. Effect sizes β={βi} are treated as fixed parameters throughout (except when noted otherwise in the very last step leading to Equation 23). We can rewrite our model to focus on the effect size at a single site i:

(1) y=β0+βixi+ϵi,

where

ϵi=g-βixi+e,

with variance

Var[ϵi]=Var[g-βixi]+Var[e]=Var[y]-βi2Var[xi]

In an OLS regression, the standard error for the effect of an allele at site i is

(2) Var[β^iur]=Var[ϵi](n-1)Var[xi]=Var[y]-βi2Var[xi](n-1)Var[xi],

where n is the sample size and β^ur denotes that the estimate was obtained using a sample of unrelated individuals. In sib-GWAS, our model for site i is

Δy=β0sib+βiΔxi+Δϵi,

with variance

Var[Δϵi]=Var[Δg-βiΔxi]+Var[Δe]=
Var[Δg]+βi2Var[Δxi]-2βi2Var[Δxi]+Var[Δe].

Recall that for siblings (denoted with subscripts A and B), we expect

Cov[xi,A,xi,B]=12Var[xi],
Cov[gA,gB]=12Var[g].

Plugging these back in, we obtain

Var[Δϵi]=Var[g]-βi2Var[xi]+2Var[e](1-ρsib)

where ρsib=Cor[eA,eB] is the correlation in environmental effects between sibs. The variance of the estimated effect size in sib-GWAS is therefore

(3) Var[β^isib]=Var[Δϵi](npairs-1)Var[Δxi]=Var[y]-βi2Var[xi]+Var[e](1-2ρsib)(npairs-1)Var[xi].

1.2.2 Sample size required for equal prediction accuracy

We measure prediction accuracy as the expected squared correlation between polygenic scores g^ and phenotypic values in an independent prediction set of unrelated individuals, denoted {(x,y)},

E{(x,y)}[R2]=Cov2[g^(x),y]Var[y]Var[g^(x)],

To incorporate randomness both in the estimation set (summarized by the Multivariate Normal distribution of β^) and the prediction set {(x,y)}, we will require

Eβ^ur(n)[E{(x,y)}[R2]]=!Eβ^sib(npairs)[E{(x,y)}[R2]]

where β^(n) is a set {β^i} estimated in a GWAS with sample size n. Equivalently,

(4) Eβ^sib[Cov2[g^sib(x),y]Var[g^sib(x)]]=!Eβ^ur[Cov2[g^ur(x),y]Var[g^ur(x)]],

where we left out the sample sizes for brevity, and Var[y] was cancelled out. Finally, we can replace Equation 4 by its first order Taylor approximation to get the requirement

(5) Eβ^[Cov{(x,y)}[g^sib(x),y]]2Eβ^[Var{(x,y)}[g^sib(x)]]=!Eβ^[Cov{(x,y)}[g^ur(x),y]]2Eβ^[Var{(x,y)}[g^ur(x)]].

We solve Equation 4 for a sample size n* to be used for estimation of the polygenic score in a standard GWAS that satisfies Equation 4. We note that if the vector of estimates β^ is given, then

(6) Cov{(x,y)}[y,g^(x)|β^]=Cov{(x,y)}[g(x),g(x)+imxi(β^iβi)|β^]=Var{(x,y)}[g(x)|β^]+imCov{(x,y)}[βixi,(β^iβi)xi|β^]=imVar[xi]βiβ^i.

Since for every i, we have

E[β^iur]=E[β^isib]=βi,

we obtain

Eβ^sib[Cov[y,g^sib(x)|β^sib]]=imVar[xi]βi2=Eβ^ur[Cov[y,g^ur(x)|β^ur]],

which turns the requirement of Equation 5 into

Eβ^sib[Var{(x,y)}[g^sib(x)]]=!Eβ^ur[Var{(x,y)}[g^ur(x)]],

or simply

(7) imVar[xi]Var[β^iur]=!imVar[xi]Var[β^isib].

Plugging the sampling variance results from Equation 2 and Equation 3 into Equation 7 and reordering, we obtain

n*-1npairs-1=imVar[y]-βi2Var[xi]imVar[y]-βi2Var[xi]+Var[e](1-2ρsib),

or, assuming that the trait is polygenic such that m1,

(8) n*npairs11+(1-h2)(1-2ρsib).

Equation 8 can in principle be applied to the estimation of ρsib for a given trait, under our model assumptions, and given an independent estimate of h2.

1.2.3 Empirical matching of standard errors

The result of Equation 8 is the same as we would obtain if we required

(9) i Var[β^isib(xi)]=!Var[β^iur(xisib)]

without taking into account randomness in the prediction set. In practice (and in the results shown in the main text), we have no prior knowledge about ρsib and instead we find a sample size n* for the standard GWAS such that

(10) median{sites i}(Var[β^isib(x)])=!median{sites i}(Var[β^iur(x)])

We note that the condition in Equation 9 is approximately met because, if we assume that y is a highly polygenic trait where

i βi2Var[xi]<<Var[y],

then, if for one site jn satisfies

Var[β^jsib(x)]=Var[β^jur(x)]=D(n)Var[xj]

such that D(n) is the same for sib-GWAS and standard GWAS, then for all sites D(n)=Var[y]n1 is the same, namely,

i Var[β^isib(x)]=Var[β^iur(x)]=D(n)Var[xi]

Equation 10 can therefore be thought of as using a weighted-median to estimate n where each site i is weighted by 1Var[xi]. In conclusion, the requirement of Equation 10 leads to equal prediction accuracy of standard and sib-GWAS under the vanilla model assumptions. We note further that in the main text (Figure 3), to follow common practice, we use incremental R2 throughout rather than R2. However, as we show in Appendix 1—figure 16, using R2 instead gives highly similar qualitative results.

1.3 Indirect parental effects

1.3.1 Distribution of the effect size estimate at a single site

We consider an additive model with direct effects as well as indirect parental effects, assuming no interaction between the parents and the polygenic score of the children and ignoring possible indirect effects of siblings on each other. The other assumptions from the previous section—for example independent segregation of alleles across sites—remain. We start by considering the model

y=β0+g+n+e

where g is the sum of direct effects in an individual with genotype (effect-allele count) xi at each site i,

g=imβixi,

and

n=imηi(xi+x~im+x~ip)

is the sum of parental indirect effects, with overall parental effect allele count xi+x~ip+x~im at each site, where x~im is the untransmitted maternal effect allele count, and x~ip the untransmitted paternal effect allele count, with x~im,x~ip{0,1}. As we show, when we choose the standard GWAS sample size n* such that the sampling error of the effect size estimates matches that of the sib-GWAS, the prediction accuracies of the two polygenic scores differ in an independent sample: unless there is a large, negative correlation between indirect and direct effects, the polygenic score from standard GWAS is expected to outperform the one based on sib-GWAS.

We first examine the distribution of an estimated effect size of xi on the phenotype. The OLS regression for a single site in a standard GWAS follows Equation 1 and can be rewritten as

(11) y=β0+(βi+ηi)xi+ηi(x~ip+x~im)+ϵi

with

ϵi=g+n+e-(βi+ηi)xi-ηi(x~ip+x~im).

By the assumption of no assortative mating or other population structure,

(12) Cov[x~ip,x~im]=Cov[xi,x~im]=Cov[xi,x~ip]=0.

It directly follows that under the generative model specified by Equation 11, the OLS regression of y to xi and x~ip+x~im is a regression involving two independent variables. Therefore, β^iur is Normally distributed with expectation

E[β^iur]=βi+ηi.

We next calculate the variance of β^iur. From Equation 12 and

Var[x~im+x~ip]=Var[xi],

we obtain

Var[ϵi]=Var[y]+(βi+ηi)2Var[xi]+ηi2Var[xi]-2Cov[g+n,(βi+ηi)xi]-2Cov[n,ηi(x~im+x~ip)]=
=Var[y]-Var[xi](βi2+2βiηi+2ηi2).

Finally,

(13) Var[β^iur]=Var[ϵi](n-1)Var[xi]=Var[y]-Var[xi](βi2+2βiηi+2ηi2)(n-1)Var[xi].

In sib regression, we have

Δy=Δg+Δe

since indirect parental effects cancel out when taking the difference between siblings (as siblings have the same parental effect allele count). Thus, the expected estimate is the same as it was in the absence of indirect effects. Using the same considerations as in Section 1.2 for the variance in sib differences, we obtain

β^isibN(βi,Var[g]-βi2Var[xi]+Var[e](1-2ρsib)(npairs-1)Var[xi]),

where ρsib is again the correlation in environmental effects between siblings.

1.3.2 Polygenic score prediction accuracy

We now examine the difference in prediction accuracies of g^ur and g^sib after matching

(14) Var[β^iur]=!Var[β^isib]

by choosing a standard GWAS sample size n* that empirically satisfies the condition, as we do in the main text (see also Section 1.2.3).

We can derive the expected prediction accuracy by averaging over both the estimation set (which we again shorthand as the distribution of β^) and the prediction set {(x,y)}. By the law of total expectation,

E[R2]=Eβ^[E{(x,y)}[R2]]=Eβ^[Cov{(x,y)}2[g^(x),y|β^]Var{(x,y)}[y|β^]Var{(x,y)}[g^(x)|β^]]
(15) Eβ^[Cov{(x,y)}[g^(x),y|β^]]2Var{(x,y)}[y|β^]Eβ^[Var{(x,y)}[g^(x)|β^]],

where the last step is an approximation of the expectation of ratio by its first-order Taylor expansion, a ratio of expectations. The numerator of Equation 15 is

Eβ^[Cov{(x,y)}[g^(x),y|β^]]2=Eβ^[im(βi+ηi)β^iCov{(x,y)}[xi,xj|β]^]2=
=Eβ^[imVar[xi](βi+ηi)β^i]2=
(16) =(imVar[xi](βi+ηi)E[β^i])2.

The terms in the denominator of Equation 15 are

(17) Var{(x,y)}[y|β^]=Var[y]

and

(18) Eβ^[Var{(x,y)}[g^(x)|β^]=Eβ^[imVar[xi]β^i2]=imVar[xi](E[β^i]2+Var[β^i]).

Plugging Equations 16,17,18 back into Equation 15, we obtain

(19) E[R2](imVar[xi](βi+ηi)E[β^i])2Var[y](imVar[xi]Var[β^i]+imVar[xi]E[β^i]2).

We note that

C~:=Var[y]imVar[xi]Var[β^i]

is the same for sib-GWAS and standard GWAS under the requirement of Equation 14. We therefore have

(20) E[Rur2](imVar[xi](βi+ηi)2)2C~+Var[y]imVar[xi](βi+ηi)2,

and

(21) E[Rsib2](imVar[xi](βi+ηi)βi)2C~+Var[y]imVar[xi]βi2.

If we denote the proportion of the phenotypic variance explained by direct effects by

hβ2:=imVar[xi]βi2Var[y],

the proportion of the phenotypic variance explained by indirect effects of transmitted parental alleles by

τη2:=imVar[xi]ηi2Var[y],

and the proportion of phenotypic variance explained by both direct and indirect effects of transmitted alleles by

τ2:=imVar[xi](βi+ηi)2Var[y]

then Equation 20 can be written as

(22) E[Rur2]τ211+c,

where we defined

c:=imVar[xi]Var[β^i]imVar[xi](βi+ηi)2.

Here, c can be thought of as a summary of the noise-to-signal ratio, with respect to the signal coming from both direct and indirect effects of transmitted alleles. If we consider effects β and η as random, treating results obtained thus far as conditional on β and η, and further assume that effects are i.i.d. across sites (implying, in particular, that effect sizes and allele frequencies are independent),

(βiηi)((00),(σβ2ρσβσηρσβσηση2)),

the expectation of the numerator of Equation 21 is

Eβ,η[imVar[xi]βi(βi+ηi)|β,η]=imVar[xi]Eβi,ηi[βi2+βiηi]=imVar[xi](σβ2+ρσβση)

and thus Equation 21, in expectation, is:

(23) E[Rsib2]Eβ,η[E[Rsib2|β,η]]=(1+ρσησβ)2hβ211+c/α.

where

α:=hβ2/τ2=imVar[xi]βi2imVar[xi](βi+ηi)2.

We examined the fit of this prediction to simulated data. Specifically, we ran simulations to estimate effect sizes in a sib-GWAS and in a standard GWAS, after choosing n* to match their sampling variances. Finally, we used the polygenic scores to predict phenotypic values in a sample of unrelated individuals (see Section 1.3.3 for further detail).

Appendix 1—figure 9 A,C,D show the analytic result alongside simulation results, for different correlation coefficients between indirect and direct effect sizes. Even in the absence of a correlation between indirect and direct effect sizes, the polygenic score based on standard GWAS outperforms the polygenic score based on sib-GWAS.

To understand this behavior and dependency of the Rsib2Rur2 ratio on other parameters, we divide Equation 23 by Equation 22 and obtain

E[Rsib2Rur2]E[Rsib2]E[Rur2](1+ρσησβ)2α1+c1+c/α.

Noting further that

(1+ρσησβ)2α=(σβ+ρσησβ)2σβ2σβ2+2ρσβση+ση2=1-(1-ρ2)τη2τ2,

we obtain

(24) E[Rsib2Rur2][1-(1-ρ2)τη2τ2]1+c1+cτ2hβ2.

A few conclusions emerge from Equation 24 and the accompanying simulations. First, the sib-GWAS based polygenic score will outperform the standard GWAS-based polygenic score only if direct and indirect effects are strongly negatively correlated (see Appendix 1—figure 9A-D for illustration). Second, the term

(25) 1+c1+cτ2hβ2=1+imVar[β^i]Var[xi]τ21+imVar[β^i]Var[xi]hβ2

can be interpreted as the dependence on the noise-to-signal ratio (where the signals are the proportions of phenotypic variance explained by direct and indirect effects of transmitted alleles). For a given sampling variance (matched across the two study designs), the extent of the signal will differ between standard GWAS and sib-GWAS. Importantly, the sampling variance influences the ratio of prediction accuracies. If indirect effects do not exist or make negligible contributions to the trait in question, then the ratio of prediction accuracies is expected to be close to one. In the presence of indirect effects, however, the magnitude of the deviation from one depends on the relationship between direct and indirect effects (and their covariance) as well as on the (matched) sampling variance. Simulations of several parameter combinations suggest that the overall effect of this dependence on the noise-to-signal ratio is a decrease in Rsib2/Rur2 as noise increases; as more SNPs are included in the polygenic scores, the advantage of the standard GWAS-based polygenic score over that of the sib-GWAS grows larger (Appendix 1—figure 9 E-H). These considerations inform the interpretation of patterns observed in Figure 3C–F of the main text.

1.3.3 Simulations of indirect effects

For each set of simulated individuals (discovery, estimation and prediction sets), we first simulated mother-father pairs, assigning parental alleles from Bernoulli(pi), where pi denotes the allele frequency at site i. We then sampled the parental alleles at random to generate offspring (one offspring per each mother-father pair to simulate a sample of unrelated individuals and two offspring to generate sibling pairs). Phenotypes of the offspring were assigned under an additive model, sampling from a Normal distribution with mean

imβixi+ηi(xip+xim)

(where xim and xip are the maternal and paternal effect allele counts, respectively) and variance σe2, representing the total variance of environmental effects. When there is no correlation between direct and indirect effects, σe2=1-hβ2-2τη2. Using this approach, we generated a set of sibling pairs and estimated SNP effect sizes from these simulated data using a sib-GWAS. We calculated n* as follows: we simulated sets of unrelated individuals with a range of sample sizes. In each set, we performed a simple linear regression of the phenotypic values on the genotypes. We then estimated a linear relationship between the inverse of the median standard error of effect size estimates (as a dependent variable) and the square root of the sample size. Using this linear relationship, we predicted the sample size for the unrelated set that gives a median standard error equal to the median standard error of sib-GWAS effect size estimates (n*). Finally, we simulated a set of unrelated individuals with sample size n* and compared the prediction accuracy (R2) of the polygenic score based on standard GWAS on this sample with the one obtained from sib-GWAS.

We additionally investigated the effect of the number of SNPs included in the polygenic scores. For this analysis, we sorted the SNPs based on the association p-value obtained in an independent simulated set of unrelated individuals.

In these simulations, we used the following parameter values:

  • The ratio of the phenotypic variance accounted for by direct effects versus by indirect effects (hβ2/τη2): 5

  • The phenotypic variance explained by offspring and parental alleles, given no correlation between direct and indirect effects (hβ2+2τη2): 0.25 or 0.5

  • The ratio of the variance of direct effects to the variance of indirect effects (σβ2/ση2): 5

  • Allele frequencies, p, drawn from a truncated exponential distribution, truncated on the left such that the minimum allele frequency is 1%.

  • The number of loci, assumed independent (i.e., in linkage equilibrium): 100 (all causal), or 10,000 (all causal) or 10,000 (20% causal)

  • SNP effect sizes drawn as

    • (βiηi)N((00),(σβ2ρσβσηρσβσηση2)),

where ρ is the correlation between direct and indirect effect sizes. Effects sizes were then re-scaled to satisfy im2βi2pi(1-pi)=hβ2 and im2ηi2pi(1-pi)=τη2. Effects were set to 0 for non-causal loci.

  • The number of sibling pairs for sib GWAS: 10,000

  • The number of unrelated individuals for prediction: 10,000

  • The number of unrelated individuals for discovery GWAS (i.e., to decide which SNPs to include): 20,000

  • Number of iterations used to estimate n* and R2 for a given set of parameters: 10

1.4 Assortative mating

We consider assortative mating with regard to a phenotype, whereby the parents of individuals were more likely to mate if they were similar with respect to that phenotype. This process generates a correlation between genetic variants that contribute to the phenotype (i.e., linkage disequilibrium). Consequently, in a standard GWAS, the effect sizes of causal SNPs will partially capture the effect of other causal SNPs as well. Estimated effect sizes are thus expected to be inflated under positive assortative mating (mating of similar individuals) and deflated under negative assortative mating (mating of dissimilar individuals). In turn, in a sib-GWAS, the estimates are in expectation unaffected by assortative mating, because genetic differences between siblings arise from random Mendelian segregation in the parents.

1.4.1 Simulations of assortative mating

We used simulations to examine the phenotypic prediction accuracies of polygenic scores based on sib- and standard GWAS under a model with assortative mating (assuming no indirect effects or population stratification beyond assortative mating); to this end, we considered a sample of unrelated individuals, varying the degree of correlation between parental phenotypes ρa. Similar to our simulations for indirect effects (Section 1.3.3), we first simulated the estimation procedure in a sibling-based and in a standard GWAS (with sample size n*). We then computed the prediction accuracy R2 in an independent sample of unrelated individuals (see ‘Further simulation details’ below).

We first considered the simple case of a single generation of assortative mating. In the presence of positive assortative mating (ρa>0), polygenic scores based on standard GWAS outperform those based on sib-GWAS, whereas the opposite is true in the case of negative assortative mating (ρa<0) (Appendix 1—figure 10 A). In simulations of two generations of assortative mating, the gap between the prediction accuracies of scores based on standard and sib-GWAS (Appendix 1—figure 10 B) widens, suggesting that our qualitative findings apply to scenarios of sustained assortative mating as well.

We further investigated prediction accuracy as a function of the number of SNPs included in the polygenic scores, by progressively increasing the p-value threshold, using p-values obtained from an independent GWAS in unrelated samples (similar to our analysis in Figure 3). We considered two genetic architectures scenarios: (i) in which all SNPs are causal and (ii) the case in which 20% of of SNPs are causal (leading polygenic scores to include non-causal SNPs). Under both scenarios, the gap in prediction accuracies between standard and sib-GWAS grows with the number of SNPs (Appendix 1—figure 10 C-F).

Further simulation details

We simulated parental and offspring alleles as described for indirect effects in Section 1.3.3. To mimic assortative mating between parents, we first simulated i.i.d. genotypes (with effect allele counts xi at each SNP i) and randomly assigned ‘mother’ and ‘father’ labels to each individual. We then generated corresponding parental phenotypes under an additive model as

N(imβixi,1-h2)

where βi is the effect size of SNP i, and h2 is the heritability. The same model was used to generate offspring phenotypes.

To mimic the assortative mating process, we induced a given correlation between parental phenotypes, ρa, by paring mothers and fathers as follows: we first generated a random matrix

(um,iup,i)N((y¯my¯p),(σym2ρaσymσypρaσymσypσyp2)),

where y¯m and y¯p are the average phenotypes of mothers and fathers, respectively, σym and σyp are the standard deviation of the phenotypes of mothers and fathers, respectively. We then sorted the mothers and fathers sets such that the ranks of values in ym and yp match the ranks of values in um and up, respectively, to obtain cor(ym,yp)cor(um,up)=ρa. In the case of two generations of assortative mating, we simulated the generation of the grandparents similarly. We compared the performance of polygenic scores based on standard and sib-GWAS as described in Section 1.3.3. In the simulations, we used the following parameter values:

  • Heritability under random mating (h2): 0.5

  • The number of loci, assumed independent (i.e., in linkage equilibrium) under random mating: 10,000 (all causal) or 10,000 (20% causal)

  • Allele frequencies, p, drawn from a truncated exponential distribution, truncated on the left such that the minimum allele frequency is 1%.

  • SNP effect sizes set to 0 for non-causal loci and drawn as βiN(0,σ2), choosing σ2 to satisfy im2βi2pi(1-pi)=h2 for causal loci.

  • The number of sibling pairs for sib-GWAS: 10,000

  • The number of unrelated individuals for prediction: 10,000

  • The number of unrelated individuals for discovery GWAS (i.e., to decide which SNPs to include in the polygenic score): 20,000

  • The number of iterations used to estimate n* and R2 for a given set of parameters: 10

Appendix 1—figure 1
Variable prediction accuracy within an ancestry group.

This figure extends Figure 1 of the main text, showing prediction accuracies based on large-scale diverse GWAS that are the union of all strata matching the number of individuals in each stratum. The numbers in parentheses show GWAS sample sizes (see Materials and methods for details). Each box and whiskers plot was computed based on 20 iterations of resampling estimation and prediction sets. Thick horizontal lines denote the medians. The polygenic scores were estimated in samples of unrelated WB individuals. Phenotypes were then predicted in distinct samples of unrelated WB individuals, stratified by sex (A), age (B) or Townsend deprivation index, a measure of SES (C). In red and green cases, polygenic scores are based on a GWAS in a sample limited to one sex, age or SES group (a 'stratum’). In black, polygenic scores are based on a diverse GWAS in a pooled sample of all strata. In blue, polygenic scores are based on a diverse GWAS in a pooled sample of all strata but downsampled to match the size of the stratified GWAS.

Appendix 1—figure 2
Variable prediction accuracy (measured as R2) within an ancestry group.

This figure mirrors Figure 1 of the main text, except for the y-axis showing R2 values (squared correlation between polygenic score and phenotype residualized on covariates), rather than incremental R2. Each box and whiskers plot was computed based on 20 iterations of resampling estimation and prediction sets. Thick horizontal lines denote the medians. The polygenic scores were estimated in samples of unrelated WB individuals. Phenotypes were then predicted in distinct samples of unrelated WB individuals, stratified by sex (A), age (B) or Townsend deprivation index, a measure of SES (C). In red and green cases, polygenic scores are based on a GWAS in a sample limited to one sex, age or SES group (a 'stratum’). In blue, polygenic scores are based on a GWAS in a diverse sample of all strata downsampled to match the size of the stratified GWAS.

Appendix 1—figure 3
Dependence on the polygenic score model.

This figure extends Figure 1 of the main text, showing the prediction accuracies as a function of the p-value threshold for inclusion of a SNP in the polygenic score when based on a pruning and thresholding approach. The higher the p-value threshold is, the more SNPs are included. Last points on the x-axis correspond to a polygenic score model based on the LDpred approach (Vilhjálmsson et al., 2015) with a prior probability of 1 on loci being causal. Shown are incremental R2 values in different prediction sets. Points and error bars are mean and central 80% range computed based on 20 iterations of resampling estimation and prediction sets. (A–C) The polygenic scores were estimated in samples of unrelated WB individuals. Phenotypes were then predicted in distinct samples of unrelated WB individuals, stratified by sex (A), age (B) or Townsend deprivation index, a measure of SES (C). (D–I) Same as in A-C, but here the polygenic scores are based on a GWAS in a sample limited to one sex, age or SES group. The trends shown in Figure 1 of the main text are for p-value threshold of 10−4, and are qualitatively similar to the trends for other choices of the polygenic score model. For each trait, sample sizes are matched across all GWAS sets.

Appendix 1—figure 4
Estimating mean effect size across strata.

SNPs were ascertained in large samples of unrelated WB individuals. The effects of trait-increasing alleles were then re-estimated in an independent set of unrelated WB individuals (that were excluded from the original GWAS) stratified by sex for diastolic blood pressure (A), by age for BMI (B) and by Townsend deprivation index, a measure of SES for years of schooling (C). Points and error bars are mean and central 80% range computed based on 20 iterations of resampling ascertainment and estimation sets, plotted as a function of the p-value threshold (for p-values obtained in the discovery GWAS).

Appendix 1—figure 5
Variable prediction accuracy within an ancestry also seen using a linear mixed model.

This figure mirrors the last two columns in Appendix 1—figure 3, except that here, the GWAS estimates were obtained from a linear mixed model (LMM) (Loh et al., 2015). Shown are the prediction accuracies, measured as incremental R2, as a function of the p-value threshold for inclusion of a SNP in the polygenic score. Points and error bars are mean and central 80% range computed based on 20 iterations of resampling estimation and prediction sets. The polygenic scores are based on a GWAS in a sample limited to one sex, age or SES group. Phenotypes are then predicted in distinct samples of unrelated individuals, stratified by sex (A,B), age (C,D) or Townsend deprivation index, as a measure of SES (E,F). The qualitative trends are similar to those in Appendix 1—figure 3, which uses a standard linear regression with PCs (principal components of the genotype data) as a control for population structure when testing for an association between the phenotypes and genotypes. The similarity suggests that the observed differences in prediction accuracies across strata are not driven to a large degree by population structure confounding.

Appendix 1—figure 6
Comparison of siblings and unrelated individuals in the UK Biobank with respect to age, SES, and sex ratio.

Panels show the distribution of Townsend deprivation index, a measure of SES (A), the age distribution (B), and the proportion of males (C) for the siblings and unrelated sets used in the analysis described for Figure 3 of the main text. For each sibling pair, one sibling was randomly selected for these comparisons. The asterisk symbol marks a significant difference at the 1% level between siblings and unrelated individuals, as assessed by a Mann-Whitney test. SES and age distributions are quite similar in siblings and unrelated sets, whereas the proportion of males is significantly smaller in the siblings.

Appendix 1—figure 7
Comparison of siblings and unrelated individuals in the UK Biobank with respect to population structure.

Panels show the distribution of PCs (principal components of the genotype data) for the siblings and unrelated sets used in the analysis described for Figure 3 of the main text. For each sibling pair, one sibling was randomly selected for these comparisons. The asterisk symbol marks a significant difference at the 1% level between siblings and unrelated individuals, as assessed by a Mann-Whitney test. Despite slight but significant differences, siblings and unrelated sets are broadly similar with respect to their genetic ancestries.

Appendix 1—figure 8
Comparison of prediction accuracies of polygenic scores based on standard and sib-GWAS for simulated traits.

This figure mirrors Figure 3B of the main text, but here plotted for 12 simulated traits. The numbers in parentheses are the heritability, the number of causal loci considered, and the simulation replicate number, respectively. Three traits were simulated for each pair of heritability and number of causal loci parameters (see Materials and methods for simulation details). Small points show the ratio of the prediction accuracies in the two designs across 30 iterations; in each iteration, we resample sets of unrelated individuals to constitute three sets for discovery, estimation and prediction. Larger points show median values.

Appendix 1—figure 9
Simulation results for polygenic scores based on standard GWAS and sib-GWAS in the presence of indirect effects.

(A,B) Simulation results as a function of the correlation between direct and indirect effects, ρ. Simulations were performed with hβ2=0.5, τη2=0.1, and σβ2/ση2=5. The size of the estimation set in the sib-GWAS is 10,000, and the size of the estimation set in the standard GWAS is chosen to match sampling variances between the two study designs. The polygenic scores is based on 10,000 causal loci; its performance was evaluated in an independent set of 10,000 unrelated individuals. As long as direct and indirect effects are not strongly negatively correlated, the out of sample prediction accuracy is higher for the polygenic scores based on standard GWAS. (C) Same as (A) but with three-fold greater environmental noise. (D) Same as (A) but with 100 causal loci. In (A–D) points are mean ± 2 SD in 10 simulation iterations. Solid lines are values based on analytic expressions derived in Section 1.3.2. (E–H) Simulation results, with the same parameters as in (A) but ρ=0.5, as a function of the number of SNPs included in the polygenic scores, with all loci being causal (E,F), or with 20% of loci being causal (G,H). SNPs are added in increasing order of their association p-value in an independent set of 20,000 unrelated individuals. In both cases, the ratio of prediction accuracies of polygenic scores based on sib- versus standard GWAS becomes smaller with the inclusion of more weakly associated SNPs, a behavior qualitatively similar to observations in Figure 3 in the main text. Points are mean ± 2 SD in 10 simulations. See Section 1.3.3 for simulation details.

Appendix 1—figure 10
Simulation results for polygenic scores based on standard GWAS and sib-GWAS in the presence of assortative mating.

(A) Simulation results as a function of the approximate correlation between parental phenotypes, ρa. Simulations were performed with h2=0.5 under random mating. The size of the estimation set in the sib-GWAS is 10,000, and the size of the estimation set in the standard GWAS is chosen to match sampling variances between the two study designs. The polygenic score is based on 10,000 causal loci; its performance was evaluated in an independent set of 10,000 unrelated individuals. Standard-GWAS based polygenic scores outperforms (underperforms) sib-GWAS based polygenic scores under positive (negative) assortative mating. (B) Ratio of prediction accuracies of the polygenic scores based on sib- versus standard GWAS, as a function of ρa, for two sets of simulations with one or two generations of assortative mating, with same parameters as in (A). (C–F) Simulation results, with the same parameters as in (A) but ρa=0.5, as a function of the number of SNPs included in the polygenic score, with all loci being causal (C,D), or with 20% of loci being causal (E,F). SNPs are added in the order of their association p-value in an independent set of 20,000 unrelated individuals. In both cases, the ratio of prediction accuracies for scores based on sib-GWAS versus standard GWAS becomes smaller with the inclusion of more weakly associated SNPs, a behavior that is qualitatively similar to observations in Figure 3 in the main text. Points are mean ± 2 SD in 10 simulation iterations. See Section 1.4.1 for simulation details.

Appendix 1—figure 11
Comparison of prediction accuracies of polygenic scores based on standard and sib-GWAS matched for sex ratio.

This figure mirrors Figure 3B of the main text, but here the samples of siblings and unrelated individuals used in the analysis are matched for their sex ratios. Results are shown for diastolic blood pressure, as the prediction accuracy differed between sexes (Figure 1); the related phenotype of pulse rate; and a subset of the traits for which the prediction accuracy varied by GWAS design (Figure 3B). Small points show the ratio of the prediction accuracies in the two designs across 10 iterations; in each iteration, we resample sets of unrelated individuals to constitute three sets for discovery, estimation and prediction. Larger points show median values. We note that pulse rate is now similarly predicted by the two GWAS approaches, suggesting that perhaps the slightly higher prediction accuracy of the sib-GWAS shown in the main text Figure 3B are due to the sex ratio difference; for other traits, results are qualitatively unchanged.

Appendix 1—figure 12
Prediction accuracy of polygenic scores based on sib-and standard GWAS, for a range of traits.

This figure complements Figure 3C–F of the main text, showing the results of the study design depicted in Figure 3A for all traits presented in Figure 3. As described for Figure 3, we randomly divided unrelated individuals to constitute three non-overlapping sets for discovery, estimation and prediction. Small points correspond to 10 iterations of resmapling these three sets. The prediction accuracy is plotted as a function of the p-value threshold, where p-values come from the discovery GWAS. Lines show median values.

Appendix 1—figure 13
Prediction accuracy for years of schooling, for individuals with 0 or 1 full sibling.

(A) The y-axis shows the prediction accuracy, measured as incremental R2, in prediction sets stratified by participants’ number of siblings, using a polygenic score for years of schooling based on a GWAS performed using individuals who reported to have exactly 1 sibling. The x-axis shows the p-value threshold for inclusion of a SNP in the polygenic score when based on a pruning and thresholding approach. Last points on the x-axis correspond to a polygenic score model based on the LDpred approach (Vilhjálmsson et al., 2015) with a prior probability of 1 on loci being causal. Points are values based on 10 iterations of resampling estimation and prediction sets. Thick horizontal lines denote the mean values. (B–E) Comparison of the distribution of Townsend deprivation index (B) the age distribution (C), the proportion of males (D), and mean years of schooling (± 2 SD) between individuals who reported having no sibling and those who reported having 1 sibling. The two sets have somewhat different distributions of ages (or possibly come from somewhat different birth cohorts), a feature that could contribute to the patterns seen in panel A, but are otherwise similar with respect to the other features considered.

Appendix 1—figure 14
Variable prediction accuracy for binary traits, when measured as incremental AUC.

This figure is analogous to the one shown in Figure 1 of the main text, but considering dichotomized versions of the traits presented in Figure 1 in the prediction sets, and with the y-axis showing incremental AUC values rather than incremental R2. The polygenic scores are based on GWAS using the quantitative trait values as in Figure 1. The traits are (A) diastolic blood pressure of over 110 mmHg, (B) BMI of over 35 Kg/m2, and (C) completing a college or a university degree. Each box and whiskers plot was computed based on 20 iterations of resampling estimation and prediction sets. Thick horizontal lines denote the medians.

Appendix 1—figure 15
Variable prediction accuracy for binary disease phenotypes, measured as incremental AUC, in men versus women.

This figure is analogous to the one shown in Figure 1 of the main text, but looking at disease traits, and with the y-axis showing incremental AUC rather than incremental R2. Each box and whiskers plot was computed based on 20 iterations of resampling estimation and prediction sets. Thick horizontal lines denote the medians. The variable prediction accuracy of PGS based on GWAS in men only versus women only could be driven in part by the differences in ratios of cases to controls (and hence by differences in the precision of the effect size estimates). However, we also observe that the prediction accuracy can vary depending on the sex composition of the prediction set (e.g., for cardiovascular outcomes), an observation that cannot be attributed to differences in case:control ratios of the GWAS.

Appendix 1—figure 16
Comparison of prediction accuracies of polygenic scores (measured as R2) based on standard and sib-GWAS.

This figure mirrors Figure 3B of the main text, but here we first residualized the phenotypes on covariates, and then ran the same pipeline described as that used to generate Figure 3B on the residuals without further adjustment for covariates in the GWAS or prediction evaluation. Thus, this figure relates more directly to the analytical derivation in Section 1.2. However, the results in Figure 3B and here are qualitatively similar. Small points show the ratio of the prediction accuracies in the two designs across 10 iterations; in each iteration, we resample sets of unrelated individuals to constitute three sets for discovery, estimation and prediction. Larger points show median values.

Appendix 1—table 1
UK Biobank phenotype data used in this study and their corresponding data fields.

In parentheses are the units of measurements.

TraitDescriptionUKB data field
AgeAge when attended assessment center (years)21003
Age at first sexSelf-reported age at first sexual intercourse (years)2139
Alcohol intake frequencySelf-reported category, encoded as an integer: 1, 'Daily or almost daily’; 2, 'Three or four times a week’; 3, 'Once or twice a week’; 4, 'One to three times a month’; 5, 'Special occasions only’; 6, 'Never’1558
Basal metabolic rateEstimated from body composition impedance measurements (KJ)23105
Birth weightSelf-reported birth weight (Kg)20022
Body mass indexConstructed from height and weight measurements (Kg/m2)21001
Diastolic blood pressureMeasured using automated devices (mmHg); values are adjusted for medicine use (see Materials and methods)4079, 6153, 6177
Fluid intelligenceUnweighted sum of the number of correct answers given to 13 fluid intelligence questions20016
Forced vital capacityCalculated from breath spirometry (liters)3062
Hair colorSelf-reported category, encoded as an integer: 1, 'Blonde’; 2, 'Red’; 3, 'Light brown’; 4, 'Dark brown’; 5, 'Black’; none, 'Other’1747
Hand grip strengthMeasured right and left hand isometric grip strength (Kg)46, 47
HeightMeasured standing height (cm)50
Hip circumferenceMeasured hip circumference (cm)49
Hospital inpatient diagnosesDiagnoses made during hospital inpatient admissions, coded according to the International Classification of Diseases (ICD-9 and ICD-10)41202, 41203, 41204, 41205, 41270, 41271
Household incomeSelf-reported average total annual household income before tax category, encoded as an integer: 1, 'Less than £18,000'; 2, '£18,000 to £30,999'; 3, '£31,000 to £51,999'; 4, '£52,000 to £100,000'; 5, 'Greater than £100,000'738
Myocardial infarction outcomesAlgorithmically-defined myocardial infarction outcomes obtained through combinations of UK Biobank's assessment data collection (e.g., self-reported conditions), and data from hospital admissions42000, 42001
Neuroticism scoreDerived summary score, based on participants’ responses to 12 neurotic behaviour-related questions20127
Number of full siblingsSum of self-reported number of full brothers and full sisters1873, 1883
Overall health ratingSelf-reported category, encoded as an integer: 1, 'Excellent'; 2, 'Good'; 3, 'Fair'; 4, 'Poor’2178
Pack years of smokingCalculated for individuals who have ever smoked as the number of cigarettes smoked per day, divided by twenty, multiplied by the number of years of smoking (years)20161
Pulse rateMeasured during the automated blood pressure readings (bpm)102
QualificationsSelf-reported educational or professional qualifications, selected from: 'College or University degree', 'NVQ or HND or HNC or equivalent', 'Other professional qualifications eg: nursing, teaching', 'A levels/AS levels or equivalent', 'O levels/GCSEs or equivalent', 'CSEs or equivalent', or 'None of the above'6138
SexSelf-reported sex and as determined from genotyping analysis31, 22001
Skin colorSelf-reported category, encoded as an integer: 1, 'Very fair'; 2, 'Fair'; 3, 'Light olive'; 4, 'Dark olive'; 5, 'Brown'; 6, 'Black'1717
Townsend deprivation indexTownsend deprivation index at recruitment189
Vascular/heart problemsSelf-reported vascular/heart problems diagnosed by doctor selected from the categories: 'Heart attack’, 'Angina', 'Stroke', 'High blood pressure', and 'None of the above'6150
Waist circumferenceMeasured waist circumference (cm)48
Appendix 1—table 2
Genetic correlations across samples that vary by a study characteristic.

Numbers are genetic correlations estimated using LD score regression for BMI, years of schooling and diastolic blood pressure, across samples stratified by age, Townsend deprivation index (a measure of socioeconomic status, SES), and sex, respectively. ’Q’ denotes quartile of age or SES.

Trait/characteristicPair of strataGenetic correlation (s.e.)
BMI/Age(Q1,Q2)0.93 (0.036)
(Q1,Q3)0.95 (0.035)
(Q1,Q4)0.95 (0.038)
(Q2,Q3)0.89 (0.032)
(Q2,Q4)0.91 (0.036)
(Q3,Q4)1.00 (0.040)
Years of schooling/SES(Q1,Q2)0.98 (0.054)
(Q1,Q3)1.00 (0.067)
(Q1,Q4)0.93 (0.068)
(Q2,Q3)0.97 (0.064)
(Q2,Q4)1.09 (0.074)
(Q3,Q4)1.04 (0.074)
Diastolic blood pressure/Sex(male,female)0.93 (0.031)
Appendix 1—table 3
Sample sizes used for siblings and unrelated sets.

As described in Figure 3A, for the comparison of prediction accuracies of polygenic scores based on standard and sib-GWAS, we first ascertain SNPs in a large sample of unrelated individuals (‘Unrelated-discovery’) and then estimate the effect of these SNPs with a standard regression using unrelated individuals (‘Unrelated-n*') and, independently, using sib-regression (in the ‘Siblings’ set). Finally, we used the polygenic scores for prediction in a third sample of unrelated individuals (‘Unrelated-prediction’). This table shows sample sizes used for each set across the traits analyzed. For simulated traits, the numbers in parentheses are heritability, number of causal loci, and simulation replicate, respectively (three traits were simulated for each pair of heritability and number of causal loci parameters, see Materials and methods for simulation details).

TraitSiblings (pairs)Unrelated-n*Unrelated-
discovery
Unrelated-
prediction
Age at first sex13675874624498827220
Alcohol intake frequency172821092327688530764
Basal metabolic rate168021346726975029972
Birth weight6750576615907417674
BMI172171235927486830540
Diastolic blood pressure14791951425322728136
Fluid intelligence3889297910101611223
Forced vital capacity146051000925257628064
Hair color168591176327220930245
Hand grip strength170701083227511730568
Height172421814726997329997
Hip circumference172541164827593030658
Household income13240870423932626591
Neuroticism score11756690922701025223
Overall health rating171891037827658130731
Pack years of smoking23071682855449504
Pulse rate14791881225385928206
Skin color169031033427415930462
Waist circumference172571174927587330652
Years of schooling170371188527355330394
Simulated trait (0.5,10K,1)172991168527640430711
Simulated trait (0.5,10K,2)172991150527656630729
Simulated trait (0.5,10K,3)172991142227664130737
Simulated trait (0.5,100K,1)172991181427628830698
Simulated trait (0.5,100K,2)172991183327627130696
Simulated trait (0.5,100K,3)172991149027657930731
Simulated trait (0.1,10K,1)17299907227875630972
Simulated trait (0.1,10K,2)17299915827867830964
Simulated trait (0.1,10K,3)17299911127872130968
Simulated trait (0.1,100K,1)17299913327870130966
Simulated trait (0.1,100K,2)17299906927875830973
Simulated trait (0.1,100K,3)17299910827872330969
Appendix 1—table 4
Qualifications to years of schooling conversion table.

Educational or professional qualifications were converted to years of schooling following Okbay et al. (2016).

Qualifications (UKB data field 6138)Years of schooling
College or University degree20
NVQ or HND or HNC or equivalent19
Other professional qualifications eg: nursing, teaching15
A levels/AS levels or equivalent13
O levels/GCSEs or equivalent10
CSEs or equivalent10
None of the above7

References

  1. 1
  2. 2
  3. 3
  4. 4
    Multi-ancestry genome-wide gene-smoking interaction study of 387,272 individuals identifies new loci associated with serum lipids
    1. AR Bentley
    2. YJ Sung
    3. MR Brown
    4. TW Winkler
    5. AT Kraja
    6. I Ntalla
    7. K Schwander
    8. DI Chasman
    9. E Lim
    10. X Deng
    11. X Guo
    12. J Liu
    13. Y Lu
    14. CY Cheng
    15. X Sim
    16. D Vojinovic
    17. JE Huffman
    18. SK Musani
    19. C Li
    20. MF Feitosa
    21. MA Richard
    22. R Noordam
    23. J Baker
    24. G Chen
    25. H Aschard
    26. TM Bartz
    27. J Ding
    28. R Dorajoo
    29. AK Manning
    30. T Rankinen
    31. AV Smith
    32. SM Tajuddin
    33. W Zhao
    34. M Graff
    35. M Alver
    36. M Boissel
    37. JF Chai
    38. X Chen
    39. J Divers
    40. E Evangelou
    41. C Gao
    42. A Goel
    43. Y Hagemeijer
    44. SE Harris
    45. FP Hartwig
    46. M He
    47. A Horimoto
    48. FC Hsu
    49. YJ Hung
    50. AU Jackson
    51. A Kasturiratne
    52. P Komulainen
    53. B Kühnel
    54. K Leander
    55. KH Lin
    56. J Luan
    57. LP Lyytikäinen
    58. N Matoba
    59. IM Nolte
    60. M Pietzner
    61. B Prins
    62. M Riaz
    63. A Robino
    64. MA Said
    65. N Schupf
    66. RA Scott
    67. T Sofer
    68. A Stancáková
    69. F Takeuchi
    70. BO Tayo
    71. PJ van der Most
    72. TV Varga
    73. TD Wang
    74. Y Wang
    75. EB Ware
    76. W Wen
    77. YB Xiang
    78. LR Yanek
    79. W Zhang
    80. JH Zhao
    81. A Adeyemo
    82. S Afaq
    83. N Amin
    84. M Amini
    85. DE Arking
    86. Z Arzumanyan
    87. T Aung
    88. C Ballantyne
    89. RG Barr
    90. LF Bielak
    91. E Boerwinkle
    92. EP Bottinger
    93. U Broeckel
    94. M Brown
    95. BE Cade
    96. A Campbell
    97. M Canouil
    98. S Charumathi
    99. YI Chen
    100. K Christensen
    101. MP Concas
    102. JM Connell
    103. L de Las Fuentes
    104. HJ de Silva
    105. PS de Vries
    106. A Doumatey
    107. Q Duan
    108. CB Eaton
    109. RN Eppinga
    110. JD Faul
    111. JS Floyd
    112. NG Forouhi
    113. T Forrester
    114. Y Friedlander
    115. I Gandin
    116. H Gao
    117. M Ghanbari
    118. SA Gharib
    119. B Gigante
    120. F Giulianini
    121. HJ Grabe
    122. CC Gu
    123. TB Harris
    124. S Heikkinen
    125. CK Heng
    126. M Hirata
    127. JE Hixson
    128. MA Ikram
    129. Y Jia
    130. R Joehanes
    131. C Johnson
    132. JB Jonas
    133. AE Justice
    134. T Katsuya
    135. CC Khor
    136. TO Kilpeläinen
    137. WP Koh
    138. I Kolcic
    139. C Kooperberg
    140. JE Krieger
    141. SB Kritchevsky
    142. M Kubo
    143. J Kuusisto
    144. TA Lakka
    145. CD Langefeld
    146. C Langenberg
    147. LJ Launer
    148. B Lehne
    149. CE Lewis
    150. Y Li
    151. J Liang
    152. S Lin
    153. CT Liu
    154. J Liu
    155. K Liu
    156. M Loh
    157. KK Lohman
    158. T Louie
    159. A Luzzi
    160. R Mägi
    161. A Mahajan
    162. AW Manichaikul
    163. CA McKenzie
    164. T Meitinger
    165. A Metspalu
    166. Y Milaneschi
    167. L Milani
    168. KL Mohlke
    169. Y Momozawa
    170. AP Morris
    171. AD Murray
    172. MA Nalls
    173. M Nauck
    174. CP Nelson
    175. KE North
    176. JR O'Connell
    177. ND Palmer
    178. GJ Papanicolau
    179. NL Pedersen
    180. A Peters
    181. PA Peyser
    182. O Polasek
    183. N Poulter
    184. OT Raitakari
    185. AP Reiner
    186. F Renström
    187. TK Rice
    188. SS Rich
    189. JG Robinson
    190. LM Rose
    191. FR Rosendaal
    192. I Rudan
    193. CO Schmidt
    194. PJ Schreiner
    195. WR Scott
    196. P Sever
    197. Y Shi
    198. S Sidney
    199. M Sims
    200. JA Smith
    201. H Snieder
    202. JM Starr
    203. K Strauch
    204. HM Stringham
    205. NYQ Tan
    206. H Tang
    207. KD Taylor
    208. YY Teo
    209. YC Tham
    210. H Tiemeier
    211. ST Turner
    212. AG Uitterlinden
    213. D van Heemst
    214. M Waldenberger
    215. H Wang
    216. L Wang
    217. L Wang
    218. WB Wei
    219. CA Williams
    220. G Wilson
    221. MK Wojczynski
    222. J Yao
    223. K Young
    224. C Yu
    225. JM Yuan
    226. J Zhou
    227. AB Zonderman
    228. DM Becker
    229. M Boehnke
    230. DW Bowden
    231. JC Chambers
    232. RS Cooper
    233. U de Faire
    234. IJ Deary
    235. P Elliott
    236. T Esko
    237. M Farrall
    238. PW Franks
    239. BI Freedman
    240. P Froguel
    241. P Gasparini
    242. C Gieger
    243. BL Horta
    244. JJ Juang
    245. Y Kamatani
    246. CM Kammerer
    247. N Kato
    248. JS Kooner
    249. M Laakso
    250. CC Laurie
    251. IT Lee
    252. T Lehtimäki
    253. PKE Magnusson
    254. AJ Oldehinkel
    255. B Penninx
    256. AC Pereira
    257. R Rauramaa
    258. S Redline
    259. NJ Samani
    260. J Scott
    261. XO Shu
    262. P van der Harst
    263. LE Wagenknecht
    264. JS Wang
    265. YX Wang
    266. NJ Wareham
    267. H Watkins
    268. DR Weir
    269. AR Wickremasinghe
    270. T Wu
    271. E Zeggini
    272. W Zheng
    273. C Bouchard
    274. MK Evans
    275. V Gudnason
    276. SLR Kardia
    277. Y Liu
    278. BM Psaty
    279. PM Ridker
    280. RM van Dam
    281. DO Mook-Kanamori
    282. M Fornage
    283. MA Province
    284. TN Kelly
    285. ER Fox
    286. C Hayward
    287. CM van Duijn
    288. ES Tai
    289. TY Wong
    290. RJF Loos
    291. N Franceschini
    292. JI Rotter
    293. X Zhu
    294. LJ Bierut
    295. WJ Gauderman
    296. K Rice
    297. PB Munroe
    298. AC Morrison
    299. DC Rao
    300. CN Rotimi
    301. LA Cupples
    302. COGENT-Kidney Consortium, EPIC-InterAct Consortium, Understanding Society Scientific Group, Lifelines Cohort
    (2019)
    Nature Genetics 51:636–648.
    https://doi.org/10.1038/s41588-019-0378-y
  5. 5
  6. 6
  7. 7
  8. 8
  9. 9
  10. 10
  11. 11
  12. 12
  13. 13
  14. 14
    Race and Economic Opportunity in the United States: An Intergenerational Perspective
    1. R Chetty
    2. N Hendren
    (2018)
    National Bureau of Economic Research.
  15. 15
    Being Black, Living in the Red: Race, Wealth, and Social Policy in America
    1. D Conley
    (2010)
    University of California Press.
  16. 16
  17. 17
  18. 18
  19. 19
  20. 20
  21. 21
  22. 22
  23. 23
  24. 24
  25. 25
  26. 26
  27. 27
    Genetic analysis of over 1 million people identifies 535 new loci associated with blood pressure traits
    1. E Evangelou
    2. HR Warren
    3. D Mosen-Ansorena
    4. B Mifsud
    5. R Pazoki
    6. H Gao
    7. G Ntritsos
    8. N Dimou
    9. CP Cabrera
    10. I Karaman
    11. FL Ng
    12. M Evangelou
    13. K Witkowska
    14. E Tzanis
    15. JN Hellwege
    16. A Giri
    17. DR Velez Edwards
    18. YV Sun
    19. K Cho
    20. JM Gaziano
    21. PWF Wilson
    22. PS Tsao
    23. CP Kovesdy
    24. T Esko
    25. R Mägi
    26. L Milani
    27. P Almgren
    28. T Boutin
    29. S Debette
    30. J Ding
    31. F Giulianini
    32. EG Holliday
    33. AU Jackson
    34. R Li-Gao
    35. WY Lin
    36. J Luan
    37. M Mangino
    38. C Oldmeadow
    39. BP Prins
    40. Y Qian
    41. M Sargurupremraj
    42. N Shah
    43. P Surendran
    44. S Thériault
    45. N Verweij
    46. SM Willems
    47. JH Zhao
    48. P Amouyel
    49. J Connell
    50. R de Mutsert
    51. ASF Doney
    52. M Farrall
    53. C Menni
    54. AD Morris
    55. R Noordam
    56. G Paré
    57. NR Poulter
    58. DC Shields
    59. A Stanton
    60. S Thom
    61. G Abecasis
    62. N Amin
    63. DE Arking
    64. KL Ayers
    65. CM Barbieri
    66. C Batini
    67. JC Bis
    68. T Blake
    69. M Bochud
    70. M Boehnke
    71. E Boerwinkle
    72. DI Boomsma
    73. EP Bottinger
    74. PS Braund
    75. M Brumat
    76. A Campbell
    77. H Campbell
    78. A Chakravarti
    79. JC Chambers
    80. G Chauhan
    81. M Ciullo
    82. M Cocca
    83. F Collins
    84. HJ Cordell
    85. G Davies
    86. MH de Borst
    87. EJ de Geus
    88. IJ Deary
    89. J Deelen
    90. F Del Greco M
    91. CY Demirkale
    92. M Dörr
    93. GB Ehret
    94. R Elosua
    95. S Enroth
    96. AM Erzurumluoglu
    97. T Ferreira
    98. M Frånberg
    99. OH Franco
    100. I Gandin
    101. P Gasparini
    102. V Giedraitis
    103. C Gieger
    104. G Girotto
    105. A Goel
    106. AJ Gow
    107. V Gudnason
    108. X Guo
    109. U Gyllensten
    110. A Hamsten
    111. TB Harris
    112. SE Harris
    113. CA Hartman
    114. AS Havulinna
    115. AA Hicks
    116. E Hofer
    117. A Hofman
    118. JJ Hottenga
    119. JE Huffman
    120. SJ Hwang
    121. E Ingelsson
    122. A James
    123. R Jansen
    124. MR Jarvelin
    125. R Joehanes
    126. Å Johansson
    127. AD Johnson
    128. PK Joshi
    129. P Jousilahti
    130. JW Jukema
    131. A Jula
    132. M Kähönen
    133. S Kathiresan
    134. BD Keavney
    135. KT Khaw
    136. P Knekt
    137. J Knight
    138. I Kolcic
    139. JS Kooner
    140. S Koskinen
    141. K Kristiansson
    142. Z Kutalik
    143. M Laan
    144. M Larson
    145. LJ Launer
    146. B Lehne
    147. T Lehtimäki
    148. DCM Liewald
    149. L Lin
    150. L Lind
    151. CM Lindgren
    152. Y Liu
    153. RJF Loos
    154. LM Lopez
    155. Y Lu
    156. LP Lyytikäinen
    157. A Mahajan
    158. C Mamasoula
    159. J Marrugat
    160. J Marten
    161. Y Milaneschi
    162. A Morgan
    163. AP Morris
    164. AC Morrison
    165. PJ Munson
    166. MA Nalls
    167. P Nandakumar
    168. CP Nelson
    169. T Niiranen
    170. IM Nolte
    171. T Nutile
    172. AJ Oldehinkel
    173. BA Oostra
    174. PF O'Reilly
    175. E Org
    176. S Padmanabhan
    177. W Palmas
    178. A Palotie
    179. A Pattie
    180. B Penninx
    181. M Perola
    182. A Peters
    183. O Polasek
    184. PP Pramstaller
    185. QT Nguyen
    186. OT Raitakari
    187. M Ren
    188. R Rettig
    189. K Rice
    190. PM Ridker
    191. JS Ried
    192. H Riese
    193. S Ripatti
    194. A Robino
    195. LM Rose
    196. JI Rotter
    197. I Rudan
    198. D Ruggiero
    199. Y Saba
    200. CF Sala
    201. V Salomaa
    202. NJ Samani
    203. AP Sarin
    204. R Schmidt
    205. H Schmidt
    206. N Shrine
    207. D Siscovick
    208. AV Smith
    209. H Snieder
    210. S Sõber
    211. R Sorice
    212. JM Starr
    213. DJ Stott
    214. DP Strachan
    215. RJ Strawbridge
    216. J Sundström
    217. MA Swertz
    218. KD Taylor
    219. A Teumer
    220. MD Tobin
    221. M Tomaszewski
    222. D Toniolo
    223. M Traglia
    224. S Trompet
    225. J Tuomilehto
    226. C Tzourio
    227. AG Uitterlinden
    228. A Vaez
    229. PJ van der Most
    230. CM van Duijn
    231. AC Vergnaud
    232. GC Verwoert
    233. V Vitart
    234. U Völker
    235. P Vollenweider
    236. D Vuckovic
    237. H Watkins
    238. SH Wild
    239. G Willemsen
    240. JF Wilson
    241. AF Wright
    242. J Yao
    243. T Zemunik
    244. W Zhang
    245. JR Attia
    246. AS Butterworth
    247. DI Chasman
    248. D Conen
    249. F Cucca
    250. J Danesh
    251. C Hayward
    252. JMM Howson
    253. M Laakso
    254. EG Lakatta
    255. C Langenberg
    256. O Melander
    257. DO Mook-Kanamori
    258. CNA Palmer
    259. L Risch
    260. RA Scott
    261. RJ Scott
    262. P Sever
    263. TD Spector
    264. P van der Harst
    265. NJ Wareham
    266. E Zeggini
    267. D Levy
    268. PB Munroe
    269. C Newton-Cheh
    270. MJ Brown
    271. A Metspalu
    272. AM Hung
    273. CJ O'Donnell
    274. TL Edwards
    275. BM Psaty
    276. I Tzoulaki
    277. MR Barnes
    278. LV Wain
    279. P Elliott
    280. MJ Caulfield
    281. Million Veteran Program
    (2018)
    Nature Genetics 50:1412–1425.
    https://doi.org/10.1038/s41588-018-0205-x
  28. 28
  29. 29
  30. 30
  31. 31
  32. 32
  33. 33
    Applications of Linear Models in Animal Breeding, 462
    1. CR Henderson
    (1984)
    University of Guelph Guelph.
  34. 34
  35. 35
  36. 36
  37. 37
  38. 38
  39. 39
  40. 40
  41. 41
  42. 42
  43. 43
  44. 44
  45. 45
  46. 46
    Genetics and Analysis of Quantitative Traits, 1
    1. M Lynch
    2. B Walsh
    (1998)
    Sunderland, MA: Sinauer.
  47. 47
  48. 48
  49. 49
  50. 50
  51. 51
    Polygenic risk scores for prediction of breast Cancer and breast Cancer subtypes
    1. N Mavaddat
    2. K Michailidou
    3. J Dennis
    4. M Lush
    5. L Fachal
    6. A Lee
    7. JP Tyrer
    8. TH Chen
    9. Q Wang
    10. MK Bolla
    11. X Yang
    12. MA Adank
    13. T Ahearn
    14. K Aittomäki
    15. J Allen
    16. IL Andrulis
    17. H Anton-Culver
    18. NN Antonenkova
    19. V Arndt
    20. KJ Aronson
    21. PL Auer
    22. P Auvinen
    23. M Barrdahl
    24. LE Beane Freeman
    25. MW Beckmann
    26. S Behrens
    27. J Benitez
    28. M Bermisheva
    29. L Bernstein
    30. C Blomqvist
    31. NV Bogdanova
    32. SE Bojesen
    33. B Bonanni
    34. AL Børresen-Dale
    35. H Brauch
    36. M Bremer
    37. H Brenner
    38. A Brentnall
    39. IW Brock
    40. A Brooks-Wilson
    41. SY Brucker
    42. T Brüning
    43. B Burwinkel
    44. D Campa
    45. BD Carter
    46. JE Castelao
    47. SJ Chanock
    48. R Chlebowski
    49. H Christiansen
    50. CL Clarke
    51. JM Collée
    52. E Cordina-Duverger
    53. S Cornelissen
    54. FJ Couch
    55. A Cox
    56. SS Cross
    57. K Czene
    58. MB Daly
    59. P Devilee
    60. T Dörk
    61. I Dos-Santos-Silva
    62. M Dumont
    63. L Durcan
    64. M Dwek
    65. DM Eccles
    66. AB Ekici
    67. AH Eliassen
    68. C Ellberg
    69. C Engel
    70. M Eriksson
    71. DG Evans
    72. PA Fasching
    73. J Figueroa
    74. O Fletcher
    75. H Flyger
    76. A Försti
    77. L Fritschi
    78. M Gabrielson
    79. M Gago-Dominguez
    80. SM Gapstur
    81. JA García-Sáenz
    82. MM Gaudet
    83. V Georgoulias
    84. GG Giles
    85. IR Gilyazova
    86. G Glendon
    87. MS Goldberg
    88. DE Goldgar
    89. A González-Neira
    90. GI Grenaker Alnæs
    91. M Grip
    92. J Gronwald
    93. A Grundy
    94. P Guénel
    95. L Haeberle
    96. E Hahnen
    97. CA Haiman
    98. N Håkansson
    99. U Hamann
    100. SE Hankinson
    101. EF Harkness
    102. SN Hart
    103. W He
    104. A Hein
    105. J Heyworth
    106. P Hillemanns
    107. A Hollestelle
    108. MJ Hooning
    109. RN Hoover
    110. JL Hopper
    111. A Howell
    112. G Huang
    113. K Humphreys
    114. DJ Hunter
    115. M Jakimovska
    116. A Jakubowska
    117. W Janni
    118. EM John
    119. N Johnson
    120. ME Jones
    121. A Jukkola-Vuorinen
    122. A Jung
    123. R Kaaks
    124. K Kaczmarek
    125. V Kataja
    126. R Keeman
    127. MJ Kerin
    128. E Khusnutdinova
    129. JI Kiiski
    130. JA Knight
    131. YD Ko
    132. VM Kosma
    133. S Koutros
    134. VN Kristensen
    135. U Krüger
    136. T Kühl
    137. D Lambrechts
    138. L Le Marchand
    139. E Lee
    140. F Lejbkowicz
    141. J Lilyquist
    142. A Lindblom
    143. S Lindström
    144. J Lissowska
    145. WY Lo
    146. S Loibl
    147. J Long
    148. J Lubiński
    149. MP Lux
    150. RJ MacInnis
    151. T Maishman
    152. E Makalic
    153. I Maleva Kostovska
    154. A Mannermaa
    155. S Manoukian
    156. S Margolin
    157. JWM Martens
    158. ME Martinez
    159. D Mavroudis
    160. C McLean
    161. A Meindl
    162. U Menon
    163. P Middha
    164. N Miller
    165. F Moreno
    166. AM Mulligan
    167. C Mulot
    168. VM Muñoz-Garzon
    169. SL Neuhausen
    170. H Nevanlinna
    171. P Neven
    172. WG Newman
    173. SF Nielsen
    174. BG Nordestgaard
    175. A Norman
    176. K Offit
    177. JE Olson
    178. H Olsson
    179. N Orr
    180. VS Pankratz
    181. TW Park-Simon
    182. JIA Perez
    183. C Pérez-Barrios
    184. P Peterlongo
    185. J Peto
    186. M Pinchev
    187. D Plaseska-Karanfilska
    188. EC Polley
    189. R Prentice
    190. N Presneau
    191. D Prokofyeva
    192. K Purrington
    193. K Pylkäs
    194. B Rack
    195. P Radice
    196. R Rau-Murthy
    197. G Rennert
    198. HS Rennert
    199. V Rhenius
    200. M Robson
    201. A Romero
    202. KJ Ruddy
    203. M Ruebner
    204. E Saloustros
    205. DP Sandler
    206. EJ Sawyer
    207. DF Schmidt
    208. RK Schmutzler
    209. A Schneeweiss
    210. MJ Schoemaker
    211. F Schumacher
    212. P Schürmann
    213. L Schwentner
    214. C Scott
    215. RJ Scott
    216. C Seynaeve
    217. M Shah
    218. ME Sherman
    219. MJ Shrubsole
    220. XO Shu
    221. S Slager
    222. A Smeets
    223. C Sohn
    224. P Soucy
    225. MC Southey
    226. JJ Spinelli
    227. C Stegmaier
    228. J Stone
    229. AJ Swerdlow
    230. RM Tamimi
    231. WJ Tapper
    232. JA Taylor
    233. MB Terry
    234. K Thöne
    235. R Tollenaar
    236. I Tomlinson
    237. T Truong
    238. M Tzardi
    239. HU Ulmer
    240. M Untch
    241. CM Vachon
    242. EM van Veen
    243. J Vijai
    244. CR Weinberg
    245. C Wendt
    246. AS Whittemore
    247. H Wildiers
    248. W Willett
    249. R Winqvist
    250. A Wolk
    251. XR Yang
    252. D Yannoukakos
    253. Y Zhang
    254. W Zheng
    255. A Ziogas
    256. AM Dunning
    257. DJ Thompson
    258. G Chenevix-Trench
    259. J Chang-Claude
    260. MK Schmidt
    261. P Hall
    262. RL Milne
    263. PDP Pharoah
    264. AC Antoniou
    265. N Chatterjee
    266. P Kraft
    267. M García-Closas
    268. J Simard
    269. DF Easton
    270. ABCTB Investigators, kConFab/AOCS Investigators, NBCS Collaborators
    (2019)
    The American Journal of Human Genetics 104:21–34.
    https://doi.org/10.1016/j.ajhg.2018.11.002
  52. 52
    Prediction of total genetic value using Genome-Wide dense marker maps
    1. TH Meuwissen
    2. BJ Hayes
    3. ME Goddard
    (2001)
    Genetics 157:1819–1829.
  53. 53
  54. 54
  55. 55
  56. 56
    Genome-wide association study identifies 74 loci associated with educational attainment
    1. A Okbay
    2. JP Beauchamp
    3. MA Fontana
    4. JJ Lee
    5. TH Pers
    6. CA Rietveld
    7. P Turley
    8. GB Chen
    9. V Emilsson
    10. SF Meddens
    11. S Oskarsson
    12. JK Pickrell
    13. K Thom
    14. P Timshel
    15. R de Vlaming
    16. A Abdellaoui
    17. TS Ahluwalia
    18. J Bacelis
    19. C Baumbach
    20. G Bjornsdottir
    21. JH Brandsma
    22. M Pina Concas
    23. J Derringer
    24. NA Furlotte
    25. TE Galesloot
    26. G Girotto
    27. R Gupta
    28. LM Hall
    29. SE Harris
    30. E Hofer
    31. M Horikoshi
    32. JE Huffman
    33. K Kaasik
    34. IP Kalafati
    35. R Karlsson
    36. A Kong
    37. J Lahti
    38. SJ van der Lee
    39. C deLeeuw
    40. PA Lind
    41. KO Lindgren
    42. T Liu
    43. M Mangino
    44. J Marten
    45. E Mihailov
    46. MB Miller
    47. PJ van der Most
    48. C Oldmeadow
    49. A Payton
    50. N Pervjakova
    51. WJ Peyrot
    52. Y Qian
    53. O Raitakari
    54. R Rueedi
    55. E Salvi
    56. B Schmidt
    57. KE Schraut
    58. J Shi
    59. AV Smith
    60. RA Poot
    61. B St Pourcain
    62. A Teumer
    63. G Thorleifsson
    64. N Verweij
    65. D Vuckovic
    66. J Wellmann
    67. HJ Westra
    68. J Yang
    69. W Zhao
    70. Z Zhu
    71. BZ Alizadeh
    72. N Amin
    73. A Bakshi
    74. SE Baumeister
    75. G Biino
    76. K Bønnelykke
    77. PA Boyle
    78. H Campbell
    79. FP Cappuccio
    80. G Davies
    81. JE De Neve
    82. P Deloukas
    83. I Demuth
    84. J Ding
    85. P Eibich
    86. L Eisele
    87. N Eklund
    88. DM Evans
    89. JD Faul
    90. MF Feitosa
    91. AJ Forstner
    92. I Gandin
    93. B Gunnarsson
    94. BV Halldórsson
    95. TB Harris
    96. AC Heath
    97. LJ Hocking
    98. EG Holliday
    99. G Homuth
    100. MA Horan
    101. JJ Hottenga
    102. PL de Jager
    103. PK Joshi
    104. A Jugessur
    105. MA Kaakinen
    106. M Kähönen
    107. S Kanoni
    108. L Keltigangas-Järvinen
    109. LA Kiemeney
    110. I Kolcic
    111. S Koskinen
    112. AT Kraja
    113. M Kroh
    114. Z Kutalik
    115. A Latvala
    116. LJ Launer
    117. MP Lebreton
    118. DF Levinson
    119. P Lichtenstein
    120. P Lichtner
    121. DC Liewald
    122. A Loukola
    123. PA Madden
    124. R Mägi
    125. T Mäki-Opas
    126. RE Marioni
    127. P Marques-Vidal
    128. GA Meddens
    129. G McMahon
    130. C Meisinger
    131. T Meitinger
    132. Y Milaneschi
    133. L Milani
    134. GW Montgomery
    135. R Myhre
    136. CP Nelson
    137. DR Nyholt
    138. WE Ollier
    139. A Palotie
    140. L Paternoster
    141. NL Pedersen
    142. KE Petrovic
    143. DJ Porteous
    144. K Räikkönen
    145. SM Ring
    146. A Robino
    147. O Rostapshova
    148. I Rudan
    149. A Rustichini
    150. V Salomaa
    151. AR Sanders
    152. AP Sarin
    153. H Schmidt
    154. RJ Scott
    155. BH Smith
    156. JA Smith
    157. JA Staessen
    158. E Steinhagen-Thiessen
    159. K Strauch
    160. A Terracciano
    161. MD Tobin
    162. S Ulivi
    163. S Vaccargiu
    164. L Quaye
    165. FJ van Rooij
    166. C Venturini
    167. AA Vinkhuyzen
    168. U Völker
    169. H Völzke
    170. JM Vonk
    171. D Vozzi
    172. J Waage
    173. EB Ware
    174. G Willemsen
    175. JR Attia
    176. DA Bennett
    177. K Berger
    178. L Bertram
    179. H Bisgaard
    180. DI Boomsma
    181. IB Borecki
    182. U Bültmann
    183. CF Chabris
    184. F Cucca
    185. D Cusi
    186. IJ Deary
    187. GV Dedoussis
    188. CM van Duijn
    189. JG Eriksson
    190. B Franke
    191. L Franke
    192. P Gasparini
    193. PV Gejman
    194. C Gieger
    195. HJ Grabe
    196. J Gratten
    197. PJ Groenen
    198. V Gudnason
    199. P van der Harst
    200. C Hayward
    201. DA Hinds
    202. W Hoffmann
    203. E Hyppönen
    204. WG Iacono
    205. B Jacobsson
    206. MR Järvelin
    207. KH Jöckel
    208. J Kaprio
    209. SL Kardia
    210. T Lehtimäki
    211. SF Lehrer
    212. PK Magnusson
    213. NG Martin
    214. M McGue
    215. A Metspalu
    216. N Pendleton
    217. BW Penninx
    218. M Perola
    219. N Pirastu
    220. M Pirastu
    221. O Polasek
    222. D Posthuma
    223. C Power
    224. MA Province
    225. NJ Samani
    226. D Schlessinger
    227. R Schmidt
    228. TI Sørensen
    229. TD Spector
    230. K Stefansson
    231. U Thorsteinsdottir
    232. AR Thurik
    233. NJ Timpson
    234. H Tiemeier
    235. JY Tung
    236. AG Uitterlinden
    237. V Vitart
    238. P Vollenweider
    239. DR Weir
    240. JF Wilson
    241. AF Wright
    242. DC Conley
    243. RF Krueger
    244. G Davey Smith
    245. A Hofman
    246. DI Laibson
    247. SE Medland
    248. MN Meyer
    249. J Yang
    250. M Johannesson
    251. PM Visscher
    252. T Esko
    253. PD Koellinger
    254. D Cesarini
    255. DJ Benjamin
    256. LifeLines Cohort Study
    (2016)
    Nature 533:539–542.
    https://doi.org/10.1038/nature17671
  57. 57
  58. 58
  59. 59
  60. 60
  61. 61
  62. 62
  63. 63
  64. 64
    Racial Inequality: A Political-Economic Analysis
    1. M Reich
    (2017)
    Princeton University Press.
  65. 65
  66. 66
  67. 67
  68. 68
  69. 69
  70. 70
  71. 71
  72. 72
  73. 73
  74. 74
  75. 75
  76. 76
  77. 77
  78. 78
  79. 79
  80. 80
  81. 81
  82. 82
  83. 83
  84. 84
  85. 85
  86. 86
    Defining the role of common variation in the genomic and biological architecture of adult human height
    1. AR Wood
    2. T Esko
    3. J Yang
    4. S Vedantam
    5. TH Pers
    6. S Gustafsson
    7. AY Chu
    8. K Estrada
    9. J Luan
    10. Z Kutalik
    11. N Amin
    12. ML Buchkovich
    13. DC Croteau-Chonka
    14. FR Day
    15. Y Duan
    16. T Fall
    17. R Fehrmann
    18. T Ferreira
    19. AU Jackson
    20. J Karjalainen
    21. KS Lo
    22. AE Locke
    23. R Mägi
    24. E Mihailov
    25. E Porcu
    26. JC Randall
    27. A Scherag
    28. AA Vinkhuyzen
    29. HJ Westra
    30. TW Winkler
    31. T Workalemahu
    32. JH Zhao
    33. D Absher
    34. E Albrecht
    35. D Anderson
    36. J Baron
    37. M Beekman
    38. A Demirkan
    39. GB Ehret
    40. B Feenstra
    41. MF Feitosa
    42. K Fischer
    43. RM Fraser
    44. A Goel
    45. J Gong
    46. AE Justice
    47. S Kanoni
    48. ME Kleber
    49. K Kristiansson
    50. U Lim
    51. V Lotay
    52. JC Lui
    53. M Mangino
    54. I Mateo Leach
    55. C Medina-Gomez
    56. MA Nalls
    57. DR Nyholt
    58. CD Palmer
    59. D Pasko
    60. S Pechlivanis
    61. I Prokopenko
    62. JS Ried
    63. S Ripke
    64. D Shungin
    65. A Stancáková
    66. RJ Strawbridge
    67. YJ Sung
    68. T Tanaka
    69. A Teumer
    70. S Trompet
    71. SW van der Laan
    72. J van Setten
    73. JV Van Vliet-Ostaptchouk
    74. Z Wang
    75. L Yengo
    76. W Zhang
    77. U Afzal
    78. J Arnlöv
    79. GM Arscott
    80. S Bandinelli
    81. A Barrett
    82. C Bellis
    83. AJ Bennett
    84. C Berne
    85. M Blüher
    86. JL Bolton
    87. Y Böttcher
    88. HA Boyd
    89. M Bruinenberg
    90. BM Buckley
    91. S Buyske
    92. IH Caspersen
    93. PS Chines
    94. R Clarke
    95. S Claudi-Boehm
    96. M Cooper
    97. EW Daw
    98. PA De Jong
    99. J Deelen
    100. G Delgado
    101. JC Denny
    102. R Dhonukshe-Rutten
    103. M Dimitriou
    104. AS Doney
    105. M Dörr
    106. N Eklund
    107. E Eury
    108. L Folkersen
    109. ME Garcia
    110. F Geller
    111. V Giedraitis
    112. AS Go
    113. H Grallert
    114. TB Grammer
    115. J Gräßler
    116. H Grönberg
    117. LC de Groot
    118. CJ Groves
    119. J Haessler
    120. P Hall
    121. T Haller
    122. G Hallmans
    123. A Hannemann
    124. CA Hartman
    125. M Hassinen
    126. C Hayward
    127. NL Heard-Costa
    128. Q Helmer
    129. G Hemani
    130. AK Henders
    131. HL Hillege
    132. MA Hlatky
    133. W Hoffmann
    134. P Hoffmann
    135. O Holmen
    136. JJ Houwing-Duistermaat
    137. T Illig
    138. A Isaacs
    139. AL James
    140. J Jeff
    141. B Johansen
    142. Å Johansson
    143. J Jolley
    144. T Juliusdottir
    145. J Junttila
    146. AN Kho
    147. L Kinnunen
    148. N Klopp
    149. T Kocher
    150. W Kratzer
    151. P Lichtner
    152. L Lind
    153. J Lindström
    154. S Lobbens
    155. M Lorentzon
    156. Y Lu
    157. V Lyssenko
    158. PK Magnusson
    159. A Mahajan
    160. M Maillard
    161. WL McArdle
    162. CA McKenzie
    163. S McLachlan
    164. PJ McLaren
    165. C Menni
    166. S Merger
    167. L Milani
    168. A Moayyeri
    169. KL Monda
    170. MA Morken
    171. G Müller
    172. M Müller-Nurasyid
    173. AW Musk
    174. N Narisu
    175. M Nauck
    176. IM Nolte
    177. MM Nöthen
    178. L Oozageer
    179. S Pilz
    180. NW Rayner
    181. F Renstrom
    182. NR Robertson
    183. LM Rose
    184. R Roussel
    185. S Sanna
    186. H Scharnagl
    187. S Scholtens
    188. FR Schumacher
    189. H Schunkert
    190. RA Scott
    191. J Sehmi
    192. T Seufferlein
    193. J Shi
    194. K Silventoinen
    195. JH Smit
    196. AV Smith
    197. J Smolonska
    198. AV Stanton
    199. K Stirrups
    200. DJ Stott
    201. HM Stringham
    202. J Sundström
    203. MA Swertz
    204. AC Syvänen
    205. BO Tayo
    206. G Thorleifsson
    207. JP Tyrer
    208. S van Dijk
    209. NM van Schoor
    210. N van der Velde
    211. D van Heemst
    212. FV van Oort
    213. SH Vermeulen
    214. N Verweij
    215. JM Vonk
    216. LL Waite
    217. M Waldenberger
    218. R Wennauer
    219. LR Wilkens
    220. C Willenborg
    221. T Wilsgaard
    222. MK Wojczynski
    223. A Wong
    224. AF Wright
    225. Q Zhang
    226. D Arveiler
    227. SJ Bakker
    228. J Beilby
    229. RN Bergman
    230. S Bergmann
    231. R Biffar
    232. J Blangero
    233. DI Boomsma
    234. SR Bornstein
    235. P Bovet
    236. P Brambilla
    237. MJ Brown
    238. H Campbell
    239. MJ Caulfield
    240. A Chakravarti
    241. R Collins
    242. FS Collins
    243. DC Crawford
    244. LA Cupples
    245. J Danesh
    246. U de Faire
    247. HM den Ruijter
    248. R Erbel
    249. J Erdmann
    250. JG Eriksson
    251. M Farrall
    252. E Ferrannini
    253. J Ferrières
    254. I Ford
    255. NG Forouhi
    256. T Forrester
    257. RT Gansevoort
    258. PV Gejman
    259. C Gieger
    260. A Golay
    261. O Gottesman
    262. V Gudnason
    263. U Gyllensten
    264. DW Haas
    265. AS Hall
    266. TB Harris
    267. AT Hattersley
    268. AC Heath
    269. C Hengstenberg
    270. AA Hicks
    271. LA Hindorff
    272. AD Hingorani
    273. A Hofman
    274. GK Hovingh
    275. SE Humphries
    276. SC Hunt
    277. E Hypponen
    278. KB Jacobs
    279. MR Jarvelin
    280. P Jousilahti
    281. AM Jula
    282. J Kaprio
    283. JJ Kastelein
    284. M Kayser
    285. F Kee
    286. SM Keinanen-Kiukaanniemi
    287. LA Kiemeney
    288. JS Kooner
    289. C Kooperberg
    290. S Koskinen
    291. P Kovacs
    292. AT Kraja
    293. M Kumari
    294. J Kuusisto
    295. TA Lakka
    296. C Langenberg
    297. L Le Marchand
    298. T Lehtimäki
    299. S Lupoli
    300. PA Madden
    301. S Männistö
    302. P Manunta
    303. A Marette
    304. TC Matise
    305. B McKnight
    306. T Meitinger
    307. FL Moll
    308. GW Montgomery
    309. AD Morris
    310. AP Morris
    311. JC Murray
    312. M Nelis
    313. C Ohlsson
    314. AJ Oldehinkel
    315. KK Ong
    316. WH Ouwehand
    317. G Pasterkamp
    318. A Peters
    319. PP Pramstaller
    320. JF Price
    321. L Qi
    322. OT Raitakari
    323. T Rankinen
    324. DC Rao
    325. TK Rice
    326. M Ritchie
    327. I Rudan
    328. V Salomaa
    329. NJ Samani
    330. J Saramies
    331. MA Sarzynski
    332. PE Schwarz
    333. S Sebert
    334. P Sever
    335. AR Shuldiner
    336. J Sinisalo
    337. V Steinthorsdottir
    338. RP Stolk
    339. JC Tardif
    340. A Tönjes
    341. A Tremblay
    342. E Tremoli
    343. J Virtamo
    344. MC Vohl
    345. P Amouyel
    346. FW Asselbergs
    347. TL Assimes
    348. M Bochud
    349. BO Boehm
    350. E Boerwinkle
    351. EP Bottinger
    352. C Bouchard
    353. S Cauchi
    354. JC Chambers
    355. SJ Chanock
    356. RS Cooper
    357. PI de Bakker
    358. G Dedoussis
    359. L Ferrucci
    360. PW Franks
    361. P Froguel
    362. LC Groop
    363. CA Haiman
    364. A Hamsten
    365. MG Hayes
    366. J Hui
    367. DJ Hunter
    368. K Hveem
    369. JW Jukema
    370. RC Kaplan
    371. M Kivimaki
    372. D Kuh
    373. M Laakso
    374. Y Liu
    375. NG Martin
    376. W März
    377. M Melbye
    378. S Moebus
    379. PB Munroe
    380. I Njølstad
    381. BA Oostra
    382. CN Palmer
    383. NL Pedersen
    384. M Perola
    385. L Pérusse
    386. U Peters
    387. JE Powell
    388. C Power
    389. T Quertermous
    390. R Rauramaa
    391. E Reinmaa
    392. PM Ridker
    393. F Rivadeneira
    394. JI Rotter
    395. TE Saaristo
    396. D Saleheen
    397. D Schlessinger
    398. PE Slagboom
    399. H Snieder
    400. TD Spector
    401. K Strauch
    402. M Stumvoll
    403. J Tuomilehto
    404. M Uusitupa
    405. P van der Harst
    406. H Völzke
    407. M Walker
    408. NJ Wareham
    409. H Watkins
    410. HE Wichmann
    411. JF Wilson
    412. P Zanen
    413. P Deloukas
    414. IM Heid
    415. CM Lindgren
    416. KL Mohlke
    417. EK Speliotes
    418. U Thorsteinsdottir
    419. I Barroso
    420. CS Fox
    421. KE North
    422. DP Strachan
    423. JS Beckmann
    424. SI Berndt
    425. M Boehnke
    426. IB Borecki
    427. MI McCarthy
    428. A Metspalu
    429. K Stefansson
    430. AG Uitterlinden
    431. CM van Duijn
    432. L Franke
    433. CJ Willer
    434. AL Price
    435. G Lettre
    436. RJ Loos
    437. MN Weedon
    438. E Ingelsson
    439. JR O'Connell
    440. GR Abecasis
    441. DI Chasman
    442. ME Goddard
    443. PM Visscher
    444. JN Hirschhorn
    445. TM Frayling
    446. Electronic Medical Records and Genomics (eMEMERGEGE) Consortium, MIGen Consortium, PAGEGE Consortium, LifeLines Cohort Study.
    (2014)
    Nature Genetics 46:1173–1186.
    https://doi.org/10.1038/ng.3097
  87. 87
  88. 88