Polygenic Scores: How well can we separate genetics from the environment?

A simulation study demonstrates a better method for separating genetic effects from environmental effects in genome-wide association studies, but there is still some way to go before this becomes a "solved" problem.
  1. Jennifer Blanc
  2. Jeremy J Berg  Is a corresponding author
  1. Human Genetics, University of Chicago, United States

A person’s traits – such as their height or risk of disease – result from a complex interplay between the genes they inherit and the environments they experience over their lifetime. To cut through some of this complexity, human geneticists use a tool called a polygenic score, which attempts to predict a person’s traits solely from their genes (Rosenberg et al., 2019).

To build a polygenic score, geneticists first enroll a large number of people in a genome-wide association study (GWAS). For each participant, researchers measure numerous genetic variants across their genome, together with a trait of interest, and use this data to determine the extent to which different variants are associated with the trait. This information makes it possible to take the genome of someone who was not involved in the original GWAS and add up the effects of multiple genetic variants to calculate a polygenic score for that trait (Figure 1A). These scores have been used to predict a person’s risk of developing a disease (Torkamani et al., 2018), to study our evolutionary past (Rosenberg et al., 2019), and to help understand complex social outcomes (Harden and Koellinger, 2020).

Correcting biases in polygenic scores.

(A) A genome-wide associate study (GWAS) measures the trait of interest (phenotype) and the genotype of a sample of individuals and uses this data (middle graph) to see which genetic variants (represented by individual dots) are associated with the trait of interest (shown in red). This information is used to compute the polygenic score of individuals not in the original sample. Individuals with a higher polygenic score (orange) are predicted to have a higher trait value (e.g. to be taller or to have a greater risk of disease), while those with a lower polygenic score are predicted to have a lower trait value (bottom graph). (B) Mathieson and Zaidi simulated genetic data for a population that separated into subpopulations in the recent past; the environment was simulated as a six-by-six grid (left) in which environmental factors associated with the trait of interest vary smoothly from top to bottom. The uncorrected mean polygenic scores (top right) have a structure that clearly mirrors the structure in the environment. Correcting the scores with the 'common PCA' approach (middle right) does not solve this problem, but correction with the 'rare PCA' approach (bottom right) does. (C) However, when differences in the environmental factors were localized to a single square in the grid (shown in yellow), not even the rare PCA model could eliminate the correlation between genetic and environmental effects (indicated by asterix).

Image credit: Panel A – top (Stux, CC0), middle (Figure 1, Hu et al., 2016, CC BY 4.0), bottom (Jennifer Blanc); Panel B (Adapted from Figure 4, Zaidi and Mathieson, 2020).

However, efforts to use polygenic scores face substantial obstacles. All human populations exhibit genetic structure – variation in how genetically similar pairs of individuals are to one another – due to the complex history of geographic separation, population mixtures and migrations that have occurred throughout our evolutionary history. If this genetic structure correlates with patterns of environmental variation, it will cause many genetic variants to be incorrectly associated with a trait. This phenomenon, which is known as population stratification, will introduce biases into polygenic scores and undermine their purpose (which is to separate out the genetic component of trait variation).

To overcome this barrier, researchers would ideally measure the relevant environmental effects in the GWAS sample and include them as statistical controls in their analyses. However, it is difficult – if not impossible – to quantify all environmental effects on a given trait. Existing theory suggests that researchers can use the patterns of genetic variation they have already measured to model the genetic structure of the GWAS sample, and use this as statistical control instead (Song et al., 2015; Wang and Blei, 2019). In essence, because the problem arises from correlations between the environmental effects and patterns of genetic structure, it can be solved by controlling for either of them. The difficulty lies in how to correctly model this genetic structure. Geneticists favor a method called principal components analysis (PCA) (Price et al., 2006), as its simplicity and computational feasibility make it easy to apply to massive GWAS datasets. But the approach has limitations, and population stratification remains an issue in practice (Mathieson and McVean, 2012; Berg et al., 2019; Sohail et al., 2019).

Now, in eLife, Arslan Zaidi and Iain Mathieson from the University of Pennsylvania report which PCA models are the most effective at reducing bias in polygenic scores (Zaidi and Mathieson, 2020). To do this, they simulated the genetic data of a single population which had divided into spatially structured sub-groups within the recent past. They then simulated environmental effects on the trait and tested different PCA models to see how well each model controlled for them.

The results showed that the usual approach, known as ‘common PCA’, leads to polygenic scores that inappropriately mirror the environmental effects. Common PCA models calculate genetic structure by only measuring variants that appear in more than 5% of individuals in the GWAS sample. These common variants are typically ancient in origin, and therefore do not adequately capture the genetic structure of populations which have been spatially subdivided in the recent past. It is this failure to capture the genetic structure that results in biased polygenic scores.

On the other hand, rare variants, which appear in only a handful of individuals, are typically recent in origin and therefore reflect the history of recent subdivisions. Zaidi and Mathieson show that for this reason, PCA models built using patterns of genetic structure in rare variants (‘rare PCA’) eliminate biases from polygenic scores more effectively than the ‘common PCA’ technique (Figure 1B). However, this approach is not a panacea. When the environmental factors associated with the trait were localized to one geographic place (e.g. pollution localized to a particular city), even the rare PCA approach could not separate genetic effects from environmental biases (Figure 1C).

Zaidi and Mathieson also explore a more complicated set of simulations which are meant to more accurately mimic the patterns seen in real GWAS datasets, and find that the results are essentially identical to the simplified scenario described above. In all of their simulations, Zaidi and Mathieson know the ground truth, allowing them to experiment with different approaches designed to target the kind of bias they have simulated. In the real world, the ground truth is not known, so it is difficult to have complete confidence that stratification biases have been properly dealt with. Although a long-studied issue, these findings further demonstrate how separating genetic effects from environmental effects is still not a ‘solved’ problem in genetic studies (Lawson et al., 2020).

Studies that use polygenic scores have exploded in number over the past decade, riding a wave of well-founded optimism that they can open up new, otherwise inaccessible, avenues of research. But care is needed to ensure that this powerful tool is applied appropriately. Ultimately, the possibility for misleading results is an unavoidable risk, especially in research that is restricted to non-experimental settings. Zaidi and Mathieson provide several good recommendations for overcoming this, and suggest that a combination of the rare and common PCA approaches will minimize the amount by which environmental effects confound GWAS data. Moving forward, their results highlight the need for further statistical methods that more effectively deal with the biases introduced by environmental effects, especially for sharply distributed factors. In addition, more sensitive diagnostics are needed to assess how environmental effects impact polygenic scores.


    1. Wang Y
    2. Blei DM
    (2019) The blessings of multiple causes
    Journal of the American Statistical Association 114:1574–1596.

Article and author information

Author details

  1. Jennifer Blanc

    Jennifer Blanc is in the Department of Human Genetics, University of Chicago, Chicago, United States

    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-7569-018X
  2. Jeremy J Berg

    Jeremy J Berg is in the Department of Human Genetics, University of Chicago, Chicago, United States

    For correspondence
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-5411-6840


We thank Arjun Biddanda, Xiaoheng Cheng, Graham Coop, Doc Edge and John Novembre for comments on earlier drafts, and Arslan Zaidi and Iain Mathieson for answering questions about their paper.

Publication history

  1. Version of Record published: December 23, 2020 (version 1)


© 2020, Blanc and Berg

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.


  • 3,650
    Page views
  • 259
  • 3

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Jennifer Blanc
  2. Jeremy J Berg
Polygenic Scores: How well can we separate genetics from the environment?
eLife 9:e64948.

Further reading

    1. Epidemiology and Global Health
    2. Medicine
    Botond Antal et al.
    Research Article Updated


    Type 2 diabetes mellitus (T2DM) is known to be associated with neurobiological and cognitive deficits; however, their extent, overlap with aging effects, and the effectiveness of existing treatments in the context of the brain are currently unknown.


    We characterized neurocognitive effects independently associated with T2DM and age in a large cohort of human subjects from the UK Biobank with cross-sectional neuroimaging and cognitive data. We then proceeded to evaluate the extent of overlap between the effects related to T2DM and age by applying correlation measures to the separately characterized neurocognitive changes. Our findings were complemented by meta-analyses of published reports with cognitive or neuroimaging measures for T2DM and healthy controls (HCs). We also evaluated in a cohort of T2DM-diagnosed individuals using UK Biobank how disease chronicity and metformin treatment interact with the identified neurocognitive effects.


    The UK Biobank dataset included cognitive and neuroimaging data (N = 20,314), including 1012 T2DM and 19,302 HCs, aged between 50 and 80 years. Duration of T2DM ranged from 0 to 31 years (mean 8.5 ± 6.1 years); 498 were treated with metformin alone, while 352 were unmedicated. Our meta-analysis evaluated 34 cognitive studies (N = 22,231) and 60 neuroimaging studies: 30 of T2DM (N = 866) and 30 of aging (N = 1088). Compared to age, sex, education, and hypertension-matched HC, T2DM was associated with marked cognitive deficits, particularly in executive functioning and processing speed. Likewise, we found that the diagnosis of T2DM was significantly associated with gray matter atrophy, primarily within the ventral striatum, cerebellum, and putamen, with reorganization of brain activity (decreased in the caudate and premotor cortex and increased in the subgenual area, orbitofrontal cortex, brainstem, and posterior cingulate cortex). The structural and functional changes associated with T2DM show marked overlap with the effects correlating with age but appear earlier, with disease duration linked to more severe neurodegeneration. Metformin treatment status was not associated with improved neurocognitive outcomes.


    The neurocognitive impact of T2DM suggests marked acceleration of normal brain aging. T2DM gray matter atrophy occurred approximately 26% ± 14% faster than seen with normal aging; disease duration was associated with increased neurodegeneration. Mechanistically, our results suggest a neurometabolic component to brain aging. Clinically, neuroimaging-based biomarkers may provide a valuable adjunctive measure of T2DM progression and treatment efficacy based on neurological effects.


    The research described in this article was funded by the W. M. Keck Foundation (to LRMP), the White House Brain Research Through Advancing Innovative Technologies (BRAIN) Initiative (NSFNCS-FR 1926781 to LRMP), and the Baszucki Brain Research Fund (to LRMP). None of the funding sources played any role in the design of the experiments, data collection, analysis, interpretation of the results, the decision to publish, or any aspect relevant to the study. DJW reports serving on data monitoring committees for Novo Nordisk. None of the authors received funding or in-kind support from pharmaceutical and/or other companies to write this article.

    1. Biochemistry and Chemical Biology
    2. Epidemiology and Global Health
    Lang Pan et al.
    Research Article


    Few studies have assessed the role of individual plasma cholesterol levels in the association between egg consumption and the risk of cardiovascular diseases. This research aims to simultaneously explore the associations of self-reported egg consumption with plasma metabolic markers and these markers with the risk of cardiovascular disease (CVD).


    Totally 4778 participants (3401 CVD cases subdivided into subtypes and 1377 controls) aged 30–79 were selected based on the China Kadoorie Biobank. Targeted nuclear magnetic resonance was used to quantify 225 metabolites in baseline plasma samples. Linear regression was conducted to assess associations between self-reported egg consumption and metabolic markers, which were further compared with associations between metabolic markers and CVD risk.


    Egg consumption was associated with 24 out of 225 markers, including positive associations for apolipoprotein A1, acetate, mean HDL diameter, and lipid profiles of very large and large HDL, and inverse associations for total cholesterol and cholesterol esters in small VLDL. Among these 24 markers, 14 were associated with CVD risk. In general, the associations of egg consumption with metabolic markers and of these markers with CVD risk showed opposite patterns.


    In the Chinese population, egg consumption is associated with several metabolic markers, which may partially explain the protective effect of moderate egg consumption on CVD.


    This work was supported by the National Natural Science Foundation of China (81973125, 81941018, 91846303, 91843302). The CKB baseline survey and the first re-survey were supported by a grant from the Kadoorie Charitable Foundation in Hong Kong. The long-term follow-up is supported by grants (2016YFC0900500, 2016YFC0900501, 2016YFC0900504, 2016YFC1303904) from the National Key R&D Program of China, National Natural Science Foundation of China (81390540, 81390541, 81390544), and Chinese Ministry of Science and Technology (2011BAI09B01). The funders had no role in the study design, data collection, data analysis and interpretation, writing of the report, or the decision to submit the article for publication.