Polygenic Scores: It is time to get real when trying to predict educational performance

A study of 3,500 children in the UK shows that data on socioeconomic background and previous educational achievements can better predict how students will perform at school than genetic data.
  1. Cecile Janssens  Is a corresponding author
  1. Rollins School of Public Health, Emory University, United States

Interest in using polygenic scores to make predictions is skyrocketing in many areas of life. For example, researchers are exploring the use of these scores to predict the onset of complex diseases, such as cardiovascular disease, diabetes and cancer. It has also been proposed that polygenic scores could be used to predict educational attainment (Lee et al., 2018), and social behaviors such as loneliness (Abdellaoui et al., 2018) and same-sex sexual behavior (Ganna et al., 2019). However, even when the association between a polygenic score and a certain phenotype is statistically significant, this does not always guarantee the polygenic score will have a strong predictive power.

Most phenotypes are the result of multiple genetic variations, which are found by screening the genome of populations and identifying which variants appear more frequently in individuals with a specific trait. Polygenic scores are then calculated for each person based on how many of these genetic variations are present in their genome. This score indicates how likely a person is to develop the phenotype of interest.

Studies using data gathered by the Avon Longitudinal Study of Parents and Children (ALSPAC) in the UK have identified various factors that can predict the educational performance of individual students, including cannabis and tobacco use, and month of birth (Wright et al., 2018; Odd et al., 2016; Stiby et al., 2015). However, it is unclear whether polygenic scores can predict student performance better than other information that is easier to obtain.

Now, in eLife, Tim Morris, Neil Davies and George Davey Smith from the University of Bristol report the results of a study in which they explored if polygenic scores could be used to predict the educational performance of 3,500 children from the ALSPAC cohort who were born in the early 1990s (Morris et al., 2020). The educational achievement of each student was determined by averaging test scores from national exams taken at 7 and 16 years of age. The team then compared these exam scores against both polygenic scores and other characteristics available to the school (such as age, sex, and Free School Meal status), and the education and socioeconomic position of the children’s parents.

Morris et al. found that although polygenic scores display some degree of predictive power, socioeconomic factors, such as parent education, are a better predictor for how well a child will perform in school. Moreover, earlier educational achievements were found to be the best indicator for educational performance: for example, the results of tests sat at age 14 can predict how well students will perform in tests at age 16. Therefore, polygenic scores are better at predicting earlier performances in school than later academic successes. However, the power of this prediction is still weaker than other, more easily measurable factors.

These differences in predictive performance are similar to what is seen in complex diseases: polygenic scores on their own are poor predictors and only minimally improve predictions made on the basis of other (readily available) data. Furthermore, just as early school grades predict later grades, early symptoms of a disease are an excellent indicator for how severe the condition may become (Meigs et al., 2008). This suggests that if major risk factors develop and influence the phenotype over time, predictions made before the emergence of these risk factors will be less informative.

Polygenic scores are always created using variables that we know are associated with the phenotype of interest, so they will always have some predictive power. Therefore, what we really want to know is whether this predictive power is high enough to be useful for practical applications. And to answer this question we need to know more about how the polygenic scores are intended to be used (Martens and Janssens, 2019).

Other studies on factors that influence the educational performance of the ALSPAC cohort did not use averaged test scores as a read-out of academic success. Instead they focused on how different factors predict the likelihood that a student would drop out of school, or finish secondary school with fewer than five C+ grades – the minimum requirement for most education and training courses after age 16.

If the aim of education policies is to get students to finish school with five or more C+ grades, then it is important to identify which students are most likely not to achieve this goal. These children can then be offered more teaching and a greater level of support. Knowing when these interventions should be introduced will inform at what age the education performance of a student needs to be predicted, and which predictors are already available. Therefore, if polygenic scores are going to inform education policy, it is important that future prediction studies are designed with the intended use in mind.


Article and author information

Author details

  1. Cecile Janssens

    Cecile Janssens is in the Rollins School of Public Health, Emory University, Atlanta, United States

    For correspondence
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-6153-4976

Publication history

  1. Version of Record published: March 13, 2020 (version 1)


© 2020, Janssens

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.


  • 1,996
  • 111
  • 1

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Cecile Janssens
Polygenic Scores: It is time to get real when trying to predict educational performance
eLife 9:e55720.
  1. Further reading

Further reading

  1. How well can genetic scores predict school achievements?

    1. Genetics and Genomics
    Tiechao Ruan, Ruixi Zhou ... Ying Shen
    Research Article

    IQ motif-containing proteins can be recognized by calmodulin (CaM) and are essential for many biological processes. However, the role of IQ motif-containing proteins in spermatogenesis is largely unknown. In this study, we identified a loss-of-function mutation in the novel gene IQ motif-containing H (IQCH) in a Chinese family with male infertility characterized by a cracked flagellar axoneme and abnormal mitochondrial structure. To verify the function of IQCH, Iqch knockout (KO) mice were generated via CRISPR-Cas9 technology. As expected, the Iqch KO male mice exhibited impaired fertility, which was related to deficient acrosome activity and abnormal structures of the axoneme and mitochondria, mirroring the patient phenotypes. Mechanistically, IQCH can bind to CaM and subsequently regulate the expression of RNA-binding proteins (especially HNRPAB), which are indispensable for spermatogenesis. Overall, this study revealed the function of IQCH, expanded the role of IQ motif-containing proteins in reproductive processes, and provided important guidance for genetic counseling and genetic diagnosis of male infertility.