1. Genetics and Genomics
Download icon

Polygenic Scores: It is time to get real when trying to predict educational performance

  1. Cecile Janssens  Is a corresponding author
  1. Rollins School of Public Health, Emory University, United States
  • Cited 0
  • Views 1,489
  • Annotations
Cite this article as: eLife 2020;9:e55720 doi: 10.7554/eLife.55720


A study of 3,500 children in the UK shows that data on socioeconomic background and previous educational achievements can better predict how students will perform at school than genetic data.

Main text

Interest in using polygenic scores to make predictions is skyrocketing in many areas of life. For example, researchers are exploring the use of these scores to predict the onset of complex diseases, such as cardiovascular disease, diabetes and cancer. It has also been proposed that polygenic scores could be used to predict educational attainment (Lee et al., 2018), and social behaviors such as loneliness (Abdellaoui et al., 2018) and same-sex sexual behavior (Ganna et al., 2019). However, even when the association between a polygenic score and a certain phenotype is statistically significant, this does not always guarantee the polygenic score will have a strong predictive power.

Most phenotypes are the result of multiple genetic variations, which are found by screening the genome of populations and identifying which variants appear more frequently in individuals with a specific trait. Polygenic scores are then calculated for each person based on how many of these genetic variations are present in their genome. This score indicates how likely a person is to develop the phenotype of interest.

Studies using data gathered by the Avon Longitudinal Study of Parents and Children (ALSPAC) in the UK have identified various factors that can predict the educational performance of individual students, including cannabis and tobacco use, and month of birth (Wright et al., 2018; Odd et al., 2016; Stiby et al., 2015). However, it is unclear whether polygenic scores can predict student performance better than other information that is easier to obtain.

Now, in eLife, Tim Morris, Neil Davies and George Davey Smith from the University of Bristol report the results of a study in which they explored if polygenic scores could be used to predict the educational performance of 3,500 children from the ALSPAC cohort who were born in the early 1990s (Morris et al., 2020). The educational achievement of each student was determined by averaging test scores from national exams taken at 7 and 16 years of age. The team then compared these exam scores against both polygenic scores and other characteristics available to the school (such as age, sex, and Free School Meal status), and the education and socioeconomic position of the children’s parents.

Morris et al. found that although polygenic scores display some degree of predictive power, socioeconomic factors, such as parent education, are a better predictor for how well a child will perform in school. Moreover, earlier educational achievements were found to be the best indicator for educational performance: for example, the results of tests sat at age 14 can predict how well students will perform in tests at age 16. Therefore, polygenic scores are better at predicting earlier performances in school than later academic successes. However, the power of this prediction is still weaker than other, more easily measurable factors.

These differences in predictive performance are similar to what is seen in complex diseases: polygenic scores on their own are poor predictors and only minimally improve predictions made on the basis of other (readily available) data. Furthermore, just as early school grades predict later grades, early symptoms of a disease are an excellent indicator for how severe the condition may become (Meigs et al., 2008). This suggests that if major risk factors develop and influence the phenotype over time, predictions made before the emergence of these risk factors will be less informative.

Polygenic scores are always created using variables that we know are associated with the phenotype of interest, so they will always have some predictive power. Therefore, what we really want to know is whether this predictive power is high enough to be useful for practical applications. And to answer this question we need to know more about how the polygenic scores are intended to be used (Martens and Janssens, 2019).

Other studies on factors that influence the educational performance of the ALSPAC cohort did not use averaged test scores as a read-out of academic success. Instead they focused on how different factors predict the likelihood that a student would drop out of school, or finish secondary school with fewer than five C+ grades – the minimum requirement for most education and training courses after age 16.

If the aim of education policies is to get students to finish school with five or more C+ grades, then it is important to identify which students are most likely not to achieve this goal. These children can then be offered more teaching and a greater level of support. Knowing when these interventions should be introduced will inform at what age the education performance of a student needs to be predicted, and which predictors are already available. Therefore, if polygenic scores are going to inform education policy, it is important that future prediction studies are designed with the intended use in mind.


Article and author information

Author details

  1. Cecile Janssens

    Cecile Janssens is in the Rollins School of Public Health, Emory University, Atlanta, United States

    For correspondence
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-6153-4976

Publication history

  1. Version of Record published: March 13, 2020 (version 1)


© 2020, Janssens

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.


  • 1,489
    Page views
  • 81
  • 0

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

  1. How well can genetic scores predict school achievements?

    1. Genetics and Genomics
    2. Microbiology and Infectious Disease
    Jessamyn I Perlmutter et al.
    Research Article

    Wolbachia are the most widespread bacterial endosymbionts in animals. Within arthropods, these maternally-transmitted bacteria can selfishly hijack host reproductive processes to increase the relative fitness of their transmitting females. One such form of reproductive parasitism called male killing, or the selective killing of infected males, is recapitulated to degrees by transgenic expression of the WO-mediated killing (wmk) gene. Here, we characterize the genotype-phenotype landscape of wmk-induced male killing in D. melanogaster using transgenic expression. While phylogenetically distant wmk homologs induce no sex-ratio bias, closely-related homologs exhibit complex phenotypes spanning no death, male death, or death of all hosts. We demonstrate that alternative start codons, synonymous codons, and notably a single synonymous nucleotide in wmk can ablate killing. These findings reveal previously unrecognized features of transgenic wmk-induced killing and establish new hypotheses for the impacts of post-transcriptional processes in male killing variation. We conclude that synonymous sequence changes are not necessarily silent in nested endosymbiotic interactions with life-or-death consequences.