Variable prediction accuracy of polygenic scores within an ancestry group

Abstract
Data availability
Article and author information
Metrics

Abstract

Fields as diverse as human genetics and sociology are increasingly using polygenic scores based on genome-wide association studies (GWAS) for phenotypic prediction. However, recent work has shown that polygenic scores have limited portability across groups of different genetic ancestries, restricting the contexts in which they can be used reliably and potentially creating serious inequities in future clinical applications. Using the UK Biobank data, we demonstrate that even within a single ancestry group (i.e., when there are negligible differences in linkage disequilibrium or in causal alleles frequencies), the prediction accuracy of polygenic scores can depend on characteristics such as the socio-economic status, age or sex of the individuals in which the GWAS and the prediction were conducted, as well as on the GWAS design. Our findings highlight both the complexities of interpreting polygenic scores and underappreciated obstacles to their broad use.

Data availability

The GWAS summary statistics generated in this study have been uploaded to Dryad.

The following data sets were generated

(2019) Variable prediction accuracy of polygenic scores within an ancestry group
Dryad Digital Repository, 10.5061/dryad.66t1g1jxs.

https://doi.org/10.5061/dryad.66t1g1jxs

Article and author information

Author details

Hakhamanesh Mostafavi

Department of Biological Sciences, Columbia University, New York, United States

For correspondence
hsm2137@columbia.edu

Competing interests
No competing interests declared.

"This ORCID iD identifies the author of this article:" 0000-0002-1060-2844
Arbel Harpak

Department of Biological Sciences, Columbia University, New York, United States

For correspondence
ah3586@columbia.edu

Competing interests
No competing interests declared.

"This ORCID iD identifies the author of this article:" 0000-0002-3655-748X
Ipsita Agarwal

Department of Biological Sciences, Columbia University, New York, United States

Competing interests
No competing interests declared.
Dalton Conley

Department of Sociology, Princeton University, Princeton, United States

Competing interests
No competing interests declared.
Jonathan K Pritchard

Department of Genetics, Stanford University, Stanford, United States

Competing interests
No competing interests declared.

"This ORCID iD identifies the author of this article:" 0000-0002-8828-5236
Molly Przeworski

Department of Systems Biology, Columbia University, New York, United States

For correspondence
mp3284@columbia.edu

Competing interests
Molly Przeworski, Reviewing editor, eLife.

"This ORCID iD identifies the author of this article:" 0000-0002-5369-9009

Funding

National Institute of General Medical Sciences (GM121372)

Molly Przeworski

National Human Genome Research Institute (HG008140)

Jonathan K Pritchard

Robert Wood Johnson Foundation (84337817)

Dalton Conley

Simons Foundation (633313)

Arbel Harpak

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Ethics

Human subjects: This study has been conducted using the UK Biobank resource under application Number 11138, as approved by Columbia University Institutional Review Board, protocol AAAS2914.

Copyright

This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.