Variable prediction accuracy of polygenic scores within an ancestry group
Abstract
Fields as diverse as human genetics and sociology are increasingly using polygenic scores based on genome-wide association studies (GWAS) for phenotypic prediction. However, recent work has shown that polygenic scores have limited portability across groups of different genetic ancestries, restricting the contexts in which they can be used reliably and potentially creating serious inequities in future clinical applications. Using the UK Biobank data, we demonstrate that even within a single ancestry group (i.e., when there are negligible differences in linkage disequilibrium or in causal alleles frequencies), the prediction accuracy of polygenic scores can depend on characteristics such as the socio-economic status, age or sex of the individuals in which the GWAS and the prediction were conducted, as well as on the GWAS design. Our findings highlight both the complexities of interpreting polygenic scores and underappreciated obstacles to their broad use.
Data availability
The GWAS summary statistics generated in this study have been uploaded to Dryad.
-
Variable prediction accuracy of polygenic scores within an ancestry groupDryad Digital Repository, 10.5061/dryad.66t1g1jxs.
Article and author information
Author details
Funding
National Institute of General Medical Sciences (GM121372)
- Molly Przeworski
National Human Genome Research Institute (HG008140)
- Jonathan K Pritchard
Robert Wood Johnson Foundation (84337817)
- Dalton Conley
Simons Foundation (633313)
- Arbel Harpak
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Ethics
Human subjects: This study has been conducted using the UK Biobank resource under application Number 11138, as approved by Columbia University Institutional Review Board, protocol AAAS2914.
Copyright
© 2020, Mostafavi et al.
This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 12,274
- views
-
- 1,686
- downloads
-
- 319
- citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Genetics and Genomics
- Neuroscience
The central complex (CX) plays a key role in many higher-order functions of the insect brain including navigation and activity regulation. Genetic tools for manipulating individual cell types, and knowledge of what neurotransmitters and neuromodulators they express, will be required to gain mechanistic understanding of how these functions are implemented. We generated and characterized split-GAL4 driver lines that express in individual or small subsets of about half of CX cell types. We surveyed neuropeptide and neuropeptide receptor expression in the central brain using fluorescent in situ hybridization. About half of the neuropeptides we examined were expressed in only a few cells, while the rest were expressed in dozens to hundreds of cells. Neuropeptide receptors were expressed more broadly and at lower levels. Using our GAL4 drivers to mark individual cell types, we found that 51 of the 85 CX cell types we examined expressed at least one neuropeptide and 21 expressed multiple neuropeptides. Surprisingly, all co-expressed a small molecule neurotransmitter. Finally, we used our driver lines to identify CX cell types whose activation affects sleep, and identified other central brain cell types that link the circadian clock to the CX. The well-characterized genetic tools and information on neuropeptide and neurotransmitter expression we provide should enhance studies of the CX.
-
- Cancer Biology
- Genetics and Genomics
Interpretation of variants identified during genetic testing is a significant clinical challenge. In this study, we developed a high-throughput CDKN2A functional assay and characterized all possible human CDKN2A missense variants. We found that 17.7% of all missense variants were functionally deleterious. We also used our functional classifications to assess the performance of in silico models that predict the effect of variants, including recently reported models based on machine learning. Notably, we found that all in silico models performed similarly when compared to our functional classifications with accuracies of 39.5–85.4%. Furthermore, while we found that functionally deleterious variants were enriched within ankyrin repeats, we did not identify any residues where all missense variants were functionally deleterious. Our functional classifications are a resource to aid the interpretation of CDKN2A variants and have important implications for the application of variant interpretation guidelines, particularly the use of in silico models for clinical variant interpretation.