Population stratification continues to bias the results of genome-wide association studies (GWAS). When these results are used to construct polygenic scores, even subtle biases can cumulatively lead to large errors. To study the effect of residual stratification, we simulated GWAS under realistic models of demographic history. We show that when population structure is recent, it cannot be corrected using principal components of common variants because they are uninformative about recent history. Consequently, polygenic scores are biased in that they recapitulate environmental structure. Principal components calculated from rare variants or identity-by-descent segments can correct this stratification for some types of environmental effects. While family-based studies are immune to stratification, the hybrid approach of ascertaining variants in GWAS but re-estimating effect sizes in siblings reduces but does not eliminate stratification. We show that the effect of population stratification depends not only on allele frequencies and environmental structure but also on demographic history.
- Iain Mathieson
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
- George H Perry, Pennsylvania State University, United States
- Received: July 29, 2020
- Accepted: November 16, 2020
- Accepted Manuscript published: November 17, 2020 (version 1)
© 2020, Zaidi & Mathieson
This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.
Downloads (link to download the article as PDF)
Download citations (links to download the citations from this article in formats compatible with various reference manager tools)
Open citations (links to open the citations from this article in various online reference manager services)
Older age is a strong shared risk factor for many chronic diseases and there is increasing interest in identifying aging biomarkers. Here a proteomic analysis of 1301 plasma proteins was conducted in 997 individuals between 21 and 102 years of age. We identified 651 proteins associated with age (506 over-represented, 145 underrepresented with age) was identified. Mediation analysis suggested a role for partial cis-epigenetic control of protein expression with age. Of the age-associated proteins, 33.5% and 45.3%, were associated with mortality and multimorbidity, respectively. There was enrichment of proteins associated with inflammation and extracellular matrix as well as senescence-associated secretory proteins. A 76-protein proteomic age signature predicted accumulation of chronic diseases and all-cause mortality. These data support the premise of proteomic biomarkers to monitor aging trajectories and to identify individuals at higher risk for disease to be targeted for in depth diagnostic procedures and early interventions.
Human ascariasis is a major neglected tropical disease caused by the nematode Ascaris lumbricoides. We report a 296 megabase (Mb) reference-quality genome comprised of 17,902 protein-coding genes derived from a single, representative Ascaris worm. An additional 68 worms were collected from 60 human hosts in Kenyan villages where pig husbandry is rare. Notably, the majority of these worms (63/68) possessed mitochondrial genomes that clustered closer to the pig parasite Ascaris suum than to A. lumbricoides. Comparative phylogenomic analyses identified over 11 million nuclear-encoded SNPs but just two distinct genetic types that had recombined across the genomes analyzed. The nuclear genomes had extensive heterozygosity, and all samples existed as genetic mosaics with either A. suum-like or A. lumbricoides-like inheritance patterns supporting a highly interbred Ascaris species genetic complex. As no barriers appear to exist for anthroponotic transmission of these ‘hybrid’ worms, a one-health approach to control the spread of human ascariasis will be necessary.