Gene-centric functional dissection of human genetic variation uncovers regulators of hematopoiesis
Abstract
Genome-wide association studies (GWAS) have identified thousands of variants associated with human diseases and traits. However, the majority of GWAS-implicated variants are in non-coding regions of the genome and require in depth follow-up to identify target genes and decipher biological mechanisms. Here, rather than focusing on causal variants, we have undertaken a pooled loss-of-function screen in primary hematopoietic cells to interrogate 389 candidate genes contained in 75 loci associated with red blood cell traits. Using this approach, we identify 77 genes at 38 GWAS loci, with most loci harboring 1-2 candidate genes. Importantly, the hit set was strongly enriched for genes validated through orthogonal genetic approaches. Genes identified by this approach are enriched in specific and relevant biological pathways, allowing regulators of human erythropoiesis and modifiers of blood diseases to be defined. More generally, this functional screen provides a paradigm for gene-centric follow up of GWAS for a variety of human diseases and traits.
Data availability
1000 Genomes human variation datasetThe 1000 Genomes Project Consortium. (2015)Recombinant hotspots access at:ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/pilot_data/technical/reference/Phase 1 data (for PLINK) accessed at: https://www.cog-genomics.org/plink/1.9/resourcesPhase 3 data accessed at: http://www.internationalgenome.org/category/phase-3/Pooled screen abundance data for shRNA targeting red blood cell trait GWAS-nominated genes during the course of in vitro differentiation of human CD34+ cellsSK Nandakumar, SK McFarland, et al. (2019)Available on the project's companion GitHub repository: https://github.com/sankaranlab/shRNA_screen/tree/master/ref/shref.csvEffects of shRNA knockdown of SF3A2 on splicing during human erythropoiesisSK Nandakumar, SK McFarland, et al. (2019)ID GSE129603. In the public domain at GEO https://www.ncbi.nlm.nih.gov/geo/Effects of SF3B1 mutants on splicing in human erythropoiesisEA Obeng et al. (2016)ID GSE85712. In the public domain at GEO https://www.ncbi.nlm.nih.gov/geo/SNP sets identified by GWAS of LDL, HDL, and triglyceride traitsCJ Willer et al. (2013)Accessed at: http://csg.sph.umich.edu/willer/public/lipids2013/Human hematopoietic lineage gene expressionMR Corces et al. (2016)ID GSE74912. In the public domain at GEO https://www.ncbi.nlm.nih.gov/geo/Human adult and fetal erythropoiesis gene expressionH Yan et al. (2018)ID GSE107218. In the public domain at GEO https://www.ncbi.nlm.nih.gov/geo/
Article and author information
Author details
Funding
National Institutes of Health (R33HL120791)
- Vijay G Sankaran
New York Stem Cell Foundation
- Vijay G Sankaran
National Institutes of Health (R01DK103794)
- Vijay G Sankaran
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Ethics
Animal experimentation: No human subjects were involved in the study. Human CD34+ HSPCs used in these experiments are deidentified and obtained from external sources. All mouse experiments were performed in full compliance with the approved Institutional Animal Care and Use Committee (IACUC) protocols at Boston Children's Hospital (Protocol # 18-05-3680R) and Brigham and Women's Hospital (Protocol # 2017N000060). These studies were approved by local regulatory committees in accordance with the highest ethical standards for biomedical research involving vertebrate animals.
Copyright
© 2019, Nandakumar et al.
This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 2,861
- views
-
- 435
- downloads
-
- 14
- citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Genetics and Genomics
There are thousands of Mendelian diseases with more being discovered weekly and the majority have no approved treatments. To address this need, we require scalable approaches that are relatively inexpensive compared to traditional drug development. In the absence of a validated drug target, phenotypic screening in model organisms provides a route for identifying candidate treatments. Success requires a screenable phenotype. However, the right phenotype and assay may not be obvious for pleiotropic neuromuscular disorders. Here, we show that high-throughput imaging and quantitative phenotyping can be conducted systematically on a panel of C. elegans disease model strains. We used CRISPR genome-editing to create 25 worm models of human Mendelian diseases and phenotyped them using a single standardised assay. All but two strains were significantly different from wild-type controls in at least one feature. The observed phenotypes were diverse, but mutations of genes predicted to have related functions led to similar behavioural differences in worms. As a proof-of-concept, we performed a drug repurposing screen of an FDA-approved compound library, and identified two compounds that rescued the behavioural phenotype of a model of UNC80 deficiency. Our results show that a single assay to measure multiple phenotypes can be applied systematically to diverse Mendelian disease models. The relatively short time and low cost associated with creating and phenotyping multiple strains suggest that high-throughput worm tracking could provide a scalable approach to drug repurposing commensurate with the number of Mendelian diseases.
-
- Genetics and Genomics
The use of siblings to infer the factors influencing complex traits has been a cornerstone of quantitative genetics. Here, we utilise siblings for a novel application: the inference of genetic architecture, specifically that relating to individuals with extreme trait values (e.g. in the top 1%). Inferring the genetic architecture most relevant to this group of individuals is important because they are at the greatest risk of disease and may be more likely to harbour rare variants of large effect due to natural selection. We develop a theoretical framework that derives expected distributions of sibling trait values based on an index sibling’s trait value, estimated trait heritability, and null assumptions that include infinitesimal genetic effects and environmental factors that are either controlled for or have combined Gaussian effects. This framework is then used to develop statistical tests powered to distinguish between trait tails characterised by common polygenic architecture from those that include substantial enrichments of de novo or rare variant (Mendelian) architecture. We apply our tests to UK Biobank data here, although we note that they can be used to infer genetic architecture in any cohort or health registry that includes siblings and their trait values, since these tests do not use genetic data. We describe how our approach has the potential to help disentangle the genetic and environmental causes of extreme trait values, and to improve the design and power of future sequencing studies to detect rare variants.