Gene-centric functional dissection of human genetic variation uncovers regulators of hematopoiesis

  1. Satish K Nandakumar
  2. Sean K McFarland
  3. Laura M Mateyka
  4. Caleb A Lareau
  5. Jacob C Ulirsch
  6. Leif S Ludwig
  7. Gaurav Agarwal
  8. Jesse M Engreitz
  9. Bartlomiej Przychodzen
  10. Marie McConkey
  11. Glenn S Cowley
  12. John G Doench
  13. Jaroslaw P Maciejewski
  14. Benjamin L Ebert
  15. David E Root
  16. Vijay G Sankaran  Is a corresponding author
  1. Harvard Medical School, United States
  2. Broad Institute of MIT and Harvard, United States
  3. Ruprecht-Karls-University Heidelberg, Germany
  4. University of Oxford, United Kingdom
  5. Harvard Stem Cell Institute, United States
  6. Harvard University, United States
  7. Cleveland Clinic, United States
  8. Brigham and Women’s Hospital, United States
  9. Dana-Farber Cancer Institute, United States
  10. Howard Hughes Medical Institute, United States
6 figures, 1 table and 1 additional file

Figures

Figure 1 with 3 supplements
Design and Execution of an shRNA Screen Using Blood Cell Trait GWAS Hits to Identify Genetic Actors in Erythropoiesis.

(A) Overview of shRNA library design.75 loci associated with red blood cell traits (van der Harst et al., 2012) were used as the basis to calculate 75 genomic windows of LD 0.8 or greater from the sentinel SNP. Genes with a start site within 110 kb or end site within 40 kb of the LD-defined genomic windows were chosen as candidates to target in the screen. (B) Compositional makeup of the library, depicted as number of genes and number of hairpins for each of the four included subcategories; GWAS-nominated genes, erythroid genes, essential genes, and negative control genes (Figure 1—source data 2). (C) Primary CD34+hematopoietic stem and progenitor cells (HSPCs) isolated from three independent donors were cultured for a period of 16 days in erythroid differentiation conditions. At day 2, cells were infected with the shRNA library, and the abundances of each shRNA were measured at days 4, 6, 9, 12, 14, and 16 using deep sequencing.

https://doi.org/10.7554/eLife.44080.002
Figure 1—source data 1

Table containing annotations and information for the 75 SNPs used to seed the shRNA library.

https://doi.org/10.7554/eLife.44080.006
Figure 1—source data 2

Table containing annotations and information for all hairpins, as well as shRNA counts for each time point and replicate.

https://doi.org/10.7554/eLife.44080.007
Figure 1—figure supplement 1
Characteristics of GWAS Loci and Gene Selection for Pooled Screen.

(A) Counts of loci from among the original 75 annotated with linkage to each of the six RBC traits, hemoglobin (Hb), mean corpuscular hemoglobin (MCH), mean corpuscular hemoglobin concentration (MCHC), mean corpuscular volume (MCV), packed cell volume (PCV), and red blood cell count (RBC).Some loci were associated with multiple traits. Detailed information on each loci available in Figure 1—source data 1. (B) Kernel density plot showing the log10 sizes in bp of the LD-defined genomic windows used to find overlapping genes. (C) Histogram showing distribution of number of genes selected using the LD window method at each locus. A median of 4 genes were present at each.

https://doi.org/10.7554/eLife.44080.003
Figure 1—figure supplement 2
Feasibility of Loss of Function Approaches to Perform Pooled Screens in Primary Hematopoietic Stem and Progenitor Cells (HSPCs).

(A) Schematic of the loss of function lentiviral constructs tested for pooled screens in primary CD34+ cells. (B) FACS plots showing the proportion of infected GFP+ cells 4 days after transduction with the respective constructs (MOI -multiplicity of infection). FACS analysis was performed in independent analyzers. (C) Efficient silencing of Duffy surface antigen in primary CD34+ derived erythroid cells by targeting the promoter region using CRISPRi compared to CRISPR constructs.

https://doi.org/10.7554/eLife.44080.004
Figure 1—figure supplement 3
Pooled shRNA screen in primary HSPCs undergoing erythroid differentiation.

(A) Histogram showing distribution of number of independent hairpins included in the library to target each of the candidate’s genes. (B) Representative FACS plots of erythroid cell surface markers CD71 (transferrin receptor) and CD235a (Glycophorin A) expression at various time points during erythroid differentiation at which deep sequencing of shRNAs was performed. Percentages in each quadrant is represented as mean and standard deviation of 3 experiments from independent donors that were uninfected (Mock) or infected with the shRNA library (Pool).

https://doi.org/10.7554/eLife.44080.005
Figure 2 with 2 supplements
Summary Characterization of shRNA Screen Outcomes.

(A) Kernel density plot showing library representation as log2 shRNA CPM across all hairpins. (B) shRNA abundance log2 fold changes from day 4 to day 16. Represented values are the mean of hairpin abundance log2 fold changes across hairpins for each gene and two standard deviations. (C) Kernel density plots representing the day 4 to day 16 log2 fold changes of hairpin abundances for each of the subcategories of the library, including GWAS-nominated genes, known erythroid essential genes, essential genes to cell viability, and orthogonal genes serving as negative controls. (D) Violin plot of day 4 and day 16 log2 CPM for known actors GATA1 and RPS19 and negative controls LacZ and luciferase. (E) Log2 hairpin counts averaged for known actors GATA1 and RPS19 as well as negative controls LacZ and luciferase across the course of the experiment. Gray lines depict the universe of all other gene traces in the library for context.

https://doi.org/10.7554/eLife.44080.008
Figure 2—figure supplement 1
shRNA abundance log2 fold changes from day four to each of the other time points.

Represented values are the mean of hairpin abundance log2 fold changes across hairpins for each gene and two standard deviations.

https://doi.org/10.7554/eLife.44080.009
Figure 2—figure supplement 2
Scatter plots showing agreement of replicate observations across independent CD34+ donor populations.
https://doi.org/10.7554/eLife.44080.010
Figure 3 with 3 supplements
Statistical Modeling of Gene Effect Accounting for Off-target shRNA Confounders.

(A) Bar graph showing the 38 of 75 loci in the screen with at least one corresponding statistically significant (FDR < 0.1, β >0.1) gene effect causing either a positive or negative log2 fold change in shRNA abundance.Statistical model output for each gene in screen available in Figure 3—source data 1. (B) Kernel density plot showing the expected distributions of K562 essentiality scores using permuted gene hit sets from the library. (C) Hairpin rank sums for permuted sets of 5 genes. The red line indicates the enriched rank sums for 5 ‘gold standard’ genes included in the library, CCND3, SH2B3, MYB, KIT, and RBM38, for each which a genetic basis of action has already been established. (D) Permuted distribution of % inclusion of predicted coding variants among the set of identified hits. (E) Heat map depicting strength of expression (as z scores within each gene) for each of the 77 identified hit genes across hematopoietic lineages (top) and throughout the specific stages of adult erythropoiesis (bottom). Purple boxes highlight the cell types that were enriched for expression of hit genes. (F) Calculated enrichment of the identified hit genes for expression across hematopoietic lineages (top) and throughout the specific stages of adult erythropoiesis (bottom). In both cases, cellular states corresponding to those along the erythropoietic lineage had elevated probability of expressing genes from the hit set as compared to other genes from the library.

https://doi.org/10.7554/eLife.44080.011
Figure 3—source data 1

Table containing the R model output for each gene.

https://doi.org/10.7554/eLife.44080.015
Figure 3—figure supplement 1
Additional Characterization of Modeling Outcomes.

(A) Histogram showing the number of gene hits identified at each of the 40 loci with at least one significant gene effect detected. Statistical model output for each gene in screen available in Figure 3—source data 1. (B) Bar graph showing the number of gene hits identified for each of the six red blood cell traits used in the original GWAS to identify the studied loci. (C) Density-normalized histogram showing the Pearson correlation of hairpin measurements for both genes nominated as hits and genes not nominated as hits.

https://doi.org/10.7554/eLife.44080.012
Figure 3—figure supplement 2
K562 Essentiality Scores Comparing Hit Genes vs.Genes Implicated by Other Traits.

(A) Permuted enrichment of essentiality among the set of hit genes vs. randomly chosen sets of genes from the human genome. (B) Permuted enrichment of essentiality among the set of hit genes vs. genes implicated by a separate GWAS for LDL cholesterol levels. (F) Permuted enrichment of essentiality among the set of hit genes vs. genes implicated by a separate GWAS for HDL cholesterol levels. (C) Permuted enrichment of essentiality among the set of hit genes vs. genes implicated by a separate GWAS for blood triglyceride levels.

https://doi.org/10.7554/eLife.44080.013
Figure 3—figure supplement 3
Heat map depicting strength of expression (as z scores within each gene) for each of the 77 identified hit genes throughout the specific stages of fetal erythropoiesis.

Purple boxes highlight the cell types that were enriched for expression of hit genes.

https://doi.org/10.7554/eLife.44080.014
Analysis of Interactions Among Members of the Hit Set Identifies Signaling/Transcription, Membrane, and mRNA Translation-Related Subnetworks Important to Erythropoiesis.

STRING interaction network analysis identifies signaling/transcription, membrane, and mRNA translation-related subnetworks important to erythropoiesis embedded in the genes identified in the screen hit set. Edges connecting the network are color-coded according to the evidence supporting the interaction. In STRING, this evidence can derive from empirical determination, curation in a database, co-expression of the respective gene nodes, genomic proximity, and text-mining of published literature.

https://doi.org/10.7554/eLife.44080.016
Figure 5 with 1 supplement
Transferrin receptor two is a Negative Regulator of Human Erythropoiesis.

(A) Quantitative RT-PCR and (B) Western blot showing the expression of TFR2 in human CD34+ cells five days post-infection with the respective lentiviral shRNAs targeting TFR2 (TFR2 sh1 and sh2) and a control luciferase gene (shLUC). (C) Representative FACS plots of erythroid cell surface markers CD71 (transferrin receptor) and CD235a (Glycophorin A) expression at various time points during erythroid differentiation. Percentages in each quadrant are represented as mean and standard deviation of 3 independent experiments (D) Hoechst staining showing more enucleated cells after TFR2 knockdown at day 21 of erythroid culture. (E) Representative histogram plots showing increased expression of CD235a (Glycophorin A) after TFR2 knockdown (F) Enhanced pSTAT5 response after TFR2 knockdown in UT7/EPO cells.

https://doi.org/10.7554/eLife.44080.017
Figure 5—figure supplement 1
Additional Analysis Showing Transferrin Receptor two is a Negative Regulator of Human Erythropoiesis.

(A) Representative FACS plots of alternate erythroid cell surface markers CD49d (α4 integrin) and CD235a (Glycophorin A) expression at various time points during erythroid differentiation. (B) May-Grunwald Giemsa staining showing more differentiated erythroid cells after TFR2 knockdown at day 18 of erythroid culture. (C) Western blot showing downregulation of TFR2 in UT7/EPO cells. (D) Time-dependent absolute value of Mean Fluorescence Intensity (MFI) of STAT5 in UT7/Epo cells after TFR2 knockdown.

https://doi.org/10.7554/eLife.44080.018
Figure 6 with 2 supplements
SF3A2 is a Key regulator of Human Erythropoiesis and Modulates Erythropoiesis Defects in a Murine Model of MDS.

(A) Quantitative RT-PCR and (B) Western blot showing the expression of SF3A2 in human CD34+ cells five days post-infection with the respective lentiviral shRNAs targeting SF3A2 (sh1-4) and a control luciferase gene (shLUC). (C) Growth curves showing that downregulation of SF3A2 results in reduced total cell numbers during erythroid differentiation from three independent experiments. (D) Representative FACS plots of erythroid cell surface markers CD71 (transferrin receptor) and CD235a (Glycophorin A) expression at various time points during erythroid differentiation. Percentages in each quadrant are represented as mean and standard deviation of three independent experiments (E) Altered splicing events identified by RNA-Seq analysis of stage matched erythroid cells (shSF3A2 vs. shLUC). Overlapping changes observed in SF3B1 mutant BM cells from MDS patients (Obeng et al) (Figure 6—source data 5 and 6). Differentially expressed genes and pathway analysis available in Figure 6—source data 14. (F) Lineage negative bone marrow cells from wildtype (WT) and Sf3b1K700E mice were infected with shRNAs targeting murine Sf3a2 gene co-expressing a reporter GFP gene. Percentage of Ter119+ CD71+ erythroid cells within the GFP compartment after 48 hr in erythroid differentiation. (G) Total cell numbers of GFP+ erythroid cells after 48 hr in erythroid differentiation.

https://doi.org/10.7554/eLife.44080.019
Figure 6—source data 1

Table containing the DESeq2 output for differentially expressed genes in cells undergoing SF3A2 knockdown or control shRNA treatment.

https://doi.org/10.7554/eLife.44080.022
Figure 6—source data 2

Table containing the DESeq2 output for differentially expressed genes in MDS patients with and without mutations in SF3B1.

https://doi.org/10.7554/eLife.44080.023
Figure 6—source data 3

Tables containing the GO component (Table 1) and function (Table 2) enrichments calculated using GOrilla for cells undergoing SF3A2 knockdown or control shRNA treatment.

https://doi.org/10.7554/eLife.44080.024
Figure 6—source data 4

Tables containing the GO component (Table 1) and function (Table 2) enrichments calculated using GOrilla for MDS patient samples with and without mutations in SF3B1.

https://doi.org/10.7554/eLife.44080.025
Figure 6—source data 5

Tables containing the differential splicing analysis for cells undergoing SF3A2 knockdown or control shRNA treatment.

Categories of splice mutations presented in each table are alternative 3’ splice sites, alternative 5’ splice sites, mutually exclusive exons, retrained introns, and skipped exons, respectively.

https://doi.org/10.7554/eLife.44080.026
Figure 6—source data 6

Tables containing the differential splicing analysis for MDS patient patient samples with and without mutations in SF3B1.

Categories of splice mutations presented in each table are alternative 3’ splice sites, alternative 5’ splice sites, mutually exclusive exons, retrained introns, and skipped exons, respectively.

https://doi.org/10.7554/eLife.44080.027
Figure 6—figure supplement 1
Additional Analysis Showing SF3A2 is Required for Human Erythropoiesis.

(A) shRNAs targeting SF3A2 co-expressing a reporter GFP gene was infected into human CD34+ cells and cultured in erythroid conditions. GFP expression at various time points from three independent experiments show that downregulation of SF3A2 results in reduced cell numbers. (B) Representative FACS plots of erythroid (CD235a) and non-erythroid cell surface markers (CD11b/CD41 a) and at various time points showing an increase in non-erythroid lineages upon SF3A2 downregulation. Cells were gated on the GFP positive population.

https://doi.org/10.7554/eLife.44080.020
Figure 6—figure supplement 2
Additional Analysis of Erythropoiesis Defects Observed in Sf3b1K700E Murine Erythroid Cells upon SF3A2 knockdown.

(A) Knockdown efficiency of shRNAs targeting SF3A2 in murine erythroleukemia (MEL) cells by western blot. (B) Total cell numbers of GFP +shRNA expressing bone marrow cells from wildtype (WT) and Sf3b1K700E mice at the start of murine erythroid differentiation. (C) Percentage of Ter119+ CD71+ erythroid cells within GFP compartment and (D) Total cell numbers of GFP+ erythroid cells after 24 hr in erythroid differentiation. (E) Growth curves of GFP+ erythroid cells during erythroid culture. (F) Putative but insignificant interaction between SF3A2 variant alleles (rs25672) and hemoglobin levels in MDS patients with SF3B1 mutations.

https://doi.org/10.7554/eLife.44080.021

Tables

Key resources table
Reagent type
(species) or
resource
DesignationSource or
reference
IdentifiersAdditional
information
Biological sample (Homo sapiens)CD34 + mobilized peripheral bloodFred Hutchinson Cancer Research Center
Cell line (Homo sapiens)UT-7/EPONARRID:CVCL_5202maintained in Sankaran laboratory
Cell line (Mus musculus)MELNAmaintained in Sankaran laboratory
Genetic reagent (Mus musculus)Sf3b1K700EObeng et al., 2016Dr. Benjamin L. Ebert (Brigham Women's Hospital, Boston MA)
Recombinant DNA reagent (lentiviral shRNA)PLKO.1-Puro (plasmid)Sigma-AldrichRRID
:Addgene_10878
Pol III based shRNA backbone
Recombinant DNA reagent (lentiviral shRNA)PLKO-GFP (plasmid)this paperGFP version of pLKO.1-Puro
Recombinant DNA reagent (lentiviral shRNA)SFFV-Venus-mir30 shRNA (plasmid)this paperPol II based shRNA backbone
Antibodymouse monoclonal anti-human CD235a-APCThermo Fisher ScientificCat#: 17-9987-42; RRID:AB_2043823FACS (5 ul per test)
Antibodymouse monoclonal anti-human CD71-FITCThermo Fisher ScientificCat#: 11-0719-42; RRID:AB_1724093FACS (5 ul per test)
Antibodymouse monoclonal anti-human CD71-PEcy7Thermo Fisher ScientificCat#: 25-0719-42; RRID:AB_2573366FACS (5 ul per test)
Antibodymouse monoclonal ant-human CD49d-PEMiltenyi BiotecCat#: 130-093-282; RRID:AB_1036224FACS (10 ul per test)
Antibodymouse monoclonal anti-human CD41a-PEThermo Fisher ScientificCat#: 12-0419-42; RRID:AB_10870785FACS (5 ul per test)
Antibodymouse monoclonal anti-human CD11b-PEThermo Fisher ScientificCat#: 12-0118-42; RRID:AB_2043799FACS (5 ul per test)
AntibodyRat monoclonal anti-mouse Ter119-APCThermo Fisher ScientificCat#: 17-5921-82; RRID:AB_469473FACS (0.25 ug/test)
AntibodyRat monoclonal anti-mouse CD71-PEThermo Fisher ScientificCat#: 12-0711-82; RRID:AB_465740FACS (0.5 ug/test)
Antibodymouse monoclonal anti-phospho STAT5 Alexa Fluor-647BD BioscienceCat#: 612599; RRID:AB_399882FACS (1:20)
Antibodymouse monoclonal anti-GAPDHSanta Cruz Biotechnologysc-32233;
RRID:AB_627679
Western (1:20,000)
Antibodymouse monoclonal anti-TFR2Santa Cruz Biotechnologysc-32271; RRID:AB_628395Western (1:200)
Antibodymouse monoclonal anti-SF3A2Santa Cruz Biotechnologysc-390444Western (1:1000)
Sequence-based reagentshLUCSigma-AldrichTRCN00000722595’- CGCTGAGTACTTCGAAATGTC-3’
Sequence-based
reagent
TFR2 sh1 (human)Sigma-AldrichTRCN00000636285’-GCCAGATCACTACGTTGTCAT-3’
Sequence-based reagentTFR2 sh2 (human)Sigma-AldrichTRCN00000636325-CAACAACATCTTCGGCTGCAT-3’
Sequence-based reagentSF3A2 sh1 (human)Sigma-AldrichTRCN00000000605’-CTACGAGACCATTGCCTTCAA-3’
Sequence-based reagentSF3A2 sh2 (human)Sigma-AldrichTRCN00000000615’-CCTGGGCTCCTATGAATGCAA-3’
Sequence-based reagentSF3A2 sh3 (human)Sigma-AldrichTRCN00000000625’-CAAAGTGACCAAGCAGAGAGA-3’
Sequence-based reagentSF3A2 sh4 (human)Sigma-AldrichTRCN00000000635’-ACATCAACAAGGACCCGTACT-3’
Commercial assay or kitRNeasy Mini KitQIAGENCat#: 74104
Commercial assay or kitiScript cDNA synthesis KitBio-RadCat#: 1708891
Commercial assay or kitiQ SYBR Green SupermixBio-RadCat#: 170–8882
Commercial assay or kitNucleoSpin Blood XL-Maxi kitClonetchCat#: 740950.1
Commercial assay or kitLineage Cell Depletion Kit (mouse)MiltenyiCat#: 130-090-858
Commercial assay or kitNextera XT DNA Library Preparation KitIlluminaCat#: FC-131–1096
Commercial assay or kitNextSeq 500/550 High Output Kit v2.5 (75 Cycles)IlluminaCat#: 20024906
Commercial assay or kitBioanalyzer High Sensitivity DNA AnalysisAgilentCat#: 5067–4626
Commercial assay or kitAgencourt AMPure XPBeckman-CoulterCat#: A63881
Commercial assay or kitTaKaRa Ex TaqDNA PolymeraseTakaraCat#: RR001B
Commercial assay or kitQubit dsDNA HS Assay KitThermo FisherCat#: Q32854
Chemical compound, drugHuman Holo-TransferrinSigma AldrichCat#: T0665-1G
Peptide, recombinant proteinHumulin R (insulin)LillyNDC 0002-8215-01
Peptide, recombinant proteinHeparinHospiraNDC 00409-2720-01
Peptide, recombinant proteinEpogen (recombinant erythropoietin)AmgenNDC 55513-267-10
Peptide, recombinant proteinRecombinant human stem cell factor (SCF)PeprotechCat#: 300–07
Peptide, recombinant proteinRecombinant human interleukin-3 (IL-3)PeprotechCat#: 200–03
Peptide, recombinant proteinRecombinant mousestem cell factor (SCF)R&D systemsCat# 455-MC-010
Peptide, recombinant proteinrecombinant mouse Insulin like Growth Factor 1 (IGF1)R&D systemsCat# 791 MG-050
Chemical compound, drugHoechst 33342Life TechnologiesCat#: H1399FACS (1:1000)
Chemical compound, drugFixation BufferBD BioscienceCat#: 554655
Chemical compound, drugPerm Buffer IIIBD BioscienceCat#: 558050
Chemical compound, drugMay-Grünwald StainSigma-AldrichCat#: MG500
Chemical compound, drugGiemsa StainSigma-AldrichCat#: GS500
Software, algorithmSTARDobin et al., 2013RRID:SCR_015899
Software, algorithmMISOKatz et al., 2010RRID:SCR_003124
Software, algorithmRThe R FoundationRRID:SCR_001905
Software, algorithmSalmonPatro et al., 2017RRID:SCR_017036
Software, algorithmGOrillaEden et al., 2009RRID:SCR_006848
Software, algorithmVEPMcLaren et al., 2016RRID:SCR_007931
Software, algorithmFlowJo version 10FlowJoRRID:SCR_008520
Software, algorithmGraphPad Prism 7GraphPad Software IncRRID:SCR_002798
Software, algorithmPython 2, 3Python Software FoundationRRID:SCR_008394
Software, algorithmPLINKChang et al., 2015RRID:SCR_001757
Software, algorithmPoolQBroad Institutehttps://portals.broadinstitute.org/gpp/public/software/poolq

Additional files

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Satish K Nandakumar
  2. Sean K McFarland
  3. Laura M Mateyka
  4. Caleb A Lareau
  5. Jacob C Ulirsch
  6. Leif S Ludwig
  7. Gaurav Agarwal
  8. Jesse M Engreitz
  9. Bartlomiej Przychodzen
  10. Marie McConkey
  11. Glenn S Cowley
  12. John G Doench
  13. Jaroslaw P Maciejewski
  14. Benjamin L Ebert
  15. David E Root
  16. Vijay G Sankaran
(2019)
Gene-centric functional dissection of human genetic variation uncovers regulators of hematopoiesis
eLife 8:e44080.
https://doi.org/10.7554/eLife.44080