Research Article

Pancreatic progenitor epigenome maps prioritize type 2 diabetes risk genes with roles in development

Department of Pediatrics, Pediatric Diabetes Research Center, University of California, San Diego, United States
Department of Cellular & Molecular Medicine, University of California, San Diego, United States
Sanford Consortium for Regenerative Medicine, United States
Biomedical Graduate Studies Program, University of California, San Diego, United States
Human Genetics Program, Sanford Burnham Prebys Medical Discovery Institute, United States
Graduate School of Biomedical Sciences, Sanford Burnham Prebys Medical Discovery Institute, United States
Ludwig Institute for Cancer Research, United States

Feb 5, 2021

Open access
Copyright information

Abstract
Introduction
Results
Discussion
Materials and methods
Data availability
References
Article and author information
Metrics

Abstract

Genetic variants associated with type 2 diabetes (T2D) risk affect gene regulation in metabolically relevant tissues, such as pancreatic islets. Here, we investigated contributions of regulatory programs active during pancreatic development to T2D risk. Generation of chromatin maps from developmental precursors throughout pancreatic differentiation of human embryonic stem cells (hESCs) identifies enrichment of T2D variants in pancreatic progenitor-specific stretch enhancers that are not active in islets. Genes associated with progenitor-specific stretch enhancers are predicted to regulate developmental processes, most notably tissue morphogenesis. Through gene editing in hESCs, we demonstrate that progenitor-specific enhancers harboring T2D-associated variants regulate cell polarity genes LAMA1 and CRB2. Knockdown of lama1 or crb2 in zebrafish embryos causes a defect in pancreas morphogenesis and impairs islet cell development. Together, our findings reveal that a subset of T2D risk variants specifically affects pancreatic developmental programs, suggesting that dysregulation of developmental processes can predispose to T2D.

Introduction

Type 2 diabetes (T2D) is a multifactorial metabolic disorder characterized by insulin insensitivity and insufficient insulin secretion by pancreatic beta cells (Halban et al., 2014). Genetic association studies have identified hundreds of loci influencing risk of T2D (Mahajan et al., 2018). However, disease-relevant target genes of T2D risk variants, the mechanisms by which these genes cause disease, and the tissues in which the genes mediate their effects remain poorly understood.

The majority of T2D risk variants map to non-coding sequence, suggesting that genetic risk of T2D is largely mediated through variants affecting transcriptional regulatory activity. Intersection of T2D risk variants with epigenomic data has uncovered enrichment of T2D risk variants in regulatory sites active in specific cell types, predominantly in pancreatic beta cells, including risk variants that affect regulatory activity directly (Chiou et al., 2019; Fuchsberger et al., 2016; Gaulton et al., 2015; Gaulton et al., 2010; Greenwald et al., 2019; Mahajan et al., 2018; Parker et al., 2013; Pasquali et al., 2014; Thurner et al., 2018; Varshney et al., 2017). T2D risk-associated variants are further enriched within large, contiguous regions of islet active chromatin, referred to as stretch or super-enhancers (Parker et al., 2013). These regions of active chromatin preferentially bind islet-cell-restricted transcription factors and drive islet-specific gene expression (Parker et al., 2013; Pasquali et al., 2014).

Many genes associated with T2D risk in islets are not uniquely expressed in differentiated islet endocrine cells, but also in pancreatic progenitor cells during embryonic development. For example, T2D risk variants map to HNF1A, HNF1B, HNF4A, MNX1, NEUROG3, PAX4, and PDX1 (Flannick et al., 2019; Mahajan et al., 2018; Steinthorsdottir et al., 2014), which are all transcription factors also expressed in pancreatic developmental precursors. Studies in model organisms and hESC-based models of pancreatic endocrine cell differentiation have shown that inactivation of these transcription factors causes defects in endocrine cell development, resulting in reduced beta cell numbers (Gaertner et al., 2019). Furthermore, heterozygous mutations for HNF1A, HNF1B, HNF4A, PAX4, and PDX1 are associated with maturity onset diabetes of the young (MODY), which is an autosomal dominant form of diabetes with features similar to T2D (Urakami, 2019). Thus, there is evidence that reduced activity of developmentally expressed transcription factors can cause diabetes later in life.

The role of these transcription factors in T2D and MODY could be explained by their functions in regulating gene expression in mature islet cells. However, it is also possible that their function during endocrine cell development could predispose to diabetes instead of, or in addition to, endocrine cell gene regulation. One conceivable mechanism is that individuals with reduced activity of these transcription factors are born with either fewer beta cells or beta cells more prone to fail under conditions of increased insulin demand. Observations showing that disturbed intrauterine metabolic conditions, such as maternal malnutrition, can lead to reduced beta cell mass and T2D predisposition in the offspring (Lumey et al., 2015; Nielsen et al., 2014; Portha et al., 2011) support the concept that compromised beta cell development could predispose to T2D. However, whether there is T2D genetic risk relevant to the regulation of endocrine cell development independent of gene regulation in mature islet cells has not been explored.

In this study, we investigated the contribution of gene regulatory programs specifically active during pancreatic development to T2D risk. First, we employed a hESC-based differentiation system to generate chromatin maps of hESCs during their stepwise differentiation into pancreatic progenitor cells. We then identified T2D-associated variants localized in active enhancers in developmental precursors but not in mature islets, used genome editing in hESCs to define target genes of pancreatic progenitor-specific enhancers harboring T2D variants, and employed zebrafish genetic models to study the role of two target genes in pancreatic and endocrine cell development.

Results

Pancreatic progenitor stretch enhancers are enriched for T2D risk variants

To determine whether there is a development-specific genetic contribution to T2D risk, we generated genome-wide chromatin maps of hESCs during their stepwise differentiation into pancreatic progenitors through four distinct developmental stages: definitive endoderm (DE), gut tube (GT), early pancreatic progenitors (PP1), and late pancreatic progenitors (PP2) (Figure 1A). We then used ChromHMM (Ernst and Kellis, 2012) to annotate chromatin states, such as active promoters and enhancers, at all stages of hESC differentiation as well as in primary islets (Figure 1—figure supplement 1A,B).

Figure 1 with 1 supplement see all

Download asset Open asset

T2D-associated risk variants are enriched in stretch enhancers of pancreatic progenitors independent of islet stretch enhancers.

(A) Schematic illustrating the stepwise differentiation of human embryonic stem cells (hES) into pancreatic progenitors (solid arrows) and lineage relationship to islets (dotted arrow). Developmental intermediates include definitive endoderm (DE), gut tube (GT), early pancreatic progenitor (PP1), and late pancreatic progenitor (PP2) cells. (B) Box plots depicting length of typical enhancers (TE) and stretch enhancers (SE) at each developmental stage and in primary human islets. Plots are centered on median, with box encompassing 25–75th percentile and whiskers extending up to 1.5 interquartile range. Total numbers of enhancers are shown above each box plot. (C) Examples of stretch enhancers (denoted with red boxes) near the genes encoding the pancreatic lineage-determining transcription factors NKX6.1 and PDX1, respectively. Chromatin states are based on ChromHMM classifications: TssA, active promoter; TssFlnk, flanking transcription start site; TssBiv, bivalent promoter; Repr, repressed; EnhA, active enhancer; EnhP, poised enhancer. (D) Percentage of TE and SE overlapping with at least one ATAC-seq peak at PP2 or in islets. Enrichment analysis comparing observed and expected overlap based on random genomic regions of the same size and located on the same chromosome averaged over 10,000 iterations (***p<1 × 10⁻⁴; permutation test). ATAC-seq peaks were merged from two independent differentiations for PP2 stage cells and four donors for primary islets. (E) Genome-wide enrichment of T2D-associated variants (minor allele frequency >0.0025) in stretch enhancers, ATAC-seq peaks, and ATAC-seq peaks within stretch enhancers for all developmental stages when modeling each annotation separately. Points and lines represent log-scaled enrichment estimates and 95% confidence intervals from functional genome wide association analysis (fgwas), respectively. ATAC-seq peaks were merged from two independent differentiations for ES, DE, GT, PP1, and PP2 stage cells and from four donors for primary islets. (F) Genome-wide enrichment of T2D-associated variants (minor allele frequency >0.0025) in ATAC-seq peaks within stretch enhancers for all developmental stages and coding exons when considering all annotations in a joint model. Points and lines represent log-scaled enrichment estimates and 95% confidence intervals from fgwas, respectively. ATAC-seq peaks were merged from two independent differentiations for ES, DE, GT, PP1, and PP2 stage cells and from four donors for primary islets. See also Figure 1—figure supplement 1.

Large and contiguous regions of active enhancer chromatin, which have been termed stretch- or super-enhancers (Parker et al., 2013; Whyte et al., 2013), are highly enriched for T2D risk variants in islets (Parker et al., 2013; Pasquali et al., 2014). We therefore partitioned active enhancers from each hESC developmental stage and islets into stretch enhancers (SE) and traditional (non-stretch) enhancers (TE) (Figure 1B). Consistent with prior observations of SE features (Parker et al., 2013; Whyte et al., 2013), SE comprised a small subset of all active enhancers (7.7%, 7.8%, 8.8%, 8.1%, 8.1%, and 10.4% of active enhancers in ES, DE, GT, PP1, PP2, and islets, respectively; Figure 1B and Figure 1—figure supplement 1C) and genes proximal to SE were more highly expressed than genes proximal to TE (p=4.68 × 10⁻⁷, 4.64 × 10⁻¹¹, 1.31 × 10⁻⁵, 8.85 × 10⁻⁹, 5.34 × 10⁻⁶, and <2.2 × 10⁻¹⁶ for expression of genes near TE vs SE in ES, DE, GT, PP1, PP2, and islets, respectively; Figure 1—figure supplement 1D). Genes near SE in pancreatic progenitors included transcription factors involved in the regulation of pancreatic cell identity, such as NKX6.1 and PDX1 (Figure 1C). Since disease-associated variants are preferentially enriched in narrow peaks of accessible chromatin within broader regions of active chromatin (Greenwald et al., 2019; Thurner et al., 2018; Varshney et al., 2017), we next used ATAC-seq to generate genome-wide maps of chromatin accessibility across all time points of differentiation. Nearly all identified SE contained at least one ATAC-seq peak (Figure 1D and Figure 1—figure supplement 1E,F). At the PP2 stage, 62.3% of SE harbored one, 32.2% two or three, and 0.7% four or more ATAC-seq peaks (Figure 1—figure supplement 1F). Similar percentages were observed in earlier developmental precursors and islets.

Having annotated accessible chromatin sites within SE, we next tested for enrichment of T2D-associated variants in SE active in mature islets and in pancreatic developmental stages. We observed strongest enrichment of T2D-associated variants in islet SE (log enrichment = 2.18, 95% CI = 1.80, 2.54) and late pancreatic progenitor SE (log enrichment = 2.17, 95% CI = 1.40, 2.74), which was more pronounced when only considering variants in accessible chromatin sites within these elements (islet log enrichment = 3.20, 95% CI = 2.74, 3.60; PP2 log enrichment = 3.18, 95% CI = 2.35, 3.79; Figure 1E). Given that a subset of pancreatic progenitor SE is also active in islets, we next determined whether pancreatic progenitor SE contribute to T2D risk independently of islet SE. Variants in accessible chromatin sites of late pancreatic progenitor SE were enriched for T2D association in a joint model including islet SE (islet log enrichment = 2.94, 95% CI = 2.47, 3.35; PP2 log enrichment = 1.27, 95% CI = 0.24, 2.00; Figure 1F). We also observed enrichment of variants in accessible chromatin sites of pancreatic progenitor SE after conditioning on islet SE (log enrichment = 0.60, 95% CI = −0.87, 1.48), as well as when excluding pancreatic progenitor SE active in islets (log enrichment = 1.62, 95% CI = <-20, 3.14). Examples of known T2D loci with T2D-associated variants in SE active in pancreatic progenitors but not in islets included LAMA1 and PROX1. These results suggest that a subset of T2D variants may affect disease risk by altering regulatory programs specifically active in pancreatic progenitors.

Pancreatic progenitor-specific stretch enhancers are near genes that regulate tissue morphogenesis

Having observed enrichment of T2D risk variants in pancreatic progenitor SE independent of islet SE, we next sought to further characterize the regulatory programs of SE with specific function in pancreatic progenitors. We therefore defined a set of pancreatic progenitor-specific stretch enhancers (PSSE) based on the following criteria: (i) annotation as a SE at the PP2 stage, (ii) no classification as a SE at the ES, DE, and GT stages, and (iii) no classification as a TE or SE in islets. Applying these criteria, we identified a total of 492 PSSE genome-wide (Figure 2A and Figure 2—source data 1).

Figure 2 with 1 supplement see all

Download asset Open asset

Candidate target genes of pancreatic progenitor-specific stretch enhancers regulate developmental processes.

(A) Schematic illustrating identification of pancreatic progenitor-specific stretch enhancers (PSSE). (B) Heatmap showing density of H3K27ac ChIP-seq and ATAC-seq reads at PSSE, centered on overlapping H3K27ac and ATAC-seq peaks, respectively, and spanning 5 kb in ES, DE, GT, PP1, PP2, and islets. PSSE coordinates in Figure 2—source data 1. (C) Percentage of PSSE exhibiting indicated chromatin states at defined developmental stages and in islets. (D) Percentage of PSSE overlapping with at least one ChIP-seq peak at PP2 for the indicated transcription factors. Enrichment analysis comparing observed and expected overlap based on random genomic regions of the same size and located on the same chromosome averaged over 10,000 iterations (***p<1×10⁻⁴; permutation test). (E) Gene ontology analysis for nearest expressed genes (fragments per kilobase per million fragments mapped (FPKM) ≥1 at PP2) to the 492 PSSE. See also Figure 2—source data 2. (F) Enrichment (LD score regression coefficient z-scores) of T2D, developmental, and metabolic GWAS trait-associated variants at accessible chromatin sites in PSSE as compared with PP2 and islet stretch enhancers. Significant enrichment was identified within accessible chromatin at PP2 stretch enhancers for lean type 2 diabetes (Z = 2.06, *p=3.94 × 10⁻²), at PP2 stretch enhancers for type 2 diabetes (Z = 3.57, ***p=3.52 × 10⁻⁴), at islet stretch enhancers for type 2 diabetes (Z = 2.78, **p=5.46 × 10⁻³), at islet stretch enhancers for fasting proinsulin levels (Z = 2.83, **p=4.61 × 10⁻³), at islet stretch enhancers for HOMA-B (Z = 2.58, **p=9.85 × 10⁻³), at PP2 stretch enhancers for disposition index (Z = 2.18, *p=2.94 × 10⁻²), at islet stretch enhancers for acute insulin response (Z = 2.24, *p=2.51 × 10⁻²), at islet stretch enhancers for HbA1c (Z = 1.98, *p=4.72 × 10⁻²), and at islet stretch enhancers for fasting glucose levels (Z = 2.64, **p=8.31 × 10⁻³). See also Figure 2—source data 3 and Figure 2—figure supplement 1.

Figure 2—source data 1 Chromosomal coordinates of pancreatic progenitor-specific stretch enhancers (PSSE).: https://cdn.elifesciences.org/articles/59067/elife-59067-fig2-data1-v1.xlsx
Download elife-59067-fig2-data1-v1.xlsx
Figure 2—source data 2 Enriched gene ontology terms for PSSE-associated genes.: https://cdn.elifesciences.org/articles/59067/elife-59067-fig2-data2-v1.xlsx
Download elife-59067-fig2-data2-v1.xlsx
Figure 2—source data 3 Proportion of variants nominally associated with beta cell functional traits.: https://cdn.elifesciences.org/articles/59067/elife-59067-fig2-data3-v1.xlsx
Download elife-59067-fig2-data3-v1.xlsx
Figure 2—source data 4 Tissue identity of downloaded data from ROADMAP consortium.: https://cdn.elifesciences.org/articles/59067/elife-59067-fig2-data4-v1.xlsx
Download elife-59067-fig2-data4-v1.xlsx

As expected based on their chromatin state classification, PSSE acquired broad deposition of the active enhancer mark H3K27ac at the PP1 and PP2 stages (Figure 2B,C). Coincident with an increase in H3K27ac signal, chromatin accessibility at PSSE also increased (Figure 2B), and 93.5% of PSSE contained at least one accessible chromatin site at the PP2 stage (Figure 2—figure supplement 1A,B). Further investigation of PSSE chromatin state dynamics at earlier stages of pancreatic differentiation revealed that PSSE were often poised (defined by H3K4me1 in the absence of H3K27ac) prior to activation (42%, 48%, 63%, and 17% of PSSE in ES, DE, GT, and PP1, respectively; Figure 2C), consistent with earlier observations that a poised enhancer state frequently precedes enhancer activation during development (Rada-Iglesias et al., 2011; Wang et al., 2015). Intriguingly, a subset of PSSE was classified as TE earlier in development (13%, 23%, 29%, and 46% of PSSE in ES, DE, GT, and PP1, respectively; Figure 2C), suggesting that SE emerge from smaller regions of active chromatin seeded at prior stages of development. During differentiation into mature islet cells, PSSE lost H3K27ac but largely retained H3K4me1 signal (62% of PSSE) (Figure 2C), persisting in a poised state in terminally differentiated islet cells.

To gain insight into the transcription factors that regulate PSSE, we conducted motif enrichment analysis of accessible chromatin sites within PSSE (Figure 2—figure supplement 1C). Consistent with the activation of PSSE upon pancreas induction, motifs associated with transcription factors known to regulate pancreatic development (Conrad et al., 2014; Masui et al., 2007) were enriched, including FOXA (p=1 × 10⁻³⁴), PDX1 (p=1 × 10⁻³⁰), GATA (p=1 × 10⁻²⁵), ONECUT (p=1 × 10⁻¹⁷), and RBPJ (p=1 × 10⁻¹⁴), suggesting that pancreatic lineage-determining transcription factors activate PSSE. Analysis of the extent of PSSE overlap with ChIP-seq binding sites for FOXA1, FOXA2, GATA4, GATA6, PDX1, HNF6, and SOX9 at the PP2 stage substantiated this prediction (p<1 × 10⁻⁴ for all transcription factors; permutation test; Figure 2D).

Annotation of biological functions of predicted target genes for PSSE (nearest gene with FPKM ≥1 at PP2 stage) revealed gene ontology terms related to developmental processes, such as tissue morphogenesis (p=1 × 10⁻⁷) and vascular development (p=1 × 10⁻⁸), as well as developmental signaling pathways, including BMP (p=1 × 10⁻⁵), NOTCH (p=1 × 10⁻⁴), and canonical Wnt signaling (p=1 × 10⁻⁴; Figure 2E and Figure 2—source data 2), which have demonstrated roles in pancreas morphogenesis and cell lineage allocation (Ahnfelt-Rønne et al., 2010; Li et al., 2015; Murtaugh, 2008; Sharon et al., 2019; Sui et al., 2013). Consistent with the temporal pattern of H3K27ac deposition at PSSE, transcript levels of PSSE-associated genes increased upon pancreatic lineage induction and peaked at the PP2 stage (p=1.8 × 10⁻⁸; Figure 2—figure supplement 1D). Notably, expression of these genes sharply decreased in islets (p<2.2 × 10⁻¹⁶), underscoring the likely role of these genes in regulating pancreatic development but not mature islet function.

Pancreatic progenitor-specific stretch enhancers are highly specific across T2D-relevant tissues and cell types

We next sought to understand the phenotypic consequences of PSSE activity in the context of T2D pathophysiology. Variants in accessible chromatin sites of PSSE genome-wide were enriched for T2D association (log enrichment = 2.85, 95% CI = <-20, 4.09). We determined enrichment of genetic variants for T2D-related quantitative endophenotypes within accessible chromatin sites of PSSE, as well as all pancreatic progenitor SE (not just progenitor-specific) and islet SE, using LD score regression (Bulik-Sullivan et al., 2015; Finucane et al., 2015). As expected based on prior observations (Parker et al., 2013; Pasquali et al., 2014), we observed enrichment (Z > 1.96) of variants associated with quantitative traits related to insulin secretion and beta cell function within islet SE, exemplified by fasting proinsulin levels, HOMA-B, and acute insulin response (Z = 2.8, Z = 2.6, and Z = 2.2, respectively; Figure 2F). Conversely, PSSE showed a trend toward depletion for these traits, although the estimates were not significant. We further tested for enrichment in the proportion of variants in PSSE and islet SE nominally associated (p<0.05) with beta cell function traits compared to background variants. There was significant enrichment of beta cell trait association among islet SE variants (χ² test; p<0.05 for all beta cell functional traits except for insulin secretion rate), but no corresponding enrichment for PSSE (Figure 2—source data 3).

A prior study found that variants at the LAMA1 locus had stronger effects on T2D risk among lean relative to obese cases (Perry et al., 2012). Since we identified a PSSE at the LAMA1 locus, we postulated that variants in PSSE collectively might have differing impact on T2D risk in cases segregated by BMI. We therefore tested PSSE, as well as pancreatic progenitor SE and islet SE, for enrichment of T2D association using GWAS of lean and obese T2D (Perry et al., 2012), using LD score regression (Bulik-Sullivan et al., 2015; Finucane et al., 2015). We observed nominally significant enrichment of variants in pancreatic progenitor SE for T2D among lean cases (Z = 2.1). Variants in PSSE were mildly enriched for T2D among lean (Z = 1.1) and depleted among obese (Z = −0.70) cases, although neither estimate was significant. By comparison, islet SE showed positive enrichment for T2D among both lean (Z = 1.9) and obese cases (Z = 1.3; Figure 2F). Together, these results suggest that PSSE may affect T2D risk in a manner distinct from islet SE function.

Having observed little evidence for enrichment of PSSE variants for traits related to beta cell function, we asked whether the enrichment of PSSE for T2D-associated variants could be explained by PSSE activity in T2D-relevant tissues and cell types outside the pancreas. We assessed PSSE activity by measuring H3K27ac signal in 95 representative tissues and cell lines from the ENCODE and Epigenome Roadmap projects (Kundaje et al., 2015). Interestingly, there was group-wide specificity of PSSE to pancreatic progenitors relative to other cells and tissues including those relevant to T2D, such as adipose tissue, skeletal muscle, and liver (Figure 2—figure supplement 1E and Figure 2—source data 4). Since gene regulation in adipocyte precursors also contributes to T2D risk (Claussnitzer et al., 2014), we further examined PSSE specificity with respect to chromatin states during adipogenesis, using data from human adipose stromal cell differentiation stages (hASC1-4) (Mikkelsen et al., 2010; Varshney et al., 2017). PSSE exhibited virtually no active chromatin during adipogenesis (9, 8, 6, and 8 out of the 492 PSSE were active enhancers in hACS-1, hASC-2, hASC-3, and hASC-4, respectively; Figure 2—figure supplement 1F). These findings identify PSSE as highly pancreatic progenitor-specific across T2D-relevant tissues and cell types.

Identification of pancreatic progenitor-specific stretch enhancers harboring T2D-associated variants

Given the relative specificity of PSSE to pancreatic progenitors, we next sought to identify T2D-associated variants in PSSE at specific loci which may affect pancreatic development. We therefore identified variants in PSSE with evidence of T2D association (at p=4.7 × 10⁻⁶) after correcting for the total number of variants in PSSE genome-wide (n = 10,738). In total there were 49 variants in PSSE with T2D association exceeding this threshold mapping to 11 loci (Figure 3A). This included variants at nine loci with known genome-wide significant T2D association (PROX1, ST6GAL1, SMARCAD1, XKR6, INS-IGF2, HMGA2, SMEK1, HMG20A, and LAMA1), as well as at two previously unreported loci with sub-genome-wide significant association, CRB2 and PGM1. To identify candidate target genes of the T2D-associated PSSE in pancreatic progenitors, we analyzed the expression of all genes within the same topologically associated domain (TAD) as the PSSE in PP2 cells and in primary human embryonic pancreas tissue (Figure 3B and Figure 3—figure supplement 1A). These expressed genes are candidate effector transcripts of T2D-associated variants in pancreatic progenitors.

Figure 3 with 1 supplement see all

Download asset Open asset

Identification of T2D risk variants associated with pancreatic progenitor-specific stretch enhancers.

(A) Manhattan plot showing T2D association p-values (from Mahajan et al., 2018) for 10,738 variants mapping within PSSE. The dotted line shows the threshold for Bonferroni correction (p=4.66 × 10⁻⁶). Novel loci identified with this threshold and mapping at least 500 kb away from a known locus are highlighted in blue. Chromosomal coordinates of T2D-associated PSSE are indicated. (B) mRNA levels (measured in fragments per kilobase per million fragments mapped [FPKM]) at PP2 (blue) and in human embryonic pancreas (54 and 58 days gestation, gold) of nearest expressed (FPMK ≥1) gene at PP2 for PSSE harboring T2D variants identified in A. (C) PP2 specificity of H3K27ac signal at PSSE harboring T2D variants identified in A. Z-score comparing H3K27ac signal at PP2 to H3K27ac signal in tissues and cell lines from the ENCODE and Epigenome Roadmap projects. See also Figure 3—figure supplement 1.

As many pancreatic progenitor SE remain poised in mature islets (Figure 2C), we considered whether T2D-associated variants in PSSE could have gene regulatory function in islets that is re-activated in the disease state. We therefore assessed overlap of PSSE variants with accessible chromatin of islets from T2D donors (Khetan et al., 2018). None of the strongly T2D-associated variants in PSSE (p=4.7 × 10⁻⁶) overlapped an islet accessible chromatin site in T2D islets, arguing against the relevance of PSSE in broadly regulating islet gene activity during T2D.

A pancreatic progenitor-specific stretch enhancer at LAMA1 harbors T2D risk variants and regulates LAMA1 expression selectively in pancreatic progenitors

Variants in a PSSE at the LAMA1 locus were associated with T2D at genome-wide significance (Figure 3A), and LAMA1 was highly expressed in the human embryonic pancreas (Figure 3B). Furthermore, the activity of the PSSE at the LAMA1 locus was almost exclusively restricted to pancreatic progenitors (Figure 3—figure supplement 1B,C), and was further among the most progenitor-specific across all PSSE harboring T2D risk variants (Figure 3C). In addition, reporter gene assays in zebrafish embryos have shown that this enhancer drives gene expression specific to pancreatic progenitors in vivo (Cebola et al., 2015). We therefore postulated that the activity of T2D-associated variants within the LAMA1 PSSE is relevant for gene regulation in pancreatic progenitors, and we sought to characterize the LAMA1 PSSE in greater depth.

Multiple T2D-associated variants mapped within the LAMA1 PSSE, and these variants were further in the 99% credible set in fine-mapping data from the DIAMANTE consortium (Mahajan et al., 2018; Figure 4A). No other variants in the 99% credible set mapped in an accessible chromatin site active in islets from either non-diabetic or T2D samples. The PSSE is intronic to the LAMA1 gene and contains regions of poised chromatin and TE at prior developmental stages (Figure 4A). Consistent with its stepwise genesis as a SE throughout development, regions of open chromatin within the LAMA1 PSSE were already present at the DE and GT stages. Furthermore, pancreatic lineage-determining transcription factors, such as FOXA1, FOXA2, GATA4, GATA6, HNF6, SOX9, and PDX1, were all bound to the PSSE at the PP2 stage (Figure 4B). Among credible set variants in the LAMA1 PSSE, rs10502347 overlapped an ATAC-seq peak as well as ChIP-seq sites for multiple pancreatic lineage-determining transcription factors. Additionally, rs10502347 directly coincided with a SOX9 footprint identified in ATAC-seq data from PP2 cells, and the T2D risk allele C is predicted to disrupt SOX9 binding (Figure 4B). Consistent with the collective endophenotype association patterns of PSSE (Figure 2F), rs10502347 showed no association with beta cell function (p=0.81, 0.23, 0.46 for fasting proinsulin levels, HOMA-B, and acute insulin response, respectively; Figure 4—figure supplement 1A). Thus, T2D variant rs10502347 is predicted to affect the binding of pancreatic transcription factors and does not appear to affect beta cell function.

Figure 4 with 2 supplements see all

Download asset Open asset

A T2D risk-associated *LAMA1* pancreatic progenitor-specific stretch enhancer regulates *LAMA1* expression specifically in pancreatic progenitors.

(A) (Top) Locus plots showing T2D association p-values for variants in a 35 kb window (hg19 chr18:7,050,000–7,085,000) at the *LAMA1* locus and *LAMA1* PSSE (red box). Fine mapped variants within the 99% credible set for the *LAMA1* locus are colored black. All other variants are colored light gray. (Bottom) Chromatin states and ATAC-seq signal in ES, DE, GT, PP1, and PP2. TssA, active promoter; TssFlnk, flanking transcription start site; TssBiv, bivalent promoter; Repr, repressed; EnhA, active enhancer; EnhP, poised enhancer. (B) FOXA1, FOXA2, GATA4, GATA6, HNF6, SOX9, and PDX1 ChIP-seq profiles at the *LAMA1* PSSE in PP2. The variant rs10502347 (red) overlaps transcription factor binding sites and a predicted ATAC-seq footprint for the SOX9 sequence motif. Purple dotted lines indicate the core binding profile of the average SOX9 footprint genome-wide and the blue dotted line indicates the position of rs10502347 within the SOX9 motif. (C) *LAMA1* mRNA expression at each developmental stage determined by RNA-seq, measured in fragments per kilobase per million fragments mapped (FPKM). Data shown as mean ± S.E.M. (n = 3 replicates from independent differentiations). Light blue and purple indicate classification of the *LAMA1* PSSE as typical enhancer (TE) and stretch enhancer (SE), respectively. (D) *LAMA1* mRNA expression at each developmental stage determined by qPCR in control and ∆*LAMA1*Enh cells. Data are shown as mean ± S.E.M. (n = 3 replicates from independent differentiations for control cells. ∆*LAMA1*Enh cells represent combined data from two clonal lines with three replicates for each line from independent differentiations. n = 3 technical replicates for each sample; p=0.319, 0.594, 0.945, 0.290, and <1 × 10⁻⁶ for comparisons in ES, DE, GT, PP1, and PP2, respectively; student’s t-test, two sided; ***p<0.001, n.s., not significant). Light blue and purple indicate classification of the *LAMA1* PSSE as TE and SE, respectively. Each plotted point represents the average of technical replicates for each differentiation. (E) mRNA expression determined by RNA-seq at PP2 of genes expressed in either control or ∆*LAMA1*Enh cells (FPKM ≥ 1 at PP2) and located within the same topologically associated domain as *LAMA1*. Data are shown as mean FPKM ± S.E.M. (n = 2 replicates from independent differentiations for control cells. ∆*LAMA1*Enh cells represent combined data from two clonal lines with two replicates for each line from independent differentiations. p adj. = 0.389 and 8.11 × 10⁻³ for *ARHGAP28* and *LAMA1*, respectively; DESeq2). See also Figure 4—figure supplements 1 and 2.

Figure 4—source data 1 Genes downregulated in ∆LAMA1Enh PP2 stage cells compared to control cells (p adj. <0.05).: https://cdn.elifesciences.org/articles/59067/elife-59067-fig4-data1-v1.xlsx
Download elife-59067-fig4-data1-v1.xlsx
Figure 4—source data 2 Genes upregulated in ∆LAMA1Enh PP2 stage cells compared to control cells (p adj. <0.05).: https://cdn.elifesciences.org/articles/59067/elife-59067-fig4-data2-v1.xlsx
Download elife-59067-fig4-data2-v1.xlsx

Enhancers can control gene expression over large genomic distances, and therefore their target genes cannot be predicted based on proximity alone. To directly assess the function of the LAMA1 PSSE in regulating gene activity, we utilized CRIPSR-Cas9-mediated genome editing to generate two independent clonal human hESC lines harboring homozygous deletions of the LAMA1 PSSE (hereafter referred to as ∆LAMA1Enh; Figure 4—figure supplement 1B). We examined LAMA1 expression in ∆LAMA1Enh compared to control cells throughout stages of pancreatic differentiation. Consistent with the broad expression of LAMA1 across developmental and mature tissues, control cells expressed LAMA1 at all stages (Figure 4C). LAMA1 was expressed at similar levels in ∆LAMA1Enh and control cells at early developmental stages, but was significantly reduced in PP2 cells derived from ∆LAMA1Enh clones (p=0.319, 0.594, 0.945, 0.290, and <1 × 10⁻⁶ for comparisons in ES, DE, GT, PP1, and PP2, respectively; Figure 4D). To next investigate whether the LAMA1 PSSE regulates other genes at this locus, we utilized Hi-C datasets from PP2 cells to identify topologically associated domains (TADs). We then examined expression of genes mapping in the same TAD as the LAMA1 PSSE. ARHGAP28 was the only other expressed gene within the TAD, and albeit not significantly different from controls (p.adj >0.05), showed a trend toward lower expression in ∆LAMA1Enh PP2 cells (Figure 4E), raising the possibility that ARHGAP28 is an additional target gene of the LAMA1 PSSE. Together, these results demonstrate that while LAMA1 itself is broadly expressed across developmental stages, the T2D-associated PSSE regulates LAMA1 expression specifically in pancreatic progenitors.

To determine whether deletion of the LAMA1 PSSE affects pancreatic development, we generated PP2 stage cells from ∆LAMA1Enh and control hESC lines and analyzed pancreatic cell fate commitment by flow cytometry and immunofluorescence staining for PDX1 and NKX6.1 (Figure 4—figure supplement 1C,D). At the PP2 stage, ∆LAMA1Enh and control cultures contained similar percentages of PDX1- and NKX6.1-positive cells. Furthermore, mRNA expression of PDX1, NKX6.1, PROX1, PTF1A, and SOX9 was either unaffected or only minimally reduced (p adj. = 3.56 × 10⁻², 0.224, 0.829, 8.14 × 10⁻², and 0.142, for comparisons of PDX1, NKX6.1, PROX1, PTF1A, and SOX9 expression, respectively; Figure 4—figure supplement 1E), and the overall gene expression profiles as determined by RNA-seq were similar in ∆LAMA1Enh and control PP2 cells (Figure 4—figure supplement 1F and Figure 4—source datas 1 and 2). To examine effects of complete LAMA1 loss-of-function, we additionally generated a hESC line harboring a deletion of the LAMA1 coding sequences (hereafter referred to as ∆LAMA1; Figure 4—figure supplement 2A,B), and produced PP2 stage cells. Similar to ∆LAMA1Enh cultures, ∆LAMA1 and control PP2 stage cultures contained similar numbers of PDX1- and NKX6.1-positive cells (Figure 4—figure supplement 2C,D). Likewise, mRNA expression of PDX1, NKX6.1, PROX1, PTF1A, and SOX9 was similar in ∆LAMA1 and control PP2 cells (p=4.3 × 10⁻², 0.19, 0.16, 0.17, and 8.7 × 10⁻², respectively; Figure 4—figure supplement 2E). These findings indicate that in vitro pancreatic lineage induction is unperturbed in both ∆LAMA1Enh cells exhibiting reduced LAMA1 expression, as well as ∆LAMA1 cells where LAMA1 coding sequences are disrupted.

Pancreatic progenitor-specific stretch enhancers at the CRB2 and PGM1 loci harbor T2D-associated variants

Multiple variants with evidence for T2D association in PSSE mapped outside of known risk loci, such as those mapping to CRB2 and PGM1 (Figure 3A). As with the LAMA1 PSSE, PSSE harboring variants at CRB2 and PGM1 were intronic to their respective genes, contained ATAC-seq peaks, and bound pancreatic lineage-determining transcription factors FOXA1, FOXA2, GATA4, GATA6, HNF6, SOX9, and PDX1 (Figure 5A,B and Figure 5—figure supplement 1A,B). Compared to the LAMA1 PSSE, CRB2 and PGM1 PSSE were less specific to pancreatic progenitors and exhibited significant H3K27ac signal in several other tissues and cell types, most notably brain, liver, and the digestive tract (Figure 5—figure supplement 1C,D).

Figure 5 with 3 supplements see all

Download asset Open asset

A T2D risk-associated *CRB2* pancreatic progenitor-specific stretch enhancer regulates *CRB2* expression specifically in pancreatic progenitors.

(A) (Top) Locus plots showing T2D association p-values for variants in a 35 kb window (hg19 chr9:126,112,000–126,147,000) at the *CRB2* locus and *CRB2* PSSE (red box). Fine mapped variants within the 99% credible set for the novel *CRB2* locus are colored black. All other variants are colored light gray. (Bottom) Chromatin states and ATAC-seq signal in ES, DE, GT, PP1, and PP2. TssA, active promoter; TssFlnk, flanking transcription start site; TssBiv, bivalent promoter; Repr, repressed; EnhA, active enhancer; EnhP, poised enhancer. (B) FOXA1, FOXA2, GATA4, GATA6, HNF6, SOX9, and PDX1 ChIP-seq profiles at the *CRB2* PSSE in PP2. The variant rs2491353 (black) overlaps with transcription factor binding sites. (C) *CRB2* mRNA expression at each developmental stage determined by RNA-seq, measured in fragments per kilobase per million fragments mapped (FPKM). Data shown as mean ± S.E.M. (n = 3 replicates from independent differentiations). Light blue and purple indicate classification of the *CRB2* PSSE as typical enhancer (TE) and stretch enhancer (SE), respectively. Plotted points represent average of technical replicates for each differentiation. (D) *CRB2* mRNA expression at each developmental stage determined by qPCR in control and ∆*CRB2*Enh cells. Data are shown as mean ± S.E.M. (n = 3 replicates from independent differentiations for control cells. ∆*CRB2*Enh cells represent combined data from two clonal lines with three replicates for each line from independent differentiations. n = 3 technical replicates for each sample; p=7.03 × 10⁻⁴,<1 × 10⁻⁶,<1 × 10⁻⁶, 1.46 × 10⁻², and <1 × 10⁻⁶ for comparisons in ES, DE, GT, PP1, and PP2, respectively; student’s t-test, two sided; ***p<0.001 **p<0.01). Light blue and purple indicate classification of the *CRB2* PSSE as TE and SE, respectively. Each plotted point represents the average of technical replicates for each differentiation. (E) mRNA expression determined by RNA-seq at PP2 of genes expressed in either control or ∆*CRB2*Enh cells (FPKM ≥ 1 at PP2) and located within the same topologically associated domain as *CRB2*. Data are shown as mean FPKM ± S.E.M. (n = 2 replicates from independent differentiations for control cells. ∆*CRB2*Enh cells represent combined data from two clonal lines with two replicates for each line from independent differentiations. p adj. = 0.158, 1.00, and 3.51 × 10⁻³, for *MIR600HG*, *STRBP*, and *CRB2*, respectively; DESeq2; **p<0.01, n.s., not significant). See also Figure 5—figure supplements 1–3.

Figure 5—source data 1 Genes downregulated in ∆CRB2Enh PP2 stage cells compared to control cells (p adj. <0.05).: https://cdn.elifesciences.org/articles/59067/elife-59067-fig5-data1-v1.xlsx
Download elife-59067-fig5-data1-v1.xlsx

CRB2 is a component of the Crumbs protein complex involved in the regulation of cell polarity and neuronal, heart, retinal, and kidney development (Alves et al., 2013; Bulgakova and Knust, 2009; Dudok et al., 2016; Jiménez-Amilburu and Stainier, 2019; Slavotinek et al., 2015). However, its role in pancreatic development is unknown. To determine whether the CRB2 PSSE regulates CRB2 expression in pancreatic progenitors, we generated two independent hESC clones with homozygous deletions of the CRB2 PSSE (hereafter referred to as ∆CRB2Enh; Figure 5—figure supplement 2A) and performed pancreatic differentiation of ∆CRB2Enh and control hESC lines. In control cells, CRB2 was first expressed at the GT stage and increased markedly at the PP1 stage (Figure 5C). This pattern of CRB2 expression is consistent with H3K27ac deposition at the CRB2 PSSE in GT stage cells and classification as a SE at the PP1 and PP2 stages (Figure 5A and Figure 5—figure supplement 1C). In ∆CRB2Enh cells, we observed upregulation of CRB2 expression at earlier developmental stages, in particular at the DE and GT stages (p<1 × 10⁻⁶ at both stages; Figure 5D), suggesting that the CRB2 PSSE may be associated with repressive transcriptional complexes prior to pancreas induction. At the PP2 stage, CRB2 expression was significantly reduced in ∆CRB2Enh cells (p adj. = 3.51 × 10⁻³; Figure 5D), whereas the expression of other genes in the same TAD was not affected (p adj. ≥0.05; Figure 5E). Thus, the CRB2 PSSE specifically regulates CRB2 and is required for CRB2 expression in pancreatic progenitors.

Phenotypic characterization of PP2 stage ∆CRB2Enh cultures revealed similar percentages of PDX1- and NKX6.1-positive cells as in control cultures (Figure 5—figure supplement 2B,C). The expression of pancreatic transcription factors and global gene expression profiles were also similar in ∆CRB2Enh and control PP2 cells (Figure 5—figure supplement 2D,E and Figure 5—source data 1). Likewise, CRB2 deletion hESCs (∆CRB2) differentiated to the PP2 stage (Figure 5—figure supplement 3A,B) produced similar numbers of PDX1- and NKX6.1-positive cells and expressed pancreatic transcription factors at levels similar to control cells (Figure 5—figure supplement 3C–E). Thus, neither deletion of the CRB2 PSSE nor the CRB2 gene overtly impairs pancreatic lineage induction in the in vitro hESC differentiation system.

lama1 and crb2 zebrafish morphants display annular pancreas and decreased beta cell mass

Based on their classification as extracellular matrix and cell polarity proteins, respectively, Laminin (encoded by LAMA1) and CRB2 are predicted to regulate processes related to tissue morphogenesis, such as cell migration, tissue growth, and cell allocation within the developing organ. Furthermore, PSSE in general were enriched for proximity to genes involved in tissue morphogenesis (Figure 2E), suggesting that T2D risk variants acting within PSSE could have roles in pancreas morphogenesis. Since cell migratory processes and niche-specific signaling events are not fully modeled during hESC differentiation, we reasoned that the in vitro pancreatic differentiation system might not be suitable for studying Laminin and CRB2 function in pancreatic development.

To circumvent these limitations, we employed zebrafish as an in vivo vertebrate model to study the effects of reduced lama1 and crb2 levels on pancreatic development. The basic organization and cell types in the pancreas as well as the genes regulating endocrine and exocrine pancreas development are highly conserved between zebrafish and mammals (Dong et al., 2008; Field et al., 2003; Kimmel et al., 2015). To analyze pancreatic expression of Laminin and Crb proteins, we used Tg(ptf1a:eGFP)^jh1 embryos to visualize pancreatic progenitor cells and the acinar pancreas by eGFP expression. At 48 hr post-fertilization (hpf), both Laminin and Crb proteins were detected in the eGFP and Nkx6.1 co-positive pancreatic progenitor cell domain (Figure 6—figure supplement 1A,B).

To determine the respective functions of lama1 and crb2 in pancreatic development, we performed knockdown experiments using anti-sense morpholinos directed against lama1 and the two zebrafish crb2 genes, crb2a and crb2b (Omori and Malicki, 2006; Pollard et al., 2006). Knockdown efficiency of each morpholino was validated using whole-mount immunohistochemistry. We observed significant reduction of Laminin staining throughout the pancreatic progenitor cell domain in embryos treated with morpholinos targeting lama1 (Figure 6—figure supplement 2A–D). In embryos treated with morpholinos targeting crb2a or crb2a and crb2b, we observed loss of staining in the pancreatic progenitor cell domain using antibodies specific to Crb2a or antibodies detecting all Crb proteins, respectively (Figure 6—figure supplement 3A–H) Residual panCrb protein signal was observed in the dorsal pancreas, which may be the result of expression of Crb proteins other than Crb2a and Crb2b in this region.

Consistent with prior studies (Pollard et al., 2006), lama1 morphants exhibited reduced body size and other gross anatomical defects at 78 hpf, whereas crb2a/b morphants appeared grossly normal. Both lama1 and crb2a/b morphants displayed an annular pancreas (15 out of 34 lama1 and 27 out of 69 crb2a/b morphants) characterized by pancreatic tissue partially or completely encircling the duodenum (Figure 6A–D), a phenotype indicative of impaired migration of pancreatic progenitors during pancreas formation. These findings suggest that both lama1 and crb2a/b control cell migratory processes during early pancreatic development and that reduced levels of lama1 or crb2a/b impair pancreas morphogenesis.

Figure 6 with 4 supplements see all

Download asset Open asset

*lama1* and *crb2* regulate pancreas morphogenesis and beta cell differentiation.

(**A,B**) Representative 3D renderings of *Tg(ptf1a:eGFP)^jh1* control zebrafish embryos (**A,A’**) and *lama1* morphants (**B,B’**) stained with DAPI (nuclei, blue) and antibody against insulin (red); n ≥ 15 embryos per condition. To account for reduced acinar pancreas size in *lama1* morphants, control embryos were imaged at 50 hr post fertilization (hpf) and *lama1* morphants at 78 hpf. 15 out of 34 *lama1* morphants displayed an annular pancreas with two acinar pancreas domains (green) connected behind the presumptive intestine (B’, white arrow). Scale bar, 40 µM. (**C,D**) Representative 3D renderings of 78 hpf *Tg(ptf1a:eGFP)^jh1* control zebrafish embryos (**C,C’**) and *crb2a/b* morphants (**D,D’**) stained with DAPI (nuclei, blue) and antibodies against insulin (red); n ≥ 15 embryos per condition. Twenty-seven out of 69 *crb2a/b* morphants displayed an annular pancreas with the acinar pancreas (green) completely surrounding the presumptive intestine. Scale bar, 40 µM. (E) Representative 3D renderings of *Tg(ptf1a:eGFP)^jh1* control zebrafish embryos and *crb2a/b, lama1*, or *crb2a/b + lama1* morphants stained with DAPI (nuclei, blue) and antibody against insulin (red). All embryos were imaged at 78 hpf except for controls to *lama1* and *crb2a/b + lama1* morphants, which were imaged at 50 hpf to account for reduced acinar pancreas size of *lama1* morphants. Scale bar, 20 µM. (F) Quantification of beta (insulin⁺) cell nuclei per embryo from experiment in (E). p adj. = 4.0 × 10⁻³, 8.0 × 10⁻³, and 2.0 × 10⁻⁴ for comparison of hfp 78 control (n = 7 embryos) to hfp 78 *crb2a/b* (n = 8), hpf 50 control (n = 12) to hpf 78 *lama1* (n = 10), or *crb2a/b + lama1* (n = 12) morphants, respectively; ANOVA-Dunnett’s multiple comparison test; ***p<0.001 **p<0.01. 5 out of 8 *crb2a/b*, 3 out of 10 *lama1*, and 9 out of 12 *crb2a/b + lama1* morphants displayed an annular pancreas. MO, morpholino; Control, standard control morpholino. See also Figure 6—figure supplements 1–4.

To gain insight into the effects of lama1 and crb2a/b knockdown on pancreatic endocrine cell development, we examined beta cell numbers (insulin⁺ cells) at 78 hpf. We also evaluated potential synergistic effects of combined lama1 and crb2a/b knockdown. To account for the reduction in body and pancreas size in lama1 morphants, we compared cell numbers in 78 hpf lama1 morphants with 50 hpf control embryos, which have a similarly sized acinar compartment as 78 hpf lama1 morphants. Beta cell numbers were significantly reduced in both lama1 and crb2a/b morphants (p=8.0 × 10⁻³ and 4.0 × 10⁻³ for comparisons of lama1 and crb2a/b morphants, respectively; Figure 6E,F), as well as in morphants with a combined knockdown of lama1 and crb2a/b (p=2.0 × 10⁻⁴; Figure 6F), showing that reduced lama1 and crb2a/b levels, both individually and in combination, impair beta cell development. Furthermore, we found that nearly all lama1, crb2a/b, and combined lama1 and crb2a/b morphants without an annular pancreas had reduced beta cell numbers, indicating independent roles of lama1 and crb2 in pancreas morphogenesis and beta cell differentiation. Finally, to investigate the contributions of individual crb2 genes to the observed phenotype, we performed knockdown experiments using morpholinos against crb2a and crb2b alone. Only crb2b morphants showed a significant reduction in beta cell numbers (p=4.4 × 10⁻²; Figure 6—figure supplement 4), suggesting that crb2b is the predominant crb2 gene required for beta cell development. Combined, these findings demonstrate that lama1 and crb2 are regulators of pancreas morphogenesis and beta cell development in vivo.

Discussion

In this study, we identify T2D-associated variants localized within chromatin active in pancreatic progenitors but not islets or other T2D-relevant tissues, suggesting a novel mechanism whereby a subset of T2D risk variants specifically alters pancreatic developmental processes. We link T2D-associated enhancers active in pancreatic progenitors to the regulation of LAMA1 and CRB2 and demonstrate a functional requirement in zebrafish for lama1 and crb2 in pancreas morphogenesis and endocrine cell formation. Furthermore, we provide a curated list of T2D risk-associated enhancers and candidate effector genes for further exploration of how the regulation of developmental processes in the pancreas can predispose to T2D.

Our analysis identified 11 loci where T2D-associated variants mapped in SE specifically active in pancreatic progenitors. Among these loci was LAMA1, which has stronger effects on T2D risk in lean compared to obese individuals (Perry et al., 2012). We also found evidence that variants in PSSE collectively have stronger enrichment for T2D in lean individuals, although the small number of PSSE and limited sample size of the BMI-stratified T2D genetic data prohibits a more robust comparison. There was also a notable lack of enrichment among PSSE variants for association with traits related to insulin secretion and beta cell function. If T2D-associated variants in PSSE indeed confer diabetes susceptibility by affecting beta cell development, the question arises as to why variants associated with traits related to beta cell function are not enriched within PSSE. As genetic association studies of endophenotypes are based on data from non-diabetic subjects, a possible explanation is that variants affecting beta cell developmental processes have no overt phenotypic effect under physiological conditions and contribute to T2D pathogenesis only during the disease process.

Since the genomic position of enhancers and transcription factor binding sites is not well conserved between species (Villar et al., 2015), a human cell model is necessary to identify target genes of enhancers associated with disease risk. By employing enhancer deletion in hESCs, we demonstrate that T2D-associated PSSE at the LAMA1 and CRB2 loci regulate LAMA1 and CRB2, respectively, and establish LAMA1 and CRB2 as the predominant target gene of their corresponding PSSE within TAD boundaries. By analyzing LAMA1 and CRB2 expression throughout the pancreatic differentiation time course, we show that the identified PSSE control LAMA1 and CRB2 expression in a temporal manner consistent with the activation pattern of their associated PSSE. While the specific T2D-relevant target genes of the majority of T2D-associated PSSE remain to be identified, it is notable that several are localized within TADs containing genes encoding transcriptional regulators. These include PROX1 and GATA4, which are known to regulate pancreatic development (Shi et al., 2017; Tiyaboonchai et al., 2017; Westmoreland et al., 2012), as well as HMGA2 and BCL6 with unknown functions in the pancreas. Our catalogue of T2D-associated PSSE provides a resource to fully characterize the gene regulatory program associated with developmentally mediated T2D risk in the pancreas. Our finding that predicted target genes of PSSE are similarly expressed in hESC-derived pancreatic progenitors and primary human embryonic pancreas (Figure 3B and Figure 3—figure supplement 1A) further underscores the utility of the hESC-based system for these studies.

In the embryo, endocrine cells differentiate by delaminating from a polarized epithelium of progenitors governed by local cell-cell and cell-matrix signaling events (Mamidi et al., 2018). These processes are not well-recapitulated in the hESC-based pancreatic differentiation system, highlighting a limitation of this system for studying the function of Laminin and CRB2, which are mediators of mechanical signals within an epithelium. Therefore, we analyzed their function in zebrafish as an in vivo model. We show that lama1 or crb2 knockdown leads to an annular pancreas and reduced beta cell numbers. The beta cell differentiation defect was also evident in embryos not displaying an annular pancreas, suggesting independent mechanisms.

Consistent with our findings in lama1 morphants, culture of pancreatic progenitors on Laminin-based substrates promotes endocrine cell differentiation (Mamidi et al., 2018). During in vivo pancreatic development, endothelial cells are an important albeit not the only source of Laminin in the pancreas (Heymans et al., 2019; Mamidi et al., 2018; Nikolova et al., 2006). While we do not know the respective contributions of endothelial cell- and pancreatic progenitor cell-derived Laminin to the phenotype of lama1 morphants, the T2D-associated LAMA1 PSSE is not active in endothelial cells (Figure 3—figure supplement 1C). Furthermore, we found no other T2D-associated variants at the LAMA1 locus mapping in endothelial cell enhancers or accessible chromatin sites in islets, suggesting that T2D risk is linked to LAMA1 regulation in pancreatic progenitors.

Similar to Laminin, CRB2 has been shown to regulate mechanosignaling (Varelas et al., 2010). Our observation that pancreatic progenitor cells express Crb proteins is consistent with the phenotype of crb2 morphants reflecting a progenitor-autonomous role of Crb2. Furthermore, the similarity in pancreatic phenotype between lama1 or crb2 morphants raises the possibility that signals from Laminin and Crb2 could converge on the same intracellular pathways in pancreatic progenitors.

Our findings suggest that variation in gene regulation during pancreatic development can predispose to T2D later in life. Several lines of evidence support the concept of a developmental impact on T2D risk. First, human genetic studies have shown a strong correlation between birth weight and adult cardiometabolic traits and disease (Horikoshi et al., 2016). Second, epidemiological studies provide evidence that offspring of mothers who were pregnant during a famine have a higher prevalence of T2D (Lumey et al., 2015). This phenomenon has been experimentally reproduced in rodents, where maternal malnutrition has been shown to cause reduced beta cell mass at birth and to render beta cells more prone to failure under stress (Nielsen et al., 2014). Together, our results provide a strong rationale for further exploration of how genetic variants affecting developmental gene regulation in the pancreas contribute to T2D risk.

Materials and methods

Key resources table

Reagent type (species) or resource	Designation	Source or reference	Identifiers	Additional information
Antibody	APC Mouse monoclonal IgG1, κ Isotype Control	BD Pharmingen	Cat# 555751, RRID:AB_398613	Flow cytometry (1:100)
Antibody	Chicken polyclonal anti-GFP	Aves Labs	Cat# GFP-1020, RRID:AB_10000240	Immunohistochemistry (1:200)
Antibody	Cy3-conjugated donkey polyclonal anti-mouse	Jackson ImmunoResearch Labs	Cat# 715-165-150, RRID:AB_2340813	Immunofluorescence (1:1000)
Antibody	DyLight 488-conjugated donkey polyconal anti-goat	Jackson ImmunoResearch Labs	Cat# 705-545-003, RRID:AB_2340428	Immunofluorescence (1:500)
Antibody	Goat polyclonal anti-CTCF	Santa Cruz Biotechnology	Cat# SC-15914X, RRID:AB_2086899	ChIP-seq (4 ug)
Antibody	Goat polyclonal anti-FOXA1	Abcam	Cat# ab5089, RRID:AB_304744	ChIP-seq (4 ug)
Antibody	Goat polyclonal anti-FOXA2	Santa Cruz Biotechnology	Cat# sc-6554, RRID:AB_2262810	ChIP-seq (4 ug)
Antibody	Goat polyclonal anti-GATA4	Santa Cruz Biotechnology	Cat# sc-1237, RRID:AB_2108747	ChIP-seq (4 ug)
Antibody	Goat polyclonal anti-PDX1	Abcam	Cat# ab47383, RRID:AB_2162359	Immunofluorescence (1:500)
Antibody	Guinea pig polyclonal anti-Insulin	Biomeda	Cat# v2024	Immunohistochemistry (1:200)
Antibody	Mouse monoclonal anti-Crb2a	ZIRC	Cat# Zs-4	Immunohistochemistry (1:100)
Antibody	Mouse polyclonal anti-GATA6	Santa Cruz Biotechnology	Cat# sc-9055, RRID:AB_2108768	ChIP-seq (4 ug)
Antibody	Mouse monoclonal anti-NKX6.1	Developmental Studies Hybridoma Bank	Cat# F64A6B4, RRID:AB_532380	Immunofluorescence (1:300)
Antibody	Mouse monoclonal anti-NKX6.1-Alexa Fluor 647	BD Biosciences	Cat# 563338, RRID:AB_2738144	Flow cytometry (1:5)
Antibody	Mouse monoclonal anti-NKX6.1	Developmental Studies Hybridoma Bank	Cat# F55A10, RRID:AB_532378	Immunohistochemistry (1:10)
Antibody	Mouse monoclonal anti-PDX1-PE	BD Biosciences	Cat# 562161, RRID:AB_10893589	Flow cytometry (1:10)
Antibody	PE Mouse monoclonal IgG1, κ Isotype Control	BD Pharmingen	Cat# 555749, RRID:AB_396091	Flow cytometry (1:100)
Antibody	Rabbit polyclonal anti-CRB2	Sigma	Cat # SAB1301340	Immunofluorescence (1:500)
Antibody	Rabbit polyclonal anti-H3K27ac	Active Motif	Cat# 39133, RRID:AB_2561016	ChIP-seq (4 ug)
Antibody	Rabbit polyclonal anti-H3K4me1	Abcam	Cat# ab8895, RRID:AB_306847	ChIP-seq (4 ug)
Antibody	Rabbit polyclonal anti-HNF6	Santa Cruz Biotechnology	Cat# sc-13050, RRID:AB_2251852	ChIP-seq (4 ug)
Antibody	Rabbit polyclonal anti-laminin	Sigma	Cat# L9393, RRID:AB_477163	Immunohistochemistry (1:100) Immunofluorescence (1:30)
Antibody	Rabbit monoclonal anti-panCrb	Jensen Laboratory, University of Massachusetts, Amherst	N/A	Immunohistochemistry (1:100)
Antibody	Rabbit polyclonal anti-PDX1	Beta Cell Biology Consortium	AB1068	ChIP-seq (4 ug)
Antibody	Rabbit polyclonal anti-SOX9	Chemicon	Cat# 5535, RRID:AB_2239761	ChIP-seq (4 ug)
Cell line (Homo-sapiens)	CyT49	ViaCyte, Inc	NIHhESC-10–0041, RRID:CVCL_B850	Male
Cell line (Homo-sapiens)	H1	WiCell Research Institute	NIHhESC-10–0043, RRID:CVCL_9771	Male
Chemical compound, drug	2-Mercaptoethanol	Thermo Fisher Scientific	Cat# 21985023
Chemical compound, drug	Accutase	Thermo Fisher Scientific	Cat# 00-4555-56
Chemical compound, drug	B-27 supplement	Thermo Fisher Scientific	Cat# 17504044
Chemical compound, drug	Bovine Albumin Fraction V	Life Technologies	Cat# 15260037
Chemical compound, drug	D-(+)-Glucose Solution, 45%	Sigma-Aldrich	Cat# G8769
Chemical compound, drug	DAPI	Invitrogen	Cat# D1306	Immunohistochemistry (1:200)
Chemical compound, drug	DMEM High Glucose	VWR	Cat# 16750–082
Chemical compound, drug	DMEM/F12 [-] L-glutamine	VWR	Cat# 15–090-CV
Chemical compound, drug	DMEM/F12 with L-Glutamine, HEPES	Corning	Cat# 45000–350
Chemical compound, drug	DMF	EMD Millipore	Cat# DX1730
Chemical compound, drug	DPBS	Thermo Fisher Scientific	Cat# 21–031-CV
Chemical compound, drug	DTT	Sigma	Cat# D9779
Chemical compound, drug	Fetal Bovine Serum	Thermo Fisher Scientific	Cat# MT35011CV
Chemical compound, drug	Glutamax	Thermo Fisher Scientific	Cat# 35050–079
Chemical compound, drug	GlutaMAX	Thermo Fisher Scientific	Cat# 35050061
Chemical compound, drug	Hoechst 33342	Thermo Fisher Scientific	Cat# H3570
Chemical compound, drug	HyClone Dulbecco’s Modified Eagles Medium	Thermo Fisher Scientific	Cat# SH30081.FS
Chemical compound, drug	IGEPAL-CA630	Sigma	Cat# I8896
Chemical compound, drug	Illumina tagmentation enzyme	Illumina	Cat# FC-121–1030
Chemical compound, drug	Insulin-Transferrin-Selenium (ITS)	Thermo Fisher Scientific	Cat# 41400045
Chemical compound, drug	Insulin-Transferrin-Selenium-Ethanolamine (ITS-X)	Thermo Fisher Scientific	Cat# 51500–056
Chemical compound, drug	KAAD-Cyclopamine	Toronto Research Chemicals	Cat# K171000
Chemical compound, drug	K-acetate	Sigma	Cat# P5708
Chemical compound, drug	KnockOut SR XenoFree	Thermo Fisher Scientific	Cat# A1099202
Chemical compound, drug	LDN-193189	Stemgent	Cat# 04–0074
Chemical compound, drug	Matrigel	Corning	Cat# 356231
Chemical compound, drug	MCDB 131	Thermo Fisher Scientific	Cat# 10372–019
Chemical compound, drug	Mg-acetate	Sigma	Cat# M2545
Chemical compound, drug	mTeSR1 Complete Kit - GMP	STEMCELL Technologies	Cat# 85850
Chemical compound, drug	NEBNext High-Fidelity 2X PCR Master Mix	NEB	Cat# M0541
Chemical compound, drug	Non-Essential Amino Acids	Thermo Fisher Scientific	Cat# 11140050
Chemical compound, drug	O.C.T. Compound	Sakura Finetek USA	Cat# 25608–930
Chemical compound, drug	Penicillin-Streptomycin	Thermo Fisher Scientific	Cat# 15140122
Chemical compound, drug	Polyethylenimine (PEI)	Polysciences	Cat# 23966–1
Chemical compound, drug	Protease inhibitor	Roche	Cat# 05056489001
Chemical compound, drug	Retinoic acid	Sigma-Aldrich	Cat# R2625
Chemical compound, drug	RNA ScreenTape Sample Buffer	Agilent Technologies	Cat# 5067–5577
Chemical compound, drug	ROCK Inhibitor Y-27632	STEMCELL Technologies	Cat# 72305
Chemical compound, drug	RPMI 1640 [-] L-glutamine	VWR	Cat# 15–040-CV
Chemical compound, drug	SANT-1	Sigma-Aldrich	Cat# S4572
Chemical compound, drug	Sodium Bicarbonate	Sigma-Aldrich	Cat# NC0564699
Chemical compound, drug	Tamoxifen	Sigma	Cat# T5648
Chemical compound, drug	TGF-β RI Kinase Inhibitor IV	Calbiochem	Cat# 616454
Chemical compound, drug	TPB	Calbiochem	Cat# 565740
Chemical compound, drug	Tranylcypromine	Cayman Chemical	Cat# 10010494
Chemical compound, drug	Tris-acetate	Thermo Fisher Scientific	Cat# BP-152
Chemical compound, drug	TTNPB	Enzo Life Sciences	Cat# BML-GR105
Chemical compound, drug	Vectashield Antifade Mounting Medium	Vector Laboratories	Cat# H-1000
Chemical compound, drug	XtremeGene 9	Roche	Cat# 6365787001
Commercial assay	High Sensitivity D1000 ScreenTape	Agilent Technologies	Cat# 5067–5584
Commercial assay, kit	RNA ScreenTape	Agilent Technologies	Cat# 5067–5576
Commercial assay, kit	RNA ScreenTape Ladder	Agilent Technologies	Cat# 5067–5578
Commercial assay, kit	BD Cytofix/Cytoperm Plus Fixation/Permeabilization Solution Kit	BD Biosciences	Cat# 554715
Commercial assay, kit	ChIP-IT High Sensitivity Kit	Active Motif	Cat# 53040
Commercial assay, kit	iQ SYBR Green Supermix	Bio-Rad	Cat# 1708884
Commercial assay, kit	iScript cDNA Synthesis Kit	Bio-Rad	Cat# 1708891
Commercial assay, kit	KAPA Library Preparation Kit (Illumina)	Kapa Biosystems	Cat# KK8234
Commercial assay, kit	KAPA Stranded mRNA-Seq Kits	Kapa Biosystems	Cat# KK8401
Commercial assay, kit	MinElute PCR purification kit	QIAGEN	Cat# 28004
Commercial assay, kit	Qubit ssDNA assay kit	Thermo Fisher Scientific	Cat# Q10212
Commercial assay, kit	RNeasy Micro Kit	QIAGEN	Cat# 74004
Genetic reagent (D. rerio)	Tg(ptf1a:eGFP)^jh1	PMID:16258076	N/A
Other	SPRIselect bead	Beckman Coulter	Cat# B23317
Recombinant protein	Activin A	R and D Systems	Cat# 338-AC/CF
Recombinant protein	Human AB Serum	Valley Biomedical	Cat# HP1022
Recombinant protein	Recombinant EGF	R and D Systems	Cat# 236-EG
Recombinant protein	Recombinant Heregulinβ−1	Peprotech	Cat# 100–03
Recombinant protein	Recombinant KGF/FGF7	R and D Systems	Cat# 251 KG
Recombinant protein	Recombinant Mouse Wnt3A	R and D Systems	Cat# 1324-WN/CF
Recombinant protein	Recombinant Noggin	R and D Systems	Cat# 3344 NG
Sequence-based reagent	Px333 Plasmid	http://www.addgene.org/64073/	RRID:Addgene_64073
Sequence-based reagent	LAMA1 Forward	This paper	qPCR primers	GTG ATG GCA ACA GCG CAA A
Sequence-based reagent	LAMA1 Reverse	This paper	qPCR primers	GAC CCA GTG ATA TTC TCT CCC A
Sequence-based reagent	CRB2 Forward	This paper	qPCR primers	ACC ACT GTG CTT GTC CTG AG
Sequence-based reagent	CRB2 Reverse	This paper	qPCR primers	TCC AGG GTC GCT AGA TGG AG
Sequence-based reagent	TBP Forward	This paper	qPCR primers	TGT GCA CAG GAG CCA AGA GT
Sequence-based reagent	TBP Reverse	This paper	qPCR primers	ATT TTC TTG CTG CCA GTC TGG
Sequence-based reagent	LAMA1Enh Upstream Guide	This paper	CRISPR sgRNA	GTC AAA TTG CTA TAA CAC GG
Sequence-based reagent	LAMA1Enh Downstream Guide	This paper	CRISPR sgRNA	CCA CTT TAA GTA TCT CAG CA
Sequence-based reagent	CRB2Enh Upstream Guide	This paper	CRISPR sgRNA	ATA CAA AGC ACG TGA GA
Sequence-based reagent	CRB2Enh Downstream Guide	This paper	CRISPR sgRNA	GAA TGC GGA TGA CGC CTG AG
Sequence-based reagent	lama1-ATG	PMID:16321372	Morpholino	TCA TCC TCA TCT CCA TCA TCG CTC A Obtained from GeneTools, LLC
Sequence-based reagent	crb2a-SP	PMID:16713951	Morpholino	ACG TTG CCA GTA CCT GTG TAT CCT G Obtained from GeneTools, LLC
Sequence-based reagent	crb2b-SP	PMID:16713951	Morpholino	TAA AGA TGT CCT ACC CAG CTT GAA C Obtained from GeneTools, LLC
Sequence-based reagent	standard control MO	N/A	Morpholino	CCT CTT ACC TCA GTT ACA ATT TAT A Obtained from GeneTools, LLC
Software, algorithm	Adobe Illustrator v 5.1	http://www.adobe.com/products/illustrator.html	RRID:SCR_014198
Software, algorithm	Adobe Photoshop v 5.1	http://www.adobe.com/products/photoshop.html	RRID:SCR_014199
Software, algorithm	BEDtools v 2.26.0	https://github.com/arq5x/bedtools2	RRID:SCR_006646
Software, algorithm	Bioconductor	https://www.bioconductor.org/	RRID: SCR_006442
Software, algorithm	Burrows-Wheeler Aligner v 0.7.13	http://bio-bwa.sourceforge.net/	RRID:SCR_010910
Software, algorithm	CENTIPEDE v 1.2	http://centipede.uchicago.edu/	N/A
Software, algorithm	Cufflinks v 2.2.1	http://cole-trapnell-lab.github.io/cufflinks/	RRID:SCR_014597
Software, algorithm	deepTools2 v 3.1.3	https://deeptools.readthedocs.io/en/develop/content/installation.html	N/A
Software, algorithm	DESeq2 v 3.10	https://bioconductor.org/packages/release/bioc/html/DESeq2.html	RRID:SCR_015687
Software, algorithm	FlowJo v10 software	https://www.flowjo.com/solutions/flowjo	RRID: SCR_008520
Software, algorithm	GraphPad Prism v 8.1.2	https://www.graphpad.com/scientific-software/prism/	RRID: SCR_002798
Software, algorithm	HOMER v 4.10.4	http://homer.ucsd.edu/homer/	RRID: SCR_010881
Software, algorithm	Juicebox Tools v 1.4	https://github.com/aidenlab/Juicebox/wiki/Juicebox-Assembly-Tools	N/A
Software, algorithm	MACS2 v 2.1.4	http://liulab.dfci.harvard.edu/MACS/	RRID:SCR_013291
Software, algorithm	MEME suite v 5.1.1	http://meme-suite.org/	RRID:SCR_001783
Software, algorithm	Metascape	http://metscape.ncibi.org	RRID:SCR_014687
Software, algorithm	Picard Tools v 1.131	http://broadinstitute.github.io/picard/	RRID:SCR_006525
Software, algorithm	R Project for Statistical Computing v 3.6.1	http://www.r-project.org/	RRID:SCR_001905
Software, algorithm	SAMtools v 1.5	http://samtools.sourceforge.net	RRID:SCR_002105
Software, algorithm	STAR v 2.4	https://github.com/alexdobin/STAR	N/A
Software, algorithm	UCSC Genome Browser	http://genome.ucsc.edu/	RRID:SCR_005780
Software, algorithm	vcf2diploid v 0.2.6a	https://github.com/abyzovlab/vcf2diploid	N/A
Software, algorithm	ZEISS ZEN Digital Imaging for Light Microscopy	http://www.zeiss.com/microscopy/en_us/products/microscope-software/zen.html#introduction	RRID:SCR_013672

	H3K4me1	H3K27ac	CTCF	H3K4me3	H3K27me3
ES	0.90	0.91	0.87	0.81	1.00
DE	0.97	0.84	0.86	0.99	0.99
GT	0.97	0.87	0.89	0.97	0.99
PP1	0.97	0.85	0.89	0.96	0.99
PP2	0.98	0.87	0.87	0.97	1.00

ES	0.95
DE	0.83
GT	1.00
PP1	1.00
PP2	1.00

Share this article

Cite this article

T2D-associated risk variants are enriched in stretch enhancers of pancreatic progenitors independent of islet stretch enhancers.

Candidate target genes of pancreatic progenitor-specific stretch enhancers regulate developmental processes.

Figure 2—source data 1

Figure 2—source data 2

Figure 2—source data 3

Figure 2—source data 4

Identification of T2D risk variants associated with pancreatic progenitor-specific stretch enhancers.

A T2D risk-associated LAMA1 pancreatic progenitor-specific stretch enhancer regulates LAMA1 expression specifically in pancreatic progenitors.

Figure 4—source data 1

Figure 4—source data 2

A T2D risk-associated CRB2 pancreatic progenitor-specific stretch enhancer regulates CRB2 expression specifically in pancreatic progenitors.

Figure 5—source data 1

lama1 and crb2 regulate pancreas morphogenesis and beta cell differentiation.

Author details

Ryan J Geusz

Contribution

Contributed equally with

Competing interests

Allen Wang

Contribution

Contributed equally with

Competing interests

Joshua Chiou

Contribution

Contributed equally with

Competing interests

Joseph J Lancman

Contribution

Contributed equally with

Competing interests

Nichole Wetton

Contribution

Competing interests

Samy Kefalopoulou

Contribution

Competing interests

Jinzhao Wang

Contribution

Competing interests

Yunjiang Qui

Contribution

Competing interests

Jian Yan

Contribution

Competing interests

Anthony Aylward

Contribution

Competing interests

Bing Ren

Contribution

Competing interests

P Duc Si Dong

Contribution

Competing interests

Kyle J Gaulton

Contribution

For correspondence

Competing interests

Maike Sander

Contribution

For correspondence

Competing interests

Citations by DOI

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

Categories and tags

Research organisms