3D genomic features across >50 diverse cell types reveal insights into the genomic architecture of childhood obesity

  1. Khanh B Trang
  2. Matthew C Pahl
  3. James A Pippin
  4. Chun Su
  5. Sheridan H Littleton
  6. Prabhat Sharma
  7. Nikhil N Kulkarni
  8. Louis R Ghanem
  9. Natalie A Terry
  10. Joan M O'Brien
  11. Yadav Wagley
  12. Kurt D Hankenson
  13. Ashley Jermusyk
  14. Jason Hoskins
  15. Laufey T Amundadottir
  16. Mai Xu
  17. Kevin Brown
  18. Stewart Anderson
  19. Wenli Yang
  20. Paul Titchenell
  21. Patrick Seale
  22. Klaus H Kaestner
  23. Laura Cook
  24. Megan Levings
  25. Babette S Zemel
  26. Alessandra Chesi
  27. Andrew D Wells
  28. Struan FA Grant  Is a corresponding author
  1. Center for Spatial and Functional Genomics, The Children's Hospital of Philadelphia, United States
  2. Division of Human Genetics, The Children's Hospital of Philadelphia, United States
  3. Cell and Molecular Biology Graduate Group, Perelman School of Medicine, University of Pennsylvania, United States
  4. Department of Genetics, Perelman School of Medicine, University of Pennsylvania, United States
  5. Department of Pathology, The Children's Hospital of Philadelphia, United States
  6. Division of Gastroenterology, Hepatology, and Nutrition, Children's Hospital of Philadelphia, United States
  7. Scheie Eye Institute, Department of Ophthalmology, Perelman School of Medicine, University of Pennsylvania, United States
  8. Penn Medicine Center for Ophthalmic Genetics in Complex Disease, United States
  9. Department of Orthopedic Surgery University of Michigan Medical School Ann Arbor, United States
  10. Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, United States
  11. Department of Child and Adolescent Psychiatry, Children's Hospital of Philadelphia, United States
  12. Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, United States
  13. Institute for Diabetes, Obesity and Metabolism, Perelman School of Medicine, University of Pennsylvania, United States
  14. Department of Cell and Developmental Biology, Perelman School of Medicine, University of Pennsylvania, United States
  15. Department of Physiology, Perelman School of Medicine, University of Pennsylvania, United States
  16. Department of Microbiology and Immunology, University of Melbourne, Peter Doherty Institute for Infection and Immunity, Australia
  17. Department of Critical Care, Melbourne Medical School, University of Melbourne, Australia
  18. Division of Infectious Diseases, Department of Medicine, University of British Columbia, Canada
  19. Department of Surgery, University of British Columbia, Canada
  20. BC Children's Hospital Research Institute, Canada
  21. School of Biomedical Engineering, University of British Columbia, Canada
  22. Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, United States
  23. Department of Pathology, Perelman School of Medicine, University of Pennsylvania, United States
  24. Institute for Immunology, Perelman School of Medicine, University of Pennsylvania, United States
  25. Division Endocrinology and Diabetes, The Children's Hospital of Philadelphia, United States
  26. Penn Neurodegeneration Genomics Center, Perelman School of Medicine, University of Pennsylvania, United States

eLife Assessment

This important study presents genome-wide high-resolution chromatin-based 3D genomic interaction maps for over 50 diverse human cell types and integrates these data with pediatric obesity GWAS. The work provides convincing evidence that multiple pancreatic islet cell types are key effector cell types. The authors also perform variant-to-gene mapping to nominate genes underlying several GWAS hits. Overall, the results will be of interest to both the fields of 3D genome architecture and pediatric obesity.

https://doi.org/10.7554/eLife.95411.3.sa0

Abstract

The prevalence of childhood obesity is increasing worldwide, along with the associated common comorbidities of type 2 diabetes and cardiovascular disease in later life. Motivated by evidence for a strong genetic component, our prior genome-wide association study (GWAS) efforts for childhood obesity revealed 19 independent signals for the trait; however, the mechanism of action of these loci remains to be elucidated. To molecularly characterize these childhood obesity loci, we sought to determine the underlying causal variants and the corresponding effector genes within diverse cellular contexts. Integrating childhood obesity GWAS summary statistics with our existing 3D genomic datasets for 57 human cell types, consisting of high-resolution promoter-focused Capture-C/Hi-C, ATAC-seq, and RNA-seq, we applied stratified LD score regression and calculated the proportion of genome-wide SNP heritability attributable to cell type-specific features, revealing pancreatic alpha cell enrichment as the most statistically significant. Subsequent chromatin contact-based fine-mapping was carried out for genome-wide significant childhood obesity loci and their linkage disequilibrium proxies to implicate effector genes, yielded the most abundant number of candidate variants and target genes at the BDNF, ADCY3, TMEM18, and FTO loci in skeletal muscle myotubes and the pancreatic beta-cell line, EndoC-BH1. One novel implicated effector gene, ALKAL2 – an inflammation-responsive gene in nerve nociceptors – was observed at the key TMEM18 locus across multiple immune cell types. Interestingly, this observation was also supported through colocalization analysis using expression quantitative trait loci (eQTL) derived from the Genotype-Tissue Expression (GTEx) dataset, supporting an inflammatory and neurologic component to the pathogenesis of childhood obesity. Our comprehensive appraisal of 3D genomic datasets generated in a myriad of different cell types provides genomic insights into pediatric obesity pathogenesis.

Introduction

The prevalence of obesity has risen significantly worldwide (NCD Risk Factor Collaboration (NCD-RisC), 2017), especially among children and adolescents (Bryan et al., 2021). Obesity is associated with chronic diseases, such as diabetes, cardiovascular diseases, and certain cancers (GBD 2015 Obesity Collaborators, 2017; Singh et al., 2013; Wormser et al., 2011; Lauby-Secretan et al., 2016), along with mechanical issues including osteoarthritis and sleep apnea (Fontaine and Barofsky, 2001).

Modern lifestyle factors, including physical inactivity, excessive caloric intake, and socioeconomic inequity, along with disrupted sleep and microbiome, represent environmental risk factors for obesity pathogenesis. However, genetics also play a significant role, with the estimated heritability ranging from 40% to 70% (Loos and Yeo, 2022; Maes et al., 1997; Elks et al., 2012). Studies show that body weight and obesity remain stable from infancy to adulthood (Demerath et al., 2007; Dubois et al., 2007; Wardle et al., 2008; Bouchard, 2009), but variation between individuals does exist (Littleton et al., 2020). Genome-wide association studies (GWAS) have improved our understanding of the genetic contribution to childhood obesity (Vogelezang et al., 2020; Yaghootkar et al., 2020; Couto Alves et al., 2019; Fu et al., 2019; Turcot et al., 2018; Littleton and Grant, 2022). However, the functional consequences and molecular mechanisms of identified genetic variants in such GWAS efforts are yet to be fully elucidated. Efforts are now being made to predict target effector genes and explore potential drug targets using various computational and experimental approaches (Yu et al., 2022; Avsec et al., 2021; Gazal et al., 2022; Zhou and Troyanskaya, 2015; Nasser et al., 2021), which subsequently warrant functional follow-up efforts.

With our extensive datasets generated on a range of different cell types, by combining 3D chromatin maps (Hi-C, Capture-C) with matched transcriptome (RNA-seq) and chromatin accessibility data (ATAC-seq), we investigated heritability patterns of pediatric obesity-associated variants and their gene-regulatory functions in a cell type-specific manner. This approach yielded 94 candidate causal variants mapped to their putative effector gene(s) and corresponding cell type(s) setting. In addition, using methods comparable to our prior efforts in other disease contexts (Chesi et al., 2019; Su et al., 2020; Pahl et al., 2021; Cousminer et al., 2021; Vujkovic et al., 2022; Pahl et al., 2022; Su et al., 2022; Palermo et al., 2023), we also uncovered new variant-to-gene combinations within specific novel cellular settings, most notably in immune cell types, which further confirmed the involvement of the immune system in the pathogenesis of obesity in the early stages of life.

Results

Enrichment assessment of childhood obesity variants across cell types

To explore the enrichment of childhood obesity GWAS variants across cell types, we carried out Partitioned Linkage Disequilibrium Score Regression (LDSR) (Finucane et al., 2015) on all ATAC-seq-defined OCRs for each cell type. We assessed cell-type-specific enrichment of GWAS signals in four main categories of genomic regions (Figure 1A): (1) Total OCRs: open chromatin regions defined by ATAC-seq; (2) Promoter OCRs: the subset of OCRs overlapping a gene promoter; (3) cREs: the subset of OCRs that form chromatin loops (as determined by Hi-C/Promoter Capture-C) with a gene promoter, and are therefore considered putative enhancers or suppressors regulating gene expression; (4) cREs ±500bases: extended cREs by 500 bases in both directions. The rationale behind this approach is that different GWAS variants can influence phenotypes by regulating gene expression in a cell-type specific manner through various regulatory mechanisms. For example, they may alter enhancer function (cREs category) or affect the binding of a transcription factor at a gene’s promoter (Promoter OCRs category).

Partitioned Linkage Disequilibrium Score Regression analysis for open chromatin regions of all cell types.

(A) The schematic shows the different types of regions defined in our study and 3 different ways overlapping chromatin contact regions – OCRs – gene promoters define cREs. (B) Heritability enrichment by LDSC analysis for each cell type. (a) Bar-plot shows the total number of OCRs identified by ATAC-seq for each cell type on bulk cells - blue, or on single cells – red; and the portion of OCRs that fall within cREs identified by incorporating Hi-C (green) or by Capture-C (orange). 4 panels of dot-plots show heritability enrichment by LDSC analysis for each cell type, with standard error whiskers. Dots’ colors correspond to -log10(p-values), dots with white asterisks are significant P-values <0.05, and dots’ sizes corresponding to the proportion of SNP contribute to heritability. Dash line at 1, i.e., no enrichmen. (b) Analysis done on whole OCRs set of each cell type (whiskers colors match with bulk/single cell from bar-plot a); (c) On only OCRs that overlapped with promoters (whiskers’ colors match with 623 bulk/single cell from bar-plot a); (d) On the putative cREs of each cell type (whiskers’ colors match with Hi-C/Capture-C from bar-plot a); (e) On the same cREs as (c) panel with their genomic positions expanded ±500 bases on both sides (whiskers’ colors match with Hi-C/Capture-C from bar-plot a).

We observed that 41 of 57 cell types – including 22 metabolic, 21 immune, 7 neural cell types, and 7 independent cell lines (Supplementary file 1a) – showed at least a degree of directional enrichment with the total set of OCRs (Figure 1B Total OCRs). However, only four cell types – two pancreatic alpha and two pancreatic beta cell-based datasets – had statistically significant enrichments (p<0.05). These enrichments were less pronounced when focusing on promoter OCRs only (Figure 1B Promoter OCRs). To further limit the LD enrichment assessment to just those OCRs that can putatively regulate gene expression via chromatin contacts with gene promoters, we used the putative cREs (Chesi et al., 2019; Pahl et al., 2021). This reduced the number of cell types showing at least nominal enrichment (31 of 57), enlarged the dispersion of enrichment ranges across different cell types, increased the 95% confidence intervals (CI) of enrichments, and hence increased the P-value of the resulting regression score. cREs from pancreatic alpha cells derived from single-cell ATAC-seq were the only dataset that remained statically significant (Figure 1B putative cREs).

The original reported LDSR method analyzed enrichment in the 500 bp flanking regions of their regulatory categories (Finucane et al., 2015). However, when we expanded our analysis to the ± 500 bp window for our cREs, albeit incorporating more weighted variants into the enrichment (represented by larger dots in Figure 1B cREs ± 500 bases), this resulted in a decrease in the number of cell types yielding at least nominal enrichment (26 cell types), the enrichment range across cell types, the 95% CI, and level of significance. The pancreatic alpha cell observation also dropped below the bar for significance with this expanded window definition.

Consistency and diversity of childhood obesity proxy variants mapped to cREs

Despite the enrichments above only being limited to just a small number of cell types, it is likely that individual loci have differing levels of contributions in various cellular contexts and could not be detected at the genome wide assessment scale. As such we elected to further explore the candidate effector genes that are directly affected by cREs harboring childhood obesity-associated variants by systematically mapping the genomic positions of the LD proxies onto each cell type’s cREs. Most proxies fall within chromatin contact regions (blue area in Venn diagram Figure 2A) or OCRs (yellow area) or open chromatin contact regions (red area), or completely outside (white area) any defined region. Only 94 proxies fall within our defined cREs (overlapped area with dotted green border in Figure 2A), they clustered at 13 original loci (Supplementary file 1b). Figure 2—figure supplement 1 outlines the number of signals at each locus included or excluded based on the criteria we defined for our regions of interest. The TMEM18 locus yielded the most variants through cREs mapping, with 46 proxies for the two lead independent variants, rs7579427 and rs62104180. The second most abundant locus was ADCY3, with 21 proxies for lead variant rs4077678 (Figure 2B). The higher number of variants at one locus did not correlate with implicating more genes or cell types through mapping. The mapping frequency of various variants within a specific locus exhibited substantial differences (Figure 2—figure supplement 2).

Figure 2 with 6 supplements see all
Mapping 771 proxies to the open chromatin regions of each cell type.

(A) Venn diagram shows how 771 proxies mapped to the OCRs: · Blue area: 758 proxies were located within contact regions of at least one cell type regardless of chromatin state; · Red area: 417 proxies were located within contact regions marked as open by overlapping with OCR; · Yellow area: If we only considered open chromatin regions, 178 proxies were included; · Dotted green bordered area: To focus on just those variants residing within open chromatin and contacting promoter regions in any cell type, we overlapped the genomic positions of these proxies with each cell type’s cRE set, yielding 90 variants (3 from the 99% credible set) directly contacting open gene promoters (Supplementary file 1b), with 10 of which located within a promoter of one gene but contacting another different gene promoter. There were an additional 4 variants located within gene promoters but in chromatin contact with promoter(s) of nearby transcript(s) of the same gene (correspond to 3 cREs illustrations in Figure 1A). · White area: proxies that fall into neither defined region of interest. (B) Bar-plot shows number of proxies, cell types and target genes mapped at each locus. (C) The upSet plot shows the degree of overlap across cell types of the variants; ranked from the most common variant (red) – rs61888800 from BDNF locus, a well-known 5' untranslated region variant of this gene that is associated with anti-depression and therapeutic response (Licinio et al., 2009; Colle et al., 2015) – appeared in 39 cell types, to the group of variants (grey) which appeared in only one cell type.

Inspecting individual variants regardless of their locus, we found that 28 of 94 proxies appeared in cREs across multiple cell types, with another 66 observed in just one cell type (Figure 2C). 45 variants of these 66 just contacted one gene promoter, such as at the GPR1 and TFAP2B loci (Figure 2—figure supplement 3).

Overall, the number of cell types in which a variant was observed in open chromatin correlated with the number of genes contacted via chromatin loops (Figure 2—figure supplement 4). However, we also observed that some variants found in cREs in multiple cell types were more selective with respect to their candidate effector genes (Figure 2—figure supplement 5 - red arrow), or conversely, more selective across given cell types but implicated multiple genes (Figure 2—figure supplement 5 - blue arrow). Figure 2—figure supplement 6 outlines our observations at the TMEM18 locus – an example locus involved in both scenarios.

Implicated genes cluster at loci strongly associated with childhood obesity consistently across multiple cell types

Mapping the variants across all the cell types resulted in a total of 111 implicated childhood obesity candidate effector genes (Table 1). Among these, 45 genes were specific to just one cell type (Figure 3—figure supplement 1), including 13 in myotubes and 7 in natural killer cells. Conversely and notably, BDNF appeared across 42 different cell types. Across the metabolic, neural, and immune systems and seven other cell lines, there were nine genes consistently implicated in all four categories (top panel Figure 3 red stars, Figure 3—figure supplement 2 ‘all’), while five genes were consistently implicated in metabolic, neural, and immune systems (top panel Figure 3 – blue stars, Figure 3—figure supplement 2 ‘all_main’). Two genes, ADCY3 and BDNF, had variants both at their promoters and contacted variants in cREs via chromatin loops (Figure 3—figure supplement 3).

Figure 3 with 3 supplements see all
Profiles of 111 implicated genes by 94 proxies through cREs of each cell type.

Main panel: Bubble plot show corresponding expression level (size) and number of variants (color) target each implicated gene of each cell type. Squares represent genes with variants at their promoters. Circles represent genes with variants contacted through chromatin loops. Some genes were implicated by both types, these ‘double implications’ are represented as diamond shapes, and were identified across several cell types: two cell types (plasmacytoid dendritic cells and pre-differentiated adipocytes) for ADCY3 gene, and five for BDNF (human embryonic stem cells - hESC, differentiated human fetal osteoblast cells - hFOB_Diff, neural progenitor cells derived from induced pluripotent stem cells - NPC_iPSC, PANC-1, and NCIH716 cell lines). Genes with expression undetected in our arrays are shown as triangles. Top panel: bar-plot shows numbers of cell types each gene was implicated within, color-coded by which systems the cell types belong to. Right panel: bar-plot shows numbers of genes implicated by the variants with each cell type.

Table 1
PubMed-query known functions for 111 genes implicated by obesity variants.
LocusImplicated genesObesity or related traitsDifferent traits
TNNI3KLRRIQ3(NA)Associated with opioid usage [PMID:34728798] and MDD [PMID: 31748543]
FPGTPredict BMI in Korean pop. [PMID: 28674662](NA)
FPGT-TNNI3KAssociated with MDD [PMID: 31748543]
LRRC53Associated with high BMI increased risk heart attack [PMID: 32471361](NA)
ASTN1Identified as obesity QTL in rat [PMID: 35729251]Associated with neurodevelopmental traits [PMID: 24381304] and variety of cancers [PMID: 32945491]
BRINP2(NA)Associated with neurodevelopmental traits [PMID: 34267256]
SEC16BAL122019.1(NA)
AL162431.1
TMEM18FAM110C(NA)Overexpression induces microtubule aberrancies [PMID: 17499476], involved in cell spreading and migration [PMID: 19698782]
SH3YL1Associated with BMI in type 2 diabetes nephropathy [PMID: 33223406]Influence on T cell activation [PMID: 31427643], involved in different cancer types [PMID: 26305679,24508479]
ACP1Associated with early-onset obesity [PMID: 24129437], correlated with cardiovascular risks [PMID: 19570551], drive adipocyte differentiation via control of pdgfrα signaling [PMID: 33615467]Associated with bipolar disorder [PMID: 31830721]
ALKAL2Associated with childhood BMI [PMID: 33627773]Enhance expression in response to inflammatory pain in nociceptors [PMID: 35608912, 35610945]
MYT1LAssociated with early-onset obesity [PMID: 24129437](NA)
AC079779.1(NA)
AC079779.2
AC079779.3
AC079779.4
LINC01865
AC105393.2
AC105393.1
LINC01874
LINC01875
AC093326.1
AC141930.2
ADCY3ITSN2(NA)Regulate T-cells function [PMID: 32618424] and help the interaction with B-cells [PMID: 29337666]
NCOA1Meta-inflammation gene [PMID: 25647480], reduce adipogenesis, shift the energy balance between white and brown fat [PMID: 31133421](NA)
ADCY3Regulate/impair MC4R within energy-regulating melanocortin signaling pathway [PMID: 29311635,32955435]
DNAJC27-AS1Linked to obesity, diabetes traits [PMID: 30131766]
DNAJC27Linked to obesity, diabetes traits [PMID: 30131766]
EFR3BAssociated with T1D [PMID: 21980299], down-regulated in rare obesity-related disorder [PMID: 25705109]
WDR43(NA)Associated with breast cancer [PMID: 27117709]
AC013267.1(NA)
RF00016
GPR1GPR1Increase expression in obese phenotype [PMID: 34174278](NA)
TFAP2BTFAP2D(NA)Involve in embryogenesis [PMID: 12711551]
CALCRHEPACAM2Associated with colorectal cancer [PMID: 29659199, 29973580]
VPS50Involve in neurodevelopmental disorders and defects [PMID: 30828385, 34037727]
MIR653Involve in different types of cancer [PMID: 35777307]
MIR489Promote adipogenesis in mice [PMID: 34004251](NA)
CALCRAssociated with BMI and control of food-intake [PMID: 34462445, 34210852, 31955990, 29522093]
TFPI2(NA)Involved in colorectal cancer [PMID: 35004840, 34092617, 25902909]
BET1Involved in triacylglycerol metabolism [PMID: 24423365]Associated with muscular dystrophy [PMID: 34310943, 34779586]
AC003092.1(NA)Association with glioblastoma [PMID: 33815468, 30442884]
AC002076.1(NA)
BDNFLIN7CAssociated in T2D [PMID: 20215397], obesity [PMID: 23044507]Associated with psychopathology [PMID: 23044507]
BDNF-ASRegulate BDNF and LIN7C expression [PMID: 22960213, 22446693](NA)
BDNFRegulate eating behavior and energy balance [PMID: 34556834]
MIR610(NA)Involve in different types of cancer [PMID: 34408418, 29228616, 26885452]
KIF18AInvolve in different types of cancer [PMID: 35591854, 35286090]
METTL15Associated with childhood obesity [PMID: 31504550](NA)
AC090124.1(NA)Reported to differentially prognostic of pancreatic cancer [PMID: 34307375]
ARL14EPInvolve in WAGR syndrome [PMID: 36011342, 31511512]
DCDC1Involvement with eyes anomalies [PMID: 34773354, 34703991]
THEM7PAssociated with mechanisms underlying inguinal hernia [PMID: 34392144]
AL035078.2(NA)
ELP4
LINC00678
AC023206.1
RN7SKP158
AC104978.1
MIR8068
AC013714.1
AC100773.1
AC090833.1
AC090791.1
AC110056.1
AL035078.2
FAIM2PRPF40B(NA)Splicing regulator involved in T-cell development [PMID: 31088860, 34323272]
TMBIM6Deficiency leads to obesity by increasing Ca2+-dependent insulin secretion [PMID: 32394396]Immune cell function and survival [PMID: 26470731]
BCDIN3DAssociated with obesity, T2D [PMID: 20215397](NA)
FAIM2Associated with childhood obesity [PMID: 31504550]
AQP2Associated with obesity, diabetes [PMID: 33367818]
AQP5Associated with non-obese diabetes [PMID: 25635992,22320885]Responsible for transporting water, involve in Sjogren’s syndrome [PMID: 25635992, 31557796]
AQP6Down-regulated in retina in diabetes [PMID: 21851171]Associated with renal diseases [PMID: 30654539]
RACGAP1Involve in diabetes nephropathy [PMID: 35222021](NA)
ASIC1Inhibition increase food intake and decrease energy expenditure [PMID: 35894166]
LSM6P2(NA)
RPL35AP28
LINC02396
LINC02395
AC025154.1
AC025154.2
ADCY9SLX4(NA)Associated with blood pressure [PMID: 30671673]
DNASE1Associated with obesity hypertension [PMID: 33351325](NA)
TRAP1Involve in global metabolic network, deletion reduce obesity incidence [PMID: 25088416]
CREBBPAssociated with high adiposity and low cardiometabolic risk [PMID: 33619380]
ADCY9Asoociated with BMI, obesity [PMID: 33619380, 23563607]
SRL(NA)Involve in cardiac dysfunction [PMID: 22119571]
LINC01569Associated with cancer and endometriosis [PMID: 35341703, 34422671]
TFAP4Associated with BMI, birth weight, maternal glycemic [PMID: 35708509](NA)
AC012676.1(NA)Involve in hepatocellular carcinoma [PMID: 35210216]
AC009171.2(NA)
FTOFTOMost extensively studied obesity locus [PMID: 34556834](NA)
IRX3Obesogenic effects in adipocytes [PMID: 26760096], brain [PMID: 24646999], pancreas[93]
IRX5
AC018553.1(NA)Associated with melanoma [PMID: 35611195]
CRNDERegulator of angiogenesis in obesity-induced diabetes [PMID: 31863035](NA)
MMP2Involve in obesity-relate angiogenesis [PMID: 35919566]
CAPNS2(NA)Associated with thyroid-related traits [PMID: 23408906]
AMFRInvolve in hepatic lipid metabolism [PMID: 33591966](NA)
CETPInvolve in monogenic hyperalphalipoproteinemia [PMID: 34878751]
RPGRIP1LHypomorphism of this ciliary gene linked to morbid obesity [PMID: 27064284, 30597647, 29657248]Required for hypothalamic arcuate neuron development [PMID: 30728336]
LINC02169(NA)Associated with occupational exposure to gases/fumes and mineral dust [PMID: 31152171]
AC007491.1(NA)
AC018553.2
LINC02140
AC106738.1
AC106738.2
MTND5P34
AC007336.1
MC4RAC090771.1(NA)

At the TMEM18 locus on chr 2p25.3, a highly significant human obesity locus that has long been associated with both adult and childhood obesity, we observed differing degrees of evidence for 16 genes, but noted that rs6548240, rs35796073, and rs35142762 consistently contacted the SH3YL1, ACP1, and ALKAL2 promoters across multiple cell types (Figure 2C -third and fourth column).

At the chr 2p23 locus, ADCY3 yielded the most contacts (i.e. many proxies contacting the same gene via chromatin loops), suggesting this locus acts as a regulatory hub. However, we observed a similar composition in cell types for four other genes: DNAJC27, DNAJC27-AS1 (both previously implicated in obesity and/or diabetes traits Cherian et al., 2018), AC013267.1, and SNORD14 (RF00016). ITSN2, NCOA1, and EFR3B were three genes within this locus that were only implicated in immune cell types. NCOA1 encodes a prominent meta-inflammation factor (Rollins et al., 2015) known to reduce adipogenesis and shift the energy balance between white and brown fat, and its absence known to induce obesity (Mohsen G et al., 2019).

CALCR was the most frequently implicated gene at its locus, supported by 20 cell types across all systems. While within the BDNF locus, METTL15 and KIF18A – two non-cell-type-specific genes - plus some lncRNA genes, were contacted by childhood obesity-associated proxies within the same multiple cell types as BNDF, again suggesting the presence of a regulatory hub.

At the FAIM2 locus on chr 12q13.12, we observed known genes associated with obesity, eating patterns, and diabetes-related traits, including ASIC1, AQP2, AQP5, AQP6, RACGAP1, and AC025154.2 (AQP5-AS1) along with FAIM2 (Table 1). These genes were harbored within cREs of astrocytes, neural progenitors, hypothalamic neurons, and multiple metabolic cell types. Plasmacytoid and CD1c+conventional dendritic cells were the only two immune cell types that harbored such proxies within their cREs, implicating ASIC1, PRPF40B, RPL35AP28, TMBIM6, and LSM6P2 at the FAIM2 locus.

The independent ADCY9 and FTO loci are both located on chromosome 16. Genes at the ADCY9 locus were only implicated in a subset of immune cell types. Interestingly, genes at the FTO locus were only implicated in Hi-C datasets (as opposed to Capture C), including 6 metabolic cell types and astrocytes. Most genes at the FTO locus were implicated in skeletal myotubes, differentiated osteoblasts, and astrocytes, namely FTO and IRX3; while IRX5, CRNDE, and AC106738.1 were also implicated in adipocytes and hepatocytes.

The most implicated cell types by two sets of analyses

EndoC-BH1 and myotubes are the two cell types in which we implicated the most effector genes, with 38 and 42, respectively – Figure 3 side panel. This phenomenon is likely proportional in the case of myotubes, given the large number of cREs identified by overlapped Hi-C contact data and ATAC-seq open regions (Figure 1A), but not for EndoC-BH1. Albeit harboring an average number of cREs compared to other cell types, EndoC-BH1 cells were consistently among the top-ranked heritability estimates for the childhood obesity variants resulting from the EGG consortium GWAS (Figure 1) and harbored a significant number of implicated genes by the mapping of proxies. Interestingly, the pancreatic alpha cell type – shown above to be the most significant for heritability estimate by LDSC – revealed only six implicated genes contacted by the defined proxies, namely BDNF and five lncRNA genes.

Pathway analysis

Of the 111 implicated genes in total, PubMed query revealed functional studies for 66 genes. The remaining were principally lncRNA and miRNA genes with currently undefined functions (Table 1). To investigate how our implicated genes could confer obesity risk, we performed several pathway analyses keeping them either separated for each cell type or pooling into the respective metabolic, neural, or immune system sets. Figure 4—figure supplement 1 shows simple Gene Ontology (GO) biological process terms enrichment results.

Leveraging the availability of our expression data generated via RNA-seq (available for 46 of 57 cell types), we performed pathway analysis. Given that our gene sets from the variant-to-gene process was stringently mapped, the sparse enrichment from normal direct analyses is not ideal for exploring obesity genetic etiology. Thus, we incorporated two methods from the pathfindR package (Ulgen et al., 2019) and our customized SPIA (details in Materials and methods, Figure 4—figure supplement 2). The result of 60 enriched KEGG terms is shown in Figure 4A, Supplementary file 1c, with 13 genes in 14 cell types for pathfindR and 39 enriched KEGG terms shown in Figure 4, Supplementary file 1d, with 10 genes in 42 cell types for customized SPIA. There were 20 overlapping pathways between the two approaches (yellow rows in Supplementary file 1c-d) including many signaling pathways such as the GnRH (hsa04912), cAMP (hsa04024), HIF-1 (hsa04066), Glucagon (hsa04922), Relaxin (hsa04926), Apelin (hsa04371), and Phospholipase D (hsa04072) signaling pathways. They were all driven by one or more of these 5 genes: ADCY3, ADCY9, CREBBP, MMP2, and NCOA1. Interestingly, we observed the involvement of natural killer cells in nearly all the enriched KEGG terms from pathfindR due to the high expression of the two adenylyl cyclase encoded genes, ADCY3 and ADCY9, along with CREBBP. The SPIA approach disregarded the aquaporin genes (given they appear so frequently in so many pathways that involve cellular channels) but highlighted the central role of BDNF which single-handedly drove four signaling pathways: the Ras, Neurotrophin, PI3K-Akt, and MAPK signaling pathways. This also revealed the role of TRAP1 in neurodegeneration.

Figure 4 with 4 supplements see all
KEGG pathways enrichment analysis.

(A) The pathfindR method: Focused on ‘leveraging interaction information from a protein-protein interaction network (PIN) to identify distinct active subnetworks and then perform enrichment analyses on these subnetworks’, thus aiding enriched pathway analyses through the inter-connection between the genes targeted by obesity variants with key genes driving the pathology of the disease. The result with 60 enriched KEGG terms in the main panel (Supplementary file 1c), shows 13 genes in 14 cell types and their scaled expression levels in the lower panel. (B) The modified SPIA: After applying the adjusted P-value of 0.05 as the filtering threshold, the analysis yielded 39 enriched KEGG terms (full table at Supplementary file 1d) with only 10 genes, but involved up to 42 cell types.

These two approaches did not discount the role of FAIM2 and CALCR. However, their absence was mainly due to the content of the current KEGG database. On the other hand, these approaches accentuated the role of the MMP2 gene at the FTO locus in skeletal myotubes, given its consistency within the GnRH signaling pathway (Figure 4—figure supplement 3), which is in line with previous studies linking its expression with obesity (Derosa et al., 2008; Aksoyer Sezgin et al., 2022; Nonino et al., 2021).

Supportive evidence by colocalization of target effector genes with eQTLs

The GTEx consortium has characterized thousands of eQTLs, albeit in heterogeneous bulk tissues (Pejman, 2017). To assess how many observed gene-SNP pairs agreed with our physical variant-to-gene mapping approach in our multiple separate cellular settings, we performed colocalization analysis using ColocQuiaL (Chen et al., 2022).

282 genes were reported to be associated with the variants within 13 loci from our variant-to-genes analysis. We found 114 colocalizations for ten of our loci that had high conditioned posterior probabilities (cond.PP.H4.abf≥0.8), involving 44 genes and 41 tissues among the eQTLs. We extracted the posterior probabilities for each SNP within each colocalization and selected the 95% credible set as the likely causal variants (complete list in Supplementary file 1e). Despite sensitivity differences and varying cellular settings, when compared with our variant-to-gene mapping results, colocalization analysis yielded consistent identification for 21 pairs of SNP-gene interactions when considering the analyses across all our cell types, composed of 20 SNPs and 7 genes. Details of these SNP-gene pairs are shown in Figure 5A and B.

Colocalization of target effector genes with eQTLs.

(A) Venn diagram shows the overlaps between sets of genes yielded by ColocQuiaL and the variant-to-gene mapping process. (B) Circos plot of the 10 loci demonstrates the differences in the ranges of associations between the two approaches, with long-ranged chromatin contacts between obesity variants and target genes displayed as orange links and short-range eQTLs colocalizations as green links. Two SNPs – rs35796073, and rs35142762 within the TMEM18 locus, in linkage disequilibrium with rs7579427 – were estimated with high probability (cond.PP.H4=0.78) of colocalizing with the expression of ALKAL2 gene in subcutaneous adipose tissue. These pairs of SNP-gene were also identified by our variant-to-gene mapping approach in natural killer cells, plasmacytoid dendritic cells, unstimulated PBMC naïve CD4 T cells and astrocytes. The rs7132908 variant at the FAIM2 locus colocalized with the expression of AQP6 in thyroid tissue and with ASIC1 in prostate tissue, not only with high cond.PP.H4 but also with high individual SNP causal probability (SNP.PP.H4>0.95). rs7132908 was the second most consistent observation in our variant-to-gene mapping, namely across 25 different cell types (Figure 2B) and all three systems plus the other independent cell lines. The pair of rs7132908-contacting-AQP6 was observed in 15 different cell types - 8 metabolic and 4 neural cell types, and 3 independent cell lines. The pair of rs7132908-contacting-ASIC1 was observed in 11 different cell types - 8 metabolic and 2 neural cell types, and plasmacytoid dendritic cells. The other eQTL signals that overlapped with our variant-to-gene mapping results were: BDNF at the METTL15 locus with its promoter physically contacted by rs11030197 in 4 cell types and its expression significantly colocalized (cond.PP.H4=0.82) in tibial artery; ADCY9 at its locus with its promoter physically contacted by rs2531995 in natural killer cells and its expression significantly colocalized in skin tissue (‘Skin_Not_Sun_Exposed_Suprapubic’, cond.PP.H4=0.97). And ADCY3 in the C panel. (C) ColocQuiaL estimated that these SNPs highly colocalize with the expression of ADCY3 in 11 different tissues, where the overlapping with the 16 cell types is represented, color-coded by the proxies rs numbers.

Of these 20 SNPs, 15 were at the ADCY3 locus, in LD with sentinel variant rs4077678, and all implicated ADCY3 as the effector gene in 29 cell types – 15 metabolic, 6 immune, 4 neural cell types, and 4 independent cell lines (Figure 5C). Indeed, missense mutations have been previously reported for this gene in the context of obesity (Grarup et al., 2018; Stergiakouli et al., 2014) while another member of this gene family, ADCY5, has also been extensively implicated in metabolic traits (Sinnott-Armstrong et al., 2021).

Predicting transcription factors (TFs) binding disruption at implicated genes contributing to obesity risk

TFs regulate gene expression by binding to DNA motifs at enhancers and silencers, where any disruption by a SNP can potentially cause dysregulation of a target gene. Thus, we used motifbreakR (R package) to predict such possible events at the loci identified by our variant-to-gene mapping. Each variant was predicted to disrupt the binding of several different TFs, thus requiring further literature cross-examination to select the most probable effects. For example, rs7132908 (consistently contacting FAIM2 in 25 cell types) was predicted to disrupt the binding of 12 different transcription factors. Among them, SREBF1 (Figure 6A) was the only TF that concurred with evidence that it regulates AQP2 and FAIM2 at the same enhancer (Kikuchi et al., 2021). The full prediction list can be found in Supplementary file 1f.

Figure 6 with 1 supplement see all
motifbreakR vs ATAC-seq footprint analysis.

(A) Genome view at the FAIM2 locus where rs7132908 is located and can target many genes through many chromatin contacts, presented by arcs (different colors for different cell types). rs7132908 was predicted by motifBreakR to disrupt the TF SREBF1’s binding site, thus potentially altering the expression of its implicated genes. (B) Mosaic plot shows number of variants that were annotated with disrupt-TF-binding affect by motifBreakR, and the proportions that also overlapped with predicted TF footprint from ATAC-seq TF footprint analysis. Fisher exact test was performed and produced p-value = 1. (C) Stacked bar plot for all the variants from variant-to-gene analysis, showing number of transcription factor binding sites each of the variant can disrupt (predicted by motifbreakR – green), or simply overlap (analyzed by RGT suite – purple), or both (blue).

To narrow down the list of putative TF binding sites at each variant position, we leveraged the ATAC-seq footprint analysis using the RGT suite (Li et al., 2019). The final set of Motif-Predicted Binding Sites (MPBS) within each cell type ATAC-seq footprints was used to overlap with the genomic locations of the OCRs, and then overlapped with our obesity variants, resulting in annotated 29 variants. Mosaic plot in Figure 6B shows the number and proportions of variants predicted by motifbreakR and/or overlapped with MPBS. Insignificant p-value from Fisher’s exact test indicated the independence of the two analyses. Only seven variants were found within the cREs for the same TF motifs predicted to be disrupted by motifbreakR (Figure 6C). Figure 6—figure supplement 1 outlines the seven variants that motifbreakR and ATAC-seq footprint analysis agreed on the TF bindings they might disrupt.

Discussion

Given the challenge of uncovering the underlying molecular mechanisms driving such a multifactorial disease as obesity, our approach leveraging GWAS summary statistics, RNA-seq, ATAC-seq, and promoter Capture C / Hi-C offers new insights. This is particularly true as it is becoming increasingly evident that multiple effector genes can operate in a temporal fashion at a given locus depending on cell state, including at the FTO locus (Sobreira et al., 2021). Our approach offers an opportunity to implicate relevant cis-regulatory regions across different cell types contributing to the genetic etiology of the disease. By assigning GWAS signals to candidate causal variants and corresponding putative effector genes via open chromatin and chromatin contact information, we enhanced the fine-mapping process with an experimental genomic perspective to yield new insights into the biological pathways influencing childhood obesity.

LD score regression is a valuable method that estimates the relationship between linkage disequilibrium score and the summary statistics of GWAS SNPs to quantify the separate contributions of polygenic effects and various confounding factors that produce SNP-based heritability of disease. The general positive heritability enrichment across our open chromatin features spanning multiple cell types (Figure 1B.a) reinforces the notion that obesity etiology involves many systems in our body.

While obesity has long been known to be a risk factor for pancreatitis and pancreatic cancer, the significant enrichment of pancreatic alpha and beta cell related 3D genomic features for childhood obesity GWAS signals demonstrates the bidirectional relationship between obesity and the pancreas; indeed, it is well established that insulin has obesogenic properties. Moreover, the comorbidity of obesity and diabetes (either causal or a result of the overlap between SNPs associated with these two diseases) is tangible. When focusing on genetic annotation of the cREs only, the association with obesity became more diverse across cell types, especially in metabolic cells. Interestingly, the lack of enrichment (only 8 of 57 cell types yielded no degree of enrichment) of obesity SNPs heritability in open gene promoters (Figure 1B.b) reveals that cRE regions harboring obesity SNPs are more involved in gene regulation than disruption, and therefore potentially contributing more weight to the manifestation of the disease.

Of course, we should factor in the effective sample sizes of the GWAS efforts that are wide-ranging (2000–24,000 – given that the N for each variant is different within a single dataset, thus contributing to the weights and p-value of each SNP when the algorithm calculates the genome-wide heritability), which could result in noise and negative enrichment observed in the analysis – a methodology limitation of partial linkage regression that has been extensively discussed in the field (Steinsaltz et al., 2020). Thus, it is crucial to interpret the enrichment (or lack thereof) of disease variants in a certain cellular setting with an ad hoc biological context.

From mapping the common proxies of 19 independent sentinel SNPs that were genome-wide significantly associated with childhood obesity to putative effector genes through chromatin contacting cREs, one striking finding was the several potential ‘hubs’ of putatively core effector genes, whose occurrence spread across three human physiological systems. With the data available from so many cell types, our approach connected new candidate causal variants to known obesity-related genes and new implications of cell modality for previously known associations.

A potential application of this association could be to fine-tune the effect of a drug toward controlling appetite. An example of bringing new aspects to the old is for the signal within the FTO locus that contacted IRX3 and IRX5: previous studies have suggested these obesogenic effects operate in adipocytes (Claussnitzer et al., 2016), brain (Smemo et al., 2014), or pancreas Ragvin et al., 2010; here we confirmed this association in adipocytes and uncover the presence of distal chromatin contacts in myotubes for the first time.

Besides the above-mentioned genes with known associations with obesity, we discovered newly implicated genes. For example, the LRRIQ3 gene at the TNNI3K locus had its open promoter contacted by two SNPs, rs1040070 and rs10493544, in NTERA2 cells only. The published studies (Johnston et al., 2019; Sanchez-Roige et al., 2021) that associated LRRIQ3 with major depressive disorder and opioid usage acknowledged the overlapping promoter of this gene, albeit in the opposite direction, with a run-through transcript of FPGT-TNNI3K – previously shown to be associated with BMI in European Graff et al., 2013 and Korean populations (Lee et al., 2017).

It is apparent that not all the implicated genes we report would contribute equally to the susceptibility of obesity pathogenesis. Each locus comprises genes whose functions are obviously related to obesity or similar traits like BMI, fat weight, etc., while other genes are not so directly obvious in their relation to these traits.

It is encouraging that for implicated genes within these multi-cell-type loci across different physiological systems we could find previous associations to the corresponding cell types or systems. Examples are the two aforementioned genes at the TMEM18 locus (SH3YL1 and ACP1) (Fernandes et al., 2019; Blessing et al., 2015; Kobayashi et al., 2014; Choi et al., 2021; Gaynor et al., 2020) with the broad spectrum of their functions, HEPACAM2 implicated in the NCIH716 cell line at the CALCR locus Wu et al., 2018; Huang et al., 2018, and LRRIQ3 in the NTERA2 cell line at theTNNI3K locus (Pleasure and Lee, 1993).

Chronic inflammation is an essential characteristic of obesity pathogenesis. Adipose tissue-resident immune cells have been observed, leading to an increased focus in recent years on their potential contribution to metabolic dysfunction. On the other hand, neurological or psychological conditions, such as stress, induce the secretion of both glucocorticoids (increase motivation for food) and insulin (promotes food intake and obesity). Pleasure feeding then reduces activity in the stress-response network, reinforcing the feeding habit. It has been shown that voluntary behaviors, stimulated by external or internal stressors or pleasurable feelings, memories, and habits, can override the basic homeostatic controls of energy balance (Dallman, 2010). The potential link between the immune system and metabolic disease, and moreover, through the neural system, was tangible in our findings.

Two of the three SNPs which ranked the third most consistent in our variant-to-gene mapping (Figure 2C) – rs35796073 and rs35142762 – contacted the ALKAL2 promoter (supported by GTEx evidence to colocalize with ALKAL2 expression). The anaplastic lymphoma kinase (encoded by ALK gene) is a receptor tyrosine kinase, belongs to the insulin receptor family, and has been reported to promote nerve cell growth and differentiation (Iwahara et al., 1997; Motegi et al., 2004). Despite ALKAL2 (ALK and LTK ligand 2) being studied principally in the context of immunity, a recent study using the EGCUT biobank GWAS identified ALK as a candidate thinness gene and genetic deletion showed that its expression in hypothalamic neurons acts as a negative regulator in controlling energy expenditure via sympathetic control of adipose tissue lipolysis (Orthofer et al., 2020). ALKAL2 – encoding a high-affinity agonist of ALK/LTK receptors – which has been reported to enhance expression in response to inflammatory pain in nociceptors (Defaye et al., 2022; Sun et al., 2023) - has been recently implicated as a novel candidate gene for childhood BMI by transcriptome-wide association study Yao et al., 2021, and achieved genome-wide significance in a GWAS study contrasting persistent healthy thinness with severe early-onset obesity using the STILTS and SCOOP cohorts (Riveros-McKay et al., 2019). The finding that overexpression of ALKAL2 could potentiate neuroblastoma progression in the absence of ALK mutation (Borenäs et al., 2021) echoes the relationship between ADCY3 and MC4R (Siljee et al., 2018), where a peripheral gene, ADCY3, can regulate/impair the function of a core gene, that is MC4R, within the energy-regulating melanocortin signaling pathway (Timshel et al., 2020).

Our approach implicates putative target genes based on a mechanism of regulation for these variants to alter gene expression – through regulator TF(s) that bind to these contact sites. A potential limitation of the predictions from motifbreakR and matching TF motifs to ATAC-seq footprint by the RGT toolkit is that they were both based on the position probability matrixes of Jaspar and Hocomoco, which come from public motif databases. The ATAC-seq footprint analysis also carries sequence bias that can lead to false positive discovery. Thus, our attempt to call such regulators by predicting TF binding disruption can only serve as nominations – but warrant further functional follow up.

Another limitation of this work is the diversity in data quality among different samples, since different datasets were sampled and collected at different time points, from different patients, using different protocols, with libraries sequenced at different depths and qualities, and initially preprocessed with different pipelines and parameters. Thus, it is crucial to keep in mind that the discrepancy in data points might have resulted from variations in data quality. Importantly, any association discovered must be validated functionally before effector genes of the genetic variants can be leveraged to develop new therapies. Their putative function(s) must be characterized, together with the mechanism whereby the given variant’s alleles differentially affect the expression of the targeted genes. The next step is to explore how the target genes affect the trait of interest more directly.

Our results have provided a set of leads for future exploratory experiments in specific cellular settings in order to further expand our knowledge of childhood obesity genomics and hence equip us with more effective means to overcome the burden of this systematic disease.

Conclusion

Our approach of combining RNA-seq, ATAC-seq, and promoter Capture C/Hi-C datasets with GWAS summary statistics offers a systemic view of the multi-cellular nature of childhood obesity, shedding light on potential regulatory regions and effector genes. By leveraging physical properties, such as open chromatin status and chromatin contacts, we enhanced the fine-mapping process and gained new insights into the biological pathways influencing the disease. Although further functional validation is required, our findings provide valuable leads together with their cellular contexts for future research and the development of more effective strategies to address the burden of childhood obesity.

Materials and methods

Data and resource

Request a detailed protocol

Datasets used in prior studies are listed in Supplementary file 1a. ATAC-seq, RNA-seq, Hi-C, and Capture-C library generation for each cell type is provided in their original published study.

Cell lines/types resources

Request a detailed protocol

details of each cell type/cell line source were described in their original published study, provided in Supplementary file 1a.

Myotubes (Primary Skeletal Muscle Cells, PCS-950–010), MiaPaCa (CRL-1420), Panc-1 (CRL-1469), microglia (HMC1, CRL-3304), human fetal osteoblastic (hFOB 1.19, CRL-3602), HepG2 hepatocarcinoma (HB-8065) and Colorectal adenocarcinoma ascites derived cells (NCIH716, CCL-251) cell lines were purchased from American Type Cell Center (ATCC, Manassas, Virginia, USA). EndoC-BH1 were purchased from Univercell Biosolutions. Primary Normal Human Astrocytes (NHA) of unknown sex were obtained from Lonza as cryopreserved cells. Human embryonic stem cells (ESC H9, WA09) were obtained from WiCell Research Institute. Melanocytes isolated from foreskin healthy newborn males, were obtained from the Specialized Programs of Research Excellence (SPORE) in Skin Cancer Specimen Resource Core at Yale University. Pancreata from deceased organ donors were obtained by Human Pancreas Analysis Program (HPAP) (https://hpap.pmacs.upenn.edu), a Human Islet Research Network consortium. All were authenticated by the source and confirmed to be mycoplasma negative.

The rest of the cell types were isolated from samples, obtained from consented human donors, maintained and expanded within corresponding laboratories. Cell identities were verified based on observed global expression variation, marker genes by PCR and immunofluorescence microscopy.

This study also includes human adipocytes and pre-adipocytes from P. Seale’s lab, hepatocytes from P. Titchnell’s lab, trabecular meshwork cells from J. O’Brien’s lab, which will be described in detail in separate manuscripts currently in preparation.

ATAC-seq preprocessing and peaks calling

Request a detailed protocol

The detailed configurations and technical details for each data set are provided in the original studies. In brief, open chromatin regions were called using ENCODE ATAC-seq pipeline as previously described in each study the published data provided. In brief, reads were aligned to the hg19 or hg38 genome using bowtie2; duplicates were removed, alignments from all replicates were pooled, and narrow peaks were called using MACS2. A region was considered open if it overlapped at least 1 bp with ATAC-seq peak. We lifted all coordinates from hg19 to hg38 to ensure consistency between datasets.

Promoter Capture-C pre-processing and interaction calling

Request a detailed protocol

in brief, paired-end reads were pre-processed using HICUP pipeline (Wingett et al., 2015) with bowtie2 as aligner and hg19 for reference genome. Significant promoters’ interactions were called using unique read pairs from all baits promoter in the reference by CHICAGO (Cairns et al., 2016) pipeline. In addition to analysis of individual fragments (1frag), we also binned four fragments to improve long-distance sensitivity in interactions calling (Su et al., 2021). Interactions with CHICAGO score >5 in either 1-fragment or 4-fragment resolution were considered significant. These interactions were output as ibed format (similar to BEDPE format) in which each line represents one physical contact between fragments. Interactions from both resolutions were merged and their genomic coordinates were lifted from hg19 to hg38.

Hi-C pre-processing and interaction calling

Request a detailed protocol

We follow the pipeline as a recent study described (Su et al., 2022). Paired-end reads from each replicate were pre-processed using the HICUP pipeline v0.7.4 (Wingett et al., 2015), aligned by bowtie2 with hg38 as the reference genome. The alignments files were parsed to pairtools v0.3.0 to process and pairix v0.3.7 to index and compress, then converted to Hi-C matrix binary format.cool by cooler v0.8.11 at multiple resolutions (500 bp, 1, 2, 4, 10, 40, 500kbp and 1Mbp) and normalized with ICE method (Imakaev et al., 2012). The matrices from different replicates were merged at each resolution using cooler. Mustache v1.0.1 Roayaei et al., 2020 and Fit-Hi-C2 v2.0.7 (Kaul et al., 2020) were used to call significant intra-chromosomal interaction loops from merged replicates matrices at three resolutions 1 kb, 2 kb, and 4 kb, with significance threshold at q-value <0.1 and FDR <1 × 10−6, respectively. The identified interaction loops were merged between both tools at each resolution. Lastly, interaction loops from all three resolutions were merged with preference for smaller resolution if overlapped.

Definition of cis-regulatory elements (cREs)

Request a detailed protocol

We intersected ATAC-seq open chromatin regions (OCRs) of each cell type with chromatin conformation capture data determined by Hi-C/Capture-C of the same cell type, and with promoters –1,500/+500 bp of TSS, which were referenced by GENCODE v30.

Childhood obesity GWAS summary statistics

Request a detailed protocol

Data on childhood obesity from the EGG consortium was downloaded from https://www.egg-consortium.org/. We used 8,566,179 European ancestry variants (consisting of 8,613 cases and 12,696 controls in stage I; of 921 cases and 1930 controls in stage II), representing ~55% of the total 15,504,218 variants observed across all ancestries in the original study (Bradfield et al., 2019). The sumstats file was reformatted by munge_sumstats.py to standardize with the weighted variants from HapMap v3 within the LDSC baseline, which reduced the variants to 1,217,311 (7.8% of total).

Cell type specific partitioned heritability

Request a detailed protocol

We used LDSC (http://www.github.com/bulik/ldsc) v.1.0.1 (Finucane et al., 2015) with --h2 flag to estimate SNP-based heritability of childhood obesity within 4 defined sets of input genomic regions: (1) OCRs, (2) OCRs at gene promoters, (3) cREs, and (4) cREs with an expanded window of ± 500 bp. The baseline model LD scores, plink filesets, allele frequencies and variants weights files for the European 1000 genomes project phase 3 in hg38 were downloaded from the provided link (https://alkesgroup.broadinstitute.org/LDSCORE/GRCh38/). The cREs of each cell type were used to create the annotation, which in turn were used to compute annotation-specific LD scores for each cell types cREs set.

Genetic loci included in variant-to-genes mapping

Request a detailed protocol

19 sentinel signals that achieved genome-wide significance in the trans-ancestral meta-analysis study (Bradfield et al., 2019) were leveraged for our analyses. Proxies for each sentinel SNP were queried using TopLD Huang et al., 2022 and LDlinkR tool (Myers et al., 2020) with the GRCh38 Genome assembly, 1000 Genomes phase 3 v5 variant set, European population, and LD threshold of r2 >0.8, which resulted in 771 proxies, including the 21 SNPs from the 99% credible set of the original study (Supplementary file 1g).

RNA-seq preprocessing and expression profiling

Request a detailed protocol

The detailed configurations, steps, and technical details for each data set are provided in the original studies. In brief, read fragments from fastq files were mapped to genome assembly hg19 or hg38 using STAR, independently for each replicate and condition. We used GENCODE annotation files for feature annotation and htseq-count for raw read count calculation at each feature. Read counts were transformed into TPM (transcript per million) and normalized internally between replicates/conditions in each individual study. For comparative measurements, we transformed all the expression values into 0–100 scale.

Differential analysis and clustering of correlated genes

Request a detailed protocol

Normalized transcripts per million (TPM) of all measured genes in 46 of 57 cell types was used to perform differential analysis using DEseq2 package (Love et al., 2014), where cell type and system (immune, metabolic, neural and other) were used as variables for the modeling contrast. Because many of the genes we gathered from the variant-to-gene mapping were lowly expressed in corresponding cell types or others, causing relatively high levels of variability, we used apeglm method for effect size (logarithmic fold change estimates) shrinkage Zhu et al., 2019 to alleviate this phenomenon during the genes ranking. Weighted correlation network analysis (WGCNA) package (Langfelder and Horvath, 2008) was used to cluster genes from the variant-to-genes mapping process. WGCNA network construction power was chosen based on the analysis of scale-free topology for soft-thresholding. blockwiseModules with a power of 10 were used to create the correlation network and cluster genes into 3 modules of similarly expressed genes (Figure 4—figure supplement 4). The assigned colors were used to identify the gene modules: turquoise, grey, and blue.

Pathways enrichment analyses

Request a detailed protocol

We performed three analyses on the set of genes from the variant-to-genes mapping process:

Gene set over-representation analysis (ORA) was performed using clusterProfiler package Wu et al., 2021 to identify GO biology process terms (org.Hs.eg.db databese) enriched within our genes set in each cell type. A relaxed p-value cutoff was set at 0.1, and the minimum including genes was set at 2 to ensure the capture of all possible enriched terms. An adjusted p-value of 0.05 was later used to filter the significant terms.

Active-subnetwork-oriented gene set enrichment analysis

Request a detailed protocol

We used the pathfindR package Ulgen et al., 2019 to identify active subnetworks in protein-protein interaction networks from Biogrid, KEGG, STRING, GeneMania, and IntAct databases, using the list of genes from the variant-to-genes mapping process. Then we provided the statistic from the differential expression analysis for pathfindR to perform enrichment analyses on the identified subnetworks, discovering enriched KEGG pathways.

Customized signaling pathway impact analysis (SPIA)

Request a detailed protocol

The original method, proposed by Tarca in 2009 in SPIA package (Tarca et al., 2009), incorporates ORA with the adjacency matrix to measure the importance of genes within each pathway – genes that are connected to more other genes are likely more important than the downstream end-point genes. This pathway-topology approach measures the actual perturbation on a given pathway under a given condition and given differential effect size. We added more metrics to improve this method: (a) the score of gene impact among networks, (b) the neighborhood of genes that measure the importance of a gene based on its downstream effects, (c) the betweenness puts weight on the genes that act as a gateway for the network flow. The combination of metrics produces two ways of evidence – perturbance and enrichment – for each pathway similar to SPIA. Normal inverse cumulative distribution function was used to combine the p-values of this evidence, then Bonferroni and FDR correction was applied (Figure 4—figure supplement 2).

GWAS-eQTL colocalization

Request a detailed protocol

The summary statistics for the European ancestry subset from the EGG consortium GWAS for childhood obesity was used. Common variants (MAF  ≥ 0.01) from the 1000 Genomes Project v3 samples were used as a reference panel. We used non-overlapped genomic windows of ±250,000 bases extended in both directions from the median genomic position of each of 19 sentinel loci as input. We used ColocQuiaL Chen et al., 2022 to test genome-wide colocalization of all possible variants included in each inputted window against GTEx v.8 eQTLs associations for all 49 tissues available from https://www.gtexportal.org/home/datasets. A conditional posterior probability of colocalization of 0.8 or greater was imposed.

ATAC-seq transcription factor footprint analysis with RGT toolkit

Request a detailed protocol

Bam files of mapped reads from replicates and samples were merged for each cell type. The merged bam files were then used for footprinting by RGT-HINT with parameters: --atac-seq, –paired-end, –organism = hg38. If bam files were generated on hg19, we performed lift-over using CrossMap.py bam and hg19ToHg38.over.chain.gz file. We then used RGT-MOTIFANALYSIS matching to scan each footprint for possible transcription binding sites from HOCOMOCO and JASPAR databases for human only with parameter --filter ‘species:sapiens;database:hocomoco,jaspar_vertebrates’. Parameter –rand-proportion 10 was used to generate random putative binding sites with sizes ten times larger than the input footprints. After performing motif matching, we evaluated which transcription factors were more likely to occur in those footprints than in background regions (generated by the previous command) using RGT-MOTIFANALYSIS enrichment with the same filtered databases and default parameters. Output included all the Motif-Predicted Binding Sites (MPBS) that occurred within the identified footprints in each cell type. We overlapped these sites with the loci of our obesity variants.

Prediction of variant’s effect on transcription factor binding

Request a detailed protocol

Genomic positions (0-based coordinates) and allele alternatives of each proxy (from SNPlocs.Hsapiens.dbSNP155.GRCh38 package with matching reference sequence from BSgenome package) were used to scan all position frequency matrix databases (from MotifDb package) for potential transcription factor binding disruptive effects. The motifbreakR function from motifBreakR package was used, with filterp = TRUE and setting a p-value threshold = 0.0005, information content methods method='ic' with even background probabilities of the four nucleotides bkg = c(A=0.25, C=0.25, G=0.25, T=0.25) and BPPARAM = BiocParallel::SerialParam() to allow serial evaluation.

Data availability

The current manuscript is a computational study, so no data have been generated for this manuscript. Public software packages are available at the citations and URLs listed. Custom code for this analysis has been deposited on GitHub (copy archived at Trang, 2024).

The following previously published data sets were used
    1. Trang KB
    2. Levings MK
    3. Grant SF
    4. Wells AD
    (2024) NCBI Gene Expression Omnibus
    ID GSE272080. 3D chromatin-based variant-to-gene maps across 57 human cell types reveal the cellular and genetic architecture of autoimmune disease susceptibility [captureC].
    1. Trang KB
    2. Levings MK
    3. Grant SF
    4. Wells AD
    (2024) NCBI Gene Expression Omnibus
    ID GSE272120. 3D chromatin-based variant-to-gene maps across 57 human cell types reveal the cellular and genetic architecture of autoimmune disease susceptibility [ATAC-seq].
    1. Trang KB
    2. Levings MK
    3. Grant SF
    4. Wells AD
    (2024) NCBI Gene Expression Omnibus
    ID GSE271957. 3D chromatin-based variant-to-gene maps across 57 human cell types reveal the cellular and genetic architecture of autoimmune disease susceptibility [RNA-seq].
    1. Trang KB
    2. Levings MK
    3. Grant SF
    4. Wells AD
    (2024) NCBI Gene Expression Omnibus
    ID GSE272172. 3D chromatin-based variant-to-gene maps across 57 human cell types reveal the cellular and genetic architecture of autoimmune disease susceptibility [HiC].
    1. Pahl MC
    2. Sullivan KE
    3. Romberg N
    4. Grant SF
    5. Wells AD
    (2022) NCBI Gene Expression Omnibus
    ID GSE174658. Implicating effector genes at COVID-19 GWAS loci using promoter-focused Capture-C in disease-relevant immune cell types.
    1. Chesi A
    2. Romberg N
    3. Johnson M
    4. Lu S
    5. Manduchi E
    6. Leonard M
    7. Wells AD
    8. Grant SF
    (2020) EMBL-EBI ArrayExpress
    ID E-MTAB-6621. Promoter capture-C of primary human T Follicular Helper (TFH) cells and naive CD4-positive helper T cells from tonsils of healthy volunteers.
    1. Manduchi E
    2. Johnson M
    3. Chesi A
    4. Wells AD
    5. Grant SF
    6. Romberg N
    7. Lu S
    8. Hodge K
    (2020) EMBL-EBI ArrayExpress
    ID E-MTAB-6617. ATAC-seq of primary human T Follicular Helper (TFH) cells and naive CD4-positive helper T cells from tonsils of healthy volunteers.
    1. Romberg N
    2. Johnson M
    3. Grant SF
    4. Le Coz C
    5. Manduchi E
    6. Wells AD
    (2019) EMBL-EBI ArrayExpress
    ID E-MTAB-6637. Gene expression of primary human T Follicular Helper (TFH) cells and naive CD4-positive helper T cells from tonsils of healthy volunteers.
    1. Pahl MC
    2. Sharma P
    3. Wells AD
    (2024) NCBI Gene Expression Omnibus
    ID GSE230346. Chromatin conformation dynamics during CD4+ T cell activation implicates 2 autoimmune disease-associated genes and regulatory elements.
    1. Caliskan M
    2. Brown CD
    (2019) NCBI Gene Expression Omnibus
    ID GSE128072. Genetic and Epigenetic Fine Mapping of Complex Trait Associated Loci in the Human Liver.
    1. Conery M
    2. Pippin JA
    3. Pahl MC
    4. Grant SF
    (2024) NCBI Gene Expression Omnibus
    ID GSE261284. Bioinformatics and single-cell CRISPRi-based screen reveals effector genes and implicates multi-tissue etiology for BMD.
    1. Chesi A
    2. Grant SA
    3. Manduchi E
    4. Wells AD
    5. Johnson M
    6. Hankenson K
    7. Wagley Y
    (2019) EMBL-EBI ArrayExpress
    ID E-MTAB-6862. Capture-C of primary human mesenchymal stem cells (MSC)-derived osteoblasts from healthy donors (differentiated with BPM2).
    1. Chesi A
    2. Grant SF
    3. Johnson M
    4. Hankenson K
    5. Wells AD
    6. Wagley Y
    7. Manduchi E
    (2019) EMBL-EBI ArrayExpress
    ID E-MTAB-6834. ATAC-seq of primary mesenchymal stem cell (MSC)-derived osteoblasts (differentiated with BMP2) from 4 human donors.
    1. Chesi A
    2. Manduchi E
    3. Grant SF
    4. Johnson M
    5. Hankenson K
    6. Wells AD
    7. Wagley Y
    (2019) EMBL-EBI ArrayExpress
    ID E-MTAB-6835. RNA-seq of primary mesenchymal stem cells -derived osteoblasts (differentiated with BMP2) from 3 human donors.
    1. Su C
    2. Pippin J
    3. Kaestner K
    4. Grant SF
    5. Wells A
    (2022) NCBI Gene Expression Omnibus
    ID GSE188311. 3D chromatin organizations of human pancreatic cells reveal cell-type specific regulatory architectures of diabetes risk.
    1. Su C
    2. Pippin J
    3. Kaestner K
    4. Grant SF
    5. Wells A
    (2022) Data portal of The Human Pancreas Analysis Program
    ID PANC-DB. Single cell RNAseq and scATACseq data of Human pancreatic islets.
    1. Ackermann AM
    2. Wang Z
    3. Schug J
    4. Naji A
    5. Kaestner KH
    (2015) NCBI Gene Expression Omnibus
    ID GSE76268. Integration of ATAC-seq and RNA-seq Identifies Human Alpha Cell and Beta Cell Signature Genes.
    1. Littleton SH
    2. Grant SF
    3. Trang KB
    4. Chesi A
    (2024) NCBI Gene Expression Omnibus
    ID GSE241691. Variant-to-function analysis of the childhood obesity chr12q13 locus implicates rs7132908 as a causal variant within the 3' UTR of FAIM2.
    1. Pahl MC
    2. Grant SF
    (2021) NCBI Gene Expression Omnibus
    ID GSE152098. Cis-regulatory architecture of human ESC-derived hypothalamic neuron differentiation aids in variant-to-gene mapping of relevant common complex traits.
    1. Su C
    2. Grant SF
    (2022) EMBL-EBI ArrayExpress
    ID E-MTAB-9159. Promoter Capture-C of iPSC-derived NPCs and neurons.
    1. Su C
    2. Grant SF
    (2022) EMBL-EBI ArrayExpress
    ID E-MTAB-9087. ATAC-seq of iPSC-derived NPCs and neurons.
    1. Su C
    2. Grant SF
    (2022) EMBL-EBI ArrayExpress
    ID E-MTAB-9085. RNA-seq of iPSC-derived NPCs and neurons from CHOPWT10 and CHOPWT14 cell lines.
    1. Johnson M
    2. Grant SF
    (2021) NCBI Gene Expression Omnibus
    ID GSE175368. Identification of 22 novel susceptibility loci associated with testicular germ cell tumors.

References

    1. Turcot V
    2. Lu Y
    3. Highland HM
    4. Schurmann C
    5. Justice AE
    6. Fine RS
    7. Bradfield JP
    8. Esko T
    9. Giri A
    10. Graff M
    11. Guo X
    12. Hendricks AE
    13. Karaderi T
    14. Lempradl A
    15. Locke AE
    16. Mahajan A
    17. Marouli E
    18. Sivapalaratnam S
    19. Young KL
    20. Alfred T
    21. Feitosa MF
    22. Masca NGD
    23. Manning AK
    24. Medina-Gomez C
    25. Mudgal P
    26. Ng MCY
    27. Reiner AP
    28. Vedantam S
    29. Willems SM
    30. Winkler TW
    31. Abecasis G
    32. Aben KK
    33. Alam DS
    34. Alharthi SE
    35. Allison M
    36. Amouyel P
    37. Asselbergs FW
    38. Auer PL
    39. Balkau B
    40. Bang LE
    41. Barroso I
    42. Bastarache L
    43. Benn M
    44. Bergmann S
    45. Bielak LF
    46. Blüher M
    47. Boehnke M
    48. Boeing H
    49. Boerwinkle E
    50. Böger CA
    51. Bork-Jensen J
    52. Bots ML
    53. Bottinger EP
    54. Bowden DW
    55. Brandslund I
    56. Breen G
    57. Brilliant MH
    58. Broer L
    59. Brumat M
    60. Burt AA
    61. Butterworth AS
    62. Campbell PT
    63. Cappellani S
    64. Carey DJ
    65. Catamo E
    66. Caulfield MJ
    67. Chambers JC
    68. Chasman DI
    69. Chen Y-DI
    70. Chowdhury R
    71. Christensen C
    72. Chu AY
    73. Cocca M
    74. Collins FS
    75. Cook JP
    76. Corley J
    77. Corominas Galbany J
    78. Cox AJ
    79. Crosslin DS
    80. Cuellar-Partida G
    81. D’Eustacchio A
    82. Danesh J
    83. Davies G
    84. Bakker PIW
    85. Groot MCH
    86. Mutsert R
    87. Deary IJ
    88. Dedoussis G
    89. Demerath EW
    90. Heijer M
    91. Hollander AI
    92. Ruijter HM
    93. Dennis JG
    94. Denny JC
    95. Di Angelantonio E
    96. Drenos F
    97. Du M
    98. Dubé M-P
    99. Dunning AM
    100. Easton DF
    101. Edwards TL
    102. Ellinghaus D
    103. Ellinor PT
    104. Elliott P
    105. Evangelou E
    106. Farmaki A-E
    107. Farooqi IS
    108. Faul JD
    109. Fauser S
    110. Feng S
    111. Ferrannini E
    112. Ferrieres J
    113. Florez JC
    114. Ford I
    115. Fornage M
    116. Franco OH
    117. Franke A
    118. Franks PW
    119. Friedrich N
    120. Frikke-Schmidt R
    121. Galesloot TE
    122. Gan W
    123. Gandin I
    124. Gasparini P
    125. Gibson J
    126. Giedraitis V
    127. Gjesing AP
    128. Gordon-Larsen P
    129. Gorski M
    130. Grabe H-J
    131. Grant SFA
    132. Grarup N
    133. Griffiths HL
    134. Grove ML
    135. Gudnason V
    136. Gustafsson S
    137. Haessler J
    138. Hakonarson H
    139. Hammerschlag AR
    140. Hansen T
    141. Harris KM
    142. Harris TB
    143. Hattersley AT
    144. Have CT
    145. Hayward C
    146. He L
    147. Heard-Costa NL
    148. Heath AC
    149. Heid IM
    150. Helgeland Ø
    151. Hernesniemi J
    152. Hewitt AW
    153. Holmen OL
    154. Hovingh GK
    155. Howson JMM
    156. Hu Y
    157. Huang PL
    158. Huffman JE
    159. Ikram MA
    160. Ingelsson E
    161. Jackson AU
    162. Jansson J-H
    163. Jarvik GP
    164. Jensen GB
    165. Jia Y
    166. Johansson S
    167. Jørgensen ME
    168. Jørgensen T
    169. Jukema JW
    170. Kahali B
    171. Kahn RS
    172. Kähönen M
    173. Kamstrup PR
    174. Kanoni S
    175. Kaprio J
    176. Karaleftheri M
    177. Kardia SLR
    178. Karpe F
    179. Kathiresan S
    180. Kee F
    181. Kiemeney LA
    182. Kim E
    183. Kitajima H
    184. Komulainen P
    185. Kooner JS
    186. Kooperberg C
    187. Korhonen T
    188. Kovacs P
    189. Kuivaniemi H
    190. Kutalik Z
    191. Kuulasmaa K
    192. Kuusisto J
    193. Laakso M
    194. Lakka TA
    195. Lamparter D
    196. Lange EM
    197. Lange LA
    198. Langenberg C
    199. Larson EB
    200. Lee NR
    201. Lehtimäki T
    202. Lewis CE
    203. Li H
    204. Li J
    205. Li-Gao R
    206. Lin H
    207. Lin K-H
    208. Lin L-A
    209. Lin X
    210. Lind L
    211. Lindström J
    212. Linneberg A
    213. Liu C-T
    214. Liu DJ
    215. Liu Y
    216. Lo KS
    217. Lophatananon A
    218. Lotery AJ
    219. Loukola A
    220. Luan J
    221. Lubitz SA
    222. Lyytikäinen L-P
    223. Männistö S
    224. Marenne G
    225. Mazul AL
    226. McCarthy MI
    227. McKean-Cowdin R
    228. Medland SE
    229. Meidtner K
    230. Milani L
    231. Mistry V
    232. Mitchell P
    233. Mohlke KL
    234. Moilanen L
    235. Moitry M
    236. Montgomery GW
    237. Mook-Kanamori DO
    238. Moore C
    239. Mori TA
    240. Morris AD
    241. Morris AP
    242. Müller-Nurasyid M
    243. Munroe PB
    244. Nalls MA
    245. Narisu N
    246. Nelson CP
    247. Neville M
    248. Nielsen SF
    249. Nikus K
    250. Njølstad PR
    251. Nordestgaard BG
    252. Nyholt DR
    253. O’Connel JR
    254. O’Donoghue ML
    255. Olde Loohuis LM
    256. Ophoff RA
    257. Owen KR
    258. Packard CJ
    259. Padmanabhan S
    260. Palmer CNA
    261. Palmer ND
    262. Pasterkamp G
    263. Patel AP
    264. Pattie A
    265. Pedersen O
    266. Peissig PL
    267. Peloso GM
    268. Pennell CE
    269. Perola M
    270. Perry JA
    271. Perry JRB
    272. Pers TH
    273. Person TN
    274. Peters A
    275. Petersen ERB
    276. Peyser PA
    277. Pirie A
    278. Polasek O
    279. Polderman TJ
    280. Puolijoki H
    281. Raitakari OT
    282. Rasheed A
    283. Rauramaa R
    284. Reilly DF
    285. Renström F
    286. Rheinberger M
    287. Ridker PM
    288. Rioux JD
    289. Rivas MA
    290. Roberts DJ
    291. Robertson NR
    292. Robino A
    293. Rolandsson O
    294. Rudan I
    295. Ruth KS
    296. Saleheen D
    297. Salomaa V
    298. Samani NJ
    299. Sapkota Y
    300. Sattar N
    301. Schoen RE
    302. Schreiner PJ
    303. Schulze MB
    304. Scott RA
    305. Segura-Lepe MP
    306. Shah SH
    307. Sheu WH-H
    308. Sim X
    309. Slater AJ
    310. Small KS
    311. Smith AV
    312. Southam L
    313. Spector TD
    314. Speliotes EK
    315. Starr JM
    316. Stefansson K
    317. Steinthorsdottir V
    318. Stirrups KE
    319. Strauch K
    320. Stringham HM
    321. Stumvoll M
    322. Sun L
    323. Surendran P
    324. Swift AJ
    325. Tada H
    326. Tansey KE
    327. Tardif J-C
    328. Taylor KD
    329. Teumer A
    330. Thompson DJ
    331. Thorleifsson G
    332. Thorsteinsdottir U
    333. Thuesen BH
    334. Tönjes A
    335. Tromp G
    336. Trompet S
    337. Tsafantakis E
    338. Tuomilehto J
    339. Tybjaerg-Hansen A
    340. Tyrer JP
    341. Uher R
    342. Uitterlinden AG
    343. Uusitupa M
    344. Laan SW
    345. Duijn CM
    346. Leeuwen N
    347. van Setten J
    348. Vanhala M
    349. Varbo A
    350. Varga TV
    351. Varma R
    352. Velez Edwards DR
    353. Vermeulen SH
    354. Veronesi G
    355. Vestergaard H
    356. Vitart V
    357. Vogt TF
    358. Völker U
    359. Vuckovic D
    360. Wagenknecht LE
    361. Walker M
    362. Wallentin L
    363. Wang F
    364. Wang CA
    365. Wang S
    366. Wang Y
    367. Ware EB
    368. Wareham NJ
    369. Warren HR
    370. Waterworth DM
    371. Wessel J
    372. White HD
    373. Willer CJ
    374. Wilson JG
    375. Witte DR
    376. Wood AR
    377. Wu Y
    378. Yaghootkar H
    379. Yao J
    380. Yao P
    381. Yerges-Armstrong LM
    382. Young R
    383. Zeggini E
    384. Zhan X
    385. Zhang W
    386. Zhao JH
    387. Zhao W
    388. Zhao W
    389. Zhou W
    390. Zondervan KT
    391. CHD Exome+ Consortium
    392. EPIC-CVD Consortium
    393. ExomeBP Consortium
    394. Global Lipids Genetic Consortium
    395. GoT2D Genes Consortium
    396. EPIC InterAct Consortium
    397. INTERVAL Study
    398. ReproGen Consortium
    399. T2D-Genes Consortium
    400. MAGIC Investigators
    401. Understanding Society Scientific Group
    402. Rotter JI
    403. Pospisilik JA
    404. Rivadeneira F
    405. Borecki IB
    406. Deloukas P
    407. Frayling TM
    408. Lettre G
    409. North KE
    410. Lindgren CM
    411. Hirschhorn JN
    412. Loos RJF
    (2018) Protein-altering variants associated with body mass index implicate pathways that control energy intake and expenditure in obesity
    Nature Genetics 50:26–41.
    https://doi.org/10.1038/s41588-017-0011-x
    1. Vogelezang S
    2. Bradfield JP
    3. Ahluwalia TS
    4. Curtin JA
    5. Lakka TA
    6. Grarup N
    7. Scholz M
    8. van der Most PJ
    9. Monnereau C
    10. Stergiakouli E
    11. Heiskala A
    12. Horikoshi M
    13. Fedko IO
    14. Vilor-Tejedor N
    15. Cousminer DL
    16. Standl M
    17. Wang CA
    18. Viikari J
    19. Geller F
    20. Íñiguez C
    21. Pitkänen N
    22. Chesi A
    23. Bacelis J
    24. Yengo L
    25. Torrent M
    26. Ntalla I
    27. Helgeland Ø
    28. Selzam S
    29. Vonk JM
    30. Zafarmand MH
    31. Heude B
    32. Farooqi IS
    33. Alyass A
    34. Beaumont RN
    35. Have CT
    36. Rzehak P
    37. Bilbao JR
    38. Schnurr TM
    39. Barroso I
    40. Bønnelykke K
    41. Beilin LJ
    42. Carstensen L
    43. Charles MA
    44. Chawes B
    45. Clément K
    46. Closa-Monasterolo R
    47. Custovic A
    48. Eriksson JG
    49. Escribano J
    50. Groen-Blokhuis M
    51. Grote V
    52. Gruszfeld D
    53. Hakonarson H
    54. Hansen T
    55. Hattersley AT
    56. Hollensted M
    57. Hottenga JJ
    58. Hyppönen E
    59. Johansson S
    60. Joro R
    61. Kähönen M
    62. Karhunen V
    63. Kiess W
    64. Knight BA
    65. Koletzko B
    66. Kühnapfel A
    67. Landgraf K
    68. Langhendries JP
    69. Lehtimäki T
    70. Leinonen JT
    71. Li A
    72. Lindi V
    73. Lowry E
    74. Bustamante M
    75. Medina-Gomez C
    76. Melbye M
    77. Michaelsen KF
    78. Morgen CS
    79. Mori TA
    80. Nielsen TRH
    81. Niinikoski H
    82. Oldehinkel AJ
    83. Pahkala K
    84. Panoutsopoulou K
    85. Pedersen O
    86. Pennell CE
    87. Power C
    88. Reijneveld SA
    89. Rivadeneira F
    90. Simpson A
    91. Sly PD
    92. Stokholm J
    93. Teo KK
    94. Thiering E
    95. Timpson NJ
    96. Uitterlinden AG
    97. van Beijsterveldt CEM
    98. van Schaik BDC
    99. Vaudel M
    100. Verduci E
    101. Vinding RK
    102. Vogel M
    103. Zeggini E
    104. Sebert S
    105. Lind MV
    106. Brown CD
    107. Santa-Marina L
    108. Reischl E
    109. Frithioff-Bøjsøe C
    110. Meyre D
    111. Wheeler E
    112. Ong K
    113. Nohr EA
    114. Vrijkotte TGM
    115. Koppelman GH
    116. Plomin R
    117. Njølstad PR
    118. Dedoussis GD
    119. Froguel P
    120. Sørensen TIA
    121. Jacobsson B
    122. Freathy RM
    123. Zemel BS
    124. Raitakari O
    125. Vrijheid M
    126. Feenstra B
    127. Lyytikäinen LP
    128. Snieder H
    129. Kirsten H
    130. Holt PG
    131. Heinrich J
    132. Widén E
    133. Sunyer J
    134. Boomsma DI
    135. Järvelin MR
    136. Körner A
    137. Davey Smith G
    138. Holm JC
    139. Atalay M
    140. Murray C
    141. Bisgaard H
    142. McCarthy MI
    143. Jaddoe VWV
    144. Grant SFA
    145. Felix JF
    146. Early Growth Genetics Consortium
    (2020) Novel loci for childhood body mass index and shared heritability with adult cardiometabolic traits
    PLOS Genetics 16:e1008718.
    https://doi.org/10.1371/journal.pgen.1008718

Article and author information

Author details

  1. Khanh B Trang

    1. Center for Spatial and Functional Genomics, The Children's Hospital of Philadelphia, Philadelphia, United States
    2. Division of Human Genetics, The Children's Hospital of Philadelphia, Philadelphia, United States
    Contribution
    Conceptualization, Data curation, Formal analysis, Investigation, Visualization, Methodology, Writing – original draft
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-9434-508X
  2. Matthew C Pahl

    1. Center for Spatial and Functional Genomics, The Children's Hospital of Philadelphia, Philadelphia, United States
    2. Division of Human Genetics, The Children's Hospital of Philadelphia, Philadelphia, United States
    Contribution
    Data curation, Formal analysis, Methodology, Writing – review and editing
    Competing interests
    No competing interests declared
  3. James A Pippin

    1. Center for Spatial and Functional Genomics, The Children's Hospital of Philadelphia, Philadelphia, United States
    2. Division of Human Genetics, The Children's Hospital of Philadelphia, Philadelphia, United States
    Contribution
    Resources, Writing – review and editing
    Competing interests
    No competing interests declared
  4. Chun Su

    1. Center for Spatial and Functional Genomics, The Children's Hospital of Philadelphia, Philadelphia, United States
    2. Division of Human Genetics, The Children's Hospital of Philadelphia, Philadelphia, United States
    Contribution
    Data curation, Methodology
    Competing interests
    No competing interests declared
  5. Sheridan H Littleton

    1. Center for Spatial and Functional Genomics, The Children's Hospital of Philadelphia, Philadelphia, United States
    2. Division of Human Genetics, The Children's Hospital of Philadelphia, Philadelphia, United States
    3. Cell and Molecular Biology Graduate Group, Perelman School of Medicine, University of Pennsylvania, Philadelphia, United States
    4. Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, United States
    Contribution
    Resources, Data curation, Writing – review and editing
    Competing interests
    No competing interests declared
  6. Prabhat Sharma

    1. Center for Spatial and Functional Genomics, The Children's Hospital of Philadelphia, Philadelphia, United States
    2. Department of Pathology, The Children's Hospital of Philadelphia, Philadelphia, United States
    Contribution
    Resources
    Competing interests
    No competing interests declared
  7. Nikhil N Kulkarni

    1. Center for Spatial and Functional Genomics, The Children's Hospital of Philadelphia, Philadelphia, United States
    2. Department of Pathology, The Children's Hospital of Philadelphia, Philadelphia, United States
    Contribution
    Resources
    Competing interests
    No competing interests declared
  8. Louis R Ghanem

    Division of Gastroenterology, Hepatology, and Nutrition, Children's Hospital of Philadelphia, Philadelphia, United States
    Contribution
    Resources
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-7723-5241
  9. Natalie A Terry

    Division of Gastroenterology, Hepatology, and Nutrition, Children's Hospital of Philadelphia, Philadelphia, United States
    Contribution
    Resources
    Competing interests
    No competing interests declared
  10. Joan M O'Brien

    1. Scheie Eye Institute, Department of Ophthalmology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, United States
    2. Penn Medicine Center for Ophthalmic Genetics in Complex Disease, Philadelphia, United States
    Contribution
    Resources, Writing – review and editing
    Competing interests
    No competing interests declared
  11. Yadav Wagley

    Department of Orthopedic Surgery University of Michigan Medical School Ann Arbor, Ann Arbor, United States
    Contribution
    Resources
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-1261-4267
  12. Kurt D Hankenson

    Department of Orthopedic Surgery University of Michigan Medical School Ann Arbor, Ann Arbor, United States
    Contribution
    Resources
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-6361-143X
  13. Ashley Jermusyk

    Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, United States
    Contribution
    Resources
    Competing interests
    No competing interests declared
  14. Jason Hoskins

    Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, United States
    Contribution
    Resources
    Competing interests
    No competing interests declared
  15. Laufey T Amundadottir

    Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, United States
    Contribution
    Resources, Writing – review and editing
    Competing interests
    No competing interests declared
  16. Mai Xu

    Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, United States
    Contribution
    Resources
    Competing interests
    No competing interests declared
  17. Kevin Brown

    Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, United States
    Contribution
    Resources
    Competing interests
    No competing interests declared
  18. Stewart Anderson

    1. Department of Child and Adolescent Psychiatry, Children's Hospital of Philadelphia, Philadelphia, United States
    2. Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, United States
    Contribution
    Resources
    Competing interests
    No competing interests declared
  19. Wenli Yang

    1. Institute for Diabetes, Obesity and Metabolism, Perelman School of Medicine, University of Pennsylvania, Philadelphia, United States
    2. Department of Cell and Developmental Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, United States
    Contribution
    Resources
    Competing interests
    No competing interests declared
  20. Paul Titchenell

    1. Institute for Diabetes, Obesity and Metabolism, Perelman School of Medicine, University of Pennsylvania, Philadelphia, United States
    2. Department of Physiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, United States
    Contribution
    Resources
    Competing interests
    No competing interests declared
  21. Patrick Seale

    1. Institute for Diabetes, Obesity and Metabolism, Perelman School of Medicine, University of Pennsylvania, Philadelphia, United States
    2. Department of Cell and Developmental Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, United States
    Contribution
    Resources
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-7119-1615
  22. Klaus H Kaestner

    1. Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, United States
    2. Institute for Diabetes, Obesity and Metabolism, Perelman School of Medicine, University of Pennsylvania, Philadelphia, United States
    Contribution
    Resources
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-1228-021X
  23. Laura Cook

    1. Department of Microbiology and Immunology, University of Melbourne, Peter Doherty Institute for Infection and Immunity, Melbourne, Australia
    2. Department of Critical Care, Melbourne Medical School, University of Melbourne, Melbourne, Australia
    3. Division of Infectious Diseases, Department of Medicine, University of British Columbia, Vancouver, Canada
    Contribution
    Resources, Writing – review and editing
    Competing interests
    No competing interests declared
  24. Megan Levings

    1. Department of Surgery, University of British Columbia, Vancouver, Canada
    2. BC Children's Hospital Research Institute, Vancouver, Canada
    3. School of Biomedical Engineering, University of British Columbia, Vancouver, Canada
    Contribution
    Resources
    Competing interests
    No competing interests declared
  25. Babette S Zemel

    1. Division of Gastroenterology, Hepatology, and Nutrition, Children's Hospital of Philadelphia, Philadelphia, United States
    2. Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, United States
    Contribution
    Resources, Data curation, Methodology, Writing – review and editing
    Competing interests
    No competing interests declared
  26. Alessandra Chesi

    1. Center for Spatial and Functional Genomics, The Children's Hospital of Philadelphia, Philadelphia, United States
    2. Department of Pathology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, United States
    Contribution
    Resources, Data curation, Supervision, Methodology, Writing – review and editing
    Competing interests
    No competing interests declared
  27. Andrew D Wells

    1. Center for Spatial and Functional Genomics, The Children's Hospital of Philadelphia, Philadelphia, United States
    2. Department of Pathology, The Children's Hospital of Philadelphia, Philadelphia, United States
    3. Department of Pathology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, United States
    4. Institute for Immunology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, United States
    Contribution
    Conceptualization, Resources, Supervision, Funding acquisition, Writing – review and editing
    Competing interests
    No competing interests declared
  28. Struan FA Grant

    1. Center for Spatial and Functional Genomics, The Children's Hospital of Philadelphia, Philadelphia, United States
    2. Division of Human Genetics, The Children's Hospital of Philadelphia, Philadelphia, United States
    3. Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, United States
    4. Institute for Diabetes, Obesity and Metabolism, Perelman School of Medicine, University of Pennsylvania, Philadelphia, United States
    5. Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, United States
    6. Division Endocrinology and Diabetes, The Children's Hospital of Philadelphia, Philadelphia, United States
    7. Penn Neurodegeneration Genomics Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, United States
    Contribution
    Conceptualization, Resources, Supervision, Funding acquisition, Writing – review and editing
    For correspondence
    grants@chop.edu
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-2025-5302

Funding

Eunice Kennedy Shriver National Institute of Child Health and Human Development (R01 HD056465)

  • Babette S Zemel
  • Alessandra Chesi
  • Andrew D Wells
  • Struan FA Grant

National Institute of Diabetes and Digestive and Kidney Diseases (R01 DK122586)

  • Andrew D Wells
  • Struan FA Grant

National Institute of Diabetes and Digestive and Kidney Diseases (UM1 DK126194)

  • Wenli Yang
  • Paul Titchenell
  • Patrick Seale
  • Klaus H Kaestner
  • Andrew D Wells
  • Struan FA Grant

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

This work was supported by National Institutes of Health awards R01 HD056465, R01 DK122586 and UM1 DK126194, and the Daniel B Burke Endowed Chair for Diabetes Research. Given the use of de-identified datasets and biospecimens was not considered human subjects research, ethical oversight was waived by the Institutional Review Board of the Children’s Hospital of Philadelphia.

Version history

  1. Sent for peer review:
  2. Preprint posted:
  3. Reviewed Preprint version 1:
  4. Reviewed Preprint version 2:
  5. Version of Record published:

Cite all versions

You can cite all versions using the DOI https://doi.org/10.7554/eLife.95411. This DOI represents all versions, and will always resolve to the latest one.

Copyright

This is an open-access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.

Metrics

  • 393
    views
  • 7
    downloads
  • 0
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Khanh B Trang
  2. Matthew C Pahl
  3. James A Pippin
  4. Chun Su
  5. Sheridan H Littleton
  6. Prabhat Sharma
  7. Nikhil N Kulkarni
  8. Louis R Ghanem
  9. Natalie A Terry
  10. Joan M O'Brien
  11. Yadav Wagley
  12. Kurt D Hankenson
  13. Ashley Jermusyk
  14. Jason Hoskins
  15. Laufey T Amundadottir
  16. Mai Xu
  17. Kevin Brown
  18. Stewart Anderson
  19. Wenli Yang
  20. Paul Titchenell
  21. Patrick Seale
  22. Klaus H Kaestner
  23. Laura Cook
  24. Megan Levings
  25. Babette S Zemel
  26. Alessandra Chesi
  27. Andrew D Wells
  28. Struan FA Grant
(2025)
3D genomic features across >50 diverse cell types reveal insights into the genomic architecture of childhood obesity
eLife 13:RP95411.
https://doi.org/10.7554/eLife.95411.3

Share this article

https://doi.org/10.7554/eLife.95411

Further reading

    1. Developmental Biology
    2. Genetics and Genomics
    Anne-Sophie Pepin, Patrycja A Jazwiec ... Sarah Kimmins
    Research Article Updated

    Paternal obesity has been implicated in adult-onset metabolic disease in offspring. However, the molecular mechanisms driving these paternal effects and the developmental processes involved remain poorly understood. One underexplored possibility is the role of paternally induced effects on placenta development and function. To address this, we investigated paternal high-fat diet-induced obesity in relation to sperm histone H3 lysine 4 tri-methylation signatures, the placenta transcriptome, and cellular composition. C57BL6/J male mice were fed either a control or high-fat diet for 10 weeks beginning at 6 weeks of age. Males were timed-mated with control-fed C57BL6/J females to generate pregnancies, followed by collection of sperm, and placentas at embryonic day (E)14.5. Chromatin immunoprecipitation targeting histone H3 lysine 4 tri-methylation (H3K4me3) followed by sequencing (ChIP-seq) was performed on sperm to define obesity-associated changes in enrichment. Paternal obesity corresponded with altered sperm H3K4me3 at promoters of genes involved in metabolism and development. Notably, altered sperm H3K4me3 was also localized at placental enhancers. Bulk RNA-sequencing on placentas revealed paternal obesity-associated sex-specific changes in expression of genes involved in hypoxic processes such as angiogenesis, nutrient transport, and imprinted genes, with a subset of de-regulated genes showing changes in H3K4me3 in sperm at corresponding promoters. Paternal obesity was also linked to impaired placenta development; specifically, a deconvolution analysis revealed altered trophoblast cell lineage specification. These findings implicate paternal obesity effects on placenta development and function as one potential developmental route to offspring metabolic disease.

    1. Genetics and Genomics
    Tade Souaiaia, Hei Man Wu ... Paul F O'Reilly
    Research Article

    The use of siblings to infer the factors influencing complex traits has been a cornerstone of quantitative genetics. Here, we utilise siblings for a novel application: the inference of genetic architecture, specifically that relating to individuals with extreme trait values (e.g. in the top 1%). Inferring the genetic architecture most relevant to this group of individuals is important because they are at the greatest risk of disease and may be more likely to harbour rare variants of large effect due to natural selection. We develop a theoretical framework that derives expected distributions of sibling trait values based on an index sibling’s trait value, estimated trait heritability, and null assumptions that include infinitesimal genetic effects and environmental factors that are either controlled for or have combined Gaussian effects. This framework is then used to develop statistical tests powered to distinguish between trait tails characterised by common polygenic architecture from those that include substantial enrichments of de novo or rare variant (Mendelian) architecture. We apply our tests to UK Biobank data here, although we note that they can be used to infer genetic architecture in any cohort or health registry that includes siblings and their trait values, since these tests do not use genetic data. We describe how our approach has the potential to help disentangle the genetic and environmental causes of extreme trait values, and to improve the design and power of future sequencing studies to detect rare variants.