Figures and data

Transcription Factor Occupancy Profiling with MNase “MOA-seq” in HUVECs following hypoxia for 1, 3, and 24h.
MOA-seq, MNase-defined cistrome-Occupancy Analysis, (Savadel et al. 2021), was developed to globally identify protein-bound cis-regulatory sequences in a single robust assay using modified nuclease profiling of fixed chromatin and customized bioinformatic analysis pipelines. MOA-seq was applied to HUVEC cells stressed with hypoxia to map short (1h and 3h) or long (24h) exposure-induced changes in TF occupancies. (A) The treatment scheme for the four samples used in this study: 0h = normoxia; 1h, 3h, and 24h refer to hypoxia (2% oxygen) exposure times. (B) Diagrammatic summary of the MOA-seq, starting with formaldehyde-fixed cells to preserve chromatin structure including bound transcription factors (TFs) at their cognate binding sites (cis-elements). Light/partial digest with MNase preferentially cuts DNA within nucleosome-free/accessible chromatin regions (NRF/ACR), but not where protein occupancy occurs. Recovery of de-crosslinked, small DNA fragments from ideal partial digests (yellow asterisks, inset gel) is used for NGS library production. MOA-seq coverage profiles of aligned read pair fragments (orange) and their fragment centers (frenters, dark orange) reveal footprints for analysis. The frenter peaks reflect protein-protected footprints within the NFR/ACRs (e.g. cis-element 1 and 2) but do not mark unoccupied cis-elements (e.g. cis-element 3). Peak regions are used for comparative genomics and motif discovery and analysis.

MOA-seq maps gene regulatory features at ENCODE cCREs and additional sites with HUVEC cell-type specificity in normoxic cells.
MOA-seq coverage profiles were compared to ENCODE and other genomic annotations related to gene expression and regulation. (A,B) Profile trend plots of MOA coverage showing peaks at the transcription start (TSS) and end (TES) sites of genes, with peak heights positively correlating with transcript abundance quartiles. (C) Profile trend plots of MOA frenter coverage profiles colocalize with cCRE subtypes (K4m3, DNase I-H3K4me3; dELS, distal enhancer-like signature; pELS, proximal enhancer-like signature; PLS, promoter-like signature). (D) Intersection significance is shown by Z-scores calculated from peak bp intersections with cCRE subtypes. (E) Intersection significance is shown by Z-scores calculated from peak bp intersections with DNase I hypersensitive (DHS) peaks from multiple different cell types. (F) The abundance of MOA-peaks relative to DHS regions or known cCRE regions. (G) Genome browser session showing an example of a MOA peak within a cCRE and DHS region (top) and another example in a non-cCRE and non-DHS region (bottom), with tracks labeled on the left.

MOA-seq Enrichment at ReMap ChIP Summits for Canonical HUVEC Transcription Factors.
MOA-seq enrichment at previously characterized TF binding sites as defined by ChIP-seq, using ReMAP (Hammal et al. 2022) ChIP summits for 13 canonical HUVEC TFs (Payne et al. 2024). Profile trend (A) and heatmap (B) plots of MOA 0hr frenters coverage centered on ReMap HUVEC ChIP peak summits. Data plotted were from regions with non-zero MOA coverage of ∼250k summits from a total of ∼750k spanning various studies and treatments. (C) Bar chart showing total unfiltered enrichment scores (MOA coverage ratio of summit value/flanks, see Methods) for a select subset of TF ChIPs from experiments conducted in HUVEC only, with nearby (< 10 bp) summits for MOA and ChIP. (D) Profile trend plots depicting local average reads per million for MOA frenters (red) and ChIP-seq (blue). ChIP-seq data is from individual HUVEC experiments (Wang et al. 2019) centered on the canonical motif for the respective transcription factor (TFs). (E) Bar charts illustrating the distance (bp) of MOA and ChIP summits from the motif midpoint for each TF.

Hypoxia-induced differential MOA (diff-MOA) peak analysis and transcription factor enrichment across genomic regions.
To identify genomic regions that changed their TF occupancy footprints in response to hypoxia, bdgdiff (MACS3) of MOA frenter coverage was used to call peaks using the elbow method. (A) Table of diff-MOA peak numbers and sizes for diff-MOA peaks. Each is one of the three hypoxia time points relative to 0h normoxia. (B) Venn diagram depicting the overlap of genes with diff-MOA footprints in cCRE versus non-cCRE regions. Genes in the intersection contain multiple diff-MOA footprints, with at least one footprint located in a cCRE and one located in a non-cCRE region. Schematics depict genes with a diff-MOA peak overlapping a cCRE region (light blue box) or a non-CRE region (light orange box). (C) Enrichments (adjusted p-values) for gene lists derived from diff-MOA peaks in cCRE versus non-cCRE regions. Results from ENRICHR (https://maayanlab.cloud/Enrichr/) using the libraries for ENCODE and ChEA Consensus TFs from ChIP-X. The plot was sorted by P-values for the total diff-MOA peaks (grey bars) and all significant TFs (dashed lines) are shown. (D) Base pair density overlap (per megabase) of gain and loss peaks across ENCODE cCRE subclasses: K4m3 (DNase I–H3K4me3), dELS (distal enhancer-like signature), pELS (proximal enhancer-like signature), and PLS (promoter-like signature), along with an additional category labeled “Other” representing peaks not overlapping any cCRE subclass. (E) Genome browser session showing two examples each for diff-MOA gain, diff-MOA loss, or shared for 24h hypoxia treatment. The within-sample, non-differential peaks are indicated (Peak, blue boxes). The differential, diff-MOA, peaks are shown above each pair, and boxed with a unique color and indicated as 24h GAIN (dark orange), 24h LOSS (dark grey), or 24h SHARED (purple).

Comparison of differential MOA with differential gene expression.
Comparison of hypoxia-induced diff-MOA genes to hypoxia-induced differentially expressed genes (DEGs) with regard to gene enrichment analyses, changes in RNA levels, and gain or loss of diff-MOA footprints. (A) Gene lists defined by diff-MOA (sum of all gain- or loss-peak genes) for each hypoxia time point were pooled and used as input gene lists for the ENRICHR libraries. Top hits are shown for two libraries, Human MSigDB Hallmark and WikiPathway Human gene sets. The significance value (-log10(FDR)) is proportional to the circle size; gene list sizes are indicated at the bottom. (B) Lack of directional relationship between diff-MOA (gain or loss) and DEG (transcript up or down) is shown as gene numbers plotted for each time point. (C) Analysis of genes showing both diff-MOA and DEGs. The top shows Venn diagrams for each time point. The bottom shows ENRICHR results using the same two libraries as in panel A. Genes with both diff-MOA and DEG gene subsets (MOA) have stronger enrichments as shown by comparative analysis using randomly sampled subset of genes from the list of DEGs (Control). For each time point the average enrichment is plotted based on three size-matched subsets of genes (sampling 58 DEG genes for 1h, 101 DEG genes for 3h, and 217 DEG genes for 24h. For each such comparison, the average (n=3) subsamples significance (-log10(FDR)) was less than for the MOA plus DEG group.

Motif discovery and analysis at diff-MOA peaks identify candidate hypoxia-responsive gene regulatory network cis-elements.
The diff-MOA peaks were split into six sets based on gain (A) or loss (B) of footprints at each of the three hypoxia treatment timepoints. Genomic sequences under the peaks were analyzed using MEME motif analysis. For each list of motifs, three results are shown: the significant de novo motifs with their sequence LOGO, the TF families (italicized) and their members (bolded in parentheses), and the top 3 individual TFs sorted by E-value. For each motif (TF), the background-adjusted expected number of motifs (Eval) is listed in the second column, followed by the percentage of diff-MOA peaks containing the motif (diffMOA), and the percentage of the motif intersected with ReMap ChIP-seq data (ReMap, NA represents absence of a ReMap ChIP-seq data).

Cluster analysis and GSEA GO enrichment results.
We defined 5,383 hypoxia-induced peak regions as 30-bp windows centered on all diff-MOA peaks. The R-package k-means clustering of the MOA coverage values with the elbow method resulted in 10 clusters. (A) The hypoxia MOA clusters were named “hmc1” through “hmc10”, plotted for each peak region using Z-score standardization (Y-axes -1 to +1), along with the total number of peaks and DEGs associated with each cluster. The peak regions for each cluster were separately used as input for MEME motif analysis, and the top motifs for each are listed under their respective panels. (B, C) The cluster-specific DEGs and their fold change transcript abundance for each hypoxia treatment hour were used as input for the gene set enrichment program, gseGO in clusterProfiler. Resulting enrichments are shown and plotted across the time points for significant pathways, for suppressed (pink) or activated (blue) genes. Plots list the GO term, the hmc cluster number, and the GO ID for each pathway for those found to be shared among all three time points (left) or shared between any two of the three time points (right).

Classification into HIF1A and non-HIF1A groups, and motif co-occurrence analysis.
MOA clusters were analyzed and grouped with regard to HIF1A occupancy using intersections of diff-MOA peaks with HIF1A ChIP-seq and HIF1A motif data. (A) Chart of the number of diff-MOA peaks that intersect with ReMap HIF1A ChIP-seq peaks, sorted by time and diff-MOA footprint type, GAIN or LOSS. (B) Chart showing the delineation of clusters (listed below the graph) into two groups according to diff-MOA footprint type, gain or loss. Each bar is further subdivided to show the proportion of diff-MOA peaks that intersect only HIF1A ChIP-seq peaks (lavender for GAIN, pink for LOSS) and those that also have the HIF1A motif (grey). The schematics depict the three-feature regions where the diff-MOA peak overlaps both HIF1A ChIP and HIF1A motif (grey) or the two-feature regions where the diff-MOA peak only intersects the HIF1A ChIP peak but not the motif (lavender for GAIN, pink for LOSS). (C) Classification of “hmc” clusters into HIF1A and non-HIF1A groups based on ChIP validation and HIF1A motif intersection greater than expected by chance (z-score > 2). The HIF1A group is shown with timepoints labeled on the x-axis for each cluster, left y-axis (as in Fig. 7A), and the right y-axis represents the percentage of HIF1A-intersecting peaks in a cluster. The star marks the percentage and the timepoint that makes up the greatest proportion of intersections (boxed on x-axis) with HIF1A ChIP for a given cluster. (D) The non-HIF1A group of clusters, which show a general downward trend for occupancy relative to 0h. (E) Other non-HIF1A TF motifs found in clusters from the two groups. The Venn diagram shows 21 non-HIF1A-group TFs (found in at least two clusters), 14 HIF1A-group TFs (found in at least three clusters), and 34 shared TFs. The unique and shared TF motifs are listed in the boxes, along with asterisks to indicate those which co-occur (+/- 50 bp) with the HIF1A motif. Percentages in the Venn diagram represent the proportion of TF motifs in each group that co-occur with the HIF1A motif.

MOA-seq signal stability across read-depth downsampling.
Aggregate MOA-seq coverage profiles (average counts per million, CPM) centered on ERG motif midpoints (left column) and CTCF candidate cis-regulatory elements (cCREs; right column) are shown for two biological replicates at 0, 1, 3, and 24 hours. For each panel, the original dataset (blue) is overlaid with profiles generated after random downsampling to 15 million, 10 million, 5 million, and 1 million reads (colored traces).

Genomic distribution of normoxic MOA peaks.
Comparable genomic distribution of MOA peaks detected under normoxic conditions, annotated using ChIPseeker (Wang et al. 2022; Yu et al. 2015). Peaks were classified by genomic context, including promoter (±200 bp from the TSS), 5′ UTR, 3′ UTR, exon, intron (first and other), downstream, and distal intergenic regions. (A) All normoxic peaks (n = 21,765, 100%). (B) Peaks not overlapping DHSs (n = 9,803, 45.0%). (C) Peaks not overlapping cCREs (n = 12,274, 56.4%). (D) Peaks overlapping neither DHSs nor cCREs (n = 7,202, 33.1%).

MOA Coverage at ReMap ChIP summits from HUVEC experiments for 13 transcription factors.
Profile trend plots of local MOA frenter coverage for the individual TFs grouped primarily by signal height. Plots are centered on the ReMAP ChIP peak summit for each TF.

Heatmaps of MOA-seq frenter coverage at diff-MOA peaks across hypoxia time points.
MOA-seq frenter coverage heatmaps of the diff-MOA (GAIN vs LOSS of footprints relative to normoxia) for each time point. Heatmaps were sorted by total MOA signal averaged over the 1 kb window centered on the peaks. In addition, sorting was done for diff-MOA gain sites using the hypoxic time point as the primary sort, and for diff-MOA loss sites, using the normoxic time point as the primary sort.

Distribution of diff-MOA peaks across ENCODE cCRE subclasses.
Percentage of peaks overlapping ENCODE cCRE subclasses: PLS (promoter-like signature), pELS (proximal enhancer-like signature), dELS (distal enhancer-like signature), CTCF-binding sites, K4m3 (DNase I–H3K4me3), and Other. The “Other” category refers to peaks not overlapping any ENCODE cCRE subclass. (A) Normoxic MOA peaks. (B) diff-MOA gain peaks at 1h, 3h, and 24h. (C) diff-MOA loss peaks at 1 h, 3 h, and 24 h. (D) diff-MOA shared peaks at 1h, 3h, and 24h.

Base pair density overlap of normoxic and shared diff-MOA peaks across ENCODE cCRE subclasses.
Base pair density overlap (per megabase) of MOA peaks across ENCODE cCRE subclasses: K4m3 (DNase I–H3K4me3), dELS (distal enhancer-like signature), pELS (proximal enhancer-like signature), and PLS (promoter-like signature), along with an additional category labeled “Other” representing peaks not overlapping any cCRE subclass. (A) Normoxic MOA peaks. (B) Shared diff-MOA peaks at 1h, 3h, 2 h.

Comparison of EPAS1/HIF2A and HIF1A ChIP-seq Overlap and Motif Intersections.
(A) Pie charts showing the overlap between the HIF1A ReMap ChIP-seq data set and the EPAS1/HIF2A ChIP-seq data set. The pink sections represent peaks unique to each ChIP-seq dataset, while the lavender sections represent overlapping peaks. For HIF1A, 85.6% of its peaks are unique, while 14.4% are shared. For EPAS1/HIF2A, only 17.6% of its peaks are unique, while 82.4% are shared. (B) Motif logos from HOCOMOCOv11 for both HIF1A and EPAS1/HIF2A. (C) ChIP-seq peak intersections: The intersection of each set of gain diff-MOA peaks by hour with the HIF1A and EPAS1/HIF2A ChIP-seq datasets. Bars are color-coded to indicate the proportion that is unique to the ChIP-seq dataset (blue) and the proportion that is shared or overlapping (grey). (D) Motif intersections: The intersection of each set of gain diff-MOA peaks by hour with the HIF1A and EPAS1/HIF2A motifs found via the MEME suite. Bars are color-coded to indicate the proportion of regions exclusive to the motif (blue) and the proportion of regions with overlapping motifs (grey).