Figure 1.The computational framework to analyze the ability of transcription factors to bind to nucleosomes by integrating ChIP-seq, MNaseq-seq and DNase-seq data for motif enrichment and binding motif analysis.Figure 2.Identifying pioneer factors using motif enrichment analysis.a) Ranking of transcription factors by binding motif enrichment scores. Known pioneer factors from FOXA, GATA and Yamanaka factor families are indicated by squares, diamonds and triangles, while other TFs are shown as circles. Colors corresponds to false discovery rate (FDR) q-values. Mann–Whitney U tests are performed for FOXA, GATA factors and Yamanaka factors under the null hypothesis that their mean values of enrichment scores are equal to canonical TFs. * - p-value < 0.05; ** - p-value < 0.005. b) Binding motif profile of TFs with the highest and lowest motif enrichment scores (ranked at the top 20% among all TFs). The number of motifs for each TF is normalized within the range between 0 and 1 as follows: X(i)normalized) = (X(i) -Xmin)/(Xmax -Xmin), X(i) is the number of binding motifs at the ith base pair from the nucleosomal dyad position; Xmax and Xmin represent the maximal and minimal motif counts. c) TF motif enrichment score is used to distinguish PTFs (FOXA, GATA, and CEBP families) from other canonical TFs. Receiver operating curves (ROC) analyses of motif enrichment scores are performed. Here NRs (nucleosomal regions) are defined as nucleosomal DNA regions located in differentially open and NDRs (nucleosome-depleted regions) are located in conserved open chromatin regions. Using differential and conserved open chromatin regions in motif enrichment analysis significantly increased the AUC from 0.79 to 0.94. d) Comparison of the enrichment score of TFs in different clusters identified from recent EMSA experiments. Cluster 1: strong binders to both free DNA and nucleosomal DNA; Cluster 2: weak binders to both free DNA and nucleosomal DNA; Cluster 3: strong binders to free DNA but weak binders to nucleosomal DNA.Figure 3.Association between the number of TF binding motifs (motif density) and nucleosome occupancy. a) A number of TF binding motifs and nucleosome occupancy values as a function of distance in base pairs from the nucleosome dyad. CUX1 in MCF-7 cell line is shown as an example. Nucleosome occupancy is calculated as the number of mapped nucleosomal DNA base pairs at each location using the MNase-seq data. Then the number of binding motifs at each location was normalized within the range between 0 and 1. b) Pearson correlation coefficients between the number of TF binding motifs and nucleosome occupancy values for each TF (n=225) with the median value of -0.46. c) Binding motif profiles of TFs with positive (red) or negative correlation coefficients (black) between the number of binding motifs and nucleosome occupancy. Dashed lines represent the average of the binding motif profiles. d) Comparison of binding motif profiles of CTCF between H1 embryonic stem cell line and other somatic cell lines.Figure 4.Clusters of TF binding motif profiles on nucleosomal DNA.Binding motif profiles centered at nucleosomal dyad locations (+/− 60 base pair from dyad) are clustered using k-medoids clustering with k=6. Binding motif profiles between two symmetrical nucleosomal halves are combined for each TF. The black line represents the averaged profiles of all TFs in the same cluster.