Schematic of three-way combinatorial approach. A) Three-way combinatorial Massively parallel reporter assay (MPRA) design to test enhancer-enhancer-promoter combinations. 8 barcoded reporter assay libraries, 1 per promoter, were constructed. Pairs of DNA elements (enhancers and scrambled control sequences) were inserted after barcoded reporter. The enhancers and controls can be placed in both orientations in either the enhancer position 1 (E1) or enhancer position 2 (E2). B) The design of the library yields 8 matrices that contain control-control combinations (CC), enhancer-control combinations (EC and CE) and enhancer-enhancer combinations (EE). C) and D) 2 example loci, Sox2 LCR (locus control region) (C) and Otx2 (D) from where we selected enhancers to test in the reporter libraries. Enhancers (orange) were defined around DNAse I hypersensitivty sites from mouse embryonic stem cells (mESC). Promoters (Blue) were chosen according to TSS annotatition and mESC DNAse I hypersensitivty sites. DNAse I data is publicly available from Joshi et al., 2015.

Effects of single enhancers across promoters. A) Enhancer-promoter median boost index matrix of single enhacers. For each single enhancer a median boost index (see figure and methods, log2(Activity_Combination/control-control baseline)) was calculated across all enhancer-control combinations for that particular enhancer in any position and orientation, the baseline is the median activity of the promoter across al control-control combinations. For controls a median boost index was calculated across all controls over the median activity of all controls. Color coding of the matrix corresponds to the median boost indexes, white spaces are missing data. B) Distribution of boost indices for all enhancer+control combinations for the Sox2 promoter. leftmost column corresponds to the boost index distribution for all control-control combinations for the Sox2 promoter. Each dot represents one enhancer+control combination. Horizontal lines correspond to the median of each distribution (same median as represented in the matrix in A). C) Distribution of boost indices for all enhancer+control combinations for the enhancer Nanog_E074 across all promoters. Each dot represents one enhancer+control combination. Horizontal lines correspond to the median of each distribution. D) Distribution of median Boost indices for each single enhancer across promoters. Each dot represents the median of an enhancer across all enhancer+control combinations for that enhancers (same median as represented in the matrix in A). Coloring in B and D represents the significance at a 1% FDR for a wilcoxon test comparing boost index distributions between a single enhancer enhancer+control combinations and the controls.

Effects of enhancer-enhancer combinations. A) Fragment-fragment combinatorial boost index matrix for the Lefty1 promoter. Each square represents 1 control-control (CC), enhancer-control (EC or CE) or enhancer-enhancer (EE) combination. Color coding corresponds to the average boost index for each combination across all orientations measured over the median control-control baseline. B) Boost index distributions across all 8 promoters for each combination type, control-control (CC), enhancer-control (EC regardless of position) or enhancer-enhancer (EE) combinations. P-values correspond to the result of a Wilcoxon test. C) Relationship between observed boost index for each EE combination and the observed boost index of the strongest single enhancer of the pair for Sox2 and Lefty1 promoters. Blue lines represent the LOESS fit of the data. D) Observed and expected additive activities for the Sox2_E178+Sox2_E182 combination with the Sox2 promoter and the individual activities of each of the elements. Each column represents the observed activities for the control-contol combinations, the enhancer-control combinations and the enhancer-enhancer combinations. The horizontal bars represent the median of each distribution. The horizontal black line represents the expected additive activity of the enhancer-enhancer combination as calculated by the formula in the panel in the linear space. The horizontal red lines represents the propagated standard deviations of the expected additive activity of the enhancer-enhancer combinationas as calculated by the formula in the panel. E and F) Relationship between observed and expected activities (additive in E, multiplicative in F) for all enhancer-enhancer combinations for the Sox2 promoter. The blue lines represent the linear fit of the data. Grey diagonal line is the x=y identity line. In all panels R represents Pearson’s correlation. Expected activities are calculated in the linear space and then plotted in the log2 space.

Supra and sub-additive behaviours of enhancer combinations. A) Distributions of log2 observed activities over expected additive activitities ratios of enhancer-enhancer combinations across promoters. Colored in turquoise are supra and sub-additive combinations for which the observed activity is more than 1 standard deviation away from the expected activity. Horizontal bars represent the median of each distribution. Numbers on the top part are the percentage of supra-additive combinations for each promoter. Numbers on the lower part are the percentage of sub-additive combinations for each promoter. B) Relationship between the percentages of supra and sub-additive enhancer-enhancer combinations and promoter control-control baselines. Blue lines are the linear fit of the data. R is Pearson’s correlation.C) Average supra or sub-additive behaviour of each single enhancer across enhancer-enhancer combinations for each promoter. Each dot represents the median log2 observed over expected for all enhancer-enhancer combinations of a single enhancer and a particular promoter. Grey bars represent the median of each distribution. D) For 2 example enhancers, distribution of log2 observed over expected ratios for combinations of that enhancer with any other enhancer and promoter. Horizontal black bars represent the median of the distribution. E) Distribution of log2 observed over expected ratios for enhancer-enhancer pairs from the same enhancer cluster (within clusters) or from different enhancer clusters (between clusters).The p-value results from comparing both distributions using a Wilcoxon test. Horizontal grey bars represent the median of each distibution. In all panels Obs/exp refers to observed activity over expected additive activity.

Non-linear responses of promoters to enhancer-enhancer combinations. A) Relationship between observed boost indices and average boost index across promoters for all shared enhancer-enhancer combinations. Blue lines represent the linear fit of the data. B) Relationship between the slopes extracted from the linear fits in A and the baseline promoter activities derived from the control-control combinations. The formulae depict the relationship between the average boost indices and the observed boost indices of each promoter through the extracted slopes. For both panels R is Pearson’s correlation.

Schematic of cloning strategy to generate enhancer-enhancer-promoter libraries. First the 8 different promoters were cloned into a reporter vector. Each of the 8 vectors was barcoded and linearised, random enhancer-enhancer, enhancer-control and control-control combinations were cloned downstream of the barcoded reporter.

Reproducibility of experimental data. matrix of replicate correlations for all 8 EEP libraries. Lower left panels represent the 2 dimensional density plots of replicate-replicate activity correlations. Middle panels represent the 1 dimensional density plot of each replicate. Upper right panels are the Pearson’s correlation of the mirror lower left panels.

Position and orientation bias. A) Relationship between single enhancer boost indices across all EC combinations in position 1 versus in position 2 (See Figure 1). B) Relationship between single enhancer boost indices across all EC combinations plus versus minus orientation, in position 1(left) and in position 2 (right). For both panels each dot is one enhancer with one promoter. R is Pearson’s correlation.

Selectivity of single enhancers. A) Relationship between average boost index of each single enhancer across all promoters and the F-statistic of a Welch F-test for that enhancer across all promoters. Colors indicate significance at a 5% FDR. B) Distributions of single enhancer boost indices for each enhancer across promoters. Each dot is one enhancer with one promoter and the boost index is the average boost index across all enhancer-control combinations. Colors indicate significance of the Welch F-test at a 5% FDR. Vertical black bars represent the median of the distribution.

Additivity versus multiplicativity for all promoters. Relationship between observed and expected activities (additive in A, multuplicative in B) for all enhancer-enhancer combinations for the all promoters. The blue lines represent the linear fit of the data. In all panels R and R2 are based on Pearson’s correlation.

Promoter promoter Boost index correlations for all shared enhancer-enhancer combinations. Lower left panels represent the 2 dimensional density plots of promoter-promoter boost index correlations. Middle panels represent the 1 dismensional density plot of each promoter. Upper right panels are the Pearson’s correlation of the mirror lower left panels. Blue lines are the linear fit of the data.