At the top, two native genes in the genome with multiple promoter variants (stars) are shown. Below the two genes, the TSS and Upstream MPRA library designs are illustrated. The TSS library tests …
(A) Number of barcodes tagging a given oligo in the TSS library. The inset shows the range from zero to 5000 barcodes, which contains the majority of the distribution. (B) as in (A), but for the …
Boxplots show the median as thick horizontal line, with the box showing the 25th and 75th percentiles. Whiskers show the largest value no further than 1.5 times the inter-quartile range; data points …
The figure shows details of the molecular reactions used to make Illumina sequencing libraries for barcode counting. Primer names are given in quotes. For RNA, Protocol one is on the left and …
(A) Correlations of expression driven by oligos among replicates in the TSS library. (B) Average expression across TSS replicates driven by the 200 oligos from Sharon et al., 2012 compared to their …
(A) A scatterplot showing the effect size and significance for each variant. The genome-wide significance threshold is shown as a dashed horizontal line. Variants with the most significant effects, …
Statistical tests for each variant.
Results for the TSS and Upstream libraries are in separate worksheets.
Statistical results for each variant after aggregating across the two sub libraries.
(A) Variants measured in the same library (TSS or Upstream) but in opposite orientation (‘strand’) with respect to the reporter gene. Blue points: variants with significant (5% FDR) effects on the …
(A) Scatterplot showing the MPRA effect of the most significant causal variant per gene (y-axis) versus the effect of the local eQTL (x-axis). Red dots indicate local eQTLs with a LOD score of at …
Local eQTLs and variant results.
Each panel shows, for one gene, MPRA expression driven by four oligos with the indicated combination of BY and RM alleles at the two variants. Each panel states the gene name in bold along with …
Results from the test for epistatic interactions.
Each panel shows, for one gene, MPRA expression driven by four oligos with the indicated combination of BY and RM alleles at the two variants. Each panel states the gene name in bold along with …
(A) Non-TF features. The figure shows the strength of association between each feature and variant causality. Error bars show the standard error of the mean. Significant associations are indicated …
Single-feature regression analyses of causal variants.
Logistic and linear regression (to predict log-fold change) are in separate worksheets.
(A) A variant in the promoter of the SFA1 gene alters a strong Msn2/4 motif (Yeastract consensus motif: CCCCT); as well as a strong Haa1 motif (SMGGSG). TFBSs detected by the Yeastract website in …
(A) Prediction results for 112 models. On the left, the plot shows the performance of binomial classifiers on the 10% test as black bars. On the right, the plot shows the performance of the linear …
Multiple-feature regression analyses of causal variants.
Logistic and linear regression (to predict log-fold change) are in separate worksheets.
Library | TSS | Upstream |
---|---|---|
Designed oligos | 7211 | 9882 |
Variants in design | 3645 | 4547 |
Genes in design | 2172 | 1918 |
Oligos in finished library | 6565 | 9646 |
Barcodes | 9.2 million | 20 million |
Median barcodes per oligo | 590 | 1008 |
Variants with data | 2427 | 4467 |
Genes with data | 1429 | 1824 |
Number of replicates | 12 | 6 |
Information on replicate samples.
Reagent type (species) or resource | Designation | Source or reference | Identifiers | Additional information |
---|---|---|---|---|
Strain, strain background (Saccharomyces cerevisiae) | BY4741 | Albert et al., 2018 (doi:10.7554/eLife.35471) | ||
Recombinant DNA reagent | RCP83 plasmid | This paper; (Addgene plasmid #163466) | Plasmid backbone | |
Recombinant DNA reagent | SurePrint Oligonucleotide Libraries | Agilent | Custom DNA oligo library | |
Gene (Aequorea victoria) | yEGFP | pKT0127 (Addgene plasmid #8728) | Reporter gene | |
Software, algorithm | R version 3.5.0 | https://www.r-project.org | Data analysis |
Library | Fisher’s exact test (odds ratio) | Fisher’s exact test (p-value) | Correlation w/ number significant ASE datasets (rho) | Correlation w/ number significant ASE datasets (p-value) | Correlation with ASE magnitude (rho) | Correlation with ASE magnitude (p-value) |
---|---|---|---|---|---|---|
TSS | 3.1 | 3e-7 | 0.18 | 2e-8 | 0.11 | 0.0009 |
Upstream | 2.8 | 6e-10 | 0.21 | 3e-12 | 0.13 | 0.00001 |
Oligo design.
Oligo counts per replicate.
Non-TF features for each variant, along with statistical results for each variant.
All features used in variant annotation, including TFBS.
(gzipped text file).
Primers sequences and sequences of various components of the reporter gene construct.
BY/RM sequence variants used in MPRA design (gzipped vcf file).
Sequence and map of a plasmid in the library after completed library construction.
In place of the library, the sequence contains an example promoter fragment.