Overview of the LS-MPRA.

(A) Schematic representation of LS-MPRA. BACs containing genomic regions of interest are enzymatically fragmented to generate a high-complexity DNA library. Size selected fragments are cloned into a vector containing a minimal promoter that will drive expression of GFP and a barcode positioned within the 3′ untranslated region (UTR). The LS-MPRA library is then electroporated into retinal cells, where CRM activity is inferred by quantifying barcode enrichment in transcribed mRNA. (B) Illustration of fragment-barcoding strategy and necessary elements to sequence, wherein each DNA fragment is uniquely barcoded. (C) Preparation and sequencing of the plasmid library, associating fragment-barcode pairs, and establishing a baseline for barcode abundance, which is used to normalize barcode counts after mapping of barcode-labeled fragments to genomic coordinates. ORF, open reading frame; BC, barcode.

Rho LS-MPRA to identify CRMs in the neonatal mouse retina.

(A) LS-MPRA barcode enrichment plots for the Rho locus aligned to a genome browser track from in vivo and ex vivo experiments (N=3 experimental replicates, combined), and annotated with: (i) known Rho regulatory regions, (ii) the coverage of the barcode–fragment association library across the locus, log₂-transformed base conservation among 60 vertebrate and 40 placental mammal species, regions of open chromatin in the P8 mouse retina, and (v) RefSeq gene models for the locus. (B) Expanded view of a region of interest from the Rho LS-MPRA plot, showing peaks that correspond to known regulatory regions (red bars), genomic conservation, and areas of open chromatin.

Olig2 LS-MPRA to identify candidate CRMs in the developing mouse retina.

(A) Barcode enrichment plot from the Olig2 LS-MPRA, aligned with a genome browser track and annotated with (i) previously described Olig2 regulatory regions identified in mouse embryonic stem cells, fertilized murine oocytes, a mouse lymphoma cell line, or in mouse ventral neural tube (Chen et al., 2008, Fan et al., 2023, Friedli et al., 2010, Sun et al., 2023), (ii) coverage of the barcode-fragment association library across the locus, (iii) log2 base conservation across 60 vertebrate or 40 placental mammal species, (iv) regions of open chromatin in the E14 mouse retina, and (v) RefSeq gene models. (B) Two expanded regions of interest from the Olig2 LS-MPRA, showing peaks that align with known regulatory elements, genomic conservation, and open chromatin. Candidate CRMs identified within blue bars.

Dynamic CRM activity and Olig2 expression following inhibition of Notch signaling.

(A) Bar plot showing relative endogenous Olig2 RNA expression in the retina over 0–24 hours of treatment with the γ-secretase inhibitor LY411575 or DMSO, normalized to the 0-hour control. Unpaired t-tests with Holm’s multiple comparisons correction were performed for each Control vs. Treated timepoint. (B) Barcode enrichment plot from the Olig2 LS-MPRA, aligned with a genome browser track after 0–28 hours of LY411575 treatment. Notch inhibitor-responsive CRM candidate regions (NR1–3, blue boxes) displayed differential barcode enrichment across the treatment period. Statistical analysis was performed using one-way Welch’s ANOVA with Dunnett’s T3 multiple comparisons test on AUC values, normalized to the 0-hour timepoint. Error bars in (A) and (B) represent SEM. Asterisks indicate statistical significance (*p ≤ 0.05, **p ≤ 0.01, ***p ≤ 0.001).

Co-localization of GFP driven by Olig2 CRMs and Olig2 in retinal cells.

(A) Schematic of plasmid constructs containing one or all Notch inhibitor-responsive regions driving GFP expression with the endogenous Olig2 minimal promoter, alongside the experimental workflow. (B) Representative transverse sections of E14 retinas electroporated with these plasmids, incubated in vitro for 24 hours, and stained using IHC for GFP and Olig2. Merged and single-channel images show GFP colocalization with Olig2 (orange arrow) as well as GFP⁺ cells lacking detectable Olig2 expression (yellow arrow). (C) Quantification of GFP and Olig2 co-localization, represented as pie charts showing (i) the percentage of electroporated Olig2+ cells that express GFP protein, (ii) the percentage of GFP protein+ cells that express Olig2, and (iii) the percentage of electroporated Olig2-cells that express GFP protein in retinas electroporated at E14 with plasmids containing one or all Notch inhibitor-responsive regions driving GFP. Scale bar: 20µm.

Co-localization of CRM-directed RNA and Olig2 RNA or protein in retinal cells.

(A-C) Representative transverse sections of E14 retinas incubated in vitro for 24 hours for analysis of localization patterns of Olig2 and/or GFP. (A) Sections stained for Olig2 protein (cyan, yellow arrows) using IHC and Olig2 RNA (red, magenta arrows) using FISH. A 100% stacked column plot quantifies the overlap between Olig2 protein and Olig2 RNA expression. (B) Intrinsic GFP fluorescence and GFP RNA driven by Olig2-NR1 (yellow and magenta arrows). (C) In retinas electroporated with plasmids containing one or all Notch inhibitor-responsive regions, Olig2 RNA co-localized with GFP protein only (yellow arrow), GFP RNA only (magenta arrow), or both (orange arrow). (D) Pie charts quantifying (column 1) the percentage of electroporated Olig2 RNA+ cells that expressed GFP, (column 2) the percentage of GFP RNA+ cells that expressed Olig2 RNA, (column 3) the percentage of GFP protein+ cells that expressed Olig2 RNA, and (column 4) the percentage of Olig2 RNA-cells that expressed GFP in retinas electroporated. Scale bar: 20µm.

Degenerate MPRAs to identify functional residues within candidate CRMs.

(A) Schematic of d-MPRA library assembly. Point mutations were introduced into Olig2 CRM fragments via error-prone PCR, followed by intraplasmid duplication (IPD) to generate constructs with duplicated mutant CRMs flanking a minimal promoter and GFP ORF. The 3′ CRM copy, located in the 3′ UTR, served as a barcode, with a WPRE sequence included to potentially stabilize transcripts. (B) Conceptual diagram illustrating expected d-MPRA results, showing predicted changes in CRM activity upon disruption of enhancer or repressor binding sites. (C-E) d-MPRA plots displaying log₂ fold changes in mutational frequencies using a 5-base pair sliding window average, normalized across the Olig2-NR1 (C), Olig2-NR2 (D), and Olig2-NR3 (E) regions.

TF binding sites within the Olig2-NR2 CRM.

(A) TF binding motifs predicted using HOMER identified within the Olig2-NR2 CRM candidate, aligned with the average d-MPRA plot for this region (from Fig. 7). (B-D) Position frequency matrices of TF binding sites aligned to motifs predicted by HOMER in Olig2-NR2: Mybl1 to Motif 1 (B), Foxn4 and Pax6 to Motif 3 (C), and Otx2 to Motif 14 (D). (E) UMAP visualization of scRNA profiles from E14 mouse retinas, with cell types previously identified by marker genes by Clark et al. (2019). (F) Expression of Olig2 on UMAP. (G) Co-expression (pink) of Olig2 (blue) with Mybl1, Foxn4, Pax6, or Otx2 (red) on UMAP. (H) TF occupancy track aligned to (i) Olig2 LS-MPRA barcode enrichment after 12-hour LY411575 treatment (replicate of Fig. 4B) and (ii) gene models. Occupancy peaks for Mybl1, Foxn4, and Pax6 align with the Olig2-NR2 CRM candidate (blue box).

Ngn2 LS-MPRA to identify CRM candidates active in mouse retinal cells expressing Ngn2.

(A) Ngn2 LS-MPRA barcode enrichment plot aligned with a genome browser track, annotated with (i) known Ngn2 regulatory regions, (ii) coverage of the barcode-fragment association library, (iii) log₂ base conservation across 60 vertebrate or 40 placental mammal species, (iv) regions of open chromatin in E14 mouse retina, (v) gene models, and (vi) CRM candidate regions 1-4 (blue boxes). (B) Representative transverse sections of E14 retinas incubated for 24 hours in vitro, showing localization of Ngn2 RNA with GFP RNA only (magenta arrow) or both intrinsic GFP and GFP RNA (orange arrows) in retinas electroporated with plasmids containing one of the CRM1-4 regions. (C) Pie charts showing the percentage of Ngn2 RNA+ cells expressing GFP (column 1), GFP RNA+ cells expressing Ngn2 RNA (column 2), GFP protein+ cells expressing Ngn2 RNA (column 3), and Ngn2 RNA-cells expressing GFP (column 4) in retinas electroporated at E14 with plasmids containing CRM1-4 regions driving GFP. (D) Co-localization analysis of Ngn2-CRM3 plasmid and GFP following 16-hour incubation in vitro. Scale bar: 20µm.

OLIG2 LS-MPRA to identify CRM candidates in chick embryos.

(A) OLIG2 LS-MPRA barcode enrichment plot following electroporation into E5 chick retinal explants, and E2 spinal cords and E4 spinal cords in ovo, aligned with a genome browser track and annotated with (i) coverage of the barcode-fragment association library, (ii) log₂ base conservation across 77 vertebrate species, (iii) gene models, and (iv) CRM candidate regions 1-3 (blue boxes). (B) Representative transverse sections of E5 chick retinas incubated for 24 hours in vitro, showing localization of chick OLIG2 RNA with GFP RNA only (magenta arrow) or both intrinsic GFP fluorescence and GFP RNA (orange arrows) driven by plasmids containing one of the CRM1-3 regions. (C) Pie charts showing the percentage of OLIG2 RNA+ cells expressing GFP (column 1), GFP RNA+ cells expressing OLIG2 RNA (column 2), GFP protein+ cells expressing OLIG2 RNA (column 3), and OLIG2 RNA-cells expressing GFP (column 4) in retinas electroporated at E5 with plasmids containing CRM1-3 regions driving GFP and visualized by FISH. Scale bar: 20µm.

Activity of Olig2 CRM candidates in postnatal retinal cells expressing Olig2.

(A) Representative transverse sections of postnatal retinas electroporated at P0 and incubated in vivo for 24 hours, stained with antibodies against Olig2 and GFP to identify co-localized expression (orange arrows). GFP expression was driven by plasmids containing one or more Notch-inhibitor responsive regions (NR1-3). (B) Analysis of GFP and Olig2 co-localization, shown with pie charts depicting the percent of Olig2+ electroporated cells expressing GFP (column 1), the percent of GFP+ cells expressing Olig2 (column 2), and the percent of Olig2-electroporated cells expressing GFP in retinas electroporated at P0 with plasmids containing one or more Notch-inhibitor responsive regions driving GFP. Scale bar: 20µm.

Grm6, Vsx2, and Cabp5 LS-MPRAs to identify CRMs in the neonatal mouse retina.

Barcode enrichment plots from the Grm6 (A), Vsx2 (B), and Cabp5 (C) LS-MPRAs, aligned with genome browser tracks and annotated with (i) known regulatory regions, (ii) coverage of the barcode-fragment association library across the locus, (iii) log2 base conservation across 60 vertebrate or 40 placental mammal species, (iv) regions of open chromatin in the P8 mouse retina, and (v) RefSeq gene models. Expanded regions of interest (yellow dashed boxes) show peaks or regions that align with known regulatory elements for each gene.

Activity of backbone plasmids containing the Olig2 minimal promoter and EGFP.

(A) Representative transverse sections of E14 retinas incubated in vitro for 24 hours show sparse GFP RNA and GFP fluorescence driven by control plasmids containing either EGFP alone or EGFP under the Olig2 minimal promoter. Pie charts of the percentage of electroporated cells with detectable GFP expression.

TF Binding sites in Olig2-NR1 and NR3 CRMs.

(A, H) TF binding motifs identified within Olig2-NR1 (A) and Olig2-NR3 (H) CRM candidates, aligned with the average d-MPRA plot (from Fig. 7). (B-E, G, I-K) Position frequency matrices of transcription factors aligning to Olig2-NR1: Sox4/11 to Motifs 13 and 16 (B), Lhx2 and Dlx2 to Motif 11 (C), Isl1 to Motif 8 (D), Foxp1 to Motif 18 (E), Mybl1 to Motif 2 (G), and Otx2 to Motif 12 (G); or Olig2-NR3: Ngn2 to Motifs 3 and 5 (I), Bhlhe22 to Motif 13 (J), and Lhx9 to Motif 15 (K). (F, L) Co-expression (pink) of Olig2 (blue) with Sox11, Sox4, Lhx2, Dlx2, Isl1, or Foxp1 (red) for Olig2-NR1 (F) and Ngn2, Bhlhe22, or Lhx9 (red) for Olig2-NR3 (L) on UMAP visualization of E14 mouse retinal gene expression.

Activity of OLIG2 CRMs in OLIG2+ cells within embryonic chick spinal cords.

(A) Schematic of an E2 chick transverse spinal cord with the field of view (red outline) shown in B. (B) Representative transverse hemi-sections of ventral E2 chick spinal cords incubated for 24 hours in ovo, showing (i) OLIG2 RNA localized with GFP RNA (orange arrow), (ii) OLIG2 expression in GFP-cells (yellow arrows), and (iii) GFP RNA expression in OLIG2-cells (magenta arrows). GFP expression is driven by plasmids containing one of the CRM candidate regions (CRM1-3). Scale bar: 20µm.

Primer sequences

Genome Coordinates for ROIs