Photoreceptor-enriched full-length scRNA-seq identifies regulon-defined RPC and photoreceptor precursor states.

A. Overview of sample collection and sequencing. RPCs and developing photoreceptors were FACS-enriched and isolated directly or using C1 microfluidics. B. UMAP plot of final dataset, low resolution clusters labeled by cell type. C. Expression of marker genes for RPC/MGs (LHX2), rods (NR2E3), S cones (OPN1SW), L/M cones (THRB), rod maturation (PDE6G, RHO) and cone maturation (PDE6H, OPN1LW). Arrowheads: Late maturing RHO+ rods (top), late maturing OPN1LW+ cones (bottom). D. Ward-clustered heatmap of the highest scoring SCENIC regulons in each cluster, displaying Z-score normalized regulon activities. Late = late maturing L/M cones. E. Box plot of RAX regulon area under the curve (AUC) values for early and late L/M cones and rods. *, p<0.005; ***, p<0.0005, Dunn test. F,G. UMAP plots of regulon AUC values for (F) NRL and THRB (rod and L/M cone) and (G) PAX6, E2F2, OTX2, ISL2 (MG/RPC, RPC, PR, L/M cone). See also Figure S1.

Differential expression of NRL isoforms in rod and cone populations.

A. Expression of NRL gene and the most highly assigned Ensembl isoforms ENST00000397002, encoding full-length NRL (FL-NRL) and ENST00000560550, encoding truncated NRL (Tr-NRL). B. Mean NRL isoform assignments for clusters defined in Figure 1B, presented as normalized transcript counts (top) and percentage of total counts (bottom). Significance lines for LM vs. ER fold change, colored by isoform. ****, p <0.0002; *****, <0.000001 (bootstrapped Welch’s t-test). C. Top: Mean read counts across Ensembl annotated NRL exons for each cluster. Bottom: Transcript structures for ENST00000397002, ENST00000560550, and DD10, numbered according to amino acid positions. Minimal transactivation domain (MTD) in green. Arrowheads: Red/black: First exons where red is higher of two peaks. D. Relative difference box plot of raw reads mapping to truncated (Tr) and full length (F) transcript first exons in each cell, according to cluster. Relative difference is the difference in reads mapping to truncated and full-length NRL first exons (Tr-F) divided by the sum of both (Tr+F). Values >0 indicate more reads assigned to truncated isoform, values < 0 indicate more reads assigned to full-length isoform. ***, p <0.0001 (post-hoc Dunn test). E. Combined NRL and RXRψ immunohistologic staining and RNA FISH with probes specific to truncated Tr-NRL exon 1T (green puncta) and to FL-NRL exons 1 and 2 (orange puncta) in FW16 retina. Boxed regions enlarged at right show a RXRlo, NRL+ rod with one Tr-NRL and six FL-NRL puncta (top) and an RXRψhi, NRL cone with one Tr-NRL and no FL-NRL puncta (bottom), as indicated with same-color arrows. Scale bar: 10 µm. F. Ratio of fluorescent puncta observed in experiment depicted in (E) for NRL+ or RXRψhi cells where Tr puncta >0. ***; p <0.0005 (Welch’s t-test). G. Top: PDE6B-luciferase reporter activity in NIH3T3 cells transfected with indicated amounts (ng) of pcDNA4C-EF1α and derived FL-NRL and Tr-NRL. Bottom: Expression constructs. Error bars = standard deviation of triplicate measurements. *, p <0.05; **, <0.005 (Student’s t-test). Data representative of two experiments in NIH3T3 and one in 293T. H. Long-read nanopore sequencing of pooled 5’ RACE reactions initiated with NRL exon 3 primers and performed on cDNA libraries from 23 ER cells (top) and 21 LM cells (bottom). Each schematic shows total exon coverage (above) and individual transcript structures (below), where expressed sequences are grey and introns light blue. Full-length (FL), alternatively spliced or internally initiated exon 2 (ýex2), and truncated (Tr) transcripts are indicated by brackets. Red arrow: Transcripts resembling DD10, with internal exon 2 transcription initiation and premature splicing to exon 3. Ensembl FL-NRL and Tr-NRL transcript isoforms and RACE primer positions are shown below.

Premature transcription termination of THRB isoforms in rod and cone populations.

A. Expression of THRB and two highly assigned isoforms (ENST00000280696 and ENST00000396671) in LM and LR populations. B. THRB isoform assignments for low resolution clusters presented as total transcript counts (top) and as a percentage of the total counts (bottom). C. Top: Mean read counts across Ensembl annotated THRB exons. Bottom: Transcript structures for ENST00000396671 (TRβ1), ENST00000280696 (TRβ2), and predicted TRβ1 truncations. Green arrowhead: First TRβ2 exon. Exon numbers84 are indicated above and protein domains (AF1, DNA-binding (DBD), and ligand binding (LBD)) below TRβ1. D. Compiled read coverage for LR cells across THRB exons 4 and 6 splice donor site. E. Percentage of exon 3-6 splice donor region reads that are spliced and the exon splice donor or readthrough to the subsequent intron. F. Long-read nanopore sequencing of pooled 3’ RACE reactions initiated with THRB TRβ1 exon 4 (left) or TRβ2 exon 1 (right) primers and performed on cDNA libraries from 21 LM cells (top) and 5 LR cells (bottom). Each schematic shows total exon coverage (above) and individual transcript structures (below), where expressed sequences are grey and introns light blue. Each panel shows the entire TRβ1 or TRβ2 locus with first exons (in green boxes) shown in detail at right. Red arrowheads: intronic premature transcription termination. Ensembl TRβ1 and TRβ2 transcript isoforms and RACE primer positions are shown below.

Two post-mitotic photoreceptor precursor populations rapidly expressing rod or cone markers.

A. UMAP plot colored by high resolution clusters. B. RNA Velocity plots with cell clusters as in A and cell velocity vectors. C. Enlarged view highlighting RPC and MG clusters (left), RPC-localized iPRP and TR clusters (middle), and RNA velocity (right). Black line: limit of the RPC-localized region. Arrows depict inferred trajectories. D. Violin plots depicting expression of selected genes in RPC, MG, and RPC-localized iPRP and TR cells. Colored asterisks compare cluster of same color to cluster at right of line. E. SCENIC regulon violin and box plots for RPC-localized cells in each cluster, selected from most specific regulons for MG, RPC, ER and LM clusters. ^, p <0.1; *, <0.05; **, <0.005; ***, <0.0005 (post-hoc Dunn test).

An iPRP bridge population with properties of a common cone-rod precursor.

A-C. UMAP bridge region plots displaying high resolution ER and iPRP clusters and RNA velocity (A), rod and cone marker gene expression (B), and NRL and THRB regulon activities (C). In B, circle colors indicate different patterns of GNAT1/NR2E3/GNAT2/THRB co-expression. In C, arrows indicate cells with both regulon signals. D. Box plot of Z-score-normalized AUC for each cluster. Bridge region iPRP and ER cells are segregated from other iPRP and ER cells (red labels). Mean of absolute Z-score differences between THRB and NRL regulons displayed at top. E. UMAP plots of iPRP bridge region marker genes. F-H. Combined RXRψ/NRL immunohistochemical staining and CHRNA1 FISH of FW12 retina. F. Tiled images of retina section with nuclei stained with DAPI. White boxes: fields used for quantitative fluorescent imaging. Distances along apical edge of tissue marked in mm from midpoint of central image (0 mm, C). *: Imaged region in G. Scale bar = 500 µm. G. Top: Retinal nuclear and cellular segmentation and identification of cells as RXRψ+ (green), NRL+ (red). Yellow box: Field shown below. Bottom: RXRψ or NRL immunofluorescence with CHRNA1 FISH. Arrows: RXRψ+CHRNA1+ (green), NRL+CHRNA1+ (yellow). Scale bars = 15 µm. H. Quantitation of fluorescent puncta in RXRψ+ and NRL+ cells by image field. X-axis: Distance from the midpoint of each image to retina center (0 mm, C). **, p <0.005 (Wald test, images from 0-6 mm).

L/M cone subcluster marker genes, regulons, and pseudotemporal trajectory with successive lncRNA gene expression.

A. Violin plots of high-resolution cone cluster marker genes with increasing maturation-associated expression. All significant differences between adjacent clusters are indicated. B. Violin plots of regulons with highest LM1-4 cone cluster specificity scores. *, p <0.05; **, <0.005; ***, <0.0005 (post-hoc Dunn test). C. Pseudotemporal trajectory through the L/M cone population derived with Monocle 3. *: Root cell used to define endpoint. D. Trendlines of relative count expression (left) and UMAP plots for lncRNAs correlating with early or late-upregulating modules. Line color matched to label boxes. E, F. Combined RXRψ immunohistochemical staining and FISH of lncRNAs on FW16 retina. E. Tiled images of retina with nuclei stained with DAPI. White boxes: fields used for quantitative fluorescent imaging. Distances along apical edge of tissue marked in mm from fovea to ciliary margins. Scale bar = 500 µm. Asterisks identify fields shown in F. F. Combined RXRψ immunostaining and multiplex FISH for four lncRNAs, of which two are shown in peripheral and central retina regions. Arrows: White: RXRψ/CTC-378H22.2+. Magenta: RXRψ/HOTAIRM1+. Blue: RXRψ/CTD-2034I21.2+. Green: RXRψ/RP13-143G15.4+. Scale bar = 15 µm. G. Quantitation of lncRNA fluorescent puncta assigned to RXRψ+ cells after segmentation. Colored bars mark lncRNA expression regions as described in the text. Arrows indicate significant count differences between images (p <0.05, Wald test).

Cone intrinsic MYCN and SYK contribute to the proliferative response to pRB loss.

A. Volcano plot of differential expression between early rod cluster ER and cone LM, excluding late-maturing OPN1LW+ population. pAdj cutoff =0.05, log2FC cutoff = |0.4|. Labeled genes: pAdj <10e-32, except for SYK and MYCN. B. Overrepresentation analysis of cone-enriched genes pAdj <0.05, log2FC ≥ |0.4|. C. Immunofluorescent staining of MYCN in ARR3+ cones but not NRL+ rods in FW18 retina. D-F. UMAP plots of MYCN expression (D), MYCN regulon activity (E), and SYK expression (F). G. SYK and MYCN gene expression violin plots by cluster. *, p <0.05; ns = not significant (t-test). H. Immunohistochemical staining analysis of SYK expression in FW18 and FW16 retinae with rabbit polyclonal antibody (LUMIF-hCAR) for human cone arrestin (ARR3). Green arrow: ARR3+, SYK+. White arrow: ARR3+, SYK-. Scale bar = 25 µm. H. Effect of SYK inhibitor GS-9876 on KI67+ expression in shRB1- and YFP transduced RXRψ+ cone cells from dissociated FW16.5 fetal retina. Values represent means of three immunocytochemical analyses from two treatment replicates. Error bars: standard deviation. *, p <0.05; **, <0.005 (Student’s T-test with equal variance, 2-tailed). RXRψ+ cells: Experiment 1, n=1,340. Experiment 2, n=804. Range 107 – 366 cells per condition. I. Model of SYK expression and function in cone maturation and retinoblastoma development.

Resource tables

scRNA-seq summary and cell type identification.

A. Retinae and cell numbers examined at each age. B. Histogram of total read counts for each cell, colored by sequencing run. Seq 1 cells were isolated by C1, Seq 2 cells by FACS or C1 as indicated, and all others by FACS. Dotted line: 100,000 read cutoff for cell exclusion. C. Box plots of read counts, genes detected, and transcripts detected per cell ordered by fetal age and specimen number. D-F. UMAP plots colored by sequencing run (C), isolation method (D), or retina ID (E). G-J. Expression of cell type marker genes for RPCs and Müller glia (G), photoreceptors (H), rods (I), and cones (J). Insets: Gene expression violin plots (from left to right): RPC/MG (red), iPRP (brown), LM cone (green), S cone (teal), early rod (blue), late rod (pink). UMAP and violin plots for any gene or Ensembl transcript can be produced at https://docker.saban.chla.usc.edu/cobrinik/app/seuratApp/. K, M, O. Volcano plots of differential expression (pAdj <0.05, log2FC >|0.5|) between ER and LR (K), LM and S (M), and spatially separated late maturing L/M cones and remaining LM cluster (O). Labels indicate genes with highest significance and fold change. L, N. Overrepresentation of gene ontologies for genes upregulated in LR over ER (GO-Molecular Function instead of Hallmark50) (L) or upregulated in LM over S (N). P. UMAP plots of upregulated genes in the late maturing L/M cone group.

Further characterization of NRL isoforms.

A. Top: Sashimi plots of NRL transcript splicing in long read sequencing of 5’ RACE reactions from 23 ER and 21 LM cells. Bottom: Exon structure of FL and Tr NRL isoforms and RACE primer location. The NRL gene is oriented relative to chromosome 14 coordinates. B. NRL antibody validation by immunoblot (IB) of ectopic FL-NRL and Tr-NRL expression in HEK293T. Left: anti-total NRL. Right: anti-N-terminal NRL specific to FL-NRL isoforms. C. Average mapped NRL isoform expression in scRNA-seq of RB31 retinoblastoma cell line. D. Immunoblot of endogenous total NRL in RB31 cells treated with or without 10 μM retinoic acid (RA) and 10 μM proteasome inhibitor MG132 and in FL-NRL transfected HEK293T. E. pULTRA-EGFP-P2A-Tr-NRL lentiviral vector used for explanted retina transduction. F. NRL Immunofluorescent staining of HEK293T cells 48 h after lentiviral transduction of Tr-NRL-GFP. Scale bar = 50 μm. G, H. Immunostaining of total NRL (G) or N-terminal-NRL (H) in explanted FW16.5 fetal retina 7 d after lentiviral transduction of Tr-NRL-GFP. Arrows: Yellow = GFP+, NRL-. Green = GFP+,RXRψ+. White = GFP-,FL-NRL+. Scale bar = 25 μm.

Further characterization of THRB isoforms.

Top: Sashimi plots of THRB transcript splicing in long read sequencing of 3’ RACE reactions from 23LM cells and 5 LR cells RACE reactions were initiated from primers specific to TRβ1 exon 4 (left) or specific to the unique TRβ2 exon (right). Gray arrows: premature transcription termination following TRβ1 exon 4, the TRβ2 first exon, the shared exon 6, and a novel exon used solely in TRβ2 transcripts. Bottom: Exon structure of TRβ1 and TRβ2 isoforms and RACE primer locations. The THRB gene is oriented relative to chromosome 3 coordinates.

RXRG isoform expression in developing human photoreceptors.

A. Expression of RXRG and the two most highly assigned RXRG isoforms (ENST00000359842 and ENST00000619224). B. Average RXRG isoform assignments for low resolution clusters presented as total transcript counts (top) and as a percentage of the total counts (bottom). Lines above figure indicate significant LM vs. ER fold change, colored by isoform. *****, p <0.000001 (bootstrapped Welch’s t-test). C. Top: Mean read counts across Ensembl annotated RXRG exons for each cluster. Bottom: Transcript structures for ENST00000359842 and ENST00000619224. Red arrowheads: Extended (Ext) RXRG 5’ UTR in ENST00000619224. D. Long-read nanopore sequencing of pooled 5’ RACE initiated with RXRG exon 4 primers and performed on cDNA libraries from 23 ER cells (top) and 21 LM cells (bottom). Each schematic shows total exon coverage (above) and individual transcript structures (below), where expressed sequences are grey and introns light blue. Predicted exons for ENST00000359842 and ENST00000619224 isoforms shown below. Blue arrows: RXRG exon 4 RACE primer. Panels at left show the RXRG locus with exon 1 in green box enlarged at right, revealing a similar range of RNA 5’ ends in ER and LM populations and no evidence of the unique ENST00000619224 exon. E. Immunohistochemical analysis of RXRψ and NRL in FW16 retina. Arrows: Yellow = RXRψ-high cone. White = NRL+ rod with weak RXRψ. Scale bar = 25 μm. Right: Boxplot of RXRψ mean grey values in NRL+,RXRψlo and NRL,RXRψhi cells. ***, p<0.0001 (Welch’s T-test).

High resolution cluster marker genes and differential RPC versus MG gene expression.

A. Marker gene dot plot for high resolution Louvain clusters. Dot size: Percent of cells expressing gene. B. Volcano plot of differential expression (pAdj <0.05, log2FC >|0.5|) between MG and RPC clusters with genes labeled when p <10-4 except for RPL34. C. Overrepresentation analysis for genes upregulated in RPC above pAdj and log2FC cutoff. D. Ward-clustered heatmap of the highest scoring SCENIC regulons in each high-resolution cluster, displaying Z-score normalized regulon activities. Subcluster labels: RPC-l = RPC-localized, Br = Bridge-localized, O = Other/remainder of original cluster, Late = late maturing LM4 cones.

NR2E3 and GNAT2 co-expression in early-born photoreceptors.

A-C. Combined immunofluorescence staining of RXRψ and NR2E3 and RNA FISH of GNAT2 and NR2E3 in FW14 retina. A. Tiled composite image of retinal section. Boxed regions are further evaluated in C-F. (B) Diagram of the retina shown in A, indicating regions with the most peripheral (i.e., youngest and least mature) and most central (i.e., oldest and most mature) rods and cones in this section. C. Composite images of regions 2-13. Boxes: regions shown at higher magnification in D. D. Enlarged images of boxed regions within images 2, 3, 5, and 8. Dotted yellow line demarcates the outer and inner NBL cells analyzed in E, F. White arrows = GNAT2 puncta. Green arrows = NR2E3 puncta. Boxed regions in column 3 (GNAT2 RNA plus NR2E3 protein) and column 4 (NR2E3 RNA) are enlarged in column 6 (GNAT2 RNA plus NR2E3 RNA) and illustrate initial expression of GNAT2 RNA without NR2E3 RNA in outer NBL nascent cones in region 2, initial co-expression of GNAT2 and NR2E3 RNA in early outer NBL cones in region 3, initial expression of NR2E3 RNA with or without co-expressed NR2E3 protein in region 5, and extensive co-expression of GNAT2 and NR2E3 RNA without NR2E3 protein in RXRψhi outer NBL cones in region 8. E,F. Quantitation of (E) outer and (F) inner layer photoreceptor precursors expressing combinations of RXRψ, NR2E3, GNAT2, and NR2E3.

Additional iPRP bridge region marker genes and regulon activities.

A. UMAP plots of iPRP marker S100A6 and photoreceptor transition state marker DLL3 gene expression. B. UMAP plots OLIG2, LHX9, and THRB regulon activities (left) and OLIG2, LHX9, and ONECUT1 gene expression (right). Boxes indicate bridge region.

Heatmaps of coregulated gene modules 1-7 across Monocle-derived pseudotime.

A. Heatmaps of individual genes in each module across the pseudotime trajectory shown in Figure 6C. lncRNA genes of interest are labeled. B. Heatmap of averaged gene module expression across pseudotime.