entpd5a but not osterix is a marker for a non-classical osteoblast population in zebrafish.

(A) Schematic of the embryonic notochord structure (adapted from Grotmol et al., 2005). Alizarin red staining of mineralised bone reveals non-classical osteoblasts along the trunk that are negative for osterix (B, Bi) and positive for entpd5a (D), as well as classical osteoblasts of the head and mature vertebrae (arrowheads) that are positive for osterix (C, Ci). (E) entpd5a and col9a2 label bone-producing and intersegmental NSCs, respectively. cl, cleithrum; CNS, central nervous system; da, dorsal arch; no, notochord; NSCs, notochord sheath cells; op, operculum; VCs, vacuolated cells. All lateral views, anterior to the left. (B, Bi, D, E) scale bars: 100μm. (C, Ci) scale bars: 200μm.

Revised list of tissues with detected entpd5a in the developing zebrafish embryo.

Validation of potential enhancer regions in the entpd5a promoter indicates distinct regulation of classical and non-classical osteoblasts.

(A) Chromatin accessibility profiles of representative samples of each cell type. Highlighted are six differentially accessible regions between entpd5a+ and entpd5a-cells. Chromatin regions 4-6 lie proximal to the entpd5a ORF, regions 2-3 are positioned within introns 1 and 3 respectively, whereas region 1 is found in the 3‘ end of the coq6 gene. (B) View of the UCSC browser showing a total of 10 regions of open chromatin within the 40kb surrounding the ORF of entpd5a, including regions 1-6 shown in (A) and regions 7-10 located further upstream. ORFs of entpd5a and downstream coq6 are shown in purple. The TSS of entpd5a is marked as a green bar in the 5’UTR. The ZF_carp_phastcons track indicates conservation amongst zebrafish, goldfish and 2 species of carp. Peaks identified by macs2 as significant compared to background are shown as boxes for each sample (red tracks for cartilage and intersegmental cells and green tracks for osteoblasts and segments). Boxes darker in colour indicate peaks with higher score values. Peaks of interest 1-10 are highlighted and it is indicated using asterisks whether a peak was tested for enhancer function by inserting the sequence in a construct driving GFP and/or by deletion in the BAC construct. (C) Embryo from a stable transgenic line with region 4 placed upstream of GFP, imaged at 6 dpf. GFP is expressed in osteoblasts (marked using entpd5a:pkRed), but not in the notochord. Insets demonstrate GFP and pkRed overlap in the operculum (op), but not in pkRed+ segmented NSCs. (D, F) lateral and (E) ventral head view of stable line where GFP is driven both by region 4 and by regions 2 and 3. Cranial osteoblasts are expressing GFP (D, E), and segmentation is clearly observed in the notochord (D, F). (G) Quantification of fluorescence in segments 1-6 versus intersegmental regions as shown in (F). p-value = 0.003. cl, cleithrum; no, notochord; op, operculum. (C, D, F) lateral views; (E) ventral view. Anterior to the left in all images. (C-F) scale bars: 100μm. Operculum (C) inset scale bar: 20μm.

Deletion of open chromatin further upstream or downstream of the entpd5a(2.2-introns) domain does not affect the entpd5a expression pattern.

(A) Chromatin accessibility profiles along the length of the entpd5a BAC, starting from the ATG of entpd5a. Representative tracks for each cell population are shown. Peaks of interest are labelled 4-10, following the labelling in Fig. 2. The region highlighted in red indicates the 21.4kb deleted in the BAC construct entpd5a(Δ21):pkRed. (B-E) Embryo stably expressing the entpd5a(Δ21):pkRed construct shows normal segmentation pattern at 6 dpf (B-C, asterisks), as well as normal osteoblast and chondrocyte (D, E, insets) expression of pkRed. (F-H) two representative entpd5a:pkRed+ embryos injected with entpd5a(2.2-introns):GFP show mosaic GFP expression along NSCs associated with pkRed+ segments (F, G). In contrast, representative embryos injected with entpd5a(2.2-coq6):GFP only show background GFP (H, I). (J) Schematic of the entpd5a(2.2-coq6):GFP construct, integrating peak number 1 downstream of the GFP. (K) Proportion of embryos injected with GFP constructs with observable GFP expression in NSCs. c, chrondrocyte; cl, cleithrum; op, operculum; VC, vacuolated cell. (B, C, F-I) lateral views; (D, E) ventral views. Heads positioned towards the left. (B-I) scale bars 100μm. (D, E) insets’ scale bar: 20μm.

Sequential deletions of the entpd5a(2.2):GFP construct highlight a 422bp region containing active entpd5a enhancers.

(A) Schematic of the initial construct (2.2kb cloned upstream of GFP), and the subsequent deleted constructs. In the final construct, the purple star depicts the −31bp deletion and the light blue star the −37bp deletion. The osteoblast ATAC peak position is indicated. The 422bp of interest is highlighted in red. (B-I) Representative entpd5a:pkRed embryos at 3 dpf injected with the respective deletion constructs, or as a control with the complete construct (B). The cleithra (cl) for each embryo are depicted without transmitted light. (J) Graph indicating, across all experimental repeats, the percentage of successfully injected embryos (n) for each injected construct in which GFP+ cells were observed in the cleithrum. Successfully injected embryos were identified based on the presence of GFP+ cells in background tissues. Lateral views, anterior to the left. (B-I) scale bar: 100μm.

RNA-seq analysis of skeletogenic populations of the head and trunk.

(A, B) Heatmaps for each tissue, indicating the top differentially regulated genes. The colours on the main heatmap indicate the Z-score value, while the second heatmap indicates the corresponding logFold change, and the third the Average Expression of each gene. (C) Venn Diagram indicating the overlap of total DEGs in all 4 cell populations. Osteogenic and cartilage tissues share 872 and 765 upregulated genes, respectively, while the non-contrasted osteoblasts vs. intersegmental and cartilage vs. segments share 43 and 7 genes, respectively. (D) PCA plot showing the clustering of cell populations by tissue (head versus trunk) and by entpd5a expression (positive and negative cell populations in blue and orange, respectively).

Integration of ATAC-seq and RNA-seq data highlights differentially expressed genes with accessible promoter regions in distinct cell populations.

(A, B) Bar graphs indicate the proportion of ATAC peaks found within 5kb, 10kb and 20kb of the TSS in each cell population. In (A) peaks were searched in all of the Zv11 genes and in (B) only in differentially expressed genes (DEGs). (C-E) Venn diagrams indicating for each cell population the overlap between genes in Zv11 associated with ATAC peaks within 20kb of the TSS, as well as the genes differentially regulated.

Candidate TFs regulating bone and cartilage development.

Distinct predicted binding sites could facilitate transcription factor binding of the entpd5a promoter, aiding cell type-specific regulation of the gene’s expression in classical and non-classical osteoblasts.

Our analysis suggests that, with the exception of Runx2, distinct transcriptional regulators are functioning in (A) classical versus (B) non-classical osteoblasts. In silico analyses predict the presence and distribution of binding sites (squares, coloured according to transcription factor specificity) of the corresponding regulators in the 3 non-coding regions of entpd5a which are shown to have regulatory function. The width of each square represents the length of sequence of the respective binding site. The red box indicates the location of the 37bp deleted using entpd5a(r37):GFP.

Volume of TDE1 Tagment DNA Enzyme, total reaction volume and incubation time of transposition reaction according to cell numbers.

The complex expression pattern of entpd5a is dynamically regulated during zebrafish development.

(A) Schematic of the BAC driving expression of Gal4FF in the entpd5a:Gal4FF transgenic line. The BAC contains the entire open reading frame of entpd5a, and part of the open reading frames of adjacent genes, coq6 and aldh6a1. (B, C, D) GFP expression is detected in osteoblasts (arrow) and (partially) in cartilage (asterisk; D, inset) making up the head skeleton. (C) Strong GFP expression is seen in the notochord and the cleithrum, but also in a subset of CNS neurons. (E-J) Using the entpd5a:Kaede photoconversion line we first detect entpd5a expression at the 15 somite-stage (E). Following the same embryo, active expression of the gene continues until prior to 24 hpf (F-H). Between 24 hpf and until notochord segmentation takes place, entpd5a is only actively expressed in the ventral-most NSCs of the notochord, while expression in the remaining NSCs and vacuolated cells is turned off (I, J). (I-J, insets) Cross section of the notochord at 48 hpf and at 72 hpf in the position indicated by the white vertical line. (B) ventral view; (C-J) lateral views, anterior towards the left. (B-J) scale bars: 100μm. (D, I, J) insets scale bars: 50μm.

Quality assessment of ATAC-seq of skeletogenic cells.

(A, B) Trunk and head, respectively, of transgenic fish used for collection of FAC-sorted cells. Dashed lines indicate the site where the cut was made to separate head from trunk tissue. GFP+ cells indicate mineralising cells (A, NSCs; B, cranial osteoblasts) while mCherry+ cells indicate (A) intersegmental NSCs and (B) head cartilage. (C) PCA analysis, with red circle indicating mCherry+ cell samples and green circle GFP+ cell samples. (D) Heatmap indicating correlation amongst individual replicates. (E, F) Volcano plots showing for (E) head and (F) trunk the identified significantly accessible chromatin regions. Positive log Fold Change indicates regions open in mCherry+ cells, negative log Fold Change indicates regions open in GFP+ cells. (A, B) lateral views, anterior to the left. Scale bars: 100μm.

Gating strategy for FAC-Sorting cells for ATAC-seq and RNA-seq.

Cells from (A) heads and (C) trunks of GFP-; mCherry-embryos were used as gating controls. Cells from (B) heads and (D) trunks of GFP+; mCherry+ siblings were then sorted.

No additional enhancers are present in the sequence within peak number 5.

(A) Chromatin accessibility profiles of representative samples of each cell type. In squares are the peak region covered by the entpd5a(2.2):GFP construct and the one covered by the entpd5a(5.7):GFP construct. (B, C) GFP driven in a stable line by the 5.7kb upstream of the start codon of entpd5a, is only expressed in cranial osteoblasts (B), but not in the notochord (C) or in cartilage of the head skeleton (B, inset). (B) ventral view; (C) lateral view. Anterior towards the left. (D, E) scale bars: 100μm. (D inset) scale bar: 20μm.

GFP expression in the notochord under the control of the introns is first detected in notochord progenitor cells.

(A, B) GFP and pkRed expression (under the control of the complete BAC) appear to completely overlap in notochord progenitor cells at 24 hpf. Scale bar: 100μm.

Chondrocyte markers maintain an open chromatin configuration in osteoblasts and are actively expressed in entpd5a+ cells in levels lower than that found in entpd5a-cells.

(A-D) Views of the col9a2 (A), sox9a (B), acana (C) and ccn6 (D) promoters in representative ATAC samples for each of our cell populations, with the scale bars indicating the maximum normalised ATAC signal (read counts) in each sample. (E) Normalised col9a2, sox9a, acana and ccn6 transcript counts in the three entpd5a+ (green spots) and the three entpd5a-(red circles) samples sequenced in the head and in the trunk. (F) Corresponding log2 Fold Change values for each gene when entpd5a+ vs. entpd5a-cell populations are compared within each tissue.

Quality control measurements for ATAC-seq samples.

Fragment Length Distribution Distance (FLDD) is defined as the distance of the experiment’s FLD to a reference distribution. Negative and positive distances are associated with under- and over-transposed samples, respectively. TSS enrichment score is calculated based on the normalised fragment coverage +/-2kb around the TSS.

Primers used for cloning