A cell atlas of the developing human outflow tract of the heart and its adult aortic valve derivatives

  1. Rotem Leshem
  2. Syed Murtuza-Baker
  3. Joshua Mallen
  4. Lu Wang
  5. John Dark
  6. Andrew D Sharrocks
  7. Karen Piper Hanley
  8. Neil Hanley
  9. Magnus Rattray
  10. Simon D Bamforth  Is a corresponding author
  11. Nicoletta Bobola  Is a corresponding author
  1. Faculty of Biology, Medicine and Health, University of Manchester, United Kingdom
  2. Translational and Clinical Research Institute, Newcastle University, United Kingdom
  3. College of Medicine & Health, University of Birmingham, United Kingdom
  4. Newcastle University Biosciences Institute, Faculty of Medical Sciences, United Kingdom

eLife Assessment

This is a valuable study that presents human single nuclei RNA-seq and spatial transcriptomics data of the developing outflow tract and adult aortic valves that will facilitate research in this area. Data presented are solid, with bioinformatics analyses showing cell lineage and trajectory relationships, intriguingly suggesting persistence of embryonic signature in adult aortic valve cells. The latter results would be strengthened by experimental validation.

https://doi.org/10.7554/eLife.107748.3.sa0

Abstract

The outflow tract (OFT) of the heart carries blood away from the heart into the great arteries. During embryogenesis, the OFT divides to form the aorta and pulmonary trunk, creating the double circulation present in mammals. Defects in this area account for one-third of all congenital heart defect cases. Here, we present comprehensive transcriptomic data on the developing OFT at two distinct time points (embryonic and fetal) and its adult derivatives, the aortic valves, and use spatial transcriptomics to define the distribution of cell populations. We uncover that distinctive embryonic signatures persist in adult cells and can be used as labels to retrospectively attribute relationships between cells separated by a large timescale. Single-cell regulatory network inference identifies GATA6, a transcription factor linked to common arterial trunk and bicuspid aortic valve, as a key regulator of valve precursor cells. Its downstream network reveals candidate drivers of human cardiac defects and illuminates the molecular mechanisms of both normal and pathological valve development. Our findings define the cellular and molecular signatures of the human OFT and its distinct cell lineages, which is critical for understanding congenital heart defects and developing cardiac tissue for regenerative medicine.

Introduction

Congenital heart defects (CHD) are major birth abnormalities affecting ~1% of newborn babies (Hoffman and Kaplan, 2002). The outflow tract (OFT) carries blood away from the heart into the great arteries. During embryogenesis, the OFT divides to provide separate aorta and pulmonary trunk vessels, which arise from the left and right ventricles, respectively, giving the double circulation found in mammals (Stefanovic et al., 2021). Defects specifically affecting the OFT of the heart represent a third of all CHD cases (Thom et al., 2006). In humans, the formation and remodeling of the OFT is a relatively rapid process, occurring in embryogenesis over an ~4-week period from Carnegie Stage (CS)13 (4 weeks) to CS23 (8 weeks). Septation of the OFT begins around the CS14 stage when a transient aortopulmonary septal complex protrudes from the dorsal wall of the aortic sac (Anderson et al., 2012b). This divides the common OFT vessel into separate aorta and pulmonary trunks. The arterial (semilunar) valves are formed in the intermediate component of the OFT, initially as endocardial cushions (Anderson et al., 2012a) and continue to mature after septation of the arterial vessels. The morphological changes underlying OFT formation are orchestrated by two main cell lineages, the second heart field and the cardiac neural crest. The second heart field is a population of cardiac progenitor cells, originating from the pharyngeal mesoderm, that contribute to the formation of the myocardium of the right ventricle, atria, and the OFT of the heart (Kelly, 2012). The neural crest is a transient, pluripotent cell population, which migrates from the neural tube to multiple areas of the body. The cardiac neural crest, a subpopulation of the neural crest, forms the aorticopulmonary septal complex, which separates the aorta and pulmonary trunk (Waldo et al., 1998; Jiang et al., 2000; Kirby and Hutson, 2010).

The use of model systems has substantially advanced our understanding of the cell lineages that contribute to the OFT. However, much less is known about OFT development in humans. In addition, relative to the embryonic period, adulthood is underexplored at the molecular level, and a clear relationship between cell lineages and cells found in the adult OFT is lacking. Previous studies have conducted single-cell analysis of both developing and adult (whole or microdissected) hearts (Wei et al., 2024; Xu et al., 2023; Asp et al., 2019; Sahara et al., 2019; Cui et al., 2019; Knight-Schrijver et al., 2022; Cao et al., 2020; Suryawanshi et al., 2020; Lahm et al., 2021; Litviňuková et al., 2020; Koenig et al., 2022; Tucker et al., 2020; Queen et al., 2023; Farah et al., 2024; De Bono et al., 2025). Here, we present comprehensive transcriptomic data of the developing OFT (two distinct time points, embryonic and fetal) and its adult derivatives, the aortic valves, providing a large reference framework of OFT cell repertoires and their gene expression profiles. Using spatial transcriptomics, we describe the distribution of cell populations and cell–cell co-localizations. Remarkably, we identify the persistence of distinctive embryonic signatures in cells separated by a large timescale and use these signatures to establish lineage relationships between embryonic and adult cells. Our study defines the cellular and molecular signatures of the developing OFT and adult valves and highlights the distinct cell lineages that construct these structures. This is of major importance for understanding the origin of congenital heart malformations and for producing cardiac tissue for use in regenerative medicine.

Results

Cellular landscape of the human OFT and its adult derivatives

To identify the cell types present in the human OFT, we conducted single nuclei (sn) RNA-seq on human OFT tissues. We isolated nuclei from four embryonic OFTs (CS 16–17), which were pooled into two independent pools (each from two embryos), two fetal (post conception week 12 (12 pcw)) OFTs and from three adult aortic valves (Figure 1A; Supplementary file 1). The nuclei exhibit strong consistency between biological replicate samples (Figure 1B). After quality control and filtering, a total of 30,166 nuclei were segregated into 18 different clusters by unsupervised clustering (Figure 1C). These clusters were visualized by uniform manifold approximation and projection (UMAP). Each color on the chart represents a distinct cell population, arranged in order from the largest (5127 cells in cluster 0) to the smallest (355 cells in cluster 17). We decided against using batch correction (Korsunsky et al., 2019) to preserve the biological variability in our datasets, which reflects real changes across developmental time points (Figure 1—figure supplement 1A, B). We observed the highest number of clusters (n = 12) in the fetal samples (Figure 1D), consistent with new cell types arising from embryonic to fetal stage. The low number of clusters in the adult samples reflects sampling of the aortic valves, one of the derivatives of the entire OFT. Subsequently, we performed differential gene expression analysis to aid in the classification of each cell cluster (Supplementary file 2). Leveraging recognized markers, we were able to assign the clusters in developmental samples to three main compartments: cardiac, endothelial, and mesenchymal cells (Figure 1E, F). We also detected minor compartments of neuronal and immune cell types. In addition to mesenchymal (valve interstitial) and endothelial cell types, we allocated adult clusters to immune cells. We confirmed cell types in the main compartments using additional verified markers. The cardiac and endothelial nuclei populations appear to already express their appropriate cell type markers in the embryonic samples (Figure 1—figure supplement 1C). In contrast, most of the mesenchymal nuclei do not express markers of differentiated cell types (e.g. DCN or MYH11) at the earlier (embryonic) stage, but express a combination of PDGFRA (Farahani and Xaymardan, 2015) and PDGFRB (Wang et al., 2018), implying the embryonic stage captured mesenchymal nuclei before differentiation (Figure 1—figure supplement 1C). In addition, while embryonic and fetal cells display considerable variations in gene expression, adult cells share distinguishing features that set them apart from developing cells. This includes the expression of NEAT1 (Ge et al., 2022), a long non-coding RNA (lncRNA), which does not distinguish cell types but discriminates adult from developmental tissues (Figure 1—figure supplement 1D–G).

Figure 1 with 1 supplement see all
The cellular landscape of the developing outflow tract (OFT) and its adult derivatives.

(A) Experimental schematics. Nuclei isolated from two embryonic (CS 16–17) and two fetal (12 pcw) OFTs and from three adult aortic valves (AV) were analyzed by snRNA-seq. Four cryo-sections including the OFT region of a 12 pcw heart were used in spatial transcriptomics (Visium). Section is shown within the Visium capture area (~6 mm × 6 mm; spot diameter 55 µm, centre-to-centre spacing 100 µm), which defines spatial scale. (B) Sample correlation visualized by unsupervised clustering and projected on a two-dimensional uniform manifold approximation and projection (UMAP). Nuclei are colored by sample, with embryonic (blue, orange), fetal (green, red), and adult (pink, purple, and brown). (C) Cell clusters visualized by the same UMAP as in B. Nuclei are colored by cluster. (D) Cluster composition in each sample, presented as percentage of nuclei. (E) Dotplot shows the mean expression levels of top differential genes across clusters and identifies five main cell types: cardiac, endothelial, mesenchymal (including valve interstitial), neural, and immune. (F) Cell types in E visualized by UMAP. Nuclei are colored by cell type. See also Figure 1—figure supplement 1.

Identification of a GATA6-driven mesenchymal program underlying semilunar valve development

Given their critical role in OFT development, we focused our analysis on mesenchymal cells. To explore the heterogeneity within this lineage and uncover potential subtypes, we performed nuclei subclustering of embryonic and fetal datasets, followed by t-SNE visualization (Figure 2—figure supplement 1A, B). Using this approach, we obtained seven distinct mesenchymal clusters, two embryonic and five fetal (Figure 2A, B). Fetal nuclei predominantly express DCN, a fibroblast marker, except for cluster 9, which contains MYH11-positive smooth muscle cells (Figure 2C, see also Figure 1—figure supplement 1C). In contrast, embryonic mesenchymal nuclei lacked definitive lineage markers, suggesting these cells are undifferentiated or at an early progenitor stage.

Figure 2 with 1 supplement see all
Characterization of embryonic mesenchymal nuclei.

Mesenchymal cell clusters (A) and sample projection (B) of fetal and embryonic samples following subclustering, visualized on a two-dimensional tSNE. Nuclei are colored by cluster (A) and sample (B). (C) Embryonic clusters (blue contour) do not express fibroblast (DCN) or smooth muscle (MYH11) markers, which are present in fetal nuclei (green contour). Nuclei are colored according to their scaled expression. (D) Top 10 regulons in fetal clusters based on Regulon Specificity Score (RSS). (E) Gene ontologies associated with the GATA6 regulon highlighted terms related to arterial and pulmonary valve morphogenesis. Functional annotation clustering of top 400 genes enriched GATA6 regulon was performed using DAVID and −log10(Pv) was plotted in Excel. (F) Hematoxylin and eosin (H&E)-stained section showing the aortic and pulmonary arteries with their respective valves (left) and a corresponding map of the spatial expression of the GATA6 regulon (right). (G) GREAT analysis of GATA6 high-confidence peaks (FE >10) in posterior pharyngeal arches and outflow tract (OFT) at embryonic day (E) 11.5 (mouse). GATA6 peaks predominantly cluster around genes associated with cardiovascular terms, and specifically with OFT and artery development, as well as semilunar valve development (red arrows). (H) Selected regulon genes associated with GATA6 binding in mouse embryo OFT and pharyngeal arches (see also Supplementary file 3) and associated with OFT-related abnormalities. Yellow genes are associated with human disease; light green genes cause mouse phenotypes; dark green genes are associated with both human and mouse defects. (I) UCSC tracks of H3K27Ac ChIP-seq, GATA6 ChIP-seq (boxed in red) in posterior pharyngeal arches and OFT at E11.5 (mouse) and mammal sequence conservation at MECOM (top), LTBP1 (middle) and NOTCH2 (bottom) loci. (J). Spatial Transcriptomics of Aorta (Ao) and Pulmonary Artery (PA) (clockwise): H&E staining of the tissue area, with asterisks marking the semilunar valves; spatial distribution of LTBP1, NOTCH2, and MECOM. (K) Trajectory inference of future state of embryonic nuclei (CS16–17) showing mesenchymal (4, 20), endothelial-like (7), and cardiac (2, 17) clusters. Embryonic clusters derive from subclustering of aggregated fetal and embryonic nuclei shown in Figure 2—figure supplement 1A. (L) Expression signatures in embryonic mesenchymal (4, 20) and endothelial-like (7) clusters. Both cluster 7 and 4 express high levels of cardiac TFs (GATA4 and TBX20) and HAPLN1, a marker of semilunar valves. In contrast, cluster 20 nuclei exhibit higher expression of neural crest markers, HOXA3-B3 and PLXNA2.

Due to this absence of differentiated markers, which makes it challenging to define the identity of embryonic mesenchymal populations, we applied SCENIC Aibar et al., 2017 to the embryonic single-nucleus transcriptomes to infer key cell fate regulators and gain insight into their developmental potential. SCENIC links transcription factors (TFs) with their target genes based on co-expression, and it identified GATA6 as a top regulator of embryonic mesenchymal cluster 4 (Figure 2D). GATA6 is implicated in common arterial trunk (CAT) and bicuspid aortic valve (BAV) in humans (Gharibeh et al., 2018; Kodo et al., 2009). Pathway enrichment analysis of the 400 genes in the GATA6 regulon highlighted GO terms related to arterial and pulmonary valve morphogenesis (Figure 2E), a process known to be regulated by GATA6, supporting the idea that this regulon contains functional downstream targets involved in valve formation. Consistent with this, spatial transcriptomic analysis of a later stage (12 pcw) OFT shows that GATA6 regulon is mainly restricted to the aortic and pulmonary valves (Figure 2F). To prune TF–target interactions and identify GATA6 high-confidence direct targets, we used GATA6 genomic occupancy in the mouse OFT and posterior pharyngeal arches (Losa et al., 2017). Notably, GATA6 peaks are linked to GO terms highly specific to OFT and valve development (McLean et al., 2010), emphasizing that this dataset captures GATA6 binding to targets essential for the formation and maturation of these structures (Figure 2G). We inferred direct regulation by assigning peaks to genes (McLean et al., 2010). The GATA6 regulon is significantly enriched for genes occupied by GATA6 (p = 1.2 × 10³³). This supports the interpretation that many genes within GATA6 regulon are associated with GATA6 binding and likely to be direct GATA6 targets in the OFT. We systematically analyzed direct targets for their implication in mouse phenotypes, human defects, and genome-wide association studies (GWAS) using the MGI Database (Motenko et al., 2015) and the EMBL GWAS database (Cerezo et al., 2025). The results are summarized in the network shown in Figure 2H. The GATA6 regulon includes genes identified in GWAS studies on aortic valve calcification (RNF144A, ZEB2, MECOM, LPL, PDE3A, and TWIST1) (Thériault et al., 2024) as well as genes whose inactivation in mice leads to phenotypes that overlap with GATA6 loss, including aortic valve and OFT defects (Supplementary file 3).

We detect strong GATA6 peaks overlapping with H3K27Ac, a histone mark associated with active enhancers, within the Mecom locus (Figure 2I). In mice, mutations in Mecom, which encodes a histone-lysine N-methyltransferase, result in CAT, interrupted aortic arch, and ventricular septal defects (Bard-Chapeau et al., 2014). In humans, GWAS have linked MECOM to calcified aortic stenosis (Thériault et al., 2024; Small et al., 2023). MECOM transcripts are sparsely expressed, and primarily localized within the aorta and pulmonary artery, in regions occupied by the semilunar valves (Figure 2J). The Notch2 locus also exhibits multiple GATA6 peaks in regions of high acetylation and evolutionary conservation across mammals (Figure 2I). Conditional deletion of Notch2 in cardiac neural crest cells leads to hypoplastic aorta and pulmonary arteries due to reduced smooth muscle content (Varadkar et al., 2008). NOTCH2 is one of the causative genes in Alagille syndrome, a multisystem disorder involving hepatic and cardiac anomalies, most commonly peripheral pulmonary stenosis and tetralogy of Fallot (TOF; McDaniell et al., 2006). Like MECOM, NOTCH2 is primarily expressed in the aorta and pulmonary artery (Figure 2J). Strong GATA6 binding is also observed at the Ltbp1 gene, which encodes an extracellular regulator of TGF-β signaling (Figure 2I). Loss of the long isoform of Ltbp1 in mice results in CAT and interrupted aortic arch, due to defective cardiac neural crest cell function (Todorovic et al., 2007). Unlike MECOM and NOTCH2, LTBP1 is highly expressed in the walls of the aorta and pulmonary artery, including the OFT septum (Figure 2J). Additional genes in the GATA6-regulated network include SLIT2 and ROBO1, components of the SLIT–ROBO signaling pathway implicated in BAV (Mommersteeg et al., 2015), and structural genes like MYH10 and COL1A2 (Figure 2—figure supplement 1C). Collectively, the convergence of OFT phenotypes, OFT-specific GATA6 binding and enhancer activity (H3K27Ac enrichment), and the expression patterns of regulon genes supports a GATA6-driven transcriptional network implicated in OFT development, particularly the semilunar valves. These findings delineate molecular mechanisms underlying GATA6 function in the developing valves and highlight candidate genes that may contribute to BAV susceptibility.

SCENIC analysis links mesenchymal cluster 4 to semilunar valve development. We next applied trajectory inference to examine its relationship to other embryonic clusters at the same stage. RNA velocity predicts the future state of individual cells by distinguishing between unspliced and spliced mRNAs (La Manno et al., 2018). We detected the transition of some nuclei from cluster 7 (endothelial-like cells) specifically toward cluster 4 (mesenchymal cells) (Figure 2K). Cluster 7 exhibits a similar expression profile to cluster 4, but distinct from the other mesenchymal nuclei (cluster 20) (Figure 2L). This profile includes high expression of HAPLN1, encoding the extracellular matrix cross-linking protein Hyaluronan and Proteoglycan Link Protein 1, found in the endocardial lining and the developing semilunar valves in the OFT. Compared with fetal endothelial clusters, endothelial cluster 7 is enriched for genes associated with aortic and pulmonary valve morphogenesis and epithelial-to-mesenchymal transition (EMT) (Figure 2—figure supplement 1F; Supplementary file 4). In mice, semilunar valve formation begins with EMT of endothelial cells in the endocardium, the cell layer adjacent to the myocardium (Gittenberger-de Groot et al., 1998; Kirby et al., 1983). This suggests that cluster 7 likely contains an endocardial population, with RNA velocity tracking their transition into mesenchymal cells in cluster 4, thereby generating the precursors of valve interstitial cells.

To understand why GATA6 emerges as a top regulator specifically in cluster 4, we examined GATA6 expression across embryonic nuclei. Although GATA6 is expressed in all embryonic clusters, its levels are highest in cluster 4 (Figure 2—figure supplement 1D), which may account for its restricted activity in this population. Alternatively, given that TFs typically act cooperatively, GATA6 may coregulate the cluster 4 regulon in combination with additional factors.

To identify these additional factors, we compared the regulons of cluster 4 top transcriptional regulators (Figure 2D). As expected, since regulon genes are sampled from cluster 4 enriched transcripts, these regulators share many downstream targets with GATA6 (19–30% overlap). Notably, GLI3 shows substantially greater regulon overlap (56%), suggesting functional cooperation with GATA6. This is consistent with their reported cooperation in the developing mouse limb (Hayashi et al., 2016). GLI3 is also enriched in cluster 4, further supporting the hypothesis that these TFs cooperate in the development and differentiation of this cell population (Figure 2—figure supplement 1E).

Spatial resolution of mesenchymal nuclei in the OFT

Mesenchymal cells build key structures in the OFT, specifically the separation of the aorta from the pulmonary artery (Figure 3A, C, E) and the semilunar valves at the base of the aorta and pulmonary artery (Figure 3B, D, F). At 12pcw, mesenchymal nuclei express markers of differentiated cell types (Figure 2C). To visualize the distribution of fetal mesenchymal cell populations within the OFT, we conducted spatial gene expression analysis. We generated four sections of a 12 pcw OFT (equivalent stage to snRNA-seq), starting at the level of the aortic valves (a) and ending at the level of the pulmonary valves (d) (Figure 3G). We used Cell2location Kleshchevnikov et al., 2022 to transfer the labels from the transcriptomic data to spatial gene expression data and mapped different cell types on each individual spatial location. Cardiac nuclei correctly mapped to the atria and the ventricle (Figure 3—figure supplement 1A). Mesenchymal cells largely distributed within and around the vessels, and also mapped to the valves of the pulmonary artery and the aorta (Figure 3—figure supplement 1B). Endothelial cells were primarily located in the aortic valves (Figure 3—figure supplement 1C). Immune cells were scattered around the tissue (Figure 3—figure supplement 1D), while neural cells were concentrated in a spot of the atria (Figure 3—figure supplement 1E). Next, we mapped the five fetal mesenchymal clusters to distinct structures in the OFT (Figure 3H) and used distinctive markers to confirm spatial assignments. Clusters 3 and 6 (Figure 3I, II) map to the arterial outer walls and the septum between the aorta and pulmonary arteries and largely co-localize with DCN-positive cells (Figure 3III). These two clusters largely overlap. Cluster 9 (Figure 3J) coincides with MYH11 expression (Figure 2J): MYH11 is a terminal marker of smooth muscle differentiation (Babij et al., 1991), which identifies the aortic smooth muscle layer. Cluster 12 (Figure 3K) concentrates to both aortic and pulmonary valves and co-localizes with HAPLN1 (Figure 3KI), a marker of the semilunar valves (Fang et al., 2014). No molecular differences or distinguishing markers were identified between the aortic and pulmonary valves. Finally, the smallest cluster (15) is restricted to the cardiac ventricle and is DCN-negative; therefore, it was eliminated from our further analysis (Figure 3—figure supplement 1F). Thus, the five subtypes of mesenchymal nuclei, identified by snRNA-seq, largely correspond to spatially segregated cell populations of fibroblasts (clusters 3 and 6), smooth muscle cells (cluster 9), and valvular interstitial cells (VICs; cluster 12). Gene expression across four sections of the fetal heart can be visualized at https://cellxgene.cziscience.com/collections/5d2077ea-7b49-45c8-b4cb-64790b698591.

Figure 3 with 1 supplement see all
Spatial distribution of mesenchymal clusters.

(A–F) Outflow tract (OFT) valve formation and remodeling. Images were prepared using high-resolution episcopic microscopy at embryonic stages (A–D) and by micro-CT at the fetal stage (EF). (A) By CS16 the OFT has septated into the aorta (Ao) and pulmonary trunk (PT). The immature OFT cushions are visible: the septal (yellow), parietal (green), and intercalated (purple) cushions. (B) Neural crest cells (asterisks) contribute to valve formation. CD. At CS20 the cushions have begun to remodel to form the three leaflets of the aortic and pulmonary semilunar valves. (E, F) At 11 pcw, the valves have transformed into the leaflets that control the unidirectional flow of blood from the heart. Boxed regions in (A, C, E) are shown at higher magnification in (B, D, F). (G) Heart alignment for sectioning, with the OFT region marked in red (left). Hematoxylin and eosin (H&E) staining of OFT cryo-sections used for spatial transcriptomics from the base of the OFT (a) to the pulmonary valves (d). The aorta and pulmonary trunk are indicated by blue and green arrowheads, respectively. (H) H&E section ‘c’ annotated to show major structures. (I, II, J, K) Spatial distribution of mesenchymal clusters (purple and yellow). (III, JI, KI) Spatial distribution of lineage-specific markers (white and red). (IIII) Clusters 3 (I) and 6 (II) largely overlap with fibroblast lineage marker DCN (III). Cluster 9 (J) and smooth muscle lineage-specific marker MYH11 (JI) map to the aortic walls as well as to the pulmonary artery. luster 12 (K) and valve-specific marker HAPLN1 (KI) are mainly found in the valves at the base of the aorta and pulmonary artery. See also Figure 3—figure supplement 1.

Developmental trajectories of mesenchymal cells in the developing OFT

At CS16–17, we identified two clusters of undifferentiated mesenchymal cells. By 12 pcw, three main types of differentiated mesenchymal populations had emerged: fibroblasts, smooth muscle cells, and VICs, each occupying distinct spatial locations within the OFT (Figure 2A, B). Our objective was to track the developmental trajectories of these populations and identify the 12 pcw descendant cells from each embryonic cluster. Connecting mesenchymal embryonic progenitors to their differentiated fetal counterparts is challenging because the embryonic nuclei are yet to express the molecular markers of differentiated lineage descendants (Figure 2C). Trajectory inference methods (Trapnell et al., 2014) failed to establish lineage relationships between embryonic and fetal populations. To overcome this, we used gene module scores. The rationale behind this approach was that any distinctive developmental signature present in the embryonic clusters would likely be retained in the fetal nuclei, thereby enabling us to trace the trajectories of mesenchymal cell populations.

We first performed pairwise differential gene expression analysis of embryonic clusters to identify distinct developmental signatures of mesenchymal subtypes. Cluster 4, which partly derives from endocardial cells, is enriched in cardiac-like markers (Figure 4A) and is linked to ‘aortic valve and endocardial cushion morphogenesis’ (Figure 4—figure supplement 1A). In contrast, cluster 20 is largely associated with neural-like GOs (Figure 4—figure supplement 1B) and enriched in neural crest markers HOXA3/B3 and PDGFRB (Figure 4A). These observations suggest two separate embryonic origins for clusters 20 and 4, neural crest and second heart field, respectively. From the set of differentially expressed genes, we largely selected TFs, which define lineage identity, to construct two distinct gene modules. Specifically, the ‘TBX20 SOX6 GATA4 PRRX1’ module is highly expressed in embryonic cluster 4, while the ‘MEIS1 JAG1 ROR1 PRDM6’ module is enriched in embryonic cluster 20. These eight developmental genes were sufficient to segregate the embryonic nuclei into two distinct subtypes (Figure 4B), which we designated as groups 1 and 2. Group 1 consists entirely of nuclei from cluster 4, while group 2 is primarily composed of nuclei from cluster 20, with a smaller contribution from cluster 4.

Figure 4 with 1 supplement see all
Lineage deconvolution of embryonic and fetal nuclei.

(A) Pairwise differential gene expression of the two embryonic mesenchymal clusters; genes chosen for gene modules are marked by asterisks. (B) Heatmap using embryonic gene modules, obtained using k-means clustering, separates embryonic mesenchymal nuclei into two groups. (C) Mesenchymal cell clusters of embryonic and fetal time points, projected on a two-dimensional tSNE and labeled using gene modules. Fetal clusters derive from separate ‘blue’ and ‘red’ embryonic lineages. (D) Heatmap of embryonic gene modules and cell type marker genes using k-means clustering identifies three main groups of fetal nuclei. (E) Lineage trajectories of embryonic and fetal nuclei. Using the entire fetal datasets, cluster 3 and 12 nuclei are identified as descendants of embryonic cluster 4, while clusters 6 and 9 are the most likely descendants of embryonic cluster 20, consistent with the use of gene modules in the mesenchymal subset of fetal nuclei in 3D.

Next, we asked if embryonic gene modules are inherited by fetal nuclei. Indeed, we found that the expression of our embryonic gene modules largely segregates developmental nuclei into two distinct lineages (Figure 4C): a ‘blue’ lineage, composed mainly of clusters 4, 3, and 12, which exhibits higher expression of module 1 (SOX6, TBX20) and a ‘red’ lineage, including clusters 20–6–9, which shows higher levels of module 2 expression (MEIS1 and PRDM6). We then combined embryonic gene modules with markers of differentiated cell types, obtaining three distinct groups of 12pcw mesenchymal nuclei (Figure 4D). Module 1 embryonic signature is inherited by fibroblasts and valvular cells, corresponding to cluster 3 and 12, respectively (Figure 4D), suggesting a common embryonic progenitor for these cell types. Since module 1 defines the second heart field-derived embryonic cluster 4, we conclude that fetal group 1 nuclei derive from the second heart field. Conversely, the module 2 signature was inherited by two distinct cell types: smooth muscle cells (cluster 6) and fibroblasts (cluster 9). These groups were further segregated by the expression of cell type-specific markers, MYH11 (group 2) and DCN (group 3). Module 2 is primarily associated with cluster 20, which is enriched in neural crest markers, with a smaller contribution from cluster 4, suggesting that fetal group 2–3 nuclei derive from both the neural crest and second heart field. Consistent with this, expression of cardiac neural crest markers, HOXA3/B3, is almost exclusively confined to ‘group 2–3’ fetal nuclei, supporting the notion that these nuclei (mainly clusters 6 and 9) partially derive from neural crest progenitors. This is in line with observations in the mouse model (Sawada et al., 2017) where smooth muscle cells in the aortic root originate from both neural crest and second heart field progenitors.

To confirm the lineage relationships, we developed a robust method to trace embryonic signatures in fetal cells. We expanded the gene module repertoires to include the top 100 most distinctive genes from the mesenchymal progenitors of clusters 4 and 20 (Supplementary file 5) and examined fetal nuclei populations that retained expression of these genes. When applied to the entire 12 pcw dataset (including cardiac, endothelial, and mesenchymal clusters), our method confirmed the same lineage relationship between embryonic and fetal mesenchymal nuclei that were previously identified using gene modules (Figure 4E). In sum, our analysis indicates that the two spatially distinct mesenchymal populations in the fetal OFT, smooth muscle cells and valvular fibroblasts (Figure 3J, JI and K, KI) derive from separate embryonic populations: cluster 20 for smooth muscle cells and cluster 4 for valvular fibroblasts. In contrast, the DCN-positive, HAPLN-negative fibroblast cells, which spatially intermingle in OFT tissues, derive from both cluster 4 and 20 embryonic progenitors (Figure 4—figure supplement 1C).

Cellular constituents of adult aortic valves

Semilunar (aortic and pulmonary) valves are the only distinctive and recognizable derivatives of the OFT that are retained in the adult. For adult samples, we focused on the aortic valves because of their frequent association with disease, both genetic (BAVs) and adult (valve calcification) disease. We collected female samples to mitigate individual variability and maximize the possibility to analyze healthy aortic valves, justified by the lower incidence and severity of aortic disease in females versus males. Histologically normal aortic valves were procured from healthy adult hearts collected for transplantation and subsequently rejected. A total of 5430 single nuclei from three human aortic valve samples (Figure 5A) were segregated into 11 different clusters by unsupervised clustering (Figure 5B). Leveraging established markers, we could separate the clusters into three major compartments, interstitial (seven clusters, largely NAV2-positive), endothelial (two clusters, CDH5-positive) (Giannotta et al., 2013), and immune (two clusters, ITGAM-positive macrophages and IKFZ1-positive dendritic cells) (Hulin et al., 2018; Collin and Bigley, 2018; Figure 5C).

Figure 5 with 3 supplements see all
Cellular constituents of the mature aortic valves.

(A) Aortic valve sample association projected on a two-dimensional uniform manifold approximation and projection (UMAP). Nuclei are colored by sample. (B) Nuclei clusters visualized by unsupervised clustering. Nuclei are colored by cluster. (C) Cell lineages identified using established lineage-specific markers. Each nucleus is colored based on the scaled expression of the indicated marker. (D) Top differentially expressed genes in clusters identify known lineage markers. (E) Overview of the method used to trace adult descendants of embryonic nuclei. The first step is the identification of distinctive signatures in embryonic progenitors; for this, we used the top 100 differentially expressed (DE) genes in our chosen progenitor populations, cluster 4 (blue) and cluster 20 (red). The second step is the identification of the top 5000 marker genes for each adult population; this is done by comparing each cluster with the rest of the dataset. Finally, we search for the 100 DE embryonic genes in the marker genes of adult clusters. Adult clusters with top hits are identified as the descendants of the embryonic lineage; the statistical significance is calculated using a hypergeometric test. (F) Dotplot displaying the 30 top DE genes (mean expression values) in embryonic clusters 4 and 20, respectively. The same dotplot, previously shown in Figure 3C, has been included here to facilitate cross-comparison with Figure 6G, H. (G) Distribution of cluster 4 embryonic signature genes in adult nuclei clusters. Clusters 4, 7, and 5 express a highly significant fraction of embryonic cluster 4 genes. Top 30 DE genes in embryonic clusters 4 and 20 (shown in F) are highlighted by dots. (H) Distribution of cluster 20 embryonic signature genes in adult nuclei clusters. Cluster 1 expresses a highly significant fraction of embryonic cluster 20 genes. Top 30 DE genes in embryonic clusters 4 and 20 (shown in F) are highlighted by dots.

We performed differential gene expression analysis to aid in the classification of each cell cluster. Heatmaps (Figure 5—figure supplement 1A) group together immune (2, 6) and endothelial cell types (8, 10). In addition to CDH5 (Bach et al., 1998), endothelial clusters are distinguished by high levels of ADAMTS9, which encodes for a metalloproteinase implicated in aortic valve anomalies in mice (Kern et al., 2010; Figure 5D). ADAMTS9 is also expressed in cluster 11, which is CDH5-negative (Figure 5D). Of the remaining clusters, cluster 1 is transcriptionally distinct and contains CALD1-, ACT1-positive SMC. AV3 is highly enriched in this cluster (26% vs 5.8% and 2.9% in AV1 and AV2, respectively) (Figure 5—figure supplement 1B); we attribute this skewed enrichment to aortic wall tissue being sampled with the aortic valves in AV1, as SMC are not a main constituent of the aortic valves, rather than to intrinsic variability across samples. Indeed, when cluster 1 is removed, the three samples are consistently similar to each other (Figure 5—figure supplement 1C). Cluster 11, which accounts for a small proportion of nuclei in all three samples (ranging from 1.5% to 2.3%), was the most related cluster to cluster 1 (Figure 5—figure supplement 1B; Figure 5E).

Clusters 3, 4, 5, and 7 display expression of similar transcripts, such as NAV2, LTBP1, FN1, and were assigned to VICs (Figure 5—figure supplement 1A; Figure 5D). VICs deposit three highly organized layers (fibrosa, spongiosa, and ventricularis) of extracellular matrix (ECM) that compose the mature valve structure (Rutkovskiy et al., 2017). These cells are identified by the expression of transcripts encoding for ECM proteins, most notably COL1A1, VIM, COL3A1, VCAN, BGN, and LUM. ECM encoding transcripts were enriched in clusters 3, 4, 5, and 7 (VIC) across all the adult samples examined (Figure 5—figure supplement 1D), with VIM and VCAN displaying a broader expression. Differently from cluster 5, clusters 4 and 7 express high levels of CCL2, an inflammatory cytokine (Kumar and Boss, 2000), suggesting nuclei in these clusters correspond to activated fibroblasts. Finally, cluster 9 displays enrichment in DLC1 and TLN2, implicated in cytoskeletal changes (Yuan et al., 1998; Monkley et al., 2001), and SPARC (osteonectin), implicated in valve calcification (Bradshaw et al., 2003). While the above clusters are represented across all samples, AV1 clusters 4–7–9 contain on average more nuclei relative to AV2 and AV3 (Figure 5D; Figure 5—figure supplement 1E). The increased expression of SPARC (cluster 9), combined with higher levels of CCL2 (clusters 4 and 7) in AV1, indicates the possibility of inflammatory processes that could eventually lead to valve calcification in the AV1 sample. We did not detect myocardial cells in the adult valve tissue, consistent with evidence that myocardium contributes to early arterial root and cushion formation but does not persist in mature valves. Myocardial gene expression is already absent from the valve leaflet cluster by CS16–19 (Queen et al., 2023). Our adult dataset therefore reflects the valve complex and adjacent arterial root region, a subset of embryonic OFT derivatives rather than the entire OFT myocardium.

Persistence of embryonic signatures in adult cells

Whilst differentiated, 12 pcw nuclei maintain robust embryonic signatures (Figure 4D, E). We asked if such signatures also persist in terminally differentiated adult cells and could be used as labels to retrospectively attribute cellular relationships between embryo and adult cells. To address this, we leveraged our method to identify embryonic signatures in adult cells. Using the top 100 most distinctive genes of the embryonic mesenchymal progenitors (clusters 4 and 20 in the embryo) (Supplementary file 5), we looked for populations of adult nuclei enriched in the expression of most of these genes (Figure 5E). We found that adult clusters 1, 4, and 7 retain expression of a highly significant fraction of distinctive embryonic genes, with clusters 4 and 7 largely maintaining embryonic cluster 4 expression signature (blue) (Figure 5F, G), while cluster 1 displays a significant match with embryonic cluster 20 (red) (Figure 5F–H). The finding that smooth muscle cells (adult cluster 1) are derived from embryonic cluster 20 is consistent with cluster 20 being the source of smooth muscle cells at 12 pcw (Figure 4). Valvular fibroblasts (clusters 4 and 7) derive from the embryonic population in cluster 4, which is also linked to fetal valvular fibroblasts (Figure 4). This result is significant because our embryonic signatures derive from undifferentiated cells which are yet to express obvious differentiation markers, suggesting that adult cells retain their ancestral embryonic makeup. The expression of a representative gene set from the 100-gene embryonic signatures was projected onto adult valve cells, confirming the findings shown in (Figure 5F–H; Figure 5—figure supplement 2A, B). As our adult clusters reflect aggregation of three individual samples, we asked if distinctive embryonic signatures can be detected despite confounding individual factors (aging, environmental exposure, genetic background, etc). We performed the same analysis on individual samples AV1 and AV3 (AV2 was excluded due to the lower number of nuclei). We independently re-clustered each sample and applied our method to identify the descendant nuclei of embryonic cluster 4 within individual samples. Our method linked embryonic cluster 4 to 0 in AV1, and to cluster 2 and 3 nuclei in AV3 (Figure 5—figure supplement 2A, B); these clusters contain the majority of adult cluster 4 aggregate nuclei (Figure 5—figure supplement 2C, D). This indicates that distinctive embryonic signatures can be detected over individual variability. In sum, our analysis indicates that distinctive patterns of embryonic gene expression persist in adult cells, can be consistently detected in individual adult samples, and can be used as labels to retrospectively attribute cellular relationships between embryo and adult cells. Finally, as the level of expression of a gene across different cells can provide an initial indication of its functional role, we explored the relative expression levels of distinctive embryonic signatures in embryonic nuclei and their adult descendant nuclei. Generally, embryonic genes display lower expression levels in adult cells relative to their embryonic progenitors (Figure 5—figure supplement 2E, F).

Spatial profiling of OFT defect genes reveals candidates for congenital heart malformations

OFT defects occur when the vessels leaving the heart do not form or remodel correctly, resulting in problems with blood circulation and/or oxygenation in the neonate (Neeb et al., 2013). The most severe include common arterial trunk (CAT), transposition of the great arteries (TGA), double-outlet right ventricle (DORV), and TOF. Using spatial transcriptomics, we investigated the distribution of genes whose mutations cause OFT defects (Figure 6A, AI). JAG1 mutations cause TOF (Eldadah et al., 2001); JAG1 transcripts largely concentrate around the aorta, in the semilunar valves, and the septum (Figure 6B, BI). Mutations in GATA family members affect OFT formation in different ways, leading to DORV (GATA5) (Jiang et al., 2013), TOF (GATA4 and GATA6) (Maitra et al., 2010; Tomita-Mitchell et al., 2007), and CAT (GATA6) (Kodo et al., 2009). Different from GATA5, which shows a restricted expression to the vessels (Figure 6C, CI), the distribution of GATA6 and GATA4 transcripts is broader. GATA6 is detected in the vessels, the septum, and cardiac tissue (Figure 5—figure supplement 3A), while GATA4 transcripts are more prominent in cardiac tissues (Figure 5—figure supplement 3B). Mutations in GATA4–5–6 also cause BAV, a less severe condition characterized by two valves instead of three (Yang et al., 2017). Similar to GATA5, GATA6 is highly expressed in the valves, marked by HAPLN1 (Figure 3KI). As a final example, mutations in NR2F2 cause DORV and TOF (Al Turki et al., 2014). Differently from the previous examples, NR2F2 transcripts are largely excluded from OFT structures and are mainly located in cardiac tissues surrounding the aorta and the pulmonary artery (Figure 6D, DI).

Figure 6 with 1 supplement see all
Genes mutated in congenital OFT defects.

(A) Hematoxylin and eosin (H&E) staining of spatial transcriptomics section (Figure 2J), and magnified view of aortic and pulmonary valve area (AI). The aorta and pulmonary trunk are indicated by blue and green arrowheads, respectively. JAG1 (B), GATA5 (C), and NR2F2 (D) gene expression patterns on the same section. (BI–DI) Genes as in BD with corresponding magnification of valve area. (E) Genes identified as displaying spatially similar expression patterns to JAG1.

We reasoned that genes exhibiting a distribution pattern similar to, or matching those mutated in OFT defects, could serve as potential candidates for OFT anomalies. To test this hypothesis, we focused on JAG1, whose transcripts exhibit a distinct pattern around the vessels, and generated an algorithm to identify genes that display comparable expression patterns to JAG1 (Figure 6E). The algorithm filters out any gene that is not expressed in the majority of JAG1-positive spots (>50%) and is also expressed in JAG1-negative spots (see methods). Using this approach, we identified FOXC1 and OSR1, whose loss of function in mice results in semilunar valve abnormalities and aortic arch coarctation, and defects in heart septation, respectively (Sanchez et al., 2020; Wang et al., 2005). In addition, variants in FOXC1 were recently identified in patients with conotruncal heart defects (Wei et al., 2024).

Discussion

We used snRNA-seq to analyze 30,166 nuclei, which covered two stages of OFT development (CS16–17 and 12 pcw) up to the adult derivatives of the OFT, the aortic valves. In parallel, we used spatial transcriptomics to define the distribution of fetal cell populations and visualize the expression patterns of genes implicated in OFT defects, which make up one-third of all cases of CHD. These datasets provide a large reference framework of OFT cell repertoires and their gene expression profiles, and constitute a valuable resource for enhancing our understanding of OFT defects and advancing the discovery of previously unknown genes implicated in these conditions. A major finding emerging from our analysis of a timeline of human development is that adult cells retain their ancestral embryonic signatures, which may have implications for adult-onset aortic valve disease.

During human embryo development, the OFT undergoes formation and remodeling between CS13 and CS23. This process results in the separation of the aorta and pulmonary trunks, creating the double circulation found in mammals. It also leads to the formation of the arterial valves, which continue to mature after the arterial vessels have separated. Focusing primarily on mesenchymal cells, which are responsible for building the semilunar valves and the separation of the aorta from the pulmonary artery, we identify two distinct groups of embryonic progenitors at the earliest stage (CS16/17). These mesenchymal nuclei do not yet express clear differentiation markers, but display distinctive neural crest and second heart field expression signatures. In contrast, by 12 pcw mesenchymal nuclei express cell type markers and cluster into four subgroups, corresponding to fibroblast-like cells, smooth muscle cells, and VICs, each localized to distinct regions within the OFT. Using gene modules and a new lineage tracing tool, we traced these four fetal mesenchymal populations back to their embryonic precursors. Our results reveal that valvular fibroblasts and smooth muscle cells have separate embryonic origins. Valvular fibroblasts, the mesenchymal cells involved in constructing the semilunar valves, appear to originate from endothelial cells. This aligns with the finding that mouse arterial valves derive from endothelial cells undergoing EMT in the endocardium – the myocardium-adjacent cell layer (Gittenberger-de Groot et al., 1998; Kirby et al., 1983). The mesenchymal progenitors of fetal valvular fibroblasts and adult VICs are regulated by a GATA6-driven gene network. Mutations in GATA6 are a known cause of BAV, suggesting that this regulatory program plays a critical role in semilunar valve development. The GATA6 regulon includes genes whose mutations are associated with OFT and valve defects in mice, as well as aortic valve disease in humans. Although we did not uncover a novel single-gene marker specific to humans (analogous to LRG5; Sahara et al., 2019), our identification of a GATA6 network highlights molecular mechanisms downstream of GATA6 that drive valve formation, advancing our understanding of normal valve development and informing the search for genes involved in BAV susceptibility and aortic valve disease. In contrast, smooth muscle cells derive from a distinct CS16-17 embryonic population. A consistent set of these cells displays expression of cardiac neural crest markers, in agreement with previous observations in mice, indicating that smooth muscle cells in the human aortic root originate from the neural crest and second heart field (Sawada et al., 2017). Finally, both embryonic mesenchymal populations also give rise to non-valvular fibroblast cells (DCN+ and HAPLN1).

We find that distinctive embryonic signatures persist into adulthood. While previous studies have described that reactivation of fetal programs in adult heart disease (Dirkx et al., 2013; Oka et al., 2007), all our adult samples are derived from healthy individuals. This suggests that the persistence of early developmental signatures is a pervasive feature of normal adult cells, not merely a pathological response. Single-cell analyses have revealed high levels of heterogeneity within the same cell type. Our findings suggest that the inheritance of 'embryonic memories' by adult cells may be a major contributing factor to this heterogeneity. In support of this, organ-specific developmental signatures have been observed in adult fibroblasts (Forte et al., 2022). In the human context, the persistence of distinct embryonic signatures in adult cell types may influence their susceptibility to disease, and potentially contribute to a better understanding of disease heterogeneity.

Connecting cell lineages over extended periods of time can be challenging. Existing tools typically use single-cell RNA sequencing to capture the complete expression profiles of individual cells in a single experiment, treating each cell as a unique time point on a continuum (Trapnell et al., 2014). However, the significant transcriptional changes that occur as cells transition from embryonic to fetal and adult cell types pose a challenge when comparing and linking cells over prolonged periods based on their global expression profiles. The persistence of embryonic signatures into adulthood opens the possibility of using these enduring molecular ‘labels’ to trace developmental ancestry in complex tissues, particularly in contexts where related cells are temporally distant or phenotypically divergent, such as in aging or cancer.

The observation that embryonic expression signatures persist in adult cells raises obvious questions about functional significance. We find that embryonic genes are generally expressed at lower levels in adult cells relative to embryonic cells (where they are known to have a function); their persistent expression may reflect fortuitous remnants of developmental histories. Alternatively, these retained embryonic gene expressions could serve as preserved developmental blueprints, ready to be swiftly and efficiently reactivated when the need arises, such as during injury or tissue repair. Reinforcing this perspective, heart failure is associated with the reawakening of a fetal gene program (Dirkx et al., 2013; Oka et al., 2007).

In summary, our work extends beyond confirming previously reported cell types by (1) defining a GATA6-regulated human valve progenitor lineage and its derivatives, (2) establishing distinct embryonic origins for smooth muscle and valvular fibroblasts, and (3) demonstrating the persistence of embryonic signatures in adult valve cell populations. These conclusions are directly supported in tissue by our spatial transcriptomics data, which map these lineages and regulatory programs to defined anatomical domains within the human OFT and semilunar valves.

Materials and methods

Key resources table
Reagent type (species) or resourceDesignationSource or referenceIdentifiersAdditional information
Biological sample (human embryonic tissue)Human embryonic outflow tract (CS16–17)Human Developmental Biology Resource (HDBR)Staged by HDBR guidelines; male
Biological sample (human fetal tissue)Human fetal outflow tract (12 pcw)Human Developmental Biology Resource (HDBR)Male
Biological sample (human adult tissue)Human aortic valveNewcastle Institute of Transplantation Tissue BiobankFemale (age 55–70)
Commercial assay or kitChromium Next GEM Single Cell 3′ Kit v3.110x genomicsCG000315RNA-seq library construction
Commercial assay or kitVisium spatial gene expression – Spatial transcriptomics10x genomicsCG000239Fetal heart spatial analysis
Peptide, recombinant protein (bovine serum albumin)Ultrapure BSAInvitrogenAM2616Nuclei resuspension
Peptide, recombinant protein (Protector RNAse inhibitor)RNAse inhibitorRoche3335399001Nuclei extraction and sorting
SoftwareCell ranger – Pipeline for 10x data10x genomics V3.1.0, v6.1.2RRID:SCR_017344Alignment and UMI quantification
SoftwareScanpy – Single cell data analysisScanpy v.1.9.5RRID:SCR_018139Normalization and integration
AlgorithmscVelo – RNA-velocity analysisscVElo V0.2.4RRID:SCR_018168Analysis of embryonic stage dynamics
AlgorithmSCENIC – Gene regulatory network inferenceSCENICRRID:SCR_017247Regulon activity scores
AlgorithmCell2location – Spatial deconvolutionCell2location v.0.1.3RRID:SCR_024859Maps snRNA-seq signatures to spatial spots

Sample acquisition

Request a detailed protocol

Male CS16–17 (embryonic) and 12 pcw (fetal) OFTs were collected after pregnancy termination and snap frozen by the Human Developmental Biology Resource (https://www.hdbr.org/). Embryonic samples were staged according to appearance by the HDBR embryo staging guidelines and pooled due to their limited size, with one CS16 and one CS17 sample in each pool. For fetal (12 pcw) OFT, one OFT was used for each single nuclei RNA-seq, and one OFT for spatial transcriptomics. Adult aortic valve tissue was collected from three female adult (age 55–70) hearts declined for clinical transplantation with written informed consent for research from their families. Donors did not have ECHO evidence of aortic valve pathology or past medical history of aortic valve disease, and inspection of aortic valve leaflets during dissection showed no signs of calcification. The research ethical approval (REC ref 16/NE/0230) was provided by the North East – Newcastle & North Tyneside Research Ethics Committee. The hearts were retrieved in the clinical standard fashion; arrested with 1 l of cold St. Thomas’ cardioplegia solution at the agreed time point when both cardiothoracic and abdominal retrieval teams were ready for organ procurement. The donor hearts were then rapidly retrieved and preserved with either static cold storage or a hypothermic oxygenated perfusion device for 4 hr during which they were transported back to the laboratory at Newcastle University. After preservation, they were reanimated on a modified Langendorff system with blood-based oxygenated perfusate for 4 hr. At the end of the normothermic reperfusion, the donor hearts were dissected. Aortic roots (containing the aortic valves) were excised, immediately snap frozen by being submerged into liquid nitrogen, and stored at –80°C in the Newcastle Institute of Transplantation Tissue Biobank (17/NE/0022). The material was then obtained from the Newcastle Institute of Transplantation Tissue Biobank for analysis.

Imaging

Request a detailed protocol

Human samples were processed for high-resolution episcopic microscopy and micro-computed tomography techniques as previously described. Briefly, aligned serial digital sections were imported into Amira (Thermo Fisher Scientific) to produce two- and three-dimensional images (Anderson and Bamforth, 2022). The voxel size was 3 µm (isotropic), which defines the spatial scale of all reconstructed volumes.

Nuclei extraction

Request a detailed protocol

Snap frozen embryonic and fetal tissue samples were minced using either a Dounce Homogeniser or a pellet pestle, then lysed for 30 min in ice cold lysis buffer (10 mM Tris HCl pH 7.4, 10 mM NaCl, 3 mM MgCl2, 0.1% Tween-20, 0.1% Igepal, 0.0005% Digitonin, and 0.2 U/µl Protector RNAse inhibitor) or until no tissue pieces could be seen. Lysis was stopped with wash buffer (10 mM Tris HCl pH 7.4, 10 mM NaCl, 3 mM MgCl2, 0.1% Tween-20, 0.1% BSA, and 0.2 U/µl Protector RNAse inhibitor) and filtered through a 20-µm filter. Nuclei were resuspended with PBS with bovine serum albumin (0.1% BSA UltraPure, AM2616 Invitrogen), and Protector RNAse inhibitor (0.2 U/µl, 3335399001 Roche), and processed using 10x Chromium Single Cell 3′ with target recovery set at 2500 nuclei/sample. Snap frozen adult aorta samples were cryosectioned to 50 µm, morphology was assessed by hematoxylin and eosin staining and the valve tissue identified. Valve tissue from 3 to 4 unstained frozen sections was scraped off the slides, minced using a pellet pestle, and lysed in ice cold Igepal lysis buffer (10 mM Tris HCl pH 7.4, 10 mM NaCl, 3 mM MgCl2, 0.05% Igepal, 1 mM DTT, and 1 U/µl Protector RNAse inhibitor) for 10 min. Sample was filtered using a 40-µm cell strainer. After spinning down (500 rcf, 5 min, 4°C) supernatant was removed and gently replaced with PBS/BSA (PBS containing BSA 1% and RNAse inhibitor 1 U/µl) and incubated for 5 min without disturbing the pellet. Supernatant was then removed, and the pellet resuspended in PBS/BSA supplemented with 7-aminoactinomycin D (7-AAD) dye. Nuclei were FACS sorted using a 100-µm diameter nozzle and a sheath pressure of 20 PSI as per 10x protocols (document CG000375). 7AAD allowed nuclei to be identified from debris generated during the processing procedure. 7AAD stained nuclei were excited with a 488-nm blue laser and emission was collected through a 685–725 nm bandpass filter, at an event rate of ~300–500 events per second. Following sorting, nuclei were permeabilized using 0.05X Lysis buffer (10 mM Tris HCl pH 7.4, 10 mM NaCl, 3 mM MgCl2, 0.01% Igepal, 0.001% Digitonin, 0.05 mM DTT, 1% BSA, and 1 U/µl Protector RNAse inhibitor) for 1 min, then washed and processed using 10x Chromium Single Cell 3′ with target recovery set at 2500 nuclei/sample.

Visium spatial gene expression

Request a detailed protocol

One 12 pcw whole fetal heart sample was embedded in OCT and cryosectioned at 5 µm according to 10x Genomics (document CG000240). A tissue optimization was performed according to 10x Genomics protocol (document CG000238). Permeabilization time was set to 12 min. Four spatial gene expression sections were collected from the base of the aorta at regular intervals ending at the pulmonary valve. Tissue was processed and libraries created according to manufacturer instructions (document CG000239). 25% Ct value was determined by qPCR at 16 cycles.

Single nuclei isolation and library construction

Request a detailed protocol

Gene expression libraries were prepared from single nuclei using the Chromium Controller and Single Cell 3ʹ Reagent Kits v3.1 (10x Genomics, Inc Pleasanton, USA) according to the manufacturer’s protocol (document CG000315). Briefly, nanoliter-scale Gel Beads-in-emulsion (GEMs) were generated by combining barcoded Gel Beads, a master mix containing nuclei, and partitioning oil onto a Chromium chip. Nuclei were delivered at a limiting dilution, such that the majority (90–99%) of generated GEMs contained no nuclei, while the remainder largely contained a single nucleus. The Gel Beads were then dissolved, primers released, and any co-partitioned nuclei lysed. Primers containing an Illumina TruSeq Read 1 sequencing primer, a 16-nucleotide 10x Barcode, a 12-nucleotide unique molecular identifier (UMI), and a 30-nucleotide poly(dT) sequence were then mixed with the nuclear lysate and a master mix containing reverse transcription (RT) reagents. Incubation of the GEMs then yielded barcoded cDNA from poly-adenylated mRNA. Following incubation, GEMs were broken and pooled fractions recovered. First-strand cDNA was then purified from the post GEM-RT reaction mixture using silane magnetic beads and amplified via PCR to generate sufficient mass for library construction. Enzymatic fragmentation and size selection were then used to optimize the cDNA amplicon size. Illumina P5 and P7 sequences, i7 and i5 sample indexes, and TruSeq Read 2 sequence were added via end repair, A-tailing, adaptor ligation, and PCR to yield final Illumina-compatible sequencing libraries.

Sequencing

Request a detailed protocol

The resulting sequencing libraries comprised standard Illumina paired-end constructs flanked with P5 and P7 sequences. The 16 bp 10x Barcode and 12 bp UMI were encoded in Read 1, while Read 2 was used to sequence the cDNA fragment. i7 and i5 sample indexes were incorporated as the sample index reads. Paired-end sequencing (28:90) was performed on the Illumina NextSeq500 platform using NextSeq 500/550 High Output v2.5 (150 Cycles) reagents. The .bcl sequence data were processed for QC purposes using bcl2fastq software (v. 2.20.0.422) and the resulting .fastq files assessed using FastQC (v. 0.11.3), FastqScreen (v. 0.14.0), and FastqStrand (v. 0.0.7) prior to pre-processing with the CellRanger pipeline.

Data analysis

Data pre-processing

Request a detailed protocol

Sequence files generated by the sequencer were processed using the 10x Genomics Cell Ranger pipeline. Developmental samples were analyzed with Cell Ranger v3.1.0, while adult samples used v6.1.2. FASTQ files were generated and aligned to a custom hg38 reference genome using default parameters. For developmental data, a custom premRNA_patched gtf file was used to include intronic reads, whereas for adult data the ‘include-introns’ option was enabled during the cellranger count runs. The pipeline identified cell barcodes corresponding to individual nuclei and quantified UMIs per nucleus. Read alignment was performed using the STAR aligner, with multimapping reads excluded from UMI counting.

Filtering

Request a detailed protocol

Low-quality nuclei were removed from the dataset using three commonly used parameters for cell quality evaluation: the number of UMIs per cell barcode (library size), the number of genes per cell barcode, and the proportion of UMIs that are mapped to mitochondrial genes. The threshold of these three parameters was adjusted individually for each sample to retain only good quality nuclei from each of the samples. Outlier nuclei with a total read counts >50,000 were removed as potential doublets. After filtering, the number of retained nuclei/total nuclei were as follows (all replicates are biological replicates): CS16-17_rep1 4703/7011; CS16-17_rep2 2576/7761; 12W_rep1, 10,151/12,110; 12W_rep2, 7306/16,110; AV1 2765/3008; AV2 446/609; AV3 2219/2362. To achieve a more fine-grained information for each dataset, each individual sample was first analyzed separately. Developmental time points (embryonic and fetal) were then merged, and adult time points were combined independently. Finally, all time points (developmental and adult) were integrated using scanpy (Wolf et al., 2018), yielding a total of 30,166 nuclei for downstream analysis.

Normalization and classification of cell-cycle phase

Request a detailed protocol

For individual sample analysis, raw gene expression counts were normalized using a deconvolution-based method (Lun et al., 2016a). For integrated analysis in Scanpy, the default normalization approach was applied. Cell cycle phase scores (G1 and G2/M) for each nucleus were calculated using the cyclone method (Scialdone et al., 2015).

Visualization and clustering

Request a detailed protocol

The variance of each gene expression value was decomposed into technical and biological components; Highly Variable Genes (HVGs) were identified as genes for which biological components were significantly greater than zero. HVG genes were used to reduce the dimensions of the dataset using PCA; the dimension of the dataset was further reduced to 2D using tSNE and UMAP, where 1–14 components of the PCA were given as input.

Nuclei were grouped into seven clusters using the dynamic tree cut method (Langfelder et al., 2008). For all datasets combined, scanpy’s highly_variable_genes (scanpy v. 1.9.5) were identified. Nuclei in scanpy aggregated data were clustered using Louvain clustering in scanpy with a resolution of 0.4.

Identification of marker genes

Request a detailed protocol

For individual sample analysis, marker genes for each cluster were identified using the findMarkers function from scran 1.26.2 package (Lun et al., 2016b). The function also returns FDR values for multiple testing correction using the Benjamini–Hochberg procedure. Marker genes were then used to annotate the cell types of a cluster. For the scanpy aggregated data, scanpy’s rank_genes_groups was used to rank genes in each cluster using the Wilcoxon rank-sum test method, which compares each cluster to the union of the rest of the clusters. This function also uses the Benjamini–Hochberg procedure for multiple test correction. Gene ontology enrichment was evaluated for the top 100 genes using DAVID (Huang et al., 2009; Sherman et al., 2022), with p values limited to <0.05.

RNA velocity

Request a detailed protocol

scVelo (v. 0.2.4) was applied to identify transient cellular dynamics in the embryonic stage (Bergen et al., 2020). The spliced versus unspliced ratio for each gene was calculated using the RNA-velocity’s command line tool, velocyto 10x with default parameters (La Manno et al., 2018).

SCENIC

Request a detailed protocol

The SCENIC pipeline was applied to infer gene regulatory network in our developmental samples. In the first step, GRNBoost2 created a gene co-expression module potentially regulated by the same TF, called regulon, followed by cisTarget to identify TFs that directly regulate each co-expression module, based on motif enrichments. Finally, AUCell was used to calculate regulon activity scores in each cell. Output from AUCell was then used for downstream analysis using pySCENIC. Regulon Specificity Score was used to quantify the activity of each regulon and identify cluster-specific regulons.

Spatial transcriptomics

Request a detailed protocol

Spaceranger v.1.3.1 was used to pre-process the Visium slides. After filtering the lower count spots and genes, data were normalized using scanpy’s normalize_total function. Spots were clustered using Leiden clustering, and spatially variable genes were identified using Moran’s notes (Moran, 1950). As each of the Visium spots has more than one cell, cell2location (v. 0.1.3) was used to deconvolute the spots (Kleshchevnikov et al., 2022). For cell2location, genes were filtered using the filter_genes() function and setting the parameters, cell_count_cutoff = 5, cell_percentage_cutoff2=0.03 and nonz_mean_cutoff = 1.12. The model was then trained using reference cell type signatures from snRNA-seq data, estimated using a negative binomial regression model. The Cell2location() function was then used to map the reference data to spatial spots while setting the N_cells_per_location parameter to 12 and detection_alpha to 20.

Identification of gene expression patterns in spatial transcriptomics

Request a detailed protocol

To identify genes with expression patterns similar to a gene of interest, all spatial spots with expression values greater than 1 for the target gene were first selected. Genes not expressed in at least 50% of these spots were excluded. From the remaining set, any gene expressed in more than 25% of the spots where the target gene had zero expression was also removed. This filtering process yielded a list of genes with expression profiles closely matching that of the gene of interest.

Lineage tracing algorithm

Request a detailed protocol

A transcriptional signature-based lineage inference method was developed to infer developmental relationships between embryonic populations and their fetal and adult counterparts. For each selected embryonic progenitor population (cluster), we defined a lineage signature by identifying the top 100 differentially expressed genes relative to other embryonic mesenchymal populations. In parallel, for each fetal or adult cell cluster, we identified the top 5000 differentially expressed genes by comparing expression within each cluster to all other clusters at the same (fetal or adult) stage. Lineage relationships were inferred by quantifying the overlap between each embryonic progenitor signature and the corresponding 5000-gene sets from later-stage clusters. Statistical significance of the observed overlaps was assessed using a hypergeometric test, enabling identification of fetal and adult populations significantly enriched for embryonic transcriptional signatures. Clusters exhibiting the strongest and most significant overlaps were interpreted as probable descendants of the corresponding embryonic progenitor populations.

All the experiments described (single nuclei RNA-seq and spatial transcriptomics) have been deposited in ArrayExpress; accession numbers are detailed in Supplementary file 1.

Data availability

All data described have been deposited in ArrayExpress with accession numbers listed in Supplementary file 1 and are also available through the Human Cell Atlas at https://cellxgene.cziscience.com/collections/5d2077ea-7b49-45c8-b4cb-64790b698591.

The following data sets were generated
    1. Leshem R
    2. Baker SM
    3. Bobola N
    (2025) ArrayExpress
    ID E-MTAB-13456. A cell atlas of the human outflow tract of the heart and its adult derivatives - CS16-17 Replicate 1.
    1. Leshem R
    2. Baker SM
    3. Bobola N
    (2025) ArrayExpress
    ID E-MTAB-13447. A cell atlas of the human outflow tract of the heart and its adult derivatives.
    1. Leshem R
    2. Baker SM
    3. Bobola N
    (2025) ArrayExpress
    ID E-MTAB-13453. A cell atlas of the human outflow tract of the heart and its adult derivatives - Adult samples.
    1. Leshem R
    2. Baker SM
    3. Bobola N
    (2025) ArrayExpress
    ID E-MTAB-13461. A cell atlas of the human outflow tract of the heart and its adult derivatives - 10X visium sample.

References

  1. Book
    1. Anderson R
    2. Brown NA
    3. Chaudhry B
    4. Henderson DJ
    5. Bamforth SD
    6. Mohun TJ
    7. Moorman AFM
    (2012a) The reappraisal of normal and abnormal cardiac development
    In: Anderson R, editors. Hemodynamics and Cardiology Neonatology Questions and Controversies. Elsevier. pp. 391–414.
    https://doi.org/10.1016/B978-1-4377-2763-0.00019-6
    1. Neeb Z
    2. Lajiness JD
    3. Bolanis E
    4. Conway SJ
    (2013) Cardiac outflow tract anomalies
    Wiley Interdisciplinary Reviews. Developmental Biology 2:499–530.
    https://doi.org/10.1002/wdev.98
    1. Yuan BZ
    2. Miller MJ
    3. Keck CL
    4. Zimonjic DB
    5. Thorgeirsson SS
    6. Popescu NC
    (1998)
    Cloning, characterization, and chromosomal localization of a gene frequently deleted in human liver cancer (DLC-1) homologous to rat RhoGAP
    Cancer Research 58:2196–2199.

Article and author information

Author details

  1. Rotem Leshem

    Faculty of Biology, Medicine and Health, University of Manchester, Manchester, United Kingdom
    Contribution
    Conceptualization, Data curation, Formal analysis, Investigation, Writing – review and editing
    Contributed equally with
    Syed Murtuza-Baker
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-8916-9843
  2. Syed Murtuza-Baker

    Faculty of Biology, Medicine and Health, University of Manchester, Manchester, United Kingdom
    Contribution
    Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Writing – review and editing
    Contributed equally with
    Rotem Leshem
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-6633-333X
  3. Joshua Mallen

    Faculty of Biology, Medicine and Health, University of Manchester, Manchester, United Kingdom
    Contribution
    Data curation, Investigation, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0009-0003-7617-2028
  4. Lu Wang

    Translational and Clinical Research Institute, Newcastle University, Newcastle, United Kingdom
    Contribution
    Resources, Investigation
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-4418-8602
  5. John Dark

    Translational and Clinical Research Institute, Newcastle University, Newcastle, United Kingdom
    Contribution
    Resources, Investigation
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-4727-6085
  6. Andrew D Sharrocks

    Faculty of Biology, Medicine and Health, University of Manchester, Manchester, United Kingdom
    Contribution
    Conceptualization, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-7395-9552
  7. Karen Piper Hanley

    Faculty of Biology, Medicine and Health, University of Manchester, Manchester, United Kingdom
    Contribution
    Conceptualization, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-9473-9647
  8. Neil Hanley

    1. Faculty of Biology, Medicine and Health, University of Manchester, Manchester, United Kingdom
    2. College of Medicine & Health, University of Birmingham, Edgbaston, United Kingdom
    Contribution
    Conceptualization, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-3234-4038
  9. Magnus Rattray

    Faculty of Biology, Medicine and Health, University of Manchester, Manchester, United Kingdom
    Contribution
    Conceptualization, Investigation, Methodology, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-8196-5565
  10. Simon D Bamforth

    Newcastle University Biosciences Institute, Faculty of Medical Sciences, Newcastle, United Kingdom
    Contribution
    Conceptualization, Data curation, Funding acquisition, Investigation, Project administration, Writing – review and editing
    For correspondence
    simon.bamforth@newcastle.ac.uk
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-5666-4485
  11. Nicoletta Bobola

    Faculty of Biology, Medicine and Health, University of Manchester, Manchester, United Kingdom
    Contribution
    Conceptualization, Supervision, Funding acquisition, Investigation, Writing – original draft, Project administration, Writing – review and editing
    For correspondence
    nicoletta.bobola@manchester.ac.uk
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-7103-4932

Funding

Medical Research Council (MR/S03613X/1)

  • Simon D Bamforth
  • Nicoletta Bobola

British Heart Foundation (SP/18/12/34300)

  • Simon D Bamforth
  • Nicoletta Bobola

National Institute for Health and Care Research (NIHR203332)

  • John Dark

The funders had no role in study design, data collection, and interpretation, or the decision to submit the work for publication. For the purpose of Open Access, the authors have applied a CC BY public copyright license to any Author Accepted Manuscript version arising from this submission.

Acknowledgements

We thank Andy Hayes and the other members of the Genomic Technologies Core Facility, Roger Meadows of the Bioimaging facility, and Gareth Howell of the Flow Sorting Core Facility at the University of Manchester. We also thank Zoulfia Darieva, Peyman Zarrineh, Rachel Jennings, and Aoibheann Mullan for help and discussions. A special thanks to Jasmin Turner for help with cryosectioning and histology. This work was supported by joint funding from the Medical Research Council (MRC) (http://www.mrc.ukri.org/) grant MR/S03613X/1 and British Heart Foundation (BHF) grant SP/18/12/34300 to NB and SDB. The Human Developmental Biology Resource (https://www.hdbr.org/) is funded jointly by the Medical Research Council and the Wellcome Trust (MR/R006237/1). Adult tissue acquisition was funded by the National Institute for Health and Care Research (NIHR) Blood and Transplant Research Unit in Organ Donation and Transplantation (NIHR203332), a partnership between NHS Blood and Transplant, University of Cambridge, and Newcastle University.

Ethics

Adult aortic valve tissue was collected from adult hearts declined for clinical transplantation with written informed consent for research from their families. The research ethical approval (REC ref 16/NE/0230) was provided by the North East – Newcastle & North Tyneside Research Ethics Committee. Human embryonic (Carnegie stage 16–17) and fetal (12 post-conception weeks) outflow tract samples were obtained from the Human Developmental Biology Resource (HDBR; https://www.hdbr.org/ [hdbr.org]), a UK Human Tissue Authority-licensed research tissue bank. All tissue within the HDBR is collected following elective pregnancy termination with appropriate maternal written informed consent for research use, including consent for use in future research studies. Donor information is anonymized prior to distribution to researchers. The HDBR operates under a generic ethical approval from the North East – Newcastle & North Tyneside 1 Research Ethics Committee (REC reference 23/NE/0135; IRAS project ID 330783), which provides a favorable ethical opinion for the tissue bank and for research projects using material supplied by it, provided that their use complies with the conditions of the approval.

Version history

  1. Preprint posted:
  2. Sent for peer review:
  3. Reviewed Preprint version 1:
  4. Reviewed Preprint version 2:
  5. Version of Record published:

Cite all versions

You can cite all versions using the DOI https://doi.org/10.7554/eLife.107748. This DOI represents all versions, and will always resolve to the latest one.

Copyright

© 2025, Leshem, Murtuza-Baker et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 994
    views
  • 83
    downloads
  • 0
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Rotem Leshem
  2. Syed Murtuza-Baker
  3. Joshua Mallen
  4. Lu Wang
  5. John Dark
  6. Andrew D Sharrocks
  7. Karen Piper Hanley
  8. Neil Hanley
  9. Magnus Rattray
  10. Simon D Bamforth
  11. Nicoletta Bobola
(2026)
A cell atlas of the developing human outflow tract of the heart and its adult aortic valve derivatives
eLife 14:RP107748.
https://doi.org/10.7554/eLife.107748.3

Share this article

https://doi.org/10.7554/eLife.107748