Analysis of single cell RNA-seq (scRNA-seq) data for BMSC-OBs derived from 80 Diversity Outbred (DO) (A)

Uniform Manifold Approximation and Projection (UMAP) of 139,392 single cells (BMSC-OBs). Cell numbers and corresponding percentages for the fifteen (15) annotated cell clusters are listed in parenthesis to the right of the annotated cluster name. (B) Dot plot63 portraying representative and highly expressed genes for all annotated cell clusters. Dot color indicates the scaled average gene expression while the size of the dot corresponds to the percentage of cells of a given cluster that express a given gene. (C) The raw proportional abundances of seven (7) mesenchymal cell clusters and one (1) cluster (Hem) representing the remain cells (i.e., hematopoietic immune cells) across all 80 DO mice. (D) UMAP plots for mesenchymal lineage cell clusters for DO mouse 50 and DO mouse 233. (E) CELLECT (CELL-type Expression-specific integration for Complex Traits) cell type prioritization results. The seven (7) mesenchymal cell clusters were significantly enriched (P < 0.05) for BMD heritability and the MALP, Ocy, and OB1 cell clusters were significantly enriched after correcting P-value for multiple testing. All remaining cell clusters (i.e., hematopoietic immune cells) were not significant (P > 0.05)

Overview of the network analysis pipeline

Step 1: For all seven (7) of the mesenchymal lineage cell clusters (MPC, LMP, OBP, OB1, OB2, Ocy, MALP), cell type-specific co-expression modules were generated using iterative Weighted Gene Co-expression Network Analysis (iterativeWGCNA). Step 2: Bayesian networks were learned to generate directed networks and model causal interactions between co-expressed genes. Step 3: Differentiation Driver Genes (DDGs) were identified by extracting subnetworks (i.e., large 3-step neighborhood) for each gene in each cell type-specific Bayesian network and highlighting those subnetworks that were enriched (Padj < 0.05) for trajectory-specific tradeSeq genes for the cell type boundary. Step 4: DDGs (and associated networks) were prioritized if the DDG was identified previously as an expression/splicing quantitative trait loci (eQTL/sQTL) that colocalized with BMD GWAS associations. Created with Biorender.com.

Pseudotime Trajectory Inference analysis and establishment of cell type boundaries for tradeSeq analysis (A)

Three (3) trajectories (two adipogenic, one adipogenic) were inferred from the mesenchymal cell clusters of the BMSC-OB scRNA-seq data using Slingshot. All trajectories originate from the MPC and end in either osteogenic (Ocy, OB2) or adipogenic (MALP) cell fates. (B) For each of the trajectories, cell type boundaries were generated using pseudotime values along the trajectories, which encompass the majority of cells of a cell type mapping to their respective trajectory. (C) Normalized gene expression of select genes associated with each cluster are represented in feature plots (top) and each gene plotted as a function of pseudotime (bottom) for all pseudotime trajectories (color corresponds to cell type annotation observed throughout). Vertical lines (red) represent the cell type (pseudotime) boundaries established for each cell type (label). The cell type boundary for OB1 and OB2 are represented as one red line/label for visualization purposes.

TradeSeq-identified genes associated with BMSC-OB differentiation exhibit eQTL effects. (A)

Pkm was identified as a significant global tradeSeq-identified gene (Padj = 8.35 x 10-8) for LMP cells along an osteogenic trajectory (LMP_to_OBP) (left). S100a1 was identified as a significant global tradeSeq-identified gene (Padj = 0.023) for OBP cells along an osteogenic trajectory 1 (OBP_to_OB1) (right). (B) Plots indicating the cell type-specific expression quantitative trait loci (eQTL) signal for both Pkm and S100a1. A negative eQTL effect on Pkm expression was observed in LMPs for Diversity Outbred (DO) mice with a PWK haplotype background at the Pkm locus (left). A positive eQTL effect on the expression of S100a1 was observed in OBPs for DO mice with a 129 haplotype background at the S100a1 locus, while a negative effect was observed for NZO mice (right). (C) The expression of Pkm and S100a1 based on DO mouse (expression values transformed via variance stabilizing transformation (VST), as described in Methods). Genotype abbreviations correspond to DO haplotype background (legend) at the respective gene locus. Mice with at least one PWK allele (genotype abbreviation G) tend to have decreased expression of Pkm (left). Mice with at least one 129 allele (genotype abbreviation C) tend to have increased expression of S100a1, while NZO mice (genotype abbreviation E) have decreased expression (right). (D) PWK mice had a significant reduction in mature osteoblasts (OB1) and osteocyte-like cells (Ocy) proportions relative to other mice (P = 0.030 and P = 0.026, respectively; t-test), while LMP proportions were unaffected. Asterisks represent any of the other haplotype backgrounds. 129 mice showed a significant decrease in LMP proportion and increase in OB1 proportion (P = 0.008 and P = 0.016, respectively; t-test), but OBP proportions were unaffected. No significant effects on cell type proportions were observed in NZO mice (Supplementary Fig. 2).

Prioritized Differentiation Driver Genes (DDGs) that have BMD GWAS associations that colocalize with splicing/expression QTL (eQTL/sQTL) identified in a Genotype-Tissue Expression project (GTEx) tissue.

The tissue with the most significant colocalization (RCP and/or H4PP) is listed for each DGG (26 total, 21 distinct), as determined from Al-Barghouthi et al. (2022) and Abood et al. (2023) for eQTL and sQTL, respectively7,18. RCP = Regional Colocalization Probability (GWAS and eQTL colocalization). H4PP = H4 Posterior Probability (GWAS and sQTL colocalization).

Fgfrl1 and Tpx2 are prioritized DDGs and putative drivers of mesenchymal differentiation. (A)

Fgfrl1 was identified as a Differentiation Driver Gene (DDG) of a network for LMPs differentiating along an adipogenic trajectory. The network is enriched (Padj = 7.5 x 10-3) for trajectory-specific tradeSeq-identified genes for the LMP_to_MALP cell type boundary (Hnmt, St5, Igfbp4, Cyp1b1, Pdzrn4, Mark1). Fgfrl1 was previous identified as a gene that has BMD GWAS associations that colocalize with an eQTL in the cultured fibroblast GTEx tissue. (B) Tpx2 was identified as a DDG of a network for LMPs differentiating along an osteogenic trajectory. The network is enriched (Padj = 5.7 x 10-7) for tradeSeq-identified genes for the LMP_to_OBP cell type boundary (Tpx2, Top2a, Kif4, Iqgap3, Prc1, Kif11, Ect2, Sgo2a, Ube2c). Tpx2 is both a tradeSeq gene and previous identified as a gene that has BMD GWAS associations that colocalize with an eQTL in the Testis GTEx tissue. (C-D) An increase in the expression of all tradeSeq-identified genes coincides with the cell type boundary in which they were identified as significant along their respective trajectories. (E) Box plot displaying whole-body bone mineral density (BMD) measurements (excluding skull) from the International Mouse Knockout Consortium (IMPC) for Tpx2 mutant mice, which exhibited a significant increase in BMD (Genotype P-value = 1.03 x 10-3) in both male and female mice (N = 8 (M) and 8 (F) mutants; N = 2574 (M) and 2633 (F) controls)