Modulation and quantification of gene dosage using CRISPR and targeted multimodal single-cell sequencing.

A. Co-expression network representation of the 92 selected genes under study. Genes (nodes) are connected by edges when their co-expression across single cells was above 0.5 (data used from Morris et al. 2023). Highlighted in colour are the two control highly (GAPDH) and lowly (LHX3) constantly expressed genes, as well as cis genes for which dosage was modulated with CRISPRi/a.

B. Design of the multimodal single cell experiment (HTO = hash-tag oligos).

C. Distribution of the GFI1B (left) or NFE2 (right) normalised expression across single cells for different classes of sgRNAs (NTC = Non-targeting controls, TSS = transcription start site).

D. Resulting relative expression change (log2 fold change) of the 4 cis genes upon each unique CRISPR perturbation when grouped across different classes of sgRNAs.

E. Distribution of cis gene log2FC across all sgRNA perturbations.

Cis determinants of dosage.

A. Comparison of the relative expression change (log2FC) from the same sgRNA between the two different CRISPR modalities.

B. Relative expression change of the targeted cis gene based on distance from transcription start site (TSS). Top plot excluded attenuated and NTC sgRNAs, while bottom plot also excludes enhancer sgRNAs.

C. Number of sgRNAs that overlap with the different epigenetic or open chromatin peaks.

D. Relative expression change to NTC sgRNAs (log2(FC)) of all cis genes when their sgRNAs fall or not in the different epigenetic or open chromatin peaks. P-value result from Wilcoxon rank-sum tests, with nominally significant p-values shown in black.

Trans responses of transcription factor dosage modulation

A. Average absolute expression change of all trans genes relative to the changes in expression of the cis genes.

B. Changes in relative expression of all trans genes (bottom heatmap) in response to GFI1B expression changes (top barplot) upon each distinct targeted sgRNA perturbation. The rows of the heatmap (trans genes) are hierarchically clustered based on their expression fold change linked to alterations in GFI1B dosage.

C. Dosage response curves of the highlighted trans gene in B as a function of changes in GFI1B expression. The orange line represents the sigmoid model fit, except for GATA2, which display a non-monotonic response and are fitted with a loess curve.

D. Illustration of the linear and sigmoid models and equations used to fit the dosage response curves.

E. Distribution of the difference in Akaike Information Criterion (ΔAIClinear-sigmoid) after fitting the sigmoidal or linear model for each trans gene upon GFI1B dosage modulation (top panel), and the direct comparison of the AIC of each fit (bottom panel).

Relationship between gene and dosage response properties

A. Predicted changes (using sigmoid or loess fits for monotonic and non-monotonic responses, respectively) in relative expression of all trans genes in response to changes of the GFI1B, MYB and NFE2 expression. Trans genes (rows) were hierarchically clustered based on their expression fold change linked to alterations of all TF’s dosage. Dendrogram of the resulting clustering shown in the left.

B. Heatmap highlighting the qualitative gene features of each transgene. X axis indicates the gene property and top subtitles indicate where the data was obtained from. Grey indicates missing data.

C. Heatmap indicating the z-scaled quantitative gene features of each transgene. X axis indicates the gene property and top subtitles indicate where the data was obtained from. Grey indicates missing data.

D. Difference in the average value of the sigmoid parameter indicated in right between the genes qualified into the no/yes category of the gene properties indicated in B.

E. Pearson correlation coefficient of the quantitative trans gene features (shown in C) with the sigmoid parameter value for each transgene in the response of the modulation of dosage of the TF indicated on the left. Size of the points are inversely related to significance of correlation, and colour indicates the direction of correlation.

F. Differences in the range of expression response for Housekeeping vs. non-Housekeeping transgenes with changes of dosage of MYB, GFI1B and NFE2.

G. Negative correlation between haploinsufficiency score (pHaplo) and the range of the response of transgenes to the modulation of MYB.

Non-linearities in TF dosage responses of complex traits and disease genes

A. Heatmap illustrating the correlation between the mean expression of cell types and the changes in expression linked to individual TF dosage perturbations. The barplot on the top panel represents cis gene dosage perturbation. Asterisks (*) denote correlations with 10% FDR.

B. Enrichment log(odds) ratio of non-linear TF dosage responses (ΔAIClinear-sigmoid > 0) in disease related genes (OMIM genes linked to 1 or more diseases, top panel) or in GWAS blood traits associated genes (closest expressed gene to lead GWAS variant, bottom panel). Log(odds) with Fisher’s exact test at FDR < 0.05 are highlighted in blue.

C. Examples of TF dosage response curves of genes both associated with disease (OMIM) and complex traits (Blood GWAS).