Nonlinear transcriptional responses to gradual modulation of transcription factor dosage

  1. Júlia Domingo  Is a corresponding author
  2. Mariia Minaeva
  3. John A Morris
  4. Samuel Ghatan
  5. Marcello Ziosi
  6. Neville E Sanjana
  7. Tuuli Lappalainen  Is a corresponding author
  1. New York Genome Center, United States
  2. Science for Life Laboratory, Department of Gene Technology, KTH Royal Institute of Technology, Sweden
  3. Department of Biology, New York University, United States
5 figures, 1 table and 1 additional file

Figures

Figure 1 with 3 supplements
Modulation and quantification of gene dosage using CRISPR and targeted multimodal single-cell sequencing.

(A) Co-expression network representation of the 92 selected genes under study. Genes (nodes) are connected by edges when their co-expression across single cells was above 0.5 (data used from Morris et al., 2023). Highlighted in colour are the two control highly (GAPDH) and lowly (LHX3) constantly expressed genes, as well as cis genes for which dosage was modulated with CRISPRi/a. (B) Design of the multimodal single-cell experiment (HTO = hash tag oligos). (C) Distribution of the GFI1B (left) or NFE2 (right) normalized expression across single cells for different classes of sgRNAs (NTC = Non-targeting controls, TSS = transcription start site). (D) Resulting relative expression change (log2 fold change) of the 4 cis genes upon each unique CRISPR perturbation when grouped across different classes of sgRNAs. (E) Distribution of cis gene log2FC across all sgRNA perturbations.

Figure 1—figure supplement 1
Experimental design and data processing from UMIs to expression fold change, related to Figure 1 and STAR methods.

(A) Co-expression matrix of the 76 selected GFI1B trans genes based on K562 data from Maurano et al., 2012. Three clusters from the selected targeted panel show similar co-expression architecture than the original clusters identified using the entire GFI1B trans-network (original clusters A in blue, B in green and C in red). (B) Same as A for the 39 NFE2 trans genes (original clusters A in green, B in orange, C in blue, and D in red). (C) Correlation between total UMI counts per gene between 10 X chip lanes. Targeted panel genes are shown in orange and highlighted names correspond to dosage genes (NFE2, MYB, GFI1B, and TET2) and low/high expression controls (LHX3 and GAPDH). (D) The number of singlet cells carrying each sgRNA in the two different CRISPR cell lines. NTC = non-targeting controls. (E) Q-Q plots from Sceptre calibration test. (F) Distribution of normalized UMI expression of the cis gene labelled on top for cells with single guide RNAs (sgRNAs) targeting their transcription start site (TSS) or harbouring non-targeting control (NTC) sgRNAs.

Figure 1—figure supplement 2
Biochemical and activity properties of different types of single guide RNAs (sgRNAs).

(A) Relationship between off-target and on-target activity of sgRNAs and the change in expression of their target cis gene. (B) The relationship between the number of cells that covered each sgRNA perturbation with the absolute fold change of the cis gene (top) or the number of differentially expressed trans genes due to the cis gene perturbation (bottom). (C) The relationship between the location of the mismatch mutation of attenuated sgRNAs (position 1 being farthest away from protospacer adjacent motif (PAM) motif location) and their effect on the cis gene expression.

Figure 1—figure supplement 3
Gradual effects of the single guide RNAs (sgRNAs).

(A) Distribution of the normalized cis gene UMIs in single cells, grouped by their unique sgRNAs, ranked top to bottom by mean normalized expression. Transparent distributions correspond to non-targeting controls. (B) Distribution of the correlation in trans gene expression fold changes when splitting the same sgRNA cells into 0 UMI or >0 UMI for the cis gene (top panel). Comparison of the strength of these correlations with the effect of that sgRNA on the cis gene (bottom panel). The size of dots indicates the difference in the size of the 0 UMI or >0 UMI cell groups. (C) UMAPs of the cells with GFI1B, MYB, and NFE2 guides together with non-targeting guides. The left and right clusters in each figure represent CRISPRa and CRISPRi cells, respectively. The cells are coloured by the median fold change associated with their sgRNA.

Cis determinants of dosage.

(A) Comparison of the relative expression change (log2FC) from the same single guide RNA (sgRNA) between the two different CRISPR modalities. Vertical and horizontal bars represent CRISPRa and CRISPRi standard errors, respectively. (B) Relative expression change of the targeted cis gene based on distance from transcription start site (TSS). Top plot excluded attenuated and non-targeting control (NTC) sgRNAs, while bottom plot also excludes enhancer sgRNAs. (C) Number of sgRNAs that overlap with the different epigenetic or open chromatin peaks. (D) Relative expression change to NTC sgRNAs (log2(FC)) of all cis genes when their sgRNAs fall or not in the different epigenetic or open chromatin peaks. P-value results from Wilcoxon rank-sum tests, with nominally significant p-values shown in black.

Figure 3 with 7 supplements
Trans responses of transcription factor dosage modulation.

(A) Average absolute expression change of all trans genes relative to the changes in expression of the cis genes. (B) Changes in relative expression of all trans genes (bottom heatmap) in response to GFI1B expression changes (top barplot) upon each distinct targeted single guide RNA (sgRNA) perturbation, in comparison to non-targeting control (NTC) cells. The rows of the heatmap (trans genes) are hierarchically clustered based on their expression fold change linked to alterations in GFI1B dosage. Highlighted rows are selected dosage response examples shown in C. (C) Dosage response curves of the highlighted trans gene in B as a function of changes in GFI1B expression. The orange line represents the sigmoid model fit, except for GATA2, which displays a non-monotonic response and are fitted with a loess curve. (D) Illustration of the linear and sigmoid models and equations used to fit the dosage response curves. (E) Distribution of the difference in Akaike Information Criterion (ΔAIClinear-sigmoid) after fitting the sigmoidal or linear model for each trans gene upon GFI1B dosage modulation (top panel), and the direct comparison of the Akaike Information Criterion (AIC) of each fit (bottom panel).

Figure 3—figure supplement 1
Global view of trans effects and their replication.

(A) Principal component analysis (PCA) of mean UMI normalized expression (not relative to each cell line of origin) for all genes across unique single guide RNA (sgRNA) perturbations. (B) Same as A but using relative expression fold-change when normalising by the CRISPR cell line of origin. (C) Replication of trans-effects of CRISPRi of CREs for GFI1B and NFE2, targeted both in this study (x-axis) and in Morris et al., 2023 (y-axis). GFI1B CRE 1 and NFE2 CRE 1 were targeted in Morris et al. data batches V1 and V2, and the effects are shown here for both separately. (D) Replication of trans-effects from transcription start site (TSS) silencing in this study and in Replogle et al., 2022, analysing guides from this study that target transcription start sites, but the guides do not fully match the exact guides used in Replogle et al. The effect size in Replogle et al. is quantified using their metric of Wilcox mean difference. The dashed line represents a linear regression line between the x and y variables. (E) Number of differentially expressed trans genes relative to the cis gene dosage perturbation.

Figure 3—figure supplement 2
Trans gene responses to GFI1B dosage modulation.

(A) Changes in relative expression of all trans genes (heatmap) in response to GFI1B expression (top barplot) upon each distinct targeted single guide RNA (sgRNA) perturbation. The rows of the heatmap (trans genes) are hierarchically clustered based on their expression fold change linked to alterations in GFI1B dosage. (B) Dosage response curves are plotted for each trans gene against changes in GFI1B expression. The orange line represents the sigmoid model fit, and the blue line represents a loess curve.

Figure 3—figure supplement 3
Trans gene responses to MYB dosage modulation.

(A) Changes in relative expression of all trans genes (bottom heatmap) in response to MYB expression (top barplot) upon each distinct targeted GFI1B single guide RNA (sgRNA) perturbation. The rows of the heatmap (trans genes) are hierarchically clustered based on their expression fold change linked to alterations in MYB dosage. (B) Dosage response curves are plotted for each trans gene against changes in MYB expression. The orange line represents the sigmoid model fit.

Figure 3—figure supplement 4
Trans gene responses to NFE2 dosage modulation.

(A) Changes in relative expression of all trans genes (bottom heatmap) in response to NFE2 expression (top barplot) upon each distinct targeted NFE2 sgRNA perturbation. The rows of the heatmap (trans genes) are hierarchically clustered based on their expression fold change linked to alterations in NFE2 dosage. (B) Dosage response curves are plotted for each trans gene against changes in NFE2 expression. The orange line represents the sigmoid model fit.

Figure 3—figure supplement 5
Trans gene responses to TET2 dosage modulation.

(A) Changes in relative expression of all trans genes (bottom heatmap) in response to TET2 expression (top barplot) upon each distinct targeted TET2 single guide RNA (sgRNA) perturbation. The rows of the heatmap (trans genes) are hierarchically clustered based on their expression fold change linked to alterations in TET2 dosage. (B) Dosage response curves are plotted for each trans gene against changes in TET2 expression. The orange line represents the sigmoid model fit.

Figure 3—figure supplement 6
Dosage response linear and non-linear model fitting.

(A) Distribution of the difference in Akaike Information Criterion (ΔAIClinear-sigmoid) after fitting the sigmoidal or linear model for each trans gene based on the gradual expression perturbations of the four cis genes (top panel), and the direct comparison of the Akaike Information Criterion (AIC) of each fit (bottom panel). Red lines indicate median ∆AIC. (B) Same as A but only fitting the models on those single guide RNA (sgRNA) perturbations that lead to a cis gene dosage change bounded between log2(1/2) and log2(3/2). (C) Agreement between observed and predicted trans genes expression fold change upon cis gene dosage modulation across a 10-fold cross-validation scheme. (D) Comparison of the Root Mean Square Error (RMSE) of the sigmoid model on the different trans genes dosage responses to the RMSE of the equivalent loess fit (bottom panel). In blue are highlighted the non-monotonic responses that correspond to the top four ΔRMSEsigmoid-loess (RMSEsigmoid - RMSEloess) values (top panel).

Figure 3—figure supplement 7
Distribution of the fitted parameters of the sigmoidal model on dosage responses.

Cumulative distribution of the four fitted parameters (first four columns) of the sigmoid model across genes given the independent perturbation of the four transcription factors (TFs) (rows). slope_IF = slope of dosage response curve at the inflection point, min_asmp = minimum asymptote (minimum trans gene dosage level), max_asmp = maximum asymptote (maximum trans gene dosage level), x_IF = TF expression FC at the dosage response inflection point.

Figure 4 with 1 supplement
Relationship between gene and dosage response properties.

(A) Predicted changes (using sigmoid or loess fits for monotonic and non-monotonic responses, respectively) in relative expression of all trans genes in response to changes of the GFI1B, MYB, and NFE2 expression. Trans genes (rows) were hierarchically clustered based on their expression fold change linked to alterations of all transcription factors (TFs) dosage. A dendrogram of the resulting clustering shown in the left. (B) Heatmap showing the qualitative properties of each trans gene. The x-axis indicates specific gene features. The top labels specify the source of the data, while the bottom labels describe the corresponding gene properties. WBCs, platelets, RBCs, and reticulocytes refer to genome-wide association studies (GWAS) of white blood cells, platelets, red blood cells, and reticulocytes, respectively. (C) Heatmap indicating the z-scaled quantitative gene features of each transgene. The x-axis indicates specific gene features. The top labels specify the source of the data, while the bottom labels describe the corresponding gene properties. Erythroblast, platelets, monocytes, and dendritic cells refer to cell types from Hay et al., 2018. Gray cells indicate missing data. (D) The difference in the average value of the sigmoid parameter indicated in the right between the genes qualified into the no/yes category of the gene properties indicated in B. (E) Pearson correlation coefficient of the quantitative trans gene features (shown in C) with the sigmoid parameter value for each trans gene in the response to the modulation of dosage of the TF indicated on the left. The size of the points are inversely related to the significance of correlation, and colour indicates the direction of correlation. (F) Differences in the range of expression response for Housekeeping vs. non-Housekeeping trans genes with changes of dosage of MYB, GFI1B, and NFE2. (G) Negative correlation between haploinsufficiency score (pHaplo) and the range of the response of trans genes to the modulation of MYB.

Figure 4—figure supplement 1
Relationship of gene properties and transcription factor (TF)-target network properties with TF dosage responses.

(A) A regulatory network constructed based on TF-target gene data (Minaeva et al., 2025) with nodes and edges coloured by betweenness. Nodes are sized by their degree. (B) Heatmap illustrating the correlation between the sigmoid parameters in response to cis-gene modulation and network centrality metrics calculated based on the regulatory networks from Minaeva et al., 2025. Point size is scaled to -log10 p-value.

Figure 5 with 1 supplement
Non-linearities in transcription factor (TF) dosage responses of complex traits and disease genes.

(A) Heatmap illustrating the correlation between the mean expression of cell types and the changes in expression linked to individual TF dosage perturbations. The bar plot on the top panel represents cis gene dosage perturbation. Asterisks (*) denote correlations with 10% FDR. (B) Enrichment log(odds) ratio of non-linear TF dosage responses (ΔAIClinear-sigmoid>0) in disease-related genes (OMIM genes linked to 1 or more diseases, top panel) or in GWAS blood traits-associated genes (closest expressed gene to lead GWAS variant, bottom panel). Log(odds) with Fisher’s exact test at FDR <0.05 are highlighted in blue. (C) Examples of TF dosage response curves of genes both associated with disease (OMIM) and complex traits (Blood GWAS).

Figure 5—figure supplement 1
Transcriptional similarity among bone marrow cell types at different transcription factor (TF) dosage levels.

(A) Normalized z-score mean expression across donors for targeted genes within each bone marrow cell type (Data from the Human Cell Atlas). (B) Examples of trends of correlation of trans genes expression with the TF change in dosage. The title specifies the cis gene and the cell type for which the trans effects of TF dosage modulation have been contrasted.

Tables

Key resources table
Reagent type (species) or resourceDesignationSource or referenceIdentifiersAdditional information
Recombinant DNA reagentpCC_05: Lentiviral puromycin CRISPRa dCas9-VPR systemAddgeneRRID:Addgene_139090Used as PCR template for dCas9-VPR cassette (Legut et al., 2020).
Recombinant DNA reagentpGC02: Lentiviral blasticidin CRISPRi KRAB-dCas9-MeCP2 systemotherRRID:Addgene:_170068Sourced from Sanjana Laboratory (Morris et al., 2023). Backbone for pJDE003 construction; digested with XbaI-FD and BamHI-FD.
Recombinant DNA reagentpJDE003: Lentiviral blasticidin CRISPRa dCas9-VPR systemthis studyNAConstructed by replacing KRAB-dCas9-MeCP2 cassette in pGC02 with dCas9-VPR PCR product from pCC_05; Gibson assembled (2:1 insert:vector).
Recombinant DNA reagentpGC03: Lentiviral puromycin sgRNA library cloning vectorAddgeneRRID:Addgene:_170069Used for cloning 96-sgRNA library (BsmBI digestion; NEBuilder HiFi assembly).
Recombinant DNA reagentpMD2.G: Lentiviral envelope plasmidAddgeneRRID:Addgene:_12259Envelope plasmid for lentiviral production.
Recombinant DNA reagentpsPAX2: Lentiviral packaging plasmidAddgeneRRID:Addgene:_12260Packaging plasmid for lentiviral production.
strain, strain background (Escherichia coli)NEB 5-alpha competent cellsNew England BiolabsNEB:C2987HUsed for plasmid transformations (pJDE003 assemblies).
Strain, strain background (E. coli)One Shot Stbl3 chemically competent cellsThermo Fisher ScientificThermoFisher:C737303Used for cloning/propagating lentiviral vectors.
Strain, strain background (E. coli)Endura electrocompetent cellsLucigenLucigen:60242–2Used for sgRNA library transformation by electroporation;>2.5e5 transformants obtained.
Cell line (Human)HEK293FTThermo Fisher ScientificThermoFisher:R70007; RRID:CVCL_6911Maintained at 37 °C, 5% CO2 in DMEM high glucose (Cytiva SH30022.01)+10% Serum Plus II (Sigma 14,009 C).
Cell line (Human)K562ATCCATCC:CCL-243; RRID:CVCL_0004Maintained at 37 °C, 5% CO2 in IMDM, GlutaMAX (ThermoFisher:31980097)+10% Serum Plus II (Sigma 14,009 C).
AntibodyPurified anti-CRISPR (CAS9) antibody (clone 7 A9)BioLegendBioLegend:844302; RRID:AB_2749904Primary antibody for western blot of dCas9 (conditions not specified in excerpt).
AntibodyGAPDH (14 C10) Rabbit monoclonal antibodyCell Signaling TechnologyCST:2118 S; RRID:AB_561053Primary antibody for loading control western blot (conditions not specified in excerpt).
AntibodyIRDye 800CW goat anti-mouse IgG (H+L)LI-CORLI-COR:925–32212Secondary antibody for CAS9 western blot (conditions not specified in excerpt).
AntibodyIRDye 680RD goat anti-rabbit IgG (H+L)LI-CORLI-COR:925–68073Secondary antibody for GAPDH western blot (conditions not specified in excerpt).
AntibodyFITC anti-human CD4 antibody (clone RPA-T4)BioLegendBioLegend:300505; RRID:AB_314073Used for FACS validation of CRISPRa activation (day 4 and day 10/11 post-transduction).
antibodyAPC anti-human CD19 antibody (clone HIB19)BioLegendBioLegend:302211; RRID:AB_314241Used for FACS validation of CRISPRa activation.
AntibodyPE anti-human CD45 antibody (clone 2D1)BioLegendBioLegend:368509; RRID:AB_2566369Used for FACS validation of CRISPRa activation.
Commercial assay or kitQ5 High-Fidelity 2 X Master MixNew England BiolabsNEB:M0492LPCR amplification of dCas9-VPR cassette.
Commercial assay or kitGibson Assembly Master MixNew England BiolabsNEB:E2611SUsed for Gibson assembly (2:1 insert:vector).
Commercial assay or kitNEBuilder HiFi DNA Assembly kitNew England BiolabsNEB:NEBuilder-HiFiUsed for cloning pooled sgRNA library into BsmBI-digested pGC03 (10 reactions).
Commercial assay or kitPlasmid Maxiprep KitQIAGENQIAGEN:12362Used for plasmid DNA preparation for virus production.
Commercial assay or kitMaxi Fast-Ion Plasmid Kit, Endotoxin FreeIBI ScientificIBI:IB47123Used for sgRNA library plasmid maxiprep.
Commercial assay or kitSteriflip-HV 0.45 µm filterMilliporeMillipore:SE1M003M00Filtration of harvested lentiviral supernatant.
Commercial assay or kitLentivirus Precipitation SolutionAlstemAlstem:VC100Used for lentiviral concentration (10 X or 2 X as described).
Commercial assay or kit10 x Chromium Next GEM Single Cell 5’ Reagent Kit v2 (single indexing)10 x Genomics10 x:PN-1000265; 10 x:PN-1000190Used for 5' single-cell library prep (two lanes; ECCITE-seq modifications).
Commercial assay or kit10 x Targeted Gene Expression protocol10 x Genomics10 x:PN-1000248Custom probe library used for targeted enrichment of genes of interest.
Commercial assay or kitIllumina NextSeq 500/550 Mid Output v2.5 kit (150 cycles)IlluminaIllumina:NextSeq-MidOutput-v2.5–150Sequencing of targeted gene expression, HTO and GDO libraries.
Commercial assay or kitIllumina MiSeq Reagent Kit v3 (150 cycles)IlluminaIllumina:MiSeq-v3-150Sequencing of dCas9 targeted enrichment and additional HTO libraries.
Commercial assay or kitxGen Custom Hybridization Capture Panel (biotinylated oligos)IDTIDT:xGen-Custom-PanelCustom targeted gene expression panel (final 4,405 probes;~15% discarded during design).
Commercial assay or kitLookOut Mycoplasma PCR Detection KitSigma-AldrichSigma:MP0035Routine mycoplasma testing (frequency not specified).
Peptide, recombinant proteinXbaI FastDigest (XbaI-FD)Thermo Fisher ScientificThermoFisher:FD0685Restriction digest of pGC02.
Peptide, recombinant proteinBamHI FastDigest (BamHI-FD)Thermo Fisher ScientificThermoFisher:FD0054Restriction digest of pGC02.
Peptide, recombinant proteinFastAP Thermosensitive Alkaline PhosphataseThermo Fisher ScientificThermoFisher:EF0651Vector dephosphorylation after restriction digest.
Peptide, recombinant proteinDpnIThermo Fisher ScientificThermoFisher:FD1704Digest PCR template plasmid (15 min) prior to Gibson assembly.
OtherDMEM high glucose with L-glutamine; without sodium pyruvateCytiva (HyClone)Cytiva:SH30022.01Used for HEK293FT culture and lentivirus resuspension media.
OtherIMDM, GlutaMAXThermo Fisher ScientificThermoFisher:31980097Used for K562 culture.
OtherSerum Plus II medium supplementSigma-AldrichSigma:14,009 CUsed at 10% supplementation for HEK293FT and K562 culture.
Chemical compound, drugPolyethylenimine (PEI) linear MW 25,000PolysciencesPolysciences:23966Used for HEK293FT transfection for lentivirus production.
Chemical compound, drugBlasticidinA.G. ScientificA.G.Scientific:B-1247Used at 10 µg/mL for 16 days to select dCas9-VPR K562 clones; also 5 µg/mL during sgRNA library culture as described.
Chemical compound, drugPuromycinInvivoGenInvivoGen:ant-pr-1Used at 2 µg/mL for sgRNA integration selection.
Chemical compound, drugGlycoBlueThermo Fisher ScientificThermoFisher:AM9515Used for DNA precipitation of pooled sgRNA library assemblies.
Chemical compound, drugIsopropanolotherNAUsed for DNA precipitation of pooled sgRNA library assemblies.
Chemical compound, drugNaClotherNAUsed at 50 mM during DNA precipitation of pooled sgRNA library assemblies.
Chemical compound, drugEthanol 70%otherNAUsed for washes during DNA precipitation cleanup.
Sequence-based reagentPCR primers oJDE005 and oJDE006this studyNAUsed to amplify dCas9-VPR cassette from pCC_05; primer sequences not provided in excerpt.
Sequence-based reagent96-sgRNA library (ssDNA oligos, 60 bp) for gene dosage libraryIDTIDT:ssDNA-oligos-plate96 guides pooled equimolarly to 0.2 µM; cloned into pGC03; guide sequences not provided in excerpt.
Software, algorithmFastQCBabraham BioinformaticsRRID:SCR_014583Used for QC/demultiplexing of FASTQs (version not specified).
Software, algorithmCell Ranger (cellranger count)10 x GenomicsRRID:SCR_017344Used for gene expression (with targeted-panel) and guide capture analysis (Gaussian mixture model calling).
Software, algorithmSeuratHao et al., 2021RRID:SCR_016341Used for normalization, scaling, and UMAP; Seurat v4.3 used for NormalizeData and downstream analyses.
Software, algorithmSalmon/AlevinSrivastava et al., 2020RRID:SCR_017036Used for HTO quantification (version not specified).
Software, algorithmSceptreBarry et al., 2021NAUsed to validate calibration of control cells (Figure 1e).
Software, algorithmR (stats: lm, loess, AIC; drc: drm(fct =L.4()))R Foundation; drc packageRRID:SCR_001905Used for model fitting (linear, LOESS, 4-parameter sigmoid) and AIC calculation.

Additional files

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Júlia Domingo
  2. Mariia Minaeva
  3. John A Morris
  4. Samuel Ghatan
  5. Marcello Ziosi
  6. Neville E Sanjana
  7. Tuuli Lappalainen
(2026)
Nonlinear transcriptional responses to gradual modulation of transcription factor dosage
eLife 13:RP100555.
https://doi.org/10.7554/eLife.100555.3