Mapping of the native DNA G4 through the G4-hemin-mediated proximal biotinylation

A, Immunofluorescence staining of the HEK293 cells treated with indicated conditions using the Alexa Fluor 647 labelled recombinant streptavidin (Strep-647). Nuclei were stained with the Hoechst33342. Scale bar, 50 µm. Bio-An, Biotin-aniline; H2O2, Hydrogen peroxide.

B, Schematic of the HepG4-seq procedure. SA-scFv, the recombinant fusion protein of mSA and anti-GP41 Single Chain Fragment Variable (scFv); GP41-pG-Tn5, the recombinant fusion protein of the GP41 tag, protein G and Tn5.

C, Top: Heatmap showing the signal of BG4-seq, HepG4-seq and maxScores of PQS ±1.5kb around the center of peaks identified by HepG4-seq in HEK293 cells. Color scales represent the density of the signals. Bottom: Profile plot showing the average signal of HepG4-seq reads ±1.5kb around the center of peaks and the average maxScores of PQS calculated by pqsfinder at the same positions. HepG4 rep1/rep2, two biologically independent HepG4-seq replicates in HEK293 cells; Non rep1/rep2, two biologically independent non-label negative control replicates in HEK293 cells.

D, Representative genome browser tracks showing HepG4-seq (red), non-label negative control of HepG4-seq (yellow), and PQS (black) signals in HEK293 cells along the indicated genomic loci.

E, Distribution of HepG4-seq signals in HEK293 cells in different gene features.

F, The top enriched motifs on the HepG4-seq peaks in HEK293 cells.

G, Immunofluorescence staining of the HEK293 cells treated with DMSO, ML216 (25 µM), or NSC617145 (3 µM) using the Alexa Fluor 647 labelled recombinant streptavidin. Nuclei were stained with the Hoechst33342. Scale bar, 50 µm.

H, Heatmap showing the HepG4-seq signals ±1.5kb around the center of peaks identified in HEK293 cells treated with DMSO, ML216 (25 µM), or NSC617145 (3 µM). Color scales represent the density of the signals. I, Profile plot showing the average signal of HepG4-seq reads ±1.5kb around the center of peaks identified in HEK293 cells treated with DMSO, ML216 (25 µM), or NSC617145 (3 µM). The p values (ML216 v.s. DMSO; NSC617145 v.s. DMSO) were calculated using the Mann-Whitney test.

J, Representative genome browser tracks showing the HepG4-seq signals in HEK293 cells treated with DMSO, ML216 (25 µM), or NSC617145 (3 µM) along the indicated genomic loci.

Mapping of the co-localized G4 and R-loop in the HEK 293 cells by combining the HepG4-seq and HBD-seq

A, Schematic of the HBD-seq procedure. HBD-V5, the recombinant fusion protein of the N-terminal hybrid-binding domain (HBD) of RNase H1 and V5 tag; GP41-pG-Tn5, the recombinant fusion protein of the GP41 tag, protein G and Tn5.

B, Heatmap showing the signal of HBD-seq reads ±1.5kb around the center of peaks in HEK293 cells. Two biologically independent replicates are shown. “+ RNase” represents the treatment of RNase A and RNase H prior to the HBD-seq. Color scales represent the density of the signals.

C, Heatmap showing the signal of HepG4-seq and HBD-seq reads ±1.5kb around the center of co-localized G4 and R-loop peaks in HEK293 cells. Color scales represent the density of the signals.

D, Profile plot showing the average signal of HepG4-seq and HBD-seq reads ±1.5kb around the center of co-localized G4 and R-loop peaks in HEK293 cells. The plot is visualized using the heatmap. Color scales represent the density of the signals.

E, Distribution of the co-localized G4s and R-loops in HEK293 cells in different gene features.

F, Top: Heatmap showing the signal of HepG4-seq and HBD-seq reads of the co-localized G4s and R-loops in HEK293 cells along the gene body, 3kb upstream of transcription start site (TSS) and 2kb downstream of transcription end site (TES). Bottom: Profile plot showing the average signal of HepG4-seq and HBD-seq reads of the co-localized G4s and R-loops in HEK293 cells along the indicated gene features. The plot is visualized using the heatmap. Color scales represent the density of the signals.

G, Representative genome browser tracks showing the HepG4-seq and HBD-seq signals of the co-localized G4s and R-loops in HEK293 cells along the indicated genomic loci.

H, The top enriched motifs of the co-localized G4s and R-loops in HEK293 cells.

The co-localized G4s and R-loops-mediated transcriptional regulation in HEK293 cells

A, Cumulative distribution plot showing comparisons of FPKMs of G4, R-loop, co-localized G4 & R-loop associated genes and all genes in HEK293 cells.

B, Scatter plot showing the distributions of FPKMs of G4, R-loop, co-localized G4 & R-loop-associated genes versus the distances of G4, R-loop, co-localized G4 & R-loop to nearest TSS. The distance is in bp. Color scales represent the density of dots.

C, Left: Scatter plot showing the distributions of foldchanges (FCs) of RNA levels of co-localized G4 & R-loop -associated genes (p value < 0.05) versus FCs of G4 signals of co-localized G4s & R-loops (FC >=1.5) after treatment with indicated inhibitors of G4 resolving helicases BLM or WRN. The number of genes were labelled on the plot. Right: Heatmap showing differential expression levels of co-localized G4 & R-loop-associated genes after treatment with indicated G4 inhibitors. RNA-seq data are from three biologically independent repeats. Color scales represent the normalized expression levels.

D, Distributions of co-localized G4 s& R-loops expressions of which-associated genes were significantly up- or down-regulated after treatment with indicated G4 inhibitors across different gene features.

E, Circos plot showing the overlap co-localized G4 & R-loop-associated genes differentially expressed after the treatment of ML216 or NSC617145 in HEK293 cells. Purple lines link the same gene that are shared by multiple groups. Blue lines link the genes, although different, fall under the same ontology term. Dark orange color of the inside arc represents the genes that are shared by multiple groups and light orange color of the inside arc represents genes that are unique to that group.

F, Heatmap showing the GO-based enrichment terms of co-localized G4 & R-loop-associated genes differentially expressed after the treatment of ML216 or NSC617145 in HEK293 cells. The heatmap cells are colored by their p-values.

Mapping of the co-localized G4s and R-loops in the mouse embryonic stem cells

A, Heatmap showing the signal of HepG4-seq and HBD-seq ±1.5kb around the center of peaks in mESCs. Two biologically independent replicates are shown. Two biologically independent non-label replicates were the negative controls for HepG4-seq. Two biologically independent replicates with treatment of RNase A and RNase H prior to HBD-seq were the negative controls for HBD-seq. Color scales represent the density of the signals.

B, Profile plot showing the average signal of HepG4-seq reads ±1.5kb around the center of peaks and the average maxScores of PQS calculated by pqsfinder at the same positions. HepG4 rep1/rep2, two biologically independent HepG4-seq replicates in mESCs; Non rep1/rep2, two biologically independent non-label negative control replicates in mESCs.

C, Venn diagram comparing the DNA G4 and R-loop in mESCs.

D, Top: Heatmap showing the signal of HepG4-seq and HBD-seq ±1.5kb around the center of co-localized G4s & R-loops in mESCs. Two biologically independent replicates are shown. Color scales represent the density of the signals. Bottom: Profile plot showing the average signal of HepG4-seq and HBD-seq reads ±1.5kb around the center of co-localized G4s & R-loops in mESCs.

E, Distribution of G4s, R-loops, and co-localized G4s & R-loops signals in mESCs in different gene features. F, Left: Heatmap showing the signal of HepG4-seq and HBD-seq reads of the co-localized G4s & R-loops in mESCs along the gene body, 2kb upstream of TSS and 2kb downstream of TES. Right: Profile plot showing the average signal of HepG4-seq and HBD-seq reads of the co-localized G4s & R-loops in mESCs along the indicated gene features. The plot is visualized using the heatmap. Color scales represent the density of the signals.

G, The top enriched motifs of the co-localized G4s & R-loops in mESCs.

H, Representative genome browser tracks showing the HepG4-seq and HBD-seq signals of the co-localized G4s & R-loops in mESCs along the indicated genomic loci.

The co-localized G4s & R-loops are mainly localized in active promoters and enhancers

A, Cumulative distribution plot showing comparisons of FPKMs of G4-, R-loop-, and co-localized G4 & R-loop-associated genes and all genes in mESCs.

B, Scatter plot showing the distributions of FPKMs of G4-, R-loop-, and co-localized G4 & R-loop -associated genes versus the distances of these peaks to nearest TSS. The distance is in bp. Color scales represent the density of dots.

C, Heatmap showing the signal of representative HepG4-seq (G4), HBD-seq (R-loop), RNA polymerase II Ser5P, H3K4me3, H3K27ac, H3K36me3, H3K4me1, and H3K27me3 ±2kb around the center of the co-localized G4s & R-loops in mESCs. Color scales represent the density of the signals. The average signal is plotted at the top of each heatmap panel.

D, Schematic of different types of promoters and enhancers. Pie chart showing the proportion of different types of promoters or enhancers that harbor the co-localized G4s & R-loops.

E-F, Heatmap showing the signal of representative HepG4-seq (G4), HBD-seq (R-loop), RNA polymerase II Ser5P, H3K4me3, H3K27ac, H3K36me3, H3K4me1, and H3K27me3 ±2kb around the center of the co-localized G4s & R-loops in the different types of promoters (E) or enhancers (F) in mESCs. Color scales represent the density of the signals. The average signal is plotted at the top of each heatmap panel.

Modulation of the co-localized G4s & R-loops by the helicase Dhx9

A, Western blot showing the protein levels of Dhx9 and Gapdh in the wildtype (WT) and dhx9KO mESCs. The non-specific band is labeled with a star.

B, Immunofluorescence staining of Dhx9 in the WT and dhx9KOmESCs cultured without MEFs feeder. Nuclei were stained with the Hoechst33342. The immunofluorescence signal of contaminated MEFs is labeled with a a star. Scale bar, 100 µm.

C, Heatmap showing the signal of DNA G4 (HepG4-seq) and R-loop (HBD-seq) ±1.5kb around the center of significantly differential peaks (p-value < 0.05, fold change ≥ 1.5) in WT and dhx9KO mESCs. Two biologically independent replicates are shown. Color scales represent the density of the signals.

D, Scatter plot showing distributions of foldchanges of differential G4s versus foldchanges of differential R-loop in WT and dhx9KO mESCs. “G4&R-loop Up”, both G4 and R-loop up-regulated; “G4&R-loop Down”, both G4 and R-loop down-regulated; “G4/R-loop Up”, G4 or R-loop up-regulated; “G4/R-loop Down”, G4 or R-loop down-regulated; “G4, R-loop Opposite”, G4 up-regulated and R-loop down-regulated, or, G4 down-regulated and R-loop up-regulated.

E, Top: pie chart showing proportions of differential G4s, R-loops and co-localized G4s & R-loops in dhx9KO mESCs in promoters, enhancers and other regions; Bottom: bar chart showing numbers of differential G4, R-loops and co-localized G4s & R-loops in dhx9KOmESCs in different types of promoters or enhancers.

F, Volcano plot showing distributions of G4-, R-loop-, or co-localized G4 & R-loop-associated genes differentially expressed in WT and dhx9KO mESCs. Significantly up-regulated (p-value < 0.05, fold change ≥ 1.5) and down-regulated (p-value < 0.05, fold change ≤ 0.67) genes in the dhx9KOmESCs are labeled with red and blue dots respectively. The numbers of up- or down-regulated genes are labeled on the plot.

G, Representative genome browser tracks showing the G4 (HepG4), R-loop (HBD-seq), and RNA-seq signals in WT and dhx9KO mESCs along the indicated genomic loci.

H, GO-based enrichment terms of G4-, R-loop-, or co-localized G4 & R-loop-associated genes differentially expressed in WT and dhx9KO mESCs were hierarchically clustered into a tree based on Kappa-statistical similarities among their gene memberships. The heatmap cells are colored by their p-values.

Characterization of the co-localized G4s & R-loops directly bound by Dhx9

A, Heatmap showing expression levels of resolving helicases or regulators of G4s and/or R-loops differentially expressed in WT and dhx9KO mESCs. Color scales represent the normalized expression levels.

B, Heatmap showing the signal of Dhx9 CUT&Tag reads ±1.5kb around the center of peaks in WT mESCs. Two biologically independent replicates are shown. Color scales represent the density of the signals.

C, Pie chart showing proportions of the overlapping peaks between Dhx9 binding peaks and the co-localized G4 & R-loops peaks.

D, Heatmap showing the signal of Dhx9 CUT&Tag, HepG4-seq and HBD-seq ±1.5kb around the center of Dhx9-bound Co-localized G4 & R-loop peaks in mESCs. Two biologically independent replicates are shown. Color scales represent the density of the signals.

E, The top enriched motifs of Dhx9 binding peaks overlapping with the co-localized G4 & R-loops in mESCs. F, Heatmap showing the signal of Dhx9 CUT&Tag, HepG4-seq and HBD-seq ±1.5kb around the center of Dhx9-bound significantly differential Co-localized G4 & R-loop peaks (p-value < 0.05, fold change ≥ 1.5) in WT and dhx9KO mESCs. Two biologically independent replicates are shown. Color scales represent the density of the signals.

G, Pie chart showing proportions of differential Dhx9-bound co-localized G4 & R-loops in dhx9KO mESCs in different types of promoters or enhancers.

H, Top enriched GO terms in Dhx9-bound co-localized G4 & R-loop-associated genes that are differentially expressed in WT and dhx9KO mESCs. The bubble size represents the number of genes in each indicated term. The color scale represents the p-value.

I, Representative genome browser tracks showing the Dhx9 CUT&Tag, G4 (HepG4), R-loop (HBD-seq), and RNA-seq signals in WT and dhx9KO mESCs along the indicated genomic loci. The dashed box highlights Dhx9-bound significantly differential co-localized G4 & R-loop peaks in dhx9KOmESCs.

Dhx9 regulates the cell fate of mouse embryonic stem cells

A, Relative RNA levels of indicated genes in the WT and dhx9KO mESCs that were measured by quantitative RT-PCR (qRT-PCR). Data are means ± SD; n=3, significance was determined using the two-tailed Student’s t-test.

B, Western blot showing the protein levels of indicated genes in the WT and dhx9KO mESCs. The normalized relative protein levels were labeled below each panel, where gel images were quantified by ImageJ and the level of Gapdh was used for normalization.

C, Immunofluorescence staining of Oct4 and Nanog in the WT and dhx9KOmESCs cultured on MEF feeder. Nuclei were stained with the Hoechst33342. BF, bright field. Scale bar, 100 µm.

D, Cell cycle profiles determined by flow cytometry of DAPI staining in the WT and dhx9KO mESCs. The proportions of different phases of cell cycle were analyzed by ModFit. Data are means ± SD; n=3, significance was determined using the two-tailed Student’s t-test.

E, Cell proliferation rate of WT and dhx9KO mESCs were measured by CCK-8 cell proliferation assay. The number of cells seeded at the beginning was labeled at the x-axis. Absorbance at 450 nm was determined after 2 days of culture. Data are means ± SD; n=6, significance was determined using the two-tailed Student’s t-test. F, Pictures of embryoid bodies at indicated days of in vitro differentiation of WT and dhx9KO mESCs. Scale bar, 200 µm.

G, Relative RNA levels of indicated genes in the WT and dhx9KO embryoid bodies at indicate days of in vitro differentiation that were measured by qRT-PCR. Data are means ± SD; n=3, significance was determined using the two-tailed Student’s t-test.