HIV-1 infection induces genomic R-loop accumulation in cells at early post-infection.

(A) Summary of experimental design for DRIPc-seq in HeLa cells, primary CD4+ T cells and Jurkat cells infected with HIV-1. (B) Bar graphs indicating DRIPc-seq peak counts for HIV-1-infected HeLa cells, primary CD4+ T cells and Jurkat cells harvested at the indicated hours post infection (hpi). Pre-immunoprecipitated samples were untreated (−) or treated (+) with RNase H, as indicated. Each bar corresponds to pooled datasets from two biologically independent experiments. (C) All genomic loci overlapping a DRIPc-seq peak from HIV-1 infected HeLa cells, primary CD4+ T cells and Jurkat cells in at least one sample are stacked vertically; the position of each peak in a stack is constant horizontally across samples. Each hpi occupies a vertical bar, as indicated. Each bar corresponds to pooled datasets from two biologically independent experiments. Common peaks for all samples are represented in black, and in dark gray for those common for at least two samples. The lack of a DRIP signal over a given peak in any sample is shown in light gray. The sample-unique peaks are colored blue, yellow, green, and red at 0, 3, 6, and 12 hpi, respectively. (D) Dot blot analysis of the R-loop in gDNA extracts from HIV-1 infected HeLa cells with MOI of 0.6 harvested at the indicated hpi. gDNAs were probed with anti-S9.6. gDNA extracts were incubated with or without RNase H in vitro before membrane loading (anti-RNA/DNA signal). Fold-induction was normalized to the value of harvested cells at 0 hpi by quantifying the dot intensity of the blots and calculating the ratios of the S9.6 signal to the total amount of gDNA (anti-ssDNA signal). (E) Representative images of the immunofluorescence assay of S9.6 nuclear signals in HIV-1 infected HeLa cells with MOI of 0.6 harvested at 6 hpi. The cells were pre-extracted of cytoplasm and co-stained with anti-S9.6 (red), anti-nucleolin antibodies (green), and DAPI (blue). The cells were incubated with or without RNase H in vitro before staining with anti-S9.6 antibodies, as indicated. Quantification of S9.6 signal intensity per nucleus after nucleolar signal subtraction for the immunofluorescence assay. The mean value for each data point is indicated by the red line. Statistical significance was assessed using one-way ANOVA (n >53).

HIV-1-induced R-loops are enriched at both transcriptionally active and silent regions.

(A) Distribution of DRIPc-seq peak lengths for HIV-1-infected HeLa cells, primary CD4+ T cells and Jurkat cells harvested at the indicated time points (blue, 0 hpi; yellow, 3 hpi; green, 6 hpi; red, 12 hpi). (B) Stacked bar graphs indicating the proportion of DRIPc-seq peaks mapped for HIV-1-infected HeLa cells, primary CD4+ T cells and Jurkat cells harvested at the indicated hpi over different genomic features. (C) Stacked bar graphs indicating the proportion of DRIPc-seq peaks mapped to indicated genomic compartments for HIV-1-infected HeLa cells, primary CD4+ T cells and Jurkat cells harvested at the 0, 3, 6, and 12 hpi. (D) Correlation between gene expression and DRIPc-seq signals of HIV-1-infected HeLa cells with MOI of 0.6 harvested at the indicated hpi. Statistical significance was assessed using Pearson’s r and p-values.

R-loop inducible cell line model directly addresses R-loop-mediated HIV-1 integration site selection.

(A) Summary of the experimental design for R-loop inducible cell lines, pgR-poor and pgR-rich. (B) Gene expression of ECFP (gray) and mAIRN (red), as measured using RT-qPCR in pgR-poor or pgR-rich cells. Where indicated, the cells were incubated with 1 µg/ml DOX for 24 h. Gene expression was normalized relative to β-actin. Data are presented as the mean ± SEM, n = 3. (C) DRIP-qPCR using the anti-S9.6 antibody against ECFP and mAIRN in pgR-poor or pgR-rich cells. Where indicated, the cells were incubated with 1 µg/ml DOX for 24 h. Pre-immunoprecipitated samples were untreated or treated with RNase H as indicated. Values are relative to those of DOX-treated (+) RNase H-untreated (−) pgR-poor cells. Data are presented as the mean ± SEM; statistical significance was assessed using two-way ANOVA (n = 2). (D) Bar graphs indicate luciferase activity at 48 hpi in pgR-poor or prR-rich cells infected with 100ng/p24 capsid antigen of luciferase reporter HIV-1 virus per 1× 105 cells/mL. Data are presented as the mean ± SEM; P values were calculated using one-way ANOVA (n = 6). (E) Box graph indicating the quantified HIV-1 integration site sequencing read count across pgR-poor and pgR-rich transposon sequences in untreated (−) or DOX-treated (+) pgR-poor or pgR-rich cell line infected with 100ng/p24 capsid antigen of luciferase reporter HIV-1 virus per 1× 105 cells/mL. Each bar corresponds to pooled datasets from three biologically independent experiments (n =3). In each boxplot, the centerline denotes the median, the upper and lower box limits denote the upper and lower quartiles, and the whiskers denote the 1.5 × interquartile range. Statistical significance was assessed using a two-sided Mann–Whitney U test. (F and G) Heat maps representing the number of HIV-1 integration-seq mapped read across pgR-poor (F) or pgR-rich (G) transposon sequence in untreated (-) or DOX-treated (+) pgR-poor (F) or pgR-rich (G) cell line. Each rectangular box corresponds to the pooled the number of HIV-1 integration-seq mapped read from three biologically independent experiments (n =3) at the indicated position within pgR-poor (F) or pgR-rich (G) transposon vector. Each light blue box represents actual position of R-loop forming or non-R-loop forming sequence (ECFP or mAIRN) and the yellows stars indicate TRE promoter position within vector.

HIV-1 prefers host genomic R-loop regions for its viral cDNA integration.

(A) Bar graphs showing quantified number of HIV-1 integration sites per Mb pairs in total regions of 30-kb windows centered on DRIPc-seq peaks from HIV-1 infected HeLa cells, primary CD4+ T cells and Jurkat cells (magenta) or non-R-loop region in the cellular genome (gray). (B) Proportion of integration sites within the 30-kb windows centered on DRIPc-seq peaks (magenta solid lines) or randomized DRIPc-seq peaks (gray dotted lines). Control comparisons between randomized integration sites with DRIPc-seq peaks and randomized DRIPc-seq peaks are indicated by black dotted lines and gray solid lines, respectively. (C and D) Superimpositions of HIV-1-induced R-loop positive chromatin regions, P1-P3 (C), and HIV-1-induced R-loop negative chromatin regions, N1 and N2 (D), on DRIPc-seq (blue, 0 hpi; yellow, 3 hpi; green, 6 hpi; red, 12 hpi) and number of mapped read of HIV-1 integration-seq (integration, black). Magenta dotted lines represent primer binding sites in qPCR following DRIP.

HIV-1 integrase proteins directly bind to host genomic R-loops.

(A) Representative gel images for EMSA of Sso7d-tagged HIV-1-integrase (E152Q) with R-loop and dsDNA, 10 nM nucleic acid substrate was incubated with Sso7d-tagged HIV-1-integrase (E152Q) at 0 nM, 20 nM, 50 nM, 100 nM, 200 nM, and 400 nM (left). Unbound fraction were quantified for EMSA of Sso7d-tagged HIV-1-integrase (E152Q) with different types substrates (R-loop, dsDNA, R-loop, R:D+ssDNA and Hybrid). Data are presented as the mean ± SEM, n = 3 (right). (B) Summary of the experimental design for R-loop immunoprecipitation using S9.6 antibody in FLAG-tagged HIV-1 integrase protein-expressing HeLa cells. (C) Western blotting for HIV-1 integrase protein, H3, and LaminA/C of DNA–RNA hybrid immunoprecipitation using the S9.6 antibody. (D) and (E) HeLa gDNA input was either untreated (−) or treated (+) with RNase H before enrichment for DNA–RNA hybrids using the S9.6 antibody. gDNA–RNA hybrids were incubated with nuclear extracts depleted of DNA–RNA hybrids with RNase A followed by S9.6 immunoprecipitation. DNA–RNA hybrid dot blot (D) and western blot of DNA–RNA hybrid immunoprecipitation, probed with the indicated antibodies (E). (F) DNA–RNA hybrid dot blot of FLAG antibody-immunoprecipitated nucleic acid extracts. Where indicated, nucleic acid extracts were untreated (−) or treated (+) with RNase H before probing with the S9.6 antibodies. (G) Representative images of the proximity-ligation assay (PLA) between GFP and S9.6 antibodies in HIV-IN-EGFP virion-infected HeLa cells at 6 hpi. Cells were subjected to PLA (orange) and co-stained with DAPI (blue). PLA puncta in the nucleus are indicated by the yellow arrows. Quantification analysis of number of PLA foci per nucleus (left). GFP_alone and S9.6_alone were used as single-antibody controls from HIV-IN-EGFP virion-infected HeLa cells (right). The mean value for each data point is indicated by the red line. P value was calculated using a two-tailed unpaired t-test (n > 50).

Primary CD4+ T cells sorting strategies and GFP-HIV-1 infection.

(A) Gating strategy used to determine the efficiency of CD4+ T cells sorting from human PBMC. Pre-sorted PBMCs were staining with FITC-conjugated anti-CD4 and subjected for positive CD4+ T cell sorting. The percentages of FITC stained cell population at each step of cell sorting are as indicated. (B) Gating strategy used to determine non-activated (Naïve) and activated cells (αCD3/28) with two markers, CD25 (FITC) and CD69 (APC), for each donor (upper panels, Donor 1; lower panels, Donor 2). (C) Gating strategy used to determine HIV-1-infectivity of CD4+ T cells from each donor infected with GFP reporter HIV-1 virus at 48 hpi. The percentages of GFP positive cell population at are as indicated.

Genome browser screenshot over the HIV-1-induced R-loop forming positive or negative genomic regions.

(A-C), Genome browser screenshot over the P1 (A), P2 (B), and P3 (C) HIV-1 induced R-loop-positive chromosomal regions showing result from DRIPc-seq in HIV-1-infected HeLa cells (blue, 0 hpi; yellow, 3 hpi; green, 6 hpi; red, 12 hpi; black, input signals for each indicated sample) on plus (+) or minus (-) DNA strand. Magenta dotted lines represent primer binding sites in qPCR following DRIP. (D and E), Genome browser screenshot over the N1 (D), and N2 (E) HIV-1 induced R-loop-negative chromosomal regions showing result from DRIPc-seq in HIV-1-infected HeLa cells (blue, 0 hpi; yellow, 3 hpi; green, 6 hpi; red, 12 hpi; black, input signals for each indicated sample) on plus (+) or minus (-) DNA strand. Magenta dotted lines represent primer binding sites in qPCR following DRIP.

Host cellular R-loop induction by HIV-1 infection is host-genome specific.

(A) DRIP-qPCR using the anti-S9.6 antibody at P1, P2, P3, N1, and N2 in HIV-1-infected cells with MOI of 0.6 harvested at the indicated hpi (blue, 0 hpi; green, 6 hpi). Pre-immunoprecipitated materials were untreated (−) or treated (+) with RNase H, as indicated. Data are presented as the mean ± SEM; P-values were calculated using one-way ANOVA (n = 2). (B) Dot blot analysis of the R-loop in gDNA extracts from HIV-1 infected HeLa cells with MOI of 0.6 harvested at 6hpi. The cells were treated with DMSO, 10uM of Nevirapine (NVP), or 10uM of Raltegravir (RAL) for 24 h before infection, as indicated. gDNAs were probed with anti-S9.6. gDNA extracts were incubated with or without RNase H in vitro before membrane loading (anti-RNA/DNA signal). Fold-induction was normalized to the value of harvested cells at 0 hpi by quantifying the dot intensity of the blots and calculating the ratios of the S9.6 signal to the total amount of gDNA (anti-ssDNA signal). (C) Representative images of the immunofluorescence assay of S9.6 nuclear signals in HIV-1 infected HeLa cells with MOI of 0.6 at 6 hpi. The cells were pre-extracted of cytoplasm and co-stained with anti-S9.6 (red), anti-nucleolin antibodies (green), and DAPI (blue). The cells were treated with DMSO, 10uM of Nevirapine (NVP), or 10uM of Raltegravir (RAL) for 24 h before infection, as indicated. Quantification of S9.6 signal intensity per nucleus after nucleolar signal subtraction for the immunofluorescence assay. The mean value for each data point is indicated by the red line. Statistical significance was assessed using one-way ANOVA (n >51). (D) Pie graphs indicating the percentage of DRIPc-seq reads aligned to host cellular genome (aquamarine) or to HIV-1 viral genome (gray), out of the total consensus DRIPc-seq peaks from HIV-infected HeLa cells, primary CD4+ T cells and Jurkat cells.

R-loop induction by HIV-1 infection does not follow transcriptome changes in HeLa cells.

(A) Line graphs and heat maps representing expression levels of indicated repetitive elements (SINE, right; LINE, middle; LTR, left) at the indicated hpi of HIV-1 in HeLa cells. Data are presented as the mean expression levels of two biologically independent experiments. (B) Indicated gene expression as measured by RT-qPCR in 0 or 6 hpi harvested HIV-1-infected HeLa cells. Data represent mean ± SEM, n = 3, P values were calculated according to two-tailed Student’s t-test. P > 0.05; n.s, not significant.

Regulation of cellular R-loops by RNase H1 expression, or by transposon-transposase insertion of R-loop forming and non-R-loop forming sequences in HeLa cells.

(A) Copy number of piggyBac transposon inserts in each cell line constructed by transfecting the transposon vector and transposase-expressing vector. (B and C) Fold induction of gene expression for the indicated genes as measured by RT-qPCR. Fold induction were calculated by dividing the gene expression level of DOX-treated (+) by that of DOX-untreated (-) in pgR-poor cells (B) or pgR-rich cells (C). Data represent mean ± SEM, n = 2, P values were calculated according to two-way ANOVA. P > 0.05; n.s, not significant. (D and E) Relative gene expression of the indicated genes as measured by RT-qPCR in DOX-treated (+) or DOX-untreated (-) pgR-poor cells (D) or pgR-rich cells (E). Data represent mean ± SEM, n = 2, P values were calculated according to two-way ANOVA. P > 0.05; n.s, not significant.

HIV-1 integrase proteins directly binds to host genomic R-loops.

(A) Representative gel images for EMSA of Sso7d-tagged HIV-1-integrase (E152Q) with different types of nucleic acids substrates (R:D+ssDNA and Hybrid). 100 nM nucleic acid substrate was incubated with Sso7d-tagged HIV-1-integrase (E152Q) at 0 nM, 20 nM, 50 nM, 100 nM, 200 nM, and 400 nM (n = 3). (B) Nucleic acid extracts from FLAG-HIV-1-integrase-expressing cells were immunoprecipitated using S9.6 antibody. gDNA was precipitated from the elutes of immunoprecipitation and subjected to DNA–RNA hybrid dot blotting. Where indicated, the gDNA extracts were either untreated (−) or treated (+) with RNase H after elution of immunoprecipitated materials. (C) Summary of the experimental design for R-loop immunoprecipitation using S9.6 antibody in FLAG-tagged HIV-1 integrase protein-expressing HeLa cells with pre-immunoprecipitation in vitro RNase H treatment. (D) Protein extracts from FLAG-HIV-1-integrase-expressing cells were immunoprecipitated using anti-FLAG antibody. Western blot of FLAG immunoprecipitation was probed with anti-FLAG or anti-H3 antibodies. (E) Representative images of the proximity-ligation assay (PLA) using single antibody (anti-GFP or anti-S9.6) in HIV-IN-EGFP virion-infected HeLa cells at 6 hpi, as PLA signal negative controls. Cells were subjected to PLA (orange) and co-stained with DAPI (blue) (n > 50).

Chromosomal position and DRIPc-seq signal for referenced R-loop-positive and –negative regions in HIV-1 infected HeLa cells

Chromosomal position and DRIPc-seq signal for referenced R-loop-positive and –negative regions in HIV-1 infected primary CD4+ T cells

Chromosomal position and DRIPc-seq signal for referenced R-loop-positive and –negative regions in HIV-1 infected Jurkat cells

RNA-seq analysis of relative gene expression levels of P1-3 and N1,2 R-loop regions

Oligonucleotides used for DRIPc-seq library construction

Primers used for qPCR

Oligonucleotides used for HIV-1 integration site sequencing library construct

Oligonucleotides used for electrophoretic mobility shift assay substrate preparation

Accession numbers and data sources.