Mapping HIV-1 RNA structure, homodimers, long-range interactions and persistent domains by HiCapR

  1. Yan Zhang
  2. Jingwan Han
  3. Xie Dejian
  4. Wenlong Shen
  5. Ping Li
  6. Jian You Lau
  7. Jingyun Li  Is a corresponding author
  8. Lin Li  Is a corresponding author
  9. Grzegorz Kudla  Is a corresponding author
  10. Zhihu Zhao  Is a corresponding author
  1. Laboratory of Advanced Biotechnology, Beijing Institute of Biotechnology, China
  2. Institute of Microbiology and Epidemiology, China
  3. MRC Human Genetics Unit, University of Edinburgh, United Kingdom
5 figures and 4 additional files

Figures

Figure 1 with 4 supplements
Experimental design of HiCapR and overall HIV-1 contact matrix.

(A) Experimental design of HiCapR for profiling RNA-RNA interactions and dynamics of the HIV genome. Based on the simplified SPLASH protocol, HiCapR incorporates a post-library probe-based hybridization and streptavidin pulldown method to enrich HIV RNA chimeras from the SPLASH library. The HiCapR method has been applied to two strains of HIV, NL4-3 and GX2005002, in both infected cells and their corresponding virus particles. (B) Comprehensive contact matrix derived from infected cells and virions of both NL4-3 and GX2005002 HIV-1 strains in the HiCapR experimental groups. The heatmap displays the average count of chimeras per 1 million mapped reads, combining data from two biological replicates for each sample.

Figure 1—source data 1

Designed probes for enriching HIV chimeras in fasta format.

https://cdn.elifesciences.org/articles/102550/elife-102550-fig1-data1-v1.zip
Figure 1—figure supplement 1
Workflow for identifying chimeric reads supporting RNA-RNA interactions as well as homodimers.

The rounded rectangle in the upper right corner of the figure displays the start and end positions of the 5'- and 3'-arms of chimeric reads. These arms can be ligated in multiple ways to form chimeras, some of which are thought to be produced exclusively by homodimers. First, sequencing reads are mapped to the HIV RNA genome, and chimeric reads are identified using hyb pipeline. Dimeric chimeras are then filtered to select only those that can be formed exclusively by dimerization. To reduce background noise, especially contaminated from homodimers, we selected non-overlapping chimeras for our analysis of RNA interactions. This approach allowed us to identify both RNA-RNA interactions and candidate dimer sites and predict secondary structures with high coverage chimeras.

Figure 1—figure supplement 2
HIV genome coverage of HiCapR data.

(A) The organization of the HIV NL4-3 genome follows NCBI annotation and is specified. (B) and (C) the sequencing data from each group was mapped to the HIV genome and then genome coverage was calculated. The results showed that both NL4-3 (B) and GX2005002 (C) strains of the virus, whether in infected cells or virions, and whether RNAs are ligated or not, had uniformly complete genome coverage. This suggests that the HiCapR method is able to capture the entire HIV genome with low bias.

Figure 1—figure supplement 3
Sample correlation.

The heatmap displays the correlation between samples, which was measured using Pearson’s coefficients.

Figure 1—figure supplement 4
Contact matrix derived from infected cells and virions of both NL4-3 and GX2005002 HIV-1 strains.

Related to Figure 1B, but with the inclusion of information from individual replicates and data from non-ligation controls, offering a more detailed view of the interactions and distributions presented in Figure 1B.

Figure 2 with 1 supplement
conformations of HIV-1 5'-UTR.

(A) Known conformation of HIV-1 5'-UTR supported by chimeras. Previous reported stems in 5’-UTR are supported by chimeras from HiCapR. Colors of the nucleotides indicate the log2 transformed base pairing scores. (B) Representation of the one-dimensional structure of the HIV-1 5'-UTR, highlighting the conservation between the GX2005002 and NL4-3 strains in this region. The diagram includes rectangular boxes denoting the locations of key structural elements, with numerical coordinates referencing NCBI DNA genome coordinates. Dashed boxes indicate regions that are either absent or distinct in GX2005002 compared to NL4-3. Small triangle arrows indicate insertions in the GX2005002 5'-UTR. At the top, a seqlogo displays the consensus nucleotides in the SL1 region. (C) Multidimensional scaling (MDS) plot clustering each of the 1,000 computationally predicted structures of the 5’-UTR in two strains under two conditions. (D) Basepairing probability matrices were calculated from 1000 computationally predicted structures, with color indicating the percentage of structures supporting the specific basepair.

Figure 2—figure supplement 1
Dimer and monomer conformations of HIV-1 5'-UTR.

The established conformation of the HIV-1 5'-UTR, supported by chimeras in NL4-3 and GX2005002, both in cells and in virions. Colors of the nucleotides indicate the log2 transformed base pairing scores.

Figure 3 with 8 supplements
Identification and validation of homodimers in HIV-1 genome.

(A) Visual depiction of the RNA homodimer formation process: Inter-ligated fragments (arms) originating from homodimeric RNA molecules generate chimeras, where each arm aligns to overlapping coordinates on the HIV-1 genome. This process enables the plotting of coverage and specific details of dimers by utilizing the positions and counts of these overlapping chimeras. (B) distribution of homodimers throughout the HIV-1 genome. The blue line plot showcases the homodimer coverage derived from ligated samples of NL4-3 infected cells along the HIV-1 genome. Arc plots exhibit discontinuous reads in non-ligated samples of NL4-3 infected cells, with dark red segments indicating peaks of homodimers. The lower panels depict the base pairing of homodimers in the SL1 region of the 5’-UTR and downstream of the RRE region. Color mapping indicates the log2 transformed dimer score. (C) Assessment of dimer self-binding using Bio-Layer Interferometry (BLI). The data presented here are from three separate experiments, offering insights into how dimers interact and bind to themselves. The dissociation constant (Kd), indicated by the mean ± standard deviation, was determined from these experiments.

Figure 3—figure supplement 1
Statistics of 3'–5', 5'–3', and dimer chimeras.

The stacked bar chart displays the number of 3'–5', 5'–3', and dimer chimeras in various ranges across the sample groups. The panel on the left pertains to NL4-3, while the one on the right pertains to GX2005002.

Figure 3—figure supplement 2
Homodimers around HIV-1 5’-UTR.

(A) Heatmaps simultaneously plot contact matrices calculated from non-overlapped chimeras, which is informative for interactions and dimer chimeras which is derived for homodimers. The coordination in heatplota is RNA-based, with the numbers in brackets indicating the corresponding coordinates in the DNA genome. (B) Homodimer chimeras details around splicing sites show the 5'-arm represented by blue lines and the 3'-arm represented in yellow, providing a visual representation of the interactions in these regions.

Figure 3—figure supplement 3
Dimer score matrices around 5’-UTR of NL4-3 and GX2005002 strain in different stages.

Heatmaps show dimer score between X coordinate and Y coordinate.

Figure 3—figure supplement 4
Distribution and motif enrichment of homodimers along HIV genome.

(A) Homodimer coverage along HIV-1 genome in NL 4–3 and GX2005002 both in cells and in virion samples. Gag-CLIP data from Kutluay et al., 2014 are also shown. Bottom panel depict delta G of each100-nt window in NL4-3 genome. (B) Distribution of enriched motif in homodimer peaks of NL4-3, region similar to SRSF motif was underlined in the seqlog. (C) Distribution of enriched motif in homodimer peaks of GX2005002, region similar to SRSF motif was underlined in the seqlog.

Figure 3—figure supplement 5
Homodimers around splicing sites.

Top: Homodimer chimeras details around splicing sites show the 5'-arm represented by blue lines and the 3'-arm represented in yellow, providing a visual representation of the interactions in these regions. Known splicing regulatory elements are also indicated. Bottom: basepairing of homodimers around splicing sites as indicated. Color code indicates log2 transformed base pairing score supporting each base-pair.

Figure 3—figure supplement 6
Homodimers around RRE.

(A) Contact matrix and homodimeric chimeras surrounding the RRE region in the HIV genome. The left panel displays the contact matrix derived from non-overlapping chimeras and the contact matrix derived from dimeric chimeras. The right panel presents specific details of the dimeric chimeras around the RRE elements. (B) Base pairing of homodimers in the extended RRE region in specific samples. Color mapping indicates the log2 transformed dimer score.

Figure 3—figure supplement 7
Validation of homodimers around RRE.

(A) Native agarose gel confirming dimeric conformation of RRE (extended to 620nt). (B) Agilent tapestation 4200 capillary electrophoresis confirming dimeric conformation of 5’-UTR and RRE (extended to 620nt).

Figure 3—figure supplement 8
Predicted structure of extended RRE.

Color code indicates log2 transformed base pairing score supporting each base-pair.

Figure 4 with 1 supplement
Genome domains along HIV-1 genome.

(A) Each panel shows triangle matrix and genome domains of two strains of the virus in both cell and virion states, as calculated using C-world. The x-axis represents the genomic position, while the y-axis represents the insulation score (solid lines) and dimer coverage (dashed lines). (B) Correlation of insulation score in infected cells and virion for NL4-3 and GX2005002 strain. (C) Violin and boxplot comparing the boundary strength of the genomic domains of two strains between infected cells and virions. The boxplot displays the median, quartiles, and range of the boundary strength values for each strain in each state.

Figure 4—figure supplement 1
Average contact matrices around peak centers.

Heatmaps showing the normalized average interaction frequencies for all peak centers as well as their flanking regions (±0.5 peak width). The heatmaps were binned at 10 nt resolution.

Figure 5 with 2 supplements
Long-range interaction in HIV-1 genome.

(A) Heatmaps of enriched interactions obtained from NL4-3 and GX2005002 infected cells and virions. The upper diagonal shows interactions from infected cells, while the lower diagonal region displays interactions from virions. (B) Viewpoint lines depicts the binding positions of the 5’-UTR along the HIV-1 genome. The gray rectangles indicate the viewpoint regions. The colors of the lines represent specific samples, with samples from virions shown as dashed lines. (C) Contact matrices and base pairing details between 5’-UTR and 3’-UTR. The top panels display heatmaps indicating contact probability, with the color bar indicating chimeric reads per 1 M reads in each specific sample. The bottom panels show base pairing colored by base pairing scores.

Figure 5—figure supplement 1
Length distribution of interactions across the HIV genome.

(A) Histograms show distributions of different types of interactions across the HIV genome. The x-axis represents the span of the interactions, while the y-axis represents the frequency of each interaction type. (B) Contact probability decay curves (Ps curves) showing progressive reconfiguration HIV genomes. Both two strains of the virus exhibit faster decay rates within virions than cells.

Figure 5—figure supplement 2
Base paring of 5’-UTR involved long range interaction.

Contact matrices and base pairing details between 5’-UTR and 2.2 K (A), 6 K (B) and 7.6~8.5 K (C).

Additional files

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Yan Zhang
  2. Jingwan Han
  3. Xie Dejian
  4. Wenlong Shen
  5. Ping Li
  6. Jian You Lau
  7. Jingyun Li
  8. Lin Li
  9. Grzegorz Kudla
  10. Zhihu Zhao
(2025)
Mapping HIV-1 RNA structure, homodimers, long-range interactions and persistent domains by HiCapR
eLife 13:RP102550.
https://doi.org/10.7554/eLife.102550.3