Calibrated analysis framework for nanopore direct RNA sequencing uncovers cell-specific m⁶A stoichiometry at conserved sites

Denise Ohnezeit
Elene Loliashvili
Gregory Putzel
Ruth Verstraten
Jianheng Liu
Luke S Nicholson
Alejandro Pironti
Samie R Jaffrey
Daniel P Depledge author has email address
Angus C Wilson author has email address

Department of Microbiology, New York University School of Medicine, New York, United States
Institute of Virology, Hannover Medical School, Hanover, Germany
Antimicrobial-Resistant Pathogens Program, New York University Grossman School of Medicine, New York, United States
German Center for Infection Research (DZIF), partner site Hannover-Braunschweig, , Germany
Department of Pharmacology, Weill Medical College, Cornell University, New York, United States
Cluster of Excellence RESIST (EXC 2155), Hannover Medical School, Hanover, Germany

https://doi.org/10.7554/eLife.110672.1

Open access
Copyright information

Figures and data

Workflow for m⁶A mapping in different human cell types using orthogonal sequencing approaches.
(A) Schematic of the experimental setup. RNA was isolated from NHDFs and differentiated HD10.6 cells treated with STM2457 or DMSO for 48h. Poly(A) RNA was subjected to DRS and GLORI from the same input material. IVT RNA was generated in parallel to assess potential false-positive modification calls. (B) Workflow for benchmarking Dorado basecalling. Reads were processed with Dorado versions v.0.5.0 through v0.9.0, aligned to the human genome or transcriptome, and analyzed with ONT’s Modkit tool. Multiple filtering strategies were tested against IVT controls and METTL3 inhibition (STM2457). Resulting m⁶A sites were compared with GLORI data from the same input RNA. (C) Final benchmarked Dorado pipeline to detect high confidence m⁵A sites.

Modification probability of 0.98 and stoichiometry score of > 10% are necessary cutoffs to identify m⁶A sites accurately.
(A) Dorado v0.9.0-generated modification probability distributions for all-context m⁶A from genome-aligned DMSO- and STM2457-treated NHDFs plotted against the ones from IVT NHDF poly(A) RNA. DMSO probability score distribution is shown as a red solid line, STM2457 - red dashed line, IVT - grey solid line. (B) Same as in (A), but for DRACH-context m⁶A sites. (C) Stoichiometry (> 10%) distributions for all-context π?Afrom DMSO- (in red) or STM2457-treated (in pink) NHDF datasets (genome-aligned) are plotted using a 0.98 modification probability cutoff and a filter for > 20 reads. (D) Same as in (C), but for DRACH-context m⁶A sites. (E) Overlap of m⁶A sites between all-context and DRACH models (A basecall accuracy > 80%, coverage > 2Oreads, stoichiometry > 10%), comparing non-filtered to filtered (modification probability > 0.98) outputs. (F) Metagene plots showing the density of m⁶A (all-context - dark red, DRACH-context - bright red) sites across all annotated poly(A) transcripts (left) and monoexonic poly(A) transcripts (right) from genome-aligned DMSO-treated NHDF datasets filtered with 0.98 modification probability, > 20 coverage and > 10% stoichiometry.

Comparison of m⁶A sites detected by Dorado and GLORI.
(A) Filtering strategy for Dorado vs GLORI to obtain a high-confidence list of m⁶A sites in NHDFs. (B) Stoichiometry distributions are plotted from genome-aligned DMSO- or STM2457-treated NHDF datasets using the indicated Dorado models and the filtering strategy shown in (A). (C) Stoichiometry distributions from GLORI performed on DMSO- or STM2457-treated NHDFs are plotted, using the filtering strategy shown in (A). In addition, m⁶A sites were filtered for DRACH motifs (right plot). (D) Overlap of high confidence m⁶A sites detected by Dorado vs GLORI. GLORI datasets were filtered using increasing m⁶A stoichiometry cutoffs (from left to right). (E) m⁶A sites in the SPEN gene detected by GLORI and Dorado. Boxplots (right) represent the m⁶A stoichiometries per m⁶A site shown on the left. (F) SPEN isoforms with m⁶A sites, determined by Dorado. Boxplots (right) show the distribution of m⁶A stoichiometries on the m⁶A sites on SPEN-201 and SPEN-202.

Comparative analysis of m⁶A modification patterns in NHDFs and HD10.6 cells using Dorado v0.9.0.
(A) Overlap of high confidence m⁶A sites between NHDFs and HD10.6 cells. (B) Correlation of shared m⁶A stoichiometries (n=38,2l7). (C) Boxplots showing the m⁶A stoichiometry distributions of the 38,217 detected in both NHDFs and HD10.6 cells. (D+E) Motif analysis showing m⁶A (all-context and DRACH-context) motif logos, 5-mer distributions and modification stoichiometries in NHDFs. (F+G) Same as in (D+E) but for HD10.6 cells. (H) KEGG pathway analysis (DAVID Bioinformatics Resources) of shared m⁶A-containing genes. (I) m⁶A stoichiometries of representative examples of individual genes from enriched pathways (H). ACTG1 serves as a housekeeping control.

Evaluation of Dorado by base and modification probability distributions.
(A) Dorado v0.9.0-generated basecall accuracy probability values for A (all-context and DRACH-context basecaller models) from genome-aligned DMSO- and STM2457-treated NHDF samples against IVT poly(A) NHDF RNA. A probability score distribution is shown as a black solid line for DMSO, STM2457 - black dashed line, IVT - grey solid line. (B) Dorado v0.9.0-generated modification probability distributions, visualized as histograms, for DRACH-context m⁶A from genome-aligned DMSO- and STM2457-treated NHDFs and IVT NHDFs. IVT modification probability score distribution is shown in grey, DMSO - bright red, STM2457 -dark red. (C) Same as in (B), but for all-context m⁶A sites.

Modification probability cutoffs are necessary to identify ψ and inosine sites more accurately.
(A) Dorado v0.9.0-generated basecall accuracy probability values for U (left) and modification probability distributions for ψ (right) from genome-aligned DMSO- and STM2457-treated NHDF samples are plotted against IVT poly(A) RNA. ψ probabilities are visualized in blue solid and dashed lines for DMSO and STM2457, respectively and in grey solid line for IVT. (B) Same as in (A) but showing A base accuracy probabilities and inosine sites. Inosine probabilities are visualized in purple solid and dashed lines for DMSO and STM2457, respectively and in grey solid line for IVT. (C) Same as in (A) but showing C base accuracy probabilities and m⁵C sites. m⁵C probabilities are visualized in green solid and dashed lines for DMSO and STM2457, respectively and in grey solid line for IVT. (D+E) DRACH-context and all-context m⁶A modification probability distributions (red) from genome-aligned DMSO- and STM2457-treated NHDFs are plotted against RCS negative control (grey).

Modification probability cutoffs are necessary to identify modification sites accurately, shown for transcriptome-aligned data.
Dorado v0.9.0-generated basecall accuracy probability values for A, U and C (left, A-E) and modification probability distributions for m⁶A (all-context), m⁶A (DRACH-context), ψ, inosine and m⁵C (right, A-E) in transcriptome-aligned DMSO-and STM2457-treated NHDFs, plotted against the values from IVT NHDF poly(A) RNA.

Modification probability distribution profiles in genome-aligned DMSO-and STM2457-treated HD10.6 cells.
Dorado v0.9.0-generated basecall accuracy probability values for A, U and C (left, A-E) and modification probability distributions for m⁶A (all-context), m⁶A (DRACH-context), ψ, inosine and m⁵C (right, A-E) in human genome-aligned DMSO-and STM2457-treated HD10.6 cells, plotted against the values from IVT NHDF poly(A) RNA.

Modification probability distribution profiles in transcriptome-aligned DMSO-and STM2457-treated HD10.6 cells.
Dorado v0.9.0-generated basecall accuracy probability values for A, U and C (left, A-E) and modification probability distributions for m⁶A (all-context), m⁶A (DRACH-context), ψ, inosine and m⁵C (right, A-E) in human transcriptome-aligned DMSO-and STM2457-treated HD10.6 cells, plotted against the values from IVT NHDF poly(A) RNA.

Modification site probabilities detected with different versions of Dorado and compared to RCS.
(A+B) DRACH-context and all-context m⁶A modification probability distributions from genome-aligned DMSO (solid red line)- and STM2457 (dashed red line)-treated NHDFs are plotted against RCS negative control (grey). (C) Modification probability distributions for all-context m⁶A (Dorado v0.7.0, v0.8.0 and v0.9.0), the bottom panel showing the distributions in 0.95-1.00 modification range. (D) Same as in (C), but for DRACH-context m⁶A (Dorado v0.5.0, v0.6.0, v0.8.0 and v0.9.0). (E) Same as in (C), but for ψ (Dorado v0.7.0, v0.8.0, v0.9.0). All the plots in C-E are generated from genome-aligned DMSO-treated NHDF datasets and are plotted against RCS negative controls.

Evaluation of Dorado by stoichiometry distributions.
(A) Stoichiometry distributions for all-context m⁶A from DMSO- (in red) or STM2457-treated (in pink) NHDF datasets (genome-aligned and filtered for > 20 reads) are plotted, comparing 0.98 modification probability-unfiltered (left) and filtered (right) sites. (B) Same as in (A), but for DRACH-context m⁶A sites.

Stoichiometry distributions of m^sA, ψ and inosine in IVT datasets.
(A) All-context m⁶A stoichiometry distributions (all stoichiometries (top) and > 10% stoichiometries (bottom)) in (genome-aligned) IVT RNA from DMSO-treated NHDFs, comparing modification probability-unfiltered and filtered data. The bottom panel shows stoichiometry distributions only for > 10% stoichiometry sites. (B) As in (A), but for DRACH-context m⁶A. (C) As in (A), but for ψ. (D) As in (A), but for inosine.

Evaluation of Dorado by stoichiometry distributions in transcriptome-aligned NHDFs.
(A) Stoichiometry distributions for all-context m^εA from DMSO- (in red) or STM2457-treated (in pink) NHDF datasets (transcriptome-aligned and filtered for > 20 reads) are plotted, assessing the 0.98 modification probability cutoff. The bottom panel shows stoichiometry distributions only for > 10% stoichiometry sites. (B) Same as in (A), but for DRACH-context m⁶A sites.

Evaluation of Dorado by stoichiometry distributions in transcriptome-aligned HD10.6 cells.
(A) Stoichiometry distributions for all-context m⁶A from DMSO- (in red) or STM2457-treated (in pink) HD10.6 datasets (transcriptome-aligned and filtered for > 20 reads) are plotted, assessing the 0.98 modification probability cutoff. The bottom panel shows stoichiometry distributions only for > 10% stoichiometry sites. (B) Same as in (A), but for DRACH-context m⁶A sites.

Stoichiometry distributions of ψ and inosine in NHDFs.
(A) Distributions of all (left) and > 10% (right) ψ stoichiometries in the 0.99 modification probability-unfiltered and filtered data from genome-aligned DMSO-(darker color) and STM2457-treated (lighter color) NHDFs (filtered for > 20 reads). (B) Same as in (A), but for inosine.

Stoichiometry distributions of ψ and inosine in HD10.6 cells.
(A) Distributions of all and > 10% ψ stoichiometries in the 0.99 modification probability-unfiltered and filtered data from genome-aligned DMSO-(darker color) and STM2457-treated (lighter color) HD10.6 cells (filtered for > 20 reads). (B) Same as in (A), but for inosine.

m⁶A and ψ stoichiometry distributions in different versions of Dorado.
Stoichiometry distributions (all stoichiometries - left, > 10% stoichiometries - right) in a line graph format for (A) all-context m⁶A detected in genome-aligned DMSO-treated NHDF dataset, with Dorado versions 0.7.0, 0.8.0 and 0.9.0 colored in gradients of red, darkness increasing with the later versions. (B) DRACH-context m⁶A detected with Dorado versions 0.5.0, 0.6.0, 0.8.0 and 0.9.0, colored in gradients of red, darkness increasing with the later versions. (C) ψ detected with Dorado versions 0.7.0, 0.8.0 and 0.9.0, colored in gradients of blue, darkness increasing with the later versions. All the datasets were filtered for > 20 reads.

Comparison of modification sites detected with different versions of Dorado.
Upset plots showing unique and shared detected modifications (filtered for > 20 reads, > 10% stoichiometry) in modification probability-unfiltered and filtered genome-aligned DMSO-treated NHDF data. The versions are ordered on the plot by the number of detected sites, in ascending order (on the left). Results are shown for the following models and versions: (A) all-context m^bA detected with v.0.7.0, v0.8.0 and v0.9.0, (B) DRACH-context m⁶A detected with v.0.5.0, v0.6.0, v0.8.0 and v0.9.0, (C) ψ detected with v.0.7.0, v0.8.0 and v0.9.0. (D) Nucleotide frequencies in the 5-base motif of ψ sites for uniquely v0.7.0-, v0.8.0-0.9.0- and v0.7.0-0.8.0-0.9.O-detected ψ sites. Position 0 marks the modification site.

Comparison of m⁶A sites detected by Dorado and GLORI-Sequencing in NHDFs and HDlO.6s.
(A) Upset plot showing the number of m⁶A sites detected by GLORI and Dorado in NHDFs and their overlaps. Dorado sites represent high-confidence m⁶A sites obtained after filtering DRACH-model detected data. Similarly, only high confidence m⁶A sites detected by GLORI are shown but were additionally filtered for DRACH sites. (B) Stoichiometry distributions are plotted from genome-aligned DMSO- or STM2457-treated HD10.6 datasets using the indicated Dorado models. (C) Stoichiometry distributions from GLORI performed on DMSO- or STM2457-treated NHDFs are plotted. In addition, m⁶A sites were filtered for DRACH motifs (right plot). (D) Stoichiometry distributions are plotted from transcriptome-aligned DMSO- or STM2457-treated NHDF datasets using the indicated Dorado models. (E) Stoichiometry distributions are plotted from transcriptome-aligned DMSO- or STM2457-treated HD10.6 datasets using the indicated Dorado models.

m⁶A (all-context and DRACH-context) motif logos for modification probability-unfiltered NHDF and HD10.6 data.
(A) 5-mer motif logo (showing nucleotide frequences at each position) of m⁶A-modified sites (position 0 = modified A) from genome-aligned DMSO-treated NHDFs (all-context - left, DRACH-context - right). All the datasets were filtered for > 20 coverage & > 10% stoichiometry. (B) Same as (A), but for HD10.6 cells.

Metagene plots showing m⁶A (all-context and DRACH-context) distribution profiles in NHDF and HD10.6 data filtered at different stoichiometries.
(A) Metagene plots showing the density of m⁶A (all-context - dark red, DRACH-context - bright red) sites across all annotated poly(A) transcripts (upper plot) and monoexonic poly(A) transcripts (lower plot) from genome-aligned DMSO-treated NHDF datasets filtered with 0.98 modification probability, > 20 coverage and > 10% stoichiometry. (B) The density of all-context m⁶A sites (0.98 modification probability-unfiltered - left, filtered - right) is plotted across all annotated poly(A) transcripts of NHDF (dark red) and HD10.6 (bright red) datasets, filtered with different stoichiometries (descending: all stoichiometries, > 50%, > 10% and < 10%. All datasets were filtered additionally for > 20 coverage & > 10% stoichiometry. (C) Same as (A), but for DRACH-context m⁶A sites.

Sign up for email alerts