Workflow for m6A mapping in different human cell types using orthogonal sequencing approaches.

(A) Schematic of the experimental setup. RNA was isolated from NHDFs and differentiated HD10.6 cells treated with STM2457 or DMSO for 48h. Poly(A) RNA was subjected to DRS and GLORI from the same input material. IVT RNA was generated in parallel to assess potential false-positive modification calls. (B) Workflow for benchmarking Dorado basecalling. Reads were processed with Dorado versions v.0.5.0 through v0.9.0, aligned to the human genome or transcriptome, and analyzed with ONT’s Modkit tool. Multiple filtering strategies were tested against IVT controls and METTL3 inhibition (STM2457). Resulting m6A sites were compared with GLORI data from the same input RNA. (C) Final benchmarked Dorado pipeline to detect high confidence m5A sites.

Modification probability of 0.98 and stoichiometry score of > 10% are necessary cutoffs to identify m6A sites accurately.

(A) Dorado v0.9.0-generated modification probability distributions for all-context m6A from genome-aligned DMSO- and STM2457-treated NHDFs plotted against the ones from IVT NHDF poly(A) RNA. DMSO probability score distribution is shown as a red solid line, STM2457 - red dashed line, IVT - grey solid line. (B) Same as in (A), but for DRACH-context m6A sites. (C) Stoichiometry (> 10%) distributions for all-context π?Afrom DMSO- (in red) or STM2457-treated (in pink) NHDF datasets (genome-aligned) are plotted using a 0.98 modification probability cutoff and a filter for > 20 reads. (D) Same as in (C), but for DRACH-context m6A sites. (E) Overlap of m6A sites between all-context and DRACH models (A basecall accuracy > 80%, coverage > 2Oreads, stoichiometry > 10%), comparing non-filtered to filtered (modification probability > 0.98) outputs. (F) Metagene plots showing the density of m6A (all-context - dark red, DRACH-context - bright red) sites across all annotated poly(A) transcripts (left) and monoexonic poly(A) transcripts (right) from genome-aligned DMSO-treated NHDF datasets filtered with 0.98 modification probability, > 20 coverage and > 10% stoichiometry.

Comparison of m6A sites detected by Dorado and GLORI.

(A) Filtering strategy for Dorado vs GLORI to obtain a high-confidence list of m6A sites in NHDFs. (B) Stoichiometry distributions are plotted from genome-aligned DMSO- or STM2457-treated NHDF datasets using the indicated Dorado models and the filtering strategy shown in (A). (C) Stoichiometry distributions from GLORI performed on DMSO- or STM2457-treated NHDFs are plotted, using the filtering strategy shown in (A). In addition, m6A sites were filtered for DRACH motifs (right plot). (D) Overlap of high confidence m6A sites detected by Dorado vs GLORI. GLORI datasets were filtered using increasing m6A stoichiometry cutoffs (from left to right). (E) m6A sites in the SPEN gene detected by GLORI and Dorado. Boxplots (right) represent the m6A stoichiometries per m6A site shown on the left. (F) SPEN isoforms with m6A sites, determined by Dorado. Boxplots (right) show the distribution of m6A stoichiometries on the m6A sites on SPEN-201 and SPEN-202.

Comparative analysis of m6A modification patterns in NHDFs and HD10.6 cells using Dorado v0.9.0.

(A) Overlap of high confidence m6A sites between NHDFs and HD10.6 cells. (B) Correlation of shared m6A stoichiometries (n=38,2l7). (C) Boxplots showing the m6A stoichiometry distributions of the 38,217 detected in both NHDFs and HD10.6 cells. (D+E) Motif analysis showing m6A (all-context and DRACH-context) motif logos, 5-mer distributions and modification stoichiometries in NHDFs. (F+G) Same as in (D+E) but for HD10.6 cells. (H) KEGG pathway analysis (DAVID Bioinformatics Resources) of shared m6A-containing genes. (I) m6A stoichiometries of representative examples of individual genes from enriched pathways (H). ACTG1 serves as a housekeeping control.

Evaluation of Dorado by base and modification probability distributions.

(A) Dorado v0.9.0-generated basecall accuracy probability values for A (all-context and DRACH-context basecaller models) from genome-aligned DMSO- and STM2457-treated NHDF samples against IVT poly(A) NHDF RNA. A probability score distribution is shown as a black solid line for DMSO, STM2457 - black dashed line, IVT - grey solid line. (B) Dorado v0.9.0-generated modification probability distributions, visualized as histograms, for DRACH-context m6A from genome-aligned DMSO- and STM2457-treated NHDFs and IVT NHDFs. IVT modification probability score distribution is shown in grey, DMSO - bright red, STM2457 -dark red. (C) Same as in (B), but for all-context m6A sites.

Modification probability cutoffs are necessary to identify ψ and inosine sites more accurately.

(A) Dorado v0.9.0-generated basecall accuracy probability values for U (left) and modification probability distributions for ψ (right) from genome-aligned DMSO- and STM2457-treated NHDF samples are plotted against IVT poly(A) RNA. ψ probabilities are visualized in blue solid and dashed lines for DMSO and STM2457, respectively and in grey solid line for IVT. (B) Same as in (A) but showing A base accuracy probabilities and inosine sites. Inosine probabilities are visualized in purple solid and dashed lines for DMSO and STM2457, respectively and in grey solid line for IVT. (C) Same as in (A) but showing C base accuracy probabilities and m5C sites. m5C probabilities are visualized in green solid and dashed lines for DMSO and STM2457, respectively and in grey solid line for IVT. (D+E) DRACH-context and all-context m6A modification probability distributions (red) from genome-aligned DMSO- and STM2457-treated NHDFs are plotted against RCS negative control (grey).

Modification probability cutoffs are necessary to identify modification sites accurately, shown for transcriptome-aligned data.

Dorado v0.9.0-generated basecall accuracy probability values for A, U and C (left, A-E) and modification probability distributions for m6A (all-context), m6A (DRACH-context), ψ, inosine and m5C (right, A-E) in transcriptome-aligned DMSO-and STM2457-treated NHDFs, plotted against the values from IVT NHDF poly(A) RNA.

Modification probability distribution profiles in genome-aligned DMSO-and STM2457-treated HD10.6 cells.

Dorado v0.9.0-generated basecall accuracy probability values for A, U and C (left, A-E) and modification probability distributions for m6A (all-context), m6A (DRACH-context), ψ, inosine and m5C (right, A-E) in human genome-aligned DMSO-and STM2457-treated HD10.6 cells, plotted against the values from IVT NHDF poly(A) RNA.

Modification probability distribution profiles in transcriptome-aligned DMSO-and STM2457-treated HD10.6 cells.

Dorado v0.9.0-generated basecall accuracy probability values for A, U and C (left, A-E) and modification probability distributions for m6A (all-context), m6A (DRACH-context), ψ, inosine and m5C (right, A-E) in human transcriptome-aligned DMSO-and STM2457-treated HD10.6 cells, plotted against the values from IVT NHDF poly(A) RNA.

Modification site probabilities detected with different versions of Dorado and compared to RCS.

(A+B) DRACH-context and all-context m6A modification probability distributions from genome-aligned DMSO (solid red line)- and STM2457 (dashed red line)-treated NHDFs are plotted against RCS negative control (grey). (C) Modification probability distributions for all-context m6A (Dorado v0.7.0, v0.8.0 and v0.9.0), the bottom panel showing the distributions in 0.95-1.00 modification range. (D) Same as in (C), but for DRACH-context m6A (Dorado v0.5.0, v0.6.0, v0.8.0 and v0.9.0). (E) Same as in (C), but for ψ (Dorado v0.7.0, v0.8.0, v0.9.0). All the plots in C-E are generated from genome-aligned DMSO-treated NHDF datasets and are plotted against RCS negative controls.

Evaluation of Dorado by stoichiometry distributions.

(A) Stoichiometry distributions for all-context m6A from DMSO- (in red) or STM2457-treated (in pink) NHDF datasets (genome-aligned and filtered for > 20 reads) are plotted, comparing 0.98 modification probability-unfiltered (left) and filtered (right) sites. (B) Same as in (A), but for DRACH-context m6A sites.

Stoichiometry distributions of msA, ψ and inosine in IVT datasets.

(A) All-context m6A stoichiometry distributions (all stoichiometries (top) and > 10% stoichiometries (bottom)) in (genome-aligned) IVT RNA from DMSO-treated NHDFs, comparing modification probability-unfiltered and filtered data. The bottom panel shows stoichiometry distributions only for > 10% stoichiometry sites. (B) As in (A), but for DRACH-context m6A. (C) As in (A), but for ψ. (D) As in (A), but for inosine.

Evaluation of Dorado by stoichiometry distributions in transcriptome-aligned NHDFs.

(A) Stoichiometry distributions for all-context mεA from DMSO- (in red) or STM2457-treated (in pink) NHDF datasets (transcriptome-aligned and filtered for > 20 reads) are plotted, assessing the 0.98 modification probability cutoff. The bottom panel shows stoichiometry distributions only for > 10% stoichiometry sites. (B) Same as in (A), but for DRACH-context m6A sites.

Evaluation of Dorado by stoichiometry distributions in transcriptome-aligned HD10.6 cells.

(A) Stoichiometry distributions for all-context m6A from DMSO- (in red) or STM2457-treated (in pink) HD10.6 datasets (transcriptome-aligned and filtered for > 20 reads) are plotted, assessing the 0.98 modification probability cutoff. The bottom panel shows stoichiometry distributions only for > 10% stoichiometry sites. (B) Same as in (A), but for DRACH-context m6A sites.

Stoichiometry distributions of ψ and inosine in NHDFs.

(A) Distributions of all (left) and > 10% (right) ψ stoichiometries in the 0.99 modification probability-unfiltered and filtered data from genome-aligned DMSO-(darker color) and STM2457-treated (lighter color) NHDFs (filtered for > 20 reads). (B) Same as in (A), but for inosine.

Stoichiometry distributions of ψ and inosine in HD10.6 cells.

(A) Distributions of all and > 10% ψ stoichiometries in the 0.99 modification probability-unfiltered and filtered data from genome-aligned DMSO-(darker color) and STM2457-treated (lighter color) HD10.6 cells (filtered for > 20 reads). (B) Same as in (A), but for inosine.

m6A and ψ stoichiometry distributions in different versions of Dorado.

Stoichiometry distributions (all stoichiometries - left, > 10% stoichiometries - right) in a line graph format for (A) all-context m6A detected in genome-aligned DMSO-treated NHDF dataset, with Dorado versions 0.7.0, 0.8.0 and 0.9.0 colored in gradients of red, darkness increasing with the later versions. (B) DRACH-context m6A detected with Dorado versions 0.5.0, 0.6.0, 0.8.0 and 0.9.0, colored in gradients of red, darkness increasing with the later versions. (C) ψ detected with Dorado versions 0.7.0, 0.8.0 and 0.9.0, colored in gradients of blue, darkness increasing with the later versions. All the datasets were filtered for > 20 reads.

Comparison of modification sites detected with different versions of Dorado.

Upset plots showing unique and shared detected modifications (filtered for > 20 reads, > 10% stoichiometry) in modification probability-unfiltered and filtered genome-aligned DMSO-treated NHDF data. The versions are ordered on the plot by the number of detected sites, in ascending order (on the left). Results are shown for the following models and versions: (A) all-context mbA detected with v.0.7.0, v0.8.0 and v0.9.0, (B) DRACH-context m6A detected with v.0.5.0, v0.6.0, v0.8.0 and v0.9.0, (C) ψ detected with v.0.7.0, v0.8.0 and v0.9.0. (D) Nucleotide frequencies in the 5-base motif of ψ sites for uniquely v0.7.0-, v0.8.0-0.9.0- and v0.7.0-0.8.0-0.9.O-detected ψ sites. Position 0 marks the modification site.

Comparison of m6A sites detected by Dorado and GLORI-Sequencing in NHDFs and HDlO.6s.

(A) Upset plot showing the number of m6A sites detected by GLORI and Dorado in NHDFs and their overlaps. Dorado sites represent high-confidence m6A sites obtained after filtering DRACH-model detected data. Similarly, only high confidence m6A sites detected by GLORI are shown but were additionally filtered for DRACH sites. (B) Stoichiometry distributions are plotted from genome-aligned DMSO- or STM2457-treated HD10.6 datasets using the indicated Dorado models. (C) Stoichiometry distributions from GLORI performed on DMSO- or STM2457-treated NHDFs are plotted. In addition, m6A sites were filtered for DRACH motifs (right plot). (D) Stoichiometry distributions are plotted from transcriptome-aligned DMSO- or STM2457-treated NHDF datasets using the indicated Dorado models. (E) Stoichiometry distributions are plotted from transcriptome-aligned DMSO- or STM2457-treated HD10.6 datasets using the indicated Dorado models.

m6A (all-context and DRACH-context) motif logos for modification probability-unfiltered NHDF and HD10.6 data.

(A) 5-mer motif logo (showing nucleotide frequences at each position) of m6A-modified sites (position 0 = modified A) from genome-aligned DMSO-treated NHDFs (all-context - left, DRACH-context - right). All the datasets were filtered for > 20 coverage & > 10% stoichiometry. (B) Same as (A), but for HD10.6 cells.

Metagene plots showing m6A (all-context and DRACH-context) distribution profiles in NHDF and HD10.6 data filtered at different stoichiometries.

(A) Metagene plots showing the density of m6A (all-context - dark red, DRACH-context - bright red) sites across all annotated poly(A) transcripts (upper plot) and monoexonic poly(A) transcripts (lower plot) from genome-aligned DMSO-treated NHDF datasets filtered with 0.98 modification probability, > 20 coverage and > 10% stoichiometry. (B) The density of all-context m6A sites (0.98 modification probability-unfiltered - left, filtered - right) is plotted across all annotated poly(A) transcripts of NHDF (dark red) and HD10.6 (bright red) datasets, filtered with different stoichiometries (descending: all stoichiometries, > 50%, > 10% and < 10%. All datasets were filtered additionally for > 20 coverage & > 10% stoichiometry. (C) Same as (A), but for DRACH-context m6A sites.