Universally high transcript error rates in bacteria

Abstract
eLife digest
Introduction
Results
Discussion
Materials and methods
Data availability
References
Article and author information
Metrics

Abstract

Errors can occur at any level during the replication and transcription of genetic information. Genetic mutations derived mainly from replication errors have been extensively studied. However, fundamental details of transcript errors, such as their rate, molecular spectrum, and functional effects, remain largely unknown. To globally identify transcript errors, we applied an adapted rolling-circle sequencing approach to Escherichia coli, Bacillus subtilis, Agrobacterium tumefaciens, and Mesoplasma florum, revealing transcript-error rates 3 to 4 orders of magnitude higher than the corresponding genetic mutation rates. The majority of detected errors would result in amino-acid changes, if translated. With errors identified from 9929 loci, the molecular spectrum and distribution of errors were uncovered in great detail. A G→A substitution bias was observed in M. florum, which apparently has an error-prone RNA polymerase. Surprisingly, an increased frequency of nonsense errors towards the 3′ end of mRNAs was observed, suggesting a Nonsense-Mediated Decay-like quality-control mechanism in prokaryotes.

eLife digest

Most cells contain molecules of DNA that carry instructions to make the proteins cells need to perform different tasks. When a cell requires a certain protein, the corresponding DNA sequence is first transcribed into molecules of ribonucleic acid (RNA) known as transcripts. These sequences of RNA are then read by the cell and translated into the desired protein sequence.

Errors in copying DNA before a cell divides, can lead to genetic mutations that affect the ability of the cell to carry out certain roles, influencing the overall ‘fitness’ of the cell. Similar to genetic mutations, errors that arise when forming RNA transcripts may also alter the tasks a cell performs. However, it is difficult to find out what kinds of errors cells have in their transcripts and how often these mistakes occur. This is because current methods for sequencing RNA are prone to technical inaccuracies that interfere with the ability to detect true transcript errors.

Now, Li and Lynch have adapted a method for high-throughput sequencing of RNA, which can accurately identify transcript errors in Escherichia coli and other species of bacteria. The experiments showed that errors in RNA molecules occurred more frequently than genetic mutations in the same sequence of DNA. Li and Lynch also found that the transcripts contained more nonsense errors – that is, mutations which prematurely stop transcripts from being translated, resulting in shorter proteins – at the end of the RNA molecule than at the beginning or middle. It is possible that transcripts with errors at the beginning or the middle are more efficiently eliminated than those at the end, suggesting that bacteria have a quality-control mechanism for removing transcripts with premature stop sequences.

These findings suggest that at any one-time cells carry thousands of transcripts with inaccuracies in their sequence, which likely impact the tasks cells perform. The next step will be to investigate how these different transcript errors affect the fitness of cells.

Introduction

Transcript errors refer to any inconsistencies between RNA transcripts and their corresponding genomic loci. They can occur during ribonucleotide (rNTP) incorporations by RNA polymerases and/or via post-transcriptional modifications. Errors on RNA transcripts may directly cause dysfunctions due to the regulatory roles of small RNAs and the fate determination of mRNAs by RNA structural motifs (Strathern et al., 2012). Such errors can also indirectly induce various effects at the protein level. Transcript errors can inactivate proteins and result in a loss-of-function (Gordon et al., 2013). They can also indirectly give rise to misfolded proteins and induce proteotoxic stress (Gout et al., 2017; Vermulst et al., 2015). Errors on RNA transcripts may be causal factors leading to neuron degenerative diseases (van Leeuwen et al., 1998a; van Leeuwen et al., 1998b) and tumorigenesis (Saxowsky et al., 2008). Therefore, transcript errors represent a significant potential mechanism influencing cellular integrity and fitness.

Reporter-construct assays have long been the major approach to evaluating the fidelity of RNA polymerases and identifying transcript errors (Blank et al., 1986; Bubunenko et al., 2017; Nesser et al., 2006; Rosenberger and Foskett, 1981; Rosenberger and Hilton, 1983; Shaw et al., 2002; Springgate and Loeb, 1975; Strathern et al., 2012), but these methods focus only on individual loci and cannot identify errors without phenotypic marker effects. Conventional high-throughput sequencing approaches have been considered to identify transcript errors at a large scale (van Dijk et al., 2015). However, the challenge is to distinguish the real signal of transcript errors from noise produced by technical errors resulting during reverse transcription and sequencing. To circumvent this problem, a rolling-circle amplification-based sequencing (CirSeq) method (Acevedo and Andino, 2014; Acevedo et al., 2014; Lou et al., 2013) was recently proposed and later applied to identify transcript errors in the whole transcriptome of prokaryotes (Traverse and Ochman, 2016). We further modified this protocol to minimize RNA damage potentially introduced during the preparation of sequencing libraries (Gout et al., 2017).

In this study, we applied an adapted CirSeq approach, which has been demonstrated to identify transcript errors accurately and efficiently at a large scale in eukaryotes (Gout et al., 2017), to prokaryotes for the first time. A large number of transcript errors was detected, and transcript-error rates were revealed to be orders of magnitude higher than corresponding genetic mutation rates. Our results indicate that the bias in molecular spectra of transcript errors can be influenced by both RNA polymerases and cellular rNTP concentrations. Furthermore, the spatial distribution of transcript errors on RNAs provides novel insights into the mechanism of RNA quality-control in prokaryotes.

Results

A global view of the transcript error distribution

Applying the adapted CirSeq method (see Materials and methods) to E. coli, B. subtilis, A. tumefaciens, and M. florum, RNA sequencing libraries were made with three biological replicates for each species. Key steps of library preparations involve circularizing RNA fragments and generating cDNAs with tandem repeats by rolling-circle reverse transcription. In this way, transcript errors tend to appear on all repeats of sequencing reads, while sequencing and reverse transcription errors are nearly always revealed as singletons (Figure 1—figure supplement 1). The number of loci where transcript errors were identified from each species ranges from 2006 to 2942, totaling 9929 loci across all species. M. florum showed a per-site error rate of $1.82 \pm 0.01 (S E M) \times 10^{- 5}$ , the highest among the four species ( $P = 0.009$ , Mann-Whitney U test). The error rates in E. coli, B. subtilis, and A. tumefaciens were $5.84 \pm 0.10 (S E M) \times 10^{- 6}$ , $5.80 \pm 0.14 (S E M) \times 10^{- 6}$ , and $7.26 \pm 0.35 (S E M) \times 10^{- 6}$ , respectively. These error rates are 3 to 4 orders of magnitude higher than the corresponding genomic (DNA-level) mutation rates estimated from mutation-accumulation experiments in these species (Lee et al., 2012; Lynch et al., 2016; Sung et al., 2016; Sung et al., 2015; Sung et al., 2012).

With such a large number of transcript errors identified, a transcriptome-wide view of the error distribution in each species was uncovered. Based on the circular genomes of bacteria (except for A. tumefaciens, which has one circular chromosome, one linear chromosome, and two plasmids [Goodner et al., 2001]), we annotated genomic positions of transcript errors with different potential functional effects and plotted transcript-error rates in 10 kb sliding windows (1 kb for M. florum) (Figure 1). To test whether transcript errors are randomly distributed across different genes, a previously proposed test (Long et al., 2016) was performed to identify genes enriched with transcript errors. For each gene, the expected number of transcript errors was calculated as the product of the average transcriptome-wide error rate per base and the sequencing coverage of the gene. The Poisson probability of observing a number of errors greater than or equal to the observed number was calculated. Out of 607, 495, 586, and 186 genes with detected transcript errors in E. coli, A. tumefaciens, B. subtilis and M. florum, respectively, 1, 4, 0 and 4 genes were revealed to have significantly larger numbers of errors than random expectations (Bonferroni-corrected P values of 0.05, Supplementary file 1, Tables 2-5), suggesting that transcript errors are in general randomly distributed across genes.

Figure 1 with 2 supplements see all

Download asset Open asset

The distribution of transcript errors across the whole transcriptomes of *E. coli*, *B. subtilis*, *A. tumefaciens*, and *M. florum*.

The first nucleotide of the circular chromosome starts at the 12 o’clock position. For *A. tumefaciens*, chromosomes and plasmids are arranged from the largest to smallest size in a clockwise orientation. From the outer ring to the inner ring: bacterial chromosomes (dark gray), protein-coding region (grey, black strokes indicate gene densities), synonymous substitutions (blue), missense substitutions (orange), nonsense substitutions (purple) and average transcript-error rates (plots in dark gray) in a 10 kb sliding window with a step size of 1 bp (1 kb windows for *M. florum*). Windows without sufficient sequencing coverages to detect transcript errors are left blank.

Figure 1—source data 1 Numerical data that are represented as a graph in Figure 1.: https://cdn.elifesciences.org/articles/54898/elife-54898-fig1-data1-v1.zip
Download elife-54898-fig1-data1-v1.zip

The whole bacterial transcriptome is synthesized by a single type of RNA polymerase. However, RNA products from protein-coding and noncoding RNA (ncRNA) regions undergo distinct co- and post-transcriptional processes. mRNAs are mature upon transcription and ready for translation, while ncRNAs, such as ribosomal RNAs (rRNA) and transfer RNAs (tRNA), need to be further processed to be functional (Cooper, 2000). To evaluate whether transcript-error rates of these two genomic regions are different, we calculated the error rates of protein-coding and ncRNA transcripts by dividing the number of errors by the number of nucleotides assayed in corresponding regions. Transcript-error rates of these two regions are similar in E. coli and A. tumefaciens, but the error rate of ncRNA transcripts is higher than that of protein-coding transcripts in B. subtilis and lower in M. florum (p<0.05, paired t-test) (Figure 2).

Figure 2

Download asset Open asset

Transcript-error rates of protein-coding and ncRNA regions.

cds includes all protein-coding genes that were sequenced in this study. ncRNA refers to RNAs that are functional but not translated into proteins, for example tRNA and rRNA. Transcript-error rates were calculated by dividing the number of errors by the number of nucleotides assayed in corresponding regions. Error bars indicate standard errors. The level of significance difference is indicated by asterisks (*p<0.05, paired t-test).

Figure 2—source data 1 Numerical data that are represented as a graph in Figure 2.: https://cdn.elifesciences.org/articles/54898/elife-54898-fig2-data1-v1.txt
Download elife-54898-fig2-data1-v1.txt

The molecular spectra of transcript errors are biased to C→U and G→A substitutions

A transition/transversion bias of genetic mutations has been widely observed in different species, with the molecular spectrum mostly dominated by G:C→A:T substitutions (Hershberg and Petrov, 2010; Hildebrand et al., 2010; Lynch, 2010). However, knowledge on the molecular spectrum of transcript errors in prokaryotes remains limited (Imashimizu et al., 2015; Traverse and Ochman, 2016; Traverse and Ochman, 2018). In this study, we calculated the error rate of all twelve categories of substitutions for each species (Figure 3), revealing a general bias of transitions over transversions. This bias has been thought to be driven solely by C→U substitutions (Traverse and Ochman, 2016), which may mainly result from post-transcriptional cytosine deaminations. However, the transition/transversion bias here even holds after C→U substitutions are excluded (P < 0.005, χ² test, Supplementary file 1, Table 6). This observation indicates that the transcriptional machinery in bacteria, similar to the replication machinery, tends to have a low ability to distinguish rNTPs within the same structural class of nitrogenous bases (Keightley et al., 2009; Kucukyildirim et al., 2016; Lee et al., 2012; Long et al., 2015a; Long et al., 2015b; Lynch, 2007; Lynch, 2010; Lynch et al., 2008; Ossowski et al., 2010; Sung et al., 2015). Of all transitions, the C→U substitution rate is consistently high in all four species. In addition, an unexpectedly high G→A substitution rate is revealed in M. florum, which displayed the highest transcript-error rates among four species in the present study. Intriguingly, this substitution bias was also recently observed in yeast and E. coli transcription-machinery mutants with decreased fidelity (Gout et al., 2017; Imashimizu et al., 2015; Traverse and Ochman, 2018). Thus, the G→A substitution bias may be a signature of error-prone RNA polymerase in both eukaryotes and prokaryotes.

Figure 3

Download asset Open asset

The molecular spectra of transcript errors for four bacterial species.

The conditional error rates of each type of substitutions were calculated from the number of particular transcript errors, divided by the number of corresponding ribonucleotides assayed. Error bars indicate standard errors.

Figure 3—source data 1 Numerical data that are represented as a graph in Figure 3.: https://cdn.elifesciences.org/articles/54898/elife-54898-fig3-data1-v1.txt
Download elife-54898-fig3-data1-v1.txt

Characterization of transcript errors

To evaluate potential functional effects of transcript errors, we categorized transcript errors within protein-coding regions into synonymous, missense, and nonsense substitutions using SnpEff (Cingolani et al., 2012; Table 1). Based on the bias of rNTP substitution rates (Figure 3) and codon usages of each bacterium, we also calculated the expected percentages of each error type under the assumption that transcript errors are randomly generated across the genome without error-correction processes (see Materials and methods, and Supplementary file 1, Table 7). Consistent with observations, the majority of transcript errors are expected to result in amino-acid changes, if translated (Table 1). For nonsense errors, the observed percentages are close to or significantly lower than the random expectation (P < 0.005, χ² test, Table 1).

Table 1

Percentages of transcript errors in mRNAs that are synonymous, missense, or nonsense (other potential types of transcript errors with small percentages, such as start/stop codon loss-errors, are not shown).

Observed and expected (in parentheses) percentages are presented. Based on the bias of observed rNTP substitution rates and codon usages of each bacterium, expected percentages are calculated assuming a random generation of errors and an absence of error-correction processes. The level of significant difference is indicated by asterisks (*P < 0.05, ** P < 0.005, χ² test).

Species	Synonymous	Missense	Nonsense
E. coli	40.18 (34.35) **	56.25 (59.79) *	3.57 (5.62) **
B. subtilis	32.76 (31.86)	61.69 (61.63)	5.15 (6.15)
A. tumefaciens	40.68 (36.76) *	56.36 (59.17)	2.96 (3.86)
M. florum	17.58 (24.12) **	79.27 (70.58) **	2.37 (4.85) **

Biased distribution of nonsense errors in RNA transcripts

As shown in Table 1, nonsense errors represent only a small percentage of all errors. However, they are of particular interest because they will result in the formation of a premature termination codon (PTC) and thus truncated proteins if not degraded. To ameliorate the potential severe fitness effects resulting from such errors, eukaryotes have evolved the Nonsense Mediated Decay (NMD) mechanism (Losson and Lacroute, 1979; Maquat, 1995; Peltz et al., 1993) to facilitate the degradation of RNA transcripts carrying PTCs. A key to the success of NMD is distinguishing a PTC from the original stop codon (Amrani et al., 2004; Le Hir et al., 2001), and the ability of the NMD machinery to identify a PTC is thought to diminish as the PTC approaches the 3′ end of a mRNA (Isken and Maquat, 2007). This hypothesis is supported by yeast transcript-error data that show a marked increase in the frequency of PTCs towards the 3′ end of mRNAs (Gout et al., 2017).

Although no analog of the eukaryotic NMD system is known in prokaryotes, a destabilizing effect of PTCs on mRNA stability has been observed in bacteria (Arnold et al., 1998; Braun, 1998; Morse and Yanofsky, 1969; Nilsson et al., 1987). Evaluating the distribution of nonsense errors across the whole length of mRNA transcripts, we observed an increased frequency of nonsense errors at the 3′ end of transcripts, although the trend is not statistically significant in A. tumefaciens (Figure 4A). Compared to other three species, a smaller number of nonsense errors were detected in A. tumefaciens (Supplementary file 1, Table 7), which may result in a low statistical power to reveal a potential pattern for the distribution of nonsense errors. We further modified the analysis by dividing the frequency of nonsense errors by that of all errors. This ratio tends to be higher at the 3′ end of mRNAs (Figure 4—figure supplement 1), excluding the possibility that the enrichment of nonsense errors results mainly from a higher overall transcript-error rate at the 3′ end of mRNAs.

Figure 4 with 2 supplements see all

Download asset Open asset

Nonsense errors in prokaryotic transcripts.

(A) Distributions of nonsense errors across mRNA transcripts. The frequency of nonsense errors is calculated in a 100-nt sliding window with a step size of 1 nt for data visualization. Grey intervals represent standard deviations assuming the number of errors at each locus follows a binomial distribution. Linear regression between the distance to the original stop codon and the frequency of nonsense errors of each window is indicated in dark grey lines. P values were calculated from weighted linear regressions of individual data points before binning into a window. (B) The ribosome-release model for PTCs degradation in prokaryotes. Compared to a late PTC, an early PTC results in a larger portion of ribonucleotides unprotected by ribosomes, and therefore a higher probability of being digested by cellular ribonuclease.

Figure 4—source data 1 Numerical data that are represented as a graph in Figure 4A.: https://cdn.elifesciences.org/articles/54898/elife-54898-fig4-data1-v1.txt
Download elife-54898-fig4-data1-v1.txt

Of all types of genetic codons, those with one nucleotide difference from a stop codon (one-off codons) have a higher probability of mutating into PTCs. We further normalized the frequency of nonsense errors by the abundance of one-off codons at corresponding loci. This still revealed an increased frequency of nonsense errors towards 3′ ends of transcripts (Figure 4—figure supplement 2), suggesting the higher frequency of nonsense errors is not caused by more abundant one-off codons at the 3′ end of transcripts.

The increased frequency of PTCs at the 3′ end of mRNA transcripts suggests the presence of an NMD-like process, albeit by a likely different mechanism than in eukaryotes, which largely rely on the poly-A tail or exon-exon junction complex (Amrani et al., 2004). One speculative model for the degradation of PTCs in eukaryotes, the ribosome-release model (Brogna and Wen, 2009), in which the degradation of RNAs with PTCs depends on the degree of ribosome coverage on RNA molecules, has the potential to hold true in prokaryotes. Ribosomes can load on to nascent transcripts immediately after RNA synthesis. Therefore, a whole transcript with a normal stop codon can be covered by multiple ribosomes towards its 3′ end, with these ribosomes protecting the transcript from degradation by blocking ribonuclease cleavage sites. In contrast, a PTC upstream of the original stop codon will stall the ribosomes, leaving the ribonucleotides between the PTC and the site of the original stop codon unprotected by ribosomes, potentially promoting degradation by cellular ribonucleases (Figure 4B).

Discussion

A key to accurately identifying bona fide transcript errors is to distinguish them from technical errors and low-frequency genetic mutations. With previous efforts on method development to eliminate sequencing errors (Acevedo and Andino, 2014; Acevedo et al., 2014; Lou et al., 2013) and to evaluate the error rate of the reverse transcriptase (Gout et al., 2013), it is now possible to ensure that contributions from such technical errors are orders of magnitudes lower than true transcript-error rates by the CirSeq approach (See Materials and methods). Except for M. florum, transcript-error rates in bacteria estimated by the current study are about one order of magnitude lower than those from a previous study (Traverse and Ochman, 2016). Specifically in E. coli, our error-rate estimates for each type of substitutions tend to be lower than those from Traverse and Ochman (2016), the most striking difference involving the C→U substitution rate, which could be partly due to the use of a metal ion-based RNA fragmentation approach in the previous work vs. enzymatic RNA fragmentation in the present study. The latter minimizes RNA damage (Gout et al., 2017), in particular cytosine deaminations, introduced during the preparation of the sequencing library.

Besides base-substitution errors, a small portion of transcript errors can occur in other forms such as insertions and deletions. Estimates of transcript insertion/deletion (indel) error rates from species in this study are 0.1 to 0.2 of the corresponding base-substitution error rates (Supplementary file 1, Table 1).

Bacterial transcriptomes predominantly consist of ncRNA transcripts, such as rRNAs and tRNAs (Westermann et al., 2012). However, only a small portion of the whole ncRNA transcripts was evaluated in the present study (Supplementary file 1, Table 8) because of technical limitations. The rRNA depletion procedure in the sequencing library preparation protocol removes the majority of rRNAs. Secondary structures and nucleotide modifications of tRNAs interfere the cDNA synthesis and sequencing adapter ligations. In the future, to achieve a better measurement of transcript-error rates of ncRNA transcripts, total RNAs can be mixed with rRNA-depleted RNAs at a certain ratio to increase the abundance of rRNAs in the sequencing library. Demethylase enzymes and thermophilic reversetranscriptase can be used to remove nucleotide modifications of tRNAs and to improve processivity in generating cDNAs from highly structured RNA templates (Schwartz et al., 2018; Zheng et al., 2015).

The molecular spectrum of transcript errors revealed in our work indicates a general C→U substitution bias, which has been proposed to be due to spontaneous deamination (Imashimizu et al., 2013; Traverse and Ochman, 2016) owing to the chemical instability of cytosine (Alberts et al., 2015). Besides this widely accepted mechanism, non-Watson-Crick base pairing during rNTP incorporations may also contribute to this bias. Because dG and rU can form a base pair (Sugimoto et al., 2000; Sugimoto et al., 1997), mispairing between a template DNA (dG) and an RNA (rU) during rNTP incorporations likely also contributes to the C→U substitution bias.

Another intriguing observation from the molecular spectra in the present study is the G→A substitution bias in M. florum. One source for this substitution may be unrepaired uracils on the DNA antisense strand, which pair with rATPs during transcription, resulting in a G→A substitution on the RNA transcript. Although M. florum has a diminutive genome (0.79 Mb) and lacks many genes (RefSeq NC_006055.1), a uracil-DNA glycosylase (UDG) ortholog whose product presumably removes uracils (McCullough et al., 1999) does exist in the genome. Therefore, the extent to which mismatches between the unrepaired uracil and rATP can explain the G→A bias remains unclear.

Taking data from previous studies (Gout et al., 2017; Imashimizu et al., 2015; Traverse and Ochman, 2018) and this work together, G→A substitution bias seems to be a general pattern in cells with error-prone transcription machineries. What might be the underlying mechanism? The error spectrum is shaped by two factors. One is the ability of an RNA polymerase to distinguish correct rNTPs from incorrect ones. The other factor, which is sometimes neglected, is the rNTP pool within a cell. The error rate of competitive binding of rNTPs to the template can be expressed as, $(k_{i n c o r r e c t} ∙ C_{i n c o r r e c t - r N T P s}) / (k_{c o r r e c t} ∙ C_{c o r r e c t - r N T P s})$ , where $k$ refers to the rNTP incorporation rate and $C$ indicates the concentration of rNTPs. As suggested by this equation, a biased cellular rNTP concentration might present an additional challenge to transcriptional fidelity for certain categories of rNTPs. Based on observations that RNA polymerases have a low ability to distinguish rNTPs with the same structural class of nitrogenous bases and that the cellular concentration of rATPs is the highest among all types of nucleotides in both eukaryotes and prokaryotes (Bennett et al., 2009; Buckstein et al., 2008; Traut, 1994), it is reasonable to speculate that the high cellular concentration of rATPs contribute to the observed bias towards G→A substitutions.

An additional cellular process influencing transcript errors is RNA quality-control. Because genes involved in NMD, such as up-frameshift (UPF) genes, have not been identified in prokaryotes, evidence for the existence of NMD in prokaryotes is still lacking. However, previous studies based on single gene-reporters (Baker and Mackie, 2003; Braun, 1998; Nilsson et al., 1987) and our transcriptome-wide survey suggest a Nonsense-Mediated Decay-like quality-control mechanism in prokaryotes. A key implication of the increased frequency of nonsense errors at the 3′ end of mRNAs (Figure 4A) is that the degradation of RNAs carrying nonsense errors may simply result from a higher degree of exposure to cellular ribonucleases rather than from a reliance on specific protein-based systems.

Current models of mRNA surveillance mechanisms mostly focus on stop codon-related errors (Deutscher, 2006; Richards et al., 2008), which are expected to represent only a small portion of the total transcript errors in a cell. It is largely unknown whether, and if so by which mechanisms the major transcript errors (missense errors) get degraded. To resolve this, future research will be required to evaluate the rate at which transcript errors are degraded after initially being generated during transcription. This might be possible by comparing transcript errors on nascent transcripts bound to RNA polymerases with those on mature transcripts associated with ribosomes.

Materials and methods

Bacteria strains and growth conditions

Request a detailed protocol

All bacteria strains were inoculated into liquid culture from single colonies and grew to mid-exponential growth phase upon harvest. E. coli MG1655 and B. subtilis NCIB 3610 were grown at 37°C in LB liquid medium. M. florum L1 (ATCC #33453) was grown at 30°C in SNE liquid medium. A. tumefaciens C58 was grown at 28°C in LB liquid medium.

RNA extraction

Request a detailed protocol

Bacteria were harvested from liquid culture media by centrifugation and total RNA was extracted and purified using the FastRNA Blue Kit (MPBiomedicals), RNase-free DNase set (Qiagen), and the RNeasy Mini Kit (Qiagen). rRNA was depleted by the Ribo-Zero rRNA Removal Kit (Bacteria) (Illumina) for the following library preparations.

Library preparation and sequencing

Request a detailed protocol

We followed a refined protocol of CirSeq (Gout et al., 2017) to prepare libraries for transcript error identifications. Five hundred nanograms of rRNA-depleted RNAs were firstly fragmented with the NEBNext RNase III RNA Fragmentation Module (New England Biolabs) for 90 min at 37°C. After a clean-up using the Oligo Clean and Concentrator kit (Zymo Research), RNA fragments were circularized with RNA ligase 1 (New England Biolabs) according to the manufactuer’s guidelines. cDNA with tandem repeats was generated by the rolling-circle reverse transcription as described in the refined CirSeq protocol. Synthesis of the second strand of cDNA and sequencing library preparation were performed using the NEBNext Ultra RNA Library Prep Kit and NEBNext Multiplex Oligos for Illumina (New England Biolabs). The size selection and clean-up during sequencing library preparations were performed by Agencourt AMPure XP Beads (Beckman Coulter) according the NEB guideline that is optimized for approximately 200nt RNA inserts. A final gel-based size selection was performed to enrich PCR amplified products that are longer than 300nt. Single-end reads (300nt) were then generated using Illumina HiSeq 2500 System. The sequencing data were deposited in NCBI with the BioProject Number PRJNA592142.

Genome references and annotation files

Request a detailed protocol

The accession numbers of genome references for E. coli, B. subtilis, and M. florum are NC_000913.3, NZ_CM000488.1, and NC_006055.1. For A. tumefacien, accession numbers are NC_003062.2, NC_003063.2, NC_003064.2 and NC_003065.3. The corresponding genome annotation files are from RefSeq.

Data analysis

Request a detailed protocol

Several analysis pipelines already existed to process reads with multiple tandem repeats and call transcript errors, but with their own limitations. The CirSeq_v2 pipeline (Acevedo and Andino, 2014; Acevedo et al., 2014) can only analyze reads with exactly three repeats and reads generated by CirSeq approach can contain more than four repeats if the original RNA template is smaller than 75 nt. Another pipeline described in a recent work in yeast (Gout et al., 2017) cannot generate consensus calls and recalculate the quality score from a site where not all base calls are identical. Therefore, we developed Python scripts following the methods outlined by Lou et al. (2013) (Figure 1—figure supplement 2). The structure of repeats within one read was identified by an autocorrelation-based method, in which the length of one potential repeat $P$ is detected by the maximum fraction of identical base calls that are separated by a distance $P$ within one read. The consensus sequence was constructed and the corresponding new quality score was calculated by a Bayesian approach where an inferred consensus call is taken with the maximum posterior probability given all observed base calls. This approach also allows the processing of varied numbers and types of base calls at one site. To identify the ligation junction of circular templates and to reorganize the consensus sequence, a tandem duplicate of the consensus sequence was constructed and then mapped back to the reference genome by BWA (Li and Durbin, 2009). The longest continuous mapped regions of the duplicated consensus sequences therefore correspond to original RNA fragments. We also excluded the 4 nucleotides at both ends of the reorganized consensus sequence to minimize potential confusions, because mapping can be ambiguous at the two ends of RNA fragments. After mapping of reconstructed consensus sequences, reads uniquely mapped to protein-coding regions and all reads mapped to ncRNA regions were kept. Transcript errors were called if a mismatch between a consensus call and the reference was supported by less than 1% of reads at corresponding loci. To exclude false positives of transcript errors from genetic mutations in multiple copies of ncRNA genes (such as rRNA and tRNA genes), an additional filter was included to exclude an error call that is supported by genetic variations from different copies of ncRNA genes. The transcript error rate of a given region was calculated as the number of transcript errors divided by the total number of rNTPs assayed from the corresponding region. The code for the bioinformatic pipeline can be found at https://github.com/LynchLab/CirSeq4TranscriptErrors (Li, 2020; copy archived at https://github.com/elifesciences-publications/CirSeq4TranscriptErrors).

Strategies to distinguish transcript errors from other types of errors

Request a detailed protocol

First, reverse transcription and sequencing errors need to be filtered out in the analysis. Because the rate of transcript error is generally 10^-6 ∼ 10^-5 /nt , the recalculated probability of an erroneous base call at 10^-7or lower was required to minimize contaminations from sequencing errors. Because the error rate of the reverse transcriptase used here is ∼10^-4 /nt (Gout et al., 2013), at least two tandem repeats were required in the analysis to minimize false positives from reverse transcription errors.

Second, genetic mutations (DNA level) can arise during cell culture and low frequency mutations can behave like transcript errors in the sequencing data. The probability of capturing a genetic mutation can be calculated by dividing the expected number of genetic mutations generated during cell propagations by the total transcriptome size at the time point of sample collection, $\frac{μ \cdot g \cdot T \cdot n}{T \cdot n}$ , in which µ is the per site per generation mutation rate, $g$ is the number of generations during cell culture, $T$ is the size of genome regions get transcribed, and $n$ is the average expression level per site. This equation can be further simplified as µg. Because we know the mutation rate from mutation accumulation experiments (Lee et al., 2012; Lynch et al., 2016; Sung et al., 2016; Sung et al., 2015; Sung et al., 2012) and the number of generations from culture-growth dynamics (~30 generations), Low frequency genetic mutation can only inflate the transcript-error rate we calculated here by ~1‰ -1%.

To calculate the expected percentages of transcript errors with different effects

Request a detailed protocol

Take the calculation for synonymous substitution as one example. The percentage can be calculated by summing the probabilities of each codon to have a synonymous change, $P (s y n) = \sum_{i = 1}^{64} P_{i} ∙ P_{i (s y n)}$ . $P_{i}$ refers to the probability of having codon $i$ based on the codon usage of a specific genome and there are 64 codons in total. $P_{i (s y n)}$ is the probability that codon $i$ has a synonymous substitution and it can be calculated from, $P_{i (s y n)} = \sum_{j = 1}^{9} μ_{j} ∙ 1_{\{j r e s u l t s i n s y n\}}$ . $μ_{j}$ denotes the substitution probability of 1 of the 9 single-base substitutions that can happen in one codon. And it can be calculated by, $μ_{j} = \frac{e_{j}}{\sum_{j = 1}^{9} e_{j}},$ in which $e_{j}$ refers to the error rate of 1 of the 9 substitutions in one codon. Estimates of $e_{j}$ are displayed in Figure 3.

The sliding window analysis and weighted linear regression to evaluate the distribution of nonsense errors on mRNA transcripts

Request a detailed protocol

The sliding window analysis (window size = 100nt and step size = 1nt) of the distribution of nonsense errors across mRNAs was used for data visualization. To evaluate whether or not the negative correlation between the frequency of nonsense errors and the corresponding distance from a nonsense error to the original stop codon is statistically significant, a weighted linear regression method was used. The weight was calculated as the reciprocal of a variance of a nonsense error frequency. Because the observed number of transcript errors at each locus is expected to follow a binomial distribution, the variance of the nonsense error frequency can be estimated as $\frac{p (1 - p)}{n}$ , where $p$ is the estimated frequency of errors and $n$ refers to the read coverage at the corresponding locus.

Data availability

Sequencing data of this study are available at NCBI with the BioProject Number PRJNA592142.

The following data sets were generated

1. Weiyi Li
2. Michael L
(2020) NCBI BioProject
ID PRJNA592142. Transcript error studies on Escherichia coli, Bacillus subtilis, Agrobacterium tumefaciens, and Mesoplasma florum.

http://www.ncbi.nlm.nih.gov/bioproject/?term=PRJNA592142

References

(2014) Mutational and fitness landscapes of an RNA virus revealed through population sequencing
Nature 505:686–690.

https://doi.org/10.1038/nature12861
- Google Scholar
1. Acevedo A
2. Andino R
(2014) Library preparation for highly accurate population sequencing of RNA viruses
Nature Protocols 9:1760–1769.

https://doi.org/10.1038/nprot.2014.118
- PubMed
- Google Scholar
Book
1. Alberts B
2. Johnson A
3. Lewis J
4. Morgan D
5. Raff M
6. Roberts K
7. Walter P
(2015)
Molecular Biology of the Cell

Garland Science.
- Google Scholar
1. Amrani N
2. Ganesan R
3. Kervestin S
4. Mangus DA
5. Ghosh S
6. Jacobson A
(2004) A faux 3′-UTR promotes aberrant termination and triggers nonsense- mediated mRNA decay
Nature 432:112–118.

https://doi.org/10.1038/nature03060
- Google Scholar
(1998)
mRNA stabilization by the ompA 5' untranslated region: two protective elements hinder distinct pathways for mRNA degradation

RNA 4:319–330.
- PubMed
- Google Scholar
1. Baker KE
2. Mackie GA
(2003) Ectopic RNase E sites promote bypass of 5'-end-dependent mRNA decay in Escherichia coli
Molecular Microbiology 47:75–88.

https://doi.org/10.1046/j.1365-2958.2003.03292.x
- PubMed
- Google Scholar
(2009) Absolute metabolite concentrations and implied enzyme active site occupancy in Escherichia coli
Nature Chemical Biology 5:593–599.

https://doi.org/10.1038/nchembio.186
- Google Scholar
1. Blank A
2. Gallant JA
3. Burgess RR
4. Loeb LA
(1986) An RNA polymerase mutant with reduced accuracy of chain elongation
Biochemistry 25:5920–5928.

https://doi.org/10.1021/bi00368a013
- PubMed
- Google Scholar
1. Braun F
(1998) Ribosomes inhibit an RNase E cleavage which induces the decay of the rpsO mRNA of Escherichia coli
The EMBO Journal 17:4790–4797.

https://doi.org/10.1093/emboj/17.16.4790
- Google Scholar
1. Brogna S
2. Wen J
(2009) Nonsense-mediated mRNA decay (NMD) mechanisms
Nature Structural & Molecular Biology 16:107–113.

https://doi.org/10.1038/nsmb.1550
- Google Scholar
1. Bubunenko MG
2. Court CB
3. Rattray AJ
4. Gotte DR
5. Kireeva ML
6. Irizarry-Caro JA
7. Li X
8. Jin DJ
9. Court DL
10. Strathern JN
11. Kashlev M
(2017) A cre Transcription Fidelity Reporter Identifies GreA as a Major RNA Proofreading Factor in Escherichia coli
Genetics 206:179–187.

https://doi.org/10.1534/genetics.116.198960
- PubMed
- Google Scholar
(2008) Characterization of nucleotide pools as a function of physiological state in Escherichia coli
Journal of Bacteriology 190:718–726.

https://doi.org/10.1128/JB.01020-07
- PubMed
- Google Scholar
1. Cingolani P
2. Platts A
3. Wang leL
4. Coon M
5. Nguyen T
6. Wang L
7. Land SJ
8. Lu X
9. Ruden DM
(2012) A program for annotating and predicting the effects of single Nucleotide Polymorphisms, SnpEff: snps in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3
Fly 6:80–92.

https://doi.org/10.4161/fly.19695
- PubMed
- Google Scholar
Book
1. Cooper GM
(2000)
The Cell: A Molecular Approach

ASM Press.
- Google Scholar
1. Deutscher MP
(2006) Degradation of RNA in Bacteria: comparison of mRNA and stable RNA
Nucleic Acids Research 34:659–666.

https://doi.org/10.1093/nar/gkj472
- PubMed
- Google Scholar
1. Goodner B
2. Hinkle G
3. Gattung S
4. Miller N
5. Blanchard M
6. Qurollo B
7. Goldman BS
8. Cao Y
9. Askenazi M
10. Halling C
11. Mullin L
12. Houmiel K
13. Gordon J
14. Vaudin M
15. Iartchouk O
16. Epp A
17. Liu F
18. Wollam C
19. Allinger M
20. Doughty D
21. Scott C
22. Lappas C
23. Markelz B
24. Flanagan C
25. Crowell C
26. Gurson J
27. Lomo C
28. Sear C
29. Strub G
30. Cielo C
31. Slater S
(2001) Genome sequence of the plant pathogen and biotechnology agent Agrobacterium tumefaciens C58
Science 294:2323–2328.

https://doi.org/10.1126/science.1066803
- PubMed
- Google Scholar
(2013) Heritable change caused by transient transcription errors
PLOS Genetics 9:e1003595.

https://doi.org/10.1371/journal.pgen.1003595
- PubMed
- Google Scholar
1. Gout JF
2. Thomas WK
3. Smith Z
4. Okamoto K
5. Lynch M
(2013) Large-scale detection of in vivo transcription errors
PNAS 110:18584–18589.

https://doi.org/10.1073/pnas.1309843110
- PubMed
- Google Scholar
1. Gout JF
2. Li W
3. Fritsch C
4. Li A
5. Haroon S
6. Singh L
7. Hua D
8. Fazelinia H
9. Smith Z
10. Seeholzer S
11. Thomas K
12. Lynch M
13. Vermulst M
(2017) The landscape of transcription errors in eukaryotic cells
Science Advances 3:e1701484.

https://doi.org/10.1126/sciadv.1701484
- PubMed
- Google Scholar
1. Hershberg R
2. Petrov DA
(2010) Evidence that mutation is universally biased towards AT in Bacteria
PLOS Genetics 6:e1001115.

https://doi.org/10.1371/journal.pgen.1001115
- PubMed
- Google Scholar
(2010) Evidence of selection upon genomic GC-content in Bacteria
PLOS Genetics 6:e1001107.

https://doi.org/10.1371/journal.pgen.1001107
- PubMed
- Google Scholar
(2013) Direct assessment of transcription fidelity by high-resolution RNA sequencing
Nucleic Acids Research 41:9090–9104.

https://doi.org/10.1093/nar/gkt698
- PubMed
- Google Scholar
(2015) Visualizing translocation dynamics and nascent transcript errors in paused RNA polymerases in vivo
Genome Biology 16:98.

https://doi.org/10.1186/s13059-015-0666-5
- PubMed
- Google Scholar
1. Isken O
2. Maquat LE
(2007) Quality control of eukaryotic mRNA: safeguarding cells from abnormal mRNA function
Genes Development 21:1833–3856.

https://doi.org/10.1101/gad.1566807
- PubMed
- Google Scholar
1. Keightley PD
2. Trivedi U
3. Thomson M
4. Oliver F
5. Kumar S
6. Blaxter ML
(2009) Analysis of the genome sequences of three Drosophila melanogaster spontaneous mutation accumulation lines
Genome Research 19:1195–1201.

https://doi.org/10.1101/gr.091231.109
- PubMed
- Google Scholar
1. Kucukyildirim S
2. Long H
3. Sung W
4. Miller SF
5. Doak TG
6. Lynch M
(2016) The Rate and Spectrum of Spontaneous Mutations in Mycobacterium smegmatis , a Bacterium Naturally Devoid of the Postreplicative Mismatch Repair Pathway
G3: Genes|Genomes|Genetics 6:2157–2163.

https://doi.org/10.1534/g3.116.030130
- Google Scholar
(2001) The exon-exon junction complex provides a binding platform for factors involved in mRNA export and nonsense-mediated mRNA decay
The EMBO Journal 20:4987–4997.

https://doi.org/10.1093/emboj/20.17.4987
- PubMed
- Google Scholar
1. Lee H
2. Popodi E
3. Tang H
4. Foster PL
(2012) Rate and molecular spectrum of spontaneous mutations in the bacterium Escherichia coli as determined by whole-genome sequencing
PNAS 109:E2774–E2783.

https://doi.org/10.1073/pnas.1210309109
- PubMed
- Google Scholar
Software
1. Li W
(2020) Analysis pipelines to call and analyze transcript errors from CirSeq data, version 606267b
GitHub.

https://github.com/LynchLab/CirSeq4TranscriptErrors
1. Li H
2. Durbin R
(2009) Fast and accurate short read alignment with Burrows-Wheeler transform
Bioinformatics 25:1754–1760.

https://doi.org/10.1093/bioinformatics/btp324
- PubMed
- Google Scholar
1. Long H
2. Kucukyildirim S
3. Sung W
4. Williams E
5. Lee H
6. Ackerman M
7. Doak TG
8. Tang H
9. Lynch M
(2015a) Background mutational features of the Radiation-Resistant bacterium Deinococcus radiodurans
Molecular Biology and Evolution 32:2383–2392.

https://doi.org/10.1093/molbev/msv119
- PubMed
- Google Scholar
1. Long H
2. Sung W
3. Miller SF
4. Ackerman MS
5. Doak TG
6. Lynch M
(2015b) Mutation rate, spectrum, topology, and Context-Dependency in the DNA mismatch Repair-Deficient Pseudomonas fluorescens ATCC948
Genome Biology and Evolution 7:262–271.

https://doi.org/10.1093/gbe/evu284
- Google Scholar
1. Long H
2. Miller SF
3. Strauss C
4. Zhao C
5. Cheng L
6. Ye Z
7. Griffin K
8. Te R
9. Lee H
10. Chen CC
11. Lynch M
(2016) Antibiotic treatment enhances the genome-wide mutation rate of target cells
PNAS 113:E2498–E2505.

https://doi.org/10.1073/pnas.1601208113
- PubMed
- Google Scholar
1. Losson R
2. Lacroute F
(1979) Interference of nonsense mutations with eukaryotic messenger RNA stability
PNAS 76:5134–5137.

https://doi.org/10.1073/pnas.76.10.5134
- PubMed
- Google Scholar
1. Lou DI
2. Hussmann JA
3. McBee RM
4. Acevedo A
5. Andino R
6. Press WH
7. Sawyer SL
(2013) High-throughput DNA sequencing errors are reduced by orders of magnitude using circle sequencing
PNAS 110:19872–19877.

https://doi.org/10.1073/pnas.1319590110
- PubMed
- Google Scholar
Book
1. Lynch M
(2007)
The Origins of Genome Architecture

Sinauer Associates.
- Google Scholar
1. Lynch M
2. Sung W
3. Morris K
4. Coffey N
5. Landry CR
6. Dopman EB
7. Dickinson WJ
8. Okamoto K
9. Kulkarni S
10. Hartl DL
11. Thomas WK
(2008) A genome-wide view of the spectrum of spontaneous mutations in yeast
PNAS 105:9272–9277.

https://doi.org/10.1073/pnas.0803466105
- PubMed
- Google Scholar
1. Lynch M
(2010) Rate, molecular spectrum, and consequences of human mutation
PNAS 107:961–968.

https://doi.org/10.1073/pnas.0912629107
- PubMed
- Google Scholar
1. Lynch M
2. Ackerman MS
3. Gout JF
4. Long H
5. Sung W
6. Thomas WK
7. Foster PL
(2016) Genetic drift, selection and the evolution of the mutation rate
Nature Reviews Genetics 17:704–714.

https://doi.org/10.1038/nrg.2016.104
- PubMed
- Google Scholar
1. Maquat LE
(1995)
When cells stop making sense: effects of nonsense codons on RNA metabolism in vertebrate cells

RNA 1:453–465.
- PubMed
- Google Scholar
(1999) Initiation of base excision repair: glycosylase mechanisms and structures
Annual Review of Biochemistry 68:255–285.

https://doi.org/10.1146/annurev.biochem.68.1.255
- PubMed
- Google Scholar
1. Morse DE
2. Yanofsky C
(1969) Polarity and the degradation of mRNA
Nature 224:329–331.

https://doi.org/10.1038/224329a0
- PubMed
- Google Scholar
(2006) RNA polymerase II subunit Rpb9 is important for transcriptional fidelity in vivo
PNAS 103:3268–3273.

https://doi.org/10.1073/pnas.0511330103
- PubMed
- Google Scholar
(1987) Effect of premature termination of translation on mRNA stability depends on the site of ribosome release
PNAS 84:4890–4894.

https://doi.org/10.1073/pnas.84.14.4890
- PubMed
- Google Scholar
(2010) The rate and molecular spectrum of spontaneous mutations in Arabidopsis thaliana
Science 327:92–94.

https://doi.org/10.1126/science.1180677
- PubMed
- Google Scholar
(1993) mRNA destabilization triggered by premature translational termination depends on at least three cis-acting sequence elements and one trans-acting factor
Genes & Development 7:1737–1754.

https://doi.org/10.1101/gad.7.9.1737
- PubMed
- Google Scholar
(2008) Quality control of bacterial mRNA decoding and decay
Biochimica Et Biophysica Acta (BBA) - Gene Regulatory Mechanisms 1779:574–582.

https://doi.org/10.1016/j.bbagrm.2008.02.008
- Google Scholar
1. Rosenberger RF
2. Foskett G
(1981) An estimate of the frequency of in vivo transcriptional errors at a nonsense Codon in Escherichia coli
Molecular and General Genetics MGG 183:561–563.

https://doi.org/10.1007/BF00268784
- PubMed
- Google Scholar
1. Rosenberger RF
2. Hilton J
(1983) The frequency of transcriptional and translational errors at nonsense codons in the lacZ gene of Escherichia coli
Molecular and General Genetics MGG 191:207–212.

https://doi.org/10.1007/BF00334815
- Google Scholar
(2008) 8-Oxoguanine-mediated transcriptional mutagenesis causes ras activation in mammalian cells
PNAS 105:18877–18882.

https://doi.org/10.1073/pnas.0806464105
- PubMed
- Google Scholar
1. Schwartz MH
2. Wang H
3. Pan JN
4. Clark WC
5. Cui S
6. Eckwahl MJ
7. Pan DW
8. Parisien M
9. Owens SM
10. Cheng BL
11. Martinez K
12. Xu J
13. Chang EB
14. Pan T
15. Eren AM
(2018) Microbiome characterization by high-throughput transfer RNA sequencing and modification analysis
Nature Communications 9:5353.

https://doi.org/10.1038/s41467-018-07675-z
- PubMed
- Google Scholar
(2002) Use of an in vivo reporter assay to test for transcriptional and translational fidelity in yeast
Journal of Biological Chemistry 277:24420–24426.

https://doi.org/10.1074/jbc.M202059200
- PubMed
- Google Scholar
1. Springgate CF
2. Loeb LA
(1975) On the fidelity of transcription by Escherichia coli ribonucleic acid polymerase
Journal of Molecular Biology 97:577–591.

https://doi.org/10.1016/S0022-2836(75)80060-X
- PubMed
- Google Scholar
(2012) Isolation and characterization of transcription fidelity mutants
Biochimica Et Biophysica Acta (BBA) - Gene Regulatory Mechanisms 1819:694–699.

https://doi.org/10.1016/j.bbagrm.2012.02.005
- Google Scholar
(1997)
Stabilities of internal rU-dG and rG-dT pairs in RNA/DNA hybrids

Nucleic Acids Symposium Series 37:199–200.
- Google Scholar
(2000) Thermodynamics-structure relationship of single mismatches in RNA/DNA duplexes
Biochemistry 39:11270–11281.

https://doi.org/10.1021/bi000819p
- PubMed
- Google Scholar
1. Sung W
2. Ackerman MS
3. Miller SF
4. Doak TG
5. Lynch M
(2012) Drift-barrier hypothesis and mutation-rate evolution
PNAS 109:18488–18492.

https://doi.org/10.1073/pnas.1216223109
- PubMed
- Google Scholar
1. Sung W
2. Ackerman MS
3. Gout JF
4. Miller SF
5. Williams E
6. Foster PL
7. Lynch M
(2015) Asymmetric Context-Dependent mutation patterns revealed through Mutation-Accumulation experiments
Molecular Biology and Evolution 32:1672–1683.

https://doi.org/10.1093/molbev/msv055
- PubMed
- Google Scholar
1. Sung W
2. Ackerman MS
3. Dillon MM
4. Platt TG
5. Fuqua C
6. Cooper VS
7. Lynch M
(2016) Evolution of the Insertion-Deletion mutation rate across the tree of life
G3: Genes|Genomes|Genetics 6:2583–2591.

https://doi.org/10.1534/g3.116.030890
- Google Scholar
1. Traut TW
(1994) Physiological concentrations of purines and pyrimidines
Molecular and Cellular Biochemistry 140:1–22.

https://doi.org/10.1007/BF00928361
- PubMed
- Google Scholar
1. Traverse CC
2. Ochman H
(2016) Conserved rates and patterns of transcription errors across bacterial growth states and lifestyles
PNAS 113:3311–3316.

https://doi.org/10.1073/pnas.1525329113
- PubMed
- Google Scholar
1. Traverse CC
2. Ochman H
(2018) A Genome-Wide Assay Specifies Only GreA as a Transcription Fidelity Factor in Escherichia coli
G3: Genes|Genomes|Genetics 8:2257–2264.

https://doi.org/10.1534/g3.118.200209
- Google Scholar
1. van Dijk D
2. Dhar R
3. Missarova AM
4. Espinar L
5. Blevins WR
6. Lehner B
7. Carey LB
(2015) Slow-growing cells within isogenic populations have increased RNA polymerase error rates and DNA damage
Nature Communications 6:7972.

https://doi.org/10.1038/ncomms8972
- PubMed
- Google Scholar
(1998a) Mutations in RNA: a first example of molecular misreading in Alzheimer's disease
Trends in Neurosciences 21:331–335.

https://doi.org/10.1016/S0166-2236(98)01280-6
- PubMed
- Google Scholar
1. van Leeuwen FW
2. de Kleijn DP
3. van den Hurk HH
4. Neubauer A
5. Sonnemans MA
6. Sluijs JA
7. Köycü S
8. Ramdjielal RD
9. Salehi A
10. Martens GJ
11. Grosveld FG
12. Peter J
13. Burbach H
14. Hol EM
(1998b) Frameshift mutants of beta amyloid precursor protein and ubiquitin-B in Alzheimer's and Down patients
Science 279:242–247.

https://doi.org/10.1126/science.279.5348.242
- PubMed
- Google Scholar
1. Vermulst M
2. Denney AS
3. Lang MJ
4. Hung CW
5. Moore S
6. Moseley MA
7. Mosely AM
8. Thompson JW
9. Thompson WJ
10. Madden V
11. Gauer J
12. Wolfe KJ
13. Summers DW
14. Schleit J
15. Sutphin GL
16. Haroon S
17. Holczbauer A
18. Caine J
19. Jorgenson J
20. Cyr D
21. Kaeberlein M
22. Strathern JN
23. Duncan MC
24. Erie DA
(2015) Transcription errors induce proteotoxic stress and shorten cellular lifespan
Nature Communications 6:8065.

https://doi.org/10.1038/ncomms9065
- PubMed
- Google Scholar
(2012) Dual RNA-seq of pathogen and host
Nature Reviews Microbiology 10:618–630.

https://doi.org/10.1038/nrmicro2852
- PubMed
- Google Scholar
1. Zheng G
2. Qin Y
3. Clark WC
4. Dai Q
5. Yi C
6. He C
7. Lambowitz AM
8. Pan T
(2015) Efficient and quantitative high-throughput tRNA sequencing
Nature Methods 12:835–837.

https://doi.org/10.1038/nmeth.3478
- PubMed
- Google Scholar

Article and author information

Author details

Weiyi Li

Department of Biology, Indiana University, Bloomington, United States

Present address
Joint Initiative for Metrology in Biology, SLAC National Accelerator Laboratory, Stanford University, Stanford, United States

Contribution
Conceptualization, Resources, Data curation, Software, Formal analysis, Validation, Investigation, Visualization, Methodology, Writing - original draft, Writing - review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0002-1168-7093
Michael Lynch
1. Department of Biology, Indiana University, Bloomington, United States
2. Center for Mechanisms of Evolution, The Biodesign Institute, Arizona State University, Tempe, United States
Contribution
Conceptualization, Supervision, Funding acquisition, Validation, Investigation, Writing - original draft, Project administration, Writing - review and editing

For correspondence
mlynch11@asu.edu

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0002-1653-0642

Funding

National Institutes of Health (R01-GM036827)

Michael Lynch

National Institutes of Health (R35-GM122566)

Michael Lynch

Army Research Office (W911NF-09-1-0444)

Michael Lynch

Army Research Office (W911NF-14-1-0411)

Michael Lynch

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank Stephen Simpson, W Kelley Thomas, Samuel F Miller, Jiaqi Zheng, Jie Huang, and James Ford for technical support, and Daniel Kearns and Clay Fuqua for providing B. subtilis NCIB 3610 and A. tumefaciens C58 strains. We also thank Hongan Long, Michelle Marasco, Chi-Chun Chen, Parul Johri, and Jean-Francois Gout for helpful discussions. This work was supported by National Institutes of Health Awards R01-GM036827 and R35-GM122566, and Multidisciplinary University Research Initiative Award W911NF-09-1-0444 and W911NF-14-1-0411 from the US Army Research Office (to ML).

Copyright

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.