LOGO diagram of base proportions in the observed spliced leader sequences

The relative proportion of nucleotides retrieved within 50 bases of the 5-prime end of dinoflagellate transcripts is shown above the canonical spliced leader sequence from (28). The “anchor” sequence used to retrieve potential spliced leader sequences bioinformatically is shown in the right side consisting of “GCTCAAG”.

Products recovered from total small RNA using biotinylated oligonucleotides complementary to the SL and U4 RNAs

SL- and U4-containing sequences from a small RNA fraction (<200 bases) were retrieved using biotinylated oligonucleotides complementary to either SL or U4 RNA. Products retrieved with biotinylated oligonucleotides complementary to the SL were subsequently treated with T2 RNAse or RNAse A. The products were separated electrophoretically using an Agilent Bioanalyzer and compared with size standards. A virtual gel is shown here. The samples from left to right are the size standards, the total small RNA (<200 bases), the pulldown with the anti-spliced leader oligonucleotide, the T2 RNAse digest of the spliced leader pulldown, the RNAse A digest of the spliced leader pulldown, and the pulldown with the anti-U4 snRNA oligonucleotide.

Modified bases in the 22-nucleotide SL, U4 snRNA, and decapping products

A graph of the percent of each non-standard ribonucleoside from the total detectable bases is shown with the specific moiety on the X axis and percent on the Y axis or “ND” when not detectable. The first four classes are totals for each non-standard nucleoside followed by specific moieties for which there were standards: 1-methyladenosine (M1A), 6-methyl adenosine (M6A), 5-methyl cytosine (M5C), 1-methyl guanosine (M1G), 7-methyl guanosine (M7G), pseudouridine (Y), and 2,2,7-trimethyl guanosine (M227G). The RNAse A degradation of spliced leader isolates is shown in black, the U4 snRNA isolate is shown in light grey, and the free nucleosides following decapping of the 22nt spliced leader are shown in dark grey. Error bars represent triplicate compositional analyses from a single sample. Moieties other than 7-methyl guanosine following decapping are likely contaminants from the total RNA pool bound to the Sepharose beads used in sequence enrichment.

Contigs from the A. carterae transcriptome (Genbank SRA SRX722011) presumed to play a role in capping based on annotation

Contigs from the A. carterae transcriptome presumed to pay a role in mRNA capping or RNA modification are listed using their contig designation (comp*****). Also listed for each contig is the expression level as fragments per thousand bases per million reads (FPKM), the eValue score and alignment length returned by BLAST, and the domain description and eValue also returned by BLAST denoting its functional role in cap structure formation.

Schematic phylogeny of the eIF4E gene family in A. carterae, based on an amino acid alignment using maximum likelihood

The eight different eIF4E family members from this species form three major eIF4E clades in this phylogeny. Sequence labels use the nomenclature from (32). Bootstrap support is not present for the light grey branches within the eIF4E-1 clade. The eIF4E family members highlighted in bold were selected for functional analysis in this investigation.

Summary of the eIF4E family members from Amphidinium carterae, their bio-chemical features, and their similarity to murine eIF4E-1

A) There are eight eIF4E family members present in A. carterae. They have been named according to their phylogenetic relationships as well as predictions of their biochemical functionality. We used S35-methionine to label recombinant proteins for our experiments. The predicted eIF4G/IP binding site is featured with red letters for negative charge and blue letters for positively charged residues. The residues, using murine numbering, are displayed for each eIF4E to highlight the differences in residues thought to be essential for cap binding.

The calculated isoelectric point (pI) is an average of several alogrithms used to predict hypothetical pI values. The FPKM expression level values are taken from an Illumina HiSeqRNAseq and Trinity assembly (unpublished results). B) The percent identity and similarity of each eIF4E family member were compared to each other and to murine eIF4E-1. Green and yellow imply greater similarity and red implies less.

Amino acid alignment of the eight eIF4E family members from A. carterae with the murine class 1 eIF4E1A

A) MUSCLE alignment

The amino acid sequences of the core eIF4E domains from the eight A. carterae eIF4E family members were aligned to the murine class I eIF4E, eIF4E1A, using MUSCLE. Conserved amino acids are shown in black, similar amino acids are shown in grey. The eight tryptophan residues (numbered as in the mammalian sequence) characteristic of eIF4Es are highlighted in yellow, along with the residues that are found at the corresponding positions in representative sequences from the three A. carterae clades. The residues known to be involved in interacting with the 7-methylguanosine cap in mammalian eIF4E1A are highlighted with a green arrowhead. The region that is associated with eIF4G/eIF4E-BP interaction in mammalian eIF4E1A is highlighted by a green bar. Residues in cyan are important for binding the phosphates of the m7G. In the eIF4G/4EBP interaction domain, negatively charged residues are highlighted in red and positively charged residues in blue.

A. Schematic of Important Residues: Comparison of the conserved core region of A. carterae eIF4E family members with that of mammalian eIF4E1A illustrating the important binding regions for the m7G cap and eIF4G. The eight conserved tryptophan residues (numbered as in the mammalian sequence) are shown in black along with the residues that are found at the corresponding positions in representative sequences from the three A. carterae clades. Residues in yellow are important for binding the phosphates of the m7G (the aspartate at position 90 coordinates binding by arginine157), while the eIF4G binding motif (S/TVxxFW) is shown in blue. The glutamate at position 103 (red) is involved in hydrogen bonding to the m7G. Note that dinoflagellate clade 1 eIF4Es have an insertion (↑) of 12–13 amino acids between positions equivalent to W73 and W102, as well as an insertion of 7–9 amino acids between W130 and W166.

Quantitative-PCR cycle thresholds of each eIF4E family member at a mid-day and mid-night time point on a diel cycle

The relative transcript abundance of each eIF4E family member is shown as a cycle threshold for the mid-day and mid-night time points of A. carterae cultures maintained on a 14:10 light:dark cycle.10 ng RNA was reverse-transcribed using random primers and used as template for qPCR, as outlined in Materials and Methods. cDNA was measured using SYBR green as an indicator.

Quantification of protein abundance of each eIF4E family member in A. carterae

Recombinant GST-tagged protein was purified from either the soluble or insoluble fraction of an E. coli lysate. Recombinant protein was quantified and diluted into a two-fold dilution standard curve starting at 25 nanograms for eIF4E-1a and eIF4E-1d1 or 3 ng for eIF4E-1b, eIF4E-1c, eIF4E-2, and eIF4E-3a. This was used to compare to a dilution curve of A. carterae suspended and boiled directly in SDS-PAGE samples buffer. Four two-fold dilutions were loaded per well starting at 250,000 cell equivalents for eIF4E-1a and eIF4E-1d1 or 500,000 cell equivalents for eIF4E-1b, eIF4E-1c and eIF4E-2. The pixel densities of each band from the standard curve were used to calculate the nanogram quantities and converted into molecules per cell based on the molecular weight and cell equivalents loaded in that well.

Assessment of the m7GTP and TMG binding capability of eIF4E family members from A. carterae

S35-methionine-labelled eIF4E family members were produced using a rabbit reticulocyte in vitro expression system. The ability to interact with m7GTP vs TMG cap was assessed by measuring the CPM of the unbound and bound fraction after loading, equilibrating, and washing the cap-Sepharose column. The counts were expressed as a percent of the total incorporation into each family member. The A. carterae eIF4E family members were compared to a positive control, C. elegans IFE-1, and a negative control, luciferase.

Only eIF4E-1a can be retrieved by m7GTP-Sepharose chromatography from cell lysates of A. carterae

Cell lysate was generated from a mid-day actively growing culture of A. carterae. Proteins bound to m7GTP-Sepharose were analyzed by western blot using antibody specific for each eIF4E family member. eIF4E-3a was not included in this analysis since we could not find expression of it using western blot with two separate antibodies.

A. carterae eIF4E-1a and eIF4E-1d complement yeast lacking the endogenous eIF4E gene

The S. cerevisiae strain, JOS003 (61), was transformed with the Ura-selectable vector, pRS416GPD, that was either empty or contained cDNAs encoding one of the following: A. carterae eIF4E-1a, -1b, -1c, -1d1, eIF4E-2a, eIF4E-3a. pRS416GPD constructs containing zebrafish eIF4E-1A and -1B were also included as positive and negative controls, respectively (59). Following selection on SC medium with galactose but lacking uracil and leucine, yeast from the resulting single colonies were transferred to YP-agar media containing G418 and either glucose (right), or galactose (left) as carbon source and allowed to grow at 30 °C for 72 h.

The binding affinity of purified A. carterae eIF4E family members for three separate cap structure analogs using SPR

Recombinant GST-tagged eIF4E proteins were purified from E. coli and loaded onto a surface plasmon resonance chip equipped with an anti-GST antibody. Binding experiments were carried out in phosphate-buffered potassium buffer (20 mM sodium phosphate pH 7.5, 150 mM KCl, 0.05% Tween-20) at 25°C. A two-fold dilution series (250-1.95 μM for the dinucleotide and 62.5 - 0.97 μM for the mononucleotide) was used to measure the change in response for each cap structure analogue. All dissociation constants (KD) were calculated based on a dose response curve *except for the affinity of eIF4E-1d1 and mouse eIF4E-1 for m7GpppC which was calculated based on kinetic curve fit by Biacore software algorithms.