Figures and data

In vitro construction of minimal, near-standard, and standard genetic codes
(A) Anticodons of tRNAs and their corresponding codon assignments in the minimal genetic code (MGC), the near-standard genetic code (near-SGC), and the standard genetic code (SGC). Each codon is colored according to the physicochemical properties of the assigned amino acid: hydrophobic (green), aromatic (yellow), polar uncharged (orange), basic (blue), and acidic (pink). In each box, the anticodons of tRNAs (left), and the corresponding codons (right) are shown. (B) Codon sets used for reporter genes. The 21 codons contain only codons that are usable for MGC. Reporter genes composed of these codons were used in D and the subsequent experiments using non-standard genetic codes. The 32 and 46 codons contain codons that are usable for near-SGC and SGC, respectively. Reporter genes composed of these codons were used in D. (C) Schematic of the translation assay. Reporter genes (NanoLuc) consisting of the 21, 32, or 46-codons was translated in a customized reconstituted translation system lacking endogenous tRNAs (tRNA-free PURE system; tfPURE) supplemented with in vitro-synthesized tRNAs corresponding to MGC, near-SGC, or SGC (IPEN tRNA at 100 ng/µL; all other tRNAs at 12 ng/µL), and T7 RNA polymerase (0.42 U/µL) at 30 °C for 16 h. (D) NanoLuc activity after incubation. In near-SGC (RV), two tRNAs (tRNAValCAC and tRNAArgCCU) were increased to 100 ng/µL. Each dot represents an independent experiment (n = 3). Bars indicate mean values, and error bars represent standard deviations.

Reassignment experiments to test the availability of 10 vacant codons for Ala, Ser, and Leu
(A) Schematic illustration of reassignment experiments. Translation with the original MGC and NanoLuc template is shown at the top for comparison. An example of Ala reassignment to the UUG codon is shown at the bottom. In this example, three Ala codons in the NanoLuc sequence were replaced with one type of vacant codon (e.g., UUG), generating a 21 + 1 (UUG-Ala) codon set. Similar reassignment experiments were performed for three amino acids (Ala, Ser, and Leu) and nine vacant codons. (B) NanoLuc translation results for each codon reassignment experiment. Translation reactions were performed in tfPURE supplemented with a 21-tRNA mixture (600 ng/µL), one tRNA variant (12 ng/µL each), and each NanoLuc template (1 nM) that contains 2 – 4 of a corresponding codon to be tested (21 + 1 NNN-Ala/Ser/Leu codons). Reactions were incubated at 30 °C for 16 h, after which NanoLuc activity was measured. As a control, translation reactions lacking the additional tRNA variant were conducted (21 code, gray bars) and compared to the data with the additional tRNA (21 +1 code, pink bars). Additional controls included translation without any tRNA (no tRNA) and translation using MGC with NanoLuc templates encoded by the original 21 codons (21 codons), both shown for comparison. Each dot represents three technical replicates, and error bars represent standard deviations.

Distribution of mutational costs of reassigned genetic codes
(A) Calculation method of mutational costs for each genetic code based on three physicochemical properties of amino acids. The average change in each of the three physicochemical properties of amino acids upon single-nucleotide substitutions from the 21 codons was calculated (see Methods for details). In the reassigned genetic codes analyzed here, one of three amino acids (Ala, Ser, or Leu) was assigned to each of the nine vacant codons shown in gray, and the costs were calculated for all possible reassignment combinations. (B) Distributions of mutational costs for each physicochemical property of amino acids. Dashed lines indicate the cost values of 10 genetic codes selected for experimental construction. Red dashed lines indicate the minimum and maximum cost values for each cost definition, and orange lines indicate the cost values of near-SGC. (C, D, E) Genetic codes exhibiting the minimum and maximum mutational costs based on PR (C), MV (D), and HI (E). The physicochemical values of amino acids assigned to each codon are shown as heatmaps.

Translation of random mutagenesis libraries with near-SGC
(A) Schematic overview of the protein activity assay using a random mutation library. Reporter genes composed of the 21 codons were subjected to random mutagenesis by error-prone PCR at different Mn2+ concentrations to generate DNA libraries, as shown in Fig. S6. These libraries (5 nM) were translated using near-SGC, consisting of a 32-tRNA mixture (tRNAIPEN, tRNAValCAC, and tRNAArgCCU at 100 ng/µL; all other tRNAs at 12 ng/µL) in tfPURE, including T7 RNA polymerase (1.7 U/µL) at 30 °C for 16 h, and each protein activity was measured. (B, C, D) Dependence of β-galactosidase (GAL) (B), firefly luciferase (Luc) (C), and mStayGold (mSG) (D) activity on mutation rate. Note that the vertical axis of panel C (Luc) is on a log scale. Each dot represents the results of three technical replicates, and error bars represent standard deviations.

Translation of mutagenized DNA libraries with nonSGCs
(A) Schematic of the experiment for comparing protein activities translated with different genetic codes. Random libraries prepared at low and high mutation rates were translated using either the 10 nonSGCs or the near-SGC (RV). Translation conditions were identical to those described in Fig. 4. (B, D, F) Protein activities of products translated with each genetic code using low- and high-mutation DNA libraries. Activities are shown for β-galactosidase (GAL; B, mutation rate = 2.6 × 10-3 per base), firefly luciferase (Luc; D, mutation rate = 2.7 × 10-3 per base), and mStayGold (mSG; F, mutation rate = 4.8 × 10-3 per base). (C, E, G) Ratios of protein activity of high-mutation libraries to those of low-mutation libraries, plotted against the corresponding theoretical mutational costs. Data are shown for GAL (C), Luc (E), and mSG (G). Mean values of three technical replicates are shown with standard deviations for GAL.