Introduction

Tea is one of the most popular natural non-alcoholic beverages consumed worldwide. About 2 billion cups of tea are consumed worldwide daily (1). Theanine (γ-glutamylethylamide) is a unique non-protein amino acid and the most abundant free amino acid in tea plants (Camellia sinensis). It accounts for more than 50% of the total free amino acids and approximately 1-2% of the dry weight of the new shoots of tea plants (2). Theanine is the secondary metabolite conferring the umami taste of tea infusion and also balances the astringency and bitterness of tea infusion caused by catechins and caffeine (3). It has also many health-promoting functions, including neuroprotective effects, enhancement of immune functions, and potential anti-obesity capabilities, among others (2, 49). Therefore, theanine content is highly correlated with green tea quality (10).

Theanine is synthesized from ethylamine (EA) and glutamate (Glu) by theanine synthetase (11, 12). Importantly, the large amount of theanine biosynthesis is determined by the high availability of EA in tea plants (13). Therefore, the evolution of EA biosynthesis in tea plants provided the basis for theanine biosynthesis and the formation of tea quality. EA is synthesized from alanine decarboxylation by alanine decarboxylase (CsAlaDC) (14) (Figure 1A). Indeed, CsAlaDC expression level and catalytic activity largely determine the theanine accumulation level in tea plants (15). As a novel gene, CsAlaDC had not been reported in other plant species before it was identified in tea plants. Previous studies indicated that AlaDC originated from the serine decarboxylase gene (SerDC) by gene duplication in tea plants (16). However, despite sharing highly conserved amino acid sequences, CsAlaDC and CsSerDC specifically catalyze the decarboxylation of alanine and serine, respectively (16) (Figure 1A and B). In addition, CsAlaDC exhibited a much lower enzymatic activity compared with CsSerDC (16). However, the structural basis and key amino acids underlying the evolution of the substrate specificity and enzymatic activity of CsAlaDC are unknown.

Metabolic pathways and sequence analysis of CsAlaDC and SerDCs in plants.

(A) The decarboxylation of serine and alanine in plants. SerDC, serine decarboxylase; AlaDC, alanine decarboxylase. (B) Multiple alignment of the amino acid sequences of AlaDC and SerDCs. The amino acid sequences of the 6 SerDCs from kiwifruit (Actinidia chinensis), grape (Vitis vinifera), coffee (Coffea eugenioides), cocoa (Theobroma cacao) and Arabidopsis. Primary (100%), secondary (80%), and tertiary (60%) conserved percent of similar amino acid residues were shaded in deep blue, light blue and cheer red, respectively. “*” indicated amino acid residues mutated only in CsAlaDC.

SerDC belongs to the Group II pyridoxal-5-phosphate (PLP)-dependent decarboxylase superfamily (17). These Group II amino acid decarboxylases produce many bioactive secondary metabolites and signaling molecules (1820). In plants, SerDC catalyzes the biosynthesis of ethanolamine from serine, which was first functionally characterized in Arabidopsis thaliana (21) (Figure 1A). Ethanolamine is an important metabolite in the synthesis of phosphatidylethanolamine and phosphatidylcholine, two main phospholipids involved in maintaining eukaryotic membrane structures in eukaryotic cell membranes (2224). Furthermore, the study has shown that SerDC is essential in the embryogenesis of Arabidopsis (25). SerDC is of vital importance for plant growth and development, but the mechanism of substrate recognition and catalytic activity of SerDC also remains unclear.

In this work, we attempt to understand the mechanism of functional divergence of plant AlaDC and SerDC by analyzing their structural characteristics. Here, we obtained the X-ray crystal structure of CsAlaDC and AtSerDC. According to the crystal structure, we found a distinctive zinc finger structure located at both CsAlaDC and AtSerDC, which has not been identified in any other Group II PLP-dependent amino acid decarboxylases that have been previously characterized. By comparing the substrate binding pockets, we identified Phe106 of CsAlaDC and Tyr111 of AtSerDC as the crucial sites for substrate specificity. By conducting mutation screening based on the protein structures, we identified the amino acids repressing the catalytic activity and discovered that CsAlaDCL110F/P114A exhibited a 2.3-fold increase in catalytic activity compared to that of CsAlaDC and enhanced the engineering of theanine production in vitro.

Results

Enzymatic properties of CsAlaDC, AtSerDC, and CsSerDC

CsAlaDC originates from CsSerDC by gene duplication and neofunctionalization in tea plants, but they catalyze different metabolic processes (Figure 1A). We performed a multiple sequence alignment of the amino acid sequences of the 6 SerDCs from kiwifruit (Actinidia chinensis), grape (Vitis vinifera), coffee (Coffea eugenioides), cocoa (Theobroma cacao), Arabidopsis and tea plants and CsAlaDC (Figure 1B). The results showed that the amino acid sequences of CsAlaDC and SerDCs were highly conserved, but CsAlaDC has amino acid mutations in highly conserved regions compared with SerDCs.

Next, to verify the substrate specificity and enzyme activity, we conducted enzyme activity detection and enzyme kinetics analysis for CsAlaDC, CsSerDC and AtSerDC. The 5′ truncated CsAlaDC and AtSerDC were inserted into the pET22b and pET28a expression vectors to generate recombinant plasmids pET22b-CsAlaDC and pET28a-AtSerDC, correspondingly. Additionally, the full-length protein of CsSerDC was inserted into the pET28a expression vector to generate the recombinant plasmid pET28a-CsSerDC. Subsequently, the recombinant proteins were expressed in Escherichia coli and purified via nickel affinity chromatography. The purified proteins were examined through SDS-PAGE analysis (Figure 2A), supplemented by size-exclusion chromatography assessments (Supplementary figure 1A), it seems that the protein manifests in an oligomeric configuration within the solution.

Purification and characterization of CsAlaDC, AtSerDC, and CsSerDC.

(A) Identification of CsAlaDC, AtSerDC, and CsSerDC using SDS-PAGE. (B) Detection of enzyme activities of CsAlaDC, AtSerDC, and CsSerDC by UPLC. (C) Reaction rates of substrates with different concentrations catalyzed by CsAlaDC, AtSerDC, and CsSerDC.

To verify the substrate specificity of CsAlaDC, AtSerDC, and CsSerDC, enzyme activity assays were conducted using Ala and Ser as substrates, followed by product identification via UPLC analysis. The results showed that CsAlaDC can effectively catalyze the decarboxylation of alanine to generate EA, whereas its catalytic efficacy towards serine was relatively inferior. Conversely, AtSerDC and CsSerDC selectively catalyzed serine decarboxylation to yield ethanolamine but did not exhibit activity towards alanine (Figure 2B). These findings confirmed that CsAlaDC and SerDCs do not have promiscuous decarboxylase activity on Ala and Ser.

The kinetic properties of CsAlaDC, AtSerDC, and CsSerDC were determined through the use of corresponding substrates (Figure 2C), and the results are presented in Table 1. The kinetic parameter Km of CsAlaDC was determined to be 1.215 mM, which is similar to that of AtSerDC and CsSerDC at 1.522 mM and 2.364 mM, respectively. However, AtSerDC has a Vmax of 7.053 μmo L-1 s-1, which is 4-fold that of CsAlaDC’s Vmax of 1.709 μmo L-1 s-1; while CsSerDC has a Vmax of 9.031 μmo L-1 s-1, which is 5.3-fold that of CsAlaDC. The findings suggest that the enzymatic catalytic efficiency of CsSerDC and AtSerDC is approximately triple compared to that of CsAlaDC, and the Vmax of CsAlaDC is considerably lower than that of both CsSerDC and AtSerDC. The observed disparity in Vmax suggests that CsAlaDC and SerDCs may exhibit discrepancies in substrate-to-product conversion rates under saturated substrate conditions. It is plausible that SerDCs possess a higher turnover rate, facilitating the rapid conversion of substrate to product and subsequent release of the enzyme. Another possibility is that SerDCs demonstrate enhanced stability, thereby maintaining superior catalytic activity even at higher substrate concentrations, consequently leading to a higher Vmax value. Due to the similarity in catalytic activity between CsSerDC and AtSerDC, we chose the more representative AtSerDC for further analysis.

Kinetic parameters of CsAlaDC, AtSerDC and CsSerDC

The overall structures of CsAlaDC and AtSerDC

To enhance our comprehension of CsAlaDC and AtSerDC, we conducted structural analyses of these two proteins. Through optimization of crystallization conditions, we successfully determined the crystal structures of CsAlaDC, CsAlaDC-EA, and AtSerDC at resolutions of 2.50 Å, 2.60 Å and 2.85 Å, respectively (Supplementary table 1).

The overall structure of CsAlaDC or AtSerDC is homodimeric with two subunits exhibiting an asymmetrical arrangement (Figure 3A and 3D). The monomer of CsAlaDC or AtSerDC is divided into three distinct structural domains: an N-terminal domain (N-terminal –104 aa in CsAlaDC, N-terminal –109 aa in AtSerDC), a large domain (105-364 aa in CsAlaDC, 110-369 aa in AtSerDC), and a small C-terminal domain (365 aa-C-terminus in CsAlaDC, 370-C-terminus as in AtSerDC), colored light pink, khaki, and sky blue, respectively. Compared to the full-length protein, the N-terminal structures of CsAlaDC and AtSerDC were truncated by 60 and 65 amino acid residues, respectively. The N-terminal domain of the truncated protein contains a long α-helix, which is connected to the large domain through a long loop. The helix of one subunit is antiparallel to the corresponding helix in the neighboring subunit, forming a clamp. Therefore, the N-terminal domain may not be an independent folding unit and is likely only stable in dimers. The large domain contains a seven-stranded mixed β-sheet surrounded by eight α-helices, while the C-terminal domain is composed of a three-stranded antiparallel β-sheet and three α-helices (Supplementary figure 1). The catalytic site of the enzyme is positioned within a superficial crevice at the junction of two subunits forming a dimeric arrangement. The amino acid residues derived from both subunits participate in the binding of the PLP, effectively anchoring it to the active site. In the catalytic site, one monomer accommodates PLP, while the other monomer also contributes to PLP binding and enzymatic function (Figure 3A and 3D).

Crystal structures of CsAlaDC and AtSerDC.

(A) Dimer structure of CsAlaDC. The color display of the N-terminal domain, large domain, and C-terminal domains of chain A is shown in light pink, khaki and sky blue, respectively. Chain B is shown in spring green. The PLP molecule is shown as a sphere model. The zinc finger structure at the C-terminus of CsAlaDC is indicated by the red box. The gray spheres represent zinc ions, while the red dotted line depicts the coordination bonds formed by zinc ions with cysteine and histidine. (B) The 2Fo-Fc electron density maps of K309-PLP-EA (contoured at 1σ level). The PLP is shown in violet, the K309 is shown in spring green, and the EA is shown in lightblue. (C) Active center of the CsAlaDC-EA complex, with hydrogen bonds denoted by black dotted lines. “*” denotes the amino acids on adjacent subunits. (D) Dimer structure of AtSerDC. The color display of the N-terminal domain, large domain, and C-terminal domains of chain A is shown in light pink, khaki and sky blue, respectively. Chain B is shown in cyan. The PLP molecule is shown as a sphere model. The zinc finger structure at the C-terminus of AtSerDC is indicated by the red box. The gray spheres represent zinc ions, while the red dotted line depicts the coordination bonds formed by zinc ions with cysteine and histidine. (E) Active center of the AtSerDC, with hydrogen bonds denoted by black dotted lines. “*” denotes the amino acids on adjacent subunits. (F) The monomers of CsAlaDC and AtSerDC are superimposed. CsAlaDC is depicted in spring green, while AtSerDC is shown in plum. The conserved amino acid catalytic loop is indicated by the red box. (G) Amino acid residues of the active center in CsAlaDC apo and CsAlaDC-EA complex are superimposed. CsAlaDC apo is shown in floral white, while CsAlaDC-EA complex is shown in spring green. (H) The relative activity of wild-type CsAlaDC and its Y336F mutant (left), as well as wild-type AtSerDC and its Y341F mutant (right) is shown.

Notably, our investigation has revealed the presence of a distinctive zinc finger structure (as depicted in Figure 3A and 3D) located at the C-terminus of CsAlaDC and AtSerDC. This structure is composed of a loop structure spanning 17 amino acid residues, wherein coordination of Zn2+ is facilitated by three Cys residues and one His residue. Importantly, this particular configuration is exclusive to the two proteins under examination and has not been identified in any other Group II PLP-dependent amino acid decarboxylases that have been previously characterized (26, 27).

The crystal structures of CsAlaDC-EA complex and AtSerDC were obtained and further analyzed. In the former complex, PLP bound to Lys309 via a Schiff base linkage (internal aldimine form), and the pyridine moiety of PLP is positioned between the imidazole ring of His196 and the methyl group of Ala279 in a parallel orientation to the imidazole ring. The carboxylic group of Asp277 stabilizes the N1 of PLP by a salt bridge interaction, providing the latter with the strong electrophilicity necessary to stabilize carbon ion intermediates during enzymatic catalysis. The O3 forms hydrogen bonds with Thr247 and Lys309. Additionally, His308, Gly169, Thr170, Lys309, and Ser347*(“*” denotes the amino acids on adjacent subunits) establish a hydrogen bonding network with the phosphate group of PLP, consequently fortifying its attachment to the protein (Figure 3C). Similarly, in the latter structure, the pyridine ring of PLP was sandwiched between the imidazole ring of His201 and the methyl group of Ala284. The N1 of PLP formed a salt bridge with Asp282, while the O3 formed hydrogen bonds with Thr252. The phosphate group of PLP established hydrogen bonding interactions with Lys314, His313, Gly174, Thr175, and Ser352* (Figure 3E). The aforementioned amino acid residues are observed across various Group II PLP-dependent amino acid decarboxylases and exhibit considerable stability (see further discussion below), suggesting that they could share common features in terms of catalytic mechanisms.

Factors affecting CsAlaDC and AtSerDC activity

The Group II PLP-dependent amino acid decarboxylases are dimeric enzymes with each monomer containing an active site located within a shallow cavity (28). A flexible loop of amino acids originating from one monomer extends into the active site of the other monomer and plays a crucial role in catalysis. The catalytic loop in AtSerDC, consisting of amino acid residues 328-341, is well-structured and exhibits clear electron density. However, the corresponding loop in CsAlaDC is disordered (Figure 3F). The aforementioned loop harbors a conserved Tyr residue that is believed to function as a proton donor of the carbanion in the catalytic process (29). In particular, CsAlaDC Tyr336 and AtSerDC Tyr341 correspond to that Tyr residue (Supplementary figure 2). Upon superimposing the active site residues of CsAlaDC apo and CsAlaDC-EA complex, it was observed that the hydroxyl group of Tyr336* in the CsAlaDC-EA complex exhibits a substantial 60-degree deviation along the Cα-Cβ bond (Figure 3G). To assess the function of the Tyr, we substituted the corresponding Tyr residues in CsAlaDC and AtSerDC with Phe. The resulting CsAlaDCY336F and AtSerDCY341F mutants were then exposed to Ala and Ser, respectively. We measured the production of EA or ethanolamine in the reaction mixtures. Our findings indicate that these mutants catalyze abortive decarboxylation, as evidenced by the absence of detectable EA and only a small amount of ethanolamine observed in the reaction mixture (Figure 3H). This result suggested this Tyr is required for the catalytic activity of CsAlaDC and AtSerDC.

Identification of key amino acids for the substrate specificity

The superposition of amino acid residues in the substrate binding pocket of CsAlaDC and AtSerDC revealed that all residues are identical, except for the substitution of Tyr at position 111 in AtSerDC with Phe at position 106 in CsAlaDC (Figure 4A). This observation suggests a potential role for this specific residue in determining the selective binding of the appropriate substrate.

Key amino acid residues for substrate recognition.

(A) Superposition of substrate binding pocket amino acid residues in CsAlaDC and AtSerDC. The amino acid residues of CsAlaDC are shown in spring green, the amino acid residues of AtSerDC are shown in plum, with the substrate specificity-related amino acid residue highlighted in a red ellipse. (B) Active-site-lining amino acid residues of SDC homologs from Embryophyta were identified. The height of each amino acid is scaled proportionally to the amount of information content (measured in bits). The first line depicts the conserved motif in all SerDC homologs from Embryophyta, whereas lines 2-5 represent the conserved motifs based on the variable third amino acid residue. (C) Histogram showing the distribution of the number of key motifs. (D) Histogram showing the number of key motifs in different plant orders. (E) Relative enzyme activities of wild-type CsAlaDC and mutant protein CsAlaDCF106Y against Ala substrate (columns 1 and 2), and enzyme activities of wild-type AtSerDC and various AtSerDC mutant proteins against Ala substrate (columns 3-9) are presented. The percentage graph shows the relative activity of each protein compared to wild-type CsAlaDC activity (taken as a 100% benchmark). (F) Relative enzyme activities of wild-type AtSerDC and AtSerDC mutant proteins (columns 1-7) against Ser substrates, and enzyme activities of wild-type CsAlaDC and mutant protein CsAlaDCF106Y against Ser substrates(columns 8, 9) were measured. The percentage graph shows the relative activity of each protein compared to the wild-type AtSerDC (taken as a 100% benchmark). Three independent experiments were conducted. (G) The EA contents of AtSerDC and its mutant AtSerDCY111F in N. benthamiana. (H) The EA contents of CsAlaDC and its mutant CsAlaDCF106Y in N. benthamiana. The significance of the difference (P<0.05) was labeled with different letters according to Duncan’s multiple range test.

To gain further insights into the functional role of this residue (Tyr111 in AtSerDC or Phe106 in CsAlaDC) in selective binding of the appropriate substrate, we identified 563 potential serine decarboxylases in Embryophyta using the amino acid sequences of AtSerDC (Supplementary figure 3A). By comparing the amino acid sequences of these serine decarboxylases, we observed that the corresponding residues to Tyr111 or Phe106 are located within a conserved motif comprising 9 amino acids (Figure 4B). Within this motif, the first two residues, Y (Tyrosine) and P (Proline) are completely conserved across all 563 proteins. However, the third residue where Tyr111 in AtSerDC or Phe106 in CsAlaDC is positioned exhibits variability among Y, T (Threonine), F (Phenylalanine), V (Valine), A (Alanine), I (Isoleucine), L (Leucine), or other amino acids (Figure 4B). Remarkably similar to Tyr111, this residue is predominantly Y in 83.7% of these homologs; conversely, F was found to be present in only 2.3% of these homologs within plant species belonging to Poales, Asterales, Ranunculales, Ericales, Caryophyllales and Nymphaeales orders (Figure 4C-4D).

To verify the role of Tyr111 in AtSerDC or Phe106 in CsAlaDC in the selective binding of the appropriate substrate, we introduced mutations in Tyr111 to Phe (AtSerDCY111F) and Phe106 to Tyr (CsAlaDCF106Y), followed by in vitro enzymatic activity assays. The results revealed that AtSerDCY111F acquired alanine decarboxylase activity, whereas CsAlaDCF106Y completely lost its alanine decarboxylase activity (Figure 4E). Additionally, we generated other mutations of Tyr111 in AtSerDC, including AtSerDCY111A, AtSerDCY111I, AtSerDCY111L, AtSerDCY111V and AtSerDCY111W. Among these mutants, only AtSerDCY111I exhibited a marginal level of alanine decarboxylase activity (Figure 4E). These findings indicate the essential role of Phe106 in the selective binding of alanine for CsAlaDC.

On the other side, both CsAlaDC and CsAlaDCF106Y exhibit a very low level of serine decarboxylase activity (Figure 4F). Moreover, compared with AtSerDC, the serine decarboxylase activity of AtSerDCY111F was about 30% of AtSerDC; AtSerDCY111I retained lower than 5% of serine decarboxylase activity of AtSerDC; while other mutations, including AtSerDCY111A, AtSerDCY111L, AtSerDCY111V and AtSerDCY111W abolished the serine decarboxylase activity (Figure 4F). These results suggested the Tyr111 of AtSerDC is also important for the selective binding of serine for AtSerDC.

To further verify that Phe106 of CsAlaDC and Tyr111 of AtSerDC were key amino acid residues determining the selective binding of substrates in planta, we employed the Nicotiana benthamiana transient expression system. To this end, A. tumefaciens strain GV3101 (pSoup-p19), carrying recombinant plasmid, was infiltrated into leaves of 5-week-old N. benthamiana plants, and the pCAMBIA1305 empty vector was used as the control (EV). Relative mRNA levels were detected (Supplementary figure 4), indicating that they had been successfully overexpressed in tobacco. Here, we found that a high level of EA was produced in the CsAlaDC-expressing tobacco leaves; while no EA product was detected in tobacco leaves infiltrated with the mutant CsAlaDCF106Y. In addition, we did not detect EA products in the AtSerDC-expressing tobacco leaves, while a high level of EA was detected in tobacco leaves infiltrated with mutant AtSerDCY111F. As anticipated, EA was not detected in tobacco leaves infiltrated with EV (Figure 4G and H). These results further verified the critical role of Phe106 in the selective binding of alanine for CsAlaDC.

Key amino acids for the evolution of CsAlaDC enzymatic activity

CsAlaDC and AtSerDC have a high sequence similarity of 74.5% and a nearly identical structure with an RMSD (root mean square deviation) of only 0.77Å for monomer structures. However, the two enzymes catalyze different amino acid decarboxylation reactions, and AtSerDC exhibits significantly higher Vmax than CsAlaDC (Figure 2B and C). In light of this observation, we postulated a hypothesis: EA, generated via Ala decarboxylation by CsAlaDC, can be toxic and harmful to plants if accumulated excessively. Thus, during the evolution of plant serine decarboxylase into alanine decarboxylase, the enzyme has evolved not just to alter the substrate preference but also to reduce catalytic activity to control EA production within a suitable range. Based on this hypothesis, we suggest that mutating specific amino acids in CsAlaDC to those corresponding amino acids in SerDCs could enhance its activity, and the results could provide insights into the evolution of CsAlaDC enzymatic activity.

Dimerization is essential to AlaDC/SerDC activities because the active site is composed of residues from two monomers. To identify the amino acids repressing the enzymatic activity of CsAlaDC during evolution from SerDC, we analyzed the crystal structures and the amino acids at the dimer interface between CsAlaDC and AtSerDC. This analysis revealed that the amino acids at the dimer interface at positions 66, 97, 110, 114, 116, 117, 122, 315 and 345 are different in CsAlaDC and SerDCs (Figure 1B). Therefore, we mutated these amino acids of CsAlaDC into those in the corresponding positions of AtSerDC and CsSerDC, and performed enzyme activity assays (Figure 5A).

Mutations enhance CsAlaDC enzyme activity and theanine synthesis in vitro.

(A) Relative enzyme activities of CsAlaDC mutant proteins against Ala substrate. (B) Relative enzyme activities of CsAlaDCL110F, CsAlaDCP114A, and CsAlaDCL110F/P114A against Ala substrate. (C) Histogram showing the relative content of theanine resulting from different combinations of alanine decarboxylase and theanine synthetase. Three independent experiments were conducted.

The results demonstrated that CsAlaDCL110F and CsAlaDCP114A exhibited significantly enhanced enzyme activity compared to the wild-type CsAlaDC, with a 2.1-fold and 1.59-fold increase, respectively. Furthermore, the catalytic activity of the CsAlaDCL110F/P114A double mutant exhibits a remarkable 2.3-fold increase compared to that of the wild-type protein (Figure 5B). These findings suggested a critical role of Leu110 and Pro114 of CsAlaDC in the evolution of enzymatic activity and provided a basis to improve CsAlaDC activity. It is possible that these amino acid residues could potentially augment the hydrophobic nature of the protein dimer interface.

In Vitro synthesis of theanine

The biosynthetic pathway of theanine in tea plants comprises two consecutive enzymatic steps: alanine decarboxylase facilitates the decarboxylation of alanine to generate EA, while theanine synthetase catalyzes the condensation reaction between EA and Glu to synthesize theanine. By conducting mutation screens, we discovered a CsAlaDC mutant protein (L110F/P114A) that exhibited a 2.3-fold higher catalytic activity compared to the wild-type protein. Subsequently, we employed an in vitro theanine synthesis system utilizing CsAlaDC and either glutamine synthetase (PsGS [Pseudomonas syringae pv. syringae]) or gamma-glutamate methylamine ligase (MmGMAS [Methylovorus mays]) which have the ability to synthesize theanine from EA and Glu (30, 31).

The results illustrated the successful synthesis of theanine using CsAlaDC in conjunction with two theanine synthetases, employing Ala and Glu as substrates (Figure 5C). The theanine content generated by the combination of CsAlaDCL110F and MmGMAS, as well as CsAlaDCL110F/P114A and MmGMAS, was 4.57-fold and 6.72-fold higher than the content produced in the combination of wild-type CsAlaDC and MmGMAS, respectively. Similarly, when combined with the PsGS, comparable outcomes are observed as well. The theanine content resulting from the combination of CsAlaDCL110F and PsGS, as well as CsAlaDCL110F/P114A and PsGS, exhibit enhancements of 1.62-fold and 4.33-fold compared to the wild-type protein combination, respectively (Figure 5C). Hence, the utilization of CsAlaDCL110F/P114A could effectively enhance the theanine production yield and thus, holds potential for large-scale engineering production of theanine.

Discussion

The common and distinctive structural characteristics of CsAlaDC and AtSerDC compared with other amino acid decarboxylases

CsAlaDC and AtSerDC are Group II PLP-dependent amino acid decarboxylases, which exhibit numerous structural features common to other amino acid decarboxylases. The Dali search (32) revealed that AtSerDC and CsAlaDC share structural similarities with 7CIG (MetDC [Streptomyces sp. 590]), 7ERV (HisDC1 [Photobacterium phosphoreum]), 6KHO (TrpDC [Oryza sativa Japonica Group]), 4E1O (HisDC2 [Homo sapiens]), 6JY1 (AspDC [Methanocaldococcus jannaschii]), and 5GP4 (GluDC [Levilactobacillus brevis]). The monomers of these amino acid decarboxylases all consist of three characteristic domains: the C-terminal domain, the large domain and the N-terminal domain (Supplementary figure 5). The amino acid residues that bind to PLP cofactors are conserved across multiple enzymes (Supplementary figure 5 and 6). Out of the eight proteins analyzed, a total of thirteen amino acid residues were found to be conserved across all sequences (Glu143, Glu171, Lys201, Gly245, Thr247, Asp253, His275, Asp277, Ala279, Pro286, Ser302, Lys309, and Tyr336 in CsAlaDC). Some of them are situated within the active center and play a role in stabilizing PLP or facilitating catalytic reactions. Conversely, other residues reside outside the active center and their function remains unclear (Supplementary figure 6).

Group II PLP-dependent amino acid decarboxylases are characterized by the existence of a highly flexible loop, which is of great significance for the catalytic mechanism of decarboxylase (33). The loops are located at the dimer interface and extend to the active sites of other monomers in a closed conformation. Prior research has established that the conserved amino acid residue Tyr within the loop plays a crucial role in catalysis by donating protons to carbanions of quinone intermediates that arise following decarboxylation (34, 35). Our experiments demonstrate that the substitution of the corresponding Tyr with Phe in the loop renders the protein inactive. Through the application of circular dichroism, we have corroborated that the stability of the protein is not compromised by mutations (Supplementary figure 7A, B, D, E). Both the mutant and wild-type proteins manifest absorption bands at the 420 nm wavelength, signifying the formation of a Schiff base between PLP and the lysine residues found at the active site. Given that PLP has been incorporated during the protein purification stage, the absorbance observed for both the mutant and wild-type proteins at 420 nm is consistent when compared at equivalent concentrations (Supplementary figure 8). This corroboration implies that the protein mutation does not interfere with the binding to PLP. Consequently, the inactivation observed in the mutant proteins is not a consequence of instability incited by corresponding mutations, nor changes in the binding affinity between the protein and PLP induced by these mutations.

We observed that the substitution of Phe106 for Tyr in CsAlaDC rendered CsAlaDC inactive, while the substitution of Tyr111 for Phe in AtSerDC enabled alanine decarboxylase activity, and the alteration in the activity of the mutant protein bears no correlation to protein stability and affinity for PLP (Supplementary figure 7A, C, D, F and Supplementary figure 8), suggesting that the amino acid in the position is critical for determining substrate specificity. Integrative crystallographic structure of the two enzymes, we speculate the hydrophilic nature of the Lys residue in AtSerDC predisposes the active site to preferentially accommodate the hydrophilic amino acid Ser as its substrate. In contrast, the equivalent position in CsAlaDC is occupied by Phe, an amino acid lacking the hydroxyl group. This substitution enhances the hydrophobic nature of the substrate-binding pocket. Consequently, CsAlaDC demonstrates a unique predilection, selectively binding Ala (an amino acid with comparatively hydrophobic properties) as its preferred substrate.

The monomeric configuration of CsAlaDC and AtSerDC is akin to that of other Group II PLP-dependent amino acid decarboxylases, except for a conspicuous dissimilarity. Unlike other Group II PLP-dependent amino acid decarboxylases, the C-terminal of CsAlaDC and AtSerDC harbors an evident zinc finger structure. We propose that this structure could potentially influence enzyme stability. To test this hypothesis, we truncated the zinc finger structure from both proteins and expressed them. Our findings indicate that, following the excision of the zinc finger structure, CsAlaDC became insoluble, while AtSerDC displayed a similar trait to a certain extent (Supplementary figure 9). Hence, we can establish that the zinc finger structure indeed exerts an influence on the enzyme’s stability. The potential for additional functions necessitates further investigation.

Evolution of AlaDC and SerDC

The structure of CsAlaDC seems like with AtSerDC, however, there are differences in substrate specificity. To clarify the relationship between CsAlaDC and other SerDCs, we analyzed serine decarboxylase-like proteins in Embryophyta that are highly homologous to CsAlaDC and constructed phylogenetic trees to gain insight into their evolutionary relationships (Supplementary figure 3). Based on our experimental findings and evolutionary evidence, we have identified a conserved YPX motif at the substrate binding pocket. For most plants, the residue X in this motif is predominantly Tyr, which corresponds to the serine decarboxylase protein. However, other residues such as Thr, Phe, Val, Ala, and Ile can also occupy this position (Figure 4B and C). Interestingly, serine decarboxylase-like proteins containing YPF motifs, which have the potential to function as alanine decarboxylases, were found to be distributed throughout the phylogenetic tree, including Asteroideae, Ericales, Chenopodiaceae, Poaceae, and Ranunculales (Supplementary figure 3B). However, these proteins were absent in some more recent species. This observation implies that the emergence of alanine decarboxylase in these particular species could be attributed to convergent evolution.

Moreover, we have observed the enrichment of serine decarboxylase-like proteins that possess a YPT motif within the Fabales (Supplementary figure 3B). Prior research has demonstrated that XP_004496485-1, which possesses a YPT motif, displays catalytic efficacy toward the processes of decarboxylation and oxidative deamination of Phe, Met, Leu, and Trp (36). Given that the YPT motif is highly conserved and widely distributed in Fabales, serine decarboxylase-like proteins bearing the YPT motif may have developed a unique substrate specificity in Fabales, beyond their conventional decarboxylation functions, as exemplified by XP_004496485-1 protein mentioned above. We speculate that these proteins may be capable of catalyzing other reactions, potentially involving non-protein amino acids or other substances as substrates.

Applied to improve the synthesis of theanine

Theanine is an important indicator of green tea quality. Therefore, improving the synthesis of theanine in tea plants is the focus of research. In this study, through crystal structure analysis and mutation verification, we found that the catalytic activity of CsAlaDCL110F/P114A is 2.3 times higher than that of the wild-type protein, resulting in a more abundant synthesis of theanine in vitro. This gives us great inspiration to improve the synthesis of EA in tea plants by gene editing, thus increasing the content of theanine in tea plants (Figure 5).

Theanine is highly demanded, by the market, due to its health effects and medicinal value, and as a constituent in food, cosmetics and other fields. To meet the market demand, a variety of methods have been used to acquire theanine, with the main methods including direct extraction, chemical synthesis, biotransformation (microbial fermentation) and plant cell culture. Based on the findings of this study, site-directed mutagenesis can be employed to modify enzymes involved in theanine synthesis. This modification enhances the capacity of bacteria, yeast, model plants, and other organisms to synthesize theanine, thereby facilitating its application in industrial theanine production.

Conclusions

In conclusion, our structural and functional analyses have significantly advanced understanding of the substrate-specific activities of alanine and serine decarboxylases, typified by CsAlaDC and AtSerDC. Critical amino acid residues responsible for substrate selection were identified-Tyr111 in AtSerDC and Phe106 in CsAlaDC-highlighting pivotal roles in enzyme specificity. The engineered CsAlaDC mutant (L110F/P114A) not only displayed enhanced catalytic efficiency but also substantially improved L-theanine yield in a synthetic biosynthesis setup with PsGS or MmGMAS. Our research expanded the repertoire of potential alanine decarboxylases through the discovery of 13 homologous enzyme candidates across embryophytic species and uncovered a special motif present in serine protease-like proteins within Fabale, suggesting a potential divergence in substrate specificity and catalytic functions. These insights lay the groundwork for the development of industrial biocatalytic processes, promising to elevate the production of L-theanine and supporting innovation within the tea industry.

Materials and Methods

Plants materials

Tobacco (Nicotiana benthamiana) plants were grown in a controlled chamber, under a 16-h light and 8-h dark photoperiod at 25 ℃. Leaves of 5-week-old tobacco plants were used for transient transformation, mediated by Agrobacterium tumefaciens strain GV3101.

Gene cloning and protein expression

The coding sequences of AtSerDC and CsAlaDC, amino acid residues 66-482 and 61-478, respectively, were amplified using PCR and then ligated into the Nde Ι and Xho I restriction sites of the pET-28a and pET-22b vector, respectively, providing the recombinant vector pET-28a-AtSerDC and pET-22b-CsAlaDC. The cDNAs encoding CsSerDC were amplified using PCR and then ligated into the Nde Ι and Xho I restriction sites of the pET-28a vector. Gene-specific primers were listed in Supplementary table 2. The recombinant plasmid was transformed into E. coli BL21 (DE3) competent cells. Positive transformants were grown in a 5 mL LB medium containing 30 μg/mL kanamycin or 50 μg/mL ampicillin at 37 °C overnight and then subcultured into an 800 mL LB medium containing the corresponding antibiotic. Protein expression was induced by the addition of 0.2 mM isopropyl-β-d-thiogalactoside (IPTG) for 20 h at 16 ℃ when the optical density (OD) at 600 nm reached 0.6-0.8, harvested by centrifugation at 4 °C and 4,000 rpm for 30 min.

Protein production and crystallization

The cell pellet was suspended in 30 mL of lysis buffer (20 mM HEPES, pH 7.5, 200 mM NaCl, 0.1 mM PLP), disrupted by High-Pressure Homogenizer, and then centrifuged at 16,000 rpm for 30 min at 4 °C to remove the cell debris. The supernatants were purified with a Ni-Agarose resin column followed by size-exclusion chromatography. Before crystallization, purified proteins were concentrated at 10 mg/mL. Crystallization conditions were screened by the sitting-drop vapor diffusion method using the reservoir solutions supplied in commercially available screening kits (Crystal Screen, Crystal Screen 2, PEGRx 1, 2, and SaltRx 1, 2). A droplet made by mixing 1.0 μL of purified AtSerDC or CsAlaDC (10 mg/mL) with an equal volume of a reservoir solution was equilibrated against 100 μL of the reservoir solution at 16℃. The crystal of AtSerDC was obtained using buffer pH 8.0 containing 20% (w/v) PEG400 as a precipitate and 0.2 M CaCl2. The crystal of the CsAlaDC-EA complex was obtained at pH 7.5 containing 2.6 M sodium acetate. The crystal of CsAlaDC was obtained at pH 6.0 containing 3.5 M sodium formate.

Data collection and processing

Crystals were grown by sitting-drop vapor diffusion method at 16 °C. The volume of the reservoir solution was 100 µL and the drop volume was 2 µL, containing 1 µL of protein sample and 1 µL of reservoir solution. The reservoir solution of AtSerDC contained 0.1 M Tris-HCl (pH 8.0), 0.2 M CaCl2, and 20% PEG 400. The reservoir solution of CsAlaDC contained 0.1 M HEPES (pH 7.5), 2.6 M sodium acetate or 0.1 M Bis-Tris (pH 6.0), and 3.5 M sodium formate. Crystals grew in 3-5 days using a protein concentration of 10 mg/mL. Diffraction data were collected at Shanghai Synchrotron Radiation Facility (China). The collected data sets were indexed, integrated, and scaled using the HKL3000 software package. The structure of AtSerDC was solved by molecular replacement using the structure of HisDC (PDB code: 7ERV [Photobacterium phosphoreum]) as the model, and utilizing AtSerDC as a molecular displacement model for both CsAlaDC and the CsAlaDC-EA complex. The AtSerDC, CsAlaDC, and CsAlaDC-EA complex exhibit resolutions of 2.85 Å, 2.50 Å, and 2.60 Å, respectively. The statistics for data collection and processing are summarized in Supplementary table 1.

Enzyme activity assays

Decarboxylase activity was measured by detecting products (EA or ethanolamine) in Waters Acquit ultraperformance liquid chromatography (UPLC) system (16, 37). The 100 μL reaction mixture, containing 20 mM substrate (Ala or Ser), 100 mM potassium phosphate, 0.1 mM PLP, and 0.025 mM purified enzyme, was prepared and incubated at standard conditions (45 °C and pH 8.0 for CsAlaDC, 40 °C and pH 8.0 for AtSerDC for 10 min). Then, the reaction was stopped with 20 μL of 10% trichloroacetic acid. The product was derivatized with 6-aminoquinolyl-N-hydroxy-succinimidyl carbamate (AQC) and subjected to analysis by UPLC. All enzymatic assays were performed in triplicate.

The detection methodology for theanine production remains consistent with the aforementioned approach. The 100 μL reaction mixture, containing 20 mM Ala, 45 mM Glu, 50 mM HEPES (pH 7.5), 0.1 mM PLP, 30 mM MgCl2, 10 mM ATP, 0.03 mM PsGS/MmGMAS, and 0.025 mM CsAlaDC/CsAlaDC L110F/CsAlaDC L110F/P114A, was prepared and incubated at standard conditions for 1 h. Subsequently, the reaction was terminated via immersion of the reaction vessel in a metal bath at 96 °C for 3 min (31). Theanine was derivatized with AQC and subjected to analysis by UPLC. All enzymatic assays were performed in triplicate.

Site-directed mutagenesis

Site-directed mutagenesis experiment was conducted using a PCR method from the wild-type construct pET-28a-AtSerDC and pET-22b-CsAlaDC, respectively. Dpn I endonucleases were used to digest the parental DNA template. The reaction mixture was used to transform E. coli DH5α competent cells and the plasmids from positive strains were extracted to E. coli BL21 (DE3) for protein expression, purification, and analysis of the enzymatic activity.

In vivo enzyme activity assay in N. benthamiana

The amplified PCR products were fused to the plant expression vector, pCAMBIA1305. Linearization was conducted by restriction digest with Spe I and BamH I. The recombinant colonies were selected for PCR validation on the appropriate antibiotics plate. After validation, the plasmids were electroporated into Agrobacterium tumefaciens strain GV3101(pSoup-p19). Empty pCAMBIA1305 vector, with an intron containing the GFP gene, was treated as the control.

Agrobacterium transient expression assays were performed on 5-week-old N. benthamiana plants. Agrobacterium tumefaciens strain GV3101 (pSoup-p19), carrying the above-described vectors, were cultured in Luria Bertani (LB) medium, containing appropriate antibiotics, at 28 °C. When the absorbance of bacteria colonies reached OD600 = 0.6-0.8, bacterial cells were collected and resuspended in MMA solution (10 mM MgCl2, 10 mM 2-(N-morpholino) ethane sulfonic acid (MES), pH 5.6). After the OD600 of the resuspended bacterial solution was adjusted to approx. 1.0, acetosyringone (AS) was added with a final concentration of 200 µM, and this solution was then incubated, at room temperature, for at least 3 h in darkness. Next, cell suspensions were infiltrated into N. benthamiana leaves with a needle-free syringe. These N. benthamiana leaves were then collected 3 days post infiltration, frozen in liquid nitrogen and stored at –80 °C. Internal EA in N. benthamiana leaves was extracted as previously described (18), and then the solvent was subjected to gas chromatography-mass spectrometry GS-MS system.

Transcript level analysis in N. benthamiana

Total RNA was isolated from samples using the RNAprep Pure Plant Kit (Tiangen, Beijing, China), according to the manufacturer’s protocol. The cDNAs were synthesized using TransScript One-Step gDNA Removal and cDNA Synthesis SuperMix Kit (TransGen Biotech, Beijing, China). The qRT-PCR assays were performed, as previously described (38). Primers used for qRT-PCR assays were listed in Supplementary table 3 and the qRT-PCR was run on a Bio-Rad CFX96TM RT PCR detection system and CFX Manager Software. Each reaction reagent (20 μL) contained 0.4 μL forward and reverse primers (10 μM), 2 μL cDNA (200±5 ng/μL), 10 μL SYBR Green Supermix (Vazyme, Nanjing, China) and 6.2 μL double-distilled water. Reaction was performed by a two-step method: 95 ℃ for 5 min; 40 cycles of 95 ℃ for 10 s; and 60 ℃ for 30 s. The glyceraldehyde-3-phosphate dehydrogenase (GAPDH) gene was used for internal normalization in each RT-PCR, and the 2−ΔΔCT method (39) was used to calculate the relative gene expression. All samples were performed with three replicates.

Multiple sequence alignment

MUSCLE was used to generate the protein multiple sequence alignment of amino acid decarboxylases with default settings (40). ESPript 3.x was used to display the multiple sequence alignment (41).

Phylogenetic analysis

The AtSerDC was used as a query for searching its homologs in Embryophyta using BLASTP (42) (version 2.13.0). To fetch homologs of AtSerDC in distant species, we performed a two-round blast. We first search for the best hit in the given order under Embryophyta. Then we blast each species under a provided order using the order best hit as a query and as many as 10 hits were kept. The sequences with e values <= 0.001 were removed. We further dropped sequences with lengths less than 200 or larger than 800 amino acids. Then multiple sequence alignment was performed via clustalo (version 1.2.4) with default settings on these filtered sequences (43). The NJ algorithm implemented in clustalw (version 2.1) was adopted to build the phylogenetic tree (44). The R package ggtree (version 3.2.1) and universalmotif (version 1.12.4) were used for tree and motif visualization (45).

Author contributions

W.G, Z.Z and X.W designed research; H.W, B.Z, S.Q and C.D performed research; H.W and B.Z analyzed data; H.W, B.Z, W.G, and Z.Z wrote the paper.

Competing interest statement

The authors declare no competing financial interests.

Classification

Biochemistry (major), Plant Sciences (minor)

Acknowledgements

We thank the team of beamline BL18U1 in the Shanghai Synchrotron Radiation Facility for diffraction data collection.

Funding

This study is supported by grants from the Ministry of Science and Technology of China (2019YFA0904100 to W.G, 2022YFF1003103 to X.W) and the Natural Science Foundation of China (T2221005 to W.G, 32072624 to Z.Z).

Data availability

The structures of AtSerDC, CsAlaDC-EA complex and CsAlaDC have been deposited in the Protein Data Bank (PDB) under accession numbers 8JG7, 8JIK and 8JIJ respectively.

Supplementary figures

Purification of CsAlaDC, AtSerDC, and CsSerDC and Crystal structures of CsAlaDC and AtSerDC.

(A) Comparison of elution profiles of CsAlaDC, AtSerDC and CsSerDC. (B) Monomer structure of CsAlaDC. (C) Monomer structure of AtSerDC. The color display of the N-terminal domain, large domain, and C-terminal domains is shown in light pink, khaki and sky blue, respectively. The PLP molecule is shown as a sphere model. The gray spheres represent zinc ions.

Catalytic mechanisms and conformational changes of CsAlaDC.

After the transaldimation of the internal aldimine within CsAlaDC, resulting in the release of the active-site residue Lys309, the PLP amino acid external aldimine undergoes decarboxylation, leading to the removal of the α-carboxyl group as CO2. This process generates a quinonoid intermediate that is stabilized by the delocalization of paired electrons (1,2). The carbanion at Cα is subsequently protonated by the acidic p-hydroxyl group of Tyr336* located on the large loop, facilitated by its neighboring residue His196, which is situated on the small loop (3,4). Simultaneously, the internal aldimine LLP309 in CsAlaDC is restored, resulting in the release of the product (5).

Evolutionary analysis of CsAlaDC in Embryophyta.

(A) The presented diagram depicts an evolutionary tree of CsAlaDC, which is devoid of a root and solely portrays the topological structure of the tree without including distance information. The color of the inner ring corresponds to various orders, while the outer ring’s leaf nodes are colored based on the motif types that the sequence exhibits. (B) Diversity of serine decarboxylase-like proteins in Embryophyta (196 species). Colored scatter spots on the right side of leaf nodes correspond to the respective motifs shown in Figure A.

The relative mRNA levels of AtSerDC and its mutant Y111F

(A), CsAlaDC and its mutant F106Y (B) in N. benthamiana leaves were measured by two primers. WT, wild type of N. benthamiana; EV, empty vector control; NbGAPDH was used as an internal control. Data represent mean ± SD (n=3). The significance of the difference (P<0.05) was labeled with different letters according to Duncan’s multiple range test.

Structures of HisDC2, MetDC, TrpDC, AspDC, HisDC1 and GluDC.

(A) The Overall Structures of HisDC2, MetDC, TryDC, AspDC, HisDC1 and GluDC. Chain A is shown in khaki, chain B is shown in cyan. (B) Amino acid residues in the substrate binding pocket of HisDC2, MetDC, TryDC, AspDC, HisDC1 and GluDC. The amino acid residues in chain A are shown in khaki, and the amino acid residues in chain B are shown in cyan.

Multiple sequence alignment of CsAlaDC, AtSerDC, MetDC, HisDC1, TrpDC, HisDC2, TyrDC, and GluDC were generated using MUSCLE and visualized with ESPript 3.x. Conserved amino acid residues in all eight proteins are highlighted with red backgrounds.

The magenta box marks key amino acid residues involved in substrate recognition for CsAlaDC and AtSerDC. The Lys residue covalently bound to the PLP cofactor is denoted by a red star, while the green triangle indicates the Tyr residue associated with enzymatic activity. Amino acid residues involved in the CsAlaDC substrate binding pocket are marked with blue circles.

Circular Dichroism Spectra of proteins.

(A) Circular Dichroism Spectra of CsAlaDC (WT). (B) Circular Dichroism Spectra of CsAlaDCY336F. (C) Circular Dichroism Spectra of CsAlaDCF106Y. (D) Circular Dichroism Spectra of AtSerDC (WT). (E) Circular Dichroism Spectra of AtSerDCY341F. (F) Circular Dichroism Spectra of AtSerDCY111F.

Absorption Spectra of different proteins.

(A) Absorption Spectra of CsAlaDC (WT), CsAlaDCY336F and CsAlaDF106Y. (B) Absorption Spectra of AtSerDC (WT), AtSerDCY341F and AtSerDCY111F.

Purification of truncated CsAlaDC and AtSerDC.

Lane 1 represents marker. The precipitated and eluted samples of AtSerDC with a truncated zinc finger structure are shown in lanes 2 and 3 respectively, the precipitated and eluted samples of CsAlaDC with a truncated zinc finger structure are illustrated in lanes 4 and 5 respectively.

Supplementary tables

Data collection and refinement statistics

Primers used for gene cloning.

Primers used for real-time PCR.