The Lipocone Superfamily: A Unifying Theme In Metabolism Of Lipids, Peptidoglycan And Exopolysaccharides, Inter-Organismal Conflicts And Immunity

A Maxwell Burroughs; Gianlucca G Nicastro; L Aravind

doi:10.7554/eLife.108061.1

eLife Assessment

This fundamental study presents a compelling and comprehensive analysis of the newly defined Lipocone superfamily, offering unprecedented insights into the evolutionary origins of Wnt proteins. The authors provide evidence that this superfamily evolved from membrane proteins. The work is exemplary in its use of sequence analysis and structural modeling and will be of broad interest to researchers studying protein evolution and enzymology.

[Editors' note: this paper was reviewed by Review Commons.]

https://doi.org/10.7554/eLife.108061.1.sa4

Significance of findings

fundamental: Findings that substantially advance our understanding of major research questions

landmark
fundamental
important
valuable
useful

Strength of evidence

compelling: Evidence that features methods, data and analyses more rigorous than the current state-of-the-art

exceptional
compelling
convincing
solid
incomplete
inadequate

During the peer-review process the editor and reviewers write an eLife assessment that summarises the significance of the findings reported in the article (on a scale ranging from landmark to useful) and the strength of the evidence (on a scale ranging from exceptional to inadequate). Learn more about eLife assessments

Abstract

Wnt proteins are critical signaling molecules in developmental processes across animals. Despite intense study, their evolutionary roots have remained enigmatic. Using sensitive sequence analysis and structure modeling, we establish that the Wnts are part of a vast assemblage of domains, the Lipocone superfamily, defined here for the first time. It includes previously studied enzymatic domains like the phosphatidylserine synthases (PTDSS1/2) and the TelC toxin domain from Streptococcus intermedius, the enigmatic VanZ proteins, the animal Serum Amyloid A (SAA) and a further host of uncharacterized proteins in a total of 30 families. Although the metazoan Wnts are catalytically inactive, we present evidence for a conserved active site across this superfamily, versions of which are consistently predicted to operate on head groups of either phospholipids or polyisoprenoid lipids, catalyzing transesterification and phosphate-containing head group cleavage reactions. We argue that this superfamily originated as membrane proteins, with one branch (including Wnt and SAA) evolving into diffusible versions. By comprehensively analyzing contextual information networks derived from comparative genomics, we establish that they act in varied functional contexts, including regulation of membrane lipid composition, extracellular polysaccharide biosynthesis, and biogenesis of bacterial outer-membrane components, like lipopolysaccharides. On multiple occasions, members of this superfamily, including the bacterial progenitors of Wnt and SAA, have been recruited as effectors in biological conflicts spanning inter-organismal interactions and anti-viral immunity in both prokaryotes and eukaryotes. These findings establish a unifying theme in lipid biochemistry, explain the origins of Wnt signaling and provide new leads regarding immunity across the tree of life.

Graphical abstract

Introduction

The canonical Wnt signaling network is central to developmental decisions across animals relating to axis patterning, cell fate, cell migration and proliferation, and systems morphogenesis at many levels (1–7). Other crucial pathways, dubbed non-canonical Wnt signaling pathways, include those that regulate planar cell polarity and intracellular calcium levels (8–10). With these roles in development and homeostasis, dysfunction of Wnt signaling is causally associated with a range of diseases, including diverse cancer types and type II diabetes (11,12). Wnt signaling networks are centered on the secreted Wnt proteins acting as both paracrine and autocrine diffusible, extracellular messenger molecules (13). Wnt proteins are ligands for the N-terminal, cysteine-rich CBD/Fz domains of the Frizzled class of G-protein coupled receptors (GPCRs) (14,15). Modifications of the Wnt proteins via palmitoleoylation and glycosylation at internal sites are associated with their secretion (16). Palmitoleoylation of Wnt occurs at a conserved serine residue and is also required for recognition by the Frizzled receptors (17). Binding of the Frizzled receptor by Wnt recruits the Disheveled (Dsh) protein to its cytoplasmic face, in turn triggering a bevy of downstream responses, resulting in β-catenin stabilization in canonical pathways (18,19). When β-catenin concentrations cross a threshold, it is translocated into the nucleus, where it acts as a transcriptional coactivator, usually with an HMG domain transcription factor, to stimulate multiple transcriptional programs (20–22).

Despite its initial discovery over 40 years ago, the evolutionary origins of the Wnt protein have, until recently, been mysterious (23). In 2020, our group reported the discovery of the first prokaryotic versions of the Wnt domain (24). Using comparative genomics, we showed that these bacterial Wnt domains present contexts characteristic of toxins or effectors in biological conflict systems (24,25). Prompted by these initial observations, we set out to comprehensively identify and computationally characterize the evolutionary relationships of these newly identified Wnt homologs in an effort to understand their evolutionary history and predict their functions.

Consequently, we were able to unify the Wnt family with several other domains into a large superfamily described for the first time herein. These include two biochemically characterized families that were hitherto not known as Wnt homologs: the phosphatidylserine synthase (PTDSS1/2, EC: 2.7.8.29) (26–28) and the toxin domain of TelC from Streptococcus intermedius (29). However, the majority of the families we unify are either reported for the first time or are functionally poorly understood, including the animal Serum Amyloid A (SAA) (30) and the vancomycin resistance protein VanZ families (31,32). Our comparative genomics analyses, paired with existing experimental evidence, suggest that the superfamily is broadly comprised of enzymes operating on lipid head groups (e.g., transesterification reactions) in a diversity of biochemical contexts, notably including the regulation of membrane composition, extracellular biopolymer metabolism and as effectors in biological conflicts. Thus, we identify a unifying theme across diverse aspects of lipid metabolism.

Results

Identification of the structural core of the Wnt domain

Although the structure of Wnt was described over a decade prior (33–35), its origins have been a mystery as it is phyletically restricted to Metazoa. Much attention has been focused on the three extended β-hairpins and a poorly structured loop extruding out of the core, their stabilizing cysteine residues, and the absolutely conserved serine residue, the site of palmitoleoylation (34,36) (Figure 1A). Our discovery of the first prokaryotic Wnt domains helped define its ancestral α-helical core, revealing the cysteine-rich extensions as Metazoa-specific insertions. Comparison of the core of the metazoan Wnt with AlphaFold structural models of the prokaryotic versions (24) revealed a shared globular domain composed of five α-helices (Figure 1A). As the prokaryotic homologs retained just the conserved core of the Wnt proteins, we named these the minimal Wnt (Min-Wnt) family. The core helices of the Min-Wnt family contained absolutely conserved sequence motifs (Figure 2), consistent with the enzymatic function we had earlier proposed for them (24) (see below).

(A) The four individual helices forming the core of the Lipocone superfamily are consistently colored across the illustrated representatives. The inter-helix linkers are colored gray, and lineage-specific synapomorphic insertions and extensions are colored light brown. Active site and other residues of interest are rendered as ball-and-stick. Protein Data Bank (PDB) IDs or Genbank accessions used to generate AF3 models are provided. (B) Relationship network of the Lipocone families. The thickness of the edges is scaled by the negative-log HHalign p-values. Families are colored according to the community identified by the Leiden algorithm (37) (see Methods). (C) Box plots displaying core helix transmembrane propensity scores of individual sequences within different Lipocone families. The horizontal divider represents the boundary between typical TM and soluble sequences.

Sequence logo of conserved core elements of the Lipocone families.
These correspond to the core helices H2, H3, and H4. The three conserved active site residue positions are boxed in dotted lines with the inferred ancestral residue indicated at the top of the alignment. Families are grouped and labeled on the left in their higher-order clades.

Having defined this shared core, we initiated sequence-based homology searches in an effort to identify remote homologs. Iterative position-specific sequence matrix (PSSM)-based searches (see Methods) initially recovered animal and bacterial versions of the Serum Amyloid A (SAA) proteins, and further rounds of searching initiated from this set of sequences further recovered a vast collection of additional homologous families. As an example, a search initiated with a bacterial SAA-like sequence from Bdellovibrio bacteriovorus (Genbank acc: AHZ84906.1) retrieved a sequence overlapping with the Pfam models for “Domain of unknown function”, DUF2279 (acc: WP_146898260.1, iteration: 5, e-value: 0.004), DUF4056 domain (acc: MBW8016507.1, iteration: 5, e-value: 0.005), and sequences automatically annotated as “YfiM” in the GenBank database (acc: WP_019077413.1, iteration: 4, e-value: 0.004). Sequence profile-profile searches with HHpred confirmed these relationships and captured more distant ones. For instance, a HHpred search initiated with the Bacteriovorax stolpii Min-Wnt domain (acc: WP_102242990.1, residues 1-109) recovered the Pfam Wnt profile (PF: PF00110.23, p-value: 1.5e-6) and the Pfam SAA profile (PF: PF00277.22, p-value: 3.7e-5). Similarly a HHpred search initiated with a Gemmatimonadetes sequence (acc: PYP94660.1, residues 75..170) recovered the DUF2279 Pfam profile (PF: PF10043.12, p-value: 5.4E-21), the DUF2238 profile (PF: PF09997.12, p-value: 1.3E-07), and the DUF4056 Pfam profile (PF: PF13265.9, p-value: 2.5E-05), among others.

Exhaustion of these searches, followed by clustering and manual inspection of the multiple sequence alignments of the retrieved sequences (see Methods), revealed a shared four-helix core across all of them, hereinafter referred to as H1 through H4 (Figure 1A). This 4-helix core of the domain was further confirmed by inspection of AlphaFold structural models constructed for representatives of the individual families, along with the rare instances of experimentally determined structures. These comparisons established that the above-mentioned fifth C-terminal helix in the Wnt core is a synapomorphy (shared derived character) restricted to the Wnts and closely related families like SAA (Figure 1A). In all, the results of our clustering analysis tallied 30 distinct families constituting a large superfamily. Remarkably, of these, 17 families had no pre-existing annotations. Phyletic analysis of individual families revealed a range of distributions, ranging from broad conservation in multiple superkingdoms of Life to those restricted to a small number of lineages (see below, Figure S1). A relationship network for the superfamily was constructed based on p-value and e-value scores using alignments of each family as a query in HHalign profile-profile searches against the rest (see Methods, Figure 1B). The Leiden community detection algorithm (37) was then applied to this network to identify higher-order assemblages (see Methods). These groupings were also supported by structural synapomorphies, such as a circular permutation and versions with a two-stranded ‘handle’ (see below).

The four helices conserved across the superfamily constitute a cone-like structure (Figure 1A), with the helices tending to coalesce on one end and opening out into a pocket on the other, lined by the conserved sequence positions (Figures 1A, S2). The core is also marked by a linker between H1 and H2, which adopts characteristic extended conformations in certain families and higher-order groups. While the linkers joining H2 and H3 and H3 and H4 tend to be more constrained, there are some exceptions; for example, the extended loop insert housing the palmitoleoylated serine residue between H2 and H3 in the metazoan Wnt family (Figure 1A).

Dramatic variability in hydrophobicity of the conserved core across the superfamily

We observed that these Wnt-related families dramatically varied in their hydrophobicity. Using an index for transmembrane propensity (38) (see Methods) and comparing that to known transmembrane (TM) segments, we predict that the α-helices in 18 of the 30 families are hydrophobic enough to qualify as TM domains, and show a statistically significant tendency to group to the exclusion of the other families (Figure 1C, S3). Thus, these are predicted to be integral membrane domains. Further, these ‘hydrophobic families’ often evince a broader and deeper phyletic distribution pattern than the less-hydrophobic families (Figure S1, methods), implying that the ancestral version of the superfamily was likely an integral membrane domain. Thus, their association with the lipid membrane, combined with the cone-like shape of the conserved core (Figure 1A), leads us to refer to the whole superfamily hereinafter as the Lipocone superfamily.

Alphafold3-assisted transmembrane topology prediction (39) revealed that 14 of the 17 integral membrane families are consistently oriented with the aperture of the cone-like structure opening toward the outer face of the membrane. This predicted TM topology is also generally consistent with the domain fusions when present: e.g., domains that are typically cytoplasmic and those that have extracellular or periplasmic functions are respectively predicted as projecting either inside or outside the membrane (see below). However, three families in the cpCone clade (see below) did not yield consistent orientation predictions, potentially owing to the diversity of structural variations observed in the clade, including a circular permutation event.

A unified biochemistry for the Lipocone superfamily

Of the 30 identified families, 26 display a striking conservation pattern of polar residues associated with the pocket of the Lipocone domain (Figure 2, S2). Of these, a set of three positions, one mapping to each of H2, H3, and H4, can be inferred as being ancestrally present and were likely occupied by a histidine (H2), glutamate (H3), and aspartate (H4), though in some families their identities have secondarily changed (Figures 2, S2). A fourth well-conserved polar position is observed at or near the end of H3; while its ancestral identity is difficult to establish, it is frequently an aspartate or glutamate (Figure 2). Two further well-conserved positions are often seen in H4: a polar position downstream of the broadly conserved aspartate residue and a glycine residue near the C-terminus of H4 (Figure 2) that likely caps the said helix. Although the ancestral pattern is noticeably degraded in the metazoan Wnt (Met-Wnt) family, it is strongly preserved in the prokaryotic Min-Wnt family (Figure 1A). In experimentally determined and modeled structures, the above set of 4 conserved positions forms a predicted active site in the aperture of the Lipocone domain. This, in turn, implies a shared biochemistry across the superfamily, with secondary inactivation in some families like Met-Wnt (see below, Figure 2). At the same time, the differences in the specific residues in the conserved positions between different families point to a range of distinct but related activities across the superfamily (40–42).

Consistent with these observations, two of the families with intact active sites, the PTDSS1/2 (28,43) and TelC (29), which we identified in this work as members of the Lipocone superfamily, have been characterized as active enzymes operating on different lipid substrates (Figure 3A). The eukaryotic PTDSS1/2 localizes to the endoplasmic reticulum (ER) membrane and catalyzes a reaction on the polar head group of phosphatidylcholine or phosphatidylethanolamine (44–46). PTDSS1 and PTDSS2, respectively, exchange the phosphate-linked choline or ethanolamine head groups with L-serine (28) (Figure 3A). The toxin domain of TelC acts on lipid II (29), the final intermediate in peptidoglycan biosynthesis, which couples an undecaprenyl diphosphate tail to a head group comprised of a N-acetylmuramic acid-N-acetylglucosamine disaccharide, with a pentapeptide further linked to the former sugar (47,48). TelC cleaves the bond between the undecaprenol and the diphosphate coupled to the head group (29) (Figure 3B). The reaction is comparable to that catalyzed by PTDSS1/2, as both attack phosphate linkages in lipid head groups. However, TelC apparently directs a water molecule for the attack in lieu of the hydroxyl group of serine directed by PTDSS1/2 (Figure 3B).

Known and predicted Lipocone reaction mechanisms.
Experimentally supported reactions are boxed in blue (A-B), while a predicted reaction based on genome displacement by a Lipocone domain of an experimentally characterized enzyme is boxed in orange (C). The remaining reactions (D-G) are suggested based on the contextual inferences in this work. Attacking and leaving groups are denoted by dashed green and red circles, respectively.

Combining the above observations, we infer the unified biochemistry for the catalytically active families thus: 1) They act on the head groups of lipids either by removing or swapping phosphate-linked head groups (Figure 3A-B). These would be comparable to the phospholipase D (PLD), transphosphatidylation or polyisoprenol phosphoesterase reactions (49). 2) Given the cone-like cavity and the hydrophobicity of the helices, the lipid tail is predicted to be housed within the lipocone with the head group positioned in the active site. 3) In the case of the integral membrane versions, their orientation would predict the targeting of the head groups of the outer leaf of the bilayer.

Major clades of the Lipocone superfamily

The extreme sequence divergence of the superfamily, coupled with the small size of the domain, prevents the use of simple phylogenetic tree analyses to resolve its deep evolutionary history. Hence, we combined community finding algorithms applied on profile-profile similarity networks, comparison of structural features and motifs, and phyletic patterns (Figures 1B, 2, S1) to reconstruct the most parsimonious evolutionary scenario for the diversification of the Lipocone superfamily (Figure 4, see Methods). In the below sections, we survey the higher-order clades, highlighting their specific features.

Reconstructed evolutionary scenario for the Lipocone superfamily.
The relative temporal epochs are demarcated by vertical lines and labeled at the bottom. The clades are represented by colored lines indicating the maximum depth to which the families listed to the right can be traced. Colors track the superkingdom-level phyletic distribution of the family. Dashed-line circles indicate uncertainty in the origin of lineage(s). Inferred or experimentally characterized functions for families are indicated to the left of family names. Asterisks denote newly described families.

SAW (SAA-Wnt) clade

This clade consists of four families, with the two prokaryotic families (Min-Wnt and prok-SAA) (24,50), respectively, giving rise to their counterpart eukaryotic families (Met-Wnt and Met-SAA; Figures 1A,4). This clade is structurally unified by the presence of a fifth helix that stacks in the space between the H2 and H4 helices (Figure 1A, S2). In the Wnt families, this helix is comparable in length to the core helices, while in the SAA families it is usually shorter (Figure 1A). The clade is further unified by the pronounced conservation of a sNxxGR motif (where ‘s’ is a small residue) encompassing the conserved active site position in H4 (Figure 2). SAW clade Lipocones show low overall hydrophobicity and are known or predicted to be soluble domains. Outside of the clearly inactive eukaryotic Wnt family, the remaining three families largely conserve the core active site residues (Figure 2).

VanZ-Skillet clade

This clade unites seven families: the two VanZ families, VanZ-1 and VanZ-2, prototyped by the bacterial VanZ protein originally identified in the context of vancomycin resistance and the five Skillet families, which form a distinct subclade. These are unified by a “handle”-like structure (hence, “Skillet”), adopting a helical conformation in the H1-H2 linker (Figure 1A, S2). Strikingly, a symmetric helical handle is present in the H3-H4 loop of the Skillet-DUF2809 and Skillet-3 families (Figure S2) of this clade. VanZ-1 features a conserved asparagine residue in the H2 position and a DxDDxxxN motif in H4, while VanZ-2 features RKxxH and DxxxD motifs in these respective positions (Figure 2). The Skillet families are largely unifiable in their conservation of an ExxQ motif in H3, an aspartate three positions upstream of the canonical H4 aspartate, and another aspartate in the H2 contributing to the active site. These first two features specifically ally them with the VanZ-1 family (Figure 2).

While the VanZ domain was previously reported as including a fifth TM helix, which is C-terminal to the 4-helix Lipocone core defined here (51,52), our survey instead reveals a striking diversity of configurations around the core 4-helix Lipocone domain (Supplemental Material). These range from standalone Lipocone configurations to one or more TM-helices adorning the domain at its N- and/or C-terminus. This variation is consistent with a further tendency for the VanZ families to feature an extensive diversity of domain fusions to both soluble globular domains and discrete TM modules (see below).

The VanZ families are deep-branching, as suggested by their wide phyletic spread (Figure S1). VanZ-2 is the most widespread individual Lipocone family in bacteria, with several genomes encoding multiple paralogs (Supplemental Material) (51,53). It is also found in certain eukaryotes, including a pan-fungal presence and in some representatives of the SAR clade. Both VanZ-1 and VanZ-2 are particularly well-represented in Gram-positive bacterial lineages like Actinomycetota and Firmicutes, while VanZ-2 is nearly universally conserved in the Bacteriodetes/Chlorobi lineage (Figure S1). In contrast, only one of the Skillet families, Skillet-DUF2809, is widely but sporadically distributed, with the four others being more restricted (Figures 4, S1).

YfiM clade

This clade includes three families that are consistently centrally located in the profile-profile similarity network (Figure 1B). This is likely due to their being close in sequence conservation to the ancestral state of the superfamily (Figure 2). Consistent with this, the YfiM-1 family also presents a structurally minimal Lipocone domain, comprised of just the 4-helix configuration with no further elaborations. Notably, this also extends to a lack of domain fusions in this family. In contrast, YfiM-DUF2279 and YfiM-Griddle (DUF3943) are structurally distinguished by an unusual H1-H2 linker (Figure 1A), which wraps around the outside and stacks against the H3-H4 linker (Figure S2). The YfiM-Griddle family further features a unique ‘flattened’ surface around the aperture of the Lipocone formed by protruding loops (hence, “Griddle”; Figure S2). This leaves the active site pocket more accessible relative to families with more elaborately structured inter-helix linkers. The Griddle family also features a C-terminal extension with a two-helix hairpin (with a hhsP motif in the turn between the two helices, where ‘h’ is a hydrophobic residue and ‘s’ is a small residue) (Figure S2). The three YfiM families straddle the membrane-propensity boundary in the plot (Figure 1C). Further, the YfiM-DUF2279 and Griddle families are strikingly absent in Gram-positive bacterial lineages (Figure S1). Concurrent with these features, they are often predicted by the deep-learning-based localization predictor deepTMHMM as outer-membrane proteins, suggesting a role in this subcellular location (see below).

ClaspCone-CapCone-TelC clade

Members of this clade are unified by an elaborated H1-H2 linker that often contains one or more helical segments that are typically predicted to guard the aperture of the Lipocone domain (Figure S2). This linker ends in a “clasp”-like element, which forms a range of structures in different families of the clade before leading into H2 (Figure S2). The clade is also unified by a striking reduction of overall hydrophobicity, predicting that the members of this clade are soluble domains (Figure 1C). Outside of the divergent TelC subclade, most of the families in this clade conserve a serine residue three positions upstream of the active site aspartate in H4, often preceded by an aromatic residue, which is typically phenylalanine. H4 also usually features a conserved asparagine four positions downstream of the conserved aspartate active site position, immediately preceded by a small residue (Figure 2). The second H3 active site position is generally poorly conserved, though when present, it is usually an aspartate residue. Finally, H2 contains either a DK or xD motif four positions upstream of the canonical H2 active site histidine residue (Figure 2).

The most rudimentary clasps are found in the ClaspCone-1, -2, and -3 families, where it is little more than a rounded loop, though, in ClaspCone-1, a small β-hairpin emerges within it. The three ClaspCone families are further unified by the presence of a two-helix insert leading into H2 that stacks against the Lipocone core (Figure S2). The three CapCone families, CapCone-DUF4056, CapCone-1, and CapCone-2, are named so for an encasing structure over the active site resembling a cap (Figure S2). They share a conserved glycine residue six positions upstream of the active site H2 histidine and a S/GxxSxx motif upstream of the conserved H4 aspartate (Figure 2). They are further unified by a pronounced β-hairpin clasp augmented by an additional strand (Figure S2). They also display varying degrees of degeneration of H1, along with family-specific structural elaborations.

The TelC group of this clade, prototyped by the streptococcal TelC toxin (29), is divided into two families featuring prokaryotic (prok-TelC) (29) and metazoan versions (Met-TelC) (54). Both TelC families feature a “cap” with contributions from inserts in the H1-H2 and H3-H4 loops (Figures 1A,S2). Unique to these families is the conservation of an aspartate residue located six positions downstream of the canonical active site aspartate of H4 (Figure 2). This aspartate points away from the center of the Lipocone and interacts with a conserved arginine from a synapomorphic C-terminal helical extension.

cpCone clade

A widespread yet sporadically distributed clade of seven families emerging as a stable community in the profile-profile similarity network (Figure 1B) is defined by a unique structural synapomorphy: a circular permutation (55) (hence, cpCone) placing the normally N-terminal H1 at the C-terminus of H4 (Figure S2, S4). This clade is also united by unique sequence features, viz., a polar residue (typically aspartate) six positions upstream of the conserved H2 histidine and a second glutamate three positions downstream of the conserved H3 glutamate (Figure 2). While the circular permutation is shared across the clade, several structural variations are seen, often within the same family (Figure S4). These include: 1) versions containing a duplication of the Lipocone domain. While the second copy in these versions is catalytically inactive, the H1’ from the second duplicate displaces the H1 from the first copy, suggestive of an intermediate to the circular permutation. 2) Versions retaining a candidate H1 that has been displaced by H1’ in a five-helix arrangement. 3) Those containing just the circularly permuted core. 4) Versions showing a degradation of the H1 helix, preserving just a 3-helix core (Figure S4). Despite this propensity for structural variation, the active site residues are strongly conserved, with the exception of the cpCONE-i family, which we infer to be catalytically inactive (Figure 2). The core helices of the cpCone clade are strongly hydrophobic, and they are all predicted to be integral membrane domains (Figure 1C). Consistent with this, the eukaryotic PTSSD1/2 domains reside in the ER membrane (44,45).

Wok family

The Wok family (partly covered by the Pfam DUF2238 model) shows a higher order grouping with the above circularly permuted clade (Figures 1B,4) but has a phyletic distribution only rivaled by the VanZ-2 family (Figure S1), suggesting a deep-branching origin. The shape of this family is reminiscent of a wok formed by two distinguishing structural synapomorphies: a 2-TM helix N-terminal extension and a unique “handle” formed by the linker between the H3 and H4 (Figure S2). It additionally features a C-terminal, rapidly diversifying cytoplasmic tail. Despite these elaborations, it retains the inferred ancestral active site configuration (Figure 2). The strongly hydrophobic core helices of the Wok family predict it to be an integral membrane enzyme (Figure 1C).

Functional themes in the Lipocone superfamily

Given our inference of shared general biochemistry across the Lipocone superfamily in targeting phosphate-containing linkages in head groups of both classic phospholipids and polyisoprenoid lipids, we next used contextual information from conserved gene-neighborhoods, domain architectures and phyletic pattern vectors, a powerful means of deciphering gene function (56), to narrow down the predictions for specific families (Figure 5, Table S1, Supplementary Data). To this end, we constructed a graph (network) wherein the nodes are individual domains and edges indicate adjacency in domain architectures or conserved gene-neighborhoods (Figure 6, see Methods). We then identified cliques in these networks and merged the individual cliques containing a particular Lipocone domain to define its dense subgraph (Figures S5-S7). We then analyzed these subgraphs to identify statistically significant functional categories represented in them (Table S2; see Methods). This data was combined with existing experimental results and the sequence and structure analyses outlined above to arrive at the functional themes surveyed in the below sections.

Representative contexts for the Lipocone superfamily, grouped by shared functional themes.
Genes are depicted by box arrows, with the arrowhead indicating the 3’ end of genes. Genes encoding proteins with multiple domains are broken into labeled sections corresponding to them. Domain architectures are depicted by the individual domains represented by distinct shapes. TMs, lipoboxes (LPs), and SPs are depicted as unlabeled, narrow yellow, blue, and red rectangles, respectively. All Lipocone domains are consistently colored in orange. Genes marked with asterisks are labeled by the Genbank accession number below each context. Colored labels above genes denote well-known gene names or gene cluster modules. Abbreviations: PTase, peptidase; TFase, transferase; GlycosylTFase, Glycosyltransferase; MPTase, metallopeptidase; TGase, transglycosylase; SLP, serine-containing lipobox; cNMPBD, cNMP-binding domain; NCPBM, novel putative carbohydrate binding module; (w)HTH, (winged) helix-turn-helix; ZnR, Zinc ribbon; PPTs, pentapeptide repeats; Imm, immunity protein; βPs, β-propeller repeats; Cystatin-FD, Cystatin fold domain; MTase, methylase; PGBD, peptidoglycan-binding domain; MβL, metallo-β-lactamase; L12-ClpS, ClpS-ribosomal L7/L12 domain; TA, teichoic acid.

Lipocone contextual network.
The network represents the conserved contextual associations of Lipocone domains (hexagonal nodes). Nodes and edges are colored based on known or inferred functional categories of the domains. The nodes are scaled by their degree. Gray coloring indicates domains without specific functional assignments. Examples of conserved gene neighborhoods and domain architectures supplementing those in Figure 5 illustrate contexts that bridge functional themes. Here, individual domains are colored to match network coloring. Additional abbreviations to those in Figure 5: APH-Pkinase, aminoglycoside phosphotransferase-like kinase; HUP, HIGH, UspA and PP-ATPase superfamily-like domain; Alk-phosphatase, Alkaline phosphatase; dehyd, dehydrogenase; TPRs, tetratricopeptide repeats; PMM/PGM, phosphomannomutase/phosphoglucomutase; ZnF, zinc finger; APC-transporter, amino acid-polyamine-organocation transporter; LPS, lipopolysaccharide.

Lipocone domains in membrane lipid, peptidoglycan and exopolysaccharide modifications

Across different Lipocone families, we found statistically significant connections to roles in modifying lipid head groups in various membranes and in lipids involved in the synthesis of extracellular matrix polymers such as peptidoglycan and lipopolysaccharides (Figure 6, Table S2, Supplementary Data).

Archetypal lipid head group exchange reactions catalyzed by the cpCone clade

One of the few experimentally characterized Lipocone families is the eukaryotic PTDSS1/2 family of the cpCone clade, members of which exchange the head group of essential membrane phospholipids to generate phosphatidylserine from phosphatidylethanolamine or phosphatidylcholine (Figure 3A) (28,57). Given the pervasive presence of this clade in archaea (Figure S1), it is thus tempting to speculate that these archaeal cpCones may play a role in the modification of Archaea-specific lipids (58–60) through a comparable head group exchange reaction (see below).

In bacteria, the related cpCone-1 family shows operonic association with a LolA-like lipoprotein which shuttles lipoproteins to the outer membrane (61) and a novel 4TM protein (Figure 5A). This raises the possibility that cpCone-1 might mediate the formation of membrane domains featuring lipids with a modified head group that act as foci for the trafficking of lipoproteins. Curiously, the cpCone-1 gene might also be inserted between the bacterial chromosome segregation and condensation complex subunits the Kleisin ScpA and the wHTH ScpB (62–65). The bacterial cpCONE-DUF2585 is operonically coupled to a GNAT family NH₂-group-acetyltransferase and further linked to genes for the glycolate oxidase GlcE and GlcF (66) and the bacterial proteasome subunits HslV and HslU (67) (Figure 5A, Supplementary Data). These might point to the coupling of membrane lipid head group modifications with disparate processes, such as chromosome segregation during cell division or different responses to stress (68–70).

The Wok and YfiM-1 families in cardiolipin and modified isoprenoid lipid pathways

We observed a set of conserved gene neighborhoods displaying the mutually exclusive presence of a synaptojanin-like phosphatase gene, with one encoding either a member of the Wok family or a cardiolipin synthase of the HKD superfamily (71) (Figure 5B, Supplementary Data). This suggested that the latter two are analogous enzymes catalyzing equivalent reactions. The cardiolipin synthase utilizes two phosphatidyl glycerol molecules as substrates to generate cardiolipin with the release of one of the glycerol head groups (72). This is comparable to the head group exchange reaction catalyzed by PTDSS1/2 from the cpCone clade (Figure 3A). Hence, we propose that these members of the Wok clade are cardiolipin synthases (Figure 3C). Distinct phosphoesterases, namely the synaptojanin-like, calcineurin-like (73) and HAD (74) enzymes, are also observed in gene-neighborhood associations with the Wok, suggesting that they might together regulate membrane lipid composition by acting on the phospholipids or their precursors (Figure 5C). In a distinct neighborhood, the Wok clade enzyme is coupled to carotenoid biosynthesis genes (75,76). (Figures 5D,S5). This raises the possibility that these members might also catalyze a comparable reaction to the above on isoprenoid lipids: for instance, they could synthesize a carotenoid from two geranylgeranyl-diphosphate molecules (77,78). In both these contexts, the actinobacterial operons often include genes for GT-A family glycosyltransferases, suggesting the further synthesis of glycosylated derivatives of the lipids or carotenoids (79) (Supplementary Data). In several bacteria, a YfiM-1 family Lipocone is operonically coupled to a UbiA-like prenyltransferase (80). This gene neighborhood additionally codes for a slew of enzymes, such as an amidophosphoribosyltransferase (81), a RidA-like deaminase (82), and a pair of structurally distinct phosphoesterases, respectively, containing an HD and a PHP domain (73,83) (Figures 5E, S5 Supplementary Data). This suggests a role for the YfiM-1 Lipocone and the associated enzymes in generating a modified polyisoprenoid metabolite.

VanZ families modifying lipid head groups in peptidoglycan and exopolysaccharide metabolism

The widespread VanZ-1 and VanZ-2 families (Figure 1A) frequently show either gene neighborhood associations or direct domain fusions, with diverse genes involved in both peptidoglycan and other extracellular polysaccharide pathways. Chief among these are the lipid carrier flippase (Pfam: MviN_MATE clan) (84–86), the UDP-GlcNAc/MurNAc lipid transferases, which generate the lipid-linked exopolysaccharide precursors (lipid I) (48,87), and UDP-N-acetylglucosamine (UDP-GlcNAc) biosynthesis enzymes (88,89). Despite certain examples of crossover in functional themes, the gene-neighborhood contexts of VanZ-1 and VanZ-2 suggest a metabolic partitioning, with VanZ-2 significantly associating specifically with peptidoglycan-related genes and VanZ-1 significantly linking with biosynthesis genes for other exopolysaccharides (e.g., the outer-membrane-associated lipopolysaccharide) (90) (Figures 5F,6,S6, Table S2). The latter include WaaL-like lipid A transferase (91), the polysaccharide chain-length determination domain Wzz (92), the Wzc kinase and the “extracellular antigen”-regulating ElyC-like domain (Pfam: DUF218) (93), and numerous nucleotide-diphosphate sugar biosynthesis and modification enzymes (94) (Figures 5F,6,S6).

The precursors of both peptidoglycan and exopolysaccharides are synthesized in the cytosol, linked to lipid carriers via a diphosphate linkage, e.g., the polyisoprenoid lipid undecaprenol (bactoprenol) (90,94–97). A key step in their maturation is the flipping by the flippase of the lipid-linked intermediates associated with the inner membrane to the outer membrane. These flipped units are then incorporated into the maturing chain (98,99) by the peptidoglycan glycosyltransferase (GTase) (100) and the chain length determination protein, WzzE/polymerase (WzyE) (92,101), in peptidoglycan and other exopolysaccharide maturation pathways, respectively. Based on the precedence of the TelC-catalyzed reaction (Figure 3B), we propose that VanZ-1 and VanZ-2 comparably act on the flipped lipid II head groups bearing the modified sugar intermediates to release the undecaprenol via phosphoester cleavage (Figure 3F). Such activity could modulate the concentration of available peptidoglycan intermediates and allow formation of peptidoglycan with varying thickness and composition during different phases of the life cycle, e.g., sporulation versus vegetative growth in Bacillota. Such a reaction could also possibly modulate exopolysaccharide biosynthesis by comparably acting on their precursors.

The terminal transfer from the lipid carrier of the Gram-negative bacterial O-antigen (as well as other exopolysaccharides attached to the lipid A carrier) has been attributed to the WaaL-like enzymes (91,102). However, bacteria generate further lineage-specific polysaccharide decorations, capsule structures, and other exopolysaccharides (e.g., xanthan, enterobacterial common antigen (ECA), alginate, colonic acid), as well as teichoic acids (e.g., wall teichoic acids, WTA) (103, 104). Notably, the analogs of WaaL, i.e., the terminal transferases for several exopolysaccharides, including ECA and WTA, have to date escaped identification (93). Hence, it is possible that, by analogy to the PTDSS1/2 reaction (Figure 3A), the VanZ families act on the lipid carrier-linked sugar head groups to catalyze either the extension of the polysaccharide chains through transesterification or the terminal release of the mature chain through phosphoester cleavage (Figure 3E).

Atypical VanZ domains in uncharacterized modifications of peptidoglycan and the outer membrane

Certain representatives of the two VanZ families also show operonic associations indicative of outer membrane-associated or peptidoglycan modification functions distinct from those described above (Figures 5G,6 Supplementary Data): 1) An operon in FCB group bacteria couples a VanZ-2 gene with those coding for a SprA secretin-like channel protein (105), a glycine cleavage H (GCVH)-like lipoyl-group carrier protein (106), a 2TM protein fused via a proline-rich linker to a C-terminal TonB-C domain (107), and a secreted, second TonB-C domain fused to a Wzi-like outer membrane protein (OMP) superfamily β-barrel (108) (Figures 5G, 6). 2) In betaproteobacteria, certain VanZ-1 domains are duplicated with the C-terminal copy being inactive (VanZ-i) and found in an unusual four-gene operon with a thioredoxin-fold [2Fe-2S] ferredoxin (109), a possible lipase of the α/β-hydrolase superfamily (110), and a metallo-β-lactamase (MβL) fold D-Ala-D-Ala cross-linking transpeptidase (111,112). 3) A patescibacterial operon encodes a VanZ-2 domain with an ABC ATPase transporter system, either of two structurally distinct peptidases, namely a Papain-like or glycine-glycine peptidase (113,114), fused to the same membrane-anchored N-terminal coiled-coil region, and a further TM protein containing one or more external Lamin-Tail domains (LTDs) predicted to bind extracellular DNA or polysaccharides (115) (Figures 5G,6,S6). The associations in the first of the above neighborhoods point to a distinct outer membrane-associated lipid modification, while the other two might be involved in lineage-specific decorations/modifications of peptidoglycan, accompanied by peptide-crosslinking or cleavage activities.

Lipocone domains operating in the outer membrane

Contextual associations, phyletic patterns, and localization predictions support the action of two Lipocone families directly in the outer membrane. Notably, the YfiM-Griddle and YfiM-DUF2279 families are found nearly obligately directly fused or operonically linked to several distinct OMP β-barrels (116,117) (Figures 5H,6,S5). Up to three YfiM-Griddle Lipocones, usually with a cognate OMP β-barrel, might be encoded next to each other in the genome. Additionally, YfiM-Griddle family genes are often encoded in operons with several components of the outer membrane lipid and protein trafficking apparatus, including the LolA-like chaperone (118), the POTRA domain (119,120), the channel-blocking Plug domains (121), and the TolA-binding TolB-N domain (122). Further, these operons might encode a Patatin-like lipase (123), GT-B family glycosyltransferases (79), and a range of phosphoesterases (e.g., an integral membrane phosphatidic acid phosphatase PAP2 (124), a lipobox-containing synaptojanin superfamily phosphoesterase (125) and a secreted R-P phosphatase (126) (see Figures 5H,6, and Supplementary Data)). In addition to the fusion to the OMP β-barrel, the YfiM-DUF2279 family (Figure 5H) shows operonic associations with a secreted MltG-like peptidoglycan lytic transglycosylase (127,128), a lipid-anchored cytochrome c heme-binding domain (129), a phosphoglucomutase/phosphomannomutase enzyme (130), a GNAT acyltransferase (131), a diaminopimelate (DAP) epimerase (132), and a lysozyme-like enzyme (133). In a distinct operon, YfiM-DUF2279 is combined with a GT-A glycosyltransferase domain (79), a further OMP β-barrel, and a secreted PDZ-like domain fused to a ClpP-like serine protease (134,135) (Figure 5H).

The strong linkage to the OMP β-barrel, together with their predicted localization, suggests that these YfiM-Griddle and YfiM-DUF2279 Lipocone domains operate in the outer membrane, potentially in concert with both cytoplasmic carbohydrate biosynthetic modules and periplasmic lipid- and carbohydrate-processing enzymes. As with the inner membrane lipids, they could potentially catalyze modifications of head groups through transesterification and/or linkage/release of outer membrane-associated polysaccharide chains through action on lipid-head group phosphoesters.

Lipocone domains acting on lipids in transit to the outer membrane

The ClaspCone-1 and ClaspCone-3 families lack the hydrophobicity indicative of direct residence in the membrane (Figure 1C); instead, they are predicted to localize to the periplasmic space. In the ClaspCone-1 family, the Lipocone domain is fused at the extreme N-terminus to either a single TM or a 5TM domain predicted to anchor it to the cell membrane. Between this TM element and the Lipocone domain, we detected a previously uncharacterized version of the Tubular lipid binding protein (TULIP) domain (136,137) or an Ig-like and a Zincin-like metallopeptidase (MPTase) domain (138) (Figures 5I, 6). These ClaspCone-1 genes may also show operonic associations with genes encoding a lipase of the SGNH family (139) and a membrane-bound O-acyltransferase (MBOAT; Figure 5I, Supplementary Data) (140). The TULIP domain superfamily has recently been characterized as a lipid-binding domain (136,137), which in proteobacteria functions in outer membrane lipid transport (141,142). Thus, we propose that the ClaspCone-1 family is likely to act in the periplasmic space on the head groups of outer-membrane targeted lipids bound to the TULIP or potentially to the Ig-like domains occupying an equivalent position in the domain architecture.

A Lipocone domain catalyzing a predicted lipoprotein lipid linkage reaction

The Skillet-1 Lipocone is strongly coupled in an operon with a downstream gene coding for a protein with an unusual lipobox-like sequence followed by one of several extracellular domains (e.g., concanavalin, β-jelly roll, OB-fold, Ig-like, β-propeller) predicted to bind carbohydrates or other ligands (143–147) (Figures 5J,6,S6 Supplementary Data). The lipobox-like sequence features a conserved GS motif at its C-terminus instead of the usual GC of the classic lipobox of bacterial lipoproteins (148) (Figure S8). In the canonical lipoprotein processing pathway, a thioether linkage is formed between the sulfhydryl of the cysteine and a diacylglycerol lipid embedded in the inner membrane by the lipoprotein diacylglyceryl transferase (lgt) enzyme, followed by the cleavage of the signal peptide at the GC motif junction by the signal peptidase (149,150). Given the serine in place of the cysteine in these lipobox-like sequences, we propose that it undergoes non-canonical lipidation by the associated Skillet-1 Lipocone protein in lieu of lgt. We propose that, comparable to PTDSS1/2, which act on free serine, the Skillet-1 family links the conserved serine from the lipobox-like sequence to a phospholipid (Figure 3A,D).

Lipocone domains in predicted lipid-associated signaling systems

Systems defined by standalone proteins with Lipocone domains

Several representatives of the two VanZ and Skillet-3 families are fused to a diverse array of known or predicted extracellular ligand-binding domains (Figure 5K), where the architecture takes the form of SP+X+TM+Lipocone or Lipocone+TM+X, where ‘X’ is the extracellular ligand-binding domain and SP is a signal peptide. The ligand binding domains include: (i) carbohydrate-binding lectin domains such as jelly-roll, concanavalin-like, NPCBM-like, CBD9-like, and other β-sandwiches (143,144,151–153)); (ii) a lipid-binding helix-grip superfamily domain (154)); (iii) those binding other potential ligands (e.g., Ig, OB-fold, YycI-like, DUF498-like, PepSY-like, β-helix, TPR, MORN, and β-propeller repeats (145–147,153,155–158)) (Figure 5K, Supplementary Data). We interpret these architectures as implying signaling, wherein the binding of the cognate ligand by one of the above domains regulates the catalytic activity of the associated Lipocone domain. Among these, the extracellular domains fused to the Skillet-3 family are particularly notable for their extreme variability (Figure 5K). This suggests their diversification under an arms race scenario (also see below) in a biological conflict. Further, the genes coding for the above are sporadically associated with exopolysaccharide metabolism genes (Supplementary Data). Hence, it is conceivable that this signaling is associated with exopolysaccharide variation (e.g., O-antigen phase-variation (159,160)), which might play a role in evading bacteriophage attachment.

Additionally, VanZ-1 Lipocone domains are also fused to several known signaling domains confidently predicted to reside in the cytoplasm, including the cyclic nucleotide-binding domain (cNMPBD), phosphopeptide-binding FHA, and DNA-binding RHH and HTH domains (65,161–163) (Figure 5K). These associations suggest potential VanZ regulation via a cytoplasmic cyclic nucleotide (sensed by cNMPBD) or, conversely, VanZ acting as an allosteric regulator of a transcriptional program via the HTH or RHH domain. One of the most common yet enigmatic fusions to VanZ is with the integral membrane RDD domain (53). The role of this domain is unknown; however, our analysis indicates that it contains a conserved intra-membrane binding site oriented towards the cytoplasmic face of the membrane (Nicastro GN, Burroughs AM, Aravind L, manuscript in preparation). The VanZ-RDD fusion is sometimes further fused to other domains (Figure 5K), the most notable being a highly derived but active novel Histidine Kinase domain (Figure 5K). Together, these associations point to the coupling of lipid modification with a signaling event on the cytoplasmic face of the membrane, which might relate to the dynamic regulation of lipid-carrier-bound exopolysaccharide precursors.

Multi-component associations of the Lipocone proteins in signaling

These systems resemble the above-discussed versions but are encoded by conserved gene neighborhoods that separate the Lipocone and the signaling elements (typically predicted transcription regulators) into distinct genes. Our analysis recovered at least three such systems: 1) A VanZ-1 Lipocone in the recently described HAAS/PadR-HTH two-component systems, which sometimes replace classical Histidine kinase-Receiver two-component systems (164). In these systems, the detection of an extracellular or intramembrane stimulus by a sensor domain releases the PadR-HTH transcription regulator bound to the sensor-fused HAAS domain. Here, VanZ-1 occupies the sensor position (Figure 5L). 2) A Skillet-2 Lipocone is coupled in a core two-gene system to a conserved upstream gene (Figure 5L,S5). That gene encodes a single TM protein with either a zinc ribbon (ZnR) fused to a conserved helix or an HTH domain fused to a ClpS-ribosomal L7/L12 domain in its cytoplasmic region (165). These neighborhoods might also code for an HMG-CoA reductase and GHMP kinase that catalyze successive reactions in the production of phosphomevalonate, a precursor of isoprenoid lipids (166,167). 3) A Skillet-DUF2809 Lipocone protein is operonically coupled with a 6TM protein and a further predicted transcription factor with a wHTH protein. These operons are further elaborated via additional embedded and flanking genes, either coding for components of isoprenoid lipid (e.g., undecaprenol) (168,169) or exopolysaccharide (e.g., ECA and related polysaccharides) metabolism (94,97) (Figures 5L,6,S6 Supplementary Data).

The Lipocone domains in these systems are predicted to be active enzymes, which, together with their operonic associations, point to functions involving the modification or transesterification of isoprenoid lipid head groups, sometimes in the context of exopolysaccharide biosynthesis. However, their associations with the intracellular HTH domains suggest that the Lipocone enzymatic activity is potentially coupled with the transcriptional regulation of the production of precursors of the lipids or exopolysaccharides. Given the high variability in the associated genes related to exopolysaccharide/lipopolysaccharide biosynthesis, we anticipate that the associated transcriptional regulation potentially relates to functional categories showing high diversity across bacteria, such as responses to environmental stress, phages, predatory bacteria attacks, or host immune response.

Lipocone domains as effectors in biological conflicts

Lipocone domains in antiviral immunity

The Min-Wnt domains (Figure 1A) that we originally identified were predicted to play a role in biological conflicts with invasive selfish elements, such as viruses (24). In this work, we better explain their potential mechanism of action. These versions show no fusions to extracellular domains or secretory signals, suggesting that they are deployed from within the bacterial cell (Figure 1C). These Min-Wnts are typically fused to the DUF3892, which displays a fold characterized by a three-stranded meander followed by a helix also seen in the dsRNA-binding domain and the ribosome hibernation factors (HPF) (170,171) (Figure 5M). Hence, we propose that these versions might potentially act to sense virally induced RNAs or modified ribosomes (24) to trigger a dormancy or suicide response to limit viral infection via the Min-Wnt effector. Specifically, the Min-Wnt might attack peptidoglycan precursors, such as lipid II, prior to their ‘flipping’ to restrict cell wall synthesis (85,172,173) or other such carrier lipids.

One other Min-Wnt domain, N-terminally fused to a three-stranded β-meander, is pervasive in the Bacteroidetes clade. This is operonically coupled with genes encoding a TM-linked run of pentapeptide repeats and two structurally distinct, secreted glycosyl hydrolase enzymes, respectively, containing a TIM barrel domain and a run of β-helix repeats (Figure 5M, Supplementary Data). Further, cyanobacteria show a standalone prok-TelC domain without any secretory signals. These could again act as effectors targeting lipid-linked precursors of peptidoglycan or expolysaccharides in response to intracellular invaders or stress (Figure 5M). Interestingly, some tailed bacteriophages also code for intracellular Min-Wnt domains, suggesting that they might also be deployed on the virus side in biological conflicts such limiting superinfection (Supplementary Data).

Lipocones as toxin domains in polymorphic and allied conflict systems

Polymorphic toxins and related systems, widespread across bacteria and certain archaea, are characterized by a highly variable C-terminal toxin domain (“toxin tip”) that is preceded by a range of more conserved domains typically required for autoproteolytic processing of the toxin, its packaging and trafficking (e.g., RHS repeats), adhesion and secretion via one of several secretory systems (174,175). The toxin might be delivered via one of the secretory systems into a target cell or else via direct contact between interacting cells. Classical polymorphic toxins are usually involved in kin discrimination and are accompanied by genomically linked cognate immunity proteins that protect against self-intoxication (174,176). Keeping with the principle of effector sharing between systems involved in distinct types of biological conflicts, we had originally identified a Min-Wnt domain closely related to those described in the above subsection as a toxin tip in polymorphic toxin systems (24). In the current work, we extend these findings to show that several distinct Lipocone families have been independently recruited as toxin tips of polymorphic toxins and related systems, namely Min-Wnt, prok-SAA, prok-TelC, CapCone-1, CapCone-2, ClaspCone-2 and VanZ-1 (Figures 4,5N,6,S7).

Certain CapCone-2 and Min-Wnt toxins from Gram-positive bacteria define some of the simplest of these toxin systems. Here, a standalone Lipocone domain is coupled to a signal peptide or lipobox via a poorly structured linker. These are usually encoded in a two-gene configuration with their cognate immunity protein (Figure 5O). More complex versions present, in addition to adhesion, peptidoglycan-binding, lipid-binding and proteolytic processing domains, multiple hallmarks of delivery through specific secretion systems. These include T4SS (VirD4-binding domain), T6SS (PAAR domain), T7SS (WXG/LXG domain), T9SS, and MuF domains (174,176) (Figure 5P, Supplementary Data). Additionally, we recovered standalone CapCone-1 domains encoded in an operon with a PsbP/MOG1 superfamily domain diagnostic of secretion via the T6SS (174,177) (Figure 5P). Further, we also found Min-Wnt domains fused to the N-termini of RTX-like β-roll repeats, suggestive of T1SS-mediated export (178) (Figure 5P).

Our analysis also uncovered multiple, previously uncharacterized trafficking/packaging systems associated with different Lipocone polymorphic toxins. Several Min-Wnt and CapCone-1 domains with lipoboxes are fused to an N-terminal Cystatin-like superfamily domain (179) (Figure 5Q). The same domain is also comparably fused to several other C-terminal toxin domains in related organisms, some of which are also predicted to target lipid head groups: (i) a novel toxin domain we unified with the lipid-targeting Colicin M fold (180); (ii) a lipid-binding START-domain-like helix-grip fold domain (154); (iii) a papain-like fold fatty acyltransferase (181); (iv) a domain related to the VanY-like D-Ala-D-Ala carboxypeptidase (182) (Figure 5Q, Supplementary Data). In all these cases, the toxins are coupled to a related immunity protein (see below), suggesting that they define a distinct polymorphic toxin system. We propose that this Cystatin-like domain specifies a novel packaging or deployment system upon secretion for the C-terminal toxin domain, analogous to Cystatin domains in functioning with eukaryotic proteases (183). The prok-TelC family Lipocones are found in distinctive architectures in two poorly characterized, predicted polymorphic toxin systems. In one of them, they are fused to an N-terminal glucan-binding GbpC β-sandwich domain (184) and repeats of MucBP-like Ig domains (185), which might anchor them to exopolysaccharides (Figures 5Q,S7 Supplementary Data). The second variant found in association with T9SS components (186) shows fusions to one or more copies of a previously undetected TPM domain (Figure 5Q). While the domain has been claimed to be a phosphatase (187), our recent analysis indicates that this is unlikely to be the case (164). Instead, we propose that the TPM domain might assist in assembling membrane-linked protein complexes, a role that might be relevant to the trafficking of these toxins (164).

To date, the only experimentally characterized Lipocone domain from polymorphic toxins is of the prok-TelC family that are secreted via T7SS (29,188) (Figures 5P,6). Notably, prok-TelC has been shown to be active only outside the cell and not in the cytoplasm (29). As noted above, it attacks lipid II to cleave off the peptide-linked disaccharide pyrophosphate head group from the undecaprenol tail (Figure 3B). Prok-TelC has also been speculated to similarly attack WTA-lipid II linkages (29). These findings provide a template for other Lipocone superfamily effectors in potentially targeting lipid carrier linkages in peptidoglycan and exopolysaccharide intermediates. However, given the diversity within the family (Figure 3F), it is conceivable that they also target other lipids.

Immunity proteins of Lipocone polymorphic toxins indicate periplasmic/intramembrane action

To date, only a single immunity protein has been reported for Lipocone toxins, viz., TipC, which counters prok-TelC toxin in the periplasm (29,189) (Figure 5P,6). Here, we uncovered a range of immunity proteins belonging to structurally distinct folds that counter the remaining Lipocone toxins (Figures 5N-Q,S9, Supplementary Data). The most widespread of these is a rapidly evolving, membrane-anchored member of the BamE-like superfamily that associates with not only Min-Wnt and CapCone toxins but also other above-mentioned lipid-head-group targeting toxins (e.g., the novel Colicin M-like domain). The BamE-like fold features a core two-helix hairpin followed by a run of three β-strands (Figure S9). The classical BamE operates in a pathway for the assembly of OMP β-barrels (190,191), suggesting that these immunity proteins emerged from an ancestral BamE and, like it, function in the periplasm. Additional candidate immunity proteins with more restricted phyletic spreads include (Figure S9): (i) a β-jelly-roll fold-containing protein (144); (ii) an integral membrane protein with a 4-TM core. These two are observed with Min-Wnt toxins. (iii) A novel domain combining an α-helix with a run of 4 β-strands stabilized by four absolutely conserved cysteine residues. This is coupled to both Min-Wnt and prok-SAA toxins; (iv) a protein with an OB-fold domain (145) (v) a protein with a β-sandwich related to the eukaryotic centriolar assembly SAS-6 N-terminal domain (192). The last two are coupled to CapCone-2 toxins (Supplementary Data). Notably, despite their structural diversity, these immunity proteins are all TM or lipoproteins and, like TipC (29,189), are predicted to operate at the membrane or in the periplasm (Figure 5N-Q, Supplementary Data). This suggests that they intercept their cognate Lipocone toxin domain outside cells or in the membrane rather than within the cell.

Lipocone toxins in predator-prey and other interspecific conflicts

In contrast to polymorphic toxins, which are typically deployed in intraspecific conflict between competing strains of the same species, other toxin systems are deployed against more distantly related target cells, such as prey and eukaryotic hosts (193). While some of these closely parallel polymorphic toxins in their domain architecture, they are usually distinguished by the lack of an accompanying immunity protein. The simplest of these systems are secreted Min-Wnt proteins from bacteria and fungi. These present just a standalone Min-Wnt domain or one fused to a novel domain with a half β-barrel wrapping around a helix (Figure 5R, Supplementary Data). These are probably deployed as diffusible toxins that target rival organisms in the environment.

Another architectural theme is defined by Min-Wnt and prok-SAA Lipocones fused to an enigmatic, novel, short C-terminal domain, which is comprised of a long β-hairpin with a characteristic “break” in its central region, causing it to acquire an arch-like appearance (Figures 5S, S10). Hence, we refer to this domain as the broken-hairpin. We found the broken-hairpin domain to be fused to a wide array of predicted toxin domains across the bacterial superkingdom. These include effector domains otherwise found in polymorphic toxin and allied systems that target peptidoglycan, carrier lipids and the membrane, such as members of the Colicin M (180), Zeta toxin-kinase (194), lysozyme (195), an α/β-hydrolase superfamilies (110) and nuclease toxins such as members of the HNH, HipA, SNase, and BECR superfamilies (174) (Figures 5S, S10C). Remarkably, these proteins with the broken-hairpin tend to lack a signal peptide or association with any other secretion system or immunity proteins (Figure 5S, Supplemental Material). Hence, we propose that the broken-hairpin domain itself serves as a trafficking mechanism for the externalization of these toxins in conflicts with rival environmental organisms.

Some predicted secreted Lipocones are found predominantly in predatory bacteria. The first of these are CapCone-2 domains from lineages like Bdellovibrionota, which are encoded in two-gene systems, with the second gene coding for a further secreted effector such as an α/β-hydrolase, Patatin, or acyltransferase or an OMP β-barrel domain (110,116,117,123) (Figures 5T,S7). Myxobacteria and some other lineages code for secreted prok-SAA domains fused to a N-terminal Zincin-like metallopeptidase domain, and the first bacterial example of the von Willebrand Factor D (vWD) and Ig domains at the C-terminus (196) (Figure 5T, Supplemental Material). In the recently described predatory Patescibacterial branch of Omnitrophota species, Skillet-clade Lipocone domains are found in gigantic proteins combined with several other domains and TM segments. Domains found in these proteins include polysaccharide biosynthesis enzymes (94,97), signaling proteins involved in Histidine kinase-Receiver relays (197), peptidases of the MPTase and Papain-like superfamily (113,138), diverse methylases, and extracellular ligand-binding domains like the peptidoglycan-binding LysM domain (198) (Figures 5T,6). Given the concentration of the above systems in predatory bacteria (Supplementary Data), we posit that the above Lipocones might function as toxins targeting prey membranes alongside a battery of effectors targeting other cellular components. In particular, the CapCone-2 systems might play a role in the breaching of outer membranes by Bdellovibrionota. Animal vWD domains are involved in adhesion (199); hence, the bacterial versions might play a similar role in adhering to prey cells, while the MPTase in these proteins potentially releases the associated Prok-SAA toxin through autoproteolysis. Finally, the giant proteins from the Patescibacteria are likely to combine signaling prey presence with overcoming prey defenses and breaching prey membranes.

Certain prok-TelC proteins are observed as part of several distinctive systems that could be involved in as-yet-undiscovered predatory interactions or in targeting environmental competitors. One such, defined by large proteins from spore-forming Bacillota, combines a diversifying set of extracellular ligand-binding domains (e.g., Ig-like, Cell-wall-binding β-hairpins and β-propellers (146,147,200)) with a two-enzyme core formed by a prok-TelC and a N-acetylglucosamine (GlcNAc)-1-phosphodiester alpha-N-acetylglucosaminidase (NAGPA). NAGPA catalyzes phosphoric-diester hydrolysis to release phosphodiester-linked sugars (Figures 5U,6,S7) (201). Some of these proteins feature an additional NlpC/p60 superfamily peptidase domain predicted to target peptidoglycan (181). The recombinational diversity of ligand-binding domains in this system, even among closely related Bacillota species, supports a possible arms race and involvement in a biological conflict. Other TelC domains in some Bacillota, Actinomycetota, and fungi are fused to peptidoglycan-binding domains (PGBD) (202) and an Rv2525c-like TIM-barrel (203) (Figures 5U,S7). In Actinomycetota, this protein is further combined in operons with either of two mutually exclusive genes coding for rapidly evolving proteins (Figure 5U): (i) a secreted protein containing a pair of Ig domains (200); (ii) a 3-TM protein (3TM-CCDN) with two conserved cysteines, an aspartate and asparagine residues predicted to be located between the TM segments outside the cell. This version is further coupled to a gene for a secreted VanY superfamily peptidase (182) (Figure 5U, Supplementary Data). Common to these contexts are rapidly evolving and variable domains on the one hand and peptidoglycan/exopolysaccharide binding or degrading domains on the other. Hence, we interpret these as potential conflict systems that engage the cell wall and target it and associated membranes in rival bacteria.

Lipocone domains in resistance to antimicrobial agents

VanZ-1 proteins (Figure 1A) were initially identified as encoded by a gene linked to that coding for the VanY D-alanyl-D-alanine carboxypeptidase involved in resistance to glycopeptide antibiotics like vancomycin and teicoplanin (31,204–206) (Figure 6). These antibiotics bind the terminal D-Ala-D-Ala in the peptide moiety of peptidoglycan, preventing the transpeptidase cross-linking reaction necessary for its maturation. Upon detection of these antibiotics, enzymes encoded by the core vancomycin resistance operon re-engineer the exported peptidoglycan by inserting a D-Ala-D-Lac in place of the D-Ala-D-Ala linkage, precluding antibiotic binding (53). The VanY peptidase, while not strictly required for antibiotic resistance, acts as an accessory to this system by cleaving any remaining D-Ala-D-Ala linkages generated via the canonical pathway (53,205). However, the role of VanZ in this system has so far remained unknown. While only a small fraction of the VanZ-1 genes are found in these antibiotic resistance contexts (Supplementary Data), interestingly, other Lipocone genes, namely those of the VanZ-2 and the Skillet families, might also be linked to VanY in lieu of VanZ-1. Further, VanY might be replaced by a structurally unrelated secreted D-Ala-D-Ala carboxypeptidase of the metallo-beta-lactamase fold (111) in operonic contexts with VanZ-1 (Figure 5V). Hence, given our above prediction regarding VanZ acting in peptidoglycan and/or exopolysaccharide metabolism, VanZ-1 and the Lipocones displacing it might indeed play an accessory role with VanY at the membrane (204,205) in antibiotic resistance. We posit that, in these contexts, it likely acts on the head group of Lipid II to recycle canonical peptidoglycan intermediates for their accelerated or more thorough replacement with the resistant versions (Figure 3G).

We also identified a conserved five-gene operon featuring a YfiM-1 family Lipocone that might play a role in resistance to antibacterial agents (Figures 5W,S5). Other than YfiM-1, this operon contains genes for: (i) a thioredoxin domain protein (109); (ii) A DTW clade RNA modifying enzyme of the SPOUT superfamily (207,208); (iii) a protein with acyl-CoA ligase, GNAT superfamily N-acetyltransferase and ATP-grasp domains (131,209,210); (iv) a PssA-like phosphatidylserine synthetase of the HKD superfamily (211) (Figure 5W, Supplementary Data). Of these enzymes, the phosphatidylserine synthetase is predicted to act in its usual capacity to generate a lipid with a serine head group (211). We propose that this would then function as a substrate for the YfiM-1 Lipocone domain, which might exchange the serine for another moiety via a reaction paralleling PTDSS1/2 (Figure 3A). This moiety could then be modified by aminoacylation, further acylation and a redox modification by the third protein listed above, together with the thioredoxin. Indeed, such peptide modifications of lipid head groups by lysine, alanine, or arginine aminoacylation catalyzed by derived tRNA synthetases fused to GNATs have been shown to be a key resistance mechanism against breaching of the membrane by antibacterial peptides (212,213). Hence, we predict the modifications catalyzed by this system might play a comparable role. The presence of a tRNA-modifying DTW domain suggests that in parallel to the tRNA synthetases, the GNAT in this system might use a tRNA-linked acyl group as a substrate, as seen in peptidoglycan biosynthesis (214,215).

Eukaryotic recruitments of the Lipocone superfamily

Lipocone domains have been transferred on several occasions from bacteria to eukaryotes (Figures 4,S1). While there is predicted functional overlap with the above-described, predominantly bacterial versions, we discuss these separately as the inferred biological contexts of their deployment are often distinct from the above.

Plant YfiM-1 and eukaryotic VanZ-2 proteins

A conserved YfiM-1 family protein typified by the Arabidopsis AT1G15900 was acquired from the bacteroidetes lineage of bacteria at the base of the plant lineage prior to the chlorophyte-streptophyte (including land plants) split and is predicted to be catalytically active (Figure S2, Supplementary Data). In Arabidopsis, this gene is widely expressed across different tissue types, developmental stages, and other tested conditions (216,217). Given the above-predicted roles for bacterial YfiM-1 proteins, it is conceivable that the plant version plays a comparable role in the metabolism of a conserved plant-specific lipid. In a similar vein, a distinct clade of standalone VanZ-2 domains typified by the Saccharomyces cerevisiae YJR112W-A was acquired early in the fungal lineage. A similar transfer is also seen in the SAR clade of eukaryotes (Figure S1). Since these eukaryotes lack peptidoglycan and other bacterial-type isoprenoid lipid-borne exopolysaccharide intermediates, we suggest that this version was recruited for modifications of a fungus-specific lipid (e.g., highly oxygenated isoprenoid lipids) (218).

The Met-TelC proteins

The Met-TelC clade is comprised of versions of the TelC family with a reconfigured active site transferred from bacteria to Metazoa prior to the divergence of the cnidarians, and most members are predicted to be catalytically inactive (Figure 2). In cnidarians and arthropods, the Met-TelC domain is found in a secreted protein fused to C-terminal adhesion-related vWA (219) and Ig domains, followed by a TM helix (Figure 5X, Supplementary Data). The chordate version, typified by human PGLYRP2 (220), is also secreted and is fused to a C-terminal Amidase targeting the N-acetylmuramoyl-L-alanine linkage (Figure 5X, Supplementary Data). PGLYRP2 is a key innate immunity factor against bacterial pathogens degrading sugar-peptide linkages in peptidoglycan via the Amidase domain (221–223). As most Met-TelC proteins lack the active site residues but are modeled to retain the substrate-binding pocket, we propose that they participate in anti-bacterial immunity as a Pathogen-Associated Molecular Pattern (PAMP) receptor (224). Specifically, they could recognize polyisoprenoid pyrophosphate-linkage-containing lipid intermediates of bacterial cell-surface molecules like peptidoglycan or exopolysaccharides.

Eukaryotic Wnt proteins

Wnt family Lipocones were transferred on multiple occasions to eukaryotes. The best-known of these are Met-Wnt proteins, which were acquired from bacteria at the base of Metazoa after they had separated from their closest sister group, the choanoflagellates. These lost the ancestral active site residues and function as well-studied secreted signaling molecules and will not be detailed further in this work (for review, see (1,225)). Independently of the Met-Wnt proteins, catalytically active, secreted versions closely related to the bacterial Min-Wnt proteins were transferred to fungi and, within Metazoa, to the rotifers and the hemichordate acorn worm Saccoglossus kowalevskii, where they are lineage-specifically expanded (Figure S1, Supplementary Data). These versions are primarily standalone versions of the Min-Wnt domain, lacking the large inserts typical of the Met-Wnt proteins (Figure 1A). We predict that these eukaryotic Min-Wnt proteins retain their ancestral toxin role and might participate in anti-bacterial immunity.

Met-SAA proteins

Met-SAA proteins (Figure 1A) were acquired from bacteria prior to the divergence of the cnidarians from the rest of Metazoa. However, unlike the Met-Wnt and Met-TelC proteins, they often conserve the ancestral active site residues, indicating that they are usually enzymatically active (Figure 3). Human SAA has been recognized as a key immune marker that dramatically increases in blood during the Acute Phase Response (226). It has been reported to bind the E. coli outer membrane protein OmpA (227) and claimed to function as an opsonin in innate immunity (228). Like Met-TelC, but in contrast to Met-Wnts, Met-SAAs appear to have been lost or pseudogenized in several animal lineages (229–231) (Figure 3). This is consistent with an arms-race scenario in immunity and the development of pathogen resistance against the Met-SAAs, leading to loss. Keeping with an immune role for the Met-SAAs, we propose a catalytic function for the active versions in severing lipid head groups of outer-membrane lipids or of isoprenoid lipid carrier intermediates. Such action could also generate PAMPs that could explain the activation of neutrophil- and macrophage-based immunity by SAA (228). Pertinent to these observations, diverse OMP β-barrels have been linked to the translocation of polymorphic toxin domains across the outer membrane of target cells (232–234). Given this and the origin of Met-SAA from bacterial polymorphic toxin-related systems (Figure 4), its interaction with OmpA might help it cross over into the periplasmic space and act on maturing peptidoglycan or teichoic acid intermediates.

SAA was first reported as a component of secondary amyloid deposits (235), and its capacity to form amyloid fibrils upon protease cleavage was theorized as a potential PAMP activating the immune response (30). Indeed, bacteria produce their own secreted amyloids, such as Curli and Fap, believed to contribute to biofilm formation (236,237), and might be PAMPs recognized by animal immune systems (238). Further, other animal amyloids, such as the β-amyloid, have been proposed to play a role as physical barriers in immunity against bacteria (239). Thus, amyloid formation by protease cleavage (including potentially by bacterial proteases) may represent a second line of defense mediated by Met-SAA proteins.

Discussion

Early Evolution of the Lipocone superfamily

No single well-defined Lipocone clade is universally conserved across the three superkingdoms of Life (Figures 4, S1). However, the VanZ and Wok clades are both found across all major bacterial phyla (notwithstanding sporadic losses in certain lineages) and in some archaeal lineages (Figure S1). At the same time, the cpCone clade is found across most major archaeal lineages and is nearly universally conserved in the eukaryotes (absent in Ascomycota and some choanoflagellates) (Figure S1). Notably, the cpCone and Wok clades tend to group together in the profile-profile similarity network (Figure 1B). These observations suggest that at least a single version of the Lipocone superfamily was likely present in the Last Universal Common Ancestor (LUCA). The phyletic patterns suggest that the LUCA Lipocone gave rise to the VanZ/Wok precursor in the bacterial lineage on the one hand and the cpCONE clade via a circular permutation event in the archaeo-eukaryotic lineage on the other (Figure 4). Based on the features of these deep-branching clades, the LUCA version is inferred to feature a hydrophobic domain with a 4TM helix core, with the active site facing the outer leaf of the lipid bilayer (Figure 1C). Given that extant versions operate both on classic phospholipids and isoprenoid lipids, it is difficult to infer which of these might have been substrates for the LUCA version. It is not impossible that this early version had a generic specificity that became specialized in the descendant clades.

Subsequent diversification of the Lipocone domain

The early diversification of the Lipocone domain appears to have had different drivers in the two prokaryotic superkingdoms. The presence of an extensive repertoire of exopolysaccharides in the cell wall (peptidoglycan, teichoic acids), cell surface (e.g., ECA), and outer membrane (e.g., lipopolysaccharide), synthesized via isoprenoid lipid-linked intermediates, like lipid-II, was the primary driver in the bacterial superkingdom (240). Here, this diversification yielded 4 monophyletic groups: the VanZs, Wok, YfiM and Skillet (Figure 4). The deeper VanZ and Wok branches, which were likely recruited first for lipid-II-related functions, were probably the predecessors of the more restricted bacterial families with specialized functions. For instance, the emergence of the outer membrane in certain bacteria was potentially coupled with the origin of the YfiM-like clade (Figure 4). Similarly, our predictions suggest that within these clades, further diversification accompanied the acquisition of specialized functional roles in antibiotic resistance, secondary sensor roles in single and multicomponent signaling and lipoprotein processing. The interoperability of Lipocone domains on lipid carriers shared across different biosynthetic pathways (see above, Figures 3,S5-S6) appears to have been a key factor leading to this versatility.

In the ancestral archaeo-eukaryotic lineage, the absence of peptidoglycan and an apparently lower diversity of structures with exopolysaccharides was reflected in the lesser diversification of the Lipocone clades (Figure 4). There are open questions regarding the biochemical functions of the primary archaeo-eukaryotic Lipocone clade, the cpCONE. Although the eukaryotic cpCone PTDSS1/2 family has been shown to swap serine for ethanolamine or choline in lipid head groups (28,57), their archaeal counterparts remain uncharacterized. Archaea have their own lipid with a serine in the head group (archaeophosphatidyserine), but to date, its synthesis has been shown to depend on a patchwork of different CDP-alcohol phosphatidyltransferase enzymes (CaPs) in different archaeal species (58,241,242). While the CaPs are also integral membrane enzymes with a 6TM helix core, catalyzing comparable reactions as the Lipocones on lipid head groups in archaea and eukaryotes (242), they are evolutionarily unrelated. Nevertheless, we suggest that the archaeal cpCones, like their eukaryotic counterparts, could contribute to distinct, as yet uncharacterized, pathways for the generation of cell membrane phospholipids like archaeophosphatidylserine or those with other head groups.

Emergence of diffusible versions of the Lipocone domain and their repeated recruitment in biological conflicts

One of the remarkable aspects of the Lipocone superfamily is the loss of ancestral hydrophobicity in several families (Figure 1C), transforming them from integral membrane proteins to diffusible domains. While unexpected, such a transition in integral membrane enzymes acting on lipid substrates is not unprecedented. The PAP2 superfamily of integral membrane enzymes (e.g., diacylglycerol diphosphate phosphatase) (124,243) also contains several soluble versions (244) that appear to have emerged from an integral membrane ancestor (AMB and LA, unpublished observations). Most of the soluble Lipocone domains retain their active site conservation (Figure 2) and, at least in one experimentally characterized case, catalyze a comparable reaction as the TM version (29) (Figure 3B). The weight of the evidence presented here, including the profile-profile similarity network (Figure 1B), phyletic patterns (Figure S1), functional contexts (Figure 5-6), and the broadly shared structural features (Figures 1A, S2, S4), suggests that the loss of hydrophobicity occurred on a single occasion in the Lipocone superfamily, followed by diversification of these diffusible versions.

Our analysis of the diffusible Lipocone families reveals repeated recruitment as toxins/effectors in anti-viral and polymorphic toxin and allied systems (174), suggesting that their diversification was driven by the arms races arising from the biological conflicts where they are deployed. Recruitment of a representative of the VanZ-1 family as a polymorphic toxin on rare occasions (Figures 5N,6) suggests a possible evolutionary pathway for their recruitment as toxins: the effector version of Lipocones attacking lipids in competing bacteria likely emerged from an ancestral version that catalyzed endogenous lipid-head-group modifications on the same lipids in metabolic pathways. Once versions with reduced hydrophobicity emerged, they could be deployed as diffusible effectors that were shared across extracellular and intracellular conflict systems, a trend previously recognized in many other effector domains (25).

Repeated acquisition of Lipocones of bacterial origin by eukaryotes

Unlike bacteria, eukaryotes as a whole do not possess a rich repertoire of Lipocone domains. The PTDSS1/2 family, vertically inherited from the archaeal progenitor, is the only version that can be inferred as being present in the Last Eukaryotic Common Ancestor (Figure 4). However, distinct Lipocone families of ultimately bacterial provenance were acquired early and fixed in certain eukaryotic lineages: (i) YfiM-1 in the plant lineage; (ii) the fungal VanZ-2 domains typified by the Saccharomyces cerevisiae YJR112W-A; (iii) Met-Wnt (discussed further below) (Figure S1). The early fixation of these versions in the eukaryotic lineages possessing them suggests that they were recruited for definitive “housekeeping” or developmental in the respective lineages. Beyond these, the fungal and metazoan lineages show more sporadically distributed versions, which have all been acquired from bacterial secreted-toxin or antiviral systems: (i) Min-Wnt independently in fungi and certain Metazoa; (ii) SAA; (iii) TelC; the latter two are absent in the basal-most metazoans, the sponges, but are present in Cnidaria, suggesting a relatively early acquisition (Figure 4). The weight of the evidence suggests that they have retained certain aspects of the ancestral bacterial effector function for anti-pathogen immunity in eukaryotes. This is consistent with both their episodic loss and lineage-specific expansion, the tendency to show rapid sequence divergence and, in the Met-TelC family, loss of catalytic activity (Figures 2,S1, Supplementary Data).

This independent acquisition of at least 3 distinct Lipocone families in metazoan immunity from polymorphic and allied effector systems of prokaryotes points to a persistent evolutionary trend. Notably, the Lipocone domains participating in animal immunity have been drawn from secreted effectors rather than the intracellular versions (bacterial intracellular Min-Wnts) predicted to participate in bacterial anti-selfish element immunity. More generally, this adds to a growing list of components drawn from secreted effector systems of prokaryotes in eukaryotic immune systems (174,193,245). For example, this closely parallels another structurally unrelated effector domain, the Zn-dependent deaminase (e.g., metazoan AID/APOBEC deaminases) (246). Hence, these observations add further support to our hypothesis that the extensive expansion of effectors in diverse prokaryotic inter-organismal conflict systems served as a reservoir from which eukaryotic immune systems repeatedly acquired components (193,245). We propose that symbiotic associations between the early animals and bacteria resulted in potential interactions via secreted effectors of the latter that aided the former against antagonistic bacteria. This probably led to their eventual acquisition by animals and incorporation into their immune processes.

Origin of Wnt as a signaling molecule

Earlier considerations on the evolution of Wnt signaling indicated that it emerged at the base of the metazoan lineage and incorporated a wide range of components of different origins (e.g., the HMG domain transcription factor TCF/LEF, the HEAT repeat protein β-catenin and the 7TM receptor Frizzled) (1). However, the provenance of Met-Wnt itself had been mysterious and was seen as a possible example of a metazoan innovation (225). While the Met-Wnt domains possess peculiar structural elaborations (34), its conserved core is a Lipocone domain (Figure 1A). We establish that the progenitor of Met-Wnt emerged as part of the radiation of Lipocone domains in bacteria as effectors deployed in both intracellular and inter-organismal conflict – the Min-Wnt proteins.

Whereas the Min-Wnt proteins are predicted to be secreted toxins, the Met-Wnts underwent an ancestral inactivation through loss of the catalytic residues (Figure 2). However, they retained their ancient involvement in cell-cell interactions as secreted agents. The Met-Wnt residues recognized as essential for the receptor (Frizzled) binding, including the absolutely conserved palmitoleoylated serine residue, are found in the aforementioned Metazoa-specific hairpins and loops (34,36). However, despite their inactivation, the Met-Wnts retain the ancestral substrate-binding pocket (Figures 1A, S2). This raises the possibility that they might be involved in as-yet unexplored interactions with ligands such as lipids.

Our tracing of the provenance of Wnt back to an effector in secreted bacterial toxin systems adds it to a growing list of components in metazoan signaling networks that have been acquired from such systems. For instance, this is also the case with components of the other key metazoan signaling pathway, Hedgehog (247). Here, the Hedgehog protein itself contains an autoproteolytic HINT peptidase domain that was likely drawn from a structurally and functionally cognate domain observed in polymorphic toxin systems (174,247). Further, an intracellular component of the same signaling pathway, Supressor of Fused (SuFu), was derived from a common immunity protein found in polymorphic toxin systems (247). Similarly, the Teneurin/Odd Oz proteins mediating signaling in cell migration, neuronal pathfinding, and fasciculation in Metazoa descended from a polymorphic toxin protein with a C-terminal HNH endonuclease toxin tip (174). In a similar vein, the immunity protein of certain CapCone toxins identified in this study might have given rise to the β-sandwich domain in the eukaryotic centriolar assembly factor SAS-6. These observations suggest that, in addition to immune system components, interactions with symbiotic bacteria also potentially furnished the progenitors of components of eukaryotic signaling and cytoskeletal networks that were central to the emergence of Metazoa as a clade of multicellular eukaryotes (248,249).

Conclusions

Using sensitive sequence and structure analysis, we unify a large, hitherto unrecognized superfamily of enzymatic domains, the Lipocone. By combining analysis of the active site and the structure of the Lipocone domain with contextual information from conserved gene-neighborhoods and domain architectures, we present evidence that members of this superfamily target phosphate linkages in head groups of both classical phospholipids and polyisoprenoid lipids. Specifically, they catalyze reactions such as head group exchange or severing of the head group-diphosphate linkage from the polyisoprenol. We present evidence that these activities have been recruited in a wide range of biochemical contexts, including cell membrane lipid modification, metabolism of peptidoglycan and exopolysaccharide lipid-carrier linked intermediates, lipoprotein modifications, bacterial outer membrane modification, sensing of membrane-associated signals, effector activity in antiviral and inter-organismal conflicts and resistance to antimicrobials. Further, catalytically inactive versions like Met-Wnt have been recruited for signaling roles in Metazoa. We predict the catalytic activity and potential biochemical pathways of numerous representatives for the first time, including some proteins that have remained enigmatic for over two decades, like VanZ.

We identify three notable trends in Lipocone evolution. First, although we reconstruct the ancestral member of the superfamily as being a 4TM integral membrane domain, a large monophyletic subset underwent a dramatic loss of hydrophobicity, transforming them into diffusible versions, including the Wnts and the SAAs (Figure 1C). Second, the superfamily expanded in two major functional niches in bacteria, namely peptidoglycan/exopolysaccharide metabolism and effector domains of both secreted toxins and immune systems (Figure 4). Finally, members of the Lipocone superfamily were acquired on multiple occasions from bacteria by Metazoa and were reused in new functional contexts as signaling messengers and immune factors (Figure 4).

Importantly, our predictions in this regard underscore that much remains unexplored in terms of lineage-specific cell wall and membrane metabolism in prokaryotes. We present several testable biochemical, functional hypotheses for the many poorly understood branches of the superfamily, several of which are being recognized as enzymatic for the first time here. We hope this will also open new avenues of research to fill key gaps in our understanding of lipid metabolism.

Methods

Sequence analysis

Sequence similarity searches were performed using PSI-BLAST (250) and JackHMMER (251) against the NCBI non-redundant protein database (nr) (252) or a version clustered down to 50% sequence identity (nr50). The searches were initiated using the previously identified prokaryotic Wnt (24), with multiple rounds of searches conducted, each using seeds collected from the preceding searches. Clusters based on sequence similarity (percentage identity or bit-score) were generated using MMseqs (253). The clustering parameters were adjusted according to specific goals, enabling redundancy removal, the definition of homologous groups, and the creation of new profiles. Multiple sequence alignments (MSA) were generated using the MAFFT program (254) with the local-pair algorithm, combined with the parameters –maxiterate 3000, –op 1.5, and –ep 0.2, and were manually refined based on structural superpositions and profile-profile comparisons.

Sequence similarity network analysis

The HHalign program (255) was used to perform profile-profile comparisons, with the resulting p-value and e-value scores serving as edges for constructing a superfamily relationship network. This was then analyzed using the Leiden community finding algorithm (37) to detect sub-networks. Network analysis and visualization were performed using the R igraph (256) or Python networkX libraries (257).

Comparative genomics, domain identification, and phylogenetic analysis

Genomic neighborhoods were obtained from genomes available in the NCBI Genome database (252) using in-house scripts written in Perl and Python. Conservation analysis of these genomic neighborhoods was performed by clustering the protein products of neighboring genes. Domain identification was conducted using a collection of HMMs and PSSMs maintained by the Aravind lab, along with HMMs from the Pfam database (258), utilizing the RPSBLAST (259) and HMMSCAN (260) programs. To further refine detection, domain identification was extended through remote homology analysis using the HHpred (261) program, against profiles built from the Pfam (258) and PDB70 (262) databases. Phylogenetic analyses were performed using FastTree (263) and iqTREE2 (264). Experimental functional data for characterized members of the superfamily were collected with the assistance of the ChatGPT language model (https://chat.openai.com). Structural comparisons, along with shared genomic associations, were used to further refine the interrelationships within and between the groups of the superfamily.

Families with broader presence across multiple major lineages (“phyla”) and deeper conservation within each of those lineages were inferred to be more ancient. In contrast, those with a more limited phyletic spread and/or limited depth of occurrence within each major lineage were likely later derivations (Figures 4,S1). We formalized this inference by calculating a phyletic metric for the Lipocone clades (Figure S1) comprised of both the phyletic spread and depth. The phyletic spread 𝑆_𝑖 of the ith Lipocone clade was computed thus:

Where 𝑚_𝑖 is the number of lineages with at least one representative of the Lipocone clade 𝑆_𝑖, and M is the total number of lineages examined. The phyletic depth 𝐷_𝑖 of the ith Lipocone clade was computed as a weighted average of its occurrence within each lineage in the form of the mediant:

where n_j is the number of species in lineage j with a Lipocone domain of the ith Lipocone clade and N_j is the total number of species sampled in lineage j. S_i and 𝐷_𝑖 are plotted as a bar graph with 𝑆_𝑖 as its width and 𝐷_𝑖 its height.

Contextual network construction

Each domain architecture and conserved gene neighborhood was decomposed into its constituent domains. These domains were then labeled for their biochemical function and stored as a YAML file (Supplementary Data). The contextual connections were then rendered as a graph with the domains as its nodes and the adjacency relationships as its edges. Cliques containing a given Lipocone domain were detected in this graph and merged to constitute their respective dense subgraphs. These subgraphs were then examined for the statistically significant prevalence of particular labeled functions using the Fisher exact test. Network analysis was performed using the functions of the R igraph or Python networkX libraries.

Structure analysis

Protein structures were modeled using Alphafold3 (265), with visualization and manipulation performed using either MOL* (266) or PyMOL. Structural similarity searches were conducted using the DALIlite (267) and FOLDSEEK (268) programs. DALIlite was also used to generate structural alignments.

Hydrophobicity analysis

To create the membrane propensity plots, for each protein P_i in a given family, we compute the average TM-propensity of its amino acids using the TM tendency scale (38). This score H_i for P_i is calculated as:

where h_j is the TM tendency of the j-th amino acid in the protein P_i, and n is its total length in amino acids. The Kruskal–Wallis nonparametric test was applied to assess whether TM propensity scores differed across the 30 groups. As the Kruskal–Wallis test indicated a significant difference (p<0.05), we performed post-hoc pairwise comparisons using Dunn’s test with Bonferroni correction to control for multiple testing. Group-wise visualizations were presented using critical difference diagrams, where groups not connected by horizontal bars are significantly different (adjusted p<0.05) (Figure S3).

Acknowledgements

This research was supported by the Division of Intramural Research at the National Library of Medicine (NLM), National Institutes of Health (NIH). This research was supported in part by an appointment to the NLM Research Participation Program administered by the Oak Ridge Institute for Science and Education (ORISE) through an interagency agreement between the U.S. Department of Energy (DOE) and the NLM.

Additional files

Supplementary tables and figure

References

1.
1. Richards G.S.
2. Degnan B.M
2009The dawn of developmental signaling in the metazoaCold Spring Harb Symp Quant Biol 74:81–90Google Scholar
2.
1. Jessen S.
2. Gu B.
3. Dai X
2008Pygopus and the Wnt signaling pathway: a diverse set of connectionsBioessays 30:448–456Google Scholar
3.
1. Morata G.
2. Lawrence P.A
1977The development of wingless, a homeotic mutation of DrosophilaDev Biol 56:227–240Google Scholar
4.
1. Rijsewijk F.
2. Schuermann M.
3. Wagenaar E.
4. Parren P.
5. Weigel D.
6. Nusse R
1987The Drosophila homolog of the mouse mammary oncogene int-1 is identical to the segment polarity gene winglessCell 50:649–657Google Scholar
5.
1. Sharma R.P
1973Wingless a new mutant in Drosophila melanogasterDrosophila information service 134Google Scholar
6.
1. Nusse R.
2. Varmus H.E
1982Many tumors induced by the mouse mammary tumor virus contain a provirus integrated in the same region of the host genomeCell 31:99–109Google Scholar
7.
1. Nusse R.
2. Varmus H.E
1992Wnt genesCell 69:1073–1087Google Scholar
8.
1. Komiya Y.
2. Habas R
2008Wnt signal transduction pathwaysOrganogenesis 4:68–75Google Scholar
9.
1. Segalen M.
2. Bellaiche Y
2009Cell division orientation and planar cell polarity pathwaysSemin Cell Dev Biol 20:972–977Google Scholar
10.
1. Slusarski D.C.
2. Pelegri F
2007Calcium signaling in vertebrate embryonic patterning and morphogenesisDev Biol 307:1–13Google Scholar
11.
1. Logan C.Y.
2. Nusse R
2004The Wnt signaling pathway in development and diseaseAnnu Rev Cell Dev Biol 20:781–810Google Scholar
12.
1. Welters H.J.
2. Kulkarni R.N
2008Wnt signaling: relevance to beta-cell biology and diabetesTrends Endocrinol Metab 19:349–355Google Scholar
13.
1. Christian J.L
2000BMP, Wnt and Hedgehog signals: how far can they go?Curr Opin Cell Biol 12:244–249Google Scholar
14.
1. Schulte G.
2. Bryja V
2007The Frizzled family of unconventional G-protein-coupled receptorsTrends Pharmacol Sci 28:518–525Google Scholar
15.
1. Bhanot P.
2. Brink M.
3. Samos C.H.
4. Hsieh J.C.
5. Wang Y.
6. Macke J.P.
7. Andrew D.
8. Nathans J.
9. Nusse R
1996A new member of the frizzled family from Drosophila functions as a Wingless receptorNature 382:225–230Google Scholar
16.
1. Takada R.
2. Satomi Y.
3. Kurata T.
4. Ueno N.
5. Norioka S.
6. Kondoh H.
7. Takao T.
8. Takada S
2006Monounsaturated fatty acid modification of Wnt protein: its role in Wnt secretionDev Cell 11:791–801Google Scholar
17.
1. Kurayoshi M.
2. Yamamoto H.
3. Izumi S.
4. Kikuchi A
2007Post-translational palmitoylation and glycosylation of Wnt-5a are necessary for its signallingBiochem J 402:515–523Google Scholar
18.
1. Gao C.
2. Chen Y.G
2010Dishevelled: The hub of Wnt signalingCell Signal 22:717–727Google Scholar
19.
1. Klingensmith J.
2. Nusse R.
3. Perrimon N
1994The Drosophila segment polarity gene dishevelled encodes a novel protein required for response to the wingless signalGenes Dev 8:118–130Google Scholar
20.
1. Kramps T.
2. Peter O.
3. Brunner E.
4. Nellen D.
5. Froesch B.
6. Chatterjee S.
7. Murone M.
8. Zullig S.
9. Basler K
2002Wnt/wingless signaling requires BCL9/legless-mediated recruitment of pygopus to the nuclear beta-catenin-TCF complexCell 109:47–60Google Scholar
21.
1. van Tienen L.M.
2. Mieszczanek J.
3. Fiedler M.
4. Rutherford T.J.
5. Bienz M.
2017Constitutive scaffolding of multiple Wnt enhanceosome components by Legless/BCL9eLife 6Google Scholar
22.
1. Archbold H.C.
2. Yang Y.X.
3. Chen L.
4. Cadigan K.M
2012How do they do Wnt they do?: Regulation of transcription by the Wnt/beta-catenin pathway.Acta Physiol (Oxf) 204:74–109Google Scholar
23.
1. Holstein T.W
2012The evolution of the Wnt pathwayCold Spring Harb Perspect Biol 4:a007922Google Scholar
24.
1. Burroughs A.M.
2. Aravind L
2020Identification of Uncharacterized Components of Prokaryotic Immune Systems and Their Diverse Eukaryotic ReformulationsJ Bacteriol 202Google Scholar
25.
1. Aravind L.
2. Iyer L.M.
3. Burroughs A.M
2022Discovering Biological Conflict Systems Through Genome Analysis: Evolutionary Principles and Biochemical NoveltyAnnu Rev Biomed Data Sci 5:367–391Google Scholar
26.
1. Kuge O.
2. Saito K.
3. Nishijima M
1997Cloning of a Chinese hamster ovary (CHO) cDNA encoding phosphatidylserine synthase (PSS) II, overexpression of which suppresses the phosphatidylserine biosynthetic defect of a PSS I-lacking mutant of CHO-K1 cellsJ Biol Chem 272:19133–19139Google Scholar
27.
1. Kuge O.
2. Nishijima M.
3. Akamatsu Y
1991A Chinese hamster cDNA encoding a protein essential for phosphatidylserine synthase I activityJ Biol Chem 266:24184–24189Google Scholar
28.
1. Stone S.J.
2. Vance J.E
1999Cloning and expression of murine liver phosphatidylserine synthase (PSS)-2: differential regulation of phospholipid metabolism by PSS1 and PSS2Biochem J 342:57–64Google Scholar
29.
1. Whitney J.C.
2. Peterson S.B.
3. Kim J.
4. Pazos M.
5. Verster A.J.
6. Radey M.C.
7. Kulasekara H.D.
8. Ching M.Q.
9. Bullen N.P.
10. Bryant D.
11. Goo Y.A.
12. Surette M.G.
13. Borenstein E.
14. Vollmer W.
15. Mougous J.D
2017A broadly distributed toxin family mediates contact-dependent antagonism between gram-positive bacteriaeLife 6Google Scholar
30.
1. Sack G.H.
2018Serum amyloid A - a reviewMol Med 24:46Google Scholar
31.
1. Arthur M.
2. Depardieu F.
3. Molinas C.
4. Reynolds P.
5. Courvalin P
1995The vanZ gene of Tn1546 from Enterococcus faecium BM4147 confers resistance to teicoplaninGene 154:87–92Google Scholar
32.
1. Arthur M.
2. Depardieu F.
3. Reynolds P.
4. Courvalin P
1999Moderate-level resistance to glycopeptide LY333328 mediated by genes of the vanA and vanB clusters in enterococciAntimicrob Agents Chemother 43:1875–1880Google Scholar
33.
1. Bazan J.F.
2. Janda C.Y.
3. Garcia K.C
2012Structural architecture and functional evolution of WntsDev Cell 23:227–232Google Scholar
34.
1. Janda C.Y.
2. Waghray D.
3. Levin A.M.
4. Thomas C.
5. Garcia K.C
2012Structural basis of Wnt recognition by FrizzledScience 337:59–64Google Scholar
35.
1. Chu M.L.
2. Ahn V.E.
3. Choi H.J.
4. Daniels D.L.
5. Nusse R.
6. Weis W.I
2013structural Studies of Wnts and identification of an LRP6 binding siteStructure 21:1235–1242Google Scholar
36.
1. Zhong Q.
2. Zhao Y.
3. Ye F.
4. Xiao Z.
5. Huang G.
6. Xu M.
7. Zhang Y.
8. Zhan X.
9. Sun K.
10. Wang Z.
11. Cheng S.
12. Feng S.
13. Zhao X.
14. Zhang J.
15. Lu P.
16. Xu W.
17. Zhou Q.
18. Ma D
2021Cryo-EM structure of human Wntless in complex with Wnt3aNat Commun 12:4541Google Scholar
37.
1. Traag V.A.
2. Waltman L.
3. van Eck N.J.
2019From Louvain to Leiden: guaranteeing well-connected communitiesSci Rep 9:5233Google Scholar
38.
1. Zhao G.
2. London E
2006An amino acid “transmembrane tendency” scale that approaches the theoretical limit to accuracy for prediction of transmembrane helices: relationship to biological hydrophobicityProtein Sci 15:1987–2001Google Scholar
39.
1. Jumper J.
2. Evans R.
3. Pritzel A.
4. Green T.
5. Figurnov M.
6. Ronneberger O.
7. Tunyasuvunakool K.
8. Bates R.
9. Zidek A.
10. Potapenko A.
11. Bridgland A.
12. Meyer C.
13. Kohl S.A.A.
14. Ballard A.J.
15. Cowie A.
16. Romera-Paredes B.
17. Nikolov S.
18. Jain R.
19. Adler J.
20. Back T.
21. Petersen S.
22. Reiman D.
23. Clancy E.
24. Zielinski M.
25. Steinegger M.
26. Pacholska M.
27. Berghammer T.
28. Bodenstein S.
29. Silver D.
30. Vinyals O.
31. Senior A.W.
32. Kavukcuoglu K.
33. Kohli P.
34. Hassabis D
2021Highly accurate protein structure prediction with AlphaFoldNature 596:583–589Google Scholar
40.
1. Bastard K.
2. Smith A.A.
3. Vergne-Vaxelaire C.
4. Perret A.
5. Zaparucha A.
6. De Melo-Minardi R.
7. Mariage A.
8. Boutard M.
9. Debard A.
10. Lechaplais C.
11. Pelle C.
12. Pellouin V.
13. Perchat N.
14. Petit J.L.
15. Kreimeyer A.
16. Medigue C.
17. Weissenbach J.
18. Artiguenave F.
19. De Berardinis V.
20. Vallenet D.
21. Salanoubat M.
2014Revealing the hidden functional diversity of an enzyme familyNat Chem Biol 10:42–49Google Scholar
41.
1. Glasner M.E.
2. Gerlt J.A.
3. Babbitt P.C
2006Evolution of enzyme superfamiliesCurr Opin Chem Biol 10:492–497Google Scholar
42.
1. Zhang D.
2. Iyer L.M.
3. Burroughs A.M.
4. Aravind L
2014Resilience of biochemical activity in protein domains in the face of structural divergenceCurr Opin Struct Biol 26:92–103Google Scholar
43.
1. Tomohiro S.
2. Kawaguti A.
3. Kawabe Y.
4. Kitada S.
5. Kuge O
2009Purification and characterization of human phosphatidylserine synthases 1 and 2Biochem J 418:421–429Google Scholar
44.
1. Saito K.
2. Kuge O.
3. Akamatsu Y.
4. Nishijima M
1996Immunochemical identification of the pssA gene product as phosphatidylserine synthase I of Chinese hamster ovary cellsFEBS Lett 395:262–266Google Scholar
45.
1. Stone S.J.
2. Vance J.E
2000Phosphatidylserine synthase-1 and -2 are localized to mitochondria-associated membranesJ Biol Chem 275:34534–34540Google Scholar
46.
1. Miyata N.
2. Kuge O
2021Topology of phosphatidylserine synthase 1 in the endoplasmic reticulum membraneProtein Sci 30:2346–2353Google Scholar
47.
1. Anderson J.S.
2. Matsuhashi M.
3. Haskin M.A.
4. Strominger J.L
1967Biosythesis of the peptidoglycan of bacterial cell walls. II. Phospholipid carriers in the reaction sequenceJ Biol Chem 242:3180–3190Google Scholar
48.
1. Higashi Y.
2. Strominger J.L.
3. Sweeley C.C
1967Structure of a lipid intermediate in cell wall peptidoglycan synthesis: a derivative of a C55 isoprenoid alcoholProc Natl Acad Sci U S A 57:1878–1884Google Scholar
49.
1. English D
1996Phosphatidic acid: a lipid messenger involved in intracellular and extracellular signallingCell Signal 8:341–347Google Scholar
50.
1. Zamocky M.
2. Ferianc P
2023Discovering the deep evolutionary roots of serum amyloid A protein familyInt J Biol Macromol 252:126537Google Scholar
51.
1. Woods E.C.
2. Wetzel D.
3. Mukerjee M.
4. McBride S.M
2018Examination of the Clostridioides (Clostridium) difficile VanZ ortholog, CD1240Anaerobe 53:108–115Google Scholar
52.
1. Sur V.P.
2. Mazumdar A.
3. Vimberg V.
4. Stefani T.
5. Androvic L.
6. Kracikova L.
7. Laga R.
8. Kamenik Z.
9. Komrskova K
2021Specific Inhibition of VanZ-Mediated Resistance to Lipoglycopeptide AntibioticsInt J Mol Sci 23Google Scholar
53.
1. Stogios P.J.
2. Savchenko A
2020Molecular mechanisms of vancomycin resistanceProtein Sci 29:654–669Google Scholar
54.
1. Dziarski R.
2. Gupta D
2006The peptidoglycan recognition proteins (PGRPs)Genome Biol 7:232Google Scholar
55.
1. Bliven S.
2. Prlic A
2012Circular permutation in proteinsPLoS Comput Biol 8:e1002445Google Scholar
56.
1. Aravind L
2000Guilt by association: contextual information in genome analysisGenome Res 10:1074–1077Google Scholar
57.
1. Vance J.E
2018Historical perspective: phosphatidylserine and phosphatidylethanolamine from the 1800s to the presentJ Lipid Res 59:923–944Google Scholar
58.
1. Koga Y.
2. Morii H
2005Recent advances in structural research on ether lipids from archaea including comparative and physiological aspectsBiosci Biotechnol Biochem 69:2019–2034Google Scholar
59.
1. Caforio A.
2. Driessen A.J.M
2017Archaeal phospholipids: Structural properties and biosynthesisBiochim Biophys Acta Mol Cell Biol Lipids 1862:1325–1339Google Scholar
60.
1. Rezanka T.
2. Kyselova L.
3. Murphy D.J
2023Archaeal lipidsProg Lipid Res 91:101237Google Scholar
61.
1. Narita S.
2. Tokuda H
2010Sorting of bacterial lipoproteins to the outer membrane by the Lol systemMethods Mol Biol 619:117–129Google Scholar
62.
1. Kamada K.
2. Miyata M.
3. Hirano T
2013Molecular basis of SMC ATPase activation: role of internal structural changes of the regulatory subcomplex ScpABStructure 21:581–594Google Scholar
63.
1. Schleiffer A.
2. Kaitna S.
3. Maurer-Stroh S.
4. Glotzer M.
5. Nasmyth K.
6. Eisenhaber F
2003Kleisins: a superfamily of bacterial and eukaryotic SMC protein partnersMol Cell 11:571–575Google Scholar
64.
1. Soppa J.
2. Kobayashi K.
3. Noirot-Gros M.F.
4. Oesterhelt D.
5. Ehrlich S.D.
6. Dervyn E.
7. Ogasawara N.
8. Moriya S
2002Discovery of two novel families of proteins that are proposed to interact with prokaryotic SMC proteins, and characterization of the Bacillus subtilis family members ScpA and ScpBMol Microbiol 45:59–71Google Scholar
65.
1. Aravind L.
2. Anantharaman V.
3. Balaji S.
4. Babu M.M.
5. Iyer L.M
2005The many faces of the helix-turn-helix domain: transcription regulation and beyondFEMS Microbiol Rev 29:231–262Google Scholar
66.
1. Pellicer M.T.
2. Badia J.
3. Aguilar J.
4. Baldoma L
1996glc locus of Escherichia coli: characterization of genes encoding the subunits of glycolate oxidase and the glc regulator proteinJ Bacteriol 178:2051–2059Google Scholar
67.
1. Ramachandran R.
2. Hartmann C.
3. Song H.K.
4. Huber R.
5. Bochtler M
2002Functional interactions of HslV (ClpQ) with the ATPase HslU (ClpY)Proc Natl Acad Sci U S A 99:7396–7401Google Scholar
68.
1. Storck E.M.
2. Ozbalci C.
3. Eggert U.S
2018Lipid Cell Biology: A Focus on Lipids in Cell DivisionAnnu Rev Biochem 87:839–869Google Scholar
69.
1. Barak I.
2. Muchova K
2013The role of lipid domains in bacterial cell processesInt J Mol Sci 14:4050–4065Google Scholar
70.
1. Hauck A.K.
2. Bernlohr D.A
2016Oxidative stress and lipotoxicityJ Lipid Res 57:1976–1986Google Scholar
71.
1. Guo D.
2. Tropp B.E
2000A second Escherichia coli protein with CL synthase activityBiochim Biophys Acta 1483:263–274Google Scholar
72.
1. Tan B.K.
2. Bogdanov M.
3. Zhao J.
4. Dowhan W.
5. Raetz C.R.
6. Guan Z
2012Discovery of a cardiolipin synthase utilizing phosphatidylethanolamine and phosphatidylglycerol as substratesProc Natl Acad Sci U S A 109:16504–16509Google Scholar
73.
1. Aravind L.
2. Koonin E.V
1998Phosphoesterase domains associated with DNA polymerases of diverse originsNucleic Acids Res 26:3746–3752Google Scholar
74.
1. Burroughs A.M.
2. Allen K.N.
3. Dunaway-Mariano D.
4. Aravind L
2006Evolutionary genomics of the HAD superfamily: understanding the structural adaptations and catalytic diversity in a superfamily of phosphoesterases and allied enzymesJ Mol Biol 361:1003–1034Google Scholar
75.
1. Vershinin A
1999Biological functions of carotenoids--diversity and evolutionBiofactors 10:99–104Google Scholar
76.
1. Šesták Z.
2. Britton G.
3. Liaaen-Jensen S.
4. Pfander H
2004Carotenoids. HandbookPhotosynthetica 42:186Google Scholar
77.
1. Sandmann G.
2. Misawa N
1992New functional assignment of the carotenogenic genes crtB and crtE with constructs of these genes from Erwinia speciesFEMS Microbiol Lett 69:253–257Google Scholar
78.
1. Chamovitz D.
2. Misawa N.
3. Sandmann G.
4. Hirschberg J
1992Molecular cloning and expression in Escherichia coli of a cyanobacterial gene coding for phytoene synthase, a carotenoid biosynthesis enzymeFEBS Lett 296:305–310Google Scholar
79.
1. Liu J.
2. Mushegian A
2003Three monophyletic superfamilies account for the majority of the known glycosyltransferasesProtein Sci 12:1418–1431Google Scholar
80.
1. Tran U.C.
2. Clarke C.F
2007Endogenous synthesis of coenzyme Q in eukaryotesMitochondrion 7:S62–71Google Scholar
81.
1. Massiere F.
2. Badet-Denisot M.A
1998The mechanism of glutamine-dependent amidotransferasesCell Mol Life Sci 54:205–222Google Scholar
82.
1. Liu X.
2. Zeng J.
3. Chen X.
4. Xie W
2016Crystal structures of RidA, an important enzyme for the prevention of toxic side productsSci Rep 6:30494Google Scholar
83.
1. Aravind L.
2. Koonin E.V
1998The HD domain defines a new superfamily of metal-dependent phosphohydrolasesTrends Biochem Sci 23:469–472Google Scholar
84.
1. Becker A.
2. Kleickmann A.
3. Kuster H.
4. Keller M.
5. Arnold W.
6. Puhler A
1993Analysis of the Rhizobium meliloti genes exoU, exoV, exoW, exoT, and exoI involved in exopolysaccharide biosynthesis and nodule invasion: exoU and exoW probably encode glucosyltransferasesMol Plant Microbe Interact 6:735–744Google Scholar
85.
1. Ruiz N
2015Lipid Flippases for Bacterial Peptidoglycan BiosynthesisLipid Insights 8:21–31Google Scholar
86.
1. Ruiz N
2008Bioinformatics identification of MurJ (MviN) as the peptidoglycan lipid II flippase in Escherichia coliProc Natl Acad Sci U S A 105:15553–15557Google Scholar
87.
1. Heydanek M.G.
2. Neuhaus F.C
1969The initial stage in peptidoglycan synthesis. IV. Solubilization of phospho-N-acetylmuramyl-pentapeptide translocase.Biochemistry 8:1474–1481Google Scholar
88.
1. Mengin-Lecreulx D.
2. van Heijenoort J.
1993Identification of the glmU gene encoding N-acetylglucosamine-1-phosphate uridyltransferase in Escherichia coliJ Bacteriol 175:6150–6157Google Scholar
89.
1. Mengin-Lecreulx D.
2. van Heijenoort J.
1996Characterization of the essential gene glmM encoding phosphoglucosamine mutase in Escherichia coliJ Biol Chem 271:32–39Google Scholar
90.
1. van Heijenoort J.
2007Lipid intermediates in the biosynthesis of bacterial peptidoglycanMicrobiol Mol Biol Rev 71:620–635Google Scholar
91.
1. Ashraf K.U.
2. Nygaard R.
3. Vickery O.N.
4. Erramilli S.K.
5. Herrera C.M.
6. McConville T.H.
7. Petrou V.I.
8. Giacometti S.I.
9. Dufrisne M.B.
10. Nosol K.
11. Zinkle A.P.
12. Graham C.L.B.
13. Loukeris M.
14. Kloss B.
15. Skorupinska-Tudek K.
16. Swiezewska E.
17. Roper D.I.
18. Clarke O.B.
19. Uhlemann A.C.
20. Kossiakoff A.A.
21. Trent M.S.
22. Stansfeld P.J.
23. Mancia F
2022Structural basis of lipopolysaccharide maturation by the O-antigen ligaseNature 604:371–376Google Scholar
92.
1. Franco A.V.
2. Liu D.
3. Reeves P.R
1998The wzz (cld) protein in Escherichia coli: amino acid sequence variation determines O-antigen chain length specificityJ Bacteriol 180:2670–2675Google Scholar
93.
1. Rai A.K.
2. Carr J.F.
3. Bautista D.E.
4. Wang W.
5. Mitchell A.M
2021ElyC and Cyclic Enterobacterial Common Antigen Regulate Synthesis of Phosphoglyceride-Linked Enterobacterial Common AntigenmBio 12:e0284621Google Scholar
94.
1. Kumar A.S.
2. Mody K.
3. Jha B
2007Bacterial exopolysaccharides--a perceptionJ Basic Microbiol 47:103–117Google Scholar
95.
1. van Heijenoort J.
2001Formation of the glycan chains in the synthesis of bacterial peptidoglycanGlycobiology 11:25–36Google Scholar
96.
1. Hong Y.
2. Hu D.
3. Verderosa A.D.
4. Qin J.
5. Totsika M.
6. Reeves P.R
2023Repeat-Unit Elongations To Produce Bacterial Complex Long Polysaccharide Chains, an O-Antigen PerspectiveEcoSal Plus 11:eesp00202022Google Scholar
97.
1. Rai A.K.
2. Mitchell A.M
2020Enterobacterial Common Antigen: Synthesis and Function of an Enigmatic MoleculemBio 11Google Scholar
98.
1. Sham L.T.
2. Butler E.K.
3. Lebar M.D.
4. Kahne D.
5. Bernhardt T.G.
6. Ruiz N
2014Bacterial cell wall. MurJ is the flippase of lipid-linked precursors for peptidoglycan biogenesisScience 345:220–222Google Scholar
99.
1. Kim S.
2. Pires M.M.
3. Im W
2018Insight into Elongation Stages of Peptidoglycan Processing in Bacterial Cytoplasmic MembranesSci Rep 8:17704Google Scholar
100.
1. Di Guilmi A.M.
2. Dessen A.
3. Dideberg O.
4. Vernet T.
2003The glycosyltransferase domain of penicillin-binding protein 2a from Streptococcus pneumoniae catalyzes the polymerization of murein glycan chainsJ Bacteriol 185:4418–4423Google Scholar
101.
1. Weckener M.
2. Woodward L.S.
3. Clarke B.R.
4. Liu H.
5. Ward P.N.
6. Le Bas A.
7. Bhella D.
8. Whitfield C.
9. Naismith J.H.
2023The lipid linked oligosaccharide polymerase Wzy and its regulating co-polymerase, Wzz, from enterobacterial common antigen biosynthesis form a complexOpen Biol 13:220373Google Scholar
102.
1. Han W.
2. Wu B.
3. Li L.
4. Zhao G.
5. Woodward R.
6. Pettit N.
7. Cai L.
8. Thon V.
9. Wang P.G
2012Defining function of lipopolysaccharide O-antigen ligase WaaL using chemoenzymatically synthesized substratesJ Biol Chem 287:5357–5365Google Scholar
103.
1. Imperiali B
2019Bacterial carbohydrate diversity - a Brave New WorldCurr Opin Chem Biol 53:1–8Google Scholar
104.
1. Mostowy R.J.
2. Holt K.E
2018Diversity-Generating Machines: Genetics of Bacterial Sugar-CoatingTrends Microbiol 26:1008–1021Google Scholar
105.
1. Saiki K.
2. Konishi K
2007Identification of a Porphyromonas gingivalis novel protein sov required for the secretion of gingipainsMicrobiol Immunol 51:483–491Google Scholar
106.
1. Pares S.
2. Cohen-Addad C.
3. Sieker L.
4. Neuburger M.
5. Douce R
1994X-ray structure determination at 2.6-A resolution of a lipoate-containing protein: the H-protein of the glycine decarboxylase complex from pea leavesProc Natl Acad Sci U S A 91:4850–4853Google Scholar
107.
1. Shultis D.D.
2. Purdy M.D.
3. Banchs C.N.
4. Wiener M.C
2006Outer membrane active transport: structure of the BtuB:TonB complexScience 312:1396–1399Google Scholar
108.
1. Rahn A.
2. Beis K.
3. Naismith J.H.
4. Whitfield C
2003A novel outer membrane protein, Wzi, is involved in surface assembly of the Escherichia coli K30 group 1 capsuleJ Bacteriol 185:5882–5890Google Scholar
109.
1. Qi Y.
2. Grishin N.V
2005Structural classification of thioredoxin-like fold proteinsProteins 58:376–388Google Scholar
110.
1. Suplatov D.A.
2. Besenmatter W.
3. Svedas V.K.
4. Svendsen A
2012Bioinformatic analysis of alpha/beta-hydrolase fold enzymes reveals subfamily-specific positions responsible for discrimination of amidase and lipase activitiesProtein Eng Des Sel 25:689–697Google Scholar
111.
1. Palomeque-Messia P.
2. Englebert S.
3. Leyh-Bouille M.
4. Nguyen-Disteche M.
5. Duez C.
6. Houba S.
7. Dideberg O.
8. Van Beeumen J.
9. Ghuysen J.M.
1991Amino acid sequence of the penicillin-binding protein/DD-peptidase of Streptomyces K15. Predicted secondary structures of the low Mr penicillin-binding proteins of class ABiochem J 279:223–230Google Scholar
112.
1. Aravind L.
1999An evolutionary classification of the metallo-beta-lactamase fold proteinsSilico Biol 1:69–91Google Scholar
113.
1. Novinec M.
2. Lenarcic B
2013Papain-like peptidases: structure, function, and evolutionBiomol Concepts 4:287–308Google Scholar
114.
1. Razew A.
2. Schwarz J.N.
3. Mitkowski P.
4. Sabala I.
5. Kaus-Drobek M
2022One fold, many functions-M23 family of peptidoglycan hydrolasesFront Microbiol 13:1036964Google Scholar
115.
1. Mans B.J.
2. Anantharaman V.
3. Aravind L.
4. Koonin E.V
2004Comparative genomics, evolution and origins of the nuclear envelope and nuclear pore complexCell Cycle 3:1612–1637Google Scholar
116.
1. Wimley W.C
2003The versatile beta-barrel membrane proteinCurr Opin Struct Biol 13:404–411Google Scholar
117.
1. Fairman J.W.
2. Noinaj N.
3. Buchanan S.K
2011The structural biology of beta-barrel membrane proteins: a summary of recent reportsCurr Opin Struct Biol 21:523–531Google Scholar
118.
1. Tokuda H.
2. Matsuyama S
2004Sorting of lipoproteins to the outer membrane in E. coliBiochim Biophys Acta 1693:5–13Google Scholar
119.
1. Sanchez-Pulido L.
2. Devos D.
3. Genevrois S.
4. Vicente M.
5. Valencia A
2003POTRA: a conserved domain in the FtsQ family and a class of beta-barrel outer membrane proteinsTrends Biochem Sci 28:523–526Google Scholar
120.
1. Kim S.
2. Malinverni J.C.
3. Sliz P.
4. Silhavy T.J.
5. Harrison S.C.
6. Kahne D
2007Structure and function of an essential component of the outer membrane protein assembly machineScience 317:961–964Google Scholar
121.
1. Oke M.
2. Sarra R.
3. Ghirlando R.
4. Farnaud S.
5. Gorringe A.R.
6. Evans R.W.
7. Buchanan S.K
2004The plug domain of a neisserial TonB-dependent transporter retains structural integrity in the absence of its transmembrane beta-barrelFEBS Lett 564:294–300Google Scholar
122.
1. Carr S.
2. Penfold C.N.
3. Bamford V.
4. James R.
5. Hemmings A.M
2000The structure of TolB, an essential component of the tol-dependent translocation system, and its protein-protein interaction with the translocation domain of colicin E9Structure 8:57–66Google Scholar
123.
1. Ghosh M.
2. Tucker D.E.
3. Burchett S.A.
4. Leslie C.C
2006Properties of the Group IV phospholipase A2 familyProg Lipid Res 45:487–510Google Scholar
124.
1. Stukey J.
2. Carman G.M
1997Identification of a novel phosphatase sequence motifProtein Sci 6:469–472Google Scholar
125.
1. Whisstock J.C.
2. Romero S.
3. Gurung R.
4. Nandurkar H.
5. Ooms L.M.
6. Bottomley S.P.
7. Mitchell C.A
2000The inositol polyphosphate 5-phosphatases and the apurinic/apyrimidinic base excision repair endonucleases share a common mechanism for catalysisJ Biol Chem 275:37055–37061Google Scholar
126.
1. Burroughs A.M.
2. Aravind L
2023New biochemistry in the Rhodanese-phosphatase superfamily: emerging roles in diverse metabolic processes, nucleic acid modifications, and biological conflictsNAR Genom Bioinform 5:lqad029Google Scholar
127.
1. Holtje J.V.
2. Mirelman D.
3. Sharon N.
4. Schwarz U
1975Novel type of murein transglycosylase in Escherichia coliJ Bacteriol 124:1067–1076Google Scholar
128.
1. Yunck R.
2. Cho H.
3. Bernhardt T.G
2016Identification of MltG as a potential terminase for peptidoglycan polymerization in bacteriaMol Microbiol 99:700–718Google Scholar
129.
1. Einsle O.
2. Messerschmidt A.
3. Stach P.
4. Bourenkov G.P.
5. Bartunik H.D.
6. Huber R.
7. Kroneck P.M
1999Structure of cytochrome c nitrite reductaseNature 400:476–480Google Scholar
130.
1. Levin S.
2. Almo S.C.
3. Satir B.H
1999Functional diversity of the phosphoglucomutase superfamily: structural implicationsProtein Eng 12:737–746Google Scholar
131.
1. Dong X.
2. Kato-Murayama M.
3. Muramatsu T.
4. Mori H.
5. Shirouzu M.
6. Bessho Y.
7. Yokoyama S
2007The crystal structure of leucyl/phenylalanyl-tRNA-protein transferase from Escherichia coliProtein Sci 16:528–534Google Scholar
132.
1. Antia M.
2. Hoare D.S.
3. Work E
1957The stereoisomers of alpha epsilon-diaminopimelic acid. III. Properties and distribution of diaminopimelic acid racemase, an enzyme causing interconversion of the LL and meso isomersBiochem J 65:448–459Google Scholar
133.
1. Koraimann G
2003Lytic transglycosylases in macromolecular transport systems of Gram-negative bacteriaCell Mol Life Sci 60:2371–2388Google Scholar
134.
1. Muley V.Y.
2. Akhter Y.
3. Galande S
2019PDZ Domains Across the Microbial World: Molecular Link to the Proteases, Stress Response, and Protein SynthesisGenome Biol Evol 11:644–659Google Scholar
135.
1. Hara H.
2. Yamamoto Y.
3. Higashitani A.
4. Suzuki H.
5. Nishimura Y
1991Cloning, mapping, and characterization of the Escherichia coli prc gene, which is involved in C-terminal processing of penicillin-binding protein 3J Bacteriol 173:4799–4813Google Scholar
136.
1. Wong L.H.
2. Levine T.P
2017Tubular lipid binding proteins (TULIPs) growing everywhereBiochim Biophys Acta Mol Cell Res 1864:1439–1449Google Scholar
137.
1. Levine T.P
2019Remote homology searches identify bacterial homologues of eukaryotic lipid transfer proteins, including Chorein-N domains in TamB and AsmA and Mdm31pBMC Mol Cell Biol 20:43Google Scholar
138.
1. Dhanaraj V.
2. Ye Q.Z.
3. Johnson L.L.
4. Hupe D.J.
5. Ortwine D.F.
6. Dunbar J.B.
7. Rubin J.R.
8. Pavlovsky A.
9. Humblet C.
10. Blundell T.L
1996X-ray structure of a hydroxamate inhibitor complex of stromelysin catalytic domain and its comparison with members of the zinc metalloproteinase superfamilyStructure 4:375–386Google Scholar
139.
1. Lo Y.C.
2. Lin S.C.
3. Shaw J.F.
4. Liaw Y.C
2003Crystal structure of Escherichia coli thioesterase I/protease I/lysophospholipase L1: consensus sequence blocks constitute the catalytic center of SGNH-hydrolases through a conserved hydrogen bond networkJ Mol Biol 330:539–551Google Scholar
140.
1. Hofmann K
2000A superfamily of membrane-bound O-acyltransferases with implications for wnt signalingTrends Biochem Sci 25:111–112Google Scholar
141.
1. Rahlwes K.C.
2. Ha S.A.
3. Motooka D.
4. Mayfield J.A.
5. Baumoel L.R.
6. Strickland J.N.
7. Torres-Ocampo A.P.
8. Nakamura S.
9. Morita Y.S
2017The cell envelope-associated phospholipid-binding protein LmeA is required for mannan polymerization in mycobacteriaJ Biol Chem 292:17407–17417Google Scholar
142.
1. Yeow J.
2. Chng S.S
2022Of zones, bridges and chaperones - phospholipid transport in bacterial outer membrane assembly and homeostasisMicrobiology (Reading) 168Google Scholar
143.
1. Kadirvelraj R.
2. Foley B.L.
3. Dyekjaer J.D.
4. Woods R.J
2008Involvement of water in carbohydrate-protein binding: concanavalin A revisitedJ Am Chem Soc 130:16933–16942Google Scholar
144.
1. Flint J.
2. Nurizzo D.
3. Harding S.E.
4. Longman E.
5. Davies G.J.
6. Gilbert H.J.
7. Bolam D.N
2004Ligand-mediated dimerization of a carbohydrate-binding molecule reveals a novel mechanism for protein-carbohydrate recognitionJ Mol Biol 337:417–426Google Scholar
145.
1. Murzin A.G
1993OB(oligonucleotide/oligosaccharide binding)-fold: common structural and functional solution for non-homologous sequencesEMBO J 12:861–867Google Scholar
146.
1. Williams A.F.
2. Barclay A.N
1988The immunoglobulin superfamily--domains for cell surface recognitionAnnu Rev Immunol 6:381–405Google Scholar
147.
1. Chen C.K.
2. Chan N.L.
3. Wang A.H
2011The many blades of the beta-propeller proteins: conserved but versatileTrends Biochem Sci 36:553–561Google Scholar
148.
1. Babu M.M.
2. Priya M.L.
3. Selvan A.T.
4. Madera M.
5. Gough J.
6. Aravind L.
7. Sankaran K
2006A database of bacterial lipoproteins (DOLOP) with functional assignments to predicted lipoproteinsJ Bacteriol 188:2761–2773Google Scholar
149.
1. Sankaran K.
2. Wu H.C
1994Lipid modification of bacterial prolipoprotein. Transfer of diacylglyceryl moiety from phosphatidylglycerolJ Biol Chem 269:19701–19706Google Scholar
150.
1. Tjalsma H.
2. Zanen G.
3. Venema G.
4. Bron S.
5. van Dijl J.M.
1999The potential active site of the lipoprotein-specific (type II) signal peptidase of Bacillus subtilisJ Biol Chem 274:28191–28197Google Scholar
151.
1. Notenboom V.
2. Boraston A.B.
3. Kilburn D.G.
4. Rose D.R
2001Crystal structures of the family 9 carbohydrate-binding module from Thermotoga maritima xylanase 10A in native and ligand-bound formsBiochemistry 40:6248–6256Google Scholar
152.
1. Rigden D.J
2005Analysis of glycoside hydrolase family 98: catalytic machinery, mechanism and a novel putative carbohydrate binding moduleFEBS Lett 579:5466–5472Google Scholar
153.
1. Boraston A.B.
2. Bolam D.N.
3. Gilbert H.J.
4. Davies G.J
2004Carbohydrate-binding modules: fine-tuning polysaccharide recognitionBiochem J 382:769–781Google Scholar
154.
1. Iyer L.M.
2. Koonin E.V.
3. Aravind L
2001Adaptations of the helix-grip fold for ligand binding and catalysis in the START domain superfamilyProteins 43:134–144Google Scholar
155.
1. Santelli E.
2. Liddington R.C.
3. Mohan M.A.
4. Hoch J.A.
5. Szurmant H
2007The crystal structure of Bacillus subtilis YycI reveals a common fold for two members of an unusual class of sensor histidine kinase regulatory proteinsJ Bacteriol 189:3290–3295Google Scholar
156.
1. Das K.
2. Xiao R.
3. Wahlberg E.
4. Hsu F.
5. Arrowsmith C.H.
6. Montelione G.T.
7. Arnold E
2001X-ray crystal structure of MTH938 from Methanobacterium thermoautotrophicum at 2.2 A resolution reveals a novel tertiary protein foldProteins 45:486–488Google Scholar
157.
1. Bennett H.J.
2. Davenport J.B.
3. Collins R.F.
4. Trafford A.W.
5. Pinali C.
6. Kitmitto A.
2013Human junctophilin-2 undergoes a structural rearrangement upon binding PtdIns(3,4,5)P3 and the S101R mutation identified in hypertrophic cardiomyopathy obviates this responseBiochem J 456:205–217Google Scholar
158.
1. Cortajarena A.L.
2. Regan L
2006Ligand binding by TPR domainsProtein Sci 15:1193–1198Google Scholar
159.
1. Seed K.D.
2. Faruque S.M.
3. Mekalanos J.J.
4. Calderwood S.B.
5. Qadri F.
6. Camilli A
2012Phase variable O antigen biosynthetic genes control expression of the major protective antigen and bacteriophage receptor in Vibrio cholerae O1PLoS Pathog 8:e1002917Google Scholar
160.
1. Cai R.
2. Wang G.
3. Le S.
4. Wu M.
5. Cheng M.
6. Guo Z.
7. Ji Y.
8. Xi H.
9. Zhao C.
10. Wang X.
11. Xue Y.
12. Wang Z.
13. Zhang H.
14. Fu Y.
15. Sun C.
16. Feng X.
17. Lei L.
18. Yang Y.
19. Ur Rahman S.
20. Liu X.
21. Han W.
22. Gu J
2019Three Capsular Polysaccharide Synthesis-Related Glucosyltransferases, GT-1, GT-2 and WcaJ, Are Associated With Virulence and Phage Sensitivity of Klebsiella pneumoniaeFront Microbiol 10:1189Google Scholar
161.
1. Yau K.W
1994Cyclic nucleotide-gated channels: an expanding new family of ion channelsProc Natl Acad Sci U S A 91:3481–3483Google Scholar
162.
1. Durocher D.
2. Henckel J.
3. Fersht A.R.
4. Jackson S.P
1999The FHA domain is a modular phosphopeptide recognition motifMol Cell 4:387–394Google Scholar
163.
1. Schreiter E.R.
2. Drennan C.L
2007Ribbon-helix-helix transcription factors: variations on a themeNat Rev Microbiol 5:710–720Google Scholar
164.
1. Ravi J.
2. Anantharaman V.
3. Chen S.Z.
4. Brenner E.P.
5. Datta P.
6. Aravind L.
7. Gennaro M.L
2024The phage shock protein (PSP) envelope stress response: discovery of novel partners and evolutionary historymSystems 9:e0084723Google Scholar
165.
1. AhYoung A.P.
2. Koehl A.
3. Vizcarra C.L.
4. Cascio D.
5. Egea P.F
2016Structure of a putative ClpS N-end rule adaptor protein from the malaria pathogen Plasmodium falciparumProtein Sci 25:689–701Google Scholar
166.
1. Durr I.F.
2. Rudney H
1960The reduction of beta-hydroxy-beta-methyl-glutaryl coenzyme A to mevalonic acidJ Biol Chem 235:2572–2578Google Scholar
167.
1. Tchen T.T
1958Mevalonic kinase: purification and propertiesJ Biol Chem 233:1100–1103Google Scholar
168.
1. Wolff M
2. Seemann M
3. Tse Sum Bui B
4. Frapart Y
5. Tritsch D
6. Garcia Estrabot A
7. Rodríguez-Concepción M
8. Boronat A
9. Marquet A
10. Rohmer M
2003Isoprenoid biosynthesis via the methylerythritol phosphate pathway: the (E)-4-hydroxy-3-methylbut-2-enyl diphosphate reductase (LytB/IspH) from Escherichia coli is a [4Fe-4S] proteinFEBS Lett 541:115–120Google Scholar
169.
1. Guo R.T.
2. Ko T.P.
3. Chen A.P.
4. Kuo C.J.
5. Wang A.H.
6. Liang P.H
2005Crystal structures of undecaprenyl pyrophosphate synthase in complex with magnesium, isopentenyl pyrophosphate, and farnesyl thiopyrophosphate: roles of the metal ion and conserved residues in catalysisJ Biol Chem 280:20762–20774Google Scholar
170.
1. Burroughs A.M.
2. Aravind L
2016RNA damage in biological conflicts and the diversity of responding RNA repair systemsNucleic Acids Res 44:8525–8555Google Scholar
171.
1. Ueta M.
2. Ohniwa R.L.
3. Yoshida H.
4. Maki Y.
5. Wada C.
6. Wada A
2008Role of HPF (hibernation promoting factor) in translational activity in Escherichia coliJ Biochem 143:425–433Google Scholar
172.
1. Akanuma G.
2. Kazo Y.
3. Tagami K.
4. Hiraoka H.
5. Yano K.
6. Suzuki S.
7. Hanai R.
8. Nanamiya H.
9. Kato-Yamada Y.
10. Kawamura F
2016Ribosome dimerization is essential for the efficient regrowth of Bacillus subtilisMicrobiology (Reading) 162:448–458Google Scholar
173.
1. Kuk A.C.Y.
2. Hao A.
3. Lee S.Y
2022Structure and Mechanism of the Lipid Flippase MurJAnnu Rev Biochem 91:705–729Google Scholar
174.
1. Zhang D.
2. de Souza R.F.
3. Anantharaman V.
4. Iyer L.M.
5. Aravind L.
2012Polymorphic toxin systems: Comprehensive characterization of trafficking modes, processing, mechanisms of action, immunity and ecology using comparative genomicsBiol Direct 7:18Google Scholar
175.
1. Iyer L.M.
2. Zhang D.
3. Rogozin I.B.
4. Aravind L
2011Evolution of the deaminase fold and multiple origins of eukaryotic editing and mutagenic nucleic acid deaminases from bacterial toxin systemsNucleic Acids Res 39:9473–9497Google Scholar
176.
1. Ruhe Z.C.
2. Low D.A.
3. Hayes C.S
2020Polymorphic Toxins and Their Immunity Proteins: Diversity, Evolution, and Mechanisms of DeliveryAnnu Rev Microbiol 74:497–520Google Scholar
177.
1. Alcoforado Diniz J.
2. Coulthurst S.J.
2015Intraspecies Competition in Serratia marcescens Is Mediated by Type VI-Secreted Rhs Effectors and a Conserved Effector-Associated Accessory ProteinJ Bacteriol 197:2350–2360Google Scholar
178.
1. Satchell K.J
2011Structure and function of MARTX toxins and other large repetitive RTX proteinsAnnu Rev Microbiol 65:71–90Google Scholar
179.
1. Kordis D.
2. Turk V
2009Phylogenomic analysis of the cystatin superfamily in eukaryotes and prokaryotesBMC Evol Biol 9:266Google Scholar
180.
1. Cherier D.
2. Patin D.
3. Blanot D.
4. Touze T.
5. Barreteau H
2021The Biology of Colicin M and Its OrthologsAntibiotics (Basel) 10Google Scholar
181.
1. Anantharaman V.
2. Aravind L
2003Evolutionary history, structural features and biochemical diversity of the NlpC/P60 superfamily of enzymesGenome Biol 4:R11Google Scholar
182.
1. Arthur M.
2. Molinas C.
3. Courvalin P
1992Sequence of the vanY gene required for production of a vancomycin-inducible D,D-carboxypeptidase in Enterococcus faecium BM4147Gene 120:111–114Google Scholar
183.
1. Turk V.
2. Bode W
1991The cystatins: protein inhibitors of cysteine proteinasesFEBS Lett 285:213–219Google Scholar
184.
1. Sato Y.
2. Yamamoto Y.
3. Kizaki H
1997Cloning and sequence analysis of the gbpC gene encoding a novel glucan-binding protein of Streptococcus mutansInfect Immun 65:668–675Google Scholar
185.
1. MacKenzie D.A.
2. Tailford L.E.
3. Hemmings A.M.
4. Juge N
2009Crystal structure of a mucus-binding protein repeat reveals an unexpected functional immunoglobulin binding activityJ Biol Chem 284:32444–32453Google Scholar
186.
1. Chagnot C.
2. Zorgani M.A.
3. Astruc T.
4. Desvaux M
2013Proteinaceous determinants of surface colonization in bacteria: bacterial adhesion and biofilm formation from a protein secretion perspectiveFront Microbiol 4:303Google Scholar
187.
1. Wu H.Y.
2. Liu M.S.
3. Lin T.P.
4. Cheng Y.S
2011Structural and functional assays of AtTLP18.3 identify its novel acid phosphatase activity in thylakoid lumenPlant Physiol 157:1015–1025Google Scholar
188.
1. Cao Z.
2. Casabona M.G.
3. Kneuper H.
4. Chalmers J.D.
5. Palmer T
2016The type VII secretion system of Staphylococcus aureus secretes a nuclease toxin that targets competitor bacteriaNat Microbiol 2:16183Google Scholar
189.
1. Klein T.A.
2. Pazos M.
3. Surette M.G.
4. Vollmer W.
5. Whitney J.C
2018Molecular Basis for Immunity Protein Recognition of a Type VII Secretion System Exported Antibacterial ToxinJ Mol Biol 430:4344–4358Google Scholar
190.
1. Knowles T.J.
2. Browning D.F.
3. Jeeves M.
4. Maderbocus R.
5. Rajesh S.
6. Sridhar P.
7. Manoli E.
8. Emery D.
9. Sommer U.
10. Spencer A.
11. Leyton D.L.
12. Squire D.
13. Chaudhuri R.R.
14. Viant M.R.
15. Cunningham A.F.
16. Henderson I.R.
17. Overduin M
2011Structure and function of BamE within the outer membrane and the beta-barrel assembly machineEMBO Rep 12:123–128Google Scholar
191.
1. Hagan C.L.
2. Kahne D
2011The reconstituted Escherichia coli Bam complex catalyzes multiple rounds of beta-barrel assemblyBiochemistry 50:7444–7446Google Scholar
192.
1. Kitagawa D.
2. Vakonakis I.
3. Olieric N.
4. Hilbert M.
5. Keller D.
6. Olieric V.
7. Bortfeld M.
8. Erat M.C.
9. Fluckiger I.
10. Gonczy P.
11. Steinmetz M.O
2011Structural basis of the 9-fold symmetry of centriolesCell 144:364–375Google Scholar
193.
1. Aravind L.
2. Anantharaman V.
3. Zhang D.
4. de Souza R.F.
5. Iyer L.M.
2012Gene flow and biological conflict systems in the origin and evolution of eukaryotesFront Cell Infect Microbiol 2:89Google Scholar
194.
1. Mutschler H.
2. Gebhardt M.
3. Shoeman R.L.
4. Meinhart A
2011A novel mechanism of programmed cell death in bacteria by toxin-antitoxin systems corrupts peptidoglycan synthesisPLoS Biol 9:e1001033Google Scholar
195.
1. Monzingo A.F.
2. Marcotte E.M.
3. Hart P.J.
4. Robertus J.D
1996Chitinases, chitosanases, and lysozymes can be divided into procaryotic and eucaryotic families sharing a conserved coreNat Struct Biol 3:133–140Google Scholar
196.
1. Dong X.
2. Leksa N.C.
3. Chhabra E.S.
4. Arndt J.W.
5. Lu Q.
6. Knockenhauer K.E.
7. Peters R.T.
8. Springer T.A
2019The von Willebrand factor D’D3 assembly and structural principles for factor VIII binding and concatemer biogenesisBlood 133:1523–1533Google Scholar
197.
1. West A.H.
2. Stock A.M
2001Histidine kinases and response regulator proteins in two-component signaling systemsTrends Biochem Sci 26:369–376Google Scholar
198.
1. Costa T.
2. Isidro A.L.
3. Moran C.P.
4. Henriques A.O
2006Interaction between coat morphogenetic proteins SafA and SpoVIDJ Bacteriol 188:7731–7741Google Scholar
199.
1. Zhang C.
2. Kelkar A.
3. Nasirikenari M.
4. Lau J.T.Y.
5. Sveinsson M.
6. Sharma U.C.
7. Pokharel S.
8. Neelamegham S
2018The physical spacing between the von Willebrand factor D’D3 and A1 domains regulates platelet adhesion in vitro and in vivoJ Thromb Haemost 16:571–582Google Scholar
200.
1. Hermoso J.A.
2. Monterroso B.
3. Albert A.
4. Galan B.
5. Ahrazem O.
6. Garcia P.
7. Martinez-Ripoll M.
8. Garcia J.L.
9. Menendez M
2003Structural basis for selective recognition of pneumococcal cell wall by modular endolysin from phage Cp-1Structure 11:1239–1249Google Scholar
201.
1. Das D.
2. Lee W.S.
3. Grant J.C.
4. Chiu H.J.
5. Farr C.L.
6. Vance J.
7. Klock H.E.
8. Knuth M.W.
9. Miller M.D.
10. Elsliger M.A.
11. Deacon A.M.
12. Godzik A.
13. Lesley S.A.
14. Kornfeld S.
15. Wilson I.A
2013Structure and function of the DUF2233 domain in bacteria and in the human mannose 6-phosphate uncovering enzymeJ Biol Chem 288:16789–16799Google Scholar
202.
1. Dideberg O.
2. Charlier P.
3. Dive G.
4. Joris B.
5. Frere J.M.
6. Ghuysen J.M
1982Structure of a Zn2+-containing D-alanyl-D-alanine-cleaving carboxypeptidase at 2.5 A resolutionNature 299:469–470Google Scholar
203.
1. Bellinzoni M.
2. Haouz A.
3. Miras I.
4. Magnet S.
5. Andre-Leroux G.
6. Mukherjee R.
7. Shepard W.
8. Cole S.T.
9. Alzari P.M
2014Structural studies suggest a peptidoglycan hydrolase function for the Mycobacterium tuberculosis Tat-secreted protein Rv2525cJ Struct Biol 188:156–164Google Scholar
204.
1. Wright G.D.
2. Molinas C.
3. Arthur M.
4. Courvalin P.
5. Walsh C.T
1992Characterization of vanY, a DD-carboxypeptidase from vancomycin-resistant Enterococcus faecium BM4147Antimicrob Agents Chemother 36:1514–1518Google Scholar
205.
1. Arthur M.
2. Depardieu F.
3. Cabanie L.
4. Reynolds P.
5. Courvalin P
1998Requirement of the VanY and VanX D,D-peptidases for glycopeptide resistance in enterococciMol Microbiol 30:819–830Google Scholar
206.
1. Arthur M.
2. Depardieu F.
3. Snaith H.A.
4. Reynolds P.E.
5. Courvalin P
1994Contribution of VanY D,D-carboxypeptidase to glycopeptide resistance in Enterococcus faecalis by hydrolysis of peptidoglycan precursorsAntimicrob Agents Chemother 38:1899–1903Google Scholar
207.
1. Meyer B.
2. Wurm J.P.
3. Sharma S.
4. Immer C.
5. Pogoryelov D.
6. Kotter P.
7. Lafontaine D.L.
8. Wohnert J.
9. Entian K.D
2016Ribosome biogenesis factor Tsr3 is the aminocarboxypropyl transferase responsible for 18S rRNA hypermodification in yeast and humansNucleic Acids Res 44:4304–4316Google Scholar
208.
1. Burroughs A.M.
2. Aravind L
2014Analysis of two domains with novel RNA-processing activities throws light on the complex evolution of ribosomal RNA biogenesisFront Genet 5:424Google Scholar
209.
1. Fraser M.E.
2. Joyce M.A.
3. Ryan D.G.
4. Wolodko W.T
2002Two glutamate residues, Glu 208 alpha and Glu 197 beta, are crucial for phosphorylation and dephosphorylation of the active-site histidine residue in succinyl-CoA synthetaseBiochemistry 41:537–546Google Scholar
210.
1. Iyer L.M.
2. Abhiman S.
3. Maxwell Burroughs A.
4. Aravind L
2009Amidoligases with ATP-grasp, glutamine synthetase-like and acetyltransferase-like domains: synthesis of novel metabolites and peptide modifications of proteinsMol Biosyst 5:1636–1660Google Scholar
211.
1. Raetz C.R.
2. Kennedy E.P
1974Partial purification and properties of phosphatidylserine synthetase from Escherichia coliJ Biol Chem 249:5083–5045Google Scholar
212.
1. Fields R.N.
2. Roy H
2018Deciphering the tRNA-dependent lipid aminoacylation systems in bacteria: Novel components and structural advancesRNA Biol 15:480–491Google Scholar
213.
1. Hancock R.E
1997Peptide antibioticsLancet 349:418–422Google Scholar
214.
1. Benson T.E.
2. Prince D.B.
3. Mutchler V.T.
4. Curry K.A.
5. Ho A.M.
6. Sarver R.W.
7. Hagadorn J.C.
8. Choi G.H.
9. Garlick R.L
2002X-ray crystal structure of Staphylococcus aureus FemAStructure 10:1107–1115Google Scholar
215.
1. Schneider T.
2. Senn M.M.
3. Berger-Bachi B.
4. Tossi A.
5. Sahl H.G.
6. Wiedemann I
2004In vitro assembly of a complete, pentaglycine interpeptide bridge containing cell wall precursor (lipid II-Gly5) of Staphylococcus aureusMol Microbiol 53:675–685Google Scholar
216.
1. Schmid M.
2. Uhlenhaut N.H.
3. Godard F.
4. Demar M.
5. Bressan R.
6. Weigel D.
7. Lohmann J.U
2003Dissection of floral induction pathways using global expression analysisDevelopment 130:6001–6012Google Scholar
217.
1. Berardini T.Z.
2. Reiser L.
3. Li D.
4. Mezheritsky Y.
5. Muller R.
6. Strait E.
7. Huala E
2015The Arabidopsis information resource: Making and mining the “gold standard” annotated reference plant genomeGenesis 53:474–485Google Scholar
218.
1. Savidov N.
2. Gloriozova T.A.
3. Poroikov V.V.
4. Dembitsky V.M
2018Highly oxygenated isoprenoid lipids derived from fungi and fungal endophytes: Origin and biological activitiesSteroids 140:114–124Google Scholar
219.
1. Lee J.O.
2. Rieu P.
3. Arnaout M.A.
4. Liddington R
1995Crystal structure of the A domain from the alpha subunit of integrin CR3 (CD11b/CD18)Cell 80:631–638Google Scholar
220.
1. Zhang Y.
2. van der Fits L.
3. Voerman J.S.
4. Melief M.J.
5. Laman J.D.
6. Wang M.
7. Wang H.
8. Wang M.
9. Li X.
10. Walls C.D.
11. Gupta D.
12. Dziarski R
2005Identification of serum N-acetylmuramoyl-l-alanine amidase as liver peptidoglycan recognition protein 2Biochim Biophys Acta 1752:34–46Google Scholar
221.
1. Lee J.
2. Geddes K.
3. Streutker C.
4. Philpott D.J.
5. Girardin S.E
2012Role of mouse peptidoglycan recognition protein PGLYRP2 in the innate immune response to Salmonella enterica serovar Typhimurium infection in vivoInfect Immun 80:2645–2654Google Scholar
222.
1. Dziarski R.
2. Gupta D
2010Review: Mammalian peptidoglycan recognition proteins (PGRPs) in innate immunityInnate Immun 16:168–174Google Scholar
223.
1. Dziarski R.
2. Gupta D
2006Mammalian PGRPs: novel antibacterial proteinsCell Microbiol 8:1059–1069Google Scholar
224.
1. Jones J.D.
2. Dangl J.L
2006The plant immune systemNature 444:323–329Google Scholar
225.
1. Holzem M.
2. Boutros M.
3. Holstein T.W
2024The origin and evolution of Wnt signallingNat Rev Genet 25:500–512Google Scholar
226.
1. Morrow J.F.
2. Stearman R.S.
3. Peltzman C.G.
4. Potter D.A
1981Induction of hepatic synthesis of serum amyloid A protein and actinProc Natl Acad Sci U S A 78:4718–4722Google Scholar
227.
1. Hari-Dass R.
2. Shah C.
3. Meyer D.J.
4. Raynes J.G
2005Serum amyloid A protein binds to outer membrane protein A of gram-negative bacteriaJ Biol Chem 280:18562–18567Google Scholar
228.
1. Shah C.
2. Hari-Dass R.
3. Raynes J.G
2006Serum amyloid A is an innate immune opsonin for Gram-negative bacteriaBlood 108:1751–1757Google Scholar
229.
1. Sack G.H.
2. Talbot C.C.
3. Seuanez H.
4. O’Brien S.J
1989Molecular analysis of the human serum amyloid A (SAA) gene familyScand J Immunol 29:113–119Google Scholar
230.
1. Uhlar C.M.
2. Burgess C.J.
3. Sharp P.M.
4. Whitehead A.S
1994Evolution of the serum amyloid A (SAA) protein superfamilyGenomics 19:228–235Google Scholar
231.
1. Sun L.
2. Ye R.D
2016Serum amyloid A1: Structure, function and gene polymorphismGene 583:48–57Google Scholar
232.
1. Aoki S.K.
2. Malinverni J.C.
3. Jacoby K.
4. Thomas B.
5. Pamma R.
6. Trinh B.N.
7. Remers S.
8. Webb J.
9. Braaten B.A.
10. Silhavy T.J.
11. Low D.A
2008Contact-dependent growth inhibition requires the essential outer membrane protein BamA (YaeT) as the receptor and the inner membrane transport protein AcrBMol Microbiol 70:323–340Google Scholar
233.
1. Virtanen P.
2. Waneskog M.
3. Koskiniemi S
2019Class II contact-dependent growth inhibition (CDI) systems allow for broad-range cross-species toxin delivery within the Enterobacteriaceae familyMol Microbiol 111:1109–1125Google Scholar
234.
1. Ruhe Z.C.
2. Nguyen J.Y.
3. Xiong J.
4. Koskiniemi S.
5. Beck C.M.
6. Perkins B.R.
7. Low D.A.
8. Hayes C.S
2017CdiA Effectors Use Modular Receptor-Binding Domains To Recognize Target BacteriamBio 8Google Scholar
235.
1. Levin M.
2. Franklin E.C.
3. Frangione B.
4. Pras M
1972The amino acid sequence of a major nonimmunoglobulin component of some amyloid fibrilsJ Clin Invest 51:2773–2776Google Scholar
236.
1. Blanco L.P.
2. Evans M.L.
3. Smith D.R.
4. Badtke M.P.
5. Chapman M.R
2012Diversity, biogenesis and function of microbial amyloidsTrends Microbiol 20:66–73Google Scholar
237.
1. Rouse S.L.
2. Matthews S.J.
3. Dueholm M.S
2018Ecology and Biogenesis of Functional Amyloids in PseudomonasJ Mol Biol 430:3685–3695Google Scholar
238.
1. Tukel C.
2. Wilson R.P.
3. Nishimori J.H.
4. Pezeshki M.
5. Chromy B.A.
6. Baumler A.J
2009Responses to amyloids of microbial and host origin are mediated through toll-like receptor 2Cell Host Microbe 6:45–53Google Scholar
239.
1. Prosswimmer T.
2. Heng A.
3. Daggett V
2024Mechanistic insights into the role of amyloid-beta in innate immunitySci Rep 14:5376Google Scholar
240.
1. Egan A.J.F.
2. Errington J.
3. Vollmer W
2020Regulation of peptidoglycan synthesis and remodellingNat Rev Microbiol 18:446–460Google Scholar
241.
1. Koga Y.
2. Morii H
2007Biosynthesis of ether-type polar lipids in archaea and evolutionary considerationsMicrobiol Mol Biol Rev 71:97–120Google Scholar
242.
1. Daiyasu H.
2. Kuma K.
3. Yokoi T.
4. Morii H.
5. Koga Y.
6. Toh H
2005A study of archaeal enzymes involved in polar lipid synthesis linking amino acid sequence information, genomic contexts and lipid compositionArchaea 1:399–410Google Scholar
243.
1. Carman G.M.
2. Han G.S
2006Roles of phosphatidate phosphatase enzymes in lipid metabolismTrends Biochem Sci 31:694–699Google Scholar
244.
1. Neuwald A.F
1997An unexpected structural relationship between integral membrane phosphatases and soluble haloperoxidasesProtein Sci 6:1764–1767Google Scholar
245.
1. Aravind L.
2. Nicastro G.G.
3. Iyer L.M.
4. Burroughs A.M
2024The Prokaryotic Roots of Eukaryotic Immune SystemsAnnu Rev Genet 58:365–389Google Scholar
246.
1. Krishnan A.
2. Iyer L.M.
3. Holland S.J.
4. Boehm T.
5. Aravind L
2018Diversification of AID/APOBEC-like deaminases in metazoa: multiplicity of clades and widespread roles in immunityProc Natl Acad Sci U S A 115:E3201–E3210Google Scholar
247.
1. Zhang D.
2. Iyer L.M.
3. Aravind L
2011A novel immunity system for bacterial nucleic acid degrading toxins and its recruitment in various eukaryotic and DNA viral systemsNucleic Acids Res 39:4532–4552Google Scholar
248.
1. Kaur G.
2. Burroughs A.M.
3. Iyer L.M.
4. Aravind L
2020Highly regulated, diversifying NTP-dependent biological conflict systems with implications for the emergence of multicellularityeLife 9Google Scholar
249.
1. Kaur G.
2. Iyer L.M.
3. Burroughs A.M.
4. Aravind L
2021Bacterial death and TRADD-N domains help define novel apoptosis and immunity mechanisms shared by prokaryotes and metazoanseLife 10Google Scholar
250.
1. Altschul S.F.
2. Madden T.L.
3. Schaffer A.A.
4. Zhang J.
5. Zhang Z.
6. Miller W.
7. Lipman D.J
1997Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Res 25:3389–3402Google Scholar
251.
1. Johnson L.S.
2. Eddy S.R.
3. Portugaly E
2010Hidden Markov model speed heuristic and iterative HMM search procedureBMC Bioinformatics 11:431Google Scholar
252.
1. Sayers E.W.
2. Bolton E.E.
3. Brister J.R.
4. Canese K.
5. Chan J.
6. Comeau D.C.
7. Connor R.
8. Funk K.
9. Kelly C.
10. Kim S.
11. Madej T.
12. Marchler-Bauer A.
13. Lanczycki C.
14. Lathrop S.
15. Lu Z.
16. Thibaud-Nissen F.
17. Murphy T.
18. Phan L.
19. Skripchenko Y.
20. Tse T.
21. Wang J.
22. Williams R.
23. Trawick B.W.
24. Pruitt K.D.
25. Sherry S.T
2022Database resources of the national center for biotechnology informationNucleic Acids Res 50:D20–D26Google Scholar
253.
1. Hauser M.
2. Steinegger M.
3. Soding J
2016MMseqs software suite for fast and deep clustering and searching of large protein sequence setsBioinformatics 32:1323–1330Google Scholar
254.
1. Katoh K.
2. Standley D.M
2013MAFFT multiple sequence alignment software version 7: improvements in performance and usabilityMol Biol Evol 30:772–780Google Scholar
255.
1. Steinegger M.
2. Meier M.
3. Mirdita M.
4. Vohringer H.
5. Haunsberger S.J.
6. Soding J
2019HH-suite3 for fast remote homology detection and deep protein annotationBMC Bioinformatics 20:473Google Scholar
256.
1. Csardi G.
2. Nepusz T
2006The Igraph Software Package for Complex Network Research. InterJournalComplex Systems Google Scholar
257.
1. Hagberg A.A.
2. Schult D.A.
3. Swart P.
4. Hagberg J.M
2008Exploring Network Structure, Dynamics, and Function using NetworkXIn: Proceedings of the Python in Science Conference Google Scholar
258.
1. Finn R.D.
2. Coggill P.
3. Eberhardt R.Y.
4. Eddy S.R.
5. Mistry J.
6. Mitchell A.L.
7. Potter S.C.
8. Punta M.
9. Qureshi M.
10. Sangrador-Vegas A.
11. Salazar G.A.
12. Tate J.
13. Bateman A
2016The Pfam protein families database: towards a more sustainable futureNucleic Acids Res 44:D279–285Google Scholar
259.
1. Schaffer A.A.
2. Wolf Y.I.
3. Ponting C.P.
4. Koonin E.V.
5. Aravind L.
6. Altschul S.F
1999IMPALA: matching a protein sequence against a collection of PSI-BLAST-constructed position-specific score matricesBioinformatics 15:1000–1011Google Scholar
260.
1. Eddy S.R
2011Accelerated Profile HMM SearchesPLoS Comput Biol 7:e1002195Google Scholar
261.
1. Soding J.
2. Biegert A.
3. Lupas A.N
2005The HHpred interactive server for protein homology detection and structure predictionNucleic Acids Res 33:W244–248Google Scholar
262.
1. Berman H.
2. Henrick K.
3. Nakamura H.
4. Markley J.L
2007The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB dataNucleic Acids Res 35:D301–303Google Scholar
263.
1. Price M.N.
2. Dehal P.S.
3. Arkin A.P
2010FastTree 2--approximately maximum-likelihood trees for large alignmentsPLoS One 5:e9490Google Scholar
264.
1. Minh B.Q.
2. Schmidt H.A.
3. Chernomor O.
4. Schrempf D.
5. Woodhams M.D.
6. von Haeseler A.
7. Lanfear R.
2020IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic EraMol Biol Evol 37:1530–1534Google Scholar
265.
1. Abramson J.
2. Adler J.
3. Dunger J.
4. Evans R.
5. Green T.
6. Pritzel A.
7. Ronneberger O.
8. Willmore L.
9. Ballard A.J.
10. Bambrick J.
11. Bodenstein S.W.
12. Evans D.A.
13. Hung C.C.
14. O’Neill M.
15. Reiman D.
16. Tunyasuvunakool K.
17. Wu Z.
18. Zemgulyte A.
19. Arvaniti E.
20. Beattie C.
21. Bertolli O.
22. Bridgland A.
23. Cherepanov A.
24. Congreve M.
25. Cowen-Rivers A.I.
26. Cowie A.
27. Figurnov M.
28. Fuchs F.B.
29. Gladman H.
30. Jain R.
31. Khan Y.A.
32. Low C.M.R.
33. Perlin K.
34. Potapenko A.
35. Savy P.
36. Singh S.
37. Stecula A.
38. Thillaisundaram A.
39. Tong C.
40. Yakneen S.
41. Zhong E.D.
42. Zielinski M.
43. Zidek A.
44. Bapst V.
45. Kohli P.
46. Jaderberg M.
47. Hassabis D.
48. Jumper J.M
2024Accurate structure prediction of biomolecular interactions with AlphaFold 3Nature 630:493–500Google Scholar
266.
1. Sehnal D.
2. Bittrich S.
3. Deshpande M.
4. Svobodova R.
5. Berka K.
6. Bazgier V.
7. Velankar S.
8. Burley S.K.
9. Koca J.
10. Rose A.S
2021Mol* Viewer: modern web app for 3D visualization and analysis of large biomolecular structuresNucleic Acids Res 49:W431–W437Google Scholar
267.
1. Holm L.
2019Benchmarking fold detection by DaliLite v.5Bioinformatics 35:5326–5327Google Scholar
268.
1. van Kempen M.
2. Kim S.S.
3. Tumescheit C.
4. Mirdita M.
5. Lee J.
6. Gilchrist C.L.M.
7. Soding J.
8. Steinegger M.
2024Fast and accurate protein structure search with FoldseekNat Biotechnol 42:243–246Google Scholar

Article and author information

Author information

A Maxwell Burroughs
Division of Intramural Research, National Library of Medicine, National Institutes of Health, Bethesda, United States
- These authors contributed equally to this work
Gianlucca G Nicastro
Division of Intramural Research, National Library of Medicine, National Institutes of Health, Bethesda, United States
- These authors contributed equally to this work
L Aravind
Division of Intramural Research, National Library of Medicine, National Institutes of Health, Bethesda, United States
ORCID iD: 0000-0003-0771-253X
- For correspondence: aravind@ncbi.nlm.nih.gov

Author Notes

Competing interests: No competing interests declared

Version history

Preprint posted: June 18, 2025
Sent for peer review: June 22, 2025
Reviewed Preprint version 1: July 22, 2025
Version of Record published: September 9, 2025

Cite all versions

You can cite all versions using the DOI https://doi.org/10.7554/eLife.108061. This DOI represents all versions, and will always resolve to the latest one.

Copyright

This is an open-access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.

Metrics

views: 977
downloads: 131
citations: 0

Views, downloads and citations are aggregated across all versions of this paper published by eLife.