The lipocone superfamily, a unifying theme in metabolism of lipids, peptidoglycan and exopolysaccharides, inter-organismal conflicts and immunity

  1. A Maxwell Burroughs
  2. Gianlucca G Nicastro
  3. L Aravind  Is a corresponding author
  1. Division of Intramural Research, National Library of Medicine, National Institutes of Health, United States
6 figures and 1 additional file

Figures

Figure 1 with 5 supplements
Identification and compositional analysis of the Lipocone superfamily.

(A) The four individual helices forming the core of the Lipocone superfamily are consistently colored across the illustrated representatives. The inter-helix linkers are colored gray, and lineage-specific synapomorphic insertions and extensions are colored light brown. Active site and other residues of interest are rendered as ball-and-stick. Protein Data Bank (PDB) IDs or GenBank accessions used to generate AF3 models are provided. (B) Relationship network of the Lipocone families. The thickness of the edges is scaled by negative-log HHalign p-values. Families are colored according to the community identified by the Leiden algorithm (Traag et al., 2019) (see Methods). (C) Box plots displaying core helix transmembrane propensity scores of individual sequences within different Lipocone families. The horizontal divider represents the boundary between typical transmembrane (TM) and soluble sequences.

Figure 1—source data 1

Transmembrane tendency scores by Lipocone family sequence for (C) and network YAML file for (B).

https://cdn.elifesciences.org/articles/108061/elife-108061-fig1-data1-v1.zip
Figure 1—figure supplement 1
Phyletic distribution patterns of Lipocone superfamily.

(Top) Percentage of genomes containing at least one representative of a given Lipocone family within a discrete phylogeny are reported as colored in the provided legend. (bottom) Bar graph depicting phyletic depth (bar height Di - see methods) and breadth (bar width) for Lipocone clades (see Figure 4). Coloring contrasts membrane-associating clades with diffusible clades.

Figure 1—figure supplement 2
Structural representatives of the Lipocone superfamily.

Representative structures or predicted models from Lipocone families. Core helices (H1, H2, H3, and H4) are colored uniformly across the structures as in Figure 1A. Loops and inserts are outlined and transparent, distinctive features are labeled as appropriate. Protein Data Bank (PDB) IDs or protein sequence identifiers used to generate AF models are provided. The core three ancestral active site positions (see Figure 2) are rendered as ball-and-stick, with carbons colored green and other atoms colored as standard element colors.

Figure 1—figure supplement 3
Critical difference diagram depicting group-wise differences across transmembrane (TM) tendency score distributions in Figure 1C (see Methods).

Groups connected by horizontal bars are not significantly different (Bonferroni-adjusted p>0.05); groups not connected are significantly different (adjusted p<0.05). Three non-linked groups are observed in this diagram: (1) those with clear negative TM propensity scores (containing families predicted to be soluble), (2) those with clear positive TM propensity scores (containing families predicted to insert into the membrane), and (3) those with borderline scores. The third category includes families known to insert into the endoplasmic reticulum (ER) membrane (PTDSS1/2) and those predicted to insert into the outer bacterial membrane (YfiM clade).

Figure 1—figure supplement 4
Structural diversity in the cpCone clade.

Each panel depicts a distinct structural configuration observed in the cpCone clade, colored by a rainbow palette from N- to C-termini (left structure) and by equivalent helix (right structure). All representative structures were selected from the cpCone-1 family, except where noted. AF models are based on sequence provided as NCBI accession number labels.

Figure 1—figure supplement 5
Summary of the Methodology and Main Findings.

(A) Flowchart of research strategy. Research endpoints are shaded in gray circles. Methods are shaded in orange boxes. Resources (e.g. databases, etc.) are shaded in green boxes. Used programs and algorithms are shaded in red boxes. (B) Graphical overview of Lipocone superfamily evolutionary history.

Sequence logo of conserved core elements of the Lipocone families.

These correspond to the core helices H2, H3, and H4. The three conserved active site residue positions are boxed in dotted lines with the inferred ancestral residue indicated at the top of the alignment. Families are grouped and labeled on the left in their higher-order clades.

Known and predicted Lipocone reaction mechanisms.

Experimentally supported reactions are boxed in blue (A–B), while a predicted reaction based on genome displacement by a Lipocone domain of an experimentally characterized enzyme is boxed in orange (C). The remaining reactions (D–G) are suggested based on the contextual inferences in this work. Attacking and leaving groups are denoted by dashed green and red circles, respectively.

Reconstructed evolutionary scenario for the Lipocone superfamily.

The relative temporal epochs are demarcated by vertical lines and labeled at the bottom. The clades are represented by colored lines indicating the maximum depth to which the families listed to the right can be traced. Colors track the superkingdom-level phyletic distribution of the family. Dashed-line circles indicate uncertainty in the origin of lineage(s). Inferred or experimentally characterized functions for families are indicated to the left of family names. Asterisks denote newly described families.

Figure 5 with 3 supplements
Representative contexts for the Lipocone superfamily, grouped by shared functional themes.

Genes are depicted by box arrows, with the arrowhead indicating the 3’ end of genes. Genes encoding proteins with multiple domains are broken into labeled sections corresponding to them. Domain architectures are depicted by the individual domains represented by distinct shapes. TM segments, lipoboxes (LPs), and signal peptides (SPs) are depicted as unlabeled, narrow yellow, blue, and red rectangles, respectively. All Lipocone domains are consistently colored in orange. Genes marked with asterisks are labeled by the GenBank accession number below each context. Colored labels above genes denote well-known gene names or gene cluster modules. Abbreviations: PTase, peptidase; TFase, transferase; GlycosylTFase, Glycosyltransferase; MPTase, metallopeptidase; TGase, transglycosylase; SLP, serine-containing lipobox; cNMPBD, cNMP-binding domain; NCPBM, novel putative carbohydrate binding module; (w)HTH, (winged) helix-turn-helix; ZnR, Zinc ribbon; PPTs, pentapeptide repeats; Imm, immunity protein; βPs, β-propeller repeats; Cystatin-FD, Cystatin fold domain; MTase, methylase; PGBD, peptidoglycan-binding domain; MβL, metallo-β-lactamase; L12-ClpS, ClpS-ribosomal L7/L12 domain; TA, teichoic acid.

Figure 5—source data 1

Table of Lipocone family conserved contextual associations across distinct functional themes.

https://cdn.elifesciences.org/articles/108061/elife-108061-fig5-data1-v1.pdf
Figure 5—source data 2

List of identified genome contexts.

Features and coloring as described in the Figure 5 legend.

https://cdn.elifesciences.org/articles/108061/elife-108061-fig5-data2-v1.pdf
Figure 5—figure supplement 1
Multiple sequence alignment of serine-containing lipobox (SLP).

Sequences are labeled to the left by NCBI accession number and organism abbreviations. The conserved serine residue position, denoted at the top of the alignment by an asterisk, is shaded in red, with text colored in white. Other residue positions are colored according to consensus conserved biochemical properties: hydrophobic (h) and aromatic (a) residues are shaded yellow, polar (p) residues are shaded blue, small (s), and tiny (u) residues are shaded green, and positively charged (+) residues are shaded red. Diversity of domains C-terminally fused to the SLP are depicted to the right of the alignment, represented as geometric shapes. Organism abbreviations as follows: Obac: Oscillospiraceae bacterium; Cbac: Clostridia bacterium; Lbac: Lachnospiraceae bacterium; CEqu: Candidatus Equihabitans; Rzha: Roseburia zhanii; Rusp: Ruminococcus sp; Aaut: Aceticella autotrophica; Nthe: Natranaerobius thermophilus; CFim: Candidatus Fimenecus; Busp: Butyrivibrio sp; Eusp: Eubacterium sp; Rbac: Ruminococcaceae bacterium; Chsp: Chryseobacterium sp; Flsp: Fluviicola sp; Zpro: Zunongwangia profunda; Ibac: Ignavibacteria bacterium; Mbac: Myxococcales bacterium; Pbac1: Planctomycetota bacterium; Byua: Bradyrhizobium yuanmingense; Rhsp: Rhodopseudomonas sp; Afer: Acidimicrobium ferrooxidans; Bbac1: Bacillota bacterium; Etay: Eisenbergiella tayi; Rosp: Roseburia sp; Bbac2: Betaproteobacteria bacterium; Pbac2: Pseudomonadota bacterium; Bbac3: Burkholderiales bacterium; Idsp: Ideonella sp; Pisp: Piscinibacter sp; Aant: Algoriphagus antarcticus; Fbac: Flavobacteriales bacterium; Friv: Flavobacterium rivulicola; Mlut: Mongoliitalea lutea; Ga: Gammaproteobacteria; Sysp: Syntrophorhabdus sp; Aadv: Apibacter adventoris; Bbac4: Bacteroidota bacterium; Chsp: Chryseobacterium sp; Spsy: Sphingobacterium psychroaquaticum; mbac: marine bacterium; Ster: Sebaldella termitidis; Abac: Armatimonadota bacterium; Bcla: BD1-7 clade; Gpen: Gallaecimonas pentaromativorans; Kgeo: Kangiella geojedonensis; Ktai: Kangiella taiwanensis; Posp: Porphyromonas sp; Xbre: Xylanibacter brevis; Pmul: Prevotella multiformis; Scop: Segatella copri; Bbac5: Bacteroidales bacterium; Masp: Mariniphaga sp; Pbac3: Prolixibacteraceae bacterium; Dbac: Desulfobacteraceae bacterium.

Figure 5—figure supplement 2
Structural overview of newly identified immunity proteins pairing with toxin-containing proteins in polymorphic and allied toxin systems.

The top panel depicts concordance of core secondary structure elements across BamE-like immunity protein families, with N-terminal α-helix dyad colored in blue and green and the four strands of the core β-meander colored in a yellow, light green, orange, and purple order. The bottom panel depicts rainbow palette coloring of the Jellyroll domain-containing immunity protein and the 4-transmembrane (TM) protein. The Immunity-SAA protein is colored by secondary structure element, with conserved cysteine residues rendered as ball-and-stick. Protein DataBank ID (PDBID) or AF model-generating sequence is provided.

Figure 5—figure supplement 3
Sequence and structure overview of the broken-hairpin domain.

(A) Multiple sequence alignment of broken-hairpin domain, with conserved axR (with ‘a’ representing an aromatic residue, ‘x’ representing any residue, and R representing an arginine residue) motif positions labeled above alignment. Coloring and conserved consensus abbreviations as in Figure 4. (B) AF models of broken hairpin domain, loops colored in gray and strands in orange. axR motifs are rendered as ball-and-stick representations. (C) Selection of domain architectures observed with the broken-hairpin domain, arranged and labeled by general functional theme. Depictions and abbreviations as described in Figures 5 and 6 legends. (D–E) AF models exploring the positioning of the broken-hairpin domain relative to distinct N- or C-terminal effector domains.

Figure 6 with 3 supplements
Lipocone contextual network.

The network represents the conserved contextual associations of Lipocone domains (hexagonal nodes). Nodes and edges are colored based on known or inferred functional categories of the domains. The nodes are scaled by their degree. Gray coloring indicates domains without specific functional assignments. Examples of conserved gene neighborhoods and domain architectures supplementing those in Figure 5 illustrate contexts that bridge functional themes. Here, individual domains are colored to match network coloring. Additional abbreviations to those in Figure 5: APH-Pkinase, aminoglycoside phosphotransferase-like kinase; HUP, HIGH, UspA and PP-ATPase superfamily-like domain; Alk-phosphatase, Alkaline phosphatase; dehyd, dehydrogenase; TPRs, tetratricopeptide repeats; PMM/PGM, phosphomannomutase/phosphoglucomutase; ZnF, zinc finger; APC-transporter, amino acid-polyamine-organocation transporter; LPS, lipopolysaccharide.

Figure 6—source data 1

Significant enrichment of Lipocone family contextual associations across functional categories.

https://cdn.elifesciences.org/articles/108061/elife-108061-fig6-data1-v1.pdf
Figure 6—source data 2

Figure 6 network and node annotation YAML files.

https://cdn.elifesciences.org/articles/108061/elife-108061-fig6-data2-v1.zip
Figure 6—figure supplement 1
Lipocone domain-centered subgraphs of contextual network in Figure 6.

These subgraphs capture significant enrichment of different functional categories (Figure 6—source data 1). Subgraph network nodes, edges, scaling, and coloring as described in Figure 6 legend.

Figure 6—figure supplement 2
Lipocone domain-centered subgraphs of contextual network in Figure 6.

These subgraphs capture significant enrichment of different functional categories (Figure 6—source data 1). Subgraph network nodes, edges, scaling, and coloring are described in Figure 6 legend.

Figure 6—figure supplement 3
Lipocone domain-centered subgraphs of contextual network in Figure 6.

These subgraphs capture significant enrichment of different functional categories (Figure 6—source data 1). Subgraph network nodes, edges, scaling, and coloring are described in Figure 6 legend.

Additional files

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. A Maxwell Burroughs
  2. Gianlucca G Nicastro
  3. L Aravind
(2025)
The lipocone superfamily, a unifying theme in metabolism of lipids, peptidoglycan and exopolysaccharides, inter-organismal conflicts and immunity
eLife 14:RP108061.
https://doi.org/10.7554/eLife.108061.2