Eukaryote-microbiome interactions are among the most diverse associations in marine and terrestrial ecosystems. Chemical exchanges between diverse eukaryotes and their microbiomes are hallmarks of these relations, particularly involving secondary metabolites [1,2]. Despite the importance of secondary metabolites as signaling molecules that play critical roles in structuring microbial populations and regulating global biogeochemical cycles, the molecular mechanisms by which identified chemicals regulate microbial relationships in the ocean are largely elusive. The C9 dicarboxylic acid, azelaic acid (or azelate, hereafter Aze) is a ubiquitous, yet enigmatic metabolite produced by photosynthetic organisms, such as plants [3,4] and phytoplankton [5]. It is postulated to be a product of the peroxidation of galactolipids [6], yet its exact biosynthetic pathway is unknown. Aze plays a crucial role as an infochemical that primes systemic acquired resistance against phytopathogens [79] and influences microbial diversity in soil [10]. In marine environments, diatoms secrete Aze to modulate bacterial populations by promoting the growth of symbionts while simultaneously inhibiting opportunists [5]. Despite this undeniable importance, no Aze uptake proteins are known in bacteria and its assimilation into bacterial cells to promote or inhibit growth is poorly understood. A single gene has recently been identified in Pseudomonas nitroreducens that acts as a transcriptional regulator of Aze, azeR [11], yet its regulatory targets are unknown. Understanding the molecular mechanisms that enable the transport and assimilation of Aze into bacterial cells to induce positive or negative phenotypes will help to reveal how eukaryotic hosts influence and control their microbiome.

To examine how bacteria transport and catabolise Aze, we used Phycobacter azelaicus (formerly Phaeobacter sp. F10, and hereafter Phycobacter) as a model to study how Aze promotes bacterial growth and Alteromonas macleodii F12 (hereafter A. macleodii) as a model to study the inhibitory effects of Aze [5]. Both bacteria were isolated from the diatom Asterionellopsis glacialis, which was shown to produce Aze to modulate its microbiome [5]. In previous experiments, Aze caused an increase in cell density of Phycobacter relative to controls, while temporarily inhibiting the growth of A. macleodii before its growth recovered relative to controls [5]. A combination of transcriptomics, transcriptional coexpression networks of master transcriptional regulators, metabolomics, and uptake experiments were used to elucidate how these bacteria process Aze. Additionally, in situ mesocosm experiments with seawater microbial communities and Arabidopsis thaliana were conducted to examine if Aze influences microbial diversity in natural environments and enrich microbial taxa that carry genes involved in Aze uptake and response. We discover the first putative Aze transport system in bacteria and map out how each bacterium responds to and processes Aze through complex regulatory networks.


Growth and Transcriptional Response to Aze

Phycobacter incubated with 1 mM Aze as the sole carbon source showed a significant increase in growth relative to no-carbon controls or growth with glutamate as the sole carbon source (Figure S1a). To elucidate the transcriptional responses of Phycobacter to growth on Aze, RNA samples were collected at 0.5 and 8 hours after Aze addition to Phycobacter to capture the short-term uptake and longer-term catabolism of Aze (Figure S1b). To infer possible mechanisms of short-term toxicity by A. macleodii, samples were collected at 0.5 hours. After 0.5 hours, Phycobacter differentially expressed 273 genes in response to Aze; this number increased to 558 genes after 8 hours, corresponding to ∼20% of all CDS in its genome (Table S1). Approximately 78% of differentially expressed genes were upregulated in response to Aze (n=652) and most occurred at 8 hours (n=494). Only 70 genes displayed consistent expression profiles at both 0.5 and 8 hours, suggesting the immediate, short-term response to Aze is different from long-term growth on the substrate. In contrast, A. macleodii differentially expressed 274 genes at 0.5 hours, corresponding to ∼7% of all CDS in its genome, half of which were upregulated (Table S1).

Global analysis of KEGG pathways showed significant differences between the transcriptional profiles of Phycobacter and A. macleodii (Figure 1). While fatty acid degradation, amino acid metabolism and degradation, ABC transporters, oxidative phosphorylation, propanoate metabolism, glyoxylate and dicarboxylate metabolism pathways exhibited similar expression ratios across both bacteria in response to Aze, upregulated genes in these pathways were mostly dominated by Phycobacter transcripts. In contrast, downregulated genes were mostly enriched in A. macleodii transcripts. Interestingly, fatty acid degradation genes were among the most upregulated genes in Phycobacter (Figure 1).

Enriched KEGG pathways in Phycobacter and A. macleodii.

Left: Scatter plot of the log2 fold-change of differentially expressed (DE) genes present in KEGG pathways listed on the y-axis. Each circle represents the differential expression of a gene in the presence of Aze relative to controls. Right: Expression ratios for KEGG pathways of interest were calculated as the total number of DE genes: total number of genes present in the genome in the given pathway. Circle and bar colors indicate the strain and time point (red=Phycobacter at 0.5 hours; green=Phycobacter at 8 hours; blue=A. macleodii at 0.5 hours).

Aze Catabolism by Phycobacter

No uptake genes are known for Aze. A C4-dicarboxylate tripartite ATP-independent periplasmic (TRAP) transporter substrate-binding protein (INS80_RS11065) was the most and the third most upregulated gene in Phycobacter grown on Aze at 0.5 and 8 hours, respectively. The small and large permease genes (INS80_RS11060 and INS80_RS11055) neighboring this gene, which form a functional TRAP transporter, were among the top 20 most upregulated genes in the transcriptome at 0.5 hours and continued to be upregulated at 8 hours, implicating these genes in transporting Aze (Figure 2, Tables S2 and S3). All three genes are co-localized with the putative Aze transcriptional regulator, azeR (INS80_RS11050, belonging to the lclR family transcriptional factor) [11], which was also upregulated at 0.5 hours. In silico predictions revealed that azeR, the small and large permeases all fall within a single operon while the substrate-binding gene is transcribed independently (Figure 3). We assign this cluster of genes the designation azeTSLR (T=substrate-binding protein, S=small permease, L=large permease, R=regulator). Once inside the cell, Aze appeared to be fed into the fatty acid degradation pathway via a PaaI family thioesterase and acyl-coA ligase (fadD) that are present directly downstream of azeTSLR and upregulated at 0.5 hours (Figures 2 and 3, Table S2). Aze-coA was then putatively degraded through a series of steps catalyzed by genes in the fatty acid degradation pathway (acd, paaF, and fadN) that were among the most highly upregulated genes at 0.5 hours and continued to be upregulated at 8 hours (Figure 2, Tables S2 and S3). These successive reactions presumably liberated two acetyl-coA molecules to generate glutaryl-coA. This molecule can either be converted to glutaconyl-coA, or further degraded into acetyl-coA via gcdH, paaF, fadN and atoB, all of which were upregulated at both time points (Figure 2).

Proposed Aze catabolism in Phycobacter based on transcriptomics and isotope labeling.

Left: Log2 fold-change of differentially expressed (DE) genes is shown as circles at 0.5 (left circle) and 8 hours (right circle) after the addition of Aze to Phycobacter cells relative to controls. The Aze transport system (azeTSL) consists of 3 genes, the DE values of which are shown next to the transporter. Intermediate reactions and substrates for the successive liberation of acetyl-coA from azeloyl-coA and pimeloyl-coA are not shown; DE values of the genes involved are shown next to each overall reaction. 13C-labeled metabolites detected in the intracellular metabolome of Phycobacter cells are marked with cyan circles at each labeled carbon atom site. Right: Relative abundance of each detected labeled metabolite after addition of 100 μM 13C-Aze to Phycobacter cells, shown as total ion current (TIC). Box plot values are based on triplicates. NS indicates no significant relative abundance, * denotes p<0.05 and ** denotes p<0.005 based on a Student’s t-test.

Genomic neighborhood structure of azeR and azeTSL in Phycobacter and Pseudomonas nitroreducens DSM9128.

Genes predicted in silico to belong to a single operon are marked by blue boxes. Bold-faced gene abbreviations in Phycobacter denote upregulated genes in response to Aze. Operon prediction was carried out on OperonMapper [55]. azeT=substrate-binding protein, azeS=small permease, azeL=large permease, azeR=transcriptional regulator, TR=transcriptional regulator, OR=oxidoreductase, HP=hypothetical protein, GNAT=GNAT family N-acetyltransferase, SGD=succinylglutamate desuccinylase, SBP=substrate-binding protein.

To confirm the patterns observed in the Phycobacter transcriptome and resolve the fate of glutaryl-coA, Phycobacter was supplemented with 13C9-Aze, and intracellular metabolites were extracted and analyzed using an elemental analyzer-isotope ratio mass spectrometer (IRMS) and a Fourier-transform ion cyclotron resonance mass spectrometer (FTMS). Within 15 minutes of 13C9-Aze addition, Phycobacter cells were significantly enriched in 13C and continued to exhibit an increase in δ13C up to 120 minutes as shown using IRMS (repeated-measure ANOVA, p<0.001) (Figure S2). Within 15 min, 0.18% of the carbon content of Phycobacter cells was derived from Aze, corresponding to an uptake rate of 3.8 x10-9 nmol cell-1 hr-1. After 120 min, the fraction of carbon derived from Aze contributed to 0.67% of the total carbon in the cells (Table S4). Consistent with this observation, FTMS analysis showed that 15-30 minutes after the addition of labeled Aze to cells, 13C9-Aze, 13C9-Azeloyl-coA, 13C7-pimeloyl-coA, and 13C5- glutaryl-coA were detected and increased in relative abundance over the course of 120 minutes (Figure 2, Table S4). Furthermore, 13C5-glutaconyl-coA was detected at 2 hours and 13C3- propionate-coA at 1-2 hours after addition of 13C9-Aze. Glutaconyl-coA is a side product of glutaryl-coA oxidation that is shuttled into butanoate metabolism [12], while propionyl-coA is an important intermediate in glyoxylate and propanoate metabolism, indicating these pathways maybe activated downstream of Aze catabolism. Indeed, while greater upregulation of fatty acid degradation occurred at 0.5 hours than at 8 hours, greater upregulation of glyoxylate and dicarboxylate metabolism, butanoate metabolism and other downstream pathways occurred at 8 hours (Figure 4a, Table S2).

Model of Aze catabolism in Phycobacter and toxic effects in A. macleodii.

Colored boxes and circles represent metabolic pathway expression ratios and differential gene expression (log2 fold- change), respectively, at 0.5 and 8 hours for a, Phycobacter and at 0.5 hours for b, A. macleodii. White circles/boxes indicate non-statistically significant DE genes/silent pathways. Putative transcriptional factor regulation of pathways/genes is shown by purple lines. Log2 fold-change values shown for all transporters represent the mean log2 fold-change value of all genes in the cluster. Pathway expression ratios were the same as in Fig. 1 except the sign of the ratio (indicating up- or downregulation) are shown. Amino acid biosynthesis and metabolism ratios in (b) are outside the range of the pathway expression ratio scale (values ∼1.5) but are colored in dark red to indicate significant downregulation. T2SS=Type II secretion system, AHLs=acyl-homoserine lactones.

Toxicity of Aze in A. macleodii

The genome of A. macleodii completely lacks any TRAP transport systems, suggesting Aze crosses the cell membrane non-specifically. In fact, A. macleodii assimilated negligible amounts of 13C9-Aze, with a relative uptake ∼160 times lower than that of Phycobacter at 120 minutes (Figure S2). In contrast to the Phycobacter transcriptome, the highest upregulated gene 0.5 hours after Aze addition to A. macleodii was an efflux RND transporter periplasmic subunit (GKZ85_RS10170). The permease and outer membrane subunits of this efflux pump (GKZ85_RS10160, GKZ85_RS10165) that belong to the same operon were among the top 10 most upregulated genes in the A. macleodii transcriptome (Figure 4b, Tables S2 and S3). This efflux system presumably removes Aze from the cytoplasm. Several other efflux transporters and ABC multidrug transporters were among the most upregulated genes in the presence of Aze relative to the controls. Among upregulated genes in the presence of Aze were secretion systems, protein export, spore formation, stress-response proteins, heat shock proteins, proteases, and ribosome protection genes (Tables S2 and S3). Fatty acid degradation was not differentially expressed in the presence of Aze, while most ribosomal genes were either downregulated or not differentially expressed. In addition, nucleotide metabolism, pyruvate metabolism, oxidative phosphorylation, electron transfer, amino acid metabolism and biosynthesis, and fatty acid biosynthesis were downregulated (Figures 1 and 4b, Table S2). Collectively, these transcriptomic patterns confirm previous work [5] that indicate A. macleodii mitigates the toxic effects of Aze over time and that this mitigation relies on removing Aze from the cell via efflux systems and simultaneously arresting cellular metabolism. Interestingly, several genes involved in ribosome protection and response to protein degradation (rmf, hpf, hslR, hslU, hslV) were upregulated, suggesting Aze may disrupt protein synthesis (Table S2).

AzeR and xre Transcriptional Networks

Due to the stark transcriptional response to Aze in each bacterium, we hypothesized that transcriptional regulation plays a major role in activating essential pathways that catabolize or mitigate the toxicity of Aze, particularly since both bacteria have complete fatty acid degradation pathways. To identify transcriptional master regulators for Phycobacter and A. macleodii, we performed regulatory impact factor (RIF) analyses [13]. Transcriptional factors (TF) in each species were compared to the unique corresponding differentially expressed gene lists, which led to the identification of 29 regulators with significant scores for Phycobacter and 12 for A. macleodii (deviating ± 1.96 SD from the mean; p<0.05) (Table S5). Interestingly, azeR was the top differentially expressed TF identified in Phycobacter. In contrast, only two TFs, namely (i) XRE family transcriptional regulator (xre), a bacterial TF associated with stress tolerance [14], and (ii) helix-turn-helix transcriptional regulators (GKZ85_RS06065 and GKZ85_RS16495), were differentially expressed in A. macleodii. To gain insights into the regulatory mechanisms at play during exposure to Aze, we used transcriptional coexpression networks, which enabled characterization of transcriptional patterns during response to disease, stress, and development [1517].

Significant connections (r ≥ ±0.95) within the initial network identified 286 genes with 7,208 connections in Phycobacter and 274 genes with 7,102 connections in A. macleodii (Figure S3). Subnetworks that contain azeR and xre show that they act as hub genes in putatively regulating specific gene categories in Phycobacter and A. macleodii, respectively. The subnetwork of azeR shows that it putatively regulates transcription of most of the pathways that are involved in assimilating Aze in Phycobacter, including azeTSL, fatty acid degradation, butanoate metabolism, glyoxylate and dicarboxylate metabolism and oxidative phosphorylation, in addition to other genes that may be indirectly affected by Aze catabolism (Figures 4a and 5a). In contrast, the subnetwork of xre shows that it putatively regulates transcription of the efflux system implicated in removal of Aze from the cytoplasm of A. macleodii, stress proteins and proteases, protein export, nucleotide metabolism, and amino acid metabolism and biosynthesis [18] (Figures 4b and 5b). These findings highlight the critical role of AzeR in mediating the response to Aze and may explain the large differences in response between both bacteria.

Transcriptional coexpression networks of hub transcriptional factors (TFs) azeR and xre, and their putative target genes in Phycobacter and A. macleodii.

Nodes represent genes (circles) and TFs (triangles) connected by edges based on significant co-expression correlation (PCIT; r≥0.95) for a, Phycobacter and b, A. macleodii. Nodes are grouped based on functions and represented by different colors. The size of the node corresponds to the normalized mean expression values in Aze-treated samples, whereas the color of the node border corresponds to differential expression. Edge colors indicate the direction of the correlation between each gene pair. DEG=differentially expressed gene, TF=transcriptional factor.

Aze Addition to Seawater Mesocosms

To examine the influence of Aze as a potential carbon source or antimicrobial compound on marine microbial populations, surface seawater was collected and enriched for prokaryotes by size-fractionated filtration. Seawater mesocosms were supplemented with 100 μM Aze or an equivalent volume of Milli-Q water (see Methods) and incubated in the dark for 16 hours, after which DNA was extracted and 16S ribosomal RNA (rRNA) gene amplicon sequencing was used to assess prokaryotic diversity. The total number of species (richness) was higher in Aze treated samples relative to T=16 hours control samples (Wilcox, p<0.05), while the diversity of the microbial community was significantly lower (Wilcox, p<0.05) in Aze-treated samples relative to controls (Figure 6a). PCoA showed that the microbial community was significantly different between each sample group (PERMANOVA, p<0.001, Figure 6b). Taxonomic classification at the family level showed that Rhodobacteraceae, SAR11 clade I, Cyanobiaceae, and Flavobacteriaceae constituted >60% of prokaryotic diversity across all samples. Notably, the relative abundance of Rhodobacteraceae increased in the Aze-treated samples (mean ∼47%) relative to the T=16 hours control samples (mean ∼37%) (Figure 6c). Differential abundance analysis with DESeq2 (p-adj<0.05) identified 44 amplicon sequence variants (ASVs) enriched after Aze addition relative to T=16 hours controls. At the genus level, these ASVs belonged to Rhodobacteraceae (n=17), Marine Group II Euryarchaea (MG II) (n=19), Pseudoalteromonas (n=6), Vibrio (n=1), and Psychrosphaera (n=1). Rhodobacteraceae had the highest mean taxonomic proportion of ASVs ranging from ∼1.5 to ∼95, suggesting a significant reliance on Aze, while MG II had the lowest mean ratios of ASVs ranging from ∼0.02 to ∼0.04 (Figure 6d).

Bacterial and archaeal diversity in seawater treated with Aze.

a, Alpha-diversity indices of observed OTUs, Chao1, Shannon and Simpson across the Aze-treated and control samples. b, PCoA of Bray-Curtis distances (PERMANOVA: R2 = 0.73; p<0.001) between samples. The two principal components (PCoA1 and PCoA2) explained 72.5% and 15.1% variance, respectively. c, Relative abundance of the top 25 microbial families based on 16S rRNA gene amplicon sequencing of Aze-treated samples at T=16 hours and control samples at T=0 and T=16 hours. d, Distribution of the amplicon sequence variants (ASVs) belonging to significantly differentially abundant taxa between the Aze-treated and control samples at T=16 hours, according to their log2 fold-change and p-adjusted values. The bubble size indicates the mean taxonomic proportion of each ASV, calculated as the mean number of reads in an ASV: mean number of reads present in all ASVs of the same taxonomic classification. The bubble color indicates the taxonomic classification of each ASV according to (c).

Aze Addition to soil and infiltration into Arabidopsis

We hypothesized that Aze may have similar influences in rhizobial ecosystems as it did in marine ecosystems, particularly its influence on soil microbial community composition. Potting soil was amended with 100 μM Aze, suberic acid or an equivalent volume of Milli-Q water and incubated in the dark for 16 hours at 24°C. The C8-dicarboxylic acid suberic acid was used to ensure potential changes in the microbial community structure were specific to Aze. DNA was extracted and 16S rRNA gene amplicon sequencing was used to assess prokaryotic diversity. Unlike seawater (Figure S4a), Aze-treated soil did not significantly alter the soil microbial community (Figure S4b), which suggests a fundamental difference compared to seawater. Because Aze is known to prime the immune system in plants, a fundamentally different function from its role in diatoms where it modulates phycosphere microbial communities, we hypothesized that Aze may act indirectly on soil communities through the plant host.

Four-week-old Arabidopsis thaliana leaves were infiltrated with 500 µL of 1 mM Aze, suberic acid or an equivalent volume of MES buffer as a control and incubated for five days. Rhizosphere microbial communities were collected, DNA was extracted and 16S rRNA gene amplicon sequencing was used to assess prokaryotic diversity. Aze-treated plants displayed lower diversity and richness indices compared to controls (MES buffer and suberic acid) (Figure S5a), though these trends were not statistically significant. Taxonomic classification at the family level revealed a highly diverse microbial community across all samples (Figure S6a), but Aze- treated plants showed a distinct root microbiome compared to controls (PERMANOVA, p<0.05; Figure S5b). Differential abundance analysis with DESeq2 (p-adj<0.05) identified 75 and 78 ASVs enriched between Aze vs MES controls and Aze vs suberic acid controls, respectively (Figures S6b and S6c). Interestingly, many of the enriched ASVs belonged to typical rhizobial symbionts, such as Azospirillaceae and Rhizobiaceae. These findings suggest that Aze does not directly influence the soil microbiome but acts indirectly by priming the immune system of A. thaliana, which ultimately modifies rhizosphere composition.


Aze is a ubiquitous metabolite produced endogenously by photosynthetic organisms. It was recently revealed that Aze production by a ubiquitous diatom enables it to selectively promote growth of its bacterial symbionts while simultaneously inhibiting growth of opportunists [5]. In addition, plants use Aze to prime their immune system in response to phytopathogens [7]. Its toxic effects against the acne-causing bacterium, Propionibacterium acnes, popularized its use in skin care products [19]. Aze has also been shown to have antitumor effects on cancerous cells, and functions as a competitive inhibitor of mitochondrial oxidoreductases and steroid biosynthesis [20]. The ability of a single metabolite to selectively inhibit or promote different populations of bacteria simultaneously poses important questions about how this metabolite evolved in the eukaryotic chemical repertoire to influence their microbiome and its role in unicellular vs multicellular phototrophs.

Identifying Aze as a putative substrate of a TRAP transporter system is a novel and exciting result. Despite numerous attempts, our efforts to knock-out azeTSL in Phycobacter failed. Information about bacterial transporters and their substrates remain speculative, and much less is known about substrate specificity for bacterial TRAP transporters than other transporters, such as ABC transporters [21]. TRAP transporters consist of a substrate-binding domain and two transmembrane segments (large and small permeases), while lacking the canonical nucleotide-binding domain found in ABC transporters [22]. Recognized substrates for TRAP transporters vary from sugars, mono- and di-carboxylates, organosulfur molecules, heterocyclic carboxylates, and amino acids. Among dicarboxylates, only C4-compounds, such as succinate, fumarate, and malate, have been identified as substrates for TRAP transporters, which explains the prevalent annotation of “C4-dicarboxylate transporter” for proteins in bacterial genomes, including Phycobacter. Other than C6-adipate that binds with high affinity to a tripartite tricarboxylate transporter [23], it is not known how dicarboxylates with >4 carbons are taken up [24]. This makes AzeTSL a putative transporter of the longest chain dicarboxylate molecule known to date. The sheer abundance of dicarboxylate transporters in bacterial genomes begs the question of whether they transport a variety of Cn-dicarboxylates, such as C5-glutarate, C6-adipate, C7-pimelate, C8-suberate, C9-azelate and C10-sebacate. Some of these metabolites are produced by primary producers and thus potentially serve as growth factors or signals for heterotrophic bacteria [25]. Therefore, further work is needed to expand on the nature of substrates specific to dicarboxylate TRAP transporters.

We have shown that while Phycobacter uses fatty acid degradation and subsequently glyoxylate, dicarboxylate, and butanoate metabolism pathways to catabolize Aze, A. macleodii attempts to mitigate the toxicity of Aze by cytoplasm efflux and by downregulating its ribosome and protein synthesis pathways (Figure 4). However, the fact that A. macleodii possesses a complete fatty acid degradation pathway similar to bacteria that display inhibition by Aze is intriguing, as it is unclear why such bacteria cannot catabolize this metabolite. The answer may lie in the fact that Phycobacter, and bacteria that catabolize Aze such as P. nitroreducens, possess the transcriptional factor AzeR. Using RIF analysis, we have shown that azeR in Phycobacter putatively regulates its fatty acid degradation pathway and azeTSL, along with other pathways essential for the catabolism of Aze (Figure 5a). By activating uptake, fatty acid degradation and other essential genes, AzeR enables bacteria to efficiently metabolize Aze. In contrast, bacteria deficient in AzeR, such as A. macleodii, are presumably unable to activate these pathways rapidly and thus fall susceptible to the toxic effects of Aze. Nevertheless, A. macleodii likely uses efflux pumps and the activation of a variety of stress-related pathways mediated by the transcriptional factor XRE (Figure 5b) to recover from Aze-dependent toxicity in a relatively short period of time. While Aze exhibits a wide range of inhibitory activity, in A. macleodii, ribosome and/or protein synthesis inhibition seem to contribute to the toxicity associated with Aze, evidenced by the strong downregulation of most ribosomal protein coding genes, protein synthesis genes and proteases. This finding agrees with previous studies that demonstrated the bacteriostatic effects of Aze on protein, DNA and RNA synthesis in P. acnes and Staphylococcus epidermidis [26,27].

Aze addition to seawater mesocosms induced a statistically significant increase in the relative abundance of taxa that are capable of assimilating Aze compared to controls. Rhodobacteraceae ASVs comprised the most abundant group to be influenced by Aze additions, highlighting their adaptation to phytoplankton exudates [28,29], while MG II archaea ASVs exhibited the largest enrichment relative to controls (Figure 6d). While we cannot discount competitive interactions from inducing the significant enrichment of MG II archaea ASVs, the possibility that Aze may be a growth substrate for this important group is intriguing. The MG II archaea (Candidatus Poseidoniales) [30] remain uncultivated [31], hindering our understanding of their metabolic capacity, though isotope probing and metagenomic analysis showed its assimilation of phytoplankton-derived substrates [32]. Several MG II genera exhibited chemotaxis towards phytoplankton exudates [33] and displayed significant correlations to chlorophyll a in an oceanographic time-series [30]. Further work is needed to confirm the link, if any, between Euryarchaeota and Aze. In soil, Aze primes the plant immune response [7], which likely drives the production of a myriad of metabolites that modulate the rhizosphere microbiome. Indeed, introducing Aze directly to soil had negligible effect on the soil microbiome, while Aze introduction to A. thaliana appears to cause significant changes to the rhizosphere composition that may reflect complex interactions between the host and rhizosphere microbes (Figure S6). Further work is needed to clarify the mechanisms enabling plants to modulate rhizosphere microbiomes through Aze.

In summary, photosynthetically derived secondary metabolites play a significant role in structuring microalgal and plant microbiomes. In the oceans, they are hypothesized to influence algal blooms, succession of microbial populations and global biogeochemical cycles. However, the perception and response of these metabolites in microbial heterotrophs remain largely unknown. This study set out to unravel the molecular mechanisms of Aze on bacteria that enables eukaryotic hosts to modulate their associated microbiome. More broadly, we demonstrate that bacterial lineages occupying different ecosystems, such as the phycosphere surrounding microalgal cells, may have gained an ecological advantage over others by evolving mechanisms that assimilate Aze. These findings expand understanding of bacterial TRAP- transporters and their potential substrates, while also highlighting the importance of linking ubiquitous transporter systems to chemical currencies identified in the ocean. While Aze assimilation represents an important mechanism to structure the microbiome of primary producers, it is likely one of many such metabolites that together work in concert to ensure the maintenance of healthy microbial communities, which ultimately influence higher trophic levels and control global biogeochemical cycling.

Online Materials and Methods

Growth of bacterial isolates with Aze

To test if Phycobacter could use Aze as a sole carbon course, 2 mL of an overnight culture grown in Zobell marine broth34 was centrifuged and washed three times with modified mineral media (K2HPO4 was used instead of KH2PO4) [34] lacking a carbon source. Cells were then used to inoculate 5mL cultures of filter-sterilized mineral media supplemented with 1 mM Aze, 1 mM L-glutamate, or lacking a carbon source. While 1 mM is typically high compared for environmental concentrations, it is on par with amounts of carbon sources added to bacterial culture. Cultures were incubated in the dark with shaking at 26°C. Growth was monitored in 96- well plates and measured using absorbance at 600 nm (OD600) with an Epoch 1 Microplate Spectrophotometer (Biotek).

To test the transcriptional response of Phycobacter and A. macleodii to Aze, cells were grown as previously described [5] and inoculated into cultures at a starting density of ∼7.5x105 cells/mL) and OD600 0.2, respectively, in 10% marine broth. This medium was used instead of a no-carbon-source medium to enable the cells to grow in controls and thus shed light on uptake and assimilation mechanisms of Aze while removing the transcriptional responses of cellular growth/division from the transcriptomes. Cultures were supplemented with either 500 μM filter- sterilized Aze (Sigma-Aldrich) or an equal volume of filter-sterilized Milli-Q water. All cultures were shaken in a shaker-incubator at 26°C. Cells were either centrifuged at 13,000 x g or were filtered onto 0.22 μm Sterivex cartridges (Merck Millipore, Burlington, MA, USA) using a peristaltic pump at a flow rate of 40 mL/min. All cells were flash frozen in liquid nitrogen and stored at -80°C until RNA extraction.

RNA extraction and sequencing

Cartridges containing Phycobacter cells were thawed on ice before extraction, while cell pellets of A. macleodii were processed directly from the freezer. Filter membranes were removed with sterile tweezers from cartridges and placed directly in the RLT lysis buffer (Qiagen). RNA was extracted from cell pellets and filter membranes using the RNeasy Mini Kit (Qiagen) according to the manufacturer’s protocol. RNA samples were treated with DNase I (Thermo Fisher Scientific) to eliminate genomic DNA contamination and sent to NovogeneAIT Genomics (Singapore) for library preparation using the NEBNext Ultra RNA Library Prep Kit and paired-end (2x150 bp) sequencing on the Illumina NovaSeq 6000 (San Diego, CA) platform. All time points were carried out in three biological replicates except for Phycobacter at 0.5 hours, which had six replicates. All samples passed RNA QC except for one control replicate from Phycobacter at 0.5 hours.

RNAseq analysis

Raw reads of Phycobacter and A. macleodii samples were processed using fastp v0.22 [35] for quality filtering, adapter removal, and trimming. The resulting sequences were then aligned with their respective genomes (WKFH00000000.1 and CP046140.1) using Bowtie2 v2.3.5 [36]. Resulting BAM files were processed on SAMtools v1.12 [37] and read counts were generated with featureCounts v2.0.3 [38]. Differential expression analysis between treatment and control samples was done using DESeq2 v3.14 [39] on R v4.1 (R Core Team). Genes were considered differentially expressed (DE) if they had a p-adjusted value of <0.05 and log2 fold change of ≥± 0.5. Pearson coefficient calculations and correlation plots were done using the R gplots package [40]. DE genes were fed into the Kyoto Encyclopedia of Genes and Genomes (KEGG) [41] database to infer pathways. Scatter plots of genes in enriched KEGG pathways and bar plots of their expression ratios were generated using the R ggplot2 package [42].

Isotope labeling and metabolomics

Phycobacter cells were grown as described above. Filter-sterilized Aze-13C9 (Sigma- Aldrich) was added to triplicate cultures with an OD600 of 0.3 at a final concentration of 500 μM and cultures were shaken in a shaker-incubator at 26°C. Two-mL samples were collected at 0, 15, 30, 60 and 120 minutes after addition of Aze and centrifuged at 13,000 x g for 5 minutes at 4°C, resuspended and washed with 35 g/L NaCl in PBS, then centrifuged and resuspended in ice cold 100% methanol. Subsequently, samples were sonicated for 2 minutes on ice, then centrifuged at 4°C for 5 minutes. Supernatant was collected and dried under nitrogen flow, then the extract was stored at -20°C until analysis using mass spectroscopy.

Samples for direct infusion measurements are prone to salt interference and thus solid- phase extraction with PPL-bond elute columns (Agilent Technologies, US) was applied. High- resolution mass spectra were acquired on a Bruker solariX XR Fourier-transform ion cyclotron resonance mass spectrometer (FT-MS) (Bruker Daltonics GmbH, Germany) equipped with a 7T superconducting magnet and operated in ESI (+) ionization mode with 0.5 bar nebulizer pressure, 4.0 L dry gas and a 220°C capillary temperature. Source optics parameters were 200V at capillary exit, 220V deflector plate, 150V funnel 1, 15V skimmer 1, 150Vpp funnel RF amplitude, octopole frequency of 5 MHz and an RF amplitude of 450Vpp. Para Cell parameters were set to -20V transfer exit lens, -10V analyzer entrance, 0V side kick, 3V front and back trap plates, -30V back trap plate quench and 24% sweep excitation power. For sample injection, a Triversera Nanomate (Advion BioSciences Inc., US) with 1 psi gas pressure and ESI (+) voltage of 1.7kV was used. Three-hundred and twenty scans were accumulated for each sample with an accumulation time of 200 ms and acquired with a time domain of 4 mega words over a mass range of m/z 75 to 1200, at an optimal mass range from 200-600 m/z. Spectra were internally calibrated using primary metabolites (e.g., amino acids, organic acids) in Data analysis 5.0 Software (Bruker Daltonics GmbH, Germany). The FT-MS mass spectra were exported to peak lists with a cut-off signal-to-noise ratio (S/N) of 4. The masses for relevant 13C isotope-labeled metabolites were calculated using enviPat: isotope pattern calculator47. MSMS was performed at 15V, 25V and 35V collision voltage and characteristic fragments identified manually.

Azelaic acid uptake

Phycobacter and A. macleodii cells were grown as described above. Cells were centrifuged and resuspended in triplicate flasks with 10% Zobell marine broth at an OD600 of 0.3. Filter-sterilized Aze-13C9 (Sigma-Aldrich) was added to each culture flask at a final concentration of 10 μM and cultures were incubated in a shaker-incubator at 26°C. A lower concentration of Aze-13C9 was used here to allow for an accurate quantification of its uptake. Triplicate samples (20 mL) were collected at 0, 15, 30, 60 and 120 minutes after addition of Aze and centrifuged at 5,000 × g for 15 minutes. Samples were subsequently fixed with paraformaldehyde (1% final concentration) in artificial seawater48 for 1 hour at 4°C. Samples were then washed twice with artificial seawater to eliminate any residual paraformaldehyde and cell pellets were dried in an oven at 60°C for 48 hours. A total of 0.50 mg of each dried cell pellet was weighed in a tin capsule (IVA Analysentechnik, Meerbusch, Germany) and loaded onto the autosampler of an elemental analyzer (EA) connected to isotope ratio mass spectrometer (IRMS). In EA, samples were combusted in the presence of oxygen added to helium stream at a temperature of 1000 °C in a reactor comprised of Cr2O3 (Sercon, UK) and silvered oxides of cobalt (Sercon, UK). The yield gasses were carried through reduction reactor (650 °C, electrolytic copper (Sercon, UK)), water traps (granular magnesium perchlorate, Sercon, UK) and GC column in a stream of helium (grade 99.999 % purity; BOC, Australia, 100 mL/min) [43]. All stable isotope results are reported as 1000 of δ(13C) value in permille (‰) after normalization to VPDB scale [44] using international standards provided by the International Atomic Energy Agency (IAEA): δ13C-USGS40 (-26.39 ‰), USGS41 (37.63 ‰), NBS22 (-30.03 ‰), USGS24 (-16.05 ‰), IAEA603 (2.46 ‰) [45]. The combined standard measurement uncertainty of the δ(13C) did not exceed 0.10 ‰ (1σ).

To calculate the uptake rate of Aze, we quantified: the amount of carbon in bacterial cells at all time points and determined how much was derived from the medium (stable isotope composition of 1.0852 atm%), and how much was derived from 13C-Aze (stable isotope composition of 99 atm%). The results were converted to 13C fractions prior recalculation of the uptakes of azelaic acid using EasyIsoCalculator spreadsheet [46].

Seawater mesocosms and 16S rRNA gene amplicon sequencing

Seawater samples were collected in July 2021 from surface waters off the coast of Saadiyat Island (24°38’28.6"N 54°27’09.4"E), United Arab Emirates, kept in the dark, and immediately brought back to the lab. The seawater was filtered through a 1.2-μm filter (Whatman, UK) to remove most phytoplankton and divided into 12 vessels each containing 30 mL to carry out a mesocosm experiment. Control replicates (n=4 at T=0 hours) were immediately filtered onto a 0.2-μm filter (Whatman, UK) to capture microbial cells. Treatment replicates (n=4) with 100 μM Aze and another set of control replicates (n=4) with an equal volume of Milli-Q water added in lieu of Aze were incubated at 24°C for 16 hours in the dark (T=16 hours). We used 100 μM Aze in this case to avoid potential uptake, degradation, and precipitation of this molecule over the long course of the experiment. After incubation, they were immediately filtered onto a 0.2-μm filter (Whatman, UK). Genomic DNA was extracted immediately after filtration from all filters using the DNeasy PowerWater Kit (Qiagen) and sent to NovogeneAIT Genomics (Singapore) for 16S rRNA gene amplification using the 515F-Y (5′- GTGYCAGCMGCCGCGGTAA-3′) and 926R (5′-CCGYCAATTYMTTTRAGTTT-3′) universal primers and paired-end (2x250 bp) sequencing on the Illumina NovaSeq 6000 (San Diego, CA) platform.

Clean raw reads were processed with the rANOMALY R package [47] using the DADA2 R package [48] to generate amplicon sequence variants (ASVs). Taxonomic classification was based on the SILVA v138 database. Alpha-diversity was assessed using the richness indices; observed OTUs and Chao1, and diversity indices; Shannon and Simpson. Beta-diversity was assessed and visualized by principal coordinate analysis (PCoA) based on pairwise Bray-Curtis distance and tested by the permutational analysis of variance (PERMANOVA). The differential abundance of taxa across the treatment and control samples was calculated with DESeq2 v3.14 [39] with a p-adjusted value cutoff of <0.05. All plots were generated using the ggplot2 and phylosmith [49] R packages and all statistical analyses were performed on R 4.1.2.

Soil supplementation, Arabidopsis thaliana Aze infusion, incubations and 16S rRNA gene amplicon sequencing

Two hundred and fifty milligrams of potting soil (Planting Mix No.1; ACE Hardware) was supplemented with 300 µL of 100 µM Aze, suberic acid or Milli-Q water and incubated at 24°C, in the dark for 16 hours. Following incubation, the soil samples were flash frozen in liquid nitrogen and stored at -80°C until DNA extraction.

Arabidopsis thaliana seeds (Col-0) were sowed into potting soil and allowed to germinate under short day conditions (8 hours light/24°C and 16 hours dark/22°C) in a plant growth chamber (Fitoclima 600, Aralab, Rio de Mouro, Portugal). One-week old healthy seedlings (n=15) were transplanted into round pots containing ∼90 g of fresh potting soil, with one seedling per pot. Three additional pots were designated as “plant-free” that received no seedlings. Pots were covered with cling film for 48 hours to maintain humidity after transplantation. All pots (including plant-free controls) were spatially randomized in the growth chamber and allowed to grow for three additional weeks.

One millimole treatment stocks of Aze (Sigma-Aldrich) and suberic acid (Sigma-Aldrich) were prepared in 5 mM 2-(N-Morpholino) ethanesulfonic acid (MES) buffer (Sigma-Aldrich) (pH 5.4) and filter sterilized through 0.2 µm Sterivex filters. Each plant was syringe-infiltrated with 500 µl Aze, suberic acid (equivalent to 2 µmol) or MES buffer (n=5 for each treatment) through the abaxial surface of three leaves. Infiltrated leaves were gently dried with absorbent paper and marked with a felt-tipped pen. Plant-free pots (n=3) were left untreated, and all pots were incubated for five days, as described above. Plants roots and plant-free controls were harvested, and the rhizosphere microbial communities were collected as described in Lebeis et al., 2015 [50] with modifications. Briefly, above ground plant organs were removed, and loose soil was physically removed until ∼1 mm of soil remained on the root surface. Roots were placed in 25 mL sterile MES buffer in 50 mL centrifuge tubes. Rhizosphere microbial communities were collected by first vortexing the roots in MES buffer for 15 seconds at maximum speed and subsequently sonicating for 5 minutes using an Ultrasonic Cleaner (VWR) at power level 3 to liberate tightly adherent and endophytic microbes. The root system was removed aseptically, and the turbid solution was filtered through a 100 µm nylon filter to remove large debris and plant material. The filtrate was centrifuged at 3,200 × g for 30 minutes and most of the supernatant discarded. The pellet was resuspended in the remaining supernatant (∼200-400 µL) and transferred to sterile 1.5 mL microfuge tubes and centrifuged for 5 minutes at 10,000 × g. The supernatant was removed, and the remaining pellets were flash frozen in liquid nitrogen and stored at -80°C until DNA extraction. For plant-free pots, approximately 250 mg of soil was collected 2 cm below the surface, transferred to MES buffer and processed and stored as described above.

Microbial genomic DNA was isolated using the DNeasy PowerSoil Pro Kit (Qiagen) according to the manufacturer’s instructions and samples were sequenced at NovogeneAIT Genomics (Singapore) using 16S rRNA gene amplification using the primers 515F (5′- GTGCCAGCMGCCGCGGTAA-3′) and 806R (5′-GGACTACHVGGGTWTCTAAT-3′) [10] and paired-end (2x250bp) sequencing on the Illumina NovaSeq 6000 (San Diego, CA) platform. Clean raw reads were processed with the same pipeline as the seawater samples described above.

Transcriptional master regulator analysis

We used the regulatory impact factor (RIF) algorithm [13] to detect transcription factors (TFs) with high regulatory potential contributing to the observed transcriptional remodeling upon Aze uptake. RIF is designed to identify key regulatory loci contributing to transcriptome divergence between two biological conditions. Predicted TF lists were obtained from the genome annotation (GFF file) of both Phycobacter and A. macleodii and normalized data (variance stabilized, log2-transformed counts) of these genes were retrieved. The same normalization strategy was applied to the DE genes. An expression matrix containing normalized expression of the TF genes and DE genes were subjected to RIF analysis per species. RIF analysis identifies regulators that are consistently differentially co-expressed with highly abundant and highly DE genes (RIF1 metric) and regulators that have the most altered ability to act as predictors of the abundance of DE genes (RIF2 metric). TFs were considered significant when the RIF score deviates ±1.96 standard deviation from the mean (t-test, p<0.05).

Transcriptional coexpression networks (TCNs)

The Partial Correlation and Information Theory (PCIT) [51] has been extensively used for gene network analysis [52,53]. We utilized PCIT to detect significant connections (edges) between pairs of genes (nodes) while considering the influence of a third player (gene). PCIT calculates pairwise correlations between given gene pairs after considering all possible three- way combinations of genes (triads) present within a gene expression matrix. The algorithm calculates partial correlations after exploring all triads before determining the significance threshold that depends on the average ratio of partial and direct correlation. The set of key regulatory TFs (identified by RIF analysis) and DE genes per species were used for construction of the networks. Normalized data (variance stabilized, log2-transformed counts) of these genes were used for network construction. The PCIT-inferred networks were visualized using Cytoscape v3.9 [54]. From these initial networks, we explored a series of subnetworks; first connections between nodes were considered when the partial correlation r was >± 0.95. From these networks, hub genes (potential regulatory components within the network) and their connected genes (first neighbors) were extracted based on: 1) key regulatory factors (with highest RIF scores), 2) differential expression significance, and 3) degree centrality (the number of connections of a node with other nodes in the network).

Data availability

RNAseq raw reads of Phycobacter are deposited in NCBI under the BioProject number PRJNA823575. RNAseq raw reads of A. macleodii are deposited in NCBI under the BioProject number PRJNA823732. 16S rRNA gene amplicon sequencing raw reads from the seawater mesocosm and Arabidopsis root microbiome are deposited in NCBI under the BioProject numbers PRJNA823745 and PRJNA841046, respectively. Mass spectral datasets are available in the MassIVE database under accession MSV000089221. All software packages used in this study are open source. Bacterial strains used are available upon request from the corresponding author.


This work was supported by a grant from the Gordon and Betty Moore Foundation to S.A.A. (GBMF9335, and by a grant from NYU Abu Dhabi to S.A.A. (AD179). We thank the NYU Abu Dhabi Core Technology Platforms for mass spectrometry support. We also thank Dain McParland for helping collect seawater samples.

Competing interests

The authors declare no competing interests.

Figure legends; main and supplementary

Effect of Aze on growth of Phycobacter.

a, as a sole carbon source (1 mM) and b, in comparison to a control. For (b) Aze was added at T=0 hours to treatments while controls received an equivalent volume of Milli-Q water. RNA samples were collected at the times marked by arrows (T=0.5 hours and T=8 hours). Error bars represent standard deviation of triplicate cultures.

13C-Aze assimilation by Phycobacter and A. macleodii (reported as

δ13C [‰, VPDB]) during a two-hour incubation. Asterisks denote significant differences between the two bacteria (repeated measure ANOVA, n=3, p<0.001). Error bars indicate the standard deviation of the mean.

Transcriptional coexpression networks constructed using the Partial Correlation coefficient with Information Theory algorithm in Phycobacter and A. macleodii.

Initial networks (a, d) consisted of key transcriptional factors (TFs) (identified by regulatory impact factor analysis) and differentially expressed genes. Nodes are depicted either as circles for genes or triangles for TFs. Edges represent significant interactions between nodes. Edge color represents directions of the interaction (red=positive correlation and blue=negative correlation). The size of the node corresponds to the normalized mean expression values in Aze-treated samples. Subnetworks were extracted from initial networks based on significant co-expression correlation (PCIT; r ≥ ±0.95) (b, e) and those containing only hub genes (identified based on RIF scores, differential expression, and the degree centrality) and their connected genes (c, f).

Effect of Aze treatment on the alpha and beta-diversity of soil.

a, Alpha-diversity indices of observed OTUs, Chao1, Shannon and Simpson across the Aze-treated and control samples. b, PCoA of Bray-Curtis distances (PERMANOVA: R2 = 0.73; p<0.001) between samples. The two principal components (PCoA1 and PCoA2) explained 25.1% and 14.5% variance, respectively.

Effect of Aze treatment on the alpha and beta-diversity of A. thaliana root microbiome.

a, Alpha-diversity indices of observed OTUs, Chao1, Shannon and Simpson across the Aze-treated and control treatments. b, PCoA of Unweighted Unifrac distances between Aze-treatment and controls (PERMANOVA, p<0.05). The two principal components (PCoA1 and PCoA2) explained 13.2% and 10.8% variance, respectively.

Bacterial diversity in the roots of A. thaliana treated with Aze.

a, Relative abundance of the top 25 microbial families based on 16S rRNA gene amplicon sequencing of A. thaliana root- microbiome following Aze infiltration. Plant-free control is untreated soil and MES buffer and suberic acid are treatment controls. b, Distribution of the ASVs belonging to significantly differentially abundant taxa between the Aze-treated and MES buffer samples according to their log2 fold-change and p-adjusted values. The bubble color denotes the species belonging to each genus on the x-axis. c, Distribution of the ASVs belonging to significantly differentially abundant taxa between the Aze-treated and suberic acid samples according to their log2 fold-change and p-adjusted values. Bubble colors denote the species belonging to each genus on the x-axis.

Supplementary Table Legends

Table S1. The number of differentially expressed (DE) genes for Phycobacter and Alteromonas macleodii F12 in response to azelaic acid. Percent values represent the ratio of genes in each category to the total number of genes in each genome. Genes were considered DE with a p-adjusted value of <0.05 and a log2 fold-change of ≥ ±0.50.

Tables S2. List of selected differentially expressed genes by (a) Phycobacter and (b) A. macleodii in response to azelaic acid addition. Genes with no values indicate no statistically significant differential expression compared to controls. Differential expression was defined as having log2FC ≥ 0.5 and adjusted p-value ≤ 0.05. Genes predicted to be in a single operon structure are listed in the direction of transcription within a single box, where relevant. Genes Operon structures were predicted by OperonMapper. Genes common to more than one pathway are listed only once.

Table S3. List of all differentially expressed genes by (a) Phycobacter and (b) A. macleodii in response to azelaic acid addition. A cutoff of adjusted p-value < 0.05 was used for statistically significant differential gene expression.

Table S4. Characteristics of 13C-Aze uptake by Phycobacter and A. macleodii. a, Carbon fractionation and uptake rate of bacteria exposed to 13C-Aze. b, 13C-labeled metabolites detected in Phycobacter after incubation with 13C-Aze.

Table S5. Transcriptional factors (TFs) with significant RIF score and their differential gene expression in Phycobacter and A. macleodii.