Chelating substructures found in bacterial NRP metallophores and their biosynthetic pathways.

(A) Representative NRP metallophore structures. Nearly all known NRP metallophores contain one or more of the eight labeled chelating groups. Most chelating groups provide bidentate metal chelation, as shown for ferric pyoverdine L48. (B) Chelator biosynthesis pathways that form the basis for the new antiSMASH detection algorithm, as described in the text. The same chelator colors are used in each figure.

Summary of NRP metallophore BGC detection, comparing the chelator-based rules newly implemented in antiSMASH, the transporter-based method of Crits-Christoph et al.,41 and a combined either/or ensemble.

a Detection methods were each tested on a set of 758 manually annotated NRPS BGC regions (180 true positives). Full results are given in Supplemental Table 2. b Detection methods were applied to 15,562 NCBI RefSeq representative bacterial genomes. The full results are given in Supplemental Table 3. A region is “complete” if it is not on a contig edge, as determined by antiSMASH. c F1 score is equal to 2×(Precision×Recall)/(Precision+Recall). d Percentages indicate the fraction of NRPS regions that were predicted to encode NRP metallophores.

An upset plot of chelator frequency among 2,489 complete NRP metallophore BGC regions from RefSeq representative genomes.

An additional 38 BGC regions were detected by metallophore-specific NRPS domains (VibH-like or tandem Cy) rather than chelator biosynthesis, and may produce catechol and/or salicylate metallophores using biosyntheses encoded elsewhere in the genome.

BiG-SCAPE similarity network of complete NRP metallophore BGC regions from RefSeq representative genomes.

Numbered square nodes indicate published BGCs, as given in Supplemental Table 1. Select hybrid metallophore BGC nodes are highlighted yellow, and their corresponding structures are shown. Nodes are colored by the type(s) of chelator biosynthesis detected therein. BGC regions colored light gray contain only metallophore-specific NRPS domains (VibH-like or tandem Cy) and may produce catechol and/or salicylate metallophores using biosyntheses encoded elsewhere in the genome. The network was constructed in BiG-SCAPE v1.1.2 using 2,596 BGC regions as input, including 78 reference BGCs, and a distance cutoff of 0.5.

Identification of siderophores predicted from genome mining.

(A) Chemical structures of marinobactins A-E,42 produced by Terasakiispira papahanaumokuakeensis DSM 29361; enterobactin,43 produced by Buttiauxella brennerae DSM 9396; and pyoverdine A21440 and ornicorrugatin,39 both produced by Pseudomonas brassicacearum DSM 13227. The position and orientation of the fatty acid desaturation in marinobactins B and D was not determined in this work. (B-D) High pressure liquid chromatography / high-resolution mass spectrometry (HPLC-HRMS) total ion chromatograms of culture supernatant extracts, overlaid with extracted ion chromatograms for siderophore features. Additional details and spectra are provided in the Supplemental Methods and Results.

Taxonomic distribution of 4,953 NRP-metallophore BGC regions detected in 59,851 GTDB representative bacterial genomes.

Phylum nomenclature is preserved from GTDB r207. An additional 413 BGC regions with “unknown” taxonomy are not included here. Phyla not listed had zero detected regions.

NRP metallophore biosynthesis across the bacterial kingdom.

Center: The Genome Taxonomy Database (GTDB) phylogenetic tree (version r207), with strains collapsed to the REDgroup level.46 Numbered circles indicate the most parsimonious origins of chelator pathways, as determined by reconciliation with eMPRess.47 The bottom-right legend lists the specific hidden Markov models (pHMMs) associated with each estimated origin. Arrows indicate ancient horizontal gene transfers predicted by eMPRess. Ring A: Phylal divisions. Phyla with detected chelating groups are labeled using nomenclature from GTDB r207. Ring B: Chelator biosynthetic pathways detected in at least one member of each REDgroup. Ring C: Average number of detected NRP metallophore BGC regions per genome for each REDgroup. Annotations were mapped to the phylogenetic tree using iTOL v6.49