RdnE homologs act as DNA endonucleases and contain interchangeable domains.

A) Cell viability (colony forming units [CFU] per mL) after protein production in swarms of P. mirabilis strain idrD*, which does not produce RdnE and RdnI. Cells produced GFPmut2, RdnE, or mutant variants in the predicted PD-(D/E)XK motif: D39A, E53A, K55A, or all. B) In vitro DNA degradation assay for ProteusRdnE. Increasing concentrations of a negative control, ProteusRdnE-FLAG, or ProteusRdnED39A-FLAG were incubated with methylated or unmethylated lambda DNA (48,502 bp) and analyzed by gel electrophoresis. Plasmid DNA degradation is in Figure 1 - Figure supplement 1. C) In vitro DNA degradation assay for domain deletions of ProteusRdnE. The first construct removed the first alpha helix without disturbing the catalytic residues, and the second construct contained the PD-(D/E)XK motif and removed region 2. Increasing concentrations were analyzed as in (B). D) Multiple sequence alignment between P. mirabilis and R. dentocariosa RdnE sequences. The black bar marks the PD-(D/E)XK motif, and the gray bar marks the variable region 2 domain. Conserved residues are highlighted in dark blue. Secondary structure predictions identified using Ali2D55,56 (h for alpha helix, e for beta sheet); the catalytic residues (stars) are noted above the alignment. (E,F) In vitro DNA degradation assay and analysis as in (B). (E) Increasing concentrations of either a negative control, RothiaRdnE-FLAG, or RothiaRdnED39A-FLAG. (F) The PD-(D/E)XK motifs were swapped between the RothiaRdnE (orange bar) and the ProteusRdnE (green bar) sequences and compared to the wild-type RdnE proteins.

RdnE, an endonuclease, is lethal in Escherichia coli and cuts plasmid DNA.

A) Growth curve of E. coli cells overexpressing ProteusRdnE or variants with mutations in the PD-(D/E)K active site. Cells were grown at 37°C for 16 hours. Optical density at 595 nm (OD595) was measured every half hour. A control strain expressing an empty vector was used as the negative control. B) Micrographs of E. coli cells producing ProteusRdnE-FLAG or ProteusRdnED39A-FLAG, isolated during mid-logarithmic growth and imaged. DAPI was used to detect DNA within cells. Top, the empty vector as a negative control. Middle, E. coli producing ProteusRdnE from an inducible plasmid. Bottom, E. coli producing ProteusRdnED39A from an inducible plasmid. Left, phase; right, fluorescence. C) Anti-FLAG Western Blot for ProteusRdnE-FLAG and ProteusRdnED39A-FLAG generated by in vitro translation. Protein levels were determined by comparison to a standard dilution of FLAG-BAP. A negative control (DHFR) without a FLAG tag was also produced with the in vitro translation reaction. A vertical orange line separates the membrane where the ladder was marked with a pencil after transfer (to the left) and the membrane after western blot detection (to the right). D) In vitro DNase assay reactions on cut and uncut plasmid DNA. In vitro translation products of either a negative control (DHFR), ProteusRdnE-FLAG, or ProteusRdnED39A-FLAG were incubated with cut or uncut plasmid DNA and analyzed with gel electrophoresis.

RdnI binds to and protects against RdnE in vivo and in vitro.

A) Domain architecture for the idr locus in P. mirabilis strain BB200018. At the top are genes with Pfam domains listed below them. Gray boxes denote PAAR and Rhs domains in the N-terminal region of the full-length IdrD protein. B) Micrographs of P. mirabilis strain idrD* cells carrying an empty vector, a vector for producing RdnE, or a vector for producing RdnE and RdnI. DNA was visualized by DAPI stain. Phase, left; fluorescence, right. C) Swarm competition assay18 of wild-type P. mirabilis strain BB2000 (donor) competed against the vulnerable target, which is P. mirabilis strain ATCC29906 carrying an empty vector, a vector for producing RdnI-StrepII, or a vector for producing GFP, both under the fla promoter. Left: schematic of swarm competition assay where top left colony is BB2000, top right colony is ATCC29906 with its vector cargo, and bottom colony is a 1:1 mixture of BB2000 and ATCC29906 with its vector cargo. Gray boxes underneath indicate whether BB2000 (top) or ATCC29906 (bottom) dominate in the 1:1 mixture and white arrows point to a boundary line that forms between different strains. D) Bacterial two-hybrid (BACTH) assay with RdnED39A-FLAG, RdnI-StrepII, and GFPmut2. The colorimetric change was discerned in the presence of the substrate X-gal and inducer IPTG. E) An anti-FLAG batch co-immunoprecipitation of RdnED39A-FLAG and RdnI-StrepII. RdnED39A-FLAG or exogenous FLAG-BAP (soluble fraction) was incubated with anti-FLAG resin (FLAG flow through). RdnI-StrepII was then added to the resin (RdnI-StrepII flow through). Any proteins bound to resin were eluted with FLAG-peptide (Elution) and analyzed by anti-FLAG and anti-StrepII western blots.

RdnI offers protection against and binds to RdnE.

A) Viability assays of E. coli cells after production of RdnE, RdnI, or co-production of RdnE and RdnI within a cell. Cells were assayed for colony forming units per milliliter over a six-hour time course. B) The Coomassie blue-stained gel for the anti-FLAG batch co-immunoprecipitation assay results shown in the main text, Figure 2E.

RdnE and RdnI protein families share conserved residues and predicted structures.

A) Gene neighborhoods for RdnE and RdnI homologs. Listed are gene neighborhoods, relevance, and niche, which we identified using IMG/M from the Joint Genomics Institute. Colors highlight conserved function/genes (not to scale). (Agr: Agriculture, Med: Medical, Env: Environmental), and the site of isolation. B) Phylogenetic tree based on NCBI taxonomy. Scale is located below the graph. The colored circles represent phyla (green: Actinobacteriota; yellow: Firmicutes; blue: Bacteroidota; pink: Proteobacteria). C) Unrooted maximum likelihood trees of the RdnE (left) and RdnI (right) homologs. Trees were created with RaxML70, and the scale is annotated below. The colored circles represent phyla (same as in B). D) Protein alignments overlaid with either predicted secondary structures (top) or conserved residues (bottom) of the RdnE and RdnI homologs. MUSCLE alignments63 are highlighted by secondary structures (red: alpha helices, light blue: beta sheets), or conserved residues (dark blue). White represents gaps in the protein alignment. The bars below mark the predicted conserved and variable domains. E) Alignments of AlphaFold2 predictions for RdnE and RdnI sequences from P. mirabilis (green), R. dentocariosa (orange), P. jejuni (magenta), and P. ogarae (dark blue). Structures were generated using ColabFold40 and aligned using PyMol.

RdnE and RdnI protein families show conserved structures.

A) Diagram detailing the methodology used for identifying sequences homologous to RdnE and RdnI. Seven homologs of ProteusRdnE (orange) found using BLAST37 with their corresponding downstream genes (pink) were aligned and used as seed for a sequential search using HMMER search38 and the Ensembl database64. Gene neighborhoods were analyzed for genomes with adjacent rdnE and rdnI genes. B) Tanglegram71 of the RdnE and RdnI protein families from the 21 sequences. On the left is the maximum-likelihood tree for the RdnE protein family on the right is the maximum-likelihood tree for the RdnI protein family. Black lines match effector and immunity pairs from the same species. C-F) MUSCLE alignment63 of RdnE (C and D) and RdnI (E and F) protein families highlighted with either predicted secondary structure predictions (C and E) or conserved residues (D and F). Secondary structures were predicted using Ali2D65,66 and are shaded by confidence. Predicted α-helices in pink; β-strands in light blue. Conserved residues were highlighted (dark blue) using Jalview63. Black lines underneath mark the truncated variants of RdnI described in the main text Figure 4.

AlphaFold2 predictions for RdnE and RdnI homologs.

A) Confidence scores (pIDDT) for AlphaFold239,40 predictions for RdnE sequences from P. mirabilis, R. dentocariosa, P. jejuni, and P. ogarae. The confidence score (y-axis) for each residue (x-axis) are graphed for the five ranked models (rank 1: blue, rank 2: orange, rank 3: green, rank 4: red, rank 5: purple). B) RdnE AlphaFold2 rank 1 models. Models were colored by confidence scores. Red indicates high confidence (90-100%) while blue indicates low confidence (30-50%). C) Confidence scores for AlphaFold2 predictions of RdnI homologs from P. mirabilis, R. dentocariosa, P. jejuni, and P. ogarae. D) We colored the AlphaFold2 rank 1 model, including both the BB2000 sequence and natural variant, by confidence scores where red is high confidence (90-100%) and blue is low confidence (30-50%).

RdnE and RdnI homolog species.

The RdnI protein family can offer cross-protection due to an interchangeable conserved domain that is critical for function.

A) Sequence logo of the RdnI’s conserved motif. Stars indicate the seven analyzed residues. B) Swarm competition assay with ATCC29906 producing either RdnI-StrepII or RdnI7mut-StrepII, which contains mutations in all seven conserved residues. We used a sequence-optimized (SO) RdnI protein that had a higher GC% content and an identical amino acid sequence for ease of cloning. Left: schematic of swarm competition assay as in Figure 2. Gray boxes indicate which strain dominated over the other. White arrows point to the boundary formed between different strains. C) BACTH assay of RdnED39A-FLAG with SO RdnI-StrepII or RdnI7mut-StrepII. GFPmut2 was used as a negative control. D) Swarm competition assay with ATCC29906 expressing either the wild-type RdnI or a RdnI truncation. The three truncations were in the first alpha helix (amino acids 1-85), the second half of RdnI (amino acids 150-305), and the end of the protein (amino acids 235-305). E) BACTH assay of RdnED39A-FLAG with wild-type RdnI and the three RdnI truncations. F) Swarm competition assay with ATCC29906 expressing foreign RdnI proteins. G) BACTH assay of RdnED39A-FLAG with each of the foreign RdnI proteins. GFPmut2 was used as a negative control. H) Swarm competition assay with ATCC29906 producing SO RdnI with swapped conserved motifs. I) BACTH assay of RdnED39A-FLAG with SO RdnI with swapped conserved motifs. Colored bars denote RdnI-StrepII proteins from P. mirabilis (green), R. dentocariosa (orange), P. jejuni (magenta), or P. ogarae (dark blue).

Single mutations in the RdnI conserved motif do not alter protective function.

Swarm competition assay19 with single residue mutations in the conserved motif of RdnI. P. mirabilis BB2000 (donor) was competed against the vulnerable P. mirabilis ATCC29906 expressing RdnI with single residue mutations in four of the seven conserved residues (highlighted in Figure 4A) in ProteusRdnI-StrepII (ProteusRdnIY197A-StrepII, ProteusRdnIH221A-StrepII, ProteusRdnIY244A-StrepII, ProteusRdnIY246A-StrepII). All constructs were made in a sequence-optimized RdnI.

RdnI protein levels are similar under constitutive fla promoter in P. mirabilis.

Swarm cell protein expression assay2 on P. mirabilis ATCC29906 cells expressing each of the four RdnI proteins under the constitutive fla promoter. Soluble fraction and whole cell extract samples were then run on SDS-Page gels and incubated with anti-StrepII antibodies (left) or Coomassie blue (right). 20 ng of GFP-StrepII (Iba Lifesciences, Gӧttingen Germany) was used as a positive control.

Anti-FLAG co-IPs reveal mixed binding results for foreign immunity protein

Anti-FLAG co-immunoprecipitation assay between ProteusRdnED39A-FLAG and the RdnI-StrepII proteins from P. mirabilis, R. dentocariosa, P. jejuni, or P. ogarae. RdnED39A-FLAG was incubated with anti-FLAG resin (FLAG soluble fraction). RdnI-StrepII containing lysate was then added to the resin (RdnI-StrepII flow through). Proteins bound to resin were then eluted with FLAG-peptide (elution). The negative control was exogenous FLAG-BAP. Samples were incubated with either anti-FLAG antibodies (A), anti-StrepII antibodies (B), or were stained with Coomassie blue (C).

The RdnI protein family has the potential for broader protection within oral and gut microbiomes.

A) Methodology used to identify rdnE and rdnI genes in publicly available metagenomic data. Metagenomes were mapped against sequences with a stringency of 90%. “Coverage” denotes the average depth of short reads mapping to a gene in a single sample. Colors represent rdnE and rdnI from P. mirabilis (green), R. dentocariosa (orange), P. jejuni (magenta), or P. ogarae (dark blue). B) The experimentally tested rdnE gene sequences from different organisms (colors) are found in thousands of human-associated metagenomes. Each dot represents a single sample’s coverage of an individual rdnE gene, note log10-transormed y-axis. Only samples with >1x coverage are shown. C) Euler diagram showing the number of samples with co-occurring rdnE genes from different taxa (colors). D) Kernel density plot of the ratio of rdnI to rdnE coverage. The ratio of rdnI to rdnE was defined as log10(I/E) where I and E are the mean nucleotide’s coverage for rdnI and rdnE, respectively. The distribution of ratios was summarized as a probability density function (PDF) for each taxon (color) in each environment (subpanel). Here, the y-axis (unitless) reflects the probability of observing a given ratio (x-axis) in that dataset. The colored numbers in the top right of each panel show the number of metagenomes above the detection limit for both rdnE and rdnI for each taxon. Dashed vertical lines represent the median ratio. E) Skeleton-key model for immunity protein protection. Top, the current prevailing model for T6SS immunity proteins is that protection is defined by necessary and sufficient binding between cognate effectors (locks) and immunity proteins (keys). Bottom, a proposed, expanded model: multiple immunity proteins (skeleton-keys) can bind a single effector due to a flexible (promiscuous) binding site. Protection is a two-step process of binding and then neutralization.

Metagenomic analysis with a 70% stringency revealed similar patterns in RdnE and RdnI localization.

The same metagenomic analysis as described in Figure 5 was used but had a lower stringency (70% instead of 90% identity to the rdnE and rdnI sequences). A) Each dot represents a single sample’s coverage of an individual rdnE gene, note log10-transormed y-axis. Only samples with >1x coverage are shown. B) Euler diagram showing the number of samples with co-occurring rdnE genes from different taxa (colors). C) Kernel density plot of the ratio of rdnI to rdnE coverage. The ratio of rdnI to rdnE was defined as log10(I/E) where I and E are the mean nucleotide’s coverage for rdnI and rdnE, respectively. The distribution of ratios was summarized as a probability density function (PDF) for each taxon (color) in each environment (subpanel). Here, the y-axis (unitless) reflects the probability of observing a given ratio (x-axis) in that dataset. The colored numbers in the top right of each panel show the number of metagenomes above the detection limit for both rdnE and rdnI for each taxon. Dashed vertical lines represent the median ratio.

RdnE and RdnI sequences are found in metagenomic datasets.

A) Heatmap of the log10-normalized coverage of rdnE and rdnI (rows) from the focal taxa for all metagenomes (columns) where any were detected. Metagenomes are sorted by decreasing rdnE coverage. B) Span chart showing the difference in coverage between cognate rdnE (circles) to rdnI (crosses) for all taxa (colors) for metagenomes in which rdnE-rdnI from multiple taxa were detected.

List of strains used in this study.

Plasmids used in this study.