Introduction

The discovery and successful clinical deployment of antibiotics is one of the most important breakthroughs in medical history, as it has dramatically reduced the morbidity and mortality in infections. Conversely, the extensive use of antibiotics has promoted the selection of mechanisms that enable bacteria to thrive regardless [1, 2]. Not all resistance mechanisms can be attributed to human-derived antibiotic exposure, as many predate the clinical use of antibiotics and/or are observed in bacteria of no clinical importance [3]. These mechanisms are derived from the natural evolution of species and the environment they occupy, as bacteria that co-inhabit niches with antibiotic-producing organisms naturally develop ways to avoid antibiotic-caused death [4, 5]. Of key importance, antibiotic resistance mechanisms do not have to evolve de novo, as they can be incorporated from other organisms via horizontal gene transfer (HGT).

Two main strategies are used to identify novel antibiotic resistance determinants. One clinical or hostbiased approach ensues when a patient who is infected with a presumably antibiotic-sensitive species fails to improve upon treatment. Isolation and study of the resistant strain then leads to the identification of a novel determinant of antibiotic resistance. Several seminal discoveries of direct clinical importance have taken place in this manner [68]. However, this approach mostly discovers incremental strainspecific mechanisms, such as single nucleotide polymorphisms (SNP) that alter antibiotic binding to its target. Another strategy involves ecological sampling and screening, using targeted approaches or metagenomics [9]. The latter approach holds the promise to uncover truly novel mechanisms, although these mechanisms might never find their ways into extant pathogens, and therefore will never represent a clinically relevant problem.

Mycobacterium is a genus that includes both environmental and clinically relevant microorganisms. Mycobacteria can be found in most environments including rivers and lakes [10], soil [11], plant roots [12], moss [13], reptiles [14], amphibians [15], fish [16] and mammals [17]. A typical divide of the genus is based on growth rate in a defined solid medium when sub-cultured from highly dilute inoculums [18]. A mycobacterial species is “fast-growing” if it forms visible colonies within seven days; this phenotype is considered to be the ancestral state of the genus [19]. A species that takes longer than seven days to form mature colonies is classified as “slow-growing”, this category includes the most devastating disease-causing species Mycobacterium ulcerans, M. avium complex (MAC), M. leprae, and the M. tuberculosis complex (MTBC). Until recently, human infections with mycobacteria other than the MTBC and M. leprae, i.e. non-tuberculous mycobacteria (NTM), were overshadowed by the TB burden but are now gaining increased attention due to their growing prevalence [20]. Some of the key NTM species that cause disease are the fast-growers M. abscessus and M. fortuitum, and the slow-growers M. avium, M. marinum, M. xenopi, M. gordonaeand M. kansasii [21]. NTM can infect a variety of tissues including the lungs, central nervous system, lymphatic system, joints, and skin [21]. As the frequency of NTM infections is increasing, so is the worry of their severity and resistance to treatment with available antibiotics. Contrasting with M. tuberculosis and M. leprae, which due to their isolation inside the host, do not show cross-species horizontal gene transfer or HGT, NTM might acquire resistance determinants from environmental bacteria. Therefore, the potential of mycobacterial species to contain and disseminate antibiotic resistance determinants is likely as good as of other environmental bacteria.

Here, we propose a new method to study antibiotic resistance by comparing antibiotic resistance profiles across related bacterial species (macroevolution) to identify previously unknown high-level antibiotic resistant species. Then, taking advantage of the natural genetic diversity among species (encoded in their accessory genome) and their genetic similarity (encoded in their core genome), computational, molecular, and cellular approaches can more readily pinpoint resistance determinants. We illustrate the power of this method by characterizing a previously unrecognized rifamycin-inactivating enzyme that is widely distributed across bacterial genera.

Results

Building a diverse library of mycobacterial species

A collection of 44 tractable mycobacterial species was assembled to cover most of the mycobacterial phylogenetic tree (Fig. 1a). To evaluate the biological diversity of our library we analyzed the ecological and genomic information available for the selected species. In terms of niche/pathogenicity, mycobacteria can be broadly divided into bona fideavirulent saprophytes, opportunistic pathogens, and professional pathogens. However, we considered necessary to add further nuance to the opportunistic category by assessing the strength of the evidence of pathogenicity provided in the literature and used this knowledge to establish a five-level scoring system (Fig. 1b). We determined doubling times in liquid culture in equivalent experimental conditions for 26 of the species (Fig. 1c). The results revealed large differences within the genus. For example, M. szulgaiand M. flavescens divided every 1.3 and 2.0 hours, respectively, while M. marinum and M. tuberculosis, divided every 17.1 hours. Strikingly, among slow-growers a 13-fold change difference exists between the fastest and the slowest-growing species (M. szulgai and M. tuberculosis respectively). The genome size of the library species is also highly different, with genomes as small as 3.59 Mbp and 4.08 Mbp, for M. triviale and M. koreense, respectively, to genomes as large as 6.99 Mbp and 8.01 Mbp for M. smegmatisand M. mageritense, respectively (Fig. 1d). These genome size differences suggest that the nature of the accessory genome varies widely from species to species. Interestingly, there is only a very small difference when comparing the average genome size of all fast and slow growers, 6.1 vs 5.8 Mbp, respectively. The high guanine-cytosine content (GC%) is a defining characteristic of mycobacteria. Within our library, GC% varies from 66.9–68.8%, in slow growers and the M. terraeclade respectively (Fig. 1e). Next, we investigated the number of ribosomal RNA encoding genes (rrn operon), a feature that has often been linked to growth rate. Twenty-five species possess a single copy of the rrn operon while the other 19 possess two copies (Fig. 1f). In general, but not always, “fast growers” tend to have two copies while “slow growers” one. In the genus Bacillus a correlation between growth rate and rrn operon copy number has been experimentally disproven [22], and number of rrn operon copies could be related to response of resource availability [23]. Furthermore, comparative pangenome analyses conducted by Bachmann and collaborators suggested that growth limitation in slow growing mycobacteria might instead be related to loss of amino acid transporters [24]. Finally, we analyzed the gene ontology (GO) distribution for our library in order to evaluate functional genetic diversity at the genome level. GO distribution varies noticeably across the analyzed genomes. The number of genes associated with transcription and membrane transport (categories indicated by arrows in Fig. 1g) appeared to be particularly variable. We noted that much of the genome of several of these species is poorly annotated. Therefore, Fig. 1 supports the potential usefulness of our library as a resource to investigate mycobacterial biology, antibiotic resistance, and pathogen evolution.

A diverse species library of the genus Mycobacterium.

a, Phylogenetic tree of mycobacterial species in our library calculated using the bcgTree pipeline v 1.1.0 [15]. b, Pathogenicity score. c, Doubling time. d, Genome size. e, Guanidine and cytosine percentage (GC%). f, Ribosomal copies (rrn operon). g, Gene ontology (GO) distribution. Colors from left to right represent the following GO categories: regulation of DNA-templated transcription (highlighted with a light orange arrow), transmembrane transport (highlighted with a dark orange arrow); amino acid, lipid, carbohydrate derivative, nucleobase-containing small molecule, and carbohydrate metabolic processes; generation of precursor metabolites and energy; sulfur compound, vitamin, and tRNA metabolic processes; DNA repair, protein modification process, signaling, cell wall organization or biogenesis, cellular modified amino acid metabolic process, DNA replication, DNA recombination, ribosome biogenesis, anatomical structure development, protein catabolic process, protein-containing complex assembly, protein maturation, nitrogen cycle metabolic process, intracellular protein transport, metal ion homeostasis, cell division, protein secretion, mRNA metabolic process, DNA integration, transport, organic substance transport, defense response to other organism, organic substance biosynthetic process, organic substance metabolic process, cellular process, nitrogen compound transport, regulation of gene expression, cellular biosynthetic process, other metabolic processes.

Wide variation in antibiotic resistance profiles in mycobacteria

To harness the potential of our library, we tested the antibiotic potency and the extent of its variation across the Mycobacterium genus to identify biologically-relevant differences to be further studied. We determined minimal inhibitory concentrations (MIC99) for 15 antibiotics, spanning most of the classes employed to treat mycobacterial infections, including TB (Fig. 2a, Supplementary Table 1). We found that several species displayed at least one MIC99 value that is considerably different from the mean (Supplementary Figs. 1 and 2), highlighting the biological diversity of the genus with respect to antibiotic action. As expected, notoriously multi-drug resistant M. abscessus was resistant to several antibiotics (Fig. 2a) [21, 25], yet M. abscessus was not the most resistant of the species studied. M. mageritense, M. salmoniphilum and M. houstonense were highly resistant to most of the antibiotics tested. M. abscessus was somewhat sensitive to amikacin (AMK) and bedaquiline (BDQ), which is consistent with other findings in the literature [26, 27]. Also, the magnitude of the changes in MIC99 is remarkable, of the order of 100- to 1000-fold in some instances. Of note, when the data were re-ordered and unbiased clustered, based on the overall antibiotic sensitivity of each individual species, the species distribution were divided in three main clusters, which are dramatically different when compared to their phylogenetic positioning, as shown by the tanglegram between the two heatmaps in Fig. 2a. To illustrate the absence of a taxonomic trend in antibiotic resistance we explored in detail the clade composed by M. holsaticum, M. phlei, M. flavescens, M. tusciae and M. moriokaense (Supplementary Fig. 3). While most antibiotics behaved similarly across this group (i.e., MIC99 FC < 3-fold). M. holsaticum was highly sensitive to paraaminosalicylic acid (PAS) and highly resistant to BDQ. High-level BDQ resistance was also observed with M. flavescens. Interestingly, M. flavescens is unusually sensitive to d-cycloserine (DCS). The mechanisms underpinning these distinct responses to antibiotics are currently unknown. We also observed that once ordered by antibiotic sensitivity, M. tuberculosis is positioned at the middle of the heatmap and the number of NTM more resistant to antibiotics compared to M. tuberculosis is nearly equal to the number of NTM that are more sensitive. Therefore, NTM are not generally intrinsically more drug resistant to the antibiotics tested than M. tuberculosis. In summary, antibiotic sensitivity varies dramatically across the Mycobacterium genus and our data provide the first quantitative blueprint of this variation.

Antibiotic sensitivity mapping reveals complex patterns.

a, Heatmaps of overall MIC99 values. In the X axis, antibiotics are organized based on their mechanism of action; in the Y axis, mycobacterial species are organized phylogenetically in the left heatmap, and based on their response to the set of antibiotics tested (Manhattan clustering) in the right heatmap. Colors represent the standardized MIC99 (mean/SD and centered scaled). Lower MIC99 values are in brown/orange and higher MIC99 values in lilac/purple. The details of the data can be found in Supplementary Figure 2. b, Radar plots displaying the standardized MIC99 for all antibiotics tested. MIC99- values are normalized to be plotted in radar plots. All radar plots display the results for M. tuberculosis in orange. M. branderiis displayed in purple, M. conceptionensein dark blue and M. smegmatis in light blue. c, Violin plots showing the distribution of MIC99 values. In the X axis, the set of antibiotics tested; in the Y axis the MIC99 values in μg/mL. d, Relationship between mycobacterial doubling time (X axis) and MIC99 for the antibiotics BDQ, LZD and RIF (Y axis). Antibiotics targeting the cell wall are: isoniazid (INH), ethionamide (ETH), ethambutol (EMB), D-cycloserine (DCS), and meropenem (MEM); RNA/protein synthesis: rifampicin (RIF), streptomycin (STR), kanamycin (KAN), amikacin (AMK) capreomycin (CAP) and linezolid (LZD); DNA gyrase: moxifloxacin (MFX) and ofloxacin (OFX); folate metabolism: paraaminosalicylic acid (PAS); and ATP synthase: bedaquiline (BDQ).

There are striking differences in antibiotic response across the genus that highlight the value of a genuswide approach to inform antibiotic research efforts. For example, M. smegmatis is frequently used as a model organism for M. tuberculosis in TB antibiotic discovery [28], but it displayed a completely different sensitivity profile from M. tuberculosis, being highly resistant to PAS, ethionamide (ETH), DCS and RIF. Our results suggest that M. marinum is a better M. tuberculosis proxy [29] as both have a similar sensitivity profile except to ofloxacin (OFX) (Fig. 2b). The overall distribution of antibiotic potency (MIC99) against different species is shown in Fig. 2c. Cell-envelope-targeting antibiotics and PAS exhibit a weaker potency across the genus, while antibiotics that target protein synthesis, DNA gyrase and the ATP synthase on average displayed an overall lower MIC99, indicating that most mycobacteria are sensitive to them. From the antibiotics that inhibit protein synthesis, linezolid (LZD) displays the lowest overall MIC99 and was effective against most species (Fig. 2c). To verify whether there is a correlation between doubling time and sensitivity to antibiotics we compared the doubling time of a subset of species (Fig. 1c) with the MIC99 of a subset of antibiotics. As it can be seen in Fig. 2d, no correlation is apparent between growth rate and antibiotic sensitivity in mycobacteria. Below, we explore the molecular causes of these dramatic changes in antibiotic potency observed across the Mycobacterium genus.

Intra-bacterial antibiotic accumulation does not predict potency.

We employed liquid chromatography–time-of-flight mass spectrometry (LC-MS) to determine the relative internal concentration of antibiotic ([ABX]IB) with an antibiotic concentration in the growth medium of 6 × MIC99. [ABX]IB is a function of three parameters: antibiotic uptake, efflux, and modification. Figure 3a shows extracted ion chromatograms (EICs) in five mycobacterial species for BDQ, LZD and RIF (Fig. 3b). Quantification of [BDQ]IB, [LZD]IB and [RIF]IB illustrates the variability and the magnitude of the changes observed in [ABX]IB, spanning from 2- to 200-fold (Fig. 3c). As the experiment was performed at a concentration of antibiotic where every antibiotic was equally potent, we replotted these data as a function of each antibiotic MIC99 (Fig. 3d). Only for BDQ we could observe a correlation between antibiotic potency and [BDQ]IB which could be indicative of efflux playing a role in antibiotic efficacy. In the case of RIF, where there is no correlation between antibiotic potency and its accumulation in mycobacteria (Fig. 3d), factors other than uptake and efflux as the dominant drivers of RIF potency in mycobacteria.

Intra-bacterial antibiotic concentration does not correlate with potency.

a, Positive mode extracted ion chromatograms (EIC) of whole-cell extracts of mycobacteria treated with selected antibiotics. BDQ (m/z 555.1642), LZD (m/z 338.1511) and RIF (m/z 823.4124). b, Chemical structure of BDQ, LZD and RIF. c, relative intracellular antibiotic concentration, obtained by comparing the peak height of the samples with and injection of 10 μM of antibiotic, then normalized by the concentration of antibiotic used to treat the cells. d, Relative intracellular antibiotic concentrations in relation to the MIC99. Data in c and d come from independent experiments.

A minor role for pre-existing target modification in RIF resistance.

Considering the importance of rifamycins for the treatment of TB, leprosy, Buruli ulcer, MAC and M. kansasii infections, we focused on RIF resistance mechanisms operating in mycobacteria. Figure 4a highlights the diversity in RIF potency across our library, ranging from an MIC99 of more than 100 μg/mL to less than 0.2 μg/mL. Arranging species by decreasing MIC99 value highlights that there are species better suited for the identification of target-mediated resistance mechanisms (dark orange), and species that are better suited for the identification of non-target-based resistance mechanisms (dark purple). At this stage, we focused our work on four species, all of which are resistant (MIC99 = 12.5 μg/mL) or superresistant (MIC99 ≥ 100.0 μg/mL) to RIF, compared to M. tuberculosis(MIC99 = 0.9 μg/mL): M. smegmatis and M. flavescens (MIC99 = 12.5 μg/mL), M. houstonense (MIC99 = 25.0 μg/mL), and M. conceptionense (MIC99 ≥ 100.0 μg/mL) (Fig. 4b).

High-level rifampicin resistance is caused by rifamycin modification in selected mycobacteria.

a, RIF MIC99 values for the mycobacterial species in our library organized in decreasing MIC99 value order. b, Cultures of selected species on solid medium (7H10) containing RIF at different concentrations, starting at 1× M. tuberculosis MIC. c, Comparison of the amino acid residue sequence of the rifampicin resistance-determining region (RRDR) of RpoB in selected mycobacterial species; the only residue that differs is Ser 450 in M. branderi. d and e, Volcano plots showing the differential protein expression in whole-cells with and without RIF revealing inducible expression of RIF ADP-ribosyltransferase 1 (Arr-1) in M. smegmatis, M. conceptionenseand M. flavescens. f, Detection and quantification of ribosyl-RIF in whole-cell extracts by LC-MS.

In M. tuberculosis, the fixation of mutations that decrease the affinity of RIF to the RNA polymerase β subunit (RpoB) represents the dominant cause of RIF resistance [7, 30], and therefore target modification is an obvious starting point to explore probable mechanisms of resistance to the rifamycin class of antibiotics in other mycobacteria. Figure 4c shows the Rifampicin Resistance Determining Region (RRDR), the segment of RpoB where most mutations conferring resistance to RIF are found. Except for M. branderi, no amino acid variations are found in our species of interest. This observation suggests that in contrast to M. tuberculosis, most mycobacteria are not resistant to RIF due to variations in the RIF binding region of RpoB.

As [RIF]IB or RpoB target variation cannot account for the observed resistance to RIF, we evaluated the remaining major mechanism of resistance to rifamycins, drug modification. RIF modification is widely found in nature and is carried out by various enzyme types, including phosphotransferases, glycosyltransferases, ADP-ribosyltransferases (ARTs) and monooxygenases [3135]. Importantly, a RIFART has been characterized in M. smegmatis [36]; it is encoded by the gene MSMEG_1221, also known as arr-ms, and it has been showed to be the sole determinant of RIF resistance in M. smegmatisby chemical and genetic methods [37, 38]. We employed proteomics to first check whether Arr-ms is expressed in the absence of RIF and if it is differentially expressed in the presence of RIF. Figure 4d shows that expression of Arr-ms is stimulated (5.6-fold) in the presence of RIF at 6 × MIC99, and therefore, proteomics can assist on the identification of RIF modifying enzymes in mycobacteria. Next, we evaluated whether the annotated Arr homologous proteins in M. conceptionense (SAMEA3305051) and in M. flavescens (SAMN05729960) were also induced in the presence of RIF (Fig. 4e), this was indeed the case (4.12- and 2.75-fold change respectively). To confirm that these putative RIF-ARTs are inactivating RIF, we employed LC-MS, to identify ribosyl-RIF (m/z 955.4601), a fragment of the larger ADP-ribosyl-RIF product, which fragments under LC-MS conditions. Figure 4f illustrates that ribosyl-RIF was observed in M. smegmatis, M. conceptionenseand M. flavescens treated with RIF. Additionally, other mycobacteria with annotated putative arr genes also displayed high levels of RIF ADP-ribose (Supplementary Fig. 4a and 4b), indicating that RIF modification, and precisely ADP-ribosylation, is the dominant mechanism of resistance to RIF in mycobacteria.

A novel group of rifamycin ADP-ribosyltransferases

In order to have a comprehensive understanding of the distribution of Arrs in mycobacteria we mined for arr sequences in reference genomes and built a phylogenetic tree. Arr proteins were found to be widespread in both fast- and slow-growing mycobacteria, but in a dispersed pattern suggesting that both local vertical inheritance and gene losses and acquisitions have taken place. Mycobacterial Arrs form two monophyletic groups (Fig. 5a; Supplementary Table 2). One of the groups, which we designated Arr-1, corresponds to sequences closely related to Arr-ms (median sequence identity of 80%). Arr-1 group members are predominantly Actinomycetota of the orders Geodermatophilales, Propionibacteriales, Micrococcales and Mycobacteriales. The second group, which we have named Arr-X, is taxonomically more broadly distributed, including members from Actinomycetota, Bacillota, Pseudomonadota and Bacteroidota. Within mycobacteria, more species have an arr-1 gene than arr-X and a few species have both, for example M. conceptionenseand M. flavescens (Supplementary Fig. 4a). M. conceptionense Arr-1 (Uniprot A0A0U1D6J3) and Arr-X (Uniprot A0A0U1DL14) share 50% identity and 63% similarity (BLOSUM62). The equivalent of the three residues showed by Baysarowich and collaborators to be necessary for enzymatic activity in Arr-ms (Asp84, His19 and Tyr49) are conserved in all mycobacterial Arr-1 and Arr-Xs, suggesting that they are all active ADP ribosyltransferases [34]. However, the hydrophobic nature of the RIF binding cleft of Arr-ms is not completely preserved in the Arr-X group (Supplementary Table 3) hinting at probable differences in substrate binding preference.

Characterization of a novel rifabutin-ADP ribosyltransferase in mycobacteria.

a, Phylogenetic tree of mycobacterial RIF ADP-ribosyltransferases (RIF-ARTs) and related PFAM family PF12120 sequences. Many mycobacterial species encode the equivalent of M. smegmatis RIF-ART (MSMEG_1221; Arr-1/ms in purple), and some mycobacterial species encode a previously unidentified sister group we have named Arr-X (dark orange). See Supplementary Table 3 for detailed information of the sequences in the tree. b, Ribbon representation of the crystal structure of M. smegmatis RIF-ART (PDB code 2HW2; light blue) with RIF bound (red) overlaid with the AlphaFold2 models of M. conceptionense Arr-X (dark blue) and M. flavescens Arr-X (dark orange). c, Apparent velocity of reaction of each of the enzymes (X axis) with different rifamycins as substrate. d, M. conceptionensesingle and double knockdown (KD) arrstrains in the presence of rifabutin.

To assess that Arr-X enzymes are indeed RIF-ARTs and to understand why some species have two arr genes, we cloned, overexpressed, purified, and tested the enzymatic activity of Arr-ms (as a control) and Arr-1 and Arr-X from both M. conceptionenseand M. flavescens. Figure 5c displays the catalytic activity (Vapp) of the different proteins with six rifamycins. All Arr-1 enzymes had similar activity and substrate preference, but Arr-X enzymes were much superior at inactivating rifamycins. For example, M. flavescens Arr-X is 3.9-fold faster with rifapentine than Arr-ms. Surprisingly, M. conceptionenseArr-X is 29-fold faster to inactivate rifabutin, compared to Arr-ms. These results therefore demonstrate that Arr-Xs are not only bona fiderifamycin inactivating enzymes, but also that they are significantly more efficient than Arr-1 s. We also determined the MIC99 for different rifamycins in selected species (Supplementary Table 4). Interestingly, these species are resistant to all rifamycins except for rifabutin. To probe whether Arr-X is active in bacterio, we used CRISPR interference to reduce the transcription of arr-1, arr-Xand both genes in M. conceptionense. M. conceptionensecontinued to be resistant to rifabutin upon arr-1 silencing but became more sensitive when arr-X was knocked down (Fig. 5d). Thus, Arr-X is an active “rifabutinase” that confers rifamycin resistance in M. conceptionense.

Discussion

Discovery of unknown mechanisms for high-level antibiotic resistance in nature is essential for more efficient antibiotic discovery and development and for the continuous treatment of patients. Here we propose a powerful approach for the discovery of novel antibiotic resistance determinants. Our strategy consists of mapping and comparing the antibiotic response profile of a library of tractable species representative of the entire genus Mycobacterium; an approach that is applicable for the study of many biological traits and to other bacterial genera.

Using our mycobacterial library, we identified high- and ultra-high-level intrinsic resistance [3] to many of the antibiotics tested. Resistance profiles are highly variable across the genus and do not follow phylogeny, implicating HGT as the key mechanism for acquisition of resistance determinants [39].

Our study revealed that resistance levels to BDQ, LZD and RIF were particularly divergent across the genus and often could not be explained by our current knowledge of antibiotic resistance mechanisms [6, 7, 28]. We found that resistance to these antibiotics in mycobacteria cannot be explained by uptake/efflux mechanisms and it does not correlate with growth rate.

We illustrated the power of this comparative method by characterizing a previously unrecognized group of rifamycin-inactivating enzymes (Arr-X) that is present in a wide range of bacteria including Actinomycetes, Bacilli and Gammaproteobacteria. We found that several mycobacterial species have the gene coding for the RIF inactivating enzyme RIF-ART (Arr-1), and revealed that some species also code for a homologous protein Arr-X. The existence of a superior rifabutin-inactivating enzyme in several mycobacterial species might jeopardize the use of rifabutin, currently an antibiotic of choice to treat M. avium complex-caused infections and other infections. Novel inhibitors of these two distinct Arr enzymes [37, 38]) might become essential to re-sensitize mycobacteria against rifamycins and against rifabutin in particular.

Materials and methods

Mycobacterial species and cultures

Mycobacterial species were acquired from the German Collection of Microorganisms and Cell Cultures GmbH - DSMZ (Braunschweig, Germany). The species comprised in our library were selected based on (i) broad genus coverage; (ii) diversity with respect to niche/pathogenicity; (iii) availability of genome sequence; (iv) availability of the type or laboratory strain; and (v) ability to grow on Middlebrook 7H9 culture medium. Upon arrival, long-term (-80°C), short-term (-20°C) and agar plate stocks were prepared according to DSMZ’s protocols.

Mycobacterial diversity

Genotypic and phenotypic diversity is an essential feature required for our approach. To describe the niche/pathogenicity diversity encompassed in our library, we developed an ad-hoc pathogenicity score system based on the number and detail of peer-reviewed publications describing a particular species as pathogenic. We searched for the number of records in PubMed containing the species name and the words “bacteraemia” or “pathogen” or “infection”. Species with more than twenty matches were designated as a “common proven pathogen” (score of 4), and those with none as non-pathogenic (score of 0). The remaining species were classified as a “rare proven pathogen” (score of 3) if the diagnostic evidence presented in the paper was strong and the patient was immunocompetent, “pathogen” in either we considered that the diagnostic evidence was insufficient or if the patient was immunocompromised (score of 2) and a score of 1 was used for those that did not meet either criterion.

Genome size and guanine-cytosine content (GC%) values were obtained from NCBI Microbial Genomes [40] and ribosomal copy number was annotated using rrnDB [41] or BLAST. When more than one hit was obtained, the genomic sequence and context were analyzed to confirm duplication. The growth rate of selected species was determined by turbidity measurements (OD600) taken in equal intervals. We inoculated 100 mL Middlebrook 7H9 broth complete in roller bottles. Middlebrook 7H9 broth complete contained 10% Albumin-Dextrose-Catalase (ADC) supplement, 0.05% Glycerol and 0.05% Tyloxapol. Growth rate was calculated using the specific growth rate formula [42]. Finally, encoded genes were classified based on their gene ontology using OmicsBox’s functional analysis tool, with default parameters [43, 44].

Minimal inhibitory concentration measurements

Our mid-throughput minimal inhibitory concentration assay (MIC99) was adapted from previously described protocols [45]. Briefly, in sterile Eppendorf tubes, each antibiotic was diluted in the recommended solvent to 4 mg/mL. Fluoroquinolones and BDQ were diluted to 0.4 mg/mL. Subsequently, 200 μL of each drug was added to column 10 of a 96-well plate. In the same plate, 100 μL of DMSO or water were added to columns 1–9 and 11. The drug titration was performed using a multi-channel pipette, using the volume 100 μL for 1:2 dilutions. The remaining 100 μL left over from column 1 was added to column 12, which is the negative control (contamination control). A secondary replicate plate was prepared in the same manner, and then combined for a final volume of 200 μL in each well. These plates were the master plates which were then copied into 36 new plates by transferring 5 μL of each well using Biomek FX Liquid Handling Automation (Beckman Coulter, California, USA). Once antibiotics were in the wells, 195 μL of Middlebrook 7H10 containing 10% Oleic Acid Albumin Dextrose Catalase (OADC) supplement was added each well and homogenized using Multidrop™ Combi Reagent Dispenser (Thermo Fischer Scientific, Massachusetts, USA). All procedures were performed in a biosafety cabinet.

Bacterial cultures were grown in Middlebrook 7H9 broth complete at their preferred temperature, shaking at 180 rpm. Once cultures reached approximately OD600 of 1, they were aliquoted in sterile micro tubes, and frozen at -20°C for future use.

For the MIC99 determination, stock cultures were diluted OD600 of 0.006 in Middlebrook 7H9 broth complete and 2 μL of the dilution were spotted into columns 1–11 of the 96-well plates. Plates were then incubated at the appropriate temperature and analyzed after confluent growth was observed in the growth control wells (column 11). Pictures of the plates were taken, and visual analysis was carried out to record the MIC99. At least three independent experiments were recorded for each species-antibiotic pair, for 15 antibiotics and 44 species, totaling 1,980 individual MIC determinations.

Mass spectrometry and Proteomics

Sample preparation for LC-MS and proteomics was performed as previously described [46]. Mycobacterial species were grown in roller bottles in a volume of 100 mL of Middlebrook 7H9 broth complete to an OD600 of 1. Cultures were filtered using MF-Millipore Membrane Filter – 0.22 μm pore size to concentrate cell amount. For each species, 24 bacterial-laden filters were prepared and 3 were placed in petri-dish plates of Middlebrook 7H10 containing 10% OADC supplement and incubated at appropriate temperature (37 or 30°C, depending on the species) for 5 doubling times, to expand the bacterial biomass. Subsequently, filters were transferred to fresh 7H10 containing 10% OADC supplement plates with vehicle or antibiotic at the concentration of 6 × MIC99 and incubated for one doubling time, at appropriate temperature. Cells were then scraped into screw-cap tubes containing either 1 mL of ACN:MeOH:Water (2:2:1, v:v:v) and glass beads (150 μm) for LC-MS, or washed twice with 1 mL of PBS and then placed in 4% SDS/100 mM HEPES/50 mM DTT lysis buffer and glass beads (150–212 μm), for proteomics pilot experiment and 1% SDC/100 mM HEPES/50 mM DTT lysis buffer for remaining proteomics experiments. All samples were lysed by bead beating. At this stage, LC-MS samples were centrifuged, and the supernatant was collected and filtered using Corning® Costar® Spin-X® Plastic Centrifuge 0.22 μm tube filters. Proteomics samples were heat killed and subjected to acetone precipitation, only when extracted with SDS, and peptide digested with LysC and Trypsin in 100 mM HEPES pH 8, Guanidine HCl 1 M.

LC-MS samples were analyzed using a previously described method [47]. Briefly, aqueous normal phase liquid chromatography was performed using an Agilent 1200 LC system at controlled temperature (4 °C). Flow rate of 0.4 ml min-1 was used. Elution of polar compounds were performed using a gradient of two solvents, A (MS-grade water and 0.1% of formic acid) and B (acetonitrile and 0.1% of formic acid) in positive mode. The data were analyzed by MassHunter Qualitative Analysis B07.00 or XCMS [48]. To verify the intracellular drug concentration for all samples, the molecular formulae of each drug was searched against the raw spectra and the integrated Extracted Ion Chromatogram (EIC) was used to extract quantitative information. The peak height of both the antibiotic standard and the samples was used to calculate the relative drug concentration in each sample. Although area would generally be the variable of choice for performing quantifications, the peak shape was not consistent for LZD, therefore the peak height would correlate better with the amount of drug inside the cell. Further, the amount of drug added to the cells was considered and used to perform normalization, given that each species was exposed to a concentration of drug proportional to their MIC99. For identifying and quantifying Ribosyl-RIF, XCMS was used to perform peak picking and alignment. Metaboanalyst was used to perform batch effect corrections and statistical analysis [49]. A feature with m/z 955.4601 was identified at 5.8 minutes and confirmed to be only present in RIF-treated samples. The feature abundance was normalized by the abundance in the pooled biological quality control (PBQC) samples (shown in Supplementary Fig. 3). This accurate mass was searched using MassHunter Qualitative Analysis B07.00 with an acceptable error of ±10 ppm and the obtained peaks were integrated to extract the peak area, which was normalized by the peak area of the PBQC (shown in Fig. 4). Finally, the average and the standard deviation between normalized values from six biological replicates were calculated and plotted.

Proteomics analyses were performed at the Proteomics Scientific Technology Platform (STP) at The Francis Crick Institute. Data Dependent Acquisition (DDA) was used to build a peptide library and Data Independent Acquisition (DIA) was used to analyze the experimental samples. For both DDA and DIA, Evosep LC system (Evosep) was employed with the standard gradient for a total LC runtime of 44 min, using their supplied 15 cm column [50]. Each sample was loaded from the peptide digests at a minimum volume of 10 μL (samples diluted for the optimum load for final loading volume of 10 μL to ensure all liquid enters the tip). An aliquot of the recommended amount of iRT peptides (Biognosys AG, Switzerland) was added to each sample at the sample loading stage. The protocol supplied with the Evotips was followed for conditioning/equilibrating/loading and washing the tips.

The outlet of the analytical column was connected directly to an adapter that allowed the EasySpray nano-source to be employed on the Orbitrap Fusion Lumos (Thermo Fischer Scientific, USA) using a stainless-steel emitter. The spray voltage was set to 2.2kV. The default charge state was set to 2+. For the DDA runs, MS1 data were acquired in profile mode at a resolution of 60000 (FWHM), with an AGC target of 1E6 ions and a maximum injection time of 50ms. The ion funnel RF was set at 30%. Quadrupole isolation was employed over the MS1 mass range of 375–1200 m/z. The monoisotopic precursor selection (MIPS) was set to “Peptide” and an intensity threshold of 5E4 was applied. Charge states from 2 + to 6 + were considered for MS/MS and dynamic exclusion was set to 15 seconds/10 ppm including exclusion of isotopes. Cycle time for the Data Dependent MS/MS Acquisition was set to 1 second. For the MS/MS, quadrupole isolation was set to 1.4 Da and HCD collision energy was employed at 32%. Data were acquired in the Orbitrap at a resolution of 15000 (FWHM) in centroid, with a fixed first mass of 120 m/z. The AGC was set at 1E6 and maximum injection time of 22 ms.

For the DIA data acquisition, the following parameters were adjusted. Default charge state was set to 4+, and MS1 data were acquired in profile mode at a resolution of 120,000 (FWHM) with an AGC setting of 1E6 and maximum injection time of 20ms. The MS1 scan range was set from 393–907 m/z to allow enough data points per peak. 27 DIA windows (20 Da / 1 Da overlap) were employed over this range for DIA MS2 acquisition. These data were acquired at 30000 resolution (FWHM) in the Orbitrap in centroid mode. HCD collision energy was the same as for the DDA runs. MS2 data were acquired over the mass range 200–2000 m/z with an AGC setting of 1E6 and a maximum injection time of 54 ms. Ions were injected for all available parallelizable time.

Gene knockdown using CRISPRi

Gene silencing of M. conceptionense arr-1 and arr-Xand both genes was performed following the protocols for M. smegmatis described in Wong and Rock [51] with the primers listed in Supplementary table 4.

Electrocompetent M. conceptionensecells were transformed with one of the three silencing constructs and with the empty vector pLJR962 as negative control. Transformants were selected by plating cells into Middlebrook 7H9 broth complete and 20 μg/mL of kanamycin.

Individual colonies were picked and inoculated into 10 mL of Middlebrook 7H9 broth complete and 20 μg/mL of kanamycin and glycerol stocks were prepared for subsequent experiments. To test for antibiotic sensitivity of knockdown constructs, glycerol stocks were used to inoculate 5 mL of Middlebrook 7H9 broth complete with 20 μg/mL of kanamycin and incubated at 37°C with shaking to an OD600 of ca. 1. This saturated cultures were then used to inoculate fresh 5 mL aliquots of medium with kanamycin to an OD600 of 0.05 and the cultures were incubated at 37°C with shaking to an OD600 of 0.4–0.8. The cultures were then diluted to an OD 0.1 in Middlebrook 7H9 broth complete with 20 μg/mL of kanamycin and 100 μg/mL of anhydrotetracycline (ATc) and grown at 37°C with shaking to an OD600 of 0.4–0.8. This step was repeated a second time. Cultures were then streaked into Middlebrook 7H10 plates containing 10% OADC, 0.05% glycerol, 20 μg/mL of kanamycin, 200 μg/mL of ATc and rifabutin at 0.5 ×, 1 × and 2 × the MIC99.

Cloning, expression and purification of Arr enzymes

The RIF ADP-ribosyl transferase (arr) genes from M. smegmatis, M. flavescens and M. conceptionense were cloned from genomic DNA by PCR and inserted via isothermal assembly into the pNIC-CTHF expression vector (a gift from Opher Gileadi; Addgene plasmid #26105) [52]. The resulting plasmids were Sanger sequenced to confirm the correct insertion of the genes and then transformed into E. coli BL21 (DE3) Gold competent cells (Agilent Technologies).

The transformed cells with each of the arr genes were grown at 37°C and 200 rpm in 1 L of lysogeny broth (LB) supplemented with 50 μg/mL kanamycin to an OD600 of 0.6, at this point the temperature was dropped to 16°C. Protein expression was induced by addition of isopropyl β-D-thiogalactopyranoside (IPTG) to a final concentration of 0.5 mM, and cells allowed to grow for 20 hours. Cells were harvested by centrifugation at 4000 g for 30 min and stored at - 80°C.

The purification was performed according to the previous reported protocols [34]. In summary, cells were thawed on ice for 30 min before being resuspended in buffer A [50 mM HEPES (pH 7.5), 1 mM EDTA] containing a tablet of complete EDTA-free protease inhibitor cocktail (Roche), 2 μL Benzonase nuclease (Millipore), and 6 mM MgCl2. Samples were further lysed by probe sonication (amplitude 35%, on 10 s, off 50 s, 2 min total on time per cycle, 2 - 3 × cycles) and centrifuged at 48000 g for 45 min to separate the cell debris. The supernatant was filtered through a 0.45-μm membrane and loaded onto a 20 mL HiPrep Q Sepharose Column (GE Healthcare), pre-equilibrated with buffer A. The column was washed with 5 column volumes (CV) of 5% buffer B [50 mM HEPES (pH 7.5), 1 mM EDTA, 1 M NaCl], and the adsorbed proteins were eluted with 5 CV of a linear gradient from 10 to 30% buffer B. Fractions containing the desired Arr protein were pooled together and brought to 1.25 M (NH4)2SO4 by dropwise addition of the ammonium sulfate solution while stirring. After 30 min, the sample was filtered through a 0.45-μm membrane and loaded onto a 1 mL HiTrap Phenyl Sepharose column (GE Healthcare) pre-equilibrated with buffer C [50 mM sodium phosphate (pH 7.0), 1.25 M (NH4)2SO4]. The column was washed with 15 column volumes (CV) of 10% buffer D [50 mM sodium phosphate (pH 7)], and the adsorbed proteins were eluted with 20 CV of a linear gradient from 10 to 90% buffer D. Fractions were analyzed by sodium dodecyl sulphate - polyacrylamide gel electrophoresis (SDS - PAGE) (NuPAGE Bis-Tris 4 - 12% Precast gels, Thermo Fisher Scientific), pooled, dialyzed against 2 × 2 L of 20 mM HEPES (pH 8.0), concentrated using 5000-molecular-weight-cut-off (MWCO) centrifugal ultrafiltration membranes (Millipore), aliquoted, and stored at - 80°C. The concentration was determined spectrophotometrically (NanoDrop, Thermo Fisher Scientific) at 280 nm using a theoretical extinction coefficient of 16960 M-1 cm-1. (ExPASy’s ProtParam, [53]

In vitro activity assay of Arr enzymes

Assays were carried out based on the previous reported methods. Briefly, 150 nM protein (except for M. conceptionense Arr-X where 50 nM was used) in 50 mM HEPES buffer (pH 7.5) was mixed with 150 μM of rifampicin and 2 mM NAD+. Time points (50 μL) where quenched by the addition of methanol (200 μL). The samples were then analyzed by injecting 20 μL onto the HPLC column Poroshell 120 Å, EC-C18, 3.0 × 150 mm, 2.7 μm (Agilent Technologies) and monitoring the consumption of rifampicin and formation of ADP-ribosylated rifampicin product. The same protocol was followed for each of the rifamycins in study: rifabutin, rifapentine, rifaximin, rifamycin B and rifamycin S. A standard curve of each rifamycin were measured independently.

Arr distribution analysis

In order to build a phylogenetic tree of mycobacterial Arr enzymes, Arr-ms (Uniprot A0QRS5) was used to query a local BLAST [54] database of genomic coding sequences of the mycobacterial reference genomes available in NCBI [40]. The matching sequences were combined with those obtained by searching UniProt [55] entries matching the RIF-ART protein family (Pfam PF12120 [56]). CD-Hit [53, 57] and Jalview [58] were used to reduce redundancy and a multiple protein sequence alignment was calculated using MUSCLE [59]. Trees were generated with IQ-Tree 1.6.11 [60, 61] with 1000 ultrafast bootstrap replicates.

Note

This reviewed preprint has been updated to correct the order of the author list.

Additional Declarations

There is no competing interest.

Supplementary figures

Detailed overall MIC99quantifications against the mycobacterial species in our library. X axes represent the species tested in phylogenetic order, and Y axes arethe MIC99 measurements in log10 scale. Dotted lines represent the mean MIC99 for each drug.

Log2(FC) MIC99 (Y axis) compared to the mean MIC99for each antibiotic. Species organized by phylogeny (X axis). Dotted line represents the mean MIC99 for each antibiotic.

Detailed view of the Mycobacterium holsaticum group MIC99 values

a, Phylogenetic distribution of RIF ADP-ribosyl transferases present in themycobacterial species in our library. b,Quantification of ribosyl-RIFin somemycobacterial species that encode Arr-1 and/or Arr-X.Not detected (ND).