An antimicrobial drug recommender system using MALDI-TOF MS and dual-branch neural networks

  1. Gaetan De Waele  Is a corresponding author
  2. Gerben Menschaert
  3. Willem Waegeman
  1. Department of Data Analysis and Mathematical Modelling, Ghent University, Belgium
13 figures, 6 tables and 1 additional file

Figures

Architectural overview of the proposed model.

Antimicrobial resistance (AMR) labels of spectrum–drug pairs can be represented in an incomplete matrix. A microbial sample that is susceptible to a drug is denoted by a negative label (orange), whereas positive labels (blue) signify an intermediate or resistant combination. Instance (spectrum) and target (drug) embeddings 𝒙i and 𝒕j are obtained from their respective input representations passed through their respective neural network branch. The two resulting embeddings are aggregated to a single score by their (scaled) dot product. The cross-entropy loss optimizes this score to be maximal or minimal for positive or negative combinations of microbial spectra and drugs, respectively.

Barplots showing test performance results for all trained models.

Area under the receiver operating characteristic curve (ROC-AUC) evaluates overall ranking of predictions. Prec@1(-) evaluates how often the top suggested treatment would be effective. Both metrics are calculated per spectrum/patient and then averaged. Errorbars represent the standard deviation over five random model seeds. The x-axis and colors show the different drug and spectrum embedders, respectively.

Transfer learning of DRIAMS-A models to other hospitals.

Errorbands show the standard deviation over five runs. Results in terms of other evaluation metrics are shown in Appendix 3—figure 4.

UMAP scatterplots of test set matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) spectra embeddings 𝒙i.

Top: embeddings from a ‘general’ (trained on all species) recommender. Only embeddings belonging to the 25 most occurring species in the test set are shown. The panels on the right show the same embeddings as on the left, but colored according to its antimicrobial resistance (AMR) status to a certain drug. The four displayed drugs are selected based on a ranking of the product of the number of positive and negative labels i=1n[yij=0]i=1n[yij=1]. In this way, the drugs that have a lot of observed labels, both positives and negatives, are displayed. Bottom: highlighted embeddings from a S. epidermidis-specific recommender model.

Appendix 2—figure 1
Structure used for the residual blocks, used in the 1D CNN, 2D CNN, and transformer.

In the case of convolutions, the output is zero padded so as to produce the same output dimensions as in the input.

Appendix 2—figure 2
Overview of all different drug embedders tested in this work.

One-hot embeddings are the only technique not incorporating prior knowledge of the structure of the compound. Hence, they are the only technique incapable of directly transferring to new compounds. Morgan fingerprints produce a bit-vector containing information on the presence of certain substructures. DeepSMILES strings are encoded and processed with a 1D CNN, GRU, or transformer. Drawings of molecules are processed with a 2D CNN. A string kernel on SMILES strings produces a numerical vector for every drug (taken as the row in the resulting Gram matrix).

Appendix 2—figure 3
All hyperparameter tuning experiments.

All evaluations are listed in terms of validation area under the receiver operating characteristic curve (ROC-AUC). All numbers are averages of five model runs, with errorbars showing standard deviations. In every experiment, the highest average is chosen to use in the final models. (A) Tuning of kernel and hidden size in a DeepSMILES CNN. (B) Tuning of kernel and hidden size in an Image CNN. (C) Tuning of alphabet in a DeepSMILES CNN. (D) Tuning of positional encodings in a DeepSMILES Transformer. (E) Tuning of directionality in a DeepSMILES GRU. (F) Tuning of number of bits in a Morgen Fingerprint-based drug embedder.

Appendix 3—figure 1
Spectrum-macro receiver operating characteristic (ROC) curve for best-performing model (Morgan fingerprints drug embedder, medium-sized spectrum embedder).

The y-axis shows the average sensitivity (across patients), while the x-axis shows one minus the average specificity. Note that this ROC curve is not a traditional ROC curve constructed from one single label set and one corresponding prediction set. Rather, it is constructed from spectrum-macro metrics as follows: for any possible threshold value, binarize all predictions. Then, for every spectrum/patient independently, compute the sensitivity and specificity for the subset of labels corresponding to that spectrum/patient. Finally, those sensitivities and specificities are averaged across patients to obtain one point on above ROC curve. In blue, the optimal sensitivity and specificity (according to the Youden index) is indicated (Youden, 1950).

Appendix 3—figure 2
Barplots showing test performance results for all trained models.

Colors represent the different spectrum embedder model sizes. Performance is shown in terms of macro area under the receiver operating characteristic curve (ROC-AUC) (computed per drug and averaged). Errorbars represent the standard deviation over five random seeds.

Appendix 3—figure 3
Performance of models compared against a linear spectrum embedder baseline.

The comparison is only shown for the best-performing drug embedder (Morgan fingerprints). Errorbars represent the standard deviation over five random seeds.

Appendix 3—figure 4
Transfer learning of DRIAMS-A models to other hospitals.

Errorbands show the standard deviation over five runs.

Appendix 3—figure 5
UMAP scatterplots of test set matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) spectra embeddings 𝒙i.

Embeddings from a ‘general’ (trained on all spectra across species) recommender are shown. Only embeddings belonging to the 25 most occurring species in the test set are shown. Spectra are colored according to its antimicrobial resistance (AMR) status to a certain drug. The 20 displayed drugs were selected based on a ranking of the product of the number of positive and negative labels i=1n[yij=0]i=1n[yij=1]. In this way, the drugs that have a lot of observed labels, both positives and negatives, are displayed. The drugs here are ranked 5–24 (the first four are shown in Figure 4). In order to map the clusters back to species, readers are referred back to Figure 4.

Appendix 3—figure 6
UMAP scatterplots of test set matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) spectra embeddings 𝒙i.

Embeddings from two ‘species-specific’ recommenders are shown. Spectra are colored according to its antimicrobial resistance (AMR) status to a certain drug.

Tables

Table 1
All tested model sizes for the (instance) spectrum branch.

Hidden sizes represent the evolution of the hidden state dimensionality as it goes through the model, with every hyphen defining one fully connected layer. The listed number of parameters only includes those of the instance (spectrum) branch.

Size# weightsHidden sizes
S1,578,1766000-256-128-64
M3,246,7846000-512-256-128-64
L6,846,1446000-1024-512-256-128-64
XL15,093,4406000-2048-1024-512-256-128-64
Table 2
Test performance of selected general and species-specific dual branch recommender models.

The listed averages and standard deviations are calculated over five independent runs of the same model. Performance is computed on the subset of labels spanning the 25 most common species in DRIAMS-A.

ModelROC-AUCPrec@1(-)Macro ROC-AUC
General recommender (Morgan fingerprints – M)0.9411 ± 0.00070.9967 ± 0.00110.7684 ± 0.0050
General recommender (one-hot – L)0.9408 ± 0.00110.9940 ± 0.00090.7746 ± 0.0316
Species-specific recommenders (Morgan fingerprints – M)0.9461 ± 0.00100.9973 ± 0.00040.7905 ± 0.0151
Species-specific recommenders (one-hot – L)0.9468 ± 0.00120.9950 ± 0.00110.7686 ± 0.0155
  1. ROC-AUC, area under the receiver operating characteristic curve.

Table 3
Test performance of selected recommender models, compared to the performance of a collection of models – each trained on only one species–drug combination – coined ‘species–drug classifiers’.

‘Species–drug classifiers’ refer to a collection of binary classifiers, each trained to predict antimicrobial resistance (AMR) status for a subset of data comprising a single species–drug combination. ‘Simulated expert’s best guess’ refers to counting AMR label frequencies in single species–drug combinations and taking those as predictions. The listed averages and standard deviations are calculated over five independent runs of the same model. Given the non-stochastic nature of the logistic regression and XGBoost implementations, only one set of models is trained and, hence, no standard deviations are reported. Performance is computed on the subset of labels spanning the 200 most common species–drug combinations.

ModelROC-AUCPrec@1(-)Macro ROC-AUCSpecies–drug macro ROC-AUC
Species-specific recommenders (Morgan fingerprints – M)0.9009 ± 0.00180.9830 ± 0.00150.8283 ± 0.00590.6381 ± 0.0121
Species-specific recommenders (one-hot – L)0.9030 ± 0.00180.9814 ± 0.00200.8129 ± 0.00790.6511 ± 0.0290
General recommender (Morgan fingerprints – M)0.8939 ± 0.00160.9746 ± 0.00060.8114 ± 0.00640.6517 ± 0.0076
General recommender (one-hot – L)0.8933 ± 0.00200.9778 ± 0.00230.8124 ± 0.00330.6521 ± 0.0078
Species–drug classifiers (MLP – S)0.8341 ± 0.01350.9420 ± 0.01230.8005 ± 0.00320.6745 ± 0.0218
Species–drug classifiers (MLP – M)0.8382 ± 0.00770.9421 ± 0.01960.8075 ± 0.00490.6797 ± 0.0097
Species–drug classifiers (MLP – L)0.8457 ± 0.00880.9505 ± 0.01000.8037 ± 0.00790.6648 ± 0.0149
Species–drug classifiers (MLP – XL)0.8611 ± 0.00490.9722 ± 0.00410.8106 ± 0.00690.6801 ± 0.0101
Species–drug classifiers (logistic regression)0.86840.94320.79890.7200
Species–drug classifiers (XGBoost)0.83460.91960.77630.6236
Simulated expert’s best guess0.86810.97430.71590.5000
  1. ROC-AUC, area under the receiver operating characteristic curve.

Appendix 1—table 1
Full list of modifications made to drug names in DRIAMS.

Modifications consist of (1) removal of drugs, (2) merging of drugs, and (3) renaming drugs.

Original drug nameStep undertaken
QuinolonesRemoved
AminoglycosidesRemoved
OfloxacinMerged with levofloxacin
BenzylpenicillinMerged with penicillin
Benzylpenicillin_othersMerged with penicillin
Benzylpenicillin_with_meningitisMerged with penicillin
Benzylpenicillin_with_pneumoniaMerged with penicillin
Penicillin_with_endokarditisMerged with penicillin
Penicillin_without_endokarditisMerged with penicillin
Penicillin_without_meningitisMerged with penicillin
Penicillin_with_meningitisMerged with penicillin
Penicillin_with_pneumoniaMerged with penicillin
Penicillin_with_other_infectionsMerged with penicillin
Cefuroxime.1Merged with cefuroxime
CotrimoxazolMerged with cotrimoxazole
Gentamicin_high_levelMerged with gentamicin
Cefoxitin_screenMerged with cefoxitin
Teicoplanin_GRDMerged with teicoplanin
Vancomycin_GRDMerged with vancomycin
Rifampicin_1mg-lMerged with rifampicin
Meropenem_with_meningitisMerged with meropenem
Meropenem_without_meningitisMerged with meropenem
Meropenem_with_pneumoniaMerged with meropenem
Amoxicillin-Clavulanic acid_uncomplicated_HWIMerged with amoxicillin-clavulanic acid
Strepomycin_high_levelRenamed to streptomycin
BacitracinRenamed to bacitracin A
CeftarolinRenamed to ceftaroline fosamil
Fosfomycin-TrometamolRenamed to fosfomycin tromethamine
Appendix 3—table 1
Full table of test results.

The listed averages and standard deviations are calculated over five independent runs of the same model. The best models for every metric per drug embedder are underlined. The overall best model for every metric is in bold face.

Drug embedderSpectrum embedderROC-AUCPrec@1(-)Macro ROC-AUC
Morgan fingerprintsS0.9341 ± 0.00140.9917 ± 0.00090.8158 ± 0.0070
M0.9345 ± 0.00140.9922 ± 0.00090.8078 ± 0.0081
L0.9341 ± 0.00070.9920 ± 0.00100.8070 ± 0.0128
XL0.9322 ± 0.00170.9920 ± 0.00120.7904 ± 0.0155
One-hot embeddingS0.9326 ± 0.00170.9899 ± 0.00100.7984 ± 0.0086
M0.9337 ± 0.00140.9910 ± 0.00160.7920 ± 0.0175
L0.9338 ± 0.00180.9882 ± 0.00100.8011 ± 0.0116
XL0.9327 ± 0.00110.9890 ± 0.00260.7932 ± 0.0201
DeepSMILES 1-D CNNS0.9303 ± 0.00120.9864 ± 0.00160.7949 ± 0.0185
M0.9336 ± 0.00110.9903 ± 0.00080.8009 ± 0.0044
L0.9337 ± 0.00150.9890 ± 0.00140.7940 ± 0.0052
XL0.9317 ± 0.00120.9898 ± 0.00200.7960 ± 0.0155
String Kernel (LINGO)S0.9327 ± 0.00220.9913 ± 0.00120.7972 ± 0.0087
M0.9332 ± 0.00170.9916 ± 0.00080.7919 ± 0.0051
L0.9317 ± 0.00170.9909 ± 0.00130.7859 ± 0.0136
XL0.9303 ± 0.00210.9893 ± 0.00250.7935 ± 0.0135
Image – 2-D CNNS0.9310 ± 0.00160.9888 ± 0.00250.7820 ± 0.0101
M0.9317 ± 0.00080.9885 ± 0.00190.7866 ± 0.0084
L0.9332 ± 0.00100.9901 ± 0.00160.7758 ± 0.0070
XL0.9309 ± 0.00120.9900 ± 0.00130.7711 ± 0.0109
DeepSMILES TransformerS0.9306 ± 0.00210.9900 ± 0.00220.7862 ± 0.0124
M0.9325 ± 0.00120.9891 ± 0.00140.7925 ± 0.0075
L0.9308 ± 0.00140.9885 ± 0.00270.7902 ± 0.0072
XL0.9311 ± 0.00140.9895 ± 0.00090.7791 ± 0.0075
DeepSMILES RNNS0.9291 ± 0.00150.9872 ± 0.00320.7881 ± 0.0053
M0.9293 ± 0.00280.9863 ± 0.00080.7793 ± 0.0116
L0.9266 ± 0.00120.9868 ± 0.00190.7684 ± 0.0058
XL0.9278 ± 0.00290.9879 ± 0.00270.7689 ± 0.0113
  1. ROC-AUC area under the receiver operating characteristic curve.

Appendix 3—table 2
Test area under the receiver operating characteristic curve (ROC-AUC) performance per species.

Reported figures are averages across the five different medium-sized Morgan fingerprint-based recommenders.

SpeciesROC-AUC
Staphylococcus aureus0.9578
Staphylococcus epidermidis0.9478
Escherichia coli0.9184
Klebsiella pneumoniae0.9643
Pseudomonas aeruginosa0.7614
Enterobacter cloacae0.9831
Proteus mirabilis0.9727
Staphylococcus hominis0.9594
Serratia marcescens0.9848
Staphylococcus capitis0.9425
Enterococcus faecium0.9914
Klebsiella oxytoca0.9861
Klebsiella variicola0.9824
Citrobacter koseri0.9970
Enterococcus faecalis0.9594
Staphylococcus lugdunensis0.9705
Citrobacter freundii0.9622
Morganella morganii0.9931
Proteus vulgaris0.9828
Staphylococcus haemolyticus0.9751
Candida albicans0.7446
Streptococcus pneumoniae0.9059
Stenotrophomonas maltophilia1.0000
Campylobacter jejuni1.0000
Haemophilus influenzae1.0000

Additional files

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Gaetan De Waele
  2. Gerben Menschaert
  3. Willem Waegeman
(2024)
An antimicrobial drug recommender system using MALDI-TOF MS and dual-branch neural networks
eLife 13:RP93242.
https://doi.org/10.7554/eLife.93242.4