A chemical informatics pipeline to screen for pleasant smelling insect repellents

A, Overview of the cheminformatics pipeline used to identify novel DEET-like ligands from a larger chemical space. Summary table of top 10 physicochemical features that optimally corelate with repellency. Receiver-operating-characteristic curve (ROC), which indicates the success in predicting test repellents. The performance is the average over 100 80/20 splits, where 80% of the data is for training and 20% for testing. The mean area under the curve (AUC) is provided. Representative structures from the top 150 predicted repellent compounds.

Models successfully predict odor perceptual qualities of repellents in humans

A, Validation of models predicting the odor perceptual qualities (146 perceptual descriptors) of a set of compounds that were screened for repellency in Figure 3A. The performance metric for the validation is the area under the ROC curve, which assesses the rate of correctly predicted perceptual descriptors relative the rate of incorrectly predicted descriptors. Dashed line is the average Area Under the Curve (AUC). B, Sample ROC curves for select compounds as well as the ROC curve across all compounds (e.g., shown as separate curves in the ROC plot as well as those not shown but whose identities appear in Figure 2A); the respective AUC values are reported inside the plot. C, The top 25 perceptual descriptors for the predicted repellents that underwent experimental validation are displayed as a word cloud. The font size of the descriptor is scaled relative to frequency.

Experimental validation of predicted insect repellents show a high success rate several of which are pleasant smelling

A, Representative still photograph of landing assay of female Aedes aegypti mosquitoes on filter paper covered heat block over 5 mins (overlay of all frames), and B, mean repellency index for each of the indicated compounds tested at ∼0.2μg/cm2. The repellency coefficient = number mosquitoes attracted to solvent treated control netting, minus number of mosquitoes on treated netting compared to solvent, with 37C heated rectangle as an attractant. Mosquito numbers are counted for the 5 min duration, over multiple frames, as described in the methods. N=2-4 trials, ∼20 mosquitoes/trial. For DEET, N=23 trials. C, Mean Preference index of Drosophila adults to repellents at three different concentrations dissolved in acetone in a Two-choice trap assay baited with 10% Apple Cider Vinegar measured after 48 hrs. N = 7-10 trials each treatment at 48 hrs, 10 flies/trial, error bars = s.e.m.

Some, but not all insect repellents increase Ca2+ mobilization in a human cell line

DEET induces a non-specific calcium repose in vitro. A. Dose response curves for three mammalian cell lines (HEK293. CHOK1, NG108) and one insect cell line (S2). EC50 and Hill coefficients for each cell line are given below. B. Representative DEET induced calcium kinetics in HEK293 cells. C. HEK293 cells treated with thapsigargin 10 min before assay to deplete intracellular calcium stores no longer respond to DEET, suggesting DEET response is dependent on intracellular calcium. Error bars are standard deviation.D. F, Luminescence of the Ca2+ indicator at two different concentrations for predicted and previously known repellent chemicals, scaled relative to DEET.

Supplemental Figure S1

A, Overview of the approach to generate and validate the machine learning models. Here, a model-averaged prediction is made, where each model has access to different physicochemical features and is trained on different combinations of training set chemicals. B, The R2 values over the 100 train/test splits, averaged into 10 bins. C, The average classification success is reported over 100 train/test splits, assessed by the area under the ROC curve. Active labels (positive cases) were assigned as chemical scoring in the top ∼40% of Ca2+ values. This is compared to shuffling the labels before training, reported as “Shuffle.”

Screen of 10+ million chemicals predicts several insect repellents with different smells

A, Tabulated predicted % repellency (relative to DEET) from a library of 10+ million chemicals, filtered to the top values. The best matching known repellent is displayed alongside the Euclidean distance, the predicted LD50 and Ca2+ mobilization. B, heatmap organizing top predictions according to estimated perceptual qualities.