Machine Learning Based Modelling of Human and Insect Olfaction Screens Millions of compounds to Identify Pleasant Smelling Insect Repellents

  1. Interdepartmental Neuroscience Program, University of California, Riverside, CA 92521, USA
  2. Department of Molecular, Cell and Systems Biology, University of California, Riverside, CA 92521, USA

Peer review process

Not revised: This Reviewed Preprint includes the authors’ original preprint (without revision), an eLife assessment, public reviews, and a provisional response from the authors.

Read more about eLife’s peer review process.

Editors

  • Reviewing Editor
    John Tuthill
    University of Washington, Seattle, United States of America
  • Senior Editor
    Claude Desplan
    New York University, New York, United States of America

Reviewer #1 (Public Review):

Summary:

In this manuscript, the authors set up a pipeline to predict insect repellents that are pleasant and safe for humans. This is done by daisy-chaining a new classification model based on predicting repellents with a published model on predicting human perception. Models use a feature-engineered selection of chemical features to make their predictions. The predicted molecules are then validated against a proxy humanoid (heated brick) and its safety is tested by molecular assays of human cells. The humanistic approach to modeling these authors have taken (which considers cosmetic/aesthetic appeal and safety) is novel and a necessary step for consumer usage. However, the importance of pleasantness over effectiveness is still up for debate (DEET is unpleasant but still used often) and the generalization of safety tests is unknown and assumed. The effectiveness of the prediction models is also still warranted. They pass the authors' own behavioral tests, but their contribution to the field is unknown as both models (new and published) have not been rigorously benchmarked to previous models. Moreover, the author's breadth of literature in this field is sparse, ignoring directly related studies.

Strengths:

Humanistic approach to modeling considers pleasantness and safety. Chaining models can help limit the candidate odorants from the vastness of odor space.

Weaknesses:

The current models need to be bench-marked against leading models predicting similar outcomes. Similarly, many of these papers need to be addressed and discussed in the introduction. The authors might even consider their data sources for model training to increase performance and lexical categorization for interoperability. For instance, the Dravnikes data lexicon, currently used in the human perception lexicon, has been highly criticized for its overlapping and hard-to-interpret descriptive terms ("FRAGRANT", "AROMATIC").

Human Perception

Khan, R. M., Luk, C. H., Flinker, A., Aggarwal, A., Lapid, H., Haddad, R., & Sobel, N. (2007). Predicting odor pleasantness from odorant structure: pleasantness as a reflection of the physical world. Journal of Neuroscience, 27(37), 10015-10023.

Keller, A., Gerkin, R. C., Guan, Y., Dhurandhar, A., Turu, G., Szalai, B., ... & Meyer, P. (2017). Predicting human olfactory perception from chemical features of odor molecules. Science, 355(6327), 820-826.

Gutiérrez, E. D., Dhurandhar, A., Keller, A., Meyer, P., & Cecchi, G. A. (2018). Predicting natural language descriptions of mono-molecular odorants. Nature communications, 9(1), 4979.

Lee, B. K., Mayhew, E. J., Sanchez-Lengeling, B., Wei, J. N., Qian, W. W., Little, K. A., ... & Wiltschko, A. B. (2023). A principal odor map unifies diverse tasks in olfactory perception. Science, 381(6661), 999-1006.
Related cleaned data: https://github.com/BioMachineLearning/openpom

Insect Repellents:

Wright, R. H. (1956). Physical basis of insect repellency. Nature, 178(4534), 638-638.

Katritzky, A. R., Wang, Z., Slavov, S., Tsikolia, M., Dobchev, D., Akhmedov, N. G., ... & Linthicum, K. J. (2008). Synthesis and bioassay of improved mosquito repellents predicted from chemical structure. Proceedings of the National Academy of Sciences, 105(21), 7359-7364.

Bernier, U. R., & Tsikolia, M. (2011). Development of Novel Repellents Using Structure− Activity Modeling of Compounds in the USDA Archival Database. In Recent Developments in Invertebrate Repellents (pp. 21-46). American Chemical Society.

Wei, J. N., Vlot, M., Sanchez-Lengeling, B., Lee, B. K., Berning, L., Vos, M. W., ... & Dechering, K. J. (2022). A deep learning and digital archaeology approach for mosquito repellent discovery. bioRxiv, 2022-09.

The current study assumes that insect repellents repel via their odor valence to the insect, but this is not accurate. Insect repellents also mask the body odor of humans making them hard to locate. The authors need to consult the literature to understand the localization and landing mechanisms of insects to their hosts. Here, they will understand that heat alone is not the attractant as their behavioral assay would have you believe. I suggest the authors test other behaviour assays to show more convincing evidence of effectiveness. See the following studies:

De Obaldia, M. E., Morita, T., Dedmon, L. C., Boehmler, D. J., Jiang, C. S., Zeledon, E. V., ... & Vosshall, L. B. (2022). Differential mosquito attraction to humans is associated with skin-derived carboxylic acid levels. Cell, 185(22), 4099-4116.

McBride, C. S., Baier, F., Omondi, A. B., Spitzer, S. A., Lutomiah, J., Sang, R., ... & Vosshall, L. B. (2014). Evolution of mosquito preference for humans linked to an odorant receptor. Nature, 515(7526), 222-227.

Wei, J. N., Vlot, M., Sanchez-Lengeling, B., Lee, B. K., Berning, L., Vos, M. W., ... & Dechering, K. J. (2022). A deep learning and digital archaeology approach for mosquito repellent discovery. bioRxiv, 2022-09.

Reviewer #2 (Public Review):

Summary:
This is an interesting study that seeks to identify novel mosquito repellents that smell attractive to humans.

Strengths:
The combination of standard machine learning methods with mosquito behavioral tests is a strength.

Weaknesses:
The study would be strengthened by describing how other modern ML approaches (RF, decision trees) would classify and identify other potential repellents.

A comparison in the repellent activity between DEET and the top ten hits identified in this new study indicates little change in repellent activity (~3%), suggesting that DEET remains the gold standard. Without additional toxicity tests, the study is arguably incremental. The study's novelty should be better clarified.

The Methods in the repellency tests are sparse, and more information would be useful. Testing the top repellents at low doses (<<1%) and for long periods (2-12 h) would strengthen the manuscript. Without this information, the manuscript is lacking in depth.

Testing human subjects on their olfactory perceptions of the repellents would also increase the depth and utility of the manuscript. Without additional experiments, the authors' conclusions lack support and have limited impact on the state-of-the-art.

This manuscript is a mix of different approaches, which makes it lack cohesion. There is the ML method for classifying new repellents that smell good, but no testing of the repellents on human volunteers. The repellents are not tested at realistic concentrations and durations. And the calcium mobilization test is strange and makes little sense in the context of the other experiments and framing of the manuscript.

Reviewer #3 (Public Review):

While I am not a specialist in this field, I do have some knowledge of the subject matter and the computational aspects involved.

The authors employ simple machine learning techniques (such as SVM) for the following purposes:

a. Prediction of aversive valence.
b. Predicting anti-repellent chemicals.
c. Predicting calcium mobilization.

The approach is commonplace in chemoinformatics literature.

Weaknesses:

- All the above models are presented discretely, making it difficult to discern experiment design principles and connectedness.
- The ML work is rudimentary, lacking adequate details. Chemoinformatics has reached great heights, and SVM does not seem contemporary.
- There is significant existing research on finding repellents.

Strengths:

- Authors attempt to make a case for calcium mobilization in the context of repellency. This aspect sounds interesting but is not surprising.
- Behavioral profiling of repellents could be useful.

Author Response

We would like to thank the editors for giving us an opportunity to address the insightful comments made by the referees. In our response to the comments, we provide a guide to important information that may have been overlooked, and hope to elaborate on the context for better evaluating this study.

As mentioned in the introduction of our manuscript, mosquito-transmitted diseases cause nearly a million deaths every year and significant worldwide morbidity. Moreover, the geographical range of mosquito vectors is rapidly expanding due to climate change and mosquito-borne disease risks are emerging in new parts of the world. DEET was discovered in the 1940s and has remained the primary insect repellent for >70 years in the developed world. The US Environmental Protection Agency (EPA) regulates mosquito repellents, and DEET-based commercial products are typically assigned protection times that vary with concentration. Products with lower concentration need repeated applications, whereas those with higher concentrations feel oily and cost more.

We also mentioned that DEET inhibits mammalian cation channels and human acetylcholinesterase. The latter is a target of carbamate insecticides that are commonly used in disease-endemic areas, raising additional concerns about prolonged use of DEET. DEET is also a solvent and damages several forms of plastics, synthetic fabrics, and painted . Unfortunately, DEET has been of little value in disease control in Africa and Asia. Even in developed countries, a natural, cosmetically pleasant alternative could benefit millions of people who currently avoid repellents.

Innovation in finding new repellents has been slow due to limitations in current research approaches and high costs for EPA registration (specially for synthetic compounds). Since DEET only five additional actives have been approved by the EPA for repellent products. In the 20+ years since discovery of insect odorant receptors from genomes, not a single novel repellent compound has been identified registered by the EPA. Thus, there is a both a strong need for new approaches to find insect repellents and need for new active ingredients that are safe and strategically effective. In fact, this goal of finding new mosquito repellents has been the topic of multiple Gates Foundation Grand Challenge grants, and numerous NIH funded grants to many research groups around the world.

Reviewer #1 (Public Review):

Summary:

In this manuscript, the authors set up a pipeline to predict insect repellents that are pleasant and safe for humans. This is done by daisy-chaining a new classification model based on predicting repellents with a published model on predicting human perception. Models use a feature-engineered selection of chemical features to make their predictions. The predicted molecules are then validated against a proxy humanoid (heated brick) and its safety is tested by molecular assays of human cells. The humanistic approach to modeling these authors have taken (which considers cosmetic/aesthetic appeal and safety) is novel and a necessary step for consumer usage. However, the importance of pleasantness over effectiveness is still up for debate (DEET is unpleasant but still used often) and the generalization of safety tests is unknown and assumed. The effectiveness of the prediction models is also still warranted. They pass the authors' own behavioral tests, but their contribution to the field is unknown as both models (new and published) have not been rigorously benchmarked to previous models. Moreover, the author's breadth of literature in this field is sparse, ignoring directly related studies.

Strengths:

Humanistic approach to modeling considers pleasantness and safety. Chaining models can help limit the candidate odorants from the vastness of odor space.

Weaknesses:

The current models need to be bench-marked against leading models predicting similar outcomes. Similarly, many of these papers need to be addressed and discussed in the introduction. The authors might even consider their data sources for model training to increase performance and lexical categorization for interoperability. For instance, the Dravnikes data lexicon, currently used in the human perception lexicon, has been highly criticized for its overlapping and hard-to-interpret descriptive terms ("FRAGRANT", "AROMATIC").

Human Perception:

Khan, R. M., Luk, C. H., Flinker, A., Aggarwal, A., Lapid, H., Haddad, R., & Sobel, N. (2007). Predicting odor pleasantness from odorant structure: pleasantness as a reflection of the physical world. Journal of Neuroscience, 27(37), 10015-10023.

Keller, A., Gerkin, R. C., Guan, Y., Dhurandhar, A., Turu, G., Szalai, B., ... & Meyer, P. (2017). Predicting human olfactory perception from chemical features of odor molecules. Science, 355(6327), 820-826.

Gutiérrez, E. D., Dhurandhar, A., Keller, A., Meyer, P., & Cecchi, G. A. (2018). Predicting natural language descriptions of mono-molecular odorants. Nature communications, 9(1), 4979.

Lee, B. K., Mayhew, E. J., Sanchez-Lengeling, B., Wei, J. N., Qian, W. W., Little, K. A., ... & Wiltschko, A. B. (2023). A principal odor map unifies diverse tasks in olfactory perception. Science, 381(6661), 999-1006.

Author Response: The human perception predictions were performed using models that we had reported in two earlier publications: Kowalewski & Ray, iScience (2020b) and Kowalewski, Huynh & Ray, Chem. Senses (2021). Three of the four references pointed out by the referee were cited in these prior studies, which involved computational validation by predicting on a test set of the data which was left out of training (as typically done), and also predicting across different human studies with a high degree of success. A rigorous benchmarking of the odor perception models was done in Kowalewski, Huynh & Ray, Chem. Senses (2021) and a mini-review published in the same issue of the journal by Gerkin, Chem. Senses, (2021). This included a favorable comparison with the two references indicated by the referee: Keller et al. Science (2017) as well as the Gutiérrez et. al. Nat. Communication (2018). The 4th reference, Lee et al, Science (2023) describes a neural network approach and was published much after our mosquito behavior studies were completed. Although using an advanced Neural network model Lee et al. worked with 2-D structures of compounds in contrast to our 3-D approach. They also did not report cross-study validations or comparisons with Keller et al, 2017 or benchmark to past studies, so it is difficult to compare advances if any.

The intent of the current study was to move beyond testing approaches, of which there are many, and instead work on a practical use case. As we see it, it is not necessarily the prediction of fragrance character or quality alone that matters but overlap with other predicted bioactivities. From the perspective of human use, a molecule with a pleasing scent that also repels insects is likely to be far more useful than one with an unappealing scent. Accordingly, our task in this study was to select molecules that fit into specific use categories: display strong insect repellency, have pleasing scent profiles, are natural in origin and are potentially repurposed from flavors and fragrances.

Insect Repellents:

Wright, R. H. (1956). Physical basis of insect repellency. Nature, 178(4534), 638-638.

Katritzky, A. R., Wang, Z., Slavov, S., Tsikolia, M., Dobchev, D., Akhmedov, N. G., ... & Linthicum, K. J. (2008). Synthesis and bioassay of improved mosquito repellents predicted from chemical structure. Proceedings of the National Academy of Sciences, 105(21), 7359-7364.

Bernier, U. R., & Tsikolia, M. (2011). Development of Novel Repellents Using Structure− Activity Modeling of Compounds in the USDA Archival Database. In Recent Developments in Invertebrate Repellents (pp. 21-46). American Chemical Society.

Author response: The Katritzky et. al. PNAS (2008) paper is cited in our study, and we have indicated that the chemical analogs reported therein are part of the training data set in our study. We thank the reviewer for pointing us to the book chapter by Bernier & Tsikolia (2011), which reviews the QSAR approaches taken for repellent discovery and in large measure focuses on the Katritzky et. al. PNAS (2008) paper. We did cite two relevant studies by Uli Bernier, but agree that citation of the book chapter would make a nice addition.

The current study assumes that insect repellents repel via their odor valence to the insect, but this is not accurate. Insect repellents also mask the body odor of humans making them hard to locate. The authors need to consult the literature to understand the localization and landing mechanisms of insects to their hosts. Here, they will understand that heat alone is not the attractant as their behavioral assay would have you believe. I suggest the authors test other behaviour assays to show more convincing evidence of effectiveness. See the following studies:

De Obaldia, M. E., Morita, T., Dedmon, L. C., Boehmler, D. J., Jiang, C. S., Zeledon, E. V., ... & Vosshall, L. B. (2022). Differential mosquito attraction to humans is associated with skin-derived carboxylic acid levels. Cell, 185(22), 4099-4116.

McBride, C. S., Baier, F., Omondi, A. B., Spitzer, S. A., Lutomiah, J., Sang, R., ... & Vosshall, L. B. (2014). Evolution of mosquito preference for humans linked to an odorant receptor. Nature, 515(7526), 222-227.

Wei, J. N., Vlot, M., Sanchez-Lengeling, B., Lee, B. K., Berning, L., Vos, M. W., ... & Dechering, K. J. (2022). A deep learning and digital archaeology approach for mosquito repellent discovery. bioRxiv, 2022-09.

Author response: In this study we took an unbiased approach to compile the training data set, including several known insect repellents of varying chemical structures and volatility, for most of which there is no information on how they are sensed by insects. Not surprisingly, the repellents we identified are varied in structure and in functional groups, and are likely detected in more than one way by the mosquitoes, using olfactory and/or gustatory systems. We did not consider “masking” of skin attraction as a factor in the training data set in this study, which precluded the need to discuss the papers pointed out by the referee in any detail. In fact there is an extremely vast and rich body of literature regarding human skin odor, CO2 and breath emanations, which includes our own contributions of research and review articles that are not discussed in the current paper.

We did in fact conduct human arm-in-cage experiments with a few of the compounds reported in this study using female Aedes aegypti mosquitoes; a preprint describes the smaller scale analysis, the results of which show strong repellency, in Boyle et. al. bioRxiv (2016) https://doi.org/10.1101/060178 (Figure 4). However, heat offers a practical proxy for evaluating prospective repellents in a high-throughput manner. It would certainly be desirable to further evaluate additional candidates from the heat attraction assay with human subjects in the future.

We thank the reviewer for pointing out the preprint by Wei, et. al. bioRxiv (2022). Our approaches differ in that Wei et al do not consider properties such as fragrance and toxicity. We also cannot assume that their newer neural network model is superior because although the model uses a large training dataset, it does not use 3D chemical structures that are extremely relevant for biological activity. While very little information is available for the actives reported in Wei et. al., we independently evaluated their top compounds similar or better than DEET (CAS#3731-16-6, 4282-32-0, 2040-04-2, 32940-15-1 and 3446-90-0) and could not find information about toxicity, smell, or natural source. In contrast, the top repellents that we identify here as similar or better than DEET (N=8) are all classified as GRAS (Generally Regarded as Safe) compounds by the Flavor and Extract Manufacturers (FEMA), are all naturally occurring (plum, jasmin, mushroom, grapes, etc), and have pleasant smells. The Dermal toxicity values in rabbits are known for six of our compounds and are at the best possible levels (5000mg/kg).

Reviewer #2 (Public Review):

Summary:

This is an interesting study that seeks to identify novel mosquito repellents that smell attractive to humans.

Strengths:

The combination of standard machine learning methods with mosquito behavioral tests is a strength.

Weaknesses:

The study would be strengthened by describing how other modern ML approaches (RF, decision trees) would classify and identify other potential repellents.

Author response: The current approach already shows a success rate >85% for repellency coefficient >0.5 and identifies eight naturally occurring GRAS compounds with repellency as strong as or greater than DEET. This substantially expands the repertoire of strong natural repellents. Since the 1950s only six active ingredients have been registered by US EPA for use in topical repellents, of which only two are natural in origin (Oil of lemon eucalyptus and catmint oil) and they typically do not protect as well as DEET does. That being said, we have since explored other predictive algorithms, for instance Neural Networks. The experimental evaluation of these newer pipelines will take significant resources and time and will be the focus of future grants.

A comparison in the repellent activity between DEET and the top ten hits identified in this new study indicates little change in repellent activity (~3%), suggesting that DEET remains the gold standard. Without additional toxicity tests, the study is arguably incremental. The study's novelty should be better clarified.

Author response: There is an urgent need to find new insect repellents that have better chances of being adopted by people who avoid DEET, such as in Africa and Asia. Having more natural actives that are effective, expands the tools against disease transmitting mosquitoes. As mentioned above, the top repellents that we identified as similar to or better than DEET (N=8) are all classified as GRAS (Generally Regarded as Safe) compounds by the Flavor and Extract Manufacturers (FEMA), are all naturally occurring (plum, jasmin, mushroom, grapes), and have pleasant smells. The Dermal toxicity values in rabbits are known for six and they are of the best possible levels (5000mg/kg).

The Methods in the repellency tests are sparse, and more information would be useful. Testing the top repellents at low doses (<<1%) and for long periods (2-12 h) would strengthen the manuscript. Without this information, the manuscript is lacking in depth.

Author response: The US Environmental Protection Agency (EPA) regulates mosquito repellents, and DEET-based commercial products are typically assigned protection times that vary with concentration (10% ~2 hrs, 30% ~5hrs, 100% ~8hrs). These would be the relevant concentrations for testing protection times on human volunteers, not lower as suggested. Such studies fall within the realm of EPA registration efforts, involving extensive GLP-testing for safety, physical chemistry, and Human Subjects Board approvals. This is outside the scope of the current study and is typically accomplished during development efforts.

Testing human subjects on their olfactory perceptions of the repellents would also increase the depth and utility of the manuscript. Without additional experiments, the authors' conclusions lack support and have limited impact on the state-of-the-art.

This manuscript is a mix of different approaches, which makes it lack cohesion. There is the ML method for classifying new repellents that smell good, but no testing of the repellents on human volunteers. The repellents are not tested at realistic concentrations and durations. And the calcium mobilization test is strange and makes little sense in the context of the other experiments and framing of the manuscript.

Author response: The human olfaction validation that we present in this paper is consistent with most current publications in the field (for example, Keller et al, Gutiérrez et al.). More systematic validation of the human odor character prediction pipelines used was presented in two previous papers Kowalewski & Ray, iScience (2020b) and Kowalewski, Huynh & Ray, Chem. Senses (2021) and a mini-review published in the same issue of the journal by Gerkin, Chem. Senses, (2021).

Reviewer #3 (Public Review):

While I am not a specialist in this field, I do have some knowledge of the subject matter and the computational aspects involved. The authors employ simple machine learning techniques (such as SVM) for the following purposes:

(a) Prediction of aversive valence.

(b) Predicting anti-repellent chemicals.

(c) Predicting calcium mobilization.

The approach is commonplace in chemoinformatics literature.

Weaknesses:

  • All the above models are presented discretely, making it difficult to discern experiment design principles and connectedness.
  • The ML work is rudimentary, lacking adequate details. Chemoinformatics has reached great heights, and SVM does not seem contemporary.
  • There is significant existing research on finding repellents.

Author response: In the current study, we aimed to showcase how computational research may be combined with basic science to create scalable pipelines that address real world problems, rather than to demonstrate methodological novelty of chemoinformatics approaches. Specifically we wanted to use different predictive models to identify compounds that display strong insect repellency, have pleasing scent profiles, are natural in origin and are potentially repurposed from flavors and fragrances. Unfortunately, there is very little existing research on insect repellents that have these types of properties, which would make them better candidates for EPA registration. Most tested compounds are synthetic, and are often analogs of known repellents like DEET, and necessitate substantial time and resources to register. Moreover the identities of chemosensory receptors that are responsible for repellency to DEET and other compounds, and that are conserved across Anopheles, Aedes and Culex mosquitoes are not known.

It is true that the field of cheminformatics has experimented with a variety of newer approaches, based in part on neural networks (e.g., Graph Neural Networks and graph embeddings to encode chemical structure rather than a more conventional Extended Connectivity Fingerprint (ECFP)). Importantly, however, novelty does not imply usefulness. The mosquito behavior experiments that we present show a very high success rate (>85%), validating our approach and identifying several excellent candidates already.

Strengths:

  • Authors attempt to make a case for calcium mobilization in the context of repellency. This aspect sounds interesting but is not surprising.
  • Behavioral profiling of repellents could be useful.

Author Comment: We thank the referee for this comment. We have indeed done behavioral profiling for several repellents that evoke calcium mobilization, but we do not see any clear correlation thus far.

  1. Howard Hughes Medical Institute
  2. Wellcome Trust
  3. Max-Planck-Gesellschaft
  4. Knut and Alice Wallenberg Foundation