Active machine learning-driven experimentation to determine compound effects on protein patterns

  1. Armaghan W Naik
  2. Joshua D Kangas
  3. Devin P Sullivan
  4. Robert F Murphy  Is a corresponding author
  1. Carnegie Mellon University, United States

Abstract

High throughput screening determines the effects of many conditions on a given biological target. Currently, to estimate the effects of those conditions on other targets requires either strong modeling assumptions (e.g. similarities among targets) or separate screens. Ideally, data-driven experimentation could be used to learn accurate models for many conditions and targets without doing all possible experiments. We have previously described an active machine learning algorithm that can iteratively choose small sets of experiments to learn models of multiple effects. We now show that, with no prior knowledge and with liquid handling robotics and automated microscopy under its control, this learner accurately learned the effects of 48 chemical compounds on the subcellular localization of 48 proteins while performing only 29% of all possible experiments. The results represent the first practical demonstration of the utility of active learning-driven biological experimentation in which the set of possible phenotypes is unknown in advance.

Article and author information

Author details

  1. Armaghan W Naik

    Computational Biology Department, Center for Bioimage Informatics, Carnegie Mellon University, Pittsburgh, United States
    Competing interests
    The authors declare that no competing interests exist.
  2. Joshua D Kangas

    Computational Biology Department, Center for Bioimage Informatics, Carnegie Mellon University, Pittsburgh, United States
    Competing interests
    The authors declare that no competing interests exist.
  3. Devin P Sullivan

    Computational Biology Department, Center for Bioimage Informatics, Carnegie Mellon University, Pittsburgh, United States
    Competing interests
    The authors declare that no competing interests exist.
  4. Robert F Murphy

    Computational Biology Department, Center for Bioimage Informatics, Carnegie Mellon University, Pittsburgh, United States
    For correspondence
    murphy@cmu.edu
    Competing interests
    The authors declare that no competing interests exist.

Copyright

© 2016, Naik et al.

This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 7,112
    views
  • 1,237
    downloads
  • 43
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Armaghan W Naik
  2. Joshua D Kangas
  3. Devin P Sullivan
  4. Robert F Murphy
(2016)
Active machine learning-driven experimentation to determine compound effects on protein patterns
eLife 5:e10047.
https://doi.org/10.7554/eLife.10047

Share this article

https://doi.org/10.7554/eLife.10047

Further reading

    1. Cell Biology
    Dharmendra Kumar Nath, Subash Dhakal, Youngseok Lee
    Research Advance

    Understanding how the brain controls nutrient storage is pivotal. Transient receptor potential (TRP) channels are conserved from insects to humans. They serve in detecting environmental shifts and in acting as internal sensors. Previously, we demonstrated the role of TRPγ in nutrient-sensing behavior (Dhakal et al., 2022). Here, we found that a TRPγ mutant exhibited in Drosophila melanogaster is required for maintaining normal lipid and protein levels. In animals, lipogenesis and lipolysis control lipid levels in response to food availability. Lipids are mostly stored as triacylglycerol in the fat bodies (FBs) of D. melanogaster. Interestingly, trpγ deficient mutants exhibited elevated TAG levels and our genetic data indicated that Dh44 neurons are indispensable for normal lipid storage but not protein storage. The trpγ mutants also exhibited reduced starvation resistance, which was attributed to insufficient lipolysis in the FBs. This could be mitigated by administering lipase or metformin orally, indicating a potential treatment pathway. Gene expression analysis indicated that trpγ knockout downregulated brummer, a key lipolytic gene, resulting in chronic lipolytic deficits in the gut and other fat tissues. The study also highlighted the role of specific proteins, including neuropeptide DH44 and its receptor DH44R2 in lipid regulation. Our findings provide insight into the broader question of how the brain and gut regulate nutrient storage.

    1. Cell Biology
    2. Immunology and Inflammation
    Mykhailo Vladymyrov, Luca Marchetti ... Britta Engelhardt
    Tools and Resources

    The endothelial blood-brain barrier (BBB) strictly controls immune cell trafficking into the central nervous system (CNS). In neuroinflammatory diseases such as multiple sclerosis, this tight control is, however, disturbed, leading to immune cell infiltration into the CNS. The development of in vitro models of the BBB combined with microfluidic devices has advanced our understanding of the cellular and molecular mechanisms mediating the multistep T-cell extravasation across the BBB. A major bottleneck of these in vitro studies is the absence of a robust and automated pipeline suitable for analyzing and quantifying the sequential interaction steps of different immune cell subsets with the BBB under physiological flow in vitro. Here, we present the under-flow migration tracker (UFMTrack) framework for studying immune cell interactions with endothelial monolayers under physiological flow. We then showcase a pipeline built based on it to study the entire multistep extravasation cascade of immune cells across brain microvascular endothelial cells under physiological flow in vitro. UFMTrack achieves 90% track reconstruction efficiency and allows for scaling due to the reduction of the analysis cost and by eliminating experimenter bias. This allowed for an in-depth analysis of all behavioral regimes involved in the multistep immune cell extravasation cascade. The study summarizes how UFMTrack can be employed to delineate the interactions of CD4+ and CD8+ T cells with the BBB under physiological flow. We also demonstrate its applicability to the other BBB models, showcasing broader applicability of the developed framework to a range of immune cell-endothelial monolayer interaction studies. The UFMTrack framework along with the generated datasets is publicly available in the corresponding repositories.