Top-down machine learning approach for high-throughput single-molecule analysis

  1. David S White
  2. Marcel P Goldschen-Ohm
  3. Randall H Goldsmith  Is a corresponding author
  4. Baron Chanda  Is a corresponding author
  1. University of Wisconsin-Madison, United States
  2. University of Texas at Austin, United States

Abstract

Single-molecule approaches provide enormous insight into the dynamics of biomolecules, but adequately sampling distributions of states and events often requires extensive sampling. Although emerging experimental techniques can generate such large datasets, existing analysis tools are not suitable to process the large volume of data obtained in high-throughput paradigms. Here, we present a new analysis platform (DISC) that accelerates unsupervised analysis of single-molecule trajectories. By merging model-free statistical learning with the Viterbi algorithm, DISC idealizes single-molecule trajectories up to three orders of magnitude faster with improved accuracy compared to other commonly used algorithms. Further, we demonstrate the utility of DISC algorithm to probe cooperativity between multiple binding events in the cyclic nucleotide binding domains of HCN pacemaker channel. Given the flexible and efficient nature of DISC, we anticipate it will be a powerful tool for unsupervised processing of high-throughput data across a range of single-molecule experiments.

Data availability

Simulated and raw data in addition to analysis scripts are available at https://zenodo.org/record/3727917#.Xn0Fw9NKjq0DOI: 10.5281/zenodo.3727917

The following data sets were generated

Article and author information

Author details

  1. David S White

    Neuroscience, University of Wisconsin-Madison, Madison, United States
    Competing interests
    No competing interests declared.
  2. Marcel P Goldschen-Ohm

    Neuroscience, University of Texas at Austin, Austin, United States
    Competing interests
    No competing interests declared.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-1466-9808
  3. Randall H Goldsmith

    Chemistry, University of Wisconsin-Madison, Madison, United States
    For correspondence
    rhg@chem.wisc.edu
    Competing interests
    No competing interests declared.
  4. Baron Chanda

    Department of Neuroscience, University of Wisconsin-Madison, Madison, United States
    For correspondence
    chanda@wisc.edu
    Competing interests
    Baron Chanda, Reviewing editor, eLife.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-4954-7034

Funding

National Institute of Neurological Disorders and Stroke (NS-101723)

  • Baron Chanda

National Institute of Neurological Disorders and Stroke (NS-081320)

  • Baron Chanda

National Institute of Neurological Disorders and Stroke (NS-081293)

  • Baron Chanda

National Institute of General Medical Sciences (GM007507)

  • David S White

National Institute of General Medical Sciences (GM127957)

  • Randall H Goldsmith

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Copyright

© 2020, White et al.

This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 4,012
    views
  • 516
    downloads
  • 38
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. David S White
  2. Marcel P Goldschen-Ohm
  3. Randall H Goldsmith
  4. Baron Chanda
(2020)
Top-down machine learning approach for high-throughput single-molecule analysis
eLife 9:e53357.
https://doi.org/10.7554/eLife.53357

Share this article

https://doi.org/10.7554/eLife.53357

Further reading

    1. Biochemistry and Chemical Biology
    2. Structural Biology and Molecular Biophysics
    Cristina Paissoni, Sarita Puri ... Carlo Camilloni
    Research Article

    Both immunoglobulin light-chain (LC) amyloidosis (AL) and multiple myeloma (MM) share the overproduction of a clonal LC. However, while LCs in MM remain soluble in circulation, AL LCs misfold into toxic-soluble species and amyloid fibrils that accumulate in organs, leading to distinct clinical manifestations. The significant sequence variability of LCs has hindered the understanding of the mechanisms driving LC aggregation. Nevertheless, emerging biochemical properties, including dimer stability, conformational dynamics, and proteolysis susceptibility, distinguish AL LCs from those in MM under native conditions. This study aimed to identify a2 conformational fingerprint distinguishing AL from MM LCs. Using small-angle X-ray scattering (SAXS) under native conditions, we analyzed four AL and two MM LCs. We observed that AL LCs exhibited a slightly larger radius of gyration and greater deviations from X-ray crystallography-determined or predicted structures, reflecting enhanced conformational dynamics. SAXS data, integrated with molecular dynamics simulations, revealed a conformational ensemble where LCs adopt multiple states, with variable and constant domains either bent or straight. AL LCs displayed a distinct, low-populated, straight conformation (termed H state), which maximized solvent accessibility at the interface between constant and variable domains. Hydrogen-deuterium exchange mass spectrometry experimentally validated this H state. These findings reconcile diverse experimental observations and provide a precise structural target for future drug design efforts.

    1. Structural Biology and Molecular Biophysics
    Kingsley Y Wu, Ta I Hung, Chia-en A Chang
    Research Article

    PROteolysis TArgeting Chimeras (PROTACs) are small molecules that induce target protein degradation via the ubiquitin-proteasome system. PROTACs recruit the target protein and E3 ligase; a critical first step is forming a ternary complex. However, while the formation of a ternary complex is crucial, it may not always guarantee successful protein degradation. The dynamics of the PROTAC-induced degradation complex play a key role in ubiquitination and subsequent degradation. In this study, we computationally modelled protein complex structures and dynamics associated with a series of PROTACs featuring different linkers to investigate why these PROTACs, all of which formed ternary complexes with Cereblon (CRBN) E3 ligase and the target protein bromodomain-containing protein 4 (BRD4BD1), exhibited varying degrees of degradation potency. We constructed the degradation machinery complexes with Culling-Ring Ligase 4A (CRL4A) E3 ligase scaffolds. Through atomistic molecular dynamics simulations, we illustrated how PROTAC-dependent protein dynamics facilitating the arrangement of surface lysine residues of BRD4BD1 into the catalytic pocket of E2/ubiquitin cascade for ubiquitination. Despite featuring identical warheads in this PROTAC series, the linkers were found to affect the residue-interaction networks, and thus governing the essential motions of the entire degradation machine for ubiquitination. These findings offer a structural dynamic perspective on ligand-induced protein degradation, providing insights to guide future PROTAC design endeavors.