Deciphering anomalous heterogeneous intracellular transport with neural networks

  1. Daniel Han  Is a corresponding author
  2. Nickolay Korabel
  3. Runze Chen
  4. Mark Johnston
  5. Anna Gavrilova
  6. Victoria J Allan  Is a corresponding author
  7. Sergei Fedotov  Is a corresponding author
  8. Thomas A Waigh  Is a corresponding author
  1. Department of Mathematics, University of Manchester, United Kingdom
  2. School of Biological Sciences, University of Manchester, United Kingdom
  3. Department of Physics and Astronomy, University of Manchester, United Kingdom
  4. Department of Computer Science, University of Manchester, United Kingdom
  5. The Photon Science Institute, University of Manchester, United Kingdom
13 figures, 3 tables and 1 additional file

Figures

Tests of exponent estimation for the DLFNN using N = 104 simulated fBm trajectories.

(a) Plots showing the Hurst exponent estimates of fBm trajectories with n=102 data points by a triangular DLFNN with three hidden layers compared with conventional methods. Plots are vertically grouped by Hurst exponent estimation method: (left to right) rescaled range, MSD, sequential range and DLFNN. σH values are shown in the title. Top row: Scatter plots of estimated Hurst exponents Hest and the true value of Hurst exponents from simulation Hsim. The red line shows perfect estimation. Second row: Due to the density of points, a Gaussian kernel density estimation was made of the plots in the top row (see Materials and methods). Third row: Scatter plots of the difference between the true value of Hurst exponents from simulation and estimated Hurst exponent ΔH=Hsim-Hest. Last row: Gaussian kernel density estimation of the plots in the third row. (b) σH as a function of the number of consecutive fBm trajectory data points n for different methods of exponent estimation. Example structures for two hidden layers and n=5 time series input points of the anti-triangular, rectangular and triangular DLFNN are shown in (c, d and e), respectively. (f) σH as a function of the number of hidden layers in the DLFNN for triangular, rectangular and anti-triangular structures. (g) σH as a function of the number of randomly sampled fBm trajectory data points nrand with different number of hidden layers in the DLFNN shown in the legend. (h) σH as a function of the noise-to-signal ratio (NoiseSignal) (NSR) from Gaussian random numbers added to all n=102 data points in simulated fBm trajectories. (i) Plots of bias b(Hsim), variance Var(Hsim) and mean square error (MSE) as functions of Hsim. For each value of Hsim, fBm trajectories with n=100 points were simulated and estimated by a triangular DLFNN.

DLFNN analysis of a simulated trajectory.

Top: Plot of displacement as a function of time from a simulated fBm trajectory (blue) with multiple exponent values. Bottom: Hurst exponent values used for simulation (magenta), and the DLFNN exponent predictions of the neural network using a 15 point moving window (black).

Figure 3 with 2 supplements
DLFNN analysis of a GFP-Rab5 endosome trajectory.

Top: Plot of displacement from a single trajectory in an MRC-5 cell (blue). Shaded areas show persistent (0.55 < H < 1 in green) and anti-persistent (0 < H < 0.45 in magenta) behaviour. Middle: A 15 point moving window DLFNN exponent estimate for the trajectory (black) with a line (dashed) marking diffusion H = 0.5 and two lines (dotted) marking confidence bounds for estimation marking H = 0.45 and 0.55. Bottom: Plot of instantaneous and moving (15 point) window velocity. Right: Plot of the trajectory with start and finish positions. Persistent (green) and anti-persistent (magenta) segments are shown. For sections that were 0.45 < H < 0.55 were not classified as persistent or anti-persistent and are depicted in blue.

Figure 3—figure supplement 1
DLFNN analysis of a GFP-SNX1-labeled endosome trajectory, depicted as in Figure 3.
Figure 3—figure supplement 2
DLFNN analysis of a lysosome trajectory, depicted as in Figure 3.
Survival functions plotted with error bars for persistent and anti-persistent segments for Rab5-positive endosomes, SNX1-positive endosomes and lysosomes with the power-law fits.

Fit parameters can be found in Appendix 3—table 1.

Comparison of Hurst exponent distributions for GFP-Rab5, GFP-SNX1 and lysosomes.

(a) Histograms of Hurst exponents for GFP-Rab5 (black), GFP-SNX1 (magenta) endosomes and lysosomes (green) plot on the same axes for comparison. The individual histograms of Hurst exponents (black solid) for GFP-Rab5-tagged endosomes, GFP-SNX1-tagged endosomes and lysosomes are shown in (b, c and d) respectively. For each histogram, the Gaussian mixture model fit for six components (red dashed) and individual Gaussian distribution components are shown on the same plot. The number of components were chosen through the Bayes information criterion shown in Appendix 4—figure 1.

Figure 6 with 4 supplements
MRC-5 cells stably expressing GFP-Rab5, GFP-SNX1 or stained with Lysobrite with tracking data overlaid.

The colours show the value of H estimated by the neural network using a 15 point window. The scalebar is 10 µm.

Figure 6—figure supplement 1
Distribution of endogenous Rab5 and SNX1.

MRC-5 cells were fixed and labeled with antibodies to Rab5 and SNX1, then imaged by confocal microscopy. A maximum-intensity z-projection of deconvolved images is shown. The boxed region is enlarged and presented as grey-scale single channels and a two color merged image. The scale bar is 10 µm (main image) and 2 µm (enlargements).

Figure 6—video 1
Video of MRC-5 cell stably expressing GFP-Rab5.

The field of view is 62.88 µm by 62.88 µm (600 × 600 pixels) and playback is in real time. 

Figure 6—video 2
Video of MRC-5 cell stably expressing GFP-SNX1.

The field of view is 62.88 µm by 62.88 µm (600 × 600 pixels) and playback is in real time. 

Figure 6—video 3
Video of MRC-5 cell stained with Lysobrite.

The field of view is 62.88 µm by 62.88 µm (600 × 600 pixels) and playback is in real time. 

Box and whisker plots of displacements, times and velocities of persistent retrograde, persistent anterograde and anti-persistent segments in experimental trajectories.

Any segment with H>0.55 was classed as persistent and H<0.45 as anti-persistent. These H values were chosen as a precaution against the mean error of the neural network estimation. Each data point within the box and whisker plots are averages of all trajectory segments in a single cell. A total of 65 MRC-5 cells for GFP-Rab5-tagged endosomes, 63 MRC-5 cells for SNX1-GFP-tagged endosomes and 71 MRC-5 cells for lysosomes were analysed with at least 5 to 500 (average 54) anterograde or retrograde segments for each cell.

Appendix 1—figure 1
The same single trajectory as in Figure 3 but processed using the HMM-Bayes package described in Monnier et al. (2015).

The plot shows: the original trajectory (top left); the inferred state sequence of the most likely model (top right); the model probabilities given a maximum of three possible states (bottom left); the average lifetime of each state and estimated parameters of the most likely model (bottom center left), which in this case is two different diffusive states; individual increment displacements (bottom center right); and the step size distribution of those increments classed into the two different states.

Appendix 1—figure 2
Top: Plot of displacement from a single GFP-SNX1 endosome trajectory in an MRC-5 cell (blue).

Shaded areas show persistent (0.55<H<1 in green) and anti-persistent (0<H<0.45 in magenta) behaviour. Middle: A 15 point moving window DLFNN exponent estimate for the trajectory (black) with a line (dashed) marking diffusion H=0.5 and lines (dotted) marking the confidence bounds H=0.55 and 0.45. Bottom: Plot of instantaneous and moving (15 point) window velocity. Right: Plot of the trajectory of a GFP-SNX1 endosome in an MRC-5 cell with start and finish positions, and persistent (green) and anti-persistent (magenta) segments indicated.

Appendix 1—figure 3
The same HMM-Bayes analysis as shown in Appendix 1—figure 1 applied to the trajectory in Appendix 1—figure 2.
Appendix 2—figure 1
Top: MSD (points) and power-law fits (dashed) for two different Brownian trajectories containing 1000 data points with diffusion coefficient 2.5 (red) and 0.025 (blue).

The Hurst exponent should be α=2H=1 for both trajectories. Second Row: Simulation of the two Brownian trajectories with diffusion co-efficient 2.5 (red) and 0.025 (blue). Third Row: Local Hurst exponent estimates given by the DLFNN for the two different trajectories using a 90 point window. The averages of DLFNN Hurst exponent estimates are α=2H=1.110 (red) and α=2H=1.114 (blue). Bottom: Local Hurst exponent estimates of the D=2.5 track given by DLFNN and MSDs using a 90 point window. The average of DLFNN Hurst exponent estimates is α=2H=1.110 and the average of MSD estimates is α=2H=0.937.

Appendix 3—figure 1
Normalized histograms (blue) and corresponding maximum likelihood estimation for Burr distributions (line) of segment displacements from lysosome and endosome experimental trajectories segmented using DLFNN.

Parameter estimates are shown the legend.

Appendix 4—figure 1
The Akaike and Bayes information criterion against number of components in the Gaussian mixture model shown in Figure 5 for GFP-Rab5 tagged endosomes (top), SNX1-GFP tagged endosomes (middle) and lysosomes (bottom).

Tables

Table 1
Statistics of experimental trajectory segments.

The persistent and anti-persistent segments in this table are: from trajectories that travelled over 0.5 µm at any point from their initial starting positions; contained more points than the window size; and switched behavior more than twice in the trajectory. Note that these conditions are much stricter than those to generate Figures 4 and 5. Each persistent segment was then further subdivided into retrograde and anterograde segments (see Materials and methods).

Rab5SNX1Lyso
Number of persistent segments236920997645
Number of anti-persistent segments6983394719,320
Number of retrograde segments292523435882
Number of anterograde segments230316096827
Anti-persistent displacement (µm)Mean0.050.050.03
Median0.040.050.03
St. Dev0.020.010.004
Anti-persistent speed (µms-1)Mean0.820.750.10
Median0.700.730.09
St. Dev0.310.190.02
Anti-persistent time (s)Mean0.230.200.93
Median0.230.190.92
St. Dev0.050.030.11
Retrograde displacement (µm)Mean0.530.740.29
Median0.490.690.29
St. Dev0.190.280.08
Retrograde speed (µms-1)Mean2.291.351.49
Median2.211.291.46
St. Dev0.870.390.25
Retrograde time (s)Mean0.220.460.17
Median0.210.450.17
St. Dev0.090.090.03
Anterograde displacement (µm)Mean0.350.430.31
Median0.330.370.32
St. Dev0.170.200.08
Anterograde speed (µms-1)Mean2.061.101.51
Median1.711.081.48
St. Dev0.950.300.27
Anterograde time (s)Mean0.180.340.18
Median0.150.330.18
St. Dev0.080.080.03
Key resources table
Reagent type or resourceDesignationSourceIdentifiersAdditional information
Cell line (Homo sapiens)Lung fibroblast lineAllison et al., 2017 https://doi.org/10.1083/jcb.201609033GFP-SNX1-MRC5MRC5 cell line stably expressing GFP-SNX1. Mycoplasma free.
Cell line (H. sapiens)Lung fibroblast lineOtherGFP-Rab5-MRC5MRC5 cell line stably expressing GFP-Rab5 generated by retroviral transduction by G. Pearson and E. Reid, University of Cambridge. Mycoplasma free.
Cell line (H. sapiens)MRC-5 SV1 TG1 Lung fibroblast lineECACCMRC-5 SV1 TG1 cells, cat no. 85042501Mycoplasma free.
AntibodyAnti-human Rab5A Rabbit monoclonalCell Signalling Technology3547SIF(1/200)
AntibodyAnti-human sorting nexin 1 (mouse monoclonal)BD Biosciences611482IF(1/200)
AntibodyAlexa594-conjugated anti-mouse IgG (donkey polyclonal)Jackson ImmunoResearch715-585-150IF(1/400)
AntibodyA488-conjugated donkey anti-rabbit IgGJackson Immunoresearch711-545-152IF(1/400)
Recombinant DNA reagentpLXIN-GFP-Rab5C-I-NeoROtherUsed by G. Pearson and E. Reid, University of Cambridge to generate retrovirus containing GFP-Rab5C
Sequence-based reagentHpa1 GFP ForwardOtherPCR primerUsed by G. Pearson and E. Reid, University of Cambridge to generate retrovirus containing GFP-Rab5C. TAGGGAGTTAACATGGTGAGCAAGGGCGAGGA
Sequence-based reagentNot1 Rab5C ReverseOtherPCR primerUsed by G. Pearson and E. Reid, University of Cambridge to generate retrovirus containing GFP-Rab5C . ATCCCTGCGGCCGCTCAGTTGCTGCAGCACTGGC
Chemical compound, drugDAPIBiolegend422801IF (1 µg/mL)
Chemical compound, drugProlong GoldThermoFisherP36930
Chemical compound, drugLysobrite RedAAT Bioquest22645(1/2500)
Chemical compound, drugGeneticin (G418)Sigma-AldrichG1397200 µg/mL to maintain GFP-Rab5-MRC5 and GFP-SNX1-MRC5 cells in culture.
Chemical compound, drugFormaldehyde solution, 37% (wt/v)Sigma-Aldrich252549
Chemical compound, drugTriton X-100AnatraceT1001
Software, algorithmNNT (aitracker.net)Newby et al., 2018AITrackerWeb-based automated tracking service
Software, algorithmMetamorphMolecular Devices LLCMetamorphMetamorph Microscopy Automation and Image Analysis Software
Software, algorithmFIJISchindelin, J.; Arganda-Carreras, I. and Frise, E. et al. (2012) ,‘Fiji: an open-source platform for biological-image analysis’, Nature methods 9 (7): 676–682, PMID22743772, doi:10.1038/nmeth.2019FIJI/ImageJ
Software, algorithmDLFNN Exponent EstimatorHan, Daniel. (2020, January 20). DLFNN Exponent Estimator (Version 0). http://doi.org/10.1101/777615DLFNN/DLFNN Exponent EstimatorHurst exponent estimator with Deep Learning Feed-forward Neural Network application for Windows 10. Documentation included.
Software, algorithmPython3Python Software Foundation.Python Language Reference 3.7. Available at www.python.orgPython/Python3
Software, algorithmSciPyVirtanen et al. (2020) SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nature Methods, in press.SciPy/scipy
Software, algorithmTensorflowAbadi et al., 2016Tensorflow
Software, algorithmKerasChollet, François and others. ‘Keras.' (2015). Available from https://keras.ioKeras
Software, algorithmfbmFlynn, Christopher, fbm 0.3.0 available for download at https://pypi.org/project/fbm/ or https://github.com/crflynn/fbmFBM package in PythonExact methods for simulating fractional Brownian motion (fBm) or fractional Gaussian noise (fGn) in python. Approximate simulation of multifractional Brownian motion (mBm) or multifractional Gaussian noise (mGn).
Other35 mm glass-bottomed dishes (µ-Dish)IbidiCat. No. 81150
Appendix 3—table 1
Results for the fits of survival time probabilities shown in Figure 4.

The parameters and the analytical survival functions used to fit the Kaplan-Meier estimator survival curves.

Rab5 dataSurvival function Ψ(t)Fit parameters for Figure 4
Anti-persistente-λt(τ0τ0+t)μμ=0.518±0.004, τ0=0.140±0.002s, λ=0.352±0.002s1
Persistente-λt(τ0τ0+t)μμ=1.352±0.102, τ0=0.045±0.006s, λ=1.286±0.142s1
SNX1 dataSurvival function Ψ(t)Fit parameters for Figure 4
Anti-persistente-λt(τ0τ0+t)μμ=0.757±0.023, τ0=0.118±0.006s, λ=1.004±0.016s1
Persistente-λt(τ0τ0+t)μμ=2.034±0.205, τ0=0.185±0.026s, λ=0.659±0.137s1
Lysosome dataSurvival function Ψ(t)Fit parameters for Figure 4
Anti-persistente-λt(τ0τ0+t)μμ=1.113±0.009, τ0=0.208±0.003s, λ=0.5s1 (fixed)
Persistente-λt(τ0τ0+t)μμ=1.748±0.065, τ0=0.041±0.003s, λ=1.216±0.139s1

Additional files

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Daniel Han
  2. Nickolay Korabel
  3. Runze Chen
  4. Mark Johnston
  5. Anna Gavrilova
  6. Victoria J Allan
  7. Sergei Fedotov
  8. Thomas A Waigh
(2020)
Deciphering anomalous heterogeneous intracellular transport with neural networks
eLife 9:e52224.
https://doi.org/10.7554/eLife.52224