Graphical-model framework for automated annotation of cell identities in dense cellular images

  1. Shivesh Chaudhary
  2. Sol Ah Lee
  3. Yueyi Li
  4. Dhaval S Patel
  5. Hang Lu  Is a corresponding author
  1. School of Chemical & Biomolecular Engineering, Georgia Institute of Technology, United States
  2. Petit Institute for Bioengineering and Bioscience, Georgia Institute of Technology, United States
12 figures and 1 additional file

Figures

Figure 1 with 3 supplements
CRF_ID annotation framework automatically predicts cell identities in image stacks.

(A) Steps in CRF_ID framework applied to neuron imaging in C. elegans. (i) Max-projection of a 3D image stack showing head ganglion neurons whose biological names (identities) are to be determined. …

Figure 1—figure supplement 1
Schematic description of various features in the CRF model that relate to intrinsic similarity and extrinsic similarity.

(A) An example of binary positional relationship feature (Appendix 1–Extended methods S1.2.2) illustrated for positional relationships along AP axis. The table lists feature value for some exemplary …

Figure 1—figure supplement 2
Additional examples of unary and pairwise potentials and label consistency scores calculated for each cell.

(A) Unary potentials encode affinities of each cell to take specific labels in atlas. Here, affinities of all cells to take the label specified on the top right corner of images are shown. Randomly …

Figure 1—video 1
Identities predicted automatically by the CRF_ID framework in head ganglion stack.

Top five identities predicted are shown sorted by consistency score. Scale bar 5 µm.

Figure 2 with 7 supplements
CRF_ID annotation framework outperforms other approaches.

(A) CRF_ID framework achieves high prediction accuracy (average 73.5% for top labels) using data-driven atlases without using color information. Results shown for whole-brain experimental ground …

Figure 2—figure supplement 1
Performance characterization using synthetic data.

(A) Freely available open-source 3D atlas (OpenWorm atlas) was used to generate synthetic data. (B) Four scenarios were simulated using atlas and prediction accuracies were quantified. These …

Figure 2—figure supplement 2
Method of applying position noise to the atlas to generate synthetic data.

(A, C, E) Variability in positions of cells were quantified in experimental data using landmark strains GT290 and GT298. Panels here show the variability of landmark cells along AP, LR, DV axes (n = …

Figure 2—figure supplement 3
Details of manually annotated experimental ground-truth datasets.

(A) Number of cells manually annotated in each of anterior (anterior ganglion), middle (lateral, dorsal, and ventral ganglion) and posterior (retrovesicular ganglion) regions of head ganglion in two …

Figure 2—figure supplement 4
Model tuning/characterization – features selection and simulating missing cells.

(A) Feature selection in the model was performed by keeping various feature combinations in the model and assessing prediction accuracy. Left panel – experimental data without using color …

Figure 2—figure supplement 5
CRF_ID framework with relative positional features outperforms registration method.

(A) Prediction accuracies achieved by Top, Top 3, and Top 5 labels predicted by three methods – Registration, CRF_ID framework with Relative Position features and CRF_ID framework with combined …

Figure 2—figure supplement 6
Variability in absolute positions of cells and relative positional features in experimental data compared to the static atlas.

(A) DV view (top) and LR view showing positions of cells across ground truth data (n = 9 worms, strain OH15495). Each point cloud of one color represents positions of a specific cell across …

Figure 2—figure supplement 7
Comparison of optimization runtimes of CRF_ID framework with a registration method CPD (Myronenko and Song, 2010).

(A) Optimization runtimes of CRF method using Loopy Belief Propagation (LBP) as optimization method, and registration method CPD across different number of cells in data to be annotated. Synthetic …

CRF_ID framework predicts identities for gene expression pattern analyses.

(A) (Top) Schematic showing a fluorescent reporter strain with GFP expressed in cells for which names need to be determined. Since no candidate labels are known a priori neurons labels are predicted …

Figure 4 with 4 supplements
Cell identity prediction in mock multi-cell calcium imaging experiments and landmark strain.

(A) (Top) schematic showing automatic identification of cells in multi-cell calcium imaging videos for high-throughput analysis. (Bottom) A mock strain with GFP-labeled cells was used as an …

Figure 4—figure supplement 1
Relative position features perform better than registration in handling missing cells in images.

(A) Comparison of prediction accuracies across three methods for different number of missing cells (out of total 16 cells) simulated in experimental data. Experimental data comes from AML5 strain …

Figure 4—figure supplement 2
Spatially distributed landmarks or landmarks in lateral ganglion perform best in supporting CRF_ID framework for predicting identities.

(A) Top panel – Region-wise prediction accuracy achieved by our CRF_ID framework when landmarks were constrained to lie in specific regions of the head. n = 200 runs when landmarks were constrained …

Figure 4—figure supplement 3
Microfluidic device used in chemical stimulation experiments and characterization.

(A) Schematic of the microfluidic device Cho et al., 2020 used in chemical stimulation experiments. The position of nematode in the imaging channel is shown. Temporally varying stimulus is applied …

Figure 4—video 1
Comparison between the CRF_ID framework and the registration method for predicting identities in case of missing cells.

Identities shown in red are incorrect predictions. Scale bar 5 μm.

Figure 5 with 3 supplements
CRF_ID framework identifies neurons representing sensory and motor activities in whole-brain recording.

(A) GCaMP6s activity traces of 73 cells automatically tracked throughout a 278 s long whole-brain recording and the corresponding predicted identities (top labels). Periodic stimulus (5 sec-on – 5 …

Figure 5—figure supplement 1
Further analysis of data in periodic food stimulation and whole-brain imaging experiment.

(A) Identities (top labels) predicted by our CRF_ID framework overlaid on the image (max-projection of image stack shown). Data comes from strain GT296. (B) Cumulative variance captured by …

Figure 5—video 1
Whole-brain functional imaging with bacteria supernatant stimulation.

Circles indicate the tracking of two cells that show ON and OFF response to food stimulus. Scale bar 5 µm.

Figure 5—video 2
Wave propagation in animal and correlation of neuron activities to worm motion.

Top-left panel shows tracking of cell along the anterior-posterior axis used to calculate the motion of worm. Scale bar 5 μm. Bottom-left panel shows the velocity (px/s) of the cells. Top-right …

Figure 6 with 3 supplements
Annotation framework is generalizable and compatible with different strains and imaging scenarios.

(A) A representative image (max-projection of 3D stack) of head ganglion neurons in NeuroPAL strain OH15495. (B) (Left) comparison of prediction accuracy for various methods that use different …

Figure 6—figure supplement 1
Additional results on prediction performance of CRF_ID method on NeuroPAL data: comparison against registration method and utility of ensemble of color atlases.

(A) Comparing accuracy of top 3 and top 5 identities predicted by different methods show CRF_ID framework with pairwise positional relationship features performs better than registration method (top …

Figure 6—figure supplement 2
Example annotations predicted by the CRF_ID framework for animals imaged lying on the LR axis.

Data comes from OH15495 strain.

Figure 6—figure supplement 3
Example annotations predicted by the CRF_ID framework for animals twisted about the anterior-posterior axis (note the anterior and lateral ganglions show clear left-right separation whereas retrovesicular ganglion instead of being in the middle is more toward one of the left or right sides).

Data comes from OH15495 and OH15500 strains.

Appendix 1—figure 1
Examples of PA (blue), LR (green), and DV (black) axes generated automatically in a whole-brain image stack.

Here red dots correspond to the segmented nuclei in image stack. Shown are 3D view (a), XY (b), YZ (c), and XZ (d) views of the image stack.

Author response image 1
Accuracy comparison on experimental datasets when prediction was done using atlas built with all data vs leaveone-out atlases.

Prediction was done using positional relationship information only. This figure is update for Figure 2A. Experimental datasets come from NeuroPAL strains.

Author response image 2
Accuracy achieved on experimental datasets when prediction was done using different combinations of atlases for positional relationships and color.

In these cases, leave-one-out atlas was used for either positional relationship, color or both. Experimental datasets come from NeuroPAL strains.

Author response image 3
Accuracy on experimental datasets when prediction was done using different leave-one-out atlases for color.

In each method, a different technique was used to match colors of images used to build atlas to colors of test image. Leave-one-out atlas for both positional relationships and color was used for …

Author response image 4
Accuracy comparison between base case (when prediction was done using leave-one-out color atlas without any color matching) and ensemble of two leave-one-out color atlases (with different color matching techniques).

In both cases, same leave-one-out atlases for positional relationships were used. Experimental datasets come from NeuroPAL strain (n = 9). Results in this figure are updated in Fig. 6C.

Author response image 5
Accuracy comparison between base case (when prediction was done using leave-one-out color atlas without any color matching) and ensemble of two leave-one-out color atlases (with different color matching techniques).

In both cases, same leave-one-out atlases for positional relationships were used. Experimental data comes from NeuroPAL strains with non-rigidly rotated animals (n = 7). Results in this figure are …

Additional files

Download links