Automated cell annotation in multi-cell images using an improved CRF_ID algorithm

  1. School of Chemical & Biomolecular Engineering, Georgia Institute of Technology, United States
  2. Department of Organismic and Evolutionary Biology, Harvard University, United States
  3. Interdisciplinary BioEngineering Program, Georgia Institute of Technology, United States
  4. Advanced Institute of Natural Sciences, Beijing Normal University, Zhuhai 519087, China
  5. Center for Brain Science, Harvard University, United States

Editors

  • Reviewing Editor
    Paschalis Kratsios
    University of Chicago, Chicago, United States of America
  • Senior Editor
    Piali Sengupta
    Brandeis University, Waltham, United States of America

Reviewer #1 (Public Review):

In this paper, the authors developed an image analysis pipeline to automatically identify individual ‎neurons within a population of fluorescently tagged neurons. This application is optimized to deal with ‎multi-cell analysis and builds on a previous software version, developed by the same team, to resolve ‎individual neurons from whole-brain imaging stacks. Using advanced statistical approaches and ‎several heuristics tailored for C. elegans anatomy, the method successfully identifies individual ‎neurons with a fairly high accuracy. Thus, while specific to C. elegans, this method can become ‎instrumental for a variety of research directions such as in-vivo single-cell gene expression analysis ‎and calcium-based neural activity studies.‎

The analysis procedure depends on the availability of an accurate atlas that serves as a reference map ‎for neural positions. Thus, when imaging a new reporter line without fair prior knowledge of the ‎tagged cells, such an atlas may be very difficult to construct. Moreover, usage of available reference ‎atlases, constructed based on other databases, is not very helpful (as shown by the authors in Fig 3), ‎so for each new reporter line a de-novo atlas needs to be constructed.‎

I have a few comments that may help to better understand the potential of the tool to become handy:

‎1) I wonder the degree by which strain mosaicism affects the analysis (Figs 1-4) as it was performed on ‎a non-integrated reporter strain. As stated, for constructing the reference atlas, the authors used ‎worms in which they could identify the complete set of tagged neurons. But how sensitive is the ‎analysis when assaying worms with different levels of mosaicism? Are the results shown in the paper ‎stem from animals with a full neural set expression? Could the authors add results for which the ‎assayed worms show partial expression where only 80%, 70%, 50% of the cells population are ‎observed, and how this will affect identification accuracy? This may be important as many non-‎integrated reporter lines show high mosaic patterns and may therefore not be suitable for using this ‎analytic method. In that sense, could the authors describe the mosaic degree of their line used for ‎validating the method.‎

  1. For the gene expression analysis (Fig 5), where was the intensity of the GFP extracted from? As it has ‎no nuclear tag, the protein should be cytoplasmic (as seen in Fig 5a), but in Fig 5c it is shown as if the ‎region of interest to extract fluorescence was nuclear. If fluorescence was indeed extracted from the ‎cytoplasm, then it will be helpful to include in the software and in the results description how this was ‎done, as a huge hurdle in dissecting such multi-cell images is avoiding crossreads between ‎adjacent/intersecting neurons.‎
  2. In the same matter: In the methods, it is specified that the strain expressing GCAMP was also used ‎in the gene expression analysis shown in Figure 5. But the calcium indicator may show transient ‎intensities depending on spontaneous neural activity during the imaging. This will introduce a ‎significant variability that may affect the expression correlation analysis as depicted in Figure 5.‎

Reviewer #2 (Public Review):

The authors succeed in generalizing the pre-alignment procedure for their cell identification method to allow it to work effectively on data with only small subsets of cells labeled. They convincingly show that their extension accurately identifies head angle, based on finding auto fluorescent tissue and looking for a symmetric l/r axis. They demonstrate that the method works to identify known subsets of neurons with varying accuracy depending on the nature of underlying atlas data. Their approach should be a useful one for researchers wishing to identify subsets of head neurons in C. elegans, for example in whole brain recording, and the ideas might be useful elsewhere.

The authors also strive to give some general insights on what makes a good atlas. It is interesting and valuable to see (at least for this specific set of neurons) that 5-10 ideal examples are sufficient. However, some critical details would help in understanding how far their insights generalize. I believe the set of neurons in each atlas version are matched to the known set of cells in the sparse neuronal marker, however this critical detail isn't explicitly stated anywhere I can see. In addition, it is stated that some neuron positions are missing in the neuropal data and replaced with the (single) position available from the open worm atlas. It should be stated how many neurons are missing and replaced in this way (providing weaker information). It also is not explicitly stated that the putative identities for the uncertain cells (designated with Greek letters) are used to sample the neuropal data. Large numbers of openworm single positions or if uncertain cells are misidentified forcing alignment against the positions of nearby but different cells would both handicap the neuropal atlas relative to the matched florescence atlas. This is an important question since sufficient performance from an ideal neuropal atlas (subsampled) would avoid the need for building custom atlases per strain.

  1. Howard Hughes Medical Institute
  2. Wellcome Trust
  3. Max-Planck-Gesellschaft
  4. Knut and Alice Wallenberg Foundation