1. Computational and Systems Biology
  2. Neuroscience
Download icon

Bi-channel image registration and deep-learning segmentation (BIRDS) for efficient, versatile 3D mapping of mouse brain

  1. Xuechun Wang
  2. Weilin Zeng
  3. Xiaodan Yang
  4. Chunyu Fang
  5. Yunyun Han  Is a corresponding author
  6. Peng Fei  Is a corresponding author
  1. School of Optical and Electronic Information- Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, China
  2. School of Basic Medicine, Tongji Medical College, Huazhong University of Science and Technology, China
Research Article
  • Cited 2
  • Views 2,246
  • Annotations
Cite this article as: eLife 2021;10:e63455 doi: 10.7554/eLife.63455

Abstract

We have developed an open-source software called bi-channel image registration and deep-learning segmentation (BIRDS) for the mapping and analysis of 3D microscopy data and applied this to the mouse brain. The BIRDS pipeline includes image preprocessing, bi-channel registration, automatic annotation, creation of a 3D digital frame, high-resolution visualization, and expandable quantitative analysis. This new bi-channel registration algorithm is adaptive to various types of whole-brain data from different microscopy platforms and shows dramatically improved registration accuracy. Additionally, as this platform combines registration with neural networks, its improved function relative to the other platforms lies in the fact that the registration procedure can readily provide training data for network construction, while the trained neural network can efficiently segment-incomplete/defective brain data that is otherwise difficult to register. Our software is thus optimized to enable either minute-timescale registration-based segmentation of cross-modality, whole-brain datasets or real-time inference-based image segmentation of various brain regions of interest. Jobs can be easily submitted and implemented via a Fiji plugin that can be adapted to most computing environments.

eLife digest

Mapping all the cells and nerve connections in the mouse brain is a major goal of the neuroscience community, as this will provide new insights into how the brain works and what happens during disease. To achieve this, researchers must first capture three-dimensional images of the brain. These images are then processed using computational tools that can identify distinct anatomical features and cell types within the brain.

Various microscopy techniques are used to capture three-dimensional images of the brain. This has led to an increasing number of computational programs that can extract data from these images. However, these tools have been specifically designed for certain microscopy techniques. For example, some work on whole-brain datasets while others are built to analyze specific brain regions. Developing a more flexible, standardized method for annotating microscopy images of the brain would therefore enable researchers to analyze data more efficiently and compare results across experiments.

To this end, Wang, Zeng, Yang et al. have designed an open-source software program for extracting features from three-dimensional brain images which have been captured using different microscopes. Similar to other tools, the program uses an ‘image registration’ method that is able to recognize and annotate features in the brain. These tools, however, are limited to whole-brain datasets in which the complete anatomy of each feature must be present in order to be recognized by the software.

To overcome this, Wang et al. combined the image registration method with a deep-learning algorithm which uses pixels in the image to identify features in isolated regions of the brain. Although these neural networks do not require whole-brain images, they do need large datasets to ‘learn’ from. Therefore, the image registration method also benefits the neural network by providing a dataset of annotated features that the algorithm can train on.

Wang et al. showed that their software program, named BIRDS, could accurately recognize pixel-level brain features within imaging datasets of brain regions, as well as whole-brain images. The deep-learning algorithm could also adapt to analyze various types of imaging data from different microscopy platforms. This open-source software should make it easier for researchers to share, analyze and compare brain imaging datasets from different experiments.

Introduction

The mapping of the brain and neural circuits is currently a major endeavor in neuroscience and has great potential for facilitating an understanding of fundamental and pathological brain processes (Alivisatos et al., 2012; Kandel et al., 2013; Zuo et al., 2014). Large projects, including the Mouse Brain Architecture project (Bohland et al., 2009), the Allen Mouse Brain Connectivity Atlas (Oh et al., 2014), and the Mouse Connectome project, have mapped the mouse brain (Zingg et al., 2014) in terms of cell types, long-range connectivity patterns, and microcircuit connectivity. In addition to these large-scale collaborative efforts, an increasing number of laboratories are also developing independent, automated, or semi-automated frameworks for processing brain data obtained for specific projects (Fürth et al., 2018; Ni et al., 2020; Niedworok et al., 2016; Renier et al., 2016; Wang et al., 2020a; Iqbal et al., 2019). With the improvement of experimental methods for dissection of brain connectivity and function, development of a standardized and automated computational pipeline to map, analyze, visualize, and share brain data has become a major challenge to all brain connectivity mapping efforts (Alivisatos et al., 2012; Fürth et al., 2018). Thus, the implementation of an efficient and reliable method is fundamentally required for defining the accurate anatomical boundaries of brain structures, by which the anatomical positions of cells or neuronal connections can be determined to enable interpretation and comparison across experiments (Renier et al., 2016). The commonly used approach for automatic anatomical segmentation is to register an experimental image dataset within a standardized, fully segmented reference space, thus obtaining the anatomical segmentation for this set of experimental images (Oh et al., 2014; Ni et al., 2020; Renier et al., 2016; Kim et al., 2015; Lein et al., 2007). There are currently several registration-based high-throughput image frameworks for analyzing large-scale brain datasets (Fürth et al., 2018; Ni et al., 2020; Niedworok et al., 2016; Renier et al., 2016). Most of these frameworks require the user to set a few parameters based on the image intensity or graphics outlines or to completely convert the dataset into a framework-readable format to ensure the quality of the resulting segmentation. However, with the rapid development of sample labeling technology (Lee et al., 2016; Richardson and Lichtman, 2015; Schwarz et al., 2015) and high-resolution whole-brain microscopic imaging (Economo et al., 2016; Gong et al., 2013; Nie et al., 2020; Liu et al., 2017; Li et al., 2010), the heterogeneous and non-uniform characteristics of brain structures make it difficult to use traditional registration methods for registering datasets from different imaging platforms to a standard brain space with high accuracy. In this case, laborious visual inspection, followed by manual correction, is often required, which significantly reduces the productivity of these techniques. Therefore, the research community urgently needs a robust, comprehensive registration method that can extract a significant number of unique features from image data and provide accurate registration between different types of individual datasets.

Moreover, though registration-based methods can achieve full anatomical annotation in reference to a standard atlas for whole-brain datasets, their region-based 3D registration to a whole-brain atlas lacks the flexibility to analyze incomplete brain datasets or those focused on a certain volume of interest (Song and Song, 2018), which is often the case in neuroscience research. Though some frameworks can register certain types of brain slabs that contain complete coronal outlines slice by slice (Fürth et al., 2018; Song and Song, 2018; Ferrante and Paragios, 2017), it remains very difficult to register a small brain block without obvious anatomical outlines. As neural networks have emerged as a technique of choice for image processing (Long et al., 2015; Chen et al., 2018a; He et al., 2019; Zhang et al., 2019), deep-learning-based brain mapping methods have also recently been reported to directly provide segmentation/annotation of primary regions for 3D brain datasets (Iqbal et al., 2019; Akkus et al., 2017; Chen et al., 2018b; Milletari et al., 2017; de Brebisson and Montana, 2015). Such deep-learning-based segmentation networks are efficient in extracting pixel-level features and thus are not dependent on the presence of global features such as complete anatomical outlines, making them better suited for processing of incomplete brain data, as compared to registration-based methods. On the other hand, the establishment of these networks still relies on a sufficiently large training dataset, which is often laboriously registered, segmented, and annotated. Therefore, a combination of image registration and a neural network can possibly provide a synergistic improved analysis method and lead to more efficient and versatile brain mapping techniques.

Here, we provide an open-source software as a Fiji (Schindelin et al., 2012) plugin, termed bi-channel image registration and deep-learning segmentation (BIRDS), to support brain mapping efforts and to make it feasible to analyze, visualize, and share brain datasets. We developed BIRDS to allow investigators to quantify and spatially map 3D brain data in its own 3D digital space with reference to Allen CCFv3 (Wang et al., 2020b). This facilitates analysis in its native status at cellular level. The pipeline features: (1) A bi-channel registration algorithm integrating a feature map with raw image data for co-registration with significantly improved accuracy and (2) a mutually beneficial strategy in which the registration procedure can readily provide training data for a neural network, while this network can efficiently segment incomplete brain data that is otherwise difficult to register with a standardized atlas. The whole computational framework is designed to be robust and flexible, allowing its application to a wide variety of imaging systems (e.g., epifluorescent microscopy or light-sheet microscopy) and labeling approaches (e.g., fluorescent proteins, immunohistochemistry, and in situ hybridization). The BIRDS pipeline offers a complete set of tools, including image preprocessing, feature-based registration and annotation, visualization of digital maps and quantitative analysis via a link with Imaris, and a neural network segmentation algorithm that allows efficient processing of incomplete brain data. We further demonstrate how BIRDS can be employed for fully automatic mapping of various brain structures and integration of multidimensional anatomical neuronal labeling datasets. The whole pipeline has been packaged into a Fiji plugin, with step-by-step tutorials that permit rapid implementation of this plugin in a standard laboratory computing environment.

Results

Bi-channel image registration with improved accuracy

Figure 1 shows our bi-channel registration procedure, which registers experimental whole-brain images using a standardized Allen Institute mouse brain average template, and then provides segmentations and annotations from CCFv3 for experimental data. The raw high-resolution 3D images (1 × 1 × 10 μm3 per voxel), obtained by serial two-photon tomography (STPT, see Materials and methods), were first down-sampled into isotropic low-resolution data with a 20 μm voxel size identical to an averaged Allen template image (Figure 1a). The re-sampling ratios along the x (lateral-medial axis), y (dorsal-ventral axis), and z (anterior-posterior, AP axis) axes were thus 0.05, 0.05 and 0.5, respectively. It should be noted that, in addition to the individual differences, the preparation/mounting steps can also cause non-uniform deformation of samples, thereby posing extra challenges to the precise registration of experimental image to an averaged template (Figure 1b, original dataset). To mitigate this non-uniform deformation issue before registration, we applied a dynamic re-sampling ratio rather than using a fixed value of 0.5 to the z reslicing. We first subdivided the entire image stack into multiple sub-stacks (n = 6 in our demonstration, Figure 1a) according to seven selected landmark planes (Figure 1a, Figure 1—figure supplement 1). Then we applied a dynamic z re-sampling ratio calculated corresponding to the positions of the landmark planes in the Allen template and sample data (varying from ~0.35 to 0.55) to each sub-stack, to finely compress (<0.5) or stretch (>0.5) the z depth of the sub-stacks, thereby better matching the depth of each sub-stack to the Allen template brain and rectifying the deformation along the AP axis (Figure 1a, Materials and methods). The rectified whole-brain stack assembled by these dynamically re-sampled sub-stacks showed higher original similarity to the Allen template brain as compared to a raw experimental image stack (Figure 1b). The implementation of such a preprocessing step was beneficial for the better alignment of non-uniformly morphed brain data to a standardized template (Figure 1—figure supplement 2). After data preprocessing, we applied a feature-based iterative registration using the Allen reference images to the preprocessed experimental images. We note that previous registration methods were vulnerable to inadequate alignment accuracy (Niedworok et al., 2016; Renier et al., 2016; Goubran et al., 2019), which was associated with inadequate registration information provided by merely using the raw background image data. To address this issue, in addition to the primary channel containing the background images of each sample and template brains, we further generated an assistant channel to augment the image registration and enhance the accuracy. First, we used a phase congruency (PC) algorithm (Kovesi, 2019) to extract the high-contrast edge and texture information from both the experimental and template brain images based on their relatively fixed anatomy features (Figure 1c, Materials and methods). Then, we obtained the geometry features of both brains along their lateral–medial, dorsal–ventral, and anterior–posterior axes with enhanced axial mutual information (MI) extracted using a grayscale reversal processing (Maes et al., 1997; Thévenaz and Unser, 2000) (Figure 1c, Figure 1—figure supplement 3, Materials and methods). Finally, the primary channel containing raw brain images, in conjunction with the assistant channel containing the texture and geometry maps of brains, were included in the registration procedure to fulfill an information-augmented bi-channel registration requirement (Figure 1—figure supplement 4), which was verified to show notably better registration accuracy as compared to conventional single-channel registration methods (aMAP [Niedworok et al., 2016], ClearMap [Renier et al., 2016], and MIRACL [Goubran et al., 2019]). During registration, through an iterative optimization of the transformation from an averaged Allen brain template to the experimental data, the MI gradually reached its maximum when the inverse grayscale images, PC images, and the raw images were finally geometrically aligned (Figure 1d). The displacement was presented in a grid form to illustrate the non-linear deformation effects. The geometry wrapping parameters obtained from the registration process were then applied to the Allen annotation file to generate a transformed version specifically for experimental data (Figure 1—figure supplement 4). Our dual-channel registration achieved fully automated registration/annotation at sufficiently high accuracy when processing STPT experimental data of an intact brain (Han et al., 2018). As for low-quality or highly deformed brain data (e.g., clarified brain with obvious shrinkage), though the registration accuracy of our method was accordingly reduced, our method still quite obviously surpassed other methods (Figure 2). For such challenging data types, we also developed an interactive graphic user interface (GUI) to readily permit manual correction of the visible inaccuracies in the annotation file, through finely tuning the selected corresponding points (Figure 1e). Finally, an accurate 3D annotation could be generated and applied to experimental data, either fully automatically (STPT data) or after mild manual correction (light-sheet fluorescence microscopy [LSFM] data of clarified brain), as shown in Figure 1f.

Figure 1 with 4 supplements see all
Bi-channel brain registration procedure.

(a) Re-sampling of a raw 3D image into an isotropic low-resolution one, which has the same voxel size (20 μm) using an averaged Allen template image. The raw brain dataset was first subdivided into six sub-stacks along the AP axis according to landmarks identified in seven selected coronal planes (a1). Then an appropriate z re-sampling ratio, which was different for each slice, was applied to each sub-stack (a2, left) to finely adjust the depth of the stack in the down-sampled data (a2, right). This step roughly restored the deformation of non-uniformly morphed samples, thereby allowing the following registration with an Allen reference template. (b) Plot showing the variation of the down-sampling ratio applied to the six sub-stacks and comparison with the Allen template brain before and after the dynamic re-sampling showing the shape restoration effects of the this preprocessing step. (c) Additional feature channels containing a geometry and outline feature map extracted using grayscale reversal processing (left), as well as an edge and texture feature map extracted by a phase congruency algorithm (right). This feature channel was combined with a raw image channel for implementing our information-enriched bi-channel registration, which showed improved accuracy as compared to conventional single-channel registration solely based on raw images. (d) 3D view and anatomical sections (coronal and sagittal planes) of the registration results displayed in a grid deformed from an average Allen template. (e) Visual inspection and manual correction of automatically-registered results from an optically clarified brain, which showed obvious deformation. Using the GUI provided, this step could be readily operated by adjusting the interactive nodes in the annotation file (red points to light blue points). (f) A final atlas of an experimental brain image containing region segmentations and annotations.

Figure 2 with 4 supplements see all
Comparison of BIRDS with conventional single-channel registration methods.

(a) Comparative registration accuracy (STPT data from an intact brain) using four different registration methods, aMAP, ClearMap, MIRACL, and BIRDS. (b) Comparative registration accuracy (LSFM data from clarified brain) using four methods. Magnified views of four regions of interest (a1–a4, b1–b4, blue boxes) selected from the horizontal (left, top) and coronal planes (left, bottom) are shown in the right four columns, with 3D detail for the registration/annotation accuracy for each method. All comparative annotation results were directly output from respective programs without manual correction. Scale bar, 1 mm (whole-brain view) and 250 μm (magnified view). (c) Ten groups of 3D fiducial points of interest (POIs) manually identified across the 3D space of whole brains. The blue and red points belong to the fixed experimental images and the registered Allen template images, respectively. The ten POIs were selected from the following landmarks: POIs: cc1: corpus callosum, midline; acoL, acoR: anterior commisure, olfactory limb; CPL, CPR: Caudoputamen, Striatum dorsal region; cc2: corpus callosum, midline; cc3: corpus callosum, midline; MM: medial mammillary nucleus, midline; DGsgL, DGsgR: dentate gyrus, granule cell layer. The registration error by each method could be thereby quantified through measuring the Euclidean distance between each pair of POIs in the experimental image and template image. (d) Box diagram comparing the POI distances of five brains registered by the four methods. Brains 1, 2: STPT images from two intact brains. Brain one is also shown in (a). Brains 3, 4, and 5: LSFM images from three clarified brains (u-DISCO) that showed significant deformations. Brain five is also shown in (b). The median error distance of 50 pairs of POIs in the five brains registered by BIRDS was ~104 μm, as compared to ~292 μm for aMAP, ~204 μm for ClearMap, and ~151 μm for MIRACL. (e, f) Comparative plot of Dice scores in nine registered regions of the five brains. The results were grouped by brain in (e) and region in (f). The calculation was implemented at the single nuclei level. When the results were analyzed by brain, BIRDS surpassed the other three methods most clearly using LSFM dataset #5, with a 0.881 median Dice score as compared to 0.574 from aMAP, 0.72 from ClearMap, and 0.645 from MIRACL. At the same time, all the methods performed well on STPT dataset #2, with a median Dice score of 0.874 from aMAP, 0.92 from ClearMap, 0.872 from MIRACL, and 0.933 from BIRDS. When the results were compared using nine functional regions, the median values acquired by BIRDS were also higher than the other three methods. Even the lowest median Dice score by our method was still 0.799 (indicated by black line), which was notably higher than 0.566 by aMAP, 0.596 by ClearMap, and 0.722 by MIRACL, respectively.

Comparison with conventional single-channel-based registration methods

Next, we merged our experimental brain image with a registered annotation file to generate a 3D annotated image and quantitatively compared its registration accuracy with aMAP, ClearMap, and MIRACL results. We made comparisons of both STPT data from intact brains that contained only minor deformations (Figure 2a) and LSFM data from clarified brains (u-DISCO) that showed obvious shrinkage (Figure 2b). It should be noted here that the annotated results of either previous single-channel methods or our bi-channel method were all using automatic registration without any manual correction applied, and the averaged manual annotations by our experienced researchers served as a ground truth for quantitative comparisons. It was visually obvious that, as compared to the other three methods (green: aMAP; red: ClearMap; and blue: MIRACL in Figure 2a,b), the Allen annotation files transformed and registered by our BIRDS method (yellow in Figure 2a,b) were far better aligned with both STPT (as shown in VISC, CENT, AL, and PAL regions, Figure 2a) and LSFM (as shown in HPF, CB, VIS, and COA regions, Figure 2b) images. Furthermore, we manually labeled 10 3D fiducial points of interest (POIs) across the registered Allen template images together with their corresponding experimental images (Figure 2c) and then measured the error distances between the paired anatomical landmarks in the two datasets, so that the registration accuracy by each registration method could be quantitatively evaluated (Figure 2—figure supplement 1). As shown in Figure 2d, the error distance distributions of POIs in five brains (two STPT + three LSFM) registered by the abovementioned four methods were then quantified, showing the smallest median error distance (MED) was obtained using our method for all five brains (Supplementary file 3). In two different sets of STPT data, only our BIRDS method could provide an MED below 100 μm (~80 μm, n = 2), and this value slightly increased to ~120 μm for LSFM data (n = 3), but was still smaller than all the results obtained using the other three methods (aMAP, ~342 μm, n = 3; ClearMap, ~258 μm, n = 3; and MIRACL, ~175 μm, n = 3). Moreover, the Dice scores (Dice, 1945), defined as a similarity scale function used to calculate the similarity of two samples, for each method were also calculated at the nucleus precision level based on nine functional regions in the five brains. The comparative results were then grouped by brain and region, as shown in Figure 2e,f, respectively. The highest Dice scores with an average median value of >0.89 (Supplementary file 3, calculated for five brains, 0.75, 0.81, and 0.81 for aMAP, ClearMap, and MIRACL) or >0.88 (Supplementary file 3, calculated using nine regions, 0.74, 0.77, and 0.84 for aMAP, ClearMap, and MIRACL, respectively) were obtained by BIRDS, further confirming the superior registration accuracy of our method. Through a comparative Wilcoxon test, our results were demonstrated to be superior to the other three methods (providing larger Dice scores) with a p value < 0.05 calculated either by brain or by region. More detailed comparisons of registration accuracies can be found in Figure 2—figure supplements 24.

Whole-brain digital map identifying the distributions of labeled neurons and axon projections

A 3D digital map (CCFv3) based on the abovementioned bi-channel registration was generated to support automatic annotation, analysis, and visualization of neurons in a whole mouse brain (see Materials and methods). The framework thus enabled large-scale mapping of neuronal connectivity and activity to reveal the architecture and function of brain circuits. Here, we demonstrated how the BIRDS pipeline visualizes and quantifies single-neuron projection patterns obtained by STPT imaging. A mouse brain containing six GFP-labeled layer-2/3 neurons in the right visual cortex was imaged with STPT at 1 × 1 × 10 μm3 resolution (Han et al., 2018). After applying the BIRDS procedure to this STPT image stack, we generated a 3D map of this brain (Figure 3a). An interactive hierarchal tree of brain regions in the software interface allowed navigation through the corresponding selected-and-highlighted brain regions with its annotation information (Figure 3b, Video 1). Through linking with Imaris, we visualized and traced each fluorescently labeled neuronal cell (n = 5) using the filament module of Imaris across the 3D space of the entire brain (Figure 3c, Materials and methods, Video 2). The BIRD software can also apply reverse transformation to a raw image stack to generate a standard template-like rendered 3D map, including both traced axonal projections and selected whole-brain structures, which faithfully captures true 3D axonal arborization patterns and anatomical locations, as shown in Figure 3d. This software can also quantify the lengths and arborizations of traced axons according to the segmentation of the 3D digital map generated using the BIRDS pipeline (Figure 3e).

3D digital atlas of a whole brain for visualization and quantitative analysis of inter-areal neuronal projections.

(a) Rendered 3D digital atlas of a whole brain (a2, pseudo color), which was generated from registered template and annotation files (a1, overlay of annotation mask and image data). (b) Interactive hierarchical tree shown as a sidebar menu in the BIRDS program, indexing the name of brain regions annotated in CCFv3. Clicking on any annotation name in the side bar of the hierarchal tree highlights the corresponding structure in the 3D brain map (b1, b2), and vice versa. For example, brain region LP was highlighted in the space after its name was chosen in the menu (b1). 3D rendering of an individual brain after applying a deformation field in reverse to a whole brain surface mask. The left side of the brain displays the 3D digital atlas (CCFv3, colored part in b2), while the right side of the brain is displayed in its original form (grayscale part in b2). (c) The distribution of axonal projections from five single neurons in 3D map space. The color-rendered space shown in horizontal, sagittal, and coronal views highlights multiple areas in the telencephalon, anterior cingulate cortex, striatum, and amygdala, which are all potential target areas of layer-2/3 neuron projections. (d) The traced axons of five selected neurons (n = 5) are shown. ENTm, entorhinal area, medial part, dorsal zone; RSPagl, retrosplenial area, lateral agranular part; VIS, visual areas; SSp-bfd, primary somatosensory area, barrel field; AUDd, dorsal auditory area; AUDpo, posterior auditory area; TEa, temporal association areas; CP, caudoputamen; IA, intercalated amygdalar nucleus; LA, lateral amygdalar nucleus; BLA, basolateral amygdalar nucleus; CEA, central amygdalar nucleus; ECT, ectorhinal area. (e) Quantification of the projection strength across the targeting areas of five GFP-labeled neurons. The color codes reflect the projection strengths of each neuron, as defined as axon length per target area, normalized to the axon length in VIS.

Video 1
Displays the 3D digital atlas.
Video 2
Shows the arborization of 5 neurons in 3D map space.

BIRDS can be linked to Imaris to perform automated cell counting with higher efficiency and accuracy (Materials and methods). Here, we demonstrate it with an example brain where neurons were retrogradely labeled by CAV-mCherry injected to the right striatum and imaged by STPT at 1×1×10 μm3 resolution (Han et al., 2018). The whole-brain image stacks were first processed by BIRDS to generate a 3D annotation map. Two of the example segregated brain areas (STR and BS) are outlined in the left panel of Figure 4a. The annotation map and the raw image stack were then transferred to Imaris, which processed the images within each segregated area independently. Imaris calculated the local image statistics for cell recognition only using the image stack within each segregated area; therefore, it fit the dynamic range of the local images to achieve better results, as shown in the middle column in the right panel of Figure 4a. In contrast, the conventional Imaris automated cell counting program processed the whole-brain image stack at once to calculate the global cell recognition parameters for every brain area, which easily resulted in false positive or false negative counts in brain areas where the labeling signal was too strong or too weak compared to the global signal, as demonstrated in the STR and SB in the right column of the right panels of Figure 4a, respectively. The BIRDS–Imaris program could perform automated cell counting for each brain area and reconstructed them over the entire brain. The 3D model of the brain-wise distribution of labeled striatum-projecting neurons was visualized using the BIRDS–Imaris program as a 3D rendered brain image and projection views from three axes in Figure 4b. The BIRDS program could calculate the volume of each segregated region according to the 3D segregation map and the density of labeled cells across the brain as shown in Figure 4c. Meanwhile, manual cell counting was also performed with every one out of four sections using an ImageJ plugin (Figure 4d). Compared to conventional Imaris results, our BIRDS–Imaris results were more consistent with a manual one, especially for brain regions where the fluorescent signal was at the high or low end of the dynamic range (BS and STR, Figure 4e). Thanks to the 3D digital map generated by the BIRDS pipeline, BIRDS–Imaris can process each segmented brain area separately, namely calculating the parameters for the cell recognition algorithm using local image statistics instead of processing the whole-brain image stack at once. Such a segmented cell counting strategy is much less demanding on computation resources, and moreover, it is optimized for each brain area to solve the problem that the same globe cell recognition parameter works poorly in certain brain regions with signal intensity at either of the two extreme ends of the dynamic range of the entire brain.

Cell-type-specific counting and comparison between different cell counting methods.

(a) Cell counting of retrogradely labeled striatum-projecting cells. We selected two volumes (1 × 1 × 1 mm3) from SS and VIS areas, respectively, to show the difference in cell density and the quantitative results by BIRDS–Imaris versus conventional Imaris. Here, a separate quantification parameters set for different brain areas in the BIRD–Imaris procedure lead to obviously more accurate counting results. Scale bar, 2 mm. (b) 3D-rendered images of labeled cells in the whole brain space, shown in horizontal, sagittal, and coronal views. The color rendering of the cell bodies was in accordance with CCFv3, and the cells were mainly distributed in the Isocortex (darker hue). (c) The cell density calculated for 20 brain areas. The cell densities of MO and SS were highest (MO = 421.80 mm−3; SS = 844.71 mm−3) among all areas. GU, gustatory areas; TEa, temporal association areas; AI, agranular insular area; PL, prelimbic area; PERI, perirhinal area; RSP, retrosplenial area; ECT, ectorhinal area; ORB, orbital area; VISC, visceral area; VIS, visual areas; MO, somatomotor areas; SS, somatosensory areas; AUD, auditory areas; HPF, hippocampal formation; OLF, olfactory areas; CTXsp, cortical subplate; STR, striatum; PAL, pallidum; BS, brainstem; CB, cerebellum. (d) Comparison of the cell numbers from three different counting methods, BIRDS, Imaris (3D whole brain directly), and manual counting (2D slice by slice for a whole brain). (e) The cell counting accuracy using BIRDS–Imaris (orange) and conventional Imaris methods (blue), relative to manual counting. Besides the highly divergent accuracy for the 20 regions, the counting results by conventional Imaris in STR and BS regions were especially inaccurate.

Inference-based segmentation of incomplete brain datasets using a deep-learning procedure

In practice, acquired brain datasets are often incomplete, due to researcher’s particular interest in specific brain regions, or limited imaging conditions. The registration of such incomplete brain datasets to an Allen template is often difficult due to the lack of sufficient morphology information for comparison of both datasets. To overcome this limitation, we further introduced a deep neural network (DNN)-based method for efficient segmentation/annotation of incomplete brain sections with minimal human supervision. Herein, we optimized a Deeplab V3+ network, which was based on an encoding-decoding structure, for our deep-learning implementation (Figure 5a). The input images passed through a series of feature processing stages in the network, with pixels being allocated, classified, and segmented into brain regions. It should be noted that the training of a neural network fundamentally requires a sufficiently large dataset containing various incomplete brain blocks which have been well segmented. Benefiting from our efficient BIRDS method, we could readily obtain a large number of such labeled datasets through cropping processed whole brains and without experiencing time-consuming manual annotation. Various types of incomplete brains, as shown in Figure 5b, were hereby generated and sent to our DNN for iterative training, after which the skilled network could directly infer the segmentations/annotations for new modes of incomplete brain images (Materials and methods). Next, we validated the network performance on three different modes of input brain images cropped from the registered whole-brain dataset (STPT). The DNN successfully inferred annotation results for a cropped hemisphere, irregular cut of hemisphere, and a randomly cropped volume, as shown in Figure 5c–e, respectively. The inferred annotations (red lines) were found to be highly similar to the registered annotation results (green lines) in all three types of incomplete data. To further quantify the inference accuracy, the Dice scores of the network-segmented regions were also calculated by comparing the network outputs to ground truth, which was the registration results after visual inspection and correction (Figure 5—figure supplement 1). The averaged median Dice scores for the individual sub-regions in the hemisphere, irregular cut of hemisphere, and random volumes were 0.86, 0.87, and 0.87, respectively, showing a sufficiently high inference accuracy in most of brain regions, such as the isocortex, HPF, OLF, or STR. It is worth noting that the performance of our network for segmentation using PAL, MBsta, P-sen regions remained sub-optimal (Dice score 0.78–0.8), due to their lack of obvious borders, and large structural variations across planes (Figure 5—figure supplement 1). Finally, we applied our network inferences to generate 3D atlases for these three incomplete brains, while segmenting the hemisphere into 18 regions such as the Isocortex, HPF, OLF, CTXsp, STR, PAL, CB, DORpm, DORsm, HY, MBsen, MBmot, MBsta, P-sen, P-mot, P-sat, MY-sen, and MY-mot, while we processed an irregular cut of half the telencephalon into 10 regions as Isocortex, HPF, OLF, CTXsp, STR, PAL, DORpm, DORsm, and HY, MY-mot, and the random volume into seven regions, defined as the Isocortex, HPF, STR, PAL, DORpm, DORsm, and HY (Figure 5f,g,h). Therefore, our DNN performed reasonably well even if the brain was highly incomplete. Furthermore, it could achieve second-level fine segmentation within a small brain region of interest. For example, we successfully segmented the hippocampus (CA1, CA2, CA3, and DG), as shown in Figure 5—figure supplement 2. Such a unique capability of our DNN was possibly derived from the detection of pixel-level features rather than regions, and thereby substantially strengthened the robustness of our hybrid BIRDS method over conventional brain registration techniques when the data is highly incomplete/defective. More detailed performance comparisons between our DNN-based inference and other methods are shown in Figure 5—figure supplements 15.

Figure 5 with 5 supplements see all
Inference-based network segmentation of incomplete brain data.

(a) The deep neural network architecture for directly inferring brain segmentation without required registration. The training datasets contained various types of incomplete brain images, which were cropped from annotated whole-brain datasets created using our bi-channel registration beforehand. (b) Four models of incomplete brain datasets for network training: whole brain (1), a large portion of telencephalon (2), a small portion of telencephalon (3), and a horizontal slab of whole brain (4). Scale bar, 1 mm. (c–e) The inference-based segmentation results for three new modes of incomplete brain images, defined as the right hemisphere (c), an irregular cut of half the telencephalon (d), and a randomly cropped volume (e). The annotated sub-regions are shown in the x-y, x-z, and y-z planes, with the Isocortex, HPF, OLF, CTXsp, STR, PAL, CB, DORpm, DORsm, HY, MBsen, MBmot, MBsta, P-sen, P-mot, P-sat, MY-sen, and MY-mot for right hemisphere, Isocortex, HPF, OLF, CTXsp, STR, PAL, DORpm, DORsm, HY, and MY-mot for the irregular cut of half the telencephalon, and Isocortex, HPF, STR, PAL, DORpm, DORsm, and HY for the random volume. Scale bar, 1 mm. (f–h) Corresponding 3D atlases generated for these three incomplete brains.

Discussion

In summary, we demonstrate a bi-channel image registration method, in conjunction with a deep-learning framework, to readily provide accuracy-improved anatomical segmentation for whole mouse brain in reference to an Allen average template, and direct segmentation inference for incomplete brain datasets, which were otherwise not easily registered to standardized whole-brain space. The addition of a brain feature channel to the registration process greatly improved the accuracy of automatically registering individual whole-brain data with a standardized Allen average template. It should be noted that the registration was based on two-photon template images provided by Allen CCF, so it is currently limited to using on like-imaged brains, for example, brains imaged using wide-field, confocal, or light-sheet microscopes, etc. For processing various incomplete brain datasets, which were challenging for registration-based methods while remaining very common in neuroscience research, we applied our deep neural network to rapidly infer segmentations. The sufficiently accurate results shown using different types of incomplete data verify the advances of network segmentation. Though a full annotation using a neural network is currently too computationally demanding as compared to registration-based segmentation, it is undoubtedly a good complement to registration-based segmentation. Therefore, in our hybrid BIRDS pipeline, the DNN inference greatly reinforced the inefficient side of registration, while the registration also readily provided high-quality training data for our DNN. We believe such a synergistic effect in our method could provide a paradigm shift for enabling robust and efficient 3D image segmentation/annotation for biology research. With the unceasing development of deep learning, we envision that network-based segmentation will play an increasingly important role in new pipelines. A variety of applications, such as tracing of long-distance neuronal projections and parallel counting of cell populations in different brain regions, was also enabled as a result of our efficient brain mapping. The BIRDS pipeline is now fully open source and also has been packaged into a Fiji plugin to facilitate biological researchers. We sincerely expect that the BIRDS method can immediately allow new insights using current brain mapping techniques, and thus further push the resolution and scale limits in future explorations of brain space.

Materials and methods

Acquisition of STPT image dataset

Request a detailed protocol

Brains 1 and 2 were obtained with STPT, and each dataset encompassed ~180 Gigavoxels, for example, 11,980 × 7540 × 1075 in Dataset 1, with a voxel size of 1 × 1 × 10 μm3. The procedure of sample preparation and imaging acquisition were described in Han et al., 2018. Briefly, the adult C57BL/6 mouse (RRID:IMSR_JAX:000664) was anesthetized, and craniotomy was performed on top of the right visual cortex. Individuals neuronal axons were labeled with plasmid DNA (pCAG-eGFP [Addgene accession 11150]) by two-photon microscopy-guided single-cell electroporation, and the brain was fixed by cardioperfusion of 4% paraformaldehyde 8 days later. Striatum-projecting neurons were labeled by stereotactically injecting PRV-cre into the right striatum of tdTomato reporter mice (Ai14, JAX), and the brain was fixed cardioperfusion 30 days later. The brains were embedded in 5% oxidized agarose and imaged with a commercial STPT (TissueVision, USA) excited at 940 nm. Coronally, the brain was optically scanned every 10 μm at 1 μm/pixel without averaging and physically sectioned every 50 μm. The power of excitation laser was adjusted to compensate the depth of optical sections.

Acquisition of LSFM images dataset

Request a detailed protocol

Brains 3, 4, and 5 were obtained with LSFM and each dataset encompassed ~700 Gigavoxels (~10,000 × 8000 × 5000), with an isotropic voxel size of 1 μm3. Brain tissues of eight-week-old Thy-GFP-M mice (RRID:IMSR_JAX:007788) were first clarified with u-DISCO protocol (Pan et al., 2016) before imaging. Brains 3 and 4 were acquired using a custom-built Bessel plane illumination microscope, a type of LSFM modality employing non-diffraction thin Bessel light-sheet. Brain 5 was whole-brain 3D image of a Thy-GFP-M mice acquired using a lab-built selective plane illumination microscope (Nie et al., 2020), another LSFM modality combining Gaussian light-sheet with multi-view image acquisition/fusion.

Implementation of Bi-channel registration

Preprocessing of raw data

Request a detailed protocol

First, we developed an interactive GUI in the Fiji plugin (RRID:SCR_002285) to correspond the coronal planes in the Allen Reference Atlas (ARA) (132 planes, 100 μm interval) with those in the experimental 3D image stack (e.g., 1075 planes with 10 μm stepsize in Dataset 1). As shown in Figure 1—figure supplement 1, seven coronal planes, from the anterior bulbus olfactorius to the posterior cerebellum, were identified across the entire brain, with their number of layers being recorded as ai in template atlas and bi in the raw image stack. Therefore, k sub-stacks (k=[1,6],kN) was defined by these seven planes (Figure 1b). According to the ratio of step size between the ARA and its template image stack (100 μm–20 μm), we also obtained the number of layers of the selected planes in the template image as ci=5ai2. The reslicing ratio of the kth sub-stack sandwiched by every two planes (ak to ak+1) was then calculated by: lk=bk+1bkck+1ck. Each lk was applied to the kth sub-stack to obtain the resliced version of the sub-stack. Finally, the six resliced sub-stacks together formed a complete image stack of whole brain (20 μm stepsize), which had a rectified shape more similar to the Allen average template image, as compared to the raw experimental data. According to the isotropic voxel size of 20 μm in the template, the lateral size of voxel in the resliced image stack was also adjusted from originally 1 μm to 20 μm with a uniform lateral down-sampling ratio of 20 applied to all the coronal planes. The computational cost of abovementioned data re-sampling operation was low, taking merely ~5 min for processing 180 GB raw STPT data on a Xeon workstation (E5-2630 V3 CPU).

Features extraction and processing

Request a detailed protocol

We extracted feature information based on the purified signals of the image data with backgrounds filtrated. We realized this through calculating the threshold values of the signals and backgrounds using by Huang’s fuzzy thresholding method (Huang and Wang, 1995) and removing the backgrounds according to the calculated thresholds. Then the feature information was detected using a PC algorithm, which was robust to the intensity change of signals, and could efficiently extract corners, lines, textures information from the image. Furthermore, when the images had relatively low contrast at the border, which was very common in our study, the edge information could be much better retained using PC detection. Finally, the pixel intensity in the generated PC feature map can be calculated by following formula:

(1) E(x)=nAn[cos(φn(x)φ¯(x))|sin(φn(x)φ¯(x))|]
(2) PC(x)=W(x) |E(x)|T nAn(x)+ε

where x is the angle vector of a pixel after Fourier transform of the image, An is the local amplitude of the nth cosine component, φn is the local phase, φ¯ is the weighted average of phase, E(x) is the local energy, W(x) is the filter band, T is the noise threshold, and ε is a small positive number (=0.01 in our practice) to prevent the denominator from leading to too large value of PC(x).

Bi-channel registration procedure

Request a detailed protocol

Image registration is fundamentally an iterative process optimized by a pre-designed cost function, which reasonably assesses the similarity between experimental and template datasets. Our bi-channel registration procedure is implemented based on the Elastix open-source program (Version 4.9.0) (Shamonin et al., 2013; Klein et al., 2010). Unlike conventional single-channel image registration, our method simultaneously registers all groups of input data using a single cost function defined as:

(3) c(Tμ;IF,IM)=1i=1Nωii=1Nωic(Tμ;IFi,IMi)
(4) c(Tμ;IFi,IMi)=mLMfLFp(f,m;Tμ)log2(pF(f)pM(m;Tμ))mLMfLFp(f,m;Tμ)log2p(f,m;Tμ)

where N represents the number of data groups and ωi is the weighting parameter for each data group. Since we used primary channel containing raw image stack in conjunction with assistant channel containing geometry and texture feature maps for registration simultaneously, here N=3. c(Tμ;IFi,IMi) is the cost function of each channel, where IFi represents the fixed image (experimental data) and IMi represents the moving image (template). Tμ denotes the deformation function of the registration model, with parameter μ being optimized during the iterative registration process. In Equation 4, LF and LM are two sets of regularly spaced intensity bin centers, p is the discrete joint probability, and pF and pM are the marginal discrete probabilities of the fixed and moving image. Here we used rigid+affine+B-spline three-level model for the registration, with rigid and affine transformations mainly for aligning the overall orientation differences between the datasets, and B-spline model mainly for aligning the local geometry differences. B-spline places a regular grid of control points onto the images. These control points are movable during the registration and cause the surrounding image data to be transformed, thereby permitting the local, non-linear alignment of the image data. A stepsize of 30 pixels was set for the movement of control points in 3D space, and a five-level coarse-to-fine pyramidal registration was applied for achieving faster convergence. During the iterative optimization of Tμ, we used gradient descent method to efficiently approach the optimal registration of the template images to the fixed experimental images. The solution of μ can be expressed as

(5) μk+1=μkl*c(Tμ;IF,IM)μ

where l is the learning rate, which also means the stepsize of the gradient descent optimization. The transformation parameters obtained from the multi-channel registration were finally applied to the annotation files to generate the atlas /annotation for the whole-brain data.

Visualization and quantification of brain-map results

Generation of 3D digital map for whole brain

Request a detailed protocol

We obtained a low-resolution annotation (20 μm isotropic resolution) for entire mouse brain after registration. In order to generate a 3D digital framework based on the high-resolution raw image, b,l recorded in the down-sampling step were used for resolution restoration. Annotation information is to distinguish different brain regions by pixel intensity. In order to generate a digital frame for quantitative analysis, we introduced Marching cubes algorithm (Lorensen and Cline, 1987) to generate 3D surface graphics, which is also, to generate the 3D digital maps. Then, through the programmable API link with Imaris (RRID:SCR_007370), we could readily visualize the 3D digital map and perform various quantification in Imaris. After registration and interpolation applied to the experimental data, a 3D digital map was visualized in Imaris (9.0.0) invoked by our program. Then neuron tracing and cell counting tasks could be performed in Imaris at native resolution (e.g., 1 × 1 × 10 μm3 for STPT data). During the neural tracing process, the brain regions where the selected neurons passed through could be three-dimensionally displayed under arbitrary view. Furthermore, the cell counting could be performed in parallel by simultaneously setting a number of kernel diameters and intensity thresholds for different segmented brain regions.

Distance calculation

Request a detailed protocol

The Euclidean distance between one pair of landmarks shown in different image datasets (Figure 2c) indicates the registration error and can be calculated as:

(6) ρ=(x2x1)2+(y2y1)2+(z2z1)2

Calculation of dice scores

View detailed protocol

Dice score is the indicator for quantifying the accuracy of segmentation and can be calculated as:

(7) Dice=2(AB)A+B

where A is the ground truth of segmentation, while B is the result by brain-map. AB represents the number of pixels where A and B overlap, and A+B refers to the total number of pixels in A and B.

Here, with referring to BrainsMapi (Ni et al., 2020) methods, we compared the registration/segmentation results by four registration tools at both coarse region level and fine nuclei level. As we assessed the accuracy of these methods at brain-region level, 10 brain regions, Outline, CB, CP, HB (hindbrain), HIP, HY (hypothalamus), Isocortex, MB (midbrain), OLF, and TH (thalamus), were first selected from the entire brain for comparison. Then we further picked out 50 planes in each selected brain region (totally 500 planes for 10 regions) and manually segmented them to generate the reference results (ground truth). For nuclei-level comparison, we selected nine small sub-regions, ACA, ENT, MV, PAG, RT, SSp, SUB, VISp, and VMH as targets, and performed similar operation on them with selecting five representative coronal sections for each region. To allow the manual segmentation as objective as possible, two skillful persons independently repeated the abovementioned process for five times, and a STAPLE algorithm (Warfield et al., 2004) was used to fuse the 10 manual segmentation results to obtain the final averaged output as the ground-truth segmentation, for each region.

Deep neural network segmentation

Generation of ground-truth training data

Request a detailed protocol

We chose 18 primary regions (levels 4 and 5): Isocortex, HPF, OLF, CTXsp, STR, PAL, CB, DORpm, DORsm, HY, MBsen, MBmot, Mbsta, P-sen, P-mot, P-sat, MY-sen, and MY-mot, in whole brain for the DNN training and performance validation. The ground-truth annotation masks for these regions were readily obtained from our bi-channel registration procedure of BIRDS. For high-generosity DNN segmentation of whole brain and incomplete brain, we specifically prepared two groups of data training and validation as following: (1) Nine whole mouse brains containing 5000 annotated sagittal slices (660 × 400 for each slice) were first used as the training dataset. Then, the sagittal sections of whole brains were cropped, to generate different types of incomplete brains, as shown in Figure 5b. The training dataset were thus comprised both complete and incomplete sagittal sections (5000 and 4500 slices, respectively). The DNN trained by such datasets was able to infer segmentations for given complete or incomplete sagittal planes. Figure 5—figure supplement 3 compared the DNN segmentation results of both whole brain (a) and incomplete brains (b, c, d) with the corresponding ground truths. (2) To demonstrate the performance of the DNN in segmenting sub-regions at finer scale, we chose the ground-truth images from coronal sections of specific hippocampus region (1100 slices from eight mouse brains, 570 × 400 for each slice) for DNN training. The corresponding ground-truth masks for the four major sub-regions of hippocampus, CA1, CA2, CA3, and the DG, were then generated by registration. We validated the DNN’ performance on segmenting these small sub-regions through comparison with the ground-truths at four different coronal planes.

During network training, we defined the segmentation classes with the same number of regions, which is 19 in Figure 5, Figure 5—figure supplement 3 for large incomplete (background as one class) and 5 in Figure 5—figure supplement 2 (background as one class). Finally, each region was assigned a channel for generating the segmented output. The details of training and test datasets are shown in Supplementary file 2.

Modified deeplab V3+ network

Request a detailed protocol

Our DNN is based on the modification of Deeplab V3+, which contains a classical encoder–decoder structure (Chen et al., 2020). The main framework is based on Xception, which is optimized for depthwise separable convolutions and thereby reduces the computational complexity while maintains high performance. The network gradually reduces the spatial dimension of the feature map at the encoder stage and allows complicated information to be easily outputted at deep level. At the final stage of the encoder structure, we introduce a dilated convolution Atrous Spatial Pyramid Pooling (ASPP), which increased the receptive field of convolution by changing the stepsize of atrous. The number of cores selected in ASPP is 6, 12, and 18. To solve the issue of aliased edges in inference results of conventional Deeplab V3+, we introduced more original image information into the decoder by serially convolving the 2× and 4× down-sampling results generated at the encoder stage and concatenating them to the decoder. The network was trained for ~30,000 iterations with a learning rate of 0.001, learning momentum of 0.9 and output stride of 8. The training was implemented using two NVIDIA GeForce GTX 1080Ti graphics cards and took approximately 40 hr.

Neuron tracing

Request a detailed protocol

The neurons in whole brain data were segmented semi-automatically using the Imaris software. With registering our brain to ABA, we obtained the anatomical annotation for all the segmented areas. The Autopath Mode of the Filament module was applied to trace long-distance neurons. We first assigned one point on a long-distance neuron to initiate the tracing. Then, Imaris automatically calculated the pathway in accordance with image data, reconstructed the 3D morphology, and linked it with the previous part. This procedure would repeat several times until the whole neuron, which could also be recognized by human’s eye, was segmented.

Cell counting

Request a detailed protocol

The Spots module and Surface module of Imaris software were used to count retrogradely labeled striatum-projecting cells in SS and VIS areas. We first separated the brain regions into multiple channels in Surface module. Then automatic creation in Spots module was applied to count cells number for each channel. To achieve accurate counting, the essential parts were the appropriate estimate of cell bodies’ diameter and filtration of the chosen cells by tuning the quality parameters. The accuracy of this automatic counting procedure was compared with manual counting, which herein severed as ground truth. After obtaining the total number of cells in each brain region, according to the sub-region ranges divided by the Surface module, Imaris could also calculate the volume of each brain region. Then, with knowing the number of cell nuclei and the volume of each segmented brain region, the density of the cell nuclei inside each brain region could be calculated.

Computing resources

Request a detailed protocol

We run BIRDS on a Windows 10 workstation equipped with dual Xeon E5-2630 V3 CPUs, 1 TB RAM, and two GeForce GTX 1080 Ti graphic cards. In Supplementary file 1, we showed the memory and time consumptions of our BIRDS plugin for processing 180 GB and 320 GB brain datasets. The image preprocessing time at stage 1 is approximately proportional to the size of data. In contrast, since the datasets for registration are down-sampled to the same size to match the Allen CCF template, the time and memory consumption at stages 2 and 3 are nearly the same for two datasets. The time and memory consumption for generating the 3D digital framework at stage 4 is proportional to the data size. It should be noted that memory with capacity at least 1.5 times of the data size is required at this step. Therefore, when applying BIRDS to larger datasets, such as rat brain, we will recommend a powerful workstation with at least one XEON CPU, 1 TB memory, and one state-of-the-art graphic card (Geforce RTX 3090), to guarantee a smooth running of whole pipeline.

Code availability

Request a detailed protocol

We have made our pipeline open access for the community. We have provided source code of the BIRDS for Windows 10 at https://github.com/bleach1by1/birds_reg (Wang, 2021a; copy archived at swh:1:rev:22cf3d792c3887708a65ddae43d6dde7ed8b7836). FIJI plugin and ancillary installation packages are provided at https://github.com/bleach1by1/BIRDS_plugin (Wang, 2021b; copy archived at swh:1:rev:41e90d4518321d6ca8e806ccadb2809bfa6bd475). BIRDS contains five core modules, image preprocessing, bi-channel registration, manual correction, link with Imaris software, and deep-learning segmentation, all of which can be executed on a GUI. We also provided a step-by-step tutorial and test data to facilitate the program implementation by other researchers.

Data availability

The Allen CCF is open access and available with related tools at https://atlas.brain-map.org/. The datasets (Brain1~5) have been deposited in Dryad at https://datadryad.org/stash/share/4fesXcJif0L2DnSj7YmjREe37yPm1bEnUiK49ELtALg and https://datadryad.org/stash/share/PWwOzHmOqVBa_CBplDW133X5AEGwFsuoZZ4BNW_nAsQ. The code and plugin can be found at the following link: https://github.com/bleach1by1/BIRDS_plugin (copy archived at https://archive.softwareheritage.org/swh:1:rev:41e90d4518321d6ca8e806ccadb2809bfa6bd475/), https://github.com/bleach1by1/birds_reg (copy archived at https://archive.softwareheritage.org/swh:1:rev:22cf3d792c3887708a65ddae43d6dde7ed8b7836/), https://github.com/bleach1by1/birds_dl.git (copy archived at https://archive.softwareheritage.org/swh:1:rev:92d3a68c7805cbef58c834e39c807e8cbaa902e6/), https://github.com/bleach1by1/BIRDS_demo (copy archived at https://archive.softwareheritage.org/swh:1:rev:61ad20ab070b7af9881d69df643fa4b878993f90/). All data generated or analysed during this study are included in the manuscript. Source data files have been provided for Figures 1, 2, 3, 4, 5 and Figure 2—figure supplements 3,4; Figure 5—figure supplements 2,3.

The following previously published data sets were used

References

  1. Conference
    1. Chen L-C
    2. Zhu Y
    3. Papandreou G
    4. Schroff F
    5. Adam H
    (2020)
    Encoder-decoder with atrous separable convolution for semantic image segmentation
    Proceedings of the European Conference on Computer Vision (ECCV). pp. 801–818.
  2. Conference
    1. de Brebisson A
    2. Montana G
    (2015) Deep neural networks for anatomical brain segmentation
    IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). pp. 20–28.
    https://doi.org/10.1109/CVPRW.2015.7301312
  3. Conference
    1. He K
    2. Gkioxari G
    3. Dollár P
    4. Girshick R
    (2019) Mask R-CNN
    EEE International Conference on Computer Vision (ICCV). pp. 2961–2969.
    https://doi.org/10.1109/ICCV.2017.322
  4. Conference
    1. Kovesi P
    (2019)
    Psivt 2019
    Pacific-Rim Symposium on Image and Video Technology.
  5. Conference
    1. Long J
    2. Shelhamer E
    3. Darrell T
    (2015) Fully convolutional networks for semantic segmentation
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 3431–3440.
    https://doi.org/10.1109/CVPR.2015.7298965
    1. Thévenaz P
    2. Unser M
    (2000) Optimization of mutual information for multiresolution image registration
    IEEE Transactions on Image Processing : A Publication of the IEEE Signal Processing Society 9:2083–2099.
    https://doi.org/10.1109/83.887976

Decision letter

  1. Laura L Colgin
    Senior Editor; University of Texas at Austin, United States
  2. Jeffrey C Smith
    Reviewing Editor; National Institute of Neurological Disorders and Stroke, United States
  3. Charles R Gerfen
    Reviewer; National Institute of Mental Health, United States
  4. Jahandar Jahanipour
    Reviewer; National Institutes of Health, United States

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Acceptance summary:

A major advance in analyses of brain and neural circuit morphology is the capability to register labeled neuronal structures to a reference digital 3D brain structural atlas such as the Allen Institute for Brain Sciences mouse brain Common Coordinate Framework. In this technical advance, the authors have developed open source, automated methods for 3D morphological mapping in the mouse brain employing dual-channel microscopic neuroimage data and deep neural network learning. The authors demonstrate improved accuracy of image registration to the Allen Institute atlas and morphological reconstruction of mouse brain image sets, suggesting that the authors' approach is an important addition to the tool box for 3D brain morphological reconstruction and representation.

Decision letter after peer review:

Thank you for submitting your article "Bi-channel Image Registration and Deep-learning Segmentation (BIRDS) for efficient, versatile 3D mapping of mouse brain" for consideration by eLife. Your article has been reviewed by two peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Laura Colgin as the Senior Editor. The following individuals involved in review of your submission have agreed to reveal their identity: Charles R Gerfen (Reviewer #1); Jahandar Jahanipour (Reviewer #2).

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

We would like to draw your attention to changes in our revision policy that we have made in response to COVID-19 (https://elifesciences.org/articles/57162). Specifically, we are asking editors to accept without delay manuscripts, like yours, that they judge can stand as eLife papers without additional data, even if they feel that they would make the manuscript stronger. Thus the revisions requested below only address clarity and presentation.

Summary:

Wang et al. present a technique for registering and analysis of whole mouse brain image sets to the Allen CCF. A major advance in the analysis of neuroanatomical circuits is the ability to register labeled neuronal structures in the whole mouse brain to a common mouse brain atlas. Numerous studies have used various techniques for such registration with varying success and accuracy. The authors' BIRDS process for registering whole mouse brain image sets to the Allen CCF uses a number of innovative approaches that provide some advances in the accuracy of registration, at least for the examples described. There are a number of major revisions listed below that need to be addressed.

Essential revisions:

1) Despite the steps being described in detail, parts of the image processing and registration process are difficult to understand, which requires careful revision of the text to clarify the steps in the procedure depicted in Figure 1:

– The down-sampling is straightforward. However, the "dynamic down-sampling ratios applied to the.… " is not at all clear. What exactly is being done and what is the purpose? By "dynamic down-sampling ratio" is it meant that instead of the ratio of the original (1:20 downsampling ratio to convert the original to the dimensions of the Allen CCF ) that the ratio might be different at different rostro-caudal levels of the brain? And is this to account for differences in how the brains are processed. For instance, experimental brains might have different dimensions in the x and y planes than the Allen CCF or might be deformed in different ways. And that this process makes those corrections at the appropriate rostro-caudal levels.

– The next step(s) are very difficult to follow: the "feature based iterative registration of the Allen reference image with pre-processed experimental images". This step involves the application of: 1) phase congruency algorithm to extract high contrast information from both experimental and template brain (CCF?) images and 2) enhanced axial mutual information extracted by grayscale reversal processing. And then somehow a "bi-channel" registration was performed. What is unclear is what channels are used for the "bi-channel" registration. Is a separate channel formed by the combination of the processes of 1) and 2)? Is the original image also combined with these channels? Are these processes applied to the Allen 2-photon images? And then exactly what is registered and how is that done?

– It might be better if the process steps were described separately and succinctly from comments about how this is better than other methods. The description in the Materials and methods section is somewhat better, but still confusing. The figure legend for Figure 1—figure supplement 4 comes close to being understood.

– The "bi-channel" process is the most confusing, and since it is part of the acronym should be better explained. In general, it seems straightforward, using not only the original image data but also that data that has been processed to enhance edges and other features. What is confusing is whether you are also using image data from channels that have labeled neurons and axons – which is stored in other channels?

2) Authors propose an algorithm for registration of high-resolution large-scale microscopic images of whole mouse brain datasets. This type of computational processing requires adequate memory and processing power which might not be available in a standard laboratory environment. Authors should elaborate on the computational techniques to address the large dataset handling of the proposed algorithm in regards of memory and the speed. Furthermore, is the proposed algorithm expandable to larger datasets such as super-resolution images or high-resolution whole rat brain datasets specifically in the context of memory and speed?

3) Authors discuss downsampling the dataset before registration. How does this downsampling affect the results specifically the smoothness of the borders between the regions?

4) The tracing of each fluorescently labelled neuronal cell is not clear! The method for tracing is neither explained nor referenced in the Results section.

5) Is the Deep Neural Network mainly to be used for incomplete brain datasets? Would be nice if it helped with providing better registration for registration of local structure differences, such as different layers in cortical structures or areas with multiple nuclei in addition to when there is damage to the tissue. Also, the 2-photon image data used for registration does not always provide sufficient morphologic features to identify certain brain structures, such as certain subnuclei in the brainstem and thalamus and to some extent different cortical layers. Basing the training data set on the structures detected using bi-channel procedures on 2-photon data may not best identify certain brain structures.

6) The DNN structure needs further clarification in terms of input and output. The input size is not specified; is it 2D section or 3D? How many output classes are defined? Does each region get a specific node in the output? It is not clear how the network output is selected. In "Generation of ground-truth training data" section it is mentioned that 18 primary regions were chosen for training set. In inference, the four major sub-regions of hippocampus, CA1, CA2, CA3, and the DG are not in the original 18 regions defined in the training set. The structure of the training set both in the training and test mode should be clearly explained.

7) It is recommended to report the median values of the comparison of different methods in a separate table rather than the legend of the figure for easier readability.

8) Variables in Equations 1 and 2 are not fully defined.

9) Equation 3 only explains the relationship of the final cost function to the Individual cost functions. The individual cost functions are not defined.

10) The "Marching cubes" algorithm introduced in "Generation of 3D digital map for whole brain" section is not defined. Is it a proposed algorithm by authors or an algorithm used from somewhere else? If latter, the authors should cite the paper.

https://doi.org/10.7554/eLife.63455.sa1

Author response

Essential revisions:

1) Despite the steps being described in detail, parts of the image processing and registration process are difficult to understand, which requires careful revision of the text to clarify the steps in the procedure depicted in Figure 1:

– The down-sampling is straightforward. However, the "dynamic down-sampling ratios applied to the.… " is not at all clear. What exactly is being done and what is the purpose? By "dynamic down-sampling ratio" is it meant that instead of the ratio of the original (1:20 downsampling ratio to convert the original to the dimensions of the Allen CCF ) that the ratio might be different at different rostro-caudal levels of the brain? And is this to account for differences in how the brains are processed. For instance, experimental brains might have different dimensions in the x and y planes than the Allen CCF or might be deformed in different ways. And that this process makes those corrections at the appropriate rostro-caudal levels.

We apologize for the ambiguity brought to the reviewer. In addition to the individual difference, the sample preparation step often causes additional nonuniform deformation of samples. It means that the scaling factors of different parts of the brain are not the same, as compared to Allen CCF template. Such nonuniform deformation is found to be especially obvious along the AP axis and makes the following precise registration a lot more difficult. So at image preprocessing step, we divided the whole brain into a few z sub-stacks (6 in our demonstration) according to several selected landmark planes (7 in our demonstration) and applied appropriate re-sampling ratios, which are different, to them to finely re-adjust the depth of the sub-regions. Through compressing or stretching the depth of the sub-stacks by dynamic re-sampling ratio calculated by corresponding the positions of the landmark planes in Allen CCF3 template and sample data (Materials and methods section), we rectify the nonuniform deformation to restore a uniform rostro-caudal level of brain. This step was verified to be beneficial to the following image registration. Figure 1—figure supplement 2 of the revised manuscript shows the improvement of registration accuracy by using our dynamic downsampling preprocessing. We further illustrated this dynamic down-sampling operation in the revised Figure 1 and improved the descriptions in Results and Figure 1—figure supplement 1.

– The next step(s) are very difficult to follow: the "feature based iterative registration of the Allen reference image with pre-processed experimental images". This step involves the application of: 1) phase congruency algorithm to extract high contrast information from both experimental and template brain (CCF?) images and 2) enhanced axial mutual information extracted by grayscale reversal processing. And then somehow a "bi-channel" registration was performed. What is unclear is what channels are used for the "bi-channel" registration. Is a separate channel formed by the combination of the processes of 1) and 2)? Is the original image also combined with these channels? Are these processes applied to the Allen 2-photon images? And then exactly what is registered and how is that done?

– It might be better if the process steps were described separately and succinctly from comments about how this is better than other methods. The description in the Materials and methods section is somewhat better, but still confusing. The figure legend for Figure 1—figure supplement 4 comes close to being understood.

– The "bi-channel" process is the most confusing, and since it is part of the acronym should be better explained. In general, it seems straightforward, using not only the original image data but also that data that has been processed to enhance edges and other features. What is confusing is whether you are also using image data from channels that have labeled neurons and axons – which is stored in other channels?

Bi-channel is defined to be distinguished with conventional registration that merely uses the brain background images for registration. The 1st primary channel consisted of brain background images of both sample and Allen CCF brains is always included in the registration. The 2nd assistant channel consisted of the generated texture and geometry maps of both brains participates the registration procedure with appropriate weighting being applied. Therefore, both primary and assistant channels contain the graphic information from the background images of both sample and Allen CCF brains. Meanwhile, no neurons/axons labelled images are required at registration step. In the revised manuscript, we have further clarified the definitions and implementations of the bi-channel registration in Results and Materials and methods sections. Also, the registration procedure has been better illustrated in the revised Figure 1—figure supplement 4.

2) Authors propose an algorithm for registration of high-resolution large-scale microscopic images of whole mouse brain datasets. This type of computational processing requires adequate memory and processing power which might not be available in a standard laboratory environment. Authors should elaborate on the computational techniques to address the large dataset handling of the proposed algorithm in regards of memory and the speed. Furthermore, is the proposed algorithm expandable to larger datasets such as super-resolution images or high-resolution whole rat brain datasets specifically in the context of memory and speed?

We appreciate the reviewer’s professional advices, which are indeed helpful to the improvement of our manuscript. We have discussed the computational cost and the possible expansion to larger dataset in the Materials and methods section. A new Supplementary file 1 is also provided to summarize the data size, memory cost and time consumption at different BIRDS stages for processing 180 GB STPT and 320 GB LSFM datasets.

3) Authors discuss downsampling the dataset before registration. How does this downsampling affect the results specifically the smoothness of the borders between the regions?

In consideration of the big saving of computational costs (speed and memory), down-sampling the raw image dataset before registration is a standard operation, which shows minor effect on the smoothness of the borders between regions. This step has been widely used in many established brain registration works such as: ClearMap (Renier et al., 2016), aMAP (Niedworok et al., 2016) and MIRACL (Goubran et al., 2019) with downsampling the raw image data to 25-μm, 12.5-μm and 25-μm voxel size, respectively. In our method, we chose downsampling to 20-μm voxel size, which was within the normal range.

4) The tracing of each fluorescently labelled neuronal cell is not clear! The method for tracing is neither explained nor referenced in the Results section.

The tracing of fluorescently labelled neurons shown in Figure 3 was performed using the filament module of Imaris, which was based on the automatic neuron segmentation aided by human inspection / correction. We have added the descriptions on neuron tracing and counting in the Results and Materials and methods sections.

5) Is the Deep Neural Network mainly to be used for incomplete brain datasets? Would be nice if it helped with providing better registration for registration of local structure differences, such as different layers in cortical structures or areas with multiple nuclei in addition to when there is damage to the tissue. Also, the 2-photon image data used for registration does not always provide sufficient morphologic features to identify certain brain structures, such as certain subnuclei in the brainstem and thalamus and to some extent different cortical layers. Basing the training data set on the structures detected using bi-channel procedures on 2-photon data may not best identify certain brain structures.

We appreciate reviewer’s professional question. In our BIRDS pipeline, since the registration step has handled the whole brain datasets, DNN is designed to be used for incomplete brain datasets, which are otherwise difficult for registration. As long as finer labelling data being provided, the DNN itself certainly has the ability to segment finer brain structures, like CA1, CA2, CA3 and DG structures shown in Figure 5—figure supplement 4. Since only 2-photon and light-sheet datasets are available to us, it is a pity that we are currently unable to train the network with data containing sufficient morphologic features at certain brain structures, such as certain subnuclei in the brainstem and thalamus. However, it should be noted that BIRDS is a fully open-source approach, thus higher-quality data containing specific regions labelled from various laboratories could be processed using this approach. We also look forward to seeing increasingly stronger segmentation / capabilities shown by BIRDS in future.

6) The DNN structure needs further clarification in terms of input and output. The input size is not specified; is it 2D section or 3D? How many output classes are defined? Does each region get a specific node in the output? It is not clear how the network output is selected. In "Generation of ground-truth training data" section it is mentioned that 18 primary regions were chosen for training set. In inference, the four major sub-regions of hippocampus, CA1, CA2, CA3, and the DG are not in the original 18 regions defined in the training set. The structure of the training set both in the training and test mode should be clearly explained.

We apologize for the confusion to reviewer. The inputs for DNN are 2D image slices. We trained the network twice on dataset of 18 primary brain regions and dataset of 4 small sub-regions in hippocampus separately, to demonstrate network’s inference capability on both coarse segmentation of large incomplete brain (Figure 5) and fine segmentation of certain region of interest (Figure 5—figure supplement 4). During network training, we defined the segmentation classes the same with the number of regions, which is 19 in Figure 5 (background as one class) and 5 in Figure 5—figure supplement 4 (background as one class). Finally, each region was assigned a channel rather than a node for generating the segmented output. In the revised manuscript, we have included a Supplementary file 2 to provide the detailed information of training and test datasets. More DNN information can be found in the revised Materials and methods section and the source code we have provided.

7) It is recommended to report the median values of the comparison of different methods in a separate table rather than the legend of the figure for easier readability.

In response to reviewer’s suggestion, we have added a separate Supplementary file 3 to report the median values.

8) Variables in Equations 1 and 2 are not fully defined.

We have fully defined the variables in Equations 1 and 2 (Materials and methods section).

9) Equation 3 only explains the relationship of the final cost function to the Individual cost functions. The individual cost functions are not defined.

We have defined the individual cost functions as new Equation 4 in the Materials and methods section.

10) The "Marching cubes" algorithm introduced in "Generation of 3D digital map for whole brain" section is not defined. Is it a proposed algorithm by authors or an algorithm used from somewhere else? If latter, the authors should cite the paper.

We have cited the paper that reports Marching cubes algorithm in the revised manuscript (Lorensen and Cline, 1987).

https://doi.org/10.7554/eLife.63455.sa2

Article and author information

Author details

  1. Xuechun Wang

    School of Optical and Electronic Information- Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, China
    Contribution
    Data curation, Software, Validation, Visualization, Methodology, Writing - original draft
    Contributed equally with
    Weilin Zeng and Xiaodan Yang
    Competing interests
    No competing interests declared
  2. Weilin Zeng

    School of Optical and Electronic Information- Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, China
    Contribution
    Software, Validation
    Contributed equally with
    Xuechun Wang and Xiaodan Yang
    Competing interests
    No competing interests declared
  3. Xiaodan Yang

    School of Basic Medicine, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
    Contribution
    Resources, Visualization
    Contributed equally with
    Xuechun Wang and Weilin Zeng
    Competing interests
    No competing interests declared
  4. Chunyu Fang

    School of Optical and Electronic Information- Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, China
    Contribution
    Resources, Visualization
    Competing interests
    No competing interests declared
  5. Yunyun Han

    School of Basic Medicine, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
    Contribution
    Resources, Visualization, Writing - review and editing
    For correspondence
    yhan@hust.edu.cn
    Competing interests
    No competing interests declared
  6. Peng Fei

    School of Optical and Electronic Information- Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, China
    Contribution
    Software, Funding acquisition, Visualization, Writing - review and editing
    For correspondence
    feipeng@hust.edu.cn
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-3764-817X

Funding

National Natural Science Foundation of China (21874052)

  • Peng Fei

National Natural Science Foundation of China (31871089)

  • Yunyun Han

Innovation Fund of WNLO

  • Peng Fei

Junior Thousand Talents Program of China

  • Yunyun Han
  • Peng Fei

The FRFCU (HUST:2172019kfyXKJC077)

  • Yunyun Han

National Key R&D program of China (2017YFA0700501)

  • Peng Fei

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank Yongsheng Zhang, Shaoqun Zeng, Haohong Li, Luoying Zhang, Man Jiang, Bo Xiong for discussions and comments on the work. Hao Zhang for the help on the code implementation. This work was supported by the National Key R and D program of China (2017YFA0700501 PF), the National Natural Science Foundation of China (21874052 for PF,31871089 for YH), the Innovation Fund of WNLO (PF) and the Junior Thousand Talents Program of China (PF and YH), the FRFCU (HUST:2172019kfyXKJC077 YH).

Senior Editor

  1. Laura L Colgin, University of Texas at Austin, United States

Reviewing Editor

  1. Jeffrey C Smith, National Institute of Neurological Disorders and Stroke, United States

Reviewers

  1. Charles R Gerfen, National Institute of Mental Health, United States
  2. Jahandar Jahanipour, National Institutes of Health, United States

Publication history

  1. Received: September 25, 2020
  2. Accepted: December 27, 2020
  3. Accepted Manuscript published: January 18, 2021 (version 1)
  4. Version of Record published: January 27, 2021 (version 2)

Copyright

© 2021, Wang et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 2,246
    Page views
  • 267
    Downloads
  • 2
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

  1. Further reading

Further reading

    1. Computational and Systems Biology
    2. Neuroscience
    Matthew Rosenberg et al.
    Research Article Updated

    Animals learn certain complex tasks remarkably fast, sometimes after a single experience. What behavioral algorithms support this efficiency? Many contemporary studies based on two-alternative-forced-choice (2AFC) tasks observe only slow or incomplete learning. As an alternative, we study the unconstrained behavior of mice in a complex labyrinth and measure the dynamics of learning and the behaviors that enable it. A mouse in the labyrinth makes ~2000 navigation decisions per hour. The animal explores the maze, quickly discovers the location of a reward, and executes correct 10-bit choices after only 10 reward experiences — a learning rate 1000-fold higher than in 2AFC experiments. Many mice improve discontinuously from one minute to the next, suggesting moments of sudden insight about the structure of the labyrinth. The underlying search algorithm does not require a global memory of places visited and is largely explained by purely local turning rules.

    1. Computational and Systems Biology
    2. Neuroscience
    Man Yi Yim et al.
    Research Article Updated

    What factors constrain the arrangement of the multiple fields of a place cell? By modeling place cells as perceptrons that act on multiscale periodic grid-cell inputs, we analytically enumerate a place cell’s repertoire – how many field arrangements it can realize without external cues while its grid inputs are unique – and derive its capacity – the spatial range over which it can achieve any field arrangement. We show that the repertoire is very large and relatively noise-robust. However, the repertoire is a vanishing fraction of all arrangements, while capacity scales only as the sum of the grid periods so field arrangements are constrained over larger distances. Thus, grid-driven place field arrangements define a large response scaffold that is strongly constrained by its structured inputs. Finally, we show that altering grid-place weights to generate an arbitrary new place field strongly affects existing arrangements, which could explain the volatility of the place code.