A scalable and modular automated pipeline for stitching of large electron microscopy datasets
Abstract
Serial-section electronmicroscopy (ssEM) is themethod of choice for studyingmacroscopic biological samples at extremely high resolution in three dimensions. In the nervous system, nanometer-scale images are necessary to reconstruct dense neural wiring diagrams in the brain, so called connectomes. In order to use this data, consisting of up to 108 individual EM images, it must be assembled into a volume, requiring seamless 2D stitching from each physical section followed by 3D alignment of the stitched sections. The high throughput of ssEM necessitates 2D stitching to be done at the pace of imaging, which currently produces tens of terabytes per day. To achieve this, we present a modular volume assembly software pipeline ASAP (Assembly Stitching and Alignment Pipeline) that is scalable to datasets containing petabytes of data and parallelized to work in a distributed computational environment. The pipeline is built on top of the Render (27) services used in the volume assembly of the brain of adult Drosophilamelanogaster (30). It achieves high throughput by operating on themeta-data and transformations of each image stored in a database, thus eliminating the need to render intermediate output. ASAP ismodular, allowing for easy incorporation of new algorithms without significant changes in the workflow. The entire software pipeline includes a complete set of tools for stitching, automated quality control, 3D section alignment, and final rendering of the assembled volume to disk. ASAP has been deployed for continuous stitching of several large-scale datasets of the mouse visual cortex and human brain samples including one cubic millimeter of mouse visual cortex (28; 8) at speeds that exceed imaging. The pipeline also has multi-channel processing capabilities and can be applied to fluorescence and multi-modal datasets like array tomography.
Data availability
The current manuscript describes is a software infrastructure resource that is being made publicly available. The manuscript is not a data generation manuscript. Nevertheless, one of the datasets used is already publicly available on https://www.microns-explorer.org/cortical-mm3#em-imagery with available imagery and segmentation (https://tinyurl.com/cortical-mm3).Moreover cloud-volume (https://github.com/seung-lab/cloud-volume) can be used to programmatically download EM imagery from either Amazon or Google with the cloud paths described below. The imagery was reconstructed in two portions, referred to internally by their nicknames 'minnie65' and 'minnie35' reflecting their relative portions of the total data. The two portions are aligned across an interruption in sectioning.minnie65:AWS Bucket: precomputed://https://bossdb-open-data.s3.amazonaws.com/iarpa_microns/minnie/minnie65/emGoogle Bucket: precomputed://https://storage.googleapis.com/iarpa_microns/minnie/minnie65/emminnie35:AWS Bucket: precomputed://https://bossdb-open-data.s3.amazonaws.com/iarpa_microns/minnie/minnie35/emGoogle Bucket: precomputed://https://storage.googleapis.com/iarpa_microns/minnie/minnie35/emWe have also made available in Dryad raw data of the remaining datasets https://doi.org/10.5061/dryad.qjq2bvqhr
-
ASAP-TEM-sampleDryad Digital Repository, doi:10.5061/dryad.qjq2bvqhr.
-
MICrONS multi-area datasethttps://doi.org/10.1101/2021.07.28.454025.
Article and author information
Author details
Funding
IARPA (D16PC00004)
- Gayathri Mahalingam
- Russel Torres
- Daniel Kapner
- Tim Fliss
- Shamishtaa Seshamani
- Rob Young
- Samuel Kinn
- JoAnn Buchanan
- Marc M Takeno
- Wenjing Yin
- Daniel J Bumbarger
- R Clay Reid
- Forrest Collman
- Nuno Macarico da Costa
The funders had no role in study design and interpretation, or the decision to submit the work for publication.
Ethics
Animal experimentation: All procedures were carried out in accordance with Institutional Animal Care and Use Committee approval at the Allen Institute for Brain Science with protocol numbers 1503, 1801 and 1808
Human subjects: Human surgical specimen was obtained from local hospital in collaboration with local neurosurgeon. The sample collection was approved by the Western Institutional Review Board (Protocol # SNI 0405). Patient provided informed consent and experimental procedures were approved by hospital institute review boards before commencing the study.
Copyright
© 2022, Mahalingam et al.
This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 2,278
- views
-
- 359
- downloads
-
- 22
- citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Cell Biology
- Computational and Systems Biology
Induced pluripotent stem cell (iPSC) technology is revolutionizing cell biology. However, the variability between individual iPSC lines and the lack of efficient technology to comprehensively characterize iPSC-derived cell types hinder its adoption in routine preclinical screening settings. To facilitate the validation of iPSC-derived cell culture composition, we have implemented an imaging assay based on cell painting and convolutional neural networks to recognize cell types in dense and mixed cultures with high fidelity. We have benchmarked our approach using pure and mixed cultures of neuroblastoma and astrocytoma cell lines and attained a classification accuracy above 96%. Through iterative data erosion, we found that inputs containing the nuclear region of interest and its close environment, allow achieving equally high classification accuracy as inputs containing the whole cell for semi-confluent cultures and preserved prediction accuracy even in very dense cultures. We then applied this regionally restricted cell profiling approach to evaluate the differentiation status of iPSC-derived neural cultures, by determining the ratio of postmitotic neurons and neural progenitors. We found that the cell-based prediction significantly outperformed an approach in which the population-level time in culture was used as a classification criterion (96% vs 86%, respectively). In mixed iPSC-derived neuronal cultures, microglia could be unequivocally discriminated from neurons, regardless of their reactivity state, and a tiered strategy allowed for further distinguishing activated from non-activated cell states, albeit with lower accuracy. Thus, morphological single-cell profiling provides a means to quantify cell composition in complex mixed neural cultures and holds promise for use in the quality control of iPSC-derived cell culture models.
-
- Cell Biology
Collagen-I fibrillogenesis is crucial to health and development, where dysregulation is a hallmark of fibroproliferative diseases. Here, we show that collagen-I fibril assembly required a functional endocytic system that recycles collagen-I to assemble new fibrils. Endogenous collagen production was not required for fibrillogenesis if exogenous collagen was available, but the circadian-regulated vacuolar protein sorting (VPS) 33b and collagen-binding integrin α11 subunit were crucial to fibrillogenesis. Cells lacking VPS33B secrete soluble collagen-I protomers but were deficient in fibril formation, thus secretion and assembly are separately controlled. Overexpression of VPS33B led to loss of fibril rhythmicity and overabundance of fibrils, which was mediated through integrin α11β1. Endocytic recycling of collagen-I was enhanced in human fibroblasts isolated from idiopathic pulmonary fibrosis, where VPS33B and integrin α11 subunit were overexpressed at the fibrogenic front; this correlation between VPS33B, integrin α11 subunit, and abnormal collagen deposition was also observed in samples from patients with chronic skin wounds. In conclusion, our study showed that circadian-regulated endocytic recycling is central to homeostatic assembly of collagen fibrils and is disrupted in diseases.