A multi-resolution imaging and analysis pipeline for comparative circuit reconstruction in insects

  1. Lund University, Department of Biology, Lund Vision Group, Lund, Sweden
  2. Macquarie University, School of Natural Sciences, Ecological Neuroscience Group, Sydney, Australia
  3. Microscopy Australia Facility at the Centre for Microscopy and Microanalysis (CMM), The University of Queensland, Brisbane, Australia
  4. Tedore Interactive, Hamburg, Germany
  5. HHMI Janelia, Ashburn, United States
  6. Lund University, NanoLund, Lund, Sweden

Peer review process

Not revised: This Reviewed Preprint includes the authors’ original preprint (without revision), an eLife assessment, public reviews, and a provisional response from the authors.

Read more about eLife’s peer review process.

Editors

  • Reviewing Editor
    Albert Cardona
    University of Cambridge, Cambridge, United Kingdom
  • Senior Editor
    Albert Cardona
    University of Cambridge, Cambridge, United Kingdom

Reviewer #1 (Public review):

Summary:

This manuscript presents an end-to-end pipeline, intended to accelerate EM-based connectomics by combining low-resolution imaging for large volumes with synapse-level imaging only in selected regions of interest. In principle, this strategy can substantially reduce imaging time, computational demands, analysis time, and overall cost.

General note:

Overall, I found the manuscript interesting and valuable, particularly as a description of how one laboratory has assembled and applied a practical workflow to reconstruct and analyze the central complex across multiple insect species. In that sense, the work is compelling as an account of a real, functioning strategy for comparative connectomics, and I appreciated reading it. My main reservation is not about the relevance of the biological problem or the utility of the pipeline in the authors' own hands, but about whether the manuscript, in its current form, fully meets the expectations of a paper that is focused on tools and resources. The expectation would be that this paper would be a venue for sharing new techniques, software tools, datasets, and other resources intended to be usable by the community. Here, because much of the pipeline appears to build on existing methods and software, the key value added should be a particularly clear demonstration of how these components were adapted, integrated, validated, and documented for this specific use case in a way that others could realistically reproduce and adopt. At present, that translational and reproducibility-oriented component does not yet seem sufficiently developed, despite the clear promise of the overall approach.

Major comments:

(1) The work is valuable as a practical integration and application of multiple existing tools into a coherent pipeline, together with a new multi-resolution imaging strategy. However, the manuscript at times reads as though it introduces an entirely novel workflow. I would encourage the authors to clarify the contribution more explicitly: which components are genuinely new (for example, the acquisition strategy and the end-to-end integration/validation), and which are adaptations of already established methods or software. This would make the scope and novelty of the paper easier to assess.

(2) The most distinctive element is the multi-resolution acquisition strategy. However, as described, the selection of high-resolution regions seems to be decided a priori based on anatomy (guided by xCT localization of the CX), rather than being determined automatically from the data (i.e., ROI placement is anatomy-driven rather than data-driven). A more data-driven or machine learning-guided ROI strategy would strengthen the methodological contribution and the adaptability to new scenarios, along the lines of approaches such as SmartEM [1].

(3) The manuscript emphasizes open-source availability and reduced barriers to entry, but the current software release, as referenced, does not yet appear to support straightforward external reuse. Since much of the pipeline builds on existing methods, the main added value lies in how these technologies were adapted, combined, and validated for the present problem. A clear and complete explanation of this adaptation is therefore essential, but is currently missing. I would suggest the following concrete improvements:
a) Provide a single landing page or umbrella repository that links each pipeline step in the paper to the corresponding codebase, including version tags/commits and expected inputs/outputs for each step.
b) Include step-by-step tutorials for each component.
c) Provide an example dataset together with a full reproduction walkthrough in a controlled environment.
d) Clearly explain the required parameters and configuration for each step, including how they should be adjusted for other datasets or scenarios.
e) Follow packaging and distribution best practices (for example, PyPI/conda releases, Docker containers, and version pinning).

(4) In my own attempt to set up and run parts of the released code, I encountered issues that currently limit reproducibility. For example, when creating an environment for EMalign (https://github.com/Heinze-lab/EMalign), the required Python version is not specified, and installation did not succeed under Python 3.12 due to dependency constraints. Additionally, synful_312 (https://github.com/Heinze-lab/synful_312) and SegToPCG (https://github.com/Heinze-lab/SegToPCG) appear to be empty despite being referenced in the manuscript. These are fixable issues, but addressing them is important if the paper is to deliver on its "low entry cost" claim.

(5) Table 1 reports acquisition times, which is helpful. However, the multi-resolution approach adds essential processing steps that appear due to the strategy followed (e.g., "XY alignment high-res" and "high-res to low-res alignment"). Please include registration/alignment (and other major post-processing) runtimes and resource requirements, such as storage, in a comparable table so readers can assess true end-to-end cost.

References:

[1] Meirovitch, Y., et al. "SmartEM: machine learning-guided electron microscopy." Nature Methods (2025).

Reviewer #2 (Public review):

Summary:

The paper proposes a workflow to accelerate EM connectomics by combining multi-scale imaging with image processing and analysis (image alignment, registration, neuron tracing, automated segmentation and synapse prediction, proof-reading) to derive a brain region connectome. The paper argues and (partially) demonstrates that this approach facilitates comparative connectomics.

The data acquisition pipeline uses a well-established sample preparation protocol, uCT guided acquisition, and SBEM imaging at cellular and synaptic resolution.

Data processing and analysis combine existing state-of-the-art components and focus on the alignment and complementary analysis of the two SBEM resolution levels. The paper applies the workflow to the central complex of six different insects and performs some preliminary analysis based on this (which is acceptable for a resource/tool).

Disclaimer for the rest of the review: I am an expert in image analysis and segmentation, so I have mainly focused on these aspects as I am not qualified to analyze the details of image acquisition.

Strengths:

The paper addresses an important problem and promises an acceleration and democratization of comparable connectomics. The time savings of the imaging approach are well-motivated and derived. The methods used for image alignment, segmentation, synapse detection, and proofreading are state-of-the-art.

Weaknesses:

I see two major weaknesses in the paper:

(1) The paper introduces the (approximate) equivalence of the projectome and connectome in the insect brain very prominently in the introduction and uses this as a central motivation for the multi-resolution image acquisition protocol. But - to me - it is unclear how this principle is really used in the analysis presented in the last results and if this assumption is evaluated at all. Specifically, Figure 4 a shows the anatomical neuron reconstructions (from cellular resolution SBEM), d-g show connectome-level analysis from the synaptic resolution data. The only link I can see between the two is that the neural processes in the synapse-resolution data can be mapped to the neurons from the cellular resolution data, thanks to the image alignment. This is certainly important, BUT it is only tangentially related to the projectome vs. connectome claim from the introduction. This claim implies that a tentative connectome is derived from projectome-level data (e.g. by assuming a uniform probability of synapse-formation given surface or distance between projections) that is then validated by the "true" connectome data from synaptic resolution. Instead, what is actually solved - to my understanding - is mapping the local connectome to the projectome. While related, these are different things and the current framing of the paper and the quite brief description of the section on comparative connectomics (also no corresponding Methods section) make this claim inadequately supported.

(2) Reporting on segmentation and proofreading is purely qualitative. Given that this is claimed as a core contribution of the paper (e.g. statement in line 497 and following), I would expect substantially more reporting and evaluation of this claim:
a) Report the actual time needed for proofreading the segmentations in CAVE. I could not find any numbers on this.
b) Report the initial segmentation quality of the model: How many errors does it make? Note: There is a brief mention of VoI-based quantification in Methods (around line 1060), but the results are not reported.

What should be done: Report the error rates (with an accurate measure such as skeleton VoI) independently for all 6 volumes. Given that the authors have the proofread versions, this is feasible. Only then can the claims be made here be evaluated. Note that the F1-score of synapse prediction is quantified. This is a good starting point, but could also be extended to further species in order to assess the actual transferability. Furthermore, none of the data from the study seems to be available. The training data of the network has to be made available. If possible, high-resolution data should be proofread too.

Further points:

(1) Why isn't reconstruction at the cellular level addressed with ML? This is surely possible and should be easier than the full connectome analysis. Similar to before, the actual times needed for tracing with CATMAID are not reported; the manuscript only states that this can be done in minutes for a neuron, but it's unclear if this is the best or average case. It would help to have quantitative numbers to assess whether automation would bring any benefits.

(2) Finally, regarding the underlying software. I did not try this myself due to time constraints, but did check the repositories. They seem to be in an ok state with some documentation in a README. However, given the central role of the software contribution, I would expect a centralized doc page that explains how to use the different parts of the software, including a full example with sample data. Without this, application by other labs - a central claim - will be difficult.

Author Response:

Public Review:

On behalf of all authors I would like to thank the reviewers for highly constructive and helpful comments, which, once addressed fully, will make the paper stronger and more useful as a tools and resources contribution.

Besides addressing all minor issues that were pointed out by the reviewers, we see three main lines of changes we will need to pursue in order to address all major concerns. We plan to do all of these as fast as possible. Given that new alignments, segmentation and tracing is needed, this will take between one and three months.

(1) Availability of code, software documentation and accessibility of pipeline. 

Both reviewers and the editorial summary agreed that we need to improve the availability of our code, provide more instructions and examples of how to use the code, and make our methods more reusable to outsiders. To achieve this we will follow the suggestions made by the reviewers, in particular the list presented by reviewer 1 (point three of weaknesses in the public review).

We firstly would like to apologize for the faulty link to the SegToPCG (https://github.com/Heinzelab/SegToPCG) repository (the correct name and link is: LSDtoPCG and https://github.com/Heinze-lab/LSDtoPCG) as well as the missing code in the https://github.com/Heinze-lab/synful_312 repository; these issues have already been fixed and will be included in an updated bioRxiv version.

Second, we will generate an overarching umbrella page that will serve as a go-to site for any user who would like to implement our pipeline. To enable implementation, we will expand the documentation, provide detailed instructions, and include an example dataset with these instructions.

(2) Quantification of analysis steps, including segmentation, alignment and manual tracing, to validate our claims of increased efficiency and transferability across species.

As for point 1, both reviewers as well as the editorial summary highlighted the need for more comprehensive quantification of the workflow, especially with respect to segmentation quality as well as time investment into manual tracing and high resolution alignments. In particular, these data should validate the transferability of the segmentation models across species, and support the claims made about the time savings resulting from using our multiresolution workflow compared to a whole sample synaptic resolution approach.

To this aim, we will generate all analyses according to the reviewer suggestions and incorporate the resulting data in new figures and tables. To make the data fully comparable across species, we will apply the latest version of our alignment and segmentation scripts to at least one high resolution data stack of each species, quantify manual tracing of a comparable, defined set of neurons in each species, and perform VOI analyses of each species segmentation against manually traced neurons in identically sized testing volumes in each dataset. Additionally, we will proof-read identical branches of homologous neurons in each species and quantify the required number of edits from raw segmentation output to completion.

As the segmentation pipeline has evolved over the last years, a fair comparison between all datasets requires fresh analysis based on the latest version of our machine learning models (cannot be done with existing data) and will therefore take a few weeks of time.

(3) Clarification of aims for multi-resolution pipeline and how projectomes and connectomes inform each other

Reviewer 2 highlighted that there is not sufficient clarity about the aims of combining projectome and connectome. Judging from the reviewer comment, we might have inadvertently left the impression that we aimed at predicting a connectome from projectome data, by using spatial proximity of neurons as a proxy for connectivity. In fact, our data show that this is not possible, and that projection level data cannot predict connectivity. For instance, in the head direction system, the projectivity data suggests identical circuits for bees and flies (except at the edges of the ring), but connectivity data shows that the components of the ring attractor circuit are forming circuits that are distinctly different between the species (despite the same neurons with the same projection patterns being involved).

What we aim to do is slightly different. We define global patterns of information flow using the projectome, and then define circuits in a part of this global circuit at synaptic level. Then, we extrapolate the global connectivity by assuming that the circuits identified in one or two computational units (columns) are repeated in each column. This rests on the assumption that the same neurons form the same connections in each repeated module, as long as the cellular repertoire is identical (verified by the projectome), but does not use proximity data to predict connectivity. This method thus only applies to brain regions that consist of repeated computational modules, i.e. where we can assume that knowing the connectivity in one of them allows extrapolation to the entire brain region. While this is a simplification, the Drosophila CX has in principle confirmed this assumption.

We will generate a new figure in which we illustrate the process of combining local connectomes and global projectomes using examples from our data, but illustrating this schematically also for other brain regions, e.g. the insect optic lobe or the cerebral cortex of mammals. We will also carefully rewrite the relevant text passages to avoid misunderstandings.

Overall, we would like to thank the reviewers again for their thorough and detailed comments, which will help to make our connectomics workflow more accessible and reproducible.

  1. Howard Hughes Medical Institute
  2. Wellcome Trust
  3. Max-Planck-Gesellschaft
  4. Knut and Alice Wallenberg Foundation