Peer review process
Not revised: This Reviewed Preprint includes the authors’ original preprint (without revision), an eLife assessment, public reviews, and a provisional response from the authors.
Read more about eLife’s peer review process.Editors
- Reviewing EditorAnnalisa ScimemiUniversity at Albany, State University of New York, Albany, United States of America
- Senior EditorPanayiota PoiraziFORTH Institute of Molecular Biology and Biotechnology, Heraklion, Greece
Reviewer #1 (Public review):
Summary:
The authors note that while many software packages exist for spike sorting, these do not automatically differentiate with known accuracy between excitatory and inhibitory neurons. Moreover, most existing spike sorting packages are for in vivo use, where the majority of electrodes are separated from each other by several hundred microns or more. There is a need for spike sorting packages that can take advantage of high-density electrode arrays where all electrodes are within a few tens of microns of other electrodes. Here, the authors offer such a software package with SpikeMAP, and they validate its performance in identifying parvalbumin interneurons that were optogenetically stimulated.
Strengths:
The main strength of this work is that the authors use ground truth measures to show that SpikeMAP can take features of spike shapes to correctly identify known parvalbumin interneurons against a background of other neuron types. They use spike width and peak to peak distance as the key features for distinguishing between neuron types, a method that has been around for many years (Barthó, Peter, et al. "Characterization of neocortical principal cells and interneurons by network interactions and extracellular features." Journal of neurophysiology 92.1 (2004): 600-608.), but whose performance has not been validated in the context of high density electrode arrays.
Another strength of this approach is that it is automated - a necessity if your electrode array has 4096 electrodes. Hand-sorting or even checking such a large number of channels is something even the cruelest advisor would not wish upon a graduate student. With such large channel counts, it is essential to have automated methods that are known to work accurately. Hence, the combination of validation and automation is an important advance.
A nice feature of this work is that with high-density electrode arrays, the spike waveforms appear on multiple nearby electrodes simultaneously. And since spike amplitudes fall off with distance, this allows triangulation of neuron locations within the regular electrode array. Thus, spike correlations between neuron types, or within neuron types, can be plotted as a function of distance. While SpikeMAP is not the first to do this (Peyrache, Adrien, et al. "Spatiotemporal dynamics of neocortical excitation and inhibition during human sleep." Proceedings of the National Academy of Sciences 109.5 (2012): 1731-1736.), it is a welcome capability of this package.
It is also good that the code for this package is open-source, allowing a community of people (I expect in vitro labs will especially want to use this) to use the code and further improve it.
Weaknesses:
As this code was developed for use with a 4096 electrode array, it is important to be aware of double-counting neurons across the many electrodes. I understand that there are ways within the code to ensure that this does not happen, but care must be taken in two key areas. Firstly, action potentials traveling down axons will exhibit a triphasic waveform that is different from the biphasic waveform that appears near the cell body, but these two signals will still be from the same neuron (for example, see Litke et al., 2004 "What does the eye tell the brain: Development of a System for the Large-Scale Recording of Retinal Output Activity"; figure 14). I did not see anything that would directly address this situation, so it might be something for you to consider in updated versions of the code. Secondly, spike shapes are known to change when firing rates are high, like in bursting neurons (Harris, K.D., Hirase, H., Leinekugel, X., Henze, D.A. & Buzsáki, G. Temporal interaction between single spikes and complex spike bursts in hippocampal pyramidal cells. Neuron 32, 141-149 (2001)). I did not see this addressed in the present version of the manuscript.
Another area for possible improvement would be to build on the excellent validation experiments you have already conducted with parvalbumin interneurons. Although it would take more work, similar experiments could be conducted for somatostatin and vasoactive intestinal peptide neurons against a background of excitatory neurons. These may have different spike profiles, but your success in distinguishing them can only be known if you validate against ground truth, like you did for the PV interneurons.
Appraisal:
This work addresses the need for an automated spike sorting software package for high-density electrode arrays. Although no spike sorting software is flawless, the package presented here, SpikeMAP, has been validated on PV interneurons, inspiring a degree of confidence. This is a good start, and further validation on other neuron types could increase that confidence. Groups doing in vitro experiments, where 4096 electrode arrays are more common, could find this system particularly helpful.
Reviewer #2 (Public review):
Summary:
In this paper, entitled "SpikeMAP: An unsupervised spike sorting pipeline for cortical excitatory and inhibitory 2 neurons in high-density multielectrode arrays with ground-truth validation", the authors present spikeMAP, a pipeline for the analysis of large-scale recordings of in vitro cortical activity. According to the authors, spikeMAP not only allows for the detection of spikes produced by single neurons (spike sorting), but also allows for the reliable distinction between genetically determined cell types by utilizing viral and optogenetic strategies as ground-truth validation. While I find that the paper is nicely written and easy to follow, I find that the algorithmic part of the paper is not really new and should have been more carefully compared to existing solutions. While the GT recordings to assess the possibilities of a spike sorting tool to distinguish properly between excitatory and inhibitory neurons are interesting, spikeMAP does not seem to bring anything new to state-of-the-art solutions, and/or, at least, it would deserve to be properly benchmarked. I would suggest that the authors perform a more intensive comparison with existing spike sorters.
Strengths:
The GT recordings with optogenetic activation of the cells, based on the opsins, is interesting and might provide useful data to quantify how good spike sorting pipelines are, in vitro, to discriminate between excitatory and inhibitory neurons. Such an approach can be quite complementary to artificially generated ground truth.
Weaknesses:
(1) The global workflow of spikeMAP, described in Figure 1, seems to be very similar to that of Hilgen et al. 2020 (10.1016/j.celrep.2017.02.038). Therefore, the first question is what is the rationale of reinventing the wheel, and not using tools that are doing something very similar (as mentioned by the authors themselves). I have a hard time, in general, believing that spikeMAP has something particularly special, given its Methods, compared to state-of-the-art spike sorters. This is why, at the very least, the title of the paper is misleading, because it lets the reader think that the core of the paper will be about a new spike sorting pipeline. If this is the main message the authors want to convey, then I think that numerous validations/benchmarks are missing to assess first how good spikeMAP is, with reference to spike sorting in general, before deciding if this is indeed the right tool to discriminate excitatory vs inhibitory cells. The GT validation, while interesting, is not enough to entirely validate the paper. The details are a bit too scarce for me, or would deserve to be better explained (see other comments after).
(2) Regarding the putative location of the spikes, it has been shown that the center of mass, while easy to compute, is not the most accurate solution [Scopin et al, 2024, 10.1016/j.jneumeth.2024.110297]. For example, it has an intrinsic bias for finding positions within the boundaries of the electrodes, while some other methods, such as monopolar triangulation or grid-based convolution,n might have better performances. Can the authors comment on the choice of the Center of Mass as a unique way to triangulate the sources?
(3) Still in Figure 1, I am not sure I really see the point of Spline Interpolation. I see the point of such a smoothing, but the authors should demonstrate that it has a key impact on the distinction of Excitatory vs. Inhibitory cells. What is special about the value of 90kHz for a signal recorded at 18kHz? What is the gain with spline enhancement compared to without? Does such a value depend on the sampling rate, or is it a global optimum found by the authors?
(4) Figure 2 is not really clear, especially panel B. The choice of the time scale for the B panel might not be the most appropriate, and the legend filtered/unfiltered with a dot is not clear to me in Bii. In panel E, the authors are making two clusters with PCA projections on single waveforms. Does this mean that the PCA is only applied to the main waveforms, i.e. the ones obtained where the amplitudes are peaking the most? This is not really clear from the methods, but if this is the case, then this approach is a bit simplistic and does not really match state-of-the-art solutions. Spike waveforms are quite often, especially with such high-density arrays, covering multiple channels at once, and thus the extracellular patterns triggered by the single units on the MEA are spatio-temporal motifs occurring on several channels. This is why, in modern spike sorters, the information in a local neighbourhood is often kept to be projected, via PCA, on the lower-dimensional space before clustering. Information on a single channel only might not be informative enough to disambiguate sources. Can the authors comment on that, and what is the exact spatial resolution of the 3Brain device? The way the authors are performing the SVD should be clarified in the methods section. Is it on a single channel, and/or on multiple channels in a local neighbourhood?
(5) About the isolation of the single units, here again, I think the manuscript lacks some technical details. The authors are saying that they are using a k-means cluster analysis with k=2. This means that the authors are explicitly looking for 2 clusters per electrode? If so, this is a really strong assumption that should not be held in the context of spike sorting, because, since it is a blind source separation technique, one can not pre-determine in advance how many sources are present in the vicinity of a given electrode. While the illustration in Figure 2E is ok, there is no guarantee that one can not find more clusters, so why this choice of k=2? Again, this is why most modern spike sorting pipelines do not rely on k-means, to avoid any hard-coded number of clusters. Can the authors comment on that?
(6) I'm surprised by the linear decay of the maximal amplitude as a function of the distance from the soma, as shown in Figure 2H. Is it really what should be expected? Based on the properties of the extracellular media, shouldn't we expect a power law for the decay of the amplitude? This is strange that up to 100um away from the soma, the max amplitude only dropped from 260 to 240 uV. Can the authors comment on that? It would be interesting to plot that for all neurons recorded, in a normed manner V/max(V) as function of distances, to see what the curve looks like.
(7) In Figure 3A, it seems that the total number of cells is rather low for such a large number of electrodes. What are the quality criteria that are used to keep these cells? Did the authors exclude some cells from the analysis, and if yes, what are the quality criteria that are used to keep cells? If no criteria are used (because none are mentioned in the Methods), then how come so few cells are detected, and can the authors convince us that these neurons are indeed "clean" units (RPVs, SNRs, ...)?
(8) Still in Figure 3A, it looks like there is a bias to find inhibitory cells at the borders, since they do not appear to be uniformly distributed over the MEA. Can the authors comment on that? What would be the explanation for such a behaviour? It would be interesting to see some macroscopic quantities on Excitatory/Inhibitory cells, such as mean firing rates, averaged SNRs... Because again, in Figure 3C, it is not clear to me that the firing rates of inhibitory cells are higher than Excitatory ones, whilst they should be in theory.
(9) For Figure 3 in general, I would have performed an exhaustive comparison of putative cells found by spikeMAP and other sorters. More precisely, I think that to prove the point that spikeMAP is indeed bringing something new to the field of spike sorting, the authors should have compared the performances of various spike sorters to discriminate Exc vs Inh cells based on their ground truth recordings. For example, either using Kilosort [Pachitariu et al, 2024, 10.1038/s41592-024-02232-7], or some other sorters that might be working with such large high-density data [Yger et al, 2018, 10.7554/eLife.34518].
(10) Figure 4 has a big issue, and I guess the panels A and B should be redrawn. I don't understand what the red rectangle is displaying.
(11) I understand that Figure 4 is only one example, but I have a hard time understanding from the manuscript how many slices/mices were used to obtain the GT data? I guess the manuscript could be enhanced by turning the data into an open-access dataset, but then some clarification is needed. How many flashes/animals/slices are we talking about? Maybe this should be illustrated in Figure 4, if this figure is devoted to the introduction of the GT data.
(12) While there is no doubt that GT data as the ones recorded here by the authors are the most interesting data from a validation point of view, the pretty low yield of such experiments should not discourage the use of artificially generated recordings such as the ones made in [Buccino et al, 2020, 10.1007/s12021-020-09467-7] or even recently in [Laquitaine et al, 2024, 10.1101/2024.12.04.626805v1]. In these papers, the authors have putative waveforms/firing rate patterns for excitatory and inhibitory cells, and thus, the authors could test how good they are in discriminating the two subtypes.