Dimensionality reduction suggests a limited relationship between neuroanatomy and spike timing.

A. Experimental pipeline. Left to right: Recording, raw data, extracted spike times, spiking time features (e.g., rates, CV), and model training protocols. The Allen Institute Visual Coding dataset comprises high density silicon extracellular recordings that span multiple brain regions and structures. During recording, mice were headfixed and presented with multiple visual stimuli, including drifting gratings. For supervised experiments, classifiers were either transductive—all neurons from all animals were mixed, and divided into train and test sets, or inductive— train and test sets were divided at the level of the animal. B. Brain regions and structures included in our analyses. (Left to right) Brain Regions: Hippocampus, Midbrain, Thalamus and Visual Cortex. Hippocampal Structures: CA1, CA3, Dentate Gyrus (DG), Prosubiculum (ProS), Subiculum (SUB). Thalamic Structures: Ethmoid Nucleus (Eth), Dorsal Lateral Geniculate (LGd), Lateral Posterior Nucleus (LP), Ventral Medial Geniculate (MGv), Posterior Complex (PO), Suprageniculate Nucleus (SGN), Ventral Posteromedial Nucleus (VPM). Visuocortical Structures: Anterolateral (VISal), Anteromedial (VISam), Lateral (VISl), Primary (VISp), Posteromedial (VISpm), Rostrolateral (VISrl). C. Unsupervised t-SNE plot of units recorded in each set of regions/structures. For each unit, 14 spiking metrics (see methods) describe the spike train, which is then placed in t-SNE, 2D scatterplot. Color scheme follows 1B. P-values are derived from a permutation test (shuffled control) of structure/region classifiability on the dimensionally-reduced space-see Unsupervised Analysis section of Methods

A linear classifier can learn to predict single neuron location based on standard spiking metrics.

A-D., Balanced accuracy of logistic regression models trained to predict the anatomical location of single units based on each of 14 individual spiking metrics: Coefficient of Variation 2 (CV2), Local Variation (LV), Revised Local Variation (LVR), Mean Firing Rate (FR), Standard Deviation (Std Dev) of interspike intervals (ISIs), Coefficient of Variation (CV), Minimum ISI (Min ISI), Median ISI (Med ISI), Maximum ISI (Max ISI), power spectral density (PSD)-δ (Delta Band 0.1-4 Hz), PSD-θ (Theta Band 4-8 Hz), PSD-α (Alpha Band 8-12 Hz), PSD-β (Beta Band 12-40 Hz), PSD-γ (Gamma Band 40-100 Hz). Balanced accuracy expected by chance varies by task and is indicated by the dashed red line. Features on the x-axis are ordered by performance. Feature (bar) colors are assigned by the ordering in A (brain region task) and maintained for the structure tasks (B-D). This shows the extent to which individual features maintain their relative importance across tasks. E-H., Confusion matrices from logistic regression models trained to predict unit location from the combination of all 14 spiking metrics. Each confusion matrix shows the average of 5 train/test splits of the data. The proportion is printed only in cells where proportion was greater than chance level (1/number of classes). Balanced accuracy for each task: Brain Regions = 52.91 ±1.24; Hippocampal Structures = 44.10 ±1.99; Thalamic Structures = 37.14 ±2.57; Visuocortical Structures = 24.09 ±1.46 (error is the SEM across 5 splits).

Anatomical information in spike trains is captured by nonlinear models that learn patterns in ISIs and stimulus-specific responses.

A. Average balanced accuracy of transductive multi-layer perceptrons (MLPs) in classifying unit location in each task (rows) based on three representations of the spike train (columns). Chance is red, and peak balanced accuracy is blue. ISI dist—full distribution of ISIs; Avg PSTH—mean peristimulus time histogram across all trials; Cat PSTH—concatenation of PSTHs from all 40 stimuli (see methods). B. MLP sensitivity as a function of test data duration. Model was trained normally and tested on varying amounts of data from each of the four brain regions. C. Feature importance from models that classified anatomy based on ISI dist. Features are ISI ranges between 0 and 3 s in 10 msec bins. 5 splits and 100 iterations. Error is ±1 SEM across 100 shuffles. High value ranges for each region are highlighted. D. Illustration of ISI mean, slope, and variance. E. Regional distributions of mean, slope, and variance within the highlighted range (in C) compared to averages from all other regions within that range (gray). Multiple comparisons corrected t-tests: p >= 0.05, *: p < 0.05, **: p < 0.01, ***: p < 0.001). F. Visuocortical structure information is enriched in subsets of stimuli. Each rectangle represents one of the 40 stimulus parameter combinations. Colors correspond to visuocortical structures, and the area of color shows the relative importance of that stimulus to model classification of the given structure. Cat PSTH model. Dashed boxes—exemplar stimulus conditions. G. PSTHs corresponding to structure/stimulus pair indicated in F. Specific structure PSTH shown in color. Average PSTH of all other structures shown in gray.

Spike time features that predict anatomy generalize across animals.

A. Balanced accuracy (± SE) of multi layer perceptrons (MLPs) trained on ISI distributions and concatenated PSTHs in both inductive (withhold entire animals for testing) and transductive splits. Chance (red dashed line) varies by task. Linear mixed effects regression (ns/not significant: p >= 0.05, *: p < 0.05, **: p < 0.01, ***: p < 0.001). B. Left: Illustration of an example implanted silicon array spanning isocortex, hippocampus, and thalamus. Right: Brain region probability for the example implant shown on the left calculated by smoothing across neuron-based classifications. Colored background shows the consensus prediction as a function of neuron location (electrode number). C. Confusion matrix resulting from hierarchical (region then structure) inductive classification with smoothing. Matrix cells with proportion less than chance (1/number of classes) contain no text. Average balanced accuracy after smoothing (by task): Brain Regions = 89.47 ± 2.98%; Hippocampal Structures = 51.01 ± 4.50%; Thalamic Structures = 53.21 ± 7.59%; Visuocortical Structures = 38.48 ± 3.31% (error is the SEM across 5 splits). Overall balanced accuracy: 46.91± 1.90%.

Primary vs secondary distinction and cortical layer are more evident in spike timing than individual structures.

A. Illustration of visuocortical structures that can be grouped into primary versus secondary superstructures: VIS-am, -pm, -l, -al, -rl are grouped into secondary visual cortex (VISs) while VISp is primary visual cortex. B. Confusion matrix resulting from inductive classification of superstructure (with smoothing). Cells with proportion less than chance (1/number of classes) contain no text. Balanced accuracy is 79.98 ±3.03% C. Left: Illustration of example array implanted across cortical layers. Right: Layer probability for the example implant shown on the left calculated by smoothing across neuron-based classifications. Colored background shows the consensus prediction as a function of neuron location (electrode number). D. Confusion matrix resulting from smoothed inductive classification of layer. Balanced accuracy is 62.59 ±1.12%

Anatomical information embedding in spike trains generalizes across diverse stimuli.

(Top Left) Schematic showing inductive train/test split along with visual stimuli: naturalistic movie, drifting gratings, spontaneous activity (i.e. gray screen). (Left Column) Grids showing Matthew’s Correlation Coefficient (MCC) values for pairs of training stimuli (grid rows) and testing stimuli (grid columns). Grid diagonals (top right to bottom left) represent train/test within the same stimuli. (Right Column) Confusion matrices corresponding to the each of the MCC grids in the left column.

Anatomical information embedded in spike trains generalizes across laboratories and protocols.

A. Illustration of behavioral tasks employed by two laboratories: (left) Passive Viewing (Allen Institute) vs. (right) Active Decision-Making (Steinmetz et al.). In Active Decision-Making, the mouse spins a wheel in response to the location of the drifting grating presented on the screen. B. Bubble chart showing balanced accuracy of individual test set animals from the Steinmetz et al. The model was trained on Allen Institute data. Bubble size indicates the number of units recorded in the test animal. Black horizontal lines represent the median of the balanced accuracy distributions across the animals. Models are trained with either drifting gratings alone (purple) or a combination of drifting gratings, natural movies, and spontaneous/gray screen (green). Balanced accuracy across all Steinmetz et al. neurons: BR DG=80.46%, BR Mix: 81.28 %, HS DG: 40.07 %, HS Mix: 69.89 %, TS DG: 21.52 %, TS Mix: 58.22 %, VCS DG: 28.41 %, VCS Mix: 28.34 %, VCSS DG: 58 %, VCSS Mix: 59 %, VC Layers DG: 46.36 %, VC Layers Mix: 49.01 % where BR stands for brain regions, HS hippocampal structures, TS thalamic structures, VCS visual cortex structures, VCSS visual cortex superstructures, DG drifting gratings and Mix denotes the combined stimuli. C. Confusion matrices for prediction of hippocampal structures in Steinmetz et al. test set when trained on drifting gratings stimulus (left) vs. mixed stimuli (right) from the Allen Institute. D. Confusion matrices for prediction of thalamic structures in Steinmetz et al. test set when trained on drifting gratings stimulus (left) vs. mixed stimuli (right) from the Allen Institute.

Number of included single units as a function of brain structure: Allen Institute data

Number of included single units as a function of brain structure: Steinmetz et al. data

MLP Hyperparameters

Unsupervised analysis of spiking activity.

Left columns (scatters) show three methods of unsupervised dimensionality reduction (2D): principal component analysis (PCA), t-distributed Stochastic Neighbor Embedding (t-SNE), and Uniform Manifold Approximation and Projection (UMAP). Right columns (Wasserstein distance matrices) show quantification of the distance between anatomical groupings in the corresponding scatter plots on the left. Distance is calculated in the primary dimension for each plot. Rows denote three separate representations of unit activity: 14 pre determined spike metrics (e.g., firing rate), the full ISI distribution, or concatenated PSTHs. Note that within a task (e.g., Brain Regions), there is not a consistent pattern in the distance matrices, suggesting that any structure in the scatterplots is circumstantial. Colors in scatter plots correspond to anatomical labels on the matrices. Horizontal lines separate tasks. P-values in bottom right corner of plots refer to a permutation test of a logistic regression classifier’s training accuracy for accurate prediction of region/structure labels (relative to shuffled) based on plot coordinates for all individual units.

Unsmoothed transductive classification: confusion matrices.

Confusion matrices for MLP models using a transductive train/test split are shown. Matrices for four classification tasks (rows) and three different spiking activity representations (columns) are shown. Note that in the transductive condition, all neurons across animals are pooled and divided into train/test splits, such that there is a within-animal aspect to classifier learning. In individual matrices, the proportion of true instances of a particular class (columns) are distributed across a row reflecting the distribution of MLP predictions. Correct classifications fall along the diagonal from top left to bottom right. The confusion matrices’ labels, provided in the first column, are consistent in subsequent columns. Average (across the 5 splits) balanced accuracy (expressed as percent) for each task across the features from left to right: Regions (65.29, 53.96, 59.15), Hippocampus (41.72, 35.51, 44.16), Thalamus (39.15, 31.91, 43.25), Visual Ctx (25.84, 28.84, 38.35)

MLP-based model sensitivity as a function of input duration.

Models were trained normally as tested with varying amounts (durations) of input data. In all but one example, model performance increases as a function of input size (duration). The exception is in the visuocortical task, the secondary structure, VISam, declines with time. This reflects the model’s failure to learn a robust pattern. Three tasks are shown: A. Hippocampal Structures, B. Thalamic structures, and C. Visuocortical structures.

Interpretation of hippocampal and thalamic MLPs.

A Hippocampal structure information is enriched in subsets of stimuli. Each rectangle represents one of the 40 stimulus parameter combinations. Colors correspond to hippocampal structures, and the area of color shows the relative importance of that stimulus to model classification of the given structure. Cat PSTH model. Dashed boxes—exemplar stimulus conditions. B. PSTHs corresponding to structure/stimulus pair indicated in A. Specific structure PSTH shown in color. Average PSTH of all other structures shown in gray. C,D. Same as A and B except for thalamic neurons and structures.

Unsmoothed inductive classification: confusion matrices.

Confusion matrices for MLP models using an inductive train/test split are shown. Note that in the inductive condition, models are trained on all neurons from a subset of animals, and tested on all neurons from a withheld group of animals. This eliminates any possibility of learning a local solution within animal. Matrices for four classification tasks (rows) and three different spiking activity representations (columns) are shown. In individual matrices, the proportion of true instances of a particular class (columns) are distributed across a row reflecting the distribution of MLP predictions. Correct classifications fall along the diagonal from top left to bottom right. The confusion matrices’ labels, provided in the first column, are consistent in subsequent columns. Average (across the 5 splits) balanced accuracy values (in %) for each task across the features from left to right: Regions (65.10, 51.72, 49.16), Hippocampus (35.89, 26.39, 21.50), Thalamus (32.79, 26.28, 21.72), Visual Ctx (25.52, 25.83, 26.51)

Smoothed inductive classification: confusion matrices.

Confusion matrices for MLP models using an inductive (across animals) train/test split are shown. Note that in the inductive condition, models are trained on all neurons from a subset of animals, and tested on all neurons from a withheld group of animals. This eliminates any possibility of learning a local solution within animal. Matrices for four classification tasks (rows) and three different spiking activity representations (columns) are shown. In individual matrices, the proportion of true instances of a particular class (columns) are distributed across a row reflecting the distribution of MLP predictions. Correct classifications fall along the diagonal from top left to bottom right. The confusion matrices’ labels, provided in the first column, are consistent in subsequent columns. Average (across the 5 splits) balanced accuracy values (in %) for each task across the features from left to right: Regions (89.47, 67.95, 63.92), Hippocampus (51.01, 39.31, 25.84), Thalamus (53.21, 35.38, 24.81), Visual Ctx (38.48, 44.66, 43.97)

The effect of training and testing within and across stimulus conditions.

For each classification task (e.g., Brain Regions), the bar chart shows the mean MCC value with standard error for models trained and tested on the same stimulus condition (e.g. train: drifting gratings & test: drifting gratings; train: natural movie & test: natural movie) in gray. In black, MCC of models trained and tested on different stimuli (e.g. train: drifting gratings & test: spontaneous; train: natural movies & test: drifting gratings). Linear mixed effects. (ns/not significant: p >= 0.05, *: p < 0.05, **: p < 0.01, ***: p < 0.001)

Train/test across laboratories: Allen-to-Steinmetz et al. confusion matrices.

6 tasks are displayed. The y-axis label denotes the task (e.g, Regions), and the column label denotes the training condition (drifting gratings or mixed stimuli). All models involve training on the Allen Institute data and testing on Steinmetz et al. data. Mixed stimuli models are trained on neuronal activity recorded during the presentation of drifting gratings, natural movies, and spontaneous activity. Balanced accuracy corresponding to each confusion matrix: BR DG=80.46%, BR Mix: 81.28 %, HS DG: 40.07 %, HS Mix: 69.89 %, TS DG: 21.52 %, TS Mix: 58.22 %, VCS DG: 28.41 %, VCS Mix: 28.34 %, VCSS DG: 58 %, VCSS Mix: 59 %, VC Layers DG: 46.36 %, VC Layers Mix: 49.01 % where BR stands for brain regions, HS hippocampal structures, TS thalamic structures, VCS visual cortex structures, VCSS visual cortex superstructures, DG drifting gratings and Mix denotes the combined stimuli.