Schematic depiction of drift: a. Mice were implanted with a 4-shank Neuropixels 2.0 probe in visual cortex area V1. b. Each colored star represents the location of a unit recorded on the probe. In this hypothetical case, the same color indicates unit correspondence across days. The black unit is missing on day 48, while the turquoise star is an example of a new unit. Tracking aims to correctly match the red and blue unit across all datasets and detect that the black units on day 48 is likely undetected. c. Two example spatial-temporal waveforms of units recorded in two datasets that likely represent the same neuron, based on similar visual responses. Each trace is the average waveform on one channel across 2.7 millisecond. The blue traces are waveforms on a peak channel and 9 nearby channels (two rows above, two rows below, and one in the same row) from the first dataset (Day 1). The red traces, similarly selected, are from the second dataset. Waveforms are aligned at the electrodes with peak amplitude, different on the two days.

The EMD can detect the displacement of single units: a. Schematic of EMD unit matching. Each blue unit in day 1 is matched to a red unit in day 2. Dashed lines indicate the matches to be found by minimizing the weighted sum of physical and waveform distances. b. Open and filled circles show positions of units on days 1 and 2, respectively. Arrows indicate matching using EMD. The arrow color represents the match direction; upward matches found with the EMD are in red and downward in black. Solid lines indicate a z-match distance within 15µm, while a dashed line indicates a z distance > 15µm. An expand segment is shown for the probe section from 3120 to 3220µm. c. The match distance histogram (black and red bars) and kernel fit (light blue solid curve). The light blue dashed line shows the mode (dm = 15.65µm). The dark blue dashed line shows the imposed drift (di = 12µm). The red region shows the matches within 15µm of the mode. The EMD needs to detect the homogeneous movement against the background, i.e. units in the black region that are unlikely to be the real match due to biological constraints.

The ROC curve of matching accuracy vs. distance. The blue curve indicates the recovery rate of reference units. The red line indicates the number of reference units included. The solid vertical line indicates the average z distance across all reference pairs in all animals (z = 6.96µm). The dashed vertical black line indicates a z-distance threshold at z = 10µm.

Recovery rate, accuracy and putative pairs: a. The histogram distribution fit for all KSgood units (top) and reference units alone (middle). False positives for reference units are defined as units matched by EMD but not matched when using receptive fields. The false positive fraction for the set of all KSgood units is obtained by integration. z = 10µm threshold has a false positive rate = 27% for KSgood units. b. Light blue bars represent the number of reference units successfully recovered using only unit location and waveform. The numbers on the bars are the recovery rate of each datatset, and the red portion indicates incorrect matches. Incorrect matches are those matches using receptive field data that were not recovered using EMD without receptive field information. Similarly, the green bars show matching accuracy with distance under threshold z at 10µm. The orange portion indicates incorrect matches after thresholding. The false positives are mostly eliminated by adding the threshold. Purple bars are the number of putative units (unit with no reference information) inferred with z-threshold = 10µm.

Number of reference units (deep blue, dark orange and green for different subjects), putative (medium green, medium orange and blue) units, and mixed units (light green, yellow, and light blue) tracked for different duration of days. The decrease rate is similar for different chain types in the same subject.

Example mixed chain: a. Above: Firing rates for this neuron on each day (Day 1, 2, 13, 23, 48). Below: Firing rate percent change compared to the previous day. b. Visual response similarity (yellow line), PSTH similarity correlation (orange line), and visual fingerprint similarity correlation (blue line). The similarity score is the sum of vfp and PSTH. The dashed black line represent the threshold to be considered a reference unit. c. Spatial-temporal waveform of a trackable unit. Each pair of traces represent the waveform on a single channel. d. Estimated location of this unit on different days. Each colored dot represent a unit in a day. The orange squares represent the electrodes. e. The pairwise vfp and PSTH traces of this unit.

Summary of dataset: a. The recording intervals for each animal. A black dash indicates one recording on that day. b. All animals were recorded from visual cortex V1 with a 720 µm section of the probe containing 96 recording sites. The blue arrow indicate main drift direction. c. Examples of visual fingerprint(vfp) and peri-stimulus time histogram(PSTH) from a high correlation (left column) and a just-above-threshold (right column) correlation unit. Both vfp and PSTH values vary from [0,1]. d. Kilosort-good and reference unit counts for animal AL032, including units from all four shanks.

The waveform L2 similarity change distribution per dataset by neuron groups and across all neurons. Box plots indicate 25% percentile, medians, and 75% percentile. Whiskers at the ends of the box plot show maximum and minimum values. n and N represents the number of unit comparisons, i.e. number of units times (number of datatset −1).

The plots show location change distribution per units and the bottom plot show the change per dataset by neuron groups and across all neurons. Box plots indicate 25% percentile, medians, and 75% percentile. Whiskers at the ends of the box plot show maximum and minimum values. n and N represents the number of units (above plot), and the number of unit comparisons, i.e. number of units times (number of datatset −1) (lower plot).

The average firing rate fold change per dataset by neuron groups and across all neurons. Box plots indicate 25% percentile, medians, and 75% percentile. Whiskers at the ends of the box plot show maximum and minimum values. n and N represents the number of units.

The visual fingerprint and PSTH change distribution per dataset by neuron groups and across all neurons. Box plots indicate 25% percentile, medians, and 75% percentile. Whiskers at the ends of the box plot show maximum and minimum values. n and N represents the number of unit comparisons, i.e. number of units times (number of datatset −1).

The similarity score distribution per dataset by neuron groups and across all neurons.Box plots indicate 25% percentile, medians, and 75% percentile. Whiskers at the ends of the box plot show maximum and minimum values. n and N represents the number of occurrence of units, i.e. number of units times number of datasets this unit have.

An example similarity score (vfp + PSTH) heatmap from animal AL032 shank 2 Kilosort-good units between day 1 and 2. Each small square represents the similarity score (value range from [0,2]) between one unit from day 1 and one unit from day 2. A warm colored square indicates a higher score. All clusters are sequenced by their physical locations on the probe. There is a diagonal line with brightest color blocks, indicating that units with more similar firing responses across days tend to be physically close. This confirms our assumption that neurons are physically stable overtime. Also notice that, on each column, there might be more than one bright blocks in the more distance clusters. We minimizes the effect of the distant units by constraining the feasible region during reference units selection. There are also columns without bright yellow blocks. This happens because some units do not respond to the stimulus and those units are not included in the reference set.

The effect of drift correction in finding reference units for all three animals. Note that drift correction improves the recovery rate for most cases; the degree of improvement is a function of the magnitude of the drift.

The distribution fitting for KSgood and correct reference units with simulated data at f = 0.23, 0.5, 0.6, 0.7, 0.96, respectively.

The reference units recovery rate for spike sorted recordings spanning different days. Each triangle represents the matching results of two datasets. Animal AL031 has 6 sets of matching, with one outlier removed. Animal AL032 has 24 sets of matching. Animal AL036 has 60 sets of matching. The recovery quality becomes lower as datasets spans longer time.

An example of reference chain. a. Above: Firing rates of this neuron on each day. Below: Firing rate percent change compared to the previous day. b. Visual response similarity (yellow line), PSTH similarity correlation (orange line), and visual fingerprint similarity correlation (blue line). The similarity score is the sum of vfp and PSTH. The dashed black line represent the threshold to be considered a reference unit. c. Spatial-temporal waveform of a trackable unit. Each pair of traces represent the waveform on a single channel. d. Estimated location of this unit on different days. Each colored dot represent a unit in a day. The orange squares represent the electrodes. e. The pairwise vfp and PSTH traces of this unit.

An example of putative chain. Order is the same as above.

Reference unit counts and normalized EMD cost for each pair of datasets recorded by the same shank. For animal AL036 (left), we excluded the first two datasets and all of their matching results (first two rows of each matrix on the left) based on the reference unit counts. Following analysis on their matching EMD cost, location-only cost and waveform-only cost suggest a big difference compared to the following days (datasets in the red rectangles). We infer that the first two datatsets have recorded from a different population from the later days. The other matrices show the similar information for animal AL032 for reference. To show the relative size of EMD cost in related datasets versus unrelated datasets, we calculated the cost between unrelated datasets with similar number of units (AL032 shank 1 and AL036 shank 1, EMD cost = 78, location cost = 67, and waveform cost = 32). The EMD cost is between 70-80, much larger than those between related datasets (between 20-30).

The normalized EMD cost (unitless), z distance (µm), physical distance (µm), and waveform distance (unitless) and the corresponding recovery rate in pairwise match of all recording to all other recordings, shank by shank. Each triangle represents the recovery rate of two datasets. Animal AL031 has 6 sets of matching, with one outlier removed. Animal AL032 has 24 sets of matching. Animal AL036 has 60 sets of matching. Overall, most of the datat-set with high recovery rate has per-unit EMD cost around 50. Note that the EMD cost is not predictive of recovery rate.

The reference units to KSgood units ratio decreases for datasets with larger time interval. But the variability of the number of reference units is generally large for datasets with the same time interval.

We varied weight ω in equation 3 used to combine physical and waveform distances at an increment of 500. The vertical line indicate weight = 1500, where the overall recovery rate = 86.29%. The maximum recovery rate = 87.68% occurs at weight = 3000. We chose weight = 1500 for all subsequent analysis.

The Kilosort-good and reference unit counts for the animals. AL031 and AL036 as shown for animal AL032 in Figure 5.