(A) WaveMAP begins with UMAP which projects high-dimensional data into lower dimension while preserving local and global relationships (see Figure 2—figure supplement 1A for an intuitive diagram). Normalized average waveforms from single units (i) are passed to UMAP (McInnes et al., 2018) which begins with the construction of a high-dimensional graph (ii). In the high-dimensional space (ii.a), UMAP constructs a distance metric local to each data point (ii.b). The unit ball (ball with radius of one) of each local metric stretches to the 1st-nearest neighbor. Beyond this unit ball, local distances decrease (ii.c) according to an exponential distribution that is scaled by the local density. This local metric is used to construct a weighted graph with asymmetric edges (ii.d). The 1-nearest neighbors are connected by en edge of weight 1.0. For the next -nearest neighbors, this weight then falls off according to the exponential local distance metric (in this diagram with some low weight connections omitted for clarity). These edges, and , are made symmetric according to (ii.e). (B) The high-dimensional graph (iii) captures latent structure in the high-dimensional space. We can use this graph in Louvain community detection (Louvain, iv) (Blondel et al., 2008) to find clusters (see Figure 2—figure supplement 1B for an intuitive diagram). In Louvain, each data point is first initialized as belonging to its own ‘community’ (iv.a, analogous to a cluster in a metric space). Then, in an iterative procedure, each data point joins neighboring communities until a measure called ‘modularity’ is maximized (iv.b, see Supplemental Information for a definition of modularity). Next, data points in the same final community are aggregated to a single node and the process repeats until the maximal modularity is found on this newly aggregated graph. This process then keeps repeating until the maximal modularity graph is found and the final community memberships are passed back to the original data points. We can also use this graph to find a low-dimensional representation through a graph layout procedure (v). The graph layout proceeds by finding a ‘low energy’ configuration that balances attractive (shown as springs in v.a) and repulsive (not shown) forces between pairs of points as a function of edge weight or lack thereof. This procedure iteratively minimizes the cross-entropy between the low-dimensional and high-dimensional graphs (v.b). The communities found through Louvain are then combined with the graph layout procedure to arrive at a set of clusters in a low-dimensional embedded space (vi). These clusters (vi, top) can be used to classify the original waveforms (vi, bottom). To investigate ‘why’ these data points became clusters, each cluster is examined for locally (within-cluster) important features (SHAP Lundberg and Lee, 2017), (vii) and globally important trends (UMAP inverse transform, viii). Not shown is the classifier SHAP values are calculated from. The diagrams for the graph construction and layout are based on UMAP documentation and the diagram for Louvain community detection is based on Blondel et al., 2008. Figure 2—figure supplement 1: An intuitive diagram of local and global distance preservation in UMAP and a schematic of the Louvain clustering process.