Whole-brain comparison of rodent and human brains using spatial transcriptomics

  1. Antoine Beauchamp  Is a corresponding author
  2. Yohan Yee
  3. Ben C Darwin
  4. Armin Raznahan
  5. Rogier B Mars  Is a corresponding author
  6. Jason P Lerch  Is a corresponding author
  1. The Hospital for Sick Children, Canada
  2. Mouse Imaging Centre, Canada
  3. Department of Medical Biophysics, University of Toronto, Canada
  4. Section on Developmental Neurogenomics, Human Genetics Branch, National Institute of Mental Health Intramural Research Program, United States
  5. Wellcome Centre for Integrative Neuroimaging, Centre for Functional MRI of the Brain (FMRIB), Nuffield Department of Clinical Neurosciences, John Radcliffe Hospital, University of Oxford, United Kingdom
  6. Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, Netherlands

Abstract

The ever-increasing use of mouse models in preclinical neuroscience research calls for an improvement in the methods used to translate findings between mouse and human brains. Previously, we showed that the brains of primates can be compared in a direct quantitative manner using a common reference space built from white matter tractography data (Mars et al., 2018b). Here, we extend the common space approach to evaluate the similarity of mouse and human brain regions using openly accessible brain-wide transcriptomic data sets. We show that mouse-human homologous genes capture broad patterns of neuroanatomical organization, but the resolution of cross-species correspondences can be improved using a novel supervised machine learning approach. Using this method, we demonstrate that sensorimotor subdivisions of the neocortex exhibit greater similarity between species, compared with supramodal subdivisions, and mouse isocortical regions separate into sensorimotor and supramodal clusters based on their similarity to human cortical regions. We also find that mouse and human striatal regions are strongly conserved, with the mouse caudoputamen exhibiting an equal degree of similarity to both the human caudate and putamen.

Editor's evaluation

This important work develops new methods for aligning measures of brain-wide gene expression in the mouse and human brains. It presents compelling evidence in support of both conserved and species-specific transcriptional patterns. The work will be of interest to neuroscientists and geneticists interested in the molecular correlates of brain evolution.

https://doi.org/10.7554/eLife.79418.sa0

Introduction

Animal models play an indispensable role in neuroscience research, not only for understanding disease and developing treatments but also for obtaining data that cannot be obtained in the human. While numerous species have been used to model the human brain, the mouse has emerged as the most prominent of these, due to its rapid life cycle, straightforward husbandry, and amenability to genetic engineering (Dietrich et al., 2014; Ellenbroek and Youn, 2016; Kabakci et al., 2004; Houdebine, 2004). Mouse models have proven to be extremely useful for understanding diverse features of the brain, from its molecular neurobiological properties to its large-scale network properties (Hodge et al., 2019; Oh et al., 2014; Yao et al., 2021). However, translating findings from the mouse to the human has not been straightforward. This is especially evident in the context of neuropsychopharmacology, where promising neuropsychiatric drugs have one of the highest failure rates in Phase III clinical trials (Hay et al., 2014).

Successful translation requires an understanding of how effects on the brain of the model species are likely to manifest in the brain of the actual species of interest. This is not trivial in the case of the mouse and human, as the two species diverged from a common ancestor about 80 million years ago (Kaas, 2012). Although common themes are apparent in the brains of all mammalian species studied to date (Krubitzer, 2007), there remain substantial differences between the mouse and human brain. Beyond the obvious differences in size, large parts of the human cortex potentially have no corresponding homologues in the mouse (Preuss, 1995). Direct comparisons across the brains of different species are further complicated by the fact that researchers from different traditions use inconsistent nomenclature to refer to similar neuroanatomical areas (van Heukelum et al., 2020; Laubach et al., 2018).

Over the course of the last decade, we have developed novel approaches to explicitly evaluate similarities and differences between the brains of related species. These approaches describe brains using common data spaces that are directly comparable between species, making it possible to evaluate the similarity of different regions in a quantitative fashion (Mars et al., 2021). This way, potential homologues can be formally tested, and regions of the brain that do not allow for straightforward translation can be identified (Mars et al., 2018b). Establishing such a formal translation between the mouse and the human brain would allow scientists involved in translational research to explicitly test hypotheses about conservation of brain regions, identify regions that are well suited to translational paradigms, and directly transform quantitative maps from the brain of one species to the other.

One approach toward building these common spaces has been to exploit connectivity. It has previously been demonstrated that brain regions can be identified via their unique set of connections to other regions in the brain. This connectivity fingerprint can therefore be seen as a diagnostic of an area (Mars et al., 2018a; Passingham et al., 2002). The common connectivity space approach relies on defining agreed upon neuroanatomical homologues a priori and then expressing the connectivity fingerprint of regions under investigation with those established homologues in the two brains (Mars et al., 2016a). The connections of any given region to the established homologues thus form a common space, which links the two brains. In a series of early studies, we compared the connectivity of the macaque and human brain, identifying homologies as well as specializations across association cortex (Mars et al., 2013; Neubert et al., 2014; Sallet et al., 2013). The same approach has recently been applied to mouse-human comparisons for the first time, demonstrating conserved organization between the mouse and human striatum, but some specialization in the human caudate related to connectivity with the prefrontal cortex (Balsters et al., 2020). A similar study recently compared connectivity of the medial frontal cortex across rats, marmosets, and humans (Schaeffer et al., 2020). However, the lack of established neuroanatomical homologues in mice, particularly in the cortex, limits the use of connectivity to compare these species.

A more promising approach to mouse-human comparisons could be to exploit the spatial patterns of gene expression. Advances in transcriptomic mapping can be used to characterise the differential expression of many thousands of genes across the brain and compare the pattern between regions (Ortiz et al., 2020). Moreover, the availability of whole-brain spatial transcriptomic data sets for multiple species provides an opportunity to run novel analyses at low cost (Hawrylycz et al., 2012; Lein et al., 2007). Such maps for the human cortex show topographic patterns that mimic those observed in other modalities, such as a gradient between primary and heteromodal areas of the neocortex (Burt et al., 2018). Importantly, these patterns appear to be conserved across mammalian species (Fulcher et al., 2019), which opens up the possibility of using the expression of homologous genes as a common space across species. In fact, a recent study demonstrated how the expression of homologous genes can be used to directly register mouse and vole brains into a common reference frame, which allows for direct point-by-point comparisons of brain maps (Englund et al., 2021). However, this specific approach is only feasible because of the large degree of morphological similarity between mouse and vole brains. In the case of mouse-human comparisons, we almost certainly cannot directly register mouse and human brains into a common coordinate frame using methods for image registration. Hence we need to be more creative in our approach.

Here we examine the patterns of similarity between the mouse and human brain using a common space constructed from spatial gene expression data sets. We begin with an initial set of 2835 homologous genes. Subsequently, we present and evaluate a novel method for improving the resolution of mouse-human neuroanatomical correspondences using a supervised machine learning approach. Using the novel representation of the gene expression common space, that is, a latent gene expression space, we analyze the similarity of mouse and human isocortical subdivisions and demonstrate that sensorimotor regions exhibit a higher degree of similarity than supramodal regions. Finally, we examine the patterns of transcriptomic similarity at a voxel-wise level in the mouse and human striatum.

Results

Homologous genes capture broad similarities in the mouse and human brains

We first examined the pattern of similarities that emerged when comparing mouse and human brain regions on the basis of their gene expression profiles. We constructed a gene expression common space using widely available data sets from the Allen Institute for Brain Science: the Allen Mouse Brain Atlas (AMBA) and the Allen Human Brain Atlas (AHBA) (Hawrylycz et al., 2012; Lein et al., 2007). These data sets provide whole-brain coverage of expression intensity for thousands of genes in the mouse and human genomes. For our purposes, we filtered these gene sets to retain only mouse-human homologous genes using a list of orthologues obtained from the NCBI HomoloGene system (NCBI Resource Coordinators, 2018). Using a gene enrichment analysis, we found that this reduced gene set was significantly associated with a number of biological processes related to the nervous system, with Gene Ontology labels such as ‘nervous system development’, ‘neurogenesis’, and ‘regulation of nervous system development’. Additional modules returned with high significance were ‘regulation of multicellular organismal process’, ‘regulation of biological quality’, and ‘multicellular organism development’. The full set of significant modules can be found in Supplementary file 1.

Prior to analysis, the mouse and human homologous gene expression data sets were pre-processed using a pipeline that included quality control checks, normalization procedures, and aggregation of the expression values under a set of atlas labels. The result was a gene-by-region matrix in either species, describing the normalized expression of 2835 homologous genes across 67 mouse regions and 88 human regions (see Materials and methods). We quantified the degree of similarity between all pairs of mouse and human regions using the Pearson correlation coefficient, resulting in a mouse-human similarity matrix (Figure 1A).

Transcriptomic similarity in the mouse and human brains.

(A) Similarity matrix displaying the correlation between 67 mouse regions and 88 human regions based on the expression of 2835 homologous genes. Columns are annotated with 11 broad mouse regions: cortical subplate (CTXsp), olfactory areas (OLF), hippocampal formation (HPF), isocortex, cerebral nuclei (CNU), interbrain (IB), midbrain (MB), pons (P), medulla (MY), cerebellar cortex (CBX), and cerebellar nuclei (CBN). Rows are annotated with 16 broad human regions: claustrum (Cl), limbic lobe (LL), frontal lobe (FL), insula (Ins), occipital lobe (OL), parietal lobe (PL), temporal lobe (TL), amygdala (Amg), basal ganglia (BG), basal forebrain (BF), diencephalon (DIE), mesencephalon (MES), pons, myelencephalon (MY), cerebellar cortex (CbCx), and cerebellar nuclei (CbN). Broad patterns of similarity are evident between coarsely defined brain regions, while correlation patterns are mostly homogeneous within these regions. (B) Mouse brain coronal slices showing similarity profiles for the human precentral gyrus, cuneus, and crus I. Correlation patterns for the precentral gyrus and cuneus are highly similar to one another and broadly similar to most isocortical regions. The crus I is homogeneously similar to the mouse cerebellum. (C) Anatomically ordered line charts displaying the similarity profiles for the seed regions in (B). Dashed vertical lines indicate the canonical mouse homologue for each human seed. Annotation colors correspond to atlas colors from the Allen Mouse Brain Atlas and Allen Human Brain Atlas for mouse and human regions, respectively.

Figure 1—source data 1

Mouse-human similarity matrix using homologous genes.

https://cdn.elifesciences.org/articles/79418/elife-79418-fig1-data1-v2.csv

We find that the similarity matrix exhibits broad patterns of positive correlation between the mouse and human brains. These clusters of similarity correspond to coarse neuroanatomical regions that are generally well defined in both species. For instance, we observe that, overall, the mouse isocortex is similar to the human cerebral cortex, with the exception of the hippocampal formation, which forms a unique cluster. Similarly, the mouse and human cerebellar hemispheres cluster together, while the cerebellar nuclei show relatively high correlation to each other (r=0.351) as well as to brain stem structures like the pons (r=0.328 and r=0.335 for the mouse and human nuclei, respectively) and myelencephalon (r=0.288 and r=0.351). The associations between broad regions such as these are self-evident in the correlation matrix.

Our ability to resolve regional matches on a finer scale is limited when using all homologous genes in this way. This is especially true for regions within the cerebral and cerebellar cortices, which exhibit a high degree of internal homogeneity. This is apparent in the similarity profiles, defined here as the set of correlation values between a given seed region and all target regions in the other species. For example, the human precentral gyrus and cuneus are most strongly correlated to many regions of the mouse isocortex. While the brain maps feature a rostral-caudal gradient (Figure 1B), the profiles of the two seeds are highly similar despite the regions having very different functions (Figure 1C). Indeed, the correlation between the similarity profiles of the precentral gyrus and cuneus is r=0.975. The similarity profile of human cerebellar crus 1 highlights another example of this homogeneity. The profile of crus 1 is similar to that of all regions of the mouse cerebellum, with an average correlation of r=0.213 and a standard deviation of σ=0.034. Across all regions, the variance of the correlations across cortical regions is σ2=0.0067 while that across cerebellar hemispheric regions is σ2=0.0013, compared with a total variation of σ2=0.031 across all entries in the matrix.

Although there is distinguishing power in the profiles of regions at a finer scale, this is much smaller than between coarse anatomical regions. This is also true for parts of the broad anatomical systems that are part of the same functional system. This suggests that the regional expression patterns of mouse-human homologous genes can be used to identify general similarities between the brains of the two species using a simple correlation measure, but the ability to identify finer scale matches might require a more subtle approach.

A latent gene expression space improves the resolution of mouse-human associations

In the previous analyses, we showed that the expression profiles of homologous genes capture broad similarities across the mouse and the human for the major subdivisions of the brain. Some information at a finer resolution (e.g. within the isocortex) was also evident but much less distinctive. Our next goal was to investigate whether it is possible to leverage the gene expression data sets to relate mouse and human brains to one another at a finer regional level. In order to do so, we sought to maximize the informational value in the set of 2835 homologous genes by creating a new latent common space that exploits the regional distinctiveness of the expression profiles.

The approach used in the previous analysis relied on using homologous genes as a common space between the mouse and human brain. This approach effectively assigns equal value to each gene, whereas a more powerful approach would be to weight genes by their ability to distinguish between different brain regions. We investigated whether we could accomplish this by constructing a new set of variables from combinations of the homologous genes. Our primary goal here was to transform the initial gene space into a new common space that would improve the locality of the matches. However, while we sought a transformation that would allow us to recapitulate known mouse-human neuroanatomical homologues, we also wanted to avoid directly encoding such correspondences in the transformation. Using this information as part of the optimization process for the transformation would run the risk of driving the transformation toward mouse-human pairs that are already known. While we are interested in being able to recover such matches, we are equally interested in identifying novel and unexpected associations between neuroanatomical regions in the mouse and human brains (e.g. one-to-many correspondences). Given these criteria, our approach to identifying an appropriate transformation was to train a multi-layer perceptron classifier on the data from the AMBA. The classifier was tasked with predicting the 67 labels in our mouse atlas from the voxel-wise expression of the homologous genes (Figure 2A).

Figure 2 with 1 supplement see all
Creating a new common space.

(A) Voxel-wise expression maps from 2835 homologous genes in the Allen Mouse Brain Atlas were used to train the neural network to classify each mouse voxel into one of the 67 atlas regions. (B) Once the network is trained, the output layer is removed. The mouse and human regional gene expression matrices are passed through the network, resulting in lower-dimensional latent space representations of the data. The training and transformation process were repeated 500 times. (C) A similarity matrix displaying the gene expression latent space correlation between mouse and human regions, averaged over 500 neural network training runs. Similar brain regions exhibit very high correlation values. Column and row annotations as described in Figure 1.

Figure 2—source data 1

Correlations between mouse and human brain regions in all latent spaces (1 of 3), related to Figure 2C.

https://cdn.elifesciences.org/articles/79418/elife-79418-fig2-data1-v2.csv
Figure 2—source data 2

Correlations between mouse and human brain regions in all latent spaces (2 of 3), related to Figure 2C.

https://cdn.elifesciences.org/articles/79418/elife-79418-fig2-data2-v2.csv
Figure 2—source data 3

Correlations between mouse and human brain regions in all latent spaces (3 of 3), related to Figure 2C.

https://cdn.elifesciences.org/articles/79418/elife-79418-fig2-data3-v2.csv

While the model could have been trained using the data from either species, we chose to use the mouse data because it provides continuous coverage of the entire brain and is thus better suited to this purpose. In training the model to perform this classification task, we effectively optimize the network architecture to identify a transformation from the input gene space to a space that encodes information about the delineation between mouse brain regions. To extract this transformation, we removed the output layer from the trained neural network. The resulting architecture defines a transformation from the input space to a lower-dimensional gene expression latent space. We then applied this transformation to the mouse and human gene-by-region expression matrices to obtain representations of the data in the latent common space (Figure 2B). Finally, we used these gene expression latent common space matrices to compute the new similarity matrix (Figure 2C). Since the optimization algorithm used to train the perceptron features an inherent degree of stochasticity, we repeated this training and transformation process 500 times to generate a distribution of latent spaces and similarity matrices over training runs. Although the neural network and associated latent space do not directly provide information about which genes are most important for the classification of specific mouse atlas labels, this type of information can be derived from the model using attribution methods such as integrated gradients (Figure 2—figure supplement 1; Sundararajan et al., 2017). Each brain region in the classification task is associated with the input genes in different ways, such that there isn’t a single weighting of gene importance for the entire model. While most genes contribute to the classification of any given label in some capacity, it is often the case that the network relies on a reduced subset of genes to arrive at a decision. For example, the genes, Prrg2 and Cd4, were found to be the most influential for the classification of the caudoputamen, when the feature attributions were averaged over all training runs. In contrast, Rfx4 and Glra3 were the most influential for the classification of the primary motor area. In some cases, the spatial expression pattern of the gene clearly shows a demarcation of the region of interest (e.g. Cd4), but this is not always the case, nor is it necessary, as the network learns from the entire gene expression signature of all voxels.

To assess whether the latent space representations of the data improved the resolution of the mouse-human matches, we considered two criteria. The first was whether the similarity profiles of the mouse atlas regions were more localized within the corresponding broad regions of interest (e.g. primary motor area within isocortex), compared with their similarity profiles in the original gene space. We term this the locality criterion. The second criterion was whether the degree of similarity between canonical neuroanatomical homologues improved in this new latent common space. We term this as the homology criterion. The locality criterion tells us about our ability to extract finer-scale signal in these profiles, while the homology criterion informs us about our ability to recover expected matches in this finer-scale signal. To evaluate these criteria, we computed ranked similarity profiles for every region in the mouse brain, ordered such that a rank of 1 indicates the most similar human region. In addition, given the difference in absolute value between the input gene space and gene expression latent space correlations, we scaled the similarity profiles to the interval 0,1 in order to make comparisons between the spaces.

We evaluated the locality criterion by examining the decay rate of the top of the similarity profiles. We reasoned that the plateau of similarity to a broad brain region, as seen in the anatomically ordered similarity matrices and profiles (Figure 1A, C; Figure 2C), would correspond to a similar plateau at the head of the rank-ordered profiles. Moreover, the emergence of local signal would manifest as an increase in the range between the peaks and troughs within the broad region. In the rank-ordered profiles, this would correspond to a faster rate of decay at the head of the profile. In order to quantify this decay, we computed the rank at which each region’s similarity profile decreased to a scaled value of 0.75. This was calculated for every mouse region in the initial gene space, as well as in each of the 500 gene expression latent spaces. As a measurement of performance between the two representations of the data, we then took the difference in this rank between each of the latent spaces and the original gene space (Figure 3A). A negative rank difference indicates an improvement in the latent space.

Quantifying improvement in locality in gene expression latent space.

(A) The amount of local signal within a broadly similar region of the brain for a finer seed region’s (e.g. primary motor area) similarity profile can be quantified by the decay rate of the head of the rank-ordered profile. Decay rate was quantified by computing the rank at a similarity of 0.75. This metric was compared between the initial gene expression space (orange line) and every gene expression latent space resulting from repeated training of the neural network (every blue line is a training outcome, heavy blue line serves as an example). A negative difference between these rank metrics indicates an improvement in locality in the latent space. (B) Structure-wise distributions of differences in rank at a similarity of 0.75 between the initial gene expression space and the gene expression latent spaces. Points and error bars represent mean and 95% CI with n = 500. Dashed black line at 0 indicates the threshold for improvement in one space over the other. Colors correspond to Allen Mouse Brain Atlas annotations as in Figures 1 and 2. Binomial likelihood (logistic regression) estimate of pB=0.78 with 95% CI [0.66, 0.86]. The probability of obtaining at least these many successes under the null binomial distribution, B67,0.5 , is p=8.64 · 10−7 . (C) Proportion of perceptron training runs resulting in an improvement or null difference in the gene expression latent space compared with the initial space, estimated using region-wise logistic regressions. Cortical and cerebellar regions exhibit high proportions of improvement, while subcortical regions are less likely to be improved by the classification process.

Figure 3—source data 1

Scaled similarity profiles of the mouse primary motor area, related to Figure 3A.

https://cdn.elifesciences.org/articles/79418/elife-79418-fig3-data1-v2.csv
Figure 3—source data 2

Ranks at a similarity of 0.75 for mouse regions in the homologous gene space and all latent spaces, related to Figure 3B.

https://cdn.elifesciences.org/articles/79418/elife-79418-fig3-data2-v2.csv
Figure 3—source data 3

Logistic regression model estimates for mouse regions, related to Figure 3C.

https://cdn.elifesciences.org/articles/79418/elife-79418-fig3-data3-v2.csv

Examining the structure-wise distributions of these rank differences, we found that for the majority of regions in our mouse atlas, the classification approach resulted in either an improvement in the amount of locality within a broad region, or no difference from the original gene space (Figure 3B, C). We quantified the improvement overall by fitting a logistic regression model with no predictors to the mean rank differences of each of the atlas regions. We considered the success condition for the Bernoulli trials to be a mean rank difference less than or equal to zero. The model estimate for the Bernoulli probability – which we denote pB to distinguish from the p-value p – was pB=0.78 with a 95% CI of 0.66,0.86 . In other words, 52 of the 67 brain regions saw an improvement on average when using the latent spaces. The probability of obtaining at least as many successes as this under the null model, i.e. a binomial distribution with pB=0.50 and n=67, is p=8.64 · 10−7 . We additionally evaluated the same kind of logistic regression on a region-wise basis to quantify how often the latent spaces resulted in an improvement for individual brain regions (Figure 3C). We found that for 46 regions (69%), the model estimated the probability to be at least at high as pB=0.95. While confidence intervals varied around this estimate, the range between the upper and lower bound was only ever as high as 0.04. For 53 of the 67 regions (79%), the q-values, i.e. p-values adjusted for multiple comparisons, were effectively null, with the largest being q=3.7710-16 . Of the remaining 14 regions, 13 had q-values equal to 1 and one region, the periacqueductal gray, had a q-value of q=0.854. The regions with the smallest estimates for the Bernouilli probabilities are the dentate gyrus (pB=0.0, no variance, q=1), the striatum ventral region (pB=0.016, 95% CI 0.008,0.032 , q=1), and the lateral septal complex (p=0.016, 95% CI 0.008,0.032 , q=1). The remaining regions with q=1 are all subcortical and fall under the broad subdivisions of cerebral nuclei, olfactory areas, interbrain, midbrain, pons, medulla, and cerebellar nuclei. Beyond this binary measure of improvement, some regions exhibited a large range of differences in rank over the various latent spaces. In particular regions like the main olfactory bulb (mean rank difference of μ=10, 95% CI -12,33) and accessory olfactory bulb μ=9, 95% CI -13,31 exhibit a substantial degree of variance. Other than these two areas, regions within the olfactory areas (e.g. piriform area) were among those that benefited the most from the classification approach, showing improvement in all sampled latent spaces, with all Bernouilli probability estimates equal to 1 and all q-values equal to 0. While the effects, i.e. rank differences, are smaller, the similarity profiles of regions belonging to the isocortex and cerebellar cortex also saw an improvement in locality. All models for isocortical areas returned Bernouilli probability estimates greater than pB=0.85 and q-values that were at most q=1.3510-67. Moreover, 9 of the 19 isocortical regions were improved in all latent spaces, that is, pB=1. Brain regions belonging to the cerebellar cortex saw similar improvement. In contrast, regions belonging to the cerebral nuclei, the diencephalon, midbrain, and hindbrain did not see much improvement in this new common space, with an average Bernouilli probability estimate of pB=0.36 for this subset. Other than the caudoputamen (pB=0.99, 95% CI 0.97,1.00 , q=1.3510-139), the superior colliculus (pB=0.90, 95% CI 0.87,0.92 , q=9.8210-81), and the inferior colliculus (pB=0.75, 95% CI 0.71,0.78 , q=3.1210-30), all regions in this subset return q-values equal to 1. For many such regions, the degree of locality appears to be worse in this space, though only by a small number of ranks, for example, striatum ventral region (mean rank difference of μ=4, 95% CI 1,7) and lateral septal complex (μ=6, 95% CI 0,11). Indeed, computing the average rank difference over this subset of regions across all latent spaces, we find μ=2 with 95% CI -5,8 . These results demonstrate that the supervised learning approach used here can improve the resolution of neuroanatomical correspondences between the mouse and human brains, though the amount of improvement varies over the brain. Regions that were already well characterized using the initial set of homologous genes (e.g. subcortical regions) did not benefit tremendously, but numerous regions in the cortical plate and subplate, as well as the cerebellum, saw an improvement in locality in this new common space.

While the supervised learning approach improved our ability to identify matches on a finer scale for a number of brain regions, this does not necessarily mean that those improved matches are biologically meaningful. The second criterion for evaluating the performance of the neural network addresses whether this improvement in locality captures what we would expect in terms of known mouse-human homologies. To this end, we examined the degree of similarity between established mouse-human neuroanatomical pairs, both in the initial gene expression space and in the set of latent spaces. We began by establishing a list of 36 canonical mouse-human homologous pairs on the basis of common neuroanatomical labels in our atlases. For each of these regions in the mouse brain, we compared the rank of the canonical human match in the rank-ordered similarity profiles between the latent spaces and the original gene expression space (Figure 4A). The lower the rank, the more similar the canonical pair, with a rank of 1 indicating maximal similarity. As described above, we evaluated the overall performance of the classification approach by running a logistic regression using the average latent space rank difference over all regions in our subset. Here we find an estimated Bernouilli probability of pB=0.64 with 95% CI 0.47,0.78 . Under the null binomial distribution, B36,0.5 , the probability of getting at least as many successes as this is p=0.033. We also evaluated the model for each brain region and found that 30 of the 36 regions (83%) return Bernouilli probability estimates of at least pB=0.80. Under the null binomial distribution, B500,0.5 , we find that the largest q-value among these 30 regions is q=4.3910-54 . Moreover, 24 regions (67%) return Bernouilli probability estimates of at least pB=0.90, and 8 regions show improvement in all latent spaces, that is, pB=1 and q=0 (Figure 4B). Among these 8 regions are the claustrum, the piriform area, the primary motor and somatosensory areas, and the crus 2. Additional examples of the many regions that demonstrate improvement include: the primary auditory area (pB=0.83, 95% CI 0.80,0.86 , q=1.8010-55), the pallidum (pB=0.86, 95% CI 0.83,0.89 , q=3.6310-65), and the crus 1 (pB=0.92, 95% CI 0.90,0.94 , q=7.6810-95). Once again we find that many regions in the sub-cortex do not benefit greatly from the gene expression latent spaces, since the initial gene set was already recapitulating the appropriate match with maximal similarity. We find that the striatum ventral region, caudoputamen, hypothalamus, and pons are maximally similar to their canonical matches in at least 95% of latent spaces. In such cases, the classification approach performs as well as the original approach. While these probability estimates provide a sense of how often an improvement is returned, it is important to note that many regions in this set exhibit a substantial degree of variance over the latent spaces in the ranking of the canonical pairs, for example, the primary auditory area (μ=9, 95% CI 1,19), the visual areas (μ=18, 95% CI 7,29), and the paraflocculus (μ=16, 95% CI 2,29). This is especially apparent for cerebellar regions, indicating some instability in the neural network’s ability to recover these matches.

Recovering canonical neuroanatomical pairs in gene expression space.

(A) Comparison between the ranks of canonical human matches for mouse seed regions between the initial gene expression space and gene expression latent spaces. Points and error bars represent mean and 95% CI with n = 500. Mouse region names are colored according to the Allen Mouse Brain Atlas palette. Binomial likelihood estimate of p=0.64 with 95% CI [0.47, 0.78]. The probability of obtaining at least thse many successes under the null binomial distribution, B36,0.5 , is p=0.033. (B) Proportion of latent spaces resulting in an improvement or null difference compared with the initial gene space, estimated using region-wise logistic regressions. Uncolored voxels correspond to regions with no established canonical human match.

Figure 4—source data 1

Ranks of canonical neuroanatomical pairs for mouse regions in the homologous gene space and all latent spaces, related to Figure 4A.

https://cdn.elifesciences.org/articles/79418/elife-79418-fig4-data1-v2.csv
Figure 4—source data 2

Logistic regression model estimates for mouse regions, related to Figure 4B.

https://cdn.elifesciences.org/articles/79418/elife-79418-fig4-data2-v2.csv

Together, these results demonstrate that the multi-layer perceptron classification approach improves our ability to resolve finer scale mouse-human neuroanatomical matches within the broadly similar regions obtained using the initial gene expression space. By training a classifier to predict the atlas labels in one species, we were able to generate a new common space that amplified the amount of local signal within broadly similar regions while also improving our ability to recover known mouse-human neuroanatomical pairs.

Cortical areas involved in sensorimotor processing show greater transcriptomic similarity than supramodal areas

It is well established that the brains of most, if not all, extant mammalian species follow a common organizational blueprint inherited from an early mammalian ancestor (Kaas, 2011a). A number of cortical subdivisions have consistently been identified in members of many distantly related mammalian species (Krubitzer, 2007) and hypothesized to have been present in the common ancestor of all mammals (Kaas, 2011a). While it is clear that basic sensorimotor cortical regions are found in the majority of mammals, including mice and humans, there is much debate about the extent to which cortical areas involved in supramodal processing are conserved across mammalian taxa. Although some supramodal regions were likely present in the earliest mammals, including some cingulate regions and an orbitofrontal cortex (Kaas, 2011a), since the divergence of mouse and human lineages some 80 million years ago, the primate neocortex has undergone substantial expansion and re-organization (Kaas, 2012). Indeed, when comparing the human neocortex even to primate model species, this is the likely locus of areas that cannot be easily translated between species (Mars et al., 2018b). As a result, it is important to investigate whether our between-species mapping is more successful in somatosensory areas than supramodal areas.

We assessed the similarity between mouse and human isocortical areas using the pairwise correlations in each of the gene expression latent spaces returned from the multi-layer perceptron. For every region in the mouse isocortex, we evaluated the distribution of maximal correlation values over latent spaces (Figure 5A). While the region-wise variance for each isocortical area was large, we found that, on average, sensorimotor regions exhibited higher maximal correlation values than supramodal regions (linear regression with binary predictor: β=-0.042, 95% CI -0.087,0.003 , t17=-1.854, p=0.0812). The mouse primary somatosensory (r=0.96, 95% CI 0.93,0.98) and motor (r=0.95 with 95% CI 0.92,0.98) areas have the highest average maximal correlation values. We additionally examined the distributions of maximal correlation, grouped by cortex type (Figure 5B). To generate these distributions, we computed average maximal correlation values by cortex type in each of the latent spaces. Here too we find that sensorimotor regions are associated with higher maximal correlation values on average compared with supramodal areas (linear mixed-effects regression: β=-0.042, 95% CI -0.044,-0.040 , t499=-49.9, p<2 · 10−16). These distributions demonstrate that sensorimotor isocortical regions exhibit more similarity overall on the basis of homologous gene expression than do supramodal regions.

Figure 5 with 2 supplements see all
Similarity of mouse-human isocortical regions.

(A) Maximal correlation distributions of mouse isocortical regions. Points and error bars represent mean and 95% CI over n = 500 latent space samples. Linear regression using average maximal correlation values: β=-0.042, 95% CI -0.087,0.003 , t17=-1.854, p=0.0812. (B) Distributions of average maximal correlation for sensorimotor and supramodal isocortical areas in each gene expression latent space. Gray lines correspond to individual latent spaces. Linear mixed-effects regression: β=-0.042, 95% CI -0.044,-0.040 , t499=-49.9, p<2 · 10−16 . (C) Hierarchical clustering of mouse and human isocortical regions based on average latent space correlation values. Mouse regions are annotated as sensorimotor or supramodal. Four clusters were chosen for visualization using the elbow method. (D) Within-cluster sum of squared distances for different numbers of mouse and human isocortical clusters in the average latent space and initial homologous gene space.

Figure 5—source data 1

Maximal correlations of mouse isocortical regions in all latent spaces, related to Figure 5A and B.

https://cdn.elifesciences.org/articles/79418/elife-79418-fig5-data1-v2.csv
Figure 5—source data 2

Correlations between mouse and human isocortical regions in all latent spaces, related to Figure 5C.

https://cdn.elifesciences.org/articles/79418/elife-79418-fig5-data2-v2.csv
Figure 5—source data 3

Scree plot data, related to Figure 5D.

https://cdn.elifesciences.org/articles/79418/elife-79418-fig5-data3-v2.csv

While we found that sensorimotor isocortical areas in the mouse brain were more similar to human brain regions than supramodal areas, the distributions of maximal correlation do not speak to the neuroanatomical patterns of organization for these matches. To understand how the similarity patterns of mouse and human cortical subdivisions were organized, we used hierarchical clustering to cluster mouse and human isocortical regions on the basis of their similarity profiles in the average gene expression latent space (Figure 5C). This allows us to examine the similarity of regions to one another within and across brains at multiple levels simultaneously.

At a high level, we find a striking segregation of the mouse isocortex into one main cluster that corresponds to regions that are primarily engaged in sensorimotor processing and separate clusters of regions that are supramodal. All of the sensorimotor areas cluster together, but two supramodal areas also form part of this cluster: the posterior parietal association areas and the anterior cingulate cortex. The mouse sensorimotor cluster is characterized by high correlation values to human sensorimotor regions like the precentral gyrus, the cuneus, and the postcentral gyrus, as well as low correlation values to the piriform cortex and paraterminal gyrus. At this level of clustering, the remaining mouse supramodal subdivisions form three clusters. The retrosplenial area belongs to its own cluster, while the infralimbic and perirhinal areas cluster together. The similarity profile of the retrosplenial area is more similar to the sensorimotor cluster, and these two clusters are combined in the three-cluster solution. The remaining two mouse clusters are characterized by low correlations to the human cluster containing sensorimotor areas. This is especially true for the cluster containing the infralimbic and perirhinal areas.

On the human side, the four-cluster solution also features a sensorimotor cluster, which contains regions like the pre- and post-central gyri, the cuneus, and Heschl’s gyrus. This cluster exhibits a high degree of similarity to the mouse sensorimotor cluster and low similarity to the mouse supramodal clusters. The isocortical regions not belonging to this cluster are split into three clusters. The majority of these remaining regions form a large cluster that contains areas like the cingulate gyrus and the frontal pole. The parolfactory gyri, parahippocampal gyrus, and temporal pole form a separate cluster that exhibits high correlation to the mouse ectorhinal, orbital, and prelimbic areas. Finally, the paraterminal gyrus and piriform cortex are clustered together and exhibit high similarity to the mouse infralimbic area and low similarity to the mouse sensorimotor cluster.

We additionally ran hierarchical clustering on the isocortical similarity matrix in the original homologous gene space. While the cluster annotations were not substantially different in this space, we observed that the Euclidean distances within and between clusters were smaller compared with the latent space clustering, further confirming that the perceptron classification approach improves the segregation of brain regions in the gene expression common space (Figure 5D).

Overall, we observe a greater degree of similarity between mouse and human cortical regions involved in basic sensorimotor processing compared with supramodal or association areas. This is in line with the large body of existing research that suggests that sensory and motor areas of the cortex are conserved across the brains of mammals. While sensorimotor areas exhibit a greater degree of similarity than supramodal areas, the neuroanatomical pattern of correspondences obtained using mouse-human homologous genes is not at the level of individual cortical areas. Still, using a clustering approach we identified clear distinctions in the patterns of similarity between sensorimotor and supramodal areas, especially for regions in the mouse isocortex.

Transcriptomic comparison of the mouse and human striatum

We have focused here on comparing mouse and human brain organization using transcriptomic data, with a latent space based on homologous genes as the common space between the two species. To date, common space comparisons between the mouse and human brain have only been performed using functional connectivity (Balsters et al., 2020; Schaeffer et al., 2020). As a case in point, Balsters et al., 2020 compared mouse and human striatal organization using this measure. They found that the nucleus accumbens was highly conserved between mice and humans, and that voxels in the posterior part of the human putamen were significantly similar to the lateral portion of their mouse caudoputamen parcellation. Additionally, they report that 85% of voxels in the human striatum were not significantly similar to any of their mouse striatal seeds, and that 25% of human striatal voxels were significantly dissimilar compared with the mouse. These differences were understandable, as they involved parts of the human striatum that connected to parts of prefrontal cortex that have no known homologue in the mouse (Neubert et al., 2014). However, it is not necessarily the case that between-species differences in connectivity are associated with distinct architectonic or molecular signatures. Therefore, we investigated the patterns of similarity between the mouse and human striata on the basis of gene expression using the neural network latent space representations.

We first identified the striatal regions present in the Allen human dataset: the caudate, the putamen, and the nucleus accumbens. We evaluated the correlation between the microarray samples in these regions and every region in the mouse atlas. Based on these correlation values, we focused our analysis on the four mouse regions that were consistently the most similar across all latent spaces: the caudoputamen, the nucleus accumbens, the fundus of striatum, and the olfactory tubercle. For each of the human striatal regions, we then calculated the average correlation over the samples to each of the mouse targets. We examined the distribution of these average correlation values over the latent spaces (Figure 6A). We find that the human caudate and putamen consistently exhibit the strongest degree of similarity to the mouse caudoputamen. The median of distributions for the caudate-caudoputamen pairs and putamen-caudoputamen pairs is 0.93, with modal values of 0.92 and 0.94, respectively. All latent spaces return correlations greater than 0.85 for caudate-caudoputamen and putamen-caudoputamen pairs. Beyond this expected top match, the caudate and putamen both exhibit high similarity to the nucleus accumbens and the fundus of striatum, with mean correlation values of about 0.80. Neither of these target regions is consistently more similar to the mouse caudoputamen over all latent spaces.

Similarity among mouse and human striatal regions.

(A) Distributions over gene expression latent spaces of region-wise average correlation values for mouse and human striatal pairs. Human regions were chosen based on the Allen Human Brain Atlas ontology. Mouse target regions were chosen to be those with the highest average correlation values. (B) Latent space averaged correlations between voxels in the mouse striatum and human target regions. Target regions were selected based on the highest mean correlation across all striatal voxels. (C) Proportions of latent spaces in which mouse striatal voxels are maximally similar to human target regions.

Figure 6—source data 1

Correlations between human striatal samples and mouse striatal targets in all latent spaces, related to Figure 6A.

https://cdn.elifesciences.org/articles/79418/elife-79418-fig6-data1-v2.csv
Figure 6—source data 2

Average latent space correlations of mouse striatal voxels with human regions, related to Figure 6B.

https://cdn.elifesciences.org/articles/79418/elife-79418-fig6-data2-v2.csv
Figure 6—source data 3

Maximal correlations of mouse striatal voxels in all latent spaces (1 of 3), related to Figure 6C.

https://cdn.elifesciences.org/articles/79418/elife-79418-fig6-data3-v2.csv
Figure 6—source data 4

Maximal correlations of mouse striatal voxels in all latent spaces (2 of 3), related to Figure 6C.

https://cdn.elifesciences.org/articles/79418/elife-79418-fig6-data4-v2.csv
Figure 6—source data 5

Maximal correlations of mouse striatal voxels in all latent spaces (3 of 3), related to Figure 6C.

https://cdn.elifesciences.org/articles/79418/elife-79418-fig6-data5-v2.csv

While the similarity of the caudate and the putamen to the caudoputamen is unsurprising, the story is not as clear for the human nucleus accumbens. We find that the variance in correlation calculated over all mouse targets is much lower (σ=0.04) compared with the equivalent variances for the caudate (σ=0.08) and putamen (σ=0.08), indicating less specificity to any one mouse striatal target. In particular, the human nucleus accumbens isn’t as specifically similar to the mouse nucleus accumbens in the way that the caudate and putamen are similar to the caudoputamen. The mouse target distributions are right-shifted compared with those for the caudate and putamen, with median values of 0.89, 0.86, and 0.87 for the mouse nucleus accumbens, caudoputamen, and fundus of striatum, respectively. The human accumbens also exhibits a high degree of similarity to the mouse olfactory tubercle, the distribution of which is also right-shifted compared with the caudate and putamen.

Given the high correlation of the human caudate and putamen to the mouse caudoputamen, as well as the finding reported by Balsters et al. about the similarity of the lateral caudoputamen to the putamen, we were curious as to whether we could identify sub-regional patterns of similarity in the caudoputamen and other striatal regions using these gene expression data. To probe this question, we first examined the average latent space correlation between each voxel in the mouse striatum and every region in the human atlas. We created brain maps for the human regions that exhibited the highest mean correlation values, averaged over mouse striatal voxels: the caudate, the putamen, the nucleus accumbens, and the septal nuclei (Figure 6B). We find that voxels in the caudoputamen exhibit a homogeneous pattern of similarity to both the caudate and the putamen. On average, voxels in the caudoputamen have a correlation of 0.92 to the caudate and 0.91 to the putamen, with standard deviations of 0.05 and 0.06, respectively. The caudate and putamen are associated with correlations of at least 0.90 in 79 and 73% of caudoputamen voxels. A number of voxels are also highly similar to the human nucleus accumbens, with an average correlation value of 0.86 and 30% of voxels returning a correlation of at least 0.90. The caudoputamen voxels most similar to the nucleus accumbens lie in the ventral-rostral part of the region. Of course, voxels in the mouse nucleus accumbens are also highly similar to the human nucleus accumbens, with an average of 0.89 and standard deviation of 0.06. While the human nucleus accumbens is the most strongly correlated region, a number of voxels also exhibit reasonably strong correlations to the substantia innominata and the amygdala. Indeed, 88% of voxels in the accumbens are correlated at a value of 0.7 or higher to the amygdala, and 57% of voxels pass this threshold for the substantia innominata.

We additionally examined the proportion of latent spaces in which each voxel in the mouse striatum was maximally similar to the human target regions (Figure 6C). As expected, we find that voxels in the caudoputamen are most often maximally similar to the human caudate and putamen, with 77% of voxels in the caudoputamen being maximally similar to the caudate or putamen in at least 95% of latent spaces, and 59% of voxels being maximally similar to one of those targets in all latent spaces. Interestingly, we observe the emergence of a continuous bilateral pattern of specifity to the caudate and putamen, with voxels in the rostral and lateral-caudal parts of the caudoputamen being maximally similar to the caudate in a high proportion of latent spaces. In contrast, while voxels in the medial-rostral part of the caudoputamen are often maximally similar to the caudate, they are also maximally similar to the putamen in some of latent spaces. This map highlights subtle differences in the similarity between caudoputamen voxels and the caudate or putamen. While this pattern distinguishes the two regions on the basis of which is the top match, individual voxels have very similar correlation values to the targets (Figure 6B), with a mean difference in correlation of only 0.01. Beyond the caudoputamen, we find that the accumbens and olfactory tubercle in the mouse are consistently similar to the human nucleus accumbens, with 84% of mouse accumbens voxels and 75% of olfactory tubercle voxels having the human accumbens as their top match in at least 80% of latent spaces. For those voxels below this threshold, the human regions that are most often the top match are the amygdala and the piriform cortex.

Overall, we observe a strong association between the mouse caudoputamen and both the human caudate and putamen. While we find a subtle pattern of specificity to either region among voxels in the caudoputamen on the basis of maximal similarity, the high degree of similarity in the correlation values to each region suggests that the majority of voxels in the caudoputamen are equally similar to the caudate and the putamen on the basis of the expression of mouse-human homologous genes. We also find that the nucleus accumbens is well conserved across species. However, the region also exhibits patterns of similarity that go beyond the simple one-to-one match. The human accumbens features similar correlation values to the mouse caudoputamen and fundus of striatum, in addition to the accumbens proper, with no sharp distinction between these regions. It also exhibits a larger degree of similarity to the mouse olfactory tubercle. This is also seen in the mouse striatum, where voxels in the accumbens and the olfactory tubercle map onto the human accumbens.

Discussion

We have demonstrated how spatial transcriptomic patterns of homologous genes can be used to make quantitative comparisons between the mouse and human brain. We showed that using homologous genes as a common space allows one to easily identify coarse similarities in brain structures across species, but that more fine-scaled parcellations, such as at the level of cortical areas, are more complex. Despite this limitation, the approach still allows for a formal assessment of different patterns of between-species similarity in primary compared to supramodal regions, identifications of distinct clusters of cortical territories across species, and comparison of between-species similarities at the transcriptomic level to those observed using other modalities. We will discuss our observations in the context of the importance of the mouse as a model for human neuroscience below.

The abundance of neuroscience research performed using mice has resulted in a wealth of knowledge about the mouse brain. In the preclinical setting, mouse models are utilized with the intention of better understanding human neuropathology. For instance, in the context of autism spectrum disorders, a plethora of studies using mouse models have reported on the neurobiological and neuroanatomical phenotypes that arise from mutations at specific genetic loci (Gompers et al., 2017; Horev et al., 2011; Pagani et al., 2021). It is common for researchers involved in translational neuroscience to rely on findings of this kind to make inferences about the human disorder. The typical approach, which is to identify rough post-hoc correspondences between neuroanatomical ontologies, is not particularly comprehensive and is subject to confirmation bias. While it may be a reasonable starting point for comparison, the true correspondence between the mouse and human brain is likely more complicated given the evolutionary distance between the two species. Although overall patterns of brain organization, including the general pattern of neocortical organization, are similar across all mammals, substantial differences are evident (Ventura-Antunes et al., 2013). To make matters worse, researchers from the different neuroscientific traditions often use distinct terminology, further complicating detailed information exchange. To address these problems, we sought to establish a first quantitative whole-brain comparison between the two species.

The expression of homologous genes provides an elegant way to define a common space for quantitative cross-species comparisons since it relies on homology at a deep molecular biological level. The approach is not without limitations, however. First, the acquisition of whole-brain transcriptomic data is labour-intensive, time-consuming, and invasive. These data sets cannot be generated easily, especially in the human, in which the process depends on the availability of post-mortem samples. As a result, the effective sample sizes are extremely limited in this domain. For instance, in the Allen mouse coronal in-situ hybridization data set used here, the brain-wide expression of each gene is sampled only once (barring a few exceptions). This constrains the types of analyses that are possible (e.g. null hypothesis significance testing) and largely limits the availability of replication data sets. That being said, new technologies, such as spatial transcriptomics, are gradually making it easier to acquire brain-wide gene expression data in less time and at lower cost (Ortiz et al., 2020; Ståhl et al., 2016; Vickovic et al., 2019). Second, the approach of relying on all available genes is subject to noise. To address this issue, Myers, 2017 used a method of gene set selection to attempt to improve the correspondence between established mouse-human homologies. While this leads to improvement, it was only at the level of coarsely defined regions (e.g. cortex-cortex). Our approach, therefore, was to use supervised machine learning to create a latent common space based on combinations of homologous genes that can delineate areas within a single species.

This latent common space approach led to a substantial improvement in specificity of between-species comparisons. Nevertheless, it is evident that the first major distinction in gene expression patterns within a species, and the easiest identification of similarity across species, are at the coarse anatomical level of the major subdivisions of the vertebrate brain, such as the isocortex, cerebellar hemispheres and nuclei, and brain stem. All of these territories were present in the ancestral vertebrate brain (Striedter and Northcutt, 2020), and the ability to detect conserved transcriptomic signatures at this level is not surprising. Within such structures, such as the isocortex, our ability to make simple one-to-one correspondences decreased. This is partly because areas within a coarse structure have more similar transcriptomic profiles, but also likely due to the fact that a single area in one brain does not have a single correspondent in another, larger brain. In other words, regions in the brains of related species may exhibit one-to-many or many-to-many mappings. In our study, we found greater cross-species similarity between isocortical areas associated with sensorimotor processing than areas in supramodal isocortex. Primary areas, including the sensorimotor areas, are present in all mammals studied to date and likely part of the common ancestors of all mammals (Kaas, 2011a; Krubitzer, 2007). Although this common ancestor likely also had non-primary areas, it cannot be denied that association cortex expanded dramatically in primates and especially so in the human brain (Chaplin et al., 2013; Mars et al., 2016b). Again, the pattern found here of greater similarity in more conserved areas might reflect this evolutionary history. In that context it is interesting to note that some non-primary areas thought to be present in the common mammalian ancestor, such as cingulate and orbitofrontal cortex (Kaas, 2011a) showed relatively high correlation to human areas.

An advantage of the approach presented here is that it can, in principle, be applied to any aspect of brain organization. Beyond simply establishing whether areas are similar across species in a particular common space, comparing the results across common spaces established using different types of neuronal data can inform on which larger principles of organization are similar across brains (Eichert et al., 2020). This is illustrated here by the results of our striatal analysis. We found high similarity between the human caudate and putamen and mouse caudoputamen, with little differentiation within these regions in a single species. In contrast, Balsters et al., 2020 demonstrated that human caudoputamen contains a distinct pattern of connectivity. At first sight, one could argue the results are in contrast. However, evolutionarily speaking, it is quite probable that an overall similar transcriptomic signature of the striatum can be accompanied by a distinct connectivity pattern to areas of the cortex present in only one of the two species. Indeed, this speaks to the different types of similarity that can be studied, depending on which aspect of brain organization one is interested in. Although the human brain is much larger than the mouse brain and contains a number of cortical territories that have no homologue in the mouse brain (Kaas, 2011b; Rudebeck and Izquierdo, 2022), the similarity in transcriptomic signature mean that translations between the species is valid in many contexts. The supervised learning approach also provides interesting avenues for future research. For instance, rather than classifying all regions in the brain at once, separate models could be trained to classify regions belonging to different sub-trees in the neuroanatomical hierarchy (see Figure 5—figure supplement 1 and Figure 5—figure supplement 2). This type of approach requires more exploration, however, such as where to split the hierarchy, how to optimize the classifiers for each sub-tree, and how to stitch all this information back together at the end in order to make comparisons between different sub-trees.

The power of a formal understanding of similarities and differences between brains at different levels of organization is evident. In fundamental neuroscience, it will help translate results from data types that cannot be obtained in humans to the human brain (Barron et al., 2021). In translational neuroscience, it will, in a negative sense, help establish the limits of the translational paradigm by showing which aspects of the human brain cannot be understood using the model species (Liu et al., 2021). In a positive sense, it will also help by establishing and improving our understanding of the many aspects in which the model and human brain do concur (Mandino et al., 2021). More ambitious still, it can provide a way in which highly diverse manifestations of certain disease syndromes (e.g. autism spectrum disorder) (Grzadzinski et al., 2013; Simonoff et al., 2008) and the availability of many distinct model strains (Ellegood et al., 2015), each hypothesized to capture a distinct aspect of a multi-dimensional clinical syndrome, can be related to one another. Ultimately, we believe that using the mapping of homologous gene expression between species can be an important part of building a transform that maps information obtained using mice to humans and vice versa.

Materials and methods

Mouse gene expression data

Request a detailed protocol

We used the adult mouse whole-brain in-situ hybridization data sets from the AMBA (Lein et al., 2007). Specifically, we used 3D expression grid data, that is, expression data aligned to the Allen Mouse Brain Common Coordinate Framework (CCFv3) (Wang et al., 2020) and summarized under a grid at a resolution of 200μm. We downloaded the gene expression ‘energy’ volumes from both the coronal and sagittal in-situ hybridization experiments as a sequence of 32-bit float values using the Allen Institute’s API (http://help.brain-map.org/display/api/Downloading+3-D+Expression+Grid+Data). These volumes were subsequently reshaped into 3D images in the Medical Image NetCDF (MINC) format. Origin, extents, and spacing were defined such that the image was RAS-oriented, with the origin at the point where the anterior commissure crosses the midline. The MINC images from the coronal and sagittal data sets were then processed separately using the Python programming language. The sagittal data set was first filtered to keep only those genes that were also present in the coronal set. Images were imported using the pyminc package, masked and reshaped to form an experiment-by-voxel expression matrix. We pre-processed this data by first applying a log2 transformation for consistency with the human data set. For those genes associated with more than one in-situ hybridization experiment, we averaged the expression of each voxel across the experiments. We subsequently filtered out genes for which more than 20% of voxels contained missing values. Finally, we applied a K-nearest neighbours algorithm to impute the remaining missing values. The result of this pre-processing pipeline was a gene-by-voxel expression matrix with 3958 genes and 61,315 voxels for the coronal data set and a matrix with 3619 genes and 26,317 voxels for the sagittal data set.

Human gene expression data

Request a detailed protocol

Human gene expression data was obtained from the AHBA (Hawrylycz et al., 2012). The data were downloaded from the Allen Institute’s API (http://api.brain-map.org) and pre-processed using the abagen package in Python (https://abagen.readthedocs.io/en/stable/) (Arnatkeviciute et al., 2019; Hawrylycz et al., 2012; Markello et al., 2021). We used the microarray data from the brains of all six donors, each of which contains log2 expression values for 58,692 gene probes across numerous tissue samples. The pre-processing pipeline included probe selection using differential stability on data from all donors and intensity-based filtering of probes at a threshold of 0.5. The samples and genes were additionally normalized for each donor individually using a scaled robust sigmoid function. In practice, this pipeline was implemented using the get_samples_in_mask function from the abagen package. The remaining parameters were set to their default values. The output of the pre-processing pipeline was a gene-by-sample expression matrix with 15,627 genes and 3702 samples across all donors.

Mouse atlases

Request a detailed protocol

We used a version of the DSURQE atlas from the Mouse Imaging Centre (Dorr et al., 2008; Qiu et al., 2018; Richards et al., 2011; Steadman et al., 2014; Ullmann et al., 2013), modified using the AMBA hierarchical ontology, which was downloaded from the Allen Institute’s API. The labels of the DSURQE atlas correspond to the leaf node regions in the AMBA ontology, which allowed us to use the hierarchical neuroanatomical tree to aggregate and prune the atlas labels to the desired level of granularity. For the purposes of our analyses, we removed white matter and ventricular regions entirely. The remaining gray matter regions were aggregated up the hierarchy so that the majority of resulting labels contained enough voxels to be classified appropriately by the multi-layer perceptron. In doing so, we maintained approximately the same level of tree depth within a broad region (e.g. cerebellar regions were chosen at the same level of granularity). This resulted in a mouse atlas with 67 gray matter regions. We additionally generated an atlas with 11 broader regions for visualization and annotation purposes.

Human atlases

Request a detailed protocol

We used the hierarchical ontology from the AHBA, which we obtained using the Allen Institute’s API. We aggregated and pruned the neuroanatomical hierarchy to correspond roughly to the level of granularity obtained in our mouse atlas, resulting in 88 human brain regions. We additionally generated a set of 16 broad regions for visualization and annotation. White matter and ventricular regions were omitted entirely.

Expression matrices and similarity matrices

Request a detailed protocol

We created the mouse and human gene-by-region expression matrices from the mouse gene-by-voxel and human gene-by-sample expression matrices. First, we intersected the gene sets in these matrices with a list of 3331 homologous genes obtained from the NCBI HomoloGene database (NCBI Resource Coordinators, 2018), resulting in 2835 homologous genes present in both the mouse and human expression matrices. We then annotated each of the human samples with one of the 88 human atlas regions, and each of the mouse voxels with one of the 67 mouse atlas regions, discarding white matter and ventricular entries in the process. These labeled expression matrices were subsequently normalized as follows: For each matrix, we first normalized each voxel/sample across all homologous genes using a z-scoring procedure to create a normalized gene expression signature for each voxel/sample. We then centered the distribution of expression signatures in gene space by subtracting the mean expression of each homologous gene over all voxels/samples. Finally, we generated the gene-by-region expression matrices by averaging the expression of every gene over the voxels/samples corresponding to each atlas region. Using these expression matrices, we generated the mouse-human similarity matrix by computing the Pearson correlation coefficient between all pairs of mouse and human regions.

Gene enrichment analysis

Request a detailed protocol

We ran a gene enrichment analysis on the set of homologous genes obtained from the NCBI HomoloGene database. We first downloaded Gene Ontology data for biological process related modules from the Bader Lab at the University of Toronto (http://baderlab.org/GeneSets). These data include a gene set of 16,563 genes and a module set of 15757 biological process modules. Every module is associated with a subset of genes from the full gene set. For each module, we used a hypergeometric test to evaluate whether the homologous gene set was over-represented in the module subset, compared with the full gene set. The resulting p-values were adjusted for multiple comparisons using the false-discovery rate method (Benjamini and Hochberg, 1995). A total of 938 modules were found to be significant at a threshold of 0.001. The surviving modules were ordered according to their p-values and written out to a comma-separated values data file (Supplementary file 1). This analysis was carried out using the tmod package in the R programming language.

Multi-layer perceptron classification and latent space

Request a detailed protocol

To improve the resolution of mouse-human neuroanatomical matches, we performed a supervised learning approach, wherein we trained a multi-layer perceptron neural network to classify 67 mouse atlas regions from the expression values of 2835 homologous genes. We chose a model architecture in which each layer of the network was fully connected to previous and subsequent layers. To optimize the hyperparameters, we implemented an ad hoc cross-validation procedure that took into account the fact that the majority of genes in the coronal AMBA data set are sampled only once over the entire mouse brain. The procedure involved a combination of the coronal data set and the sagittal in-situ hybridization data sets. For the sagittal data set, we used the expression matrix described above. However, we used a modified version of the coronal expression matrix. This matrix was generated using the pipeline described above with the following modifications: (1) We applied the unilateral brain mask from the sagittal data set to the coronal images in order to have the same spatial extent, and (2) we did not aggregate the expression of multiple in-situ hybridization experiments for those genes in the coronal set that were measured more than once. We then filtered these experiment-by-voxel expression matrices according to the list of mouse-human homologous genes, as well as the human sample expression matrix. We also annotated the voxels in each of the expression matrices with one of the 67 regions in the mouse atlas. Our validation procedure then involved iterative construction of training and validation sets by sampling gene experiments from either the coronal or sagittal matrices. For every gene in the homologous set, we first determined whether that gene was associated with more than one experiment in the coronal matrix. If this was the case, we randomly sampled one of those experiments for the training set and one of the remaining experiments for the validation set. If the gene was associated with only one experiment in the coronal set, we randomly sampled either the coronal or sagittal experiment for the training set and the other for the validation set. Once the training and validation sets were generated, they were normalized using the procedure described above. We then optimized the neural network using the training set and evaluated its performance on the validation set. We repeated this construction, training, and validation procedure five times for every combination of hyperparameters.

Using this validation approach, we tuned the number of hidden layers in the network, the number of hidden units per hidden layer, the amount of weight decay, the maximum learning rate, and the optimization method. The values we sampled were as follows:

  • Number of hidden layers: 3, 4, 5

  • Number of hidden units: 200, 500, 1000

    Weight decay: 0, 10-6 , 10-3

    Maximum learning rate: 10-5 , 10-4 , 10-3 , 10-2 , 10-6 , 10-1

  • Optimizer: SGD, AdamW

All models were trained over 200 epochs using a one-cycle learning rate policy. The activation function used in the forward pass was the rectified linear unit, and the loss function was the negative log-likelihood loss. We found that the best-performing model had 3 hidden layers, 200 neurons per layer, and no weight decay. It was optimized using the AdamW optimization algorithm (Loshchilov and Hutter, 2019) with a maximum learning rate of 10-5 . This model returned an average loss of 0.215 on the training sets and of 1.224 on the validation sets. The average training classification accuracy was 0.936, and the validation accuracy was 0.597.

Using the optimal hyperparameters, we trained the multi-layer perceptron on the full bilateral coronal voxel-wise expression matrix. We used the trained network to generate the latent gene expression space. To extract the appropriate transformation, we removed the predictive output layer and soft-max transformation from the network architecture. The resulting architecture returns the 200 hidden units in the third hidden layer as the output of the model. To create the latent space data representations, we applied this network to the mouse and human regional and voxel-/sample-wise expression matrices. The resulting matrices have 200 columns corresponding to the hidden units and rows corresponding to the number of regions, voxels, or samples in the mouse and human matrices. This process was repeated 500 times to generate 500 latent spaces.

These models were implemented in Python using PyTorch (https://pytorch.org) and the skorch package (https://skorch.readthedocs.io/en/stable/).

Multi-layer perceptron feature importance

Request a detailed protocol

We used integrated gradients to evaluate the contribution of different genes in the classification of mouse atlas labels. Since the homologous gene inputs contribute to the classification of distinct labels in different ways, we examined the feature attributions for three regions: the caudoputamen, the primary motor area, and the infralimbic area. Using the trained multi-layer perceptron, we computed integrated gradients for each of these three regions. We then averaged the values over all input voxels for each gene, resulting in a vector of gene attributions for each of the three example regions. This process was repeated for 200 training runs of the neural network. We then averaged the gene importance vectors of each region over all training runs to get a summary of gene importance. This process was implemented using the IntegratedGradients function from the captum package in Python (https://captum.ai/).

Statistical modeling

Request a detailed protocol

To quantify the improvement in the mouse-human matches when using the latent spaces versus the original gene expression space (Figures 3 and 4), we used a set of logistic regression models to estimate the probability that the rank difference was less than or equal to zero. To estimate the overall improvement due to the latent spaces, we created a binary variable to encode whether the average rank difference over latent spaces for each region met the success criterion. This variable was then used as our target in a logistic regression with no regressors. Once the model was fit, we applied the logistic function to the intercept parameter estimate to get the corresponding estimate for the Bernoulli probability, pB . This transformation was also applied to the bounds on the variance estimate for the intercept to get the corresponding confidence interval. Using the estimated Bernouilli probability, we calculated the corresponding number of successes, k. We then evaluated the probability of obtaining at least k successful outcomes under the null binomial distribution, Bn,0.5 . The parameter n was taken to be the number of brain regions under consideration. We additionally applied this approach on a region-wise basis to evaluate the likelihood of a region seeing improvement in the latent spaces. In this case, the null distribution was B500,0.5 for each region. The resulting p-values were adjusted for multiple comparisons using the false-discovery rate method (Benjamini and Hochberg, 1995). These models were implemented using the glm function from the stats package in the R programming language.

In our comparison of sensorimotor and supramodal cortical regions (Figure 5), we used linear models to evaluate the impact of cortex type on maximal correlation values. In the first instance, we computed each region’s average maximal correlation over all latent spaces. We then regressed those average values against a binary variable indicating whether the regions were sensorimotor or supramodal. Here we used a simple linear regression. In the second instance, for each latent space we computed average maximal correlation values for sensorimotor regions and supramodal regions. We then regressed these average values against a binary variable as described above. In this case, lm function from the stats package, while the linear mixed-effects regression was implemented using the lmer function from the lme4 package. The lmerTest package was used to estimate the degrees of freedom in the mixed-effects model and perform hypothesis testing.

Data availability

The Allen Mouse Brain Atlas and Allen Human Brain Atlas data sets are openly accessible and can be downloaded from the Allen Institute's API (http://api.brain-map.org). This manuscript and all figures were generated programmatically using R Markdown (https://rmarkdown.rstudio.com) and (https://www.latex-project.org). All of the code and additional data needed to generate this analysis, including figures and manuscript, is accessible at GitHub, (copy archived at swh:1:rev:0ad9c547e18e8ca5d08872cbecb9f729a4b8b62b; Beauchamp, 2022).

References

    1. Barron HC
    2. Mars RB
    3. Dupret D
    4. Lerch JP
    5. Sampaio-Baptista C
    (2021) Cross-Species neuroscience: closing the explanatory gap
    Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences 376:20190633.
    https://doi.org/10.1098/rstb.2019.0633
  1. Book
    1. Houdebine LM
    (2004)
    Chapter 6: the mouse as an animal model for human diseases
    In: Houdebine LM, editors. The Laboratory Mouse. Elsevier Academic Press. pp. 97–107.
  2. Thesis
    1. Myers E
    (2017)
    Molecular neuroanatomy: Mouse-human homologies and the landscape of genes implicated in language disorders
    Boston University.
  3. Book
    1. Striedter GF
    2. Northcutt RG.
    (2020)
    Brains Through Time
    Oxford University Press.
  4. Conference
    1. Sundararajan M
    2. Taly A
    3. Yan Q.
    (2017)
    Axiomatic Attribution for Deep NetworksProceedings of the 34th
    International Conference on Machine Learning. PMLR. pp. 3319–3328.

Decision letter

  1. Alex Fornito
    Reviewing Editor; Monash University, Australia
  2. Kate M Wassum
    Senior Editor; University of California, Los Angeles, United States
  3. Bratislav Misic
    Reviewer; McGill University, Canada

Our editorial process produces two outputs: i) public reviews designed to be posted alongside the preprint for the benefit of readers; ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.

Decision letter after peer review:

Thank you for submitting your article "Whole-brain comparison of rodent and human brains using spatial transcriptomics" for consideration by eLife. Your article has been reviewed by 2 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Kate Wassum as the Senior Editor. The following individual involved in the review of your submission has agreed to reveal their identity: Bratislav Misic (Reviewer #2).

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

Essential revisions:

Each of the Reviewers has raised some specific points that require further attention or analysis in their public review and recommendations for authors. Please provide a point-by-point response to each of these. We especially ask that you well address Reviewer 1 recommendations to authors points 2 and 3.

In your revision, If you have not already done so, please ensure your manuscript complies with the eLife policies for statistical reporting: https://reviewer.elifesciences.org/author-guide/full "Report exact p-values wherever possible alongside the summary statistics and 95% confidence intervals. These should be reported for all key questions and not only when the p-value is less than 0.05."

Reviewer #1 (Recommendations for the authors):

From the methodological point of view, this study is well-executed and the specific questions/suggestions are presented below.

1. Expression patterns across broad anatomical divisions such as the human cortex, subcortex, brainstem, and cerebellum demonstrate substantial differences. Similar tendencies are also observed in the mouse brain, where differences between neocortical and other brain areas tend to be much stronger compared to the differences within these divisions. The analyses presented in this work are performed on the combined datasets covering the whole brain and the resulting similarity metrics appear to be significantly skewed to the right with values broadly ranging from 0.7-1. Could the authors please comment if these transcriptional differences between broad anatomical divisions may attenuate/diminish the potential differences within these structures, e.g. within cortex/neocortex/subcortex/cerebellum? It might be interesting to expand the analyses by analyzing each anatomical division independently in order to disentangle more subtle transcriptional similarities/differences between species.

2. Currently, in the description of the processing of AHBA data there is no mention of within-donor normalization prior to data aggregation. It has been previously shown that samples acquired from the same donor tend to cluster together rather than reflecting anatomical divisions of the brain when samples across 6 brains are combined. Based on the current documentation, samples from all 6 brains are first aggregated into a sample x gene matrix and only then normalized for every gene across samples. This type of normalization retains expression differences between different donor brains and can bias the resulting sample x gene and region x gene datasets as well as subsequent analyses. Markello et al., (2021) have recently shown that within-donor data normalization is the most influential step in AHBA data processing, therefore, I suggest revisiting this data processing step. Also, could the authors comment on the choice of mean expression level subtraction for within-sample/region normalization rather than the standard z-score normalization?

3. Does the latent gene space method allows the identification of genes that are most informative in region identification? Could the authors provide some comments in the manuscript?

4. Some formal statistical evaluations should be presented when performing comparisons. For example, but not limited to, comparing maximal correlational values between sensimotor and supramodal areas (lines 277-280, Figure 5B).

References

Markello, R. D., Arnatkeviciute, A., Poline, J.-B., Fulcher, B. D., Fornito, A., and Misic, B. (2021). Standardizing workflows in imaging transcriptomics with the abagen toolbox. eLife, 10, e72129. https://doi.org/10.7554/eLife.72129

Reviewer #2 (Recommendations for the authors):

I think the manuscript is very polished as-is. I have a number of questions/suggestions that should be considered optional:

1) Line 61: "the connections of a brain region tend to be unique". I know exactly what the authors mean (each brain region has a unique/specific connectivity profile), but the sentence could perhaps be clearer.

2) Why use a multi-layer perceptron to map homologues, as opposed to a more interpretable, SVD-based method, such as PLS or CCA?

3) It is still not entirely clear to me how well the perceptron performs in the more conventional, global sense – is there a final, cross-validated accuracy? Is this accuracy significantly greater than what would be expected by chance?

4) In most of the analyses, there is a clear distinction between the cortex and cerebellum, which should then be expected to drive the configuration of the latent spaces. Have the authors attempted to perform the analysis using cortex only?

5) Do the authors have a sense of what biological pathways the homologous genes are involved in?

https://doi.org/10.7554/eLife.79418.sa1

Author response

Essential revisions:

Each of the Reviewers has raised some specific points that require further attention or analysis in their public review and recommendations for authors. Please provide a point-by-point response to each of these. We especially ask that you well address Reviewer 1 recommendations to authors points 2 and 3.

In your revision, If you have not already done so, please ensure your manuscript complies with the eLife policies for statistical reporting: https://reviewer.elifesciences.org/author-guide/full "Report exact p-values wherever possible alongside the summary statistics and 95% confidence intervals. These should be reported for all key questions and not only when the p-value is less than 0.05."

Reviewer #1 (Recommendations for the authors):

From the methodological point of view, this study is well-executed and the specific questions/suggestions are presented below.

1. Expression patterns across broad anatomical divisions such as the human cortex, subcortex, brainstem, and cerebellum demonstrate substantial differences. Similar tendencies are also observed in the mouse brain, where differences between neocortical and other brain areas tend to be much stronger compared to the differences within these divisions. The analyses presented in this work are performed on the combined datasets covering the whole brain and the resulting similarity metrics appear to be significantly skewed to the right with values broadly ranging from 0.7-1. Could the authors please comment if these transcriptional differences between broad anatomical divisions may attenuate/diminish the potential differences within these structures, e.g. within cortex/neocortex/subcortex/cerebellum? It might be interesting to expand the analyses by analyzing each anatomical division independently in order to disentangle more subtle transcriptional similarities/differences between species.

The idea of separately classifying subtrees of the hierarchical ontology (e.g.isocortex, cerebellum) is a good idea, and one that we’d considered previously. Given the difficulty of discriminating between cerebellar regions using transcriptomic data, we believe that this approach would be most beneficial in the isocortex. To that end, we trained the multi-layer perceptron to classify the 19 regions in the AMBA ontology that make up the mouse isocortex, and then generated 500 latent spaces in the way we’ve described previously. In this case, however, we only transformed mouse and human isocortical regions into the latent spaces, since the network was only trained to classify isocortical regions. Given this reduced set of regions, we focused on examining the impact of the isocortical latent spaces on the outcomes that were specific to isocortical regions.

We first generated a version of Figure 4 that included only those mouse isocortical regions with established neuroanatomical homologues. Here we find that in terms of averages, the original set of latent spaces and the isocortical-specific latent spaces perform equally well for most of the regions surveyed. The exception is the visual areas, which feature an improve mouse-human correspondence in the isocortical-specific spaces. Another salient feature is that the variance in the rank of the canonical pairs is smaller in the isocortical-specific latent spaces.

Author response image 1

In addition to this analysis, we repeated the clustering analysis from Figure 5 using the isocortical latent spaces.We find striking differences here. The first is that supramodal regions exhibit higher maximal correlation values than sensorimotor regions on average (panels A and B). Another important change is that average latent space similarity matrix exhibits fewer pairs of regions with high correlations (panel C). This in turn changes the clustering. The mouse sensorimotor regions no longer form a single cluster but are split off into three clusters. A sensorimotor cluster mostly remains on the human side, though the precentral gyrus is split off into a different cluster. Interestingly, the scree plot in panel D suggests that the optimal clustering in this space might be as high as 10 clusters.

While we find the subtree classification approach interesting, we don’t believe that we see enough improvement in this preliminary analysis to justify switching approaches at this stage. The global classification task performs well and offers a simple way to evaluate comparisons between different regions across the entire brain. Tackling the subtree classification approach in full is an interesting avenue for future work but would require more exploration. In particular, the latent spaces obtained from the classification of different subtrees cannot trivially be stitched together. This renders it difficult to make comparisons between different broad regions of the brain, e.g. isocortical regions and sub-cortical regions. Additional considerations include where to split the anatomical hierarchy for classification, and how to optimize networks for each sub-division. We’ve added the following sentences in the discussion (lines 660-666):

“The supervised learning approach also provides interesting avenues for future research. For instance, rather than classifying all regions in the brain at once, separate models could be trained to classify regions belonging to different sub-trees in the neuroanatomical hierarchy. This type of approach requires more exploration however, such as where to split the hierarchy, how to optimize the classifiers for each sub-tree, and how to stitch all this information back together at the end in order to make comparisons between different sub-trees.”

2. Currently, in the description of the processing of AHBA data there is no mention of within-donor normalization prior to data aggregation. It has been previously shown that samples acquired from the same donor tend to cluster together rather than reflecting anatomical divisions of the brain when samples across 6 brains are combined. Based on the current documentation, samples from all 6 brains are first aggregated into a sample x gene matrix and only then normalized for every gene across samples. This type of normalization retains expression differences between different donor brains and can bias the resulting sample x gene and region x gene datasets as well as subsequent analyses. Markello et al., (2021) have recently shown that within-donor data normalization is the most influential step in AHBA data processing, therefore, I suggest revisiting this data processing step. Also, could the authors comment on the choice of mean expression level subtraction for within-sample/region normalization rather than the standard z-score normalization?

Our original pre-processing pipeline was built before the release of the abagen Python package, which implements the pipeline options described in Markello et al. 2021. That being said, we recognize the importance of using standardized tools, and so we’ve modified the human pre-processing pipeline such that it uses the abagen toolkit. Compared with our original pipeline, this new pipeline implements probe selection via differential stability, as well as within-donor normalization of both samples and genes using a scaled robust sigmoid function. We have updated the subsection “Human gene expression data” in the “Materials and methods” section to reflect these changes (lines 705-716):

“Human gene expression data was obtained from the Allen Human Brain Atlas (Hawrylycz et al., 2012). The data were downloaded from the Allen Institute's API (http://api.brain-map.org) and pre-processed using the `abagen` package in Python (https://abagen.readthedocs.io/en/stable/) (Arnatkeviciūtė et al., 2019; Hawrylycz et al., 2012; Markello et al., 2021). We used the microarray data from the brains of all six donors, each of which contains `log2` expression values for 58692 gene probes across numerous tissue samples. The pre-processing pipeline included probe selection using differential stability on data from all donors and intensity-based filtering of probes at a threshold of 0.5. The samples and genes were additionally normalized for each donor individually using a scaled robust sigmoid function. In practice, this pipeline was implemented using the `get_samples_in_mask` function from the `abagen` package. The remaining parameters were set to their default values. The output of the pre-processing pipeline was a gene-by-sample expression matrix with 15627 genes and 3702 samples across all donors.”

Using this updated pipeline to pre-process the human expression data, we re-generated all downstream aspects of the analysis, including 500 new latent spaces. Notably, using this updated pipeline, our subset of mouse-human homologous genes now contains 2835 genes rather than the original 2624. This value has been updated in the manuscript. We updated all figures to examine the impact of the new human pipeline on the outcomes.

In Figure 1, we observe a slight increase in contrast in the similarity matrices (panel A), particularly for mouse regions in the cortical subplate, olfactory areas, and hippocampal formation.

In Figure 2, we also see a slight increase in contrast in the similarity matrix (panel C).

In Figure 3, panel B, we see some minor tweaks to the region-wise distributions. The biggest change occurs in the cerebellar regions, where we see some improvement (i.e. distributions have moved left) when using the abagen pre-processing pipeline, compared with our original pipeline.

In Figure 4, panel A, we see some worsening of the ranks in the initial homologous gene expression space when using the updated pre-processing pipeline compared with our original pipeline (e.g. claustrum, piriform area, subiculum, primary motor area). However, the ranks are improved in the latent spaces as desired. Note that in the figure at submission, some of the error bars in panel A were erroneously being discarded by the plotting routine since they fell outside of the range of the x-axis (e.g. claustrum, anterior cingulate area). This has been resolved in the updated figure. In panel B., we see an improvement in the proportions in the hippocampal formation when using the new processing pipeline.

In Figure 5, we see some minor shuffling of the order of isocortical regions in panel A. We also see a slight increase in the separate between the medians of the distributions in panel B. In panel C, the mouse clusters are unchanged. However, we see the emergence of a sensorimotor cluster on the human side. Sensorimotor regions that we present in the large cluster in the original figure (e.g. precentral gyrus, Heschl’s gyrus) are now clustered with the somatosensory and visual regions. This cluster is characterized by high similarity to the mouse sensorimotor cluster. In panel D, we find that the cluster separations have increased when using the updated pre-processing pipeline.

In Figure 6, panel A, the distributions of caudate-caudoputamen and putamen-caudoputamen correlation values are shifted slightly to the left in the update figure. They are also slightly wider. This widening is also seen for the distributions of the human nucleus accumbens. In panel B, the voxel-wise correlations to the caudate and putamen are slightly lower in absolute value, but the spatial pattern of correlation remains unchanged. In panel C, the proportions are different for the caudate and putamen. In the updated figure, we find that voxels in the mouse caudoputamen are most consistently maximally similar to the human caudate. The distinction between the caudate and putamen that we saw in the original figure is mostly subdued. While hints of it remain in the dorsal and caudal parts of the caudoputamen, these voxels are still mostly maximally similar to the caudate, rather than the putamen.

Overall, we find that pre-processing the human data using the abagen package, including within-donor normalization, doesn’t hugely impact the outcomes of the study. The most important change is the emergence of a human sensorimotor cluster in the isocortical analysis.

Note that these changes in the figures are associated with changes to the quantitative statements reported in the discussion for each figure. These are outlined below in our response to recommendation #4 from reviewer #1.

Finally, to address our choice of normalization for the expression matrices: Upon revision of the manuscript and code, we found that we incorrectly described the normalization procedure in our original submission. The correct normalization procedure, along with our reasoning, is as follows: We first performed a z-score normalization for each voxel or sample across all homologous genes to generate a normalized gene expression signature for each voxel/sample. While this is likely good enough on its own, we wanted to ensure that the point-cloud distributions for both the mouse and human data sets were centered at the origin in the gene expression space, rather than existing in separate domains. To ensure this for each species, we subtracted the mean value for each gene across all voxels/samples.The result is a set of voxel-wise/sample-wise gene expression signatures, centered at the origin in gene expression space.

The manuscript has been adapted accordingly (lines 743-748):

“These labelled expression matrices were subsequently normalized as follows: For each matrix, we first normalized each voxel/sample across all homologous genes using a z-scoring procedure to create a normalized gene expression signature for each voxel/sample. We then centered the distribution of expression signatures in gene space by subtracting the mean expression of each homologous gene over all voxels/samples.”

3. Does the latent gene space method allows the identification of genes that are most informative in region identification? Could the authors provide some comments in the manuscript?

The relative importance of input genes for the classification of voxels into atlas regions, called feature importance or feature attributions, can be obtained using integrated gradients (Sundararajan et al., 2017). This method can be implemented using the captum (https://captum.ai/) package in Python. Using this toolkit, we can identify the relative importance of all genes in the classification of any given single label (e.g. caudoputamen), but not the classification of all labels at once. A comprehensive characterization of the importance of all input genes for all 67 mouse atlas labels across all 500 latent spaces is beyond the scope of this study. However, we have applied the method to generate feature attributions for the classification of three labels: the caudoputamen, the primary motor area, and the infralimbic area. Using captum, we computed integrated gradients for each of these three labels for 200 training runs of the perceptron. For each target label of interest, we averaged the importance of each gene over the 200 training iterations to get an average measure of importance over latent spaces. These results are summarized in a new supplemental figure, which is tied to figure 2.

We discuss this additional figure in the section titled “A latent gene expression space improves the resolution of mouse-human associations” (lines 219-232):

“Although the neural network and associated latent space do not directly provide information about which genes are most important for the classification of specific mouse atlas labels, this type of information can be derived from the model using attribution methods such as integrated gradients (Figure 2-—figure supplement 1)(Sundararajan et al., 2017). Each brain region in the classification task is associated with the input genes in different ways, such that there isn't a single weighting of gene importance for the entire model. While most genes contribute to the classification of any given label in some capacity, it is often the case that the network relies on a reduced subset of genes to arrive at a decision. For example, the genes Prrg2 and Cd4 were found to be most influential for the classification of the caudoputamen, when the feature attributions were averaged over all training runs. In contrast, Rfx4 and Glra3 were the most influential for the classification of the primary motor area. In some cases, the spatial expression pattern of the gene clearly shows a demarcation of the region of interest (e.g. Cd4), but this is not always the case, nor is it necessary, as the network learns from the entire gene expression signature of all voxels.”

We also included a new subsection in “Materials and methods”, titled “Multi-layer perceptron feature importance” (lines 821-830):

“We used integrated gradients to evaluate the contribution of different genes in the classification of mouse atlas labels. Since the homologous gene inputs contribute to the classification of distinct labels in different ways, we examined the feature attributions for three regions: the caudoputamen, the primary motor area, and the infralimbic area. Using the trained multi-layer perceptron, we computed integrated gradients for each of these three regions. We then averaged the values over all input voxels for each gene, resulting in a vector of gene attributions for each of the three example regions. This process was repeated for 200 training runs of the neural network. We then averaged the gene importance vectors of each region over all training runs to get a summary of gene importance. This process was implemented using the `IntegratedGradients` function from the `captum` package in Python (https://captum.ai/).”

4. Some formal statistical evaluations should be presented when performing comparisons. For example, but not limited to, comparing maximal correlational values between sensimotor and supramodal areas (lines 277-280, Figure 5B).

We initially avoided relying on formal statistical evaluations due to the absence of biologically relevant sources of variance in the data sets, i.e. variation in mouse-human correlation values arising from variation in transcriptomic maps. Since all donors were aggregated on the human side, and each gene was sampled only once in the coronal data set on the mouse side, the sample size is effectively n = 1.

The resampling of the latent spaces does introduce variance and this can be used to make more formal statistical statements. The caveat here however is that we have complete control over the relevant sample size (the number of latent spaces), and so metrics like p-values are meaningless, since the sample size can be made arbitrarily high at low cost.

That being said, we formalized some of the quantitative statements made with respect to the analyses using statistical models. Note that the values cited below reflect the changes resulting from the updated human pre-processing pipeline, as discussed in our response to recommendation #2 from reviewer #1.

For Figure 3, panels B and C, we used binomial likelihood models to quantify 1. The probability that any given region would see improvement on average in the latent spaces, and 2. The probability that any given latent space would return an improvement, for each individual region. While the resulting estimate of the Bernoulli probability in these models is equivalent to the proportions cited in the original manuscript, the models additionally return a confidence interval around this value. The relevant text in the section “A latent gene expression space improves the resolution of mouse-human associations” has been updated to reflect these changes (lines 266-316):

“Examining the structure-wise distributions of these rank differences, we found that for the majority of regions in our mouse atlas, the classification approach resulted in either an improvement in the amount of locality within a broad region, or no difference from the original gene space (Figure 3, B and C). We quantified the improvement overall by fitting a logistic regression model with no predictors to the mean rank differences of each of the atlas regions. We considered the success condition for the Bernoulli trials to be a mean rank difference less than or equal to zero. The model estimate for the Bernoulli probability – which we denote pB to distinguish from the p-value p – was pB = 0.78 with a 95% confidence interval of [0.66,0.86]. In other words, 52 of the 67 brain regions saw an improvement on average when using the latent spaces. The probability of obtaining at least as many successes as this under the null model, i.e. a binomial distribution with pB = 0.50 and n = 67, is p = 8.64 · 10−7. We additionally evaluated the same kind of logistic regression on a region-wise basis to quantify how often the latent spaces resulted in an improvement for individual brain regions (Figure 3C). We found that for 46 regions (69%), the model estimated the probability to be at least at high as pB = 0.95. While confidence intervals varied around this estimate, the range between the upper and lower bound was only ever as high as 0.04. For 53 of the 67 regions (79%), the q-values, i.e. p-values adjusted for multiple comparisons, were effectively null, with the largest being q = 3.77 · 10−16. Of the remaining 14 regions, 13 had q-values equal to 1 and one region, the periacqueductal gray, had a q-value of q = 0.854. The regions with the smallest estimates for the Bernouilli probabilities are the dentate gyrus (pB = 0.0, no variance, q = 1), the striatum ventral region (pB = 0.016, 95% CI [0.008, 0.032], q = 1), and the lateral septal complex (p = 0.016, 95% CI [0.008,0.032], q = 1). The remaining regions with q = 1 are all subcortical and fall under the broad subdivisions of cerebral nuclei, olfactory areas, interbrain, midbrain, pons, medulla, and cerebellar nuclei. Beyond this binary measure of improvement, some regions exhibited a large range of differences in rank over the various latent spaces. In particular regions like the main olfactory bulb (mean rank difference of μ = 10, 95% CI [−12, 33]) and (accessory olfactory bulb μ = 9, 95% CI [−13,31]) exhibit a substantial degree of variance. Other than these two areas, regions within the olfactory areas (e.g. piriform area) were among those that benefited the most from the classification approach, showing improvement in all sampled latent spaces, with all Bernouilli probability estimates equal to 1 and all q-values equal to 0. While the effects, i.e. rank differences, are smaller, the similarity profiles of regions belonging to the isocortex and cerebellar cortex also saw an improvement in locality. All models for isocortical areas returned Bernouilli probability estimates greater than pB = 0.85 and q-values that were at most q = 1.35·10−67. Moreover, 9 of the 19 isocortical regions were improved in all latent spaces, i.e. pB = 1. Brain regions belonging to the cerebellar cortex saw similar improvement. In contrast, regions belonging to the cerebral nuclei, the diencephalon, midbrain and hindbrain did not see much improvement in this new common space, with an average Bernouilli probability estimate of pB = 0.36 for this subset. Other than the caudoputamen (pB = 0.99, 95% CI [0.97, 1.00], q = 1.35 · 10−139), the superior colliculus (pB = 0.90, 95% CI [0.87, 0.92], q = 9.82 · 10−81), and the inferior colliculus (pB = 0.75, 95% CI [0.71, 0.78], q = 3.12 · 10−30), all regions in this subset return q-values equal to 1. For many such regions the degree of locality appears to be worse in this space, though only by a small number of ranks, e.g. striatum ventral region (mean rank difference of μ = 4, 95% CI [1, 7]) and lateral septal complex (μ = 6, 95% CI [0, 11]). Indeed, computing the average rank difference over this subset of regions across all latent spaces, we find μ = 2 with 95% confidence interval [−5, 8]. These results demonstrate that the supervised learning approach used here can improve the resolution of neuroanatomical correspondences between the mouse and human brains, though the amount of improvement varies over the brain. Regions that were already well-characterized using the initial set of homologous genes (e.g. subcortical regions) did not benefit tremendously, but numerous regions in the cortical plate and subplate, as well as the cerebellum, saw an improvement in locality in this new common space.”

We have also updated the caption to Figure 3 (lines 318-332).

We applied the same kind of binomial likelihood models to the rank comparisons in Figure 4. The manuscript has been updated to reflect these changes (lines 333-368):

“While the supervised learning approach improved our ability to identify matches on a finer scale for a number of brain regions, this does not necessarily mean that those improved matches are biologically meaningful. The second criterion for evaluating the performance of the neural network addresses whether this improvement in locality captures what we would expect in terms of known mouse-human homologies. To this end, we examined the degree of similarity between established mouse-human neuroanatomical pairs, both in the initial gene expression space and in the set of latent spaces. We began by establishing a list of 36 canonical mouse-human homologous pairs on the basis of common neuroanatomical labels in our atlases. For each of these regions in the mouse brain, we compared the rank of the canonical human match in the rank-ordered similarity profiles between the latent spaces and the original gene expression space (Figure 4A). The lower the rank, the more similar the canonical pair, with a rank of 1 indicating maximal similarity. As described above, we evaluated the overall performance of the classification approach by running a logistic regression using the average latent space rank difference over all regions in our subset. Here we find an estimated Bernouilli probability of pB = 0.64 with 95% confidence interval [0.47,0.78]. Under the null binomial distribution, B(36, 0.5), the probability of getting at least as many successes as this is p = 0.033. We also evaluated the model for each brain region and found that 30 of the 36 regions (83%) return Bernouilli probability estimates of at least pB = 0.80. Under the null binomial distribution, B(500,0.5), we find that the largest q-value among these 30 regions is q = 4.39 · 10−54. Moreover, 24 regions (67%) return Bernouilli probability estimates of at least pB = 0.90 and 8 regions show improvement in all latent spaces, i.e. pB = 1 and q = 0 (Figure 4B). Among these 8 regions are the claustrum, the piriform area, the primary motor and somatosensory areas, and the crus 2. Additional examples of the many regions that demonstrate improvement include: the primary auditory area (pB = 0.83, 95% CI [0.80, 0.86], q = 1.80 · 10−55 ), the pallidum (pB = 0.86, 95% CI [0.83, 0.89], q = 3.63 · 10−65), and the crus 1 (pB = 0.92, 95% CI [0.90, 0.94], q = 7.68 · 10−95). Once again we find that many regions in the sub-cortex do not benefit greatly from the gene expression latent spaces, since the initial gene set was already recapitulating the appropriate match with maximal similarity. We find that the striatum ventral region, caudoputamen, hypothalamus, and pons are maximally similar to their canonical matches in at least 95% of latent spaces. In such cases, the classification approach performs as well as the original approach. While these probability estimates provide a sense of how often an improvement is returned, it is important to note that many regions in this set exhibit a substantial degree of variance over the latent spaces in the ranking of the canonical pairs, e.g. the primary auditory area (μ = 9, 95% CI [1, 19]), the visual areas (μ = 18, 95% CI [7,29]), the paraflocculus (μ = 16, 95% CI [2,29]). This is especially apparent for cerebellar regions, indicating some instability in the neural network<milestone-start />’<milestone-end />s ability to recover these matches.”

We also updated the caption to Figure 4 (370-377).

To quantify the results presented in Figure 5, panels A and B, we used linear regression models. For panel A, we used a simple linear regression to model the average maximal correlation values against a binary variable indicating whether a region was a sensorimotor or supramodal region. For panel B, we used a linear mixed-effects regression to model the average maximal correlation against the type of cortex. A random intercept term was used to model the latent spaces. The results were described in the manuscript (lines 401-416):

“We assessed the similarity between mouse and human isocortical areas using the pairwise correlations in each of the gene expression latent spaces returned from the multi-layer perceptron. For every region in the mouse isocortex, we evaluated the distribution of maximal correlation values over latent spaces (Figure 5A). While the region-wise variance for each isocortical area was large, we found that, on average, sensorimotor regions exhibited higher maximal correlation values than supramodal regions (linear regression with binary predictor: β = −0.042, 95% CI [−0.087,0.003], t(17) = −1.854, p = 0.0812). The mouse primary somatosensory (r = 0.96, 95% CI [0.93, 0.98]) and motor (r = 0.95 with 95% CI [0.92, 0.98]) areas have the highest average maximal correlation values. We additionally examined the distributions of maximal correlation, grouped by cortex type (Figure 5B). To generate these distributions, we computed average maximal correlation values by cortex type in each of the latent spaces. Here too we find that sensorimotor regions are associated with higher maximal correlation values on average compared with supramodal areas (linear mixed-effects regression: β = −0.042, 95% CI [−0.044, −0.040], t(499) = −49.9, p < 2 · 10−16). These distributions demonstrate that sensorimotor isocortical regions exhibit more similarity overall on the basis of homologous gene expression than do supramodal regions.”

We also updated the caption to Figure 5 (lines 465-474).

We did not perform any statistical tests for Figure 6, but the body of the text has been updated to reflect the changes induced by the updated human pre-processing pipeline.

We additionally included a new subsection in the “Materials and methods” section to describe these changes. The section is titled “Statistical modelling” (lines 832-861):

“To quantify the improvement in the mouse-human matches when using the latent spaces versus the original gene expression space (Figures 3 and 4), we used a set of logistic regression models to estimate the probability that the rank difference was less than or equal to zero. To estimate the overall improvement due to the latent spaces, we created a binary variable to encode whether the average rank difference over latent spaces for each region met the success criterion. This variable was then used as our target in a logistic regression with no regressors. Once the model was fit, we applied the logistic function to the intercept parameter estimate to get the corresponding estimate for the Bernoulli probability, pB. This transformation was also applied to the bounds on the variance estimate for the intercept to get the corresponding confidence interval. Using the estimated Bernouilli probability, we calculated the corresponding number of successes, k. We then evaluated the probability of obtaining at least k successful outcomes under the null binomial distribution, B(n,0.5). The parameter n was taken to be the number of brain regions under consideration. We additionally applied this approach on a region-wise basis to evaluate the likelihood of a region seeing improvement in the latent spaces. In this case, the null distribution was B(500,0.5) for each region. The resulting p-values were adjusted for multiple comparisons using the false-discovery rate method (Benjamini and Hochberg, 1995). These models were implemented using the glm function from the stats package in the R programming language. In our comparison of sensorimotor and supramodal cortical regions (Figure 5), we used linear models to evaluate the impact of cortex type on maximal correlation values. In the first instance, we computed each region<milestone-start />’<milestone-end />s average maximal correlation over all latent spaces. We then regressed those average values against a binary variable indicating whether the regions were sensorimotor or supramodal. Here we used a simple linear regression. In the second instance, for each latent space we computed average maximal correlation values for sensorimotor regions and supramodal regions. We then regressed these average values against a binary variable as described above. In this case we used a linear mixed-effects regression with a random intercept term to control for observations coming from the same latent space. These models were implemented in the R programming language. The simple linear regression was implemented using the lm function from the stats package, while the linear mixed-effects regression was implemented using the lmer function from the lme4 package. The lmerTest package was used to estimate the degrees of freedom in the mixed-effects model and perform hypothesis testing.”

Reviewer #2 (Recommendations for the authors):

I think the manuscript is very polished as-is. I have a number of questions/suggestions that should be considered optional:

1) Line 61: "the connections of a brain region tend to be unique". I know exactly what the authors mean (each brain region has a unique/specific connectivity profile), but the sentence could perhaps be clearer.

This sentence has been replaced with the following (lines 67-70):

“It has previously been demonstrated that brain regions can be identified via their unique set of connections to other regions in the brain. This connectivity fingerprint can therefore be seen as a diagnostic of an area (Rogier B. Mars et al., 2018a; Passingham et al., 2002).”

2) Why use a multi-layer perceptron to map homologues, as opposed to a more interpretable, SVD-based method, such as PLS or CCA?

We used a classifier rather than a decomposition approach in order to increase the information value found in the transcriptomic data. While the SVD methods are nice in that they jointly decompose both the mouse and human data, the objective functions are such that the resulting variables do not improve the locality of the brain matches. This is especially true if the variable set is truncated after the modelling, e.g. by selecting the first k canonical variables, etc. In our first attempt at implementing the classification approach, we initially opted for a more interpretable class of models, namely multinomial logistic regressions with LASSO regularization. However, these models failed to converge on the 67-label classification task and we soon moved on to more powerful classifiers. We decided to use a neural network approach rather than a tree-based method, because we wanted to be able to extract a set of latent space variables to use for our pairwise correlation analysis.

As you rightly pointed out, the move towards more complicated models reduces the interpretability of the resulting latent spaces. However, in the case of neural networks, we can use feature attribution methods like integrated gradients to extract information about how the resulting classifications or latent spaces relate to the input genes. For more details, please see our response to recommendation #3 from reviewer #1.

Still, the idea of a model that is jointly optimized on the mouse and human data is an attractive one. In the future it may be possible to adapt the classifier approach to use information from both the mouse and human data sets in the construction of the latent space variables.

3) It is still not entirely clear to me how well the perceptron performs in the more conventional, global sense – is there a final, cross-validated accuracy? Is this accuracy significantly greater than what would be expected by chance?

Using our ad hoc resampling cross-validation strategy, the optimal classifier returns an average validation accuracy of 0.597.

To determine the accuracy that we would expect by chance, we ran a data simulation exercise in which we randomly assigned one of the atlas labels to each voxel in our training set. Rather than giving every label equal weight, we estimated the probability of drawing a given label using the proportion of training voxels with that label. For instance, 2955 voxels have the “Caudoputamen” label, and so that probability of drawing that label in our simulation is 2955/51219 = 0.0568, where n = 51219 is the total number of voxels. Thus, we randomly drew 51219 labels and computed the resulting accuracy. We repeated this simulation 10000 times to get a null distribution of accuracy scores. The resulting null accuracy was 0.029 with 95% CI [0.028, 0.031]. This is slightly larger than the expected 1/67 = 0.015 that would result if all labels were given equal weight. So, the neural network’s validation accuracy of 0.597 is much greater than what we would expect by chance.

Author response image 2

4) In most of the analyses, there is a clear distinction between the cortex and cerebellum, which should then be expected to drive the configuration of the latent spaces. Have the authors attempted to perform the analysis using cortex only?

Please see our response to recommendation #1 from reviewer #1. In summary, we trained the multi-layer perceptron to classify only the mouse isocortical regions. We found that the resulting latent spaces improve the variance in the ranks of the canonical neuroanatomical homologues for certain isocortical regions, but don’t substantially improve the central tendencies of these rank distributions. We also found that supramodal regions exhibit higher maximal correlation values than sensorimotor regions in these latent spaces. In the clustering analysis, the sensorimotor isocortical regions no longer cluster together when using the isocortical latent spaces rather than the original latent spaces. Interestingly, the scree plot suggests that the optimal clustering solution might have as many as 10 clusters.

We chose not to pursue this approach for the current paper, since the work needed to do it properly would amount to an entirely new research project.

5) Do the authors have a sense of what biological pathways the homologous genes are involved in?

To get a sense of what biological pathways are over-represented by the homologous gene set, we ran a gene enrichment analysis. We obtained a data set of biological process modules from the Bader Lab at the University of Toronto. These modules are lists of genes involved in different biological processes, e.g. “nervous system development”. Then for each of the modules, we ran a hypergeometric test to identify whether our set of homologous genes was over-represented in the module compared with the full gene set, i.e. whether the proportion of homologous genes in the module set was larger than the proportion in the full set. The resulting p-values were corrected for multiple comparisons using the Benjamini-Hochberg method. We found that 938 modules were significant at a q-value threshold of 0.001. These modules were saved to a CSV file, to be included with the manuscript as Supplementary File 1. The 10 most significantly over-represented modules are:

1. Nervous system development

Regulation of multicellular organismal process

Generation of neurons

Regulation of biological quality

Neurogenesis

System development

7. Multicellular organism development

8. Regulation of nervous system development

9. Regulation of nervous system development

10. Regulation of localization

11. Anatomical structure development

These results are described in the Results section titled “Homologous genes capture broad similarities in the mouse and human brains” (lines 120-126):

“Using a gene enrichment analysis, we found that this reduced gene set was significantly associated with a number of biological processes related to the nervous system, with Gene Ontology labels such as <milestone-start />“<milestone-end />nervous system development”, <milestone-start />“<milestone-end />neurogenesis”, and <milestone-start />“<milestone-end />regulation of nervous system development”. Additional modules returned with high significance were <milestone-start />“<milestone-end />regulation of multicellular organismal process”, <milestone-start />“<milestone-end />regulation of biological quality”, and <milestone-start />“<milestone-end />multicellular organism development”. The full set of significant modules can be found in Supplementary File 1.”

We also included a new section in “Materials and methods”, titled “Gene enrichment analysis” (lines 753-764):

“We ran a gene enrichment analysis on the set of homologous genes obtained from the NCBI HomoloGene database. We first downloaded Gene Ontology data for biological process related modules from the Bader Lab at the University of Toronto (http://baderlab.org/GeneSets). These data include a gene set of 16563 genes and a module set of 15757 biological process modules. Every module is associated with a subset of genes from the full gene set. For each module, we used a hypergeometric test to evaluate whether the homologous gene set was over-represented in the module subset, compared with the full gene set. The resulting p-values were adjusted for multiple comparisons using the false-discovery rate method (Benjamini and Hochberg, 1995). A total of 938 modules were found to be significant at a threshold of 0.001. The surviving modules were ordered according to their p-values and written out to a comma-separated values data file (Supplementary File 1). This analysis was carried out using the `tmod` package in the R programming language.”

The full set of enriched modules is made available as a CSV file in Supplementary File 1.

https://doi.org/10.7554/eLife.79418.sa2

Article and author information

Author details

  1. Antoine Beauchamp

    1. The Hospital for Sick Children, Toronto, Canada
    2. Mouse Imaging Centre, Toronto, Canada
    3. Department of Medical Biophysics, University of Toronto, Toronto, Canada
    Contribution
    Conceptualization, Data curation, Software, Formal analysis, Funding acquisition, Validation, Investigation, Visualization, Methodology, Writing – original draft, Project administration, Writing – review and editing
    For correspondence
    antoine.beauchamp@mail.utoronto.ca
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-0008-7471
  2. Yohan Yee

    1. The Hospital for Sick Children, Toronto, Canada
    2. Mouse Imaging Centre, Toronto, Canada
    3. Department of Medical Biophysics, University of Toronto, Toronto, Canada
    Contribution
    Conceptualization, Data curation, Software, Methodology, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-7083-1932
  3. Ben C Darwin

    1. The Hospital for Sick Children, Toronto, Canada
    2. Mouse Imaging Centre, Toronto, Canada
    Contribution
    Software, Validation, Methodology, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-8689-046X
  4. Armin Raznahan

    Section on Developmental Neurogenomics, Human Genetics Branch, National Institute of Mental Health Intramural Research Program, Bethesda, United States
    Contribution
    Data curation, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-5622-1190
  5. Rogier B Mars

    1. Wellcome Centre for Integrative Neuroimaging, Centre for Functional MRI of the Brain (FMRIB), Nuffield Department of Clinical Neurosciences, John Radcliffe Hospital, University of Oxford, Oxford, United Kingdom
    2. Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, Nijmegen, Netherlands
    Contribution
    Conceptualization, Resources, Supervision, Funding acquisition, Investigation, Methodology, Writing – original draft, Project administration, Writing – review and editing
    Contributed equally with
    Jason P Lerch
    For correspondence
    rogier.mars@ndcn.ox.ac.uk
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-6302-8631
  6. Jason P Lerch

    1. The Hospital for Sick Children, Toronto, Canada
    2. Mouse Imaging Centre, Toronto, Canada
    3. Department of Medical Biophysics, University of Toronto, Toronto, Canada
    4. Wellcome Centre for Integrative Neuroimaging, Centre for Functional MRI of the Brain (FMRIB), Nuffield Department of Clinical Neurosciences, John Radcliffe Hospital, University of Oxford, Oxford, United Kingdom
    Contribution
    Conceptualization, Resources, Supervision, Funding acquisition, Investigation, Methodology, Writing – original draft, Project administration, Writing – review and editing
    Contributed equally with
    Rogier B Mars
    For correspondence
    jason.lerch@ndcn.ox.ac.uk
    Competing interests
    No competing interests declared

Funding

Canadian Institutes of Health Research (GSD-165737)

  • Antoine Beauchamp

Wellcome Trust (203139/Z/16/Z)

  • Rogier B Mars
  • Jason P Lerch

University of Oxford (E P A Cephalosporin Fund)

  • Rogier B Mars

National Institutes of Health (5R01HD100298)

  • Armin Raznahan
  • Jason P Lerch

Canadian Institutes of Health Research (FSS-167844)

  • Antoine Beauchamp

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication. For the purpose of Open Access, the authors have applied a CC BY public copyright license to any Author Accepted Manuscript version arising from this submission.

Acknowledgements

We thank C Hammill, DJ Fernandes, E Anagnostou, BJ Nieman, and E Sibille for providing advice and for interesting conceptual discussions. This study was supported by the Canadian Institutes of Health Research (doctoral funding and foreign study award for AB), the National Institutes of Health (grant 5R01HD100298), and the E P A Cephalosporin Fund. The Wellcome Centre for Integrative Neuroimaging is supported by core funding from the Wellcome Trust (203139/Z/16/Z).

Senior Editor

  1. Kate M Wassum, University of California, Los Angeles, United States

Reviewing Editor

  1. Alex Fornito, Monash University, Australia

Reviewer

  1. Bratislav Misic, McGill University, Canada

Publication history

  1. Preprint posted: March 18, 2022 (view preprint)
  2. Received: April 12, 2022
  3. Accepted: November 4, 2022
  4. Accepted Manuscript published: November 7, 2022 (version 1)
  5. Version of Record published: November 29, 2022 (version 2)

Copyright

This is an open-access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.

Metrics

  • 754
    Page views
  • 196
    Downloads
  • 1
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Antoine Beauchamp
  2. Yohan Yee
  3. Ben C Darwin
  4. Armin Raznahan
  5. Rogier B Mars
  6. Jason P Lerch
(2022)
Whole-brain comparison of rodent and human brains using spatial transcriptomics
eLife 11:e79418.
https://doi.org/10.7554/eLife.79418
  1. Further reading

Further reading

    1. Ecology
    2. Evolutionary Biology
    Zinan Wang, Joseph P Receveur ... Henry Chung
    Research Article

    Maintaining water balance is a universal challenge for organisms living in terrestrial environments, especially for insects, which have essential roles in our ecosystem. Although the high surface area to volume ratio in insects makes them vulnerable to water loss, insects have evolved different levels of desiccation resistance to adapt to diverse environments. To withstand desiccation, insects use a lipid layer called cuticular hydrocarbons (CHCs) to reduce water evaporation from the body surface. It has long been hypothesized that the waterproofing capability of this CHC layer, which can confer different levels of desiccation resistance, depends on its chemical composition. However, it is unknown which CHC components are important contributors to desiccation resistance and how these components can determine differences in desiccation resistance. In this study, we used machine learning algorithms, correlation analyses, and synthetic CHCs to investigate how different CHC components affect desiccation resistance in 50 Drosophila and related species. We showed that desiccation resistance differences across these species can be largely explained by variation in CHC composition. In particular, length variation in a subset of CHCs, the methyl-branched CHCs (mbCHCs), is a key determinant of desiccation resistance. There is also a significant correlation between the evolution of longer mbCHCs and higher desiccation resistance in these species. Given that CHCs are almost ubiquitous in insects, we suggest that evolutionary changes in insect CHC components can be a general mechanism for the evolution of desiccation resistance and adaptation to diverse and changing environments.

    1. Evolutionary Biology
    Min Wang, Thomas A Stidham ... Zhonghe Zhou
    Research Article

    The independent movements and flexibility of various parts of the skull, called cranial kinesis, are an evolutionary innovation that is found in living vertebrates only in some squamates and crown birds and is considered to be a major factor underpinning much of the enormous phenotypic and ecological diversity of living birds, the most diverse group of extant amniotes. Compared to the postcranium, our understanding of the evolutionary assemblage of the characteristic modern bird skull has been hampered by sparse fossil records of early cranial materials, with competing hypotheses regarding the evolutionary development of cranial kinesis among early members of the avialans. Here, a detailed three-dimensional reconstruction of the skull of the Early Cretaceous enantiornithine Yuanchuavis kompsosoura allows for its in-depth description, including elements that are poorly known among early-diverging avialans but are central to deciphering the mosaic assembly of features required for modern avian cranial kinesis. Our reconstruction of the skull shows evolutionary and functional conservation of the temporal and palatal regions by retaining the ancestral theropod dinosaurian configuration within the skull of this otherwise derived and volant bird. Geometric morphometric analysis of the palatine suggests that loss of the jugal process represents the first step in the structural modifications of this element leading to the kinetic crown bird condition. The mixture of plesiomorphic temporal and palatal structures together with a derived avialan rostrum and postcranial skeleton encapsulated in Yuanchuavis manifests the key role of evolutionary mosaicism and experimentation in early bird diversification.