Tools and Resources

Neuroscience

Hierarchical Bayesian modeling of multiregion brain cell count data

School of Engineering Mathematics and Technology, University of Bristol, Michael Ventris Building, United Kingdom
School of Physiology, Pharmacology and Neuroscience, University of Bristol, Biomedical Sciences Building, University Walk, United Kingdom
Centre for Neurotechnology and Department of Bioengineering, Imperial College London, South Kensington, United Kingdom
Department of Basic and Clinical Neuroscience, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, United Kingdom
School of Computing, Engineering and Intelligent Systems, Ulster University, United Kingdom

Nov 21, 2025

https://doi.org/10.7554/eLife.102391.3

Open access
Copyright information

Figures
Tables
Additional files

14 figures, 4 tables and 1 additional file

Figures

Figure 1

Download asset Open asset

Introduction.

(A) Each of $N$ animals produces a cell count from a total of $R$ brain regions of interest. Cell-count data is typically undersampled with $N ≪ R$ . Scientists analyze the brain sections from the experiment for positive signals. Here, an example section is shown where teal points mark cells expressing the immediate early gene c-Fos (green and red lines indicate regions labeled as damaged). The final cell count is equal to the sum of these individual items sagittal brain map taken from the Allen mouse brain atlas: https://mouse.brain-map.org. (B) Partial pooling is a hierarchical structure that jointly models observations from some shared population distribution. It is a continuum that depends on the value of the population variance $τ$ . When $τ = 0$ , there is no variation in the population, and each individual observation is modeled as a conditionally independent estimate of some fixed population mean $θ$ (complete pooling). As $τ$ tends to infinity, observations do not combine inferential strength but inform an independent estimate $γ_{i}$ (no pooling). In between the two extremes, combine. Each observation can contribute to the population estimate while simultaneously supporting a local one to effectively model the variance in the data. The observed data quantities, $y_{i}$ to $y_{n}$ , are highlighted with a thick line in the model diagrams. (C) An example of partial pooling on simulated count data. As the population standard deviation increases on the $x$ -axis, the individual estimates $\exp (γ_{i})$ trace a path from a completely pooled estimate to an unpooled estimate. Circular points give the raw data values. Parameters are exponentiated because the outcomes are Poisson and so parameters are fit on the log scale.

Figure 2

Download asset Open asset

Methods.

A table of partial pooling behavior for different likelihood and prior combinations. Rows are the two prior choices for the population distribution, and columns the two distributions for the data. Within each cell, the expectation of the marginal posterior $p (\exp (γ_{i}) | θ, τ, y)$ is plotted as a function of $τ$ . The solid black line is the expectation of the marginal posterior $p (θ | τ, y)$ with one standard deviation highlighted in gray. Top left: Combining a normal prior for the population with a Poisson likelihood is unsatisfactory in the presence of a zero observation. The zero observations influence the population mean in an extreme way owing to their high importance under the Poisson likelihood. Bottom left: By changing to a horseshoe prior, the problematic zero observations can escape the regularization machinery. However, regularization of the estimates with positive observations is much less impactful. Top right: A zero-inflated Poisson likelihood accounts for the zero observations with an added process, reducing the burden on the population estimate to compromise between extreme values. Bottom right: No model.

Figure 3

Download asset Open asset

Recognition memory circuit.

Schematic of the recognition memory network adapted from Exley, 2019. Bold arrows show the assumed two-way connection between the medial prefrontal cortex and the hippocampus facilitated by the nucleus reuniens (NRe). Colors highlight the hippocampus (HPC) (red), MPC (blue), and specific areas of the rhinal cortex (yellow). The NRe was lesioned in the experiment.

Figure 4

Download asset Open asset

Results - Case study 1.

(A) Heatmap of the raw log cell count data. Each row corresponds to a single animal, columns correspond to brain regions. Animals are grouped into lesion-familiar (LF), lesion-novel (LN), sham-familiar (SF), and sham-novel (SN). (**B, C**) $\log_{2}$ -fold differences for each surgery type: B shows differences between SF and SN groups; C shows differences between LF and LN groups. The 95% Bayesian highest density interval (HDI) is given in green, and the 95% confidence interval calculated from a Welch’s $t$ -test in orange. Horizontal lines within the intervals mark the posterior mean of the Bayesian results, and the raw data means in the $t$ -test case. The $x$ -axis is ordered in terms of decreasing p-value from the significance test and ticks have been color-paired with the nodes in the recognition memory circuit diagram (Figure 3). Black ticks are not present in the circuit because they are the control regions in the experiment.

Figure 5

Download asset Open asset

Results - Case study 2.

(A) Heatmap of the raw log cell count data. Each row corresponds to a single animal, columns correspond to brain regions. L and R denote left and right hemispheres, respectively. (B) log₂ fold differences in green fluorescent protein (GFP) positive cells between mouse genotypes, heterozygous (HET), and knockout (KO), for each of the 50 recorded brain regions spread across two rows. The 95% Bayesian highest density interval (HDI) is given in purple and pink for the Bayesian horseshoe and zero-inflated model. The 95% confidence interval calculated from a Welch’s $t$ -test is in orange. Horizontal lines within the intervals mark the posterior mean of the Bayesian results and the data estimate for the $t$ -test. The $x$ -axis is ordered in terms of decreasing p-value from the significance test.

Figure 6

Download asset Open asset

Example data and inferences highlighting model discrepancies.

On the left under ‘data’: boxplots with medians and interquartile ranges for the raw data for four example brain regions. The shape of each point pairs left and right hemisphere readings in each of the five animals. On the right under ‘inference’: highest density intervals (HDIs) and confidence intervals are plotted. Purple is the Bayesian horseshoe model, pink is the Bayesian ZIP model, and orange is the sample mean. The Bayesian estimates are not strongly influenced by the zero-valued observations (medial preoptic nucleus [MPN], suprachiasmatic nucleus [SCH], dorsal tuberomammillary nucleus [TMd]) or large-valued outliers (medial habenula [MH]) and have means close to the data median. This explains the advantage of the Bayesian results over the confidence interval.

Appendix 1—figure 1

Download asset Open asset

Diagnostics - Poisson.

Standard Poisson model - Case study 1.

Appendix 1—figure 2

Download asset Open asset

Diagnostics - Horseshoe.

Horseshoe model - Case study 2.

Appendix 1—figure 3

Download asset Open asset

Diagnostics - ZIPoisson.

Zero-inflated Poisson - Case study 2.

Appendix 1—figure 4

Download asset Open asset

PPC - Poisson.

Posterior predictive check for the standard Poisson model in Case study 1. (A) The proportion of zeroes in the data matches the proportion of zeroes in posterior predictive samples. This proportion is zero. (B) The distribution of standard deviations computed over a number of posterior predictive datasets (histogram) aligns with the standard deviation of the data.

Appendix 1—figure 5

Download asset Open asset

PPC - Horseshoe.

Horseshoe model - Case study 2. Posterior predictive check for the standard horseshoe model in Case study 2. (A) The proportion of zeroes in the data is larger than those found in posterior predictive datasets. This makes sense, because the likelihood is still a Poisson distribution. (B) The distribution of standard deviations computed over a number of posterior predictive datasets (histogram) aligns with the standard deviation of the data.

Appendix 1—figure 6

Download asset Open asset

PPC - ZIPoisson.

Zero-inflated Poisson - Case study 2. (A) The proportion of zeroes in the data matches the proportion of zeroes in posterior predictive samples. (B) The distribution of standard deviations computed over a number of posterior predictive datasets (histogram) aligns with the standard deviation of the data.

Appendix 1—figure 7

Download asset Open asset

Horseshoe densities.

(A) Conditional posterior. (B) MCMC pair plots. Divergent samples are colored in pink, non-divergent in blue.

Appendix 1—figure 8

Download asset Open asset

Modified horseshoe densities.

(A) The conditional posterior $p (\tilde{γ}, κ ∣ θ, τ, y)$ when y = 0 (left) and y ≠ 0 (right). (B) MCMC pair plots of samples from the marginal posterior density $p (\tilde{γ}, κ ∣ y)$ .

Tables

Table 1

Parameter table for the hierarchical model.

Parameter	Description
$E_{i}$	Exposure
$κ_{i}$	Horseshoe inflation.
$π$	Zero inflation
$γ_{i}$	Random effect for observationi
$θ_{r g}$	Fixed effect for regionr in groupg
$τ_{r g}$	Scale of random effects for regionr in groupg

Appendix 1—table 1

Software packages used.

R Libraries	Version	Description
rstan	2.26.3	complete Stan library
cmdstanr	0.5.2	lightweight Stan library
HDInterval	0.2.2	calculating HDI in R
ggplot2	3.4.1	plotting
bayesplot	1.9.0	plotting
tidyverse	1.3.1	tibble, tidyr, readr, purr, dplyr, stringr, forcats

R version 4.2.1 - ‘Funny-looking-kid’.
Computation was performed locally on a Dell XPS 13 7390 laptop. Intel i7-10510U @ 1.80 GHz, 16 GB of RAM, Ubuntu 20.04.4 LTS.
Panels composed using Inkscape version 1.2.2.

Appendix 1—table 2

Acronyms for the brain regions in Case study 1.

Term	Definition
ACC	Anterior cingulate cortex
DCA1/3	Dorsal CA1/3
DDG	Dorsal dentate gyrus
DPC	Dorsal peduncular cortex
DSUB	Dorsal subiculum
HPC	Hippocampus
ICA1/3	Intermediate CA1/3
IDG	Intermediate dentate gyrus
IFC	Infralimbic cortex
LENT	Lateral entorhinal cortex
MOC	Medial orbital cortex
MPFC	Medial prefrontal cortex
M2C	Motor cortex M2
NRe	Nucleus reuniens
PRL	Prelimbic cortex
PRH	Perirhinal cortex
PSTC	Postrhinal cortex
TE2	Temporal association cortex
VCA1/3	Ventral CA1/3
VDG	Ventral dentate gyrus
VOC	Ventral orbital cortex
VSUB	Ventral subiculum
V2C	Visual cortex V2

Appendix 1—table 3

Acronyms for the brain regions in Case study 2.

Term	Definition	Term	Definition
AHN	Anterior hypothalamic nucleus	PP	Peripeduncular nucleus
ARH	Arcuate hypothalamic nucleus	PR	Perireunensis nucleus
CL	Central lateral nucleus of the thalamus	PVa	Periventricular hypothalamic nucleus, anterior part
DMH	Dorsomedial nucleus of the hypothalamus	PVH	Paraventricular hypothalamic nucleus
FF	Fields of Forel	PVHd	Paraventricular hypothalamic nucleus, descending division
IGL	Intergeniculate leaflet of the lateral geniculate complex	PVi	Periventricular hypothalamic nucleus, intermediate part
LD	Lateral dorsal nucleus of thalamus	PVp	Periventricular hypothalamic nucleus, posterior part
LM	Lateral mammillary nucleus	RCH	Retrochiasmatic area
LGv	Ventral part of the lateral geniculate complex	RT	Reticular nucleus of the thalamus
LGd	Dorsal part of the lateral geniculate complex	SBPV	Subparaventricular zone
LH	Lateral habenula	SCH	Suprachiasmatic nucleus
LHA	Lateral hypothalamic area	SGN	Suprageniculate nucleus
LP	Lateral posterior nucleus of the thalamus	SPFm	Subparafascicular nucleus, magnocellular part
MD	Mediodorsal nucleus of thalamus	SPFp	Subparafascicular nucleus, parvicellular part
MGd	Medial geniculate complex, dorsal part	SUM	Supramammillary nucleus
MGv	Medial geniculate complex, ventral part	TMd	Tuberomammillary nucleus, dorsal part
MGm	Medial geniculate complex, medial part	TMv	Tuberomammillary nucleus, ventral part
MH	Medial habenula	TU	Tuberal nucleus
MMme	Medial mammillary nucleus, median part	VAL	Ventral anterior-lateral complex of the thalamus
MPN	Medial preoptic nucleus	VMH	Ventromedial hypothalamic nucleus
PH	Posterior hypothalamic nucleus	VM	Ventral medial nucleus of the thalamus
PMd	Dorsal premammillary nucleus	VPL	Ventral posterolateral nucleus of the thalamus
PMv	Ventral premammillary nucleus	VPM	Ventral posteromedial nucleus of the thalamus
PO	Posterior complex of the thalamus	VPMpc	Ventral posteromedial nucleus of the thalamus, parvicellular part
POL	Posterior limiting nucleus of the thalamus	ZI	Zona incerta